portelli/Grid - Grid - DiRAC Tursa git server

mirror of https://github.com/paboyle/Grid.git synced 2026-05-20 00:54:30 +01:00

Author	SHA1	Message	Date
paboyle	5a80930dd2	Charge conjugation boundary conditions for gauge fields implemented as a policy class, changing the nature of covariant Cshifts used in plaquettes, rectangles and staples. As a result same code is used for the plaq and rect action independent of the BC type. Should probably isolate the BC in a separate class that Gimpl takes as a template param. Do the same with smearing policies. This would then allow composition of BC with smearing etc....	2016-01-02 13:37:25 +00:00
paboyle	841a37f941	Fix to WilsonCompressor that fixes a bug in comms phase due to the sign change on gamma matrix in hopping term. Add logging of time spent in CG.	2015-12-29 23:49:41 +00:00
Azusa Yamaguchi	e6cad3821c	Logging improvement	2015-12-29 19:51:18 +00:00
Azusa Yamaguchi	98de1cbb6a	Optimised version of rectangle term staples. ~3.4x faster than the naive.	2015-12-29 19:22:59 +00:00
Azusa Yamaguchi	f7d61b8b81	Plaq plus rectangle and Iwasaki, Symanzik DBW2. http://arxiv.org/pdf/hep-lat/0610075.pdf plaq and rect regress plausibly over 100 trajectories and under HMC with average plaq and rectangle coming out ok.	2015-12-28 16:39:26 +00:00
Azusa Yamaguchi	78c4e862ef	Plaq, Rectangle, Iwasaki, Symanzik and DBW2 workign and HMC regresses to http://arxiv.org/pdf/hep-lat/0610075.pdf	2015-12-28 16:38:31 +00:00
paboyle	0afcf1cf13	Moved all the HMC tests over to using a single HmcRunner class that manages checkpoint strategies and such like	2015-12-22 11:19:25 +00:00
paboyle	08edbb5cbe	HMC bit repro across checkpoints. Fixed parallel RNG issue with threading. Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure repro across checkpoints.	2015-12-22 08:54:40 +00:00
paboyle	0abfbcc8eb	Naming of files improvement.	2015-12-21 15:37:26 +00:00
paboyle	1b94253ba4	Logging improvement	2015-12-21 15:36:28 +00:00
paboyle	36e6f9ac7b	Bug fix. Guess not initialised in refresh step; didn't hit before due to luck in not having a vector created with NAN data.	2015-12-21 15:34:35 +00:00
paboyle	2f41691c11	Bug fix. Guess was not zeroed prior to CG call. Was earlier accidentally benign just due to luck.	2015-12-21 15:33:36 +00:00
paboyle	31ca609d12	HMC checkpointing . Need a general HMC framework to work in restart.	2015-12-20 02:29:51 +00:00
paboyle	e108e708a3	Wilson TM tests and compiles in	2015-12-17 23:06:33 +00:00
paboyle	67ccb043f1	Added TM fermions for DSDR etc..	2015-12-17 22:34:28 +00:00
Jung	eb1759d7ea	Added Gparity instantiation to no HANDOPT case deleted configure (as intended?)	2015-12-16 00:04:09 -05:00
paboyle	34a0fde2ad	Fixes to fermion force terms after sign of gamma_mu (0...3) change. Thought I had already committed these. Believe I have got the Gparity fermion force working. * tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion based on a small dt update of U field. * tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21. Need to accumulate a full plaquette log to believe fully which will take some hours of run time.	2015-12-15 23:14:12 +00:00
Jung	bc34b7e808	Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2 Conflicts: lib/qcd/action/fermion/WilsonKernels.h tests/Make.inc	2015-12-15 11:11:59 -05:00
Jung	284453c5e9	Added gparity mobius defs, added params to ScaledShamir checking in before puling master	2015-12-14 12:15:06 -05:00
paboyle	3ce10aa975	Fix a regression failure on Mobius; chroma regression added	2015-12-10 22:55:00 +00:00
Jung	f2b4edc090	Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention)	2015-12-07 02:04:57 -05:00
paboyle	b2c02a6106	Runs fastst on cori	2015-11-28 16:58:16 -08:00
paboyle	e9ff25b06b	Small threading change makes a difference on Cori.	2015-11-07 00:07:05 -08:00
paboyle	05a7029600	Stencil change	2015-11-07 00:06:31 -08:00
paboyle	899ca41cb8	Merge branch 'master' of github.com:paboyle/Grid Conflicts: lib/qcd/action/fermion/WilsonFermion5D.cc	2015-11-06 03:50:04 -08:00
paboyle	d29b4c1dee	Assembler files	2015-11-06 03:48:48 -08:00
paboyle	a2ff068e29	Asm and threading for many core	2015-11-06 03:47:14 -08:00
paboyle	17af18dcab	Changes for AVX512 assembler	2015-11-06 03:45:51 -08:00
Peter Boyle	28022755ae	Stencil class name global change to StencilImpl typedef	2015-11-06 05:30:17 -06:00
paboyle	1159de165c	Asm option for AVX512	2015-11-05 22:04:51 -08:00
paboyle	16c7993434	Merge branch 'master' of github.com:paboyle/Grid Conflicts: lib/simd/Grid_avx512.h lib/simd/Grid_imci.h	2015-11-04 03:32:10 -08:00
paboyle	4e65ad21ac	Adding a routine for AVX512 / IMCI with explicit assembly implementations	2015-11-04 03:15:08 -08:00
Peter Boyle	abb23df83f	formatting only	2015-11-04 10:00:27 +00:00
paboyle	1878bf97d0	Babbage fix	2015-09-30 16:04:01 -07:00
paboyle	a660ce716b	No compile babbage fix	2015-09-30 16:02:44 -07:00
Peter Boyle	64d64d1ab6	Updating to modify non-inlining permute routines and hopefully get better reg use and enhance performance.	2015-09-25 08:55:04 -07:00
Peter Boyle	5ef42add2d	Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly and drop swizzles in AVX512. Don't know why these compiled.	2015-09-23 05:23:45 -07:00
Peter Boyle	2f38ebc446	Reintroducing the hand unrolled loops	2015-09-08 17:45:30 +01:00
Peter Boyle	638d6675ee	Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now. Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears.	2015-08-31 16:33:20 +01:00
Peter Boyle	357c6ab46d	Reunitarise. Complete the HMC and integrator changes.	2015-08-31 16:32:04 +01:00
Peter Boyle	755dca9533	Added ForceGradient integrator. dH dropped so seems to work. Will only believe it is right once I have pulled a dt^4 error scaling plot out.	2015-08-31 06:23:02 +01:00
Peter Boyle	29fd004d54	Unified integrator and integrator algorithm into virtual class used as a policy for the HMC.	2015-08-30 13:39:19 +01:00
Peter Boyle	aa52fdadcc	Global edit on HMC sector -- making GaugeField a template parameter and preparing to pass integrator, smearing, bc's as policy classes to hmc. Propose to unify "integrator" and integrator algorithm in a base/derived way to override step. Want to read through ForceGradient to ensure that abstraction covers the force gradient case.	2015-08-30 12:18:34 +01:00
Peter Boyle	76d752585b	Started a tidy up in the HMC sector. Now comfortable with the two level integrators; to a little figure out what Guido had done & why -- but there is a neat saving of force evaluations across the nesting time boundary making use of linearity of the leapP in dt. I cleaned up the printing, reduced the volume of code, in the process sharing printing between all integrators. Placed an assert that the total integration time for all integrators must match at end of trajectory. Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both Omelyan and with Leapfrog so substantial confidence gained.	2015-08-29 17:18:43 +01:00
Peter Boyle	dc814f30da	Binary IO file for generic Grid array parallel I/O. Number of IO MPI tasks can be varied by selecting which dimensions use parallel IO and which dimensions use Serial send to boss I/O. Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes doing the I/O. Interpolates nicely between ALL nodes write their data, a single boss per time-plane in processor space [old UKQCD fortran code did this], and a single node doing all I/O. Not sure I have the transfer sizes big enough and am not overly convinced fstream is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero. Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations on my MacOS + OpenMPI and Clang environment. It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from each node in order to gather bigger chunks at the syscall level. That would push us up to the circa 8x 1848 == 4KB size write chunk, and by taking, say, x/y non parallel we get to 16MB contiguous chunks written in multi 4KB transactions per IOnode in 64^3 lattices for configuration I/O. I suspect this is fine for system performance.	2015-08-26 13:40:29 +01:00
Peter Boyle	e8d63c9178	Merge branch 'master' of https://github.com/paboyle/Grid	2015-08-19 05:49:00 +01:00
Peter Boyle	c54c086f17	Even odd preconditioned one flavour ratio (no support for non-const EE schur block)	2015-08-19 05:46:58 +01:00
Peter Boyle	dd6bb73ee0	Added one flavour rational ratios (unprec)	2015-08-19 04:58:40 +01:00
Peter Boyle	fc160eeccc	Added one flavour rational ratios (unprec)	2015-08-19 04:58:40 +01:00
Peter Boyle	48db72259e	EvenOdd schur decomposed mpcdagmpc version of rhmc determinant. dH is also small and plaquette looks right.	2015-08-18 18:37:39 +01:00

1 2 3 4 5 ...

420 Commits