Guido Cossu
089f0ab582
Debugged HMC for Creutz relation
2016-07-28 16:44:41 +01:00
Guido Cossu
b93e18ed50
Modified the Dirac Kernel class to compile with different number of colours
...
Added the general push_back functionality to accomodate for all defined representations
Compiles, not tested
2016-07-18 16:36:28 +01:00
Guido Cossu
9c77bb69a5
Added all elements for Hirep HMC
...
TODO: Test and debug
2016-07-18 12:05:23 +01:00
paboyle
fad5c675eb
sign error on the 4d gparity force
2016-07-16 01:51:56 +01:00
paboyle
4908b77d46
Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created
...
a HUGE amount of difficult to resolve and understand conflicts .
Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain.
2016-07-15 20:59:07 +01:00
paboyle
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
paboyle
980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
Guido Cossu
7edf4c6c04
Added HMC utitities for the higher representations
...
TODO: Inherit types for the pseudofermions, Debugging, testing
2016-07-15 13:39:47 +01:00
paboyle
1a6c7204ac
Disable instantiation; Use cache version instead
2016-07-15 00:34:39 +01:00
paboyle
49310fbab3
Done with red black change over
2016-07-15 00:08:43 +01:00
paboyle
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2
5d ls vec rename of impl class
2016-07-14 23:57:26 +01:00
paboyle
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
paboyle
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00
Guido Cossu
9dc345e8e8
Debugged smearing and adding HMC functions for hirep
2016-07-13 17:51:18 +01:00
Guido Cossu
a9ae30f868
Added representations definitions for the HMC
2016-07-12 13:36:10 +01:00
paboyle
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
Guido Cossu
daea5297ee
Wrote the projector in the adjoint representation algebra
2016-07-08 16:14:16 +01:00
Guido Cossu
5028969d4b
Added generators for the adjoint representation
2016-07-08 15:40:11 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
Guido Cossu
fbf96b1bbb
]Merge branch 'develop' into feature/hirep
2016-07-07 14:20:10 +01:00
Guido Cossu
3c49ddfaa4
Merge branch 'temporary-smearing' into develop
2016-07-07 14:04:59 +01:00
Guido Cossu
ffb8b3116c
Tested smeared RHMC Wilson1p1, accepting
2016-07-07 11:49:36 +01:00
Christopher Kelly
4774a3bcd2
Generalized HotConfiguration and functions it calls to accept gauge fields with precision other than the default.
2016-07-06 18:01:08 -04:00
Guido Cossu
e87182cf98
Debugged the copy constructor of the Lattice class
2016-07-06 15:31:00 +01:00
Guido Cossu
e3d5319470
Debugged the real() and imag() functions and added tests to Test_Simd
2016-07-06 14:16:03 +01:00
Guido Cossu
ffedeb1c58
Minor modifications
2016-07-06 11:41:27 +01:00
Guido Cossu
3e80947c2b
Cleaned up HMC output. Tested smeared HMCs for single precision (OK)
2016-07-05 12:03:54 +01:00
Guido Cossu
fdfbf11c6d
Merge branch 'develop' into temporary-smearing
2016-07-04 18:45:10 +01:00
Guido Cossu
9cb90f714e
Merge remote-tracking branch 'origin/develop' into temporary-smearing
2016-07-04 17:28:40 +01:00
Guido Cossu
2daffdf95d
Tested smeared WilsonRatio action, accepts
2016-07-04 16:17:28 +01:00
Guido Cossu
149f826601
Tested smearing for Nf2 WilsonFermionAction, non EO: accepts
2016-07-04 16:09:19 +01:00
Guido Cossu
cd8ee27080
Simple change in iGamma for smearing
2016-07-04 16:02:57 +01:00
Guido Cossu
0fa66e8f3c
Debugged smearing for EOWilson, accepts
2016-07-04 15:35:37 +01:00
Guido Cossu
8dd099267d
Corrected a bug in the Expression Templates (acso and asin were wrong)
2016-07-03 12:28:25 +01:00
Guido Cossu
1a6d65c6a4
Converted set_uw and set_fj to all complex functions
2016-07-03 10:27:43 +01:00
Guido Cossu
092fa0d8da
Debugged set_fj,
...
to be fixed: BUG in imag()
2016-07-01 16:06:20 +01:00
paboyle
680645f849
Merge branch 'release/v0.5.0'
2016-06-30 15:15:03 -07:00
paboyle
712b9a3489
Asm only for avx512
2016-06-30 14:35:02 -07:00
paboyle
bdaa5b1767
Updated to have perfect prefetching for the s-vectorised kernel with any cache blocking.
2016-06-30 14:35:02 -07:00
paboyle
8fcefc021a
Improved the prefetching when using cache blocking codes
2016-06-30 14:35:02 -07:00
paboyle
05c884a62a
Prefetch change
2016-06-30 14:35:01 -07:00
paboyle
2d8bb4c594
Tweaks
2016-06-30 14:35:01 -07:00
paboyle
6d58cb2a68
Enable reordering of the loops in the assembler for cache friendly.
...
This gets in the way of L2 prefetching however. Do next next link in stencil
prefetching.
2016-06-30 14:35:01 -07:00
Guido Cossu
565e9329ba
Changed the colouring classes
2016-06-30 16:51:03 +01:00
Guido Cossu
5e02392f9c
Fixed compilation error for benchmark_dwf
...
Some parts were assuming floating point precision
2016-06-20 12:30:51 +01:00
paboyle
87418e7df1
Slightly faster prefetching perf.
2016-06-13 02:32:52 -07:00
paboyle
55f65b81b5
Improvements to the assembler interface that let us move chunks of the
...
site and s loop into the kernels. This will save on function call overhead and
guarantee L2 prefetching strategy is right since OMP can't distribute the
sub-chunks of work.
2016-06-09 01:12:36 -07:00
Azusa Yamaguchi
d9408893b3
Prefetching in the normal kernel implementation.
2016-06-08 05:43:48 -07:00
paboyle
8ac021de73
Added a test an fixed it for red black precon Ls innermost vectorised DWF
2016-06-07 13:16:56 -07:00
paboyle
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
paboyle
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
paboyle
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
c698b16d75
function to generate Chroma-style gamma matrix products
2016-05-01 18:30:35 -07:00
paboyle
5341977948
IMCI fixes. Thought I had committed these. The "real" disambiguation
...
between std::real and Grid::real shouldn't have been necessary and I don't
know why only the icpc v16.0 on babbage hits it.
May need a longer term rename of Grid::real or some careful EnableIf work.
2016-04-30 03:34:16 -07:00
f6c53e5039
Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2'
2016-04-30 00:17:52 -07:00
6aa000176f
Fermion <-> Propagator functions
2016-04-30 00:14:33 -07:00
paboyle
1e554350ac
The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.
2016-04-29 16:49:18 -07:00
paboyle
c79ea0dcef
Fixingn IMCI
2016-04-22 21:52:54 -07:00
paboyle
8fd8bc25e9
simd 5th dim with rotation
2016-04-19 15:39:00 -07:00
paboyle
ba427abde9
simd 5d
2016-04-19 15:38:39 -07:00
paboyle
9b6ab6db16
simd in 5th dimension support
2016-04-19 15:38:01 -07:00
paboyle
806a83d38b
simd in fifth dim support for dwf
2016-04-19 15:36:19 -07:00
neo
339be37dba
Debugging smeared HMC
2016-04-13 17:00:14 +09:00
neo
a87b744621
HMC runs but does not accept with smearing on
2016-04-07 16:45:11 +09:00
paboyle
b1192a8908
Benchmark_zmm added
2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596
Adding extra benchmark
2016-04-06 10:32:54 +01:00
97d0d56bcb
Debugging Smearing routines (set_fj)
2016-04-06 17:58:43 +09:00
7c7ea35ffb
Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions
2016-04-05 16:28:09 +09:00
4b1cf580e0
Debugging the Smearing routines
2016-04-05 16:19:30 +09:00
paboyle
e67fc2be18
Adding a trial for openmp overhead minimisation
2016-03-31 16:00:37 +01:00
paboyle
8052556275
Cleaning up the single/double kernel implementation switch
2016-03-31 14:51:32 +01:00
paboyle
60d965f79e
AVX512 improvements; sigfpe trapping too
2016-03-30 08:42:34 +01:00
paboyle
1ecbf9794d
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-30 08:37:55 +01:00
paboyle
c77b7ee897
AddSub based alternate SU3 routine
2016-03-28 17:55:22 -06:00
paboyle
1e355a51e1
Interface change
2016-03-27 23:46:55 -07:00
paboyle
21abaf7e91
Gamma sign change
2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
paboyle
60d4564151
ICC no compile fix
2016-03-16 02:30:40 -07:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
325e745daa
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-02 07:04:03 -08:00
paboyle
61413565d0
Back off the inlined spin proj as not working
2016-03-02 07:03:09 -08:00
2d8bb356e3
Smearing routines compile (still untested)
2016-02-25 02:43:59 +09:00
a7251f28c7
Stout smearing compiles (untested)
2016-02-24 03:16:50 +09:00
Antonin Portelli
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
Peter Boyle
6aeaf6f568
Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
...
turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
Peter Boyle
40f2db9bc0
Disable metropolis step until 10 traj covered. Should move to exposing these
...
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
Jung
9f0d9ade68
Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
...
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
neo
c1b1b89d17
More on smearing routines, writing APEsmear (dev)
2016-02-19 17:15:27 +09:00
neo
771235017d
Adding smearing routines (development)
2016-02-19 15:30:41 +09:00
paboyle
3425751cb8
Missing return value
2016-02-19 01:06:03 +00:00
Peter Boyle
22422a84d9
Small problem in compressor fix
2016-02-17 19:03:09 -06:00
Peter Boyle
c9fadf97a5
Simplify the compressor interface again.
2016-02-17 18:16:45 -06:00
Peter Boyle
81395e85d1
Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.
2016-02-16 13:56:44 -06:00
Peter Boyle
a0fc47c6f9
Cheaper implementation
2016-02-15 16:02:36 -06:00
paboyle
e2f73e3ead
Updates for shmem
2016-02-10 16:50:32 -08:00
neo
6371676a75
Correcting some compilation errors for clang-sse
2016-02-10 11:37:03 +09:00
Jung
bd84c23298
definitions reconciled.
2016-01-25 16:30:59 -05:00
Jung
7aa8d5e8af
Faiing to compile, comparing with master
2016-01-25 16:03:02 -05:00
Jung
6012b0ec23
Checking in changes before changing to chulwoo-dec12-2015
2016-01-25 09:40:58 -05:00
Jung
411ac49dd7
GparityWilsonTM typedef added. Not yet tested
...
Conflicts:
configure
lib/qcd/action/fermion/WilsonKernels.h
2016-01-25 01:36:28 -05:00
Jung
5c57d4f403
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
paboyle
fc6ad65751
Pushed the overlap comms tweaks
2016-01-11 06:34:22 -08:00
paboyle
dafc74020c
Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori
2016-01-10 16:54:27 -08:00
paboyle
d19321dfde
Overlap comms compute changes
2016-01-10 19:20:16 +00:00
Jung
5924e5a562
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
configure
lib/qcd/action/Actions.h
lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
c99d748da6
Timing reports in benchmarks now reflect the asynch comms thread statistics
2016-01-04 14:42:16 +00:00
paboyle
02452afd36
Optional overlap of comms with compute
2016-01-04 14:18:40 +00:00
paboyle
331768dcff
Added overlap comms compute mode
2016-01-03 01:38:11 +00:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
1e68b1c1bd
Create a benign default for gparity twists
2016-01-02 14:06:53 +00:00
paboyle
5a80930dd2
Charge conjugation boundary conditions for gauge fields implemented as a policy
...
class, changing the nature of covariant Cshifts used in
plaquettes, rectangles and staples.
As a result same code is used for the plaq and rect action independent of the BC type.
Should probably isolate the BC in a separate class that Gimpl takes as a template param.
Do the same with smearing policies.
This would then allow composition of BC with smearing etc....
2016-01-02 13:37:25 +00:00
paboyle
841a37f941
Fix to WilsonCompressor that fixes a bug in comms phase due to the sign change on gamma
...
matrix in hopping term.
Add logging of time spent in CG.
2015-12-29 23:49:41 +00:00
Azusa Yamaguchi
e6cad3821c
Logging improvement
2015-12-29 19:51:18 +00:00
Azusa Yamaguchi
98de1cbb6a
Optimised version of rectangle term staples.
...
~3.4x faster than the naive.
2015-12-29 19:22:59 +00:00
Azusa Yamaguchi
f7d61b8b81
Plaq plus rectangle and Iwasaki, Symanzik DBW2.
...
http://arxiv.org/pdf/hep-lat/0610075.pdf plaq and rect regress plausibly over 100 trajectories
and under HMC with average plaq and rectangle coming out ok.
2015-12-28 16:39:26 +00:00
Azusa Yamaguchi
78c4e862ef
Plaq, Rectangle, Iwasaki, Symanzik and DBW2 workign and HMC regresses to http://arxiv.org/pdf/hep-lat/0610075.pdf
2015-12-28 16:38:31 +00:00
paboyle
0afcf1cf13
Moved all the HMC tests over to using a single HmcRunner class that manages checkpoint strategies and such like
2015-12-22 11:19:25 +00:00
paboyle
08edbb5cbe
HMC bit repro across checkpoints. Fixed parallel RNG issue with threading.
...
Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
2015-12-22 08:54:40 +00:00
paboyle
0abfbcc8eb
Naming of files improvement.
2015-12-21 15:37:26 +00:00
paboyle
1b94253ba4
Logging improvement
2015-12-21 15:36:28 +00:00
paboyle
36e6f9ac7b
Bug fix. Guess not initialised in refresh step; didn't hit before due to luck in not having a vector
...
created with NAN data.
2015-12-21 15:34:35 +00:00
paboyle
2f41691c11
Bug fix. Guess was not zeroed prior to CG call. Was earlier accidentally benign just due to luck.
2015-12-21 15:33:36 +00:00
paboyle
31ca609d12
HMC checkpointing .
...
Need a general HMC framework to work in restart.
2015-12-20 02:29:51 +00:00
paboyle
e108e708a3
Wilson TM tests and compiles in
2015-12-17 23:06:33 +00:00
paboyle
67ccb043f1
Added TM fermions for DSDR etc..
2015-12-17 22:34:28 +00:00
Jung
eb1759d7ea
Added Gparity instantiation to no HANDOPT case
...
deleted configure (as intended?)
2015-12-16 00:04:09 -05:00
paboyle
34a0fde2ad
Fixes to fermion force terms after sign of gamma_mu (0...3) change.
...
Thought I had already committed these.
Believe I have got the Gparity fermion force working.
* tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion
based on a small dt update of U field.
* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.
Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
Jung
bc34b7e808
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
tests/Make.inc
2015-12-15 11:11:59 -05:00
Jung
284453c5e9
Added gparity mobius defs, added params to ScaledShamir
...
checking in before puling master
2015-12-14 12:15:06 -05:00
paboyle
3ce10aa975
Fix a regression failure on Mobius; chroma regression added
2015-12-10 22:55:00 +00:00
Jung
f2b4edc090
Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention)
2015-12-07 02:04:57 -05:00
paboyle
b2c02a6106
Runs fastst on cori
2015-11-28 16:58:16 -08:00
paboyle
e9ff25b06b
Small threading change makes a difference on Cori.
2015-11-07 00:07:05 -08:00
paboyle
05a7029600
Stencil change
2015-11-07 00:06:31 -08:00
paboyle
899ca41cb8
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
paboyle
d29b4c1dee
Assembler files
2015-11-06 03:48:48 -08:00
paboyle
a2ff068e29
Asm and threading for many core
2015-11-06 03:47:14 -08:00
paboyle
17af18dcab
Changes for AVX512 assembler
2015-11-06 03:45:51 -08:00
Peter Boyle
28022755ae
Stencil class name global change to StencilImpl typedef
2015-11-06 05:30:17 -06:00
paboyle
1159de165c
Asm option for AVX512
2015-11-05 22:04:51 -08:00
paboyle
16c7993434
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/simd/Grid_avx512.h
lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
4e65ad21ac
Adding a routine for AVX512 / IMCI with explicit assembly implementations
2015-11-04 03:15:08 -08:00
Peter Boyle
abb23df83f
formatting only
2015-11-04 10:00:27 +00:00