Before this change AVX512 enabled different instruction sets depending
on the compiler:
For Intel C++ Compiler Classic (ICC):
AVX512F, AVX512CD, AVX512DQ, AVX512BW, AVX512VL
i.e. Intel Xeon Skylake and newer
For Intel ICX, gcc, clang:
AVX512F, AVX512CD, AVX512ER, AVX512PF
i.e. Intel Xeon Phi x200/x205 (KNL/KNM)
With this commit AVX512 now only enables the common instruction sets
supported by all CPUs supporting any AVX-512 instructions set:
AVX512F and AVX512CD (called COMMON-AVX512 by icc)
* develop:
Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case. Other cases to do. This now includes comms code path.
Better SIMD usage/coalescence
* develop: (26 commits)
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Correct misleading ac help string
Enable performance counting in WilsonFermion like in others
changed back A2AUtils warning
changed if and accelerator_for - no runtime errors any more
Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu
Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.
Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field
MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error
maxLocalNorm2()
change back benchmark_ITT
prettify
Flop cout matches DiRAC-ITT-2020
revert changes
merge develop
fixes
weird bug in 2pt function...
revert changes
final version, tested on CPU and GPU
bugfix
...
* develop: (26 commits)
Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Correct misleading ac help string
Enable performance counting in WilsonFermion like in others
changed back A2AUtils warning
changed if and accelerator_for - no runtime errors any more
Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu
Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.
Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field
MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error
maxLocalNorm2()
change back benchmark_ITT
prettify
Flop cout matches DiRAC-ITT-2020
revert changes
merge develop
fixes
weird bug in 2pt function...
revert changes
final version, tested on CPU and GPU
bugfix
...
Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC
Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.