Guido Cossu
e1042aef77
First version of the doube prec for testing purposes
...
It does not compile single and double version at the same time
2016-10-28 17:20:04 +01:00
paboyle
aa6a839c60
avx512 build fix; detect clang/gcc intrinsics vs. ICPC
2016-10-28 09:13:09 +01:00
b4d2af8c89
threaded FFT
2016-10-26 19:46:36 +01:00
434af6aeaa
Merge branch 'develop' into feature/fft-opt
2016-10-26 18:50:38 +01:00
e90f8ac841
Merge branch 'develop' into feature/feynman-rules
2016-10-26 18:50:21 +01:00
a1705a8d53
debug message removed
2016-10-26 18:50:07 +01:00
ca21003f01
Merge branch 'feature/fft-opt' into feature/feynman-rules
...
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
14ddf2c234
more FFT optimisations
2016-10-26 17:36:26 +01:00
Guido Cossu
1d666771f9
Debugging the RNG, eliminate the barrier after broadcast
2016-10-26 16:08:23 +01:00
Guido Cossu
d50055cd96
Making the ILDG support optional
2016-10-26 09:48:01 +01:00
Azusa Yamaguchi
bca861e112
Note:FFT shoud be GridFFT (Not change yet).
...
Gauge fix with FFt is added (tests/core)
2016-10-25 14:21:48 +01:00
33d199a0ad
temporary thread safety in FFT
2016-10-25 12:56:40 +01:00
paboyle
b820076b91
Merge branch 'develop' into feature/mpi3
2016-10-25 06:02:33 +01:00
paboyle
09f66100d3
MPI 3 compile on non-linux
2016-10-25 06:01:12 +01:00
azusayamaguchi
d7d92af09d
Travis fail fix attempt
2016-10-25 01:45:53 +01:00
azusayamaguchi
460d0753a1
Merge branch 'develop' into feature/mpi3
...
Conflicts:
lib/simd/Grid_avx512.h
2016-10-25 01:08:51 +01:00
azusayamaguchi
8f8058f8a5
More random bits on parallel seeding
2016-10-25 01:05:52 +01:00
azusayamaguchi
d97a27f483
Verbose
2016-10-25 01:05:31 +01:00
azusayamaguchi
7c3363b91e
Compiles all comms targets
2016-10-25 00:04:17 +01:00
azusayamaguchi
b94478fa51
mpi, mpi3, shmem all compile.
...
mpi, mpi3 pass single node multi-rank
2016-10-24 23:45:31 +01:00
Guido Cossu
47c7159177
ILDG reader/writer works
...
Fill the xml header with the required information, todo.
2016-10-24 21:57:54 +01:00
13bf0482e3
FFT optimisation
2016-10-24 19:25:40 +01:00
a795b5705e
memory optimisation
2016-10-24 19:25:15 +01:00
392e064513
fast local peek-poke
2016-10-24 19:24:21 +01:00
azusayamaguchi
b6a65059a2
Update to use shared memory to contain the stencil comms buffers
...
Tested on 2.1.1.1 1.2.1.1 4.1.1.1 1.4.1.1 2.2.1.1 subnode decompositions
2016-10-24 17:30:43 +01:00
Guido Cossu
f415db583a
Adding ILDG format
2016-10-24 15:48:22 +01:00
Guido Cossu
f55c16f984
Adding a barrier in the RNG save
2016-10-24 11:02:14 +01:00
azusayamaguchi
ea25a4d9ac
Works
2016-10-23 06:10:05 +01:00
azusayamaguchi
c190221fd3
Internal SHM comms in non-simd directions working
...
Need to fix simd directions
2016-10-22 18:14:27 +01:00
Guido Cossu
df67e013ca
More debug output for the RNG
2016-10-22 13:34:17 +01:00
Guido Cossu
3e990c9d0a
Reverting the broadcast change
2016-10-22 13:26:43 +01:00
Guido Cossu
4b740fc8fd
Debugging the RNG state save
2016-10-22 13:06:00 +01:00
azusayamaguchi
0fcd2e7188
Simplify the comms structure prior to implementing Shared memory direct bouncs
2016-10-21 22:44:10 +01:00
azusayamaguchi
910b8dd6a1
use simd type
2016-10-21 22:35:29 +01:00
azusayamaguchi
75ebd3a0d1
Typo fixes and rotate for CLANG
2016-10-21 22:34:29 +01:00
Guido Cossu
cccd14b09e
Small cleanup
2016-10-21 17:20:54 +01:00
Guido Cossu
e6acffdfc2
Fixing the plaquette computation
2016-10-21 16:06:34 +01:00
7c8f79b147
more stochastic QED fixes
2016-10-21 15:20:12 +01:00
azusayamaguchi
09fd5c43a7
Reasonably fast version
2016-10-21 15:17:39 +01:00
462921e549
QED: fix stochastic field
2016-10-21 14:41:08 +01:00
Guido Cossu
392130a537
Working on the 5d
2016-10-21 14:22:25 +01:00
azusayamaguchi
f22317748f
Merge branch 'feature/mpi3' of https://github.com/paboyle/Grid into feature/mpi3
2016-10-21 13:36:35 +01:00
azusayamaguchi
6a9eae6b6b
Reporting improvements
2016-10-21 13:36:18 +01:00
azusayamaguchi
fad96cf250
StencilBufs
2016-10-21 13:36:00 +01:00
azusayamaguchi
f331809c27
Use variable type for loop
2016-10-21 13:35:37 +01:00
bd6a228af6
Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules
2016-10-21 13:10:30 +01:00
63d219498b
first (dirty) implementation of Feynman stoctachtic EM field
2016-10-21 13:10:13 +01:00
paboyle
2c54a53d0a
Compile verbose reduce
2016-10-21 12:12:14 +01:00
paboyle
306160ad9a
bcopy threaded
2016-10-21 12:07:28 +01:00
azusayamaguchi
20a091c3ed
Intel vs. Clang intrinsics differences absorbed
2016-10-21 09:08:36 +01:00
azusayamaguchi
202078eb1b
Cray / OpenSHMEM ordering differs
2016-10-21 09:07:20 +01:00
paboyle
a762b1fb71
MPI3 working with a bounce through shared memory on my laptop.
...
Longer term plan: make the "u_comm_buf" in Stencil point to the shared region and avoid the
send between ranks on same node.
2016-10-21 09:03:26 +01:00
Guido Cossu
deef2673b2
Separating the Lattice theories stub from the QCD.h file
2016-10-20 17:24:08 +01:00
paboyle
5b5925b8e5
Forgot to add
2016-10-20 17:09:40 +01:00
Guido Cossu
977b0a6dd9
Merge branch 'develop' into feature/hmc_generalise
2016-10-20 17:04:41 +01:00
Guido Cossu
977d844394
Few modifications on stdout messages
2016-10-20 17:01:59 +01:00
paboyle
b58adc6a4b
commVector
2016-10-20 17:00:15 +01:00
paboyle
f9d5e95d72
allocator template typedefs moved to AlignedAllocator
2016-10-20 16:59:39 +01:00
paboyle
4f8e636a43
commVector
2016-10-20 16:59:16 +01:00
paboyle
9b39f35ae6
commVector different for SHMEM compat
2016-10-20 16:58:53 +01:00
paboyle
5fe2b85cbd
MPI3 and shared memory support
2016-10-20 16:58:01 +01:00
paboyle
c7cccaaa69
Comm vector for shmem
2016-10-20 16:57:31 +01:00
paboyle
cbcfea466f
MPI3
2016-10-20 16:57:14 +01:00
paboyle
4955672fc3
MPI3
2016-10-20 16:57:00 +01:00
paboyle
8c043da5b7
SHMEM and comms allocator made different
2016-10-20 16:56:05 +01:00
paboyle
3cbe974eb4
Layout
2016-10-20 16:55:21 +01:00
997fd882ff
Merge branch 'develop' into feature/feynman-rules
...
# Conflicts:
# lib/Threads.h
# lib/qcd/action/fermion/WilsonFermion.cc
# lib/qcd/action/fermion/WilsonFermion.h
# lib/qcd/utils/SUn.h
# lib/simd/Grid_avx.h
# lib/simd/Intel512common.h
2016-10-19 18:35:18 +01:00
Guido Cossu
590675e2ca
Csum in hex format
2016-10-19 17:26:25 +01:00
Guido Cossu
8c65bdf6d3
Printing checksum for the RNG file
2016-10-19 16:56:11 +01:00
Guido Cossu
74f1ed3bc5
Adding some documentation for HMC
2016-10-19 10:51:13 +01:00
paboyle
7af9b87318
Cache face tables to improve performance.
...
Extract merge now looking poor.
2016-10-18 09:51:37 +01:00
paboyle
811ca45473
GNU clang hack for AVX512 since there are missing reduce intrinsics in Clang 3.9 and GCC-6 AVX512 support
2016-10-17 16:23:21 +01:00
paboyle
bc1a4d40ba
Faster integer handling avoid push_back
2016-10-17 16:16:44 +01:00
Guido Cossu
e250e6b7bb
Moving parameters outside of the HMCrunner
2016-10-14 17:22:32 +01:00
paboyle
c8079e6621
Time the face gateher in x-dir more carefully
2016-10-13 22:28:50 +01:00
azusayamaguchi
8b0d171c9a
32bit issue on the KNL code variant where byte offsets were stored
2016-10-12 17:49:32 +01:00
azusayamaguchi
8bbd9ebc27
Reversing changes to Stencil class
2016-10-12 13:47:20 +01:00
azusayamaguchi
6472b431f0
__rdpmc needed for gcc, clang++
2016-10-12 12:29:08 +01:00
azusayamaguchi
bd205a3293
Fixing for non x86 and non KNL
2016-10-12 12:09:15 +01:00
azusayamaguchi
496beffa88
Fix non-KNL build
2016-10-12 12:06:08 +01:00
azusayamaguchi
9b63e97108
align not absolutely required and confuses clang++
2016-10-12 11:51:21 +01:00
azusayamaguchi
81f2aeaece
KNL streaming stores, and KNL performance coutners
2016-10-12 11:45:22 +01:00
paboyle
2d4a45c758
Typecast pointer
2016-10-12 09:14:15 +01:00
paboyle
a123dcd7e9
Static required for shmem. Reading same object twice requires csum reset
2016-10-12 00:29:57 +01:00
paboyle
6b27c42dfe
Cosmetic
2016-10-12 00:29:39 +01:00
paboyle
f7c2aa3ba5
runtime by default
2016-10-12 00:29:13 +01:00
paboyle
7240d73184
Parallelise the x faces; fix the segv on KNL with comms
2016-10-11 22:21:07 +01:00
paboyle
42cd148f5e
Base pointer for comms buffer under AVX512 assembly
2016-10-11 16:06:06 +01:00
Guido Cossu
eda4dd622e
Some more edit
2016-10-11 15:45:20 +01:00
paboyle
6e01264bb7
don't use static by default
2016-10-11 10:03:39 +01:00
paboyle
6f408256bc
FMA4 option moved on the align
2016-10-11 10:03:01 +01:00
paboyle
8d11681aac
verbose remove
2016-10-10 23:50:42 +01:00
paboyle
3d5c9a1ee9
No compile fix on clang++ 3.9
2016-10-10 23:50:13 +01:00
paboyle
dc389e467c
axpy_ssp for any coeff type via template
2016-10-10 23:48:05 +01:00
paboyle
3619167d62
Mass parameter
2016-10-10 23:47:33 +01:00
paboyle
96f1d1b828
Debugged Domain wall and Overlap feynman rules (infinite Ls, finite mass).
2016-10-10 23:46:45 +01:00
paboyle
657e0a8f4d
Mass parameter
2016-10-10 23:46:10 +01:00
paboyle
616e7cd83e
Mass parameter
2016-10-10 23:45:48 +01:00
paboyle
6f26d2e8d4
Overlap tree level feynman rule
2016-10-10 23:45:18 +01:00
paboyle
c014574504
A "please implement me" feynman rule. If this were abstract virtual it would
...
require/force implementation
2016-10-10 23:44:00 +01:00
paboyle
d7ce164e6e
Feynman rule for DWF
2016-10-10 23:43:36 +01:00
paboyle
c0d5b99016
Dminus
2016-10-10 23:43:19 +01:00
paboyle
09ca32d678
Dminus added for Cayley
2016-10-10 23:42:55 +01:00
paboyle
082ae350c6
static schedule by default
2016-10-10 23:42:30 +01:00
Guido Cossu
611b5d74ba
Fix for AVX+FMA3 compilation
2016-10-10 15:26:17 +01:00
Guido Cossu
b56c9ffa52
Fix for AVXFMA
2016-10-10 14:43:37 +01:00
Guido Cossu
c68a2b9637
Minor fix
2016-10-10 11:54:58 +01:00
Guido Cossu
293df6cd20
Generalising the HMCRunner and moving parameters to the user level
2016-10-10 11:49:55 +01:00
Guido Cossu
65f61bb3bf
Reset QCD colours to 3
2016-10-10 09:46:17 +01:00
Guido Cossu
26b9740d53
Some fix for the GenericHMCrunner
2016-10-10 09:43:05 +01:00
cb02b7088f
Merge branch 'develop' into feature/doxygen
...
# Conflicts:
# configure.ac
2016-10-09 13:35:44 +01:00
Guido Cossu
6eb873dd96
Added scalar action phi^4
...
Check Norm2 output (Complex type assumption)
2016-10-07 17:28:46 +01:00
Guido Cossu
11b4c80b27
Added support for hmc and binary IO for a general field
2016-10-07 13:37:29 +01:00
Guido Cossu
2e453dfbf5
Added some instrumentation to benchmark the force computation
2016-10-06 17:52:45 +01:00
Guido Cossu
c065e454c3
Adding Binrary IO, untested
2016-10-06 10:12:11 +01:00
paboyle
4089984431
Timing hooks
2016-10-06 09:25:12 +01:00
Guido Cossu
c78bbd0f8c
Fix ASM compilation
2016-10-04 15:37:32 +01:00
Guido Cossu
d9b5fbd374
In the middle of adding a general binary writer
2016-10-04 11:24:08 +01:00
Guido Cossu
cfbc1a26b8
Now the gauge implementation has to take care of the Nexp
2016-10-03 16:20:06 +01:00
Guido Cossu
257f69f931
One more function to generalise the HMC integrator
2016-10-03 15:50:04 +01:00
Guido Cossu
e415260961
First cut on generalised HMC
...
Backward compatibility OK
2016-10-03 15:28:00 +01:00
536e2ff073
*.inc removed: please don't commit these files either!
2016-09-27 11:54:03 +01:00
paboyle
87acd06990
Use streaming stores
2016-09-26 10:11:34 +01:00
paboyle
9353b6edfe
Fenv out of grid namespace
2016-09-26 10:09:13 +01:00
paboyle
167cc2650e
GNU SOURCE problem on travis
2016-09-26 09:58:09 +01:00
paboyle
7089b6d5a5
Setting up but not implemented some QED rules
2016-09-26 09:43:40 +01:00
paboyle
2ba7d43ddd
Divide handling
2016-09-26 09:43:14 +01:00
paboyle
836e929565
Divide handling improved
2016-09-26 09:42:22 +01:00
paboyle
b6713ecb60
Momentum space rules for Overlap, DWF untested to date
2016-09-26 09:39:09 +01:00
paboyle
52a39f0fcd
Divide in ET
2016-09-26 09:38:38 +01:00
paboyle
81a7a03076
Integer <<
2016-09-26 09:38:17 +01:00
paboyle
16b37b956c
divide goes to ET
2016-09-26 09:37:59 +01:00
paboyle
567b6cf23f
demangle moves to logging
2016-09-26 09:36:51 +01:00
paboyle
296396646d
FPE's on macos set up
2016-09-26 09:36:14 +01:00
Guido Cossu
5c190a1b8c
Merge branch 'develop' into feature/hirep
2016-09-23 11:06:06 +01:00
Guido Cossu
c4ac6e7e8f
Consolidating HMC interface
...
Uniformed interface for standard action in fundamental rep and Hirep
2016-09-23 10:47:42 +01:00
Guido Cossu
510e340e16
Debugged last commit for the Two index representation
2016-09-22 22:16:21 +01:00
Guido Cossu
6ffadca153
Restored number of colours to 3
2016-09-22 14:22:54 +01:00
Guido Cossu
b6597b74e7
Added support for the Two index Symmetric and Antisymmetric representations
...
Tested for HMC convergence: OK
Added also a test file showing an example for mixed representations
2016-09-22 14:17:37 +01:00
a034e9901b
Merge branch 'develop' into feature/hadrons
2016-09-20 13:49:33 +01:00
Antonin Portelli
0724f7af75
QPX single precision implementation
2016-09-19 18:09:12 +01:00
2e74520821
removed libtool use (BG/Q compatibility)
2016-09-16 15:25:49 +01:00
Antonin Portelli
6dd75ad9e5
Merge branch 'develop' of github.com:paboyle/Grid into feature/bgq
2016-09-16 15:07:54 +01:00
Guido Cossu
fda408ee6f
Added first lines for supporting Two Index representations
2016-09-13 10:43:30 +01:00
Guido Cossu
b9c80318a2
Merge branch 'develop' into feature/hirep
2016-09-13 10:01:51 +01:00
Guido Cossu
5df5d52d41
Fix for the Intel compiler
2016-09-12 17:17:20 +01:00
Guido Cossu
f76f281e58
Cleaning files after fix
2016-09-09 11:34:25 +01:00
Guido Cossu
aa20cc8b52
Fixing compilation error with AVX512 flag
2016-09-09 02:58:52 -07:00
Guido Cossu
0fd179fb33
Merge branch 'develop' into feature/hirep
2016-09-01 12:59:53 +01:00
Guido Cossu
f45ef8d114
Minor modification in ActionBase.h
2016-09-01 11:46:46 +01:00
paboyle
8535d433a7
Cold or hot must support any precisoin
2016-08-31 00:27:53 +01:00
paboyle
b573d1f35a
Wilson tree level added
2016-08-31 00:27:04 +01:00
paboyle
0c1d7e4daf
Mom space prop for Wilson action
2016-08-31 00:26:36 +01:00
paboyle
02e983a0cd
Momentum space prop and free prop convolution
2016-08-31 00:26:02 +01:00
paboyle
d15ab66aae
FFT moves higher in include order
2016-08-31 00:25:22 +01:00
paboyle
9005b82c6d
Multi dim FFT, and normalisation fix
2016-08-31 00:24:52 +01:00
paboyle
3475f45ce7
Demangle support for typeid stuff
2016-08-31 00:23:48 +01:00
paboyle
0744f38866
Demangle support is useful
2016-08-31 00:23:28 +01:00
Guido Cossu
fd5614738d
Merge branch 'develop' into feature/hirep
2016-08-30 18:21:36 +01:00
Guido Cossu
b0d3e4bb2c
Separating travis builds
2016-08-30 13:44:07 +01:00
Guido Cossu
b512ccbee6
HMC for Adjoint fermions works
...
Accepts and reproduces known results
Check initial instability of inverters
when starting from hot configurations
2016-08-30 11:31:25 +01:00
paboyle
8c89391c02
FFTW unresolved fixed when no fftw3.h
2016-08-24 16:41:47 +01:00
paboyle
bfac5195b8
tidy up
2016-08-24 16:38:36 +01:00
paboyle
744691097f
Printing
2016-08-24 15:05:56 +01:00
paboyle
ff6da364e8
FFT double and single precision gives good performance now in multithreaded code.
2016-08-24 15:05:00 +01:00
4d11a6f5f2
first commit for QPX intrinsics
2016-08-23 14:41:44 +01:00
paboyle
88be3b39bb
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-08-22 18:29:36 +01:00
paboyle
356e7940fd
fftw can be switched off
2016-08-22 16:24:49 +01:00
paboyle
73ce476890
Include fftw headers
2016-08-22 16:24:21 +01:00
paboyle
e423a09974
FFT improved and test_FFT passing under MPI 8 processes, 8^4 for LatticeComplexD and LatticeSpinMatrixD
2016-08-18 02:23:21 +01:00
paboyle
17097a93ec
FFTW test ran over 4 mpi processes.
2016-08-17 01:33:55 +01:00
paboyle
4ab7dbfd57
Instantiate
2016-08-15 23:00:40 +01:00
paboyle
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
Guido Cossu
9c2e8d5e28
Nc=3 just to let all the test pass in Travis
2016-08-09 15:46:57 +01:00
Guido Cossu
147e2025b9
Added unit tests on the representation transformations
...
Status: Passing all tests
2016-08-08 16:54:22 +01:00
b1cfb4d661
first try at a nicer Doxygen implementation
2016-08-05 15:29:18 +01:00
paboyle
32bc7a6ab8
MPI back out of change that hangs
...
AVX2 for clang, gcc needs the -mfma flag.
2016-08-05 10:36:00 +01:00
7ff7c7d90d
Merge branch 'develop' into feature/hadrons
2016-08-04 16:22:10 +01:00
93d29bb699
build system improvements after discussion with Peter
2016-08-04 16:19:59 +01:00
2485ef9c9c
Merge branch 'feature/new-build' into feature/hadrons
...
# Conflicts:
# Makefile.am
# scripts/copyright
2016-08-03 16:49:16 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
Guido Cossu
49b5c49851
Checked the hermiticity of the op in derivative, ok
...
Still CG fails to converge
2016-07-31 12:37:33 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
Guido Cossu
089f0ab582
Debugged HMC for Creutz relation
2016-07-28 16:44:41 +01:00
Guido Cossu
b93e18ed50
Modified the Dirac Kernel class to compile with different number of colours
...
Added the general push_back functionality to accomodate for all defined representations
Compiles, not tested
2016-07-18 16:36:28 +01:00
Guido Cossu
9c77bb69a5
Added all elements for Hirep HMC
...
TODO: Test and debug
2016-07-18 12:05:23 +01:00
paboyle
f9e90eeb1f
Sign error on the force for 4d fields fixed
2016-07-16 01:52:44 +01:00
paboyle
fad5c675eb
sign error on the 4d gparity force
2016-07-16 01:51:56 +01:00
paboyle
4908b77d46
Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created
...
a HUGE amount of difficult to resolve and understand conflicts .
Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain.
2016-07-15 20:59:07 +01:00
paboyle
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
paboyle
980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
Guido Cossu
7edf4c6c04
Added HMC utitities for the higher representations
...
TODO: Inherit types for the pseudofermions, Debugging, testing
2016-07-15 13:39:47 +01:00
paboyle
1a6c7204ac
Disable instantiation; Use cache version instead
2016-07-15 00:34:39 +01:00
paboyle
49310fbab3
Done with red black change over
2016-07-15 00:08:43 +01:00
paboyle
5c0c8efb9e
Updated file list
2016-07-15 00:02:11 +01:00
paboyle
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2
5d ls vec rename of impl class
2016-07-14 23:57:26 +01:00
paboyle
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00