paboyle
2ba7d43ddd
Divide handling
2016-09-26 09:43:14 +01:00
paboyle
836e929565
Divide handling improved
2016-09-26 09:42:22 +01:00
paboyle
b6713ecb60
Momentum space rules for Overlap, DWF untested to date
2016-09-26 09:39:09 +01:00
paboyle
52a39f0fcd
Divide in ET
2016-09-26 09:38:38 +01:00
paboyle
81a7a03076
Integer <<
2016-09-26 09:38:17 +01:00
paboyle
16b37b956c
divide goes to ET
2016-09-26 09:37:59 +01:00
paboyle
567b6cf23f
demangle moves to logging
2016-09-26 09:36:51 +01:00
paboyle
296396646d
FPE's on macos set up
2016-09-26 09:36:14 +01:00
paboyle
8535d433a7
Cold or hot must support any precisoin
2016-08-31 00:27:53 +01:00
paboyle
b573d1f35a
Wilson tree level added
2016-08-31 00:27:04 +01:00
paboyle
0c1d7e4daf
Mom space prop for Wilson action
2016-08-31 00:26:36 +01:00
paboyle
02e983a0cd
Momentum space prop and free prop convolution
2016-08-31 00:26:02 +01:00
paboyle
d15ab66aae
FFT moves higher in include order
2016-08-31 00:25:22 +01:00
paboyle
9005b82c6d
Multi dim FFT, and normalisation fix
2016-08-31 00:24:52 +01:00
paboyle
3475f45ce7
Demangle support for typeid stuff
2016-08-31 00:23:48 +01:00
paboyle
0744f38866
Demangle support is useful
2016-08-31 00:23:28 +01:00
paboyle
8c89391c02
FFTW unresolved fixed when no fftw3.h
2016-08-24 16:41:47 +01:00
paboyle
bfac5195b8
tidy up
2016-08-24 16:38:36 +01:00
paboyle
744691097f
Printing
2016-08-24 15:05:56 +01:00
paboyle
ff6da364e8
FFT double and single precision gives good performance now in multithreaded code.
2016-08-24 15:05:00 +01:00
paboyle
88be3b39bb
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-08-22 18:29:36 +01:00
paboyle
356e7940fd
fftw can be switched off
2016-08-22 16:24:49 +01:00
paboyle
73ce476890
Include fftw headers
2016-08-22 16:24:21 +01:00
paboyle
e423a09974
FFT improved and test_FFT passing under MPI 8 processes, 8^4 for LatticeComplexD and LatticeSpinMatrixD
2016-08-18 02:23:21 +01:00
paboyle
17097a93ec
FFTW test ran over 4 mpi processes.
2016-08-17 01:33:55 +01:00
paboyle
4ab7dbfd57
Instantiate
2016-08-15 23:00:40 +01:00
paboyle
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
paboyle
32bc7a6ab8
MPI back out of change that hangs
...
AVX2 for clang, gcc needs the -mfma flag.
2016-08-05 10:36:00 +01:00
93d29bb699
build system improvements after discussion with Peter
2016-08-04 16:19:59 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
paboyle
f9e90eeb1f
Sign error on the force for 4d fields fixed
2016-07-16 01:52:44 +01:00
paboyle
fad5c675eb
sign error on the 4d gparity force
2016-07-16 01:51:56 +01:00
paboyle
4908b77d46
Fixed conflicts. PLEASE avoid making wholesale cosmetic only changes, this created
...
a HUGE amount of difficult to resolve and understand conflicts .
Wholesale formatting, reordering functions etc... in a central file like Tensor_class
or Grid_vector_types while others are also editing without making substantial functionality
changes creates pain.
2016-07-15 20:59:07 +01:00
paboyle
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
paboyle
980ff18956
Solving the instantiation no compile issue
2016-07-15 17:19:44 +01:00
paboyle
1a6c7204ac
Disable instantiation; Use cache version instead
2016-07-15 00:34:39 +01:00
paboyle
49310fbab3
Done with red black change over
2016-07-15 00:08:43 +01:00
paboyle
5c0c8efb9e
Updated file list
2016-07-15 00:02:11 +01:00
paboyle
dfd714e1ef
Multiple implementations for the 5d hopping terms, depending on cache friendly
...
ops and/or the 5th direction being vectorised
All use 4d redblack.
2016-07-15 00:00:09 +01:00
paboyle
79a8ca1a62
Rewrite for performance. Impl dependent instantiations give
...
4d linalg impls of the 5d hopping terms (and inverse)
Cache friendly loop orderings of the above
Dense matrix stored and apply to the above
-- Switch to Ls vectorised, and use dense matrix approach for the MooeeInv
and rotate/shift of the Mooee M5D routines.
2016-07-14 23:58:15 +01:00
paboyle
fb45eb2eb2
5d ls vec rename of impl class
2016-07-14 23:57:26 +01:00
paboyle
a307274c96
Fermion impl rename for ls vectorised 5d approaches
2016-07-14 23:56:13 +01:00
paboyle
3f2c44a5fe
Updating the class to 5d selection based on impl type
2016-07-14 23:55:26 +01:00
paboyle
48fb1cdc11
Update domain 5d vectorised impl type, move the type over to 4d redblack with
...
the dense OO inverse
2016-07-14 23:54:35 +01:00
paboyle
8a79e93cc2
Rename the 5d domain wall fermion vectorised Ls impl class
2016-07-14 23:53:00 +01:00
paboyle
dd62a61c5c
Added broadcast and rotation of simd vectors
2016-07-14 23:49:00 +01:00
paboyle
8f47d0b5ab
Rotation needed for hopping term in fifth dim with Ls vectorised fields
2016-07-14 23:45:36 +01:00
paboyle
42af132dab
Fix for chris kellys request to peek poke on checkerboarded fields
2016-07-14 23:44:48 +01:00
paboyle
adbc7c1188
Adding files for multiple implementations (cache opt) and Ls vectorisation
...
of the 5D cayley form chiral fermions for the 5d matrix. With Ls entirely
in the vector direction, s-hopping terms involve rotations.
The serial dependence of the LDU inversion for Mobius and 4d even odd
checkerboarding is removed by simply applying Ls^2 operations (vectorised
many ways) as a dense matrix operation.
This should give similar throughput but high flops (non-compulsory flops)
but enable use of the KNL cache friendly kernels throughout the code.
Ls is still constrained to be a multiple of Nsimd, which is as much as 8 for AVX512
with single precision.
2016-07-14 22:59:21 +01:00