paboyle
84b441800f
Merge branch 'develop' into feature/lanczos-reorg
2017-10-27 14:21:38 +01:00
paboyle
1ef424b139
Split grid Y2K bug fix attempt
2017-10-27 14:20:35 +01:00
paboyle
08583afaff
Red black friendly coarsening
2017-10-25 23:51:18 +01:00
paboyle
08ca338875
Split grid communication
2017-10-09 23:19:45 +01:00
paboyle
4f8b6f26b4
Merge branch 'develop' into feature/dwf-multirhs
2017-10-02 11:41:49 +01:00
Azusa Yamaguchi
d9cd4f0273
Staggered multinode block cg debugged. Missing global sum.
...
Code stalls and resumes on KNL at cambridge. Curious.
CG iterations 23ms each, then 3200 ms pauses. Mean bandwidth reports
as 200MB/s. Comms dominant in the report. However, the time behaviour suggests it
is *bursty*.... Could be swap to disk?
2017-08-23 15:07:18 +01:00
azusayamaguchi
659d7d1a40
For test/solver
...
Fixed
2017-07-12 15:01:48 +01:00
paboyle
349d75e483
Precision fix
2017-06-23 02:57:59 -07:00
Azusa Yamaguchi
e9cc21900f
Block solver complete for staggered. Now stable on mass 0.003 and
...
gives 8x (!) speed up on Haswell laptop vs. standard CG for 8 RHS solves.
166 iterations vs. 537 iterations so algorithmic gain + 2x in flop rate gain.
Better than a slap in the face with a wet kipper.
2017-06-20 12:37:41 +01:00
Azusa Yamaguchi
cfe3cd76d1
Block solver improvements
2017-06-19 14:04:21 +01:00
paboyle
c85024683e
Merge branch 'feature/parallelio' into develop
2017-06-19 01:39:48 +01:00
Peter Boyle
6f687a67cd
As local vols increase, use 64 bits for safety
2017-06-01 17:36:18 -04:00
paboyle
58e8d0a10d
reverse direction lexico mapping
2017-05-30 23:38:30 +01:00
Guido Cossu
ab3596d4d3
Using Cayley-Hamilton form for the exponential of SU(3) matrices
2017-05-25 12:07:47 +01:00
Guido Cossu
a8fb2835ca
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2017-05-18 14:45:00 +01:00
Azusa Yamaguchi
f46a67ffb3
No compile issue on clang on mac fixed.
...
Compiler version was clang++-3.9 under mpicxx
2017-05-17 10:51:01 +01:00
Guido Cossu
10f2872aae
Faster exponentiation for lattice fields
2017-05-15 15:51:16 +01:00
paboyle
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
paboyle
697c0603ce
SITMO I/O for NERSC working now bit repro
2017-05-05 16:54:44 +01:00
Guido Cossu
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
paboyle
8e161152e4
MultiRHS solver improvements with slice operations moved into lattice and sped up.
...
Block solver requires a lot of performance work.
2017-04-18 10:51:55 +01:00
paboyle
3141ebac10
MultiRHS working, starting to optimise. Block doesn't and I thought it already was; puzzled.
2017-04-17 10:50:19 +01:00
paboyle
7ede696126
Non compile of tests fixed
2017-04-16 23:40:00 +01:00
paboyle
bf516c3b81
higher precision reduction variables in norm and inner product
2017-04-15 12:27:28 +01:00
paboyle
441a52ee5d
First cut at higher precision reduction
2017-04-15 10:57:21 +01:00
paboyle
683550f116
Const args improvement
2017-04-09 23:41:04 +09:00
Guido Cossu
8c540333d5
Merge branch 'develop' into feature/hmc_generalise
2017-04-05 14:41:04 +01:00
paboyle
83f6fab8fa
Big/Small crush test, and fast SITMO rng init, faster but not ideal
...
MT and Ranlux init.
2017-04-02 12:10:51 +09:00
paboyle
9dc7ca4c3b
Sitmo fast init
2017-04-02 00:28:22 +09:00
Guido Cossu
b3dede4dd3
Merge branch 'develop' into feature/hmc_generalise
2017-03-10 23:57:37 +09:00
paboyle
586a7c90b7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:26:59 +00:00
paboyle
e099dcdae7
Merge branch 'develop' into feature/bgq-asm
2017-02-23 00:25:29 +00:00
paboyle
4e7ab3166f
Refactoring header layout
2017-02-22 18:09:33 +00:00
paboyle
aac80cbb44
Bug fix from Chris K
2017-02-22 12:19:09 -05:00
Francesco Sanfilippo
15e668eef1
now it is possible to pass {coords list} to a peek or poke
2017-02-21 22:48:38 +01:00
paboyle
3ae92fa2e6
Global changes to parallel_for structure.
...
Move the comms flags to more sensible names
2017-02-21 05:24:27 -05:00
Guido Cossu
e0571c872b
Merge branch 'develop' into feature/hmc_generalise
2017-02-09 16:12:00 +00:00
paboyle
71ac2e7940
Faster RNG init
2017-02-07 01:33:23 -05:00
paboyle
fdc170b8a3
Parallel fors in lattice transfer
2017-02-07 01:16:39 -05:00
Guido Cossu
899e685627
Merge branch 'feature/sitmo_rng' into develop
2017-01-27 14:15:56 +00:00
Guido Cossu
ef8d3831eb
Temporary patch the threading error in InsertSlice and ExtractSlice
...
Find source and fix the error
2017-01-25 18:12:04 +00:00
Guido Cossu
7b40a3e3e5
Reorganizing files
2017-01-25 18:09:46 +00:00
Guido Cossu
677757cfeb
Added and tested SITMO PRNG
2017-01-25 12:47:22 +00:00
Guido Cossu
17629b8d9e
Merge branch 'develop' into feature/hmc_generalise
2017-01-25 11:33:53 +00:00
Guido Cossu
851f2ad8ef
Adding fermions actions support in the factories
2017-01-19 10:00:02 +00:00
91a3534054
Lattice slice utilities now thread safe
2017-01-16 06:32:25 +00:00
Guido Cossu
ce1a115e0b
Removing redundant arguments for integrator functions, step 1
2016-12-20 17:51:30 +00:00
Guido Cossu
2bd4233919
Completed testing of the HMC for Ls vectorised version (on AVX2)
2016-12-07 04:56:37 +00:00
Guido Cossu
b812d5e39c
Added single threaded version of the derivative for the Ls vectorised DWF
2016-12-06 16:31:13 +00:00
Guido Cossu
a783282b8b
Merge branch 'develop' into feature/hmc_generalise
2016-11-10 18:13:07 +00:00
ca21003f01
Merge branch 'feature/fft-opt' into feature/feynman-rules
...
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
paboyle
b820076b91
Merge branch 'develop' into feature/mpi3
2016-10-25 06:02:33 +01:00
azusayamaguchi
8f8058f8a5
More random bits on parallel seeding
2016-10-25 01:05:52 +01:00
392e064513
fast local peek-poke
2016-10-24 19:24:21 +01:00
Guido Cossu
f55c16f984
Adding a barrier in the RNG save
2016-10-24 11:02:14 +01:00
Guido Cossu
3e990c9d0a
Reverting the broadcast change
2016-10-22 13:26:43 +01:00
bd6a228af6
Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules
2016-10-21 13:10:30 +01:00
paboyle
f9d5e95d72
allocator template typedefs moved to AlignedAllocator
2016-10-20 16:59:39 +01:00
Guido Cossu
590675e2ca
Csum in hex format
2016-10-19 17:26:25 +01:00
Guido Cossu
26b9740d53
Some fix for the GenericHMCrunner
2016-10-10 09:43:05 +01:00
paboyle
52a39f0fcd
Divide in ET
2016-09-26 09:38:38 +01:00
paboyle
16b37b956c
divide goes to ET
2016-09-26 09:37:59 +01:00
paboyle
17097a93ec
FFTW test ran over 4 mpi processes.
2016-08-17 01:33:55 +01:00
paboyle
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
Christopher Kelly
6f47fbb1e2
Disabled parallel for loops in ExtractSlice and InsertSlice due to race conditions. Likely will need to do so for localConvert too.
2016-07-13 10:49:18 -04:00
paboyle
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
Guido Cossu
3c49ddfaa4
Merge branch 'temporary-smearing' into develop
2016-07-07 14:04:59 +01:00
Guido Cossu
ffb8b3116c
Tested smeared RHMC Wilson1p1, accepting
2016-07-07 11:49:36 +01:00
Christopher Kelly
dd8cfff111
Another fix for pedantic compilers
2016-07-06 18:22:15 -04:00
Christopher Kelly
184642adb0
Fix for pedantic compilers
2016-07-06 18:15:15 -04:00
Christopher Kelly
85ed8175cb
Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h.
2016-07-06 15:57:04 -04:00
Guido Cossu
e87182cf98
Debugged the copy constructor of the Lattice class
2016-07-06 15:31:00 +01:00
Guido Cossu
e3d5319470
Debugged the real() and imag() functions and added tests to Test_Simd
2016-07-06 14:16:03 +01:00
Guido Cossu
9cb90f714e
Merge remote-tracking branch 'origin/develop' into temporary-smearing
2016-07-04 17:28:40 +01:00
Guido Cossu
8dd099267d
Corrected a bug in the Expression Templates (acso and asin were wrong)
2016-07-03 12:28:25 +01:00
3789e3f31c
additional fixed in slice functions
2016-05-12 18:35:38 +01:00
0c66719210
const fix in slice functions
2016-05-12 13:01:35 +01:00
e3083b6dfc
Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab'
2016-05-11 15:20:41 +01:00
paboyle
ab89418658
Precision change going in; useful for mixed precision algorithms for example.
2016-05-11 15:18:47 +01:00
paboyle
28cd99882c
Subslicing
2016-05-11 15:06:54 +01:00
paboyle
aceaee774c
ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
...
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
101aa769eb
LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers
2016-05-04 12:15:31 -07:00
c4c89336fe
SliceSum: shutting down warning about non-threaded code for now
2016-05-01 18:29:57 -07:00
paboyle
f7ca6ca889
Bernoulli reenabled -- using integral type for the discrete_distribution, but
...
then casts in the fill
2016-04-30 03:48:28 -07:00
paboyle
ec4a9b7f6c
The Bernoulli gives a no compile due to a static assertion that the type be integral
...
in 4.7 random.h
Probably need to go through an Integer type, and then conver to real after the random draw
to make clean.
2016-04-30 03:42:24 -07:00
f6c53e5039
Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2'
2016-04-30 00:17:52 -07:00
23b6172c31
Bernoulli RNG
2016-04-30 00:14:13 -07:00
paboyle
574ea4f843
const safety
2016-04-19 15:15:11 -07:00
2d8bb356e3
Smearing routines compile (still untested)
2016-02-25 02:43:59 +09:00
a7251f28c7
Stout smearing compiles (untested)
2016-02-24 03:16:50 +09:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
08edbb5cbe
HMC bit repro across checkpoints. Fixed parallel RNG issue with threading.
...
Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
2015-12-22 08:54:40 +00:00
paboyle
09bfe52840
Remove extraneous variable
2015-12-21 15:30:28 +00:00
paboyle
31ca609d12
HMC checkpointing .
...
Need a general HMC framework to work in restart.
2015-12-20 02:29:51 +00:00
paboyle
5710966324
Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time.
...
Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file.
Store a Parallel (one per site) and a single serial RNG file.
2015-12-19 18:32:25 +00:00
paboyle
02d730513a
Divide function
2015-11-28 16:54:43 -08:00
Peter Boyle
6d06bd9493
Minor change in commented out code
2015-10-09 00:42:21 +02:00
Peter Boyle
5ef42add2d
Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
...
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00