Guido Cossu
ef8d3831eb
Temporary patch the threading error in InsertSlice and ExtractSlice
...
Find source and fix the error
2017-01-25 18:12:04 +00:00
Guido Cossu
7b40a3e3e5
Reorganizing files
2017-01-25 18:09:46 +00:00
Guido Cossu
677757cfeb
Added and tested SITMO PRNG
2017-01-25 12:47:22 +00:00
Guido Cossu
17629b8d9e
Merge branch 'develop' into feature/hmc_generalise
2017-01-25 11:33:53 +00:00
Guido Cossu
851f2ad8ef
Adding fermions actions support in the factories
2017-01-19 10:00:02 +00:00
91a3534054
Lattice slice utilities now thread safe
2017-01-16 06:32:25 +00:00
Guido Cossu
ce1a115e0b
Removing redundant arguments for integrator functions, step 1
2016-12-20 17:51:30 +00:00
Guido Cossu
2bd4233919
Completed testing of the HMC for Ls vectorised version (on AVX2)
2016-12-07 04:56:37 +00:00
Guido Cossu
b812d5e39c
Added single threaded version of the derivative for the Ls vectorised DWF
2016-12-06 16:31:13 +00:00
Guido Cossu
a783282b8b
Merge branch 'develop' into feature/hmc_generalise
2016-11-10 18:13:07 +00:00
ca21003f01
Merge branch 'feature/fft-opt' into feature/feynman-rules
...
# Conflicts:
# lib/FFT.h
# lib/qcd/action/fermion/WilsonFermion5D.h
# tests/core/Test_fft.cc
2016-10-26 18:44:47 +01:00
paboyle
b820076b91
Merge branch 'develop' into feature/mpi3
2016-10-25 06:02:33 +01:00
azusayamaguchi
8f8058f8a5
More random bits on parallel seeding
2016-10-25 01:05:52 +01:00
392e064513
fast local peek-poke
2016-10-24 19:24:21 +01:00
Guido Cossu
f55c16f984
Adding a barrier in the RNG save
2016-10-24 11:02:14 +01:00
Guido Cossu
3e990c9d0a
Reverting the broadcast change
2016-10-22 13:26:43 +01:00
bd6a228af6
Merge commit '20a091c3eddfdb67a82ece6413740a93650a2f98' into feature/feynman-rules
2016-10-21 13:10:30 +01:00
paboyle
f9d5e95d72
allocator template typedefs moved to AlignedAllocator
2016-10-20 16:59:39 +01:00
Guido Cossu
590675e2ca
Csum in hex format
2016-10-19 17:26:25 +01:00
Guido Cossu
26b9740d53
Some fix for the GenericHMCrunner
2016-10-10 09:43:05 +01:00
paboyle
52a39f0fcd
Divide in ET
2016-09-26 09:38:38 +01:00
paboyle
16b37b956c
divide goes to ET
2016-09-26 09:37:59 +01:00
paboyle
17097a93ec
FFTW test ran over 4 mpi processes.
2016-08-17 01:33:55 +01:00
paboyle
f4dd5062d7
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-07-15 19:26:06 +01:00
Christopher Kelly
6f47fbb1e2
Disabled parallel for loops in ExtractSlice and InsertSlice due to race conditions. Likely will need to do so for localConvert too.
2016-07-13 10:49:18 -04:00
paboyle
ef97e32152
Adding persistent communicators
2016-07-08 17:16:08 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
Guido Cossu
3c49ddfaa4
Merge branch 'temporary-smearing' into develop
2016-07-07 14:04:59 +01:00
Guido Cossu
ffb8b3116c
Tested smeared RHMC Wilson1p1, accepting
2016-07-07 11:49:36 +01:00
Christopher Kelly
dd8cfff111
Another fix for pedantic compilers
2016-07-06 18:22:15 -04:00
Christopher Kelly
184642adb0
Fix for pedantic compilers
2016-07-06 18:15:15 -04:00
Christopher Kelly
85ed8175cb
Implemented mixed precision CG. Fixed filelist to exclude lib/Old directory and include Config.h.
2016-07-06 15:57:04 -04:00
Guido Cossu
e87182cf98
Debugged the copy constructor of the Lattice class
2016-07-06 15:31:00 +01:00
Guido Cossu
e3d5319470
Debugged the real() and imag() functions and added tests to Test_Simd
2016-07-06 14:16:03 +01:00
Guido Cossu
9cb90f714e
Merge remote-tracking branch 'origin/develop' into temporary-smearing
2016-07-04 17:28:40 +01:00
Guido Cossu
8dd099267d
Corrected a bug in the Expression Templates (acso and asin were wrong)
2016-07-03 12:28:25 +01:00
3789e3f31c
additional fixed in slice functions
2016-05-12 18:35:38 +01:00
0c66719210
const fix in slice functions
2016-05-12 13:01:35 +01:00
e3083b6dfc
Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab'
2016-05-11 15:20:41 +01:00
paboyle
ab89418658
Precision change going in; useful for mixed precision algorithms for example.
2016-05-11 15:18:47 +01:00
paboyle
28cd99882c
Subslicing
2016-05-11 15:06:54 +01:00
paboyle
aceaee774c
ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
...
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
101aa769eb
LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers
2016-05-04 12:15:31 -07:00
c4c89336fe
SliceSum: shutting down warning about non-threaded code for now
2016-05-01 18:29:57 -07:00
paboyle
f7ca6ca889
Bernoulli reenabled -- using integral type for the discrete_distribution, but
...
then casts in the fill
2016-04-30 03:48:28 -07:00
paboyle
ec4a9b7f6c
The Bernoulli gives a no compile due to a static assertion that the type be integral
...
in 4.7 random.h
Probably need to go through an Integer type, and then conver to real after the random draw
to make clean.
2016-04-30 03:42:24 -07:00
f6c53e5039
Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2'
2016-04-30 00:17:52 -07:00
23b6172c31
Bernoulli RNG
2016-04-30 00:14:13 -07:00
paboyle
574ea4f843
const safety
2016-04-19 15:15:11 -07:00
2d8bb356e3
Smearing routines compile (still untested)
2016-02-25 02:43:59 +09:00
a7251f28c7
Stout smearing compiles (untested)
2016-02-24 03:16:50 +09:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
08edbb5cbe
HMC bit repro across checkpoints. Fixed parallel RNG issue with threading.
...
Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
2015-12-22 08:54:40 +00:00
paboyle
09bfe52840
Remove extraneous variable
2015-12-21 15:30:28 +00:00
paboyle
31ca609d12
HMC checkpointing .
...
Need a general HMC framework to work in restart.
2015-12-20 02:29:51 +00:00
paboyle
5710966324
Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time.
...
Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file.
Store a Parallel (one per site) and a single serial RNG file.
2015-12-19 18:32:25 +00:00
paboyle
02d730513a
Divide function
2015-11-28 16:54:43 -08:00
Peter Boyle
6d06bd9493
Minor change in commented out code
2015-10-09 00:42:21 +02:00
Peter Boyle
5ef42add2d
Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
...
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
Peter Boyle
357c6ab46d
Reunitarise. Complete the HMC and integrator changes.
2015-08-31 16:32:04 +01:00
Peter Boyle
fdfe194c41
Threading bug in RNG fill fixed.
2015-08-19 14:41:05 +01:00
Peter Boyle
4e085dd0ed
Domain wall even-odd 2f HMC with wilson gauge and PV 2f ratio now running and giving small dH.
...
Azusa is working hard on the rectangle term and we'll hopefully start reproducing plaquettes
from RBC-UKQCD parameters soon !
My new laptop is pretty warm and is starting to groan ;)
2015-08-19 10:26:07 +01:00
Peter Boyle
4dc7c36aa8
Gparity works now even if simd distributed in a Gparity twist direction.
...
Tested by doubling lattice in t-direction.
2015-08-14 12:57:42 +01:00
Peter Boyle
7d3512ab21
Gparity valence test now working.
...
Interface in FermionOperator will change a lot in future
2015-08-14 00:01:04 +01:00
Peter Boyle
d9d4c5916a
Elemental force term for Wilson dslash added and tests thereof passing.
...
Now need to construct pseudofermion two flavour, ratio, one flavour, ratio
action fragments.
2015-07-26 10:54:38 +09:00
Peter Boyle
d1afebf71e
Sizable improvement in multigrid for unsquared.
...
6000 matmuls CG unprec
2000 matmuls CG prec (4000 eo muls)
1050 matmuls PGCR on 16^3 x 32 x 8 m=.01
Substantial effort on timing and logging infrastructure
2015-07-24 01:31:13 +09:00
Peter Boyle
03ca506a3d
Big commit fixing nocompiles in defective C++11 compilers (gcc, icpc). stared getting to
...
near the bleeding edge I guess
2015-06-30 15:17:27 +01:00
Peter Boyle
b4a6dbfa65
Patches for beginnings of an overlap multigrid
2015-06-20 22:22:56 +01:00
neo
4eb71d2cd2
Lattice matrix exponential ok
2015-06-17 20:41:07 +09:00
Azusa Yamaguchi
73494a4768
Typo fix
2015-06-16 14:06:31 +01:00
Azusa Yamaguchi
f5bcca6cdf
Where and many other functions (sin cos abs log exp) into ET system
2015-06-14 01:07:25 +01:00
Azusa Yamaguchi
be3f4ce201
Cosmetic
2015-06-14 01:06:56 +01:00
Azusa Yamaguchi
264d0d1735
real comps and expression comps
2015-06-14 01:05:57 +01:00
Azusa Yamaguchi
6ca940b5a6
Allow real comparisons and expressions in comparisons
2015-06-14 01:05:39 +01:00
Azusa Yamaguchi
b66bbed548
Allow sparse occupation of vectors in some cases
2015-06-14 01:05:06 +01:00
Azusa Yamaguchi
463b9ca374
Moving more into the ET system
2015-06-14 01:04:32 +01:00
Azusa Yamaguchi
611f7ec38c
trying to find a way to remove functions from the ET system using explicit
...
expression closure statements. Not sure if this works yet
2015-06-14 01:03:28 +01:00
Azusa Yamaguchi
f490f320ae
Transpose always returns self image
2015-06-14 01:02:31 +01:00
Azusa Yamaguchi
58f50b7520
Extra lattice unaries
2015-06-14 01:01:55 +01:00
Azusa Yamaguchi
e5c980f169
Moving where in to the expression template system; deprecate
2015-06-14 01:01:21 +01:00
Peter Boyle
7766cc96c1
Got this sorted with the promote working in a test
2015-06-09 22:39:13 +01:00
Peter Boyle
1e5b015ee3
Some unary ops and coarse grid support
2015-06-09 10:26:19 +01:00
Peter Boyle
d6f1ddf99c
Conjugate residual algorithm; some more unary functions
2015-06-08 12:04:59 +01:00
neo
4b114fce3d
Added support for Ta to Lattice types
2015-06-04 18:29:55 +09:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
Azusa Yamaguchi
c851d0e705
FIx miistake
2015-06-01 12:26:20 +01:00
Peter Boyle
5644ab1e19
Large scale change to support 5d fermion formulations.
...
Have 5d replicated wilson with 4d gauge working and matrix regressing
to Ls copies of wilson.
2015-05-31 15:09:02 +01:00
Peter Boyle
840754dd42
Hand unrolled version of dslash in a separate class.
...
Useful to compare; raises Intel compiler from 9GFlop/s to 17.5 Gflops.
on ivybridge core. Raises Clang form 14.5 to 17.5
2015-05-26 19:54:03 +01:00
Peter Boyle
94d679c4e6
Better checkerboard tracking.
2015-05-25 13:45:08 +01:00
Peter Boyle
eadfb5be67
Better pragma use
2015-05-23 09:32:37 +01:00
Peter Boyle
9601890549
Streaming store option ifdef
2015-05-21 06:47:05 +01:00
Peter Boyle
1559dd4adc
Compile time select if we do the streaming store copy. Relies on Clang++ eliminating object copies,
...
and other compliers do not necessarily cope.
2015-05-21 06:39:00 +01:00
Peter Boyle
4dba8522a1
Got unpreconditioned conjugate gradient to run and converge on a random (uniform random,
...
not even SU(3) for now) gauge field. Convergence history is correctly indepdendent of decomposition
on 1,2,4,8,16 mpi tasks.
Found a couple of simd bugs which required fixed and enhanced the Grid_simd.cc test suite.
Implemented the Mdag, M, MdagM, Meooe Mooee schur type stuff in the wilson dop.
2015-05-19 13:57:35 +01:00
Peter Boyle
e9ed288b00
Typoo xifed
2015-05-16 05:49:32 +01:00
Peter Boyle
dda3da45fb
Update Grid_lattice_trace.h
2015-05-16 04:40:28 +01:00
Peter Boyle
a19aa9627d
Optimisation and syntax pretty
2015-05-16 04:36:22 +01:00
Peter Boyle
9e29fb2c6a
strong inline
2015-05-16 04:33:10 +01:00
Peter Boyle
537f47404b
Parallel for replace
2015-05-15 11:48:04 +01:00
Peter Boyle
46c4379592
Formatting change
2015-05-15 11:38:54 +01:00
Peter Boyle
f761ab0f50
Filed bug report Bug 66153 on GCC-5.
...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66153
2015-05-15 11:38:04 +01:00
Peter Boyle
2a28cfb3a3
Silly formatting change
2015-05-15 11:37:07 +01:00
Peter Boyle
a108d5d3b0
cout IO for all types
2015-05-13 09:24:10 +01:00
Peter Boyle
6cec662ac5
Enhanced SIMD interfacing
2015-05-12 20:41:44 +01:00
Peter Boyle
6103c29ee3
Threading support rework.
...
Placed parallel pragmas as macros; implemented deterministic thread reduction in style of
BFM.
2015-05-12 07:51:41 +01:00
Peter Boyle
b1d2c60d07
Moving some things around for pretty
2015-05-11 19:09:49 +01:00
Peter Boyle
5555a852be
Lots of changes required to compile for MIC under ICPC
2015-05-10 23:29:21 +01:00
Peter Boyle
b802abc83f
Expression template hack
2015-05-10 15:35:30 +01:00
Peter Boyle
14591c72d6
Expression template engin
2015-05-10 15:34:20 +01:00
Peter Boyle
2ffd941d67
Assertion should never hit, but did due to a bug
2015-05-10 15:24:37 +01:00
Peter Boyle
ca554f661b
Moving operator stuff into separate file so that we can switch on/off replacement with
...
expression templates
2015-05-10 15:23:49 +01:00
Peter Boyle
b9d16a7191
streaming store cases
2015-05-05 18:14:09 +01:00
Peter Boyle
193860dbc8
Comms and memory benchmarks added
2015-05-03 09:44:47 +01:00
Peter Boyle
25d523c0f4
Shaken out stencil to the point where I think wilson dslash is correct.
...
Need to audit code carefully, consolidate between stencil and cshift,
and then benchmark and optimise.
2015-04-28 08:11:59 +01:00
Peter Boyle
94f728bee4
Big updates with progress towards wilson matrix
2015-04-26 15:51:09 +01:00
Peter Boyle
9ec3529864
Improved the gamma quite a bit.
...
Serial rng's which are set on node zero and broadcaste
2015-04-24 20:21:40 +01:00
Peter Boyle
128ad0999f
Moved code from summation into transfer and reduction
2015-04-24 18:40:44 +01:00
Peter Boyle
52a6ba9767
Slice summation working. May move this into lattice/Grid_lattice_reduction however
2015-04-23 15:13:00 +01:00
Peter Boyle
b32c14b433
Got the NERSC IO working and fixed a bug in cshift.
2015-04-22 22:46:48 +01:00
Peter Boyle
42f167ea37
Rework of RNG to use C++11 random. Should work correctly maintaining parallel RNG across
...
a machine. If a "fixedSeed" is used, randoms should be reproducible across different machine
decomposition since the generators are physically indexed and assigned in lexico ordering.
2015-04-19 14:55:58 +01:00
Peter Boyle
5483ed641e
Split all OMP directives into lattice subdir for easy maintainance of
...
parallelism and future OMP 4.0 offload.
2015-04-18 22:17:01 +01:00
Peter Boyle
e5a25dfcb1
Build reorg with which I am a bit happier
2015-04-18 21:22:50 +01:00