paboyle
e503ef5590
Cleaned up
2016-06-07 00:11:36 +01:00
paboyle
a7682b0060
Only instantiate the one routine to avoid duplicate symbol under g++5/MacOS
2016-06-06 23:48:21 +01:00
paboyle
d4c9d71fc8
Merge branch 'master' of https://github.com/paboyle/Grid
2016-06-06 07:06:54 -07:00
paboyle
786ca52c43
Problems remain in the red black preconditioning of the Ls vectorisation
2016-06-06 07:05:51 -07:00
Peter Boyle
f78d89bcbe
Update Lebesgue.cc
...
kill verbose
2016-06-03 13:33:42 +01:00
paboyle
53d06046b0
Compiling updates for KNL
2016-06-03 03:47:54 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
1c0e922585
Merge pull request #35 from aportelli/master
...
empty SIMD fix
2016-05-27 16:49:13 +01:00
9d5f693cbe
empty SIMD fix
2016-05-24 10:56:27 +01:00
Peter Boyle
5c90c3b457
Merge pull request #34 from aportelli/master
...
Polymorphic lattices & various small updates
2016-05-24 10:50:04 +01:00
91e04056f9
fix of the empty SIMD
2016-05-12 19:24:10 +01:00
3789e3f31c
additional fixed in slice functions
2016-05-12 18:35:38 +01:00
0c66719210
const fix in slice functions
2016-05-12 13:01:35 +01:00
paboyle
3a5b5c8bec
Save an old tar of tree
2016-05-12 03:20:17 -07:00
4bc21ec7cb
thread CL argument fix
2016-05-11 15:21:29 +01:00
e3083b6dfc
Merge commit 'ab894186589224d570e0ecef8eea06443194a8ab'
2016-05-11 15:20:41 +01:00
paboyle
ab89418658
Precision change going in; useful for mixed precision algorithms for example.
2016-05-11 15:18:47 +01:00
paboyle
28cd99882c
Subslicing
2016-05-11 15:06:54 +01:00
paboyle
aceaee774c
ExtractSlice / InsertSlice for lower dimensional lattices where the lattice is not
...
distributed in the orthogonal direction.
Useful for fermion 4d/5d etc..
2016-05-11 14:12:02 +01:00
101aa769eb
LatticeBase contain the grid pointer and a virtual destructor to allow polymorphic lattice pointers
2016-05-04 12:15:31 -07:00
0bf99bfde5
log polish
2016-05-04 12:14:49 -07:00
64bf6fe54e
macro to dump NERSC header to a stream
2016-05-04 12:14:38 -07:00
1161d566b9
minor code cleaning
2016-05-02 19:32:11 -07:00
c698b16d75
function to generate Chroma-style gamma matrix products
2016-05-01 18:30:35 -07:00
c4c89336fe
SliceSum: shutting down warning about non-threaded code for now
2016-05-01 18:29:57 -07:00
fa59789580
ConjugateGradient: cleaner output
2016-05-01 18:29:20 -07:00
92c2c7d3b5
SchurRedBlackDiagMooeeSolve: fix: guess was not initialised from input
2016-05-01 16:07:55 -07:00
e99ce0875f
directly exit when using '--help' option
2016-05-01 16:05:16 -07:00
paboyle
c23375cd65
Testing travis CI integration
2016-04-30 06:30:56 -07:00
paboyle
f7ca6ca889
Bernoulli reenabled -- using integral type for the discrete_distribution, but
...
then casts in the fill
2016-04-30 03:48:28 -07:00
paboyle
ec4a9b7f6c
The Bernoulli gives a no compile due to a static assertion that the type be integral
...
in 4.7 random.h
Probably need to go through an Integer type, and then conver to real after the random draw
to make clean.
2016-04-30 03:42:24 -07:00
paboyle
5341977948
IMCI fixes. Thought I had committed these. The "real" disambiguation
...
between std::real and Grid::real shouldn't have been necessary and I don't
know why only the icpc v16.0 on babbage hits it.
May need a longer term rename of Grid::real or some careful EnableIf work.
2016-04-30 03:34:16 -07:00
f6c53e5039
Merge commit '1e554350acae0e67fa7177ed0db9d4f684a54af2'
2016-04-30 00:17:52 -07:00
ba09cbae3e
function to read std::vector from a string (blank separated values)
2016-04-30 00:15:44 -07:00
6aa000176f
Fermion <-> Propagator functions
2016-04-30 00:14:33 -07:00
23b6172c31
Bernoulli RNG
2016-04-30 00:14:13 -07:00
3f128443ab
OS X icpc fix
2016-04-30 00:13:33 -07:00
paboyle
1e554350ac
The threaded coms didn't agree with GCC. Suprised, and looks like GCC bug.
2016-04-29 16:49:18 -07:00
paboyle
c79ea0dcef
Fixingn IMCI
2016-04-22 21:52:54 -07:00
paboyle
e3f141f82f
Fixed SSE compile with typecasts
2016-04-22 10:30:30 -07:00
paboyle
a6dfa2386b
GCC choked on intrinsics calls that ICPC did not
2016-04-22 06:33:41 -07:00
Peter Boyle
d9b5e66877
Update Make.inc
2016-04-20 18:25:48 +01:00
paboyle
8fd8bc25e9
simd 5th dim with rotation
2016-04-19 15:39:00 -07:00
paboyle
ba427abde9
simd 5d
2016-04-19 15:38:39 -07:00
paboyle
9b6ab6db16
simd in 5th dimension support
2016-04-19 15:38:01 -07:00
paboyle
806a83d38b
simd in fifth dim support for dwf
2016-04-19 15:36:19 -07:00
paboyle
7223753355
Rotate in a direction > 2 for simd_layout
2016-04-19 15:35:15 -07:00
paboyle
b27bac4669
Updates for simd in one dir
2016-04-19 15:34:10 -07:00
paboyle
c8a93d6a93
Cartesian changes to allow all simd in one direction
2016-04-19 15:18:12 -07:00
paboyle
04072a5e1f
Rotate is a temporary hack. Would like to merge ALL
...
permutes as rotates of length 2, and make any rotate active
over any subset of lane bits. This is hard, and requires general
permute; current intrinsics mean this is only really possible for specific
case by case encodings as presently performed. Intel could produce a general
permute.. would help. IBM did it in VMX.
2016-04-19 15:15:34 -07:00
paboyle
574ea4f843
const safety
2016-04-19 15:15:11 -07:00
paboyle
587f80cd93
Updated to compile and pass under intel SDE
2016-04-19 15:13:54 -07:00
paboyle
528eb773ad
Merged.
...
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle
e5657510b0
Rotate support for Ls simd-ized
2016-04-19 22:24:18 +01:00
paboyle
f473919526
Rotate support
2016-04-19 22:23:51 +01:00
Christopher Kelly
ab56ccdd25
-Complete and working implementation of Grid_empty
2016-04-15 13:17:42 -04:00
neo
339be37dba
Debugging smeared HMC
2016-04-13 17:00:14 +09:00
neo
a87b744621
HMC runs but does not accept with smearing on
2016-04-07 16:45:11 +09:00
Christopher Kelly
a646260e82
Merge remote-tracking branch 'origin/master' into ckelly-dec12-2015
2016-04-06 13:57:28 -04:00
Christopher Kelly
af9c8d1372
-Checkerboard fixes for Lanczos
2016-04-06 13:50:56 -04:00
paboyle
b1192a8908
Benchmark_zmm added
2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596
Adding extra benchmark
2016-04-06 10:32:54 +01:00
97d0d56bcb
Debugging Smearing routines (set_fj)
2016-04-06 17:58:43 +09:00
paboyle
c7ba47bdc7
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-06 02:56:28 +01:00
7c7ea35ffb
Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions
2016-04-05 16:28:09 +09:00
4b1cf580e0
Debugging the Smearing routines
2016-04-05 16:19:30 +09:00
paboyle
e67fc2be18
Adding a trial for openmp overhead minimisation
2016-03-31 16:00:37 +01:00
paboyle
f473ef7591
Fixing the compile
2016-03-31 07:47:42 -07:00
paboyle
8052556275
Cleaning up the single/double kernel implementation switch
2016-03-31 14:51:32 +01:00
paboyle
60d965f79e
AVX512 improvements; sigfpe trapping too
2016-03-30 08:42:34 +01:00
paboyle
83b15bfcdd
Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign
2016-03-30 08:39:39 +01:00
paboyle
1ecbf9794d
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-30 08:37:55 +01:00
paboyle
2ded354403
configure
2016-03-30 00:17:43 -07:00
paboyle
340428a1fe
Eigen fixes and HDCR work
2016-03-30 00:16:02 -07:00
paboyle
c77b7ee897
AddSub based alternate SU3 routine
2016-03-28 17:55:22 -06:00
paboyle
b6c3bc574b
Moving to a more coherent organisation of the inline assembly and arch dependencies.
2016-03-28 16:24:37 +01:00
paboyle
1e355a51e1
Interface change
2016-03-27 23:46:55 -07:00
paboyle
ad80f61fba
AVX512 shaken out
2016-03-28 00:38:05 -06:00
paboyle
21abaf7e91
Gamma sign change
2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
azusa
f54e0ec9bd
Try lanczos to set up hdcr subspace
2016-03-17 10:36:16 +00:00
paboyle
60d4564151
ICC no compile fix
2016-03-16 02:30:40 -07:00
paboyle
d4e57f4bc6
IO Bandwidth reporting
2016-03-16 02:30:16 -07:00
paboyle
3920b2c0ab
HDCR updates
2016-03-16 02:29:58 -07:00
paboyle
2733c4b93c
hdcr updates
2016-03-16 02:29:37 -07:00
paboyle
36a800f26c
Microsecond granularity support
2016-03-16 02:28:51 -07:00
paboyle
b75da563d9
Resurrect timestamp. Should make optional
2016-03-16 02:28:17 -07:00
paboyle
f9faec38be
Printing fix under comms none
2016-03-16 02:27:53 -07:00
paboyle
d6b64f47d9
Uint64 sum for IO rates
2016-03-16 02:27:22 -07:00
paboyle
a359f7a9f5
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-11 16:07:07 -08:00
paboyle
b606deb3f0
Uint64 gsum
2016-03-11 16:06:54 -08:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
2dce9c3cff
HDCR running on 16^3 with 2x-3x speed up.
2016-03-08 01:01:50 -08:00
paboyle
dc72293398
More timing info
2016-03-06 10:46:55 -08:00
paboyle
e55c35734b
Fix a nocompile
2016-03-03 20:33:28 +00:00
paboyle
325e745daa
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-02 07:04:03 -08:00
paboyle
61413565d0
Back off the inlined spin proj as not working
2016-03-02 07:03:09 -08:00
paboyle
ff129d9ad9
Redundant operations removed
2016-03-02 07:02:37 -08:00
paboyle
03fcd3b33a
Back out of the colour
2016-03-02 07:01:15 -08:00
paboyle
68b02da483
Backing off the colour
2016-03-02 07:00:43 -08:00
paboyle
e051119769
extern "C" should have been in the header file, but Cray is apparently not C++ friendly.
2016-03-02 07:00:00 -08:00
2d8bb356e3
Smearing routines compile (still untested)
2016-02-25 02:43:59 +09:00
a7251f28c7
Stout smearing compiles (untested)
2016-02-24 03:16:50 +09:00
1eb169ac0b
compatibility fix
2016-02-23 16:36:50 +00:00
5674c3e241
cycle count fix for x86
2016-02-23 16:08:18 +00:00
Antonin Portelli
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
19526d09c2
Merge commit '6aeaf6f568a391e34b913f08be6a11beb28d8842'
2016-02-22 15:23:26 +00:00
Peter Boyle
6aeaf6f568
Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
...
turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
Peter Boyle
40f2db9bc0
Disable metropolis step until 10 traj covered. Should move to exposing these
...
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
Peter Boyle
2cfa20cc4e
Improving the logging, got fed up with color so optionally disable.
...
Backtrace macro used everwhere
2016-02-21 07:58:53 -06:00
Peter Boyle
a5f683d124
Machine generated
2016-02-21 07:57:42 -06:00
Jung
9f0d9ade68
Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
...
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
neo
c1b1b89d17
More on smearing routines, writing APEsmear (dev)
2016-02-19 17:15:27 +09:00
neo
771235017d
Adding smearing routines (development)
2016-02-19 15:30:41 +09:00
paboyle
3425751cb8
Missing return value
2016-02-19 01:06:03 +00:00
paboyle
db5e8050a8
Attempts at some optimisation
2016-02-18 22:33:58 +00:00
paboyle
a3fbabf404
Bug fix
2016-02-18 18:08:24 +00:00
Peter Boyle
22422a84d9
Small problem in compressor fix
2016-02-17 19:03:09 -06:00
Peter Boyle
c9fadf97a5
Simplify the compressor interface again.
2016-02-17 18:16:45 -06:00
Peter Boyle
c650bb3f3d
Very small merge speed up.
2016-02-16 18:41:53 -06:00
Peter Boyle
81395e85d1
Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.
2016-02-16 13:56:44 -06:00
Peter Boyle
340a29b735
More careful sequencing of comms
2016-02-15 16:04:59 -06:00
Peter Boyle
a0fc47c6f9
Cheaper implementation
2016-02-15 16:02:36 -06:00
Peter Boyle
42a9ac71d2
BUg fix, wait till complete.
2016-02-14 16:21:21 -06:00
Peter Boyle
41c2b09184
Shmem comms [NO MPI] target added. The dwf test runs and passes.
...
Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals.
2016-02-14 14:24:38 -06:00
paboyle
294dbf1bf0
Compile on OpenMPI shmem
2016-02-11 23:45:51 +00:00
Peter Boyle
9548c8b91f
Had to break this out for universal access through the code base.
2016-02-11 07:40:09 -06:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
paboyle
e2f73e3ead
Updates for shmem
2016-02-10 16:50:32 -08:00
neo
6371676a75
Correcting some compilation errors for clang-sse
2016-02-10 11:37:03 +09:00
Jung
bd84c23298
definitions reconciled.
2016-01-25 16:30:59 -05:00
Jung
7aa8d5e8af
Faiing to compile, comparing with master
2016-01-25 16:03:02 -05:00
Jung
6012b0ec23
Checking in changes before changing to chulwoo-dec12-2015
2016-01-25 09:40:58 -05:00
Jung
411ac49dd7
GparityWilsonTM typedef added. Not yet tested
...
Conflicts:
configure
lib/qcd/action/fermion/WilsonKernels.h
2016-01-25 01:36:28 -05:00
Jung
b8fb05a422
Addtional routines for Lanczos (SYM2, Chebyshef)..
2016-01-25 01:26:25 -05:00
Jung
5c57d4f403
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
paboyle
fc6ad65751
Pushed the overlap comms tweaks
2016-01-11 06:34:22 -08:00
paboyle
dafc74020c
Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori
2016-01-10 16:54:27 -08:00
paboyle
d19321dfde
Overlap comms compute changes
2016-01-10 19:20:16 +00:00
Jung
5924e5a562
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
configure
lib/qcd/action/Actions.h
lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
c99d748da6
Timing reports in benchmarks now reflect the asynch comms thread statistics
2016-01-04 14:42:16 +00:00
paboyle
02452afd36
Optional overlap of comms with compute
2016-01-04 14:18:40 +00:00
paboyle
331768dcff
Added overlap comms compute mode
2016-01-03 01:38:11 +00:00
paboyle
4aac345bea
Updated logging to colour code according to message type
2016-01-02 17:21:14 +00:00
paboyle
15c0022042
GPLv2 clarified, and copyright message and banner in Init function.
...
Color is just showing off....
2016-01-02 15:22:30 +00:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
1e68b1c1bd
Create a benign default for gparity twists
2016-01-02 14:06:53 +00:00
paboyle
5a80930dd2
Charge conjugation boundary conditions for gauge fields implemented as a policy
...
class, changing the nature of covariant Cshifts used in
plaquettes, rectangles and staples.
As a result same code is used for the plaq and rect action independent of the BC type.
Should probably isolate the BC in a separate class that Gimpl takes as a template param.
Do the same with smearing policies.
This would then allow composition of BC with smearing etc....
2016-01-02 13:37:25 +00:00
paboyle
145a295231
Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not
...
impact Wilson based nearest neighbour stencils.
2015-12-30 19:29:48 +00:00