paboyle
574ea4f843
const safety
2016-04-19 15:15:11 -07:00
paboyle
587f80cd93
Updated to compile and pass under intel SDE
2016-04-19 15:13:54 -07:00
paboyle
528eb773ad
Merged.
...
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle
e5657510b0
Rotate support for Ls simd-ized
2016-04-19 22:24:18 +01:00
paboyle
f473919526
Rotate support
2016-04-19 22:23:51 +01:00
Christopher Kelly
ab56ccdd25
-Complete and working implementation of Grid_empty
2016-04-15 13:17:42 -04:00
neo
339be37dba
Debugging smeared HMC
2016-04-13 17:00:14 +09:00
neo
a87b744621
HMC runs but does not accept with smearing on
2016-04-07 16:45:11 +09:00
Christopher Kelly
a646260e82
Merge remote-tracking branch 'origin/master' into ckelly-dec12-2015
2016-04-06 13:57:28 -04:00
Christopher Kelly
af9c8d1372
-Checkerboard fixes for Lanczos
2016-04-06 13:50:56 -04:00
paboyle
b1192a8908
Benchmark_zmm added
2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596
Adding extra benchmark
2016-04-06 10:32:54 +01:00
97d0d56bcb
Debugging Smearing routines (set_fj)
2016-04-06 17:58:43 +09:00
paboyle
c7ba47bdc7
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-06 02:56:28 +01:00
7c7ea35ffb
Putting the Traceless Antihermitian part outside the deriv in pseudofermion actions
2016-04-05 16:28:09 +09:00
4b1cf580e0
Debugging the Smearing routines
2016-04-05 16:19:30 +09:00
paboyle
e67fc2be18
Adding a trial for openmp overhead minimisation
2016-03-31 16:00:37 +01:00
paboyle
f473ef7591
Fixing the compile
2016-03-31 07:47:42 -07:00
paboyle
8052556275
Cleaning up the single/double kernel implementation switch
2016-03-31 14:51:32 +01:00
paboyle
60d965f79e
AVX512 improvements; sigfpe trapping too
2016-03-30 08:42:34 +01:00
paboyle
83b15bfcdd
Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign
2016-03-30 08:39:39 +01:00
paboyle
1ecbf9794d
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-30 08:37:55 +01:00
paboyle
2ded354403
configure
2016-03-30 00:17:43 -07:00
paboyle
340428a1fe
Eigen fixes and HDCR work
2016-03-30 00:16:02 -07:00
paboyle
c77b7ee897
AddSub based alternate SU3 routine
2016-03-28 17:55:22 -06:00
paboyle
b6c3bc574b
Moving to a more coherent organisation of the inline assembly and arch dependencies.
2016-03-28 16:24:37 +01:00
paboyle
1e355a51e1
Interface change
2016-03-27 23:46:55 -07:00
paboyle
ad80f61fba
AVX512 shaken out
2016-03-28 00:38:05 -06:00
paboyle
21abaf7e91
Gamma sign change
2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
azusa
f54e0ec9bd
Try lanczos to set up hdcr subspace
2016-03-17 10:36:16 +00:00
paboyle
60d4564151
ICC no compile fix
2016-03-16 02:30:40 -07:00
paboyle
d4e57f4bc6
IO Bandwidth reporting
2016-03-16 02:30:16 -07:00
paboyle
3920b2c0ab
HDCR updates
2016-03-16 02:29:58 -07:00
paboyle
2733c4b93c
hdcr updates
2016-03-16 02:29:37 -07:00
paboyle
36a800f26c
Microsecond granularity support
2016-03-16 02:28:51 -07:00
paboyle
b75da563d9
Resurrect timestamp. Should make optional
2016-03-16 02:28:17 -07:00
paboyle
f9faec38be
Printing fix under comms none
2016-03-16 02:27:53 -07:00
paboyle
d6b64f47d9
Uint64 sum for IO rates
2016-03-16 02:27:22 -07:00
paboyle
a359f7a9f5
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-11 16:07:07 -08:00
paboyle
b606deb3f0
Uint64 gsum
2016-03-11 16:06:54 -08:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
2dce9c3cff
HDCR running on 16^3 with 2x-3x speed up.
2016-03-08 01:01:50 -08:00
paboyle
dc72293398
More timing info
2016-03-06 10:46:55 -08:00
paboyle
e55c35734b
Fix a nocompile
2016-03-03 20:33:28 +00:00
paboyle
325e745daa
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-02 07:04:03 -08:00
paboyle
61413565d0
Back off the inlined spin proj as not working
2016-03-02 07:03:09 -08:00
paboyle
ff129d9ad9
Redundant operations removed
2016-03-02 07:02:37 -08:00
paboyle
03fcd3b33a
Back out of the colour
2016-03-02 07:01:15 -08:00
paboyle
68b02da483
Backing off the colour
2016-03-02 07:00:43 -08:00
paboyle
e051119769
extern "C" should have been in the header file, but Cray is apparently not C++ friendly.
2016-03-02 07:00:00 -08:00
2d8bb356e3
Smearing routines compile (still untested)
2016-02-25 02:43:59 +09:00
a7251f28c7
Stout smearing compiles (untested)
2016-02-24 03:16:50 +09:00
1eb169ac0b
compatibility fix
2016-02-23 16:36:50 +00:00
5674c3e241
cycle count fix for x86
2016-02-23 16:08:18 +00:00
Antonin Portelli
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
19526d09c2
Merge commit '6aeaf6f568a391e34b913f08be6a11beb28d8842'
2016-02-22 15:23:26 +00:00
Peter Boyle
6aeaf6f568
Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
...
turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
Peter Boyle
40f2db9bc0
Disable metropolis step until 10 traj covered. Should move to exposing these
...
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
Peter Boyle
2cfa20cc4e
Improving the logging, got fed up with color so optionally disable.
...
Backtrace macro used everwhere
2016-02-21 07:58:53 -06:00
Peter Boyle
a5f683d124
Machine generated
2016-02-21 07:57:42 -06:00
Jung
9f0d9ade68
Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
...
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
neo
c1b1b89d17
More on smearing routines, writing APEsmear (dev)
2016-02-19 17:15:27 +09:00
neo
771235017d
Adding smearing routines (development)
2016-02-19 15:30:41 +09:00
paboyle
3425751cb8
Missing return value
2016-02-19 01:06:03 +00:00
paboyle
db5e8050a8
Attempts at some optimisation
2016-02-18 22:33:58 +00:00
paboyle
a3fbabf404
Bug fix
2016-02-18 18:08:24 +00:00
Peter Boyle
22422a84d9
Small problem in compressor fix
2016-02-17 19:03:09 -06:00
Peter Boyle
c9fadf97a5
Simplify the compressor interface again.
2016-02-17 18:16:45 -06:00
Peter Boyle
c650bb3f3d
Very small merge speed up.
2016-02-16 18:41:53 -06:00
Peter Boyle
81395e85d1
Regressing to not overlap comms and compute becasue bluewaters, edison, and cori are so rubbish at it.
2016-02-16 13:56:44 -06:00
Peter Boyle
340a29b735
More careful sequencing of comms
2016-02-15 16:04:59 -06:00
Peter Boyle
a0fc47c6f9
Cheaper implementation
2016-02-15 16:02:36 -06:00
Peter Boyle
42a9ac71d2
BUg fix, wait till complete.
2016-02-14 16:21:21 -06:00
Peter Boyle
41c2b09184
Shmem comms [NO MPI] target added. The dwf test runs and passes.
...
Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals.
2016-02-14 14:24:38 -06:00
paboyle
294dbf1bf0
Compile on OpenMPI shmem
2016-02-11 23:45:51 +00:00
Peter Boyle
9548c8b91f
Had to break this out for universal access through the code base.
2016-02-11 07:40:09 -06:00
Peter Boyle
7f927a541c
Shmem related fixes for shmem compile
2016-02-11 07:37:39 -06:00
paboyle
e2f73e3ead
Updates for shmem
2016-02-10 16:50:32 -08:00
neo
6371676a75
Correcting some compilation errors for clang-sse
2016-02-10 11:37:03 +09:00
Jung
bd84c23298
definitions reconciled.
2016-01-25 16:30:59 -05:00
Jung
7aa8d5e8af
Faiing to compile, comparing with master
2016-01-25 16:03:02 -05:00
Jung
6012b0ec23
Checking in changes before changing to chulwoo-dec12-2015
2016-01-25 09:40:58 -05:00
Jung
411ac49dd7
GparityWilsonTM typedef added. Not yet tested
...
Conflicts:
configure
lib/qcd/action/fermion/WilsonKernels.h
2016-01-25 01:36:28 -05:00
Jung
b8fb05a422
Addtional routines for Lanczos (SYM2, Chebyshef)..
2016-01-25 01:26:25 -05:00
Jung
5c57d4f403
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
paboyle
fc6ad65751
Pushed the overlap comms tweaks
2016-01-11 06:34:22 -08:00
paboyle
dafc74020c
Overlap comms compute improvements in hand op kernels, and better timing from Edison and Cori
2016-01-10 16:54:27 -08:00
paboyle
d19321dfde
Overlap comms compute changes
2016-01-10 19:20:16 +00:00
Jung
5924e5a562
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
configure
lib/qcd/action/Actions.h
lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
c99d748da6
Timing reports in benchmarks now reflect the asynch comms thread statistics
2016-01-04 14:42:16 +00:00
paboyle
02452afd36
Optional overlap of comms with compute
2016-01-04 14:18:40 +00:00
paboyle
331768dcff
Added overlap comms compute mode
2016-01-03 01:38:11 +00:00
paboyle
4aac345bea
Updated logging to colour code according to message type
2016-01-02 17:21:14 +00:00
paboyle
15c0022042
GPLv2 clarified, and copyright message and banner in Init function.
...
Color is just showing off....
2016-01-02 15:22:30 +00:00
paboyle
aae8bf31a7
Global edit adding copyright and license info to every source file.
2016-01-02 14:51:32 +00:00
paboyle
1e68b1c1bd
Create a benign default for gparity twists
2016-01-02 14:06:53 +00:00
paboyle
5a80930dd2
Charge conjugation boundary conditions for gauge fields implemented as a policy
...
class, changing the nature of covariant Cshifts used in
plaquettes, rectangles and staples.
As a result same code is used for the plaq and rect action independent of the BC type.
Should probably isolate the BC in a separate class that Gimpl takes as a template param.
Do the same with smearing policies.
This would then allow composition of BC with smearing etc....
2016-01-02 13:37:25 +00:00
paboyle
145a295231
Bug fix for stencil with large shifts (3+), would be important to naik term for example but did not
...
impact Wilson based nearest neighbour stencils.
2015-12-30 19:29:48 +00:00
paboyle
841a37f941
Fix to WilsonCompressor that fixes a bug in comms phase due to the sign change on gamma
...
matrix in hopping term.
Add logging of time spent in CG.
2015-12-29 23:49:41 +00:00
Azusa Yamaguchi
e6cad3821c
Logging improvement
2015-12-29 19:51:18 +00:00
Azusa Yamaguchi
98de1cbb6a
Optimised version of rectangle term staples.
...
~3.4x faster than the naive.
2015-12-29 19:22:59 +00:00
Azusa Yamaguchi
f7d61b8b81
Plaq plus rectangle and Iwasaki, Symanzik DBW2.
...
http://arxiv.org/pdf/hep-lat/0610075.pdf plaq and rect regress plausibly over 100 trajectories
and under HMC with average plaq and rectangle coming out ok.
2015-12-28 16:39:26 +00:00
Azusa Yamaguchi
78c4e862ef
Plaq, Rectangle, Iwasaki, Symanzik and DBW2 workign and HMC regresses to http://arxiv.org/pdf/hep-lat/0610075.pdf
2015-12-28 16:38:31 +00:00
1e0be161e5
MacroMagic: inline functions to avoid double symbol issues
2015-12-23 14:20:05 +00:00
paboyle
0afcf1cf13
Moved all the HMC tests over to using a single HmcRunner class that manages checkpoint strategies and such like
2015-12-22 11:19:25 +00:00
paboyle
08edbb5cbe
HMC bit repro across checkpoints. Fixed parallel RNG issue with threading.
...
Conclusion: c++11 distributions not thread safe and must us distinct dist as well as distinct engine
per site. Makes sense when you think of box muller. Also added a reset of dist on fill to ensure
repro across checkpoints.
2015-12-22 08:54:40 +00:00
paboyle
0abfbcc8eb
Naming of files improvement.
2015-12-21 15:37:26 +00:00
paboyle
1b94253ba4
Logging improvement
2015-12-21 15:36:28 +00:00
paboyle
36e6f9ac7b
Bug fix. Guess not initialised in refresh step; didn't hit before due to luck in not having a vector
...
created with NAN data.
2015-12-21 15:34:35 +00:00
paboyle
2f41691c11
Bug fix. Guess was not zeroed prior to CG call. Was earlier accidentally benign just due to luck.
2015-12-21 15:33:36 +00:00
paboyle
09bfe52840
Remove extraneous variable
2015-12-21 15:30:28 +00:00
paboyle
8c9010d0f4
Isnan check on guess and convergence assert on result
2015-12-21 15:29:46 +00:00
paboyle
42c583265c
Remove timestamp
2015-12-21 15:28:03 +00:00
paboyle
539d698492
Prototypes for CML routines
2015-12-21 15:26:42 +00:00
paboyle
31ca609d12
HMC checkpointing .
...
Need a general HMC framework to work in restart.
2015-12-20 02:29:51 +00:00
paboyle
5710966324
Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time.
...
Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file.
Store a Parallel (one per site) and a single serial RNG file.
2015-12-19 18:32:25 +00:00
paboyle
e108e708a3
Wilson TM tests and compiles in
2015-12-17 23:06:33 +00:00
paboyle
6f0198d4d9
Merge branch 'master' of https://github.com/paboyle/Grid
2015-12-17 22:34:54 +00:00
paboyle
67ccb043f1
Added TM fermions for DSDR etc..
2015-12-17 22:34:28 +00:00
Azusa Yamaguchi
24a5a81c53
SSE compile fix
2015-12-16 09:09:37 +00:00
Jung
eb1759d7ea
Added Gparity instantiation to no HANDOPT case
...
deleted configure (as intended?)
2015-12-16 00:04:09 -05:00
paboyle
34a0fde2ad
Fixes to fermion force terms after sign of gamma_mu (0...3) change.
...
Thought I had already committed these.
Believe I have got the Gparity fermion force working.
* tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion
based on a small dt update of U field.
* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.
Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
Jung
bc34b7e808
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
tests/Make.inc
2015-12-15 11:11:59 -05:00
Jung
284453c5e9
Added gparity mobius defs, added params to ScaledShamir
...
checking in before puling master
2015-12-14 12:15:06 -05:00
paboyle
af855cc129
Updating to fix peek poke to checkerboarded arrays since Chulwoo needs this.
2015-12-12 07:11:46 +00:00
paboyle
47fe6b5a7c
Merge branch 'master' of https://github.com/aportelli/Grid into aportelli-master
2015-12-10 23:14:52 +00:00
paboyle
b3ef09a54d
Merge branch 'master' of https://github.com/paboyle/Grid
2015-12-10 23:05:38 +00:00
paboyle
3ce10aa975
Fix a regression failure on Mobius; chroma regression added
2015-12-10 22:55:00 +00:00
Azusa Yamaguchi
a32a59fc43
Merge branch 'master' of https://github.com/paboyle/Grid
2015-12-09 12:48:44 +00:00
200de272ed
IO: serialisable enums
2015-12-08 13:54:00 +00:00
d68a72e28b
IO: code cleaning and string binary IO fix
2015-12-08 13:53:33 +00:00
17f9268a55
XmlIO: minor code cleaning
2015-12-07 18:30:00 +00:00
78f0c2595d
autotool file accidentally committed
2015-12-07 18:28:06 +00:00
Jung
f2b4edc090
Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention)
2015-12-07 02:04:57 -05:00
Jung
fb81acca3c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-12-03 12:11:10 -05:00
paboyle
93356fd246
No compile fixes on gcc/Cray
2015-11-29 03:14:44 -08:00
paboyle
ca42fe6d32
Merge branch 'master' of github.com:paboyle/Grid
...
Merge done
Conflicts:
lib/serialisation/XmlIO.h
tests/Test_stencil.cc
2015-11-28 17:03:43 -08:00
paboyle
6b97b271ae
Integer divide useful
2015-11-28 17:01:20 -08:00
paboyle
fa01ae5980
integer divide
2015-11-28 17:00:34 -08:00
paboyle
113131b01c
THis failed for some reason. Suspect Antonin has made more progress.
2015-11-28 16:59:59 -08:00
paboyle
b2c02a6106
Runs fastst on cori
2015-11-28 16:58:16 -08:00
paboyle
02d730513a
Divide function
2015-11-28 16:54:43 -08:00
paboyle
d875c2bd39
More verbose useful
2015-11-28 16:54:19 -08:00
paboyle
cc32ba615a
Verbose changes
2015-11-28 16:53:54 -08:00
paboyle
6684739452
Better to drop KMP_AFFINITY override
2015-11-28 16:52:44 -08:00
Peter Boyle
bc4b252883
Merge branch 'master' of https://github.com/paboyle/Grid
2015-11-29 00:33:01 +00:00
Peter Boyle
11cf0f08f3
This file is not yet debugged.
2015-11-29 00:32:45 +00:00
Peter Boyle
8a33846095
No compile fix
2015-11-29 00:29:58 +00:00
Peter Boyle
54f04ee5c9
Perf event interface was linux specfic and use ifdef to protect
2015-11-29 00:24:48 +00:00
Peter Boyle
825875fd48
compile fixes
2015-11-29 00:24:25 +00:00
Peter Boyle
f8290bfd58
Compile fixes
2015-11-29 00:24:04 +00:00
Azusa Yamaguchi
967be91692
update merge
2015-11-26 09:51:41 +00:00
06f8ecea04
Merge commit '899ca41cb8c8f47771bfd37cd895cbc2184e5560'
2015-11-16 18:16:25 +00:00
af19118113
new I/O interface
2015-11-16 18:14:37 +00:00
paboyle
e9ff25b06b
Small threading change makes a difference on Cori.
2015-11-07 00:07:05 -08:00
paboyle
05a7029600
Stencil change
2015-11-07 00:06:31 -08:00
paboyle
b04b8914fd
EXECINFO change
2015-11-07 00:05:57 -08:00
paboyle
899ca41cb8
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/qcd/action/fermion/WilsonFermion5D.cc
2015-11-06 03:50:04 -08:00
paboyle
d29b4c1dee
Assembler files
2015-11-06 03:48:48 -08:00
paboyle
a2ff068e29
Asm and threading for many core
2015-11-06 03:47:14 -08:00
paboyle
b362f8d27b
Threading for many core
2015-11-06 03:46:41 -08:00
paboyle
64770d9052
Threading changes for many core and asm calls
2015-11-06 03:46:21 -08:00
paboyle
17af18dcab
Changes for AVX512 assembler
2015-11-06 03:45:51 -08:00
Peter Boyle
28022755ae
Stencil class name global change to StencilImpl typedef
2015-11-06 05:30:17 -06:00
Peter Boyle
955b482aaf
Partial optimisation of the extraction/merger of simd vecs.
2015-11-06 05:26:20 -06:00
Peter Boyle
f9b2fce93b
Changing whole stencil class to be template and not just single functions
2015-11-06 05:25:10 -06:00
Peter Boyle
473fa28a6c
Partial optimisation; comms in x-dir for red black dslash will be slow as the checker skipping block strided
...
loops are non threadable. Will need to write a kernel for these instead and drive them with a lookup table
to make a look sufficiently simple to thread.
2015-11-06 05:23:23 -06:00
Peter Boyle
5d854c869c
Stencil interface changes
2015-11-06 05:22:33 -06:00
Peter Boyle
880ff88362
Comms optimisation
2015-11-06 05:22:18 -06:00
Azusa Yamaguchi
4690acc3c8
Don't know why peter committed these as they didn't compile
2015-11-06 10:31:48 +00:00
Azusa Yamaguchi
3281745fde
Exec info and linux check to stop non-portable code breaking
2015-11-06 10:31:24 +00:00
paboyle
1159de165c
Asm option for AVX512
2015-11-05 22:04:51 -08:00
paboyle
16c7993434
Merge branch 'master' of github.com:paboyle/Grid
...
Conflicts:
lib/simd/Grid_avx512.h
lib/simd/Grid_imci.h
2015-11-04 03:32:10 -08:00
paboyle
6be9716e6f
New file
2015-11-04 03:26:28 -08:00
paboyle
4a41c885ed
Use Linux kernel interface to hardware performance counters. Dead useful.
2015-11-04 03:24:19 -08:00
paboyle
757b31ed42
Threading for KNC mods.
2015-11-04 03:22:14 -08:00
paboyle
ac7d1f26ad
Either blocking or lebesgue curve
2015-11-04 03:19:16 -08:00
paboyle
1a8bf938b3
Use either sub-blocking or lebesgue
2015-11-04 03:18:51 -08:00
paboyle
63a2993827
Exec info an cache blocking
2015-11-04 03:16:56 -08:00
paboyle
4e65ad21ac
Adding a routine for AVX512 / IMCI with explicit assembly implementations
2015-11-04 03:15:08 -08:00
Peter Boyle
dfc1de6f60
Merge branch 'master' of github.com:paboyle/Grid
2015-11-04 05:14:26 -06:00
Peter Boyle
3b7576ad53
Switch off for now
2015-11-04 05:13:29 -06:00
paboyle
9b5d31ffc1
mac , mult routines
...
Lines# with '#' will be ignored, and an empty message aborts the commit.
2015-11-04 03:10:34 -08:00
paboyle
a38762159c
Inline assembly hooks for AVX 512. Better way in some ways than BAGEL to generate assembly.
...
Updated Grid_avx512.h
2015-11-04 03:09:06 -08:00
Peter Boyle
ffc5dab17f
AMD FMA4 support added for Interlagos/BlueWaters
2015-11-04 04:29:58 -06:00
Peter Boyle
96608c70d1
chrono causing some problems on Cray systems. Suspend use for now
2015-11-04 04:28:31 -06:00
Peter Boyle
d35d63b171
Algorithm in
2015-11-04 04:27:44 -06:00
Peter Boyle
24044dbc56
Debugged a problem with checkerboarded cshift in the checker dimension which arose
...
only when mpi spread out in the checker dimension. Added a test that trapped and helped debug this
2015-11-04 10:00:55 +00:00
Peter Boyle
abb23df83f
formatting only
2015-11-04 10:00:27 +00:00
Peter Boyle
12c5ec813c
Useful debug messages (commented out) are included for preservation in case I need to revisit this
2015-11-04 09:59:27 +00:00
Peter Boyle
1271508ca2
Bug fix for spread out in x (EO) direction.
...
This is really annoying -- it is very hard to thread the loops with the index
recursion on buffer offset in the red-black case. Must think of a good threading
solution here.
2015-11-04 09:57:57 +00:00
Peter Boyle
ec5af35166
EO bug fix when spread out in x-direction
2015-11-04 09:56:58 +00:00
Peter Boyle
0f59356e86
Problem in comms fixed
2015-11-02 00:00:15 +00:00
8709117aea
Log: generalised Logger class to allow separate logs in Grid-based applications
2015-10-27 17:31:13 +00:00
e6b9aa9076
Config.h removed form repository
2015-10-27 10:47:07 +00:00
Peter Boyle
8889af45ca
FMA4 added
2015-10-09 01:00:53 +02:00
Peter Boyle
83afb2e26a
Poly support for lanczos
2015-10-09 00:43:21 +02:00
Peter Boyle
6d06bd9493
Minor change in commented out code
2015-10-09 00:42:21 +02:00
Peter Boyle
6ee23f409e
Lanczos addition
2015-10-09 00:41:00 +02:00
Peter Boyle
2d95dac6b6
Lanczos untested/partially tested additions. In middle of shake out but at least compiles
2015-10-09 00:40:25 +02:00
Peter Boyle
814c79f38d
SIMD improvements for mac and madd use in complex for avx, sse
2015-10-09 00:38:52 +02:00
paboyle
1878bf97d0
Babbage fix
2015-09-30 16:04:01 -07:00
paboyle
a660ce716b
No compile babbage fix
2015-09-30 16:02:44 -07:00
paboyle
f4b6d1dfea
NGO stores reenabled
2015-09-30 16:02:14 -07:00
paboyle
23813ac798
No compile on babbage fix
2015-09-30 16:01:28 -07:00
Peter Boyle
9f4f65cb46
Added a decoupled memory system benchmark to remove thread synch overhead
2015-09-26 18:23:57 -07:00
Peter Boyle
64d64d1ab6
Updating to modify non-inlining permute routines and hopefully get better reg use and
...
enhance performance.
2015-09-25 08:55:04 -07:00
Peter Boyle
5ef42add2d
Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
...
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
Peter Boyle
2f38ebc446
Reintroducing the hand unrolled loops
2015-09-08 17:45:30 +01:00
Peter Boyle
638d6675ee
Tested rms dH is ~ dt^4 numerically, so believe the ForceGradient is correct now.
...
Paranoia makes me want to diddle with the FG step to ensure dt^2 reappears.
2015-08-31 16:33:20 +01:00
Peter Boyle
357c6ab46d
Reunitarise. Complete the HMC and integrator changes.
2015-08-31 16:32:04 +01:00
Peter Boyle
755dca9533
Added ForceGradient integrator. dH dropped so seems to work. Will only
...
believe it is right once I have pulled a dt^4 error scaling plot out.
2015-08-31 06:23:02 +01:00
Peter Boyle
29fd004d54
Unified integrator and integrator algorithm into virtual class used as a policy for the
...
HMC.
2015-08-30 13:39:19 +01:00
Peter Boyle
aa52fdadcc
Global edit on HMC sector -- making GaugeField a template parameter and
...
preparing to pass integrator, smearing, bc's as policy classes to hmc.
Propose to unify "integrator" and integrator algorithm in a base/derived
way to override step. Want to read through ForceGradient to ensure
that abstraction covers the force gradient case.
2015-08-30 12:18:34 +01:00
Peter Boyle
76d752585b
Started a tidy up in the HMC sector. Now comfortable with the two level integrators;
...
to a little figure out what Guido had done & why -- but there is a neat saving of force
evaluations across the nesting time boundary making use of linearity of the leapP in dt.
I cleaned up the printing, reduced the volume of code, in the process sharing printing
between all integrators. Placed an assert that the total integration time for all integrators
must match at end of trajectory.
Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both
Omelyan and with Leapfrog so substantial confidence gained.
2015-08-29 17:18:43 +01:00
Peter Boyle
dc814f30da
Binary IO file for generic Grid array parallel I/O.
...
Number of IO MPI tasks can be varied by selecting which
dimensions use parallel IO and which dimensions use Serial send to boss
I/O.
Thus can neck down from, say 1024 nodes = 4x4x8x8 to {1,8,32,64,128,256,1024} nodes
doing the I/O.
Interpolates nicely between ALL nodes write their data, a single boss per time-plane
in processor space [old UKQCD fortran code did this], and a single node doing all I/O.
Not sure I have the transfer sizes big enough and am not overly convinced fstream
is guaranteed to not give buffer inconsistencies unless I set streambuf size to zero.
Practically it has worked on 8 tasks, 2x1x2x2 writing /cloning NERSC configurations
on my MacOS + OpenMPI and Clang environment.
It is VERY easy to switch to pwrite at a later date, and also easy to send x-strips around from
each node in order to gather bigger chunks at the syscall level.
That would push us up to the circa 8x 18*4*8 == 4KB size write chunk, and by taking, say, x/y non
parallel we get to 16MB contiguous chunks written in multi 4KB transactions
per IOnode in 64^3 lattices for configuration I/O.
I suspect this is fine for system performance.
2015-08-26 13:40:29 +01:00
Peter Boyle
612957f057
pull in original license.
2015-08-21 10:19:08 +01:00
Peter Boyle
cea8ac9a22
Credits to orig source where I found the macro tricks.
2015-08-21 10:14:53 +01:00
Peter Boyle
476da3ee62
Separated IO reader/writers into a proper abstract base,
...
derived relationship. Have Text/Binary/Xml versions of
Reader & Writer.
Any new Reader/Writer class inheriting the interface can give object serialisation
to any desired format now.
new file: lib/serialisation/BaseIO.h
modified: lib/serialisation/BinaryIO.h
modified: lib/serialisation/Serialisation.h
modified: lib/serialisation/TextIO.h
modified: lib/serialisation/XmlIO.h
The test uses the Xml, Binary and Text formats as well as cout << Object.
2015-08-21 10:06:33 +01:00
Peter Boyle
35818fdf6c
Text and Binary readers
2015-08-20 23:04:38 +01:00
Peter Boyle
77d299b414
Cosmetic
2015-08-20 16:30:52 +01:00
Peter Boyle
ab81a25073
XMLReader implementation and a virtual Reader/Writer template framework.
...
Test_serialisation has an example of *code* *free* object serialisation
to both ostream and to XML using macro magic.
Implementing TextReader/TextWriter, YAML, JSON etc.. should be trivial
and we can use configure time options to select the default "Reader" typedef.
Present done with
"using XMLPolicy::Reader"
to pick up the default serialisation strategy.
2015-08-20 16:21:26 +01:00
Peter Boyle
fdfe194c41
Threading bug in RNG fill fixed.
2015-08-19 14:41:05 +01:00
Peter Boyle
4e085dd0ed
Domain wall even-odd 2f HMC with wilson gauge and PV 2f ratio now running and giving small dH.
...
Azusa is working hard on the rectangle term and we'll hopefully start reproducing plaquettes
from RBC-UKQCD parameters soon !
My new laptop is pretty warm and is starting to groan ;)
2015-08-19 10:26:07 +01:00
Peter Boyle
e8d63c9178
Merge branch 'master' of https://github.com/paboyle/Grid
2015-08-19 05:49:00 +01:00
Peter Boyle
c54c086f17
Even odd preconditioned one flavour ratio
...
(no support for non-const EE schur block)
2015-08-19 05:46:58 +01:00
Peter Boyle
dd6bb73ee0
Added one flavour rational ratios (unprec)
2015-08-19 04:58:40 +01:00
Peter Boyle
fc160eeccc
Added one flavour rational ratios (unprec)
2015-08-19 04:58:40 +01:00
Peter Boyle
48db72259e
EvenOdd schur decomposed mpcdagmpc version of rhmc determinant.
...
dH is also small and plaquette looks right.
2015-08-18 18:37:39 +01:00
Peter Boyle
570150f1d3
EvenOdd schur decomposed mpcdagmpc version of rhmc determinant.
...
dH is also small and plaquette looks right.
2015-08-18 18:37:39 +01:00
Peter Boyle
5c364f8082
One flavour rational unprec added; untested but does compile.
...
Moving param structs into a single header for later connection to file I/O using
macromagic.h
2015-08-18 14:40:08 +01:00
Peter Boyle
a842a6c94d
One flavour rational unprec added; untested but does compile.
...
Moving param structs into a single header for later connection to file I/O using
macromagic.h
2015-08-18 14:40:08 +01:00
Peter Boyle
bdcbfe9310
Even Odd two flavour ratio added and dH == small
2015-08-18 10:37:08 +01:00
Peter Boyle
9306921ded
Even Odd two flavour ratio added and dH == small
2015-08-18 10:37:08 +01:00
Peter Boyle
76f3855629
Merge branch 'master' of https://github.com/paboyle/Grid
2015-08-18 09:23:58 +01:00
Peter Boyle
8621e2409f
Merge branch 'master' of https://github.com/paboyle/Grid
2015-08-18 09:23:58 +01:00
Peter Boyle
6212807a77
Small dh obtained in two flavour ratio so looks ok.
2015-08-18 09:21:29 +01:00
Peter Boyle
7622f0c441
Small dh obtained in two flavour ratio so looks ok.
2015-08-18 09:21:29 +01:00
Peter Boyle
0bc38a69ce
Adding PV pseudofermion in prep for DWF HMC.
...
Not compiled this yet, but cloned in from BFM.
2015-08-18 09:19:42 +01:00
Peter Boyle
25d0eae50c
Adding PV pseudofermion in prep for DWF HMC.
...
Not compiled this yet, but cloned in from BFM.
2015-08-18 09:19:42 +01:00
Peter Boyle
24382d77bb
Adding PV pseudofermion in prep for DWF HMC.
...
Not compiled this yet, but cloned in from BFM.
2015-08-17 23:14:48 +01:00
Peter Boyle
ef6a9e6b07
Adding PV pseudofermion in prep for DWF HMC.
...
Not compiled this yet, but cloned in from BFM.
2015-08-17 23:14:48 +01:00
Peter Boyle
353d66def1
Unused apparently
2015-08-16 01:41:05 +01:00
Peter Boyle
b8166af92b
Unused apparently
2015-08-16 01:41:05 +01:00
Peter Boyle
afeabe0d23
Tidying
2015-08-16 00:14:10 +01:00
Peter Boyle
6180487517
Tidying
2015-08-16 00:14:10 +01:00
Peter Boyle
53da927c3c
Merge branch 'master' of https://github.com/paboyle/Grid
2015-08-15 23:59:04 +01:00
Peter Boyle
f0e32f12cf
Merge branch 'master' of https://github.com/paboyle/Grid
2015-08-15 23:59:04 +01:00