e67fc2be18
Adding a trial for openmp overhead minimisation
2016-03-31 16:00:37 +01:00
8052556275
Cleaning up the single/double kernel implementation switch
2016-03-31 14:51:32 +01:00
60d965f79e
AVX512 improvements; sigfpe trapping too
2016-03-30 08:42:34 +01:00
83b15bfcdd
Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign
2016-03-30 08:39:39 +01:00
1ecbf9794d
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-30 08:37:55 +01:00
2ded354403
configure
2016-03-30 00:17:43 -07:00
340428a1fe
Eigen fixes and HDCR work
2016-03-30 00:16:02 -07:00
c77b7ee897
AddSub based alternate SU3 routine
2016-03-28 17:55:22 -06:00
b6c3bc574b
Moving to a more coherent organisation of the inline assembly and arch dependencies.
2016-03-28 16:24:37 +01:00
1e355a51e1
Interface change
2016-03-27 23:46:55 -07:00
ad80f61fba
AVX512 shaken out
2016-03-28 00:38:05 -06:00
21abaf7e91
Gamma sign change
2016-03-28 00:35:45 -06:00
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
f54e0ec9bd
Try lanczos to set up hdcr subspace
2016-03-17 10:36:16 +00:00
60d4564151
ICC no compile fix
2016-03-16 02:30:40 -07:00
d4e57f4bc6
IO Bandwidth reporting
2016-03-16 02:30:16 -07:00
3920b2c0ab
HDCR updates
2016-03-16 02:29:58 -07:00
2733c4b93c
hdcr updates
2016-03-16 02:29:37 -07:00
36a800f26c
Microsecond granularity support
2016-03-16 02:28:51 -07:00
b75da563d9
Resurrect timestamp. Should make optional
2016-03-16 02:28:17 -07:00
f9faec38be
Printing fix under comms none
2016-03-16 02:27:53 -07:00
d6b64f47d9
Uint64 sum for IO rates
2016-03-16 02:27:22 -07:00
a359f7a9f5
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-11 16:07:07 -08:00
b606deb3f0
Uint64 gsum
2016-03-11 16:06:54 -08:00
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
2dce9c3cff
HDCR running on 16^3 with 2x-3x speed up.
2016-03-08 01:01:50 -08:00
dc72293398
More timing info
2016-03-06 10:46:55 -08:00
e55c35734b
Fix a nocompile
2016-03-03 20:33:28 +00:00
325e745daa
Merge branch 'master' of https://github.com/paboyle/Grid
2016-03-02 07:04:03 -08:00
61413565d0
Back off the inlined spin proj as not working
2016-03-02 07:03:09 -08:00
ff129d9ad9
Redundant operations removed
2016-03-02 07:02:37 -08:00
03fcd3b33a
Back out of the colour
2016-03-02 07:01:15 -08:00
68b02da483
Backing off the colour
2016-03-02 07:00:43 -08:00
e051119769
extern "C" should have been in the header file, but Cray is apparently not C++ friendly.
2016-03-02 07:00:00 -08:00
1eb169ac0b
compatibility fix
2016-02-23 16:36:50 +00:00
5674c3e241
cycle count fix for x86
2016-02-23 16:08:18 +00:00
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
19526d09c2
Merge commit '6aeaf6f568a391e34b913f08be6a11beb28d8842'
2016-02-22 15:23:26 +00:00
6aeaf6f568
Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
...
turned up problems on the BlueWaters Cray.
Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
40f2db9bc0
Disable metropolis step until 10 traj covered. Should move to exposing these
...
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
2cfa20cc4e
Improving the logging, got fed up with color so optionally disable.
...
Backtrace macro used everwhere
2016-02-21 07:58:53 -06:00
a5f683d124
Machine generated
2016-02-21 07:57:42 -06:00
9f0d9ade68
Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
...
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
3425751cb8
Missing return value
2016-02-19 01:06:03 +00:00
db5e8050a8
Attempts at some optimisation
2016-02-18 22:33:58 +00:00
a3fbabf404
Bug fix
2016-02-18 18:08:24 +00:00
22422a84d9
Small problem in compressor fix
2016-02-17 19:03:09 -06:00
c9fadf97a5
Simplify the compressor interface again.
2016-02-17 18:16:45 -06:00
c650bb3f3d
Very small merge speed up.
2016-02-16 18:41:53 -06:00