1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-10-24 17:54:47 +01:00
Commit Graph

1115 Commits

Author SHA1 Message Date
paboyle
b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle
e8dddb1596 Adding extra benchmark 2016-04-06 10:32:54 +01:00
paboyle
c7ba47bdc7 Merge branch 'master' of https://github.com/paboyle/Grid 2016-04-06 02:56:28 +01:00
paboyle
e67fc2be18 Adding a trial for openmp overhead minimisation 2016-03-31 16:00:37 +01:00
paboyle
f473ef7591 Fixing the compile 2016-03-31 07:47:42 -07:00
paboyle
8052556275 Cleaning up the single/double kernel implementation switch 2016-03-31 14:51:32 +01:00
paboyle
60d965f79e AVX512 improvements; sigfpe trapping too 2016-03-30 08:42:34 +01:00
paboyle
83b15bfcdd Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign 2016-03-30 08:39:39 +01:00
paboyle
1ecbf9794d Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-30 08:37:55 +01:00
paboyle
2ded354403 configure 2016-03-30 00:17:43 -07:00
paboyle
340428a1fe Eigen fixes and HDCR work 2016-03-30 00:16:02 -07:00
paboyle
c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle
b6c3bc574b Moving to a more coherent organisation of the inline assembly and arch dependencies. 2016-03-28 16:24:37 +01:00
paboyle
1e355a51e1 Interface change 2016-03-27 23:46:55 -07:00
paboyle
ad80f61fba AVX512 shaken out 2016-03-28 00:38:05 -06:00
paboyle
21abaf7e91 Gamma sign change 2016-03-28 00:35:45 -06:00
paboyle
165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
azusa
f54e0ec9bd Try lanczos to set up hdcr subspace 2016-03-17 10:36:16 +00:00
paboyle
60d4564151 ICC no compile fix 2016-03-16 02:30:40 -07:00
paboyle
d4e57f4bc6 IO Bandwidth reporting 2016-03-16 02:30:16 -07:00
paboyle
3920b2c0ab HDCR updates 2016-03-16 02:29:58 -07:00
paboyle
2733c4b93c hdcr updates 2016-03-16 02:29:37 -07:00
paboyle
36a800f26c Microsecond granularity support 2016-03-16 02:28:51 -07:00
paboyle
b75da563d9 Resurrect timestamp. Should make optional 2016-03-16 02:28:17 -07:00
paboyle
f9faec38be Printing fix under comms none 2016-03-16 02:27:53 -07:00
paboyle
d6b64f47d9 Uint64 sum for IO rates 2016-03-16 02:27:22 -07:00
paboyle
a359f7a9f5 Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-11 16:07:07 -08:00
paboyle
b606deb3f0 Uint64 gsum 2016-03-11 16:06:54 -08:00
paboyle
090e7aa930 Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle
2dce9c3cff HDCR running on 16^3 with 2x-3x speed up. 2016-03-08 01:01:50 -08:00
paboyle
dc72293398 More timing info 2016-03-06 10:46:55 -08:00
paboyle
e55c35734b Fix a nocompile 2016-03-03 20:33:28 +00:00
paboyle
325e745daa Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-02 07:04:03 -08:00
paboyle
61413565d0 Back off the inlined spin proj as not working 2016-03-02 07:03:09 -08:00
paboyle
ff129d9ad9 Redundant operations removed 2016-03-02 07:02:37 -08:00
paboyle
03fcd3b33a Back out of the colour 2016-03-02 07:01:15 -08:00
paboyle
68b02da483 Backing off the colour 2016-03-02 07:00:43 -08:00
paboyle
e051119769 extern "C" should have been in the header file, but Cray is apparently not C++ friendly. 2016-03-02 07:00:00 -08:00
1eb169ac0b compatibility fix 2016-02-23 16:36:50 +00:00
5674c3e241 cycle count fix for x86 2016-02-23 16:08:18 +00:00
Antonin Portelli
497e7e4c53 BG/Q compatibility fix 2016-02-23 15:57:38 +00:00
19526d09c2 Merge commit '6aeaf6f568a391e34b913f08be6a11beb28d8842' 2016-02-22 15:23:26 +00:00
Peter Boyle
6aeaf6f568 Parallel IO worked on. I'm puzzled because I already thought I shook this out on MacOS + OpenMPI and then
turned up problems on the BlueWaters Cray.

Gets 75MB/s from home filesystem on parallel configuration read. Need to make the RNG IO parallel,
and also to look at aggregating bigger writes for the parallel write.
Not sure what the home filesystem is.
2016-02-21 08:03:21 -06:00
Peter Boyle
40f2db9bc0 Disable metropolis step until 10 traj covered. Should move to exposing these
in XML input and start having "applications" directory.
2016-02-21 08:01:44 -06:00
Peter Boyle
2cfa20cc4e Improving the logging, got fed up with color so optionally disable.
Backtrace macro used everwhere
2016-02-21 07:58:53 -06:00
Peter Boyle
a5f683d124 Machine generated 2016-02-21 07:57:42 -06:00
Jung
9f0d9ade68 Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
paboyle
3425751cb8 Missing return value 2016-02-19 01:06:03 +00:00
paboyle
db5e8050a8 Attempts at some optimisation 2016-02-18 22:33:58 +00:00