1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-04-19 02:01:02 +01:00
Commit Graph

1126 Commits

Author SHA1 Message Date
paboyle b27bac4669 Updates for simd in one dir 2016-04-19 15:34:10 -07:00
paboyle c8a93d6a93 Cartesian changes to allow all simd in one direction 2016-04-19 15:18:12 -07:00
paboyle 04072a5e1f Rotate is a temporary hack. Would like to merge ALL
permutes as rotates of length 2, and make any rotate active
over any subset of lane bits. This is hard, and requires general
permute; current intrinsics mean this is only really possible for specific
case by case encodings as presently performed. Intel could produce a general
permute.. would help. IBM did it in VMX.
2016-04-19 15:15:34 -07:00
paboyle 574ea4f843 const safety 2016-04-19 15:15:11 -07:00
paboyle 587f80cd93 Updated to compile and pass under intel SDE 2016-04-19 15:13:54 -07:00
paboyle 528eb773ad Merged.
Merge branch 'master' of https://github.com/paboyle/Grid
2016-04-19 22:24:34 +01:00
paboyle e5657510b0 Rotate support for Ls simd-ized 2016-04-19 22:24:18 +01:00
paboyle f473919526 Rotate support 2016-04-19 22:23:51 +01:00
Christopher Kelly ab56ccdd25 -Complete and working implementation of Grid_empty 2016-04-15 13:17:42 -04:00
Christopher Kelly a646260e82 Merge remote-tracking branch 'origin/master' into ckelly-dec12-2015 2016-04-06 13:57:28 -04:00
Christopher Kelly af9c8d1372 -Checkerboard fixes for Lanczos 2016-04-06 13:50:56 -04:00
paboyle b1192a8908 Benchmark_zmm added 2016-04-06 03:00:07 -07:00
paboyle e8dddb1596 Adding extra benchmark 2016-04-06 10:32:54 +01:00
paboyle c7ba47bdc7 Merge branch 'master' of https://github.com/paboyle/Grid 2016-04-06 02:56:28 +01:00
paboyle e67fc2be18 Adding a trial for openmp overhead minimisation 2016-03-31 16:00:37 +01:00
paboyle f473ef7591 Fixing the compile 2016-03-31 07:47:42 -07:00
paboyle 8052556275 Cleaning up the single/double kernel implementation switch 2016-03-31 14:51:32 +01:00
paboyle 60d965f79e AVX512 improvements; sigfpe trapping too 2016-03-30 08:42:34 +01:00
paboyle 83b15bfcdd Better Avx512 assembly sequence for SU3 using fmaddsub to get the imag imag sign 2016-03-30 08:39:39 +01:00
paboyle 1ecbf9794d Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-30 08:37:55 +01:00
paboyle 2ded354403 configure 2016-03-30 00:17:43 -07:00
paboyle 340428a1fe Eigen fixes and HDCR work 2016-03-30 00:16:02 -07:00
paboyle c77b7ee897 AddSub based alternate SU3 routine 2016-03-28 17:55:22 -06:00
paboyle b6c3bc574b Moving to a more coherent organisation of the inline assembly and arch dependencies. 2016-03-28 16:24:37 +01:00
paboyle 1e355a51e1 Interface change 2016-03-27 23:46:55 -07:00
paboyle ad80f61fba AVX512 shaken out 2016-03-28 00:38:05 -06:00
paboyle 21abaf7e91 Gamma sign change 2016-03-28 00:35:45 -06:00
paboyle 165bffc2e7 Avx512 changes for assembler kernels 2016-03-26 22:25:45 -06:00
paboyle 644fd6d32e Build avx512 clean 2016-03-25 09:35:33 -07:00
azusa f54e0ec9bd Try lanczos to set up hdcr subspace 2016-03-17 10:36:16 +00:00
paboyle 60d4564151 ICC no compile fix 2016-03-16 02:30:40 -07:00
paboyle d4e57f4bc6 IO Bandwidth reporting 2016-03-16 02:30:16 -07:00
paboyle 3920b2c0ab HDCR updates 2016-03-16 02:29:58 -07:00
paboyle 2733c4b93c hdcr updates 2016-03-16 02:29:37 -07:00
paboyle 36a800f26c Microsecond granularity support 2016-03-16 02:28:51 -07:00
paboyle b75da563d9 Resurrect timestamp. Should make optional 2016-03-16 02:28:17 -07:00
paboyle f9faec38be Printing fix under comms none 2016-03-16 02:27:53 -07:00
paboyle d6b64f47d9 Uint64 sum for IO rates 2016-03-16 02:27:22 -07:00
paboyle a359f7a9f5 Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-11 16:07:07 -08:00
paboyle b606deb3f0 Uint64 gsum 2016-03-11 16:06:54 -08:00
paboyle 090e7aa930 Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
paboyle 2dce9c3cff HDCR running on 16^3 with 2x-3x speed up. 2016-03-08 01:01:50 -08:00
paboyle dc72293398 More timing info 2016-03-06 10:46:55 -08:00
paboyle e55c35734b Fix a nocompile 2016-03-03 20:33:28 +00:00
paboyle 325e745daa Merge branch 'master' of https://github.com/paboyle/Grid 2016-03-02 07:04:03 -08:00
paboyle 61413565d0 Back off the inlined spin proj as not working 2016-03-02 07:03:09 -08:00
paboyle ff129d9ad9 Redundant operations removed 2016-03-02 07:02:37 -08:00
paboyle 03fcd3b33a Back out of the colour 2016-03-02 07:01:15 -08:00
paboyle 68b02da483 Backing off the colour 2016-03-02 07:00:43 -08:00
paboyle e051119769 extern "C" should have been in the header file, but Cray is apparently not C++ friendly. 2016-03-02 07:00:00 -08:00