1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-05-23 18:44:17 +01:00
Commit Graph

1083 Commits

Author SHA1 Message Date
Peter Boyle 403bff1a47 Force reqd subgroup size fo SYCL 2021-06-22 17:56:10 +00:00
Peter Boyle 6cd9224dd7 SYCL comms buffer allocate 2021-06-16 17:10:55 +00:00
Peter Boyle 4c5440fb06 const happy for sycl 2021-06-15 21:45:07 +00:00
Peter Boyle 0e27e3847d Remove synch 2021-06-03 04:24:19 +00:00
u61464 8cfc7342cd staggered hand unroll read coalesce 2021-05-05 14:17:18 -07:00
u61464 15ae317858 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-05-04 08:40:38 -07:00
u61464 834f536b5f Fastest option on SyCL is now std::complex 2021-05-04 08:40:18 -07:00
ferben cf2923d5dd Jamie's fix 2021-04-27 16:53:37 +01:00
ferben 009ccd581e bugfix 3D stout smearing 2021-04-26 10:36:33 +01:00
Peter Boyle d45c868656 Change interface 2021-04-25 10:53:34 -04:00
Peter Boyle 955a8113de Expose label only to reduce number of parameters 2021-04-25 10:36:38 -04:00
Peter Boyle dbe210dd53 Open the ens_id 2021-04-25 10:25:59 -04:00
Peter Boyle 980e721f6e Update MetaData.h 2021-04-13 09:33:01 -04:00
aznyong 895244ecc3 Merge with upstream; implemented conserved tadpole for Shamir action. 2021-04-06 13:46:33 +01:00
aznyong addeb621a7 Implemented tadpole operator for Shamir action. 2021-04-06 13:45:37 +01:00
Peter Boyle a7fb25adf6 Make Cshift fields static to avoid repeated reallocaate overhead 2021-03-29 21:44:14 +02:00
Peter Boyle e947992957 Improved force terms 2021-03-29 20:04:06 +02:00
Peter Boyle bb89a82a07 Staggered coalseced read 2021-03-29 20:01:15 +02:00
Peter Boyle 15c50a7442 Explicit instantiate the template function 2021-03-18 15:40:42 -04:00
Peter Boyle 9c2b37218a sRNG parameter added 2021-03-18 06:24:11 -04:00
Peter Boyle 51f506553c Read out the local ID once, and store 2021-03-12 15:33:04 +01:00
Peter Boyle db3ac67506 Update thread issue 2021-03-12 14:55:07 +01:00
Peter Boyle da91a884ef NVCC versions found buggy added as guard 2021-03-11 23:54:53 +01:00
Peter Boyle ce1fc1f48a Possible fallback plan for Fionn's compiler bbug in nvcc 2021-03-11 22:20:53 +01:00
u61464 0e21adb3f6 Gives 200GF/s on SyCL/DG1 8^4, doesn't uglify develop for other platforms too badly.
Easy to revert to clean more C++ stylistic code. Theres a SYCL_HACK macro I will clean up later once dpcpp
evolves a central nervous systems.
2021-03-10 05:40:51 -08:00
Peter Boyle 2146eebb65 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-03-09 04:31:46 +01:00
Peter Boyle 6a429ee6d3 2d loop hits Nvidia 16bit limit on large local vols 2021-03-09 04:31:10 +01:00
Peter Boyle 4d1ea15c79 More verbosity. The 16bit limit on Grid.y, Grid.z is annoying 2021-03-09 04:29:37 +01:00
Peter Boyle a76cb005e0 Update Tensor_exp.h 2021-03-08 13:37:57 -05:00
Peter Boyle a9604367c1 Merge pull request #336 from lehner/feature/gpt
Make ShmDims configurable; adjust GRID_MAX_SIMD to allow for 128 byte width on GPUs
2021-03-05 13:17:19 -05:00
Peter Boyle 89d299ceec Merge pull request #333 from mmphys/bugfix/LatTransfer
Fix convertType for GPU in Lattice_transfer.h
2021-03-05 12:46:33 -05:00
Christoph Lehner b24181aa4f Update Coordinate.h
Revert GRID_MAX_SIMD change
2021-03-05 16:56:58 +01:00
ferben 7a19432e0b whitespace 2021-03-05 10:57:09 +00:00
ferben 9b15704290 tested and consitent 2021-03-05 10:42:32 +00:00
Michael Marshall f252d69eef Merge branch 'develop' into bugfix/LatTransfer
* develop:
  Pass serial RNG around
  Sycl happier
2021-03-04 20:41:30 +00:00
ferben 3b06e4655e Merge branch 'develop' into feature/XiToSigma 2021-03-04 20:06:16 +00:00
ferben d4b4de8f42 changes 2021-03-04 20:01:24 +00:00
Peter Boyle c90beee774 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-03-03 23:50:29 +01:00
Peter Boyle 1eea9d73b9 Pass serial RNG around 2021-03-03 23:50:01 +01:00
u61464 679d1d22f7 Sycl happier 2021-03-03 11:21:43 -08:00
Michael Marshall 03e54722c1 Merge branch 'develop' into bugfix/LatTransfer
* develop:
  Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case. Other cases to do. This now includes comms code path.
2021-03-03 16:13:23 +00:00
Peter Boyle 442336bd96 Hand unrolled to use optimised code paths on GPU for coalesced reads in Wilson case.
Other cases to do. This now includes comms code path.
2021-03-02 14:50:51 +01:00
Christoph Lehner 9c9566b9c9 Merge pull request #23 from paboyle/develop
Sync
2021-03-01 12:33:51 +01:00
Michael Marshall 1059a81a3c Merge branch 'develop' into bugfix/LatTransfer
* develop:
  Better SIMD usage/coalescence
2021-02-27 00:21:36 +00:00
Peter Boyle 2e61556389 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-02-26 17:52:20 +01:00
Peter Boyle f9b1f240f6 Better SIMD usage/coalescence 2021-02-26 17:51:41 +01:00
Michael Marshall 69f41469dd Merge branch 'develop' into bugfix/LatTransfer
* develop: (26 commits)
  Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
  Correct misleading ac help string
  Enable performance counting in WilsonFermion like in others
  changed back A2AUtils warning
  changed if and accelerator_for - no runtime errors any more
  Mac OS (Darwin) sed -i flag for in-place editing differs from posix / gnu
  Seems the intention with AutoConf produced Grid/Config.h was to use sed to translate standard PACKAGE_ #defines into GRID_ however due to missing '' after -i this hasn't been working. Perhaps it is too late to fix this, since we don't know who/what is relying on this downstream? ... but if they are, and AutoConf is being used, then likely these #defines have just been redefined anyway. Seems reasonable to redefine PACKAGE and VERSION as well, as none of these macros are used throughout Grid or Hadrons.
  Fixed compile issues with maxLocalNorm2 for non-scalar lattices maxLocalNorm2 test now reuses the random field
  MADWF 5d source option for hadrons - look at Grid of source Abort on GPU error
  maxLocalNorm2()
  change back benchmark_ITT
  prettify
  Flop cout matches DiRAC-ITT-2020
  revert changes
  merge develop
  fixes
  weird bug in 2pt function...
  revert changes
  final version, tested on CPU and GPU
  bugfix
  ...
2021-02-25 09:19:17 +00:00
Christopher Kelly c791cb2214 Merge branch 'develop' into feature/link-update-mask 2021-02-23 11:51:54 -05:00
Christopher Kelly d5ab571a89 Added the ability to apply a custom "filter" to the conjugate momentum in the Integrator classes, applied both after refresh and after applying the forces
Added a conjugate momentum "filter" that applies a phase to each site. With sites set to 1.0 or 0.0 this acts as a mask and enables, for example, the freezing of inactive gauge links in DDHMC
Added tests/forces/Test_momentum_filter demonstrating the use of the filter to freeze boundary links
2021-02-23 11:49:56 -05:00
ferben 0ed800f6e4 merge develop 2021-02-23 14:54:46 +00:00