1
0
mirror of https://github.com/paboyle/Grid.git synced 2026-05-10 04:04:31 +01:00

Commit Graph

  • 705a8098b2 Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port Peter Boyle 2019-07-12 17:14:11 +01:00
  • a29b43d755 Stencil comms cleaner Peter Boyle 2019-07-12 17:12:25 +01:00
  • 368c8369ce Merge branch 'feature/gpu-port' of https://github.com/paboyle/Grid into feature/gpu-port Peter Boyle 2019-07-12 17:11:29 +01:00
  • c0d89a2dbb TODO updates Peter Boyle 2019-07-12 17:11:15 +01:00
  • 78ebd93281 Cuda 9.1 happy Peter Boyle 2019-07-12 17:11:00 +01:00
  • 3d58daf70f Safety check Peter Boyle 2019-07-12 17:10:35 +01:00
  • bd155ca5c0 Overlap comms with comput now supported Peter Boyle 2019-07-12 09:09:40 +01:00
  • 91e2cf9b40 All axes can be used for comms now Peter Boyle 2019-07-12 09:08:26 +01:00
  • 3cc9947731 Better welcome printing Peter Boyle 2019-07-12 06:47:51 +01:00
  • f15eeb0283 localise scope of variables declared in macro Peter Boyle 2019-07-12 06:47:01 +01:00
  • 0996ba9396 Pretty messaging Peter Boyle 2019-07-12 06:45:31 +01:00
  • 12afb0395f Debugging transposeSpin - seems just not to be implemented for Lattice<x> Michael Marshall 2019-07-11 17:42:26 +01:00
  • ec4aa978ab why cant I spinTranspose Felix Erben 2019-07-11 14:01:41 +01:00
  • 966a203dcb Interactions with GPU compilation Peter Boyle 2019-07-11 03:16:17 +01:00
  • 44170cc15f Initialise CUDA device prior to entering MPI. This may or may not interact with Summit which configures MPI - CUDA mapping with jsrun. TBD Cases of OpenMPI and MVAPICH are covered, and default to cudaSetDevice(0) otherwise Peter Boyle 2019-07-11 03:14:23 +01:00
  • 7bc4a06f3f This is probably what you want ... Michael Marshall 2019-07-10 12:29:33 +01:00
  • cd659525e1 You probably want to add this to the build. And you may need to do a bootstrap Michael Marshall 2019-07-10 12:08:37 +01:00
  • dc2240d2d8 why does sliceSum in Nucleon.hpp not work Felix Erben 2019-07-10 11:34:16 +01:00
  • 98cf20cf06 continued work on baryons Felix Erben 2019-07-09 17:42:36 +01:00
  • cc3346073e continued work on baryons Felix Erben 2019-07-09 17:30:32 +01:00
  • 3848da7c50 added nucleon module (non-distillation) Felix Erben 2019-07-08 17:43:14 +01:00
  • c3d0c176ab cleaning up Kl2 contraction portelli 2019-05-24 13:08:35 +01:00
  • 0a71f8bb10 Merge pull request #222 from guelpers/feature/kl2QEDseq portelli 2019-07-05 16:22:34 +01:00
  • b7d0cf6751 buxfix in diquark sum / baryons Felix Erben 2019-07-04 22:06:37 +01:00
  • 3a31ba2ea2 Merge remote-tracking branch 'upstream/develop' into feature/kl2QEDseq guelpers 2019-07-03 14:37:56 +01:00
  • eac6337466 Hadrons: EMLepton: multiple source-sink separations at once guelpers 2019-07-03 14:36:34 +01:00
  • ab7537e002 Merge pull request #221 from fionnoh/bugfix/A2ALoop portelli 2019-07-03 14:13:51 +01:00
  • 2c1a077369 continued on baryons Felix Erben 2019-07-02 17:55:28 +01:00
  • 6e3c3214a3 Offload loops Peter Boyle 2019-07-02 17:25:40 +01:00
  • d6ffadb33b Coalesced write Peter Boyle 2019-07-02 17:25:13 +01:00
  • ae3abbe53d Added the ability for Perambulator module to save unsmeared sinks through the addition of two optional parameters: UnsmearedSinkFileName: If present, specifies the filename to write to UnsmearedSinkMultiFile: defaults to true to write each sink vector to a different file, but can be set to 0 for a single file Michael Marshall 2019-07-01 17:28:27 +01:00
  • 5fc0188205 started saving sinks Felix Erben 2019-07-01 14:51:59 +01:00
  • 4c3225412b Drop 5dVEC Peter Boyle 2019-07-01 07:31:26 +01:00
  • b8f7bfbb26 Dont stream as poor perf in some cases Peter Boyle 2019-07-01 07:30:25 +01:00
  • 7b7c470917 Accelerator loop Peter Boyle 2019-07-01 07:29:51 +01:00
  • 532e226b22 cuda 9.1 fixes Peter Boyle 2019-07-01 07:29:22 +01:00
  • 6a13731818 Move GPU cuda call earlier Peter Boyle 2019-07-01 07:28:41 +01:00
  • 67690df3bd Changes nedded to have a current insertion on every second time slice - avoids unnecessary contractions fionnoh 2019-06-28 15:18:28 +08:00
  • 1059189abf Bugfix for A2ALoop module fionnoh 2019-06-27 13:49:55 +08:00
  • ce29b18dc9 New modules for loading in MFs as diskvectors and producing propagaotrs from 4 quark contractions fionnoh 2019-06-27 13:46:06 +08:00
  • 421a0a8a36 Changes to A2Autils, A2AMatirx and DiskVector code that is needed for Hadrons 4 quark contraction module fionnoh 2019-06-27 13:45:20 +08:00
  • ac530636ca A2Aloop bugfix fionnoh 2019-06-27 13:44:47 +08:00
  • 2d940a598c Inserted four extra parameters just to make this test compile. Needs to be fixed properly Michael Marshall 2019-06-19 10:37:50 +01:00
  • c28c5fc61b Inserted four extra parameters just to make this test compile. Needs to be fixed properly Michael Marshall 2019-06-19 10:31:41 +01:00
  • 015340d60c Elided superfluous copy on write Michael Marshall 2019-06-19 09:37:03 +01:00
  • 1cd4ee0706 Thrust used on GPU builds Peter Boyle 2019-06-18 12:50:35 +01:00
  • b8f71b6777 Fix NVCC warning unused variable Peter Boyle 2019-06-17 13:58:45 +01:00
  • 703dc20377 Compile tests fix Peter Boyle 2019-06-16 13:59:29 +01:00
  • d976e5c514 Pow is being awkward in thrust for reasons I don't understand. Possible thrust bug. Peter Boyle 2019-06-16 12:05:11 +01:00
  • d7b3efe893 Compile fix Peter Boyle 2019-06-15 17:03:15 +01:00
  • f710d7bd45 TODO list update Peter Boyle 2019-06-15 12:54:27 +01:00
  • cb336aa8f8 Thread loop constructs changing a little Peter Boyle 2019-06-15 12:54:11 +01:00
  • 462900b48d Modified entire test directory to suit new GPU constructs for looping Peter Boyle 2019-06-15 12:53:27 +01:00
  • 0561c2edeb Benchmarks modified for new GPU constructs Peter Boyle 2019-06-15 12:52:56 +01:00
  • 0184719216 Change to predicate type Peter Boyle 2019-06-15 12:52:26 +01:00
  • 24202dbc51 Thread loop construct change Peter Boyle 2019-06-15 12:52:07 +01:00
  • d763c303c5 Clean acceleerator barrier Peter Boyle 2019-06-15 12:51:45 +01:00
  • 8e394d3bf9 New loop construct Peter Boyle 2019-06-15 12:51:15 +01:00
  • b881d5489b Move SchurDiagTwoKappa to Algorithms Peter Boyle 2019-06-15 12:50:45 +01:00
  • 82306913a8 Move Schur operator into correct place Peter Boyle 2019-06-15 12:49:22 +01:00
  • 49f90cc7eb use pragma once Peter Boyle 2019-06-15 12:45:22 +01:00
  • b77af0210b Thread loop. Probably deprecate this impl Peter Boyle 2019-06-15 12:44:56 +01:00
  • 5254ede2d8 New loops. Revisit as accelerator loop in future audit Peter Boyle 2019-06-15 12:44:29 +01:00
  • 16e5d7945e Hard to make 5D vec work with GPU code Peter Boyle 2019-06-15 12:43:43 +01:00
  • decc99ca76 Accelerator version Peter Boyle 2019-06-15 12:43:00 +01:00
  • 464cd65931 Still to test this fully Peter Boyle 2019-06-15 12:35:14 +01:00
  • a1ec2f4723 Still to test this routine fully Peter Boyle 2019-06-15 12:33:55 +01:00
  • ea9662ec85 Thread loop changes Peter Boyle 2019-06-15 09:09:57 +01:00
  • 52c74f1cac Thread loop changes Peter Boyle 2019-06-15 09:08:16 +01:00
  • 9a13d2992c lean up Peter Boyle 2019-06-15 09:05:16 +01:00
  • b0449ae270 Thread loop changes Peter Boyle 2019-06-15 09:04:19 +01:00
  • 1299225105 Accelerator loop changes Peter Boyle 2019-06-15 09:03:46 +01:00
  • 5925e7f405 Thread for changes Peter Boyle 2019-06-15 09:01:30 +01:00
  • be1fd4930f Template instantiation make happy changes Peter Boyle 2019-06-15 08:37:34 +01:00
  • 377fa5dec1 looping construct Peter Boyle 2019-06-15 08:36:48 +01:00
  • e8b78f596e Looping construct changes Peter Boyle 2019-06-15 08:35:57 +01:00
  • 09720c40cd Coalesced loops Peter Boyle 2019-06-15 08:35:26 +01:00
  • bb024dd114 Loop construct changed Peter Boyle 2019-06-15 08:30:05 +01:00
  • 52456b9ec7 New loop construct Peter Boyle 2019-06-15 08:28:45 +01:00
  • b285138be4 Better checking on types Peter Boyle 2019-06-15 08:27:48 +01:00
  • c7dbf4c87e Scalar support for GPU threads Peter Boyle 2019-06-15 08:25:43 +01:00
  • 1e889c93b8 Insert a GPU synchronise Peter Boyle 2019-06-15 08:23:26 +01:00
  • 7379047482 Threading and acceleration primitives further changes. accelerator_barrier() needed and used Peter Boyle 2019-06-15 08:22:48 +01:00
  • d836ce3b78 Clean up of acceleration and threading primitives Peter Boyle 2019-06-15 08:14:21 +01:00
  • cefaacbc07 Changing accelerator loop. Still have work to do for multi-GPU code Peter Boyle 2019-06-15 08:10:24 +01:00
  • 0074ef7f69 thread loops Peter Boyle 2019-06-15 08:04:29 +01:00
  • 20359ca15f Coalesced loops. Peter Boyle 2019-06-15 08:03:57 +01:00
  • 736358b0cb Coalesced loops Peter Boyle 2019-06-15 08:03:13 +01:00
  • 6b692aa726 Thread loops Peter Boyle 2019-06-15 08:02:26 +01:00
  • 7f99e1cd3b Coalesced loops Peter Boyle 2019-06-15 08:01:39 +01:00
  • f3c89df948 Thread loop changes Peter Boyle 2019-06-15 08:00:37 +01:00
  • b7e6d111d7 Thread loop changes. Need to offload this file Peter Boyle 2019-06-15 07:59:10 +01:00
  • f39cf69c33 Accelerator loop change Peter Boyle 2019-06-15 07:58:23 +01:00
  • 8e27338df2 Rationalise number of loop macros Peter Boyle 2019-06-15 07:57:40 +01:00
  • bcbb5e9d26 Remove assembly tests Peter Boyle 2019-06-15 07:57:05 +01:00
  • 0ea7f5279d Accelerator loop changes Peter Boyle 2019-06-15 07:56:14 +01:00
  • 18e5de426d There is a stray use of predicatedWhere introduced by Andrew Lawson in the conserve currents. The conserved currents need rewritten using data parallel operations. Peter Boyle 2019-06-15 07:53:58 +01:00
  • e896d81235 Accelerator loop redefine. Coalesce most accesses, but ET engine still to go clean. Peter Boyle 2019-06-15 07:52:44 +01:00
  • 7b8ccff4f4 Accelerated coalesced loops in most cases Peter Boyle 2019-06-15 07:48:00 +01:00
  • 68541606ab Thread loop changes. Soon try these with accelerator loops and benchmark Peter Boyle 2019-06-15 07:46:42 +01:00