Peter Boyle
a7abda89e2
View location & access mode
2020-05-21 16:13:59 -04:00
Peter Boyle
7860a50f70
Make view specify where and drive data motion - first cut.
...
This is a compile tiime option --enable-unified=yes/no
2020-05-21 16:13:16 -04:00
ferben
6c6812a5ca
GB/s output
2020-05-20 12:26:57 +01:00
Christoph Lehner
8358ee38c4
pull develop
2020-05-19 08:56:18 -04:00
ferben
1f154fe652
some cleanup in BaryonUtils
2020-05-19 13:48:56 +01:00
ferben
d708c0258d
some cleanup in BaryonUtils
2020-05-19 13:48:00 +01:00
Christoph Lehner
a7635fd5ba
summit mem
2020-05-18 17:52:26 -04:00
Peter Boyle
ebb60330c9
Automatic data motion options beginning
2020-05-17 16:34:25 -04:00
5aa60be17d
SerialisableClassName method for serialisable enum, and boolean to test if a serialisable object is an enum
2020-05-15 20:00:34 +01:00
Christoph Lehner
32fbdf4fb1
Merge pull request #5 from paboyle/develop
...
Sync upstream
2020-05-13 09:02:56 +02:00
Peter Boyle
a9847aa866
Dependence fix
2020-05-12 20:03:37 -04:00
Peter Boyle
2e652431e5
No compile on summiit fix
2020-05-12 18:56:47 -04:00
Peter Boyle
8b5b55b682
Make tests all compile ccurrent Grid, mostly MdagM removal of norms fixes but a few minor
...
issues fiixed too
2020-05-12 17:57:24 -04:00
Peter Boyle
0e3c49f687
TransposeIndex was broken by Christoph
2020-05-12 17:57:01 -04:00
Peter Boyle
cb7ee37562
Close expressions in arg to cshift
2020-05-12 17:56:40 -04:00
Peter Boyle
82f71643a4
Remove the norm in MdagM
2020-05-12 17:55:53 -04:00
Peter Boyle
d24d8e8398
Use X-direction as more bits meaningful on CUDA.
...
2^31-1 shoulddd always bee enough for SIMD and thread reduced local volume
e.g. 32*2^31 = 2^36 = (2^9)^4 or 512^4 ias big enough.
Where 32 is gpu_threads * Nsimd = 8*4
2020-05-12 10:35:49 -04:00
Christoph Lehner
162e4bb567
no automatic prefetching for now
2020-05-12 07:01:23 -04:00
Peter Boyle
07c0c02f8c
Speed up Cshift
2020-05-11 17:02:01 -04:00
Peter Boyle
8c31c065b5
Keep the Vector fixed to protect it from realloc
2020-05-11 17:00:30 -04:00
Christoph Lehner
b1c86900b2
Merge pull request #4 from paboyle/develop
...
merge
2020-05-11 20:59:29 +02:00
Peter Boyle
bbbee5660d
First compiile on HiP
2020-05-10 05:28:09 -04:00
Peter Boyle
ea08f193e7
Allocator cache spliit into large/small pools
2020-05-10 05:24:26 -04:00
Peter Boyle
2bb2c68e15
Separate pools for small and large allocations cache
2020-05-09 22:57:21 -04:00
Peter Boyle
efe5bc6a3c
Split allocator cache into two pools of different sizes
2020-05-09 22:27:56 -04:00
Peter Boyle
384da487bd
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-05-08 18:55:11 -04:00
Peter Boyle
ee1de82a53
Working ITT benchmark again
2020-05-08 18:54:50 -04:00
Peter Boyle
2b576fc185
Comment deadd codde remove
2020-05-08 18:54:29 -04:00
Peter Boyle
52081acfa5
NVCC compile fixes
2020-05-08 13:14:12 -04:00
Peter Boyle
b01b7f761a
Merge pull request #283 from DanielRichtmann/feature/minor-fixes
...
Some small fixes
2020-05-08 10:52:03 -04:00
Daniel Richtmann
c83471bfd0
Fix missing checkerboards for adj und conjugate
2020-05-08 16:44:03 +02:00
Daniel Richtmann
ab0c5d77fb
Correct NonHermitianSchurOperatorBase
2020-05-08 16:44:02 +02:00
Daniel Richtmann
779e3c7442
Const-correctness for retrieval routines of GridStopWatch
2020-05-08 16:43:52 +02:00
Daniel Richtmann
0c570824f2
Add missing declaration of GridCmdOptionInt
2020-05-08 16:43:51 +02:00
Peter Boyle
f8b8e00090
Systematise the accelerator primitives and locate to Grid/threads/Accelerator.h / Accelerator.cc
...
Aim to reduce the amount of cuda and other code variations floating around all over the place.
Will move GpuInit iinto Accelerator.cc from Init.cc
Need to worry about SharedMemoryMPI.cc and the Peer2Peer windows
2020-05-08 06:23:55 -07:00
Peter Boyle
0dd1bdfa94
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2020-05-08 09:21:43 -04:00
Peter Boyle
1d65e2f62c
Slightly faster Chebyshev; ifdef'ed out the fastest until tested numerics
...
Lifteed from HDCR setup
2020-05-08 09:20:54 -04:00
Peter Boyle
93920c4811
Remove verbose
2020-05-08 09:19:54 -04:00
Peter Boyle
6859a3e1d4
Schur operator
2020-05-08 09:19:12 -04:00
Peter Boyle
21ca182c36
Comments remove
2020-05-08 09:18:24 -04:00
053b4dd495
Merge pull request #282 from felixerben/baryon-reversal
...
Baryon reversal
2020-05-07 18:09:17 +01:00
ferben
42bb5f0721
asserrtion
2020-05-07 18:06:12 +01:00
ferben
253bcc3426
back to old version
2020-05-07 18:03:17 +01:00
a887206413
Merge pull request #281 from felixerben/feature/baryonSpeedup
...
Feature/baryon speedup
2020-05-07 13:41:29 +01:00
ferben
591ebb6213
Merge branch 'develop' of github.com:paboyle/Grid into feature/baryonSpeedup
2020-05-07 11:13:21 +01:00
ferben
56e2f7d088
deleted test routines. cleaned up fast version. assert Ns=4,Nc=3.
2020-05-07 10:03:45 +01:00
Peter Boyle
525418abfb
Merge pull request #273 from lehner/feature/gpt
...
Feature/gpt
2020-05-06 10:10:51 -04:00
Peter Boyle
5f780806c2
Merge pull request #279 from paboyle/bugfix/nvcc-config
...
configure fix for nvcc with extra arguments as CXX
2020-05-06 10:07:52 -04:00
Christoph Lehner
3c6ffcb48c
Merge branch 'develop' into feature/gpt
2020-05-06 15:03:35 +02:00
Christoph Lehner
87984ece7d
add Lattice_basis.h
2020-05-06 08:47:18 -04:00