1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-10 19:36:56 +01:00
Commit Graph

358 Commits

Author SHA1 Message Date
303b83cdb8 Scaling benchmarks, verbosity and MPICH aware in acceleratorInit()
For some reason Dirichlet benchmark fails on several nodes; need to
debug this.
2024-02-13 19:48:03 +00:00
14643c0aab SDCC benchmarking scripts for A100 nodes and IceLake nodes (AVX512) 2023-12-04 15:45:57 -05:00
86dac5ff4f Better printing 2023-04-04 07:42:19 -07:00
900e01f49b Temporary 2023-03-27 21:35:06 -07:00
23298acb81 Merge pull request #424 from giltirn/feature/dirichlet-precchange
Precision change implementation
2023-03-22 23:04:52 -04:00
b5b759df73 Merge branch 'develop' into feature/dirichlet 2023-03-21 16:05:46 -04:00
1db58a8acc Precision change improvements
Added a new, much faster implementation of precision change that uses (optionally) a precomputed workspace containing pointer offsets that is device resident, such that all lattice copying occurs only on the device and no host<->device transfer is required, other than the pointer table. It also avoids the need to unpack and repack the fields using explicit lane copying. When this new precisionChange is called without a workspace, one will be computed on-the-fly; however it is still considerably faster than the original implementation.

In the special case of using double2 and when the Grids are the same, calls to the new precisionChange will automatically use precisionChangeFast, such that there is a single API call for all precision changes.

Reliable update and mixed-prec multishift have been modified to precompute precision change workspaces

Renamed the original precisionChange as precisionChangeOrig

Fixed incorrect pointer offset bug in copyLane

Added a test and a benchmark for precisionChange

Added a test for reliable update CG
2023-02-21 10:52:42 -05:00
67f569354e Partial dirichlet changes 2022-11-30 15:51:13 -05:00
fe6e8f5ac6 Benchmark_comms fix 2022-11-15 17:00:49 -05:00
0ae0e5f436 Partial Dirichlet test 2022-11-15 16:40:38 -05:00
653039695b Partial dirichlet changes 2022-11-15 16:37:15 -05:00
c82b164f6b Merge branch 'feature/dirichlet' of https://github.com/paboyle/Grid into feature/dirichlet 2022-10-04 17:41:48 -04:00
413312f9a9 Benchmark the halo construction.
THe bye counts are out and should be doubled for SIMD directions
2022-10-04 11:12:59 -07:00
25df2d2c3b Various precision options 2022-09-27 10:57:12 -04:00
cd5cf6d614 Tracing replaces self timing hooks 2022-08-31 17:33:41 -04:00
c0f8482402 Remove SSC marks 2022-07-07 17:49:36 +01:00
583f7c52f3 SSC mark 2022-06-01 19:27:29 -04:00
58a86c9164 SSC mark removal 2022-06-01 19:27:06 -04:00
18028f4309 Merge branch 'develop' into feature/dirichlet 2022-05-24 18:26:18 -07:00
aa008cbe99 Updated for new Dirichlet interface 2022-05-19 16:44:39 -07:00
4b1997e2f3 wilson sweep test 2022-05-16 15:58:33 +01:00
8939d5dc73 bugfix: eo operator called in correct location 2022-05-16 00:28:28 +01:00
e2fc3a0f04 Merge pull request #28 from paboyle/develop
Sync with Upstream
2022-03-08 09:58:51 +01:00
5340e50427 HMC running with new formulation 2022-03-01 17:10:25 -05:00
9616811c3d Merge branch 'feature/gpt' of https://github.com/lehner/Grid into feature/gpt 2022-02-24 22:03:05 +01:00
8a3002c03b separate left and right masses for CayleyFermion5D 2022-02-24 22:02:56 +01:00
0f1c5b08a1 Dirichlet filters running on AMD and now integrated in Fermion op 2022-02-23 19:29:28 -05:00
70988e43d2 Passes multinode dirichlet test with boundaries at
node boundary or at the single rank boundary
2022-02-23 01:42:14 -05:00
aab3bcb46f Dirichlet first cut - wrong answers on dagger multiply.
Struggling to get a compute node so changing systems
2022-02-22 19:58:33 +00:00
135808dcfa Less verbose 2021-12-07 16:24:24 -05:00
2bf3b4d576 Update to reduce memory footpring in benchmark test 2021-12-07 09:02:02 -08:00
ba7e371b90 Warning free compile on Tursa.
Hopefully got all reqd virtual dtors
2021-10-21 19:56:52 +01:00
8bd70ad8b5 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-09-16 10:22:38 -07:00
b4690e6091 Adding build basics for different systems 2021-09-16 00:00:38 +01:00
c7baeb5bae Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2021-09-14 08:31:11 -07:00
361bb8a101 Remove half prec comms 2021-09-14 15:06:29 +01:00
7efdb3cd2b Remove half prec comms 2021-09-14 15:06:06 +01:00
bcfa9cf068 Improvement of output 2021-08-28 08:08:15 -07:00
75030637cc Improved comms benchmark, same as benchmark_comms_host_device 2021-08-10 05:16:30 -07:00
fe5aaf7677 Make comms benchmark same as Benchmark_comms_host_device 2021-08-09 04:06:30 -07:00
1eea9d73b9 Pass serial RNG around 2021-03-03 23:50:01 +01:00
cf76741ec6 Intel DPCPP Gold happy now (compiles all, runs Benchmark_dwf_fp32 ) 2020-12-03 03:47:11 -08:00
147dc15d26 Update 2020-11-20 13:13:59 -05:00
8fcb392e24 Merge branch 'develop' of https://github.com/paboyle/Grid into develop 2020-11-17 04:51:31 -08:00
dd8d70eeff Build without LIME 2020-11-17 04:41:15 -08:00
3aab983760 Flop count set as in DiRAC-ITT-2020 (mistaken 20% low, but must maintain consistency) 2020-11-16 17:13:58 +01:00
9c4dcc5ea3 Merge branch 'master' into develop 2020-11-16 16:34:57 +01:00
e9bc748828 Useful GPU machine benchmark for GDR used to shakeout Booster at Juelich - see slack earlyaccess channel 2020-11-13 03:58:34 +01:00
f48156529b Work on 2,2,2,8 ranks 2020-11-13 03:57:58 +01:00
f16c2665f5 Host memory explict 2020-11-12 20:29:58 +01:00