# QUDA benchmarks This folder contains benchmarks for the [QUDA](https://github.com/lattice/quda) library. - `Benchmark_Quda`: This benchmark measure floating point performances of fermion matrices (Wilson and DWF), as well as memory bandwidth (using a simple `axpy` operation). Measurements are performed for a fixed range of problem sizes. ## Building After setting up your compilation environment (Tursa: `source /home/dp207/dp207/shared/env/production/env-{base,gpu}.sh`): ```bash ./build-quda.sh # build Quda ./build-benchmark.sh # build benchmark ``` where `` is an arbitrary directory where every product will be stored. ## Running the Benchmark The benchmark should be run as ```bash mpirun -np /prefix/qudabench/Benchmark_Quda ``` where `` is the total number of GPU's to use. On Tursa this is 4 times the number of nodes. Note: - on Tursa, the `wrapper.sh` script that is typically used with Grid is not necessary. - due to Qudas automatic tuning, the benchmark might take significantly longer to run than `Benchmark_Grid` (even though it does fewer things). - setting `QUDA_ENABLE_TUNING=0` disables all tuning (degrades performance severely). By default, it is turned on. - setting `QUDA_RESOURCE_PATH=` enables Quda to save and reuse optimal tuning parameters, making repeated runs much faster