1
0
mirror of https://github.com/paboyle/Grid.git synced 2024-09-20 01:05:38 +01:00

Fixed typo on --enable-comm, removed all references to --enable-precision except for config options, where it is listed as deprecated. Removed travis test for single precision.

This commit is contained in:
Sam Mangham 2020-10-12 12:33:13 +01:00
parent c2b688abc9
commit d55cc5b380
6 changed files with 52 additions and 88 deletions

View File

@ -9,11 +9,6 @@ matrix:
- os: osx - os: osx
osx_image: xcode8.3 osx_image: xcode8.3
compiler: clang compiler: clang
env: PREC=single
- os: osx
osx_image: xcode8.3
compiler: clang
env: PREC=double
before_install: before_install:
- export GRIDDIR=`pwd` - export GRIDDIR=`pwd`
@ -55,7 +50,7 @@ script:
- make -j4 - make -j4
- make install - make install
- cd $CWD/build - cd $CWD/build
- ../configure --enable-precision=$PREC --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF} - ../configure --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF}
- make -j4 - make -j4
- ./benchmarks/Benchmark_dwf --threads 1 --debug-signals - ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
- make check - make check

33
README
View File

@ -111,11 +111,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu
``` bash ``` bash
mkdir build; cd build mkdir build; cd build
../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> ../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
``` ```
where `--enable-precision=` set the default precision, where `--enable-simd=` set the SIMD type, `--enable-
`--enable-simd=` set the SIMD type, `--enable-
comms=`, and `<path>` should be replaced by the prefix path where you want to comms=`, and `<path>` should be replaced by the prefix path where you want to
install Grid. Other options are detailed in the next section, you can also use `configure install Grid. Other options are detailed in the next section, you can also use `configure
--help` to display them. Like with any other program using GNU autotool, the --help` to display them. Like with any other program using GNU autotool, the
@ -146,8 +145,8 @@ If you want to build all the tests at once just use `make tests`.
- `--enable-numa`: enable NUMA first touch optimisation - `--enable-numa`: enable NUMA first touch optimisation
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
- `--enable-precision={single|double}`: set the default precision (default: `double`). - `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option**
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. - `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
- `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
- `--disable-timers`: disable system dependent high-resolution timers. - `--disable-timers`: disable system dependent high-resolution timers.
- `--enable-chroma`: enable Chroma regression tests. - `--enable-chroma`: enable Chroma regression tests.
@ -201,8 +200,7 @@ Alternatively, some CPU codenames can be directly used:
The following configuration is recommended for the Intel Knights Landing platform: The following configuration is recommended for the Intel Knights Landing platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -212,8 +210,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -232,8 +229,7 @@ for interior communication. This is the mpi3 communications implementation.
We recommend four ranks per node for best performance, but optimum is local volume dependent. We recommend four ranks per node for best performance, but optimum is local volume dependent.
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi3-auto \ --enable-comms=mpi3-auto \
--enable-mkl \ --enable-mkl \
CC=icpc MPICXX=mpiicpc CC=icpc MPICXX=mpiicpc
@ -244,8 +240,7 @@ We recommend four ranks per node for best performance, but optimum is local volu
The following configuration is recommended for the Intel Haswell platform: The following configuration is recommended for the Intel Haswell platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3-auto \ --enable-comms=mpi3-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -262,8 +257,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -280,8 +274,7 @@ This is the default.
The following configuration is recommended for the Intel Skylake platform: The following configuration is recommended for the Intel Skylake platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=mpiicpc CXX=mpiicpc
@ -298,8 +291,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -330,8 +322,7 @@ and 8 threads per rank.
The following configuration is recommended for the AMD EPYC platform. The following configuration is recommended for the AMD EPYC platform.
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
CXX=mpicxx CXX=mpicxx
``` ```

View File

@ -115,11 +115,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu
``` bash ``` bash
mkdir build; cd build mkdir build; cd build
../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> ../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
``` ```
where `--enable-precision=` set the default precision, where `--enable-simd=` set the SIMD type, `--enable-
`--enable-simd=` set the SIMD type, `--enable-
comms=`, and `<path>` should be replaced by the prefix path where you want to comms=`, and `<path>` should be replaced by the prefix path where you want to
install Grid. Other options are detailed in the next section, you can also use `configure install Grid. Other options are detailed in the next section, you can also use `configure
--help` to display them. Like with any other program using GNU autotool, the --help` to display them. Like with any other program using GNU autotool, the
@ -150,8 +149,8 @@ If you want to build all the tests at once just use `make tests`.
- `--enable-numa`: enable NUMA first touch optimisation - `--enable-numa`: enable NUMA first touch optimisation
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
- `--enable-precision={single|double}`: set the default precision (default: `double`). - `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option**
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. - `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
- `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
- `--disable-timers`: disable system dependent high-resolution timers. - `--disable-timers`: disable system dependent high-resolution timers.
- `--enable-chroma`: enable Chroma regression tests. - `--enable-chroma`: enable Chroma regression tests.
@ -205,8 +204,7 @@ Alternatively, some CPU codenames can be directly used:
The following configuration is recommended for the Intel Knights Landing platform: The following configuration is recommended for the Intel Knights Landing platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -216,8 +214,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -236,8 +233,7 @@ for interior communication. This is the mpi3 communications implementation.
We recommend four ranks per node for best performance, but optimum is local volume dependent. We recommend four ranks per node for best performance, but optimum is local volume dependent.
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi3-auto \ --enable-comms=mpi3-auto \
--enable-mkl \ --enable-mkl \
CC=icpc MPICXX=mpiicpc CC=icpc MPICXX=mpiicpc
@ -248,8 +244,7 @@ We recommend four ranks per node for best performance, but optimum is local volu
The following configuration is recommended for the Intel Haswell platform: The following configuration is recommended for the Intel Haswell platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3-auto \ --enable-comms=mpi3-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -266,8 +261,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -284,8 +278,7 @@ This is the default.
The following configuration is recommended for the Intel Skylake platform: The following configuration is recommended for the Intel Skylake platform:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=mpiicpc CXX=mpiicpc
@ -302,8 +295,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -334,8 +326,7 @@ and 8 threads per rank.
The following configuration is recommended for the AMD EPYC platform. The following configuration is recommended for the AMD EPYC platform.
``` bash ``` bash
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi3 \ --enable-comms=mpi3 \
CXX=mpicxx CXX=mpicxx
``` ```

View File

@ -12,31 +12,31 @@ module load mpi/openmpi-aarch64
scl enable gcc-toolset-10 bash scl enable gcc-toolset-10 bash
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
* gcc 10.1 prebuild w/ MPI, QPACE4 interactive login * gcc 10.1 prebuild w/ MPI, QPACE4 interactive login
scl enable gcc-toolset-10 bash scl enable gcc-toolset-10 bash
module load mpi/openmpi-aarch64 module load mpi/openmpi-aarch64
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
* armclang 20.2 (qp4) * armclang 20.2 (qp4)
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN" ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN"
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
* gcc 10.0.1 VLA (merlin) * gcc 10.0.1 VLA (merlin)
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
* gcc 10.0.1 fixed-size ACLE (merlin) * gcc 10.0.1 fixed-size ACLE (merlin)
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
* gcc 10.0.1 fixed-size ACLE (fjt) w/ MPI * gcc 10.0.1 fixed-size ACLE (fjt) w/ MPI
@ -46,34 +46,34 @@ export OMPI_CXX=g++-10.0.1
export MPICH_CC=gcc-10.0.1 export MPICH_CC=gcc-10.0.1
export MPICH_CXX=g++-10.0.1 export MPICH_CXX=g++-10.0.1
$ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt" $ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt"
-------------------------------------------------------- --------------------------------------------------------
* armclang 20.0 VLA (merlin) * armclang 20.0 VLA (merlin)
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
TODO check ARMCLANGCOMPAT TODO check ARMCLANGCOMPAT
* armclang 20.1 VLA (merlin) * armclang 20.1 VLA (merlin)
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
TODO check ARMCLANGCOMPAT TODO check ARMCLANGCOMPAT
* armclang 20.1 VLA (fjt cluster) * armclang 20.1 VLA (fjt cluster)
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"
TODO check ARMCLANGCOMPAT TODO check ARMCLANGCOMPAT
* armclang 20.1 VLA w/MPI (fjt cluster) * armclang 20.1 VLA w/MPI (fjt cluster)
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64" ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64"
No ARMCLANGCOMPAT -> still correct ? No ARMCLANGCOMPAT -> still correct ?
@ -81,9 +81,9 @@ No ARMCLANGCOMPAT -> still correct ?
* Fujitsu fcc * Fujitsu fcc
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN" ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN"
* Fujitsu fcc w/ MPI * Fujitsu fcc w/ MPI
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"

View File

@ -184,19 +184,19 @@ Below are shown the `configure` script invocations for three recommended configu
This is the build for every day developing and debugging with Xcode. It uses the Xcode clang c++ compiler, without MPI, and defaults to double-precision. Xcode builds the `Debug` configuration with debug symbols for full debugging: This is the build for every day developing and debugging with Xcode. It uses the Xcode clang c++ compiler, without MPI, and defaults to double-precision. Xcode builds the `Debug` configuration with debug symbols for full debugging:
../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=double --prefix=$GridPre/Debug ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Debug
#### 2. `Release` #### 2. `Release`
Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`, except using single-precision (handy for validation): Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`:
../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=single --prefix=$GridPre/Release ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Release
#### 3. `MPIDebug` #### 3. `MPIDebug`
Debug configuration with MPI: Debug configuration with MPI:
../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --enable-precision=double --prefix=$GridPre/MPIDebug ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --prefix=$GridPre/MPIDebug
### 5.3 Build Grid ### 5.3 Build Grid

View File

@ -178,15 +178,10 @@ Then enter the cloned directory and set up the build system::
Now you can execute the `configure` script to generate makefiles (here from a build directory):: Now you can execute the `configure` script to generate makefiles (here from a build directory)::
mkdir build; cd build mkdir build; cd build
../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto \ ../configure --enable-simd=AVX --enable-comms=mpi-auto \
--prefix=<path> --prefix=<path>
where:: ::
--enable-precision=single|double
sets the **default precision**. Since this is largely a benchmarking convenience, it is anticipated that the default precision may be removed in future implementations,
and that explicit type selection be made at all points. Naturally, most code will be type templated in any case.::
--enable-simd=GEN|SSE4|AVX|AVXFMA|AVXFMA4|AVX2|AVX512|NEONv8|QPX --enable-simd=GEN|SSE4|AVX|AVXFMA|AVXFMA4|AVX2|AVX512|NEONv8|QPX
@ -236,7 +231,7 @@ Detailed build configuration options
--enable-mkl[=path] use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional). --enable-mkl[=path] use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional).
--enable-simd=code setup Grid for the SIMD target `<code>`(default: `GEN`). A list of possible SIMD targets is detailed in a section below. --enable-simd=code setup Grid for the SIMD target `<code>`(default: `GEN`). A list of possible SIMD targets is detailed in a section below.
--enable-gen-simd-width=size select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). E.g. SSE 128 bit corresponds to 16 bytes. --enable-gen-simd-width=size select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). E.g. SSE 128 bit corresponds to 16 bytes.
--enable-precision=single|double set the default precision (default: `double`). --enable-precision=single|double set the default precision (default: `double`). **Deprecated option**
--enable-comms=mpi|none use `<comm>` for message passing (default: `none`). --enable-comms=mpi|none use `<comm>` for message passing (default: `none`).
--enable-rng=sitmo|ranlux48|mt19937 choose the RNG (default: `sitmo`). --enable-rng=sitmo|ranlux48|mt19937 choose the RNG (default: `sitmo`).
--disable-timers disable system dependent high-resolution timers. --disable-timers disable system dependent high-resolution timers.
@ -304,8 +299,7 @@ Build setup for Intel Knights Landing platform
The following configuration is recommended for the Intel Knights Landing platform:: The following configuration is recommended for the Intel Knights Landing platform::
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -314,8 +308,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -332,8 +325,7 @@ presently performs better with use of more than one rank per node, using shared
for interior communication. for interior communication.
We recommend four ranks per node for best performance, but optimum is local volume dependent. :: We recommend four ranks per node for best performance, but optimum is local volume dependent. ::
../configure --enable-precision=double\ ../configure --enable-simd=KNL \
--enable-simd=KNL \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--enable-mkl \ --enable-mkl \
CC=icpc MPICXX=mpiicpc CC=icpc MPICXX=mpiicpc
@ -343,8 +335,7 @@ Build setup for Intel Haswell Xeon platform
The following configuration is recommended for the Intel Haswell platform:: The following configuration is recommended for the Intel Haswell platform::
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi-auto \ --enable-comms=mpi-auto \
--enable-mkl \ --enable-mkl \
CXX=icpc MPICXX=mpiicpc CXX=icpc MPICXX=mpiicpc
@ -360,8 +351,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -379,8 +369,7 @@ Build setup for Intel Skylake Xeon platform
The following configuration is recommended for the Intel Skylake platform:: The following configuration is recommended for the Intel Skylake platform::
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=mpiicpc CXX=mpiicpc
@ -396,8 +385,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
../configure --enable-precision=double\ ../configure --enable-simd=AVX512 \
--enable-simd=AVX512 \
--enable-comms=mpi \ --enable-comms=mpi \
--enable-mkl \ --enable-mkl \
CXX=CC CC=cc CXX=CC CC=cc
@ -422,8 +410,7 @@ and 8 threads per rank.
The following configuration is recommended for the AMD EPYC platform:: The following configuration is recommended for the AMD EPYC platform::
../configure --enable-precision=double\ ../configure --enable-simd=AVX2 \
--enable-simd=AVX2 \
--enable-comms=mpi \ --enable-comms=mpi \
CXX=mpicxx CXX=mpicxx