mirror of
				https://github.com/paboyle/Grid.git
				synced 2025-11-04 14:04:32 +00:00 
			
		
		
		
	Fixed typo on --enable-comm, removed all references to --enable-precision except for config options, where it is listed as deprecated. Removed travis test for single precision.
This commit is contained in:
		@@ -9,11 +9,6 @@ matrix:
 | 
			
		||||
    - os:        osx
 | 
			
		||||
      osx_image: xcode8.3
 | 
			
		||||
      compiler: clang
 | 
			
		||||
      env: PREC=single
 | 
			
		||||
    - os:        osx
 | 
			
		||||
      osx_image: xcode8.3
 | 
			
		||||
      compiler: clang
 | 
			
		||||
      env: PREC=double
 | 
			
		||||
      
 | 
			
		||||
before_install:
 | 
			
		||||
    - export GRIDDIR=`pwd`
 | 
			
		||||
@@ -55,7 +50,7 @@ script:
 | 
			
		||||
    - make -j4
 | 
			
		||||
    - make install
 | 
			
		||||
    - cd $CWD/build
 | 
			
		||||
    - ../configure --enable-precision=$PREC --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF}
 | 
			
		||||
    - ../configure --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF}
 | 
			
		||||
    - make -j4 
 | 
			
		||||
    - ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
 | 
			
		||||
    - make check
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										33
									
								
								README
									
									
									
									
									
								
							
							
						
						
									
										33
									
								
								README
									
									
									
									
									
								
							@@ -111,11 +111,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
mkdir build; cd build
 | 
			
		||||
../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
 | 
			
		||||
../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
where `--enable-precision=` set the default precision,
 | 
			
		||||
`--enable-simd=` set the SIMD type, `--enable-
 | 
			
		||||
where `--enable-simd=` set the SIMD type, `--enable-
 | 
			
		||||
comms=`, and `<path>` should be replaced by the prefix path where you want to
 | 
			
		||||
install Grid. Other options are detailed in the next section, you can also use `configure
 | 
			
		||||
--help` to display them. Like with any other program using GNU autotool, the
 | 
			
		||||
@@ -146,8 +145,8 @@ If you want to build all the tests at once just use `make tests`.
 | 
			
		||||
- `--enable-numa`: enable NUMA first touch optimisation
 | 
			
		||||
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
 | 
			
		||||
- `--enable-precision={single|double}`: set the default precision (default: `double`).
 | 
			
		||||
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option**
 | 
			
		||||
- `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
 | 
			
		||||
- `--disable-timers`: disable system dependent high-resolution timers.
 | 
			
		||||
- `--enable-chroma`: enable Chroma regression tests.
 | 
			
		||||
@@ -201,8 +200,7 @@ Alternatively, some CPU codenames can be directly used:
 | 
			
		||||
The following configuration is recommended for the Intel Knights Landing platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi-auto  \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -212,8 +210,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi       \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -232,8 +229,7 @@ for interior communication. This is the mpi3 communications implementation.
 | 
			
		||||
We recommend four ranks per node for best performance, but optimum is local volume dependent.
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi3-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CC=icpc MPICXX=mpiicpc 
 | 
			
		||||
@@ -244,8 +240,7 @@ We recommend four ranks per node for best performance, but optimum is local volu
 | 
			
		||||
The following configuration is recommended for the Intel Haswell platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -262,8 +257,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -280,8 +274,7 @@ This is the default.
 | 
			
		||||
The following configuration is recommended for the Intel Skylake platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=mpiicpc
 | 
			
		||||
@@ -298,8 +291,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -330,8 +322,7 @@ and 8 threads per rank.
 | 
			
		||||
The following configuration is recommended for the AMD EPYC platform.
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3 \
 | 
			
		||||
             CXX=mpicxx 
 | 
			
		||||
```
 | 
			
		||||
 
 | 
			
		||||
							
								
								
									
										33
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										33
									
								
								README.md
									
									
									
									
									
								
							@@ -115,11 +115,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
mkdir build; cd build
 | 
			
		||||
../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
 | 
			
		||||
../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path>
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
where `--enable-precision=` set the default precision,
 | 
			
		||||
`--enable-simd=` set the SIMD type, `--enable-
 | 
			
		||||
where `--enable-simd=` set the SIMD type, `--enable-
 | 
			
		||||
comms=`, and `<path>` should be replaced by the prefix path where you want to
 | 
			
		||||
install Grid. Other options are detailed in the next section, you can also use `configure
 | 
			
		||||
--help` to display them. Like with any other program using GNU autotool, the
 | 
			
		||||
@@ -150,8 +149,8 @@ If you want to build all the tests at once just use `make tests`.
 | 
			
		||||
- `--enable-numa`: enable NUMA first touch optimisation
 | 
			
		||||
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
 | 
			
		||||
- `--enable-precision={single|double}`: set the default precision (default: `double`).
 | 
			
		||||
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option**
 | 
			
		||||
- `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
- `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
 | 
			
		||||
- `--disable-timers`: disable system dependent high-resolution timers.
 | 
			
		||||
- `--enable-chroma`: enable Chroma regression tests.
 | 
			
		||||
@@ -205,8 +204,7 @@ Alternatively, some CPU codenames can be directly used:
 | 
			
		||||
The following configuration is recommended for the Intel Knights Landing platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi-auto  \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -216,8 +214,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi       \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -236,8 +233,7 @@ for interior communication. This is the mpi3 communications implementation.
 | 
			
		||||
We recommend four ranks per node for best performance, but optimum is local volume dependent.
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi3-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CC=icpc MPICXX=mpiicpc 
 | 
			
		||||
@@ -248,8 +244,7 @@ We recommend four ranks per node for best performance, but optimum is local volu
 | 
			
		||||
The following configuration is recommended for the Intel Haswell platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -266,8 +261,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -284,8 +278,7 @@ This is the default.
 | 
			
		||||
The following configuration is recommended for the Intel Skylake platform:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=mpiicpc
 | 
			
		||||
@@ -302,8 +295,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi3      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -334,8 +326,7 @@ and 8 threads per rank.
 | 
			
		||||
The following configuration is recommended for the AMD EPYC platform.
 | 
			
		||||
 | 
			
		||||
``` bash
 | 
			
		||||
../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi3 \
 | 
			
		||||
             CXX=mpicxx 
 | 
			
		||||
```
 | 
			
		||||
 
 | 
			
		||||
@@ -12,31 +12,31 @@ module load mpi/openmpi-aarch64
 | 
			
		||||
 | 
			
		||||
scl enable gcc-toolset-10 bash
 | 
			
		||||
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
 | 
			
		||||
* gcc 10.1 prebuild w/ MPI, QPACE4 interactive login
 | 
			
		||||
 | 
			
		||||
scl enable gcc-toolset-10 bash
 | 
			
		||||
module load mpi/openmpi-aarch64
 | 
			
		||||
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
 | 
			
		||||
------------------------------------------------------------------------------
 | 
			
		||||
 | 
			
		||||
* armclang 20.2 (qp4)
 | 
			
		||||
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
 | 
			
		||||
------------------------------------------------------------------------------
 | 
			
		||||
 | 
			
		||||
* gcc 10.0.1 VLA (merlin)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* gcc 10.0.1 fixed-size ACLE (merlin)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* gcc 10.0.1 fixed-size ACLE (fjt) w/ MPI
 | 
			
		||||
@@ -46,34 +46,34 @@ export OMPI_CXX=g++-10.0.1
 | 
			
		||||
export MPICH_CC=gcc-10.0.1
 | 
			
		||||
export MPICH_CXX=g++-10.0.1
 | 
			
		||||
 | 
			
		||||
$ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt"
 | 
			
		||||
$ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt"
 | 
			
		||||
 | 
			
		||||
--------------------------------------------------------
 | 
			
		||||
 | 
			
		||||
* armclang 20.0 VLA (merlin)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
 | 
			
		||||
TODO check ARMCLANGCOMPAT
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* armclang 20.1 VLA (merlin)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static
 | 
			
		||||
 | 
			
		||||
TODO check ARMCLANGCOMPAT
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* armclang 20.1 VLA (fjt cluster)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"
 | 
			
		||||
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"
 | 
			
		||||
 | 
			
		||||
TODO check ARMCLANGCOMPAT
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* armclang 20.1 VLA w/MPI (fjt cluster)
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64"
 | 
			
		||||
../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64"
 | 
			
		||||
 | 
			
		||||
No ARMCLANGCOMPAT -> still correct ?
 | 
			
		||||
 | 
			
		||||
@@ -81,9 +81,9 @@ No ARMCLANGCOMPAT -> still correct ?
 | 
			
		||||
 | 
			
		||||
* Fujitsu fcc
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN"
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
* Fujitsu fcc w/ MPI
 | 
			
		||||
 | 
			
		||||
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"
 | 
			
		||||
../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU"
 | 
			
		||||
 
 | 
			
		||||
@@ -184,19 +184,19 @@ Below are shown the `configure` script invocations for three recommended configu
 | 
			
		||||
 | 
			
		||||
This is the build for every day developing and debugging with Xcode. It uses the Xcode clang c++ compiler, without MPI, and defaults to double-precision. Xcode builds the `Debug` configuration with debug symbols for full debugging:
 | 
			
		||||
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=double --prefix=$GridPre/Debug
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Debug
 | 
			
		||||
 | 
			
		||||
#### 2. `Release`
 | 
			
		||||
 | 
			
		||||
Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`, except using single-precision (handy for validation):
 | 
			
		||||
Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`:
 | 
			
		||||
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=single --prefix=$GridPre/Release
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Release
 | 
			
		||||
 | 
			
		||||
#### 3. `MPIDebug`
 | 
			
		||||
 | 
			
		||||
Debug configuration with MPI:
 | 
			
		||||
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --enable-precision=double --prefix=$GridPre/MPIDebug
 | 
			
		||||
    ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --prefix=$GridPre/MPIDebug
 | 
			
		||||
 | 
			
		||||
### 5.3 Build Grid
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -178,15 +178,10 @@ Then enter the cloned directory and set up the build system::
 | 
			
		||||
Now you can execute the `configure` script to generate makefiles (here from a build directory)::
 | 
			
		||||
 | 
			
		||||
  mkdir build; cd build
 | 
			
		||||
  ../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto \
 | 
			
		||||
  ../configure --enable-simd=AVX --enable-comms=mpi-auto \
 | 
			
		||||
      --prefix=<path>
 | 
			
		||||
 | 
			
		||||
where::
 | 
			
		||||
 | 
			
		||||
  --enable-precision=single|double
 | 
			
		||||
 | 
			
		||||
sets the **default precision**. Since this is largely a benchmarking convenience, it is anticipated that the default precision may be removed in future implementations,
 | 
			
		||||
and that explicit type selection be made at all points. Naturally, most code will be type templated in any case.::
 | 
			
		||||
::
 | 
			
		||||
 | 
			
		||||
   --enable-simd=GEN|SSE4|AVX|AVXFMA|AVXFMA4|AVX2|AVX512|NEONv8|QPX
 | 
			
		||||
 | 
			
		||||
@@ -236,7 +231,7 @@ Detailed build configuration options
 | 
			
		||||
  --enable-mkl[=path]                     use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional).
 | 
			
		||||
  --enable-simd=code                      setup Grid for the SIMD target `<code>`(default: `GEN`). A list of possible SIMD targets is detailed in a section below.
 | 
			
		||||
  --enable-gen-simd-width=size            select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). E.g. SSE 128 bit corresponds to 16 bytes.
 | 
			
		||||
  --enable-precision=single|double        set the default precision (default: `double`).
 | 
			
		||||
  --enable-precision=single|double        set the default precision (default: `double`). **Deprecated option**
 | 
			
		||||
  --enable-comms=mpi|none                 use `<comm>` for message passing (default: `none`).
 | 
			
		||||
  --enable-rng=sitmo|ranlux48|mt19937     choose the RNG (default: `sitmo`).
 | 
			
		||||
  --disable-timers                        disable system dependent high-resolution timers.
 | 
			
		||||
@@ -304,8 +299,7 @@ Build setup for Intel Knights Landing platform
 | 
			
		||||
 | 
			
		||||
The following configuration is recommended for the Intel Knights Landing platform::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
  ../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi-auto  \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -314,8 +308,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.
 | 
			
		||||
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
  ../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi       \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -332,8 +325,7 @@ presently performs better with use of more than one rank per node, using shared
 | 
			
		||||
for interior communication.
 | 
			
		||||
We recommend four ranks per node for best performance, but optimum is local volume dependent. ::
 | 
			
		||||
 | 
			
		||||
   ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=KNL        \
 | 
			
		||||
   ../configure --enable-simd=KNL        \
 | 
			
		||||
             --enable-comms=mpi-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CC=icpc MPICXX=mpiicpc 
 | 
			
		||||
@@ -343,8 +335,7 @@ Build setup for Intel Haswell Xeon platform
 | 
			
		||||
 | 
			
		||||
The following configuration is recommended for the Intel Haswell platform::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
  ../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi-auto \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=icpc MPICXX=mpiicpc
 | 
			
		||||
@@ -360,8 +351,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
  ../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -379,8 +369,7 @@ Build setup for Intel Skylake Xeon platform
 | 
			
		||||
 | 
			
		||||
The following configuration is recommended for the Intel Skylake platform::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
  ../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=mpiicpc
 | 
			
		||||
@@ -396,8 +385,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed.
 | 
			
		||||
 | 
			
		||||
If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use::
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX512     \
 | 
			
		||||
  ../configure --enable-simd=AVX512     \
 | 
			
		||||
             --enable-comms=mpi      \
 | 
			
		||||
             --enable-mkl             \
 | 
			
		||||
             CXX=CC CC=cc
 | 
			
		||||
@@ -422,8 +410,7 @@ and 8 threads per rank.
 | 
			
		||||
The following configuration is recommended for the AMD EPYC platform::
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
  ../configure --enable-precision=double\
 | 
			
		||||
             --enable-simd=AVX2       \
 | 
			
		||||
  ../configure --enable-simd=AVX2       \
 | 
			
		||||
             --enable-comms=mpi \
 | 
			
		||||
             CXX=mpicxx 
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user