mirror of
				https://github.com/paboyle/Grid.git
				synced 2025-10-30 11:34:32 +00:00 
			
		
		
		
	Merge pull request #314 from smangham/issue_readme_precision
Fix for deprecated configure options in documentation (issue #313)
This commit is contained in:
		| @@ -9,11 +9,6 @@ matrix: | |||||||
|     - os:        osx |     - os:        osx | ||||||
|       osx_image: xcode8.3 |       osx_image: xcode8.3 | ||||||
|       compiler: clang |       compiler: clang | ||||||
|       env: PREC=single |  | ||||||
|     - os:        osx |  | ||||||
|       osx_image: xcode8.3 |  | ||||||
|       compiler: clang |  | ||||||
|       env: PREC=double |  | ||||||
|        |        | ||||||
| before_install: | before_install: | ||||||
|     - export GRIDDIR=`pwd` |     - export GRIDDIR=`pwd` | ||||||
| @@ -55,7 +50,7 @@ script: | |||||||
|     - make -j4 |     - make -j4 | ||||||
|     - make install |     - make install | ||||||
|     - cd $CWD/build |     - cd $CWD/build | ||||||
|     - ../configure --enable-precision=$PREC --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF} |     - ../configure --enable-simd=SSE4 --enable-comms=none --with-lime=$CWD/build/lime/install ${EXTRACONF} | ||||||
|     - make -j4  |     - make -j4  | ||||||
|     - ./benchmarks/Benchmark_dwf --threads 1 --debug-signals |     - ./benchmarks/Benchmark_dwf --threads 1 --debug-signals | ||||||
|     - make check |     - make check | ||||||
|   | |||||||
							
								
								
									
										33
									
								
								README
									
									
									
									
									
								
							
							
						
						
									
										33
									
								
								README
									
									
									
									
									
								
							| @@ -111,11 +111,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu | |||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| mkdir build; cd build | mkdir build; cd build | ||||||
| ../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> | ../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> | ||||||
| ``` | ``` | ||||||
|  |  | ||||||
| where `--enable-precision=` set the default precision, | where `--enable-simd=` set the SIMD type, `--enable- | ||||||
| `--enable-simd=` set the SIMD type, `--enable- |  | ||||||
| comms=`, and `<path>` should be replaced by the prefix path where you want to | comms=`, and `<path>` should be replaced by the prefix path where you want to | ||||||
| install Grid. Other options are detailed in the next section, you can also use `configure | install Grid. Other options are detailed in the next section, you can also use `configure | ||||||
| --help` to display them. Like with any other program using GNU autotool, the | --help` to display them. Like with any other program using GNU autotool, the | ||||||
| @@ -146,8 +145,8 @@ If you want to build all the tests at once just use `make tests`. | |||||||
| - `--enable-numa`: enable NUMA first touch optimisation | - `--enable-numa`: enable NUMA first touch optimisation | ||||||
| - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. | - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. | ||||||
| - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). | - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). | ||||||
| - `--enable-precision={single|double}`: set the default precision (default: `double`). | - `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option** | ||||||
| - `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. | - `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. | ||||||
| - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). | - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). | ||||||
| - `--disable-timers`: disable system dependent high-resolution timers. | - `--disable-timers`: disable system dependent high-resolution timers. | ||||||
| - `--enable-chroma`: enable Chroma regression tests. | - `--enable-chroma`: enable Chroma regression tests. | ||||||
| @@ -201,8 +200,7 @@ Alternatively, some CPU codenames can be directly used: | |||||||
| The following configuration is recommended for the Intel Knights Landing platform: | The following configuration is recommended for the Intel Knights Landing platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi-auto  \ |              --enable-comms=mpi-auto  \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -212,8 +210,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi       \ |              --enable-comms=mpi       \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -232,8 +229,7 @@ for interior communication. This is the mpi3 communications implementation. | |||||||
| We recommend four ranks per node for best performance, but optimum is local volume dependent. | We recommend four ranks per node for best performance, but optimum is local volume dependent. | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi3-auto \ |              --enable-comms=mpi3-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CC=icpc MPICXX=mpiicpc  |              CC=icpc MPICXX=mpiicpc  | ||||||
| @@ -244,8 +240,7 @@ We recommend four ranks per node for best performance, but optimum is local volu | |||||||
| The following configuration is recommended for the Intel Haswell platform: | The following configuration is recommended for the Intel Haswell platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3-auto \ |              --enable-comms=mpi3-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -262,8 +257,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -280,8 +274,7 @@ This is the default. | |||||||
| The following configuration is recommended for the Intel Skylake platform: | The following configuration is recommended for the Intel Skylake platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=mpiicpc |              CXX=mpiicpc | ||||||
| @@ -298,8 +291,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -330,8 +322,7 @@ and 8 threads per rank. | |||||||
| The following configuration is recommended for the AMD EPYC platform. | The following configuration is recommended for the AMD EPYC platform. | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3 \ |              --enable-comms=mpi3 \ | ||||||
|              CXX=mpicxx  |              CXX=mpicxx  | ||||||
| ``` | ``` | ||||||
|   | |||||||
							
								
								
									
										33
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										33
									
								
								README.md
									
									
									
									
									
								
							| @@ -115,11 +115,10 @@ Now you can execute the `configure` script to generate makefiles (here from a bu | |||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| mkdir build; cd build | mkdir build; cd build | ||||||
| ../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> | ../configure --enable-simd=AVX --enable-comms=mpi-auto --prefix=<path> | ||||||
| ``` | ``` | ||||||
|  |  | ||||||
| where `--enable-precision=` set the default precision, | where `--enable-simd=` set the SIMD type, `--enable- | ||||||
| `--enable-simd=` set the SIMD type, `--enable- |  | ||||||
| comms=`, and `<path>` should be replaced by the prefix path where you want to | comms=`, and `<path>` should be replaced by the prefix path where you want to | ||||||
| install Grid. Other options are detailed in the next section, you can also use `configure | install Grid. Other options are detailed in the next section, you can also use `configure | ||||||
| --help` to display them. Like with any other program using GNU autotool, the | --help` to display them. Like with any other program using GNU autotool, the | ||||||
| @@ -150,8 +149,8 @@ If you want to build all the tests at once just use `make tests`. | |||||||
| - `--enable-numa`: enable NUMA first touch optimisation | - `--enable-numa`: enable NUMA first touch optimisation | ||||||
| - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. | - `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below. | ||||||
| - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). | - `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). | ||||||
| - `--enable-precision={single|double}`: set the default precision (default: `double`). | - `--enable-precision={single|double}`: set the default precision (default: `double`). **Deprecated option** | ||||||
| - `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. | - `--enable-comms=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below. | ||||||
| - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). | - `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `). | ||||||
| - `--disable-timers`: disable system dependent high-resolution timers. | - `--disable-timers`: disable system dependent high-resolution timers. | ||||||
| - `--enable-chroma`: enable Chroma regression tests. | - `--enable-chroma`: enable Chroma regression tests. | ||||||
| @@ -205,8 +204,7 @@ Alternatively, some CPU codenames can be directly used: | |||||||
| The following configuration is recommended for the Intel Knights Landing platform: | The following configuration is recommended for the Intel Knights Landing platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi-auto  \ |              --enable-comms=mpi-auto  \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -216,8 +214,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi       \ |              --enable-comms=mpi       \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -236,8 +233,7 @@ for interior communication. This is the mpi3 communications implementation. | |||||||
| We recommend four ranks per node for best performance, but optimum is local volume dependent. | We recommend four ranks per node for best performance, but optimum is local volume dependent. | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi3-auto \ |              --enable-comms=mpi3-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CC=icpc MPICXX=mpiicpc  |              CC=icpc MPICXX=mpiicpc  | ||||||
| @@ -248,8 +244,7 @@ We recommend four ranks per node for best performance, but optimum is local volu | |||||||
| The following configuration is recommended for the Intel Haswell platform: | The following configuration is recommended for the Intel Haswell platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3-auto \ |              --enable-comms=mpi3-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -266,8 +261,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -284,8 +278,7 @@ This is the default. | |||||||
| The following configuration is recommended for the Intel Skylake platform: | The following configuration is recommended for the Intel Skylake platform: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=mpiicpc |              CXX=mpiicpc | ||||||
| @@ -302,8 +295,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use: | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi3      \ |              --enable-comms=mpi3      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -334,8 +326,7 @@ and 8 threads per rank. | |||||||
| The following configuration is recommended for the AMD EPYC platform. | The following configuration is recommended for the AMD EPYC platform. | ||||||
|  |  | ||||||
| ``` bash | ``` bash | ||||||
| ../configure --enable-precision=double\ | ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi3 \ |              --enable-comms=mpi3 \ | ||||||
|              CXX=mpicxx  |              CXX=mpicxx  | ||||||
| ``` | ``` | ||||||
|   | |||||||
| @@ -12,31 +12,31 @@ module load mpi/openmpi-aarch64 | |||||||
|  |  | ||||||
| scl enable gcc-toolset-10 bash | scl enable gcc-toolset-10 bash | ||||||
|  |  | ||||||
| ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++ CC=gcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ||||||
|  |  | ||||||
| * gcc 10.1 prebuild w/ MPI, QPACE4 interactive login | * gcc 10.1 prebuild w/ MPI, QPACE4 interactive login | ||||||
|  |  | ||||||
| scl enable gcc-toolset-10 bash | scl enable gcc-toolset-10 bash | ||||||
| module load mpi/openmpi-aarch64 | module load mpi/openmpi-aarch64 | ||||||
|  |  | ||||||
| ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi-auto --enable-shm=shmget --enable-openmp CXX=mpicxx CC=mpicc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ||||||
|  |  | ||||||
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------ | ||||||
|  |  | ||||||
| * armclang 20.2 (qp4) | * armclang 20.2 (qp4) | ||||||
|  |  | ||||||
| ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN" | ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DARMCLANGCOMPAT -DA64FXASM -DDSLASHINTRIN" | ||||||
|  |  | ||||||
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------ | ||||||
|  |  | ||||||
| * gcc 10.0.1 VLA (merlin) | * gcc 10.0.1 VLA (merlin) | ||||||
|  |  | ||||||
| ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ||||||
|  |  | ||||||
|  |  | ||||||
| * gcc 10.0.1 fixed-size ACLE (merlin) | * gcc 10.0.1 fixed-size ACLE (merlin) | ||||||
|  |  | ||||||
| ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=g++-10.0.1 CC=gcc-10.0.1 CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN" | ||||||
|  |  | ||||||
|  |  | ||||||
| * gcc 10.0.1 fixed-size ACLE (fjt) w/ MPI | * gcc 10.0.1 fixed-size ACLE (fjt) w/ MPI | ||||||
| @@ -46,34 +46,34 @@ export OMPI_CXX=g++-10.0.1 | |||||||
| export MPICH_CC=gcc-10.0.1 | export MPICH_CC=gcc-10.0.1 | ||||||
| export MPICH_CXX=g++-10.0.1 | export MPICH_CXX=g++-10.0.1 | ||||||
|  |  | ||||||
| $ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt" | $ ../configure --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -march=armv8-a+sve -msve-vector-bits=512 -fno-gcse -DA64FXFIXEDSIZE -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64 -lrt" | ||||||
|  |  | ||||||
| -------------------------------------------------------- | -------------------------------------------------------- | ||||||
|  |  | ||||||
| * armclang 20.0 VLA (merlin) | * armclang 20.0 VLA (merlin) | ||||||
|  |  | ||||||
| ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -fno-unroll-loops -mllvm -vectorizer-min-trip-count=2 -march=armv8-a+sve -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ||||||
|  |  | ||||||
| TODO check ARMCLANGCOMPAT | TODO check ARMCLANGCOMPAT | ||||||
|  |  | ||||||
|  |  | ||||||
| * armclang 20.1 VLA (merlin) | * armclang 20.1 VLA (merlin) | ||||||
|  |  | ||||||
| ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ../configure --with-lime=/home/men04359/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN" LDFLAGS=-static GRID_LDFLAGS=-static MPI_CXXLDFLAGS=-static | ||||||
|  |  | ||||||
| TODO check ARMCLANGCOMPAT | TODO check ARMCLANGCOMPAT | ||||||
|  |  | ||||||
|  |  | ||||||
| * armclang 20.1 VLA (fjt cluster) | * armclang 20.1 VLA (fjt cluster) | ||||||
|  |  | ||||||
| ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" | ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp CXX=armclang++ CC=armclang CXXFLAGS="-std=c++11 -mcpu=a64fx -DARMCLANGCOMPAT -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" | ||||||
|  |  | ||||||
| TODO check ARMCLANGCOMPAT | TODO check ARMCLANGCOMPAT | ||||||
|  |  | ||||||
|  |  | ||||||
| * armclang 20.1 VLA w/MPI (fjt cluster) | * armclang 20.1 VLA w/MPI (fjt cluster) | ||||||
|  |  | ||||||
| ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64" | ../configure --with-lime=$HOME/local --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi3 --enable-openmp CXX=mpiFCC CC=mpifcc CXXFLAGS="-std=c++11 -mcpu=a64fx -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU -I/opt/FJSVxtclanga/tcsds-1.2.25/include/mpi/fujitsu -lrt" LDFLAGS="-L/opt/FJSVxtclanga/tcsds-1.2.25/lib64" | ||||||
|  |  | ||||||
| No ARMCLANGCOMPAT -> still correct ? | No ARMCLANGCOMPAT -> still correct ? | ||||||
|  |  | ||||||
| @@ -81,9 +81,9 @@ No ARMCLANGCOMPAT -> still correct ? | |||||||
|  |  | ||||||
| * Fujitsu fcc | * Fujitsu fcc | ||||||
|  |  | ||||||
| ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN" | ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=none --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=FCC CC=fcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN" | ||||||
|  |  | ||||||
|  |  | ||||||
| * Fujitsu fcc w/ MPI | * Fujitsu fcc w/ MPI | ||||||
|  |  | ||||||
| ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-precision=double --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" | ../configure --with-lime=$HOME/grid-a64fx/lime/c-lime --without-hdf5 --enable-gen-simd-width=64 --enable-simd=GEN --enable-comms=mpi --enable-openmp --with-mpfr=/home/users/gre/gre-1/grid-a64fx/mpfr-build/install CXX=mpiFCC CC=mpifcc CXXFLAGS="-Nclang -Kfast -DA64FX -DA64FXASM -DDSLASHINTRIN -DTOFU" | ||||||
|   | |||||||
| @@ -184,19 +184,19 @@ Below are shown the `configure` script invocations for three recommended configu | |||||||
|  |  | ||||||
| This is the build for every day developing and debugging with Xcode. It uses the Xcode clang c++ compiler, without MPI, and defaults to double-precision. Xcode builds the `Debug` configuration with debug symbols for full debugging: | This is the build for every day developing and debugging with Xcode. It uses the Xcode clang c++ compiler, without MPI, and defaults to double-precision. Xcode builds the `Debug` configuration with debug symbols for full debugging: | ||||||
|  |  | ||||||
|     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=double --prefix=$GridPre/Debug |     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Debug | ||||||
|  |  | ||||||
| #### 2. `Release` | #### 2. `Release` | ||||||
|  |  | ||||||
| Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`, except using single-precision (handy for validation): | Since Grid itself doesn't really have debug configurations, the release build is recommended to be the same as `Debug`: | ||||||
|  |  | ||||||
|     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --enable-precision=single --prefix=$GridPre/Release |     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=none --prefix=$GridPre/Release | ||||||
|  |  | ||||||
| #### 3. `MPIDebug` | #### 3. `MPIDebug` | ||||||
|  |  | ||||||
| Debug configuration with MPI: | Debug configuration with MPI: | ||||||
|  |  | ||||||
|     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --enable-precision=double --prefix=$GridPre/MPIDebug |     ../configure CXX=clang++ CXXFLAGS="-I$GridPkg/include/libomp -Xpreprocessor -fopenmp -std=c++11" LDFLAGS="-L$GridPkg/lib/libomp" LIBS="-lomp" --with-hdf5=$GridPkg --with-gmp=$GridPkg --with-mpfr=$GridPkg --with-fftw=$GridPkg --with-lime=$GridPre --enable-simd=GEN --enable-comms=mpi-auto MPICXX=$GridPre/bin/mpicxx --prefix=$GridPre/MPIDebug | ||||||
|  |  | ||||||
| ### 5.3 Build Grid | ### 5.3 Build Grid | ||||||
|  |  | ||||||
|   | |||||||
| @@ -178,15 +178,10 @@ Then enter the cloned directory and set up the build system:: | |||||||
| Now you can execute the `configure` script to generate makefiles (here from a build directory):: | Now you can execute the `configure` script to generate makefiles (here from a build directory):: | ||||||
|  |  | ||||||
|   mkdir build; cd build |   mkdir build; cd build | ||||||
|   ../configure --enable-precision=double --enable-simd=AVX --enable-comms=mpi-auto \ |   ../configure --enable-simd=AVX --enable-comms=mpi-auto \ | ||||||
|       --prefix=<path> |       --prefix=<path> | ||||||
|  |  | ||||||
| where:: | :: | ||||||
|  |  | ||||||
|   --enable-precision=single|double |  | ||||||
|  |  | ||||||
| sets the **default precision**. Since this is largely a benchmarking convenience, it is anticipated that the default precision may be removed in future implementations, |  | ||||||
| and that explicit type selection be made at all points. Naturally, most code will be type templated in any case.:: |  | ||||||
|  |  | ||||||
|    --enable-simd=GEN|SSE4|AVX|AVXFMA|AVXFMA4|AVX2|AVX512|NEONv8|QPX |    --enable-simd=GEN|SSE4|AVX|AVXFMA|AVXFMA4|AVX2|AVX512|NEONv8|QPX | ||||||
|  |  | ||||||
| @@ -236,7 +231,7 @@ Detailed build configuration options | |||||||
|   --enable-mkl[=path]                     use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional). |   --enable-mkl[=path]                     use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional). | ||||||
|   --enable-simd=code                      setup Grid for the SIMD target `<code>`(default: `GEN`). A list of possible SIMD targets is detailed in a section below. |   --enable-simd=code                      setup Grid for the SIMD target `<code>`(default: `GEN`). A list of possible SIMD targets is detailed in a section below. | ||||||
|   --enable-gen-simd-width=size            select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). E.g. SSE 128 bit corresponds to 16 bytes. |   --enable-gen-simd-width=size            select the size (in bytes) of the generic SIMD vector type (default: 32 bytes). E.g. SSE 128 bit corresponds to 16 bytes. | ||||||
|   --enable-precision=single|double        set the default precision (default: `double`). |   --enable-precision=single|double        set the default precision (default: `double`). **Deprecated option** | ||||||
|   --enable-comms=mpi|none                 use `<comm>` for message passing (default: `none`). |   --enable-comms=mpi|none                 use `<comm>` for message passing (default: `none`). | ||||||
|   --enable-rng=sitmo|ranlux48|mt19937     choose the RNG (default: `sitmo`). |   --enable-rng=sitmo|ranlux48|mt19937     choose the RNG (default: `sitmo`). | ||||||
|   --disable-timers                        disable system dependent high-resolution timers. |   --disable-timers                        disable system dependent high-resolution timers. | ||||||
| @@ -304,8 +299,7 @@ Build setup for Intel Knights Landing platform | |||||||
|  |  | ||||||
| The following configuration is recommended for the Intel Knights Landing platform:: | The following configuration is recommended for the Intel Knights Landing platform:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi-auto  \ |              --enable-comms=mpi-auto  \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -314,8 +308,7 @@ The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library. | |||||||
|  |  | ||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi       \ |              --enable-comms=mpi       \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -332,8 +325,7 @@ presently performs better with use of more than one rank per node, using shared | |||||||
| for interior communication. | for interior communication. | ||||||
| We recommend four ranks per node for best performance, but optimum is local volume dependent. :: | We recommend four ranks per node for best performance, but optimum is local volume dependent. :: | ||||||
|  |  | ||||||
|    ../configure --enable-precision=double\ |    ../configure --enable-simd=KNL        \ | ||||||
|              --enable-simd=KNL        \ |  | ||||||
|              --enable-comms=mpi-auto \ |              --enable-comms=mpi-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CC=icpc MPICXX=mpiicpc  |              CC=icpc MPICXX=mpiicpc  | ||||||
| @@ -343,8 +335,7 @@ Build setup for Intel Haswell Xeon platform | |||||||
|  |  | ||||||
| The following configuration is recommended for the Intel Haswell platform:: | The following configuration is recommended for the Intel Haswell platform:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi-auto \ |              --enable-comms=mpi-auto \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=icpc MPICXX=mpiicpc |              CXX=icpc MPICXX=mpiicpc | ||||||
| @@ -360,8 +351,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
|  |  | ||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi      \ |              --enable-comms=mpi      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -379,8 +369,7 @@ Build setup for Intel Skylake Xeon platform | |||||||
|  |  | ||||||
| The following configuration is recommended for the Intel Skylake platform:: | The following configuration is recommended for the Intel Skylake platform:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi      \ |              --enable-comms=mpi      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=mpiicpc |              CXX=mpiicpc | ||||||
| @@ -396,8 +385,7 @@ where `<path>` is the UNIX prefix where GMP and MPFR are installed. | |||||||
|  |  | ||||||
| If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | If you are working on a Cray machine that does not use the `mpiicpc` wrapper, please use:: | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=AVX512     \ | ||||||
|              --enable-simd=AVX512     \ |  | ||||||
|              --enable-comms=mpi      \ |              --enable-comms=mpi      \ | ||||||
|              --enable-mkl             \ |              --enable-mkl             \ | ||||||
|              CXX=CC CC=cc |              CXX=CC CC=cc | ||||||
| @@ -422,8 +410,7 @@ and 8 threads per rank. | |||||||
| The following configuration is recommended for the AMD EPYC platform:: | The following configuration is recommended for the AMD EPYC platform:: | ||||||
|  |  | ||||||
|  |  | ||||||
|   ../configure --enable-precision=double\ |   ../configure --enable-simd=AVX2       \ | ||||||
|              --enable-simd=AVX2       \ |  | ||||||
|              --enable-comms=mpi \ |              --enable-comms=mpi \ | ||||||
|              CXX=mpicxx  |              CXX=mpicxx  | ||||||
|  |  | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user