1
0
mirror of https://github.com/paboyle/Grid.git synced 2025-06-13 12:47:05 +01:00
Files
Grid/_pages/docs/general_build.md
2018-03-15 17:01:59 +00:00

6.3 KiB

title, author_profile, excerpt, header, permalink, sidebar
title author_profile excerpt header permalink sidebar
Documentation false Building on Intel and AMD targets
overlay_color
#5DADE2
/docs/general_build/
nav
docs

{% include base_path %} The information included in this page has been updated on March 2018 and it is valid for the release version 0.7.0.

{% include toc icon="gears" title="Contents" %}

Building for the Intel Knights Landing

The following configuration is recommended for the Intel Knights Landing platform:

../configure --enable-precision=double\
             --enable-simd=KNL        \
             --enable-comms=mpi-auto  \
             --with-gmp=<path>        \
             --with-mpfr=<path>       \
             --enable-mkl             \
             CXX=icpc MPICXX=mpiicpc

where <path> is the UNIX prefix where GMP and MPFR are installed. If you are working on a Cray machine that does not use the mpiicpc wrapper, please use:

../configure --enable-precision=double\
             --enable-simd=KNL        \
             --enable-comms=mpi       \
             --with-gmp=<path>        \
             --with-mpfr=<path>       \
             --enable-mkl             \
             CXX=CC CC=cc

Building for the Intel Haswell

The following configuration is recommended for the Intel Haswell platform:

  ../configure --enable-precision=double\
             --enable-simd=AVX2       \
             --enable-comms=mpi-auto \
             --enable-mkl             \
             CXX=icpc MPICXX=mpiicpc

The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.

If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:

               --with-gmp=<path>        \
               --with-mpfr=<path>       

where <path> is the UNIX prefix where GMP and MPFR are installed.

If you are working on a Cray machine that does not use the mpiicpc wrapper, please use:

  ../configure --enable-precision=double\
             --enable-simd=AVX2       \
             --enable-comms=mpi      \
             --enable-mkl             \
             CXX=CC CC=cc

If using the Intel MPI library, threads should be pinned to NUMA domains using:

        export I_MPI_PIN=1

This is the default.

Building for the Intel Skylake

The following configuration is recommended for the Intel Skylake platform:

  ../configure --enable-precision=double\
             --enable-simd=AVX512     \
             --enable-comms=mpi-auto  \
             --enable-mkl             \
             CXX=mpiicpc

The MKL flag enables use of BLAS and FFTW from the Intel Math Kernels Library.

If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed:

               --with-gmp=<path>        \
               --with-mpfr=<path>       \

where <path> is the UNIX prefix where GMP and MPFR are installed.

If you are working on a Cray machine that does not use the mpiicpc wrapper, please use:

  ../configure --enable-precision=double\
             --enable-simd=AVX512     \
             --enable-comms=mpi       \
             --enable-mkl             \
             CXX=CC CC=cc

If using the Intel MPI library, threads should be pinned to NUMA domains using:

        export I_MPI_PIN=1

This is the default.

Building for the AMD Epyc

The AMD EPYC is a multichip module comprising 32 cores spread over four distinct chips each with 8 cores. So, even with a single socket node there is a quad-chip module. Dual socket nodes with 64 cores total are common. Each chip within the module exposes a separate NUMA domain. There are four NUMA domains per socket and we recommend one MPI rank per NUMA domain. MPI-3 is recommended with the use of four ranks per socket, and 8 threads per rank.

The following configuration is recommended for the AMD EPYC platform:

  ../configure --enable-precision=double\
             --enable-simd=AVX2       \
             --enable-comms=mpi3 \
             CXX=mpicxx 

If gmp and mpfr are NOT in standard places (/usr/) these flags may be needed::

               --with-gmp=<path>        \
               --with-mpfr=<path>       

where <path> is the UNIX prefix where GMP and MPFR are installed.

Using MPICH and g++ v4.9.2, best performance can be obtained using explicit GOMP_CPU_AFFINITY flags for each MPI rank. This can be done by invoking MPI on a wrapper script omp_bind.sh to handle this.

It is recommended to run 8 MPI ranks on a single dual socket AMD EPYC, with 8 threads per rank using MPI3 and shared memory to communicate within this node:

  mpirun -np 8 ./omp_bind.sh ./Benchmark_dwf --mpi 2.2.2.1 --dslash-unroll --threads 8 --grid 16.16.16.16 --cacheblocking 4.4.4.4 

Where omp_bind.sh does the following:

  #!/bin/bash

  numanode=` expr $PMI_RANK % 8 `
  basecore=`expr $numanode \* 16`
  core0=`expr $basecore + 0 `
  core1=`expr $basecore + 2 `
  core2=`expr $basecore + 4 `
  core3=`expr $basecore + 6 `
  core4=`expr $basecore + 8 `
  core5=`expr $basecore + 10 `
  core6=`expr $basecore + 12 `
  core7=`expr $basecore + 14 `

  export GOMP_CPU_AFFINITY="$core0 $core1 $core2 $core3 $core4 $core5 $core6 $core7"
  echo GOMP_CUP_AFFINITY $GOMP_CPU_AFFINITY

  $@

Build setup for laptops, other compilers, non-cluster builds

Many versions of g++ and clang++ work with Grid, and involve merely replacing CXX (and MPICXX), and omit the enable-mkl flag.

Single node, non MPI builds are enabled with:

  --enable-comms=none

FFTW support that is not in the default search path may then enabled with:

  --with-fftw=<installpath>

BLAS will not be compiled in by default, and Lanczos will default to Eigen diagonalisation.

Notes

  • GMP is the GNU Multiple Precision Library.
  • MPFR is a C library for multiple-precision floating-point computations with correct rounding.
  • Both libaries are necessary for the RHMC support.

{% include paginator.html %}