paboyle
54e94360ad
Experimental: Multiple communicators to see if we can avoid thread locks in --enable-comms=mpit
2017-06-24 23:10:24 +01:00
Nils Meyer
3d04dc33c6
ARM neon intrinsics support
2017-06-13 13:26:59 +02:00
paboyle
0494feec98
Libz dependency
2017-06-13 12:00:23 +01:00
paboyle
3bfd1f13e6
I/O improvements
2017-06-11 23:14:10 +01:00
paboyle
7a8f6af5f8
Drop verbose compiler predefine check
2017-05-11 12:48:40 +01:00
paboyle
2b3fdd4a58
Print CXX predefines
2017-05-11 12:05:50 +01:00
paboyle
529e78d43f
Restart the v0.7.0 release
2017-05-08 18:20:04 +01:00
paboyle
c1c7566089
GCC bug work around in 5.0 through 6.2 inclusive.
2017-05-06 15:20:25 +01:00
paboyle
2439999ec8
Warning elimination; drop to -O2 on G++ bad versions
2017-05-06 14:44:49 +01:00
paboyle
751f2b9703
Better check and benchmark driving
2017-05-05 19:54:38 +01:00
Guido Cossu
de84aacdfd
Fixing a configure error for the smearing tests
2017-05-05 13:59:10 +01:00
Guido Cossu
20999c1370
Merge branch 'develop' into feature/hmc_generalise
2017-05-05 12:47:17 +01:00
124bf4d829
git ref in config summary
2017-05-02 19:41:01 +01:00
e8e56b3414
Config summary saved in git-config
2017-05-02 19:40:47 +01:00
89c430136d
grid-config program
2017-05-02 19:13:13 +01:00
Guido Cossu
3344788fa1
Merge branch 'develop' into feature/hmc_generalise
2017-05-01 12:13:56 +01:00
paboyle
3844bcf800
If no f16c instructions supported must use software half precision conversion.
...
This will also become useful on BG/Q, so will move out from SSE4 into a general area.
Lifted the Eigen half precision from web. Looks sensible, but not extensively regressed
against the intrinsics implementation yet.
2017-04-20 15:30:52 +01:00
paboyle
d3b9a7fa14
F16c apparently requires AVX, even if the 128 bit are used.
...
Seems odd.
2017-04-13 13:19:11 +01:00
paboyle
4226c633c4
Default to FP16 off again
2017-04-13 12:51:39 +01:00
paboyle
db5ea001a3
Update to use Xcode 8.3 since -mfp16 causes SIGILL
2017-04-13 12:22:40 +01:00
paboyle
1d502e4ed6
FP16 optional compile time
2017-04-13 11:55:24 +01:00
paboyle
73cdf0fffe
Drop f16c from SSE because of a macos compile error on travis
2017-04-13 11:23:41 +01:00
paboyle
9c3065b860
Debug flags off again
2017-04-13 10:01:32 +01:00
paboyle
68392ddb5b
Exchange in generic
...
Precision change in AVX, SSE, AVX512, Generic. QPX still to do.
2017-04-13 08:38:12 +01:00
paboyle
cb6b81ae82
Half precision conversion
2017-04-12 19:32:37 +01:00
Guido Cossu
8c540333d5
Merge branch 'develop' into feature/hmc_generalise
2017-04-05 14:41:04 +01:00
paboyle
d1d63a4f2d
sitmo default
2017-04-02 00:26:05 +09:00
Guido Cossu
120fb59978
Adding tests for WilsonFlow classes
2017-03-21 16:11:35 +09:00
Guido Cossu
b3dede4dd3
Merge branch 'develop' into feature/hmc_generalise
2017-03-10 23:57:37 +09:00
Francesco Sanfilippo
29b60f7e1a
adding --with switch to pass lime path
2017-02-21 23:09:39 +01:00
Francesco Sanfilippo
041884acf0
Prepending PACKAGE_ with GRID_ in Config.h
...
Avoid polluting linking progr
2017-02-21 22:51:36 +01:00
Guido Cossu
e0571c872b
Merge branch 'develop' into feature/hmc_generalise
2017-02-09 16:12:00 +00:00
Guido Cossu
677757cfeb
Added and tested SITMO PRNG
2017-01-25 12:47:22 +00:00
Guido Cossu
17629b8d9e
Merge branch 'develop' into feature/hmc_generalise
2017-01-25 11:33:53 +00:00
5803933aea
First implementation of HDF5 serial IO writer, reader is still empty
2017-01-17 16:21:18 -08:00
91e98b1dd5
Merge branch 'feature/hadrons' into develop
2016-12-15 18:15:56 +00:00
Guido Cossu
01480da0a8
Merge branch 'develop' into feature/hmc_generalise
2016-12-05 05:10:27 +00:00
7a1ac45679
Hadrons: configure.ac Linux typo
2016-12-05 14:00:10 +09:00
9ad3d3453e
Hadrons is now a library, the previous XML driven program is now a test
2016-12-01 21:36:29 +09:00
7a1a7a685e
Merge branch 'feature/fft-opt' into feature/hadrons
2016-11-27 15:32:03 +09:00
Guido Cossu
1e44fd3094
Added some details on the mpi flags for Cray machines
2016-11-26 18:30:53 +00:00
a2cffb0304
AVXFMA target fixed
2016-11-21 17:47:18 +01:00
bafbac6ac4
Merge branch 'feature/gen-simd' into develop
2016-11-19 13:45:30 +01:00
595f1ce371
GEN SIMD build fix
2016-11-19 13:45:12 +01:00
97cddda49e
Merge branch 'feature/gen-simd' into feature/doxygen
...
# Conflicts:
# Makefile.am
# configure.ac
2016-11-19 13:11:13 +01:00
b873504b90
fully generic SIMD
2016-11-19 01:32:39 +01:00
042ae5b87c
generic 256bits SIMD
2016-11-15 12:16:15 +00:00
Guido Cossu
4e1ffdd17c
Adding git info to the configure output
2016-11-10 18:44:36 +00:00
Guido Cossu
a783282b8b
Merge branch 'develop' into feature/hmc_generalise
2016-11-10 18:13:07 +00:00
13a8997789
Merge branch 'release/v0.6.0' into feature/hadrons
...
# Conflicts:
# Makefile.am
2016-11-08 20:43:39 +00:00
7df940dc3e
homemade test recusrive target for old autotools versions
2016-11-04 22:32:25 +00:00
8af8b047fd
tests is now a recusrsive target
2016-11-04 13:44:21 +00:00
92cd797636
MPI auto configure fix
2016-11-03 13:48:07 +00:00
paboyle
9e2ec2719b
Merge branch 'develop' into feature/mpi3-master-slave
2016-11-02 13:02:56 +00:00
paboyle
791cb050c8
Comms improvements
2016-11-01 11:35:43 +00:00
e74417ca12
big build system polish
2016-10-31 16:31:27 +00:00
Guido Cossu
d50055cd96
Making the ILDG support optional
2016-10-26 09:48:01 +01:00
Guido Cossu
f415db583a
Adding ILDG format
2016-10-24 15:48:22 +01:00
Guido Cossu
f55c16f984
Adding a barrier in the RNG save
2016-10-24 11:02:14 +01:00
paboyle
39f1c880b8
mpi3
2016-10-20 16:56:40 +01:00
azusayamaguchi
81f2aeaece
KNL streaming stores, and KNL performance coutners
2016-10-12 11:45:22 +01:00
Guido Cossu
b56c9ffa52
Fix for AVXFMA
2016-10-10 14:43:37 +01:00
cb02b7088f
Merge branch 'develop' into feature/doxygen
...
# Conflicts:
# configure.ac
2016-10-09 13:35:44 +01:00
77c8a94dae
AVXFMA4 flag fix for Intel Compiler
2016-10-09 12:55:12 +01:00
98439847cf
configure portability fix
2016-10-05 14:57:20 +01:00
7ea4b959a4
hopefully more portable configure output
2016-09-27 11:54:37 +01:00
Guido Cossu
15d8f5c88c
Small change to the configure.ac to include the canonical names
2016-09-23 11:05:36 +01:00
a034e9901b
Merge branch 'develop' into feature/hadrons
2016-09-20 13:49:33 +01:00
d2573189d8
build system: FFTW fix
2016-09-20 12:30:24 +01:00
2e74520821
removed libtool use (BG/Q compatibility)
2016-09-16 15:25:49 +01:00
Antonin Portelli
6dd75ad9e5
Merge branch 'develop' of github.com:paboyle/Grid into feature/bgq
2016-09-16 15:07:54 +01:00
paboyle
ff6da364e8
FFT double and single precision gives good performance now in multithreaded code.
2016-08-24 15:05:00 +01:00
4d11a6f5f2
first commit for QPX intrinsics
2016-08-23 14:41:44 +01:00
paboyle
88be3b39bb
Merge branch 'develop' of https://github.com/paboyle/Grid into develop
2016-08-22 18:29:36 +01:00
paboyle
29c4ef41de
Adding a test for libfftw3
2016-08-22 16:21:01 +01:00
paboyle
90e70790f3
Feature for z-Mobius prep
2016-08-15 22:31:29 +01:00
573b8c6020
build system: -O3 is not overriden by env CXXFLAGS
2016-08-06 01:26:24 +01:00
7b56f63a5c
configure Doxygen output fix
2016-08-05 15:35:29 +01:00
b1cfb4d661
first try at a nicer Doxygen implementation
2016-08-05 15:29:18 +01:00
paboyle
32bc7a6ab8
MPI back out of change that hangs
...
AVX2 for clang, gcc needs the -mfma flag.
2016-08-05 10:36:00 +01:00
7ff7c7d90d
Merge branch 'develop' into feature/hadrons
2016-08-04 16:22:10 +01:00
93d29bb699
build system improvements after discussion with Peter
2016-08-04 16:19:59 +01:00
2485ef9c9c
Merge branch 'feature/new-build' into feature/hadrons
...
# Conflicts:
# Makefile.am
# scripts/copyright
2016-08-03 16:49:16 +01:00
3b376ed54e
build system: error if MPI not found
2016-08-03 15:23:38 +01:00
629283726b
build system: local Grid link flag moved to configure.ac
2016-08-03 15:07:42 +01:00
6adb66dd08
build system: finer management of GMP/MPFR dependence
2016-08-03 15:06:45 +01:00
bc092ad30f
build system fix
2016-08-03 11:47:38 +01:00
dad642ed1b
various build system fixes and improvements
2016-08-03 11:39:20 +01:00
63ae39abc7
proper propagation of OpenMP flags
2016-08-02 17:41:32 +01:00
9e5b934d21
improved LAPACK configuration
2016-08-02 17:26:54 +01:00
e9f30cab2c
first working version for the new build system
2016-07-30 17:53:18 +01:00
paboyle
a0676beeb1
Open up dependency on Eigen and FFTW
2016-07-07 22:31:07 +01:00
75fc295f6e
Merge branch 'hadrons' into feature/hadrons
2016-06-14 17:51:15 +01:00
paboyle
5d3a1a025d
timers flag
2016-06-03 03:25:38 -07:00
paboyle
139cc5f1ae
Large change with KNL preparation
2016-06-03 03:24:26 -07:00
dc5f32e5f0
Merge branch 'master' into hadrons
2016-04-30 00:18:31 -07:00
paboyle
c79ea0dcef
Fixingn IMCI
2016-04-22 21:52:54 -07:00
paboyle
165bffc2e7
Avx512 changes for assembler kernels
2016-03-26 22:25:45 -06:00
paboyle
644fd6d32e
Build avx512 clean
2016-03-25 09:35:33 -07:00
179e82b5ca
Merge branch 'master' into hadrons
2016-03-08 12:55:33 +00:00
paboyle
090e7aa930
Merge remote-tracking branch 'origin/chulwoo-dec12-2015'
...
Merge Chulwoo's Lanczos related improvements.
Merge Nd!=4 fixes for pure gauge HMC from Evan.
2016-03-08 09:55:14 +00:00
02f8b84ac9
Merge branch 'master' into hadrons
2016-02-23 16:13:39 +00:00
Antonin Portelli
497e7e4c53
BG/Q compatibility fix
2016-02-23 15:57:38 +00:00
cfd368596d
Merge branch 'master' into hadrons
2016-02-22 15:25:02 +00:00
Jung
9f0d9ade68
Added configure flag for LAPACK. Tested ImplicitlyRestartedLanczos::calc()
...
Checking in before cleaning up
2016-02-20 02:50:32 -05:00
Peter Boyle
41c2b09184
Shmem comms [NO MPI] target added. The dwf test runs and passes.
...
Not really shaken out to my satisfaction though as I want more tests done, so don't declare as working.
But committing my current while I try a few experimentals.
2016-02-14 14:24:38 -06:00
paboyle
e2f73e3ead
Updates for shmem
2016-02-10 16:50:32 -08:00
Jung
5c57d4f403
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
2016-01-11 11:36:45 -05:00
Jung
5924e5a562
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
configure
lib/qcd/action/Actions.h
lib/qcd/action/fermion/WilsonKernels.h
2016-01-06 03:44:57 -05:00
paboyle
02452afd36
Optional overlap of comms with compute
2016-01-04 14:18:40 +00:00
paboyle
5710966324
Options to use mersenne twister OR ranlux48 via --enable-rng flag at configure time.
...
Can save and restore RNG state via new (serial) I/O routines in a NERSC header style file.
Store a Parallel (one per site) and a single serial RNG file.
2015-12-19 18:32:25 +00:00
paboyle
34a0fde2ad
Fixes to fermion force terms after sign of gamma_mu (0...3) change.
...
Thought I had already committed these.
Believe I have got the Gparity fermion force working.
* tests/Test_gpdwf_force.cc -- correctly predicts dS for two flavour pseudofermion
based on a small dt update of U field.
* tests/Test_hmc_EODWFRatio_Gparity.cc -- ran 1 trajectory on 8^4 with dH=0.21.
Need to accumulate a full plaquette log to believe fully which will take some hours of run time.
2015-12-15 23:14:12 +00:00
Jung
bc34b7e808
Merge branch 'master' of https://github.com/paboyle/Grid into scidac1_2
...
Conflicts:
lib/qcd/action/fermion/WilsonKernels.h
tests/Make.inc
2015-12-15 11:11:59 -05:00
paboyle
3ce10aa975
Fix a regression failure on Mobius; chroma regression added
2015-12-10 22:55:00 +00:00
Jung
f2b4edc090
Fixes for Gparity comparison with CPS (Instantiation, Gamma matrix convention)
2015-12-07 02:04:57 -05:00
4a7f3d1b7b
Merge branch 'master' into hadrons
...
# Conflicts:
# configure
2015-12-02 10:57:51 +00:00
Azusa Yamaguchi
c2d96644a0
EXEC INFO check
2015-11-06 10:31:05 +00:00
538b16610b
First commit for measurement software 'Hadrons'
2015-10-27 17:33:18 +00:00
Peter Boyle
d4289a33b8
AMD FMA4 addition
2015-10-09 00:44:20 +02:00
Peter Boyle
5ef42add2d
Changes to remove warnings under icc; disambiguate AVX512 from IMCI correctly
...
and drop swizzles in AVX512. Don't know why these compiled.
2015-09-23 05:23:45 -07:00
Peter Boyle
2f38ebc446
Reintroducing the hand unrolled loops
2015-09-08 17:45:30 +01:00
Peter Boyle
76d752585b
Started a tidy up in the HMC sector. Now comfortable with the two level integrators;
...
to a little figure out what Guido had done & why -- but there is a neat saving of force
evaluations across the nesting time boundary making use of linearity of the leapP in dt.
I cleaned up the printing, reduced the volume of code, in the process sharing printing
between all integrators. Placed an assert that the total integration time for all integrators
must match at end of trajectory.
Have now verified e-dH = 1 for nested integrators in Wilson/Wilson runs with both
Omelyan and with Leapfrog so substantial confidence gained.
2015-08-29 17:18:43 +01:00
neo
490009745c
Small change in the HMC interface.
...
Example of multiple levels in the WilsonFermion hmc test.
Merge remote-tracking branch 'upstream/master'
Conflicts:
lib/qcd/hmc/HMC.h
lib/qcd/hmc/integrators/Integrator.h
lib/qcd/hmc/integrators/Integrator_algorithm.h
tests/Test_simd.cc
2015-07-30 17:16:57 +09:00
Peter Boyle
019f7a802e
Files renamed
2015-07-27 18:30:19 +09:00
paboyle
5a68a9bbd4
Removed troublesome macros
2015-07-21 22:41:01 -07:00
neo
9adaeb061a
More NEON functionalities
2015-07-21 11:52:15 +09:00
Peter Boyle
638d2cda11
Change the SIMD command correctly with precision = double vs. single and
...
connect the "Real" default precisoin to a configure flag.
Have RealF, RealD and Real types, where Real is compile target dependent single/double,
RealF is single and RealD is double etc..
2015-07-01 22:45:15 +01:00
neo
48bf4878c1
Experimental support for ARM
2015-06-09 15:46:21 +09:00
Peter Boyle
63a61fcc2a
PartialFraction Hw with Zolo and Tanh approx converged under CG and passed EO breakdown
...
and hermiticity tests.
2015-06-04 13:28:37 +01:00
neo
3055d2cf2c
Addedd Ta functionality to the tensor types
...
Merge remote-tracking branch 'upstream/master'
Conflicts:
configure
2015-06-04 18:11:32 +09:00
Peter Boyle
1d0df449e8
Reorganise of file naming
2015-06-03 12:47:05 +01:00
neo
f41e4e8b1b
Some modifications to the configure to check SIMD support
2015-05-29 11:41:02 +09:00
neo
19bd6f103a
Check at configure time if CPU supports the requested SIMD optimization
2015-05-27 18:30:11 +09:00
neo
da46b56e85
Adding support for doxygen generation
2015-05-27 10:34:56 +09:00
neo
1a24801246
checked performance of new vector libaries.
...
Added check for c++11 support on the configure.ac
2015-05-26 12:02:54 +09:00
neo
9e29ac6549
Completed implementation of new Grid_simd classes
...
Tested performance for SSE4, Ok.
AVX1/2, AVX512 yet untested
2015-05-22 17:33:15 +09:00
neo
baa382f055
Added check of mpfr and gmp at configure time
...
It generates automatically the linker flags or complains if not found.
2015-05-19 13:54:55 +09:00
neo
99aecf1f2e
Minor modification to the configure.ac
...
Enables silent rules (use make V=1 to override)
Prints a summary after configure is completed
2015-05-18 17:15:14 +09:00
Peter Boyle
11cb3e9a01
Getting closer to having a wilson solver... introducing a first and untested
...
cut at Conjugate gradient. Also copied in Remez, Zolotarev, Chebyshev from
Mike Clark, Tony Kennedy and my BFM package respectively since we know we will
need these. I wanted the structure of
algorithms/approx
algorithms/iterative
etc.. to start taking shape.
2015-05-18 07:47:05 +01:00
Peter Boyle
0b4d3544b9
clang++ 3.4/5/7 compile happy for AVX and SSE
...
icpc compiles happy on MacOSX both with -xCOMMON-AV512 and native AVX
gcc-5 does not compile happy; can work around by renaming lattice peek/poke/transpose/trace templates
relative to tensor ones, but gcc goes into a recursive template instantiation due to
matching error. I think this is a gcc bug and have filed a report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66153
2015-05-15 11:52:11 +01:00
Peter Boyle
48f425d31c
I have made the Cshift work successfully with open mp threading in
...
every routine. Collapse(2) is now working under clang-omp++.
2015-05-13 00:31:00 +01:00
Peter Boyle
4a1d4f1b3c
Starting a benchmarking sub dir
2015-05-02 17:52:36 +01:00
Peter Boyle
31fd146cc0
Improving the byte swap support for portability
2015-05-01 10:57:33 +01:00
Peter Boyle
5c8858f31b
Better description of Intel's many ISA targets
2015-04-23 08:02:51 +01:00
Peter Boyle
47292de769
Fixing endian on linux I hope
2015-04-23 07:51:15 +01:00
Peter Boyle
b32c14b433
Got the NERSC IO working and fixed a bug in cshift.
2015-04-22 22:46:48 +01:00
Peter Boyle
8ddfa7e6b0
Reorganisation
2015-04-18 21:23:32 +01:00
Peter Boyle
26148c3323
Build reorg
2015-04-18 14:56:05 +01:00
Peter Boyle
5aac6dc85b
spin trace type work
2015-04-16 14:48:21 +01:00
Peter Boyle
48a38ef4fd
Major rework of extract/merge/permute processing debugged and working.
2015-04-06 11:26:24 +01:00