mirror of
https://github.com/paboyle/Grid.git
synced 2024-11-09 23:45:36 +00:00
Documentation update (briefly) covering serialisation changes. For review
This commit is contained in:
parent
2b1fcd78c3
commit
393727b93b
1
.gitignore
vendored
1
.gitignore
vendored
@ -88,6 +88,7 @@ Thumbs.db
|
||||
# build directory #
|
||||
###################
|
||||
build*/*
|
||||
Documentation/_build
|
||||
|
||||
# IDE related files #
|
||||
#####################
|
||||
|
Binary file not shown.
@ -1787,7 +1787,7 @@ Hdf5Writer Hdf5Reader HDF5
|
||||
|
||||
Write interfaces, similar to the XML facilities in QDP++ are presented. However,
|
||||
the serialisation routines are automatically generated by the macro, and a virtual
|
||||
reader adn writer interface enables writing to any of a number of formats.
|
||||
reader and writer interface enables writing to any of a number of formats.
|
||||
|
||||
**Example**::
|
||||
|
||||
@ -1814,6 +1814,91 @@ reader adn writer interface enables writing to any of a number of formats.
|
||||
}
|
||||
|
||||
|
||||
Eigen tensor support -- added 2019H1
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The Serialisation library was expanded in 2019 to support de/serialisation of
|
||||
Eigen tensors. De/serialisation of existing types was not changed. Data files
|
||||
without Eigen tensors remain compatible with earlier versions of Grid and other readers.
|
||||
Conversely, data files containing serialised Eigen tensors is a breaking change.
|
||||
|
||||
Eigen tensor serialisation support was added to BaseIO, which was modified to provide a Traits class
|
||||
to recognise Eigen tensors with elements that are either: primitive scalars (arithmetic and complex types);
|
||||
or Grid tensors.
|
||||
|
||||
**Traits determining de/serialisable scalars**::
|
||||
|
||||
// Is this an Eigen tensor
|
||||
template<typename T> struct is_tensor : std::integral_constant<bool,
|
||||
std::is_base_of<Eigen::TensorBase<T, Eigen::ReadOnlyAccessors>, T>::value> {};
|
||||
// Is this an Eigen tensor of a supported scalar
|
||||
template<typename T, typename V = void> struct is_tensor_of_scalar : public std::false_type {};
|
||||
template<typename T> struct is_tensor_of_scalar<T, typename std::enable_if<is_tensor<T>::value && is_scalar<typename T::Scalar>::value>::type> : public std::true_type {};
|
||||
// Is this an Eigen tensor of a supported container
|
||||
template<typename T, typename V = void> struct is_tensor_of_container : public std::false_type {};
|
||||
template<typename T> struct is_tensor_of_container<T, typename std::enable_if<is_tensor<T>::value && isGridTensor<typename T::Scalar>::value>::type> : public std::true_type {};
|
||||
|
||||
|
||||
Eigen tensors are regular, multidimensional objects, and each Reader/Writer
|
||||
was extended to support this new datatype. Where the Eigen tensor contains
|
||||
a Grid tensor, the dimensions of the data written are the dimensions of the
|
||||
Eigen tensor plus the dimensions of the underlying Grid scalar. Dimensions
|
||||
of size 1 are preserved.
|
||||
|
||||
**New Reader/Writer methods for multi-dimensional data**::
|
||||
|
||||
template <typename U>
|
||||
void readMultiDim(const std::string &s, std::vector<U> &buf, std::vector<size_t> &dim);
|
||||
template <typename U>
|
||||
void writeMultiDim(const std::string &s, const std::vector<size_t> & Dimensions, const U * pDataRowMajor, size_t NumElements);
|
||||
|
||||
|
||||
On readback, the Eigen tensor rank must match the data being read, but the tensor
|
||||
dimensions will be resized if necessary. Resizing is not possible for Eigen::TensorMap<T>
|
||||
because these tensors use a buffer provided at construction, and this buffer cannot be changed.
|
||||
Deserialisation failures cause Grid to assert.
|
||||
|
||||
|
||||
HDF5 Optimisations -- added June 2021
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Grid serialisation is intended to be light, deterministic and provide a layer of abstraction over
|
||||
multiple file formats. HDF5 excels at handling multi-dimensional data, and the Grid HDF5Reader/HDF5Writer exploits this.
|
||||
When serialising nested ``std::vector<T>``, where ``T`` is an arithmetic or complex type,
|
||||
the Hdf5Writer writes the data as an Hdf5 DataSet object.
|
||||
|
||||
However, nested ``std::vector<std::vector<...T>>`` might be "ragged", i.e. not necessarily regular. E.g. a 3d nested
|
||||
``std::vector`` might contain 2 rows, the first being a 2x2 block and the second row being a 1 x 2 block.
|
||||
A bug existed whereby this was not checked on write, so nested, ragged vectors
|
||||
were written as a regular dataset, with a buffer under/overrun and jumbled contents.
|
||||
|
||||
Clearly this was not used in production, as the bug went undetected until now. Fixing this bug
|
||||
is an opportunity to further optimise the HDF5 file format.
|
||||
|
||||
The goals of this change are to:
|
||||
|
||||
* Make changes to the Hdf5 file format only -- i.e. do not impact other file formats
|
||||
|
||||
* Implement file format changes in such a way that they are transparent to the Grid reader
|
||||
|
||||
* Correct the bug for ragged vectors of numeric / complex types
|
||||
|
||||
* Extend the support of nested std::vector<T> to arbitrarily nested Grid tensors
|
||||
|
||||
|
||||
The trait class ``element`` has been redefined to ``is_flattenable``, which is a trait class for
|
||||
potentially "flattenable" objects. These are (possibly nested) ``std::vector<T>`` where ``T`` is
|
||||
an arithmetic, complex or Grid tensor type. Flattenable objects are tested on write
|
||||
(with the function ``isRegularShape``) to see whether they actually are regular.
|
||||
|
||||
Flattenable, regular objects are written to a multidimensional HDF5 DataSet.
|
||||
Otherwise, an Hdf5 sub group is created with the object "name", and each element of the outer dimension is
|
||||
recursively written to as object "name_n", where n is a 0-indexed number.
|
||||
|
||||
On readback (by Grid)), the presence of a subgroup containing the attribute ``Grid_vector_size`` triggers a
|
||||
"ragged read", otherwise a read from a DataSet is attempted.
|
||||
|
||||
|
||||
Data parallel field IO
|
||||
-----------------------
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user