mirror of
https://github.com/paboyle/Grid.git
synced 2025-04-04 19:25:56 +01:00
Documentation update (briefly) covering serialisation changes. For review
This commit is contained in:
parent
2b1fcd78c3
commit
393727b93b
1
.gitignore
vendored
1
.gitignore
vendored
@ -88,6 +88,7 @@ Thumbs.db
|
|||||||
# build directory #
|
# build directory #
|
||||||
###################
|
###################
|
||||||
build*/*
|
build*/*
|
||||||
|
Documentation/_build
|
||||||
|
|
||||||
# IDE related files #
|
# IDE related files #
|
||||||
#####################
|
#####################
|
||||||
|
Binary file not shown.
@ -1787,7 +1787,7 @@ Hdf5Writer Hdf5Reader HDF5
|
|||||||
|
|
||||||
Write interfaces, similar to the XML facilities in QDP++ are presented. However,
|
Write interfaces, similar to the XML facilities in QDP++ are presented. However,
|
||||||
the serialisation routines are automatically generated by the macro, and a virtual
|
the serialisation routines are automatically generated by the macro, and a virtual
|
||||||
reader adn writer interface enables writing to any of a number of formats.
|
reader and writer interface enables writing to any of a number of formats.
|
||||||
|
|
||||||
**Example**::
|
**Example**::
|
||||||
|
|
||||||
@ -1814,6 +1814,91 @@ reader adn writer interface enables writing to any of a number of formats.
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
Eigen tensor support -- added 2019H1
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The Serialisation library was expanded in 2019 to support de/serialisation of
|
||||||
|
Eigen tensors. De/serialisation of existing types was not changed. Data files
|
||||||
|
without Eigen tensors remain compatible with earlier versions of Grid and other readers.
|
||||||
|
Conversely, data files containing serialised Eigen tensors is a breaking change.
|
||||||
|
|
||||||
|
Eigen tensor serialisation support was added to BaseIO, which was modified to provide a Traits class
|
||||||
|
to recognise Eigen tensors with elements that are either: primitive scalars (arithmetic and complex types);
|
||||||
|
or Grid tensors.
|
||||||
|
|
||||||
|
**Traits determining de/serialisable scalars**::
|
||||||
|
|
||||||
|
// Is this an Eigen tensor
|
||||||
|
template<typename T> struct is_tensor : std::integral_constant<bool,
|
||||||
|
std::is_base_of<Eigen::TensorBase<T, Eigen::ReadOnlyAccessors>, T>::value> {};
|
||||||
|
// Is this an Eigen tensor of a supported scalar
|
||||||
|
template<typename T, typename V = void> struct is_tensor_of_scalar : public std::false_type {};
|
||||||
|
template<typename T> struct is_tensor_of_scalar<T, typename std::enable_if<is_tensor<T>::value && is_scalar<typename T::Scalar>::value>::type> : public std::true_type {};
|
||||||
|
// Is this an Eigen tensor of a supported container
|
||||||
|
template<typename T, typename V = void> struct is_tensor_of_container : public std::false_type {};
|
||||||
|
template<typename T> struct is_tensor_of_container<T, typename std::enable_if<is_tensor<T>::value && isGridTensor<typename T::Scalar>::value>::type> : public std::true_type {};
|
||||||
|
|
||||||
|
|
||||||
|
Eigen tensors are regular, multidimensional objects, and each Reader/Writer
|
||||||
|
was extended to support this new datatype. Where the Eigen tensor contains
|
||||||
|
a Grid tensor, the dimensions of the data written are the dimensions of the
|
||||||
|
Eigen tensor plus the dimensions of the underlying Grid scalar. Dimensions
|
||||||
|
of size 1 are preserved.
|
||||||
|
|
||||||
|
**New Reader/Writer methods for multi-dimensional data**::
|
||||||
|
|
||||||
|
template <typename U>
|
||||||
|
void readMultiDim(const std::string &s, std::vector<U> &buf, std::vector<size_t> &dim);
|
||||||
|
template <typename U>
|
||||||
|
void writeMultiDim(const std::string &s, const std::vector<size_t> & Dimensions, const U * pDataRowMajor, size_t NumElements);
|
||||||
|
|
||||||
|
|
||||||
|
On readback, the Eigen tensor rank must match the data being read, but the tensor
|
||||||
|
dimensions will be resized if necessary. Resizing is not possible for Eigen::TensorMap<T>
|
||||||
|
because these tensors use a buffer provided at construction, and this buffer cannot be changed.
|
||||||
|
Deserialisation failures cause Grid to assert.
|
||||||
|
|
||||||
|
|
||||||
|
HDF5 Optimisations -- added June 2021
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Grid serialisation is intended to be light, deterministic and provide a layer of abstraction over
|
||||||
|
multiple file formats. HDF5 excels at handling multi-dimensional data, and the Grid HDF5Reader/HDF5Writer exploits this.
|
||||||
|
When serialising nested ``std::vector<T>``, where ``T`` is an arithmetic or complex type,
|
||||||
|
the Hdf5Writer writes the data as an Hdf5 DataSet object.
|
||||||
|
|
||||||
|
However, nested ``std::vector<std::vector<...T>>`` might be "ragged", i.e. not necessarily regular. E.g. a 3d nested
|
||||||
|
``std::vector`` might contain 2 rows, the first being a 2x2 block and the second row being a 1 x 2 block.
|
||||||
|
A bug existed whereby this was not checked on write, so nested, ragged vectors
|
||||||
|
were written as a regular dataset, with a buffer under/overrun and jumbled contents.
|
||||||
|
|
||||||
|
Clearly this was not used in production, as the bug went undetected until now. Fixing this bug
|
||||||
|
is an opportunity to further optimise the HDF5 file format.
|
||||||
|
|
||||||
|
The goals of this change are to:
|
||||||
|
|
||||||
|
* Make changes to the Hdf5 file format only -- i.e. do not impact other file formats
|
||||||
|
|
||||||
|
* Implement file format changes in such a way that they are transparent to the Grid reader
|
||||||
|
|
||||||
|
* Correct the bug for ragged vectors of numeric / complex types
|
||||||
|
|
||||||
|
* Extend the support of nested std::vector<T> to arbitrarily nested Grid tensors
|
||||||
|
|
||||||
|
|
||||||
|
The trait class ``element`` has been redefined to ``is_flattenable``, which is a trait class for
|
||||||
|
potentially "flattenable" objects. These are (possibly nested) ``std::vector<T>`` where ``T`` is
|
||||||
|
an arithmetic, complex or Grid tensor type. Flattenable objects are tested on write
|
||||||
|
(with the function ``isRegularShape``) to see whether they actually are regular.
|
||||||
|
|
||||||
|
Flattenable, regular objects are written to a multidimensional HDF5 DataSet.
|
||||||
|
Otherwise, an Hdf5 sub group is created with the object "name", and each element of the outer dimension is
|
||||||
|
recursively written to as object "name_n", where n is a 0-indexed number.
|
||||||
|
|
||||||
|
On readback (by Grid)), the presence of a subgroup containing the attribute ``Grid_vector_size`` triggers a
|
||||||
|
"ragged read", otherwise a read from a DataSet is attempted.
|
||||||
|
|
||||||
|
|
||||||
Data parallel field IO
|
Data parallel field IO
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user