Further reading¶
Scientific data file format libraries¶
Related libraries¶
- openPMD C++ and Python API library for implementing the openPMD standard for meta data and naming schemes for particle-mesh data files, useable with several backends.
Related trainings¶
-
Various VSC Python trainings show how to use scientific data file formats and working in a file system friendly way in Python
-
Scientific Python. The current version of the slides contains a chapter on the use of HDF5. The GitHub repository with various examples, contains examples and a Jupyter notebook for HDF5 and examples and a Jupyter notebook for netCDF.
-
Python for HPC. Not everything is mentioned on the web site, but the current version of the slides has at the end a chapter on I/O, mostly discussing using HDF5 in Python. See also these example files and Jupyter notebook on GitHub.
-
Sometimes other data formats may be a better idea, e.g., in AI applications. With Python (and there exist such libraries for, e.g, C programmers), one can read directly from various archive file formats, e.g., gzip and tar files. See these examples from the VSC Python systems programming course
-
-
MPI courses will often also discuss MPI I/O which is a building block to implement proper I/O strategies in parallel applications, and is also used in the implementation of HDF5, netCDF and likely other libraries.
-
Jülich Supercomputer Centre (between Aachen and Köln) organises a lot of trainings. The annual training "Parallel I/O and Data Formats" is particularly relevant for this chapter of this tutorial.