What is PyTables?

[https://www.pytables.org PyTables] is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data.

[https://www.pytables.org PyTables] is built on top of the [https://www.hdfgroup.org/HDF5/ HDF5] library, using the Python language and the [https://numpy.scipy.org/ NumPy] package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using [https://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/ Pyrex]), makes it a fast, yet extremely easy to use tool for interactively dealing with, processing and searching very large amounts of data. One important feature of [https://www.pytables.org PyTables] is that it optimizes memory and disk resources so that data takes much less space (specially if on-flight compression is used) than other solutions such as relational or object oriented databases.

Design goals

PyTables has been designed to fulfill the next requirements:

Allow to structure your data in a hierarchical way.
Easy to use. It implements the natural naming scheme for allowing convenient access to the data.
All the cells in datasets can be multidimensional entities.
Most of the I/O operations speed should be only limited by the underlying I/O subsystem.
Enable the end user to save large datasets in a efficient way, i.e. each single byte of data on disk has to be represented by one byte plus a small fraction when loaded in memory.

Where to find it

For more info, documentation and downloads of PyTables, please go to its official [https://www.pytables.org home page].

Page

User

What is PyTables?

Design goals

Where to find it