Introduction to Datatable

H2O’s datatable is a Python package for manipulating 2-dimensional tabular data structures (aka, data frames). It is close in spirit to pandas or SFrame; however we put specific emphasis on speed and big data support. As the name suggests, the package is closely related to R’s data.table and attempts to mimic its core algorithms and API.

Currently datatable is in the Alpha stage and is undergoing active development. The API may be unstable; some of the core features are incomplete and/or missing.


datatable is an open source project released under the Mozilla Public Licence v2. Open Source projects live by their user and developer communities. We welcome and encourage your contributions of any kind!

No matter what your skill set or level of engagement is with datatable, you can help others by improving the ecosystem of documentation, bug report and feature request tickets, and code.

We invite anyone who is interested to contribute, whether through pull requests, or tests, or GitHub issues, API suggestions, or generic discussion.

Have Questions?

If you have questions about using datatable, post them on Stack Overflow using the [datatable] [python] tags at

Python API

Indices and tables