datatable API

Symbols listed here are available for import from the root of the datatable module.

Submodules

exceptions.

datatable warnings and exceptions.

internal.

Access to some internal details of datatable module.

math.

Mathematical functions, similar to python’s math module.

models.

A small set of data analysis tools.

re.

Functions using regular expressions.

str.

Functions for working with string columns.

time.

Functions for working with date/time columns.

Classes

Frame

Main “table of data” class. This is the equivalent of pandas’ or Julia’s DataFrame, R’s data.table or tibble, SQL’s TABLE, etc.

FExpr

Helper class for computing formulas over a frame.

Namespace

Helper class for addressing columns in a frame.

Type

Column’s type, similar to numpy’s dtype.

stype

[DEPRECATED] Enum of column “storage” types.

ltype

[DEPRECATED] Enum of column “logical” types.

Functions

fread()

Read CSV/text/XLSX/Jay/other files

iread()

Same as fread(), but read multiple files at once

by()

Group-by clause for use in Frame’s square-bracket selector

join()

Join clause for use in Frame’s square-bracket selector

sort()

Sort clause for use in Frame’s square-bracket selector

update()

Create new or update existing columns within a frame

cbind()

Combine frames by columns

rbind()

Combine frames by rows

repeat()

Concatenate frame by rows

as_type()

Cast column into another type

categories()

Get categories for categorical columns

ifelse()

Ternary if operator

shift()

Shift column by a given number of rows

cut()

Bin a column into equal-width intervals

qcut()

Bin a column into equal-population intervals

split_into_nhot()

[DEPRECATED] Split and nhot-encode a single-column frame

init_styles()

Inject datatable’s stylesheets into the Jupyter notebook

rowall()

Row-wise all() function

rowany()

Row-wise any() function

rowcount()

Calculate the number of non-missing values per row

rowfirst()

Find the first non-missing value row-wise

rowlast()

Find the last non-missing value row-wise

rowargmax()

Find the index of the largest element row-wise

rowmax()

Find the largest element row-wise

rowmean()

Calculate the mean value row-wise

rowargmin()

Find the index of the smallest element row-wise

rowmin()

Find the smallest element row-wise

rowsd()

Calculate the standard deviation row-wise

rowsum()

Calculate the sum of all values row-wise

intersect()

Calculate the set intersection of values in the frames

setdiff()

Calculate the set difference between the frames

symdiff()

Calculate the symmetric difference between the sets of values in the frames

union()

Calculate the union of values in the frames

unique()

Find unique values in a frame

corr()

Calculate correlation between two columns

count()

Count non-missing values per column

countna()

Count the number of NA values per column

cumcount()

Number rows within each group

cummax()

Calculate the cumulative maximum of values per column

cummin()

Calculate the cumulative minimum of values per column

cumprod()

Calculate the cumulative product of values per column

cumsum()

Calculate the cumulative sum of values per column

cov()

Calculate covariance between two columns

fillna()

Impute missing values

max()

Find the largest element per column

mean()

Calculate mean value per column

median()

Find the median element per column

min()

Find the smallest element per column

ngroup()

Number each group

nunique()

Count the number of unique values per column

prod()

Calculate the product of all values per column

sd()

Calculate the standard deviation per column

sum()

Calculate the sum of all values per column

Other

build_info

Information about the build of the datatable module.

dt

The datatable module itself.

f

The primary namespace used during DT[...] call.

g

Secondary namespace used during DT[..., join()] call.

options

datatable options.