datatable.Frame

class
Frame

Two-dimensional column-oriented container of data. This the primary data structure in the datatable module.

A Frame is two-dimensional in the sense that it is comprised of rows and columns of data. Each data cell can be located via a pair of its coordinates: (irow, icol). We do not support frames with more or less than two dimensions.

A Frame is column-oriented in the sense that internally the data is stored separately for each column. Each column has its own name and type. Types may be different for different columns but cannot vary within each column.

Thus, the dimensions of a Frame are not symmetrical: a Frame is not a matrix. Internally the class is optimized for the use case when the number of rows significantly exceeds the number of columns.

A Frame can be viewed as a list of columns: standard Python function len() will return the number of columns in the Frame, and frame[j] will return the column at index j (each “column” will be a Frame with ncols == 1). Similarly, you can iterate over the columns of a Frame in a loop, or use it in a *-expansion:

for column in frame: # do something list_of_columns = [*frame]

A Frame can also be viewed as a dict of columns, where the key associated with each column is its name. Thus, frame[name] will return the column with the requested name. A Frame can also work with standard python **-expansion:

dict_of_columns = {**frame}

Construction

Frame(*args, **kws)

Construct the frame from various Python sources.

dt.fread(src)

Read an external file and convert into a Frame.

.copy()

Create a copy of the frame.

Properties

.key

The primary key for the Frame, if any.

.ltypes

Logical types (dt.ltypes) of all columns.

.meta

The frame’s meta information.

.names

The names of all columns in the frame.

.ncols

Number of columns in the frame.

.nrows

Number of rows in the frame.

.stype

A tuple (number of rows, number of columns).

.source

Where this frame was loaded from.

.stype

The common dt.stype for the entire frame.

.stypes

Storage types (dt.stypes) of all columns.

.type

The common type (dt.Type) for the entire frame.

.types

types (dt.Types) of all columns.

Frame manipulation

frame[i, j, ...]

Primary method for extracting data from a frame.

frame[i, j, ...] = values

Update data within the frame.

del frame[i, j, ...]

Remove rows/columns/values from the frame.

.cbind(*frames)

Append columns of other frames to this frame.

.rbind(*frames)

Append other frames at the bottom of the current.

.replace(what, with)

Search and replace values in the frame.

.sort(cols)

Sort the frame by the specified columns.

Convert into other formats

.to_arrow()

Convert the frame into an Arrow table.

.to_csv(file)

Write the frame’s data into CSV format.

.to_dict()

Convert the frame into a Python dictionary, by columns.

.to_jay(file)

Store the frame’s data into a binary file in Jay format.

.to_list()

Return the frame’s data as a list of lists, by columns.

.to_numpy()

Convert the frame into a numpy array.

.to_pandas()

Convert the frame into a pandas DataFrame.

.to_tuples()

Return the frame’s data as a list of tuples, by rows.

Statistical methods

.countna()

Count missing values for each column in the frame.

.countna1()

Count missing values for a one-column frame and return it as a scalar.

.kurt()

Calculate excess kurtosis for each column in the frame.

.kurt1()

Calculate excess kurtosis for a one-column frame and return it as a scalar.

.max()

Find the largest element for each column in the frame.

.max1()

Find the largest element for a one-column frame and return it as a scalar.

.mean()

Calculate the mean value for each column in the frame.

.mean1()

Calculate the mean value for a one-column frame and return it as a scalar.

.min()

Find the smallest element for each column in the frame.

.min1()

Find the smallest element for a one-column frame and return it as a scalar.

.mode()

Find the mode value for each column in the frame.

.mode1()

Find the mode value for a one-column frame and return it as a scalar.

.nmodal()

Calculate the modal frequency for each column in the frame.

.nmodal1()

Calculate the modal frequency for a one-column frame and return it as a scalar.

.nunique()

Count the number of unique values for each column in the frame.

.nunique1()

Count the number of unique values for a one-column frame and return it as a scalar.

.sd()

Calculate the standard deviation for each column in the frame.

.sd1()

Calculate the standard deviation for a one-column frame and return it as a scalar.

.skew()

Calculate skewness for each column in the frame.

.skew1()

Calculate skewness for a one-column frame and return it as a scalar.

.sum()

Calculate the sum of all values for each column in the frame.

.sum1()

Calculate the sum of all values for a one-column column frame and return it as a scalar.

Miscellaneous methods

.colindex(name)

Find the position of a column in the frame by its name.

.export_names()

Create python variables for each column of the frame.

.head()

Return the first few rows of the frame.

.materialize()

Make sure all frame’s data is physically written to memory.

.tail()

Return the last few rows of the frame.

Special methods

These methods are not intended to be called manually, instead they provide a way for datatable to interoperate with other Python modules or builtin functions.

.__copy__()

Used by Python module copy.

.__deepcopy__()

Used by Python module copy.

.__delitem__()

Method that implements the del DT[...] call.

.__getitem__()

Method that implements the DT[...] call.

.__getstate__()

Used by Python module pickle.

.__init__(...)

The constructor function.

.__iter__()

Used by Python function iter(), or when the frame is used as a target in a loop.

.__len__()

Used by Python function len().

.__repr__()

Used by Python function repr().

.__reversed__()

Used by Python function reversed().

.__setitem__()

Method that implements the DT[...] = expr call.

.__setstate__()

Used by Python module pickle.

.__sizeof__()

Used by sys.getsizeof.

.__str__()

Used by Python function str.

._repr_html_()

Used to display the frame in Jupyter Lab.

._repr_pretty_()

Used to display the frame in an IPython console.