datatable.Frame¶
Two-dimensional column-oriented container of data. This the primary
data structure in the datatable
module.
A Frame is two-dimensional in the sense that it is comprised of
rows and columns of data. Each data cell can be located via a pair
of its coordinates: (irow, icol)
. We do not support frames with
more or less than two dimensions.
A Frame is column-oriented in the sense that internally the data is stored separately for each column. Each column has its own name and type. Types may be different for different columns but cannot vary within each column.
Thus, the dimensions of a Frame are not symmetrical: a Frame is not a matrix. Internally the class is optimized for the use case when the number of rows significantly exceeds the number of columns.
A Frame can be viewed as a list
of columns: standard Python
function len()
will return the number of columns in the Frame,
and frame[j]
will return the column at index j
(each “column”
will be a Frame with ncols == 1
). Similarly, you can iterate over
the columns of a Frame in a loop, or use it in a *
-expansion:
for column in frame:
# do something
list_of_columns = [*frame]
A Frame can also be viewed as a dict
of columns, where the key
associated with each column is its name. Thus, frame[name]
will
return the column with the requested name. A Frame can also work with
standard python **
-expansion:
dict_of_columns = {**frame}
Construction¶
Construct the frame from various Python sources. |
|
Read an external file and convert into a Frame. |
|
Create a copy of the frame. |
Properties¶
The primary key for the Frame, if any. |
|
Logical types ( |
|
The frame’s meta information. |
|
The names of all columns in the frame. |
|
Number of columns in the frame. |
|
Number of rows in the frame. |
|
A tuple (number of rows, number of columns). |
|
Where this frame was loaded from. |
|
The common |
|
Storage types ( |
|
The common type ( |
|
types ( |
Frame manipulation¶
Primary method for extracting data from a frame. |
|
Update data within the frame. |
|
Remove rows/columns/values from the frame. |
|
Append columns of other frames to this frame. |
|
Append other frames at the bottom of the current. |
|
Search and replace values in the frame. |
|
Sort the frame by the specified columns. |
Convert into other formats¶
Convert the frame into an Arrow table. |
|
Write the frame’s data into CSV format. |
|
Convert the frame into a Python dictionary, by columns. |
|
Store the frame’s data into a binary file in Jay format. |
|
Return the frame’s data as a list of lists, by columns. |
|
Convert the frame into a numpy array. |
|
Convert the frame into a pandas DataFrame. |
|
Return the frame’s data as a list of tuples, by rows. |
Statistical methods¶
Count missing values for each column in the frame. |
|
Count missing values for a one-column frame and return it as a scalar. |
|
Calculate excess kurtosis for each column in the frame. |
|
Calculate excess kurtosis for a one-column frame and return it as a scalar. |
|
Find the largest element for each column in the frame. |
|
Find the largest element for a one-column frame and return it as a scalar. |
|
Calculate the mean value for each column in the frame. |
|
Calculate the mean value for a one-column frame and return it as a scalar. |
|
Find the smallest element for each column in the frame. |
|
Find the smallest element for a one-column frame and return it as a scalar. |
|
Find the mode value for each column in the frame. |
|
Find the mode value for a one-column frame and return it as a scalar. |
|
Calculate the modal frequency for each column in the frame. |
|
Calculate the modal frequency for a one-column frame and return it as a scalar. |
|
Count the number of unique values for each column in the frame. |
|
Count the number of unique values for a one-column frame and return it as a scalar. |
|
Calculate the standard deviation for each column in the frame. |
|
Calculate the standard deviation for a one-column frame and return it as a scalar. |
|
Calculate skewness for each column in the frame. |
|
Calculate skewness for a one-column frame and return it as a scalar. |
|
Calculate the sum of all values for each column in the frame. |
|
Calculate the sum of all values for a one-column column frame and return it as a scalar. |
Miscellaneous methods¶
Find the position of a column in the frame by its name. |
|
Create python variables for each column of the frame. |
|
Return the first few rows of the frame. |
|
Make sure all frame’s data is physically written to memory. |
|
Return the last few rows of the frame. |
Special methods¶
These methods are not intended to be called manually, instead they provide
a way for datatable
to interoperate with other Python modules or
builtin functions.
Used by Python module |
|
Used by Python module |
|
Method that implements the |
|
Method that implements the |
|
Used by Python module |
|
The constructor function. |
|
Used by Python function |
|
Used by Python function |
|
Used by Python function |
|
Used by Python function |
|
Method that implements the |
|
Used by Python module |
|
Used by |
|
Used by Python function |
|
|
Used to display the frame in Jupyter Lab. |
|
Used to display the frame in an |