datatable.sort()

Sort clause for use in Frame’s square-bracket selector.

When a sort() object is present inside a DT[i, j, ...] expression, it will sort the rows of the resulting Frame according to the columns cols passed as the arguments to sort().

When used together with by(), the sort clause applies after the group-by, i.e. we sort elements within each group. Note, however, that because we use stable sorting, the operations of grouping and sorting are commutative: the result of applying groupby and then sort is the same as the result of sorting first and then doing groupby.

When used together with i (row filter), the i filter is applied after the sorting. For example:

DT[:10, :, sort(f.Highscore, reverse=True)]

will select the first 10 records from the frame DT ordered by the Highscore column.

Parameters

cols
FExpr

Columns to sort the frame by.

reverse
bool

If False, sorting is performed in the ascending order. If True, sorting is descending.

return
object

datatable.sort object for use in square-bracket selector.

Examples

from datatable import dt, f, by DT = dt.Frame({"col1": ["A", "A", "B", None, "D", "C"], "col2": [2, 1, 9, 8, 7, 4], "col3": [0, 1, 9, 4, 2, 3], "col4": [1, 2, 3, 3, 2, 1]}) DT
col1col2col3col4
str32int32int32int32
0A201
1A112
2B993
3NA843
4D722
5C431

Sort by a single column:

DT[:, :, dt.sort("col1")]
col1col2col3col4
str32int32int32int32
0NA843
1A201
2A112
3B993
4C431
5D722

Sort by multiple columns:

DT[:, :, dt.sort("col2", "col3")]
col1col2col3col4
str32int32int32int32
0A112
1A201
2C431
3D722
4NA843
5B993

Sort in descending order:

DT[:, :, dt.sort(-f.col1)]
col1col2col3col4
str32int32int32int32
0NA843
1D722
2C431
3B993
4A201
5A112

The frame can also be sorted in descending order by setting the reverse parameter to True:

DT[:, :, dt.sort("col1", reverse=True)]
col1col2col3col4
str32int32int32int32
0NA843
1D722
2C431
3B993
4A201
5A112

By default, when sorting, null values are placed at the top; to relocate null values to the bottom, pass last to the na_position parameter:

DT[:, :, dt.sort("col1", na_position="last")]
col1col2col3col4
str32int32int32int32
0A201
1A112
2B993
3C431
4D722
5NA843

Passing remove to na_position completely excludes any row with null values from the sorted output:

DT[:, :, dt.sort("col1", na_position="remove")]
col1col2col3col4
str32int32int32int32
0A201
1A112
2B993
3C431
4D722

Sort by multiple columns, descending and ascending order:

DT[:, :, dt.sort(-f.col2, f.col3)]
col1col2col3col4
str32int32int32int32
0B993
1NA843
2D722
3C431
4A201
5A112

The same code above can be replicated by passing a list of booleans to reverse. The length of the reverse flag list should match the number of columns to be sorted:

DT[:, :, dt.sort("col2", "col3", reverse=[True, False])]
col1col2col3col4
str32int32int32int32
0B993
1NA843
2D722
3C431
4A201
5A112

In the presence of by(), sort() sorts within each group:

DT[:, :, by("col4"), dt.sort(f.col2)]
col4col1col2col3
int32str32int32int32
01A20
11C43
22A11
32D72
43NA84
53B99