|Next release:||Version 1.1.0|
|Previous release:||Version 0.11.1|
.typesreturns the list of
dt.Typeobjects for each column of the frame. These types are a generalization of previous stypes, and will eventually replace them.
.typereturns the common
dt.Typefor all columns of the frame (provided that it exists).
New column type
dt.Type.date32added, which can store a calendar date #2858:
import datetime DT = dt.Frame([datetime.date(2021, 2, 17)])
New column type
dt.Type.time64added, which cat store timestamps within a certain time zone (in a single column all times must be in the same time zone) #2911:
import datetime DT = dt.Frame([datetime.datetime(2021, 3, 17, 9, 0, 0)])
A Frame can now be constructed from an Arrow table:
DT = dt.Frame(arrow_table)
This process uses data Arrow C Data interface, and therefore does not entail data copying.
A Frame can now be converted into an Arrow table, using the
pa_table = DT.to_arrow()
.metaproperty now provides access to frame’s meta information, if any, as set by datatable functions/methods or by the user.
dt.FExprnow has method
.sum(), which behaves exactly as the base level function
dt.FExprnow has methods
.median(), which behave exactly as the equivalent base level functions
dt.FExprnow has methods for all the row functions (
dt.FExprnow has methods
.shift(), which behave exactly as the equivalent base level functions
The row selector
iin the delete operation
del DT[i, :]can now be an unsorted list. The list can also contain duplicate values.
When a keyed Frame is converted into a pandas DataFrame, the key columns will now become the DataFrame’s index, not regulat columns. #2883
When a Frame is shown in a python console, it will now display the stype of each column, as a second line under the column names. #2810
types=in Frame’s constructor can now accept arguments of class
dt.Type, and also pyarrow’s types. #2986
A Frame can now be created properly from a list of numpy bool objects. #2762
Frames with 1000000+ columns will now be correctly stored in Jay. #2876
Passing an invalid value to the
column=argument of the
.to_numpy()method will no longer result in a crash.
Frame terminal display no longer overflows terminal’s width if it contains strings with special characters. #2844
Sorting in reverse order now works correctly in the presence of a groupby. #2838
Creating a Frame from a list of
np.str_objects now works correctly. #3026
Converting a frame with incompatible types into a numpy array will now raise an error (instead of auto-promoting to object type). However, if the user explicitly requests promotion into the object type then there won’t be any error.
Rbinding frames with columns of incompatible types will now raise an error instead of auto-promoting to string type. #2790
When a frame is converted into a numpy array of floatinng type, then we will produce a regular
np.ndarrayinstead of a masked array.
.ltypesare now considered deprecated and will be removed in a future version. Currently they continue to work as before, however.
When a frame is created from a list of python objects of disparate types, we will no longer create a column of type
object– instead, a
dt.exceptions.TypeErrorwill be thrown. An
objectcolumn can still be created by an explicit request via the
stype=parameter in the constructor.
stypes=in Frame constructor was renamed into
types=, and similarly
type=. The old parameter names are still recognized, but no longer documented.
dt.internal.in_debug_mode()removed and replaced with flags
dt.interenal.regex_supported()removed entirely – datatable will now always have support for regular expressions. #2636
ifelse()can now accept more than 3 arguments, implementing a chained-if functionality. This is equivalent to
CASE WHENin SQL. #2656
as_type()that allows casting columns into a different stype. This function is an alternative to the already existing functionality of using the stype itself as a cast function.
date32columns out of individual year/month/day parts.
dt.time.day()for retrieving individual components of a date.
dt.time.day_of_week()for computing the day of week (Monday to Sunday) for the given date column.
dt.str.slice()for applying a slice to a string column. #1667
sort()can now accept argument
na_positon=. It can take three values:
"remove". The values describe the position assigned to NAs after sorting. #793
cut()can now accept argument
bins=, that is a list or a tuple of frames containing edges of the binning intervals. #2819
When a whole column is updated within a
DT[i, j, by()]call, the stype/ltype of that column us now allowed to change. #2685
Fix a crash that occurred when using
median()on virtual columns of type ArrayView64. #2802
Fix poor performance when selecting columns from a frame with a large number of columns (10k+). #2873
Numpy scalars can now be used in expressions. #3027
f-expressions now accepts a list/tuple of column names/column positions/column types in the
dt.FExpr.len()has been deprecated and replaced with a function
dt.FExpr.re_match()has been deprecated and replaced with a function
Implemented a linear model with stochastic gradient descent learning. It supports binomial and multinomial regressions, as well as regression for continous targets. #2871
FTRL now supports
dt.Type.time64feature types. #3007
Datatable no longer supports Python 3.5, because Python 3.5 itself has reached its end of life on 2020-09-13 and will no longer be supported. If you are still using Python 3.5, please consider upgrading. #2642
dt.open(), which was deprecated since version 0.10.0. #3018
Fixed a memory leak when creating a large number of datatable objects. #2701
Datatable can now be properly installed from a source distribution. #2846
This release was created with the help of 6 people who contributed code and documentation, and 17 more people who submitted bug reports and feature requests.
Code & documentation contributors: