Version 1.1.0

Version 1.1.0
Previous release:Version 1.0.0

Frame

  • Parameter force=True in method .rbind() and function dt.rbind() will now allow combining columns of incompatible types. #3062

  • Frames with columns of type obj64 can now be saved into CSV. The values in the object column will be stringified upon saving. #3064

  • .replace() now supports numpy scalars. #3164

  • .to_numpy() now has an option to control memory layout of the resulting numpy array. #3275

  • column types returned by the method .sum() are now consistent with the ones returned by the function dt.sum(), i.e. int64 for void, boolean and integer columns; float32 for float32 columns; float64 for float64 columns. #2904

  • .to_csv() now has an option sep to control the field separator. #3337

  • Void columns can now be used with dt.sort() and dt.by(). In addition, datatable will now skip sorting any column that it knows contains constant values. #3088 #3104 #3108 #3109

  • Saving a frame with a void column into Jay no longer leads to a crash. #3074 #3099 #3246

  • Joining with void columns now works correctly. #3094

  • dt.sum() now works correctly when called on grouped column. #3110

  • Fixed dt.sum() behavior when called on iterables and frames. #3406

  • Fixed a crash which could have occurred when sorting very long identical or nearly identical strings. #3134

  • It is now possible to sort all columns according to boolean flags in the reverse list #3168

  • Fixed support for .max_column_width option when rendering frames in Jupyter notebooks. #3160

  • Fixed a crash which in rare situations happened in .to_csv() due to multithreading. #3176

  • Fixed a crash in .to_pandas() when called on keyed frames. #3224

  • Fixed .to_csv() to quote missing values when quoting=”all” is specified. #3340

  • Fixed groupby behavior on columns that contain missing values. #3331

  • Fixed creating frames from numpy arrays, that contain unicode strings. #3420

  • .to_numpy() will now create a correctly shaped array in the case of zero-column frames. #3427

  • In the case a zero-column frame is created from a list of tuples or dictionaries, the number of rows will be equal to the number of elements in that list. #3428

  • Converting a column of void type into pandas now produces a pandas object column filled with Nones. Converting such column back into datatable produces a void column again. #3063

  • When creating Frame from a list of values, a floating-point nan value will now be treated as None. In particular, nans can now be safely mixed with values of other types, and a list consisting of only nans will turn into a Column of type void. #3083

  • Converting string or object columns to numpy no longer produces a masked array. Instead, we create a regular object array, filled with Nones in place of NAs. Similarly, converting a string or object column to pandas creates a Series with None values (instead of nans as before) in place of NAs. #3083

FExpr

fread

  • When reading Excel files, datetime fields will now be converted into time64 columns in the resulting frame.

  • When reading Excel files, forward slash, backslash, and their mix are supported as separators for specifying subpath. #3221

  • fread() now supports reading from public S3 buckets, when the source has a format of s3://bucket-name/key-name. #3302

  • Header detection heuristics has been improved in the case when some of the column names are missing. #3363

  • Improved handling of very small and very large float values. #3447

  • fread() will no longer fail while reading mostly empty files. #3055

  • fread() will no longer fail when reading excel files on Windows. #3178

  • Parameter tempdir is now honored for memory limited fread() operation. #3244

  • Parameter sep= in fread() will no longer accept values '-', '+', or '.'. Previously, these values were allowed but they produced errors during parsing. #3065

Models

  • Fixed a bug in the LinearModel that in some cases resulted in the gradient and model coefficients blow up. #3234

  • Fixed undefined behavior when LinearModel predicted on frames with missing values. #3260

  • Fixed target column type detection in LinearModel. #3466

General

  • Datatable no longer supports Python 3.6, because it has reached its end of life on 2021-12-23 and will no longer be supported. If you are still using Python 3.6, please consider upgrading. #3376

  • Datatable no longer supports Python 3.7, because it has reached its end of life on 2023-06-27 and will no longer be supported. If you are still using Python 3.7, please consider upgrading. #3434

  • Added properties .is_array, .is_boolean, .is_categorical, .is_compound, .is_float, .is_integer, .is_numeric, .is_object, .is_string, .is_temporal, .is_void to class dt.Type. #3101 #3149

  • Added support for macOS Big Sur. #3175

  • Added support for Python 3.10. #3210

  • Added support for Python 3.11. #3374

  • datatable’s thread pool can now be used to parallelize external C++ applications and will have no specific datatable dependencies, when the code is built with DT_DISABLE variable being defined. #3306

  • Python built-in functoins min() and max() will continue working for list comprehensions even after dt.min() and dt.max() have been imported from datatable. #3409