# Row Functions¶

Functions `rowall()`, `rowany()`, `rowcount()`, `rowfirst()`, `rowlast()`, `rowmax()`, `rowmean()`, `rowmin()`, `rowsd()`, `rowsum()` are functions that aggregate across rows instead of columns and return a single column. These functions are equivalent to pandas aggregation functions with parameter `(axis=1)`.

These functions make it easy to compute rowwise aggregations – for instance, you may want the sum of columns `A`, `B`, `C` and `D`. You could say: `f.A + f.B + f.C + f.D`. Rowsum makes it easier – `dt.rowsum(f['A':'D'])`.

## rowall, rowany¶

These work only on boolean expressions – `rowall()` checks if all the values in the row are `True`, while `rowany()` checks if any value in the row is `True`. It is similar to pandas’ all or any with parameter `(axis=1)`. A single boolean column is returned:

```from datatable import dt, f, by df = dt.Frame({'A': [True, True], 'B': [True, False]}) df ```
AB
bool8bool8
011
110
```# rowall : df[:, dt.rowall(f[:])] ```
C0
bool8
01
10
```# rowany : df[:, dt.rowany(f[:])] ```
C0
bool8
01
11

The single boolean column that is returned can be very handy when filtering in the `i` section.

Filter for rows where at least one cell is greater than 0:

```df = dt.Frame({'a': [0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0], 'b': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'c': [0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0], 'd': [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0], 'e': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'f': [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]}) df ```
abcdef
int8int8int32int8int8int8
0000000
1000001
2000000
3000000
4000000
5005000
6100000
7000000
8000100
9100000
10000000
```df[dt.rowany(f[:] > 0), :] ```
abcdef
int8int8int32int8int8int8
0000001
1005000
2100000
3000100
4100000

Filter for rows where all the cells are 0:

```df[dt.rowall(f[:] == 0), :] ```
abcdef
int8int8int32int8int8int8
0000000
1000000
2000000
3000000
4000000
5000000

Filter for rows where all the columns’ values are the same:

```df = dt.Frame("""Name A1 A2 A3 A4 deff 0 0 0 0 def1 0 1 0 0 def2 0 0 0 0 def3 1 0 0 0 def4 0 0 0 0""") # compare the first integer column with the rest, # use rowall to find rows where all is True # and filter with the resulting boolean df[dt.rowall(f[1]==f[1:]), :] ```
NameA1A2A3A4
str32bool8bool8bool8bool8
0deff0000
1def20000
2def40000

Filter for rows where the values are increasing:

```df = dt.Frame({"A": [1, 2, 6, 4], "B": [2, 4, 5, 6], "C": [3, 5, 4, 7], "D": [4, -3, 3, 8], "E": [5, 1, 2, 9]}) df ```
ABCDE
int32int32int32int32int32
012345
1245-31
265432
346789
```df[dt.rowall(f[1:] >= f[:-1]), :] ```
ABCDE
int32int32int32int32int32
012345
146789

## rowfirst, rowlast¶

These look for the first and last non-missing value in a row respectively:

```df = dt.Frame({'A':[1, None, None, None], 'B':[None, 3, 4, None], 'C':[2, None, 5, None]}) df ```
ABC
int8int32int32
01NA2
1NA3NA
2NA45
3NANANA
```# rowfirst : df[:, dt.rowfirst(f[:])] ```
C0
int32
01
13
24
3NA
```# rowlast : df[:, dt.rowlast(f[:])] ```
C0
int32
02
13
25
3NA

Get rows where the last value in the row is greater than the first value in the row:

```df = dt.Frame({'a': [50, 40, 30, 20, 10], 'b': [60, 10, 40, 0, 5], 'c': [40, 30, 20, 30, 40]}) df ```
abc
int32int32int32
0506040
1401030
2304020
320030
410540
```df[dt.rowlast(f[:]) > dt.rowfirst(f[:]), :] ```
abc
int32int32int32
020030
110540

## rowmax, rowmin¶

These get the maximum and minimum values per row, respectively:

```df = dt.Frame({"C": [2, 5, 30, 20, 10], "D": [10, 8, 20, 20, 1]}) df ```
CD
int32int32
0210
158
23020
32020
4101
```# rowmax df[:, dt.rowmax(f[:])] ```
C0
int32
010
18
230
320
410
```# rowmin df[:, dt.rowmin(f[:])] ```
C0
int32
02
15
220
320
41

Find the difference between the maximum and minimum of each row:

```df = dt.Frame("""Value1 Value2 Value3 Value4 5 4 3 2 4 3 2 1 3 3 5 1""") df[:, dt.update(max_min = dt.rowmax(f[:]) - dt.rowmin(f[:]))] df ```
Value1Value2Value3Value4max_min
int32int32int32int32int32
054323
143213
233514

## rowsum, rowmean, rowcount, rowsd¶

`rowsum()` and `rowmean()` get the sum and mean of rows respectively; `rowcount()` counts the number of non-missing values in a row, while `rowsd()` aggregates a row to get the standard deviation.

Get the count, sum, mean and standard deviation for each row:

```df = dt.Frame("""ORD A B C D 198 23 45 NaN 12 138 25 NaN NaN 62 625 52 36 49 35 457 NaN NaN NaN 82 626 52 32 39 45""") df[:, dt.update(rowcount = dt.rowcount(f[:]), rowsum = dt.rowsum(f[:]), rowmean = dt.rowmean(f[:]), rowsd = dt.rowsd(f[:]) )] df ```
ORDABCDrowcountrowsumrowmeanrowsd
int32float64float64float64int32int32float64float64float64
01982345NA12427869.586.7583
113825NANA6232257557.6108
2625523649355797159.4260.389
3457NANANA822539269.5265.165
4626523239455794158.8261.277

Find rows where the number of nulls is greater than 3:

```df = dt.Frame({'city': ["city1", "city2", "city3", "city4"], 'state': ["state1", "state2", "state3", "state4"], '2005': [144, 205, 123, None], '2006': [173, 211, 123, 124], '2007': [None, None, None, None], '2008': [None, 206, None, None], '2009': [None, None, 124, 123], '2010': [128, 273, None, None]}) df ```
citystate200520062007200820092010
str32str32int32int32voidint32int32int32
0city1state1144173NANANA128
1city2state2205211NA206NA273
2city3state3123123NANA124NA
3city4state4NA124NANA123NA
```# get columns that are null, then sum on the rows # and finally filter where the sum is greater than 3 df[dt.rowsum(dt.isna(f[:])) > 3, :] ```
citystate200520062007200820092010
str32str32int32int32voidint32int32int32
0city4state4NA124NANA123NA

Rowwise sum of the float columns:

```df = dt.Frame("""ID W_1 W_2 W_3 1 0.1 0.2 0.3 1 0.2 0.4 0.5 2 0.3 0.3 0.2 2 0.1 0.3 0.4 2 0.2 0.0 0.5 1 0.5 0.3 0.2 1 0.4 0.2 0.1""") df[:, dt.update(sum_floats = dt.rowsum(f[float]))] df ```
IDW_1W_2W_3sum_floats
int32float64float64float64float64
010.10.20.30.6
110.20.40.51.1
220.30.30.20.8
320.10.30.40.8
420.200.50.7
510.50.30.21
610.40.20.10.7

## More Examples¶

Divide columns `A`, `B`, `C`, `D` by the `total` column, square it and sum rowwise:

```df = dt.Frame({'A': [2, 3], 'B': [1, 2], 'C': [0, 1], 'D': [1, 0], 'total': [4, 6]}) df ```
ABCDtotal
int32int32int8int8int32
021014
132106
```df[:, update(result = dt.rowsum((f[:-1]/f[-1])**2))] df ```
ABCDtotalresult
int32int32int8int8int32float64
0210140.375
1321060.388889

Get the row sum of the `COUNT` columns:

```df = dt.Frame("""USER OBSERVATION COUNT.1 COUNT.2 COUNT.3 A 1 0 1 1 A 2 1 1 2 A 3 3 0 0""") columns = [f[column] for column in df.names if column.startswith("COUNT")] df[:, update(total = dt.rowsum(columns))] df ```
USEROBSERVATIONCOUNT.1COUNT.2COUNT.3total
str32int32int32bool8int32int32
0A10112
1A21124
2A33003

Sum selected columns rowwise:

```df = dt.Frame({'location' : ("a","b","c","d"), 'v1' : (3,4,3,3), 'v2' : (4,56,3,88), 'v3' : (7,6,2,9), 'v4': (7,6,1,9), 'v5' : (4,4,7,9), 'v6' : (2,8,4,6)}) df ```
locationv1v2v3v4v5v6
str32int32int32int32int32int32int32
0a347742
1b4566648
2c332174
3d3889996
```df[:, {"x1": dt.rowsum(f[1:4]), "x2": dt.rowsum(f[4:])}] ```
x1x2
int32int32
01413
16618
2812
310024