datatable.symdiff()

Find the symmetric difference between the sets of values in all frames.

Each frame should have only a single column or be empty. The values in each frame will be treated as a set, and this function will perform the symmetric difference operation on these sets.

The symmetric difference of two frames are those values that are present in either of the frames, but not in the both. The symmetric difference of more than two frames are those values that are present in an odd number of frames.

Parameters

*frames
Frame | Frame | ...

Input single-column frames.

return
Frame

A single-column frame. The column stype is the smallest common stype of columns from the frames.

except
ValueError | NotImplementedError

dt.exceptions.ValueError

raised when one of the input frames has more than one column.

dt.exceptions.NotImplementedError

raised when one of the columns has stype obj64.

Examples

from datatable import dt df = dt.Frame({'A': [1, 1, 2, 1, 2], 'B': [None, 2, 3, 4, 5], 'C': [1, 2, 1, 1, 2]}) df
ABC
int32int32int32
01NA1
1122
2231
3141
4252

Symmetric difference of all the columns in the entire frame; Note that each column is treated as a separate frame:

dt.symdiff(*df)
A
int32
0NA
12
23
34
45

Symmetric difference between two frames:

dt.symdiff(df["A"], df["B"])
A
int32
0NA
11
23
34
45

See Also

  • intersect() – calculate the set intersection of values in the frames.

  • setdiff() – calculate the set difference between the frames.

  • union() – calculate the union of values in the frames.

  • unique() – find unique values in a frame.