Arrow Compute Functions
>>> import pyarrow as pa
>>> import pyarrow.compute as pc
>>> a = pa.array([5, 3, 4, 1, 2])
>>> sorted_indices = pc.sort_indices(a)
>>> pc.take(a, sorted_indices)
<pyarrow.lib.Int64Array object at 0x127a5d1e0>
[
1,
2,
3,
4,
5
]
>>> pc.sum(a)
<pyarrow.Int64Scalar: 15>
>>> user_history_ids = pa.array([999, 777, 555])
>>> user_history_watch_time = pa.array([30, 50, 10])
>>> request_items = pa.array([111, 222, 777, 888, 999])
>>> join_index = pc.index_in(request_items,
user_history_ids)
>>> request_items_user_watch_time =
pc.take(user_history_watch_time, join_index)
>>> request_items_user_watch_time
<pyarrow.lib.Int64Array object at 0x127a5d4e0>
[
null,
null,
50,
null,
30
]
Common complex operations e.g. Sort, Aggregate, Join…