Column#
- class pylibcudf.column.Column(DataType data_type, size_type size, gpumemoryview data, gpumemoryview mask, size_type null_count, size_type offset, list children)#
A container of nullable device data as a column of elements.
This class is an implementation of Arrow columnar data specification for data stored on GPUs. It relies on Python memoryview-like semantics to maintain shared ownership of the data it is constructed with, so any input data may also be co-owned by other data structures. The Column is designed to be operated on using algorithms backed by libcudf.
- Parameters:
- data_typeDataType
The type of data in the column.
- sizesize_type
The number of rows in the column.
- datagpumemoryview
The data the column will refer to.
- maskgpumemoryview
The null mask for the column.
- null_countint
The number of null rows in the column.
- offsetint
The offset into the data buffer where the column’s data begins.
- childrenlist
The children of this column if it is a compound column type.
Methods
all_null_like
(Column like, size_type size)Create an all null column from a template.
child
(self, size_type index)Get a child column of this column.
children
(self)The children of the column.
copy
(self)Create a copy of the column.
data
(self)The data buffer of the column.
from_array
(cls, obj)Create a Column from any object which supports the NumPy or CUDA array interface.
from_array_interface
(cls, obj)Create a Column from an object implementing the NumPy Array Interface.
from_cuda_array_interface
(cls, obj)Create a Column from an object implementing the CUDA Array Interface.
from_scalar
(Scalar slr, size_type size)Create a Column from a Scalar.
list_view
(self)Accessor for methods of a Column that are specific to lists.
null_count
(self)The number of null elements in the column.
null_mask
(self)The null mask of the column.
num_children
(self)The number of children of this column.
offset
(self)The offset of the column.
size
(self)The number of elements in the column.
type
(self)The type of data in the column.
with_mask
(self, gpumemoryview mask, ...)Augment this column with a new null mask.
- static all_null_like(Column like, size_type size)#
Create an all null column from a template.
- Parameters:
- likeColumn
Column whose type we should mimic
- sizeint
Number of rows in the resulting column.
- Returns:
- Column
An all-null column of size rows and type matching like.
- child(self, size_type index) Column #
Get a child column of this column.
- Parameters:
- indexsize_type
The index of the child column to get.
- Returns:
- Column
The child column.
- data(self) gpumemoryview #
The data buffer of the column.
- classmethod from_array(cls, obj)#
Create a Column from any object which supports the NumPy or CUDA array interface.
- Parameters:
- objobject
The input array to be converted into a pylibcudf.Column.
- Returns:
- Column
- Raises:
- TypeError
If the input does not implement a supported array interface.
- ImportError
If NumPy is not installed.
Notes
1D and 2D C-contiguous device arrays are supported. The data are not copied.
For numpy.ndarray, this is not yet implemented.
Examples
>>> import pylibcudf as plc >>> import cupy as cp >>> cp_arr = cp.array([[1,2],[3,4]]) >>> col = plc.Column.from_array(cp_arr)
- classmethod from_array_interface(cls, obj)#
Create a Column from an object implementing the NumPy Array Interface.
- Parameters:
- objobject
Must implement the __array_interface__ protocol.
- Raises:
- NotImplementedError
This method is not yet implemented.
- classmethod from_cuda_array_interface(cls, obj)#
Create a Column from an object implementing the CUDA Array Interface.
- Parameters:
- objobject
Must implement the
__cuda_array_interface__
protocol.
- Returns:
- Column
A Column containing the data from the CUDA array interface.
- Raises:
- TypeError
If the object does not support __cuda_array_interface__.
- ValueError
If the object is not 1D or 2D, or is not C-contiguous. If the number of rows exceeds size_type limit.
- NotImplementedError
If the object has a mask.
- static from_scalar(Scalar slr, size_type size)#
Create a Column from a Scalar.
- Parameters:
- slrScalar
The scalar to create a column from.
- sizesize_type
The number of elements in the column.
- Returns:
- Column
A Column containing the scalar repeated size times.
- list_view(self) ListColumnView #
Accessor for methods of a Column that are specific to lists.
- null_count(self) size_type #
The number of null elements in the column.
- null_mask(self) gpumemoryview #
The null mask of the column.
- num_children(self) size_type #
The number of children of this column.
- offset(self) size_type #
The offset of the column.
- size(self) size_type #
The number of elements in the column.
- with_mask(self, gpumemoryview mask, size_type null_count) Column #
Augment this column with a new null mask.
- Parameters:
- maskgpumemoryview
New mask (or None to unset the mask)
- null_countint
New null count. If this is incorrect, bad things happen.
- Returns:
- New Column object sharing data with self (except for the mask which is new).
- class pylibcudf.column.ListColumnView(Column col)#
Accessor for methods of a Column that are specific to lists.
Methods
child
(self)The data column of the underlying list column.
offsets
(self)The offsets column of the underlying list column.
- child(self)#
The data column of the underlying list column.
- offsets(self)#
The offsets column of the underlying list column.
- pylibcudf.column.is_c_contiguous(shape: Sequence[int], strides: None | Sequence[int], int itemsize: int) bool #
Determine if shape and strides are C-contiguous
- Parameters:
- shapeSequence[int]
Number of elements in each dimension.
- stridesNone | Sequence[int]
The stride of each dimension in bytes. If None, the memory layout is C-contiguous.
- itemsizeint
Size of an element in bytes.
- Returns:
- bool
The boolean answer.