Hash join that builds hash table in creation and probes results in subsequent *_join
member functions.
More...
#include <hash_join.hpp>
Public Types | |
using | impl_type = typename cudf::detail::hash_join< cudf::hashing::detail::MurmurHash3_x86_32< cudf::hash_value_type > > |
Implementation type. | |
Hash join that builds hash table in creation and probes results in subsequent *_join
member functions.
This class enables the hash join scheme that builds hash table once, and probes as many times as needed (possibly in parallel).
Definition at line 74 of file hash_join.hpp.
cudf::hash_join::hash_join | ( | cudf::table_view const & | build, |
null_equality | compare_nulls, | ||
rmm::cuda_stream_view | stream = cudf::get_default_stream() |
||
) |
Construct a hash join object for subsequent probe calls.
hash_join
object must not outlive the table viewed by build
, else behavior is undefined.build | The build table, from which the hash table is built |
compare_nulls | Controls whether null join-key values should match or not |
stream | CUDA stream used for device memory operations and kernel launches |
cudf::hash_join::hash_join | ( | cudf::table_view const & | build, |
nullable_join | has_nulls, | ||
null_equality | compare_nulls, | ||
rmm::cuda_stream_view | stream = cudf::get_default_stream() |
||
) |
Construct a hash join object for subsequent probe calls.
hash_join
object must not outlive the table viewed by build
, else behavior is undefined.build | The build table, from which the hash table is built |
compare_nulls | Controls whether null join-key values should match or not |
stream | CUDA stream used for device memory operations and kernel launches |
has_nulls | Flag to indicate if there exists any nulls in the build table or any probe table that will be used later for join |
std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::full_join | ( | cudf::table_view const & | probe, |
std::optional< std::size_t > | output_size = {} , |
||
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) | const |
Returns the row indices that can be used to construct the result of performing a full join between two tables.
output_size
is smaller than the actual output size.probe | The probe table, from which the tuples are probed |
output_size | Optional value which allows users to specify the exact output size |
stream | CUDA stream used for device memory operations and kernel launches |
mr | Device memory resource used to allocate the returned table and columns' device memory. |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
left_indices
, right_indices
] that can be used to construct the result of performing a full join between two tables with build
and probe
as the join keys . std::size_t cudf::hash_join::full_join_size | ( | cudf::table_view const & | probe, |
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) | const |
Returns the exact number of matches (rows) when performing a full join with the specified probe table.
probe | The probe table, from which the tuples are probed |
stream | CUDA stream used for device memory operations and kernel launches |
mr | Device memory resource used to allocate the intermediate table and columns' device memory. |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
build
and probe
as the join keys . std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::inner_join | ( | cudf::table_view const & | probe, |
std::optional< std::size_t > | output_size = {} , |
||
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) | const |
Returns the row indices that can be used to construct the result of performing an inner join between two tables.
output_size
is smaller than the actual output size.probe | The probe table, from which the tuples are probed |
output_size | Optional value which allows users to specify the exact output size |
stream | CUDA stream used for device memory operations and kernel launches |
mr | Device memory resource used to allocate the returned table and columns' device memory. |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
left_indices
, right_indices
] that can be used to construct the result of performing an inner join between two tables with build
and probe
as the join keys . std::size_t cudf::hash_join::inner_join_size | ( | cudf::table_view const & | probe, |
rmm::cuda_stream_view | stream = cudf::get_default_stream() |
||
) | const |
Returns the exact number of matches (rows) when performing an inner join with the specified probe table.
probe | The probe table, from which the tuples are probed |
stream | CUDA stream used for device memory operations and kernel launches |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
build
and probe
as the join keys . std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::left_join | ( | cudf::table_view const & | probe, |
std::optional< std::size_t > | output_size = {} , |
||
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) | const |
Returns the row indices that can be used to construct the result of performing a left join between two tables.
output_size
is smaller than the actual output size.probe | The probe table, from which the tuples are probed |
output_size | Optional value which allows users to specify the exact output size |
stream | CUDA stream used for device memory operations and kernel launches |
mr | Device memory resource used to allocate the returned table and columns' device memory. |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
left_indices
, right_indices
] that can be used to construct the result of performing a left join between two tables with build
and probe
as the join keys. std::size_t cudf::hash_join::left_join_size | ( | cudf::table_view const & | probe, |
rmm::cuda_stream_view | stream = cudf::get_default_stream() |
||
) | const |
Returns the exact number of matches (rows) when performing a left join with the specified probe table.
probe | The probe table, from which the tuples are probed |
stream | CUDA stream used for device memory operations and kernel launches |
cudf::logic_error | If the input probe table has nulls while this hash_join object was not constructed with null check. |
build
and probe
as the join keys .