Skip to main content
Ctrl+K
cudf 25.06.00 documentation - Home cudf 25.06.00 documentation - Home
  • cuDF User Guide
  • cudf.pandas
  • Polars GPU engine
  • pylibcudf documentation
  • libcudf documentation
    • Developer Guide
  • GitHub
  • Twitter
Home
cudf
cucimcudf-javacudfcugraphcumlcuprojcuspatialcuvscuxfilterdask-cudadask-cudfkvikiolibcudflibcumllibcuprojlibcuspatiallibkvikiolibrmmlibucxxraftrapids-cmakermm
nightly (25.06)
nightly (25.06)stable (25.04)legacy (25.02)
  • cuDF User Guide
  • cudf.pandas
  • Polars GPU engine
  • pylibcudf documentation
  • libcudf documentation
  • Developer Guide
  • GitHub
  • Twitter

Section Navigation

Contents:

  • libcudf documentation
    • libcudf
    • Default Stream
    • Memory Resource Management
    • Cudf Classes
      • Column Classes
        • Column Factories
        • Dictionary Classes
        • Lists Classes
        • Strings Classes
        • Structs Classes
        • Timestamp Classes
      • Table Classes
      • Scalar Classes
        • Scalar Factories
      • Fixed Point Classes
    • Column APIs
      • Column Copy
        • Copy Concatenate
        • Copy Gather
        • Copy Scatter
        • Copy Slice
        • Copy Split
        • Copy Shift
      • Column Nullmask
      • Column Sort
      • Column Search
      • Column Hash
      • Column Merge
      • Column Join
      • Column Quantiles
      • Column Aggregation
        • Aggregation Factories
        • Aggregation Reduction
        • Aggregation Groupby
        • Aggregation Rolling
      • Column Transformation
        • Transformation Unaryops
        • Transformation Binaryops
        • Transformation Transform
        • Transformation Replace
        • Transformation Fill
      • Column Reshape
        • Reshape Transpose
      • Column Reorder
        • Reorder Partition
        • Reorder Compact
      • Column Interop
        • Interop Dlpack
        • Interop Arrow
    • Datetime APIs
      • Datetime Extract
      • Datetime Compute
    • Strings APIs
      • Strings Case
      • Strings Types
      • Strings Combine
      • Strings Contains
      • Strings Convert
      • Strings Copy
      • Strings Slice
      • Strings Find
      • Strings Modify
      • Strings Replace
      • Strings Split
      • Strings Extract
      • Strings Regex
    • Dictionary APIs
      • Dictionary Encode
      • Dictionary Search
      • Dictionary Update
    • Io APIs
      • Io Types
      • Io Readers
      • Io Writers
      • Io Datasources
      • Io Datasinks
    • JSON APIs
      • JSON Object
    • Lists APIs
      • Lists Combine
      • Lists Modify
      • Lists Extract
      • Lists Filling
      • Lists Contains
      • Lists Gather
      • Lists Elements
      • Lists Filtering
      • Lists Sort
      • Set Operations
    • Nvtext APIs
      • Nvtext Ngrams
      • Nvtext Normalize
      • Nvtext Stemmer
      • Nvtext Edit Distance
      • Nvtext Tokenize
      • Nvtext Replace
      • Nvtext Minhash
      • Nvtext Jaccard
    • Utility APIs
      • Utility Types
      • Utility Dispatcher
      • Utility Bitmask
      • Utility Error
      • Utility Span
    • Labeling APIs
      • Label Bins
    • Expression Evaluation
    • tdigest
  • Regex Features
  • Unicode Limitations
  • libcudf documentation
  • libcudf documentation
  • Nvtext APIs

Nvtext APIs#

group NVText

Contents:

  • Nvtext Ngrams
    • generate_ngrams()
    • generate_character_ngrams()
    • hash_character_ngrams()
    • ngrams_tokenize()
  • Nvtext Normalize
    • normalize_spaces()
    • normalize_characters()
    • create_character_normalizer()
    • normalize_characters()
    • nvtext::character_normalizer
  • Nvtext Stemmer
    • letter_type
    • is_letter()
    • is_letter()
    • porter_stemmer_measure()
  • Nvtext Edit Distance
    • edit_distance()
    • edit_distance_matrix()
  • Nvtext Tokenize
    • load_merge_pairs()
    • byte_pair_encoding()
    • load_vocabulary_file()
    • subword_tokenize()
    • tokenize()
    • tokenize()
    • count_tokens()
    • count_tokens()
    • character_tokenize()
    • detokenize()
    • load_vocabulary()
    • tokenize_with_vocabulary()
    • load_wordpiece_vocabulary()
    • wordpiece_tokenize()
    • nvtext::bpe_merge_pairs
    • nvtext::hashed_vocabulary
    • nvtext::tokenizer_result
    • nvtext::tokenize_vocabulary
    • nvtext::wordpiece_vocabulary
  • Nvtext Replace
    • replace_tokens()
    • filter_tokens()
  • Nvtext Minhash
    • minhash()
    • minhash64()
    • minhash_ngrams()
    • minhash64_ngrams()
  • Nvtext Jaccard
    • jaccard_index()

previous

Set Operations

next

Nvtext Ngrams

This Page

  • Show Source

© Copyright 2018-2025, NVIDIA Corporation.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.16.1.