Supercharging Machine Learning in Snowflake with NVIDIA CUDA-X Libraries for Scikit-learn and Pandas

Whether it’s predicting customer churn, detecting anomalies in transaction data or exploring clustering patterns in AI embeddings, companies are bringing generative AI and ML models to bear on data sets larger than ever before. As data sets grow, GPU acceleration is becoming critical, as waiting hours or days for ML algorithms to finish running can significantly reduce productivity and increase costs. 

To help customers meet the rising demands of larger data sets, Snowflake ML has invested heavily in GPU-enabled workflows over the past few years. Today, we are thrilled to announce that Snowflake ML now comes preinstalled with NVIDIA’s cuML and cuDF libraries to accelerate popular ML algorithms with GPUs. With this native integration, Snowflake customers can easily accelerate model development cycles for scikit-learn, pandas, UMAP and HDBSCAN — no code changes required. NVIDIA’s benchmark runs show speedups of 5x the time required for Random Forest and up to 200x for HDBSCAN on NVIDIA A10 GPUs compared to CPUs.

In this blog post, we’ll walk through examples for topic modeling and genomics to illustrate how these newly integrated libraries make exploring large data sets with state-of-the-art dimensionality reduction and clustering techniques fast and seamless in Snowflake ML.

NVIDIA CUDA-X libraries for data science

As data sets reach millions of rows and include hundreds to thousands of dimensions, alternatives to traditional CPU-only processing tools become a necessity. The cuML and cuDF libraries are part of the NVIDIA CUDA-X Data Science (CUDA-X DS) ecosystem, an open source suite of GPU-accelerated libraries designed to supercharge data processing pipelines. GPUs provide parallel processing power for faster, more scalable and efficient data workflows.

Figure 1: CUDA-X Data Science is a collection of open source libraries that accelerate popular data science libraries and platforms.

 

CUDA-X DS libraries combine the power of GPUs with familiar Python APIs for data analytics, machine learning and graph analytics — delivering major speedups without requiring teams to rewrite their code. With CUDA-X DS, you can GPU-accelerate model training and iterative optimization cycles, processing data sets with hundreds of millions of rows on a single GPU. On an A10 GPU, cuML can accelerate machine learning algorithms such as TSNE by up to 72x, UMAP up to 25x and HDBSCAN up to 200x on wide data sets compared to CPU-only computing, and it can cut processing times from days down to just minutes.

Getting started with GPU-accelerated model development in Snowflake ML

Figure 2: Snowflake ML includes a robust set of model development, operations and inference capabilities right on the same platform as your governed data. 

 

Snowflake ML is a set of end-to-end ML development and inference capabilities integrated directly with the data on a single platform. The integration with NVIDIA’s cuML and cuDF libraries is accessible through Container Runtime, a prebuilt environment for large-scale machine learning development. To accelerate ML algorithms such as scikit-learn and pandas on fully managed GPUs, Snowflake customers can easily execute their data science scripts in Container Runtime through Snowflake Notebooks or via remote pushdown from any IDE (function or file dispatch) facilitated by ML Jobs, providing several benefits:

  • Simplified developer experience: With a GPU-specific runtime image, you already have access to the latest and most popular libraries and frameworks (PyTorch, XGBoost, LightGBM, scikit-learn and many more) that support ML development. In the latest update, cuML and cuDF have been fully integrated into the default GPU environment so you can gain access to acceleration for pandas, scikit-learn, UMAP and HDBSCAN directly. 

  • Easy access to GPU instances: With a simple notebook or remote execution configuration, you can select an instance from a compute pool appropriate for the workload. With a selection of GPU nodes, you have access to one or more GPUs in a single node as well as different types of GPUs to suit the complexity of your use case.

Figure 3: Easily accelerate popular ML algorithms with GPUs from Snowflake Notebooks through direct integration with NVIDIA’s CUDA-X libraries. 

 

Snowflake’s integration of NVIDIA’s libraries is a powerful solution to industry challenges characterized by large data sets that require GPU acceleration, such as topic modeling and genomics use cases.

Making topic modeling possible at scale

When tackling large-scale text analysis like topic modeling, computational efficiency quickly becomes a critical factor. The iterative and exploratory nature of many data science workflows makes the need for higher performance even more pressing. Waiting hours for every iteration isn’t feasible.

Snowflake’s integration with NVIDIA CUDA-X libraries can bring significant speedups to data science and machine learning tasks with no or almost no code changes required to your existing CPU-based Python code. Tasks such as transforming hundreds of thousands to millions of product reviews from raw text into well-defined topic clusters can take just minutes on the GPU. 

This quickstart demonstrates how accelerated computing in Snowflake makes topic modeling possible with BERTopic (a popular topic modeling library) on 500,000 book reviews in under a few minutes on the GPU_NV_S instance, rather than taking more than eight hours on the CPU_X64_L instance.

The BERTopic-based topic modeling workflow generally follows these steps:

  1. Read data: Read text data into memory using a library such as pandas

  2. Generate embeddings: Convert the raw text into numerical representations (embeddings) using the SentenceTransformers library

  3. Reduce dimensionality: Condense high-dimensional embeddings into a lower-dimensional space while retaining crucial information using the umap-learn library

  4. Cluster: Cluster the dimensionality-reduced embeddings to identify core topics using the HDBSCAN library.

You can now accelerate all four of these steps in your notebook. SentenceTransformers will automatically use CUDA-enabled PyTorch. And to accelerate your pandas, umap-learn and HDBSCAN code with zero changes, just import cuML and cuDF and “flip the switch”:

With that, the entire topic modeling workflow is accelerated — no more waiting hours for your notebook to finish running.

Enabling innovation with accelerated genomics workflows

Accelerated computing is transforming healthcare and digital biology, making it possible for scientists and researchers to tap into the increasingly large data sets generated by next-generation medical devices and to operationalize AI to solve complex problems. 

Snowflake users can now leverage the zero-code-changes-required capabilities of NVIDIA CUDA-X DS libraries such as cuDF, cuML and more to accelerate the analysis of DNA sequence data. For processing high-dimensional biological sequences, cuML and cuDF provide significant acceleration: 

  • Faster sequence analysis: By converting raw DNA sequences into feature vectors, researchers can perform classification tasks (such as predicting gene families) at scale. 

  • Seamless workflow integration: Executing pandas and scikit-learn code directly on GPUs dramatically speeds up data loading, preprocessing and ensemble model training. 

  • Zero code change acceleration: Accessing GPU-acceleration for existing workflows with no required code changes allows researchers to focus on biological insights and model design rather than low-level GPU programming.

In the quickstart, we demonstrate training a machine learning model to predict the gene family of a DNA sequence using scikit-learn and XGBoost. To GPU-accelerate our training, we just need to load the cuML accelerator (for scikit-learn) and configure our XGBoost model with device=”cuda”.

As every data scientist knows, the first model you train is rarely the best. Unfortunately, robust feature engineering and model tuning can take hours or even days, requiring testing potentially hundreds or thousands of different pipelines.

With Snowflake ML, you can turn hours of model training into minutes and focus on the genomics workflow rather than worrying about how to rewrite your code for GPUs — because you don’t need to.

Get started today 

Snowflake ML is pre-integrated with NVIDIA’s cuML and cuDF libraries to boost the operational efficiency and scalability of large-scale machine learning on Snowflake data. This expanded capability significantly enhances iterative development and discovery cycles in computationally demanding domains while abstracting away the inherent complexities of GPU infrastructure and environment management.

Ready to get started? To try NVIDIA’s libraries from Snowflake’s Container Runtime, you can easily follow along in this quickstart and product documentation to accelerate your ML workflows on GPUs.

LATEST ARTICLE

See Our Latest

Blog Posts

admin November 26th, 2025

Real business intelligence is more than seeing a number — it’s about understanding the story within it. In the UK, […]

admin November 26th, 2025

Data engineering is having a moment. Everyone suddenly cares about pipelines, lineage and “AI foundations.” It still surprises me, mostly […]

admin November 26th, 2025

We are thrilled to announce the availability of Claude Opus 4.5, Anthropic’s most capable model available to customers on Snowflake […]