Enterprise AI has historically been fragmented, requiring isolated environments for training, inference and serving. This separation often necessitates complex data movement across trust boundaries, increasing security risks and operational overhead. Snowflake, in collaboration with Amazon Web Services (AWS) and NVIDIA, is addressing these challenges by unifying the AI lifecycle within the Snowflake AI Data Cloud.
NVIDIA RTX PROTM 6000 Blackwell Server Edition GPUs will be coming soon on the Snowflake platform in select AWS regions with Amazon EC2 G7e instances. This integration places high-performance visual and generative AI compute directly adjacent to enterprise data. By embedding NVIDIA Blackwell-class compute into Snowflake architecture, customers can build powerful AI models, agents and applications that require high-throughput processing — all within a governed security perimeter right where your data lives.
With this launch, Snowflake moves beyond offering faster instances to deliver a cohesive platform that supports the diverse latencies and workflows required by modern AI teams while reducing the total cost of ownership (TCO) through improved operational velocity:
Snowflake Container Runtime for training and data loading: Data scientists can now use Snowflake Notebooks to provision GPU-powered containers for interactive development and easy deployment in Snowflake ML. This environment supports zero-code change access to GPU acceleration for pandas and scikit-learn workflows using NVIDIA cuDF and cuML libraries. It also supports scalable distributed training, allowing teams to launch multinode PyTorch training jobs across Blackwell clusters with minimal configuration. To feed these hungry GPUs, the Snowflake ML DataConnector enables parallelized reading of unstructured data such as images and PDFs directly from Snowflake stages, supporting optimal GPU utilization without I/O bottlenecks.
Flexible deployment with real-time and batch inference: For user-facing applications requiring millisecond latency, models can be deployed to Snowpark Container Services (SPCS) for real-time inference. For massive data processing, such as scoring millions of historical records, the platform leverages Snowpark-optimized warehouses with planned support for the NVIDIA Blackwell architecture.
Semantic context as the bridge to accuracy: Enterprise-grade AI requires a level of precision that generic models cannot achieve in isolation. The NVIDIA Blackwell architecture accelerates the high-speed reasoning loops required for Snowflake Intelligence. By bringing accelerated compute to the Snowflake semantic layer, enterprises can ensure that models understand the business logic and specific “meaning” behind their data, closing the accuracy gap and hitting the precision threshold required for production decision-making.
The NVIDIA RTX PRO 6000 Blackwell GPU is designed for the intensive demands of large language models and multimodal workflows. Each GPU provides 96 GB of GDDR7 memory, supporting more than 70 billion parameter models and vision-language models (VLMs) on a single card to minimize inter-GPU communication latency.
The inclusion of fifth-generation Tensor Cores with native FP4 support delivers up to 5x higher inference throughput compared to previous generations. This architecture optimizes the compute layer for precision and latency in AI deployments.
Both NVIDIA RTX PRO 6000 Blackwell Server Edition and the new NVIDIA RTX PRO 4500 Blackwell Server Edition GPU will be coming soon to Snowflake customers.
Large insurance providers manage millions of claims annually, combining structured data with massive volumes of unstructured assets such as high-resolution accident photos and handwritten reports. To build intelligent retrieval systems on this data, organizations need a pipeline that can ingest, embed and analyze complex files at scale without hitting memory or I/O walls:
Zero-copy ingestion at scale: Ingesting 100,000 accident reports daily creates a massive I/O bottleneck for traditional systems. Snowflake ML addresses this by using the DataConnector or Snowpark dynamic file access as part of user-defined functions with custom code to stream and analyze files in parallel, directly from Snowflake stages. This pipeline leverages the RTX PRO Blackwell’s 6000’s 96 GB of GDDR7 memory to handle massive batches of high-resolution images in-memory. By performing resizing and normalization directly on the GPU without spilling to disk, Snowflake enables the data transformation phase to keep pace with the compute capabilities.
Streamlined multimodal training: To “understand” a claim, models must map visual damage and text into a unified vector space. Data scientists can utilize Snowflake Notebooks to launch distributed PyTorch training jobs for custom embedding models (like CLIP) directly on the data. Snowflake abstracts the complexity of cluster management, while the underlying Blackwell architecture accelerates the training process. This combination allows teams to train on millions of claim pairs in hours rather than days, with the resulting model artifacts automatically versioned in the Snowflake Model Registry for scalable inference.
Unified deployment for fraud prevention: Once trained, the model acts as the core of a fraud detection pipeline. With a one-click deployment, data scientists can use Snowflake ML APIs to publish the model to Snowpark Container Services for real-time inference or to a virtual warehouse with Snowpark for batch processing. This flexibility allows the same Blackwell hardware to power high-bandwidth historical analysis of thousands of prior claims or low-latency evaluation of new image uploads, all while maintaining the governance and lineage tracking provided by Snowflake Horizon.
Research teams utilize Snowflake Notebooks and NVIDIA Blackwell’s FP4 precision to backtest trading strategies against decades of unstructured alternative data (audio, imagery) at high speeds. This allows for the rapid generation of tradable signals within a governed data environment, providing a distinct velocity advantage. By moving away from slow, traditional processing, firms can now perform complex portfolio optimizations in near-real-time—achieving speeds up to 80x faster than before. This unified approach allows traders to make faster, data-driven decisions while keeping sensitive information secure and governed within a single platform. Explore the architecture.
Open up more high-throughput AI use cases by integrating NVIDIA Blackwell GPUs into Snowflake today. NVIDIA RTX PRO 6000 Blackwell GPUs are available in select AWS regions with Amazon EC2 G7e instances; instances with NVIDIA RTX PRO 4500 Blackwell GPUs are coming soon. Deployment documentation and quickstart guides will be published soon. Check out the following resources to learn more:
Read about real-world applications of NVIDIA Blackwell GPUs for financial services use cases in this blog post.
Try accelerating scikit-learn and pandas workflows using GPUs through Snowflake’s integration with NVIDIA’s CUDA-X libraries.
Discover how to use agentic ML to build and deploy production-ready AI/ML models in Snowflake.
Forward-looking statements
This article contains forward-looking statements, including about our future product offerings, and are not commitments to deliver any product offerings. Actual results and offerings may differ and are subject to known and unknown risk and uncertainties. See our latest 10-Q for more information.
The idea of a data catalog as simply a system of record is dead, and so is the sheer manual […]
Telecom service providers around the world want to move toward autonomous operations. Yet many still depend on legacy systems and […]
Broadly speaking, the technology industry has progressed in leaps and bounds when it comes to equity — but we’re still […]