Snowflake Kafka Connector V4

Kafka Connector V4 defaults to schematized ingestion, where each JSON key maps to its own table column. This is more performant and what we recommend. If you want to keep Kafka Connector V3’s default two-column mode (RECORD_CONTENT and RECORD_METADATA as VARIANT), set snowflake.enable.schematization=false.

The path is incremental:

  1. Update to Kafka Connector V4 with the new connector class and apply the compatibility flags above.
  2. Test with your existing data in a nonproduction environment.
  3. Adopt Kafka Connector V4 defaults one at a time: server-side validation, native column naming and more.

The migration is seamless. Kafka Connector V4 handles offset recovery based on the configs you choose. Data integrity and exactly-once delivery are maintained throughout. Your existing tables continue to work, and you can run Kafka Connector V3 and Kafka Connector V4 side by side on different topics during a phased rollout.

You can automate your migration with Cortex Code too. The Custom Kafka Consumer Cortex Code skill guides you through setting up, running and debugging in minutes. Type kafka consumer or kafka to snowflake in Cortex Code to get started.

Get started

Kafka Connector V4 is generally available today. Kafka Connector V4 is available in Maven and compatible with Apache Kafka 2.x and 3.x, including Confluent Platform and Amazon Managed Streaming for Apache Kafka (Amazon MSK). It supports Schema Registry integration with client-side validation modes and requires Java 11+. We’re working with our partners to include Kafka Connector V4 in their managed Kafka Connect offerings.

Get hands-on with Cortex Code skills

Cortex Code ships with three purpose-built skills for Snowpipe Streaming that take you from setup to production patterns in minutes. Explore all Cortex Code skills on GitHub.

For a full list of current limitations, consult the Kafka Connector V4 documentation.

This shift simplifies everything downstream:

  • Auto table creation: The connector creates target tables from your data. No preprovisioning required.
  • Server-side schema evolution: Snowflake adds new columns as your data changes. No client-side DDL.
  • Standard community converters: Kafka Connector V4 drops the Snowflake-specific SnowflakeJsonConverter and SnowflakeAvroConverter. Use the standard JsonConverter, AvroConverter and ProtobufConverter you already know.

  • Simplified configuration: Dozens of buffer, streaming and optimization knobs have been removed. The server manages what the server should manage.
  • Migration compatibility: Features like client-side validation with DLQ are fully supported. Migrate your connector class, keep your error-handling patterns and adopt Kafka Connector V4 features at your own pace.

Try it hands-on with Cortex Code. If you want to experience the simplified setup firsthand, just type ssv2 quickstart in Cortex Code to get started.

Performance

When evaluating streaming performance, the bottleneck is rarely the network. Most often, it is the compute overhead on your Kafka Connect workers. We benchmarked V3 against V4 under a heavy workload (four CPUs, 8 GB RAM, eight partitions, eight tasks per node, roughly 10K message size and 250 columns). The data reveals exactly why evolving to a server-side architecture allows us to push performance boundaries further than ever before.

As the throughput scaling chart shows, V3 handles standard workloads reliably. However, as enterprise throughput demands increase into the tens of megabytes per second, client-side processing begins to reach its natural limits. At 8 MB/s per partition, V3 caps out at around 37.7 MB/s total throughput as it works hard to process all the data locally.

By contrast, V4 effortlessly handles that exact same workload while maintaining a crisp 7-second ingest-to-query latency. Pushing the system even further to 12 MB/s per partition, V4 scales smoothly to 96 MB/s of total throughput. At peak capacity, V4 can deliver up to 10 GB/s of throughput per table with end-to-end latencies as low as 5 seconds.

Running Apache Kafka® at scale means your connector has been doing work it was never supposed to own — buffer management, schema validation and Java Virtual Machine (JVM) tuning. We built Kafka Connector Version 4.0 (V4) to change that. It’s a ground-up rewrite of the Snowflake Kafka Connector built on Snowpipe Streaming High-Performance Architecture,  a server-side ingestion service that handles validation, transformation and commit inside the platform. The connector’s job is straightforward now. It delivers rows and Snowflake handles the rest.

Snowpipe Streaming has been proven at scale, running in production across thousands of customer deployments since its general availability. Kafka Connector V4 takes full advantage of it, moving ingestion logic server-side and getting it off your workers entirely. Kafka Connector Version 3.0 (V3) has been battle-tested in production for years. The upgrade path to Kafka Connector V4 does not require starting over. If you are running V3 today, your existing Dead Letter Queue (DLQ) and error-handling patterns work unchanged from day 1. You move at your own pace.

With Kafka Connector V4 generally available (GA) today, we’ve observed up to 10 GB/s throughput per table and 5-second end-to-end latency from ingest to queryable. To make it far easier for you to predict spending, it has the same ingestion pricing as Snowpipe Streaming’s throughput-based model, replacing the credit-based model tied to serverless compute and client connections. You pay a consistent 0.0037 credits per GB. Based on internal benchmarks from the rollout to Business Critical and Virtual Private Snowflake (BC/VPS) edition customers in August 2025, customers are already realizing upward of 50% cost savings from the new pricing model. And to make it easier than ever, we’ve built a Kafka Connector V4 Cortex Code skill and a set of skills that take you from setup to a running streaming pipeline in Snowflake.

The architecture shift

In Kafka Connector V3, the connector carried a lot of responsibility: client-side validation, buffer management, custom Snowflake-specific converters, schema handling. All of it consumed resources and created potential failure points on client-side infrastructure.

Kafka Connector V4 flips this model. Processing moves server-side through PIPE objects, Snowflake-managed objects that define how streaming data is validated, transformed and committed before it lands in your table. The connector’s job becomes simple: Deliver rows. Snowflake handles validation, transformation, clustering and commit. For customers migrating or upgrading to the latest version, Kafka Connector V4 actually provides optional client-side compatibility to the core features in V3, meaning customers can upgrade and not worry about downtime.

The CPU utilization chart illustrates exactly how V4 achieves this scale. At the 8 MB/s per partition threshold, V3 maximizes its available compute resources at 96% CPU to keep up with data transformation and serialization. V4, on the other hand, processes this identical workload using just 33% CPU while also consuming significantly less memory. This shows real client infrastructure savings for organizations running at scale.

Why the massive difference? First, the V4 SDK builds upon our previous work by introducing a highly optimized, shared Rust core. Transitioning from the pure Java implementation of V3 shrinks the client footprint, dropping CPU usage, lowering memory requirements and removing Java Garbage Collection (GC) pressure. This translates directly to smaller Kafka Connect workers and leaner infrastructure for you to run.

Second, V4 leverages Snowflake PIPE objects to push processing to the server side. Instead of utilizing worker compute for client-side transformations, V4 simply delivers the rows. Snowflake natively handles in-flight clustering, renaming columns, casting types and filtering records automatically as the data lands. Your queries run faster on fresh, well-clustered data, all with zero added client-side overhead.

Pricing and efficiency

Kafka Connector V4 utilizes Snowpipe Streaming’s throughput-based pricing model. You pay per uncompressed GB ingested at 0.0037 credits per GB, or about $0.01/GB depending on your Snowflake edition. For up-to-date pricing, see the Snowflake consumption table. This replaces the previous credit-based model tied to serverless compute and client connections. It’s more predictable and easier to forecast.

The pricing model changes when you migrate from Kafka Connector V3. Client-side resource savings, better query performance on clustered data and in-flight transformations that eliminate separate processing steps all contribute to lower total cost of ownership.

The client-side story is where the savings are most concrete. Kafka Connector V4’s Rust core and server-side processing mean your Kafka Connect workers need significantly fewer resources. One customer has reported up to 30% reduction in client-side costs. Smaller instances. Fewer workers. Less infrastructure spend.

Error handling with error logging

Server-side processing changes how you troubleshoot failed records. In Kafka Connector V3, troubleshooting failed records meant searching through client logs and Kafka Connect worker output across distributed infrastructure. With Kafka Connector V4’s server-side validation, failed records land in a SQL-queryable error table inside Snowflake. Error message, offset, timestamp, channel context — everything you need to diagnose issues, accessible with a SELECT query. Troubleshooting moves from distributed log diving to a SQL console.

Server-side validation handles type checking entirely to Snowflake’s side. You get detailed error codes and real-time channel status.

For teams that prefer Kafka-native error routing, client-side validation with DLQ remains fully supported. Same configuration, same behavior as Kafka Connector V3. You can keep your existing patterns and adopt server-side error tables when you’re ready. Pick what fits your pipeline, but for most teams, error tables are the path forward.

Migration: Three steps to Kafka Connector V4

We know migration is a serious decision, especially when you’re running dozens of connectors across hundreds of topics. So we built Kafka Connector V4 with migration at its core.

You upgrade by updating to the new connector version and class (SnowflakeStreamingSinkConnector instead of Kafka Connector V3’s SnowflakeSinkConnector). Migration-ready compatibility configs ship out of the box and reproduce Kafka Connector V3 behavior so you can get running with minimal changes:

LATEST ARTICLE

See Our Latest

Blog Posts

admin April 29th, 2026

We’ve spent the past two years making AI agents capable. They can query your databases, summarize your documents, route your […]

admin April 29th, 2026

Kafka Connector V4 defaults to schematized ingestion, where each JSON key maps to its own table column. This is more […]

admin April 29th, 2026

From technical workflows to guided collaboration At the center of this evolution is a shift in how people interact with […]