Startup Spotlight: Patch Helps Devs Unblock Pipelines With Data Packages

Welcome to Snowflake’s Startup Spotlight, where we feature awesome companies building businesses on Snowflake. In this edition, Patch.tech Co-Founder and CPO Whelan Boyd talks about how frustration with clogged data pipelines sparked the idea for Patch’s code packages, which allow engineers to distribute data sets with all the built-in elements that analysts and developers need to create apps. You’ll also find out why “identifying a villain” is part of Patch’s success strategy.

Whelan, can you tell us a bit about yourself?

I co-founded Patch in 2021 with Peter Elias. I grew up in New Orleans, live in NYC, and have been birdwatching avidly since I was 10. You may not know it, but Central Park has some of the best birding in the area, especially during migratory seasons!

I’ve also always been a math and science nerd, so spending my adult life working on databases and other physics-constrained distributed systems feels almost like a guilty pleasure.

What inspired you to start Patch?

Peter and I led the data platform at Optimizely as the company moved up-market to serve clients such as Nike, Capital One, The New York Times and H&M. These companies sent massive volumes of data through our A/B testing results pipelines.

The worst feeling in the world was not being able to move fast enough because our data infrastructure was holding our roadmap back. Working with these brands closely, I realized they were dealing with the same challenge. They had plenty of data spread across multiple sources, but the time from idea to production was months, quarters or even years. Once Peter and I saw a viable solution to help these teams move faster, we had all the motivation we needed to start Patch.

What problem does Patch aim to solve, and what makes you confident that you and your team are the right people to solve it?

In a nutshell, software developers often end up blocked because the database where some of their data lives either has incomplete data or the wrong query engine for their needs. This leads the developers—or more likely their data team—to set up more pipelines and more databases and move a new copy of the data so that an application can query it the way it needs to.

At Optimizely, we computed our billing and statistics pipelines in Snowflake. We needed to combine them with data from our operational stores and event streams to deliver interactive billing reports, user notifications, AI-based services and programmatic data access. As we prototyped solutions, we found that there was a considerable amount of pipeline setup, database and cache tuning, and API development involved. Our Node and Python devs then had to write tons of boilerplate and adapt their CI/CD workflow.

Patch enables software engineers to build highly scalable and performant production apps using Snowflake data, without ETL or changes to your existing infrastructure, and with no boilerplate code in the application. Our team has a huge amount of experience in this area. Peter, for example, previously led a complete data platform rebuild of Conde Nast’s brands, and is likely one of the most knowledgeable engineers on the planet when it comes to stream processing, batch ETL and distributed systems engineering.

What’s the coolest thing you’re doing with data?

Patch is designed for data and application teams who want to move beyond low-level pipelines and modeling to drive revenue impact by using their data to power high-value, customer-facing features. We allow teams to:

Developer experience: Data is queryable over generated GraphQL APIs or native language-generated clients called “data packages.” These interfaces are designed to make using Snowflake data in production as easy as importing a code library. No ETL, no boilerplate database driver code, no raw SQL queries. No networking or authentication boilerplate. Simply import and write code.
Read replicas with a twist: Data packages are essentially read replicas over Snowflake that allow the developer to change the query engine. This means developers can run aggregations, searches and point reads on Snowflake data with the performance and reliability of a highly tuned cache, but with only two minutes of setup.
Performance: The highly optimized query engine provides sub-30ms latency on multi-million row aggregations and sub-10ms point reads.
Schema evolution: One of the most common reasons apps break when using data sourced from analytics teams is an upstream schema change. This is unacceptable for production apps, so we drew inspiration from package managers like npm and PyPI to offer data engineers a way to evolve schemas and provide software devs with a familiar upgrade workflow.

How has Snowflake enabled you to push the envelope in your line of business?

Snowflake’s ecosystem is unmatched. Snowflake Marketplace in particular makes it easy for us to develop and distribute demo apps that appeal to various industries.

Snowflake’s ambition also attracts forward-looking customers. These teams are looking to unlock more value from their data, which includes customer-facing products, online back-end services and data monetization revenue streams that rely on data living in Snowflake.

When you were implementing Snowflake, how did you decide whether to use a managed, hybrid or connected architecture?

We decided to use a connected architecture because it allows us to unlock the value of the customer’s data by integrating it into their production stack.

We constantly hear that data engineers have developed a real command for the Snowflake platform, especially when it comes to unlocking insights. But application engineers are not as familiar and have distinct development tools and workflows. A connected architecture allows us to bridge that gap and help app developers see the immense, latent potential of Snowflake data.

What’s the most valuable piece of advice you got about how to run a startup?

One of our advisors told us that “to build a category-defining company, you have to identify your villain and state your mission.” For too long, application engineers have waited on data infrastructure teams to set up new pipelines and specialized databases for each new use case. That low-level engineering work is our villain. Data packages empower application engineers to build performant and scalable applications over data sets of any shape and size, no matter where they are stored.

If you had a chance to go back to the early days of your startup and do something differently, what would you change?

I wish we had leaned into Snowflake earlier. The feedback we’ve gotten from the team, the ability to explore customer opportunities, and the platform itself has opened a lot of doors.

What’s on the horizon for Patch?

Soon, software developers won’t have to wait for a stream or batch pipeline in order to develop their app. Instead, they’ll import a data package and start coding right away.

Looking further forward, we aim to push the envelope on what software engineers can build with data, both within their company and from the public internet. Infrastructure concerns will fade to black and the only limiting factor will be the developer’s imagination. Learn more about Patch at https://www.patch.tech/ and check out the Powered by Snowflake Startup Program today. If you’re an early-stage startup, don’t forget about the 2024 Snowflake Startup Challenge—submissions are due March 1!

The post Startup Spotlight: Patch Helps Devs Unblock Pipelines With Data Packages appeared first on Snowflake.