Accelerate Your Time Series Analytics with Snowflake’s ASOF JOIN, Now Generally Available

Time series data is everywhere. It captures how systems, behaviors and processes change over time. Enterprises across industries, such as Internet of Things (IoT), financial services, manufacturing and more, use this data to drive business and operational decisions. 

When using time series data to perform analytics and drive decisions, it’s often necessary to join several data sets. For instance, a developer in an IoT company, building an application for predictive maintenance, may need to analyze equipment health time series data in the context of historical maintenance data to derive patterns. Similarly, a financial data analyst might need to associate intraday option trade data with prevailing market price data for auditing purposes. 

A common challenge with performing these kinds of joins is that timestamps in different time series tables often don’t match exactly, forcing customers to write complex and cumbersome queries to perform such joins.

At Snowflake, we’re committed to helping customers derive meaningful insights from their data with simplicity and speed. That’s why we are excited to announce the general availability (GA) of ASOF JOIN, a purpose-built, easy-to-use and performant Time Series feature that joins time series data sets when timestamps don’t match exactly.

What is the ASOF JOIN capability? 

ASOF JOIN is a type of join that pairs a record from two tables based on their proximity (usually temporal). For each row on the left side of the join, the operation finds the closest matching value from the right side. 

The SQL syntax is as follows:

SELECT ... 
FROM left_table ASOF JOIN right_table
MATCH_CONDITION... // define proximity (closest preceding / following)
[ ON...] // Optional join key

The corresponding Snowpark syntax is:

left_dataframe.
join(right_dataframe, on=[...], how="asof", match_condition=(...))

How can this syntax be used? As an example, assume a financial analyst is tasked with finding the closest stock quote price prior to each stock trade, for audit or regulatory purposes. Below, find a snippet of their hypothetical data.  

They would do so using ASOF JOIN as follows: 

SELECT 
    t.stock_symbol,
    t.trade_time,
    t.quantity,
    q.quote_time,
FROM trades t
ASOF JOIN quotes q
    MATCH_CONDITION (t.trade_time >= q.quote_time)
    ON t.stock_symbol = q.stock_symbol

To achieve the same in Snowpark, the code snippet would look as follows: 

quotes_dataframe.
join(trades_dataframe, on = [“stock_symbol”], how=“asof”, match_condition=(trades_dataframe.trade_time >= quotes_dataframe.quote_time))

The result would look as follows:

How ASOF JOIN works 

Without native support for ASOF JOIN, customers typically have to use complex workarounds involving multiple subqueries, window functions, range joins, etc. This often results in long and complicated queries that are difficult to maintain and have suboptimal performance. 

To address this, the purpose-built ASOF JOIN capability in Snowflake offers a concise syntax that clearly specifies the desired outcome, leaving the heavy lifting to the query engine. Snowflake converts the ASOF JOIN operator into a series of operations that involve aligning rows on the left and right based on the join keys and the timestamp expression, applying partition-aware sorting, and searching for the closest previous or next value based on the query. 

To illustrate sample performance gains from using ASOF JOIN, we used sample data sets with stock quotes and stock trades (39 million quotes and 2.2 million trades) and ran queries to find the closest stock quote price prior to each trade. We compared two query types — ASOF JOIN (shown in light blue, below) and a workaround involving LEFT OUTER JOIN (in dark blue). The ASOF JOIN queries were 12x faster than the workaround. We then increased the size of both data sets tenfold. With the larger data volume, ASOF JOIN was 16x faster than the workaround.   

Using Snowflake’s native ASOF JOIN helps customers focus on building simple analytic queries that align time series data sets in a highly performant manner instead of formulating complex workarounds. 

How ZeroNorth uses ASOF JOIN 

ZeroNorth, a cleantech company, helps the global shipping industry achieve optimal commercial performance and reduce emissions. Their tech teams used ASOF JOIN to combine hourly reported vessel data, such as velocity, with the closest satellite position of the vessel prior to and after the reading. Two of the specific ASOF JOINs utilized helped provide insights such as ship trajectory tracking and ETA estimation.

Dimo Boyadzhiev, Principal Data Engineer at ZeroNorth shared: “ASOF JOIN performance is really good. This syntax has improved our ways of working to be clearer and faster.”

How Panasonic Connect uses ASOF JOIN 

Panasonic Connect, which plays a central role in the growth of Panasonic Group’s B2B solutions, used ASOF JOIN when migrating its workload from Spark to Snowpark. ASOF JOIN helped the organization perform analytics, such as finding relevant data for each ship in a specific time period, in an easy and efficient manner.

Specifically, the team shared that ASOF JOIN has proved very useful when joining large volumes of data, citing a 99% improvement in performance over previous solutions.

More ways Snowflake customers use ASOF JOIN today: 

  • A wearable technology company is combining users’ haptic interaction data (click, double clicks) with battery level information to determine how certain user interactions impact the device’s battery level. 
  • A financial services company is linking historical stock trade data with quote data for applications like trade surveillance. 
  • A cloud security company is joining application process details with network connection details for potential threat detection.
  • A design company is using ASOF JOIN to analyze user interactions with a certain feature of interest (such as cropping) by associating it with the user’s next action data. 
  • A digital analytics company is joining user behavior data with event data and other relevant data sets to perform user engagement and funnel analytics. 
  • A credit card company is analyzing credit card applications in the context of advertisement click data to gauge the effectiveness of marketing campaigns.

Learn more

ASOF JOIN is now available to all customers in their Snowflake accounts. To learn more, read the Snowflake documentation. You can also refer to the time series user guide to learn more about working with time series data in Snowflake.

The post Accelerate Your Time Series Analytics with Snowflake’s ASOF JOIN, Now Generally Available appeared first on Snowflake.

LATEST ARTICLE

See Our Latest

Blog Posts

admin July 15th, 2024

In an increasingly interconnected world, where technology drives every facet of life, even the weather predictions we rely on may […]

admin July 10th, 2024

There is a scene in Mission: Impossible – Rogue Nation where Tom Cruise is hanging onto the outside of a […]

admin July 10th, 2024

Regulated and sovereign markets across the world have stringent requirements stipulating certain important data be kept within geographical borders or […]