At TCS, we help companies shift their enterprise data warehouse (EDW) platforms to the cloud as well as offering IT services. We’re extremely familiar with just how tricky a cloud migration can be, especially when it involves moving historical business data.
Choosing a migration approach involves balancing cloud strategy, architecture needs and business priorities. But there are additional considerations when dealing with historical business data. This data has typically undergone multiple cycles of redesign, changes and upgrades, and its volume is often measured in terabytes or petabytes. Business contextual frameworks and design patterns are tightly bound to existing data models, and regulatory requirements may demand that historical data be stored as is and remain readily available for auditing.
A “lift and shift” migration approach handles all of these requirements by moving the business workload and data together. This simplifies change management and reduces impact and downtime for the rest of the business. Migrating historical data with the right quality and format ensures readiness for reporting as well as data readiness for use in AI applications. By leveraging the data for AI, an enterprise can fast-track the journey around ML workloads—predictive, prescriptive, descriptive and cognitive analytics. And when moving to Snowflake, you get the advantage of the Data Cloud’s architectural benefits (flexibility, scalability and high performance) as well as availability across multiple cloud providers and global regions.
In this blog post, we’ll take a closer look at the details that can help decision-makers successfully plan and execute an optimal lift-and-shift cloud data migration strategy for their business. We’ll also present some best practices for overcoming common migration challenges and provide an overview of migrating historical data to Snowflake using the TCS Daezmo solution suite.
Identifying the best migration approach for your organization starts with a better understanding of your historical data environment. Information gathering is the first mandate; begin by asking stakeholders, application owners, DBAs and others questions like these:
It’s critical for IT experts to be clear-eyed about bandwidth availability between on-premises data centers and cloud providers for the data migration, and to identify and factor in any workload dependencies. They should also designate all legacy data assets as hot, cold or warm to finalize the migration plan and the refresh and sync strategies.
At the same time, operations teams need to determine lead times for staging server procurement, as well as manage the security approvals needed to move data assets from on-premises servers.
Once the initial tech and operational factors are solidified, there are four main steps in a migration plan: data extraction, transfer, upload and validation (see Figure 1).
Figure 1: Extract, transfer, upload and validate are the four main steps of a data migration plan.
Here’s a look at the challenges you may face in each step, and how to overcome them:
Challenges: Efficient extraction is often stymied by low compression ratios of legacy data, long-running jobs or resource contention on certain tables, and restrictions on how many parallel connections can be opened on the source system.
Best practices
Challenges: Limited network bandwidth or highly variable throughput during peak and non-peak times pose a barrier to swift data transfer. High data volumes in each iteration can also affect data transfer rates. Moving files introduces the chance for data corruption, especially in large files.
Best practices
Challenge: If object storage is available within the customer subscription, you could use it as an external stage for data upload to Snowflake; if it’s not available, you’ll need to use the internal stage for data upload. High data volume growth in the legacy platform can also mean a shorter freeze period or cutover window, impacting the timing of incremental data synch-ups after the initial migration. Also, incorrectly sized clusters can increase your credit consumption rate.
Best practices
Challenges: Manual validations and certifications are time-consuming and error-prone. Assessing data quality requires additional time and effort from the team.
Best practices
There are several options for migrating data and analytics to a modern cloud data platform like Snowflake. Using a data migration framework, however, allows you significant flexibility. This is where the TCS Daezmo solution suite comes in. It combines several migration approaches, methodologies and machine-first solution accelerators to help companies modernize their data and analytics estate to Snowflake. Daezmo has a suite of accelerators around data and process lineage identification, historical data migration, code conversion, and data validation and quality. TCS Data Migrator Tool has connectors for standard RDBMS, warehouse and file systems that you can leverage for historical data migration.
Figure 2: Historical data migration using TCS Daezmo Data Migrator Tool.
Here’s what the four-step migration process looks like with TCS Data Migrator Tool:
By streamlining the process of moving data into the Snowflake destination, TCS Data Migrator Tool helps fast-track historical data migration. The time savings can be especially crucial for organizations facing strict migration deadlines, such as a company facing a time-sensitive data migration due to a license expiration, for example.
One leading bank in the EU needed to move its large financial and regulatory data warehouse off of its RDBMS before the license expired in 14 months—and the data had to be migrated as is. The bank used TCS Data Migrator Tool to migrate historical data for non-production environments holding about 300 TB of data. Because of the high data volume, the team used TCS Data Migrator Tool’s integration with native platform utilities to complete the task. For the more than 1 petabyte of data in the production environment, the short migration timeline and network bandwidth limitations meant a device-based approach was the best choice. The team used an AWS Snowball storage device; after loading data into S3 buckets, they developed custom scripts to load data into Snowflake tables. The fully operational financial and regulatory platform was up and running on Snowflake within months.
Data-driven organizations know that careful data management is crucial to their success—perhaps doubly so when migrating enterprise data warehouse platforms to cloud-based platforms like Snowflake. Enterprise-level strategic and tactical decisions are driven by outcomes generated by analytics on top of historical data; therefore, historical data migration needs to be treated as of paramount importance. The best practices outlined in this post are the result of TCS’s significant experience with various migration approaches and options, and the capabilities of our TCS Daezmo data warehouse modernization solution. We are proud to work with Snowflake to make data migrations as smooth as possible for our customers.
Learn more about TCS and the TCS Daezmo solution suite here. To learn more about migrating to the Snowflake Data Cloud, visit snowflake.com/migrate-to-the-cloud
The post Best Practices for Migrating Historical Data to Snowflake appeared first on Snowflake.
The stage is set for a new era in marketing, and marketers have never had so much data and technology […]
The Snowflake AI Data Cloud has democratized data for thousands of customers, removing data silos and powering data sharing and […]
Adtech and martech companies are engaged in a fierce battle for audience attention. Customers are bombarded with thousands of ads […]