Netezza started as the first data warehouse appliance in the world in 2003 and has held this highly coveted role as the world’s first 100 TB data warehouse appliance in 2006 and the world’s first petabyte data warehouse appliance in 2009 for years. Know-How Data Migration Works from Netezza to Snowflake?
The efficiency of Netezza was unmatched in its heyday thanks to its proprietary hardware acceleration process in the form of field-programmable gate arrays (FPGA) that were finely tuned to process analytical queries at blazing speed and scale. This FPGA performed compression of the data, row-column conversion and pruning of data.
During its lifetime various Netezza models were produced including Skimmer, TwinFin, Striper and Mako. Its proposition of value includes:
- No indexing and partitioning of data
- Simplified management
- Data pruning
- Purpose-built for data warehouse and analytics
However, due to the cloud revolution, IBM had largely withdrawn support for Netezza. Most of the models are no longer offered by IBM, and there is no new Netezza appliance. In fact, as of 2014, no new hardware was released. IBM is effectively forcing users of Netezza to quit the appliance by erasing crucial product support.
Cloud computing is the on-demand provision of computing infrastructure — servers, databases, internet storage, networking etc to provide faster innovation, versatile resources and economies of scale.
What Attracts Cloud?
- Cheaper price — the pay-as-you-go model is an appealing pricing model for startups and businesses alike. Users with variable computing needs can realize considerable IT cost savings.
- Reliability and uptime improved — Most cloud services provide server availability reaching 99.9%.
- Scalability a push of a button will ramp up or down the computing power.
- Deployment speed Businesses can build and deploy applications by obtaining near-instant access to virtually limitless computing resources and storage.
- Cheap storage, and practically limitless.
- Scale economies the more companies share cloud services, the greater the cost to each company can be amortized.
- Increased security the majority of cloud providers follow industry requirements such as HIPAA, PCI, ISO, Sarbanes-Oxley etc.
- Disaster recovery — cloud storage allows data to be backed up quickly and automatically, which helps vital IT systems to recover from disaster faster.
Why Use Snowflake?
Snowflake is the cloud’s sole data-warehouse. Snowflake offers much better efficiency, scalability, flexibility and workgroup/workload agility than just about any other cloud-based data warehouse on the market, and provides a secure haven for refugees from Netezza.
Also, Snowflake will scale automatically and instantly with Netezza in a way that is not possible by separating storage from computing. With the special multi-cluster, sharing data architecture of Snowflake, this is accomplished. The value proposition for Snowflake is as follows:
- Zero management — no tweakable knobs and no tuning needed.
- All your data are stored in one location-both structured and semi-structured data (JSON, AVRO, XML).
- Unrestricted access without performance loss for concurrent users and applications.
- Pay as you go, and pay as you go.
- Seamless sharing of data ( Data Sharehouse).
- Complete SQL database.
Strategy Between Netezza to Snowflake Migration
The transfer of data from Netezza to Snowflake requires two approaches:
- Lift and Shift
In particular, deciding between the two methods will depend largely on factors such as timescale, number of data sources, data types and future ambitions.
Factors Leading Lift-and-Shift Approach:
- The data is highly integrated across the existing data warehouse.
- Pressures on time-scale to move off Netezza.
- Migration of a single independent, standalone data mart.
- Well designed and processed data using standard ANSI SQL.
Factors Leading Staged Approach:
- Present data warehouse consists of several separate data marts, which can be individually transferred over time.
- Re-engineering is needed due to performance issues for essential data and processes within the data warehouse.
- Rather of reworking legacy process, the emphasis is on new technologies.
- For example, new ELT, BI and data visualization tools are needed to change the data ecosystem.
What are Migration Steps?
In order to successfully migrate your data warehouse from Netezza to Snowflake, you need to establish and stick to a logical plan. The strategy sets out the following:
What is Data model migration?
The first stage of the process of data migration is computer model migration. This model comprises the grants of databases, tables, views, sequences, user account names, functions and objects. The owner of the Netezza database must be generated in Snowflake, at least, before migration can begin.
The artefacts to migrate will rely in large measure on the reach of the migration. The data model is migrated from Netezza to Snowflake in three ways:
- Using a data model framework — you can use the tool to create ANSI SQL compliant DDL to recreate database objects in Snowflake if your data warehouse architecture is stored in a modelling tool.
- Using existing DDL scripts — you may need to modify certain types of data with that.
- New DDL scripts develop by removing metadata from Netezza using nzdumpschema or NZ_DDL utilities
- You may need to change some data types.
What is Dataset Migration?
After you migrate the data model to Snowflake, you can begin migrating historical data to Snowflake in Netezza. A third-party ETL tool can use to transfer data for small datasets, basing on the size of the data.
However, a more realistic approach is a manual process for 10s or 100s of terabytes of data. It requires the use of the AWS Snowball / Snowball Edge or Azure Data Box data transfer software. Also, if you have a direct link to AWS or Azure, applications such as IBM Aspera applications will also help with the data transfer. AWS Snowmobile, or equivalent, is the logical way to transfer petabytes or exabytes of data.
For manually moving data, data must retrieve to one or more delimiting flat files (e.g. CSV) for each Netezza table using External Tables to build single files on the NZ UNLOAD tool to create multiple files in parallel. The Snowflake PUT command is using to upload the extract files into either an internal or external bucket for cloud storage. Snowflake suggests file sizes between 100 MB and 1 GB for faster loading of databases through parallel loading of bulks.
Once the data moves to your preferring cloud provider’s cloud storage service, the Snowflake COPY command being to load data into the database.
Know About Queries and Workloads Migration
- Data query — Due to the ANSI-compliant features of Snowflake, most of your current Netezza queries can be easily migrated without modification. You can manually modify the few Netezza constructs which are not ANSI-compliant.
- BI tools-Snowflake allows both ODBC and JDBC so it should be easy to move BI tools.
- Workload management — The multi-cluster architecture of Snowflake renders this feature redundant in Netezza.
What Includes in ETL and Data Pipeline Processes Migration?
Snowflake supports ELT (Extraction, Load and Transformation) as well as ETL (Extraction, Transformation, and Load). It is highly optimizing for ELT, however. Native connectors are available for popular ETL tools, such as Ab Initio, Talend, Informatica etc. For fast comparison, it recommends the data pipeline run in both Netezza and Snowflake during the initial migration. The Snowpipe tool leverages for its excellent results to charge data continuously as it arrives in your cloud storage facility.
What is Cut-Over?
After moving the data model, data, queries etc to Snowflake, plans to make to permanently switch from Netezza to Snowflake using the following actions as a guide:
- Using Snowball / Snowball Edge / IBM Aspera to do a one-time extract and load of historical data.
- Establish a delta (incremental) load of new data.
- Inform all users of Netezza about cut-over impediments.
- Check-in and save all application code in the archive of code, for example. Docker, Bitbucket, Gitlab etc.
- Re-point all the reports from Business Intelligence ( BI) to pull Snowflake info.
- Run Netezza and Snowflake in parallel for a few days then compare the performance and validate it.
- Turn off Netezza and decommission.
Know About Netezza to Snowflake Data Migration Consultancy
Slalom is a modern consulting company with deep experience in Netezza to Snowflake migration and Snowflake’s solution partner. So, Slalom provides consultancy services ranging from project management to implementation and execution, with over 200 certified Snowflake consultants. Slalom ‘s experience of migration from Netezza to Snowflake is unparalleled. Lately, this expertise sees Slalom supporting Sony Interactive Entertainment (SIE) deliver on time and on a budget the largest global data migration from Netezza to Snowflake.
So, Sony Interactive Entertainment (SIE) collects data from consumers worldwide. Thus, Data from 2006 stores on an IBM Netezza on-premise server. So, customers all over the SIE industry use the data to gain insight to improve business success and decision making. Thus, the legacy platform had gained efficiency in terms of both storage and processing power. Also, as a consequence, success becomes a problem and continuous efforts needing to “keep the lights on.” So, significant investment needs to update the existing system to meet business needs in the long term.
So, Slalom encouraged the team responsible for the effective preparation, management and execution of the data migration. Thus, this culminated in the transition of all Sony consumer data from the legacy storage network to a cloud-based enterprise data centre, Snowflake. So, the 10-month migration project completes causing only two days of disruptions to end-users, with almost no market effect as normal.
What is Project Highlights?
- Beginning with a Discovery phase following by four more distribution phases
- Also, gave a good migration, 10 hours live ahead of schedule
- So, one of Snowflake ‘s largest full data migration, worldwide
What is the Migration Statistics?
- SIE went live with Snowflake with 322 TB of data on day 1
- 35 TB of data transfer achieves over a 24 hour period
- Size of the largest tables: 21 TB
- 10 trillion rows of data migrate across
- 30 production databases
- 58,000 database tables
- 20,000 database views migrating
Snowflake Tool Use in Business Intelligence