While organizations are benefiting from the performance, flexibility, and cost savings offered by the cloud, many enterprises have struggled with their data migration initiatives. Cloud data migration can be fraught with business risks including disruption of critical business operations, risk of data loss, and overall project complexities that often result in cost overruns or failed initiatives. According to Gartner, “Through 2022, more than 50% of data migration initiatives will exceed their budget and timeline—and potentially harm the business—because of flawed strategy and execution.”
These issues are particularly present when using legacy data migration approaches and is what we call the “data migration gap”. How can organizations migrate their petabytes of business critical and actively changing customer data without causing business disruption and minimizing the time, costs and risks associated with legacy data migration approaches?
Legacy data migration approaches typically fall under the following categories.
Lift and Shift
A lift and shift approach is used to migrate applications and data from one environment to another with zero or minimal changes. This approach is probably the most common, as companies perceive it to be straight forward. However, it is simplistic and risk laden because of the assumption that no changes are needed. As a result, a majority of these projects fail. Instead of gaining the efficiencies promised by the cloud, they don’t take advantage of new capabilities available to them and end up transferring shortcomings from their existing on-premises implementation to the new cloud environment.
Summary of issues:
Incremental Copy
An incremental copy approach is where new and modified data are periodically copied from the source to target environment utilizing multiple passes of the source data. The approach requires that all original data from the source system have been migrated to the target, and incremental changes to the data are processed with each subsequent pass. The issue with this approach is that if there is a large volume of data with changes occurring it may be impossible to ever catchup and complete the migration without requiring downtime.
Summary of issues:
Dual Pipeline / Ingest
A dual pipeline or dual ingest approach is where new data is ingested simultaneously into both the source and target environments. This approach requires significant effort to develop, test, operate and maintain the multiple pipelines. It requires that all applications are modified to always update both source and target environments when performing any data changes.
Summary of issues:
If the migration technology used does not need to compare data that have been modified during an extended period of data transfer, it can eliminate the gap that exists between source and target and maintain the target as a current replica of the source data. This requires the combination of change event information with the activities from scanning existing data, and an intelligent approach to bringing those two streams of information together to take the correct actions against the migration target. This is not just change data capture, but the use of a continuous stream of change notifications with the scan of the data that are undergoing change.
This approach is what we call a live data strategy because it enables data migrations to be performed even as the data sets are undergoing active change. A live data strategy solves all of the challenges associated with the legacy data migration approaches by enabling migrations to be performed without requiring any application downtime or business disruption. A live data approach is able to support data migration and replication use cases regardless of the data volumes or amount of data change occurring, and is the only approach able to cost effectively manage these large scale data migrations even while large amounts of data changes are occurring.