Data Migration – From A To B and everything in-between
Any ’normal‘ person could be forgiven for thinking ‚Data-Migration‘ is just a fancy jargon for ‚copy-and-paste‘. By ’normal‘, I mean people without programming-superpowers.
And to be honest, it is. Shift data from A to B.
However, it is not as straight-forward as one might think. This blog is just a quick insight into „Data-Migration“ without all its technicalities.
A successful migration of data implies that the legacy system – the old system – can be cut off completely so that the new system can give better response times. If the old system cannot be switched off, that means neither the benefit of new system – nor the legacy system – can be utilized to its full potential and hence the project can be considered failed. Data-Migration in large scale is a significant project, or at least should be treated as such – specially considering 83% of data-migration projects either fail or exceed their budgets and schedules.
And that begs the question: WHY ?!
Why migration-projects could fail
Most of the data migration projects, like any other projects, are expected to succeed at their inception. A significant part of the blame for failures lies on misconceptions and inaccurate assumptions about the effort required. For example, an assumption that the data in current system is complete and accurate could lead to complications at later part of the project. Or thinking that an in-house IT department can migrate the whole database clean may result in having to perform data-migration a second time properly. (No offense IT Abteilung. Ihr seid Zuper! <3 )
Most of the complications in migration projects surface during the last quarter of project where millions of data units are being transferred into a new system. All the ‚perfect‘ data may get transferred without a hitch, but with the data units in millions, the probability of everything running smoothly is much more closer to zero than one would like. New adjustments will have to be made. And we cannot tell if next run will be without errors. run-fix-repeat. It takes (unexpected) time. Hello Ms. Over-budget and Mr. Behind-schedule.
Or an unexpected hardware failure. Or maybe database server is down. Or a nuclear war… Okay! I guess you get the point… Data-migration. Not as simple as it first sounds. A lot of things that could go wrong.
So the next questions: What could be done? Is there a happy ending?
Yes, there is!
What are the secret ingredients?
First things first. Business stakeholders (and not only the people with programming-superpowers) must get involved enough in the project to understand the amount of resources that must be invested in order to achieve the expected quality. It is not just about migrating data from A to B but making data fit to the new system. In large scale data migration, it is almost never the case that data units exist independently. They are always interconnected to each other. And the network may look very different in the target system than in the legacy.
Short story shorter, business stakeholders need to understand that it is not an IT Problem. Now to the secret ingredients.
Extract. Transform. Load.
Extract. Data extraction is not about legacyDatabase.getEverything(). Understanding what (and why) is also very important. What can the new system do that the legacy could not? Practical discussions need to be held on what to migrate and what to leave behind, the true state or consistency of data in legacy compared to what has been documented and the expected quality of data in target system. Data-profiling approach helps in truly understanding the data and expose many risks early in the project. It also helps make an informed estimation of time and effort needed to successfully migrate the data.
Transform. The target system most probably has different system structure than the legacy. A square peg cannot fit in a round hole. And hence the migration. Data need to be cleaned, enriched, modified – in short, transformed. Transformation of data involves making it fit the new system purposefully. More questions will crop up once the implementation starts. What, why, why not, who, how – most likely way to get answers is by asking questions. Just make sure they are asked early enough.
Load. The only thing left now is to load (or store) data in the target system. All the low hanging fruits will have been harvested. Even after doing everything as correctly as possible, it is a valid presumption that a small percentage of data will cause unforeseeable problems that might not have simple solutions. Programming-superpowers must be used. Brains must be stormed. Time must be invested. It is inevitable. However, if the project was well thought through, time, resources and plan to address this very situation would have been built into the project. That means: Hello Mr. Still-on-time and Ms. Calculated.
To the business-stakeholders – it is serious.
To the people with superpowers – it is fun.*
There are many number of other factors that influence the success or failure of a data-migration – or any – project. Large system upgrades, system implementations, new management, new regulations, mergers or whatever the reasons for the need to migrate data in an enterprise scale, it is important to treat data migration as a long process – a crucial Project and not an one-time event.
Undervalued, underestimated complexity, underestimated risk, underestimated effort, insufficient preparation, misconception. Those are the thoughts that come to my mind when I think about someone’s thought who is thinking about data-migration. One might think I am getting a little emotional about data-migration.
Very likely. As I might end up writing a thesis around this topic. :)
Thanks for reading.