I have been working on a data migration project for PinpointCare (pinpointcare.com) for about 10 months now. I will try to describe it from as high of an altitude that I can.
So, what is the purpose of this project?
We currently have both version 1 and version 2 of our software in production. Version 1 is written in PHP (backend), KnockoutJS (frontend), and uses MySQL as it’s data store. Version 2 is written in C# .NET (backend), AngularJS (frontend), and uses SQL Server as it’s data store. The purpose was to move all Version 1 data into our Version 2 system.
So, how did we accomplish this complex task?
First, Version 1 is based on a relational model that was designed with one customer at hand. Of course, we didn’t design our database with no foresight of having more than one customer. I am just stating a fact. We were designing this version with our first customer asking for specific functionality. We were learning as we were building. We were building our mental model of what the requirements were as we were building this Version 1, a first attempt at solving our customer’s problems. Fred Brooks, author of The Mythical Manmonth (first published in 1975 and then republished in 1995), said “Build one, then throw one away.” Well, no one really has a chance to do that in the real world, but we did. We did because we wanted to build a more scalable and more secure product. We could not do that with PHP.
Once we had the Version 1 in the hands of a customer, we were able to get about 25 more in rapid succession. In the meantime, we built a separate team working on a completely new code base for Version 2. Version 2 was to be completely rewritten with a different software architecture and database design. We used Entity Framework Code-First, a Microsoft .NET technology which emphasizes coding first before database design. Needless to say, this methodology resulted in the database design of our Version 2 application being even more different than that of Version 1, compounded by allowing the code to drive the database design. This resulted in a low priority work item for data migration for Version 1 patients and careplans (We have a care coordination platform. See pinpointcare.com for more on what we do.) into our new Version 2 software.
One reason (for this lower priority designation of something so desired by our customers) is that it was now a complex issue, and the priority of our Version 2 resources was still focused on the delivery of Version 2 into production.
But, how did we solve this complex migration issue once Version 2 was already put into production?
After some brainstorming, we decided on a hybrid approach. This involved migrating part of our data values from MySQL to SQL Server and capturing the other part of our data (data which populated surveys or assessments with data elements formatted for the screen and printing) as .pdf screen captures. We did the .pdf screen capture via use of existing open source libraries and custom programming.
I created a dump of our Version 1 MySQL database into a .csv file, flatting out all the fields so that all the data could populate one .csv file. We then used our interface software component (which we used both in Version 1 and Version 2 for HL7 interfaces) to read the .csv extract that I had created and build API calls to Version 2 to supply the data. The Version 2 migration API would insert the Version 1 data into the Version 2 data structure. It would also insert the generated .pdfs (for the screen captures) into a BLOB-typed field in the Version 2 data structure. The Version 2 application was was then modified to recognize Version 1 data and to display Version 1 .pdfs when appropriate. The .pdfs were read-only and that was ok based on the requirements of what we needed.
I hope you found this anecdote interesting. We are still in the process of finishing this project. We have a variety of people working on this with different skill sets, but it is looking like we are nearing the finish line.