Data migration is the process of moving data from one location to another, one format to another, or one application to another. Generally, this is the result of upgrading databases, establishing a new data warehouse, or merging new data from an acquisition or other sources. Data migration is also necessary when deploying another system that sits alongside existing applications. Ideally, moving data can be completed with no data loss and minimal manual data manipulation or re-creation.

Before the migration, an exploration and assessment of the source are needed. Understanding the system and application is the key to a successful migration. After that, a design of the migration should be made. This involves drawing out the technical architecture of the solution and detailing the migration processes. To ensure a successful solution implementation, an iterative workflow should be adapted. Firstly, it makes sense to implement the solution in stages. In this way it can be tested continuously. After finishing the implementation, it should be tested again with real data to discover any shortcomings or data anomalies. If needed, the scripts should be updated and tested again. Lastly, depending on the project, the data can be transferred in a limited time window or in phases.

The goal of the project was migration of data from the old system into the new one. A big challenge was the analysis of the old system and reverse engineering of its logic. The analysis, cleaning and transformation of the data were needed and they were performed using various algorithms.

Used tools: Python, Postgres, Docker