Data Migration from On Premise Oracle Database to SQL Manage Instance on Azure Cloud using Azure Data Factory - A Working Approach

  • Milind Jadhav , Dr. Amol Goje, Jitendra Chavan


Data Migration has become important aspect nowadays when it comes to data movement from on premise databases to cloud storage or cloud databases. In this paper we present a working case study using cloud based ETL tool known as Azure Data Factory used for Data Migration from on premise Oracle database to cloud based SQL Managed Instance database for an organization. This paper evaluates the implementation of data migration process in general and specific to the tool and technologies involved in data migration process using Azure Data Factory for an organization. When an organization needs to move their application to cloud, the essential of data migration needs to be discussed, proper architecture is required to further break down each task to migrate the data. The proof of concept should be established to see if data is not getting truncated/altered in the process of migration and existing logic on the on premise database works well after moving data to the cloud. In this paper we also discuss about encryption process involved while migrating data as this is an important aspect in data migration to migrate data with existing algorithms used in on premise database and its implementation while data movement takes place using Azure Data Factory. In Oracle there are encryption algorithms being used to store sensitive user data, we have to analyze existing encryption/decryption process and implement an architecture with the help of data migration tool so that data remains intact after movement. Developer has to develop testing strategies to compare on premise data versus the data moved to the cloud storage. Azure Data Factory is powerful cloud-ETL tool to move your hundreds of table data at a time to new cloud database with maximum data transfer throughput. This data migration process requires thorough evaluation of multiple factors e.g. actual table size in migration, throughput, Virtual Machine used for data transfer, network bandwidth etc.