Migrating data from Oracle to PostgreSQL
Until recently, the “landscape” of the DBMS in the projects of our company looked like this: the majority were Oracle, the substantially smaller ones were MS SQL and MySQL.
But, as you know, there is nothing eternal, and recently we received a request about the applicability of Postgres in one of our projects. In the last couple of years, we have looked closely at this DBMS very closely - we attended conferences, meetups, but until recently it has not been possible to try it in "combat" conditions.
Given: Oracle server (single instance) 11.2.0.3 and a set of unrelated schemes with a total capacity of ~ 50GB. Required: Transfer data, indexes, primary and reference keys from Oracle to Postgres.
A review of migration tools revealed the availability of both commercial tools, such as the Enterprise DB Migration Toolkit and Oracle Golden Gate, as well as free software. The translation was planned once, so a mature tool was required, but it was clear and simple. In addition, of course, the issue of value was taken into account. Of the free software, the most mature today is the Ora2Pg project by Gilles Darold (Darold Gill), he also greatly surpassed the commercial options in functionality. Advantages that tip the scales in his direction:
The principle of operation of the Ora2Pg command-line utility is quite simple: it connects to the Oracle database, scans the circuit specified in the configuration file and unloads the circuit objects in the form of DDL instructions into sql files. The data itself can be either uploaded as INSERTs to the sql file or inserted directly into the created Postgres DBMS tables.
At the company, we use the DevOps approach to create virtual machines, install the necessary software, configure and deploy software. Our working tool is Ansible. But in order to facilitate perception and not introduce new entities that are irrelevant to the article, we will continue to show manual actions from the command line. For those who are interested, we post Ansible playbook for all the steps here .
So, on the virtual machine with OS Centos 6.6, we will perform the following steps.
All further actions will be performed from under the account
Register Postgres as a service and run it:
To install Oracle instant client, you must download the following packages from OTN:
Install them:
Create a folder for
Set the following environment variables (in the user's .bash_profile):
And check the performance.
If everything is ok, then we get something like this:
The last step of the setup remained - installation of Ora2pg. Download the latest version of Ora2Pg from the site (at the time of writing, there was version 15.2). Install the necessary packages:
Install the CPan module:
Install additional modules for Perl:
Unzip Ora2pg into, say, / install:
Build Ora2pg:
Postgres DBMS is by its “spirit” the closest to Oracle. In both the data types are correlated well, and there, and there is such a thing as a schema. We will take advantage of this and will transfer the data “by the way”. The migration process will consist of the following steps.
All subsequent actions will be performed by the postgres user.
Create a migration project. The project consists of a set of tables / functions / views / packages folders, which will contain sql files with DDL of the corresponding objects, the ora2pg.conf configuration file, and the startup script export_schema.sh.
The Ora2pg configuration file is quite voluminous, and I will focus only on those parameters that are root or were required during the migration of our data. About the rest, I recommend learning from this article .
The section that describes the parameters for connecting to the Oracle database:
Section that describes which scheme we upload:
And an indication in which circuit we load:
Specify the type of export. The parameter
Section that describes the parameters for connecting to the Postgres database:
Section for converting data types. To ensure that the type
This completes the configuration steps, and we are ready to proceed with the migration. Unload the schema descriptions as a set of sql files with DDL objects:
Let's create the qqq database, the user test_owner, and issue the necessary rights.
Let's import the sql file from DDL tables:
Now you are ready to copy data. We launch:
Despite the fact that on the command line we specify a parameter
The last step left is to create indexes and constraints.
If during the execution of the previous commands you didn’t get errors - congratulations, the migration was successful! But, as is known from Murphy's Law: "Anything that can go wrong will go wrong."
The first pitfall has already been mentioned above: the type
The next challenge was that Postgres does not have a type similar to Oracle anydata. In this regard, we were forced to analyze and correct the application logic, to the detriment of flexibility, to convert it to "suitable" types, for example, c
The Ora2Pg utility, despite the complexity of the setup, is simple and reliable to use. It can be safely recommended for the migration of small and medium databases. By the way, its author at PGConf Russia announced that it is starting the MS2Pg project. It sounds promising.
Successful migrations!
But, as you know, there is nothing eternal, and recently we received a request about the applicability of Postgres in one of our projects. In the last couple of years, we have looked closely at this DBMS very closely - we attended conferences, meetups, but until recently it has not been possible to try it in "combat" conditions.
So the challenge
Given: Oracle server (single instance) 11.2.0.3 and a set of unrelated schemes with a total capacity of ~ 50GB. Required: Transfer data, indexes, primary and reference keys from Oracle to Postgres.
Choosing a Migration Tool
A review of migration tools revealed the availability of both commercial tools, such as the Enterprise DB Migration Toolkit and Oracle Golden Gate, as well as free software. The translation was planned once, so a mature tool was required, but it was clear and simple. In addition, of course, the issue of value was taken into account. Of the free software, the most mature today is the Ora2Pg project by Gilles Darold (Darold Gill), he also greatly surpassed the commercial options in functionality. Advantages that tip the scales in his direction:
- rich functionality;
- active development of the project (15 years of development, 15 major releases).
The principle of operation of the Ora2Pg command-line utility is quite simple: it connects to the Oracle database, scans the circuit specified in the configuration file and unloads the circuit objects in the form of DDL instructions into sql files. The data itself can be either uploaded as INSERTs to the sql file or inserted directly into the created Postgres DBMS tables.
Installation and configuration of the environment
At the company, we use the DevOps approach to create virtual machines, install the necessary software, configure and deploy software. Our working tool is Ansible. But in order to facilitate perception and not introduce new entities that are irrelevant to the article, we will continue to show manual actions from the command line. For those who are interested, we post Ansible playbook for all the steps here .
So, on the virtual machine with OS Centos 6.6, we will perform the following steps.
- Install the Postgres repository.
- Install Postgres 9.4 server.
- Create a database and configure access.
- Install Postgres as a service and run it.
- Install the instant Oracle client.
- Install the Ora2Pg utility.
All further actions will be performed from under the account
root
. Install the repository:
Install Postgres 9.4:
Create a Postgres cluster:
Setting access means that we specifically lower the security of the Postgres connection for easy testing. Of course, in the production environment, we do not recommend doing this.
The file /var/lib/pgsql/9.4/data/postgresql.conf needs to uncomment the line . In the file /var/lib/pgsql/9.4/data/pg_hba.conf for local and remote connections, you must set the method . The section after editing looks like this:#yum install yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-1.noarch.rpm
#yum install postgresql94-server
#service postgresql-9.4 initdb
listen_addresses = '*'
trust
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all all trust
Register Postgres as a service and run it:
#chkconfig postgresql-9.4 on
#service postgresql-9.4 restart
To install Oracle instant client, you must download the following packages from OTN:
oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpm
oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm
oracle-instantclient11.2-sqlplus-11.2.0.4.0-1.x86_64.rpm
oracle-instantclient11.2-jdbc-11.2.0.4.0-1.x86_64.rpm
Install them:
#yum install /tmp/oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpm
#yum install /tmp/oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm
#yum install /tmp/oracle-instantclient11.2-sqlplus-11.2.0.4.0-1.x86_64.rpm
#yum install /tmp/oracle-instantclient11.2-jdbc-11.2.0.4.0-1.x86_64.rpm
Create a folder for
tnsnames.ora
:#mkdir -p /usr/lib/oracle/11.2/client64/network/admin
#chmod 755 /usr/lib/oracle/11.2/client64/network/admin
Set the following environment variables (in the user's .bash_profile):
export ORACLE_HOME=/usr/lib/oracle/11.2/client64
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export TNS_ADMIN=$ORACLE_HOME/network/admin
And check the performance.
sqlplus system/@host.domain.ru/SERVICE
If everything is ok, then we get something like this:
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>
The last step of the setup remained - installation of Ora2pg. Download the latest version of Ora2Pg from the site (at the time of writing, there was version 15.2). Install the necessary packages:
#yum install gcc cpan postgresql94-plperl postgresql94-devel
Install the CPan module:
#cpan
Install additional modules for Perl:
#cpan Test::Simple DBI Time::HiRes DBD::Oracle DBD::Pg
Unzip Ora2pg into, say, / install:
#cd /install
#tar -xvf ora2pg-15.2.tar.gz
Build Ora2pg:
#perl Makefile.PL
#make
#make install
Migration
Postgres DBMS is by its “spirit” the closest to Oracle. In both the data types are correlated well, and there, and there is such a thing as a schema. We will take advantage of this and will transfer the data “by the way”. The migration process will consist of the following steps.
- Creating a migration project using Ora2pg.
- Editing the ora2pg.conf configuration file.
- Unloading DDL tables, indexes, constraints from Oracle.
- Creating a database in Postgres.
- Import DDL tables prepared in step 3.
- Copy data.
- Import DDL indexes and constraints.
All subsequent actions will be performed by the postgres user.
#su -l postgres
Create a migration project. The project consists of a set of tables / functions / views / packages folders, which will contain sql files with DDL of the corresponding objects, the ora2pg.conf configuration file, and the startup script export_schema.sh.
$ora2pg --init_project my_project_name
$cd my_project_home
$vi config/ora2pg.conf
Configuration
The Ora2pg configuration file is quite voluminous, and I will focus only on those parameters that are root or were required during the migration of our data. About the rest, I recommend learning from this article .
The section that describes the parameters for connecting to the Oracle database:
ORACLE_HOME /usr/lib/oracle/11.2/client64
ORACLE_DSN dbi:Oracle:host=oracle_host.domain.ru;sid=
ORACLE_USER SYSTEM
ORACLE_PWD MANAGER
Section that describes which scheme we upload:
EXPORT_SCHEMA 1
SCHEMA TST_OWNER
And an indication in which circuit we load:
PG_SCHEMA tst_owner
Specify the type of export. The parameter
COPY
says that we will copy data directly from Oracle to Postgres, bypassing the text file.TYPE TABLE,COPY
Section that describes the parameters for connecting to the Postgres database:
PG_DSN dbi:Pg:dbname=qqq;host=localhost;port=5432
PG_USER tst_owner
PG_PWD tst_onwer
Section for converting data types. To ensure that the type
number()
without precision is not converted to bigint
, we indicate:DEFAULT_NUMERIC numeric
This completes the configuration steps, and we are ready to proceed with the migration. Unload the schema descriptions as a set of sql files with DDL objects:
$./export_schema.sh
Let's create the qqq database, the user test_owner, and issue the necessary rights.
$psql
postgres=#create database qqq;
CREATE DATABASE
postgres=#create user test_owner password ‘test_owner’;
CREATE ROLE
postgres=#grant all on database qqq to test_owner;
GRANT
postgres=#\q
Let's import the sql file from DDL tables:
$psql -d qqq -U test_owner < schema/tables/table.sql
Now you are ready to copy data. We launch:
$ora2pg -t COPY -o data.sql -b ./data -c ./config/ora2pg.conf
Despite the fact that on the command line we specify a parameter
-о
with the name of the file to which the upload should be saved, data is inserted directly from Oracle to Postgres. In our case, the insertion speed was about 6 thousand lines per second, but this, of course, depends on the types of data being copied and the surrounding infrastructure. The last step left is to create indexes and constraints.
$psql -d qqq -U test_owner < schema/tables/INDEXES_table.sql
$psql -d qqq -U test_owner < schema/tables/CONSTRAINTS_table.sql
If during the execution of the previous commands you didn’t get errors - congratulations, the migration was successful! But, as is known from Murphy's Law: "Anything that can go wrong will go wrong."
Our pitfalls
The first pitfall has already been mentioned above: the type
number()
without accuracy is converted to bigint
, but this can easily be fixed with the correct configuration. The next challenge was that Postgres does not have a type similar to Oracle anydata. In this regard, we were forced to analyze and correct the application logic, to the detriment of flexibility, to convert it to "suitable" types, for example, c
varchar2(100)
. In addition, if you have any custom types, then everything will have to be redone, since they are not broadcast, but this is a topic for at least a separate article.To summarize
The Ora2Pg utility, despite the complexity of the setup, is simple and reliable to use. It can be safely recommended for the migration of small and medium databases. By the way, its author at PGConf Russia announced that it is starting the MS2Pg project. It sounds promising.
Successful migrations!