Migrating data from Oracle to PostgreSQL

    Until recently, the “landscape” of the DBMS in the projects of our company looked like this: the majority were Oracle, the substantially smaller ones were MS SQL and MySQL.

    But, as you know, there is nothing eternal, and recently we received a request about the applicability of Postgres in one of our projects. In the last couple of years, we have looked closely at this DBMS very closely - we attended conferences, meetups, but until recently it has not been possible to try it in "combat" conditions.

    So the challenge


    Given: Oracle server (single instance) 11.2.0.3 and a set of unrelated schemes with a total capacity of ~ 50GB. Required: Transfer data, indexes, primary and reference keys from Oracle to Postgres.

    Choosing a Migration Tool


    A review of migration tools revealed the availability of both commercial tools, such as the Enterprise DB Migration Toolkit and Oracle Golden Gate, as well as free software. The translation was planned once, so a mature tool was required, but it was clear and simple. In addition, of course, the issue of value was taken into account. Of the free software, the most mature today is the Ora2Pg project by Gilles Darold (Darold Gill), he also greatly surpassed the commercial options in functionality. Advantages that tip the scales in his direction:

    • rich functionality;
    • active development of the project (15 years of development, 15 major releases).

    The principle of operation of the Ora2Pg command-line utility is quite simple: it connects to the Oracle database, scans the circuit specified in the configuration file and unloads the circuit objects in the form of DDL instructions into sql files. The data itself can be either uploaded as INSERTs to the sql file or inserted directly into the created Postgres DBMS tables.

    Installation and configuration of the environment


    At the company, we use the DevOps approach to create virtual machines, install the necessary software, configure and deploy software. Our working tool is Ansible. But in order to facilitate perception and not introduce new entities that are irrelevant to the article, we will continue to show manual actions from the command line. For those who are interested, we post Ansible playbook for all the steps here .

    So, on the virtual machine with OS Centos 6.6, we will perform the following steps.

    1. Install the Postgres repository.
    2. Install Postgres 9.4 server.
    3. Create a database and configure access.
    4. Install Postgres as a service and run it.
    5. Install the instant Oracle client.
    6. Install the Ora2Pg utility.

    All further actions will be performed from under the account root. Install the repository: Install Postgres 9.4: Create a Postgres cluster: Setting access means that we specifically lower the security of the Postgres connection for easy testing. Of course, in the production environment, we do not recommend doing this. The file /var/lib/pgsql/9.4/data/postgresql.conf needs to uncomment the line . In the file /var/lib/pgsql/9.4/data/pg_hba.conf for local and remote connections, you must set the method . The section after editing looks like this:
    #yum install yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-1.noarch.rpm


    #yum install postgresql94-server


    #service postgresql-9.4 initdb




    listen_addresses = '*'trust
    # TYPE DATABASE USER ADDRESS METHOD
    # "local" is for Unix domain socket connections only
    local all all trust
    # IPv4 local connections:
    host all all all trust
    

    Register Postgres as a service and run it:
    #chkconfig postgresql-9.4 on
    #service postgresql-9.4 restart
    

    To install Oracle instant client, you must download the following packages from OTN:
    oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpm
    oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm
    oracle-instantclient11.2-sqlplus-11.2.0.4.0-1.x86_64.rpm
    oracle-instantclient11.2-jdbc-11.2.0.4.0-1.x86_64.rpm
    

    Install them:
    #yum install /tmp/oracle-instantclient11.2-basic-11.2.0.4.0-1.x86_64.rpm
    #yum install /tmp/oracle-instantclient11.2-devel-11.2.0.4.0-1.x86_64.rpm
    #yum install /tmp/oracle-instantclient11.2-sqlplus-11.2.0.4.0-1.x86_64.rpm
    #yum install /tmp/oracle-instantclient11.2-jdbc-11.2.0.4.0-1.x86_64.rpm
    

    Create a folder for tnsnames.ora:
    #mkdir -p /usr/lib/oracle/11.2/client64/network/admin
    #chmod 755 /usr/lib/oracle/11.2/client64/network/admin
    

    Set the following environment variables (in the user's .bash_profile):
    export ORACLE_HOME=/usr/lib/oracle/11.2/client64
    export PATH=$PATH:$ORACLE_HOME/bin
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib
    export TNS_ADMIN=$ORACLE_HOME/network/admin
    

    And check the performance.
    sqlplus system/@host.domain.ru/SERVICE
    

    If everything is ok, then we get something like this:
     Connected to:
         Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
         With the Partitioning, OLAP, Data Mining and Real Application Testing options
         SQL>
    

    The last step of the setup remained - installation of Ora2pg. Download the latest version of Ora2Pg from the site (at the time of writing, there was version 15.2). Install the necessary packages:
    #yum install gcc cpan postgresql94-plperl postgresql94-devel
    

    Install the CPan module:
    #cpan
    

    Install additional modules for Perl:
    #cpan Test::Simple DBI Time::HiRes DBD::Oracle DBD::Pg
    

    Unzip Ora2pg into, say, / install:
    #cd /install
    #tar -xvf ora2pg-15.2.tar.gz 
    

    Build Ora2pg:
    #perl Makefile.PL
    #make
    #make install
    

    Migration


    Postgres DBMS is by its “spirit” the closest to Oracle. In both the data types are correlated well, and there, and there is such a thing as a schema. We will take advantage of this and will transfer the data “by the way”. The migration process will consist of the following steps.

    1. Creating a migration project using Ora2pg.
    2. Editing the ora2pg.conf configuration file.
    3. Unloading DDL tables, indexes, constraints from Oracle.
    4. Creating a database in Postgres.
    5. Import DDL tables prepared in step 3.
    6. Copy data.
    7. Import DDL indexes and constraints.

    All subsequent actions will be performed by the postgres user.
    #su -l postgres
    

    Create a migration project. The project consists of a set of tables / functions / views / packages folders, which will contain sql files with DDL of the corresponding objects, the ora2pg.conf configuration file, and the startup script export_schema.sh.
    $ora2pg --init_project my_project_name
    $cd my_project_home
    $vi config/ora2pg.conf
    

    Configuration


    The Ora2pg configuration file is quite voluminous, and I will focus only on those parameters that are root or were required during the migration of our data. About the rest, I recommend learning from this article .

    The section that describes the parameters for connecting to the Oracle database:
    ORACLE_HOME /usr/lib/oracle/11.2/client64
    ORACLE_DSN dbi:Oracle:host=oracle_host.domain.ru;sid=
    ORACLE_USER SYSTEM
    ORACLE_PWD MANAGER
    

    Section that describes which scheme we upload:
    EXPORT_SCHEMA 1
    SCHEMA TST_OWNER
    

    And an indication in which circuit we load:
    PG_SCHEMA tst_owner
    

    Specify the type of export. The parameter COPYsays that we will copy data directly from Oracle to Postgres, bypassing the text file.
    TYPE TABLE,COPY
    

    Section that describes the parameters for connecting to the Postgres database:
    PG_DSN dbi:Pg:dbname=qqq;host=localhost;port=5432
    PG_USER tst_owner
    PG_PWD tst_onwer
    

    Section for converting data types. To ensure that the type number()without precision is not converted to  bigint, we indicate:
    DEFAULT_NUMERIC numeric
    

    This completes the configuration steps, and we are ready to proceed with the migration. Unload the schema descriptions as a set of sql files with DDL objects:
    $./export_schema.sh
    

    Let's create the qqq database, the user test_owner, and issue the necessary rights.
    $psql 
    postgres=#create database qqq;
    CREATE DATABASE
    postgres=#create user test_owner password ‘test_owner’;
    CREATE ROLE
    postgres=#grant all on database qqq to test_owner;
    GRANT
    postgres=#\q
    

    Let's import the sql file from DDL tables:
    $psql -d qqq -U test_owner < schema/tables/table.sql
    

    Now you are ready to copy data. We launch:
    $ora2pg -t COPY -o data.sql -b ./data -c ./config/ora2pg.conf
    

    Despite the fact that on the command line we specify a parameter with the name of the file to which the upload should be saved, data is inserted directly from Oracle to Postgres. In our case, the insertion speed was about 6 thousand lines per second, but this, of course, depends on the types of data being copied and the surrounding infrastructure.

    The last step left is to create indexes and constraints.
    $psql -d qqq -U test_owner < schema/tables/INDEXES_table.sql
    $psql -d qqq -U test_owner < schema/tables/CONSTRAINTS_table.sql
    

    If during the execution of the previous commands you didn’t get errors - congratulations, the migration was successful! But, as is known from Murphy's Law: "Anything that can go wrong will go wrong."

    Our pitfalls


    The first pitfall has already been mentioned above: the type number()without accuracy is converted to bigint, but this can easily be fixed with the correct configuration.

    The next challenge was that Postgres does not have a type similar to Oracle anydata. In this regard, we were forced to analyze and correct the application logic, to the detriment of flexibility, to convert it to "suitable" types, for example, c  varchar2(100). In addition, if you have any custom types, then everything will have to be redone, since they are not broadcast, but this is a topic for at least a separate article.

    To summarize


    The Ora2Pg utility, despite the complexity of the setup, is simple and reliable to use. It can be safely recommended for the migration of small and medium databases. By the way, its author at PGConf Russia announced that it is starting the MS2Pg project. It sounds promising.

    Successful migrations!

    Also popular now: