From Oracle Database 12c EE to PostgreSQL, or the main differences between PostgreSQL when developing for the Ultimate IEM platform

Published on June 27, 2017

From Oracle Database 12c EE to PostgreSQL, or the main differences between PostgreSQL when developing for the Ultimate IEM platform

    In a previous article, I talked about the release of Solid12, a version of the IEM platform for PostgreSQL. As promised, we talk in more detail about what the developers will have to face (and what we encountered during the migration).

    This review is not exhaustive; rather, it should be considered as the minimum introductory note for an Oracle developer starting to work on PostgreSQL.

    So, here is the list of differences that we recognized as the most significant:

    • The names of tables, columns, stored procedures are lower case by default (except identifiers in quotation marks). This also affects the column names in the results of SQL queries.
    • Packages are not supported. The Solid12 kernel uses circuits to group functions into packages, so several core circuits are used.
    • Instead of package variables, session variables are used. Their semantics are similar, although there are some differences.
    • PostgreSQL temporary tables are always local and cannot be schema bound, unlike Oracle. Solid12 has a feature set that emulates Oracle-style global temporary tables that fully support the standard syntax for working with global temporary tables.
    • The PL / PgSQL stored procedure language is similar, but still different from Oracle PL / SQL in many details.
    • Stored procedures always use dynamic binding (i.e. characters are interpreted at run time). For example, if a stored procedure executes a query like "SELECT * FROM USERS", a table search by the name "USERS" will be run at run time, and not during compilation. Dynamic binding makes PostgreSQL procedures very flexible, but prone to much more runtime errors than Oracle.
    • By default, PostgreSQL functions run with the set of permissions of the current user (in Oracle, the permissions of the creator of the function are used by default). This option can be explicitly redefined in each function where it is required.
    • Empty lines are not the same as NULL. Unlike Oracle, PostgreSQL concatenating varchar or text with NULL always returns NULL. Always initialize local variables with blank lines to avoid inadvertently nullifying the result.
    • PostgreSQL DDL operations are transactional. Creating tables and functions, changing column types, clearing tables (TRUNCATE), etc., needs to be committed.
    • Replacing a function with a new version will fail if the signature of the function differs from the original. In this situation, performing “CREATE OR REPLACE FUNCTION” is not enough: you will need to delete (DROP) the old version of the function and recreate it.
    • Any database error marks the current transaction as erroneous, so the transaction cannot execute any commands except ROLLBACK. However, such a transaction can be rolled back to the last named savepoint (SAVEPOINT) that was completed before the error occurred. After the rollback to the save point, the transaction can continue to work and commit (this technique is used in integration tests, which should continue to run, regardless of errors).
    • Multiple simultaneously active DataReaders are not supported on the same connection to the database (the so-called MARS mode - Multiple Active Result Sets). To read several data sets, you must either open several database connections, one for each data set, or execute queries in turn.

    It is not always possible to completely abstract from the features of specific databases in the application code. Often in commands or services, it is necessary to form and execute a dynamic SQL query, call a stored procedure, and so on. An application might require triggers, views, constraints, or indexes, so an Oracle application developer will need to understand at least the basic PostgreSQL properties.

    The following are some instructions that may help you deal with some of the difficulties described.

    How to bypass dynamic binding in PL / PgSQL code


    Dynamic binding is a powerful mechanism that in some cases can replace dynamic query execution (EXECUTE sql). The flip side of the coin is the fragility of the design, the lack of checks during compilation. The compiler cannot statically check whether a given symbol refers to any database object.

    When a function refers to a symbol, such as a table or function, a specific object will be found by name only during the execution of the function. In addition, the contents of the variable “search_path” affect this search, which means that the symbol can be found in any scheme, depending on the settings of the current session.
    This is usually not the case.

    To disable dynamic snapping, we follow two simple rules:

    • Add the line “set search_path to (current schema name)” to all function definitions and
    • We qualify all tables outside the current schema with the names of their schemas.

    This does not make the binding static (PostgreSQL still does not check the validity of the characters), but simply disables the possibility of unintentionally binding the character to something else.

    Here is an example of the source code for a PL / PgSQL function that no longer suffers from dynamic binding:

    -- current search_path = my_schema 
    create or replace function my_func(my_arg text) returns void as $$ 
    declare v_id bigint; 
    begin 
    perform another_func(my_arg); -- same as perform my_schema.another_func(my_arg); 
    select id into v_id from kernel.users -- table name is qualified with kernel schema name where login = my_arg; -- the rest is skipped... 
    end $$ language plpgsql set search_path to my_schema;
    

    Override permissions that apply to a function


    By default, PostgreSQL functions are called with the set of permissions of the current DBMS user, similar to the Oracle option “AUTHID CURRENT_USER” (by default, Oracle uses a different mode - “AUTHID DEFINER”).

    To emulate the behavior of Oracle, a function must override the “security option” like this:

    create or replace function my_secure_func() returns void as $$
    begin  -- call here any functions available to the superuser 
    end $$ language plpgsql security definer; -- default is security invoker
    

    Oracle style global temporary table emulation


    The semantics of temporary tables in PostgreSQL are significantly different from Oracle. Here is a brief overview of the differences:

    • Temporary tables in Oracle are constant, that is, their structure is fixed and visible to all users, and the contents are temporary.
    • In PostgreSQL, a temporary table is created before each use. Both the structure and contents of the temporary table are visible only to the current DBMS process that created this table. PostgreSQL temporary tables are always deleted either at the end of a session or at the end of a transaction.
    • In Oracle, temporary tables are always located within the specific schema specified during creation.
    • In PostgreSQL, temporary tables cannot be placed in an arbitrary scheme; they are always created in a special implicit temporary scheme.

    The pack_temp schema contains a library to emulate Oracle-style temporary tables. We are interested in only two functions:

    create_permanent_temp_table(table_name [, schema_name]);
    drop_permanent_temp_table(table_name [, schema_name]);
    


    Creating a permanent temporary table is done in two steps:
    1. Create a regular PostgreSQL temporary table (the one that is deleted at the end of the transaction).
    2. We call the create_permanent_temp_table function to turn this temporary table into a permanent one:
      create temporary table if not exists another_temp_table (    first_name varchar,    last_name varchar,    date timestamp(0) with time zone,    primary key(first_name, last_name) ) on commit drop;
      -- create my_schema.another_temp_table select pack_temp.create_permanent_temp_table('another_temp_table', 'my_schema');
      -- or create another_temp_table in the current schema -- select create_permanent_temp_table('another_temp_table');
      -- don't forget to commit: PostgreSQL DDL is transactional commit;
      


    This creates a view that behaves exactly like the global Oracle temporary table. You can delete it with the drop_permanent_temp_table function.

    Multiple active DataReaders on the same database connection


    This is PostgreSQL's most annoying limitation: each database connection can only have one DataReader open at a time.

    A new request cannot be executed until the previous one has been executed and processed.

    The problem regularly pops up in application services, LINQ queries, and SQL queries in many different forms. Here are a few typical cases:

    1. A LINQ query uses a constant (or calls a service). The request opens the first DataReader, and the constant service tries to open the second and receives an error. To get rid of the error, you need to read the constant in a local variable before executing the query (or call the service after reading the query results). Example:

      // было 
      var query = from a in DataContext.GetTable<Agent>() 
                        where a.ID = Constants.TestAgentID select a;
      // стало 
      var testAgentId = Constants.TestAgentID; 
      var query = from a in DataContext.GetTable<Agent>() 
                        where a.ID = testAgentId select a;
      

    2. The results of a LINQ query are processed in a loop, but the body of the loop executes the second query, whether LINQ or SQL. How to get around the limitation: materialize the results of the query into a list or array, and run through the list after the first query has already been completed. Example:

      // было 
      foreach (var langId in DataContext.GetTable<Language>().Select(x => x.ID)) 
      { 
        using (LanguageService.UseLanguage(langId)) 
       { 
          // do something language-specific 
       } 
      }
      // стало 
      foreach (var langId in DataContext.GetTable<Language>().Select(x => x.ID).ToIDList()) 
      { 
        using (LanguageService.UseLanguage(langId)) 
        { 
            // do something language-specific 
        } 
      }
      

    3. Call ToArray / ToList / ToIDList inside a LINQ query. To fix, you need to break the request into parts:

      // было 
      var dictionary = DataContext.GetTable<CalendarDayStatus>().Where(d => dates.Contains(d.DT)) .GroupBy(g => g.DT, e => e.StatusID) .ToDictionary(k => k.Key, e => e.ToIDList());
      // стало 
      var dictionary = DataContext.GetTable<CalendarDayStatus>() .Where(d => dates.Contains(d.DT)) .GroupBy(g => g.DT, e => e.StatusID) .ToDictionary(p => p.Key);
      var dict = dictionary.ToDictionary(p => p.Key, p => p.Value.ToIDList());
      

      Unfortunately, this type of error is very difficult to detect statically. Therefore, each non-trivial LINQ query needs to be thoroughly tested to make sure that it is not trying to open several DataReaders at the same time.

    What's next?


    We plan to intensively develop interaction with the PostgreSQL team. Actually, most of the restrictions do not seem insurmountable, perhaps we can find the resources to make appropriate changes to the PostgreSQL code.

    We didn’t use some features that PostgreSQL already has, for example, support for geodata processing, but we hope that we can use them in future versions.

    In any case, both versions - light Solid12 and enteprise Ultimate Solid - will be developed in parallel, all the important functionality will be supported in both versions of the platform.