SQL Access to NoSQL Data: Implementing an SQL Procedure in Caché with Dynamically Defining Returned Metadata

    As you know, Caché can be used as a relational DBMS, including through JDBC / ODBC drivers , with the ability to execute arbitrary SQL queries and invoke SQL procedures.
    It is also known that all data in Caché is stored in multidimensional sparse arrays - globals . This allows, in the case of insufficient performance of a single SQL procedure, not to use the standard CachéSQL engine, but to rewrite its execution code in the server business logic Caché ObjectScript (COS), in which you can implement the optimal algorithm for executing the SQL procedure, often using more optimal NoSQL data structures (globals).
    However, in the Caché standard class library, there is one limitation: for SQL procedures in which the selection is performed using COS code, it is necessary to determine the set of returned fields at the compilation stage - i.e. there is no way to dynamically set metadata for an SQL procedure that works with NoSQL structures.

    How to remove this restriction is described under the cut.

    Working with SQL Procedures in Caché

    Queries through JDBC / ODBC to non-relational Caché structures are implemented using stored procedures as follows:

    image

    Such a stored procedure can return one or more sets of records (ResultSets), or a scalar value.

    As an example, we will call the stored procedure sample.SP_Sample_By_Name from the Samples area using one of the tools for working with ODBC:

    image

    By the signature of the SQL procedure, it is not known what it will return, it becomes known only during the execution of the procedure.

    Caché allows you to make class methods stored SQL procedures that either return a value or return a ResultSet. For example, a stored procedure is declared that returns a ResultSet:

    ClassMethod  SomeSqlProc (  p1  As% Integer  = 0 ) [  ReturnResultsets ,  SqlProc  ]

    Using this construct, you can write Caché ObjectScript code that can be called via ODBC as a stored procedure that will return a ResultSet (or several).

    Caché has two standard methods for generating NoSQL data returned in the form of a ResultSet:

    The first way. Using class queries

    Using class queries
    ClassMethod SomeSqlProc( p1 As %Integer = 0) [ ReturnResultsets, SqlProc ]
    {
      if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() } 
      Set query = ##class(%ResultSet).%New("User.SomeClass:Query")
      Do query.Execute(p1)
      do %sqlcontext.AddResultSet(query)
    }

    See here for details.

    This method allows you to write arbitrary code to generate data on Caché ObjectScript, but the metadata of the returned ResultSet is created by the compiler based on the% Query. # ROWSPEC parameter, i.e. at compile time.

    The second way. Using% SQL.CustomResultSet

    Using% SQL.CustomResultSet
    ClassMethod SomeSqlProc( p1 As %Integer = 0) [ ReturnResultsets, SqlProc ]
    {
      if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() }
      Set query = ##class(User.MyResultSet).%New(,p1)
      do %sqlcontext.AddResultSet(query)
    }

    Read more about % SQL.CustomResultSet , an example implementation.

    The method is similar to the previous one, but metadata is generated based on the definition of the inheritance class% SQL.CustomResultSet - as in the previous case, during compilation.

    Note: You can get SQL data in the same way:

    Retrieving SQL Data
    ClassMethod SomeSqlProc( p1 As %Integer = 0) [ ReturnResultsets, SqlProc ]
    {
      s sqltext="SELECT * FROM dbo.Classname" ##; Подготавливаем текст запроса
      if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() } 
      Set query = ##class(%ResultSet).%New("%DynamicQuery:SQL")
      Do query.Prepare(sqltext) 
      Do query.Execute()
      do %sqlcontext.AddResultSet(query)
    }

    In this case, metadata is generated at runtime, but data can only be obtained from SQL.

    Thus, if we want to generate result metadata in runtime and use arbitrary Caché ObjectScript to generate data, then it is clear that the funds available in Caché's supply are not enough.

    The solution of the problem

    There are 4 solutions to the problem:

    • at run time, create a class containing class query with the generated on-the-fly ROWSPEC;
    • at run time, create a class inherited from% SQL.CustomResultSet with the required set of fields;
    • implement an alternative to% SQL.CustomResultSet, which will generate metadata at run time based on call parameters, and not at compile time;
    • implement an alternative to% Query, which will generate metadata at runtime.

    I chose the last method - it seemed to me the most elegant (looking ahead, I still couldn’t do without crutches).

    First, create the User.Query class and inherit it from% Query - so as not to overwrite the implementation of all% Query. When using% Query, the consumer (% ResultSet) requests metadata through two class methods: GetInfo and GetODBCInfo. In the successor class, it is necessary to write alternative implementations of these methods. Through several experiments (this is easier than understanding the generators) I found out about GetInfo parameters (.colinfo, .parminfo, .idinfo, .qHandle, extoption, .extinfo):

    • colinfo - add $ lb ($ lb (name, typeid, caption), ...) to it, where name is the internal name of the field, typeid is the identifier of the Caché type, caption is the column heading;
    • parminfo - add $ lb into it ($ lb (name, typeid), ...) - the same format as in the previous paragraph, but without a title;
    • idinfo - you can add $ lb (0,0) into it (system information, something related to the index, we assume that it is not);
    • qHandle - multidimensional local array formed by the programmer;
    • the rest can be left alone (it seems, for object references, optionally in the absence of objects).


    With GetODBCInfo everything is similar, there are a bit more fields, and the result should be added to single-level lists, but in general it's the same.

    In order to return the correct metadata from GetInfo and GetODBCInfo, I need to find several not quite obvious tricks, which are mainly given below:
    • To get the identifier of type Caché (typeid), you need to call $$ externaltype ^% apiOLE (ctype, .type, "0"), where ctype is the type name in Caché (for example,% String [reference to the% string class]). The function will put the identifier in type.
      Before recognizing the identifier type (ctype) to normalize (to the form Package.Class), it is possible to make a macro $$$ NormalizeClassname (ctype)
      To get information for GetODBCInfo, you must call
      GetODBCColInfo ^% ourODBC ( ctype ,. ColParms ,. colODBCTypeName ,. colODBCType ,. maxlen ,. precision ,. scale ) ,
      where ctype is a type name in Caché, not necessarily normalized.

      Since we want to generate metadata (field names and types) dynamically, we need to pass information about them to our Query. The most obvious way to do this is with the qHandle parameter. Through it, we will transmit information about the ResultSet. For this, the programmer, in his implementation of query execution (QueryExecute), must form the ROWSPEC line for the required fields and the line of formal query parameters (similar to ROWSPEC) and put them in qHandle (“rowspec”) and qHandle (“params”), respectively.

      As a result, we get the following implementation of the User.Query class:

      User.Query Class
      Class User.Query Extends %Query
      {

      ClassMethod GetInfo(ByRef colinfo As %List, ByRef parminfo As %List, ByRef idinfo As %List, ByRef qHandle As %Binary, extoption As %Integer = 0, ByRef extinfo As %List) As %Status
      {
        if $get(qHandle("colinfo"))=""
        {
          set RowSpec = qHandle("rowspec")
          set qHandle("colinfo")=""
          set sc=$$$OK
          for i=1:1:$length(RowSpec,",")
          { 
            set col=$piece(RowSpec,",",i)
            set name="p"_i
            set ctype=$$$NormalizeClassname($select($piece(col,":",2)'="":$piece(col,":",2),1:"%String"))
            set sc = $$externaltype^%apiOLE(ctype,.type,"0")
            quit:$$$ISERR(sc)
            set caption=$piece(col,":",1)
            set qHandle("colinfo")=qHandle("colinfo")_$listbuild($listbuild(name,type,caption))
          }
          quit:$$$ISERR(sc) sc
          
        }
        if $get(qHandle("parminfo"))=""
        {
          set Params = qHandle("params")
          set qHandle("parminfo")=""
          set sc=$$$OK
          for i=1:1:$length(Params,",")
          { 
            set col=$piece(Params,",",i)
            set name="p"_i
            set ctype=$$$NormalizeClassname($select($piece(col,":",2)'="":$piece(col,":",2),1:"%String"))
            set sc = $$externaltype^%apiOLE(ctype,.type,"0")
            quit:$$$ISERR(sc)
            set qHandle("parminfo")=qHandle("parminfo")_$listbuild($listbuild(name,type))
          }
          quit:$$$ISERR(sc) sc
        }
        set colinfo = qHandle("colinfo")
        set parminfo = qHandle("parminfo")
        set idinfo = $listbuild(0,0)
        quit $$$OK
      }

      ClassMethod GetODBCInfo(ByRef colinfo As %List, ByRef parminfo As %List, ByRef qHandle As %Binary)
      {  if $get(qHandle("colinfoodbc"))=""
        {
          set RowSpec = qHandle("rowspec")
          set qHandle("colinfoodbc")=$listbuild($LENGTH(RowSpec,","))
          for i=1:1:$length(RowSpec,",")
          { 
            set col=$piece(RowSpec,",",i)
            set ctype=$select($piece(col,":",2)'="":$piece(col,":",2),1:"%String")
            Do GetODBCColInfo^%ourODBC(ctype,.colParms,.colODBCTypeName,.colODBCType,.maxLen,.precision,.scale)
            set bstr = "$Char(0,0,0,0,0,0,0,0,0,0,0,0)"
            set name = $piece(col,":",1)
            set qHandle("colinfoodbc")=qHandle("colinfoodbc")_$listbuild(name,colODBCType,precision,scale,2,name,"Query","%Library","",bstr)
          }
        }
        if $get(qHandle("parminfoodbc"))=""
        {
          set Params = qHandle("params")
          set qHandle("parminfoodbc")=$listbuild($LENGTH(Params,","))
          for i=1:1:$length(RowSpec,",")
          { 
            set col=$piece(Params,",",i)
            set ctype=$select($piece(col,":",2)'="":$piece(col,":",2),1:"%String")
            Do GetODBCColInfo^%ourODBC(ctype,.colParms,.colODBCTypeName,.colODBCType,.maxLen,.precision,.scale)
            set name="p"_i
            set qHandle("parminfoodbc")=qHandle("parminfoodbc")_$listbuild(colODBCType,precision,scale,2,name,1)
          }
        }
        set colinfo = qHandle("colinfoodbc")
        set parminfo = qHandle("parminfoodbc")
        quit $$$OK
      }

      }


      How to apply the User.Query class

      Using User.Query is similar to using % Query , but during initialization, you need to pass information to it to generate metadata.
      A class using User.Query should look something like this:

      Class User.DynamicQuery
      Class User.DynamicQuery [ Abstract ]
      {

      Query Query(p1 As %Integer) As User.Query
      {
      }

      ClassMethod QueryExecute(ByRef qHandle As %Binary, p1 As %Integer) As %Status
      {
        /// Делаем все приготовления
        ;…
        /// Формируем ROWSPEC
        s RowSpec = "ID:%Integer,date:%TimeStamp,Info:%String"
        
        s qHandle("rowspec")=RowSpec
        /// Формируем строку формальных параметров, константа
        s qHandle("params")="p1:%Integer"
        
        q $$$OK
      }

      ClassMethod QueryClose(ByRef qHandle As %Binary) As %Status [ PlaceAfter = QueryExecute ]
      {
        Quit $$$OK
      }

      ClassMethod QueryFetch(ByRef qHandle As %Binary, ByRef Row As %List, ByRef AtEnd As %Integer = 0) As %Status [ PlaceAfter = QueryExecute ]
      {
      /// Пишем обычный QueryFetch, как описано в документации по class queries
      }
      }

      ///Код хранимой процедуры, вызывающей User.Query:
      ClassMethod DynamicProc(p1 As %Integer = 0) [ ReturnResultsets, SqlProc ]
      {
        if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() } 
        Set query = ##class(%ResultSet).%New("User.DynamicQuery:Query")
        Do query.Execute(p1)
        do %sqlcontext.AddResultSet(query)
      }


      Usage example

      In the Samples area, create the Queries class. It will contain only one request, so that it can be made abstract

      Class User.Queries
      Class User.Queries [ Abstract ]
      {

      Query NoSQL(ColCount As %Integer) As User.Query
      {
      }

      ClassMethod NoSQLExecute(ByRef qHandle As %Binary, ColCount As %Integer) As %Status
      {
        set RowSpec = "Id:%Integer"
        for colNum = 1:1:ColCount
        {
          set RowSpec=RowSpec_",p"_colNum_":%Integer"
        }
        set qHandle("rowspec")=RowSpec
        set qHandle("params")="ColCount:%Integer"
        
        kill ^||MyData(+##this)
        for rowNum = 1:1:100 {
          for colNum = 1:1:ColCount
          {
            set $list(^||MyData(+##this,rowNum),colNum)=$R(1000)
          }
        }

        set qHandle("colcount") = ColCount
        set qHandle("cursor") = $order(^||MyData(+##this,""))
        
        quit $$$OK
      }

      ClassMethod NoSQLClose(ByRef qHandle As %Binary) As %Status [ PlaceAfter = NoSQLExecute ]
      {
        kill ^||MyData(+##this), qHandle
        
        Quit $$$OK
      }

      ClassMethod NoSQLFetch(ByRef qHandle As %Binary, ByRef Row As %List, ByRef AtEnd As %Integer = 0) As %Status [ PlaceAfter = NoSQLExecute ]
      {
        if qHandle("cursor") = ""
        {
          set Row = "", AtEnd = 1
          quit $$$OK
        }
        
        set rowNum = qHandle("cursor")
        set Row = $listbuild(rowNum)_^||MyData(+##this,rowNum)
        
        set qHandle("cursor") = $order(^||MyData(+##this,rowNum))
        
        Quit $$$OK
      }

      }


      Our query accepts the number of columns, and returns 100 records filled with random numbers. Now let's write the Procedures class, which will contain the class-stored procedure method using our query.

      Class User.Procedures
      Class User.Procedures Extends %Persistent
      {

      ClassMethod ProcNoSQL(p1 As %Integer) [ ReturnResultsets, SqlName = proc_nosql, SqlProc ]
      {
        if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() } 
        Set query = ##class(%ResultSet).%New("User.Queries:NoSQL")
        Do query.Execute(p1)
        do %sqlcontext.AddResultSet(query)
      }

      ClassMethod ProcSQL(p1 As %String = "") [ ReturnResultsets, SqlName = proc_sql, SqlProc ]
      {
        set sqltext="SELECT ID, Name, DOB, SSN"
        set sqltext=sqltext_" FROM Sample.Person"
        set sqltext=sqltext_" WHERE (Name %STARTSWITH '"_p1_"')"
        set sqltext=sqltext_" ORDER BY Name"

        if '$isobject($Get(%sqlcontext)) { set %sqlcontext = ##class(%ProcedureContext).%New() } 
        Set query = ##class(%ResultSet).%New("%DynamicQuery:SQL")
        Do query.Prepare(sqltext)
        Do query.Execute()
        do %sqlcontext.AddResultSet(query)
      }

      }


      Now the created SQL procedure that executes the NoSQL query can be called via xDBC:

      image

      Conclusion

      I hope that the method I proposed to create NoSQL queries for dynamic-defined SQL procedures will turn out to be useful to someone, as it turned out to be useful to me when implementing a specific practical task to improve the performance of SQL procedures, which I will discuss in the next article.

    Also popular now: