
Data extraction from 1C database: problems with transfers
I decided to write an article on how to extract data from 1C using SQL queries. All of the following applies to 1C version 8.2, it should also work in 1C version 8.1. Particular attention is paid to the problem with extracting enumeration headers.
In this case, everything works extremely well: the areas of responsibility are separated, if an error is found in the report data - it is first searched in the cube, if everything is OK in the cube - it is searched in the HD, if everything is OK in the CD - it is looked in ETL, if everything is fine in ETL - that means let the 1C-programmer figure out where he has an error in the processing that fills the “buffer database”.
But this method is not always available. It happens that there is no 1C specialist at all, or is too busy, or there is not enough iron capacity to “push” the data from 1C using processing. And one thing remains - to do data extraction using SQL queries.
Here you can download several such treatments (which we “filtered out” by sorting through dozens of such ones, choosing the best ones). They do almost the same thing - they allow you to see all the fields, understand which field leads to which directory, and even offer to automatically build the query:

Thus, we begin to examine the documents we need:

Next, open any of them, and find where it is written - in which registers:

Well, then finding this register and generating the SQL query using the above processing (as in the first figure) is not difficult.
We usually make two levels of SQL queries: “lower level” - views for renaming fields, “upper level” - views that take data from the lower level, and they already do the necessary joins.

And now, if we try to pull this field out of the database directly, we get this:

Yes, we found where the headers of the enumerations are sitting: the table is called Config, it contains image fields in which a zipped set of bytes sits, which if unzipped, we get incomprehensible structures a set of characters, delimiters, etc. Unfortunately, this data format is not documented.
Therefore, we have found another way. We made a small program in C # that uses the 1C-COM COM object in order to establish a connection with it and pull all the values of all the enumerations into one table.
You can download it from here.
The code:
It starts like this:
He does the following: connects to 1C using COM, takes all the enumerations from there, and puts them in the table of the specified database you specified, having previously cleaned it. The table should have the following structure
Further it is clear that the SSIS package (or other mechanism) can run this code before retrieving the fact / reference data, and we will get the populated table

and then you can build the join by the _EnumOrder field: the reference refers to the _Enum table by IDRRef, in it is the _EnumOrder field which refers to the EnumOrder field of your table that C # code just pulled.
If you have any comments or additional ideas - all of them are happy to accept, write to ibobak at bitimpulse dot com.
Cultural way
Ideally, 1C programmer should select data from 1C. It’s good if he creates a processing that will output data to the so-called “buffer base”: csv files, tables in SQL - whatever. The designer of the XD and ETL must take data from the buffer.In this case, everything works extremely well: the areas of responsibility are separated, if an error is found in the report data - it is first searched in the cube, if everything is OK in the cube - it is searched in the HD, if everything is OK in the CD - it is looked in ETL, if everything is fine in ETL - that means let the 1C-programmer figure out where he has an error in the processing that fills the “buffer database”.
But this method is not always available. It happens that there is no 1C specialist at all, or is too busy, or there is not enough iron capacity to “push” the data from 1C using processing. And one thing remains - to do data extraction using SQL queries.
Not a very cultural way
This is actually this method - “make a SQL query to the 1C database”. The main task is to correctly write the queries themselves. I think it’s no secret to anyone that in 1C the data structure is “tricky” and that fields and tables have intricate names. The task of the ETL designer is to pull data from this structure.View metadata
There are treatments that make it possible to see which fields of directories / documents in which tables / fields of the database are.Here you can download several such treatments (which we “filtered out” by sorting through dozens of such ones, choosing the best ones). They do almost the same thing - they allow you to see all the fields, understand which field leads to which directory, and even offer to automatically build the query:

Thus, we begin to examine the documents we need:

Next, open any of them, and find where it is written - in which registers:

Well, then finding this register and generating the SQL query using the above processing (as in the first figure) is not difficult.
We usually make two levels of SQL queries: “lower level” - views for renaming fields, “upper level” - views that take data from the lower level, and they already do the necessary joins.
Transfers
There is one big problem - these are transfers. Example:
And now, if we try to pull this field out of the database directly, we get this:

Yes, we found where the headers of the enumerations are sitting: the table is called Config, it contains image fields in which a zipped set of bytes sits, which if unzipped, we get incomprehensible structures a set of characters, delimiters, etc. Unfortunately, this data format is not documented.
Therefore, we have found another way. We made a small program in C # that uses the 1C-COM COM object in order to establish a connection with it and pull all the values of all the enumerations into one table.
You can download it from here.
The code:
using System;
using System.Data;
using System.Data.SqlClient;
namespace _1CEnumParser
{
class Program
{
///
/// Пробегается по всем перечислениям и заполняет таблицу с тремя полями: название перечисления, название значения, порядок
///
private static void DataTableFill(DataTable aTable, dynamic a1CConn)
{
foreach (dynamic catalog in a1CConn.Metadata.Enums)
{
string enumName = catalog.Name;
dynamic query = a1CConn.NewObject("Query");
query.Text = "select * from enum." + enumName;
dynamic items = query.Execute().Unload();
// бежим по строкам
for (int i = 0; i < items.Count(); i++)
{
string enumValue = null;
int enumOrder = -1;
for (int j = 0; j < items.Columns.Count(); j++)
{
string colName = items.Columns.Get(j).Name;
dynamic colValue;
try
{
colValue = a1CConn.String(items.Get(i).Get(j));
}
catch
{
colValue = "-1";
}
switch (colName.ToLower())
{
case "ссылка":
enumValue = colValue.ToString();
break;
case "порядок":
enumOrder = int.Parse(colValue.ToString());
break;
default:
throw new ApplicationException("unknown column name in enum.recordset: " + colName);
}
}
// получили 3 заполненные значения: enumName, enumValue, enumOrder. вставка в таблицу
aTable.Rows.Add(new object[] {enumName, enumValue, enumOrder});
}
}
}
///
/// Устанавливает все соединения, заполняет DataTable с перечислениями, записывает в БД
///
///
///
///
private static void ConnectAndFill(string aConnectionString1C, string aConnectionStringSQL, string aTableName)
{
// входим в SQL базу и удаляем все из таблицы
var connSQL = new SqlConnection(aConnectionStringSQL);
connSQL.Open();
// входим в 1С
var connector1C = new V82.COMConnector();
dynamic conn1C = connector1C.Connect(aConnectionString1C);
// удаляем из таблицы все данные
var command = new SqlCommand("delete from " + aTableName, connSQL);
command.ExecuteNonQuery();
// заполняем таблицу
var da = new SqlDataAdapter("select EnumName, EnumValue, EnumOrder from " + aTableName, connSQL);
var thisBuilder = new SqlCommandBuilder(da);
var ds = new DataSet();
da.Fill(ds);
DataTableFill(ds.Tables[0], conn1C);
da.Update(ds);
// закрываем коннект
connSQL.Close();
}
static void Main()
{
string[] args = Environment.GetCommandLineArgs();
string aConnectionString1C = args[1];
string aConnectionStringSQL = args[2];
string aTableName = args[3];
ConnectAndFill(aConnectionString1C, aConnectionStringSQL, aTableName);
}
}
}
It starts like this:
1cEnumParser.exe "строчка_соединения_1С" " строчка_соединения_SQL" "таблица_в_SQL"
He does the following: connects to 1C using COM, takes all the enumerations from there, and puts them in the table of the specified database you specified, having previously cleaned it. The table should have the following structure
CREATE TABLE [dbo].[tbl1CEnums](
[EnumName] [nvarchar](1024) NULL,
[EnumValue] [nvarchar](2014) NULL,
[EnumOrder] [int] NULL
) ON [PRIMARY]
Further it is clear that the SSIS package (or other mechanism) can run this code before retrieving the fact / reference data, and we will get the populated table

and then you can build the join by the _EnumOrder field: the reference refers to the _Enum table by IDRRef, in it is the _EnumOrder field which refers to the EnumOrder field of your table that C # code just pulled.
If you have any comments or additional ideas - all of them are happy to accept, write to ibobak at bitimpulse dot com.