[HATE] Pain. RusMARC format

    A warning. The post contains an overwhelming amount of hate! Remove sick pregnant women with losing mental health from the monitor. Although where did you find these children?

    Good day to all.
    Let me start with a quote from Wikipedia.
    UNIMARC (acronym for Universal Machine Readable Cataloging) is a format developed with the assistance of the International Federation of Library Associations and Institutions (IFLA) in 1977 to solve the incompatibility problem between various national MAPC formats. The main goal was to create an international MARK-format that could accommodate bibliographic records of all types of MARK-formats. Such records could be converted to UNIMARC, which would be the base format, and from it, if necessary, into any other MARK-format. The UNIMARC format facilitated the international exchange of bibliographic information in machine-readable form.


    to solve the incompatibility problem between different national MAP formats.

    That is, this format was designed to unify bibliographic records and move away from the state when each country had its own MARC format incompatible with the others. People gathered, thought and developed a uniform format. Until a certain time, everyone used it, and everyone was happy (well, except for the developers who had to write import / export functions working with this format).

    But we are not looking for easy ways ...


    What are the problems of UNIMARC from a programmer's point of view?
    Let's start in order.
    • Storage of all offsets and write length 5 byte string
    • Storing field numbers and subfield indices by string (in fact, not such a serious problem, but still)
    • The possibility of using two variants of encodings - MARC8 and UTF8 (in fact, you can use any, but only these two are discussed as a standard, and, accordingly, the encoding indicator flag is only for them). MARC8, not so common encoding


    For the rest, it is quite convenient format, where the records are stored one after the other, the presence of seals and recording format like this:
    The leader (of recording the information necessary to read the record)
    PeremennoePole1
    -PeremennoePodPole1
    - ...
    -PeremennoePodPoleN
    ...
    PeremennoePoleM

    There are special tables, which indicates a the field \ subfield is responsible for what, which ones can be used for their own purposes, and so on.

    It would seem - idyll, not without problems, but you can live.

    And then RusMARC bursts into the scene.
    1. 1. Purpose of the Russian communicative format The

    Russian communicative format was developed by order of the Ministry of Culture under the LIBNET program under the auspices of the Russian Library Association. The format is intended to be an intermediary in the exchange of bibliographic records and to contribute to the solution of the following tasks:

    a. Improved availability of bibliographic information
    b. Create consolidated directories
    in. Reduced cataloging costs

    The communicative format does not specify the form, content or structure of the recording of local systems, it contains recommendations on the form and content of data intended for exchange. Writing the communicative format does not prescribe the necessary forms of output to the local system, but must provide a sufficient set of data to generate the types of descriptions adopted in this system.

    1. 4. Relationship of the UNIMARC format and the Russian communicative format The

    areas in which the adaptation of the UNIMARC format was carried out identified the relationship between the two formats. The Russian communicative format is not a completely new, stand-alone development.
    The Russian communicative format is the Russian version of the UNIMARC International Communicative Format, in the interpretation and categories of the existing GOST standards and Cataloging Rules in Russia, based on the choice of the most common data element representation schemes.
    From a practical point of view, this means that any record transmitted in the Russian communicative format must be adequately perceived by any software that is stated that this software works with the UNIMARC format.


    What beautiful thoughts ...
    What do we have in practice in the RusMARC format?
    • Zoo encodings, yes, here you can use utf8 \ marc8 \ cp1251 \ ibm866 and even the standard will not tell you “ata-ta!”
    • Lack of flag indicating the encoding
    • Well, the bonus is that the assignment of fields and subfields does not correspond to the UNIMARC format.
    • Oh yeah, the contents of the "leader of the record" the same does not correspond to UNIMARC, although, yes, the software can read UNIMARC


    The volumes of the work done are astounding, almost all fields and subfields are mixed, the “leader of the record” is changed, because we don’t need the encoding flag, we need to know the archive control or not, what difference does it make, how will the encoding be defined?

    Gentlemen of the jury, I have nothing to add.

    Only registered users can participate in the survey. Sign in , please.

    Do you think that the creation of a domestic version of the MARC format is justified?


    Also popular now: