1/2012 Internetové sociální sítě
Tématem prvního čísla ProInflow v roce 2012 je fenomén internetových sociálních sítí (tzv. soci… » více o výzvě
Termín interoperabilita sa začal objavovať v súvislosti s nárastom informačných zdrojov a v súčasnosti jej otázky riešia pracovníci knižníc, múzeí, galérií aj archívov prakticky v každodennej práci. Pre dosiahnutie sémantickej interoperability takýchto zdrojov sa musíme sústrediť predovšetkým na dve kľúčové oblasti: metadátové schémy a pravidlá pre ich vypĺňanie a budovanie systémov organizácie poznania, súborov autorít a systémov, určených pre klasifikáciu zdrojov. V prostredí pamäťových inštitúcií v ČR patria práve knižnice k tým pamäťovým inštitúciám, ktoré so štandardizáciou na národnej i medzinárodnej úrovni začali ako prvé. Súčasné technológie spolu s existujúcimi medzinárodne platnými štandardami umožňujú vzájomnú spoluprácu nielen v rámci danej komunity, ale rovnako naprieč pamäťovými inštitúciami, čo umožňuje poskytnúť používateľom oveľa viac informácií z heterogénnych zdrojov a tieto navyše prepojiť na základe ich významu. V príspevku popisujeme infraštruktúru a rozširujúcu nadstavbu pre personálne autority. Toto riešenie bolo navrhnuté a realizované v rámci projektu „Národní autority v prostředí muzeí a galerií – interoperabilita s NK ČR“. Rozšírenie metadátovej schémy pre personálne autority realizujeme v súlade s ontologickým rámcom CIDOC CRM (ISO 21127:2006), čím môžeme používateľom ponúknuť oveľa presnejšie, komplexnejšie vyhľadávanie, ale zároveň vytvárame základ pre prepojenie heterogénne budovaných fondov na základe významu informácií v nich obsiahnutých. V príspevku referujeme o praktických výsledkoch projektu, stručne tiež popisujeme veľmi úzko súvisiaci projekt – Registr sbírek výtvarného umění. Tento prezentuje verejnosti už viac ako 80 tisíc umeleckých diel zo zbierok galérií, združených v Rade galérií ČR. V závere nášho príspevku ponúkame niekoľko ďalších námetov pre využitie takto prepojených zdrojov s cieľom ponúknuť používateľom viac informácií a poodhaliť im aj viaceré znalosti, ktoré sa v týchto zdrojoch ukrývajú.
systémy organizace znalostí, standardy, ontologie, koncepční modely, interoperabilita, CIDOC CRM, autority
The term interoperability started to appear in connection with the increase of information resources and nowadays, its problems are solved by employees of libraries, museums, galleries and archives in almost everyday work. For achieving semantic interoperability of such resources, we must focus mainly on two key areas: metadata schemes and rules for its completing and building of the knowledge organization systems, the authorities file and systems specified for the source classification. In the memory institution environment in the Czech Republic just libraries belong to such memory institutions which started with the standardization on the national and international level as the first. Current technologies together with existing internationally-valid standards enable mutual cooperation not only within the bounds of a particular community but also across memory institutions which enables to provide users with much more information from heterogeneous sources and besides, interconnect them on the basis of their meanings. In the contribution we describe the infrastructure and extension superstructure for personal authorities. The solution was suggested and realized within the project of “National Authorities in the Environment of Museums and Galleries - Interoperability with the National Library of the Czech Republic”. Metadata scheme expansion for personal authorities are realized in accordance with the ontological framework CIDOC CRM (ISO 21127:2006) through which we are able to offer users much more accurate, complex searching but at the same time we make a basis for interconnection of heterogeneously-built funds on the basis of the meaning of information included in them. In the contribution we inform about practical results of the project, in short we also describe a very closely connected project - Register of Fine Art collections. The register presents already over 80 thousand of works of art to the public from gallery collections united in the Association of galleries of Czech republic (Rada galerií ČR). In the end of our contribution we offer some other proposals for the use of such interconnected resources with the aim to offer users more information and also partly reveal them more knowledge which is hidden in these sources.
standards, ontology, knowledge organization systems, interoperability, conceptual models, CIDOC CRM, authorities
Recenzenti:
Ing. Martin Lhoták
Ing. Petr Žabička
In connection with the present, more massive and quicker development trend of information and communication technologies, the number of information systems, which contain a huge amount of information, findings and knowledge from various areas, are increasing. Finding of the required information, access to it but mainly interconnectivity of mutually related things from different sources, often means a lot of problems and demands an enormous effort and time. Therefore, the questions connected to the solution of a mutual sharing and a multiple use of a digital content, regardless of the fact where, how and on what rules the content was created, come more and more into focus. The aim of these solutions is to provide a user with a clear and exhaustive answer which fully corresponds with the areas of their interest. Considering the fact that a contemporary user takes the Internet and mobile phones almost for granted and as a daily part of their life, information systems compatibility, which enables data exports and imports among several information systems, proved not to be sufficient any more in this way. The new term „interoperability“ has been established.
Paul Miller states in his article that to be interoperable means that the creation of information systems as well as organization culture should be managed so that the exchange and the repeated use of information resources on all levels would be ensured to the highest possible extent.1
Lagoze2 deals mainly with the interoperability in the field of digital libraries and understands it as a broadscale term, which includes issues of various areas from data structure via searching for net sources, to their naming including service architecture3.
According to Arms4, interoperability means the ability of systems, information and communication technologies and working process which support these systems, mutually share and multiply use data and knowledge. In her dissertation, Andrejčíková5 added “[...] on-line without necessity of further handling with them” to the definition, and that was just in order to prevent confusion between interoperability and compatibility.
Main interoperability aims, as they are described by Arms and Gill6, are focused on usability efficiency, a mutual semantic interconnectivity and long-term information resources storage in a digital form. Interoperability includes several levels. At first, three basic levels are identified:
Naturally, in the course of time we can identify further interoperability levels such as international, legislative, etc. Nevertheless, to point out the possibilities of memory institutions in the Czech Republic, which would enable to offer their users much more, these three basic levels are sufficient. In this contribution we will mainly deal with questions connected to the semantic interoperability achievement and in short, also with the questions of the technical interoperability level.
As memory institutions follow several rules during the processing of information, findings and acquired knowledge and they store data in their own metadata schemes which are mostly set on the national and international level for a particular community (libraries, museums, galleries, archives, etc.) metadata on the semantic interoperability level are very important.
According to Gill7, the term metadata, in a broader sense of the word, is understood as everything we can get about any information object regardless of its level of aggregation. As an information object can be described everything what either a man or an information system can access and handle. The information object can be represented either by one element, a group of several elements or by the whole database of such elements. Metadata of an information object describe its three basic characteristics such as:
From the point of view of semantic interoperability achievement, the rules themselves, which memory institutions follow during the filling of metadata schemes, are very important. For a better orientation, in the following chart, there is an overview of the basic metadata schemes and rules which are recommended for particular communities - types of memory institutions.
|
institution type |
metadata schemes |
rules and models |
|
libraries |
MARC formats ( also XML base)
|
AACR2, RDA ISBD FRAD, FRSAD, FRBR (FRBRoo) GAAR |
|
archives |
EAD |
ISAD, ISAAR |
|
Museums and galleries |
VRA Core CDWA |
CCO |
|
universal |
DublinCore |
CIDOC CRM |
Tab. 1 Overview of the rules and metadata schemes in particular types of memory and fund institutions
The main task of such metadata schemes and rules is to enable to describe, organize but also subsequently find and make available the original objects of our cultural heritage. Therefore, we will understand semantic interoperability in memory institutions mainly interoperability in the field of metadata or metadata schemes with the aim of the mutual interconnectivity of data which are interrelated on the basis of their meanings. For semantic interoperability achievement in the metadata field we can use several methods which are grouped in three basic levels:
In the mentioned resource we can find more detailed description of the methods such as derivation, creation of application profiles, mapping, use of a mediator, framework or register which are used for interoperability achievement on the metadata schemes level. These methods are mainly used in particular systems even before the metadata records creation.
When we talk about semantic interoperability of resources which have been created, we have to choose from the methods which are designed for the interoperability achievement on the records or repository level. Such methods are described in detail in the second part of the mentioned study9.
Conceptual models represent the significant shift in the semantic interoperability achievement of memory institutions and cultural heritage. Already in 1997 librarians came up with entity-relational model which is aimed at functional requirements of users for bibliographic records. This model is known under the abbreviation FRBR – Functional Requirements for Bibliographic Records. The first version of this model was approved on 5th September 1997 in Copenhagen during the 63rd general IFLA conference (International Federation of Library Associations and Institutions) in pursuance of the committee Standing Committee of IFLA Section on Cataloguing and it was published in 199810. Nowadays, the version from February 2009, which is accessible in an electronic version11, is valid.
The aim was to create a generalized bibliography domain view which would be dependent neither on any formats used for recording and exchanging of catalogue records nor on particular implementations12. For that reason, it gives clearly defined structural frame, which connects data included in bibliographic records so that the records suit the users of these records in the highest possible extent. Therefore, the authors of the conceptual model FRBR first isolated entities which are key object of the interest of bibliographic records users and subsequently they identified particular attributes by which is possible to characterize these entities as well as all the mutual relations which can be created among these entities. Afterwards, they serve users for navigation in the whole universe of entities which are described in bibliographic records. From such a point of view these entities are divided into three groups:
1st group contains the entities which represent results of the intellectual effort- work, its expression, manifestation and item,
2nd group represents entities which express responsibility for products in the first group such as person or corporate body,
3rd group is represented by the entities which express what the subject of the products identified in the first group is about and a concept, object, place and event belong to this group.
Bibliographic data representation through this model offers a new environment for a user’s navigation although it is not usually simple to precisely identify mainly abstract entities especially when common metadata schemes such as MARC21 and UNIMARC are used. Abstract entities are the first two entities form the first group – work and expression. Work represents a significant intellectual work or work of art. It is an abstract entity only and its definition can be influenced by a particular culture. Work expression can be defined with a lot of problems although borders of what is possible to describe as a new expression and what is not according to FRBR, are assigned in way that eliminates any physical form of expression. The physical form influences manifestation, data carrier or the cover of work. Definition and mapping of other entities is a bit more simplified or concretized. Bibliographic resources mapping to this model is documented in detail on Library of Congress websites which attend to a functional analysis of bibliographic and holding records in the format MARC2113.
Other conceptual models such as FRAD Functional Requirements for Authority Data14 and FRSAD – Functional Requirements for Subject Authority Records15 describe entities from the second and the third group in detail.
Completely different approach to a conceptual model creation was chosen by the authors of CIDOC CRM (Conceptual reference model)16. CIDOC CRM was created under the patronage of ICOM CIDOC (International Council of Museums, International Committee for Documentation)17. This conceptual reference framework provides definitions and formal structure specified for a description of concepts and their mutual relations which are used in
a cultural heritage documentation. It is a comprehensive formal ontological standard (since 2006 as ISO 2112718) which can be used as a common language for the achievement of the advanced information integration19.
We can say that the main CIDOC CRM goal is to provide an exchange and information sharing in heterogeneous cultural heritage resources20. CIDOC CRM model does not determine what and how should memory institutions work in processing and cultural heritage documentation, in any case. Similarly, it does not provide any manual for ontology implementation and neither specifies data formats or interfaces. According to the last CIDOC CRM version from 2010, it defines 90 of main classes and 148 properties that can be used for creating relations among entities categorized to domain and range. For entities and properties in CIDOC CRM are characteristic not only hierarchy but also multiple inheritance – this is important to comply with in case of implementation.
The base of this model lies in the fact that every information object can be described through events. These are a base element of CIDOC CRM model and they are represented by temporal entities that can be used for creating connections from a time segment to actors and objects. Objects can be physical (physical things) or conceptual (conceptual objects). For using of this model in heterogeneous resources linking, it does not dependent on a used language, CIDOC CRM makes differences between entity and its name, because the name is different in languages and also for one entity a lot of alternative names are used. This is well- known from cooperative national authorities creation where one unified title (heading) can have x names in relation “see”. Therefore, every entity is described by type or by appellation.
As that entity-relational model FRBR is not concerned with events and therefore, it is neither possible to take time into consideration in it and nor information about what happened from the starting point of work creation idea, through its realization, up to the final product, the idea to harmonize these two models was born. Experts from both communities created work group for harmonizing FRBR and CIDOC CRM with these two main goals:
The first draft of the object-oriented model FRBRoo was presented in 2006 whereas the first version was accepted two years later and published in 2009.22 Version 1.0.1. from January 201023 is the last current version and the draft for the representation of this model in RDFS24 (Resource Description Framework Schema) was published in April of this year25.
Thanks to the successful harmonization of these two models we have created the environment which accomplishes main condition for semantic interoperability achievement within memory institutions. FRBRoo and CIDOC CRM present a generally acceptable ontological model that can be used for another data representation form. Such data representation provides users with direct answers for questions that are not possible to answer with current systems used in our environment. For instance, we mean questions like:
We have to do a lot of work for achieving these goals and for answering these questions by using current data about cultural heritage from memory institutions in the Czech Republic. Nevertheless, the fact that we started the right way can be documented with results achieved and presented in detail in the conference presentation.
Project DC07P02OUK002 “Authorities in the environment of museums and galleries – interoperability with the National Library of the Czech Republic” represents the first step in the practical usage of principles and standards mentioned above. A principal investigator is a department of the Moravian Museum in Brno (Moravské zemské muzeum v Brně) called CITeM (Methodological Centre for Information Technologies in Museology). The project had been planned for duration of 5 years (2007-2011).
Defined project goals were:
At this time, half a year before the project end, we can say that goals will be accomplished. The cooperation model mainly focused on personal authorities had been designed and implemented in cooperation with the Cosmotron company. Main attributes of the model are:
The main principle of a technical solution is complete adherence to the current National Authority Files scheme, without enforcing any changes. Separate database, regularly synchronized with National Authorities Files of the Czech Republic, is used for the museum interlayer. There is a new online web-based interface used for a museum authorities creation and modification. Cooperation with National Authorities is currently tested on data from four institutions (Oblastní galerie Vysočiny v Jihlavě, České muzeum výtvarného umění, Památník národního písemnictví and Regionální muzeum v Litomyšli). Firstly, data were collected and harmonized en bloc. Secondly, they were revised, modified and manually enriched. Such work is provided by 5 curators and 2 supervisors. In the time of project finishing we expect over 3000 confirmed authority records on a museum database level. Moreover, a very positive “side effect” was the fact that 314 new personal authority records, 94 corporate body records (mainly art schools and groups), over 100 geographical name records and almost 100 subject records (mainly art professions) were added to National Authorities Files. Besides, in almost 200 records there were some mistakes corrected.
Direct communication of Demus system (museum collection management system) with authorities for museums through a web service was successfully tested. Parameters of this web service as well as XML structure for museum authority records will be released at the end of the project. These results mean that online communication with a museum authority database to systems for a collection management is possible and its usage will improve the quality of records and simplify curators’ work.
While suggesting XML for museum authorities records the goal was to join CIDOC/CRM advantages and its XML representations with format MARC 21/Authorities in XML. The possibility to create a separate XML structure was rejected at the very beginning for its single purpose. Using only format MARCXML was excluded as well - for a simple reason- data structure of museum authorities records exceeds format MARC field range/authorities and the emphasis is placed on the data which are not possible to record by format MARC21/Authorities. Trying to apply XML representation of CIDOC/CRM for the project purposes, two problems arose. First, no unified approach for XML creation in accordance with CIDOC/CRM has existed so far. Inspiration was cast about in other projects. Secondly, no other project has been focused mainly on people’s information processing yet. After projects results evaluation, XML representation of CIDOC/CRM was chosen which is the most correct in technical and formal way and even as it is open enough for changes. XML characteristics allow implementing MARCXML elements to CIDOC/CRM XML. It was necessary mainly in the parts where any form of heading is used (personal names, names of places, corporations). Library practice proved that to be suitable and preponderant. The result (analysed approach advantages connection) is XML structure with two namespaces defined:
The base of every record is created from elements for individual events description connected to a person (example in picture 1).

Picture 1 Example of base XML structure
For names entry namespace MARC is used and in that case MARCXML structure (example of preferred name and alternative name is in picture 2).

Picture 2 Example of preferred and alternative name description
Analogously, namespace MARC is integrated in a birth event, where it is used for the place of birth description (example in picture 3).

Picture 3 Example of place of birth description
Working on XML structure for project is continuing intensively these days and it is likely to be enriched by further useful data.
The certified methodology for museum authority records creation should be one of the main project results. The methodology is conformable to the methodology prepared in the National Library of the Czech Republic for National Authorities Files and extends its scope to newly added museum data fields.
Considering the permanent lack of finance from Ministry of Culture of the Czech Republic, the sustainability of the museum authority files development and creation stay questionable these days. Unfortunately, a big project of the Integrated System of the Collection Administration (covering also authority files) had been untimely terminated. Principal investigators are looking for new alternate resources but without relevant success so far.
The first pragmatic usage of museum authority files can be found in the project realized in cooperation with the Association of galleries of Czech republic (Rada galerií ČR) and CITeM called Register of Fine Art collections (Registr sbírek výtvarního umění). This is the first online union catalogue of museum and gallery objects, with 80 241 objects (situation on May 3rd 2011) from 18 galleries involved. Personal names data from museum authority files were used mainly for the unification of different name forms of authors.
New representation of data, designed in conformity with the FRBRoo model specification, will provide users even with the answers to the complex questions mentioned above, as described in the examples attached to this study.
The fact that we started the right way is confirmed by the research project INTERPI26 initiated this year. Principal investigators of this project are National Library of the Czech Republic and National Archive of the Czech Republic. Project is loosely connected to results achieved in this field.
We can say that we were successful not only in handling the technical and semantic level of interoperability implementation, but also in its organization level, which is considered as the lowest in many cases.
When asking: „How to offer more to the users?“ the answer will be easy. If we are sensitive and able to accept changes brought by the environment, not insisting on the workflow used years before, it will be enough for achieving the success. This is a very simple action and reaction principle. If we do not do these things, the prognosis that was said in 2006 by dr. Řehák from the Municipal Library of Prague might come true very quickly. He said: There are just two types of libraries – libraries which will change and the ones which will disappear. Today, when changes are taking an exponential way, this challenge for the paradigm change, emphasis on users’ needs and modern technologies comes to be more than relevant.
ANDREJČÍKOVÁ, Nadežda; LENHART, Zdeněk; PODOLNÍKOVÁ, Jarmila; ŠUBOVÁ, Jana. How to offer more to users. ProInflow [online]. 14.10.2011 [cit. 17.05.2012]. Dostupný z WWW: <http://pro.inflow.cz/how-offer-more-users>. ISSN 1804–2406.