|
|
Kwaaitaal, I., Hoogeveen, M.J., & Van der Weide, T. (1994). A Reference Model for the Impact of Standardisation on Multimedia Database Management Systems. Computer Standards & Interfaces, 16, 45-54.
A Reference Model for the Impact of Standardisation on Multimedia Database Management Systems
Irene Kwaaitaal, Martijn Hoogeveen and Theo Van Der Weide
Abstract
This article discusses the standardisation of MDBMSs which is needed to keep pace with rapid developments in the area of multimedia systems. A reference model for the impact of standardisation on MDBMSs is presented which summarises the results of a survey of relevant standards. The reference model is used to identify standardisation gaps. Currently, one of the most important gaps is the lack of a standard multimedia database language for definition and manipulation of multimedia data.
Keywords: Multimedia; DBMS; reference model; standardisation.
1. Introduction Technical developments in the areas of DataBase Management Systems (DBMSs) [6], Information Retrieval Systems (IRSs), [31], Hypermedia Systems (HSs) [4], graphical user interfaces and storage media have fostered an increase of multimedia applications which use multimedia databases. Multimedia databases contain data that are represented by multiple representation media like data types for text, audio, video, graphics and animation [17]. The management of multimedia databases is performed by so called Multimedia DataBase Management Systems (MDBMSs) which need to integrate IRS, DBMS and HS facilities [11]. An MDBMS provides an interface to offer its facilities to client applications. Other interfaces of an MDBMS (see figure 1) are the interface with external information systems to exchange data and the interface to the database in which the data is actually stored. An MDBMS may manage several databases, distributed over different computer platforms. Standardisation of these interfaces is necessary to ease communications between the different entities.Currently, the main problem is the proliferation of MDBMSs on the market which do not comply to existing standards and which, as a consequence, hamper the integrated use of multi-vendor databases. Recently developed multimedia standards are thought to be having an impact on MDBMSs. On the one hand, improved support of multimedia standards will increase the usability of MDBMSs. On the other hand, since the standardisation of MDBMSs is a relatively underexposed issue, there is no sound MDBMS standard available.Therefore the goals of our research were to determine the impact current standardisation should have on MDBMSs and next to identify standardisation gaps. To limit the scope of the survey, only de jure ISO and CCITT standards were considered. To reach these goals we first listed the facilities an MDBMS may offer and studied a selection of standards. Then we constructed a reference model to visualise the impact of the standards and the standardisation gaps. It is clear that one should be cautious with conclusions about gaps, because when standardisation is missing it does not mean that standardisation is really necessary. To quote Asker [1] "To standardise as many interfaces as possible is definitely not in our best interest as users."The organisation of this report is as follows. First, the standards included in our research are briefly discussed. Next, our reference model for the impact of standardisation on MDBMSs is presented. Then, for each facility provided by an MDBMS, standardisation areas are discussed, together with the existing standards that cover the areas that were appropriate. When the standardisation area is not covered by any standard, or only partly covered, a standardisation gap is determined. Finally, conclusions are drawn and recommendations are given.
2. Standards Three main groups of standards are investigated: reference model standards, information coding standards and document standards.The reference model standards determine frameworks for other more specific standards. The Reference Model of Database Management (RMDM) [24], the Basic Reference Model of Open Distributed Processing (ODP) [25] and the Basic Reference Model of OSI [29] are the three reference model standards included in this survey. The OSI standards for networked services are also included in our research are Document Filing and Retrieval (DFR) [28], Search and Retrieve (SR) [27] and Remote Database Access (RDA) [26].Another important group of standards are concerned with the representation of data. Data content coding standardisation concerns the coding of text, graphics, still image, audio and video. The standards that were included in the research are Computer Graphics Metafile [12], JPEG for still image [13] [14], JBIG [16] for two-tone image and MPEG [15] and H.261 [5] for audio and video.Standardisation of the structuring of data is needed for communication between the MDBMS and client applications, (remote) databases, and other information systems. Standards are needed for structuring data relational, object oriented and in documents. For relational structuring the SQL standard [18] [8] is available. Document structuring standards are ODA [19] and SGML [21]. ODIF [20] and SDIF [22] standardise the respective exchange formats of ODA and SGML documents. HyTime [23] is a modelling language for hypermedia documents based on SGML. It standardises the representation of hyperlinks between objects and the positioning of hypermedia objects in a time and space domain. HyperODA is the hypermedia extension of ODA, which is currently being worked out. MHEG [17] is a standard to code multimedia and hypermedia information objects. Also hyperlinks and a mechanism to position objects in a time and space domain, are standardised in MHEG. The abovementioned standards overlap, and the creator of a document has to decide if information objects which are to be represented must conform to HyTime, MHEG or HyperODA, in the near future.
3. A simple reference model for the impact of standardisation The reference model for the impact of standardisation consists of a vertical axis, on which existing standards are set out, and a horizontal axis, on which the facilities of an MDBMS [11] [24] are set out. For each facility, potential standardisation areas are identified. Five types of impact of a standard on an MDBMS facility are identified, and are shown below:Impact on facilities: - Totally covered by the standard. - Partially covered by the standard. - Affected, some aspects of the facility is influenced by the facility. - Global, global aspects of a facility, e.g. terminology, is determined by a standard. - None, no impact. 4. Identification of standardisation areas, coverage and gaps An MDBMS may provide the following facilities [24]: a Data Modelling Facility (DMF), a facility for creation and modification of data, a storage and access facility, several search facilities, a projection facility, an import / export facility, access control management, recovery of lost data, distribution management, configuration management and an API.Below, for each facility, standardisation areas are identified, the impact of standards is discussed and standardisation gaps are highlighted.
4.1 Data Modelling Facility (DMF) The Data Modelling Facility [24] is the facility that provides the execution of statements specified in a data definition language (DDL) and a data manipulation language (DML). The DDL is used to specify the schemas of the database. A schema is a collection of data definitions which determines the structure and constraints of the data present in the database and is stored in a so called data dictionary. A multimedia DDL must be able to model multimedia information in a relational way, in an object-oriented fashion, or according to some grammar (ODA, SGML). The DML is used to specify processes which may be performed on data, structured according to the database schemas. Usually, a DDL and a DML are provided together in a database language. In this report, the query language used to search for information is discussed separately from the DML (see paragraph 4.4).For the DMF, two standardisation areas can be distinguished (see the reference model) - the DDL and the DML. Standardisation of the DDL is necessary to improve the exchange of structured data and related schemas. When the receiving system has the same DDL, the schema of the received data can be added to the data dictionary without loss of information. Standardisation of the DML creates the possibility of easily exchanging procedures between database systems and accessing multivendor databases with the same statements from the same application.SQL [18] is a standard database language that determines (among others) a DDL and a DML for relational structured data. Example DDL statements of SQL are CREATE SCHEMA, CREATE TABLE and CREATE VIEW. Example DML statements of SQL are INSERT, UPDATE and DELETE. SQL is limited regarding the definition and manipulation of free text, graphics, image, audio and video. Thus, SQL should be extended for multimedia purposes or a complete new standard multimedia database language is needed.ODA [19], SGML [21] and HyTime [23] are all document structuring standards which offer a language to specify the structure of documents. ODA and SGML do not support facilities for multimedia documents, but HyTime does. MHEG [17] standardises the structuring of multimedia information objects and can be used within a HyTime document. An important standardisation gap is the omission of the definition of an object oriented DML for handling multimedia information objects and documents (compound objects). A multimedia database language should include document type definition and object oriented data modelling.
4.2 Creation and manipulation of data This facility is about editor functions for creation and manipulation of content data. MDBMSs need to communicate with external editors, but they also may incorporate them. An example of the use of an editor is a text editor for the creation of the text of a memo which will later be added to a database record with a DML statement. Other editors are graphical, video and audio editors. For the exchange of documents, editors need to support structuring standards, for example SGML or MHEG. Standardisation of the user interface aspects of the editor itself is outside the scope of this paper, although of clear importance.
4.3 Storage and access facility The storage and access facility [11] is used to store and access information to and from secondary memory. When information is accessed it may also be decompressed according to a data content coding standard. The best known compression standards are MPEG [15], JPEG [13] [14], JBIG [16] and H.261 [5].Standardisation of the database storage format is important to be able to use multi-vendor MDBMSs for access to multi-vendor databases.
4.4 Search facilities Multimedia applications need different search techniques to perform effective searches on different types of data. The search facilities an MDBMS may provide [11] can be classified as DBMS search, IRS search, and HS search facilities. DBMS search facilities can be used to search for data structured in a relational or an object oriented way. IRS search facilities are used to search through less structured information like text in documents. HS search is an associative search technique for tracing hyperlinks, which may interconnect any two fragments of information in a database.
DBMS search facilitiesAn exact query language is used for searching a database using the DBMS search facility. The most important advantage of a standardised query language is that a remote database system can be queried with the help of a protocol standard.SQL [18] provides the SELECT statement for searching for textual and numerical information structured in tables. As noted before (see paragraph 4.1), a standardisation gap can be distinguished for searching for graphical, image, audio and video information structured in tables. Another important standardisation gap is the omission of an object oriented query language.
IRS search facilities
In an IRS search, the relevance of information objects to a particular request is not determined directly. Indexing terms are inferred from the information objects present in the database [31] and a reference to the information object may be added to the index. The indexing terms are expressed in an indexing language. The requests are also indexed. The process for determining which objects should be retrieved compares the entries of the index with the index terms derived from the request and selects a reference to an object if the index terms are sufficiently close. Indexing may be done automatically by the computer system or manually by a user who specifies the indexes when an object is created. Automatic indexing of natural language is present in most IR systems. Automatic indexing of images is still under development [30] [2].The use of a standardised indexing language by an MDBMS has the advantage that data objects can be exchanged together with their index terms. This prevents the need for reindexing when an information object is imported into another information system. Furthermore, if the format of the index entries is standardised, indexes in remote databases can be searched.A thesaurus can be used to determine additional indexes by including synonyms. Standardisation of the structure and representation of a thesaurus allows the use of thesauruses on different information systems. Two standards for thesauri exist, for monolingual thesauri ISO 2788 [32] and ISO 5964 [33] for multilingual thesauri. Support of these standards by MDBMSs should be advocated.
HS facilities In a hypermedia search [4] the user can navigate or "browse" from one information fragment on a "node", "card" or "record" to another by tracing a hyperlink. A hyperlink is often activated by clicking a graphical button within a node. The links represent a kind of association between the information fragments, for example a reference to literature. It must be possible to create a certain order of nodes within a hypermedia document, a suggested and predefined path. Graphical browsers are used to present an overview of nodes and their links to a 'reader' of a hypermedia document. These graphical browsers help in preventing users becoming 'lost in hyperspace'.To exchange hypermedia documents, standardisation of the representation of the hypermedia documents, including hyperlinks and paths, is necessary (see the reference model). MHEG [17] and HyTime [23] both standardise the representation of hyperlinks and overlap each other at this point. Only MHEG standardises the representation of a path. Standardisation of the browsing procedure creates the possibility to use a browser for documents created in different systems. Coverage of the standardisation areas derived from the hypermedia search facility can have large consequences. It creates the possibility to create a world wide web of hyperlinks over which one can browse, independent of software and hardware platforms. A use of this facility could be a world wide system with scientific reports. The references of the report can be implemented as hyperlinks which eases the search for information on a certain subject.
Incremental search
An incremental search facility offers the possibility to use the result set of earlier queries in a new query. The possibility to manipulate sets of selected information objects with operations on sets like "and", "or", and "not" need to be provided by the MDBMS [9]. Standardisation of these operations and of the representation of sets in a set manipulation language has the advantage that set operations can be performed on multi-vendor databases.
Storage of queries An interesting facility for frequent searchers is the ability to save queries or parts of a query that are used often. Standardisation of the representation of a query has the advantage that requests may be exchanged and then executed by remote systems. If standard database languages like SQL are used, this is problem is avoided.
Personal interest profile A personal interest profile [3] can be specified by defining a set of information objects, a topic to which the objects belong and a user who is interested in the topic. New information objects in the database can be evaluated through a profiler, using the topics profiles. The objects that score high enough for a certain topic are added to the set of objects. A user may be sent a message of the addition automatically.
Standardisation of the representation of the profiles, the messages and the structure of the information objects creates the possibility to profile information objects which reside in a remote system and send messages to a remote user. Standardisation of the profiler has the advantage of allowing the profiler to execute on different platforms.
Summary Summarising, the standardisation gaps found for the search facility are: a relational query language for graphics, image, audio and video, an object-oriented query language which supports multimedia data types, an indexing language, an index format, a set manipulation language, the representation of a query and the areas concerned with the personal interest profile facility. It is clear that a standard database language is needed which combines SQL with IRS and HS search capabilities.
4.5 Projection of the information Compared to traditional DBMSs, the projection of multimedia information by MDBMSs becomes more complicated because of the addition of presentational aspects such as colour, location in space and time, loudness, etc. [23]. The MDBMS must manage this interactive modification of projection information (in the external model).Standardisation areas affecting the projection facility are the representation of projection information and the time and space model.
4.6 Import / export facility To make data exchange in the form of file transfer [24] possible between different information systems, the data must be coded in a way that both systems can interpret it. This can be achieved by using the same DDL in both systems or by using a common exchange format to which the exported data and database schema are converted. The importing system converts the exchange format back to its internal representation.ODIF [20] covers this area of standardisation for ODA documents, SDIF [22] for SGML and HyTime documents and MHEG information objects. A standardisation gap can be seen for relational structured multimedia information.
4.7 Access control management An MDBMS must provide two categories of services regarding access control [24]. First, it must be possible to define and modify access control privileges and assign them to users (authorisation). Second, it must be possible to enforce the access control. To exchange access control information, it is necessary to standardise object identification, user identification and privilege information. Standardisation of the access control functions and the access control information opens the possibility of performing access control over actions of a remote user or actions involving remote information.SQL [18] covers this standardisation area with the GRANT statement for relational structured information. For object oriented structured information and documents standardisation gaps concerning this facility are present.
4.8 Recovery of lost data Recovery is defined [24] as a requirement to be able to return a database to a prior consistent state. To perform recovery, a return to a state specified in a periodically made backup can be made. Recovery can also be done by executing the reverse of transactions earlier performed. This transaction information is called logging information. A problem with MDBMSs is that the logfiles may 'explode' when logging transactions on multimedia records. Compression of multimedia data or omission of video data may reduce or solve this problem.If the representation of logging information is standardised the logfiles can be interpreted by multi-vendor MDBMSs. No such standard exists; most systems have their own implementations. In our opinion the standardisation of recovery does not have a high priority.
4.9 Configuration management, version control and variants The configuration of an information system is defined [24] as the set of processes comprising an information system and the way in which the processes are inter-related. A requirement is the ability to manage the changes made to the configuration over a period of time, to identify different versions of the system configuration and to let variants of a process exist concurrently. If the representation of these configurations, versions and variants is standardised in a kind of logfile, a database can be exported to another information system without losing this information. If the specification of these management functions are standardised they can be executed on different platforms too. Similarly, for these areas no standards exist because existing systems have their own implementations.
4.10 Distribution management When an MDBMS manages one or more databases which are stored remotely, the distribution of the data and of the transactions on the data must be managed [24]. The standardisation, necessary for distribution management, concerns the protocols that define the connection between two or more systems and service descriptions which describe what services are offered remotely. These standards are needed for remote creation and manipulation, storage and retrieval, search and access control operations. DFR [28] and SR [27] cover the areas for these remote operations on documents. RDA [26] is a standard that needs to be used with a database language for performing remote database transactions. A multimedia database language needs to include the RDA standard for remote operations on multimedia data.
4.11 Application Programming Interface (API) The Application Programming Interface consists of all facilities a programmer of a multimedia information system application can use. An API may not only consist of a database language and a 3GL programming language, but also a report writer, graphic screen painter, authoring tools and all kinds of object oriented tools. The database language has been discussed in previous paragraphs. Standardisation of the API is very difficult since the development of tools proceeds very quickly and the supply is very diverse.On the other hand, a screen painter and report painter need to make use of external data models, the views on databases that are presented to the user. For the definition of views it is necessary to make use of the DMF and by preference a standard format for external models. This has the advantage that views can be defined independently of the MDBMS used.
5. Conclusions Our main conclusion is that no complete and coherent standard exists for a multimedia database language, which consists of a DDL and DML. SQL is not suited for document type definitions, nor for IRS and HS searches. If a standard multimedia database language is defined it should permit SGML/HyTime hypermedia document structuring. Such an ideal database language should also permit object oriented data modelling.Another main point is that for the exchange of data and related schema between database systems, document standards like SGML or ODA (later HyperODA) and their exchange formats need to be supported.To allow for an ideal situation in which multi-vendor MDBMSs can be used for any databases, many aspects need to dealt with. First of all the storage format of data in databases needs to be standardised. Second, the formats in which logging information, query and result set information, and system configuration information are represented should be standardised. For distributed database management it is necessary to standardise remote operations (with DFR, SR and RDA) and to standardise the protocols that define the connection between connected systems.It is clear that the current situation, with overlapping and fragmented standardisation and with a limited support of standards by MDBMS manufacturers, is far from ideal. Nevertheless, if a sound multimedia successor of SQL is defined, it will probably be accepted by the market as is the case with SQL.
References: [1] B. Asker. Information Technology Standards, a scarce resource. Computer Standards & Interfaces 14, 1992, 275-276. [2] G. Bordogna, P. Carrara, I. Gagliardi, D. Merelli, P. Mussio, F. Naldi & M. Padula. Pictorial indexing for an integrated pictorial and textual IR environment. Journal of Information Science, 16, 1990, 165-173. [3] S.C. Bouwens, R. Klopman & E.P. Van Zonneveld. Document Retrieval en Informatiediensten: selectie van en ervaring met een full text retrieval produkt (Report no. TI-RA-92-0451). PTT Research, Groningen, Netherlands, 1992. [4] P.D. Bruza & T.P. Van der Weide. Two Level Hypermedia. An Improverd Architecture for Hypertext. In D. Karagiannis (Ed.), DEXA 91 : database and expert systems applications : proceedings of the international conference in Berlin, Federal Republic of Germany, 1991 (pp. 76-83). Springer, Wien, Austria, 1991. [5] CCITT Recommendation H.261 Video Codec for Audiovisual Services at p x 64 kbit/s (Report no. CDM XV-R 37-E), 1990. [6] C.J. Date. An introduction to database systems : volume 1 (4th ed.). Reading, Addison-Wesley, MA, 1986. [7] R. Elmasri & S.B. Navathe. Fundamentals of Database Systems. The Benjamin/Cummings Publishing Company, Redwood City, CA, 1989. [8] L. Gallagher. Database management standards: status and applicability. Computer Standards & interfaces 12, 1991, 185-192. [9] M.J. Hoogeveen, K. van der Meer & H.G. Sol. The integration of Information Retrieval and Database Management Facilities in Support of Multimedia Information Work, in: Proceedings of the 3rd International Symposium on Information Science (ISI'92), Saarbrcken, November 1992, p. 260-273. [10] M.J. Hoogeveen. De opkomst van multimedia- en hypermedia systemen. [The rise of multimedia and hypermedia systems]. Informatie, March 1993. [11] M.J. Hoogeveen. Multimedia. Infrastructuur voor tekst, beeld en geluid [Multimedia. Infrastructure for text, audio and video]. Lansa Publishing, Rijswijk, Netherlands, 1993. [12] ISO/IEC JTC1 ISO 8632. Computer Graphics - Metafile for the storage and transfer of picture description information, 1987. [13] ISO/IEC JTC1 CD 10918-1 Digital Compression and Coding of Continuous-Tone Still Images, Part 1, Requirements and Guidelines (Report No. 29N0090), 1991. [14] ISO/IEC JTC1 CD 10918-2 Digital Compression and Coding of Continuous-Tone Still Images, Part 2, Compliance Testing (Report No. 29N0070), 1991. [15] ISO/IEC JTC1 CD 11172/1/2/3 Coding of moving pictures and associated digital storage media at up to about 1.5 M bit (Report No. 29N0071), 1991. [16] ISO/IEC JTC1 DIS 11544 Coded representation of picture and audio information - Progressive bi-level image compression, 1992. [17] ISO/IEC MHEG WD S7 Information Technology - Coded Representation of Multimedia and Hypermedia Information Objects. Part 1 : Base Notation. Working Document S7 (Report No. N 235), 1992. [18] ISO/IEC IS 09075 Database language SQL with integrity enhancement (Report no. 21N1893), 1989. [19] ISO/IEC JTC 1 ISO 8613, Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 1: Introduction and general principles (Report No. 18N1538), 1989. [20] ISO/IEC JTC1 ISO 8613-5, Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 5: Office Document Interchange Format (ODIF), 1989. [21] ISO/IEC JTC1 ISO 8879, Text and Office Systems - Standard Generalized Markup Language (SGML), 1986. [22] ISO/IEC JTC1 ISO 9069, SGML support facilities - SGML Document Interchange Format (SDIF), 1988. [23] ISO/IEC DIS 10744 Revised Text of CD 10744, Information Technology - Hypermedia/Time-based Structuring Language (Report No. N3190), 1991. [24] ISO/IEC JTC 1 DIS 10032, Reference Model of Datamanagement (Report No. 21N5991), 1991. [25] ISO/IEC JTC1 Working draft - Basic reference model of open distributed processing - Part1: Overview and guide to use (Report no. 21N7053), 1992. [26] ISO/IEC DIS 9579-1 Remote Database Access - Part 1: Generic model, service and protocol (Report no. 21N6375), 1991. [27] ISO/IEC DIS 10162 Documentation - Search and Retrieve Service Definition (Report no. 18N2547), 1990. [28] ISO/IEC ISO 10166-1 Document Filing and Retrieval (DFR) - Part 1: Abstract Service Definition and Procedures, 1991. [29] ISO/IEC ISO 7498 Open System Interconnection - Basic reference model, 1984. [30] F. Rabitti & P. Savino. Image query processing based on multi-level signatures. In Bookstein, A. , Chiaramella, Y. , Salton, G. & Raghavan, V. V. (Ed.), Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval (pp. 305-314). ACM, New York, NY, 1991. [31] G. Salton & M.J. Mc Gill. Introduction to modern information retrieval. McGraw-Hill, Singapore, 1983. [32] ISO IS 2788, Guidelines for the establishment and development of monolingual thesauri, 1986. [33] ISO IS 5964, Guidelines for the establishment and development of multilingual thesauri, 1985. |
© 1995-2002 Martijn Hoogeveen |