Join our Mailing List.
Published: March 1999.
Convergence and Integration Online: The Arts and Humanities Data Service Gateway and CataloguesNeil Beagrie, King's College London, United Kingdom
IntroductionThe use of digital information in the modern world is increasing at a phenomenonal rate. In the museum and education sectors there is increasing investment in the digitisation of existing materials to make them accessible online or in other electronic media and products. At the same time an increasing proportion of new information is only being conceived, produced, and distributed in electronic form and curators, archivists, librarians, and other information professionals are facing a new world of primary electronic objects, requiring professional management, curation, and access.
In the US, the National Information Infrastructure Project, and in the UK, the National Grid for Learning, envisage national online networks linking individuals and institutions and forming the backbone of future electronic developments. Similar initiatives are also emerging across Europe and world-wide. Institutions such as museums, archives, libraries, and cultural heritage organisations, are considering the development of national and international networks to link and facilitate access to the holdings of different institutions within their sectors. Increasingly there is also interest in cross-sectoral collaboration and integration of information between these sectors.
Within UK Higher Education similar trends are influencing developments in universities and national services. A Humanities Information Review Panel sponsored jointly by the British Academy and the British Library, and the Follett Report of the Joint Funding Councils' Libraries Review Group both recognised the importance of a national strategy for preserving and providing access to electronic resources in the humanities and led to a feasibility study (Burnard and Short 1994) and then the foundation of the Arts and Humanities Data Service (AHDS) in 1995.
Over the past three years the AHDS has established itself as a series of digital services. From the beginning the AHDS has recognised that it will never be the sole provider of high-quality digital resources for the arts and humanities and has built relationships with organisations in other sectors. There is substantial activity and resources outside of the Higher Education sector of interest to the AHDS's user community. There is similar interest from external organisations in the activity and resources being created within Higher Education. There is a clear need for partnership and collaboration in developing a critical mass on online resources for learning and research. The AHDS has worked closely with the museums, library and archive sectors to promote online access and standards for inter-operability and retrieval from collections in different disciplines and professional sectors.
Many organisations and consortia are working to develop online access to the distributed collections, often using the Z39.50 protocol and Dublin Core metadata records. The AHDS 's recently launched Gateway is believed to be one of the first such initiatives to be available as a public service.
This paper outlines the nature of the AHDS, its collections and catalogues, and its experiences in developing the Gateway. Inevitably comments on the Gateway must be provisional at this stage (January 1999) but it is hoped to provide additional feedback during the conference itself in March 1999. Given the comprehensive range of activities and media in the AHDS and its collections, the development of the AHDS Gateway is likely to be of interest to a wide range of museums. If in future we see increasing partnership, collaboration, and convergence between museums and the library and archive sectors, the cross-sectoral nature of much of the AHDS's collections and activities in developing the Gateway will also have increasing relevance.
The Arts and Humanities Data ServiceThe Arts and Humanities Data Service is a national service funded by the Joint Information Systems Committee of the UK's Higher Education Funding Councils to collect, describe, and preserve electronic resources. Its objectives are to:
The Service Providers collect, preserve, catalogue, and distribute digital resources which are relevant to their academic disciplines, facilitate good practice in their creation and use, and provide user services. They also contribute to the AHDS-wide initiatives.
The Executive co-ordinates the work of the Service Providers, ensures their development of coherent collection, curatorial, and distribution policies, and takes responsibility for all Service-wide initiatives. The Executive also maintains a website (http://ahds.ac.uk) providing a central source of information on the Service and links through to the individual websites maintained by the AHDS Service Providers.
The AHDS Service Providers and Their CollectionsThe Archaeological Data Service (ADS) is a consortium of archaeological departments and institutions located at the University of York. The Service's development reflects archaeologists' widespread use of computers in their work; the existence of established regional, and national agencies in England, Scotland, Wales and Northern Ireland which develop and maintain the archaeological record for different areas of the UK; and the fact that where that record is developed through excavation, it results in the destruction of the primary evidence upon which archaeological research is based. The ADS therefore works in collaboration with existing archaeological agencies both inside and outside of higher education to provide for the long-term preservation of digital records and to facilitate inter-operability and integrated access to existing archaeological databases.
The ADS provides integrated online access to nationally significant and distributed archaeological data resources including: a large proportion of the Scottish National Monuments Record maintained by the Royal Commission on the Ancient and Historical Monuments of Scotland; the Royal Commission on the Historical Monuments of England's (RCHME) Excavation Index for England; the RCHME's Microfilm Index; the Council for British Archaeology's Carbon-14 database; and the Society of Antiquaries of London's library catalogue. It also accessions and manages high-quality archaeological data resources which fall outside the collections remit of other heritage organisations which share responsibility for Britain's archaeological record. Actual deposits currently extend to over 300 data resources.
The History Data Service (HDS) is located within the Data Archive at Essex University, and its establishment as a distinctive department of the Archive pre-dates the foundation of the AHDS.
The HDS holds nearly 500 data resources to which it regularly adds some 30-40 new ones each year. The most significant new accessions include the 1881 Census for Great Britain (this collection alone contains some 30 million records) and the Irish Historical Statistics database. Data are distributed by arrangement on a variety of portable media or by ftp. In addition, the HDS has in the past two years developed a number of major online collections including the Great Britain Historical Database which integrates selected nineteenth-century census and other holdings, and a related GIS which is available through UKborders in Edinburgh.
The Oxford Text Archive (OTA) is located within the Oxford University Computing Service where it has resided for some 20 years, and is the other Service (together with the HDS), which was already in existence when the AHDS was founded. Under the auspices of the AHDS, the OTA has now focused its collections development on the needs of those working in the literary and linguistic disciplines, whilst continuing to take materials from any literary genre, period, or language. It has also placed a considerable emphasis on providing user-friendly online access to the numerous "open access" items in its collection and on training researchers in the creation and use of electronic texts.
The OTA's collections extend to some 3,500 electronic texts and linguistic corpora including electronic versions of literary works by many major authors in Greek, Latin, English and a dozen other languages; collections and corpora of unpublished materials prepared by field workers in linguistics; electronic versions of some standard reference works; and copies of texts and corpora prepared by individual scholars and major research projects from around the world. It has also developed state-of the -art online facilities enabling users to conduct innovative and analytical searches across texts which are available online.
The Performing Arts Data Service (PADS) is located at the University of Glasgow and represents a collaboration between the departments of Music, and Theatre Film and Television Studies. It focuses on collecting and promoting the use of digital resources to support research and teaching across the broad field of the performing arts: music, film, theatre, and dance. Given the relative immaturity of these disciplines in terms of the creation and use of digital materials for learning and research, the relative sophistication of computer applications required, and the more restrictive copyright regime, the PADS has focused its efforts on: assessing users' needs; raising awareness about the benefits derived from creating and using digital resources; researching and promoting standards and best practice; and developing online data resources in two areas - music and film studies.
The PADS collections currently include a pilot online service which delivers some 30 hours of film and video from the holdings of the British Film Institute; an online demonstrator catalogue of 1,000 items available from the Scottish Music Information; and a database of 300 high-quality Internet resources of interest to the performing arts. The PADS is also working with the Scottish National Film and Video Archive to provide networked access to the Archive's catalogue, and with the nine UK Music Conservatoires to integrate access to their online collections catalogues.
The Visual Arts Data Service (VADS). The VADS is another consortium (similar to that of the ADS) headed by the Surrey Institute of Art & Design. Like the PADS, it works within a community whose exploitation of digital resources and computer technologies has been slow to take off. Accordingly, the VADS has also focused on identifying users' requirements; on raising awareness about the creation and use digital resources; and on documenting and promoting appropriate standards and best practices.
High-quality digital collections are emerging within the visual arts, many of them from the museum and heritage sectors. Accordingly, the VADS has a collections policy which emphasises accession of data resources which have no other archival home and integrating online access to extant data resources which are served by other agencies.
Developing AccessA key part of the AHDS's overall mission is to improve access to and the integration of digital resources. The growing use of computers in collections management, research and education has resulted in a proliferation of on-line digital resources, catalogues and finding aids. Although in many institutions computerisation is still limited, a disparate and ever growing corpus of resources is emerging world-wide. Users can locate digital information for museum artifacts and paper-based and digital resources in many different locations from a single desktop computer, but access is limited unless these resources can be integrated seamlessly and retrieved. This requires the adoption of shared standards and good practices, the creation of high-quality digital resources, multiple gateways to provide access to them, and the creation of new and creative partnerships between different sectors and communities.
In the museums sector significant progress has been made through CIMI test -bed projects and the AHDS is a member of the consortium and is participating in its current r&d programmes.
The AHDS itself has been working on cross-sectoral research and standards development, particularly the potential application of the emerging "Dublin Core" to provide a common high-level description of digital resources catalogued and described to specialised standards in different sectors (Greenstein and Miller 1997). This work is complemented by the development of inter-operable catalogues and gateways for the AHDS service (a common gateway and facility to search the catalogues of the five AHDS Service Providers became available in December 1998). The AHDS collections catalogues, Gateway, and the "resource discovery" methods underpinning them are described further below.
This work also facilitates access between the Service Providers of the AHDS and other external partners and resources for their subject areas. There are a number of examples of such partnerships to develop and expand access to online resources across the AHDS. In archaeology, Accessing Scotland's Past an Archaeology Data Service Project funded by SCRAN, has linked the Scottish National Monuments Record, selected Scottish SMRs, and the ADS. Through the Data Archive, the HDS has data exchange and data access agreements which open out onto the substantial collections maintained by European and North American social science and history data archives. The OTA has close working relations with text archives at the Universities of Michigan and Virginia. The PADS provides access to a pilot digital film collection being developed by the British Film Institute (in collaboration with the British Universities Film and Video Council and the JISC).
The AHDS Collection CataloguesAll items in the collections of an AHDS Service Provider are described in a local collection management database. Portions of this database are made available to users of the collections as an online collections catalogue by each Service Provider. A Gateway, which is able to search all the catalogues simultaneously and acts as a virtual on-line union catalogue of all AHDS holdings, has also been developed.
Information about Service Provider holdings, whether managed by the Service Provider or by a co-operating third party is available from online catalogues mounted at each Service Provider as follows:
ArchSearchHistory Data ServiceA Z39.50-enabled SQL database, OLIB VDX, supplied by Fretwell-Downing Informatics. A number of search screens are offered, each tailored towards a different type of query . A user can elect, for example, to carry out a keyword search across the entire catalogue, or they can restrict themselves to a location-specific search. As well as different search forms, users can interact with the catalogue by means of a map interface, allowing them to point at locations of interest on their computer screen.
Cheshire and BIRONOxford Text ArchiveCheshire is a Z39.50-enabled, SGML-aware text retrieval engine, supplied by the Universities of Liverpool and Berkeley, and accessible via the HDS web site. BIRON is the online catalogue for the Data Archive and permits queries of HDS and Data Archive holdings. BIRON is also part of the Integrated Data Catalogue (IDC) which integrates access to the holdings of eleven social science data archives in Europe and the US.
OTA catalogueThe Performing Arts Data ServiceA Z39.50-enabled, SGML-aware text retrieval engine, developed by the OTA using Open Text's PAT Software as an online catalogue which provides the additional functionality of permitting text retrieval and analysis across the bodies as well as the headers of those texts which are available on open access.
PADS System ArchitectureThe Visual Arts Data ServiceThe PADS System Architecture is Z39.50 enabled and combines an information management system based on a central object-oriented HyperWave database linked to SGI's MediaBase for real-time streaming of audio and video.
The ADAM/VADS catalogueA Z39.50-enabled information system, Index+, supplied by Systems Simulation Ltd. The system has been designed to support a wide range of standards for information description, for example Dublin Core, IAFA Templates, VRA Core Categories and also interoperability through use of the CIMI & Aquarelle Access Points, Z39.50 and WHOIS++. In addition, it will provide a range of powerful as well as simple searching options and sophisticated tools for managing controlled vocabularies and authority files.
The AHDS GatewayThe AHDS's Gateway forms a virtual "union catalogue" and single interface to the collection catalogues of the AHDS Service Providers and can be accessed from the AHDS home page http://ahds.ac.uk . It bases its search and retrieval capabilities on unqualified Dublin Core metadata from these catalogues and uses the Z39.50 protocol (developed initially in the library sector but increasingly used elsewhere) to query the remote catalogues and return result sets to the Gateway.
The Gateway also incorporates software to support the following functions:
Resource Discovery: methods and issuesAs outlined above, the AHDS's collections are broad in scope and deeply heterogeneous. They are distributed amongst five Service Providers each of which provides information about its collections in its own online catalogue. The range and diversity of collections which can be accessed through these catalogues can also be extended beyond the AHDS because several Service Providers have data exchange and interoperability agreements with third-parties. The AHDS' collections are also heterogeneous in terms of the wide variety of digital data types including electronic texts, databases, digital images, and digital film, that are held. The needs of the different disciplines they serve together with the diverse composition of data types held require the AHDS Services to adopt very different descriptive and cataloguing practices. It was recognised at an early stage when the AHDS was founded that no single cataloguing standard was sufficiently flexible to be applied across them all.
A major aim of establishing a "faculty level " organisation for the five Service Providers had been to encourage inter-disciplinary access and use. A mechanism for achieving the desired level of inter-operability and user access without imposition of a single cataloguing standard was needed. The AHDS therefore began to investigate the potential use of resource discovery metadata which could facilitate access to individual and distinctive catalogues. It also investigated appropriate systems architectures to support this.
At this time the Dublin Core was beginning to emerge as a potential standard for resource discovery metadata. Evaluation of the Dublin Core was, however, limited to a small number of domains. Nor was there much in the way of guidance with regard to its implementation in any single domain. The AHDS's work on resource discovery metadata therefore focused initially on a formal, cross-domain evaluation of the Dublin Core and was conducted in conjunction with the UK Office for Library and Information Networking (Greenstein and Miller 1997).
Research into information architectures and tools drew upon and also to some degree influenced the development at UKOLN of the MODELS Information Architecture. Briefly, MODELS envisaged a number of broker services which mediated between the user accessing a network and a range of underlying information resources or targets (Russell 1998). The outline provided by MODELS essentially describes the AHDS Gateway, which contains the broker services, and the targets, which are the online catalogues of the AHDS Service Providers and their external partners.
The AHDS adopted Z39.50 as its network application protocol standard and procured the development of a brokering web client and Z39.50 capability for the Service Provider's catalogues. The systems were procured from several suppliers selected through an open tender process (Greenstein 1997).
At the time of writing the Gateway and its interaction with the catalogues of the AHDS Services is within its first month of operation and is still being evaluated. However a number of preliminary observations can be made.
The AHDS Gateway is a new type of service within the Higher Education community in the UK. The system has proved the technical feasibility of using Dublin Core and Z39.50 in combination to enable "resource discovery" across heterogeneous and geographically distributed resources. The system also successfully combines both "resource discovery" based on catalogue metadata with "resource access", which is online wherever possible. It provides a "virtual union catalogue": a single point of access to search across a variety of independent and distinct catalogues and return result sets in a consistent and intelligible way to the user.
The Gateway also provides a service and access which are more specific than that available from most internet search engines. The ability to search on "fielded" metadata should allow more focussed searching and retrieval. Similarly the resources catalogued or made available through the gateway are pre-selected and of known origin, quality, and relevance to the user. Although the Gateway currently searches only across the holdings of the five AHDS service providers, it is being developed further and as content increases, so too will scope for inter-disciplinary searches and development. In future we envisage it will present a broader range of information providers, linking into other resources both within and beyond the UK and across a range of sectors holding resources for learning and research.
Alongside these successes it is worth noting that perhaps a number of challenges remain.
The existence of fields in Service Providers' or other catalogues which can be mapped to the 15 Dublin Core fields does not of its self guarantee that all these fields can be used effectively for cross-domain discovery. For this to be the case these fields must consistently contain the information being sought. All Dublin Core elements are optional: fields in a catalogue may be optional and rarely used in one domain but mandatory and consistently used in another. The functionality will therefore vary according to the catalogues included in the search and the fields selected to query between them. The AHDS Gateway has begun to address this, through user information in its help screens and by including dynamic updating and a facility for flagging to the user of common "assured" access points between catalogues in the search form.
The DC element set initially evolved to describe electronic "document-like" objects and the description and retrieval of spatial and temporal data is perhaps less well served. The greatest variation in description is apparent across the AHDS Services in their use of the DC date and coverage elements. Refining retrieval in these areas is potentially an area for future development.
Another issue is the form and syntax of the content of fields in that no agreement is ever likely to exist about their use of controlled vocabularies across different domains. Assisting users to search across catalogues which are populated according to different domain specific controlled vocabularies is also a challenge for the future.
ConclusionsIn many ways the challenges faced by the AHDS in dealing with heterogeneous digital content, formats, and systems can be seen as a microcosm of the future challenges to be faced in the wider world as a range of institutions move towards being part of highly diverse, distributed network of content providers on the Web. It is likely to be an environment where users are increasingly interested in single point access, may wish to retrieve information simultaneously from different sectors (museums, libraries or archives), or from different subject areas and disciplines. These are the assumptions which have shaped the development of the AHDS Gateway and associated systems. Our experience of how users actually begin to exploit the functionality of the Gateway, will provide useful feedback for our own systems' further development, and also for other applied research for future resource discovery systems on the Web.
Over the past three years the AHDS has provided a testbed for research and development, networking the cultural heritage, developing distributed collaborative collections, and a potential model for partnerships between higher education , museums and other sectors in the digital age. In a digital environment it can be argued that the traditional divisions between our curatorial institutions begin to blur: there is a common currency between archives, libraries and museums and the potential to form new partnerships to service common audiences.
Already in the UK, there are a number of pointers to such changes which may have a profound impact on museums. The first is the government's consultation paper (HMSO 1997) on the proposed National Grid for Learning and its role in education and a learning society: its aims are closely linked to the principles which lie behind the founding of many museums. The following is a quotation from the consultation paper on the National Grid for Learning:
"For the first time we have the opportunity to link all our learning institutions and training providers-including schools, colleges, universities, libraries, adult learning institutions, museums and galleries... To achieve a learning society these links must also extend in an effective way to homes, the workplace,...in the same way that public utilities like the telephone are now universally available" (HMSO 1997)The second is the Library and Information Commission's report (LIC 1997) on the future of public libraries in the digital age: a publication which has been widely praised for its vision of a Public Library Network providing online access for all sectors of society and providing entry points to business and government information, and the cultural heritage.
The third is the Department of Culture, Media, and Sport's review of its funding to museums and other institutions in the public sector and its decision to create a new Museums Libraries and Archives Council, a single body to co-ordinate and fund these sectors which have traditionally be separately funded and managed (DCMS 1998).
All these developments point to radical shifts in the landscape for our institutions. In the nineteenth century the invention of the telephone helped to transform communications between individuals and institutions. If in the 21st century the digitisation and networking of our intellectual and cultural heritage has the same transforming effect, then the AHDS may provide some insights and one model for the future.
AcknowledgementsThe AHDS Gateway has been developed by AHDS staff and their contractors. Daniel Greenstein and Paul Miller have been particularly influential in its development. The AHDS Gateway is still being evaluated and it should be noted that this paper expresses the views of the author, who is solely responsible for any errors or omissions.
Sources of Further Information on the AHDSOnline information on the AHDS is available from the AHDS Website at http://ahds.ac.uk. This also provides a gateway from the home page to the websites of each of the AHDS Service Providers.
It is also possible if you have access to email to join the mailbase list AHDS-ALL which carries announcements on the Service and of general relevance to the arts and humanities and cultural heritage. There is also a mailbase list ADS-ALL specifically for the Archaeology Data Service and archaeological interests. Full details of these lists and joining instructions are available from the Mailbase web site at http://www.mailbase.ac.uk.
Finally if you have a general enquiry, would like an information pack, or to join our mailing list you can email firstname.lastname@example.org or write to the AHDS Executive, King's College, The Strand, London WC2R 2LS.
Burnard, L and Short, H 1994, An Arts and Humanities Data Service. Report of a Feasibility Study Commissioned by the Information Services Sub-committee of the Joint Information Systems Committee of the Higher Education Funding Councils, Oxford.
Greenstein, D and Miller, P (eds) 1997, Discovering Online Resources Across the Humanities: A practical implementation of the Dublin Core, Arts and Humanities Data Service and UK Office for Library and Information Networking, Bath. A Web version is available online from http://ahds.ac.uk/public/metadata/discovery.html, last modified 17 November 1997, consulted 27 January 1999.
Greenstein, D 1997 AHDS Systems Operational Requirement , public version 1.1 dated 1 August 1997. Last modified 7 April 1998, consulted 27 January 1999, http://ahds.ac.uk/public/ahds-or/ahds-or.html.
Her Majesty's Stationery Office 1997, Connecting the Learning Society: The Government's consultation paper on the National Grid for Learning. A Web version is available online from http://www.open.gov.uk/dfee/grid/consult/index.htm, released 7 October 1997, consulted 27 January 1999.
Library and Information Commission 1997, New Library: The People's Network. LIC, London. A Web version is available online from http://www.ukoln.ac.uk/services/lic/newlibrary/ , released 16 April 1998, consulted 27 January 1999.
Russell, R 1998, MODELS:MOving to Distributed Environments for Library Services, last modified 5 November 1998, consulted 27 January 1999, http://www.ukoln.ac.uk/dlis/models/