Archives & Museum Informatics

Conferences

Publications

Seminars

Consulting

Research

Search archimuse.com

Share

last updated:
October 7, 2014 2:55 PM

Unifying our cultural memory: Could electronic environments bridge the historical accidents that fragment cultural collections?

in Information Landscapes for a Learning Society, Networking and the Future of Libraries 3, 1998. and presentation at UK Office of Library Networking Conference, July 1998.

David Bearman and Jennifer Trant, Partners, Archives & Museum Informatics, USA

(Section 5)

Mechanisms: metadata declarations, common models, and shared vocabularies

Bridging intellectual perspectives of users and documentalists will require explicit declaration by the creators of documentation of the schemas (formal rules and data models) which they are employing and methods to assess the attributes of each user's requirements. These schemes operate at the levels of data structure (what attributes are described) and data values (what terms are used). The data structure reflects the sum of all elements deemed relevant to discourse in a domain - in this case the domain of the management of this documentation rather than the domain of the disciplines whose research questions the documentation is intended to support.

On the surface, the declaration of such schemes by content creators is fairly simple16. Each repository will need to define its methods (or at least its methods with respect to any given type of data source). Information resources, in the same genres, documented by the repository will have the same attributes. The cataloguing and documentation standards we now implement provide the key building blocks in this part of the solution. Tools such as the Categories for the Description of Works of Art17 and the CIDOC Data Model (in both relational and object-oriented form)18 provide prototypes for the more generalized expression of information content.

However the problem of identifying the perspectives of the user is more complex. Schemes reflecting user needs include the user's language, age/interest, purpose, and knowledge as reflected in those terms is essential to understanding a query Determining language is quite easy, though bridging it as a barrier to access could require an significant, long-term investment in the translation of access. Establishing the age level and interest of a user is only possible with repeated interactions, so the system must build such knowledge over the course of the "reference interview". Correctly defining the user's purpose, on which much of the success of the research strategy depends, requires the explicit recognition of objectives as a separate component of the query formulation. Plotting terms posed by a user on the knowledge landscape that the user wishes to traverse, and providing the appropriate map requires that the system be aware of templates for a wide range of known discourse structures and that the user be involved implicitly or explicitly in the selection of ones appropriate to their search.19

Users employ terms in their queries which identify attributes of the resources they seek which distinguish those resources from the general universe, but these attributes are most likely not the ones that are actually of interest to the user (since they are going to be common to all the resources which the user wishes to examine). In searching, the system must not only find resources, however documented, which are most likely to have the attributes identified, it must report them by displaying terms actually used in their documentation and additional attributes usually present in the discourse frame of the user. If the user provided enough terms as part of the query, it is possible that a sufficient breadth of attributes will be selected to define this discourse frame, but users have learned that search systems are unlikely to return results if given too many of terms, so sophisticated searchers do not provide language that defines their domain. Instead, only a few attributes will be specified in a query.20

Other terms could be proposed to a user based on a partial match with one or more discourse profiles. What would need to happen in this scenario is that discourse frames defined and stored in the system will be mapped against queries. If the user is employing (mentally) a discourse frame that has previously been registered with the system, then the system can automatically fill in the 'slots' (attributes) of that frame not specified by the users query but implied by the form of the question. In effect, the system is proposing a number of schemas to the user and users are selecting that which fits closest to their needs.

Alternatively, a more experienced or sophisticated user could explicitly declare the schema in which they are working. This declaration would likewise define relevant discourse frames and enable the system to seek to identify attributes missing in the user query. Network architectures, and software applications, to support this kind of functionality are in the early stages of discussion as part of the implementation of the Resource Description Format (RDF) in XML. Front-ends would be required to negotiate user schema's in an intelligent way.

Key here is the abstract declaration of a series of schemas, that characterizes the "points-of-view" of users and the documentation practices of repositories. These would need to be expressed in a formal language, and mapped between disciplines .A means of establishing consensus between those holding this view-point would need to be established to ensure that it correctly reflected their assumptions. In developing and articulating schemas, the abstract model must be pushed far enough so that its concrete implications for all the stakeholders are clear. Task groups of representative professionals from appropriate institutions with the correct leadership could achieve framework schema in a year or two. Such groups would be defining the abstract types of user discourse frames, identifying repository documentary practices, and determining the relevant genres of primary, secondary and tertiary documentation structures.

But such conceptual models will only define a possible path; implementation requires that the institutions and individuals in the system perceive advantages, indeed sufficient economic advantages, in implementing the system. In practical terms, this means they must either be pushed by the costs imposed on users by inefficient searches in distributed environments (this has been the impetus for the Dublin Core activity) or pulled by the competitive advantage of offering a better searching functionality. The potential exists for a fully integrated information space, within which scholarly humanistic research could be enabled. The triangle model provides a framework for parsing the problems we must overcome, and where we need to go to find the answers. But it doesn't suggest how to provide a solution that will fit into the social process of the user's research.

NEXT: Social Barriers: The Humanities Research Process

PREVIOUS: The Goal of Integrated Access

Informatics: The interdisciplinary study of information content, representation, technology and applications,
and the methods and strategies by which information is used in organizations, networks, cultures and societies.