Register | Key Dates | Schedule | Events | Local Info | |||||||||||||||
Museums and the Web 2005 Papers |
|||||||||||||||||||
Reports and analyses from around the world are presented at MW2005. |
|||||||||||||||||||
|
New Ways to Search, Navigate and Use Multimedia Museum Collections over the WebMatthew Addis, IT Innovation Centre, Kirk Martinez, Paul Lewis, University of Southampton, James Stevenson, Victoria & Albert Museum, United Kingdom, and Fabrizio Giorgini, Giunti Interactive Labs, ItalyAbstractMuseums and galleries are becoming increasing rich in digital information. This is often created for internal activities such as cataloguing, curation, conservation and restoration, but also has many additional uses including gallery terminals, Web access, educational, scientific, and commercial licensing. New forms of multimedia content such as 3D models and virtual spaces have huge potential for enhancing the way people interact with museum collections; for example, in structured eLearning environments. Despite drivers for increased integration of information sources within the museum or gallery, and for improved Web accessibility for external users, this content is often hard to access and is held in multiple internal systems with non-standard schemas and descriptions. Providing information to external users or applications in a structured and machine-readable form is particularly difficult due to a lack of tools and standards. This makes it difficult to expose this rich source of information so it can be used over the Web in external applications. Over the past three years, the European Commission IST supported SCULPTEUR project has been addressing these problems by developing new ways to create, search, navigate, access, share, repurpose and use multimedia content over the Web for professional users. This paper describes the tools and techniques developed in the project. Keywords: Semantic Web, 3D models, CIDOC CRM, Search and Retrieval, eLearning. IntroductionMuseums and galleries are becoming increasingly rich in multimedia representations of works of art such as 2D images or 3D models. It is not uncommon for large museums and galleries to have tens or hundreds of thousands of digital images of works of art in their possession. The use of digital photography to create high resolution and colour accurate representations is now well established and is within the reach of even modestly sized organizations. A similar effect will follow for new forms of multimedia content, for example 3D models and interactive virtual environment, which in turn swell and diversify museum multimedia collections. Applications include collection management, cataloguing, conservation and restoration, commercial picture sales, public access material for gallery terminals and Web sites, promotion and marketing, and sales and acquisitions, to name but a few. Each application has its own requirements for content type and quality, and often this gives rise to multiple systems and processes within an organization. When there is a need to source content from across these systems, then typically the approach is ad hoc and requires 'cut and paste' between the user interfaces of the multiple software systems involved. This problem is not restricted to internal use of digital content. The recent drive towards increased Web accessibility and openness for cultural heritage organizations now means that there are high-value uses for digital content in public access, education and research. As a result, it can be frustrating that this content is 'locked away' in internal legacy systems with non-standard schemas and descriptions, and also requires aggregation and repurposing to transform it into the right form for external consumption. Even for digital content that is created directly for use by third parties, there is still a lack of standards and infrastructure to make it available in a uniform way that allows the content to be easily used and understood, especially in conjunction with other information sources. In summary, museums and galleries are faced with a wealth of new opportunities to create and deliver exciting new forms of digital content, both for internal use and for remote use over the Web. However, this is a double-edged sword. Significant technological barriers exist due to immaturity of the technology, lack of standards and best practice, and difficulties in combining information from multiple sources, whether they be within a single museum or distributed across the Web. Over the past three years, the European Commission IST Sculpteur project (http://www.sculpteurweb.org) has been addressing these problems by developing new ways to create, search, navigate, access, share, repurpose and use multimedia content from multiple sources over the Web. In particular, the project has four focus areas:
This paper focuses on the aspects related to the search and navigation of multimedia museum collections over the Web. Sculpteur involves five major museums and galleries: the Uffizi in Florence; the National Gallery and the Victoria and Albert Museum in London; the Musee de Cherbourg and the Centre de Recherche et de Restauration des Musees de France (C2RMF). These galleries have substantial digital archives comprising images, 3D models and videos together with textual information and metadata. Figure 1 shows our overall approach in terms of the system we have built to allow their internal staff and external professional users to gain access to the multimedia museum information they possess. The main input is content extracted from existing museum and gallery systems; for example collection management databases, photo catalogues, document repositories and stores of 3D models. We index this collection according to the multimedia content (for example, the colour of 2D images and the shape of 3D models) as well as the textual descriptions extracted from the museum and gallery legacy systems. At this stage the classification techniques are used to automatically associate 3D models with different classifications of art objects. We structure the textual descriptions, 3D models, 2D images, content indexes and the classified models using an ontology, in particular an ontology based on the CIDOC Conceptual Reference Model (http://cidoc.ics.forth.gr/). Next, the ontology is published on the Web in an XML form to describe the collection, which, along with a search and retrieval service based on Z39.50 SRW (http://www.loc.gov/z3950/agency/zing/srw/), allows remote applications to access the multimedia content. A range of Web interfaces for navigation and search and retrieval are built on top of this interface. These include a graphical ontology browser so that users unfamiliar with museum collections can understand and explore the rich cultural heritage information space. Giunti Interactive Labs have used the SRW to integrate their Learning Content Management System, Learn eXact (http://www.learnexact.com/). Motivated by the recent increased interest by cultural institutions in reusable multimedia components for learning (called Cultural Learning Objects, CLO) and on-line learning contents delivery and management, the result is a content authoring tool able to remotely create and manage 3D virtual learning environments. The rest of this paper covers the themes of Sculpteur in more detail, in particular focusing on the challenges we have faced and the solutions we have developed. Searching and Navigating Multimedia CollectionsModes of SearchingText based searching using Web forms and 'google' type interfaces is a familiar way for many to search large digital collections. Simple substring searching works well when significant amounts of free text are present, often the case with descriptive metadata for museum and gallery objects. Sculpteur supports text-based searching by allowing the user to search against one or more text attributes in the collection. The user can look for strings within free text fields, choose from items in controlled vocabularies, and combine several search attributes together using logical operators. In this way it is easy to specify queries such as "find all works of art painted by Van Gogh using oil where the title contains the word 'sunflowers'" (although the query isn't physically entered in this free text form). This is of course fairly standard stuff when it comes to museum information systems and Web sites. However, there are cases where new search modalities can greatly improve the results of searching, especially for large collections. In Sculpteur, we provide two additional ways of searching and exploring a collection: by concept and by content. Concept Based Searching And NavigationSearching by concept provides the user with a high-level way to explore a collection by abstracting the relatively low-level text attributes found in many legacy systems. The use of an ontology allows text attributes can be grouped together according to common semantics; for example according to the concepts of people (e.g. artist, curator, owner, restorer), art objects and representations (e.g. painting, sculptures, films, digital representations), events and activities (e.g. creation, acquisition, restoration, loan, birth, death, period), places (e.g. gallery, conservation centre, country, city, town, studio), and methods and techniques (e.g. oil, watercolour, carving, x-ray, restoration technique). These concepts are linked together by relationships specified in the ontology, and in our case we adopt the CIDOC Conceptual Reference Model (CRM). For example, the ontology specifies that objects are created during production events in which various people participate in different roles. The ability to search and navigate by concept provides several benefits. For example, a user can make complex queries such as 'find me works of art that were painted by, depict, or were owned by Van Gogh' instead of having to manually combine the results of several separate queries against 'author', 'subject' and 'owner' fields. If a search provides too few (or too many) results, then the user can generalize (or specialize) their query to get a better match. For example, if searching for something specific like a 'teapot' does not yield enough results, then the query can be generalized to 'vessels', which will retrieve 'pots', 'vases', 'urns' etc. as well. The use of an ontology that makes the relationships between concepts explicit also allows different explorative paths to be taken through the collection. For example, the user might be interested in the relationship between 'style', 'artist' and 'materials' for a set of paintings and would want to explore which artists adopted which styles and what materials they used to do so. This 'slice and dice' approach to exploring information is not easily supported using legacy systems. In Sculpteur, the ontology can be graphically visualized using a Concept Browser that implements a graph-based approach. Due to the complexity of the full CRM, this view is generally hidden from the user. Instead, a simplification of the ontology is displayed that will show only the concepts and relations that are present in the museum metadata structure. These concepts and relations are further refined and simplified, in some cases using terms from the original metadata schema to increase familiarity of the users with the interface. The choice not to display the CRM was a result of several trials involving the museum and gallery partners in the project that evaluated several user interface approaches. The terminology and complexity of the CRM proved to be too challenging to visualize in an intuitive way; hence we adopted a simplification strategy with much better results. The ability to overlay a simplified and personalized view on top of the CRM is also a potentially powerful way to enable cross-collection searching since it allows the user of one collection to visualize the contents of someone else's collection in their own context by using the CRM as an underlying (and hidden) interlingua. An important aspect of ontological visualization tools is querying for instances of concepts. Although the visualization of instance information within a graph based interface has been investigated before, for example Fenfire (http://www.nongnu.org/fenfire/) and IsaViz (http://www.w3.org/2001/11/IsaViz), trying to display even a subset of large museum collections in a graph-based visualization results in a confusing and messy display for the user. Instead, we base instance visualization and querying on mSpace interfaces (http://mspace.ecs.soton.ac.uk, McGuffin 2004). mSpace is an interaction model designed to allow a user to navigate in a meaningful manner the multi-dimensional space that an ontology can provide. Our mSpace interface uses a multipanel display, where 'slices' through the ontology are presented as columns arranged from left to right. Selection in a slice will update the display so that the values displayed in the next slice (i.e. to the right of the current slice) are related to that value. For example, if there is a slice of artists and the next slice is painting titles, then selecting an artist will display only that artist's paintings in the titles slice. When an item is chosen in a slice, details about that item are displayed in a detail panel. Slices can be freely interchanged or removed, and new slices can be added to the mSpace. The ontology simplification interface, based on TouchGraph (http://www.touchgraph.com), allows users to browse and add the slices in which they are interested into the mSpace browser, where they can be arranged to suit the user's preference. A preview panel displays the current slice arrangement. Predefined groups of slices can be selected, and users are able to save and load their own arrangements. Content Based SearchingSearching by content allows the user to query and compare different aspects of 2D images and 3D models. For example, a user can find images that have a pattern or colour similar to an image that they supply, or they can find other objects in a collection that have a similar shape to a 3D model that they have already found. More specialized search capabilities are also available; for example, allowing users to search for paintings with a particular type of craquellure identified according to image-based analysis of the pattern of cracks in the painting surface. Another example is finding high quality colour images based on low quality black and white images, in particular photocopies and faxes, useful for museums that provide identification or picture services. Further details of our image-based searching can be found in (Addis, 2002; Lewis, 2004). For a 3D content-based retrieval, in addition to simple searches according to various 3D shape descriptors (for example, Zhang, 2001), we are developing some specific 3D searching applications. These include a way to compare figurines with the moulds from which they were produced. This is an interesting problem that highlights some of the benefits and challenges of content based searching. When clay figurines are created from a mould (which is often in several pieces itself) and are then fired in a kiln, the figurine will often shrink and distort. Second generation moulds are sometimes made from such figurines, and these moulds used to make further figurines. Over time, the moulds and figurines are dispersed and often find their way into different museums. The challenge is to match them again, an ideal application for 3D content analysis. Whilst content, concept and text based queries each have their individual merits, querying by any one aspect in isolation can still result in too many hits when large collections are being searched. However, when these search modalities are combined, new user search scenarios can be supported and much better results are achieved. For example, a user might search for items of furniture that have upholstery of a particular colour or texture, or search for religious oil paintings that used a pigment of a particular shade of blue; for example, to study the transition from lapis to artificial aquamarine pigment. See the example in Figure 5, where a user selects a colour with a picker tool and also enters a keyword in a text form. The results of the search are shown in Figure 6 which present thumbnails of the matching objects in the collection (the V&A in this case). Specification of a combined content and text based query to find 'red chairs' in the Victoria and Albert Museum. An example of a 3D query is shown in Figure 7. The user uploads a VRML model of an object to be used as the basis of finding other objects in the collection with a similar shape. The first three results of the query are shown in Figure 8. The objects found clearly have a similar 3D shape, but also note that they have different textual descriptions (sugar shaker, ceramic object etc.). Text based searching in isolation would have proved difficult since a large variety of objects would need to be included in order to cover all the items likely to have a similar shape. The user would then have to manually sort through a large number of results. Clustering and Classification of Museum Objects Using ShapeIn Sculpteur, the combination of content semantics (colour, pattern, shape) and application semantics (who, what, where, when etc.) also forms the basis of a classifier whereby multimedia can be analyzed to classify the art object represented according to art domain semantics. For example, flat round objects which are gold/silver/bronze in colour might be automatically classified as coins, medals or metal plates. Likewise, an oil painting with surface relief of a spider web type pattern might be classified in terms of severity of craquellure and need for restoration. In Sculpteur we use a classifier agent that has both k-Means and k-NN classifiers available. The classifier agent is supplied with a dataset based on 3D models held by the museum partner. The user is able to choose the parameters for the classifier and train it against the data set supplied. The classifier is then used to label new objects supplied by the user, and graphically inspect the clusters containing the other models with the same label. The ability of the classifier to cluster objects of a similar shape can also be used to inspect groups of objects that share a similar trait. This is useful when the user wants to explore how a particular style (e.g. amphora style of Grecian urn) has been implemented across a range of periods, geographical locations, materials and artists. Web InterfaceThe Concept Browser, mSpaces values explorer, and content-based searching interfaces described above are part of an integrated Web interface. A series of interlinked pages allow the user to move back and forth between these different aspects of searching. For example, the results of content-based searches can be transferred to the Concept Browser interface, allowing users to perform mSpace queries with the content query results. The target users are museum professionals or similar 'power users' who require advanced searching and exploration tools. The user interface is not intended for general-purpose public access, e.g. on a Museum Web site or inside a gallery terminal. In addition to integrated and multimodal searching, several supporting tools are available to help the professional user in using the system. These include:
The rest of this section presents a series of screen shots of a typical exploration and querying activity by a user (based on a collection of objects from the V&A). CRM MappingEach museum and gallery has mapped the information in their legacy systems to the CRM in order to fully benefit from Sculpteur (the system can be used without any mappings, or with other mappings such as Dublin Core). A graphical representation of this mapping process is shown in Figure 18 along with the resultant text mappings in Figure 19. In Sculpteur, we found that ontology mapping requires close collaboration between computer scientists who understand ontologies and knowledge engineering in the context of the Sculpteur software system, museum professionals who understand the legacy data and the cultural heritage domain, and external experts who understand the CRM, its origins and how to use it. Collaboration among these parties is time and effort consuming, but is also essential to achieve accurate and meaningful mapping of each user's legacy data to the CRM framework. Once the mapping has been completed and the corresponding legacy dataset identified, work still needs to be done to export the data from museum and gallery legacy systems so it can be imported into Sculpteur. This in itself presents issues due to the different staff involved at the user site or service provider and the need for suitable formats and transfer mechanisms. Population of Sculpteur with this metadata from legacy systems is heavily dependent on the structure and semantics of the user's metadata, which are not always explicit in the legacy data structure. As a result, further manual steps ensure the data imported into Sculpteur matches the semantics of the mappings. We found that there is a balance between level of interoperability desired and the effort needed to use the CRM. Currently, due to lack of CRM tool support and examples and processes to follow, the effort is significant. On the other hand, the benefits of using the CRM are clear, and the level of interoperability and cross-collection searching that can be achieved both 'in house' and with external systems goes far beyond what can be done with simpler approaches such as Dublin Core. We also anticipate the CRM gaining adoption, especially now that it has been submitted as an ISO standard and real examples of its use start to appear. Therefore, methodologies, tools and examples become more prevalent, and the barriers will come down. In effect, Sculpteur has been one of several 'frontier' efforts that we believe will help ease the life of the 'settlers' that follow. Interoperability and Remote Access over the WebOne of the main benefits of mapping to a common ontology is to achieve interoperability and cross-collection searching, both within a set of legacy systems installed at a museum or gallery site, and between separate museums over the Web. However, there is more to interoperability than mapping to a shared ontology, especially if the objective is to provide remote access to museum information. Sculpteur takes a layered approach to interoperability based on a series of Web and cultural heritage domain standards. The 'nuts and bolts' of interoperability are provided by XML as a syntax to structure information, Web Service standards to physically allow exchange of this information over the Internet, SRW (http://www.loc.gov/z3950/agency/zing/srw/) to provide a protocol that allows one party to request information from another party, and CQL (http://www.loc.gov/z3950/agency/zing/cql/) to act as a query language to express what information is desired. These 'nuts and bolts' allow the syntactic exchange of information and do not say anything about the meaning of this information, i.e. the semantics. This is fine if the only use of the information is 'one collection at a time' by people who can read additional user manuals and descriptions of the semantics of the specific service and data they are using. However, this does not work if software systems need to communicate with each other to use the information; for example, an eLearning tool that sources content from a remote museum, or a search engine that can search across and combine content from more than one source. These applications need the semantics to be made explicit. This is the next level up. The data semantics need to be made explicit both in terms of structure (what each field or attribute means) and content (what controlled vocabularies and languages are involved). Finally, the supported search capabilities need to be described so that a remote system knows what parts of the collection can be queried and in what way. The interoperability 'stack' that we adopt is shown in Figure 20. In Sculpteur we have implement this stack based around an extended version of the SRW 1.1 standard (Addis, 2004). The CRM domain ontology is expressed in a standard ontology language, RDF (http://www.w3.org/RDF/), and is made available for download. The mappings of the legacy systems to the CRM are published as an XML structure, available through the SRW 'explain' operation. The SRW is able to dynamically map CQL queries expressed in terms of these CRM mappings to the relevant legacy database fields, execute a combined metadata and content search, and then return the results as XML structured according to the CRM mappings. The user can explore the CRM ontology and then use the SRW to retrieve corresponding instances. These instances can then be displayed to the user, for example as slices in the mSpace viewer. By using the CRM mappings and the common CRM ontology, a common result schema is achieved and cross-collection searching can be done across multiple art object collections. In this way we leverage Semantic Web (http://www.semanticweb.org) techniques to describe and visualize the complex space of cultural heritage information, whilst using XML and Web Service standards to provide an easy to use search and retrieval service to access this information. eLearningGiunti Interactive Labs have used the SRW interface to integrate their Learning Content Management System, Learn eXact. The e-learning system can search and import 2D and 3D images and related metadata from remote collections and use them to build new cultural learning objects. These learning objects include virtual museums and galleries which offer an interactive learning experience following e-learning standards like IEEE LOM, IMS Content Package and ADL SCORM. Functionality includes evaluation of sessions (based on the IMS QTI specifications) and tracking of end-user actions based on the ADL SCORM specification. ConclusionsThis paper has presented our approach to search, retrieval, navigation and interoperability of multimedia museum collections over the Web. Mapping of museum legacy information to the CRM ontology is our bedrock, and around this are built advanced search modalities, innovative navigation and exploration tools, and the ability to provide access to this functionality to remote applications over the Web. Sculpteur has needed to overcome significant technological barriers to make this possible, and the investment needed by museums and galleries to fully benefit from our approach should not be underestimated. The benefits are significant and include integrated and powerful access to multimedia museum information as well as the ability to deliver this capability to remote users and collaborating organizations. ReferencesZhang, C. and T. Chen, T. (2001). "Efficient Feature Extraction for 2D/3D Objects in Mesh Representation", ICIP 2001, 935-938, 2001, Thessaloniki, Greece. Addis, M., M. Boniface, S. Goodall, P. Grimwood, S. Kim, P. Lewis, K. Martinez,. and A. Stevenson (2003). SCULPTEUR: Towards a New Paradigm for Multimedia Museum Information Handling. In Proceedings of Semantic Web ISWC 2870, pages 582 -596. Addis, M., P. Lewis, and K. Martinez (2002). ARTISTE image retrieval system puts European galleries in the picture. Cultivate Interactive. Lewis, P. H., K. Martinez, F.S. Abas, , M.F. Ahmad Fauzi, , M. Addis, C. Lahanier,, J. Stevenson, S.C.Y. Chan, J.B. Mike, and G. Paul (2004). An Integrated Content and Metadata based Retrieval System for Art. IEEE Transactions on Image Processing 13(3):pp. 302-313. McGuffin, M. J. and M.C. Schraefel (2004). A Comparison of Hyperstructures: Zzstructures, mSpaces, and Polyarchies. In Proceedings of ACM Conference on Hypertext and Hypermedia, 2004 (in press), pages pp. 153-162, Santa Cruz, California, USA. AcknowledgementsThe authors wish to thank the European Commission for support through the SCULPTEUR project under grant IST-2001-35372. We would also like to thank our collaborators on the project, including Francis Schmitt and Tony Tung of ENST, Paris, Christian Lahanier of C2RMF, James Stevenson and Rachel Coates of the V&A museum, Joseph Padfield of the National Gallery, Raffaela Rimaboschi of the Uffizi and Jean-Pierre of the Musee de Cherbourg for many useful discussions, use of data and valuable help and advice; Patrick Sinclair and Simon Goodall from IAM and Adrian Pillinger and Dan Prideaux from IT Innovation for their intellectual contributions and development of the Sculpteur system. M. Chapman, P. Dibdin, A. Pillinger, R. Sadotra, S. Samangooei, A. Smithson and T. Wirdyanto who, as Master of Engineering students, contributed to the development of SCULPTEUR prototypes; Patrick Le Boeuf of the Biblioteheque Nationale de France for assistance with mapping to the CRM; TouchGraph (http://www.touchgraph.com) for software used in the concept browser; and Hewlett Packard's Art & Science programme for the donation of server equipment. Cite as:Addis et al., New Ways to Search, Navigate and Use Multimedia Museum Collections over the Web, in J. Trant and D. Bearman (eds.). Museums and the Web 2005: Proceedings, Toronto: Archives & Museum Informatics, published March 31, 2005 at http://www.archimuse.com/mw2005/papers/addis/addis.html |
||||||||||||||||||
last updated: April 2005 analytic scripts updated: October 2010 |
Archives & Museum Informatics, 158 Lee Avenue, Toronto, Ontario, M4E 2P3 Canada Telephone: +1 416 691 2516 | Fax: +1 416 352 6025 | E-mail: |
||||||||||||||||||
Copyright © 2005 - Archives & Museum Informatics. |