October 24-26, 2007
Toronto, Ontario, Canada

Paper: The eye of the beholder: steve.museum and social tagging of museum collections

Jennifer Trant and David Bearman, Archives & Museum Informatics, Canada; and Susan Chun, Cultural Heritage Consulting, USA


Steve is a two-year old collaboration between art museums whose research project, funded by the U.S. Institute of Museum and Library Services, seeks to learn whether (and how) social tagging can serve art museums. We are interested in discovering whether and how non-professionals’ descriptions of artworks differ from professional cataloguing, as well as how volunteer taggers may experience the activity of looking at and describing museum collections. To support our research, we have built an open source tagging tool, developed methods for analyzing user-contributed descriptions of artworks, and begun to study the nature of the tags contributed by end users. In keeping with the open, collaborative philosophy of the project, we will make both our research results and our raw data available to interested members of the community. Our briefing will provide an update on current and future project activities and discoveries, including a review of our tools, a summary of findings to date, and a discussion of our data set and analytical framework

Keywords: social tagging, museums, open source, folksonomy, research

About steve

Steve is a two-year old consortium of art museums interested in exploring the role user-contributed descriptions can play in improving on-line access to works of art. Participants include :

  • Denver Art Museum
  • Guggenheim Museum
  • The Cleveland Museum of Art
  • Indianapolis Museum of Art
  • Los Angeles County Museum of Art
  • The Metropolitan Museum of Art
  • Minneapolis Institute of Arts
  • The Rubin Museum of Art
  • San Francisco Museum of Modern Art
  • Archives & Museum Informatics
  • Think Design

Our collaboration was catalyzed by the coincidence of a growing interest in user-contributed content and the emergence of Web 2.0 technical strategies to support it. Steve’s origins,  and the potential for user-contributed content in museums,  have been widely discussed elsewhere (Bearman & Trant, 2005; Chun, Cherry, Hiwiller, Trant, & Wyman, 2006; Trant, 2006a, 2006b; Trant & Wyman, 2006).

To build our understanding of social tagging and folksonomy, we’ve developed a research project, now funded by the U.S. Institute of Museum and Library Services, that seeks to learn whether (and how) social tagging can serve art museums. With a National Leadership Grant that runs from October 2006 through September 2008 we are exploring the question: Can social tagging and folksonomy improve access to art museum collections on-line? (The Metropolitan Museum of Art, Indianapolis Museum of Art, Chun, Stein, & Trant, 2006).

The steve.museum research agenda addresses three aspects of social tagging: users interactions with tagging interfaces, tags themselves, and tags in relation to other vocabularies used to describe works of art (Figure 1). In addition, it examines the question of how social tagging is perceived in museums and whether the results of the steve research influence that perception.

Figure 1
Figure 1. Research questions are positioned in the data collection and analysis process of the steve.museum research project, from term collection in a social tagging environment through folksonomy analysis, comparison with controlled vocabularies and assessment in relationship to the work of art tagged.

Museum attitudes towards social tagging

A part of understanding tagging and the contribution it might make to accessibility of online art museum collections is to understand the barriers to incorporating user-contributed content into museum documentation. Much effort has been invested in standards for collections description, and the anarchy of emergent folksonomy seemed a cause for concern (as it is in bibliographic circles (Guy & Tonkin, 2006)). To appreciate the nature of our colleagues concerns our research program includes surveys of attitudes towards social tagging before and after the results of our research are known.

Very preliminary analysis of a “baseline” attitude survey revealed largely positive, if somewhat uncertain attitudes to the potential for social tagging (Figure 2). When asked about the statement “Museums Could Use Social Tagging” more than 66% of respondents were inclined to agree (answering “somewhat agree”, “agree”, or “strongly agree”). There was divergence by area of responsibility, as hypothesized. Technologists were all positive, as were Management/Executive. Those in Collections Information Management and the Library were more likely to “somewhat disagree” or “disagree”. A large number of “don’t know” and “no response” replies (more than 23% of responses) shows the jury is still out on the value of social tagging in the art museum context.

Figure 2
Figure 2. Museum Attitude Survey: Baseline: Museums Could Use Social Tagging. Responses By Role

 The steve tagger

To test the ways in which users tag works of art, and to gather tags in an environment that facilitated analysis, we have developed the steve tagger. This Open Source piece of software allows us to alter the interfaces that are then randomly assigned to taggers and to keep track of which interface features each user was exposed to. Because the experiment required that we examine a range of interfaces, however, the tool has been designed to simplify how interfaces are associated with the data thereby enabling any institution that adopts it to create a custom interface for itself. The steve tagger has been developed with an eye to utility beyond our research project; it is envisioned that it will provide a foundation for museums wishing to deploy tagging on their site.

The tagger, like other software developed by the steve project is an open source tool; developers, information architects, webmasters, museum IT staff, and anyone else with an interest in contributing are invited to join the community and contribute to a well-defined open source project with established standards of software quality. Our software development work is documented in an online tracking tool: http://trac.steve.museum. The software package itself is available from http://sourceforge.net/projects/steve-museum.

In addition, steve.museum maintains several discussion lists for community involvement in aspects of the project. The steve.discuss list is the place for sharing thoughts about social tagging in general, and the tagging of works of art in particular. Two specialized lists, steve.tech, for general discussion about tech issues, and steve.dev, used by the project’s active developers to discuss the software coding, testing, and revision cycle.

User interfaces for tagging

How does the tagger interface influence tagging behaviour (as shown by tags assigned)? When we look at the popular tagging tools, such as flickr and del.icio.us, we noticed variations in the way that tags are assigned. Before deploying tagging on museum sties, we wanted to be aware of the impact of interface variables on tagger behaviour. We realized that there were a number of different deployment scenarios envisioned by institutions collaborating in steve and we wanted to understand the variables that would determine success. For example, if an institutional goal was to collect as many tags as possible for a work of art, we needed to understand what factors might limit tagging.

Over the course of a year starting in late March 2007, the steve tagger is being made available in a number of different configurations, each guided by hypotheses about user behaviour. Because we are interested in what motivates users to tag, and to continue tagging, we’ve launched our tagging experiment in two venues. At http://tagger.steve.museum the general public has been invited to tag a selection of works drawn from participating collections. A functionally identical, but branded implementation of the tagger has been launched privately at The Metropolitan Museum of Art, containing only Met works. We wonder whether users who feel affiliated with a particular museum will tag differently. These parallel installations will allow us to compare what users do in different contexts

Each environment is deployed for a set period of time. Users and their tags are linked to a record of environment variables, so that we will be able to analyze accumulated tag data and determine the affects, if any, of tagger interface variables on tagging behaviour. To date we have tested such options as showing or not showing museum metadata, and showing or not showing tags assigned by others. We have also examined how showing groups of works or allowing users to choosing works to tag effects tagging.

No tags, no metadata

We began our tagging experiments in March of 2007 with an environment designed to gather baseline data. The steve tagger was deployed in its simplest configuration, showing only an image of a work and a box to collect tags (Figure 3).

Figure 3
Figure 3. steve tagger: Do users tag differently when they don't see others' tags or museum metadata?

Show metadata

The fist significant question we asked was whether the presence of museum documentation for a work of art influences the tags assigned. Would users mimic a museum label, or would they contribute new, different tags. We designed an environment (Figure 4) that adds museum metadata, formatted as ‘traditional label copy’ to the tagger interface, and will compare tags assigned to the same work with and without metadata showing.

Figure 4
Figure 4. steve tagger: show metadata. Do the tags supplied by users vary when they can see museum documentation?

 Show tags

Does user behaviour change when they see the tags that others assign? We can hypothesize two possibilities: that users mimic what is presented to them, or that they strive to be different. Understanding this is critical to future deployments of tagging on museum sites, particularly if statistical thresholding is considered as a way of reviewing tags contributed. If a tag is considered useful after it has been assigned n times, then an interface that impedes the assignment of tags perturbs this equation. An experimental interface that shows tags previously assigned (Figure 5) will allow us to determine if user tagging is encouraged, dissuaded or otherwise influenced by the presence or absence of pre-existing tags for works of art.

Figure 5
Figure 5. steve tagger: show tags. Do the tags supplied by users vary when they can see what others have done?

Show tags and metadata

Exploring the relationships between user supplied tags, and the presence or absence of museum metadata and others’ tags raises questions about interaction effects between metadata and tags. A steve tagger interface showing both museum metadata and user supplied tags has been deployed (Figure 6). We wonder if users might just ‘give up’ at this point, thinking there was nothing else to say. We also hypothesized that tags contributed in this environment might be the most useful, as they may add the most to the description of the work of art.

Figure 6
Figure 6. steve tagger: show tags and metadata. Do the tags supplied by users vary when they can see user tags and museum documentation?

Works in sets

Once scenario for deploying tagging envisioned users volunteering to tag works of art as their contribution to the museum. Here, creating an environment that effectively stimulated tagging would be important. We hypothesized that it was likely that users ‘got in the groove’ when tagging similar works, and that their tagging of sets of like-works might be more useful than the tags assigned to randomly presented, diverse groups. We are testing this hypotheses with an environment that groups works in sets by medium (Figure 7), providing some continuity between one work and the next and presenting the jarring sense of seeing a non-representational contemporary painting right after an classical sculpture.

Figure 7
Figure 7. steve tagger: show sets Does users 'get in the groove' when they tag groups of like works?

The tags people assign

To understand the tags people assign we need to look at them in a number of different ways. We have developed methods for analyzing user-contributed descriptions of artworks, and begun to study the nature of the tags. Reporting tools are being developed for the steve tagger that describing tags statistically, to determine, for example, how many tags are assigned to each work, and establish how much this varies by tagger, by type of work and by tagging environment.

Term review

Collections documentation specialists in museums have expressed doubts about the quality of tagging. One respondent to the Baseline Attitude Survey captured this concern well: “if well managed, this could be useful. if not, utter chaos” (Trant & steve.museum, 2007). During our research museum staff will review all tags assigned to works of art, and assess their usefulness for searching –  their possibility ability to aid retrieval in an online environment.

A tool has been developed specifically to support this function (Figure 8). Museum staff can approach the review either from a display of works that have been tagged (which shows the number of unique terms assigned and reviewed for each work) or from a display of terms assigned (which then displays the works which have been tagged with that term). In both cases, they are offered options of indicating that the terms could be useful or not, and characterizing the term in a variety of ways (such as that it is judgmental).

Figure 8
Figure 8. steve tagger term review: unique tags assigned to works of art are presented for review by museum staff

With this input from museum staff, we will be able to characterize the relevance and utility of the contributed content on another dimension. The review tool is available in a “pre-release” and will be made widely available once we have had sufficient experience to be comfortable with its performance for the project requirements, and its flexibility for institution specific review needs. Our goals have not been to censor tags – though there is a blacklist in place to prevent obscenities from being displayed to the public. Term review within the context of our research project is to help us understand the nature and contribution of user contributed tags.

Preliminary tagging results

Promising results from early prototype analysis showed that users could contribute significant numbers of new terms, reflecting new concepts (Trant, 2006b). Indeed up to 90% of the terms that users contributed were not present in the documentation of the works as provided by the museums even though those works provided by the museums had quite rich records. Significant numbers of new terms provided by more than one or two visitors, revealed that users see, and presumably recall, some details of images that are not explicitly documented by professionals, thus suggesting some scope for using these terms to aid in discovery.

These preliminary results are holding up in our more formal experiments. Of the tags assigned to all works during Term Set 1 (March 27– July 11, 2007), 76.5%  (7,973 of 10,418) were not found in museum documentation.

Search log analysis

However, even if we find that many new and appropriate terms are provided by the public, we know almost nothing about what searchers of museum collections actually seek, and so cannot really say what contribution tags could make. To date, we have found no systematic, published, information retrieval studies using museum databases.  Therefore, Trant conducted a second preliminary study, using the search logs of the Guggenheim Museum (Trant, 2006c). She found that:

  • artists names were searched significantly more often than other things (comprising 63% of searches made more than 10 times)
  • the usage curve strongly matched the academic year and the days of the week (with significant dips on weekends)
  • the exhibition program does influence searches (searches related to the theme of a show increase during its time on view)
  • search terms are very diverse. The most popular search term 'picasso' made up only 2.8% of searches; the curve of term distribution was very steep and the tail exceptionally long
  • the characteristics of the tail differ from the 'head' of the curve; infrequent searches were more likely to be for subject-related topics, and combinations of categories
  • spelling errors accounted for 36% of unsuccessful searches, but half (50%) of the unsuccessful artist’s name searches failed because of a spelling error.

This preliminary study made it clear that comparative analysis of searching in a number of museum databases is essential if we are to draw usable conclusions from the folksonomy contributions received in tagging. Does the penchant for the specificity displayed in the Guggenheim logs match the level of interest common in the browsing museum visitor, or does the collection search function at the Guggenheim somehow encourage the use of a specific query term? Is browsing behaviour supported elsewhere? Does searching the Guggenheim collection on-line reflect the focus of modern art-making and critical theory or are users adjusting to the data source, knowing that artists name searches are most likely to be successful?

We need to know more about what users really search for, and how that behaviour differs in different types of museums. Then we can make some informed decisions about how to facilitate access, both to individual collections and to aggregated collections data, and assess the contribution of user tagging and the resulting folksonomy.

The tags in relation to vocabularies

For many years, museums have been urged to use controlled vocabularies to improve access to their collections. Since there have not been empirical studies of searches of museum databases, we are not able to say whether this strategy would work, beyond the theories about controlled vocabulary improving precision. But we can compare the terms that taggers give to those in controlled vocabularies. This enables us to establish if the current documentation, enhanced by terms that are related through controlled vocabularies, would have a greater number of tags that users assign, and therefore be more likely to be able to match a user's search term.

We begin by automatically comparing terms assigned by users to the museum’s own documentation (as noted above), the Art and Architecture Thesaurus (AAT) and Union List of Artists Names (ULAN) controlled vocabularies, and WordNet which is a lexical database of English language terms and concepts. By identifying not just the term itself, but the hierarchical concept tree to which the term belongs, we hope to be able to answer to what extent taggers are elaborating the descriptions through applying a broader range of language or whether their vocabulary is independent of the concepts described by museum professionals.

Term analysis tools to perform these comparisons are in final stages of testing tools will be added to the Open Source steve tagger software.

User Motivations for Tagging

Tags are one indication of tagger behaviour that we can study independent of the taggers themselves. But we are also interested in the motivations for tagging. We are planning to survey the people who return to the steve tagger regularly and tag large numbers of works (both in the multi-institutional and The Metropolitan Museum of Art context). We hope to determine what motivates taggers, and establish how we might best reach people like them who find social tagging of museum collections rewarding.

Future work

This paper highlights the first steps in a two-year research agenda exploring the contribution social tagging might make to the accessibility of on-line collections of museum documentation. Subsequent developments of the steve tagger will focus on the more social aspects of ‘social tagging’, empowering the tagger with more choice in the works they will tag, and offering some more immediate rewards for their activity. In tandem, a number of participating museums are moving ahead with deployments of tagging on their Web sites. We’re hoping to compare the results of the steve experiments with data collected ‘live’.

In keeping with the open, collaborative philosophy of the project, we will make both our research results and our raw data available to interested members of the community. Regular reports are posted to http://www.steve.museum  and discussion forums are maintained on threaded lists for research, technical issues and impacts. Before the project ends, all its raw data will be deposited in social science data archives for use by others.


Establishing how useful social tagging can be to museums interested in enhancing their documentation and enabling greater access, and determining what ways museums can best attract and utilize the contributions of the public, is crucially important at this early stage in the implementation of Web 2.0 technologies. Not only will it serve to answer questions that management will want to answer before committing resources, co-operative research and broad sharing of results can collectively save the community a great deal of wasted energy by providing hard data on the behaviour of taggers supported by different interface functions. Early results of the steve.museum research are already proving useful; full results will be reported by late fall 2008.


All the members of the steve.museum collaboration contributed to the work reported here. Credit for the implementation of the steve tagger goes to the steve dev team, including Rob Stein, Charlie Moad, Ed Bachta, David Ellis, Willy Lee, and Michael Jenkins.


Bearman, D., & Trant, J. (2005). Social Terminology Enhancement through Vernacular Engagement: Exploring Collaborative Annotation to Encourage Interaction with Museum Collections. D-Lib Magazine, 11(9), http://www.dlib.org/dlib/september05/bearman/09bearman.html

Chun, S., Cherry, R., Hiwiller, D., Trant, J., & Wyman, B. (2006). Steve.museum: An Ongoing Experiment in Social Tagging, Folksonomy, and Museums. Paper presented at the Museums and the Web 2006: selected papers from an international conference. Retrieved September 11, 2006, from http://www.archimuse.com/mw2006/papers/wyman/wyman.html.

Guy, M., & Tonkin, E. (2006). Folksonomies: Tidying up Tags? D-Lib Magazine, 12(1), http://www.dlib.org/dlib/january06/guy/01guy.html

The Metropolitan Museum of Art, Indianapolis Museum of Art, Chun, S., Stein, R., & Trant, J. (2006). Researching Social Tagging and Folksonomy in Art Museums: U.S. Institute of Museum and Library Services (IMLS), http://www.imls.gov/applicants/samples/NLG%20RD%20Sample%20The%20Met.pdf

Trant, J. (2006a). Exploring the potential for social tagging and folksonomy in art museums: proof of concept. New Review of Hypermedia and Multimedia, 12(1), 83-105, http://www.archimuse.com/papers/steve-nrhm-0605preprint.pdf

Trant, J. (2006b). Social Classification and Folksonomy in Art Museums: early data from the steve.museum tagger prototype. Paper presented at the ASIST SIG-CR workshop on Social Classification. http://www.archimuse.com/papers/asist-CR-steve-0611.pdf

Trant, J. (2006c). Understanding Searches of an On-line Contemporary Art Museum Catalogue. A Preliminary Study:  Fall 2006 http://conference.archimuse.com/system/files/trantSearchTermAnalysis061220a.pdf

Trant, J., & steve.museum. (2007). Museum Attitudes to Social Tagging: Baseline Survey Responses

Trant, J., & Wyman, B. (2006). Investigating social tagging and folksonomy in art museums with steve.museum. Paper presented at the World Wide Web 2006: Tagging Workshop. http://www.archimuse.com/research/www2006-tagging-steve.pdf

Cite as:

Trant, J., et al., The eye of the beholder: steve.museum and social tagging of museum collections , in International Cultural Heritage Informatics Meeting (ICHIM07): Proceedings, J. Trant and D. Bearman (eds). Toronto: Archives & Museum Informatics. 2007. Published October 24, 2007 at http://www.archimuse.com/ichim07/papers/trant/trant.html