April 13-17, 2010
Denver, Colorado, USA

Evaluating the On-line Audience of a New Collections Web site

Graham Davies and Dafydd James,  Amgueddfa Cymru – National Museum Wales, United Kingdom


Amgueddfa Cymru - National Museum Wales's virtual Museum Rhagor (Welsh for 'more') - was launched on-line in 2007. The site focuses on increasing accessibility to the vast number of items that are stored out of sight of the general public. Rhagor makes available the Museum’s knowledge and expertise that is intrinsically linked with its collections and objects.

Prior to the site's launch, the effectiveness of making this information on-line was difficult to judge. Therefore, as well as practical user testing exercises, post launch evaluation was undertaken over an entire year to examine both widescale usage trends and specific visitor behaviour patterns. The methodology focused on a mix of standard quantitative metrics gathered from statistical packages, combined with a host of qualitative metrics, such as user ratings, comments and submission of content by users.

The results have since prompted evidence-driven targets of the site’s performance, as opposed to those based on expectations alone.  In terms of content delivery, new articles published on the site reflect the style of those articles that were rated highest by users to the site. The site's presence on various social media sites was also studied, allowing an overarching understanding of the effectiveness and reach of data disseminated to wider audiences beyond that of the main Web site. A further requirement of the evaluation was to study the bilingual nature of the site, allowing a comparison of behaviour patterns of Welsh and English users of the site.

Keywords: evaluation, user analysis, engagement, on-line audience, user behaviour, metrics, statistics

1. Backgound

Measuring the popularity of Amgueddfa Cymru – National Museum Wales' Web site

Amgueddfa Cymru - National Museum Wales's Web site ( / - has historically never been scrutinised as to its impact on the community and audience it serves.

The Museum Web site has, since its inception in 1996, traditionally reported its success simply as a measure of year-on-year growth in the form of hits, visits and visits over 10 minutes. This was rationalised from 2000 onwards to only report visits (due in large to increased awareness of the misleading notion of hits and the debatable accuracy of visits over 10 minutes).

These figures, compiled into cumulative quarterly performance indicators, are submitted for corporate accountability, and statutory reporting to Government. As long as this primary performance indicator exceeded targets, Web site activity is not examined further.

Due to the inherent nature of the Internet, activity on our Web site has always tended to increase over time, consistently beating or matching set targets. As a consequence, the Web team at the Museum has never had to specifically account for any significant decline, or negative deviation from these targets.

Figure 1

Fig 1: Graph showing the historical page views per month for the National Museum Wales Web site: 1999-2009 (Various sources)

Forecasting future trends in Museum Web site demand

The rate of growth shown in Figure 1 will, at some point in the future, become unsustainable. Ultimately, this growth will likely peak and begin leveling off – with this in mind, projected targets for Web site performance (based on current methods) will have to be intelligently forecast to take account of these future growth patterns and not continue to be set on exponential growth predictions.

Further to this, following a decade of exponential growth in traffic to museum Web sites on the back of exponential growth of overall Internet use, there are indications that the share of clicks going to museum Web sites is most likely declining (Peacock, 2007). Therefore, understanding trends for both our site and others in the sector is key to forecasting and understanding future demands to our Web sites.

Given the fast-paced changes in technology and social behaviour of the Internet, the basis of page views and Web site sessions as key performance indicators by the UK government is becoming increasingly problematic (Haynes, 2007). Therefore it seems appropriate that we look at gaining a more holistic understanding of our Web site activity – so at least we are not continuing into the future wholly reliant on measuring our Web site success by traditional statistics alone.

2. Our Virtual Museum, Rhagor

In the past, the Museum devoted time and resources to developing Web site projects where the primary measurement of success was the publication of that project onto the Web. Although this met the needs of the relevant project commissioners, it offered no insight into the value gained by its audience (whoever they may be).

The launch of our Virtual Museum (Rhagor, Welsh for 'more': in 2007 provided an ideal opportunity to examine the impact of this specific project on its on-line audience.

By examining both widescale usage trends and specific visitor behaviour patterns, we have looked at where our users come from, what they are after, what they actually end up looking at and, to some extent, why they might be leaving. Internally, this is a much more effective guide to the success of the site, rather than solely tracking visits. The evaluation assisted with prioritising areas for improvement, as well as justifying the project's investment to our stakeholders.

Rhagor was developed as a means of interpreting and exploring our collections in an on-line environment and increasing accessibility to the vast number of items that are stored out of sight of the general public. With curators writing short and accessible articles, Rhagor makes available the Museum's knowledge and the expertise intrinsically linked with its collections and objects.

Technical background

The administration system for the entire Web site is a custom-built Web content management system called Amgueddfa CMS. Due to difficulties in finding an appropriate bilingual content management system, Amgueddfa CMS was developed in-house by the Museum's Web Manager. It is a PHP/MySQL setup that allows developers to add blocks of code, meaning that over a period of time this system has built up numerous modules – therefore increasing functionality considerably over several years. Amgueddfa CMS is now used to publish all Web and intranet pages – including Rhagor – as well as most gallery interactives. It is accessible via the Museum's network through a Web browser. A scaled down version is available for curators to blog while off site, as well as a secure version for remote technical administration of Web pages. The Museum's Web Manager, Chris Owen, will be demonstrating Amgueddfa CMS at MW2010 in: Amgueddfa CMS: the in-house content management system for National Museum Wales (

3. Evaluating an On-line Audience: Methods and Metrics

The evaluation undertaken on Rhagor focused mainly on the one-year period following its launch, but elements of this paper are further supported by ongoing data analysis. The methodology focused on a mix of standard quantitative metrics gathered from statistical packages, combined with a host of qualitative metrics such as user ratings, comments and user interaction.

Usability testing was also undertaken on the Web site in 2008, though this was specifically to provide insights into the practical usability and navigability of the site.

This evaluation study was very retrospective. There were no specific measurements of success at the launch of the project apart from trying to appeal to special interest groups, experts and the elusive 'general audience'.

The title of 'general audience', however, is far from a definitive, measurable audience. In fact, it has been suggested that the use of the term 'general visitor' actively avoids the definition of the group (Hamma, 2004).

Peacock (2007) examines the preconceived assumptions and biases we hold about our on-line users and how we adopt these assumptions into the planning and managing of museum Web sites. We need to critically examine our own assumptions and beliefs if we really want to understand and meet the needs of our Web site users (Peacock, 2007).

4. Data Analysis

Many organizations are now using more than one analytics package to gather statistics for measuring Web site activity. The Museum installed BetterAWStats in September 2007 for analysing log data for reporting key performance indicators for both of the Museum's URLs (Welsh and English). We found that BetterAWStats provided us with much more data than just performance indicators to put in our operational plan. In contrast to previous methods, we were also able to filter out robots and crawlers.

In conjunction with BetterAWStats, we also installed Google Analytics at the same time. This gave us much greater control and flexibility to filter and segment data for in-depth analysis, including advanced segmentation for viewing statistics for users from Wales that visit Rhagor – a feature not available in BetterAWStats.

4.1 Log file data

As good as log files are at faithfully recording every piece of activity on your Web site, it is estimated that as much as 50-60% of this activity is due to non-human traffic (Chan, 2008), such as search engine robots ('bots'), content scrappers, hackers, e-mail harvesters, etc.

Not only do log files over-inflate your figures by all this non-human activity, but they can also hugely inflate your remaining figures by recording all the automated activity from scripts that may be running on your site. 

It is important to distinguish here between visits and page views. Although the Museum's key performance indicator is the total number of visits to the Web site, this metric does not allow us to segment activity on different areas of the Web site. For this we use the page views metric; however, it is this metric that is over-inflated by automated activity.

A quick look at a sample log file illustrates that, once all the non-human traffic has been filtered out, the top four highest page requests are all generated by scripts.

Analysis of log file data from September 2008 (Figure 2) showed that activity generated by non-viewed traffic resulted in 58% of the total number of pages viewed.  From the remaining 42%, a further 46% of traffic was due to script activity. This leaves a figure considerably smaller than what we started with (19%), but one that is truly reflective of user activity.

The effects on page views of scripts can be seen graphically in Figure 3 by comparing September 2008 to September 2009. It shows a rise in overall viewed traffic, but this is only due to the additional script activity.

figure 2

Fig 2: BetterAWStats log file reporting on Web site activity for September 2009. The top four URL requests are automatically generated by scripts.

Figure 3

Fig 3: The effects on page views of addition scripts reported by BetterAWStats.

As long as the issues noted above are taken into consideration, Web log data can still prove to be a valuable tool. Peacock (2002) provides persuasive arguments that log file data can offer valuable insights into user behaviour – if used in the right way and for the right purpose.

4.2 Page-tagging

In comparison, page-tagging software (such as Google Analytics) does not record any of this non-human activity. A Web page has to be served up in its entirety before activity is logged, ensuring that all the activity generated is by actual end users of the site.

However, this page-tagging method of recording activity is not without its issues either. Chan (2008) emphasises that in-page dynamic content – which is increasingly included in today's more socially orientated Web sites – is unable to be tracked without complex programming.

The more widespread this in-page dynamic content is becoming within Web sites, the greater the need will be to measure its impact independently from the page itself.

Although individual media items and downloads can be via BetterAWStats, we cannot at this time record any in-page events. Google Analytics does provide functionality to record this type of activity, but this would need to be custom built into Amgueddfa CMS. This complex and time-consuming task would need to be incorporated into ever-increasing departmental priorities.

4.3 How Rhagor compares to the rest of the site

In trying to understand the early success rates for Rhagor we explored how it compared to the rest of the Web site. Without analysis of the data, we estimated that it would be around 15-20% at least. In reality, the true figure was 7%. Although this figure seemed remarkably small, contextualizing this within the rest of the Web site provided us with insightful information (Figures 4 and 5):

Figure 4

Fig 4: Pie chart demonstrating popularity of Web site sections for (Source: Google Analytics)

Figure 5

Fig 5: Pie chart demonstrating popularity of Web site sections for (Source: Google Analytics)

For the period April 2008 to September 2008, Rhagor accounted for 7% of the English Web site page views, translating as the sixth most visited section.

In comparison we discovered that Rhagor accounted for 15% of the Welsh Web site page views, meaning Rhagor was actually the most visited section during the same period.

4.4 Most viewed vs. most popular

We also looked at the most popular articles on Rhagor. By identifying the highest viewed articles (Table 1) we saw variations in what our two main audience segments (Welsh and English language) are reading.

Top English content

Top Welsh content

1: Help us find the mysterious "Ghost Slug" 1: Faces of Wales interactive timeline
2: Blaschka image gallery 2: Podcast download page
3: The Largest turtle in the World 3: Your History image gallery
4: Podcast download page 4: Welsh lovespoon interactive
5: Your History image gallery 5: Blaschka image gallery

Table 1: Top 5 highest viewed articles on Rhagor September 2007- September 2008 (source: Google Analytics)

A simple look at the articles with the highest page views may not truly reflect the most popular content, so we were careful to use these results in conjunction with those articles receiving the highest ratings (see section 5).

4.5 Segmentation by geography and language

The bilingual nature of Amgueddfa Cymru – National Museum Wales is reflected in its two Web site URLs. Switching from one language changes the URLs, thus providing a good indicator for our Welsh audience.

For Rhagor, the number of people accessing the Welsh URL accounted for 6% of the total Web visits from April 2008 to September 2008 (using Google Analytics). As expected, when we segmented the audience to specify the location to Wales, this increased to 15%  - which we believe is a good number taking into consideration that 20.2% of people aged 3 and over can read Welsh (Welsh Language Board, 2003).

In a UK context, we estimated that approximately 1% of people can read Welsh (Welsh Language Board, 2003; Office for National Statistics, 2002).

In comparison, a total of 7.7% of our UK audience for Rhagor accessed the Welsh site from April 2008 to September 2008. This could be due to the large Welsh speaking communities in England, as London has the largest Welsh language audience globally for Rhagor during this period.

However, it was considered that there might be a Web site-wide issue with people accidentally coming across Welsh content, not understanding it and wanting English content. This could be reflected in the higher bounce rate for the Welsh URL (56.7% April 2008 to September 2008) than for the English URL (44.1% April 2008 to September 2008). It is also worth stating that referrals for this same period were considerably higher from the Welsh URL to the English URL (12.5%) than vice-versa (0.9%).

Our original measure of Welsh language users in Wales (15%) may therefore not be a true reflection of Rhagor’s bilingual audience.


Considering the Welsh population is only 3 million people, the overall visits to Rhagor (59,362 from April 2008 to September 2008 according to Google Analytics) seems reasonable. Segmenting geographically, our overall Rhagor audience from Wales is 8,416 for the same period.

Even though the global Rhagor audience was spending comparatively more time on the Web site, this figure nearly doubled for Wales with an average time on site of 7:51 (English URL). Therefore Rhagor users from Wales seem to be getting more value from the Web site.

As an organization our priorities with geographical audiences are mixed – even though our core audience is in Wales (since Amgueddfa Cymru – National Museum Wales is a Welsh Assembly Government-sponsored body), one of our goals is to increase awareness of Wales and the Museum’s collections internationally.

Some of the more popular pages are the Visit Us pages, which are a focus of the Marketing department’s audience development in Wales. We have tried to capture some of these users by inserting links to relevant articles in Rhagor.

Mobile Analytics

While exploring the statistics for mobile usage we found that visits to the National Museum Wales Web site from iPhones increased steadily over the period January 2009 to December 2009. Also, when comparing with data from the previous year, we found that iPhone usage in 2009 increased by 440% (English URL).

Analysing the data further made it clear that iPhone usage on our Web site focused on visiting and event information, with only two Rhagor features appearing in the top 50 viewed pages. Due to the nature of mobile browsing where people need practical information there and then, this questions the capacity of our Web site to deliver simple optimized information when searched for by the user.

The need for collections information on mobiles has been questioned if there is no augmented experience (Ellis, 2009). However, if we can give added value to an application (e.g. cultural trails), and take  account of the more interpretive collections information in Rhagor, we can tap into the marketing potential of mobile application stores (iTunes, Android Market) to increase our audience.

Other segmentation considerations

Peacock (2007) segments the audience by its needs, defining users as 'Visitors', 'Searchers', 'Browsers' and 'Transactors'. Traditionally we viewed these segments as relating to certain sections of our Web sites, though this is not necessarily the case.

5. Ratings

Articles published on Rhagor include a simple rating system where users just click once to rate 1 to 5 stars on the following three elements:

  • Content
  • Images
  • Style and readability.

The ability to alter these questions gives us freedom and flexibility to test specific elements should we require.

We are also considering including a How likely are you to recommend a friend? poll, a key measure of user satisfaction as advised in Tom Loosemore's keynote speech at UK Museums on the Web 2008: Integrate, federate, aggregate.

5.1 Meaningful insights into user experience

Due to the functionality of Amgueddfa CMS, we are able to record all ratings submitted for Rhagor articles. This basic cumulative figure alone is of minimal benefit and only tells us that articles are actually being rated (be they good or bad).

For a more meaningful and insightful perspective of what this rating metric can tell us about users' experience, we charted and analysed each rating over a year, and plotted this in relation to the frequency of ratings received (Figure 6).

Figure 6

Fig 6: Frequency of ratings received plotted against rating score for Rhagor articles.

The overall clustering pattern reveals that, generally, more articles were rated between 3-5 than 1-2 across all three polls. Therefore this is of more value than just 'total number of ratings received' as it tells us that people tend to rate our content higher, rather than lower.

We can break down this metric further and identify those articles that receive the lowest ratings, allowing us to address possible editorial and style issues within those articles.

5.2 Most popular style of article - as rated by the users

We took this methodology a step further to identify those articles that had most frequently and consistently been rated highest in terms of style and readability. The top two highest rated articles are now distributed to content authors in the Museum as exemplar articles.


As with the ratings system, each article published comes with the functionality for the user to comment on that article. When a user submits a comment, it is not immediately published, but flagged up internally on the Amgueddfa CMS homepage, ready to go through an approval process where allocated staff can either approve, mark as spam or delete each comment.

6.1  Comments received this year: 147

This is a prime example of where numbers are meaningless. There is nothing that puts the total number of comments received in context; it only tells us that people are posting comments. Rather than doing a simple tally of total number of comments received, we looked at what people were actually saying and tried to evaluate this in a more meaningful and insightful way.

6.2  Trending analysis

As well as looking at a selection of comments, we wanted to gauge the overall tone of what people were saying, and whether we could identify any trends in positive words as opposed to negative words.

As an experiment, we gathered individual words submitted as comments into a tag cloud (Figure 7), revealing that among the top 50 most popular words there were no negative words trending. The overall usefulness of this approach is debatable (and ironically turns something qualitative into something quantitative!),

 However; this methodology could be applied more effectively elsewhere.

Figure 7

Fig 7: Tag cloud demonstrating the most popular words that have been submitted as comments. (Welsh translations: Cymru = Wales; yn = in; ysgol = school)

6.3 A lesson from comments

On looking through a sample of the comments received through the whole Web site, some seemed to be expressing frustration at the omission of images in our online art catalogue ( – hence evidence of negative experiences. As a direct response, we have altered our art database to display a disclaimer, explaining why not all images are available (Figures 8 and 9).

Figure 8

Fig 8: Example page from the National Museum Wales Art On-line Web pages illustrating a blunt non-informative message

Figure 9

Fig 9: The updated, more user-friendly version.

One measure of success would therefore be to lower the proportion of negative / non-positive comments. In hindsight, these negative comments gave us an opportunity to improve communication with the user.

6.4 Comment 'value'

In trying to make this methodology actionable, it may be useful to implement a 'value' system during the comment approval process, where approved comments could be tagged (privately) as:

  • Approved: Positive - evidence of a positive user experience, includes questions and enquiries;
  • Approved: General - passive comments that don’t tell us anything about a users experience;
  • Approved: Negative - evidence of frustrations or issues.

This methodology would depend on the overall benefits of such a system, but as well as gauging positive comments, it could also reveal content issues that could be promptly rectified.

Something of definite value would be to tag and track those comments that require a Museum reply. Approved: Action required would aid in the management and workflow of those comments awaiting responses.

7. Evaluating Our Social Audiences

Soon after the launch of Rhagor, we decided to use social media to distribute content. This was not driven by a desire to simply be seen as ‘doing’ social media, but from the acute desire to distribute our collections and knowledge beyond our own Web site.

We were aware that there were much larger captive audiences and special interest groups to be had on popular social media sites than could ever possibly have stumbled upon our own Web site – whatever promotional lengths we went to. We also realised that we had something of value and interest to offer these social forums.

After earmarking the specific type of content we wanted to disseminate; namely the images that were displayed in our on-line image galleries, we decided that the best forum to use would be Flickr, a Web site centered around images and photography. Uploading content was fairly swift as all the images and bilingual text had already been published on our own Web site. Content has continued to be added ever since (

Fundamentally, we realised that simply uploading content was not going to get us very far, so we invested (and still do invest) time in getting actively involved, especially engaging in groups focused on our key curatorial subject areas.

7.1 Evaluating  social success

It didn’t take long to realise that the audience viewing our content on Flickr was much larger than for our own Web site (Figure 10).

Figure 10

Fig 10: Cumulative total of views of January 2008 - December 2009

However, despite the much higher number of people interacting with our content on Flickr, there seems to be a low referral rate to the Museum's Web site (1.34%), despite increasing the number of links back to our Web site.

Making  data discoverable and publishing relevant data where the audience spends its time are key factors that align closely to Seb Chan's ‘5 rules of Museum content’ (

Flickr provides a number of metrics that allow monitoring of activity on your content. As mentioned previously, it is not the total number of views that is important, but the qualitative metrics that give insight into your content's appeal. In the case of Flickr, it's tracking and analysing the number of contacts made (and who they are), how many images people tag as favourites, the insightful (and sometimes authoritative) conversations posted about items, the number of images requested to be posted into special interest groups, etc.

Significantly, all the methodology applied during evaluating the metrics from our own site can also be applied to sites such as Flickr. It is possible to work out the most/least favourite (or popular) content, produce tag clouds from comments received, and so forth.

8. Summary and Conclusion

We have evidenced in this paper the requirement for Amgueddfa Cymru – National Museum Wales to look beyond key performance indicators if we are to gain informative insights into our Web site users. The launch of Rhagor provided an opportunity to experiment with a range of metrics to research the methodologies required to evidence success.

8.1 Actionable Research:

  • Our methodology prompted us to identify Rhagor's most popular content as a combination of both highest viewed and highest rated.
  • Analysis of the ratings within Rhagor revealed that over all, articles are rated higher rather than lower. Exemplar articles identified from highest rated articles now provide practical guidance in the creation of new content. Conversely, identifying the lowest rated articles highlights editorial issues to be addressed.
    By filtering out automated activity from page views generated by log file data, we are left with figures that are far more reflective of true user activity.
    Contextualising figures can provide a much more meaningful insight into your data; for example, Rhagor is the most active section of the Welsh URL as opposed to accounting for 15% of the page views.
  • During this evaluation it has become apparent that popular content for our two main languages differs. This has led us to question our simple duplication of Welsh and English content for Rhagor's homepage.
  • Through analysis of various qualitative metrics, we were able to quickly identify negative visitor experiences and remedy issues where possible.
  • Finally, we stated that the same evaluation methodologies can be applied to content you may have on other sites.

8.2 Conclusion

It became apparent during this evaluation study that as more analysis was done, more insightful information could be obtained. This prompts the question as to how much effort should be invested, not just in the analysis of data, but also in the backend programming that goes into producing that data. The fact that we are unable to track page events might not be too much of a problem at the moment, but in the future, this will become a higher priority as we evolve to publish more dynamic content applications.

Given the changing nature of the Internet, the sustainability of reporting traditional metrics as key performance indicators for our Web sites needs examination. In the future, what will be the most effective method of measuring success as our Web sites evolve into numerous content widgets and data feeds, and when Museums won't have Web sites, only Web presences? These are metrics that we need to identify if we are to continue to evidence our Internet success into the future.


Welsh Language Board (2003). Census 2001: Main Statistics about Welsh. Published 23 September 2003. Consulted 27 January 2010.

Office for National Statistics (2002). Census 2001: United Kingdom". Consulted 27 January 2010.

Chan, S. (2008). "Towards New Metrics Of Success For On-line Museum Projects". In J. Trant and D. Bearman (eds). Museums and the Web 2008: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2008. Consulted January 12, 2010.

Ellis, M. (2009). "What's so great about mobile?" In Electronic Museum. Published December 18 2009. Consulted January 29 2009.

Hamma, K. (2004). “The role of museums in on-line teaching, learning and research”. First Monday, Volume 9, Number 5. Published May 3 2004. Consulted 27 January 2010.

Haynes, J., and D. Zambonini (2007). "Why Are They Doing That!? How Users Interact With Museum Web sites". In J. Trant and D. Bearman (eds). Museums and the Web 2007: Proceedings. Toronto: Archives & Museum Informatics, published March 1, 2007 Consulted January 12, 2010.

Peacock, D. (2002). "Statistics, Structures & Satisfied Customers: Using Web log data to improve site performance". In D. Bearman and J. Trant (eds). Museums and the Web 2002: Proceedings. Pittsburgh: Archives & Museum Informatics, 2002. Consulted 12 January 2010.

Peacock, D. and J. Brownbill (2007). "Audiences, Visitors, Users: Reconceptualising Users Of Museum On-line Content and Services". In J. Trant and D. Bearman (eds). Museums and the Web 2007: Proceedings. Toronto: Archives & Museum Informatics, published March 1 2007. Consulted January 12 2010.

Cite as:

Davies, G. and D. James, Evaluating the On-line Audience of a New Collections Web site. In J. Trant and D. Bearman (eds). Museums and the Web 2010: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2010. Consulted