April 11-14, 2007
San Francisco, California

'Instant Multimedia': A New Challenge For Cultural Heritage

Nicoletta Di Blas, HOC-DEI, Politecnico di Milano, Italy; Davide Bolchini, Università della Svizzera Italiana, Lugano, Switzerland; and Paolo Paolini, HOC-DEI & Università della Svizzera Italiana


Cultural Heritage is traditionally associated with long-term values and stable (i.e. not rapidly changing) content; consequently, we are used to thinking that 'editorial products' associated with it have a long life and deserve a long-term, well-planned effort. Most museum catalogues, publications, Web sites and multimedia applications follow this pattern: they need sizable development time, significant effort and proportionately significant financial resources.

Recently, a new necessity is slowly emerging, for a number of reasons: budget is shrinking, new opportunities of communication are springing up, technologies and devices are rapidly changing. Therefore, a new approach can be considered: 'instant multimedia' (analogous to 'instant books') means that in a short span of time and with little budget a good quality multimedia application must be produced, possibly aiming for several technological channels (from Web, to mobile devices, to iPod, to information points, to iPhone ...).

This investigates the concept of instant multimedia and to identify its basic constituents. Furthermore the paper illustrates a specific paradigm, OneThousandandOneStory, consisting of a design pattern, a workflow, a production method and finally an engine, to generate applications. The paradigm produces high-quality applications that can be delivered as CD-Roms, Information Points, Web sites, Podcasts and, very soon, phone applications. It has been used to generate 11 instant multimedia applications for a variety of partners, including museums, ministries, research teams, and academic institutions. It has been used by College students and, from January 2007, more than three thousand high-school students in Italy will use it to generate their own multimedia narratives about culture and history in their territory.

Keywords: low-cost multimedia, fast development, user experience, storytelling, design

1. Introduction And Motivation

Sometimes interesting ideas stem from the least expected situations. In October 2005, we at HOC-LAB at Politecnico di Milano were asked to develop, in a very short time (approximately one month), an interactive application on CD-ROM for an exhibition about Bramantino (a Renaissance painter quite active in Lombardia, the region of Milan) that was about to be opened at the Pinacoteca Ambrosiana (in cooperation with the National Gallery of London).

Given the very short lead-time, we had to speed up the production process in two respects: cutting down design and programming, and cutting down (time and cost of) the preparation of the content. Let us focus on the latter, i.e. content production, first. Since the exhibition had already printed its own catalogue, and there was neither time nor money to reproduce it in an interactive way, we decided to meet the exhibition's curator to interview him for a couple of hours and then use the interview as the basis for the application's content. Our original idea was to use the format of interview: in a number of projects, and especially for Learning@Europe (see Di Blas & Poggi, 2006), we had already strongly relied upon interviews as an effective way to gather high quality content quickly, and also as a format for delivering textual content that could be too cumbersome otherwise. After analyzing what we had, we realized that in this case it would not be possible to make a 'multimedia interview' (we had experience with text-based interviews) without using video-taking, and that was too expensive for this production. Moreover, since a large number of pictures had to be used (all related to the pictures in display and referenced by the curator), we wanted to use them in a proper way, but we did not want to end up with a pure slide-show (not even if accompanied by an audio comment).

Turning our attention now to the issue of design, basic choices were again taken almost by accident: the curator turned out to be a magic storyteller who kept us spell-bound for 2 hours. Under the influence of this interview, we tried to retain the 'flavor' of what we had experienced, offering to our users the possibility of 'listening to the tales' told by the curator. However, we had to accommodate the need for variable user experiences, ranging from visitors who would not spend more than a few minutes to play with technology, to visitors who could spend 20 minutes or more. An additional requirement to consider at the same time was Web technology (for on-line users) and CD-Roms (for visitors attending the exhibition), as well as podcasting as a future option.

The solution was to break the narrative into 8 topics: for each topic we would have a short version and a longer version, the latter consisting of a set of sub-topics (4 for each topic), as shown in Fig.1.

editorial plan

Fig 1: The Bramantino's application editorial plan (pdf)

In terms of the 'user experience' we had the following options in mind: the user could focus on the main topics only (short version); the user could focus on one topic and its details; the user could get everything (long version); the user could just pick up the item of his/her interest (either a topic or sub-topic).

The final decision was about the medium for each item; i.e. for each 'topic' or 'subtopic'. Again we took a pragmatic decision: an 'audio' reading of a short text coupled with visual communication; i.e. a set of flash animations highlighting details. This choice ensured quality (i.e. impact for the user) at the same time as low cost, since the audio was obtained by 'reading' a text rather than directly editing (in audio and/or in video) the interview.

The production experiment was extremely successful. The interview's transcript was done, adjustments and additions were made in record time (by asking the curator very specific questions, in order to integrate missing pieces of information), and in 3 days all the texts were ready (for audio recording) in final version. When the exhibition opened, on December 5th 2005, the CD-Rom was shown in the installation (putting the narrative in a loop) and also distributed, with great appreciation from the public.

Fig 2: Bramantino’s application: the home page

Fig 2: Bramantino's application: the home page

From the Bramantino exhibition we learned some lessons, the most important of which was the following: cutting down costs and time, without significantly affecting the quality of the communication, is a strategic direction, rather than an accident.

There are several reasons to consider this a strategy, rather than an accident, for the whole field of Cultural Heritage:

On the basis of the above considerations, we have developed a different strategy with respect to our past (see for example Di Blas, Paolini, Poggi 2004), focusing on low-cost, fast production, possible service to multiple purposes (e.g. interactive information point, self-running information point, portable guide, podcasting...).

For this class of applications we coined the term instant multimedia, evoking the concept of 'instant books', i.e. applications of good quality that can be put on the market in a short time. Besides the above considerations, we had an additional requirement: to spend less time on info-architecture and design in order to focus on 'deep content'; i.e. the 'cultural message', rather than using resources for high-end content (e.g. high-quality video, 3D graphics, etc.). In this way, we felt, the impact on the user could be deeper.

2. The Instant Multimedia Package

In order to make instant multimedia 'real', an instant multimedia package is needed, consisting of a few elements:

Different styles of instant multimedia can be obtained by using different packages, or at least different design patterns. Instant multimedia can be considered also an outpost version of the (so called) CMS's, Content Management Systems: systems where content is inserted (through data entry of some complexity), and publishing is almost automatic. Some CMS's prefer flexibility, 'info-architecture power', interface personalization, etc. The consequences are more flexible design, more time to configure the system and to put it in place, and more complex data entry. Instant multimedia emphasizes speed and low cost, therefore less flexible design, very simple data entry, little control over the interface, etc.

There are other examples of systems that we consider half-way between traditional CMS's and Instant Multimedia. The 'Medina' effort (Garzotto, 2006) and 'Pachyderm' (LaMar et al., 2005; Samis, 2005) are examples of this trend: a standardization of design concepts and layout that allows fast production of quality applications.

In the next two sections we will describe a specific Instant Multimedia package, directly derived from the Bramantino exhibition, that we called OneThousandandOneStory.

In Section 3 we describe the design pattern, the workflow and the content production method; in Section 4 we describe the technology; i.e. the engine that supports data entry and generates the final applications.

3. OneThousandandOneStory

OneThousandandOneStory is an instant multimedia package developed by the HOC laboratory of Politecnico di Milano. (It will be also demonstrated in a special session of Museums and the Web 2007.) In different versions it has been used for building 11 applications (as of January 07), for different clients and purposes: promotion of HOC activities, in museums (Pinacoteca Ambrosiana, Museo Archeologico di Milano), in the ministry of Tourism of Syria, in ministries of the Mediterranean area, in the Dean's office of Politecnico di Milano. As of January 07, OneThousandandOneStory is being considered by other cultural institutions (e.g. the Herman Hesse Museum near Lugano, Switzerland) and by other national initiatives (in cooperation with the Italian Ministry for Cultural Heritage); it is also the basis for a national competition involving nearly 3,000 high school students from 10 different Regions. It has also been used by students at USI (University of Italian Switzerland) as a basis for content authoring exercises. Figures 3, 4 and 5 display some screenshots from the Archeological Museum of Milan application, with some examples of the related narratives.

Fig 3: The Roman section of the Archeological Museum of Milan: the home page

Fig 3: The Roman section of the Archeological Museum of Milan: the home page

Fig 4: the tower

Fig 4: the tower

"...In the museum gardens a tower is conserved, tied to a line of walls that back in time surrounded the city. It was emperor Maximian that, at the end of the III century A.D., decided to enlarge the city walls, in which an extraordinary polygonal tower, with 24 faces, was included. Frescoes can still be seen in its inner walls, dating back to the XV century, when the tower was transformed into a chapel inside the monastery..."

Fig 5: the cup

Fig 5: the cup

"...The cage cup housed at the archaeological museum is considered one of the most rare archaeological findings from the late Roman Empire. What makes the cup so unique, apart from its aesthetic value, is the manufacturing technique, that remains a secret even for nowadays glass manufacturers. The technique refers to glass openwork and was mastered by very few experts in the Antiquity"

The First 3 Components Of This Instant Multimedia Package

3.1 The Design Pattern

The design pattern for OneThosusandandOneStory is a generalization of what we have used for the Bramantino exhibition described in section 1. In the generalization process we had a few requirements in mind:

The design pattern we came up with is very simple indeed, but it is apparently able to satisfy the above requirements:

3.2 The Workflow

The workflow is divided into 10 major steps.

W1: Gathering Of The "Raw" Material And A General Idea (1-2 Hours)

The preliminary actions are quite simple: focusing on the subject, talking loosely with the expert, collecting the pictures already available, consulting the text already available (if any).

At the end of this step the generic idea of the narrative is made clear and agreed between the project staff and the expert.

W2: Editorial Plan (1-2 Hours)

The project staff, together with the expert, transforms the general idea into a precise plan; i.e. identifies each topic and subtopic. The editorial plan may be revised when the content is actually produced (see below), but it is better to start the content production with a precise idea of the envisioned outcome.

W3: Visual Communication (3-8 Hours)

In the 'professional' version of the format we can accommodate different kinds of visual communication, including flash animations, videos, slideshows of images (including power point presentations) etc. In the simplest version (like, for example, the one we provide for schools) we only allow still images: they are played in a sequence with the audio.

In our format there is no strict synchronization between the images and the audio: images are meant to be evocative (rather than identifying or descriptive) of the topic. This lack of strict synchronization allows great freedom for the author, and also cuts down time and cost. Strict synchronization can be simulated by carefully placing images in the proper order, in order to get them along with the audio at the right moment (more or less, of course).

W4: Writing The Narratives (8-16 Hours)

This is a crucial step. A text must be produced, for each topic or subtopic: the time depends on the technique of production (see the corresponding paragraph). Each piece (topic or subtopic) needs something between 90 and 120 words; we can estimate the time for production of the first version of text at between 15 minutes and 25 minutes each: a 6-topic narrative (with an average of 5 subtopics each) has 36 narrative elements (audio pieces), and can therefore take something between 540 minutes and 900 minutes, plus some additional editorial time.

W5: From Text To Audio (4-8 Hours)

Our experience shows that it is better to make a revision with registered audio, rather than with just the textual narratives. Recording 36 fragments of 1 minute each may take varying times, according to whether a professional studio (as we suggest) is used, or an in-house production is used.

W6: First Version: Putting The Pieces Together (2 Hours)

Using the engine, in order to create the first version, the following pieces must be inserted:

Partial or global previews allow an easy check of the partial work

W7: Quality Check (2-4 Hours)

The overall impact of the result must be checked when all the content is there. Checking may find trivial mistakes (e.g. wrong picture, fault in the text, etc.) or discover flaws in cultural quality and completeness. The thoroughness of this test depends, obviously, on the level of perfection being sought, and also on the concrete possibility for improvements (depending on available resources).

W8: Revising Text And Audio (4-8 Hours)

According to the observations of W7, the texts can be improved, and consequently a new session of audio recording and editing may be necessary.

W9: Revising Visual Communication (1-4 Hours)

Critical evaluation of the pictures may involve shifting them around and/or, most important of all, selecting other pictures to reinforce the visual communication.

W10: Final Version (2 Hours)

The revised audio files, texts and pictures (with captions) can be inserted and, after a quick check, the generation can be accomplished: a Web version, a CD-ROM version and a set of files for pod-casting can be generated in one step.

From the above workflow we can see that more or less 28 hours is the minimum amount of time needed, while 56 hours is the maximum. These rough estimates, of course, can't be taken as scientific determination, but rather are the results of our experience over several developments.

3.3 Content Production

One key ingredient for an instant multimedia package, in order to be successful, is a fast and reliable way to produce content. 'Fast' means little time and little budget; 'reliable' means quality, at the level needed for the type of production. As far as content production goes, there are three key ingredients of OneThousandandOneStory: transcribed interviews (with transcription and subsequent reading by professional speakers), evocative images, and professional speakers.


We use interviews with highly competent people to get the texts that later will generate audio. The reasons in favor of transcribed interviews are several:

Evocative Images

The audio narrative is the driving force to keep the user absorbed; images add an emotional aspect, rather than providing strictly rational information. The advantages are several:

Professional Narrators

We prefer professionals to the use of 'author speakers', such as the curator, for example. Although speeches by authors can be very emotional, they are more difficult to organize, more expensive (due to editing costs), and often, for 25-30 minutes of audio, of inferior quality.

4. OneThousandOneStory: Technology

Aapplication engine is the cornerstone of the instant multimedia approach. Through the engine a number of advantages can be obtained:

Fig 6: The engine OneThousandandOneStory: editorial plan ( left) and content ( right)

Fig 6: The engine OneThousandandOneStory: editorial plan ( left) and content ( right)

The engine (Milleeunastoria in Italian and OneThousandandOneStory in English) has been developed in a number of different versions by HOC-lab in the year 2006. It has been used, so far, by the staff of the laboratory, by high school students and by college students. In January 2007 thousands of students started using it for the competition Policultura (; the engine has been made available to schools around Italy.

Fig 7: The architecture for the QST engine

Fig 7: The architecture for the QST engine

Figure 7 summarizes the main functional architecture for the QST engine. The most important components of the engine are Data Entry, Preview and Generation.

In order to avoid technicalities for the development team, in the current version the engine is provided as a Web service. The production team remotely inserts data and looks at previews; when the work is complete, the generated application (that can be quite sizable) is available on the server. The generation of the applications (for the different channels) takes virtually no time.

5. Conclusions And Future Work

Instant multimedia could become a pervasive phenomenon for Cultural Heritage: numerous small, good quality applications made available for different channels and supporting different user experiences. Users today are more likely to pay attention to small applications, possibly available on their mobile devices, rather than to 'consume' large, expensive applications.

In order to make this vision real, the Cultural Heritage community needs reliable Instant Multimedia Packages, possibly evolved from current Content Management Systems or from efforts like Pachiderm (LaMar et al., 2005; Samis, 2005), MEDINA (Garzotto, 2006) and similar applications. We hope that in the near future cultural institutions willing to develop a new lean and low-budget application will be allowed to choose among a dozen different options, in terms of format, technology, user experience, etc.

HOC-LAB (and its partner, TEC-LAB) are committed to the following developments:

Most important of all, we are working together with cultural institutions to make instant multimedia more widely used, and to provide more satisfactory experiences for users.


Di Blas, N., P. Paolini, C. Poggi (2005). "A Virtual Museum where Students can Learn". In L. Tan Wee Hin & R. Subramaniam (eds). E-learning and Virtual Science Centers. Idea Group Inc., U.S.A. 308-326.

Di Blas N., C. Poggi (2006). "3D for Cultural Heritage and Education: Evaluating the Impact". In D. Bearman & J. Trant (Eds.) Museums and the Web, Selected Papers from an International Conference. Toronto, Canada: Archives & Museum Informatics,141-150. Also available at

Folmer, E., M. van Welie, and J. Bosch (2006). "Bridging Patterns - an approach to bridge gaps between SE and HCI". Journal of Information and Software Technology. Volume 48, Issue 2, 69-89, February 2006.

Garzotto, F. (2006). MEDINA three years later: Towards "Enterprise Frameworks" for Cultural Tourism Web Applications. In D. Bearman & J. Trant (Eds.) Museums and the Web, Selected Papers from an International Conference. Toronto: Archives & Museum Informatics, 173-184. Also available at

LaMar, Michelle, et al. Architecting the Elephant: Software Architecture and User Interface Design for Pachyderm 2.0, in J. Trant and D. Bearman (eds.). Museums and the Web 2005: Proceedings, Toronto: Archives & Museum Informatics, published March 31, 2005 at

Paolini, P., F. Garzotto, D. Bolchini and S. Valenti (1999). 'Modelling by Pattern' of Web Applications. In Peter P.S. Chen (Eds). Proc. of Advances in Conceptual Modeling (ER'99). Workshops on Evolution and Change in Data Management, Reverse Engineering in Information Systems, and the World Wide Web and Conceptual Modeling, Paris, France, November 15-18, 1999.

Samis, P. (2005). "Just Add Elephants: Breeding and Browsing Rich Media Educational Resources at the San Francisco Museum of Modern Art". Studies in Communication Sciences. Vol. 5, Number 1, Summer 2005.

van Duyne, D.K., J.A. Landay, and J.I. Hong (2002). The Design of Sites: Patterns, Principles, and Processes for Crafting a Customer-Centered Web Experience. (Paperback) Addison Wesley, 2002.

Welie, M. and H. Traetteberg (2000). "Interaction patterns in user interfaces". In Proceedings of the Seventh Pattern Languages of Programs Conference. (Monticello, IL, Aug. 13-16, 2000).

Cite as:

di Blas, N. et al., 'Instant Multimedia': A New Challenge For Cultural Heritage, in J. Trant and D. Bearman (eds.). Museums and the Web 2007: Proceedings, Toronto: Archives & Museum Informatics, published March 1, 2007 Consulted

Editorial Note