“The Hand Dance” installation is an immersive virtual reality experience in which visitors interact with a stereoscopic virtual environment. The system integrates diverse technologies in the development of a flexible system allowing users to record human movement, analyze discrete gestures to define the elements that make each gesture uniquely identifiable, and test the recognition of a recorded gesture set in a networked real-time interface environment. The system can be used, therefore, as a body-based learning tool, a digital archive of manual skill, or for the interactive enhancement of live dance performance.
The system is designed to be easily configurable for a variety of different tasks and applications, and integrates technologies including a VICON motion tracking system, gesture recognition algorithms, and XVR, an online virtual environment development platform (http://www.vrmedia.it). The gesture recognition process allows gestures to be recorded through several inter-related steps: kinematic model construction and gesture recording using the VICON system, gesture analysis, data processing, gesture recognition via neural networks, and gesture model testing. The system has been tested with a wide variety of gestures focusing on arm and hand movement with 98% accuracy. In addition, a prototype installation/performance has been developed for eventual use in cultural institutions, learning environments, and performance spaces.
2. The Role of Dance in Society
Dance generally refers to human movement used as a form of expression, social interaction, or presented in a spiritual or performance setting. The term can also be used metaphorically to describe methods of non-verbal communication (body language) between humans, the motion of inanimate objects (the leaves danced in the wind), and certain musical forms or genres. Indeed, a recent trend in philosophical and cognitive models of the human mind understands all linguistic and iconic knowledge in terms of “embodied” mind and metaphor (Valera, Thompson and Rosch, 1991, Lakoff and Johnson, 1999).
Definitions of what constitutes dance are dependent on social, cultural, aesthetic, artistic and moral constraints, and range from functional movement (such as Folk dance) to virtuoso techniques such as ballet. Dance movements may be without significance in themselves, such as in ballet or European folk dance, or have a gestural vocabulary/symbolic system as in many Asian dances. Dance can embody or express ideas, emotions or tell a story. In short, dance is an enactive language (based on the act of doing) with direct ties to literal or abstract forms of symbolic and iconic expression, often deeply entrenched in cultural understandings.
The meaning of a particular dance, therefore, is tied to its roots in the popular folklore of a given region. Folklore is “the body of expressive culture, including tales, music, dance, legends, oral history, proverbs, jokes, popular beliefs, customs, and so forth within a particular population comprising the traditions (including oral traditions) of that culture, subculture, or group. It is also the set of practices through which those expressive genres are shared.” (http://en.wikipedia.org/wiki/Folklore) The academic and usually ethnographic study of folklore is sometimes called folkloristics, a field with the implicit interest in preserving human movement for the purpose of cultural study (Clemente and Mugnaini, 2001).
For this reason dance represents, in its own small way through music and movement, a true look towards other cultures; ancient cultures and works of rare beauty: from architecture to literature, philosophy to music. It is a story that began in ancient Mesopotamia, with honorary dances to the goddess Ishtar (goddess of fertility), passing through the ancient dances of the aboriginal Australians, to the old popular Italian folk dances, to today. In one sense, the history of human movement can be considered the heritage which dance provides. In this light, the importance of preserving body based knowledge–both for anthropological study and artistic expression–is of great interest in the field of cultural heritage informatics.
3. The Hand Dance: Gesture and Learning by Doing
“The Hand Dance” is an interactive “learning machine” performance/installation dealing with body motion and its decodification and visualization. In dance, gesture is a natural part of body movement, particularly exemplified by the use of the hands. The “figures” of motion performed during ordinary tasks express diverse meanings as understood through the context of culture and tradition. The principle objective of this live interactive performance/installation is, through the use of innovative technologies, to understand and interpret these movements and to visualize the resulting figures, symbologies, archetypes, etc. At the core of this experience is an interest in gesture and the expressive potential of hands, and the objective of creating an interactive experience where the hands of participants are the main protagonists. The installation uses technology to capture the atmosphere and symbolism of theatrical action, and to make it universally accessible through a cultural exchange based on the visual language of popular dance.
In Indian dance, for example, the performer executes basic movements to “draw” meaning with her hands, including geometric figures, elements of nature, concepts and emotions. A symbol could have many meanings; for example, a circle can represent the moon, the sun, a wheel, a pond, a snake biting its tail, time, the creative process, etc. Indeed, these symbols inhabit the body and evoke other deep truths. Perception - the audience’s ability to explore a determined symbol - is at the core of interpreting this type of evocative dance. Without a familiar acquaintance with its content popular dance remains a superficial art; a dance composed of beautiful movements and technical resolution, but deprived of the magic of ancient or sacred dance. The language of the hands can be considered a “universal language” that narrates through the body the vary symbols determined by the dancer.
“The Hand Dance” is an explicative choreography of hand movements; in other words, a simulation of the interpretation of symbols that bodily movements can communicate. The interpretation of such symbols is carried out naturally within the culture to which hand dance belongs; for other cultures, innovative technologies can give a real-time access to this kind of knowledge, by means of networked multimodal interfaces. Indeed, “The Hand Dance” illustrates a form of knowledge tied to action and capable of conveying concepts (generally expressed trough symbolic communication) by means of body intelligence. The enactive nature of “The Hand Dance” interface, therefore, offers the possibility of creating a resonance between various cultures, to share knowledge of culture through dance, promote enjoyment, and well being with other countries, and improve the health and wellness of the body through physical activity. Physical movement not relegated to geographic limits, but dispersed via eLearning technologies throughout the world offers a universal language, especially for expressing the identity of a culture (Castells, 2002, Bausinger, 2006).
4. Didactics of Skills and Manual Learning
The instruction and learning of body-based skills plays an essential role in the preservation of culture. Because knowledge of manual skills such as dance is handed down through principally oral traditions, it tends to remain concentrated both geographically and in the hands of a declining number of traditional masters. The preservation of this knowledge requires first-hand learning opportunities for students, access to which is often difficult.
On the other hand, the increasing interest in these disciplines is evident in the number of folk dance courses and festivals, and interest from within the scientific community. A growing need for new teaching and learning mechanisms - especially for physical skills and activities - can be observed in recent scientific literature. At a high level, a summary of the didactic process includes:
- A social obligation to: Transmit cultural knowledge (today understood not only as a pure and simple transmission, but to include appropriation, reworking, and production) in institutional and non-institutional forms; develop/educate the individual through learning experiences that become internalized and fuel further knowledge/abilities/expertise through the use of critical thought; develop specific sectors of knowledge/abilities/expertise in specific subjects when social institutions require it.
- Differing didactic processes depending on whether the individual has general, specific, or special objectives.
- Deliberate and projected acts of transmission/communication, mediation and relationship.
- The three interrelated and coexisting periods of planning, action, and evaluation.
In this overview the importance of “didactic action” should be noted: it is conducted by the “teacher” on the basis of a plan; active participation of the subject is implied, and the real-time evaluation and adaptation of action is implicit in this context of interactivity; it takes place in an “atmosphere of learning” that connotes a relational type (climate, dialogue, narration, etc.) and a type of mediation (methodology, instrument, configuration, etc.) in which real-time evaluation and adaptation are integral. Indeed, just as the parallels between computers and theatrical action have been noted (Laurel, 1993), a strong corollary between digital interaction and didactic action is evident.
5. Technologies for Dance Learning: From Video Courses to Virtual Reality
New technologies are adopted in response to social need. Due to the difficulty of transmitting manual skills, technologies such as video have become commonplace in the education of activities ranging from dance to sport to house repair. Unfortunately, these video courses are often ineffective due to their lack of involvement and awkward timing and lapses of connection between teacher and student. At the same time, however, they satisfy an important desire of cultural consumers interested in learning a new skill. A representative example of the video course paradigm for dance is the DVD “Hula for Everyone” (Taina Productions, 2005) Running time 63 minutes, the cover of which is shown in figure 1.
This program promises to be “your complete Hula lesson,” with special features including “Image learning the hula with the magical island paradise as your background. Be enchanted by the melodic sounds of Hawai’i and be inspired by the voice of experience and charm of ‘Taina.’ Now you can learn to do the hula with these easy step-by-step lessons.”
Video-courses such as “Hula for Everyone” follow a format that includes the following elements:
- Each basic step is taught with music, which conditions the dancer to feel the rhythm from the very beginning.
- Each lesson progresses from simple movements to those of increased difficulty and speed.
- Each song is taught one phrase or verse at a time. The first time through the music and dance is slowed down for learning, and subsequent examples are at the full speed of the performance.
- “Mirror-image” is used. The instructor performs the movements in reverse, allowing the dancer to follow each lesson in a mirror-like fashion.
Despite the temporal and interactive limitations of the video-course, these points provide a solid foundation for the development of multimodal or virtual reality (VR) dance learning systems.
For example, imagine a space in which you can stand surrounded by a virtual landscape and music, and control the ebb and flow of that place (choose the location, the folk dance, the people you interact with) by your own full-body movement. In addition, imagine that this space also includes a motion tracking system that reproduces your silhouette in real time–including all of the information about your body is contained in this silhouette–and a “dance teacher” capable of being interacted with, who can entertain, sing, and speak to you, as would an actor on the stage.
VR systems make this possible. The element that differentiates VR from other representations, a projected film or a video watched on television, DVD or cd-rom, for example, is the direct involvement of the body and action of the subject of the experience. When we enter into a virtual world, we become present in it with our body and our actions: we can move, change our gaze, interact, and explore the virtual world in which we find ourselves completely immersed. This sensation of immersion is produced by multisensory kinesthetic involvement and the inclusion of our point-of-view within the computer generated space. At an underlying level it is through our body that we understand the world and manipulate information and objects. In virtual reality we touch the world with our hand, and the effect of our actions through touch is a fundamental aspect of understanding, directly connected to our biological roots. This important bodily aspect, sensitive to our cognition, allows us a means a language to interpret the experience of synthetic worlds in terms of presence and action beyond simple visual experience. The experience of virtual worlds emphasizes purely visual experiences and the related cognitive processes, but accompanies them with action; experience, action, perception and knowledge are unified. Virtual Reality systems are constructed taking this holistic means of perceiving into account, designed around the body and action rather than on a single sense.
In gyms, dance schools and other locales, it often happens that the student watches himself in the mirror while trying to learn the movements they have seen the teacher execute, however many dance teachers prefer not to use the mirror in order to intensify the student’s bodily perception and connection between mind and body. This ultimately makes dance learning happen in the physical dimensions of the non-visual, with physical knowledge anticipating visual understanding (Sparacino, Davenport and Pentland, 2000).
Historically, many body driven interactive artistic experiences have explored this terrain. For example, the projects of Myron Krueger (1991) created in the early 1970s, are the precursors to the work many researchers tackle today in the interactive art and entertainment community. In academia, Bregler (1997), Kakadiaris and Metaxas (1995), and Cham and Rehg (1999), among others, have produced 2D and 3D body tracking systems towards a camera-based quantitative understanding of the body in motion, using different original approaches including kinematic and dynamic models.
“The Hand Dance” builds on these works through the development of a gesture recognition system that can be used, in conjunction with a large projected theatrical environment display to provide virtual scenographies and accompaniment to live performance. In this way, installation visitors can see what participants inside the installation are doing, their abilities tracked in real time and interpreted on the projected screen emphasizing the beauty of gesture. Abstract and figurative images, the interpretations of hand movement, are projected like virtual shadows; in this way, the gesture of the hand can become a bird, flower, religious rite, message, etc., as shown in figure 2.
Furthermore, “The Hand Dance” is a machine learning system conceived as an interactive installation that can be used at home, in gyms, cultural centers, schools, etc. Teachers are able to record gestures into the system, which can then be recognized by subsequent users. Because the teacher may not be present to explain the given gestures, an interactive avatar/teacher is included in the interface to demonstrate the ideal movement. The experience is enhanced by the use of real-time stereoscopic visualization and an enhanced form of “mirror learning”, in which the student sees his or her own kinematic-skeleton reflected in a virtual “mirror” superimposed with that of the correct gesture. The system can also be installed on a network and distributed across numerous geographic locations.
6. “The Hand Dance” Installation in Detail
Body-driven artistic projects today can benefit from a variety of tools allowing real time information capture and use this information creatively to generate or drive lights, music, graphics, or video. There are a variety of commercially available body motion systems using real time optical tracking that can be put to the desired artistic objective (Sparacino, Davenport and Pentland, 2000).
On the high end, VICON (http://www.vicon.com) offers an expensive multiple camera motion tracking system that uses passive (non powered) markers on the body. This system requires a dedicated calibrated space and can handle tracking of multiple bodies at once, with the use of several cameras; the more cameras are employed, the higher the accuracy of the system. A more affordable solution by Infusion Systems (http://www.infusionsystems.com) offers a simple and powerful MIDI-based plug-and-play sensor package and software to capture aspects of body motion in real time. These sensors require wiring up the stage and/or the performer with cables, which, in some cases, can be a limiting factor. In the mid range, Rokeby’s VNS (Very Nervous System) (http://www.interlog.com/~drokeby/vns.html) is the first example of a camera-based tool to gather information on body motion in real time. While the VNS does not model an articulated body in motion, such as the VICON tracking system, it is able to detect the presence of any foreground object in motion, in sufficiently high contrast lighting conditions.
“The Hand Dance” system, intended for installation in a theatrical, educational, or museum environment, is currently implemented with the use of a VICON motion tracking system. This system is composed of 8 cameras that track the positions of retroreflective markers attached to the user performing the movement. The system’s cameras are placed around a close area for create a capture volume, as shown in figure 3.
The user wears a special upper-body suit containing the retroreflective markers, allowing the gesture and position of body to be tracked in real time. The position of the markers allows movements of the hands and arms to be captured and interpreted by the system, and includes one marker on the back, one on each shoulder, one on each elbow, two on each wrist, one on each thumb, one on the knuckle of each index finger and one on each index finger tip. This information is used to control the motion of an avatar in the 3D environment, as shown in figure 4. This template has nine segments connected by ball joints, allowing the velocities, angles and positions of a gesture performance to be analyzed by the real-time recognition system. Alternate kinematic models and marker configurations are easy to save within the VICON control software, so the system can be expanded to include the capture and analysis of full-body gestures.
After analyzing the behavior of numerous dance movements, it was determined that the only attribute which does not change significantly depending on the user’s position, placement or body size are the angles of the kinematic structure. With this in mind, the values of the angles between the finger/thumb, palm/forearm, forearm/arm and arm/shoulder of each arm are calculated based on the markers’ positions, as shown in figure 5.
The angles revealed that some values are not needed because they do not belong to the movement itself. For this reason, a “chopper” was developed. The idea behind the chopper is to use the angle velocities and palm velocities in order to know when the desired movements start and finish.
The overall sequence of the system architecture is shown in figure 6, beginning with the user performing movements to record the markers placed on the body.
For each captured gesture, a series of manipulations are performed on the data. The first algorithm corrects the angles in the case of a lost marker (the VICON system sometimes loses the markers and sets the position of the lost marker to zero). Next, another algorithm calculates the velocities of the palms and the angles (with basic mathematical methods). This is used to determine where each movement starts and finishes, and to filter the data from the gesture. A subsequent algorithm uses these values to chunk the movement, based on when the first velocity (either angular or of the palms) changes from zero (start) and returns to zero and remains that way for some time (end).
Two neural networks have been tested to recognize the gestures: a Feed Forward Neural Network (FFNN) and a Probabilistic Neural Network (PNN) (Bishop, 2006). The repeatability of the results for each instance of the PNN suggest that this neural network is the best option in the real time implementation of a Gesture Recognition System for Virtual Environment Control (Portillo-Rodriguez et. al, 2007).
Within the context of this technological backdrop, visitors can choose to experience different virtual environments in which gesture plays a critical role. For example, the hands of a potter control the shape of the ceramic vase being thrown on the potter’s wheel, the traffic policeman uses gesture to communicate with cars and pedestrians at busy intersections, and the Indian dancer uses gesture to tell stories by evoking metaphorical images. The system is equipped with a gesture recognition algorithm to control the virtual environment. Thus the motion of the users’ body controls each environment in a different way, depending on the context.
7. Conclusions And Future Developments
“The Hand Dance” concept points to the possibility of various future developments, including elaborated virtual scenographies and accompaniment for live performance, the creation of virtual schools of manual skill instruction, the creation of a collaborative didactic format integrating interested associations, research laboratories and schools with residents in places where dance or manual skill are localized, applications of movement therapy, laboratories where choreographies can be performed between people in diverse locations, the creation of new theatrical performances, virtual dance companies, etc.
To date the system provides accurate gesture recognition for gestures with a consistent starting and stopping position, but real gestures are often part of a fluid sequence. Improving the system to account for the continuous and varying motion of everyday tasks is an ongoing area of research.
To enhance motor learning, elements providing haptic feedback to correct faulty posture could be incorporated into the system. Vibrotactile feedback suits, for example, have been shown to be effective aids to accelerated human motor learning (Avizzano and Bergamasco, 1999, Lieberman and Breazeal, 2007).
Finally, improvements to the reactive virtual environment demonstration will be developed to test the efficiency and implications of gesture learning on a variety of different users and situations. For interactive learning environments in particular, performed data could be mined by an intelligent system to understand which aspects of movement are similar and how students’ gestures evolve as they learn. Opportunities incorporating real time machine learning and interactive human/system teaching/learning give and take would be interesting to apply in a networked environment where a variety of users collaborate to teach the system–and each other–the fundamentals of new invented and/or improvised collaborative performance.
“The Hand Dance” is an evolution of continuing research at the PERCRO Laboratory, Scuola Superiore Sant’Anna in Pisa, Italy. In addition to inclusion in ICHIM 07, “The Hand Dance” will be shown at the scientific and artistic event “Enaction_in_Arts” 2007 in Grenoble, France (http://www.enactivenetwork.org). In addition to the Enactive Network of Excellence, special thanks to the SKILLS IP consortium, an EU funded project dealing with the acquisition, interpretation, storing and transfer of human skill by means of multimodal interfaces, robotics and virtual environment technologies.
Thanks also to Carlo Alberto Avizzano, Oscar O. Sandoval-González, Jesús A. Velázquez Lechuga, and Gerardo A. Saucedo Basilio for contributions to the technological development of the gesture recognition system.
Avizzano and Bergamasco, 1999. Carlo Alberto Avizzano and Massimo Bergamasco, Haptic Interfaces: a New Interaction Paradigm, Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems, 1999
Bausinger, 2006. Hermann Bausinger, Cultura popolare e mondo tecnologico, Guida, Napoli, 2006.
Bishop, 2006. Christopher M. Bishop, Pattern Recognition and Machine Learning, ISBN 780387310732, Springer, 2006
Bregler, 1997. Bregler, C., Learning and recognizing human dynamics in video sequences, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 1997
Cadoz, 1994. Claude Cadoz, Le geste, canal de communication home/machine, Technique et science informatiques, 13, 1, pp. 31-61, 1994.
Castells, 2002. Manuel Castells, La nascita della società in rete, Università Bocconi ed., Milano, 2002.
Cham and Rehg, 1999. T.J.Cham, J.M.Rehg, A multiple hypothesis approach to figure tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2, 1999
Clemente and Mugnaini, 2001. Pietro Clemente and Fabio Mugnaini, Oltre il folklore: Tradizioni popolari e antropologia nella società contemporanea, Carocci ed., Roma, 2001.
Kakadiaris and Metaxas, 1995. Ioannis Kakadiaris and Dimitris Metaxas, 3D Human body model acquisition from multiple views, Proc. International Conference on Computer Vision, 1995
Krueger, 1991. Myron Krueger, Artificial Reality 2, Addison-Wesley Professional, ISBN 0-201-52260-8, 1991.
Kurtenback and Hulteen, 1990. Gordon Kurtenbach and Eric A. Hulteen, Gestures in Human-Computer Interaction, The Art of Human-Computer Interface Design, Addison Wesley, 1990.
Lakoff and Johnson, 1999. George Lakoff and Mark Johnson, Philosophy in the Flesh: The embodied mind and its challenge to western thought, Basic Books, New York, 1999.
Laurel, 1993. Brenda Laurel, Computers as Theater, Addison-Wesley Professional, 1993.
Lieberman and Breazeal, 2007. Jeff Lieberman and Cynthia Breazeal, Development of a Wearable Vibrotactile Feedback Suit for Accelerated Human Motor Learning, IEEE International Conf-erence on Robotics and Automation, Roma, Italy, 2007.
Morris, Collet, Marsh and O’Shaughnessy, 1979. Desmond Morris, Peter Collet, Peter Marsh and Marie O’Shaughnessy, Gestures: Their Origin and Distribution, Stein and Day, 1979.
Portillo-Rodriguez et. al, 2007. Otniel Portillo-Rodriguez, Oscar O. Sandoval-González, Haakon Faste, Jesús A.Velázquez Lechuga, Gerardo A. Saucedo Basilio, Elvira Todaro, Carlo Alberto Avizzano and Massimo Bergamasco, Towards a Flexible Real-time Gesture Recognition System for Virtual Environment Control,4th International Conference on Enactive Interfaces, Grenoble, France, November 19-22, 2007. Submitted.
Sparacino, Davenport and Pentland, 2000. Flavia Sparacino, Glorianna Davenport and Alex Pentland, Media in performance: Interactive spaces for dance, theater, circus, and museum exhibits, IBM Systems Journal, Vol. 39, Nos. 3&4, 2000.
Taina Productions, 2005. Taina Productions, Hula for Everyone, distributed by Island Heritage, 2005.
Valera, Thompson and Rosch, 1991. Francesco Valera, Evan Thompson and Eleanor Rosch, The Embodied Mind: Cognitive Science and Human Experience, MIT Press, 1991.
Todaro, E., et al., The Hand Dance: A Didactic Performance Platform, in International Cultural Heritage Informatics Meeting (ICHIM07): Proceedings, J. Trant and D. Bearman (eds). Toronto: Archives & Museum Informatics. 2007. Published October 24, 2007 at http://www.archimuse.com/ichim07/papers/todaro/todaro.html