Proceedings paper

 

Symposium: The Power of the Voice for Singer and Listener

Antonio G. Salgado

Departamento de Comunicação e Arte,

Universidade de Aveiro, Aveiro, Portugal

 

Voice, Emotion and Facial Gesture in Singing

Structured Abstract: Modified to give results details

1. Background: Recent enquiry into facial gesture and the information it conveys has shown that human face is a commanding site to investigate a personīs emotional state and personality. Either through compared observerīs judgments of emotion from facial behaviour, or through measurements of the facial movement itself a series of data have been collected relating facial gesture and the emotion communicated through it.

Alongside this, recent psychodynamic, phenomenological and therapeutic research has approached singing as a creative musical experience occurring in both time and space, and existing precisely at the threshold between 'self and the world' - a resonant field where self and other may experience feelings of oneness and wholeness, a channel through which one expresses and communicates something from the interior world. In other words, singing provides a bridge between the inner world of mood, emotion, image, thought, experience and the outer world of relationship, discourse and interaction.

2. Aim: The aim of the current study is to investigate empirically how voice, emotion and facial gesture might be connected in singing and to verify how (or if) this possible connection might have any impact on the singer's awareness of his/her self-perception.

3. Method: A series of videotapes of three singers performing a line from a German Lied with different emotional intention from nautral to anger were analysed measuring facial movements, spectographic analyses of the singing, and singer's and onlookers' judgements of emotional content from the facial behaviour and the vocal communication.

4. Results

Data analysis revealed that both larynx and face move and work very differently according to emotional state. The particular profiles of each share some common characteristics. For instance, when singing with a sad expression, the face contracts, reducing its overall surface area, and so too does the vocal sound, producing a more breathy, whispered tone, indicating that the ventricle folds are further apart than in the other emotional conditions. Interestingly, a similar result was obtained for the communication of fear for singers A and C, whereas singer B used a much fuller sound and the frontalis was lifted, like is typically used in surprise (Ekman and Friesen, 1969) . This suggests that the three singers had slightly different interpretations of what fear was and how it was created. One possible reason for the differences is that in the case of singer B, he was thinking about the surprise within the Erlkonig itself, where the child suddenly, in a fearful state, calls out for his father. Singers A and C said that they focused more on a generic expression of fear, and did not have the element of surprise in mind when singing.

Whilst there was a high correlation for all states, especially happiness, it is worth mentioning that all the singers noticed discomfort in their throats when singing in the angry condition. Upon analysis of the data, it seems that in this state the muscle platysma is involved. Since the singers were wearing throat straps, the two electrodes were forced downwards as the platysma was brought into action, causing the discomfort. The sung effect was to create a rougher, less vibrant tone.

Interview data can be touched upon here to show that all the singers believed that their emotions were 'authentic' in that they were constructed out of memories of those states. However, they were all able to recognise that in their different interpretations some were more 'successful' than others. The quantitative data so far does not show any statistical differences in these more or less successful interpretations. Thus, it is only possible to begin to theorise about what might allow for the differentiation between the interpretations to occur. Form the commentaries of the singers and the viewers, it appears that there are qualitative differences in the intensity of how the muscles are used. That is to say that if the singer is clearly working with a stronger inner intention the effect is more successful.

 

5. Implications: Within the signifying process of singing, it is evident that facial behaviour plays an important role when communicating. The impact of the singer's awareness of his/her vocal expression and vocal self-perception clearly can be gained through the awareness of facial behaviour.

 

Paper

Intentions

The current study investigates empirically how voice, emotion and facial gesture might be connected in Western classical singing. The need for such a study arose out of an awareness of several issues. Firstly, facial expression in singing is often discussed anecdotally, but has rarely been subjected to any empirical analysis. Secondly, singing teachers often ask singers to 'sing with the eyes', 'make a smile' and so on, to achieve technical ends in singing. Thus, it was considered important to know whether these different facial movement expressions do in fact affect the quality of the produced vocalisation. Thirdly, given the second point, it seemed important to know if there was an objective correlation between the facial gesture and the sound made - in terms of its expressive intention. From a perceptual perspective, for instance, does the audience understand more if the singer looks as well as sounds 'sad', and what do these emotions objectively look and sound like? Fourthly, it is well known that singers and actors often show empathy with an emotional state, without entering into it completely or 'authentically'. In fact, singers 'act' out emotions. It was a final intention of this work to explore the extent to which the emotional expressions requested were perceived as being authentic by both the performer him/herself and the audience. It was possible to match these data against 'norms' for emotional expression in the face by comparing the profiles of the singers with measurements of facial formation/musculature arrangement for real emotions recorded by Ekman and Friesen (1969) from still photographs taken when people were subjected specific emotion eliciting situations.

Background

According to Manen (1974), historically, Bel Canto vocal technique was a musical exploration of the different vocal expressions for the different emotional states. So, in a practical way there has been an exploration of vocal emotion in singing, but few systematic empirical studies. Of the existing empirical work, key research has been undertaken by Kotlyar and Morozov (1976) and Sundberg (1980) who have demonstrated that when asked to sing with the emotional intentions of joy, sorrow, anger, fear, and no emotion, very different spectrographic analyses of the vocal sounds emerge. In joyful, for instance, there is a much higher frequency than the other emotions, the tonal course of the pitches is moderate, both up and down, the tonal colour comprises many overtones and the volume is loud. In sadness, there is a much lower frequency produced, the tonal course of the pitches is downward, the tonal colour is very restricted, with few overtones and the volume is soft. Given the complete absence of research precedents for what happens to the face in singing, it was hypothesised that the face would differ greatly according to the emotion, with happiness involving very different gestures to sadness, as it was felt that there would be a correlation between size of expression and loudness of sound produced. These hypotheses was in part based on intuitions from everyday observations, but also emerged out of drawing parallels with the work of Davidson (1994) who discovered that when a pianist was asked to perform with different emotional intentions, the louder he played, the larger his movements were in order to produce the sounds.

In the general emotional expression literature, key contributions to the field have been made by Ekman and Friesen (1969) and their co-workers. Whilst they largely support Davidson's findings, it is worth noting that in terms of musculature, fewest muscles are involved in happiness than the other emotions.

Linking these general research findings about musculature to singing technique, it is important to note that in classical singing, the intention is generally to keep the larynx free, to allow for optimum vibration. Additionally, the singer is taught to use the resonating cavities of the face and the pharynx. To achieve this, vowel sounds are often modified from those used in everyday speech, with the mouth opening rather more at the back than the front (Helmsley, 1998). These factors may have an impact on how the face works when the highly trained singer is asked to produce an emotional expression. In fact, there may even be some source of conflict, with natural facial expression involving a muscle in one direction which may need to function in another way for the sake of optimum vocalisation of the same emotion when interfaced with the technique of singing.

Fonagy (1962) undertook some pioneering research examining glottal behaviour during emotional speech. He found that very different glottal behaviours for the different emotions, with the ventricle folds being further apart in sadness or tender whispering, and the laryngeal ventricle squeezed together for pressed phonation in anger, for instance. But, if, as Helmesley (1998) argues, the emotions have to come first from the mind (thought of anger), then through the eyes (visualise to realise the emotion) and eventually into the voice, it is important to examine whether the facial musculature leads to the sound production or if the sound production causes the facial expressions. The issues of learning, formation or innate expression of emotion require careful consideration, for contrarily, Fonagy (1962) refers to the glottal profiles created in his study as 'pre-conscious expressive gestures'.

Singers are, of course, actors, often asked to characterise different people or emotional states in their work. They are people who apparently learn to 'fake' behavioural moods. Runeson and Frykholm (1983) believe that faked emotional states and expressions are formed in a slightly different manner to genuine ones, and thus an expert viewer can distinguish between the two. Of course, in singing, we accept that in an operatic characterisation there is an element of dramatic play, and so perhaps even expect the gestures and expressions used to be 'fake' or 'larger' than in real life. There is an expectation which needs to be fulfilled. It seems critical for these reasons to compare data from singers producing 'faked' emotional facial expression and sounds with those of genuine ones.

Alongside all of the research described above, it is important to note that recent psychodynamic, phenomenological and therapeutic research has approached singing as a creative musical experience occurring in both time and space, and existing precisely at the threshold between 'self and the world' - a resonant field where self and other may experience feelings of oneness and wholeness, a channel through which one expresses and communicates something from the interior world. In other words, singing provides a bridge between the inner world of mood, emotion, image, thought, experience and the outer world of relationship, discourse and interaction (Salgado, 1999, Draffan, 2000). Thus, finally, to get a deeper insight into the issue of emotion versus faked emotion, it is important to interview performers to ask how they feel when producing these emotions, and to explore audience reactions to the facial gestures used.

Methods

Recordings

Two male and one female singer (mean age 32 years) with an average professional experience of 8 years singing in solo oratorio, opera and recital work were used as the subjects of investigation. They were asked to prepare the musical phrase "Mein Vater, mein Vater" from the Lied Elrkonig by Schubert for recording. This phrase was chosen since the word could apply to almost any situation of emotional state. The musical line itself both rises and falls within a limited range of a third, so does not make particular technical demands on the singer, and so again leaves interpretative possibilities open. Recordings were made in five different conditions:

Neutral: this was based on Fonagy examination which always used the emotional state of neutral as a state against which to measure other emotional states. It acted as a base-line measurement.

Happy.

Sad.

Fearful.

Angry.

These are the four fundamental emotions which are now well-reported as being the most strong and clearly recognisable in many contexts (Ekman and Friesen, 1969)

For the recording, it was necessary to video tape the singers in full-face close up. Measurements of facial muscle activity were taken by digitising and tracking muscle activity over time with a specially designed software package. To track the movements it was necessary to mark the muscles to be mapped with 25 colours circular stickers, 12 on each side of the face and an anchor marker on the bridge of the nose. For the vocal recordings, the betacam sound channel was used to input a spectogram through software which allowed for an immediate plot of the harmonic spectrum and the singer's formant. Additionally, an electrolaryngograph was used to collect data about the opening and closing of the glottis. This was recorded by placing two electrodes on the outside of the larynx. These were kept in place using an elasticated neck strap.

After being videotaped, the singers collectively, along with three other viewers watched the recordings to assess the success of the tasks. From between two and four different attempts at each emotional state made by each singer, the viewers assessed which were perceptually the most/least authentic, and these bi-polar pairs were used as sources of data for analysis.

The singers were also interviewed about their views on emotion and singing and these qualitative comments were used to help interpret the data.

 

Results

Sound analysis

Emotion - Neutral. In this condition, the singers formant was not dominant, and the voice is weak in both amplitude and harmonics.

Emotion - Sadness. Relatively low harmonic content to that in the other

emotions. With singer B's voice being far weaker in both amplitude and harmonics than in his other examples. In all cases, the singer's formant is not particularly dominant (the strong harmonics between about 2500 and 3500 Hz).

Emotion - Fear. Singer's A and C both show very low harmonic content, with only the first two harmonics coming out strongly in singer A's case. Singer C is totally lacking the singer's formant, and singer A's is very weak, the tendency to half voice/ whisper to create the impression of fear. For singer B, however, there is quite strong harmonic content, indicating a much fuller voiced interpretation.

Emotion - Happy. Singer's C and A still show a weaker harmonic content, though the 2500-3500Hz area is stronger than in the fearful example. Singer B's plot, however, is not greatly different from that for fear.

Emotion - Anger: This showed the most dramatic change and both amplitude and harmonic content are considerably greater for singers A and C, and slightly greater for singer B. Singer C's singers formant still seems relatively weak - but the overall amplitude for all plots is considerably weaker than for either of the

Other singers. (That is, singer C is singing rather quieter).

Visual analysis

Emotion - Neutral. In this condition, the measurement of the movements show a very limited range of muscle activity, with a high degree of correlation between the three singers' use of their faces in this condition.

Emotion - Sadness. Corrugator muscles are used extensively in this condition, with singer B showing the most movement activity here, singer A, a moderate range of activity, and singer C the least activity. There is a correlation between individual's data for the bi-polar pairings of sadness recordings, showing that whether authentic or inauthentic emotion is expressed, the same muscles are involved.

Emotion - Fear. Levator labu and frontalis muscles are used here, but the degree of involvement varies according to individual and bi-polar interpretation. For instance, in Singer B, both lots of muscles are equally involved in both interpretations of fear. For singer C, there is limited activity in both. For singer A, there is little frontalis activity, but more in her more authentic interpretation of fear.

Emotion - Happy. Zygomaticus major and risorius muscles are involved. Like in neutral, there is a high degree of correlation between singers and bi-polar pairs.

Emotion - Anger. Platysma and procerus are involved. Here, singers A and B use very similar formations and degrees of activity in both renditions of the emotion. For singer C, there is less overall facial activity.

 

Conclusions

In summary, the data analysis reveals that both larynx and face move and work very differently according to emotional state. The particular profiles of each share some common characteristics. For instance, when singing with a sad expression, the face contracts, reducing its overall surface area, and so too does the vocal sound, producing a more breathy, whispered tone, indicating that the ventricle folds are further apart than in the other emotional conditions. Interestingly, a similar result was obtained for the communication of fear for singers A and C, whereas singer B used a much fuller sound and the frontalis was lifted, like is typically used in surprise (Ekman and Friesen, 1969) . This suggests that the three singers had slightly different interpretations of what fear was and how it was created. One possible reason for the differences is that in the case of singer B, he was thinking about the surprise within the Erlkonig itself, where the child suddenly, in a fearful state, calls out for his father. Singers A and C said that they focused more on a generic expression of fear, and did not have the element of surprise in mind when singing.

Whilst there was a high correlation for all states, especially happiness, it is worth mentioning that all the singers noticed discomfort in their throats when singing in the angry condition. Upon analysis of the data, it seems that in this state the muscle platysma is involved. Since the singers were wearing throat straps, the two electrodes were forced downwards as the platysma was brought into action, causing the discomfort. The sung effect was to create a rougher, less vibrant tone.

Interview data can be touched upon here to show that all the singers believed that their emotions were 'authentic' in that they were constructed out of memories of those states. However, they were all able to recognise that in their different interpretations some were more 'successful' than others. The quantitative data so far does not show any statistical differences in these more or less successful interpretations. Thus, it is only possible to begin to theoriesabout what might allow for the differentiation between the interpretations to occur. Form the commentaries of the singers and the viewers, it appears that there are qualitative differences in the intensity of how the muscles are used. That is to say that if the singer is clearly working with a stronger inner intention the effect is more successful.

References

Davidson, J.W. (1994) What type of information is conveyed in the body movements of solo musician performers? Journal of Human Movement Studies, 6, 279-301.

Draffan, K. (2000) Singing from the Soul. Unpublished MMus Dissertation, University of Sheffield.

Ekman, P. and Friesen, W. V., (1969) The repertory of nonverbal behaviour:Categories, origins, usage, and coding. Semiotica, 1, 49-98.

Fonagy, I. (1962) Mimik auf glottaler Ebener. Phonetica, 8, 209-219.

Helmsley, T. (1998) Singing and Imagination. Oxford: Oxford university press.

Kotlyar, G.M. & Morozov, V.P. (1976) Acoustical correlates of the emotional content of vocalised speech. Soviet Physiology and Acoustics, 22, 208-211.

Manén, L. (1974) 'The Art of Singing', London: Faber Music Ltd.

Runeson, S. and Frykholm, G. (1983) Kinematic Specification of Dynamics as an informational basis for person-and-action perception: Expectations, gender, recognition, and decpetive intention, Journal of Experimental Psychology: General, 112, 585-615.

Salgado, A. (1999) Rethinking voice evaluation in singing, Conference Proceedings from European Society of Cognitive Sciences of Music conference on Research Relevant to Music Training Institutions. Lucerne, Switzerland, September.

Sundberg, J,. (1980) Röstlära. Stockholm: Proprius Förlag.

 

 Back to index