Voice Acoustics: an introduction (and redirection)
Speech
science has a long history. Speech and voice acoustics are an active
area of research in many labs, including our own, which studies the
singing and speaking voice. This document gives an introduction and
overview. This is followed by a more detailed account, sometimes
using experimental data to illustrate the main points. Throughout, a
number of simple experiments are suggested to the reader.
More about speech and singing
Voice science is a broad and
active area of research. The references quoted in this essay appear
below, and below that is a collection of links.
One of the aims of this essay is to provide an introduction to our research on the
voice and to our publications on voice and music acoustics.
This web essay was written by Joe Wolfe, Maëva
Garnier and John Smith of the Acoustics
Group at UNSW, 2009.
References
- Alku, P., (1991). "Glottal Wave Analysis With
Pitch Synchronous Iterative Adaptive Inverse Filtering", in
Proceedings of Second European Conference on Speech Communication
and Technology, Genova, Italy.
- Barrichelo, V. M. O., Heuer, R. J., Dean, C. M.
& Sataloff, R. T. (2001)”Comparison of singer's formant,
speaker's ring, and LTA spectrum among classical singers and
untrained normal speakers”, J. Voice, 15, 344-350.
- Baken, R.J. and Orlikoff, R.F. (2000). Clinical
Measurement of Speech and Voice. 2nd ed. Singular Publishing
Group, San Diego, California.
- Barney, A., De Stefano, A., and Henrich, N.
(2007). “The effect of glottal opening on the acoustic response of
the vocal tract” Acta Acustica united with Acustica, 93,
1046-1056.
- Behnke E. (1880). The mechanism of the human
voice, 12th ed. London: J. Curwen & Sons, Warwick Lane,
E.C.
- Bele, I. (2006) "The speaker's formant". J.
Voice, 20, 555-578.
- Bjorkner, E. (2006). Why so different? Doctoral
dissertation. KTH, Stockholm.
- Bloothooft, G. and Plomp, R. 1986a. “Spectral
analysis of sung vowels. III. Characteristics of singers and modes
of singing.” J. Acoust. Soc. Am. 79, 852-864.
- Bloothooft, G. and Plomp, R. (1986b). The sound
level of the singer's formant in professional singing.
J.Acoust.Soc.Am., 79, 2028-2033.
- Carlson, R., Granström, B. and Fant, G. (1970).
"Some studies concerning perception of isolated vowels." STL-QPSR
2-3: 19-35.
- Chen, M.Y. (1997). “Acoustic correlates of
English and French nasalized vowels”. J. Acoust. Soc. Am. 102,
2360-2370.
- Childers, D.G., Krishnamurthy A.K., (1985). "A
critical review of electroglottography". Critical rev. biomed.l
eng., 12, 131-161.
- Childers, D. G. & Lee, C. K. (1991). “Vocal
quality factors: analysis, synthesis, and perception”.
J.Acoust.Soc.Am., 90, 2394-2410.
- Clark, J. Yallop, C. and Fletcher, J., An
Introduction to Phonetics and Phonology, Blackwell, Oxford
(2007).
- Cleveland, T. F., Sundberg, J. and Stone, R. E.
(2001). “Long-term-average spectrum characteristics of country
singers during speaking and singing.” J. Voice, 15, 54-60.
- Dang, J. and Honda, K., (1997). "Acoustic
characteristics of the piriform fossa in models and humans",
J.Acoust.Soc.Am., 101: 456-465.
- Ekholm, E., Papagiannis, G. C. and Chagnon, F.
P. 1998. Relating objective measurements to expert evaluation of
voice quality in western classical singing: Critical perceptual
parameters. J. Voice 12, 182-196.
- Elliot, S.J., Bowsher, J.M. (1982).
"Regeneration in brass wind instruments", J. Sound & Vibration
83, 181-217.
- Fant, G. (1960). Acoustic Theory of Speech
Production. Mouton & Co, The Hague, Netherlands.
- Feng, G. and Castelli, E. 1996. Some acoustic
features of nasal and nasalized vowels: A target for vowel
nasalization. J. Acoust. Soc. Am., 99, 3694-3706.
- Flanagan, J. and Landgraf, L. (1968).
"Self-oscillating source for vocal-tract synthesizers", IEEE
Trans. Audio and Eletroacoustics, 16, 57-64.
- Flanagan, J. L. (1960). Analog Measurements of
Sound Radiation from the Mouth. J.Acoust.Soc.Am., 32,
1613-1620.
- Fletcher, N.H. "Autonomous vibration of simple pressure-controlled
valves in gas flows" J. Acoust. Soc. Am. 93: 2172-2180,
1993.
- Garcia M. (1855). Observations on the human
voice. In: Proc. Royal Soc. London, p. 399-410.
- Garnier, M. (2007). Communication in noisy
environments: from adaptation to vocal straining. Ph.D thesis,
University of Paris 6.
- Garnier, M., Henrich, N., Castellengo, M.,
Sotiropoulos, D. and Dubois, D. (2007). "Characterisation of Voice
Quality in Western Lyrical Singing: from Teachers's Judgements to
Acoustic Descriptions". J. Interdisciplinary Music Studies 1(2):
62-91.
- Garnier, M., Wolfe, J., Henrich, N. and Smith,
J. (2008). "Interrelationship between vocal effort and vocal tract
acoustics: a pilot study". Proc. of ICSLP, Brisbane,
Australia.
- Garnier, M., Henrich, N., Smith, J. and Wolfe,
J. (2010) "Vocal
tract adjustments in the high soprano range" J.
Acoust. Soc. America. 127, 3771-3780.
- Gauffin, J. and Sundberg, J. (1989). "Spectral
correlates of glottal voice source waveform characteristics." J.
Speech an Hearing Research 32(3): 556-565.
- Ghonim, A., Smith, J. and Wolfe, J. (2007) “The
sounds of world English”
- Goldstein, J.L. (1973). "An optimum processor
theory for the central formation of the pitch of complex tones".
J.Acoust.Soc.Am., 54, 1496-1516.
- Hardcastle, W. and Laver, J.D. (1999). “The
Handbook of Phonetic Sciences”. Blackwell Handbooks in
Linguistics, Wiley-Blackwell.
- Henrich, N., Kiek, M., Smith,. J. and Wolfe, J.
(2007) "Resonance strategies in Bulgarian women's
singing", Logopedics Phoniatrics Vocology, 32, 171-177.
- Henrich, N. (2006). "Mirroring the voice from
Garcia to the present day: some insights into singing voice
registers." Logopedics Phoniatrics Vocology 31(1): 3-14.
- Henrich, N., d'Alessandro, C., Doval, B. and
Castellengo, M. (2005). "Glottal open quotient in singing:
Measurements and correlation with laryngeal mechanisms, vocal
intensity, and fundamental frequency." J.Acoust.Soc.Am. 117:
1417-1430.
- Henrich, N., Smith, J. and Wolfe, J. (2011) "Vocal tract resonances in singing: strategies in
different vocal ranges" J.
Acoust. Soc. America. 129, 1024-1035. Hertegard, S.
Larsson, H. and Grandqvist, S. "Vocal fold resonances at thigh and
low pitch tuning", Proc. Stockholm Music Acoustics Conference
(SMAC 03), (R. Bresin, ed) Stockholm, Sweden. 459–461
(2003)
- Hertegard, S., Gauffin, J., Sundberg, J. 1990.
“Open and covered singing as studied by means of fiberoptics,
inverse filtering and spectral analysis”. J. Voice, 4,
220-230.
- Hirano M, Vennard W, Ohala J. (1970).
"Regulation of register, pitch and intensity of voice". Folia
Phoniatrica, /22, 1-20.
- Hollien, H. and Michel, J. F. (1968). "Vocal fry
as a phonational register." J. Speech Hearing Research 11(3):
600-604.
- Itoh, T., Takeda, K. and Itakura, F. (2002)
"Acoustic analysis and recognition of whispered speech", in
Proceedings of ICASSP, vol.1, 389-392.
- Imagawa, H., Sakakibara, K.-I., Tayama, N.
(2003). "The effect of the hypopharyngeal and supra-glottic shapes
on the singing voice", in Proceedings of SMAC, Stockholm,
Sweden.
- Joliveau, E., Smith, J. and Wolfe, J. (2004a)
“Tuning of vocal tract resonances by sopranos”
Nature, 427, 116.
- Joliveau, E., Smith, J. and Wolfe, J. (2004) "Vocal
tract resonances in singing: the soprano voice", J. Acoust. Soc. America, 116,
2434-2439.
- Johnson, K. (2003). Acoustic and Auditory
Phonetics. 2nd ed.Blackwell, Oxford.
- Kallail, K. J., and Emanuel, F. W. (1984a).
"Formant–frequency differences between isolated whispered and
phonated vowel samples produced by adult female subjects", J.
Speech Hear. Res. 27, 245–251.
- Kallail, K. J., and Emanuel, F. W. (1984b). "An
acoustic comparison of isolated whispered and phonated vowel
samples produced by adult male subjects", J. Phonetics 12,
175-186.
- Katz, B. & D'Alessandro, C. (2007).
Measurement of 3D Phoneme-Specific Radiation Patterns in Speech
and Singing, LIMSI.
http://rs2007.limsi.fr/index.php/PS:Page_14
- Kitamura, T., Honda, K. and Takemoto, H.,
(2005). "Individual variation of the hypopharyngeal cavities and
its acoustic effects", Acoustic Science and Technology, 26(1):
16-26.
- Klatt, D. H. and Klatt, L. C. (1990) "Analysis,
synthesis, and perception of voice quality variations among female
and male talkers",J.Acoust.Soc.Am., 87, 820-857.
- Kob, M. & Jers, H. (1999). Directivity
measurement of a singer. J.Acoust.Soc.Am., 105, 1003.
- Kob, M. (2003) “Analysis and modelling of
overtone singing in the sygyt style” Appl. Acoust., 65,
1249-1259.
- Kob, M. Henrich, N. Howard, D., Herzel, H.,
Tokuda, I. and Wolfe, J. "Analysing and understanding the singing
voice: recent progress and open questions" Current Bioinformatics.
In press.
- Leino, T. (1993). Long-term average spectrum
study on speaking voice quality in male actors. Proceedings of
SMAC, Stockholm, Sweden, 206-210.
- Lieberman, P., and Blumstein, S.E. (1988).
"Speech physiology, speech perception, and acoustic phonetics."
Cambridge University Press, Cambridge, UK.
- Lindblom, B. E. F., and Sundberg, J. E. F.
(1971). “Acoustical consequences of lip, tongue, jaw, and larynx
movement,” J. Acoust. Soc. Am. 50, 1166-1179.
- Matsuda, M. and Kasuya, H., (1999)"Acoustic
nature of the whisper", in Proceedings of Eurospeech'99,
133-136.
- Miller, R.L. (1959). "Nature of the Vocal Cord
Wave". J.Acoust.Soc.Am., 31, 6, 667-677.
- Miller, D.G. and Schutte, H.K. (1993). "Physical
definition of the ‘flageolet register’". J. Voice, 7, 3,
206-212.
- Miller DG. (2000). Registers in singing:
empirical and systematic studies in the theory of the singing
voice. Doctoral dissertation, University of Groningen.
- Nawka, T., Anders, L. C., Cebulla, M. &
Zurakowski, D. (1997). “The speaker's formant in male voices”, J.
Voice, 11, 422-428.
- Nearey, T. (1989). "Static, dynamic, and
relational properties in vowel perception.". J.Acoust.Soc.Am.. 85,
pp. 2088-2113.
- Novak, A. and Vokral, J. (1995). "Acoustic
parameters for the evaluation of voice of future voice
professionals." Folia Phoniatrica Logopedica 47: 279-285.
- Petersen, G.E., and Barney, H.L., ‘Control
methods used in a study of vowels’, J. Acoust. Soc. Am. 24,
175-184 (1952).
- Pinczower, R., Oates, J. (2005) “Vocal
Projection in Actors: The Long-Term Average Spectral Features That
Distinguish Comfortable Acting Voice From Voicing With Maximal
Projection in Male Actors”, J. Voice 19, 440-453.
- Rothenberg, M. “An interactive model for the
voice source” Quarterly Prog. Status Report, Dept Speech, Music
and Hearing, KTH, Stockholm, 22. 1-17 (1981).
- Rothenberg, M. (1973). "A new inverse-filtering
technique for deriving the glottal air flow waveform during
voicing". J.Acoust.Soc.Am., 53, 6, 1632-1645.
- Roubeau B, Castellengo M, Bodin P, Ragot M.
(2004). "Laryngeal registers as shown in the voice range profile".
Folia Phoniatrica Logopaedica, 56, 5, 321-33.
- Scherer, R.C. (1991). "Physiology of phonation:
A review of basic mechanics". Phonosurgery: Assessment and
surgical management of voice. 77-93.
- Smith J., Henrich N., Wolfe J. (2007) “
Resonance tuning in singing”, 19th International Congress on
Acoustics, Madrid, Spain, Sept. 2007.
- Smits, R., ten Bosch, L., and Collier, R.
(1996). "Evaluation of various sets of acoustic cues for the
perception of prevocalic stop consonants. I. Perception
experiment".J.Acoust.Soc.Am., 100, 3852-3864.
- Steinhauer, K.M., Rekart, D.M. and Keaten, J.
(1992). “Nasality in modal speech and twang qualities:
Physiologic, acoustic, and perceptual differences”,
J.Acoust.Soc.Am., 92, p. 2340.
- Stevens, K.N. (1999). Acoustic Phonetics. MIT
Press, Cambridge, MA.
- Stone, R., Cleveland, T., Sundberg, J., Prokop,
J. (2003). “Aerodynamic and acoustical measures of speech,
operatic, and broadway vocal styles in a professional female
singer.” J. Voice, 17, 283-297.
- Sundberg, J. (1974) “Articulatory interpretation
of the ‘singing formant’,” J.Acoust.Soc.Am. 55, 838-844.
- Sundberg, J., Gramming, P. and Lovetri, J.
(1993) “Comparisons of pharynx, source, formant, and pressure
characteristics in operatic and musical theatre singing”, J.
Voice, 7, 301-310.
- Sundberg, J. (2001), ‘Level and centre frequency
of the singer’s formant’, J. Voice 15, 176-186.
- Sundberg, J., and Skoog, J. (1997) “Dependence
of jaw opening on pitch and vowel in singers,” J. Voice 11,
301-306.
- Sundberg, J. (1970). "Formant structure and
articulation of spoken and sung vowels." Folia Phoniatrica (Basel)
22(1): 28-48.
- Svec, J., Schutte, H.K. and Miller, D.G. (1999).
"On pitch jumps between chest and falsetto registers in voice:
Data from living and excised human larynges". The
J.Acoust.Soc.Am., 106, 3, 1523-1531.
- Svec, J., Schutte, H.K. (1996).
"Videokymography: High-speed line scanning of vocal fold
vibration". J. Voice, 10, 2 , 201-205.
- Swerdlin, Y., Smith, J. and Wolfe, J. (2010) "The
effect of whisper and creak vocal mechanisms on vocal tract
resonances" J. Acoust. Soc. America. 127,
2590-2598.
- Takemoto, H., Adachi, S., Kitamura, T.,
Mokhtari, P., Honda, K. (2006). "Acoustic roles of the laryngeal
cavity in vocal tract resonance", J.Acoust.Soc.Am., 120:
2228-2238.
- Titze, I. (2001). “Acoustic Interpretation of
Resonant Voice”, J. Voice 15, 519-528.
- Titze, I.R., Bergan, C.C, Hunter, E.J. and
Story, B. (2003). Source and filter adjustments affecting the
perception of the vocal qualities twang and yawn. Logopedics
Phoniatrics Vocology 28 : 47 – 155.
- Van Den Berg, J. (1958). "Myoelastic-aerodynamic
theory of voice production". The Journal of Speech Language and
Hearing Research, 1, 3, 227-244. Van Den Berg, J., Zantema, J.T.,
Doornenbal, P. Jr. (1957). “On the Air Resistance and the
Bernoulli Effect of the Human Larynx”. J.Acoust.Soc.Am., 29, 5,
626-631.
- Vurma, A., Ross, J. (2002). “Where Is a Singer's
Voice if It Is Placed “Forward”?”, J. Voice, 16, 383-391.
- Weiss, R., W . Brown, J. and Moris, J. (2001).
"Singer's Formant in Sopranos: Fact or Fiction?" J. Voice, 15,
457-468.
- Wolfe, J. and Smith, J. (2008) "Acoustical
coupling between lip valves and vocal folds" Acoust. Australia,
26, 23-27.
Links
This web essay was written by Joe Wolfe, Maëva
Garnier and John Smith of the Acoustics
Group at UNSW.
|