Beyond the average: embracing speaker individuality in the dynamic modeling of the acoustic-articulatory relationship

Authors

DOI:

https://doi.org/10.3989/loquens.2023.e103

Keywords:

individual differences, continuous-time modeling, formants, tongue kinematics

Abstract


This paper explores the acoustic-articulatory relationship while considering individual differences in speech production. We aimed to determine whether there is a causal relationship between tongue movements and the contours of the first and second formant frequencies (F 1 and F 2) employing a hierarchical Bayesian continuous-time dynamic model, which allows for a more direct connection between the acoustic and articulatory measured variables and theories involving dynamicity. The results show predictive tendencies for both formants, where the anteroposterior and vertical tongue movements may predict changes in F 1, with rising predicting an increase and retraction a decrease; and with tongue fronting and tongue height inversely predicting F 2. Further, the modeled individual differences showed similar global tendencies, except for the rate of change of F 2. Overall, this study provides valuable insights into the relationship between tongue articulatory variables and formant contours, while accounting for between-speaker variability.

The results show predictive tendencies for both formants, where the anteroposterior and vertical tongue movements may predict changes in f1, with rising predicting an increase and retraction a decrease; and with tongue fronting and tongue height inversely predicting f2. Further, the modeled individual differences showed similar global tendencies, however with larger error estimates. Overall, this study provides valuable insights into the relationship between tongue articulatory variables and formant contours, while accounting for between-speaker variability.

Downloads

Download data is not yet available.

References

Browman, C. P., & Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6(2), 201-251. https://doi.org/10.1017/S0952675700001019

Carré, R. (2009). Dynamic properties of an acoustic tube: Prediction of vowel systems. Speech Communication, 51(1), 26-41. https://doi.org/10.1016/j.specom.2008.05.015

Carré, R., Divenyi, P., & Mrayati, M. (2017). Speech: A dynamic process. De Gruyter. https://doi.org/10.1515/9781501502019

Driver, C. C. (2022, January 14). Inference With Cross-Lagged Effects-Problems in Time. https://doi.org/10.31219/osf.io/xdf72

Driver, C. C., Oud, J. H. L., & Voelkle, M. C. (2017). Continuous Time Structural Equation Modeling with R Package ctsem. Journal of Statistical Software, 77(5), 1-35. https://doi.org/10.18637/jss.v077.i05

Driver, C. C., & Tomasik, M. J. (2023). Formalizing Developmental Phenomena as Continuous-Time Systems: Relations Between Mathematics and Language Development [Journal Article]. https://osf.io/szx96 https://doi.org/10.1111/cdev.13990 PMid:37661359

Driver, C. C., & Voelkle, M. C. (2018). Hierarchical Bayesian Continuous Time Dynamic Modeling. Psychological Methods, 23(4), 774-799. https://doi.org/10.1037/met0000168 PMid:29595295

Driver, C. C., & Voelkle, M. C. (2021). Chapter 34-Hierarchical continuous time modeling. In J. F. Rauthmann (Ed.), The Handbook of Personality Dynamics and Processes (pp. 887-908). Academic Press. https://doi.org/10.1016/B978-0-12-813995-0.00034-0

Dromey, C., Jang, G.-O., & Hollis, K. (2013). Assessing correlations between lingual movements and formants. Speech Communication, 55(2), 315-328. https://doi.org/10.1016/j.specom.2012.09.001

Esling, J. H. (2005). There Are No Back Vowels: The Laryngeal Articulator Model. Canadian Journal of Linguistics/Revue Canadienne de Linguistique, 50(1-4), 13-44. https://doi.org/10.1353/cjl.2007.0007

Fant, G. (1980). The Relations between Area Functions and the Acoustic Signal. 37(1-2), 55-86. https://doi.org/10.1159/000259983 PMid:7413769

Fry, D. B. (1979). The Physics of Speech. Cambridge University Press. https://books.google.ch/books?id=Ud-8yy-DCZgC https://doi.org/10.1017/CBO9781139165747

Gorman, E. F., & Kirkham, S. (2020). Dynamic acoustic-articulatory relations in back vowel fronting: Examining the effects of coda consonants in two dialects of British English. The Journal of the Acoustical Society of America, 148(2), 724. https://doi.org/10.1121/10.0001721 PMid:32872991

Granger, C. W. J. (1980). Testing for causality: A personal viewpoint. Journal of Economic Dynamics and Control, 2, 329-352. https://doi.org/10.1016/0165-1889(80)90069-X

He, L., Zhang, Y., & Dellwo, V. (2019). Between-speaker variability and temporal organization of the first formant. The Journal of the Acoustical Society of America, 145(3), EL209-EL214. https://doi.org/10.1121/1.5093450 PMid:31067968

Hillenbrand, J. M., & Nearey, T. M. (1999). Identification of resynthesized /hVd/ utterances: Effects of formant contour. The Journal of the Acoustical Society of America, 105(6), 3509-3523. https://doi.org/10.1121/1.424676 PMid:10380673

Hughes, O. M., & Abbs, J. H. (1976). Labial-Mandibular Coordination in the Production of Speech: Implications for the Operation of Motor Equivalence. Phonetica, 33(3), 199-221. https://doi.org/10.1159/000259722 PMid:996113

Ji, A., Berry, J. J., & Johnson, M. T. (2014). The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7719-7723. https://doi.org/10.1109/ICASSP.2014.6855102

Ji, Y., Wei, J., Zhang, J., Fang, Q., Lu, W., Honda, K., & Lu, X. (2017). Speech Behavior Analysis by Articulatory Observations. Procedia Computer Science, 111, 463-470. https://doi.org/10.1016/j.procs.2017.06.048

Johnson, K., Ladefoged, P., & Lindau, M. (1993). Individual differences in vowel production. The Journal of the Acoustical Society of America, 94(2), 701-714.

https://doi.org/10.1121/1.406887 PMid:8370875

Josserand, M., Allassonnière-Tang, M., Pellegrino, F., & Dediu, D. (2021). Interindividual Variation Refuses to Go Away: A Bayesian Computer Model of Language Change in Communicative Networks. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.626118 PMid:34234707 PMCid:PMC8257003

Labov, W., Ash, S., & Boberg, C. (2006). The atlas of North American English: Phonetics, phonology, and sound change: a multimedia reference tool. Mouton de Gruyter. https://doi.org/10.1515/9783110167467

Lee, J. (2014). Relationship between the first two formant frequencies and tongue positional changes in production of /aɪ/. The Journal of the Acoustical Society of America, 135(4_Supplement), 2294-2294. https://doi.org/10.1121/1.4877541

Lee, S.-H., Yu, J.-F., Hsieh, Y.-H., & Lee, G.-S. (2015). Relationships Between Formant Frequencies of Sustained Vowels and Tongue Contours Measured by Ultrasonography. American Journal of Speech-Language Pathology, 24(4), 739-749. https://doi.org/10.1044/2015_AJSLP-14-0063 PMid:26254465

Lee, J., Shaiman, S., & Weismer, G. (2016). Relationship between tongue positions and formant frequencies in female speakers. The Journal of the Acoustical Society of America, 139(1), 426-440. https://doi.org/10.1121/1.4939894 PMid:26827037

Lins Machado, C., Dellwo, V., & He, L. (2022). Idiosyncratic lingual articulation of American English /æ/ and /ɑ/ using network analysis. Interspeech 2022, 754-758. https://doi.org/10.21437/Interspeech.2022-10397

Lohmann, J. F., Zitzmann, S., Voelkle, M. C., & Hecht, M. (2022). A primer on continuous-time modeling in educational research: An exemplary application of a continuous-time latent curve model with structured residuals (CT-LCM-SR) to PISA Data. Large-Scale Assessments in Education, 10(1), 5. https://doi.org/10.1186/s40536-022-00126-8

McDougall, K. (2006). Dynamic features of speech and the characterization of speakers: Towards a new approach using formant frequencies. International Journal of Speech, Language and the Law, 13(1), 89-126. https://doi.org/10.1558/sll.2006.13.1.89

Nearey, T. M. (2013). Vowel Inherent Spectral Change in the Vowels of North American English. In G. S. Morrison & P. F. Assmann (Eds.), Vowel Inherent Spectral Change (pp. 49-85). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-14209-3_4

Nearey, T. M., & Assmann, P. F. (1986). Modeling the role of inherent spectral change in vowel identification. The Journal of the Acoustical Society of America, 80(5), 1297-1308. https://doi.org/10.1121/1.394433

Oud, J. H. L., & Voelkle, M. C. (2014). Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling. Quality & Quantity, 48(6), 3271-3288. https://doi.org/10.1007/s11135-013-9955-9

Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161

Schwartz, G. (2021). The phonology of vowel VISC-osity - acoustic evidence and representational implications. Glossa: A Journal of General Linguistics, 6(1). https://doi.org/10.5334/gjgl.1182

Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17(1-2), 3-45. https://doi.org/10.1016/S0095-4470(19)31520-7

Wieling, M., Tomaschek, F., Arnold, D., Tiede, M., Bröker, F., Thiele, S., Wood, S. N., & Baayen, R. H. (2016). Investigating dialectal differences using articulography. Journal of Phonetics, 59, 122-143. https://doi.org/10.1016/j.wocn.2016.09.004

Yang, X., Millar, J. B., & Macleod, I. (1996). On the sources of inter- and intra- speaker variability in the acoustic dynamics of speech. Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, 3, 1792-1795 vol.3. https://doi.org/10.1109/ICSLP.1996.607977

Yunusova, Y., Green, J. R., Greenwood, L., Wang, J., Pattee, G. L., & Zinman, L. (2012). Tongue movements and their acoustic consequences in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopaedica: Official Organ of the International Association of Logopedics and Phoniatrics (IALP), 64(2), 94-102. https://doi.org/10.1159/000336890 PMid:22555651 PMCid:PMC3369262

Published

2023-12-30

How to Cite

Lins Machado, C., & He, L. (2023). Beyond the average: embracing speaker individuality in the dynamic modeling of the acoustic-articulatory relationship. Loquens, 10(1-2), e103. https://doi.org/10.3989/loquens.2023.e103

Issue

Section

Articles