Beyond the average: embracing speaker individuality in the dynamic modeling of the acoustic-articulatory relationship
DOI:
https://doi.org/10.3989/loquens.2023.e103Keywords:
individual differences, continuous-time modeling, formants, tongue kinematicsAbstract
This paper explores the acoustic-articulatory relationship while considering individual differences in speech production. We aimed to determine whether there is a causal relationship between tongue movements and the contours of the first and second formant frequencies (F 1 and F 2) employing a hierarchical Bayesian continuous-time dynamic model, which allows for a more direct connection between the acoustic and articulatory measured variables and theories involving dynamicity. The results show predictive tendencies for both formants, where the anteroposterior and vertical tongue movements may predict changes in F 1, with rising predicting an increase and retraction a decrease; and with tongue fronting and tongue height inversely predicting F 2. Further, the modeled individual differences showed similar global tendencies, except for the rate of change of F 2. Overall, this study provides valuable insights into the relationship between tongue articulatory variables and formant contours, while accounting for between-speaker variability.
The results show predictive tendencies for both formants, where the anteroposterior and vertical tongue movements may predict changes in f1, with rising predicting an increase and retraction a decrease; and with tongue fronting and tongue height inversely predicting f2. Further, the modeled individual differences showed similar global tendencies, however with larger error estimates. Overall, this study provides valuable insights into the relationship between tongue articulatory variables and formant contours, while accounting for between-speaker variability.
Downloads
References
Browman, C. P., & Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6(2), 201-251. https://doi.org/10.1017/S0952675700001019
Carré, R. (2009). Dynamic properties of an acoustic tube: Prediction of vowel systems. Speech Communication, 51(1), 26-41. https://doi.org/10.1016/j.specom.2008.05.015
Carré, R., Divenyi, P., & Mrayati, M. (2017). Speech: A dynamic process. De Gruyter. https://doi.org/10.1515/9781501502019
Driver, C. C. (2022, January 14). Inference With Cross-Lagged Effects-Problems in Time. https://doi.org/10.31219/osf.io/xdf72
Driver, C. C., Oud, J. H. L., & Voelkle, M. C. (2017). Continuous Time Structural Equation Modeling with R Package ctsem. Journal of Statistical Software, 77(5), 1-35. https://doi.org/10.18637/jss.v077.i05
Driver, C. C., & Tomasik, M. J. (2023). Formalizing Developmental Phenomena as Continuous-Time Systems: Relations Between Mathematics and Language Development [Journal Article]. https://osf.io/szx96 https://doi.org/10.1111/cdev.13990 PMid:37661359
Driver, C. C., & Voelkle, M. C. (2018). Hierarchical Bayesian Continuous Time Dynamic Modeling. Psychological Methods, 23(4), 774-799. https://doi.org/10.1037/met0000168 PMid:29595295
Driver, C. C., & Voelkle, M. C. (2021). Chapter 34-Hierarchical continuous time modeling. In J. F. Rauthmann (Ed.), The Handbook of Personality Dynamics and Processes (pp. 887-908). Academic Press. https://doi.org/10.1016/B978-0-12-813995-0.00034-0
Dromey, C., Jang, G.-O., & Hollis, K. (2013). Assessing correlations between lingual movements and formants. Speech Communication, 55(2), 315-328. https://doi.org/10.1016/j.specom.2012.09.001
Esling, J. H. (2005). There Are No Back Vowels: The Laryngeal Articulator Model. Canadian Journal of Linguistics/Revue Canadienne de Linguistique, 50(1-4), 13-44. https://doi.org/10.1353/cjl.2007.0007
Fant, G. (1980). The Relations between Area Functions and the Acoustic Signal. 37(1-2), 55-86. https://doi.org/10.1159/000259983 PMid:7413769
Fry, D. B. (1979). The Physics of Speech. Cambridge University Press. https://books.google.ch/books?id=Ud-8yy-DCZgC https://doi.org/10.1017/CBO9781139165747
Gorman, E. F., & Kirkham, S. (2020). Dynamic acoustic-articulatory relations in back vowel fronting: Examining the effects of coda consonants in two dialects of British English. The Journal of the Acoustical Society of America, 148(2), 724. https://doi.org/10.1121/10.0001721 PMid:32872991
Granger, C. W. J. (1980). Testing for causality: A personal viewpoint. Journal of Economic Dynamics and Control, 2, 329-352. https://doi.org/10.1016/0165-1889(80)90069-X
He, L., Zhang, Y., & Dellwo, V. (2019). Between-speaker variability and temporal organization of the first formant. The Journal of the Acoustical Society of America, 145(3), EL209-EL214. https://doi.org/10.1121/1.5093450 PMid:31067968
Hillenbrand, J. M., & Nearey, T. M. (1999). Identification of resynthesized /hVd/ utterances: Effects of formant contour. The Journal of the Acoustical Society of America, 105(6), 3509-3523. https://doi.org/10.1121/1.424676 PMid:10380673
Hughes, O. M., & Abbs, J. H. (1976). Labial-Mandibular Coordination in the Production of Speech: Implications for the Operation of Motor Equivalence. Phonetica, 33(3), 199-221. https://doi.org/10.1159/000259722 PMid:996113
Ji, A., Berry, J. J., & Johnson, M. T. (2014). The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 7719-7723. https://doi.org/10.1109/ICASSP.2014.6855102
Ji, Y., Wei, J., Zhang, J., Fang, Q., Lu, W., Honda, K., & Lu, X. (2017). Speech Behavior Analysis by Articulatory Observations. Procedia Computer Science, 111, 463-470. https://doi.org/10.1016/j.procs.2017.06.048
Johnson, K., Ladefoged, P., & Lindau, M. (1993). Individual differences in vowel production. The Journal of the Acoustical Society of America, 94(2), 701-714.
https://doi.org/10.1121/1.406887 PMid:8370875
Josserand, M., Allassonnière-Tang, M., Pellegrino, F., & Dediu, D. (2021). Interindividual Variation Refuses to Go Away: A Bayesian Computer Model of Language Change in Communicative Networks. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.626118 PMid:34234707 PMCid:PMC8257003
Labov, W., Ash, S., & Boberg, C. (2006). The atlas of North American English: Phonetics, phonology, and sound change: a multimedia reference tool. Mouton de Gruyter. https://doi.org/10.1515/9783110167467
Lee, J. (2014). Relationship between the first two formant frequencies and tongue positional changes in production of /aɪ/. The Journal of the Acoustical Society of America, 135(4_Supplement), 2294-2294. https://doi.org/10.1121/1.4877541
Lee, S.-H., Yu, J.-F., Hsieh, Y.-H., & Lee, G.-S. (2015). Relationships Between Formant Frequencies of Sustained Vowels and Tongue Contours Measured by Ultrasonography. American Journal of Speech-Language Pathology, 24(4), 739-749. https://doi.org/10.1044/2015_AJSLP-14-0063 PMid:26254465
Lee, J., Shaiman, S., & Weismer, G. (2016). Relationship between tongue positions and formant frequencies in female speakers. The Journal of the Acoustical Society of America, 139(1), 426-440. https://doi.org/10.1121/1.4939894 PMid:26827037
Lins Machado, C., Dellwo, V., & He, L. (2022). Idiosyncratic lingual articulation of American English /æ/ and /ɑ/ using network analysis. Interspeech 2022, 754-758. https://doi.org/10.21437/Interspeech.2022-10397
Lohmann, J. F., Zitzmann, S., Voelkle, M. C., & Hecht, M. (2022). A primer on continuous-time modeling in educational research: An exemplary application of a continuous-time latent curve model with structured residuals (CT-LCM-SR) to PISA Data. Large-Scale Assessments in Education, 10(1), 5. https://doi.org/10.1186/s40536-022-00126-8
McDougall, K. (2006). Dynamic features of speech and the characterization of speakers: Towards a new approach using formant frequencies. International Journal of Speech, Language and the Law, 13(1), 89-126. https://doi.org/10.1558/sll.2006.13.1.89
Nearey, T. M. (2013). Vowel Inherent Spectral Change in the Vowels of North American English. In G. S. Morrison & P. F. Assmann (Eds.), Vowel Inherent Spectral Change (pp. 49-85). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-14209-3_4
Nearey, T. M., & Assmann, P. F. (1986). Modeling the role of inherent spectral change in vowel identification. The Journal of the Acoustical Society of America, 80(5), 1297-1308. https://doi.org/10.1121/1.394433
Oud, J. H. L., & Voelkle, M. C. (2014). Do missing values exist? Incomplete data handling in cross-national longitudinal studies by means of continuous time modeling. Quality & Quantity, 48(6), 3271-3288. https://doi.org/10.1007/s11135-013-9955-9
Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161
Schwartz, G. (2021). The phonology of vowel VISC-osity - acoustic evidence and representational implications. Glossa: A Journal of General Linguistics, 6(1). https://doi.org/10.5334/gjgl.1182
Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17(1-2), 3-45. https://doi.org/10.1016/S0095-4470(19)31520-7
Wieling, M., Tomaschek, F., Arnold, D., Tiede, M., Bröker, F., Thiele, S., Wood, S. N., & Baayen, R. H. (2016). Investigating dialectal differences using articulography. Journal of Phonetics, 59, 122-143. https://doi.org/10.1016/j.wocn.2016.09.004
Yang, X., Millar, J. B., & Macleod, I. (1996). On the sources of inter- and intra- speaker variability in the acoustic dynamics of speech. Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, 3, 1792-1795 vol.3. https://doi.org/10.1109/ICSLP.1996.607977
Yunusova, Y., Green, J. R., Greenwood, L., Wang, J., Pattee, G. L., & Zinman, L. (2012). Tongue movements and their acoustic consequences in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopaedica: Official Organ of the International Association of Logopedics and Phoniatrics (IALP), 64(2), 94-102. https://doi.org/10.1159/000336890 PMid:22555651 PMCid:PMC3369262
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Consejo Superior de Investigaciones Científicas (CSIC)

This work is licensed under a Creative Commons Attribution 4.0 International License.
© CSIC. Manuscripts published in both the print and online versions of this journal are the property of the Consejo Superior de Investigaciones Científicas, and quoting this source is a requirement for any partial or full reproduction.
All contents of this electronic edition, except where otherwise noted, are distributed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. You may read the basic information and the legal text of the licence. The indication of the CC BY 4.0 licence must be expressly stated in this way when necessary.
Self-archiving in repositories, personal webpages or similar, of any version other than the final version of the work produced by the publisher, is not allowed.
Funding data
Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
Grant numbers PZ00P1_193328