Towards the applicability of voice quality in forensic phonetics

Marianela Fernández Trinidad

doi:10.3989/loquens.2022.e093

Authors

Marianela Fernández Trinidad Universidad Complutense de Madrid https://orcid.org/0000-0002-0087-0829

DOI:

https://doi.org/10.3989/loquens.2022.e093

Keywords:

Voice disguise, laryngeal voice quality, forensic phonetics, applicability

Abstract

Voice quality derived from long-term laryngeal settings stands out as a potentially individualizing trait of speakers. This places it in an advantageous situation with respect to other phonetic parameters used in forensic linguistics. However, anyone confronted with its analysis will immediately run into a methodological difficulty stemming from its inherently multidimensional nature. In this lies its main disadvantage and the fundamental reason why its analysis is not always considered in the traditional approach used in the comparison of speakers for identification purposes. Based on an experimental inquiry on voice disguised by means of falsetto, this study shows that it is possible to work with a reduced set of laryngeal features responsible for voice quality and facilitate its interpretation and explanation, which is a critical issue for forensic practice.

Downloads

Download data is not yet available.

References

Alves, H., Fernández Trinidad, M., Gil Fernández, J., Infante, P., Lahoz-Bengoechea, J. M., Pérez Sanz, C. y San Segundo, E. (2012). Disguised voices: A perceptual experiment. 3rd European Conference of the International Association of Forensic Linguistic

Alves, H., Gil Fernández, J., Pérez Sanz, C. y San Segundo, E. (2014). La cualidad individual de la voz y la identificación del locutor: el proyecto CIVIL. En Y. Congosto, M. L. Montero Curiel, y A. Salvador Plans (Eds.), Fonética experimental, educación superior e investigación (Vol. 1, pp. 591-612). Madrid: Arco/Libros.

Baayen, R. (2008). Analyzing Linguistic Data: A Practical Introduction to Statistics. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511801686

BioMetroSoft®. (2022). Informe técnico para documentar la funcionalidad de la aplicación informática BioMet®Fore [versión 1.0].

BioMetroSoft®. (2014). BioMet®Phon: Tool for de Evaluation of Voice Quality and Biometri. User's Manual [versión 2.3].

Drygajlo, A., Jessen, M., Gfroerer, S., Wagner, I., Vermeulen, J. y Niemi, T. (2015). Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition. Frankfurt: Verlag für Polizeiwissenschaft. https://enfsi.eu/wp-content/uploads/2016/09/guidelines_fasr_and_fsasr_0.pdf

ENFSI, European Network of Forensic Science Institutes (2015). ENFSI Guideline or Evaluative Reporting in Forensic Science, http://enfsi.eu/wp-content/uploads/2016/09/m1_guideline.pdf

Fernández Trinidad, M. (2018). Caracterización del falsetto y sus consecuencias para la discriminación de voces. [Tesis doctoral inédita]. Universidad Nacional de Educación a Distancia.

Fernández Trinidad, M. y Rojo, J. (2019). Perceptual cues for individual voice quality. En J. Gil Fernández y M. Gibson (Eds.), Romance Phonetics and Phonology (pp. 161-176). Oxford University Press. https://doi.org/10.1093/oso/9780198739401.003.0010

Figueiredo, R. y de Souza Britto, H. (1996). A report on the acoustic effects of one type of disguise. Forensic Linguistics, 3, 168-175. https://doi.org/10.1558/ijsll.v3i1.168

Gil, J. Alves, H. y Hierro, J. A. (2012). Proposition raisonnée de protocole de capture de voix connue à des fins judiciaires. Revue Internationale de Criminalistique et de Police Scientifique et Technique, LXV, 319-345.

Gil Fernández, J., Fernández Trinidad, M., Infante, P. y Lahoz-Bengoechea, J. M. (2017). "Obtaining speech samples for research and expertise in forensic phonetics". En: Orletti, F. y Mariottini, L. (Eds.) Theories, Practices, Instruments of Forensic Linguistics (pp. 27-50). Cambridge Scholars Publishing.

Gil Fernández, J. y San Segundo, E. (2014). La cualidad de voz en fonética judicial. En E. Garayzábal, M. Jiménez y M. Reigosa (Coords.), Lingüística Forense. La Lingüística en el ámbito legal y policial (pp. 154 -199). Madrid: Euphonía Ediciones.

Gobl, C. y Ní Chasaide, A. (2010). Voice source variation and its communicative functions. En W. Hardcastle, J. Laver, y F. Gibbon (Eds.), The Handbook of Phonetic Sciences (2.a ed., pp. 378-423). Oxford: Wiley-Blackwell. https://doi.org/10.1002/9781444317251.ch11

Godino, J., Gómez-Vilda, P. y Blanco, M. (2006). Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering, 53(10), 1943-1953. https://doi.org/10.1109/TBME.2006.871883 PMid:17019858

Gold, E. y French, P. (2011). International practices in forensic speaker comparison. International Journal of Speech, Language, and the Law, 18(2), 293-307. https://doi.org/10.1558/ijsll.v18i2.293

Gómez-Vilda, P. (2022). Informe técnico para documentar la funcionalidad de la aplicación informática BioMet®Fore. Versión 1.0, 2022 NeuSpeLab, Glottex VAS (www.glottalsolutions.com)

Gómez-Vilda, P.; Rodellar Biarge, M. V.; Nieto Lluis, V.; Martínez Olalla, R.; Álvarez Marquina, A.; Scola Yurrita, B.; Ramírez Calvo, C.; Poletti Serafini, D. y Fernández Fernández, M. (2013). BioMet®Phon: A system to monitor phonation quality in the clinics. En eTELEMED 2013: The Fifth International Conference on eHealth, Telemedicine, and Social Medicine (pp. 253-258), Nice, France. ISBN 978-1-61208-252-3.

González-Rodríguez, J., Gil, J., Pérez, R. y Franco-Pedroso, J. (2014). What are we missing with i-vectors? A perceptual analysis of i-vector based falsely accepted trials. Proceedings of Odyssey 14. The Speaker and Language Recognition Workshop (pp. 33-40). https://doi.org/10.21437/Odyssey.2014-6

Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S. y Yang, G-Z. (2019). XAI- Explainable artificial intelligence. Science Robotics, 4(37), eaay7120. https://doi.org/10.1126/scirobotics.aay7120 PMid:33137719

IAFPA, International Association of Forensic Phonetics and Acoustics (2020). Code of Practice.http://www.iafpa.net/wp-content/uploads/2020/09/IAFPA-Code-of-Practice-2020.pdf

IAFLL, International Association of Forensic and Legal Linguistics (s/d). Code of Practice.https://www.iafl.org/wp-content/uploads/2018/07/IAFL_Code_of_Practice_1-1.pdf

Jessen, M. (1997). Speaker-specific information in voice quality parameters. The International Journal of Speech, Language, and the Law, 4(1), 84-103. https://doi.org/10.1558/ijsll.v4i1.84

Jolliffe, I. (1986). Principal component analysis and factor analysis. En Principal Component Analysis (pp. 115-128). New York: Springer. https://doi.org/10.1007/978-1-4757-1904-8_7

Künzel, H. (2000). Effects of voice disguise on speaking fundamental frequency. Forensic Linguistics, 7, 149-179. https://doi.org/10.1558/sll.2000.7.2.149

Künzel, H., González-Rodríguez, J. y Ortega-García, J. (2004). Effect of voice disguise on the performance of a forensic automatic speaker recognition system. En Proceedings of Odyssey 04. The Speaker and Language Recognition Workshop. (pp. 1-4).

Lahoz-Bengoechea, J. M., Villa Villa, J. y Gil Fernández, J. (2017). Fillers in disguised accented speech. 13th Biennial Conference of the International Association of Forensic Linguists.

Masthoff, H. (1996). A report on a voice disguise experiment. Forensic Linguistics, 3(1), 160-167. https://doi.org/10.1558/ijsll.v3i1.160

Nolan, F. (1983). The Phonetic Bases of Speaker Recognition. Cambridge University Press.

Núñez, F. (2013). Fisiología de la fonación. En I. Cobeta, F. Núñez y S. Fernández (Eds.), Patología de la voz (pp. 55-75). Barcelona: Marge Medica Books.

Palacios, D. (2018). Contribución al estudio de selección de parámetros para identificación de estrés en la voz. [Tesis doctoral inédita]. Universidad Politécnica de Madrid.

Palacios, D., Rodellar, V., Lázaro, C., Gómez, A. y Gómez, P. (2020). An ICA-based method for stress classification from voice samples. Neural Computing and Applications, 32(24), 17887-17897. https://doi.org/10.1007/s00521-019-04549-3

Perrot, P., Aversano, G. y Chollet, G. (2007). Voice disguise and automatic detection: review and perspectives. En Y. Stylianou, M. Faundez-Zanuy y A. Esposito (Eds.), Progress in Nonlinear Speech Processing (pp. 101-117). Berlin: Springer. https://doi.org/10.1007/978-3-540-71505-4_7

Perrot, P. y Chollet, G. (2008). The question of disguised voice. The Journal of the Acoustical Society of America, 123(5), 3878. https://doi.org/10.1121/1.2935782

Perrot, P., Preteux, C., Vasseur, S. y Chollet, G. (2007). Detection and recognition of voice disguise. Proceedings IAFPA 2007 (pp. 1-3). Plymouth, UK: The College of St Mark y St John.

Praveena, J. y Krishna, Y. (2015). Identifying speaker from disguised speech using aural perception and Mel-frequency cepstral coefficient. Journal of Indian Speech Language & Hearing Association, 29(2), 28-34. https://doi.org/10.4103/0974-2131.185974

Rodellar-Biarge, V., Palacios-Alonso, D., Nieto-Lluis, V. y Gómez-Vilda, P. (2015). Towards the search of detection in speech-relevant features for stress. Expert Systems, 32(6), 710-718. https://doi.org/10.1111/exsy.12109

Rodman, R. (1998). Speaker recognition of disguised voices: A program for research. Proceedings of the Consortium on Speech Technology in Conjunction with the Conference on Speaker Recognition by Man and Machine: Directions for Forensic Applications (pp. 1-22). Ankara, Turkey: COST250 Publishing Arm.

Rose, P. (2002). Forensic Speaker Identification. London: Taylor y Francis. https://doi.org/10.1201/9780203166369

San Segundo, E. y Gómez-Vilda, P. (2014). Evaluating the forensic importance of glottal source features through the voice analysis of twins and non-twin siblings. Language and Law/Linguagem e Direito, 1(2), 22-41.

Titze, I. R. (2000 [1994]). Principles of voice production (2nd edition). Iowa, City: National Center for Voice and Speech.

Wolf, J. (1972). Efficient acoustic parameters for speaker recognition. The Journal of the Acoustical Society of America, 51(6B), 2044-2056. https://doi.org/10.1121/1.1913065

Zhang, C. y Tan, T. (2008). Voice disguise and automatic speaker recognition. Forensic Science International, 175(2), 118-122. https://doi.org/10.1016/j.forsciint.2007.05.019 PMid:17646071