Reconocimiento automático de locutor con hermanos españoles: hermanos gemelos (monozigóticos y dizigóticos) y no gemelos
DOI:
https://doi.org/10.3989/loquens.2015.021Palabras clave:
fonética judicial, gemelos, reconocimiento automático, españolResumen
Hemos utilizado el sistema de reconocimiento automático BatvoxTM (versión 4.1) con una población de hablantes masculinos compuesta de 24 gemelos monocigóticos, 10 gemelos dicigóticos, 8 hermanos no gemelares y 12 hablantes no emparentados (edades comprendidas entre 18 y 52 años, con español centropeninsular como lengua materna). Puesto que los parámetros cepstrales en los que se basa BatvoxTM dependen en gran medida de las bases anatómicas y fisiológicas del tracto vocal del hablante, se propuso que estos debían estar influenciados genéticamente. Esta hipótesis se pudo corroborar, puesto que los coeficientes de similitud arrojados por el sistema automático decrecen exactamente en la misma dirección en la que disminuye el grado de parentesco de las parejas de hablantes, es decir: gemelos monocigóticos, dicigóticos, hermanos no gemelares y hablantes no emparentados. Esto es, los gemelos monocigóticos obtuvieron valores más altos que los dicigóticos; estos, a su vez, mayores que los hermanos no gemelares, y, finalmente, estos últimos mayores que los hablantes no emparentados. Estos resultados sugieren que los parámetros en los que está basado este sistema de reconocimiento están condicionados en gran medida por aspectos genéticos y, por tanto, resultan útiles y robustos para la comparación de muestras de voz dubitadas e indubitadas que encontramos en un caso típicamente forense. Por otro lado, el EER (Equal Error Rate) del 9 % que se obtuvo en las comparaciones exclusivamente de gemelos monocigóticos supone un valor muy similar al hallado en estudios anteriores con gemelos monocigóticos alemanes, como Künzel (2010): EER del 11 %.
Descargas
Citas
Agnitio Voice trics (2013). Batvox 4.1 Basic User Manual [Computer software].
Ariyaeeinia, A., Morrison, C., Malegaonkar, A., & Black, S. (2008). A test of the effectiveness of speaker verification for differentiating between identical twins. Science & Justice, 48(4), 182–186. http://dx.doi.org/10.1016/j.scijus.2008.02.002 PMid:19192680
Bimbot, F., Bonastre, J.?F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., . . . & Reynolds, D. A. (2004). A tutorial on text-independent speaker verification. EURASIP Journal on Advances in Signal Processing, 4, 1–22. http://dx.doi.org/10.1155/s1110865704310024
Brümmer, N., & du Preez, J. (2006). Application-independent evaluation of speaker detection. Computer Speech & Language, 20(2–3), 230-275. http://dx.doi.org/10.1016/j.csl.2005.08.001
Campbell, W. M., Campbell, J. P., Reynolds, D. A, Singer, E., & Torres-Carrasquillo, P. A. (2006). Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2–3), 210-229. http://dx.doi.org/10.1016/j.csl.2005.06.003
Charlet, D., & Lecha, V. P. (2007). Voice biometrics within the family: Trust, privacy and personalisation. In J. Filipe, H. Coelhas, & M. Saramago, (Eds.): E-business and telecommunication networks: Second International Conference, ICETE 2005, Vol. 3 (pp. 93–100). Berlin: Springer. http://dx.doi.org/10.1007/978-3-540-75993-5_8
Debruyne, F., Decoster, W., Van Gijsel, A., & Vercammen, J. (2002). Speaking fundamental frequency in monozygotic and dizygotic twins. Journal of Voice, 16(4), 466–471. http://dx.doi.org/10.1016/S0892-1997(02)00121-2
Del Abril Alonso, Á., Ambrosio Flores, E., de Blas Calleja, M. d. R., Caminero Gómez, Á., García Lecumberri, C., & de Pablo González, J. M. (2009). Fundamentos de psicobiología. Madrid: Sanz y Torres.
Doddington, G., Liggett, W., Martin, A., Przybocki, M., & Reynolds, D. (1998). SHEEP, GOATS, LAMBS and WOLVES: A statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation. Proceedings of the International Conference on Spoken Language (ICSLP '98), paper 0608.
Drygajlo, A. (2007). Forensic automatic speaker recognition [Exploratory DSP]. IEEE Signal Processing Magazine, 24(2), 132–135. http://dx.doi.org/10.1109/MSP.2007.323278
Feiser, H. S. (2009). Acoustic similarities and differences in the voices of same-sex siblings. Paper presented at the 18th Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Cambridge, UK. PMid:19633830
Felson, J. (2014). What can we learn from twin studies? A comprehensive evaluation of the equal environments assumption. Social Science Research, 43, 184–199. http://dx.doi.org/10.1016/j.ssresearch.2013.10.004 PMid:24267761
Forrai, G., & Gordos, G. (1983). A new acoustic method for the discrimination of monozygotic and dizygotic twins. Acta paediatrica Academiae Scientiarum Hungarica, 24(4), 315–322.
Foulkes, P., & French, J. P. (2012). Forensic speaker comparison: A linguistic–acoustic perspective. In P. Tiersma & L. M. Solan (Eds.), Oxford handbook of language and law, 557–572. Oxford: Oxford University Press. http://dx.doi.org/10.1093/oxfordhb/9780199572120.013.0041
Galton, F. (1875). The history of twins, as a criterion of the relative powers of nature and nurture (Rev. ed.). Journal of the Anthropological Institute of Great Britain and Ireland, 5, 391–406.
Giles, H., Coupland, J., & Coupland, N. (1991). Contexts of accommodation: Developments in applied sociolinguistics. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511663673
Gómez-Vilda, P., Fernández?Baillo, R., Nieto, A., Díaz, F., Fernández?Camacho, F. J., Rodellar, V., . . . & Martínez, R. (2007). Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice, 21(4), 450–476. http://dx.doi.org/10.1016/j.jvoice.2006.01.008 PMid:16549321
Gonzalez-Rodriguez, J., Fierrez-Aguilar, J., Ortega-Garcia, J. (2003). Forensic identification reporting using automatic speaker recognition systems. Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), 2, 93–96. http://dx.doi.org/10.1109/icassp.2003.1202302
Gonzalez-Rodriguez, J., Rose, P., Ramos, D., Toledano, D. T., & Ortega-Garcia, J. (2007). Emulating DNA: Rigorous quantification of evidential weight in transparent and testable forensic speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2104–2115. http://dx.doi.org/10.1109/TASL.2007.902747
Homayounpour, M. M., & Chollet, G. (1995). Discrimination of voices of twins and siblings for speaker verification. In Proceedings of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), 345–348.
Jain, A. K., Prabhakar, S., & Pankanti, S. (2002). On the similarity of identical twin fingerprints. Pattern Recognition, 35(11), 2653–2663. http://dx.doi.org/10.1016/S0031-3203(01)00218-7
Jessen, M. (2008). Forensic phonetics. Language and Linguistics Compass, 2(4), 671–711. http://dx.doi.org/10.1111/j.1749-818X.2008.00066.x
Kenny, P., Boulianne, G., Ouellet, P., & Dumouchel, P. (2005). Factor analysis simplified. Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), 1, 637–640. http://dx.doi.org/10.1109/ICASSP.2005.1415194
Kim, K. (2010). Automatic speaker identification of Korean male twins. Paper presented at the 19th Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Trier.
Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40. http://dx.doi.org/10.1016/j.specom.2009.08.009
Kong, A. W. K., Zhang, D., & Lu, G. (2006). A study of identical twins' palmprints for personal verification. Pattern Recognition, 39(11), 2149–2156. http://dx.doi.org/10.1016/j.patcog.2006.04.035
Künzel, H. J. (1994). Current approaches to forensic speaker recognition. In Proceedings of the ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, 135–141.
Künzel, H. J. (2010). Automatic speaker recognition of identical twins. International Journal of Speech, Language and the Law, 17(2), 251–277.
Künzel, H. J., & Alexander, P. (2014). Forensic automatic speaker recognition with degraded and enhanced speech. Journal of the Audio Engineering Society, 62(4), 244–253. http://dx.doi.org/10.17743/jaes.2014.0014
Labov, W. (1972). The transformation of experience in the narrative syntax. In W. Labov, Language in the inner city: Studies in the Black English Vernacular (pp. 354–396). Philadelphia, PA: University of Philadelphia Press.
Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins (Doctoral dissertation). University of Melbourne.
Martino, D., Loke, Y. J., Gordon, L., Ollikainen, M., Cruickshank, M. N., Saffery, R., Craig, J. M. (2013). Longitudinal, genomescale analysis of DNA methylation in twins from birth to 18 months of age reveals rapid epigenetic change in early life and pair-specific effects of discordance. Genome Biology, 14(5): R42. http://dx.doi.org/10.1186/gb-2013-14-5-r42 PMid:23697701 PMCid:PMC4054827
Meuwly, D. (2001). Reconnaissance de locuteurs en sciences forensiques: l'apport d'une approche automatique (PhD dissertation). University of Laussane.
Morrison, G. S. (2010). Forensic voice comparison. In I. Freckelton & H. Selby (Eds.), Expert evidence (Chapter 99). Sydney: Thomson Reuters.
Morrison, G. S., & Kinoshita, Y. (2008). Automatic-type calibration of traditionally derived likelihood ratios: Forensic analysis of Australian English /o/ formant trajectories. Proceedings of the 9th INTERSPEECH Conference, 1501–1504.
Nolan, F. (1983). The phonetic bases of speaker recognition. Cambridge: Cambridge University Press.
Nolan, F. (1997). Speaker recognition and forensic phonetics. In W. J. Hardcastle & J. Laver (Eds.), The handbook of phonetic sciences (pp. 744–767). Oxford: Blackwell.
Nolan, F., & Oh, T. (1996). Identical twins, different voices. International Journal of Speech Language and the Law, 3(1), 39–49. http://dx.doi.org/10.1558/ijsll.v3i1.39
Pardo, J. S. (2006). On phonetic convergence during conversational interaction. The Journal of the Acoustical Society of America, 119(4), 2382–2393. http://dx.doi.org/10.1121/1.2178720 PMid:16642851
Philips, T. (2008). The role of methylation in gene expression, Nature Education, 1(1), 116.
Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27(2), 169–190. http://dx.doi.org/10.1017/S0140525X04000056 PMid:15595235
Przybocki, M. A., Martin, A. F., & Le, A. N. (2007). NIST speaker recognition evaluations utilizing the Mixer corpora—2004, 2005, 2006. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1951–1959. http://dx.doi.org/10.1109/TASL.2007.902489
Przybyla, B. D., Horii, Y., & Crawford, M. H. (1992). Vocal fundamental frequency in a twin sample: Looking for a genetic effect. Journal of Voice, 6(3), 261–266. http://dx.doi.org/10.1016/S0892-1997(05)80151-1
Ramos, D. (2007). Forensic evaluation of the evidence using automatic speaker recognition systems (Doctoral dissertation). Universidad Autónoma de Madrid. Retrieved from http://hdl.handle.net/10486/1774.
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41. http://dx.doi.org/10.1006/dspr.1999.0361
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–83. http://dx.doi.org/10.1109/89.365379
Rose, P. (2002). Forensic speaker identification. London: Taylor & Francis. http://dx.doi.org/10.1201/9780203166369
Rose, P. (2006). Technical forensic speaker recognition: Evaluation, types and testing of evidence. Computer Speech & Language, 20(2–3), 159–191. http://dx.doi.org/10.1016/j.csl.2005.07.003
San Segundo, E. (2010a). Parametric representations of the formant trajectories of Spanish vocalic sequences for likelihood-ratio-based forensic voice comparison. The Journal of the Acoustical Society of America, 128(4), 2394. http://dx.doi.org/10.1121/1.3508586
San Segundo, E. (2010b). Variación inter e intralocutor: Parámetros acústicos segmentales que caracterizan fonéticamente a tres hermanos. Interlingü.stica, 21, 352–363.
San Segundo, E. (2012). Glottal source parameters for forensic voice comparison: An approach to voice quality in twins' voices. Paper presented at the 21st Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Santander, Spain.
San Segundo, E. (2013a). Guess who is laughing: A perceptual experiment on twin and non-twin siblings' identification. Paper presented at the 31st International Conference AESLA (Asociación Espa-ola de Lingüística Aplicada). San Cristóbal de La Laguna: Universidad de La Laguna.
San Segundo, E. (2013b). A phonetic corpus of Spanish male twins and siblings: Corpus design and forensic application. Procedia– Social and Behavioral Sciences, 95, 59–67. http://dx.doi.org/10.1016/j.sbspro.2013.10.622
San Segundo, E. (2014). Forensic speaker comparison of Spanish twins and non-twin siblings: A phonetic-acoustic analysis of formant trajectories in vocalic sequences, glottal source parameters and cepstral characteristics (PhD thesis). Consejo Superior de Investigaciones Científicas-Universidad Internacional Menéndez Pelayo, Spain.
San Segundo, E. (2015). Forensic speaker comparison of Spanish twins and non-twin siblings: A phonetic-acoustic analysis of formant trajectories in vocalic sequences, glottal source parameters and cepstral characteristics (Thesis abstract). International Journal of Speech Language and the Law, 22(2), 249–253. http://dx.doi.org/10.1558/ijsll.v22i2.28821
San Segundo, E., & G.mez?Vilda, P. (2013). Voice biometrical match of twin and non-twin siblings. In C. Manfredi (Ed.), Models and analysis of vocal emissions for biomedical applications: 8th International Workshop, Firenze, Italy, 2013, (pp. 253–256). Retrieved from http://digital.casalini.it/9788866554707.
San Segundo, E., & G.mez?Vilda, P. (2015). Evaluating the forensic importance of glottal source features through the voice analysis of twins and non-twin siblings, Language and Law/Linguagem e Direito, 1(2), 22–41.
Sataloff, R. T. (1995). Genetics of the voice. Journal of Voice, 9(1), 16–19. http://dx.doi.org/10.1016/S0892-1997(05)80218-8
Scheffer, N., Bonastre, J.?F., Ghio, A., & Teston, B. (2004). Gémellité et reconnaissance automatique du locuteur. Actes des XXV Journées d'Étude sur la Parole (JEP), 445–448.
Segal, N. L. (1993). Implications of twin research for legal issues involving young twins. Law and Human Behavior, 17(1), 43–58. http://dx.doi.org/10.1007/BF01044536
Srihari, S., Huang, C., & Srinivasan, H. (2008). On the discriminability of the handwriting of twins. Journal of Forensic Sciences, 53(2), 430–446. http://dx.doi.org/10.1111/j.1556-4029.2008.00682.x PMid:18366576
Stromswold, K. (2006). Why aren't identical twins linguistically identical? Genetic, prenatal and postnatal factors. Cognition, 101(2), 333–384. http://dx.doi.org/10.1016/j.cognition.2006.04.007 PMid:16797523
Tomblin, J. B., & Buckwalter, P. P. (1998). Heritability of poor language achievement among twins. Journal of Speech, Language, and Hearing Research, 41, 188–189. http://dx.doi.org/10.1044/jslhr.4101.188
Trouvain, J., & Truong, K. P. (2012). Convergence of laughter in conversational speech: Effects of quantity, temporal alignment and imitation. Paper presented at the International Symposium on Imitation and Convergence in Speech, Aix-en-Provence, France. PMCid:PMC3382493
van Leeuwen, D. A., & Brümmer, N. (2007). An introduction to application- independent evaluation of speaker recognition systems. In C. Müller (Ed.), Speaker classification I: Fundamentals, features, and methods (pp. 330–353). Heidelberg: Springer–Verlag. http://dx.doi.org/10.1007/978-3-540-74200-5_19 PMid:17498209
Van Lierde, K. M., Vinck, B., De Ley, S., Clement, G., & Van Cauwenberge, P. (2005). Genetics of vocal quality characteristics in monozygotic twins: A multiparameter approach. Journal of Voice, 19(4), 511–518. http://dx.doi.org/10.1016/j.jvoice.2004.10.005 PMid:16301097
Weirich, M., & Lancia, L. (2011). Perceived auditory similarity and its acoustic correlates in twins and unrelated speakers. In Proceedings of the 17th International Congress of Phonetic Sciences (ICPhS 17-Hong Kong), 2118–2121.
Wolf, J. J. (1972). Efficient acoustic parameters for speaker recognition. The Journal of the Acoustical Society of America, 51(6B), 2044–2056 http://dx.doi.org/10.1121/1.1913065
Publicado
Cómo citar
Número
Sección
Licencia
Derechos de autor 2015 Consejo Superior de Investigaciones Científicas (CSIC)

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
© CSIC. Los originales publicados en las ediciones impresa y electrónica de esta Revista son propiedad del Consejo Superior de Investigaciones Científicas, siendo necesario citar la procedencia en cualquier reproducción parcial o total.
Salvo indicación contraria, todos los contenidos de la edición electrónica se distribuyen bajo una licencia de uso y distribución “Creative Commons Reconocimiento 4.0 Internacional ” (CC BY 4.0). Consulte la versión informativa y el texto legal de la licencia. Esta circunstancia ha de hacerse constar expresamente de esta forma cuando sea necesario.
No se autoriza el depósito en repositorios, páginas web personales o similares de cualquier otra versión distinta a la publicada por el editor.