Automatic speaker recognition of spanish siblings: (monozygotic and dizygotic) twins and non-twin brothers
DOI:
https://doi.org/10.3989/loquens.2015.021Keywords:
forensic phonetics, twins, siblings, automatic speaker recognition, SpanishAbstract
The performance of the automatic speaker recognition (ASR) system BatvoxTM (Version 4.1) has been tested with a male population of 24 monozygotic (MZ) twins, 10 dizygotic (DZ) twins, 8 non-twin siblings and 12 unrelated speakers (aged 18–52 with Standard Peninsular Spanish as their mother tongue). Since the cepstral features in which this ASR system is based depend largely on anatomical–physiological foundations, we hypothesized that such features ought to be gene-dependent. Therefore, higher similarity values should be found in MZ twins (100% shared genes) than in DZ twins, in brothers (B) or in a reference population of unrelated speakers (US). Results corroborated the expected decreasing scale MZ > DZ > B > US since the similarity coefficients yielded by the automatic system for these speakers decreased exactly in the same direction as the kinship degree of the four speaker groups diminishes. This suggests that the system features are to a great extent genetically conditioned and that they are hence useful and robust for comparing speech samples of known and unknown origin, as found in legal cases. Furthermore, the 9.9% EER (Equal Error Rate) obtained when testing MZ pairs lies around the same value (11% EER) found in Künzel (2010) with German twins.
Downloads
References
Agnitio Voice trics (2013). Batvox 4.1 Basic User Manual [Computer software].
Ariyaeeinia, A., Morrison, C., Malegaonkar, A., & Black, S. (2008). A test of the effectiveness of speaker verification for differentiating between identical twins. Science & Justice, 48(4), 182–186. http://dx.doi.org/10.1016/j.scijus.2008.02.002 PMid:19192680
Bimbot, F., Bonastre, J.?F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., . . . & Reynolds, D. A. (2004). A tutorial on text-independent speaker verification. EURASIP Journal on Advances in Signal Processing, 4, 1–22. http://dx.doi.org/10.1155/s1110865704310024
Brümmer, N., & du Preez, J. (2006). Application-independent evaluation of speaker detection. Computer Speech & Language, 20(2–3), 230-275. http://dx.doi.org/10.1016/j.csl.2005.08.001
Campbell, W. M., Campbell, J. P., Reynolds, D. A, Singer, E., & Torres-Carrasquillo, P. A. (2006). Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2–3), 210-229. http://dx.doi.org/10.1016/j.csl.2005.06.003
Charlet, D., & Lecha, V. P. (2007). Voice biometrics within the family: Trust, privacy and personalisation. In J. Filipe, H. Coelhas, & M. Saramago, (Eds.): E-business and telecommunication networks: Second International Conference, ICETE 2005, Vol. 3 (pp. 93–100). Berlin: Springer. http://dx.doi.org/10.1007/978-3-540-75993-5_8
Debruyne, F., Decoster, W., Van Gijsel, A., & Vercammen, J. (2002). Speaking fundamental frequency in monozygotic and dizygotic twins. Journal of Voice, 16(4), 466–471. http://dx.doi.org/10.1016/S0892-1997(02)00121-2
Del Abril Alonso, Á., Ambrosio Flores, E., de Blas Calleja, M. d. R., Caminero Gómez, Á., García Lecumberri, C., & de Pablo González, J. M. (2009). Fundamentos de psicobiología. Madrid: Sanz y Torres.
Doddington, G., Liggett, W., Martin, A., Przybocki, M., & Reynolds, D. (1998). SHEEP, GOATS, LAMBS and WOLVES: A statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation. Proceedings of the International Conference on Spoken Language (ICSLP '98), paper 0608.
Drygajlo, A. (2007). Forensic automatic speaker recognition [Exploratory DSP]. IEEE Signal Processing Magazine, 24(2), 132–135. http://dx.doi.org/10.1109/MSP.2007.323278
Feiser, H. S. (2009). Acoustic similarities and differences in the voices of same-sex siblings. Paper presented at the 18th Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Cambridge, UK. PMid:19633830
Felson, J. (2014). What can we learn from twin studies? A comprehensive evaluation of the equal environments assumption. Social Science Research, 43, 184–199. http://dx.doi.org/10.1016/j.ssresearch.2013.10.004 PMid:24267761
Forrai, G., & Gordos, G. (1983). A new acoustic method for the discrimination of monozygotic and dizygotic twins. Acta paediatrica Academiae Scientiarum Hungarica, 24(4), 315–322.
Foulkes, P., & French, J. P. (2012). Forensic speaker comparison: A linguistic–acoustic perspective. In P. Tiersma & L. M. Solan (Eds.), Oxford handbook of language and law, 557–572. Oxford: Oxford University Press. http://dx.doi.org/10.1093/oxfordhb/9780199572120.013.0041
Galton, F. (1875). The history of twins, as a criterion of the relative powers of nature and nurture (Rev. ed.). Journal of the Anthropological Institute of Great Britain and Ireland, 5, 391–406.
Giles, H., Coupland, J., & Coupland, N. (1991). Contexts of accommodation: Developments in applied sociolinguistics. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511663673
Gómez-Vilda, P., Fernández?Baillo, R., Nieto, A., Díaz, F., Fernández?Camacho, F. J., Rodellar, V., . . . & Martínez, R. (2007). Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice, 21(4), 450–476. http://dx.doi.org/10.1016/j.jvoice.2006.01.008 PMid:16549321
Gonzalez-Rodriguez, J., Fierrez-Aguilar, J., Ortega-Garcia, J. (2003). Forensic identification reporting using automatic speaker recognition systems. Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), 2, 93–96. http://dx.doi.org/10.1109/icassp.2003.1202302
Gonzalez-Rodriguez, J., Rose, P., Ramos, D., Toledano, D. T., & Ortega-Garcia, J. (2007). Emulating DNA: Rigorous quantification of evidential weight in transparent and testable forensic speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2104–2115. http://dx.doi.org/10.1109/TASL.2007.902747
Homayounpour, M. M., & Chollet, G. (1995). Discrimination of voices of twins and siblings for speaker verification. In Proceedings of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), 345–348.
Jain, A. K., Prabhakar, S., & Pankanti, S. (2002). On the similarity of identical twin fingerprints. Pattern Recognition, 35(11), 2653–2663. http://dx.doi.org/10.1016/S0031-3203(01)00218-7
Jessen, M. (2008). Forensic phonetics. Language and Linguistics Compass, 2(4), 671–711. http://dx.doi.org/10.1111/j.1749-818X.2008.00066.x
Kenny, P., Boulianne, G., Ouellet, P., & Dumouchel, P. (2005). Factor analysis simplified. Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), 1, 637–640. http://dx.doi.org/10.1109/ICASSP.2005.1415194
Kim, K. (2010). Automatic speaker identification of Korean male twins. Paper presented at the 19th Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Trier.
Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40. http://dx.doi.org/10.1016/j.specom.2009.08.009
Kong, A. W. K., Zhang, D., & Lu, G. (2006). A study of identical twins' palmprints for personal verification. Pattern Recognition, 39(11), 2149–2156. http://dx.doi.org/10.1016/j.patcog.2006.04.035
Künzel, H. J. (1994). Current approaches to forensic speaker recognition. In Proceedings of the ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification, 135–141.
Künzel, H. J. (2010). Automatic speaker recognition of identical twins. International Journal of Speech, Language and the Law, 17(2), 251–277.
Künzel, H. J., & Alexander, P. (2014). Forensic automatic speaker recognition with degraded and enhanced speech. Journal of the Audio Engineering Society, 62(4), 244–253. http://dx.doi.org/10.17743/jaes.2014.0014
Labov, W. (1972). The transformation of experience in the narrative syntax. In W. Labov, Language in the inner city: Studies in the Black English Vernacular (pp. 354–396). Philadelphia, PA: University of Philadelphia Press.
Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins (Doctoral dissertation). University of Melbourne.
Martino, D., Loke, Y. J., Gordon, L., Ollikainen, M., Cruickshank, M. N., Saffery, R., Craig, J. M. (2013). Longitudinal, genomescale analysis of DNA methylation in twins from birth to 18 months of age reveals rapid epigenetic change in early life and pair-specific effects of discordance. Genome Biology, 14(5): R42. http://dx.doi.org/10.1186/gb-2013-14-5-r42 PMid:23697701 PMCid:PMC4054827
Meuwly, D. (2001). Reconnaissance de locuteurs en sciences forensiques: l'apport d'une approche automatique (PhD dissertation). University of Laussane.
Morrison, G. S. (2010). Forensic voice comparison. In I. Freckelton & H. Selby (Eds.), Expert evidence (Chapter 99). Sydney: Thomson Reuters.
Morrison, G. S., & Kinoshita, Y. (2008). Automatic-type calibration of traditionally derived likelihood ratios: Forensic analysis of Australian English /o/ formant trajectories. Proceedings of the 9th INTERSPEECH Conference, 1501–1504.
Nolan, F. (1983). The phonetic bases of speaker recognition. Cambridge: Cambridge University Press.
Nolan, F. (1997). Speaker recognition and forensic phonetics. In W. J. Hardcastle & J. Laver (Eds.), The handbook of phonetic sciences (pp. 744–767). Oxford: Blackwell.
Nolan, F., & Oh, T. (1996). Identical twins, different voices. International Journal of Speech Language and the Law, 3(1), 39–49. http://dx.doi.org/10.1558/ijsll.v3i1.39
Pardo, J. S. (2006). On phonetic convergence during conversational interaction. The Journal of the Acoustical Society of America, 119(4), 2382–2393. http://dx.doi.org/10.1121/1.2178720 PMid:16642851
Philips, T. (2008). The role of methylation in gene expression, Nature Education, 1(1), 116.
Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences, 27(2), 169–190. http://dx.doi.org/10.1017/S0140525X04000056 PMid:15595235
Przybocki, M. A., Martin, A. F., & Le, A. N. (2007). NIST speaker recognition evaluations utilizing the Mixer corpora—2004, 2005, 2006. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1951–1959. http://dx.doi.org/10.1109/TASL.2007.902489
Przybyla, B. D., Horii, Y., & Crawford, M. H. (1992). Vocal fundamental frequency in a twin sample: Looking for a genetic effect. Journal of Voice, 6(3), 261–266. http://dx.doi.org/10.1016/S0892-1997(05)80151-1
Ramos, D. (2007). Forensic evaluation of the evidence using automatic speaker recognition systems (Doctoral dissertation). Universidad Autónoma de Madrid. Retrieved from http://hdl.handle.net/10486/1774.
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41. http://dx.doi.org/10.1006/dspr.1999.0361
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–83. http://dx.doi.org/10.1109/89.365379
Rose, P. (2002). Forensic speaker identification. London: Taylor & Francis. http://dx.doi.org/10.1201/9780203166369
Rose, P. (2006). Technical forensic speaker recognition: Evaluation, types and testing of evidence. Computer Speech & Language, 20(2–3), 159–191. http://dx.doi.org/10.1016/j.csl.2005.07.003
San Segundo, E. (2010a). Parametric representations of the formant trajectories of Spanish vocalic sequences for likelihood-ratio-based forensic voice comparison. The Journal of the Acoustical Society of America, 128(4), 2394. http://dx.doi.org/10.1121/1.3508586
San Segundo, E. (2010b). Variación inter e intralocutor: Parámetros acústicos segmentales que caracterizan fonéticamente a tres hermanos. Interlingü.stica, 21, 352–363.
San Segundo, E. (2012). Glottal source parameters for forensic voice comparison: An approach to voice quality in twins' voices. Paper presented at the 21st Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Santander, Spain.
San Segundo, E. (2013a). Guess who is laughing: A perceptual experiment on twin and non-twin siblings' identification. Paper presented at the 31st International Conference AESLA (Asociación Espa-ola de Lingüística Aplicada). San Cristóbal de La Laguna: Universidad de La Laguna.
San Segundo, E. (2013b). A phonetic corpus of Spanish male twins and siblings: Corpus design and forensic application. Procedia– Social and Behavioral Sciences, 95, 59–67. http://dx.doi.org/10.1016/j.sbspro.2013.10.622
San Segundo, E. (2014). Forensic speaker comparison of Spanish twins and non-twin siblings: A phonetic-acoustic analysis of formant trajectories in vocalic sequences, glottal source parameters and cepstral characteristics (PhD thesis). Consejo Superior de Investigaciones Científicas-Universidad Internacional Menéndez Pelayo, Spain.
San Segundo, E. (2015). Forensic speaker comparison of Spanish twins and non-twin siblings: A phonetic-acoustic analysis of formant trajectories in vocalic sequences, glottal source parameters and cepstral characteristics (Thesis abstract). International Journal of Speech Language and the Law, 22(2), 249–253. http://dx.doi.org/10.1558/ijsll.v22i2.28821
San Segundo, E., & G.mez?Vilda, P. (2013). Voice biometrical match of twin and non-twin siblings. In C. Manfredi (Ed.), Models and analysis of vocal emissions for biomedical applications: 8th International Workshop, Firenze, Italy, 2013, (pp. 253–256). Retrieved from http://digital.casalini.it/9788866554707.
San Segundo, E., & G.mez?Vilda, P. (2015). Evaluating the forensic importance of glottal source features through the voice analysis of twins and non-twin siblings, Language and Law/Linguagem e Direito, 1(2), 22–41.
Sataloff, R. T. (1995). Genetics of the voice. Journal of Voice, 9(1), 16–19. http://dx.doi.org/10.1016/S0892-1997(05)80218-8
Scheffer, N., Bonastre, J.?F., Ghio, A., & Teston, B. (2004). Gémellité et reconnaissance automatique du locuteur. Actes des XXV Journées d'Étude sur la Parole (JEP), 445–448.
Segal, N. L. (1993). Implications of twin research for legal issues involving young twins. Law and Human Behavior, 17(1), 43–58. http://dx.doi.org/10.1007/BF01044536
Srihari, S., Huang, C., & Srinivasan, H. (2008). On the discriminability of the handwriting of twins. Journal of Forensic Sciences, 53(2), 430–446. http://dx.doi.org/10.1111/j.1556-4029.2008.00682.x PMid:18366576
Stromswold, K. (2006). Why aren't identical twins linguistically identical? Genetic, prenatal and postnatal factors. Cognition, 101(2), 333–384. http://dx.doi.org/10.1016/j.cognition.2006.04.007 PMid:16797523
Tomblin, J. B., & Buckwalter, P. P. (1998). Heritability of poor language achievement among twins. Journal of Speech, Language, and Hearing Research, 41, 188–189. http://dx.doi.org/10.1044/jslhr.4101.188
Trouvain, J., & Truong, K. P. (2012). Convergence of laughter in conversational speech: Effects of quantity, temporal alignment and imitation. Paper presented at the International Symposium on Imitation and Convergence in Speech, Aix-en-Provence, France. PMCid:PMC3382493
van Leeuwen, D. A., & Brümmer, N. (2007). An introduction to application- independent evaluation of speaker recognition systems. In C. Müller (Ed.), Speaker classification I: Fundamentals, features, and methods (pp. 330–353). Heidelberg: Springer–Verlag. http://dx.doi.org/10.1007/978-3-540-74200-5_19 PMid:17498209
Van Lierde, K. M., Vinck, B., De Ley, S., Clement, G., & Van Cauwenberge, P. (2005). Genetics of vocal quality characteristics in monozygotic twins: A multiparameter approach. Journal of Voice, 19(4), 511–518. http://dx.doi.org/10.1016/j.jvoice.2004.10.005 PMid:16301097
Weirich, M., & Lancia, L. (2011). Perceived auditory similarity and its acoustic correlates in twins and unrelated speakers. In Proceedings of the 17th International Congress of Phonetic Sciences (ICPhS 17-Hong Kong), 2118–2121.
Wolf, J. J. (1972). Efficient acoustic parameters for speaker recognition. The Journal of the Acoustical Society of America, 51(6B), 2044–2056 http://dx.doi.org/10.1121/1.1913065
Published
How to Cite
Issue
Section
License
Copyright (c) 2015 Consejo Superior de Investigaciones Científicas (CSIC)

This work is licensed under a Creative Commons Attribution 4.0 International License.
© CSIC. Manuscripts published in both the print and online versions of this journal are the property of the Consejo Superior de Investigaciones Científicas, and quoting this source is a requirement for any partial or full reproduction.
All contents of this electronic edition, except where otherwise noted, are distributed under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. You may read the basic information and the legal text of the licence. The indication of the CC BY 4.0 licence must be expressly stated in this way when necessary.
Self-archiving in repositories, personal webpages or similar, of any version other than the final version of the work produced by the publisher, is not allowed.