1. INTRODUCTION
⌅ First language (L1 henceforth) acquisition and development have drawn
the attention of researchers for centuries. However, new technology
development from the last few decades has entailed a qualitative change
in research (Dolgova & Tyler, 2019Dolgova, N., & Tyler, A. (2019). Applications of Usage-Based Approaches to Language Teaching. In X. Gao (Ed.), Second Handbook of English Language Teaching (pp. 939-961). Springer. https://doi.org/10.1007/978-3-030-02899-2_49
; Ellis, 2017Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
; Kern et al., 2014Kern, S., Gayraud, F., & Chenu, F. (2014). The role of input in early first language morphosyntactic development. Language, Interaction and Acquisition, 5(1), 1-18. https://doi.org/10.1075/lia.5.1.00int
; MacWhinney, 1996MacWhinney, B. (1996). Computational analysis of interactions. In P. Fletcher, & B. MacWhinney (Eds.), The Handbook of child language (pp. 152-178). Blackwell. https://doi.org/10.1111/b.9780631203124.1996.00006.x
). The gradual introduction of new technological
tools and the adoption of common methodologies and procedures made the
design of the first corpora of child language possible. In those
corpora, hundreds of speech recordings from different-aged children were
transcribed, providing researchers with an invaluable database for the
study of child language (Ellis, 2017Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
). Currently, the international corpus of reference is CHILDES1 https://childes.talkbank.org/ (MacWhinney & Snow, 1985MacWhinney, B., & Snow, C. (1985). The child language data exchange system. Journal of Child Language, 12(2), 271-295. https://doi.org/10.1017/S0305000900006449), a multilingual child language corpus, in which
we can find samples of Spanish language, some of which were used to
corroborate results from this study. And particularly regarding the
phonological treatment of corpora, the development of the software PHON2 https://www.phon.ca/phon-manual/index.html (Hedlund & Rose, 2020Hedlund, G., & Rose, Y. (2020). Phon 3.1 [Computer Software]. https://phon.ca.
) meant a landmark in the study of child language.
Within the field of L1 acquisition research, the description of the phonological development involves four basic concerns (Grunwell, 1981Grunwell, P. (1981). The development of phonology: A descriptive profile. First language, 3, 161-191. https://doi.org/10.1177/014272378100200601
): the great variation from one individual to
another; the extension and gradual regularisation of the child’s
pronunciation system, characterised by unsystematicity; the difficulty
determining the starting point of the phonological development; and the
need to consider both the input and output in the process of
description. Grunwell (1981, p. 167)Grunwell, P. (1981). The development of phonology: A descriptive profile. First language, 3, 161-191. https://doi.org/10.1177/014272378100200601
disapproved of the fact that “studies are to
discover when children achieve the correct pronunciation of the sounds
of their language”. She considered that the question about when the sounds of speech are learnt was ill posed, due to factors such as
the wide range of individual variation, or the fact that a child does
not acquire each phoneme separately. Therefore, research on phonological
acquisition must not focus so much on the precise moment at which a
child acquires a certain phoneme, but on the search for patterns by
describing large samples of speech language. “We need models of usage
and its effects upon acquisition” (Ellis, 2017, p. 48Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
).
This subject matter of phonological
acquisition has been largely aimed at improving research on language
disorders. From a detailed study of a child’s normal linguistic
development and the establishment of patterns in language behaviour it
is possible to detect atypical phenomena in the development of an
individual. According to Ingram (1976)Ingram, D. (1976). Phonological disability in children. Edwards Arnold.
,
the knowledge about patterns of typical language development gives us
the clues for the treatment of pathologies. And corpus linguistics plays
a pertinent role in this regard, since corpora are a huge source for
the analysis of natural language in the elaboration of, for instance,
what Acosta and Ramos (1998)Acosta,
V., & Ramos, V. (1998). Estudio de los desórdenes del habla
infantil desde la perspectiva de los procesos fonológicos. Revista de Logopedia, Fonatría y Audiología, 18, 124-142. https://doi.org/10.1016/S0214-4603(98)75683-9
demanded: a phonological inventory; or to study
the role of input in the acquisition process examining child-directed
speech (CDS) in natural contexts.
The present study is based on CHIEDE (Garrote, 2010Garrote, M. (2010). Los corpus de habla infantil. Metodología y análisis. Servicio de publicaciones de la Universidad Autónoma de Madrid.
), a cross-sectional corpus in which n=59
children aged 3;0-6;0 participated. The corpus was recorded,
transcribed and, subsequently, tagged by means of automatic processing
techniques (phonological and morphosyntactic tagging software), and then
manually checked to correct possible tagging errors. This methodology
facilitates the retrieval of linguistically annotated data (parts of
speech, morphological, and phonological information) to quantify
linguistic features. It is descriptive work, following an observational
method based on performance, on external empirical data, and not on
competence and experimentation.
This paper presents a
phonological study of L1 Spanish children with the aim to show the
phonological development displayed by the participants. Taking into
account the participants’ age, our purpose was not to establish the
order of acquisition of phonemes, but to carry out a description of the
typical phonological development of Spanish-speaking children from 3;0
to 6;0 years old, based on the frequency of occurrence of phonemes
(providing a phonological inventory), and to highlight the role of the
input frequency as a facilitator to acquire phonemes (even those
traditionally considered more complex). Three questions are considered:
(1) Is the phonological system completely acquired at 3;0? (2) Is 4;0 a
turning point in the acquisition process as many linguistic studies
claim (Bosch, 1983Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
; Díez-Itza & Martínez López, 2004Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
; Maratsos, 1974Maratsos, M. P. (1974). Children who get worse at understanding the passive: A replication of Bever. Journal of Psycholinguistic Research, 3(1), 65-74. https://doi.org/10.1007/BF01067222
)? And finally, and most importantly, (3) To what
extent is the input frequency relevant in this process? The goal is to
clarify these questions through the revision of some of the most
significant theories and research, and the analysis of data from
different corpora.
2. PREVIOUS RESEARCH
⌅ Morphology and syntax are the linguistic levels which have been
addressed to the most extent by research on L1 acquisition. Studies
carried out on child language have mainly focused on the acquisition of
the lexical and grammatical structure, to the detriment of phonology,
semantics, or pragmatics. According to Vihman et al. (2009, p. 164)Vihman,
M. M., DePaolis, R. A, & Keren-Portnoy, T. (2009). A dynamic
systems approach to babbling and words. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 163-182). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.010
, “The role of phonology in the development of
linguistic knowledge is often given short shrift by researchers
interested in word learning”. Consequently, phonological studies on
acquisition are less frequent (Polo, 2016Polo, N. (2016). La investigación actual sobre el desarrollo de la fonología del español como lengua materna. Lenguas modernas, 47, 137-152.
).
Moreover, a vast majority focus on the English language. Though
research has been gradually carried out on other languages, it is
“heavily biased toward Indo-European languages of Western Europe with
the bulk of research still concentrated on English” (Stoll, 2009, p. 89Stoll, S. (2009). Crosslinguistic approaches to language acquisition. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 89-104). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.006
).
One of the pioneering works on phonological development was Stampe’s (1969)Stampe,
D. (1969). The acquisition of phonetic representation. In R. I.
Binnick, A. Davidson, G. Green, & J. L. Morgan (Eds.), Papers from the Fifth Regional Meeting of the Chicago Linguistic Society (pp. 443-454). Chicago Linguistic Society.
,
for whom the language acquisition process is based upon an innate
mechanism children have in order to simplify adult words. By means of
these mechanisms or processes -unstressed syllable deletion,
clusters reduction, merging vowels into /a/- the child goes from what
Stampe called a “language-innocent state” to the adult production.
Later, Ingram (1976)Ingram, D. (1976). Phonological disability in children. Edwards Arnold.
adopted Stampe’s theory for clinical phonology research. Following the piagetian stages (Piaget, 1926Piaget, J. (1926). The language and thought of the child. Kegan Paul, Trench & Trubner.
)
of cognitive development and their corresponding linguistic periods,
Ingram established a parallelism with the phonological level, thus
locating the evolution of the different phonemes and phonological skills
at distinct stages from the sensorimotor stage (0;0-1;6) to the formal
operational stage (12;0-16;0).
However, crosslinguistic studies
on acquisition beyond the early period (around one year of age) have
proved that it is not possible to establish clear stages of development
applicable to every language. For instance, Durgunoğlu and Öney (1999, p. 283)Durgunoğlu, A. Y., & Öney, B. (1999). A cross-linguistic comparison of phonological awareness and word recognition. Reading and Writing: An Interdisciplinary Journal, 11, 281-299. https://doi.org/10.1023/A:1008093232622
examined the “effects of language-specific
influences on the development of phonological awareness” and explained
how structural phonetic differences among languages mean differences in
the child’s development of phonology. In a similar line, Bleses, Basbøll, Lum and Vach (2010)Bleses,
D., Basbøll, H., Lum, J., & Vach, W. (2010). Phonology and lexicon
in a cross-linguistic perspective: the importance of phonetics - a
commentary on Stoel-Gammon’s “Relationships between lexical and
phonological development in young children”. Journal of Child Language ,38(01), 61-68. https://doi.org/10.1017/S0305000910000437
set up a ranking of 7 languages based on the
complexity of their phonetic systems (vowel/consonant ratio) and
concluded that the most complex one was the Danish phonemic system,
followed by the Swedish, the Dutch, the French, the English (American),
the Galician and the Croatian. Bernhardt and Stemberger (2017)Bernhardt,
B. M., & Stemberger, J. P. (2017). Investigating typical and
protracted phonological development across languages. In E. Babatsouli,
D. Ingram, & N. Müller (Eds.), Crosslinguistic Encounters in Language Acquisition: Typical and Atypical Development (pp. 71-108). Multilingual Matters. https://doi.org/10.21832/9781783099092-008
, comparing typical development with protracted
phonological development, showed that in four languages, Mandarin,
Arabic, Slovene and European Spanish, the WWM3WWM stands for “whole word match”, that is, the child’s pronunciation equals the adult’s. scores for 4-year-old children were 80-85% (85.4% for European Spanish).
Though differences across languages, McLeod and Crowe (2018)McLeod, S., & Crowe, K. (2018). Children’s Consonant Acquisition in 27 Languages: A Cross-Linguistic Review. American Journal of Speech-Language Pathology, 27(4), 1546-1571. https://doi.org/10.1044/2018_AJSLP-17-0100
, after reviewing 64 studies involving more than
26,000 children and 27 languages concluded that 93% of consonants were
correctly produced by 5 years old. In the same line, Stoel-Gammon (2006, p. 646)Stoel-Gammon, C. (2006). Infancy: phonological development. Encyclopaedia of Language & Linguistics (Second Edition), 642-648. https://doi.org/10.1016/B0-08-044854-2/00838-5
stated that “By the age of 3 years, the level of
intelligibility increases to 75%, and by age 4, it is 100%”, meaning
that, though not adult-like yet, the child phonological system is
sufficiently developed to be intelligible.
In Spain, the theories set out first by Stampe and then by Ingram were later introduced by authors such as Bosch (1983)Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
and Díez-Itza (1995)Díez-Itza,
E. (1995). Procesos fonológicos en la adquisición del español como
lengua materna. In J. M. Ruiz, P. H. Sheerin, & E. González-Cascos
(Eds.), Actas del XI Congreso Nacional de Lingüística Aplicada (pp. 225-264). Universidad de Valladolid.
.
For both researchers, the phonological acquisition period is placed
between approximately one and a half years old and six to seven years
old, with an intermediate division around four years old (Bosch, 1983)Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
.
This means that one cannot talk about a total control of the complete
phonological system until the age of six or seven, when the child
masters certain complicated phonemes and their combination in more
complex syllables. In spite of that, as mentioned before (Bernhardt and Stemberger, 2017Bernhardt,
B. M., & Stemberger, J. P. (2017). Investigating typical and
protracted phonological development across languages. In E. Babatsouli,
D. Ingram, & N. Müller (Eds.), Crosslinguistic Encounters in Language Acquisition: Typical and Atypical Development (pp. 71-108). Multilingual Matters. https://doi.org/10.21832/9781783099092-008
; Stoel-Gammon, 2006Stoel-Gammon, C. (2006). Infancy: phonological development. Encyclopaedia of Language & Linguistics (Second Edition), 642-648. https://doi.org/10.1016/B0-08-044854-2/00838-5
), by the age of 4 years intelligibility is complete.
Spanish studies have mostly focused on what Díez-Itza and Martínez López (2004)Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
call periodo temprano ‘early period’, that is, until about three years old. These authors consider necessary to increase research on the periodo tardío ‘late period’, i.e., from three to six years old. They determined three stages in the phonological acquisition: expansión ‘expansion’, the stage until 3;0, characterised by a progressive
diminution of phonological processes (such as unstressed syllable
deletion, clusters reduction, etc.), after which there would be a
standstill; estabilización ‘stabilisation’, from three to four
years old and initially defined by a considerable decrease of processes,
which increase again at around four years old (showing a U-shape
developmental pattern); and resolución ‘resolution’ from the age of five years onwards, when phonological processes are residual. Díez-Itza and Martínez López’s (2004)Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
intention was to confirm if the age of four clearly becomes a universal
milestone of transition towards subsequent periods, as it has been
repeatedly assumed by descriptive studies. In fact, at the age of four
years children’s language is characterised, from the standpoint of
phonology, by an increased speech rate, which means more coarticulation
and the lengthening of utterances and conversational turns (Díez-Itza & Martínez López, 2004Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
). Many scholars agree on a transition point at four years old (Bosch, 1983Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
; Díez-Itza & Martínez López, 2004Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
) regarding phonological acquisition, but also other linguistic levels. For example, Maratsos (1974)Maratsos, M. P. (1974). Children who get worse at understanding the passive: A replication of Bever. Journal of Psycholinguistic Research, 3(1), 65-74. https://doi.org/10.1007/BF01067222
, analysing the acquisition of the passive
structure, concluded that children show a U-shape developmental pattern
around four years old, as the rate of passive comprehension decreased in
comparison to younger children. Also, Garrote (2010)Garrote, M. (2010). Los corpus de habla infantil. Metodología y análisis. Servicio de publicaciones de la Universidad Autónoma de Madrid.
found that it was around 4;0 that children produced more non-targeted
speech as a consequence of rule overgeneralisation errors.
Bosch (1983)Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
, based on studies by Serra (1983)Serra,
M. (1983). Normas estadísticas de articulación para la población
escolar de 3 a 7 años en el área metropolitana de Barcelona. Revista de Logopedia, Foniatría y Audiología, 3(4), 232-235. https://doi.org/10.1016/S0214-4603(83)75286-1
and Melgar de González (1976)Melgar de González, M. (1976). Cómo detectar al niño con problemas de habla. Trillas.
,
summarised the most problematic phonemes during the acquisition process
of Spanish: the trill /r/, fricatives such as /s/, /θ/ and /x/, and the
voiced plosive /d/. She concludes that the most difficult place of
articulation is that located in the dento-alveolar area, where a great
number of sounds are differentiated just by the manner of articulation (Bosch, 1983)Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
. López Valero et al. (1989)López
Valero, A., Carrillo Hernández, M. R., & Ros Frutos, J. L. (1989).
Aportaciones para el estudio del desarrollo del lenguaje infantil en el
período comprendido entre los veinticuatro y los treinta meses. Cauce, Revista de Filología y su Didáctica, 12, 145-156.
supported Bosch’s findings concluding that the sounds belatedly acquired in Spanish are /x/, /f/, /r/ and / θ /.
Other authors such as Serra (1983)Serra,
M. (1983). Normas estadísticas de articulación para la población
escolar de 3 a 7 años en el área metropolitana de Barcelona. Revista de Logopedia, Foniatría y Audiología, 3(4), 232-235. https://doi.org/10.1016/S0214-4603(83)75286-1
established the following order of acquisition: nasals, plosives, fricatives, and, finally, liquids and the alveolar trill.
It
is noteworthy to mention here two studies related to the present one,
due to the age range (3 to almost 6 years old) and the language
(Spanish, though Mexican variety). First, Jiménez (1987)Jimenez, B. C. (1987). Acquisition of Spanish consonants in children aged 3-5 years, 7 months. Language, Speech, and Hearing Services in Schools, 18(4), 357-363. https://doi.org/10.1044/0161-1461.1804.357
found out that, by age 5 years, the 120 children
forming the sample showed production problems only with two consonants:
/s/ and /r/. Second, Acevedo (1993, p. 11)Acevedo, M. A. (1993). Development of Spanish consonants in preschool children. Communication Disorders Quarterly, 15(2), 9-15. https://doi.org/10.1177/152574019301500202
also tested 120 Mexican children. Results proved
that sound “mastery occurred by the 4;0-4;5 age group”, remaining
problematic the following consonants: /ɲ/, /g/, /f/, /s/, and /x/. Both
studies were based on elicitation tasks, not on spontaneous speech.
Most significant works on the Spanish phonological acquisition, unlike
the present study, are focused on the early period and they are
crosslinguistic studies (Bosch & Sebastián-Gallés, 2001Bosch,
L., & Sebastián-Gallés, N. (2001). Evidence of early language
discrimination abilities in infants from bilingual environments. Infancy, 2, 29-49. https://doi.org/10.1207/S15327078IN0201_3
, 2003Bosch,
L., & Sebastián-Gallés, N. (2003). Language experience and the
perception of a voicing contrast in fricatives: infant and adult data.
In M. J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the Fifteenth International Congress of Phonetic Sciences (pp. 1987-1990). Universitat Autónoma de Barcelona.
; Bunta & Ingram, 2007Bunta,
F., & Ingram, D. (2007). The acquisition of speech rhythm by
bilingual Spanish-and English-speaking 4-and 5-year-old children. Journal of Speech, Language, and Hearing Research, 50(4), 999-1014. https://doi.org/10.1044/1092-4388(2007/070)
; Goldstein & Cintrón, 2001Goldstein, B., & Cintrón, P. (2001). An investigation of phonological skills in Puerto-Rican Spanish-speaking 2-year-olds. Clinical Linguistics and Phonetics, 15, 343-361. https://doi.org/10.1080/02699200010017814
; Kehoe & Lleó, 2003Lleó, C. (2003). Prosodic licensing of coda in the acquisition of Spanish. Probus, 15, 257-281. https://doi.org/10.1515/prbs.2003.010
, 2005Kehoe, M., & Lleó, C. (2005). The emergence of language specific rhythm in German-Spanish bilingual children. Arbeiten zur Mehrsprachigkeit: Working Papers in Multilingualism, 58. SFB 538.
; Kehoe, Lleó & Rakow, 2005Kehoe, M., & Lleó, C. (2005). The emergence of language specific rhythm in German-Spanish bilingual children. Arbeiten zur Mehrsprachigkeit: Working Papers in Multilingualism, 58. SFB 538.
; Lleó, 2002Lleó, C. (2002). The role of markedness in the acquisition of complex prosodic structures by German-Spanish Bilinguals. International Journal of Bilingualism, 6, 291-313. https://doi.org/10.1177/13670069020060030501
, 2003Lleó, C. (2003). Prosodic licensing of coda in the acquisition of Spanish. Probus, 15, 257-281. https://doi.org/10.1515/prbs.2003.010
, 2006Lleó, C. (2006). The acquisition of prosodic word structures in Spanish by monolingual and Spanish-German bilingual children. Language and Speech, 49, 205-229. https://doi.org/10.1177/00238309060490020401
). However, the interest here is in knowing how,
once the Spanish phonemes are acquired (late period), the children’s
phonological system becomes as stable as the adults’ observing the
frequency of use.
Taking into account previous research and the above-mentioned claims (Acosta & Ramos, 1998Acosta,
V., & Ramos, V. (1998). Estudio de los desórdenes del habla
infantil desde la perspectiva de los procesos fonológicos. Revista de Logopedia, Fonatría y Audiología, 18, 124-142. https://doi.org/10.1016/S0214-4603(98)75683-9
; Grunwell, 1981Grunwell, P. (1981). The development of phonology: A descriptive profile. First language, 3, 161-191. https://doi.org/10.1177/014272378100200601
; MacWhinney, 1996MacWhinney, B. (1996). Computational analysis of interactions. In P. Fletcher, & B. MacWhinney (Eds.), The Handbook of child language (pp. 152-178). Blackwell. https://doi.org/10.1111/b.9780631203124.1996.00006.x
, among others), there is a need for a
phonological frequency-based analysis of the linguistic performance of
children aged 3;0 to 6;0 (late period), using a spontaneous speech
corpus as a data source.
2.1. The role of input and frequency
⌅ Although input is considered by advocates of nativist theories of a
Chomskyan nature as irrelevant, citing the Poverty of Stimulus Argument (Chomsky, 1980)Chomsky, N. (1980). Rules and representation. MIT Press.
, later tendencies such as connectionist models (Menn & Stoel-Gammon, 1996Menn, L., & Stoel-Gammon, C. (1996). Phonological development. In P. Fletcher, & B. MacWhinney (Eds.), The Handbook of child language (pp. 335-359). Blackwell. https://doi.org/10.1111/b.9780631203124.1996.00014.x
) give the input a key role in the learning
process, considering it the source of empirical knowledge from which
children, through statistical processing, acquire language. Indeed, “a
number of linguists have recently proposed statistical explanations for
patterns of phonological productions” (Rose, 2009, p. 329Rose,
Y. (2009). Internal and External Influences on Child Language
Productions. In Pellegrino, François, Egidio Marsico, Ioana Chitoran,
& Christophe Coupé (Eds.), Approaches to Phonological Complexity (pp. 329-351). Mouton de Gruyter. https://doi.org/10.1515/9783110223958.329
).
In recent years, the cognitive-functional or usage-based model (Tomasello, 2003Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Harvard University Press.
)
has posed the emergence of language as a result of use, from which
linguistic patterns arise, and then grammatical constructions are
consolidated. From a usage-based approach to language acquisition,
“children learn linguistic constructions from the conspiracy of
experienced exemplars, with abstract syntactic constructions and their
associated meanings emerging from the statistical distribution of
form-function correspondences in usage” (Ellis, 2017, p. 46Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
).
Zamuner, Gerken and Hammond (2004, p. 1406)Zamuner, T., LouAnn, S., & Gerken y Hammond, M. (2004). Phonotactic probabilities in young children’s speech production. Journal of Child Language 31, 515-36. https://doi.org/10.1017/S0305000904006233
based their research on the Specific Language
Grammar Hypothesis (SLGH), which states that “language acquisition is
best described with respect to the patterns in the input or ambient
language”. Thus, children will acquire first those phonemes which are
more frequent in their language.
Studies based on frequency and likelihood of occurrence have shed some light on the process of language acquisition (Ellis, 2017Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
; Polo, 2016Polo, N. (2016). La investigación actual sobre el desarrollo de la fonología del español como lengua materna. Lenguas modernas, 47, 137-152.
; Rose, 2009Rose,
Y. (2009). Internal and External Influences on Child Language
Productions. In Pellegrino, François, Egidio Marsico, Ioana Chitoran,
& Christophe Coupé (Eds.), Approaches to Phonological Complexity (pp. 329-351). Mouton de Gruyter. https://doi.org/10.1515/9783110223958.329
; Zamuner et al., 2004Zamuner, T., LouAnn, S., & Gerken y Hammond, M. (2004). Phonotactic probabilities in young children’s speech production. Journal of Child Language 31, 515-36. https://doi.org/10.1017/S0305000904006233
). For example, Lleó (2003)Lleó, C. (2003). Prosodic licensing of coda in the acquisition of Spanish. Probus, 15, 257-281. https://doi.org/10.1515/prbs.2003.010
, in a crosslinguistic study of German and
Spanish, found that coda consonants are acquired earlier in languages
where codas and coda clusters are common. The same author concluded some
years later that “We now know that babbling results from a combination
of unmarked sounds and the most frequent sounds produced around the
baby” (Lleó, 2012, p. 693Lleó,
C. (2012). First language acquisition of Spanish sounds and prosody. In
J. I. Hualde, A. Olarrea, & E. O’Rourke (Eds.), The Handbook of Hispanic Linguistics (pp. 693-710). Blackwell Publishing Ltd. https://doi.org/10.1002/9781118228098.ch32
)., Also Demuth (2009)Demuth, K. (2009). The prosody of syllables, words and morphemes. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 183-198). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.011
, after analysing the fact that /t/ (and not
voiced /d/) is the first coda consonant acquired by English speaking
children, determined that “although frequency and markedness typically
pattern together, children may show a preference for frequency over
markedness effects in their early productions” (Demuth, 2009, p. 189Demuth, K. (2009). The prosody of syllables, words and morphemes. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 183-198). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.011
). Roark and Demuth (2000)Roark,
B., & Demuth, K. (2000). Prosodic constraints and the learners’s
environment: a corpus study. In Howell, S. Catherine, Sara A. Fish &
Thea Keith-Lucas (Eds.). Proceedings of the 24th Annual Boston University Conference on Language Development. Vol. 2 (pp. 597-608). Cascadilla Press.
carried out a corpus-based study on prosodic properties on language.
Results proved that “young language learners are sensitive to
statistical properties of the input, and this influences the course of
language development.” (Roark & Demuth, 2000, p. 599Roark,
B., & Demuth, K. (2000). Prosodic constraints and the learners’s
environment: a corpus study. In Howell, S. Catherine, Sara A. Fish &
Thea Keith-Lucas (Eds.). Proceedings of the 24th Annual Boston University Conference on Language Development. Vol. 2 (pp. 597-608). Cascadilla Press.
). For a more complete view of the role of input and frequency in child language acquisition, see Kern et al. (2014)Kern, S., Gayraud, F., & Chenu, F. (2014). The role of input in early first language morphosyntactic development. Language, Interaction and Acquisition, 5(1), 1-18. https://doi.org/10.1075/lia.5.1.00int
, who, in a special issue, crosslinguistically
analyse the essential function of these two factors in the process of L1
acquisition, covering distinct linguistic levels.
The present research is framed within the usage-based phonology (Polo, 2016Polo, N. (2016). La investigación actual sobre el desarrollo de la fonología del español como lengua materna. Lenguas modernas, 47, 137-152.
), and the SLHG (Zamuner et al., 2004Zamuner, T., LouAnn, S., & Gerken y Hammond, M. (2004). Phonotactic probabilities in young children’s speech production. Journal of Child Language 31, 515-36. https://doi.org/10.1017/S0305000904006233
), following Ellis’s (2008, p. 95Ellis,
N. C. (2008). Usage-based and form-focus SLA: The implicit and explicit
learning of constructions. In A. Tyler, Y. Kim, & M. Takada (Eds.), Language in the Context of Use: Discourse and Cognitive Approaches to Language, (pp. 93-120). New York: Mouton de Gruyter.
)
statement: “language processing is intimately tuned to input frequency
and probabilities of mappings at all levels of grain: phonology and
phonotactics, reading, spelling, lexis, morphosyntax, formulaic
language, language comprehension, grammaticality, sentence production,
and syntax. It relies on this prior statistical knowledge”.
Notwithstanding, following Rose (2009, p. 346Rose,
Y. (2009). Internal and External Influences on Child Language
Productions. In Pellegrino, François, Egidio Marsico, Ioana Chitoran,
& Christophe Coupé (Eds.), Approaches to Phonological Complexity (pp. 329-351). Mouton de Gruyter. https://doi.org/10.1515/9783110223958.329
), “while statistics of the input seem to play a
central role in infant speech perception, such statistics appear to be
only one of the many factors underlying patterns observed in speech
production”. Therefore, a single approach is not enough to account for
language acquisition, but a contribution to the general research
scenario.
2.2. Contribution of Corpus Linguistics
⌅ Investigation of language acquisition has traditionally been based on
experiments or tests of a logopedic kind rather than on spontaneous
speech (see Acevedo, 1993Acevedo, M. A. (1993). Development of Spanish consonants in preschool children. Communication Disorders Quarterly, 15(2), 9-15. https://doi.org/10.1177/152574019301500202
or Jiménez, 1987Jimenez, B. C. (1987). Acquisition of Spanish consonants in children aged 3-5 years, 7 months. Language, Speech, and Hearing Services in Schools, 18(4), 357-363. https://doi.org/10.1044/0161-1461.1804.357
as examples of research describing the
phonological development of Mexican Spanish children ranging in age from
3 to more than 5 years). This may be due to the fact that, on the one
hand, such studies tend to focus on speech and language disorders and,
therefore, the samples in many cases belong to subjects who show
atypical language development. These samples are collected in assessment
situations where the context tends to be artificially created. On the
other hand, another reason for using tests and not speech corpora in
child language research is related to the difficulty of obtaining large
samples of spontaneous speech, which poses a major disadvantage to any
investigation: we have to find the occasion to make recordings, but also
these must be later transcribed. This difficulty is compounded by the
challenges of working with children, since it is not only necessary to
count on the permission of parents or guardians, but also, we must be
particularly respectful of their right to privacy.
Ellis (2017)Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
states that usage-based linguistics are supported
by findings from Corpus Linguistics, Cognitive Linguistics, and
Psycholinguistics. In the same line, Dolgova and Tyler (2019, p. 914)Dolgova, N., & Tyler, A. (2019). Applications of Usage-Based Approaches to Language Teaching. In X. Gao (Ed.), Second Handbook of English Language Teaching (pp. 939-961). Springer. https://doi.org/10.1007/978-3-030-02899-2_49
claim that Corpus Linguistics studies are an
example of the different existing usage-based models, which “reveals
frequency patterns and meanings in natural usage contexts”. These
authors call for the need of using corpus linguistics in research from a
usage-based perspective: “The usage-based research program necessitates
extensive analysis both of the usage from which learners learn and of
learner usage as it develops” (Ellis, 2017, p. 41Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
), by means of corpora and computational techniques. Nonetheless, Ellis (2017, p. 46)Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
warns about the need for complementary sources of
information: “Learner language corpora show what learners say; they do
not show what they know. Experimental techniques are needed to probe
aspects of knowledge and understanding”.
The use of corpora for assessing phonological development has been extensively promoted by researchers (Demuth, 2009Demuth, K. (2009). The prosody of syllables, words and morphemes. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 183-198). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.011
; Dolgova and Tyler, 2019Dolgova, N., & Tyler, A. (2019). Applications of Usage-Based Approaches to Language Teaching. In X. Gao (Ed.), Second Handbook of English Language Teaching (pp. 939-961). Springer. https://doi.org/10.1007/978-3-030-02899-2_49
; Ellis, 2017Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
; MacWhinney, 1996MacWhinney, B. (1996). Computational analysis of interactions. In P. Fletcher, & B. MacWhinney (Eds.), The Handbook of child language (pp. 152-178). Blackwell. https://doi.org/10.1111/b.9780631203124.1996.00006.x
; Stoll, 2009Stoll, S. (2009). Crosslinguistic approaches to language acquisition. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 89-104). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.006
, among others) as a complement to tests carried
out in artificial contexts in order to observe the production of
selected words. The acquisition of a sound is gradual, and its
production is maintained for a certain period, fluctuating between the
correct form and the non-targeted alternatives to its fossilisation.
However, experimental tasks typically use isolated words as a model of
production of a certain sound; during tests, which consist of the child
repeating a word or group of words after the adult, immediate imitation
can lead to a better pronunciation, which outside those contexts would
not be that correct. Acosta and Ramos (1998)Acosta,
V., & Ramos, V. (1998). Estudio de los desórdenes del habla
infantil desde la perspectiva de los procesos fonológicos. Revista de Logopedia, Fonatría y Audiología, 18, 124-142. https://doi.org/10.1016/S0214-4603(98)75683-9
criticised the historically used assessment
procedure that focused on isolated words as opposed to the analysis of
spontaneous speech samples.
In addition, corpora can be easily managed to retrieve data using useful automatic or semi-automatic computational tools, which facilitate work and save time. Therefore, corpus linguistics can be either a method in itself or a complement to the traditional approach, especially describing the most unconscious and spontaneous facet of language.
The main contribution of
naturalistic language corpora to the study of language acquisition is
providing samples of authentic language in real context, an invaluable
source for the study of child language. Spontaneous language corpora are
preferable to study the real use of language in children, on occasion
combined with corpora made up of texts obtained by means of elicitation
tasks or tests as a supplement to evoke those phenomena difficult to
find in spontaneous speech, due to low frequency of occurrence, or even
to avoidance strategies -words children systematically avoid due to
pronunciation difficulties (Stoll, 2009Stoll, S. (2009). Crosslinguistic approaches to language acquisition. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 89-104). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.006
).
3. METHODOLOGY
⌅3.1. The CHIEDE corpus
⌅CHIEDE, a spontaneous child language corpus of Spanish, is made up of approximately 60,000 words. About a third of the corpus consists of child language and the remaining is CDS. The main feature of CHIEDE is the spontaneity of interactions. The corpus is made up of transcribed recordings of communicative situations in their natural context. The recordings were carried out in central Spain, where the linguistic variety is Peninsular Spanish, in a medium-sized town. The speakers are monolingual and belonging to middle socioeconomic status regarding their families’ income and occupation.
The corpus presents two types of interactions: spontaneous collective interactions, recorded at a daily activity in the classroom where the whole group of children and the teacher informally chatted; and dialogues, in which an adult talks with a single child. Figure 1 shows the corpus design4For further details, consult the web site http://www.lllf.uam.es/ESP/Chiede.html#:~:text=El%20Corpus%20de%20Habla%20Infantil,comunicativas%20en%20su%20contexto%20natural.. Children were grouped according to their year of birth.
CHIEDE contains 58,616 word tokens in 30 text files for a total of 7 hours and 53 minutes of recordings in 30 audio files from n=59 child participants. Table 1 presents figures regarding word tokens, number of utterances, word types and the token/type ratio by age group.
Age group | Word tokens | Utterances | Types | Token /type ratio |
---|---|---|---|---|
3;0-3;12 | 5,628 | 4,909 | 985 | 5.7 |
4;0-4;12 | 6,787 | 5,092 | 1,155 | 5.8 |
5;0-5;12 | 9,004 | 5,443 | 1,450 | 6.2 |
Adults | 37,197 | 20,876 | 2,910 | 12.7 |
Total | 58,616 | 36,320 |
The fact that the corpus was going to be published required being extremely respectful and compliant with the current legal framework. Consequently, before recording, parents, teachers and participants were properly apprised and asked to sign an informed consent agreeing to participate in the research. Regarding ethical concerns, all names were anonymised and, on occasion, parts of the recordings were cut and discarded due to sensitive information the children gave about their private lives.
The device used to record the corpus was a Sony DAT (Digital Audio Tape), which allows for a digital recording with professional quality, with a Sony Stereo microphone placed in the most adequate spot to capture the sound. Even so, when recording ambient sound, a certain level of background noise is inevitable; it is impossible to obtain studio sound quality. For this reason, a sound editing software (Wavelab, https://www.steinberg.net/es/wavelab/) was used to improve the quality of the recordings.
The topics of conversation were varied, but all of them related to the children’s everyday lives: what they did yesterday or the previous weekend, describing their family, talking about their friends, their pets, or the things they like to do, etc.
Each recording is aligned with its corresponding orthographic transcription, including a header with metadata or sociolinguistic and contextual information. In addition to the audio and the text files, two other kind of files are included: those with the sound-text alignment by utterances and those in XML format with morphosyntactic annotation. The files are identified with a name where the age of the child participant is specified.
3.2. Procedure
⌅ This work was conducted from the perspectives of computational
linguistics and corpus linguistics, to assist other disciplines such as
phonology and psycholinguistics. The main advantage of working with
corpora is to improve and facilitate the empirical work through
computational tools that make tasks such as labelling, counting of items
and calculation of frequencies faster and more reliable. Undoubtedly,
the phonological transcription of a text is a task which needs the
investment of many working hours. If the orthographic transliteration
does consume most of the time devoted to the creation of a corpus, the
phonological transcription would at least double that time. Nowadays,
software such as PHON (Hedlund & Rose, 2020Hedlund, G., & Rose, Y. (2020). Phon 3.1 [Computer Software]. https://phon.ca.
) facilitates this task. The present study, however, used the one developed the software by Moreno Sandoval et al. (2008)Moreno
Sandoval, A., Torre Toledano, D., De La Torre, R., Garrote, M., &
Guirao, J. M. (2008). Developing a phonemic and syllabic frequency
inventory for spontaneous spoken Castilian Spanish and their comparison
to text-based inventories. Proceedings of the VI Language Resources and Evaluation Conference (LREC), 1097-1100.
,
which, to simplify, transforms “the orthographical representation of a
word to its phonemic transcription based on context-dependent rules” (Moreno Sandoval et al., 2008, p. 1098Moreno
Sandoval, A., Torre Toledano, D., De La Torre, R., Garrote, M., &
Guirao, J. M. (2008). Developing a phonemic and syllabic frequency
inventory for spontaneous spoken Castilian Spanish and their comparison
to text-based inventories. Proceedings of the VI Language Resources and Evaluation Conference (LREC), 1097-1100.
).
The reliability of the automatic phonological transcription was high:
4% of the words transcribed automatically were found to have a
transcription (either phonemic or syllabic) error. Therefore, it was
necessary that a group of linguists carry out a second part of the task
(peer review), listening to the audio files and manually correcting the
mistakes, and completing those features and nuances absent in an
orthographic representation. It must also be clarified that the
phonological transcription was a broad one, not a narrow annotation,
which would have considerably increased the work. As children were not
too young regarding the language acquisition period, most of them
exhibited an adult-like speech in phonological terms, and just three
children from the 3;0 group had typical (not due to any pathology)
pronunciation difficulties (files ADR3.wav, BRU3.wav, and NAT3.wav, and
their corresponding ADR3.txt, BRU3.txt, and NAT3.txt files, which can be
consulted in the website mentioned in Note 4), which were carefully
annotated.
Finally, to be faithful to the children’s production, the phonological transcription was carried out over the actual orthographic transcription, that is, a second orthographic line (introduced by %pho) in which the real production of the child (including errors) was represented, as shown in example (1).
In this way, figures regarding frequencies are real, and not based on target forms.
4. RESULTS
⌅Results5Statistical analysis was carried out using the software IBM SPSS Statistics. are presented in four separate sections. In the first one a frequency-based phonological inventory is provided to address research questions 1 and 3. The next two sections offer data regarding variability between the three age groups. Finally, data from CHIEDE are corroborated by comparing results with three corpora from the CHILDES database.
4.1. Data retrieved from the phonological transcription
⌅According to the data collected, Table 2 presents the relative frequency of the total number of phoneme tokens in the three child groups that make up the corpus.
Phoneme | 3;0-3;12 | 4;0-4;12 | 5;0-5;12 | CDS (adults) |
---|---|---|---|---|
e | 12.9 | 13.44 | 13.58 | 15.12 |
a | 13.29 | 13.09 | 13.13 | 12.27 |
o | 11.68 | 10.46 | 10.81 | 10.38 |
s | 7.34 | 7.55 | 7.94 | 8.11 |
i | 8.41 | 8.1 | 7.94 | 7.22 |
n | 7.35 | 7.2 | 6.72 | 7.05 |
ɾ | 4.41 | 4.76 | 4.84 | 5.12 |
t | 4.03 | 3.73 | 3.78 | 4.52 |
l | 4.49 | 4.8 | 4.74 | 4.51 |
k | 3.72 | 4.39 | 4.58 | 4.49 |
d | 2.66 | 3.11 | 3.53 | 4.36 |
m | 3.46 | 3.85 | 3.92 | 3.15 |
u | 3.89 | 3.91 | 3.53 | 3.14 |
p | 3.18 | 2.99 | 2.71 | 2.74 |
b | 2.42 | 2.71 | 2.42 | 2.5 |
θ | 0.84 | 0.88 | 1.03 | 1.52 |
g | 1.5 | 1.3 | 1.06 | 0.91 |
ʎ6The phoneme /ʎ/ is the default output of the automatic phonological transcriber. However, it must be clarified that the language variety studied (central Spain) presents yeísmo. Thus, the actual phonetic representation of /ʎ/ is /ʝ/. | 1.75 | 1.15 | 1.12 | 0.83 |
x | 0.84 | 0.93 | 0.67 | 0.62 |
f | 0.35 | 0.4 | 0.39 | 0.5 |
r | 0.56 | 0.43 | 0.62 | 0.42 |
ʧ | 0.64 | 0.45 | 0.62 | 0.3 |
ɲ | 0.29 | 0.37 | 0.31 | 0.19 |
The total number of phoneme tokens is 75,535, and the Phonological Mean Length of Utterance (PMLU) (Ingram, 2002Ingram, D. (2002). The measurement of whole-word productions. Journal of Child Language, 29, 713-733. https://doi.org/10.1017/S0305000902005275
) is 12.72 phonemes. In this case, the table does
not present the order of acquisition of phonemes (already acquired due
to the children’s age), but their usage frequency, as data were not
longitudinally collected. It can be observed how the phonemes that
occupy the final rows in the table are more infrequent in Spanish and
therefore their frequency decreases in relation to the most common ones;
nevertheless, the figures increase as the children grow older. This
shows that from three to five years old, the process of language
acquisition is still ongoing and therefore studies on the acquisition of
language must not stop at 36 months. However, according to these data,
all children show a complete (intelligible) acquisition, even of those
phonemes considered as acquired later.
In addition to the phonological data extracted from the children’s speech, a fourth column that includes the frequencies of phonemes in the child-directed speech (adults’) has been added. Although data are similar for both children and adults, greater similarity can be noticed, especially at the top of the table, between the oldest group (5;0-6;0) and the group of adults than between the 3;0-4;12-year-olds’ and the adults’ speech.
Analysing absolute frequency means for the three child groups, the asymptotic significance is p = 0.0007If adults’ absolute frequency means are compared with the children’s, the asymptotic significance is p = .000 in all cases., which denotes noteworthy different distributions of the three groups. If we observe the sample in detail (Table 3), the results are as follows:
Pairwise Comparisons (df 2) | ||||
---|---|---|---|---|
Sample 1/ Sample 2 | Test statistic | Std. Error | Std. Test Statistic | Sig. |
3/4-year-olds | -.870 | .295 | -2.949 | .003 |
3/5-year-olds | -1.413 | .295 | -4.792 | .000 |
4/5-year-olds | -.543 | .295 | -1.843 | .065 |
By comparing the distribution of data from the three groups, we can observe a significant difference between the youngest (3;0-3;12) and the oldest (4;0-6;0) groups.
Another feature of the automatic phonological transcriber is the segmentation of words into syllables. In this way, it is possible to quickly and reliably know the total number of syllables that make up our corpus, and their frequency of use. The total number of syllable tokens is 35,086 and the Syllable Mean Length of Utterance (SMLU) is 5.91.
The top 25 more frequent syllables
are made up of no more than two phonemes, and most of them follow the
pattern CV, supporting previous research (Carreira, 1991Carreira, M. (1991). The acquisition of Spanish syllable structure. In D. Wanner, & D. A. Kibbee (Eds.), New analyses in Romance linguistics (pp. 3-18). John Benjamins. https://doi.org/10.1075/cilt.69.06car
; Goldstein & Cintrón, 2001Goldstein, B., & Cintrón, P. (2001). An investigation of phonological skills in Puerto-Rican Spanish-speaking 2-year-olds. Clinical Linguistics and Phonetics, 15, 343-361. https://doi.org/10.1080/02699200010017814
; Kehoe & LLeó, 2003Kehoe,
M., & Lleó, C. (2003). The acquisition of syllable types in
monolingual and bilingual German and Spanish children. In B. Beachley,
A. Brown, & F. Conlin (Eds.), Proceedings of the Twenty-seventh annual Boston University Conference on Language Development (pp. 402-413). Cascadilla Press.
).
Closed syllables (CVC) or consonant clusters like CCV involve a higher
articulatory difficulty and therefore their frequency of use is lower
compared to open syllables consisting of no more than two phonemes. The
four groups coincide (80%): 20 out of the 25 most frequent syllables are
the same for children as for adults. From these data, it is possible to
easily and accurately calculate PMLU and SMLU for each age group. In Table 4 we observe how figures appreciably increase from three to six years old.
PMLU | SMLU | |
---|---|---|
3;0-3;12 | 10.29 | 4.88 |
4;0-4;12 | 13.57 | 6.26 |
5;0-6;0 | 14.11 | 6.49 |
Adults | 28.35 | 13.03 |
Statistics show that there is a significant difference between groups’ means, being p = 0.026 for PMLU and p = 0.025 for SMLU.
Findings (Table 2) prove a relationship between input frequency and order of acquisition that will be thoroughly analysed in the section devoted to the discussion, revisiting research question 3.
4.2. Standard deviation analysis
⌅So far, all data presented belong to the whole corpus divided into age groups. However, to calculate the standard deviation a sub-corpus was extracted in order to get a balance between the participants. As seen in Figure 1, representing the corpus design, CHIEDE is divided into two sub-corpora: collective interactions and dialogues. In the former communicative setting, the number of subjects is about twenty children (see Figure 1 for exact numbers), and the participation of all of them is not equal. When extracting the phonemes inventory for each of the participants, it was observed that while for some of them the number of words was very high -and therefore they presented a high frequency of phonemes- for others figures were considerably lower due to their moderate participation. Hence, a decision was made to use just the dialogues sub-corpus for this task as only one child participates in each interaction, so the number of conversational turns increases and therefore his/her production in terms of number of words enlarges. In addition, it was found that the number of words uttered by the children was similar in each dialogue (Table 5). A balance needed to compare data from different subjects was thus obtained.
Age | Words |
---|---|
3;0-3;12-year-olds | 4,021 |
4;0-4;12-year-olds | 4,416 |
5;0-6;0-year-olds | 4,119 |
Total | 12,556 |
Thus, the total sample consists of 24 children, equally divided into three age groups -3;0-3;12, 4;0-4;12 and 5;0-6;0 years old- each one made up of eight children, four boys and four girls (see Figure 1). The relative frequency was calculated from the automatic count of the absolute frequency of the twenty-three Spanish phonemes, and then, the standard deviation across all children in each age group was computed. Table 6 presents the values for each age group.
3;0-3;12-year-olds | 4;0-4;12-year-olds | 5;0-6;0-year-olds | ||||||
---|---|---|---|---|---|---|---|---|
Phonemes | xˉ | s | Phonemes | xˉ | s | Phonemes | xˉ | s |
e | 12.95 | 2.40 | a | 13.51 | 2.47 | e | 13.97 | 0.73 |
a | 12.68 | 2.20 | e | 13.51 | 2.24 | a | 12.16 | 1.32 |
o | 11.80 | 1.32 | o | 10.35 | 1.72 | o | 11.14 | 0.58 |
i | 8.57 | 1.78 | i | 8.19 | 1.81 | s | 8.71 | 1.35 |
n | 7.87 | 1.12 | s | 8.02 | 1.85 | i | 8.22 | 1.34 |
s | 7.76 | 1.39 | n | 7.65 | 0.66 | n | 7.15 | 0.94 |
ɾ | 4.27 | 0.84 | ɾ | 4.82 | 0.79 | ɾ | 4.69 | 0.95 |
u | 4.25 | 1.12 | k | 4.15 | 1.25 | l | 4.61 | 0.67 |
l | 4.05 | 0.97 | l | 4.13 | 0.93 | k | 4.12 | 0.61 |
t | 4.05 | 1.02 | u | 3.95 | 0.93 | m | 3.71 | 0.86 |
k | 3.90 | 1.05 | t | 3.83 | 0.89 | t | 3.65 | 0.42 |
m | 3.57 | 1.16 | m | 3.50 | 0.99 | u | 3.55 | 0.65 |
p | 3.18 | 0.78 | p | 3.03 | 0.66 | d | 3.20 | 0.69 |
d | 2.54 | 0.19 | d | 3.01 | 0.81 | p | 3.11 | 0.79 |
b | 2.41 | 0.93 | b | 2.48 | 0.70 | b | 2.45 | 0.49 |
ʎ | 1.52 | 0.67 | g | 1.28 | 0.26 | g | 1.19 | 0.52 |
ɡ | 1.51 | 0.41 | ʎ | 1.07 | 0.39 | ʎ | 1.08 | 0.52 |
x | 0.72 | 0.37 | x | 0.99 | 0.41 | θ | 1.00 | 0.25 |
ʧ | 0.63 | 0.31 | θ | 0.77 | 0.44 | x | 0.70 | 0.18 |
θ | 0.60 | 0.42 | ʧ | 0.52 | 0.21 | ʧ | 0.52 | 0.16 |
r | 0.57 | 0.21 | r | 0.51 | 0.31 | r | 0.48 | 0.25 |
f | 0.33 | 0.24 | ɲ | 0.45 | 0.38 | f | 0.38 | 0.26 |
ɲ | 0.27 | 0.14 | f | 0.31 | 0.11 | ɲ | 0.21 | 0.06 |
Noting the values, the deviation degree of each phoneme in relation to the mean is appreciable, especially for the figures corresponding to the 4;0-4;12 years old group (11 out of 23 phonemes), which show a higher fluctuation from the mean. On the contrary, the 5;0-6;0 years old group displays less variation, although it is notable salient in four cases: /f/, /g/, /p/ and /ɾ/. To appreciate the differences more clearly, these data have been transferred to boxplots (Figures 2, 3 and 4). For the last values, due to the low frequency of phonemes, differences are hardly substantial; but for higher values, the degree of variability is noticeable.
In the boxplots, the form of the median line shows three distinct blocks: after the first six most frequent phonemes (/e/, /a/, /o/, /i/, /n/, /s/) there is a marked drop, after which the values are kept within a stable range until a second drop in the last and least frequent ones (from /x/ in the 3;0-3;12 and 4;0-4;12 years old groups, and /θ/ in the 5;0-6;0 years old group). The highest frequency rates are distributed among seven phonemes: vowels /a/, /e/, /i/, and /o/, the nasal /n/, and the fricative /s/ (mean above 7, Table 6). Within the second block, we find plosives, the vowel /u/, the liquids /l/ and / ɾ /, and the nasal /m/. Finally, the last block (mean below 1, Table 6), in which the frequency of sounds is moderate, includes the rest of the fricatives, the trill /r/, and the nasal /ɲ/; here the degree of variability decreases due to the low frequency of use.
Despite the fact that the median line pattern is similar for the three charts, in the first two age groups there are more striking irregularities, while the last age group’s plot shows a softer median curve. In the latter case the degree of deviation is lower, showing more consistency.
Again, Friedman’s Two-Way Analysis of Variance by Ranks presents an asymptotic significance of p = 0.018, detailed by age groups as follows:
Table 7 shows significant differences between 5 and 4-year-olds and between 5 and 3-year-olds. However, between the 3 and the 4 years old groups there seems to be no significant difference, which means that in the oldest age group (5;0-6;0 years old) there is a stabilisation of the phonological system, since figures for standard deviation are lower (as can be seen in 8), given that fluctuation from the mean decreases. At ages 3;0-3;12 and 4;0-4;12 years the values present a higher variation, especially for the most frequent phonemes. However, from 5 years old these differences disappear and the figures are stabilised, decreasing the distance between the values and the mean, in contrast to the irregularities which the other two age groups show, especially the 4;0-4;12 years old group. Thus, the idea of a turning point at the age of four years in the process of phonological acquisition is reinforced: again, it seems that it is from that age when children’s language begins to approach adult use.
Pairwise Comparisons (df 2) | ||||
---|---|---|---|---|
Sample 1/Sample 2 | Test Statistic | Std. Error | Std. Test Statistic | Sig. |
5/4-year-olds | .652 | .295 | 2.212 | .027 |
5/3-year-olds | .783 | .295 | 2.654 | .008 |
4/3-year-olds | .130 | .295 | .442 | .658 |
4.3. U-shape development at four years old
⌅Linked to the question about whether 4;0 is a turning point in the language acquisition process, and to the above data (standard deviation analysis), it is relevant to describe the finding of the greatest variability of 4-year-olds in the present study as a sign of a U-shaped (inverted in the chart) development pattern. Figure 5 shows how variability (based on standard deviation) is higher for 11 out of 23 phonemes (43.5%) in the 4;0 group: /e/, /a/, /o/, /s/, /i/, /l/, /p/, / θ /, /x/, /r/, and /f/. Therefore, it can be concluded that, at least in these 11 cases, a U-shape development pattern can be observed. This issue will be thoroughly discussed later.
4.4. Extrapolation of results
⌅ Phonological frequencies depend on the lexical use and on the lexical
selection the child makes (statistical acquisition based on the lexicon, Polo, 2016Polo, N. (2016). La investigación actual sobre el desarrollo de la fonología del español como lengua materna. Lenguas modernas, 47, 137-152.
).
“Children who still have a small vocabulary may be very selective in
their choice of words, that is, either actively avoid words which are
difficult to pronounce or substitute consonants systematically” (Stoll, 2009, p. 94Stoll, S. (2009). Crosslinguistic approaches to language acquisition. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 89-104). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.006
). Therefore, a study such as the one presented
here is incomplete if lexical units are not taken into account. To
accomplish this, the most frequent lexical units presented in were
analysed. But in order to reinforce conclusions, we used not only
CHIEDE, but three more corpora from the CHILDES database (MacWhinney and Snow 1985MacWhinney, B., & Snow, C. (1985). The child language data exchange system. Journal of Child Language, 12(2), 271-295. https://doi.org/10.1017/S0305000900006449). In this way, it can be determined if the
results presented here are contextual or, on the contrary, they are a
general tendency. To carry out this test, the methodology was as
follows:
-
Among the CHILDES corpora in Spanish language, three corpora which shared features with CHIEDE were selected, especially regarding age range. They were Spanish Díez-Itza Corpus (Díez-Itza, 1995)Díez-Itza, E. (1995). Procesos fonológicos en la adquisición del español como lengua materna. In J. M. Ruiz, P. H. Sheerin, & E. González-Cascos (Eds.), Actas del XI Congreso Nacional de Lingüística Aplicada (pp. 225-264). Universidad de Valladolid.
, Spanish BecaCESNo Corpus (Benedet & Snow, 2004Benedet, M., & Snow, K. (2004). Spanish BecaCESNo Corpus. TalkBank. https://childes.talkbank.org/access/Spanish/BecaCESNo.html
) and Spanish Marrero Corpus (Albalá & Marrero, 2004Albalá, M. J., & Marrero, V. (2004). Spanish Marrero Corpus. TalkBank. https://childes.talkbank.org/access/Spanish/Marrero.html
). -
From two of them, BecaCESNo and Marrero, those files (transcriptions) in which the child was younger than 3;0 and older than 6;0 years old were discarded, as CHIEDE’s participants are within that age range.
-
Once the corpora were selected, CLAN, a tool provided by the CHILDES Project (MacWhinney & Snow, 1985MacWhinney, B., & Snow, C. (1985). The child language data exchange system. Journal of Child Language, 12(2), 271-295. https://doi.org/10.1017/S0305000900006449),was used to extract the list of different forms (types) and their frequency of use.
-
After cleaning up those lists (deleting Proper Names, as they are contextual, or correcting orthographic mistakes), they were compared and the most frequent lexical units or types common to the four corpora were extracted.
-
The 500 most frequent types were selected and the phonological transcriber was applied to them.
Phoneme | Beca CESNo | Díez-Itza | Marrero | CHIEDE | cv |
---|---|---|---|---|---|
a | 13.92 | 14.06 | 14.90 | 13.49 | 4.18 |
e | 14.26 | 13.54 | 13.56 | 13.19 | 3.31 |
o | 11.97 | 11.97 | 12.40 | 11.68 | 2.47 |
s | 8.88 | 8.77 | 8.68 | 8.85 | 1.04 |
i | 6.94 | 8.45 | 7.31 | 8.11 | 9.08 |
n | 7.62 | 7.54 | 6.80 | 8.42 | 8.70 |
k | 4.81 | 5.20 | 5.16 | 4.15 | 10.05 |
m | 4.06 | 4.37 | 3.30 | 4.68 | 14.42 |
l | 3.95 | 4.08 | 4.34 | 3.86 | 5.09 |
ɾ | 3.87 | 3.53 | 4.59 | 3.37 | 14.09 |
t | 3.82 | 3.68 | 3.91 | 3.61 | 3.63 |
u | 3.40 | 3.12 | 2.44 | 3.99 | 19.97 |
d | 2.95 | 2.75 | 2.95 | 2.58 | 6.34 |
p | 2.95 | 2.32 | 2.73 | 2.85 | 10.13 |
b | 2.74 | 2.53 | 2.77 | 2.47 | 5.56 |
ʎ | 1.20 | 1.32 | 1.15 | 1.50 | 12.01 |
g | 0.83 | 0.79 | 0.80 | 0.98 | 10.26 |
θ | 0.62 | 0.52 | 0.65 | 0.59 | 9.42 |
x | 0.40 | 0.53 | 0.56 | 0.46 | 14.07 |
ʧ | 0.23 | 0.23 | 0.22 | 0.43 | 34.71 |
f | 0.19 | 0.29 | 0.32 | 0.26 | 20.47 |
ɲ | 0.25 | 0.21 | 0.18 | 0.28 | 18.97 |
r | 0.13 | 0.17 | 0.28 | 0.20 | 31.92 |
Table 8 show the results. The most relevant figures are those in the last column, in which the coefficient of variation shows the variability of the four samples in relation to the mean. The most homogeneous values belong to the phonemes /a/, /e/, /o/, /s/, /i/, /n/, /l/, /t/, /d/, /b/, and /θ/. On the other hand, /ʧ/, /f/, and /r/ show the most heterogeneous distribution. These phonemes are precisely the most infrequent ones not only in CHIEDE, but in the other three corpora too, as well as in the adults’ speech, again reinforcing the assumption about an existing relationship of the input frequency with the order of acquisition of phonemes.
Broadly speaking, the differences among the four corpora are not meaningful, as frequency figures are almost equal, which means that the basic lexical units are not context dependent, but generalised, as well as the most frequent phonemes. Therefore, the results obtained after the phonological analysis carried out on CHIEDE can be extrapolated.
5. DISCUSSION
⌅ Revisiting research questions in light of the results, major findings
are summarised here. Regarding the first research question posed in the
present study, it can be concluded that, according to the sample, the
phonological Spanish system is essentially acquired (in terms of
intelligibility) at the age of three years (as shown in Table 2).
Acquisition is here understood as development, that is, as a process
where phonemes are already organised into patterns (what Velleman and Vihman (2002)Velleman, S. L., & Vihman, M. M. (2002). Whole-Word Phonology and Templates. Language Speech and Hearing Services in Schools, 33(1), 9-23 https://doi.org/10.1044/0161-1461(2002/002)
call templates) typical of the final stages of development in children, showing that units are rooted. According to Velleman and Vihman (2002, p. 20)Velleman, S. L., & Vihman, M. M. (2002). Whole-Word Phonology and Templates. Language Speech and Hearing Services in Schools, 33(1), 9-23 https://doi.org/10.1044/0161-1461(2002/002)
, “templates serve as a stepping stone in the
direction of the adult system, despite the decrease in accuracy that may
temporarily result”. Vihman (2018, p. 38)Vihman, M., & Wauquier, S. (2018). Templates in child language. In M. Hickmann, E, Veneziano, & H. Jisa (Eds.), Sources of variation in first language acquisition: Languages, contexts, and learners (pp. 27-44). John Benjamins Publishing Company. https://doi.org/10.1075/tilar.22.02vih
also states that “template formation is neither
the outcome of a pre-existing principle nor an end in itself, but
instead a dynamic (and momentary) child response, in the early stages of
acquisition, to the phonological and lexical challenges of the
language”.
It is generally accepted in Spanish phonological
acquisition research that the most problematic phonemes are liquid
consonants, the fricatives /s/, /θ/ and /x/, the nasal /ɲ/, and the
plosive /d/ (Acevedo, 1993Acevedo, M. A. (1993). Development of Spanish consonants in preschool children. Communication Disorders Quarterly, 15(2), 9-15. https://doi.org/10.1177/152574019301500202
; Bosch, 1983Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
, Jiménez, 1978Jimenez, B. C. (1987). Acquisition of Spanish consonants in children aged 3-5 years, 7 months. Language, Speech, and Hearing Services in Schools, 18(4), 357-363. https://doi.org/10.1044/0161-1461.1804.357
). However, after analysing these sounds in
CHIEDE, it can be observed that both the fricative /s/ and the liquids
/l/ and /ɾ/ are among the most frequent phonemes. CHIEDE’s participants
showed no added difficulty in their use, indicating that, although they
may be problematic phonemes at the time of their acquisition, from three
years old onwards these three sounds do not present any difficulty for
children with typical development; in fact, they are widely used.
Regarding the rest of the phonemes which are considered problematic, it
can be concluded that they are characterised by a lower use. The higher
frequency of certain phonemes over others is a lexical matter: “Thus,
when we examine the lexicon (words) of a language, not all sounds have
an equal opportunity to appear in all positions.” (Bernstein-Ratner, 1994, p. 351Bernstein-Ratner, N. (1994). Phonological analysis of child speech. In J. L. Sokolov, & C.E. Snow (Eds.), Handbook of research in language development using CHILDES (pp. 324-372). Hillsdale, NJ: Lawrence Erlbaum Associates.
).
Certain phonemes, such as /r/ or /ɲ/, are less frequent in the Spanish
lexicon, and thus their frequency of use is low (as seen in frequency
lists, Tables 2 and 8 ).
Results from the present study shed light on the existence of a turning
point at four years old in the process of L1 acquisition (research
question 2). On the one hand, figures on PMLU and SMLU (Table 4)
indicate that from four to five years of age there is a significant
increase towards adult language. Furthermore, standard deviation (Table 6)
shows how language becomes stabilised from five years old onwards. It
can also be stated that the subjects from this study fit Díez-Itza and Martínez López’s (2004)Díez-Itza,
E., & Martínez López, V. (2004). Las etapas tardías de la
adquisición fonológica: procesos de reducción de grupos consonánticos. Anuario de Psicología, 35, 177-202.
stages, as it seems that from 3;0 to 5;0 years old children are in a
period of reorganisation of the phonological system, termed
“stabilisation” by the authors; however, from 5;0 years old onwards
children seem to achieve the “resolution” stage. Variability showed by
the group of 4;0-4;12 leads to the conclusion that around four years old
there is a landmark which is relevant not only for research on typical
language development, but specially for research on speech and language
disorders. This turning point is also supported by the U-shape
development pattern evidenced from the analysis in Figure 5.
Although the 3-year-olds group displayed a similar pattern, this was
shown in those less frequent phonemes. However, 4-year-olds exhibited a
higher variation and a U-shape pattern precisely for those phonemes
which are acquired earlier and, therefore, should be stable at this age.
The overriding question guiding this research is to what extent
the input frequency is relevant in the L1 acquisition process (research
question 3). In disciplines such as Psycholinguistics, and more
specifically in Speech and Language Therapy, it is quite accepted, that
phonemes which usually pose a problem in the acquisition process, such
as the Spanish trill, are characterised by a more difficult
physiological articulation (Bosch, 1983Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
; López Valero et al., 1989López
Valero, A., Carrillo Hernández, M. R., & Ros Frutos, J. L. (1989).
Aportaciones para el estudio del desarrollo del lenguaje infantil en el
período comprendido entre los veinticuatro y los treinta meses. Cauce, Revista de Filología y su Didáctica, 12, 145-156.
).
However, this idea is conceived from the standpoint of adult speakers
whose articulatory system is fossilised. The baby’s physiology is ready
to adapt to different circumstances and therefore we cannot claim
whether it is difficult for a child to manage his/her articulators to
pronounce a sound or if he/she simply lacks enough examples to learn it.
According to Zamuner et al. (2004, p. 1420)Zamuner, T., LouAnn, S., & Gerken y Hammond, M. (2004). Phonotactic probabilities in young children’s speech production. Journal of Child Language 31, 515-36. https://doi.org/10.1017/S0305000904006233
, “it appears that children are not limited by
articulatory or perceptual constraints, but rather that children’s
errors are largely influenced by their ability to access stored
representations.”. For these reasons, and mainly based on the results
obtained from CHIEDE, it is highlighted here the relevance of
probability and frequency in studies on language ontogenesis, as
frequency of use may be an essential indicator of typical development.
It is also agreed that at the age of three all vowels are acquired,
followed by nasals, approximants, and later plosives. However, at this
age, the incomplete acquisition of liquids, fricatives and affricates
prevails (LLeó, 2012Lleó,
C. (2012). First language acquisition of Spanish sounds and prosody. In
J. I. Hualde, A. Olarrea, & E. O’Rourke (Eds.), The Handbook of Hispanic Linguistics (pp. 693-710). Blackwell Publishing Ltd. https://doi.org/10.1002/9781118228098.ch32
). Interestingly, this order of acquisition
coincides with the order of frequency of spontaneous adult speech
phonemes in Spanish (Table 2).
Studies such as those by Demuth (2009)Demuth, K. (2009). The prosody of syllables, words and morphemes. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 183-198). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.011
, Ellis (2017)Ellis, N. C. (2017). Cognition, Corpora, and Computing: Triangulating Research in Usage-Based Language Learning. Language Learning, 67(S1), 40-65. https://doi.org/10.1111/lang.12215
, Kern et al. (2014)Kern, S., Gayraud, F., & Chenu, F. (2014). The role of input in early first language morphosyntactic development. Language, Interaction and Acquisition, 5(1), 1-18. https://doi.org/10.1075/lia.5.1.00int
or Tomasello (2009)Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Harvard University Press.
,
among several others, demonstrate the probabilistic relationship
between input and language acquisition. The present study is another
example of how the input frequency affects language development (in this
particular case, phonological acquisition). “Ease of articulation seems
to play only a partial role in determining the overall developmental
route” (Pye, Ingram & List, 1987, p. 182Pye,
C. L., Ingram, D., & List, H. (1987). A comparison of initial
consonant acquisition in English and Quiché. In K. Nelson, & A. Van
Kleeck (Eds.), Children’s Language (Vol. 6) (pp. 175-190). Erlbaum. https://doi.org/10.4324/9781315792668-8
).
Another factor influencing phonological
learning is phonological neighbourhoods or phonologically similar
words. Studies such as those by Zamuner (2009)Zamuner, T. (2009). The structure and nature of phonological neighbourhoods in children’s early lexicons. Journal of Child Language, 36, 3-21. https://doi.org/10.1017/S0305000908008829
showed that the words which are first acquired have denser neighbourhoods than those acquired later. Maekawa and Storkel (2006)Maekawa,
J., & Storkel, H. L. (2006). Individual differences in the
influence of phonological characteristics on expressive vocabulary
development by young children. Journal of Child Language, 33, 439-459. https://doi.org/10.1017/S0305000906007458
also highlighted the importance of phonotactic
probability and density neighbourhood. These authors concluded that
“[...] phonotactic probability, density and frequency appeared to
predict expressive vocabulary development but with individual variation
across children” (Maekawa & Storkel, 2006, p. 457Maekawa,
J., & Storkel, H. L. (2006). Individual differences in the
influence of phonological characteristics on expressive vocabulary
development by young children. Journal of Child Language, 33, 439-459. https://doi.org/10.1017/S0305000906007458
).
Likewise Pierrehumbert (2003)Pierrehumbert, J. B. (2003). Probabilistic theories of phonology. In R. Bod, J. Hay, & S. Jannedy (Eds.), Probability Theory in Linguistics (pp. 177-228). The MIT Press.
referred to various studies that have shown that children are sensitive
to statistical patterns of sound. This stands in opposition to the idea
of a universal inventory from which the individual selects the
necessary elements to design his/her phonological system. The main
counter-argument she stated is that this theory does not explain why
children take so much time from when they acquire or distinguish an
element as one of their own language until they master its production in
an adult manner. Phonetic knowledge is gradually acquired and it is
updated through experience. “Acquiring the phonetic encoding system of a
language involves acquiring probability distributions over the phonetic
space”9 Pierrehumbert (2003) defined the concept of phonetic space as the acoustic and articulatory
parameterisation of speech as physical event, that is, what Moreno Cabrera (1997) called espacio de variación articulatorio (‘articulatory variation space’), or the different articulatory realisations of a phoneme. (Pierrehumbert, 2003, p. 184Pierrehumbert, J. B. (2003). Probabilistic theories of phonology. In R. Bod, J. Hay, & S. Jannedy (Eds.), Probability Theory in Linguistics (pp. 177-228). The MIT Press.
).
This last idea leads to consider how crucial the roles of probability
and frequency of use are in the process of language acquisition. Bernstein Ratner (1994)Bernstein-Ratner, N. (1994). Phonological analysis of child speech. In J. L. Sokolov, & C.E. Snow (Eds.), Handbook of research in language development using CHILDES (pp. 324-372). Hillsdale, NJ: Lawrence Erlbaum Associates.
suggested that those elements that children acquire earlier are the
most frequent both in adult speech and in all languages throughout the
world, while phonemes that present a higher learning difficulty are
precisely those that are less represented.
In this study (Table 2),
the frequency of use that the oldest age group shows is very similar to
that shown by adults in spontaneous speech, whereas the differences
between the other two groups of children and the adult one are larger.
According to data from CHIEDE, the most common sounds in adult speech
are precisely those which, based on previous research (Bosch, 1993Bosch, L. (1983). El desarrollo fonológico infantil: una prueba para su evaluación. Anuario de Psicología, 28(1), 87-114.
; López Valero et al., 1989López
Valero, A., Carrillo Hernández, M. R., & Ros Frutos, J. L. (1989).
Aportaciones para el estudio del desarrollo del lenguaje infantil en el
período comprendido entre los veinticuatro y los treinta meses. Cauce, Revista de Filología y su Didáctica, 12, 145-156.
; Serra, 1983Serra,
M. (1983). Normas estadísticas de articulación para la población
escolar de 3 a 7 años en el área metropolitana de Barcelona. Revista de Logopedia, Foniatría y Audiología, 3(4), 232-235. https://doi.org/10.1016/S0214-4603(83)75286-1
), are acquired earlier and more easily, i.e.,
vowels and nasals in the first place, followed by plosives and liquids.
Lower positions on the frequency list are occupied by fricatives, which
are precisely the last and most problematic in the acquisition process.
The same phenomenon occurs in other languages. For example, the English sounds identified as more complicated to learn (Grunwell, 1981Grunwell, P. (1981). The development of phonology: A descriptive profile. First language, 3, 161-191. https://doi.org/10.1177/014272378100200601
) are those which have a lower frequency rate in adult language (Mines et al., 1978Mines, M. A., Hanson, B. F., & Shoup, J. E. (1978). Frequency of occurrence of phonemes in conversational English. Language and speech, 21(3), 221-241. https://doi.org/10.1177/002383097802100302
).
Among these phonemes are some fricatives, such as the voiceless dental
/θ/ and the voiceless and voiced postalveolar affricates /ʧ/ and /ʤ/.
There is an undeniable relationship between less frequent phonemes in
adult language and those which are more problematic in the acquisition
process.
The evidence so far leads to emphasize the importance of
the input frequency in the study of L1 acquisition and its relation to
the most problematic phonemes. With this, the importance of the place
and manner of articulation as the sole factor causing the delayed
acquisition of certain phonemes should be played down (Rose, 2009Rose,
Y. (2009). Internal and External Influences on Child Language
Productions. In Pellegrino, François, Egidio Marsico, Ioana Chitoran,
& Christophe Coupé (Eds.), Approaches to Phonological Complexity (pp. 329-351). Mouton de Gruyter. https://doi.org/10.1515/9783110223958.329
). As stated by Menn and Stoel-Gammon (1996, p. 352)Menn, L., & Stoel-Gammon, C. (1996). Phonological development. In P. Fletcher, & B. MacWhinney (Eds.), The Handbook of child language (pp. 335-359). Blackwell. https://doi.org/10.1111/b.9780631203124.1996.00014.x
, “A theory of child phonology cannot ignore word
frequency although current adult phonological theory has no place for
this notion”.
5.1. Limitations
⌅Despite the fact that the children in CHIEDE showed a complete (intelligible) acquisition of phonemes at 3 years old, this situation must be regarded with caution, since the participants represent only a part of the whole population of Spanish-speaking children. Giving priority to sub-corpora balance (between the three different age groups’ language production) limited the number of participants per age group. Nevertheless, the comparison of CHIEDE’s data to those from three different corpora supports, to some extent, the findings in the present study.
As Grunwell (1981)Grunwell, P. (1981). The development of phonology: A descriptive profile. First language, 3, 161-191. https://doi.org/10.1177/014272378100200601
stated, language acquisition is characterised by
great variation from an individual to another. However, data from CHIEDE
may serve as a paradigmatic pattern of linguistic behaviour for
research on child language.
Another potential limitation could be the grouping of participants. As 4 years old is hypothesised as a critical age, speakers could have been grouped by different age limits to analyse the range 3;5-4;5. However, a balanced distribution of children in three groups prevailed here. Otherwise, age ranges and number of participants per group would be unbalanced. In addition, it would also be relevant to consider the role of the gender factor for future research.
Concerning the characteristics of the transcription, further research is suggested regarding issues such as the distribution of phonemes and syllable structure, clusters or allophones description. This would involve a narrow transcription, which exceeds the scope of this research. Indeed, as mentioned, recording conditions were not ideal due to the ambient sound.
Finally, it would be interesting to extend this experiment to other languages, particularly to other Spanish dialects and varieties, and observe to what extent patterns coincide.
6. CONCLUSIONS
⌅ Research on language acquisition beyond English and crosslinguistically
has thrived during the last decades, although many unsolved questions
still remain. There is a need for large cross-sectional spontaneous
speech corpora, sufficiently representative and linguistically
annotated. Furthermore, standards must be established to facilitate
analysis and comparison. As Stoll (2009, p. 91)Stoll, S. (2009). Crosslinguistic approaches to language acquisition. In E. L. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 89-104). Cambridge University Press. https://doi.org/10.1017/CBO9780511576164.006
complained, “the use of different data sets,
different methods or different criteria for coding makes it difficult to
compare across languages”. Also, corpus-based analysis of the late
acquisition period should be increased, that is, exceeding 36 months
old, as most of the existing corpora do not include child participants
exceeding that initial period of language development. The use of
representative corpora and computational tools enriches research on
language acquisition and is a reliable method for the study of
frequency, which, as several investigations reveal, is a significant
factor throughout the acquisition process.
The findings from the present study contribute to current research on Spanish-speaking children’s phonological acquisition in three ways:
-
Providing a phonological inventory which may serve as a model for future research on typical and atypical child language development (from 3 years old onwards).
-
Contributing to the assumption that 4 years old is a turning point in the process of language acquisition, as the variability analysis of the frequency of phonemes in CHIEDE shows.
-
Corroborating the importance of the role of input frequency as a factor to take into consideration when analysing child language.
From a methodological point of view, we encourage language acquisition research based on natural language corpora. Corpus Linguistics and Computational Linguistics are essential in language analysis, especially from a usage-based approach, as commented above and showed in this research. In addition, apart from the three contributions mentioned above, the findings of this research have practical implications for Clinical Linguistics and Speech and Language Therapy, as they can be used as a paradigm for the assessment of child language