1. INTRODUCTION
⌅The complexity of phonemic inventories varies across languages, and in today’s lingua franca (English), non-native English speakers of various ages and with a variety of first languages (L1s) encounter some challenges in producing cross-language speech.
Theories and
language learning models have described different factors that influence
the learning of L2 phonemes. These include the age at which L2 is
learned, the length of L2 exposure, L1 use (Flege, 1999Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second language acquisition and the Critical Period Hypothesis (pp.101-131). New Jersey: Lawrence Erlbaum Associates Publishers.
; Flege et al., 2021Flege, J., Aoyama, K., & Bohn, O.-S. (2021). The Revised Speech Learning Model (SLM-r) Applied. In R. Wayland (Ed.), Second Language Speech Learning: Theoretical and Empirical Progress (pp. 84-118). Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108886901.003
), and the relationship between L1 and L2 phoneme systems (Best & Tyler, 2007Best,
C., & Tyler, M. (2007). Nonnative and second-language speech
perception: Commonalities and complementarities. In O.-S. Bohn & M.
J. Munro (Eds.), Language experience in second language speech learning (pp. 13-34). Amsterdam: Benjamins.
; Flege, 1995Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233-277). Baltimore, MD: York Press.
).
Despite claims that a mixture of factors, especially those related to
age, can help or obstruct learning L2 segments, most scholars agree that
adults can still acquire phonological proficiency in an L2 (Munro & Derwing, 2008Munro, M., & Derwing, T. (2008). Segmental acquisition in adult ESL learners: A longitudinal study of vowel production. Language Learning, 58(3), 479-502.
).
A significant factor in this case is the degree of difference in the
vowel systems between the L1 and English, and the use of contrastive
features such as tone, nasality, and relative duration that are
implemented to produce and perceive vowel contrasts (Martínez-Celdrán & Elvira-García, 2019Martínez-Celdrán,
E., & Elvira-García, W. (2019). Description of Spanish Vowels and
Guidelines for Teaching Them. In: Rao, R. (Ed.), Key Issues in the Teaching of Spanish Pronunciation. Oxford: Routledge.pp17-39. https://doi.org/10.4324/9781315666839-2
; Ronquest, 2018Ronquest, R. (2018). Vowels. In K. Geeslin (Ed.), The Cambridge Handbook of Spanish Linguistics. Cambridge Handbooks in Language and Linguistics (pp.145-164). Cambridge University Press. https://doi.org/10.1017/9781316779194.008
).
As an example, Spanish (5 monophthong vowels) and Standard Southern
British English (12 monophthong vowels) are languages with different
phonemic structures. These cross-linguistic disparities tend to present
problems for Spanish L1 learners in producing English vowel contrasts
because, while there are some vowel phoneme approximations between the
languages, the general realisations regarding vowel quality are not
precisely equivalent.
In terms of age, adults L2 learners are faced with a complex task while creating new phoneme categories of the L2. Flege (1999)Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second language acquisition and the Critical Period Hypothesis (pp.101-131). New Jersey: Lawrence Erlbaum Associates Publishers.
explains that the challenge is due to the L1
system’s advanced development, which hinders the adaptation of L2
sounds. However, the generation of novel phonological contrasts, such
the English /iː/ and /ɪ/ or /uː/ and /ʊ/, may be aided by the L2 input
received, for example exposure to the language, explicit phonetics
instruction, or opportunities to utilise the L2. Developing a new
language is not just a matter of exposure to it, since immersion does
not necessarily translate into good quality and quantity of L2 input.
For instance, learners with the same length of residence in their L2
environment, but with different use of their L1 and L2 may have
disparities in their linguistic development (see Piske et al., 2001Piske, T., MacKay, I., & Flege, J. E. (2001). Factors affecting degree of foreign accent in an L2: A review. Journal of Phonetics, 29(2), 191-215. https://doi.org/10.1006/jpho.2001.0134
). Flege & Mackay (2004)Flege, J. and MacKay, I. (2004) Perceiving Vowels in a Second Language. Studies in Second Language Acquisition, 26, 1-34. http://dx.doi.org/10.1017/S0272263104261010
found that L2 learners with greater use of their L1 perceived English
vowels less accurately than participants with greater use of English
rather than their mother tongue.
The length of exposure/immersion
in the target language required to accurately develop foreign segments
is still open to debate, since studies have not succeeded in identifying
or agreeing on a minimal time frame. For example, Baptista (2006)Baptista, B. O. (2006). Adult phonetic learning of a second language vowel system. In B. O. Baptista (Ed.), English with a Latin beat: Studies in Portuguese/ Spanish-English interlanguage (pp.19-40). Amsterdam: Benjamins
found that Brazilian-Portuguese speakers who had lived in the US for
six months were unable to produce the English /iː/-/ɪ/ vowel contrasts;
but, after eight months, some speakers were able to do so. Another study
by Morrison (2002)Morrison,
G. (2002). Perception of English /i/ and /ɪ/ by Japanese and Spanish
listeners: longitudinal results. In G. S. Morrison & L. Zsoldos
(Eds.), North West Linguistics Conference. Burnaby, BC: Simon Fraser University Linguistics Graduate Student Association.
found that Japanese and Spanish speakers living in Canada required more
than five months of exposure to the English language to establish the
vowel difference /i/-/ɪ/. Munro & Derwing (2008)Munro, M., & Derwing, T. (2008). Segmental acquisition in adult ESL learners: A longitudinal study of vowel production. Language Learning, 58(3), 479-502.
in their one-year longitudinal study found that different vowels
develop at varying rates, and the first six months of immersion in the
L2 are crucial to improving the intelligibility of vowels. The overall
findings supported the hypothesis of a rapid progress of L2 segments at
an initial stage of immersion before plateauing (Flege, 1988Flege, J. E. (1988). Factors affecting degree of perceived foreign accent in English sentences. Journal of the Acoustical Society of America, 84, 70-79.
).
In contrast, other studies have found a lack of fast progress within
six months of exposure and suggested a longer period for sufficient
changes in segment accuracy - more than three years (see Baker & Trofimovich, 2006Baker,
W., & Trofimovish, P. (2006). Perceptual paths to accurate
production of L2 vowels: The role of individual differences. International Review of Applied Linguistics in Language Teaching, 44 (33), 231-250. https://doi.org/10.1515/IRAL.2006.010
; Koffi & Lesniak, 2019 Koffi, E., & Lesniak, F. (2019). A longitudinal acoustic phonetic study of English vowels by a Panamanian speaker. Linguistic Portfolios, 8 (5), 48-63.
; Smith et al., 2019Smith,
B., Johnson, E., & Hayes-Harb, R. (2019). ESL learners’
intra-speaker variability in producing American English tense and lax
vowels. Journal of Second Language Pronunciation, 5(1), 139-164. https://doi.org/10.1075/jslp.15050.smi
).
1.1. Native Language (L1) influence
⌅The
phonemic system of the first language, particularly in adult learners,
can play a role in how quickly or accurately non-native English speakers
create and understand English tokens (Flege, 1995Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233-277). Baltimore, MD: York Press.
).
In the case of Spanish, the vowel inventory differs from English by
featuring less diphthongality and durational distinctions between vowels
(Flege et al., 1997Flege,
J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on
non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25(4), 437-470.
).
Further, during the early stages of learning, Spanish L1 learners of
English may assimilate certain English vowels to their nearest
counterparts in the Spanish inventory. For example, the English minimal
pair pool - pull tend to be produced and perceived as the Spanish
/u/; the English /ɑ/ is likely to be perceived and produced as the
Spanish /a/ or as the vowel /o/ (Escudero & Chládková, 2010Escudero, P., & Chládková, K. (2010). Spanish listeners’ perception of American and Southern British English vowels. The Journal of the Acoustical Society of America, 128(5), 254-260. https://doi.org/10.1121/1.3488794.
). A similar pattern to those presented above has been found by earlier acoustic-phonetic research, such as Flege (1991)Flege, J. E. (1991). The Interlingual Identification of Spanish and English Vowels: Orthographic Evidence. The Quarterly Journal of Experimental Psychology, 43 (3), 701-73.
.
This is especially true for the English vowels /ɪ/, /ɛ/ and /æ/ that
were realised as the Spanish /i/, /e/ and /a/, respectively. It has also
been found that Spanish speakers produce a distinctively English /iː/,
but an /ɪ/ overlapping with /iː/ (Cebrian et al., 2021Cebrian,
J., Gorba, C., & Gavaldà, N. (2021). When the easy becomes
difficult: Factors affecting the acquisition of the English /i:/-/ɪ/
contrast. Frontiers in Communication,6. https://doi.org/10.3389/fcomm.2021.660917
; Flege, 1991Flege, J. E. (1991). The Interlingual Identification of Spanish and English Vowels: Orthographic Evidence. The Quarterly Journal of Experimental Psychology, 43 (3), 701-73.
; Flege et al., 1997Flege,
J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on
non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25(4), 437-470.
; Fullana-Rivera & Mackay, 2003Fullana-Rivera, N., & Mackay, I. (2003). Production of English sounds by EFL learners: The case of /i/ and /ɪ/. 15th International Congress of Phonetic Sciences, 1525-1528 Barcelona, Spain.
).
Generally, there seems to be evidence to support the idea that adult learners can adjust the production of segments with time. However, individual variables, including motivation, age, social contact, and L1 background, play an essential role in the development of L2 English segments.
The aim of the present research was to longitudinally examine the progress of adult Spanish speakers, both individually and qua group, in distinguishing between pairs of English vowels /iː/ and /ɪ/, /ɪ/ and /e/ and /uː/ and /ʊ/, in terms of quality in citation style of speech.
Using word lists is a widely adopted technique for
gathering speech data. One benefit of this method is its capacity to
regulate phonological context, thus circumventing connected speech
processes such as vowel reduction, coarticulation effects, and prosodic
features that might influence the ultimate pronunciation outcome (Fogerty & Humes, 2012Fogerty,
D., & Humes, L. (2012). The role of vowel and consonant fundamental
frequency, envelope, and temporal fine structure cues to the
intelligibility of words and sentences. The Journal of the Acoustical Society of America, 131(2), 1490-1501.
; Shattuck-Hufnagel & Turk, 1996Shattuck-Hufnagel, S., & Turk, A. (1996). A Prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research, 25 (2), 193-247. https://doi.org/10.1007/BF01708572
). Additionally, adopting a Chomskian perspective (Chomsky, 1965Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, Massachusetts: MIT Press.
),
word lists can offer researchers a more direct avenue to delve into the
speaker’s underlying phonological/phonetic competence compared to
alternative elicitation methods. In spontaneous conversations, speakers
tend to attend to various factors, such as the content of their message.
When reading from a scripted passage, the speaker may concentrate on
the prosodic fluency of their delivery. In contrast, a word list
minimises the potential influence of ‘distracting’ factors such as
stress, intonation, and rhythm. While it is acknowledged that one can
never directly access competence (a speaker’s unconscious underlying
knowledge of a system) except through observing performance, the
argument can be made that the word list, with its emphasis on
pronunciation of isolated words, reduces performance factors to the
greatest extent possible. Consequently, it brings researchers closer to
competence compared to other speech elicitation techniques. In support
of this, a study conducted by Leung et al. (2016)Leung,
K., Jongman, A., Wang, Y., & Sereno, J. A. (2016). Acoustic
characteristics of clearly spoken English tense and lax vowels. The Journal of Acoustical Society of America, 140, 45-58.
investigated temporal and spectral differences in the production of
English vowel pairs, specifically /iː/-/ɪ/, /a/-/ʌ/, and /uː/-/ʊ/, in
citation style of speech. The study involved eighteen native speakers of
Canadian English reading six isolated words (in KVD context e.g.,
‘keyed,’ ‘kid,’ ‘cad’). The findings revealed distinctions in both
temporal and spectral dimensions. When speakers articulated words more
clearly, the duration of tense vowels (/iː/, /a/, and /uː/) increased,
while lax vowels (/ɪ/, /ʌ/, and /ʊ/) showed more formant changes. In
essence, the vowel pairs underwent different dimensional changes in
clear-citation form style. Another study by Smiljanić and Bradlow (2005)Smiljanić, R., & A.R. Bradlow. (2005). Production and perception of clear speech in Croatian and English. The Journal of the Acoustical Society of America, 118, 1677-1688
,
which tested the effect of clear speech in English and Croatian, found
that word list productions increased the first and second formant
values, raising the degree of separation between contrasting vowels (in
both languages).
1.2. Research questions
⌅In this study we address the following questions by reference to the data collected from our forty Spanish L1 participants: (1) Whether the group of participants as a whole modify the production of the phonemic contrast between /iː/ and /ɪ/, /ɪ/ and /e/ and /uː/ and /ʊ/ over the course of a year. (2) How do the learners mark the developing contrast between members of the front vowel pairs and of the back vowel pair, by vowel quality as indexed by the first or second formant values? (3) Whether some individual speakers develop more marked contrasts than do others. (4) What are the specific experiential L2 exposure/engagement factors associated with the changes?
Marking the phonemic contrasts between the vowels pairs on which we focussed may present challenges because they are absent in Spanish. Likewise, pronunciation changes from Spanish to English norms can be problematic because some of these English vowels are closely similar to Spanish vowels.
It is recognised that the relative functional loads of the examined vowel pairs are not uniform (Gilner, 2020Gilner, L. (2020) Functional load rankings of the vowel systems of 10 varieties of English. https://doi.org/10.13140/RG.2.2.18174.00320.
).
The /iː/-/ɪ/ and /ɪ/-/e/ contrasts bear a relatively high functional
load, this suggests that many minimal pairs of English words depend
entirely on the presence of one member of these pairs rather than the
other. If individuals fail to effectively distinguish between members of
these pairs, it could result in frequent misunderstandings among
listeners. Consequently, non-native speakers of English may have a
strong motivation to master these distinctions. On the other hand, the
/uː/-/ʊ/ contrast has a low functional load, meaning that it is used to
distinguish between few words, and failure to make a clear distinction
between members of this pair is unlikely to result in frequent
misunderstandings. Including both low and high functional load pairs in
the study serves the purpose of offering initial insights into the
relative importance of factors shaping participants’ alignment with
native English norms . A marked difference in change between front and
back pairs would imply a predominant influence of communicative needs,
specifically the essential requirement of being understood. Conversely, a
roughly similar rate of change for both back and front pairs could
suggest that the motivation to sound proficient in English is just as
crucial as the drive to fulfill communicative needs. Essentially,
substantial progress in achieving intelligibility may occur even without
a strong communicative need if there is ample motivation for linguistic
proficiency itself.
Given the participants’ exposure to various
English accents, it is not possible to specify the exact phonetic target
pronunciations for them. However, the challenges in this area are
significantly reduced by the fact that three of the five vowels studied
are classified as British English phonologically “short” vowels. These
vowels least socially and regionally variable. (French et al., 2008French, P., Clermont, F., Harrison, P., & Simpson, S. (2008). Population data for English spoken in England: A modest first step. Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), Lausanne, Switzerland.
; Wells, 1982Wells, J. (1982). Accents of English. Cambridge: Cambridge University Press.
).
2. METHODOLOGY
⌅2.1. Participants
⌅The participants were forty native Spanish- speaking postgraduate students at the University of York, UK (20 female/20 male) from Mexico, Spain, Ecuador, Chile, Colombia, Perú and Argentina. The average age was 27 years (M= 27.3, SD = 2.81). For all the participants, English had been learned as a foreign language in traditional classroom settings (2-4 hours a week). Before arriving in the UK, none of them had ever spent more than three weeks living in an English-speaking country.
All of the subjects had an average IELTS result of 6.5, which according to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (2001) is equivalent to an independent user B2.
2.2. Procedure
⌅2.2.1. Stimuli
⌅The corpus was collected by recording participants reading a list of sixty monosyllabic English words at three different times during a year. The target words per reading session were fifteen, and forty-five words were fillers. The list of words was recorded with no repetitions. Before reading the word list, participants were asked to introduce themselves and briefly talk about what they did the day before as a warm-up activity.
The first session was recorded one month after arriving at university; the second session was recorded five months after the first one, and the final session was recorded at the end of the academic year (five months after the second one).
The decision to
exclusively use monosyllabic words aimed to eliminate the influence of
connected speech processes. The online platform English Lexicon Project (Balota et al., 2007Balota,
D., Yap, M., Cortese, M., Hutchison, K., Kessler, B., Loftis, B.,
Neely, J., Nelson, D., Simpson, G., & Treiman, R. (2007). The
English Lexicon Project. Behavior Research Methods, 39, 445-459.
)
was employed to generate high-frequency words, to ensure a clear
pronunciation without the uncertainties that might arise with less
frequent and potentially unfamiliar words. Each word in the list
followed a CVC structure and consisted of three to five orthographic
letters, allowing for controlled data by limiting the set of contexts.
The word order was randomized using the Random.org program (Haarh, 1998Haahr, M. (1998). Random.org. Retrieved from https://www.random.org/company/
).
2.2.2. Recordings
⌅The individual recordings were made in a sound treated recording studio with a dpA 4066 head-worn omnidirectional microphone frequency response 20 Hz to 20 kHz, 3 dB soft boost at 8-20 kHz). A sampling rate of 44.1 kHz and bit depth of 16 was used. All data were collected without any EQ or filtering applied. Twelve dB of headroom was allowed during the recording process to avoid overloading or clipping of the signal.
2.2.3. Language background questionnaire
⌅ Flege (2018) Flege, J. E. (2018). It’s input that matters most, not age. Bilingualism: Language and Cognition, 21(5), 919-920. https://doi.org/10.1017/S136672891800010X
emphasizes the significance of both the quality
and quantity of L2 input for successful learning of L2 speech. The
various opportunities for interaction in the L2 and the necessity to
communicate in the target language will equip motivated speakers with
additional tools and time to enhance and practice their English-speaking
skill (Flege & Liu, 2001Flege, J. E., & Liu, S. (2001). The effect of experience on adults’ acquisition of a second language. Studies in Second Language Acquisition, 23 (4), 527-552.
).
To uncover the specific types of L2 experiences and spoken interactions
linked to notable progress in distinguishing among the members of the
three vowel pairs, participants completed a language background
questionnaire.
The questionnaire was administered on three occasions through the year. We categorised it into four primary sections: 1) background information, 2) attitudes toward English, 3) opportunities for English language development, and 4) circumstances of exposure to English.
All participants were adults, aged eighteen
and above, placing them beyond the sensitive or critical period for
language acquisition. This hypothesis suggests that adult second
language learners are less responsive to input compared to children,
making it challenging for them to attain native-like pronunciation in a
second language after puberty (Lenneberg, 1967Lenneberg, E. (1967). Biological foundations of language. New York: Wiley.
).
The overall findings shed light on the relationships between the extent and nature of L2 exposure and engagement and the progress in developing the vowel contrasts.
2.3. Acoustic analysis
⌅The study reports on the F 1 and F 2 values extracted from fifteen monosyllabic words
containing the target vowels /iː/, /ɪ/, /e/, /uː/ and /ʊ/ from the word
list read across the three time points. A total of 1,800 tokens (15
words x 3 times x 40 speakers) were obtained and analysed. Words with
palatal onset /j/ were not included in order to avoid fronting as a
co-articulatory effect, as opposed to an adjustment to newly emerged and
well-documented British English L2 fronting norms, for /u:/ and /ʊ/ (Kleber, et al., 2011Kleber,
F., Harrington, J., & Reubold, U. (2011). The relationship between
perception and production of coarticulation during a sound change in
progress. Language and Speech, 55 (3), 383-405.
; Sóskuthy et al., 2015Sóskuthy,
M., Foulkes, P., Haddican, W., Hay, J., & Hughes, V. (2015).
Word-level distributions and structural factors codetermine GOOSE
fronting. Proceedings of ICPhS 18, Glasgow, UK: the University of Glasgow.
).
Similarly, no words with initial /w/ were included to avoid co-articulatory F 2 reduction effects arising from the velarity and the
lip/rounding of the initial consonant. Formant measurements were
extracted at the midpoint of each vowel using a script in Praat software
version 6.0.49 with settings of max formants 5500 (Hz) and 4.5 as the
number of formants (with manual adjustment when needed) (Boersma & Weenink, 2019Boersma, P. (1998). Functional Phonology. [Doctoral dissertation, University of Amsterdam]. The Hague: Holland Academic Graphics.
). Finally, using the Norm software by Thomas & Kendall (2007)Thomas,
E., & Kendall, T. (2007). NORM: The vowel normalization and
plotting suite. [Online Program]. Retrieved from: http://ncslaap.lib.ncsu.edu/tools/norm/
, the raw values of F 1 and F 2 were normalized for the mean per speaker of each vowel using the vowel extrinsic Lobanov method.
2.4. Statistical analysis
⌅To evaluate the progress of participants, both individually and as a group, the following steps were taken:
-
Group: we conducted a statistical Repeated Measures Analysis of Variance (ANOVA) separately for each vowel to detect differences in vowel quality across Time 1 (T1), Time 2 (T2), and Time 3 (T3). The within-subject effects for the ANOVA were time (3), formant (2), time * formant. The interactions between time and formants were followed by post-hoc analyses: paired-sample t-tests (p < .05) for which all significant results were Bonferroni corrected.
We calculated Euclidean distances to observe the separation between each pair of vowels. -
Individual: Euclidean distances were obtained to calculate the degree of separation between the pair of tokens examined. These analyses were conducted for each speaker with the normalised mean value for each pair of vowels: /iː/ and /ɪ/, /ɪ/ and /e/ and /uː/ and /ʊ/. The Euclidean distance was proportional to the separation between the contrasting vowels produced by each individual speaker across three time points.
To assess the degree of change and the distribution/spread of the vowel pairs during the three testing times, the participants were split into two categories high and low performers. (The high performers were participants with clear vowel distinction and the low performers were the speakers who did not produce a clear separation of vowel). The arbitrary division was based on the Euclidean distance obtained from the reference data points for English RP speakers1The values were taken from Deterding (1990 as cited in Deterding, 1997) to then calculate the Euclidean distance between the values examined. (broad values were used). The rationale for this was to establish mean reference values for contrasting and comparing the findings of the current data. The reference broad value for the Euclidean distance between /i:/ and /ɪ/ vowels was 250 Hz; for the /u:/and /ʊ/ it was 110 Hz; and for the /ɪ/ and /e/ it was 250 Hz.
Acknowledging these are
average group values, it is worth noting that some RP speakers in the
study might exhibit Euclidean distance values above or below the group
means. Unfortunately, the author (Deterding, 1990, as cited in Deterding, 1997Deterding, D. (1997). The formants of monophthong vowels in standard southern British English pronunciation. Journal of the International Phonetic Association, 27, 47-55.
)
did not furnish individual mean ranges, making it difficult to
determine if participants in the current study produced values within
the range for native L1 speakers. At best, we could assess if they
aligned with the norms set by the RP group.
The ongoing shifts in
the phonetic realisations of English segments pose challenges in
directly applying the acoustic data published for RP/SSBE vowels to
current pronunciation trends. Despite this, recent studies conducted by Bjelaković (2016)Bjelaković, A. (2017). The vowels of contemporary RP: Vowel formant measurements for BBC newsreaders. English Language and Linguistics, 21(3), 501-532. https://doi.org/10.1017/S1360674316000253
, Deterding (1997)Deterding, D. (1997). The formants of monophthong vowels in standard southern British English pronunciation. Journal of the International Phonetic Association, 27, 47-55.
, Fabricius (2007)Fabricius,
A. (2007). Variation and change in the TRAP and STRUT vowels of RP: A
real-time comparison of five acoustic data sets. Journal of the International Phonetic Association, 37, 293-320.
, and Ferragne & Pellegrino (2010)Ferragne, E., & Pellegrino, F. (2010). Formant frequencies of vowels in 13 accents of the British Isles. Journal of the International Phonetic Association, 40 (1), 1-34.
aim to supplement existing data from works such as those by Wells (1982)Wells, J. (1982). Accents of English. Cambridge: Cambridge University Press.
, Henton (1983)Henton, C. (1983). Changes in the vowels of Received pronunciation. Journal of Phonetics, 11, 353-371.
, and Bauer (1985)Bauer, L. (1985). Tracing phonetic change in the received pronunciation of British English. Journal of Phonetics, 13, 61-81.
.
While these studies may be considered somewhat dated, they presently
stand as the most comprehensive empirical data available for assessing
vowel quality
3. RESULTS
⌅3.1. Group results
⌅We examined the results to obtain a general view of the performance of the group as a whole, and to observe the effect of time over the production of both formants (the results are presented independently per vowel). Table 1 summarises the outcomes.
Vowel | Time (x 3) | Time * Formants | Post-hoc Test F1 | Post-hoc- test F2 |
---|---|---|---|---|
/i:/ | p < .002 | non- sig | non-sig | non-sig |
/ɪ/ | p < .001 | p < .001 | T1-T2 F1 p < .018 T2-T3 F1 p < .036 T1-T3 F1 non-sig |
T1-T2 F2 p < .001 T2-T3 F2 p < .001 T1-T3 F2 non-sig |
/u:/ | p < .001 | p < .001 | non-sig | T1-T2- F2 p < .001 T2-T3-F2 p < .001 T1-T3- F2 p < .001 |
/ʊ/ | p < .001 | p < .001 | T1-T2-F1 non-sig T2-T3-F1 p < .001 T1-T3-F1 p < .001 |
T1-T2- F2 non-sig T2-T3-F2 p < .001 T1-T3- F2 p < .001 |
/e/ | p < .001 | p < .001 | T1-T2-F1 non-sig | T1-T2-F2 p < .015 |
T2-T3-F1 p < .001 T1-T3-F1 non-sig |
T2-T3-F2 p < .001 T1-T3-F2 p < .001 |
As shown in Table 1, by T3, the vowels /ɪ/ and /ʊ/ showed significant differences in their F 1 and F 2 structure across Time. The /u:/ by T3, showed significant differences in its F 2; and the /e/ showed significant differences in its F 1 and F 2 values.
In addition, the group performance in terms of vowel quality at the beginning (T1) and end of the year (T3) as measured by F 1 - F 2 Euclidean distances showed that regarding: a) /iː/ vs /ɪ/ at T1, the group did not have a clear distinction between this pair (E. d = 112 Hz) and did not develop one by T3 (E. d= 235 Hz); b) /ɪ/ vs /e/ at T1, the group had a clear distinction between this pair (E. d = 268 Hz) and had maintained it by T3 (E. d = 394 Hz); c) /uː/ vs /ʊ/ at T1 showed that the group did not have a clear distinction between this pair (E. d = 68 Hz), and did not develop one by T3 (93 Hz).
3.2. Individual results
⌅We examined the rates of progress across individual members of the group in terms of Euclidean distance separation. To identify speakers who exhibited changes in formant values, we established three primary groups:
-
Moderate/static group: this group comprised participants who did not surpass the established thresholds for vowel separation for each vowel pair (e.g., results below 250 Hz for /i:/ and /ɪ/, below 110 Hz for /u:/and /ʊ/, and below 250 for /ɪ/ and /e/).
-
Substantial/large movement group: participants placed in this group were those who either exhibited changes in formant values surpassing the established thresholds for each vowel pair or consistently maintained their formant values above the RP norms (e.g., results above 250 Hz for /i:/ and /ɪ/, above 110 Hz for /u:/and /ʊ/, and above 250 Hz for /ɪ/ and /e/).
-
Backward movement group: This group included participants who initially produced formant values above the set thresholds for vowel pairs but, at some testing point (T2 or T3), exhibited values falling below those thresholds.
As shown in Figure 1, most of the participants [≈70%] maintained their separation boundaries below the 250 Hz across time points. Six speakers had the distinction between this pair at T1 and kept it by T3 (avg: over 250 Hz). Four speakers progressed to making the distinction by T3 (avg: 355 Hz). Thirty speakers did not develop a distinction during the whole year (avg: 87 Hz).
As observed in Figure 2, seven speakers had a near native distinction between this pair at T1 (avg: 151 Hz). By T3, four had maintained the distinction, three have not. Twenty-five speakers progressed to making a clear distinction by T3 (avg: 239 Hz). And eight speakers did not develop one by the end (avg: 68 Hz).
Figure 3 shows that participants exhibit few variations in the separation of the vowels across time. Twenty-six speakers had a near-native distinction between this pair at the start of the year (over 250 Hz). By T3, twenty had maintained the distinction. Six participants did not keep the separation by T3. Only two speakers progressed to making a distinction by T3. And twelve speakers did not develop a clear separation between the tokens by T3.
3.3 F1-F2 individual results
⌅Considering the previous outcomes, it was important for us to look in more detail at what may have caused the distance - or the lack of it - between each pair of vowels under examination, i.e., was the separation made by a distinction in F 1 and F 2 separately? The following are the results in terms of individual formants.
Vowel | Nº sp | F1 | Vowel | Nº sp | F1 |
---|---|---|---|---|---|
/i:/ | 19 | Higher values (avg:350Hz) | /ɪ/ | 13 | Higher values (avg:421Hz) |
21 | Lower values (avg:335Hz) | 27 | Lower values (avg:383Hz) |
From the above results we can say that at the end of the year, results for /i:/ show that half of the speakers produced a higher fronted vowel, and the other half a lower-fronted vowel; however, the /ɪ/ was produced by most of the speakers as a higher and more fronted vowel than at T1.
Vowel | Nº sp | F2 | Vowel | Nº sp | F2 |
---|---|---|---|---|---|
/i:/ | 26 | Increased values (fronted) (avg:2599Hz) | /ɪ/ | 34 | Increased values (fronted) (avg:2406Hz) |
24 | Decreased values (central) (avg:2505Hz) | 6 | Decreased values (central) (avg:2209Hz) |
Vowel | Nº sp | F1 | Vowel | Nº sp | F1 |
---|---|---|---|---|---|
/u:/ | 17 | Higher values (avg:412Hz) | /ʊ/ | 27 | Higher values (avg:538Hz) |
23 | Lower values (avg:365Hz) | 13 | Lower values (avg:406Hz) |
Vowel | Nº sp | F2 | Vowel | Nº sp | F2 |
---|---|---|---|---|---|
/u:/ | 39 | Increased values (fronted) (avg:1519Hz) | /ʊ/ | 40 | Increased values (fronted) (avg:1331Hz) |
1 | Decreased values (retracted) (avg:843Hz) | 0 | Decreased values (retracted) |
Regarding F 1, by T3 a bigger separation between the vowels was observed whereby
/u:/ exhibited lower F 1 values (365 Hz) (although non-significant) and /ʊ/ followed the opposite tendency (538 Hz). In terms of F 2 by T3, both vowels were produced in a more fronted position than at previous times. /uː/ obtained a mean of 1519 Hz, while /ʊ/ obtained a mean of 1331 Hz.
Vowel | Nº sp | F1 | Vowel | Nº sp | F1 |
---|---|---|---|---|---|
/ɪ/ | 13 | Higher values (avg:412Hz) | /e/ | 16 | Higher values (avg:669Hz) |
23 | Lower values (avg:385Hz) | 24 | Lower values (avg:587Hz) |
Vowel | Nº sp | F2 | Vowel | Nº sp | F2 |
---|---|---|---|---|---|
/ɪ/ | 34 | Increased values (fronted) (avg:2406Hz) | /e/ | 30 | Increased values (fronted) (avg:2011Hz) |
6 | Decreased values (central-back) (avg:2209Hz) | 10 | Decreased values (retracted) (avg:1901Hz) |
From the tables above we can state that the F 1 results for /ɪ/ and /e/ show that most of the speakers shifted their vowel production (T1-T3) from higher values to lower ones. In terms of F 2, by T3 most of the speakers changed their production of both vowels towards a more fronted position compared to the previous time (although the results for /ɪ/ were not significant).
In summary, /ɪ/ tended to move to a higher and more fronted position and /e/ was also inclined towards a higher and more fronted position, possibly owing to an adjustment towards native speaker norms (a vowel close to Cardinal 3).
When taken as a whole, these findings offer significant new information about the development of participants’ contrast between English vowel pairs /i:/ and /ɪ/, /u:/ and /ʊ/ and /ɪ/ and /e/ in terms of spectral properties. Additionally, it has been possible to determine if speakers’ improvements or alterations in regard to vowel contrasts were made in terms of F 1 and F 2 individually or both formants jointly.
In general, participants displayed some formant difference, regardless of the degree of separation between the vowel pairs under examination. In other words, Spanish speakers generally realised vowels as British English speakers, with /i:/ produced higher than /ɪ/; /u:/ pronounced higher than /ʊ/; and /e/ produced lower than /ɪ/.
3.4. Language background questionnaire results
⌅To assess the reasons behind vowel changes observed in certain speakers compared to others, particularly concerning their L2 exposure and engagement, we conducted a language background questionnaire.
The questionnaire consisted of four sections. The first part gathered background information, encompassing details such as age, origin, and the subject of discipline participants were studying. The second part probed into attitudes towards English, addressing topics like the importance of maintaining a Spanish accent, self-assessment of English skills, and opinions on various English accents. The third section focused on opportunities to develop English language skills, exploring factors like the amount of supervision time participants were given, on/off-campus work, and participation in extracurricular activities. The final section centered on the circumstances of exposure to English, covering living arrangements, friendships, and daily use of both the first (L1) and second (L2) languages
The main results are summarized in Table 8.
Similarities ✔ | Differences X |
---|---|
a) Attitudes toward English | a) Attitudes toward English |
✔ Highly motivated to speak English. ✔ To achieve native-like British pronunciation was a goal. ✔ To maintain their Spanish accent while speaking was not important. ✔ Speaking was the most difficult English skill to perform. |
X The high performers reported progress in understanding the different British English accents. The low performers did not. X The high performers found the speaking skill as the most difficult. The low performers did not. |
b) Opportunities to develop the English language | b) Opportunities to develop the English language |
✔ Both groups had more exposure to English in academic and non-academic settings by Time 2. | X By T3, the high performers increased their interactions in academic and
non-academic settings with native speakers. The low performers reduced
them. X The high performers did not participate in extracurricular activities. The low performers did. |
c) Circumstances of exposure to English | c) Circumstances of exposure to English |
✔ Neither group lived with British host-families nor with their Spanish speaking family in the UK. ✔ None of the members of the groups have native English speakers as partners. ✔ None of the members of the groups have native English-speakers as close friends. |
X The high performers spoke more Spanish than English during the day. The low performers did the opposite. X The high performers reported speaking English with native speakers most of the time. The low performers spoke English mostly with international speakers (e.g., Chinese, Italian, or other Spanish speakers), showing a difference in the quality of input rather than quantity. |
The comparative Table above offers an overview of the key distinctions and similarities between both groups. It is evident that diverse linguistic and social experiences played a role in shaping the development of the English vowel contrast.
Social factors emerged as significant influencers on participants who made progress in achieving a native-like contrast between vowel pairs, particularly through sustained and increased academic and non-academic social interactions with native English speakers post-T2. Participants who developed clear vowel contrasts, predominantly showed more interaction with native British English speakers, which resulted in an improvement in their English comprehension, thus influencing the advance of their vowel productions.
By contrast, the trend observed in the low-performing group suggests that the potential ‘low quality’ of English input (coming from non-native speakers) and the reduction in academic activities after T2 played a pivotal role in the lack of English comprehension, and, therefore, the production of English vowel contrasts.
Contrary to expectations, some differences between high and low performers were noted in terms of confidence and competence in speaking English. Surprisingly, high performers reported finding speaking a challenging skill throughout the year, while low performers did not. Other surprising results were in terms of the L1 use during a typical day, where, unexpectedly, high-performing participants reported speaking more Spanish than their low-performing counterparts over the course of a day.
4. DISCUSSION AND CONCLUSION
⌅The purpose of this study was to examine the progress of adult speakers of Spanish, both individually and as a group, in distinguishing productively between English vowels /iː/ and /ɪ/, /uː/ and /ʊ/ and /ɪ/ and /e/ in terms of spectral features. It also sought to identify and consider the factors associated with the different rate of progress among speakers.
It is acknowledged that, in addition to vowel quality as indexed here by F 1 and F 2 values, duration is another feature that speakers may use to distinguish vowels phonemically from each other, and Spanish speaking speakers may face some problems producing length differences between ‘short’ and ‘long’ English vowels. However, in this article, we focus only on vowel quality. A further study, which includes quantity data, is in preparation.
First, our findings concerning whether or not the participants - as a group - produced phonemic contrasts between the pairs of vowels examined, indicated that for /iː/ and /ɪ/ and for /uː/ and /ʊ/ the group exhibited some progress in achieving a phonemic contrast between these vowel pairs; however, by the end of the year the distinctions made by the group did not fully align with native RP English norms. The phonemic contrast /iː/ and /ɪ/ at T1 was realised with an average of 112 Hz; this value by T3 increased to 235 Hz. It was below the English norm for the quality distinction, but not by a very long mark. Moreover, if the RP English norm is 250 Hz and the T3 norm for this group is 235 Hz, there is likely to be some overlap between the two populations.
The findings for /ɪ /- /e/ contrast reveal that the group generally maintained the separation between the vowels by T3 with a clear, native-like distinction between the /ɪ/ - /e/ pair.
These results provide more evidence in favour of the earlier claims made by certain researchers (e.g., Flege et al., 1992Flege,
J. E., Munro, M. J., & Skelton, L. (1992). Production of the
word-final /t/-/d/ contrast by native speakers of English, Mandarin, and
Spanish. Journal of the Acoustical Society of America, 92, 128-143.
& Guion et al., 2001Guion,
S., Flege, J. E., Akahane-Yamada, R., & Pruitt, J. (2000). An
investigation of current models of second language speech perception:
The case of Japanese adults’ perception of English consonants. Journal of the Acoustical Society of America, 107 (5), 2711-24.
) that L2 immersion through residency, particularly for adult learners, improves L2 performance.
In addition, the findings for /iː/ and /ɪ/ are consistent with earlier studies by, for example, Flege et al. (1997)Flege,
J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on
non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25(4), 437-470.
, Morrison (2002)Morrison,
G. (2002). Perception of English /i/ and /ɪ/ by Japanese and Spanish
listeners: longitudinal results. In G. S. Morrison & L. Zsoldos
(Eds.), North West Linguistics Conference. Burnaby, BC: Simon Fraser University Linguistics Graduate Student Association.
and Fullana-Rivera & Mackay (2003)Fullana-Rivera, N., & Mackay, I. (2003). Production of English sounds by EFL learners: The case of /i/ and /ɪ/. 15th International Congress of Phonetic Sciences, 1525-1528 Barcelona, Spain.
,
which have shown that Spanish L2 learners were unable to distinguish
between /iː/ and /ɪ/. Their findings are primarily related to a brief
exposure (less than a year) to the new vowel categories, which differ
from the group’s immersion length of this study (1 year).
Second, regarding potential changes in vowel quality produced by the participants, and whether these changes were marked by the first or second formant values, our findings showed an important alteration in the vowel quality of the vowel pairs. When producing /iː/ and /ɪ/, the group increased their F 1 and F 2 values over a year. The final outcomes for /ɪ/ showed an increase in the F 1 value (avg: 407 Hz among participants), which shows a movement of /ɪ/ to a more open position over the course of a year.
The
fact that the development of the vowel pair contrast was marked by
producing /ɪ/ in a more open position indicates progress in utilising
vowel quality cues for English vowel production. This is in contrast to Flege’s (1991)Flege, J. E. (1991). The Interlingual Identification of Spanish and English Vowels: Orthographic Evidence. The Quarterly Journal of Experimental Psychology, 43 (3), 701-73.
assimilation pattern and the findings of Fullana-Rivera & Mackay (2003)Fullana-Rivera, N., & Mackay, I. (2003). Production of English sounds by EFL learners: The case of /i/ and /ɪ/. 15th International Congress of Phonetic Sciences, 1525-1528 Barcelona, Spain.
, as the production of /ɪ/ did not assimilate to the Spanish /i/ category, which has a F 1 value of approximately 286 Hz (Bradlow,1995Bradlow, A. (1995). A comparative acoustic study of English and Spanish vowels. The Journal of the Acoustical Society of America 97 (3) 1916-1924.
).
The developing contrast between /uː/ and /ʊ/ was marked by vowel quality, as indexed for /uː/ in F 2 and for /ʊ/ in F 1 and F 2. The final outcome for /uː/ shows a statistically significant increase in the F 2 value, resulting in a more fronted vowel compared to the initial production during the testing period. For /ʊ/, the final result shows an increase in both the F 1 and F 2 formants. The higher F 1 value signified a movement of /ʊ/ to a more open position, while the higher F 2 value indicated a progress toward a more fronted position.
These
findings indicate that the speakers established a contrast closer to
contemporary British English. The fronting observed in /uː/ and /ʊ/
indicates an accommodatory gravitation towards present-day /uː/ and /ʊ/
targets. This shift involves moving from very high back vowels-
resembling those found in English learning textbooks - to more central
or fronted ones. This adjustment may be attributed to a systemic change,
as a result of a ‘push effect.’ In this process, back vowels,
specifically /u:/ and /ʊ/, tend to shift towards a more centralised
position. This shift is motivated by the constraints of a reduced
auditory space, prompting the adjustment in their articulation (Lubowicz, 2011Lubowicz, A. (2011). Chain shifts. In Marc van Oostendorp, Colin Ewen, Beth Hume and Keren Rice (Eds.) Companion to Phonology, 1- 19. Wiley-Blackwell.
; Torgersen & Kerswill, 2004Torgensen,
E; & Kerswill, P. (2004). Internal and external motivation in
phonetic change: Dialect levelling outcomes for an English vowel shift. Journal of Sociolinguistics, 8 (1), 23-53. https://doi.org/10.1111/j.14679841.2004.00250.x
), but also, as a result of exposure to the evolving English fronting tendency.
The
group’s average age of 27 years may have influenced the tokens’
realisation to align more closely with native English speakers. This is
due to the fronting of /u:/ and /ʊ/, a phenomenon observed more
frequently in younger speakers than in older ones (Harrington et al., 2008Harrington,
J., Kleber, F., & Reubold, U. (2008). Compensation for
coarticulation, /u/ fronting, and sound change in standard southern
British: An acoustic and perceptual study. Journal of Acoustic Society of America, 123(5), 2825-2835.
; Hawkins & Midgley, 2005Hawkins, S., & Midgley, J. (2005). Formant frequencies of RP monophthongs in four age groups of speakers. Journal of International Phonetic Association, 35, 183-199.
).
By the end of the year, the group’s engagement and interactions with
native English speakers may have influenced the continuous changes
linked to this fronting tendency in their performance.
These results are somehow contrary to previous studies, e.g., Escudero & Chládková (2010)Escudero, P., & Chládková, K. (2010). Spanish listeners’ perception of American and Southern British English vowels. The Journal of the Acoustical Society of America, 128(5), 254-260. https://doi.org/10.1121/1.3488794.
because the changes in F 1 and F 2 values, particularly the repositioning of /ʊ/ to a more
open and fronted position, indicated that the speakers established a new
vowel category by splitting the Spanish /u/ into two. This is evident
as /ʊ/ was produced with an average of 451 Hz, not resembling the
Spanish value for /u/ of 322 Hz. Additionally, these findings differ
from Koffi & Lesniak (2019) Koffi, E., & Lesniak, F. (2019). A longitudinal acoustic phonetic study of English vowels by a Panamanian speaker. Linguistic Portfolios, 8 (5), 48-63.
and Wang & Munro (1999)Wang, X., & Munro, M. J. (1999). The
perception of English tense-lax vowel pairs by native Mandarin
speakers: The effect of training on attention to temporal and spectral
cues. 14th International Congress of Phonetic Sciences, 3, 125-128.
,
which proposed a more gradual process (more than a year) for back vowel
changes. Notwithstanding the fact that the distinction between the pair
was not greater than the English norm, the formants did change within a
year, suggesting a faster adaptation process compared to the timelines
indicated by these previous authors.
Moreover, the developing contrast between /ɪ/ and /e/ was marked by quality changes for both vowels. By T3, the F 1 values for /e/ decreased from 676 Hz to approximately 600 Hz, and the F 2 values increased beyond 1900 Hz. The decrease in F 1 indexes a more open vowel, potentially influenced by a systemic ‘push’ effect and an adjustment towards pronunciation norms typical of native speakers (a vowel closer to Cardinal Vowel 3).
This outcome contradicts the findings and assimilation patterns suggested by Escudero & Chládková (2010)Escudero, P., & Chládková, K. (2010). Spanish listeners’ perception of American and Southern British English vowels. The Journal of the Acoustical Society of America, 128(5), 254-260. https://doi.org/10.1121/1.3488794.
and Flege (1991)Flege, J. E. (1991). The Interlingual Identification of Spanish and English Vowels: Orthographic Evidence. The Quarterly Journal of Experimental Psychology, 43 (3), 701-73.
,
who have proposed that the realisation of English /e/ by Spanish
speakers would be assimilated to the Spanish /e/, which presents F 1 values of 458 Hz and F 2 values of 1814 Hz average.
Third, regarding the factors associated with the changes produced by some individuals who developed more marked contrasts than others, our findings revealed that individuals exhibited differences in the development of production contrasts between vowel pairs. These variations became apparent from T2 onward (after five months of residing in England), indicating a potential shift in the timeline for second language phonetic/phonological learning.
This contradicts earlier
assertions that non-native English speakers in an immersion setting
experience rapid progress in the initial period (0-5 months) followed by
a plateau, as suggested by Flege et al. (1992)Flege,
J. E., Munro, M. J., & Skelton, L. (1992). Production of the
word-final /t/-/d/ contrast by native speakers of English, Mandarin, and
Spanish. Journal of the Acoustical Society of America, 92, 128-143.
. The individual outcomes show that some learners achieve noticeable vowel contrasts at a faster pace than others.
Six speakers demonstrated the development of contrasts (T2-T3) between all vowel pairs; and eight speakers did not establish a clear distinction during the entire year. The variation in the pace of progress in producing vowel contrasts could be linked to factors such as the ability to comprehend English spoken by native speakers, social interactions in both academic and non-academic environments with native English speakers, and the frequency of using English with native speakers as opposed to international English speakers.
The factors mentioned
were significantly more influential for the ‘high performers’ compared
to the ‘low performers.’ These findings align with the claims made by Flege (2018) Flege, J. E. (2018). It’s input that matters most, not age. Bilingualism: Language and Cognition, 21(5), 919-920. https://doi.org/10.1017/S136672891800010X
and Jun & Cowie (2004)Jun, S., & Cowie, I. (1994). Interference of ‘new’ and ‘similar’ vowels in Korean speakers of English. Ohio State University Working Papers, 43, 117-130.
,
suggesting that adult learners can achieve a clear distinction between
contrasting vowels through active engagement in spoken interactions with
native English speakers, emphasising the importance of this over just
passive exposure to the second language. The limited social interactions
with native English speakers and increased interactions with
international users of English reported by the ‘low performers’ may have
contributed to their depressed performance
To conclude, various factors, including the age of learning (DeKeyser, 2000DeKeyser, R. (2000). The robustness of critical period effects in second language acquisition. Studies in Second Language Acquisition, 22, 499-533.
), learning environment (Best & Tyler, 2007Best,
C., & Tyler, M. (2007). Nonnative and second-language speech
perception: Commonalities and complementarities. In O.-S. Bohn & M.
J. Munro (Eds.), Language experience in second language speech learning (pp. 13-34). Amsterdam: Benjamins.
), length of immersion (Guion et al., 2000Guion,
S., Flege, J. E., Akahane-Yamada, R., & Pruitt, J. (2000). An
investigation of current models of second language speech perception:
The case of Japanese adults’ perception of English consonants. Journal of the Acoustical Society of America, 107 (5), 2711-24.
), and the use of both native and second languages (Polka, 1991Polka, L. (1991). Cross-language speech perception in adults: Phonemic, phonetic, and acoustic contributions. Journal of Acoustical Society of America, 89, 2961-2977.
),
have been demonstrated to influence the development of English vowel
contrasts in second language speakers. Our longitudinal study has
provided valuable insights into L2 phonemic development, revealing the
journey of adult non-native English speakers as they adapt to vowel
contrasts during exposure to and engagement with the target language.
Notably, the initial five months of interaction and exposure to the
second language appeared as a linear period of adaptation before
noticeable progress in English vowel development occurred.