Team Me: Foxing Around

Continue or start your personal language log here, including logs for challenge participants
User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Tue Feb 04, 2020 8:22 pm



Update (of sorts)

Has it been a year since I provided an update? Well, I'm a whole year wiser.

Right now I can remember only the following
Gibbon (Ru & Port)
Matteo Saudino (YouTube)
Stephen King (Ru, Port, Pol)
Documentaries
Toons
...

Uh, I should really keep some kind of a record of these things. A log, perhaps? I will try to go through my YouTube history and maybe I can find something interesting to share.

To Japanese or not to Japanese?
4 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Sat Feb 08, 2020 3:08 pm

1 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Sun Feb 09, 2020 2:31 pm



2019 ITA

NFLX
Adventure Time (a few eps)
Rick and Morty the same
Final Space 2 seasons liked it
A Series of Unfortunate events 2 seasons
Grimm several seasons Meh
Daybreak season 1 Liked it
Happy 7 eps bleh
V Wars 6 eps barf
Stranger Things 3 seasons
Black Summer Season 1 - crunchy

YouTube
Matteo Saudino - lots
a scruffy leftie high school teacher who posts his history and philosophy lessons online. Here I should include a disclaimer about my own political views but I'm fairly apolitical.
Documentaries
Music

German
1 season of a German series about spooky time travel

I should really log into the account because I have probably forgotten about some things. Clicking around I've found out that Once upon a time...life and Lucky Luke are available in French and Italian (LL only in Fr) and I just saw a neat Korean series in Be Portuguese

Turkish

Magnificent Century
90 eps or so. Then I found out I could have watched it in Russian. Well, it was probably more satisfying from a couch potato point of view. The subs ranged from excellent in the beginning to "I will cut my head off with you" which kind of kills the drama.
1 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Tue Feb 11, 2020 12:10 pm



So I have been using Netflix to watch bad TV. All that media consumption occurred in the second half of 2019 so I am doing this backwards. In December I lowered my digital footprint. I also watched a bunch of Mosfilm...films and some miniseries and I finished Burroughs ' Mars novels. Right before that I ran into a Polish crime series featuring a guy wearing a mohawk. In 2019 I "finished" several S. Kong* novels in Polish and Russian. I may have finished one in Portuguese.

*King. Sigh. This does remind me that in 2019 I did some writing. I did not "practice " writing (eww) - I simply wrote to people. While I still have to find that soulmate or the perfect forum it was an interesting experience.

Between October and January I upgraded some overseas property. In the process I found out that the local workmen are either MIA or unavailable so I mostly used unskilled "foreigners". I had to do everything twice with varying degrees of success. Ok, everything was pretty much done wrong. All I have to say is that the people sitting behind Home Depot stores across the US are a national treasure. I don't think I watched much Netflix while this was going on or in the first half of '19. Last summer I was busy dealing with my mother's medical problems and other heavy stuff. While this was going on I regularly listened to the audio files on my phone. If I used "life" as an excuse I would never have learned anything. I finished the Gulag Archipelago and I bit off some serious chunks from Gibbon's book. I used the phone as a sort of a juke box in the sense that I mostly listened to bits and pieces of novels, short stories, some (approachable) philosophy and of course music. I am now going to reminisce at a special place. I will probably listen to something in one of my languages.
10 x

User avatar
Carmody
Black Belt - 1st Dan
Posts: 1747
Joined: Fri Jan 01, 2016 4:00 am
Location: NYC, NY
Languages: English (N)
French (B1)
Language Log: http://tinyurl.com/zot7wrs
x 3395

Re: Team Me: Foxing Around

Postby Carmody » Tue Feb 11, 2020 8:03 pm

If I used "life" as an excuse I would never have learned anything.

Brilliant!

Thank you!!
2 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Sun Feb 16, 2020 8:54 pm



I went for a walk today a little beyond my "special" place to another such place to catch some fresh air and ponder the sunset. As I was making my way towards the drawbridge and trying to breathe with full lungs (while listening to this) I had an oh sh... moment when I realized I was surrounded by a large group of Chinese tourists.

In other news I joined the 6WC late and was dropping on it like a furry version of major Kusanagi but while the mission target stated "Japanese" all my ammo was spent on blue, white and red. So I turned on my invisibility cloak.
2 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Thu Jan 06, 2022 7:23 pm



Lo vuoi un palloncino?


Excerpts.



Vocabulary

Tentacles!

In other stories

A guy sitting on the toilet tries to keep the lid shut while at the same time reaching for that one clean toothpick on the bloodstained floor...

Typewriter>"terminal/word processor">cell phone/laptop toting daytrader trying to wiggle out of a port-a-potty
2 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Fri Jan 07, 2022 1:35 am

"Fast-mapping is the ability to acquire a word rapidly on the basis of minimal information. As proposed by Carey (1978), we assume that children are able to achieve fast-mapping because their initial word meanings are skeletal placeholders that will be extended gradually over time."

Fast Mapping in Word Learning: What Probabilities Tell Us

"Our experimental results suggest that fast mapping can be explained as an induction process over the acquired
associations between words and objects. In that sense, fast mapping is a general cognitive ability, and not a hard-coded, specialized mechanism of word learning. In addition, our results confirm that the onset of fast mapping is a natural consequence of learning
more words, which in turn accelerates the learning of new words. This bootstrapping approach results in a rapid pace of vocabulary acquisition in children, without requiring a developmental change in the underlying learning mechanism."

Neurophysiological Correlates of Fast Mapping of Novel Words in the Adult Brain

"Word acquisition could be mediated by the neurocognitive mechanism known as fast mapping (FM). It refers to a process of incidental exclusion-based learning and is believed to be a critical mechanism for the rapid build-up of lexicon, although its neural mechanisms are still poorly understood. To investigate the neural bases of this key learning skill, we used event-related potentials (ERPs) and employed an audio-visual paradigm that included a counterbalanced set of familiar and novel spoken word forms presented, in a single exposure, in conjunction with novel and familiar images."

https://www.frontiersin.org/articles/10 ... 00304/full

"According to some commentators, (Laufer, 1989, 1992; Hirsh & Nation, 1992), efficient reading in a second language requires knowledge of approximately 95% of the tokens in a given text, and critically, it is only when readers have reached this level that they are able to reliably infer meaning, and use reading as a means of extending vocabulary knowledge. Even then, as Cobb (n.d.) notes, acquiring vocabulary through reading has been shown to need around 10 encounters with a particular item. And the problem here is that in a natural extensive reading approach, the further one travels down the lists, the longer it takes for those encounters to occur. On the other hand, most learners obviously do not acquire vocabulary purely from lists. Research, for example, into students who successfully achieve at least band 5.5 in IELTS shows that they know at least 1,650 out of the 2,000 most commonly used words in English (Neufeld, 2008). At this threshold, a ‗fast mapping‘ principle illustrated mathematically by McMurray (2007) seems to take effect. This is reflected in an extended study of the same students which shows that they actually know over 6,000 out of the 10,000 most common words. If, however, students fall even 100 words short of the 1,650 word threshold, the McMurray model shows that fast mapping is disabled and the chance of significant incidental language development minimal. Other studies have reached similar conclusions. Cobb (1995) cites a threshold of 1,500 words, below which Arabic students in a university preparatory program failed the PET exit level. This threshold was confirmed for remedial Turkish students in an unpublished study by Billuroğlu (2007), which showed that they knew about 1,300 of the 2,000 most commonly used words and only about 3,500 of the 10,000 most commonly used words. In other words, a deficit in knowledge of 300 of the most commonly used words meant that fast-mapping could not take effect and as a result these students were 2,500 words short of the 6,000 word threshold, and condemned either to repeat foundation courses, or to struggle in academic environments, often falling back on L1.

The implications of such research are profound. For learners with limited time, the suggestion that extensive reading will raise lexical proficiency to acceptable levels may prove to be misplaced. The idea too that learners should grapple with ‗authentic‘ text early in the learning process, and be trained to ‗infer‘ the meaning of unknown vocabulary may also be misguided. For learners to develop their vocabulary at speed, what they would seem to need is an approach that maximizes exposure to the most frequent vocabulary items, and increases the likelihood of incidental learning of vocabulary from a much earlier stage.

The very strength of the graded reader approach is also its most significant weakness. In the pursuit of a homogeneous set of texts, the depth and range of meanings of words is diminished. The concordance extracts below illustrate the difference between the nature of exposure to words in a reader corpus to the breadth and depth of meaning in real English, approximated by a corpus of articles from Wikipedia, drawn from a range of topics, of comparable size (608,466 words). In the graded readers corpus, DRAW is almost exclusively present in the sense of drawing a picture, with single occurrences of ―draw a knife,‖ ―draw the curtains,‖ ―draw a breath,‖ and ―draw [lots].‖

This does not mean that the general approach of graded readers is unhelpful. Indeed, wide reading in a second language and the cultivation of a reading culture has much to offer. Most reading series however have not been designed to maximise coverage of lexis to the levels required, and nor have they ensured sufficient exposure to selected lexis to provide the foundation for its long-term acquisition. We should note here that this is an issue of application, not of potential, and we hope to show that readers in fact offer an ideal basis for vocabulary acquisition.

Summary of Discussion

An interesting point of departure has been reached in terms of vocabulary pedagogy. On the positive side, the importance of building an in-depth vocabulary as a prerequisite for communicative competence and second language comprehension is not disputed. Ongoing research clearly challenges previously held views that suggested that what students mostly required in terms of reading were certain defined skills and strategies, and that these were best acquired by immersing students into the world of ‗authentic‘ text as early as possible. The over-reliance on skills training, and the somewhat inconclusive debates that emerged concerning the nature of textual authenticity may perhaps now be taken as given. What is of more concern is the need for practitioners to apply the research findings and produce materials that are fundamentally designed on a lexically orientated approach to language acquisition."

http://www.readingmatrix.com/articles/s ... q=ereaders

6 x

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Team Me: Foxing Around

Postby luke » Fri Jan 07, 2022 9:54 am

reineke wrote:And the problem here is that in a natural extensive reading approach, the further one travels down the lists, the longer it takes for those encounters to occur. On the other hand, most learners obviously do not acquire vocabulary purely from lists. Research, for example, into students who successfully achieve at least band 5.5 in IELTS shows that they know at least 1,650 out of the 2,000 most commonly used words in English (Neufeld, 2008). At this threshold, a ‗fast mapping‘ principle illustrated mathematically by McMurray (2007) seems to take effect. This is reflected in an extended study of the same students which shows that they actually know over 6,000 out of the 10,000 most common words. If, however, students fall even 100 words short of the 1,650 word threshold, the McMurray model shows that fast mapping is disabled and the chance of significant incidental language development minimal. Other studies have reached similar conclusions. Cobb (1995) cites a threshold of 1,500 words, below which Arabic students in a university preparatory program failed the PET exit level. This threshold was confirmed for remedial Turkish students in an unpublished study by Billuroğlu (2007), which showed that they knew about 1,300 of the 2,000 most commonly used words and only about 3,500 of the 10,000 most commonly used words. In other words, a deficit in knowledge of 300 of the most commonly used words meant that fast-mapping could not take effect and as a result these students were 2,500 words short of the 6,000 word threshold, and condemned either to repeat foundation courses, or to struggle in academic environments, often falling back on L1.

Thank you for posting that research snippet! It shows why research is so interesting and can be controversial.

Imagine 3 students and each achieves the "1,650 word threshold".
Student 1 got there by studying the 1,650 most frequent words according to the corpus in question using an SRS program like Anki.
Student 2 got there by reading graded readers.
Student 3 got there by extensive reading without graded readers.

All other things being equal:
Which student got to the threshold first?
Which student achieved the threshold in the most pedagogically effective way?
Which student will do better when reading non-graded readers?

I would hypothesize:
Student 1 (SRS) may learn the 1,650 words fastest, but be the least prepared for real texts.
Student 2 (graded readers) may have had the most pleasant journey.
Student 3 (enough reading of non-graded material to hit the threshold) may be the best prepared for more advanced texts.

It's interesting how the researchers introduce and conflate Fast Memory with the threshold. It's difficult to have a broad, extensive experiment between "pure SRS", "pure graded readers", and "pure "ungraded" readers who didn't use any SRS either that took the individual from 0 to 1,650 threshold words.

So, of course, some speculation and hypothesizing is necessary.

Back to Fast Memory and especially children. Isn't possible that because children's brains are less "fixed" (more neural plasticity) and hopefully they have less preoccupation (parents providing food, shelter, physical and emotional safety), and those factors are what's really the overriding feature of "fast memory"?
4 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Team Me: Foxing Around

Postby reineke » Fri Jan 07, 2022 7:49 pm

Hi Luke

This is not in lieu of a response. I'm just saving some previous handy and relevant research here instead of leaving it strewn in random threads.

How many words are needed to do the things a language user needs to do?

Although the language makes use of a large number of words, not all of these words are equally useful. One measure of usefulness is word frequency, that is, how often the word occurs in normal use of the language. From the point of view of frequency, the word the is a very useful word in English. It occurs so frequently that about 7% of the words on a page of written English and the same proportion of the words in a conversation are repetitions of the word the. Look back over this paragraph and you will find an occurrence of the in almost every line.

The good news for second language learners and second language teachers is that a small number of the words of English occur very frequently and if a learner knows these words, that learner will know a very large proportion of the running words in a written or spoken text. Most of these words are content words and knowing enough of them allows a good degree of comprehension of a text. Here are some figures showing what proportion of a text is covered by certain numbers of high frequency words.

Table 1: Vocabulary size and text coverage in the Brown corpus

Vocabulary size Text coverage
1000 72.0%

2000 79.7%

3000 84.0%

4000 86.8%

5000 88.7%

6000 89.9%

15,851 97.8%

The figures in Table 1 refer to written texts and are from Francis and Kucera (1982) which is a very diverse corpus of over 1,000,000 running words made up of 500 texts of around 2000 running words long. As we shall see the more diverse the texts in a corpus, the greater the number of different words and the high frequency words cover slightly less of the text, so these figures are a conservative estimate. The figures in the last line of the table are from Kucera (1982). The COBUILD Dictionary claims that 15,000 words cover 95% of the running words of their corpus. The figures in Table 1 are for lemmas and not word families. Word families would give fractionally higher coverage. Table 1 assumes that high frequency words are known before lower frequency words and shows that knowing about 2,000 word families gives near to 80% coverage of written text. The same number of words gives greater coverage of informal spoken text - around 96% (Schonell, Meddleton and Shaw, 1956).


How much vocabulary and how should it be learned?

We are now ready to answer the question "How much vocabulary does a second language learner need?" Clearly the learner needs to know the 3,000 or so high frequency words of the language. These are an immediate high priority and there is little sense in focusing on other vocabulary until these are well learned. Nation (1990) argues that after these high frequency words are learned, the next focus for the teacher is on helping the learners develop strategies to comprehend and learn the low frequency words of the language. Because of the very poor coverage that low frequency words give, it is not worth spending class time on actually teaching these words. It is more efficient to spend class time on the strategies of (1) guessing from context, (2) using word parts and mnemonic techniques to remember words, and (3) using vocabulary cards to remember foreign language - first language word pairs. Detailed description of these strategies can be found in Nation (1990). Notice that although the teacher's focus is on helping learners gain control of important strategies, a major function of these strategies is to help the learners to continue to learn new words and increase their vocabulary size.

A way to manage the learning of huge amounts of vocabulary is through indirect or incidental learning. An example of this is learning new words (or deepening the knowledge of already known words) in context through extensive listening and reading. Learning from context is so important that some studies suggest that first language learners learn most of their vocabulary in this way (Sternberg, 1987). Extensive reading is a good way to enhance word knowledge and get a lot of exposure to the most frequent and useful words. At the earlier and intermediate levels of language learning, simplified reading books can be of great benefit. Other sources of incidental learning include problem solving group work activities (Joe, Nation and Newton, 1996) and formal classroom activities where vocabulary is not the main focus.

The problem for beginning learners and readers is getting to the threshold where they can start to learn from context. Simply put, if one does not know enough of the words on a page and have comprehension of what is being read, one cannot easily learn from context. Liu Na and Nation (1985) have shown that we need a vocabulary of about 3000 words which provides coverage of at least 95% of a text before we can efficiently learn from context with unsimplified text. This is a large amount of startup vocabulary a learner needs, and this just to comprehend general texts. So how can we get learners to learn large amounts of vocabulary in a short space of time?

The suggestion that learners should directly learn vocabulary from cards, to a large degree out of context, may be seen by some teachers as a step back to outdated methods of learning and not in agreement with a communicative approach to language learning. This may be so, but the research evidence supporting the use of such an approach as one part of a vocabulary learning program is strong.

To these research based arguments might be added the argument that most serious learners make use of such an approach. They can be helped to do it more effectively. There are other advantages for using word cards. They can give a sense of progress, and a sense of achievement, particularly if numerical targets are set and met. They are readily portable and can be used in idle moments in or out of class either for learning new words or revising old ones. They are specifically made to suit particular learners and their needs and are thus self motivating.

It should not be assumed that learning from word lists or word cards means that the words are learned forever, nor does it mean that all knowledge of a word has been learned. Learning from lists or word cards is only an initial stage of learning a particular word (see Schmitt and Schmitt, 1995 for further information). It is however a learning tool for use at any level of vocabulary proficiency. There will always be a need to have extra exposure to the words through reading, listening and speaking as well as extra formal study of the words, their collocates, associations, different meanings, grammar and so on. This shows a complementary relationship between contextualized learning of new words and the decontextualized learning from word cards.

What vocabulary does a language learner need?

The previous sections of this paper have suggested that second language learners need first to concentrate on the high frequency words of the language. In this section we look at some useful vocabulary lists based on frequency and review the research on the adequacy of the General Service List (West, 1953). Most counts also consider range, that is the occurrence of a word across several subsections of a corpus...

The second 1000 words behave in this way because they are lower frequency words than the first 1000 words and have a narrower range of occurrence. That is their occurrence is more closely related to the topic or subject area of a text than the wide ranging more general purpose words in the first 1000. But given a range of topics and genres, and enough texts, the second 1000 words are more generally useful than other lists of words.

After the 2000 high frequency words of the GSL, what vocabulary does a second language learner need? The answer to this question depends on what the language learner intends to use English for. If the learner has no special academic purpose then the learner should work on the strategies for dealing with low frequency words. If however the learner intends to go on to academic study in upper high school or at university, then there is a clear need for general academic vocabulary. This can be found in the 836 word list called the University Word List (UWL) (Xue and Nation, 1984; Nation, 1990).

The UWL consists of words that are not in the first 2000 words of the GSL but which are frequent and of wide range in academic texts. Wide range means that the words occur not just in one or two disciplines like economics or mathematics, but occur across a wide range of disciplines. Here are some items from it.

accompany formulate index major objective
biology genuine indicate maintain occur
comply hemisphere individual maximum passive
deficient homogeneous job modify persist
edit identify labour negative quote
feasible ignore locate notion random
(Nation, 1990)

The value of the UWL can be seen when we look at the coverage of academic text that it provides. Note the low coverage the UWL has of fiction. Newspapers and magazines which are more formal make use of more of the UWL. Very formal academic text makes the greatest use of the UWL. The UWL is thus a word list for learners with specific purposes namely academic reading. The purpose behind the setting up of the UWL is to create a list of high frequency words for learners with academic purposes, so that these words can be taught and directly studied in the same way as the words from the GSL can.

Word frequency lists

The major theme of this paper has been that we need to have clear sensible goals for vocabulary learning. Frequency information provides a rational basis for making sure that learners get the best return for their vocabulary learning effort. Vocabulary frequency lists which take account of range have an important role to play in curriculum design and in setting learning goals.

This does not necessarily mean that learners must be provided with large vocabulary lists as the major source of their vocabulary learning. It does mean however that course designers should have lists to refer to when they consider the vocabulary component of a language course...

The following list suggests several of the factors that would need to be considered in the development of a resource list of high frequency words.

1 Representativeness The corpora that the list is based on should adequately represent the wide range of uses of language.

Frequency and range Most frequency studies have given recognition to the importance of range of occurrence. A word should not become part of a general service list because it occurs frequently. It should occur frequently across a wide range of texts. This does not mean that its frequency has to be roughly the same across the different texts, but means that it should occur in some form or other in most of the different texts or groupings of texts.

3 Word families

4 Idioms and set expressions Some items larger than a word behave like high frequency words. That is, they occur frequently as a unit (Good morning, Never mind), and their meaning is not clear from the meaning of the parts (at once, set out). If the frequency of such items is high enough to get them into a general service list in direct competition with single words, then perhaps they should be there. Certainly the arguments for idioms are strong, whereas set expressions could be included under one of their constituent words

5 Range of information To be of full use in course design, a list of high frequency words would need to include the following information for each word - the forms and parts of speech included in a word family, frequency, the underlying meaning of the word, variations of meaning and collocations and the relative frequency of these meanings and uses, and restrictions on the use of the word with regard to politeness, geographical distribution etc.


http://www.robwaring.org/papers/CUP/cup.html

Vocabulary Range and Text Coverage:
Insights from the Forthcoming
Routledge Frequency Dictionary of Spanish

"In the following table -- which represents the main conclusions of this study -- we see the percent coverage of all tokens in three different registers (oral, fiction, and non-fiction) at three different levels of lexemes -- top 1000 words, top 2000 and top 3000.

Table 3. Percent coverage of tokens by groups of types/lemma
Image

Image


"As the data indicate, a limited vocabulary of 1000 words would allow language learners to recognize between 75-80% of all lexemes in written Spanish, and about 88% of all lexemes in spoken Spanish (which is due to the higher repetition of basic words in the spoken register). Subsequent extensions of the base vocabulary have increasingly marginal importance. By doubling the vocabulary list to 2000 words, we account for only about 5-8% more words in a given text, and the third thousand words in the list increases this only about 2-4% more. There clearly is a law of “diminishing returns” in terms of vocabulary learning."

The data from Spanish and English are roughly comparable, but there is an important difference in the way in which the data was obtained. In Nation (2000), the words are grouped by what he calls “word families”, so that [courage, discouragement, encourage] would all be grouped under the headword [COURAGE], and [paint, painted, painter, painting] would all be grouped under the headword [PAINT]. In our study, however, we used the traditional lemma approach, in which pintar, pintura, pintor, and pintoresco would all be assigned to different lemma, and [pintamos, pinto, and pintarás] would all be assigned to the lemma [PINTAR]. Because we separate the nominal, verbal, and adjectival uses, we might expect that the same number of headwords would lead to less text coverage than in English. The fact that this does not happen, however, is probably due to the fact that English has a larger lexical stock than Spanish, due to the influence of native Anglo-Saxon and imported Franco-Norman and Latinate words (e.g. real, royal, regal). The fact that the same amount of lexemes in German leads to lower textual coverage is somewhat more difficult to explain. It may be due to the still-incomplete state of the German tagger (Jones, p.c.). Or again, it may be due to a generally larger lexical stock in German than in Spanish, though this is much more debatable.

10. Conclusion

Hopefully the preceding discussion provides some useful insight into the issue of vocabulary range and text coverage, and the way in which the extracted data can be used to create a more useful frequency dictionary of Spanish. From the point of view of a language learner, the important point is that text coverage clearly obeys the law of diminishing returns. With about 4000 words, a language learner would be able to recognize more than 90% of the words in a typical native speaker conversation. If s/he learns two thousand more words, however, this will increase coverage by only about 3-4%. We have also seen that the degree of coverage is a function of register and part of speech, and have provided detailed data to support this view. We have also considered the role of vocabulary range, and how factors such as register affect this as well."

http://www.lingref.com/cpp/hls/7/paper1091.pdf

Selecting Television Programs for Language Learning: Investigating Television Programs from the Same Genre

"In a corpus-driven study looking at the number of words needed to understand the vocabulary in television programs, Webb and Rodgers (2009a) found that a vocabulary size of 3000 word families plus proper nouns and marginal words provided 95.45% coverage of a corpus made up of 88 television programs from a variety of genres."

"Webb and Rodgers (2009a) findings also shed light on differences between television genres. Children’s programs were found to have the smallest vocabulary load; the most frequent 2000 word families, plus proper nouns and marginal words accounted for 95% coverage. The most frequent 3000 word families plus proper nouns and marginal words accounted for 95% of American drama, older programs, situation comedies and British programs. The genres with the greatest proportions of low frequency words were news stories and science fiction programs. Results also indicated that coverage is likely to vary between episodes of programs leading Webb and Rodgers to suggest that randomly viewing programs may limit comprehension. Instead they proposed watching programs from within the same subgenre that have similar topics and storyline."

https://www.researchgate.net/publicatio ... Same_Genre
7 x


Return to “Language logs”

Who is online

Users browsing this forum: No registered users and 2 guests