Vocabulary Tests?

Ask specific questions about your target languages. Beginner questions welcome!
Xmmm
Blue Belt
Posts: 821
Joined: Tue Oct 06, 2015 1:19 am
Languages: ru it tr
x 2221

Re: Vocabulary Tests?

Postby Xmmm » Tue Jan 30, 2018 4:36 am

Deinonysus wrote:I can attest to the inaccuracy of that first vocabulary test that's been going around. My wife and I both took the test and got the exact same result for French, a bit over 5,000 words, equivalent to an 8-year-old French child.

What's the problem with that, you might ask? Well, I'm a solid B1 in French, and I only got so many words because there were a lot of direct cognates to words I knew in English. My greatest French accomplishments include successfully ordering fried chicken at St-Hubert and managing to have a 30-second conversation with a 2-year-old French child with only one major grammatical error that was pointed out to me.

My wife is a professional French teacher and is often mistaken for a native speaker. She has aced graduate-level courses that were taught in French. She's at least a solid C1. She definitely has a vocabulary of well over 5,000 French words. She just happened to miss some of the random, obscure words on this test.

If a test says that she and I have the same vocabulary in French, then the test is wrong.


The test could be fixed (maybe easier in theory and harder in practice):

1. the sample would have to be random and always changing
2. the sample would have to be large enough that someone taking it 4 or 5 times would get consistent results
3. you'd have to be punished for wrong answers. Right now I could take the test in Georgian and would get 10 or 15 right just by luck, with a report that I knew 1500 words. But I literally don't know even one word of Georgian.

My result in Russian was reasonable. My number in Italian was sky high ... but I do know a lot of words in English, and a lot of them have cognates in Italian. For example, I just took a word I know in English (prevaricate) and guessed that in Italian there would be a word "prevaricare". And guess what, there is! And it means what you think it means. So I "know" that word. I'm reading a book, I see it, I know exactly what they are saying. So I know a lot of five dollar words in Italian. The problem is, I don't know the words for broom, gum, shoelaces, etc.

And for those that say the only thing that matters is what you can say, and that's there's no value to learning the words you'd actually need to know to read fiction in the TL, well that's one way of looking at things.
1 x

Ещё раз сунешь голову туда — окажешься внутри. Поняла, Фемида? -- аигел

s_allard
Blue Belt
Posts: 985
Joined: Sat Jul 25, 2015 3:01 pm
Location: Canada
Languages: French (N), English (N), Spanish (C2 Cert.), German (B2 Cert)
x 2370

Re: Vocabulary Tests?

Postby s_allard » Tue Jan 30, 2018 5:21 am

Xmmm wrote:... My number in Italian was sky high ... but I do know a lot of words in English, and a lot of them have cognates in Italian. For example, I just took a word I know in English (prevaricate) and guessed that in Italian there would be a word "prevaricare". And guess what, there is! And it means what you think it means. So I "know" that word. I'm reading a book, I see it, I know exactly what they are saying. So I know a lot of five dollar words in Italian. The problem is, I don't know the words for broom, gum, shoelaces, etc.

...

This is actually a very interesting example that indicates a major problem of these tests. Prevaricate in English does not mean prevaricare in Italian despite all the appearances. The Italian verb, just like its Spanish counterpart but unlike English, means to abuse of one's official power. This is why one sometimes sees judges in Spain accused of prevaricación, as defined here according to Wikipedia:

La prevaricación, o prevaricato, es un delito que consiste en que una autoridad, juez u otro servidor público dicta una resolución arbitraria en un asunto administrativo o judicial a sabiendas de que dicha resolución es injusta y contraria a la ley.​ Es comparable al incumplimiento de los deberes del servidor público.
https://es.wikipedia.org/wiki/Prevaricaci%C3%B3n

This sort of problem points to the larger issue of the so-called cognate discount that allows people to guess correct answers on the vocabulary tests. For technical topics one can probably safely assume that meaning and usage are the same for words of same origin. But even that is not so sure, as the case of prevarication shows. When it comes to less technical topics the resemblances of words often hide major differences.
1 x

User avatar
trui
Orange Belt
Posts: 111
Joined: Sun Jun 04, 2017 5:54 pm
Languages: Native: English
Good enough: Dutch
Not there yet: German
Just starting: French
We'll see?: Russian
Language Log: https://forum.language-learners.org/vie ... 3&start=10
x 215

Re: Vocabulary Tests?

Postby trui » Tue Jan 30, 2018 6:38 am

Vocabulary tests are fun-- even though the accuracy is questionable.

I got 18000 for the Dutch arealme test and took the English test afterwards and got 29000. Given that the meanings of the words tested were the same in both tests, that doesn't exactly give me confidence in their accuracy.

For the sprachenlernen test I got 3300 (B2) in Dutch and 5400 (C2) in English (presumably the maximum result). B2 sounds accurate for what it's worth. Ironically enough, both tests put my Dutch vocab count as ~60% of my English. Just 40% to go!

More seriously though, I prefer trying to read something or watch something that I haven't tried in a while as a measure of progress. Often I'm pleasantly surprised. I can appreciate the desire to find out how many words one knows though, even though vocab tests have their own problems as others have pointed out.

One thing that's easy at this point to measure: is my Dutch vocab as big as my English. No? Guess I've still got work to do.
1 x
All comments and corrections welcome.

garyb
Black Belt - 1st Dan
Posts: 1582
Joined: Mon Jul 20, 2015 12:35 pm
Location: Scotland
Languages: Native: English
Advanced: Italian, French
Intermediate: Spanish
Beginner: German, Japanese
Language Log: viewtopic.php?f=15&t=1855
x 6050
Contact:

Re: Vocabulary Tests?

Postby garyb » Tue Jan 30, 2018 10:29 am

That site thinks I'm C2 just because I recognise all of a list of relatively simple vocabulary... At first that seems ridiculous and I'm inclined to say that this is yet another silly online test that over-exaggerates everyone's level and is only useful as an ego boost. Thinking about it more though, perhaps it validates what s_allard's been trying to tell us all along, that even for a C2 exam you don't need a gigantic vocabulary but just a good grasp of relatively common words. If I decided to take a C-level exam in French I'd certainly focus most of my efforts on areas other than receptive vocabulary.
0 x

s_allard
Blue Belt
Posts: 985
Joined: Sat Jul 25, 2015 3:01 pm
Location: Canada
Languages: French (N), English (N), Spanish (C2 Cert.), German (B2 Cert)
x 2370

Re: Vocabulary Tests?

Postby s_allard » Tue Jan 30, 2018 10:45 am

Iversen wrote:...
But it may be more interesting that the overlap between the headwords from 2014.1 and 2014.2 (amounting to 3498 resp. 3914 words) was no less than 1979 words. Which simply means that if I had taken a third sample of around 37.000 and counted headwords in it, it is likely that up to half of the words would be overlap, and the rest would be new words.

But of course this couldn't continue forever: if I had produced ten samples of 35-40.000 words each (all written by me) then the proportion of totally new words in no. 10 would almost certainly be far less than half the words. But even the test with just two corpora suggests that I only use a small part of my total number of active words in English and even less of my passive vocabulary. With ten corpara or more I would of course be closer to emptying my complete stock of known words, but even message 11 would almost certainly contain some not hitherto represented words.

...

As always, there are some very interesting and relevant observations in Iversen's post. As he so rightly points out, the larger the number of samples the larger the total vocabulary size. This is exactly what those vocabulary coverage studies do. If you read only one book, you might need let's say 6000 word families, but to read two books by different authors you'll need 8000 word families because there is only so much overlap. By the time you get to 50 books, you might need maybe 40000 word families to read all the books.

In Iversen's personal example, we are looking at chronological samples. We see that vocabulary usage evolves over time, as to be expected. While we see new words entering Iversen's vocabulary, it would be interesting to observe if certain words are no longer being used at a certain point in time.

I for example use a six-month timeframe to define my contemporary real vocabulary. This is quite small. But my productive vocabulary encompasses all the words that I have used recently (in terms of years). My active vocabulary includes all the words I feel I can use confidently. Then I add all the words that I can understand well - my receptive vocabulary - and finally I add the words I can recognize without necessarily knowing their meaning - my passive vocabulary.

I recently saw interviews of polyglots Luca Lampariello and Alex Rawlings in German. Both demonstrated very impressive levels of proficiency. How much German vocabulary did they produce? Given the length of the interviews, quite a small amount. But their productive vocabulary is undoubtedly much larger.

What is so impressive in these interviews is overall proficiency: excellent pronunciation, fluency, impeccable grammar and good word choices. But I return to my main contention: vocabulary size is basically irrelevant, it is the ability to use words that is important.
0 x

User avatar
Iversen
Black Belt - 4th Dan
Posts: 4782
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15017

Re: Vocabulary Tests?

Postby Iversen » Tue Jan 30, 2018 12:15 pm

S_allard raises the question of chronologi. In the samples from my HTLAL log it obviously took me several months to write the messages that contained those words, and it is not only likely, but certain that I might be unable to remember the meaning of a word I had used months before. It could be interesting to check how many of all those words from my HTLAL I still 'know' (actively as well as passively), but I don't have the time to do the stats now (too many other things - like music - on my agenda right now!).

I typically wrote about something I just had read or watched on TV or experienced during a voyage somewhere, and once I had written about something I would mostly leave that topic and continue to other topics. And now several years have passed and the meantime I will almost certainly have forgotten some of the words. But that's life, and you just need to add as many (or more) new words to the stock as seep out from the bottom of the leaky container, then you will be happy.
0 x

User avatar
rdearman
Site Admin
Posts: 7255
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
Language Log: viewtopic.php?f=15&t=1836
x 23251
Contact:

Re: Vocabulary Tests?

Postby rdearman » Tue Jan 30, 2018 12:25 pm

It seems to me that vocabulary is like what my brother-in-law used to say about money. "Rich or Poor, it's always good to have money."

But what I'm looking for isn't to know what amount of vocabulary I need to speak, or be B2/C1 or whatever. What I'm looking for is some test that will give me an estimate of the vocabulary I know. Here is a good explanation of how a test was created for English vocabulary. http://testyourvocab.com/details

I suspect that s_allard is correct that the amount of vocabulary required isn't huge, but I just wanted to try and figure out what mine is. Having read Iversen's Guide on the matter I think I could probably test myself. What are your thoughts on this as a text.

  • Find 5 pages from books, newspapers, magazine articles, etc. then mark unknown words. Find percentage of unknowns. The use that ratio to the estimated size of words in the language.
  • Download a corpus sorted by frequency and mark the first 10,000 if I know them or not. Find the median point where known and unknown diverge.
  • They say you can estimate the size of your vocabulary as pN where N is the number of words in the dictionary and p is an estimate of the proportion of words in the dictionary that you know. If, for example, you sample 40 words from a dictionary of 100 thousand words, and you know the meaning of half of them, then the estimated size of your vocabulary is 50 thousand words.

These should give me a rough guide. I've been reading up on this rather dry topic. Some links below.

http://www.tandfonline.com/doi/pdf/10.1 ... 0.11659728
https://www.frontiersin.org/articles/10 ... 01116/full
http://languagelog.ldc.upenn.edu/nll/?p=22743
http://hosted.jalt.org/test/beg_1.htm
http://ww2.amstat.org/publications/jse/ ... arton.html
http://nflrc.lll.hawaii.edu/rfl/April20 ... /meara.pdf

Some online tests.

http://www.insightin.com/test/test.phtml
http://www.majortests.com/word-focus/vo ... -tests.php
http://www.learningspanish-spain.com/sp ... -test.aspx
http://www.practicerussian.com/Tests/TestWords.aspx
http://www.teeveetee.com
http://www.open.ac.uk/Arts/classical-st ... ocab.shtml
http://www.javacamp.org/misc/VocTester.html
http://learnersdictionary.com/quiz/vocabulary-start
http://www.er.uqam.ca/nobel/r21270/levels/
4 x
: 26 / 150 Read 150 books in 2024

My YouTube Channel
The Autodidactic Podcast
My Author's Newsletter

I post on this forum with mobile devices, so excuse short msgs and typos.

s_allard
Blue Belt
Posts: 985
Joined: Sat Jul 25, 2015 3:01 pm
Location: Canada
Languages: French (N), English (N), Spanish (C2 Cert.), German (B2 Cert)
x 2370

Re: Vocabulary Tests?

Postby s_allard » Tue Jan 30, 2018 1:20 pm

Never one to fear contradicting myself, I'll admit that vocabulary tests can be useful for comparative purposes, i.e. people taking the same test or an individual taking the same test over time. There is a sense of satisfaction in arriving at a big number.

Thanks a lot to rdearman for those great links. Very useful indeed.
0 x

User avatar
tarvos
Black Belt - 2nd Dan
Posts: 2889
Joined: Sun Jul 26, 2015 11:13 am
Location: The Lowlands
Languages: Native: NL, EN
Professional: ES, RU
Speak well: DE, FR, RO, EO, SV
Speak reasonably: IT, ZH, PT, NO, EL, CZ
Need improvement: PO, IS, HE, JP, KO, HU, FI
Passive: AF, DK, LAT
Dabbled in: BRT, ZH (SH), BG, EUS, ZH (CAN), and a whole lot more.
Language Log: http://how-to-learn-any-language.com/fo ... PN=1&TPN=1
x 6094
Contact:

Re: Vocabulary Tests?

Postby tarvos » Tue Jan 30, 2018 4:16 pm

I recently saw interviews of polyglots Luca Lampariello and Alex Rawlings in German. Both demonstrated very impressive levels of proficiency. How much German vocabulary did they produce? Given the length of the interviews, quite a small amount. But their productive vocabulary is undoubtedly much larger.


It is.

The problem is that breadth of vocabulary can be skewed. I tend to have a large vocabulary in pretty much all of my languages (occasionally I will use words even my teachers are confused about, but that are entirely acceptable). My vocabulary, however, skews technically, and many of the words are obscure, rare, or specific, but they just correspond well to my interests.

I like the example of the Spanish word "cama", which means bed. But in fantasy literature and old texts, you will encounter "lecho" (cf. French lit, Italian letto). A word entirely transparent to me (I never had to look it up), but if you don't read fantasy in Spanish, you don't really have reason to come across it very often. And on the other hand there's "catre", which is colloquial and used often in Rioplatense Spanish (not my style).
2 x
I hope your world is kind.

Is a girl.

Cainntear
Black Belt - 3rd Dan
Posts: 3526
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8793
Contact:

Re: Vocabulary Tests?

Postby Cainntear » Tue Jan 30, 2018 6:06 pm

While the tests in question are clearly a slightly unusual case, I think it highlights something that's very important for teaching.

I've had similar experiences of "I don't know this, but we were discussing insert thing here so it probably means that."

I've had that in end of unit tests and I had it when I volunteered for a classmate's masters research on vocabulary.

But where I've seen it most is in computer-based language courses -- I saw it before Duolingo and I've seen it DL itself and a lot of its lookalikes. When you do a receptive-skills question and you can complete it this way... are you learning anything? Does the "getting it right" part reinforce the pattern or is it just too easy to lead to learning? Maybe you do learn something from it sometimes, but at times I've found myself looking at the possible answers to a multiple-choice L2->L1 translation question, spotted that only one option has been taught in the course yet, and clicked on the correct answer without even knowing what the L2 wordform was. Seriously -- immediately after clicking I would realise that I had literally no idea.
0 x


Return to “Practical Questions and Advice”

Who is online

Users browsing this forum: No registered users and 2 guests