A few weeks ago I wrote this in the "Intermediate level" thread:
"Most language tests agree: a B2 test requires you know between 4,000 and 5,000 terms to understand what is on the test. A C1 test between 6000 and 9000. What most people think of as "good" speakers of a foreign language, based on anecdotal evidence I have read over time, have an average vocabulary in their L2 of about 8,000 to 10,000 words, which is still half to 1/3 of native speakers. I would assume a C2 test would require at least 15,000 words to get through without major challenges. Let's remember even natives have problems passing such tests."
I forgot to mention at this time that these figures were based on a combination of anecdotal evidence (about the level of "fluent" L2 foreign speakers), based on two tests done on this subject matter I few years back; as well my extrapolation based on such studies and the average vocabulary size that is thrown out there for natives. Plus on vocabulary lists of tests like the HSK, the Chinese profiency tests for foreigners. Ironically, while I personally used the HSK levels and their vocabulary size as a landmark of sorts to estimate CEFR vocab size per level, the HSK itself is criticized by EuropeanTeachers of Chinese as being inaccurate in their claims of what each HSK level purportedly allows the taker to claim as a language level: basically, they claim the HSK grossly overestimates, since Hanban (the designers of the HSK) make a 1/1 correspondence between HSK 1 through 6 and the six CEFR Levels A1 to C2.
As an advanced student of Mandarin Chinese, I basically concur with the European teachers of Chinese, that the HSK 5 is definitely not C1, but rather a mid B1. And the HSK 6 is certainly high B2, maybe borderline C1 level (I say this because I have taken mock tests of the HSK 6 and they include some very rather obvious topic jargon: from various diseases like Lou Gehrig's, to neutron stars in articles about astronomy, to various financial terms in interviews of a CEO). The interesting thing is that when specifically the German branch of Teachers of Chinese put out their letter of refutation, they actually say the following:
The Fachverband welcomes the new HSK Chinese Proficiency Test that was published by the People’s Republic of China (PRC) earlier this year, especially insofar as it certifies elementary knowledge of Chinese for beginners with a vocabulary of 150 to
300 lexical units on the basis of the Hanyu Pinyin transcription system. It thus serves as a valuable motivator for students of Chinese.
However, in the interests of a proper and realistic assessment of Chinese language proficiency, we at the Fachverband Chinesisch, after examining the documents, consider it our duty to categorically deny the linking between the new HSK levels, as set out in the official HSK documents, and those of the Common European Framework of Reference for Languages (CEFR):
At present, the vocabulary size required for level A1 in all foreign languages is about 500 lexical units, for A2 about 1,000, for level B1 about 2,000. The new HSK suggests that just one-third of this vocabulary size would be needed to achieve the same levels of proficiency.
The official data given by the Hanban envisage that level B2 (HSK 4) will be reached after just 2 years of learning with 2-4 hou
rs of lessons per week (160-320 hours). These figures are out of the question, even for European languages. In this context we would like to refer once again to the resolution taken by the Fachverband in 2005, according to which we estimated that between 1,200 and
1,600 hours of instruction (+ private study time) are required to attain oral and written proficiency in Chinese that is comparable to level B2.
http://www.fachverband-chinesisch.de/si ... ungHSK.pdf
I slightly disagree (how dare I), with their view that HSK 6 is strictly a B2 level since there is some fairly technical material in there, and I have the feeling that you need to know more than the 5,000 "lexical units" (see below) to really feel comfortable with the listening and reading. I would estimate you need north of 6,000+ words to pass the test.
If you extrapolate those figures, then you I would get something like this (I may be wrong of course):
A1 = 500
A2 = 1,000
B1 = 2,000
B2 = 4,000
C1 = 8,000
C2 = 16,000
I am not trying to claim I am genius, I just may have guessed in educated fashion, but these figures match very closely with what I wrote in my quote above. 4,000 words for B2 (which is the level most people feel some "freedom of expression" in a foreign language, this is twice as many words as the famous figure of 2,000 to be able to communicate in most languages in a basic sense). And also match the stuff I have read about what people and experts consider "fluent foreign speakers of L2", having a vocabulary of "just 8,000" words, which matches the C1 estimate above. The C2 vocabulary level is just my guess, it could be more. Most native speakers that have finished high school have vocabularies higher than 16,000 from what I have read, so in theory they should be able to pass C2 exams. But then it also depends on their test-taking and other factors. University graduates I would assume could pass a C2 but who knows.
Finally, back to "lexical units". Is this a lemma? Or a word? If it is a word, is it that a lexical unit is "one definition", or multiple ones of the same word? That is my major doubt that could throw all this yapping I am doing about vocabulary size and language levels for a loop.
Anyways, I am just sharing this small info and my thoughts with anyone who is interested in this kind of stuff.