Arnaud's lazy log (Russian & co)

Continue or start your personal language log here, including logs for challenge participants
Arnaud
Blue Belt
Posts: 984
Joined: Sat Jul 18, 2015 11:57 am
Location: Paris, France
Languages: Native: French
Intermediate: English, Russian, Italian
Tourist : Breton, Greek, Chinese, Japanese, German, Spanish, Latin
Language Log: viewtopic.php?t=1524
x 2172

Re: Arnaud's lazy log (Russian & co)

Postby Arnaud » Sun Aug 20, 2017 5:37 pm

A little page about the russian most frequent words.

The average word length is 5.28 characters.
The average sentence length is 10.38 words.
1000 most frequent lemmas cover 64.0708% of word forms in texts.
2000 most frequent lemmas cover 71.9521% of word forms in texts.
3000 most frequent lemmas cover 76.6824% of word forms in texts.
5000 most frequent lemmas cover 82.0604% of word forms in texts.

As you can see the 5000 most frequent lemmas cover 82.0604% of word forms in texts.
To cover 90% of word forms, you need 11545 lemmas and to cover 95% you need 24438.
So in russian, it's impossible to cover 90% with 5000 words, that's what everybody can feel by experience and keep on repeating here and there in the different personal logs.
I could add that as the corpus is made of texts from 1970 to 2002, a lot of archaic or old vocab you can meet in classical literature is not counted, so I suspect (but that's just my personal feeling) that the number to cover 90% is even higher than 11545 if your goal is to be able to read russian classical literature. I also suppose that the slang and mat are not cover, but you can meet them frequently in certain books or even TV series (if you watch физрук, it's full of colloquialisms that you don't find in dictionaries)

Perhaps it's the same thing with other languages, I don't know, but as russian is inflected, it gives a lot of things to learn and perhaps reinforces the feeling of difficulty.
4 x

User avatar
neofight78
Blue Belt
Posts: 539
Joined: Wed Jul 22, 2015 8:02 pm
Location: Novosibirsk, Russia
Languages: English (N), Russian (B2+), Spanish (A0)
Language Log: viewtopic.php?t=833
x 1232

Re: Arnaud's lazy log (Russian & co)

Postby neofight78 » Sun Aug 20, 2017 5:55 pm

To backup Arnaud's general stats with some that relate to a specific book....

Человек-амфибия contains 5662 unique words. So if we very carefully select our 5000 words so that they are all in this book we would come out as knowing 5000 / 5662 = 88% of all words in the book. So we fall slightly short of the stated 90%, but let's not quibble over a few percent here. The real killer is that the odds that all those 5000 words we already know are in the book is pretty small.

I have just shy of 8000 words in my Anki deck. I have also read Человек-амфибия and learned a load of new words from it. Yet, these 8000 words still only give 86% coverage of the book.

Now let's take a book that I've not read yet: Контрольный выброс. There are 8157 unique words. Our ideal 5000 gives 5000 / 8157 = 61% - this is a long way from 90%! My current vocabulary of around 8000 comes out at only 73% coverage.

In short, Arnaud is indeed a hero for finishing Человек-амфибия, especially considering he was using a paper dictionary. He is also right to moan about the amount of vocabulary needed to read Russian fiction. Also the idea that learning a select 5000 words will make reading easy also looks very doubtful.
4 x

User avatar
smallwhite
Black Belt - 2nd Dan
Posts: 2386
Joined: Mon Jul 06, 2015 6:55 am
Location: Hong Kong
Languages: Native: Cantonese;
Good: English, French, Spanish, Italian;
Mediocre: Mandarin, German, Swedish, Dutch.
.
x 4876

Re: Arnaud's lazy log (Russian & co)

Postby smallwhite » Sun Aug 20, 2017 6:02 pm

To Arnaud,

Yes. While MY 5000 words, the 5000 words in MY quizlet and mentioned in MY posts, do not refer to these 5000 most frequent words. My 5k words don't include cognates, place names, easy words like personal pronouns or "and", "but", etc. I do not quizlet every single word I encounter. My 5k words are maybe half of the 10000 most frequent words or something. Do you understand now? I was not saying that the 5000 most frequent Greek words cover 90% of my novels. What happened was, I read, I picked SOME words to study, I picked 5k, and now I know 90% of my novels.

And yes, I know Russian is difficult. What I'd like to hear from you is how you think Russian compares with German and Greek. Which is harder and how?
Last edited by smallwhite on Sun Aug 20, 2017 6:18 pm, edited 1 time in total.
0 x
Dialang or it didn't happen.

User avatar
smallwhite
Black Belt - 2nd Dan
Posts: 2386
Joined: Mon Jul 06, 2015 6:55 am
Location: Hong Kong
Languages: Native: Cantonese;
Good: English, French, Spanish, Italian;
Mediocre: Mandarin, German, Swedish, Dutch.
.
x 4876

Re: Arnaud's lazy log (Russian & co)

Postby smallwhite » Sun Aug 20, 2017 6:16 pm

neofight78 wrote:Also the idea that learning a select 5000 words will make reading easy also looks very doubtful.

I am sure I did not say that about Russian. The 5000 was my GREEK wordcount all along, and I don't even know Russian?
0 x
Dialang or it didn't happen.

User avatar
neofight78
Blue Belt
Posts: 539
Joined: Wed Jul 22, 2015 8:02 pm
Location: Novosibirsk, Russia
Languages: English (N), Russian (B2+), Spanish (A0)
Language Log: viewtopic.php?t=833
x 1232

Re: Arnaud's lazy log (Russian & co)

Postby neofight78 » Sun Aug 20, 2017 6:18 pm

smallwhite wrote:Yes. While MY 5000 words, the 5000 words in MY quizlet and mentioned in MY posts, do not refer to these 5000 most frequent words. My 5k words don't include cognates, place names, easy words like personal pronouns or "and", "but", etc. I do not quizlet every single word I encounter. My 5k words are maybe half of the 10000 most frequent words or something. Do you understand now? I was not saying that the 5000 most frequent Greek words cover 90% of my novels. What happened was, I read, I picked SOME words to study, I picked 5k, and now I know 90% of my novels.


Go back and read my post. I've posted best case scenarios for 5000 words, the frequency of the words is irrelevant.

smallwhite wrote:And yes, I know Russian is difficult. What I'd like to hear from you is how you think Russian compares with German and Greek. Which is harder and how?


If you get 90% coverage with 5000 words in Greek, it must be easier, because that's not possible in Russian. FSI says German is easier than Russian.
Last edited by neofight78 on Sun Aug 20, 2017 6:39 pm, edited 1 time in total.
1 x

Arnaud
Blue Belt
Posts: 984
Joined: Sat Jul 18, 2015 11:57 am
Location: Paris, France
Languages: Native: French
Intermediate: English, Russian, Italian
Tourist : Breton, Greek, Chinese, Japanese, German, Spanish, Latin
Language Log: viewtopic.php?t=1524
x 2172

Re: Arnaud's lazy log (Russian & co)

Postby Arnaud » Sun Aug 20, 2017 6:20 pm

smallwhite wrote:And yes, I know Russian is difficult. What I'd like to hear from you is how you think Russian compares with German and Greek. Which is harder and how?
Do you see German or Greek in my profile ?
My level of german is probably A0.5 no more, how can I compare carrots and apples ?
In fact I've read in Blaurebel's log that you learned russian during at least 100 hours, so you're better placed than me to answer your own question.
3 x

User avatar
smallwhite
Black Belt - 2nd Dan
Posts: 2386
Joined: Mon Jul 06, 2015 6:55 am
Location: Hong Kong
Languages: Native: Cantonese;
Good: English, French, Spanish, Italian;
Mediocre: Mandarin, German, Swedish, Dutch.
.
x 4876

Re: Arnaud's lazy log (Russian & co)

Postby smallwhite » Mon Aug 21, 2017 4:09 am

neofight78 wrote:
smallwhite wrote:Yes. While MY 5000 words, ...


Go back and read my post. I've posted best case scenarios for 5000 words, the frequency of the words is irrelevant.

My post above was for Arnaud, whose post was before yours.



I've been explaining and re-explaining in vain, but I think I get it now. Let me try again.

This sentence by Arnaud is simply not true: "Smallwhite wrote in a previous message that with 5000 words in greek she knows 90% of the words of her book".
This part is true: "in greek she knows 90% of the words of her book".
But this part is not true: "with 5000 words in greek".

Looking at how you two present your numbers, apparently you only know words that are in your notebook/Anki and you do not know other words. I, on the other hand, study and read differently. So, forget that "5000" figure, which was stated by Arnaud and not by me in the first place and misused. I do not know 5000 words. I know maybe 10,000 words.

Smallwhite has about 5,000 flashcards in Greek.
Smallwhite knows maybe 10,000 words in Greek.

I hope everything is clear now.



neofight78 wrote:
In short, Arnaud is indeed a hero for finishing Человек-амфибия, especially considering he was using a paper dictionary. He is also right to moan about the amount of vocabulary needed to read Russian fiction.

Maybe I'm misinterpreting your message, but thought I'd quote the below just to be clear - it was Arnaud himself who said his method didn't work, not me:
Arnaud wrote:
... That neverending list of unknown vocab is depressing...
... It's frustrating. Russian is frustrating and the more I progress the more I feel my knowledge of that language is superficial and fragile. Something definitively didn't work, but I don't know what: perhaps the lack of teachers to explain why this and not that. I've too many unanswered questions and they keep on accumulating.

And I did not think he was moaning incorrectly, in fact I said the exact oppose - I would moan, too, if I didn't use Quizlet, so I brought up Quizlet.



Arnaud wrote:
Do you see German or Greek in my profile ?
My level of german is probably A0.5 no more, how can I compare carrots and apples ?
In fact I've read in Blaurebel's log that you learned russian during at least 100 hours, so you're better placed than me to answer your own question.

No, I don't see Russian on your profile nor on mine, but this comparison between Greek and Russian was actually started by you yourself and I was only continuing it:
Arnaud wrote:
russian...
2. The quantity of vocab you need is astronomical...
3. How many words, then ? Smallwhite wrote in a previous message that with 5000 words in greek she knows 90% of the words of her book. Well, with 5000 words in russian, you ...
Myself after having read Assimil (sans peine+perfectionnement) which gives about 7500 words and the textbook of Patapova which gives 8000 words, I was unable to read children books. I had to add several graded-readers and transcripts of series during a long time to be able to reach a level where I could start to read and be under 10% of unknown words.

It's fine if you do not want to continue your comparison of Greek and Russian now, but please know that I was only continuing your comparison. I've been trying to let you know that the "5000" number there was incorrect, and was hoping for a revised comparison from you. But if you do not want to revise your comparison then nevermind. Thank you for your time.

And neofight78, you do not need to revise your calculations based on my new figure of 10,000 either. I can crunch the numbers myself. Thank you for the data.
1 x
Dialang or it didn't happen.

User avatar
neofight78
Blue Belt
Posts: 539
Joined: Wed Jul 22, 2015 8:02 pm
Location: Novosibirsk, Russia
Languages: English (N), Russian (B2+), Spanish (A0)
Language Log: viewtopic.php?t=833
x 1232

Re: Arnaud's lazy log (Russian & co)

Postby neofight78 » Mon Aug 21, 2017 5:20 am

Oh, now suddenly it's 10000 words?

All I can discern from your posts is that you seem to like Quizlet and trolling.
2 x

Arnaud
Blue Belt
Posts: 984
Joined: Sat Jul 18, 2015 11:57 am
Location: Paris, France
Languages: Native: French
Intermediate: English, Russian, Italian
Tourist : Breton, Greek, Chinese, Japanese, German, Spanish, Latin
Language Log: viewtopic.php?t=1524
x 2172

Re: Arnaud's lazy log (Russian & co)

Postby Arnaud » Mon Aug 21, 2017 5:24 am

Ok, you know 10000 words in greek, I started the comparaison between greek and russian, and I didn't understand anything of what you wrote.
Thanks for dropping by and good luck with your studies.
4 x

Arnaud
Blue Belt
Posts: 984
Joined: Sat Jul 18, 2015 11:57 am
Location: Paris, France
Languages: Native: French
Intermediate: English, Russian, Italian
Tourist : Breton, Greek, Chinese, Japanese, German, Spanish, Latin
Language Log: viewtopic.php?t=1524
x 2172

Re: Arnaud's lazy log (Russian & co)

Postby Arnaud » Sat Aug 26, 2017 7:28 pm

Russian:
- I've reached 50% of Голова профессора Доуэля, it's going well, it seems to me that the story is easier to read than Человек-амфибия: less descriptions of oudoors or natural landscapes, less action, but the novel is still very good: I'm completly hooked :D
I've added Чернобыльская молитва, as a "spare tire" when I'm too tired to read the main book: less words to look for in the dictionary.
- I've "watched" (more precisely listened in the background while reading) Метод Фрейда this week: well, I understood almost nothing. I was surprised to see the same actor as in Мажор, Павел Прилучный, and I absolutly don't understand what that guy says, in general, so I think that I'm goint to start again to study series with that actor to try and improve my listening abilities because it's frustrating. Of course, I'll have to read less (probably half an hour less/day). I think I need more quantity, as simple as that, I should try to push my current 500h of listening to 1000h.
- I've read the little book of Paul Noble (thanks reineke) about language learning: a lot of ideas in it, and also the idea that you need to repeat a lot, far more than what I do: I count too much on my natural genius and my perfect memory to learn things :mrgreen: . I think I should repeat more, especially the material related to the listening as it's one of my main problems.
2 x


Return to “Language logs”

Who is online

Users browsing this forum: bombobuffoon and 2 guests