Hash wrote:If you don't mind, please share the three lists of word forms for both languages.
Hi! I'm new here so I don't think I can post links yet, but I'll copy-paste the top 50 results for each language here. The first column is the raw frequency. The second column is the contextual diversity (aka "range" or "document frequency" in AntConc terms). The third column is the word form, and the last column is how much coverage you can get so far. I believe the huge difference is due to Russian affixing case markers to words (I don't know the proper grammatical term), thus creating many more word forms.
@Sfuqua: I know conjugations are supposed to be the same word, but in my experience with Spanish learners, they can be unrecognizable to some people, except advanced students. I guess when you're at an advanced level you can rely on lemmatized word lists. Thankfully German-style compound words are rare in Spanish. The RAE wants us to write ex preso (ex prisioner) as expreso (exprisioner, but also express), but I'm a rebel.
@Einzelne, thanks for commenting. I created the Russian list when I was learning a bit of Russian last year, but got disheartened by the statistics. I guess most of the words are compound words of some kind, maybe including case markers, so it's actually not that big a difference in vocabulary as it seems. BTW, something interesting about Russian is how some words are pronounced almost exactly as they are in Spanish. I would dare say some cognates are easier to recognize in spoken Russian than in spoken French, despite French being a closer relative to Spanish.
*****Spanish*****
1258295 376 que 4.37 %
1051015 376 no 8.01 %
915078 376 a 11.19 %
767170 376 de 13.85 %
622668 376 la 16.01 %
570860 376 y 17.99 %
448317 376 me 19.55 %
435842 376 es 21.06 %
410289 376 lo 22.48 %
402304 376 qué 23.88 %
381298 376 el 25.20 %
342302 376 en 26.39 %
315485 376 por 27.49 %
273007 376 se 28.43 %
265823 376 con 29.36 %
250949 376 un 30.23 %
197721 376 ya 30.91 %
193379 376 para 31.58 %
190311 376 mi 32.24 %
188555 376 una 32.90 %
173597 376 está 33.50 %
168349 376 los 34.08 %
152414 376 si 34.61 %
122822 376 las 35.04 %
121733 376 más 35.46 %
80159 376 del 35.74 %
316200 375 yo 36.84 %
217892 375 pero 37.59 %
179939 375 le 38.22 %
142582 375 eso 38.71 %
113079 375 todo 39.11 %
107742 375 como 39.48 %
99473 375 muy 39.82 %
92225 375 su 40.14 %
89402 375 o 40.45 %
87623 375 al 40.76 %
83775 375 así 41.05 %
57044 375 esta 41.25 %
35685 375 son 41.37 %
325899 374 te 42.50 %
127360 374 bien 42.94 %
101864 374 porque 43.30 %
95926 374 nada 43.63 %
84298 374 sé 43.92 %
69819 374 hacer 44.17 %
63472 374 tiene 44.39 %
58213 374 hay 44.59 %
54820 374 ahora 44.78 %
50390 374 ser 44.95 %
47713 374 mucho 45.12 %
****Russian****
48075 98 не 2.47 %
43774 98 и 4.72 %
43618 98 я 6.96 %
41715 98 в 9.10 %
29340 98 а 10.61 %
23778 98 на 11.83 %
18706 98 с 12.79 %
18584 98 да 13.75 %
15311 98 у 14.53 %
14806 98 как 15.29 %
12412 98 так 15.93 %
12180 98 все 16.56 %
11554 98 он 17.15 %
10286 98 мы 17.68 %
9873 98 меня 18.18 %
9665 98 мне 18.68 %
8592 98 за 19.12 %
8111 98 нет 19.54 %
7482 98 же 19.92 %
6780 98 из 20.27 %
6623 98 тебя 20.61 %
6496 98 к 20.95 %
6134 98 очень 21.26 %
5985 98 если 21.57 %
5562 98 она 21.85 %
5257 98 там 22.12 %
4907 98 о 22.38 %
4763 98 бы 22.62 %
4533 98 нас 22.85 %
4499 98 вас 23.08 %
4426 98 от 23.31 %
4352 98 они 23.54 %
3978 98 где 23.74 %
2477 98 ли 23.87 %
2433 98 нам 23.99 %
22121 97 ты 25.13 %
15309 97 то 25.91 %
11076 97 вы 26.48 %
5727 97 здесь 26.78 %
5724 97 сейчас 27.07 %
5357 97 тебе 27.35 %
4763 97 для 27.59 %
3921 97 вам 27.79 %
3447 97 или 27.97 %
3082 97 был 28.13 %
3013 97 тут 28.28 %
2735 97 даже 28.42 %
2501 97 быть 28.55 %
2234 97 теперь 28.67 %
1855 97 куда 28.76 %