Frequency lists and dictionaries

Ask specific questions about your target languages. Beginner questions welcome!
HerrSignore
White Belt
Posts: 19
Joined: Wed Apr 29, 2020 7:07 pm
Languages: English (N)
German (B1/B2)
Italian (A2)
Portuguese (A2)
Spanish (A1)
French (A1)
x 32

Frequency lists and dictionaries

Postby HerrSignore » Sun May 24, 2020 8:59 am

How do you incorporate frequency lists and dictionaries into your learning? For Portuguese, I had a frequency dictionary of the top 5000 words or so and I essentially learnt each word, out of context, via flashcards and repitition with varying success. This method will of course not teach you grammar or sentence structure, but as a raw method to quickly gain vocabularly, do you think it has merit?
2 x

User avatar
smallwhite
Black Belt - 2nd Dan
Posts: 2154
Joined: Mon Jul 06, 2015 6:55 am
Location: Hong Kong
Languages: Native: Cantonese;
Good: English, French, Spanish, Italian;
Mediocre: Mandarin, German, Swedish, Dutch.
.
x 3992

Re: Frequency lists and dictionaries

Postby smallwhite » Sun May 24, 2020 9:18 am

I plug vocabulary holes with frequency lists. Let's say I'm a beginner and have just learnt 2000 words from textbooks and various sources. I would then go through a frequency list, words 1 to 2000, to look for words that I haven't learnt yet (holes) and make flashcards out of them (plugging holes).

The numbers never match like that (2000 vs 2000) so I have to use judgement or just stop when I feel sick of the list.
9 x
This post was written as a public service at no cost to readers.

Dialang or it didn't happen.

User avatar
Iversen
Black Belt - 2nd Dan
Posts: 2789
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 6132

Re: Frequency lists and dictionaries

Postby Iversen » Sun May 24, 2020 9:45 am

There are a number of words which you HAVE to learn, but that doesn't amount to 5000 words. It is hard to judge the exact number of words in this category, but if I should venture a guess it might be less than 500 - maybe just a 100 or so. So you have to learn all these words, but you can't do it from a frequency list (and not even from a fullsize dictionary). The really important words are often irregular and being so common they also tend to be used in many idiomatic expressions and maybe they have special functions in grammar. In the case of irregular verbs etc. you might check their forms in a grammar or a special morphology source (like the homepage Verbix - and yes, I do know that errors have been found in it, but you still learn more correct facts than blatant errors from using it). But much of what you should about the extremely common words has to be learnt from native speech and writing, supplemented by occasional lookups in dictionaries and grammars.

This doesn't mean that the frequency list is completely irrelevant, but it should only be used to check that you know all the essential words - and that means that it should be used fairly late in the learning process, not as a tool in the early phase.

After the absolutely necessary words there are some relatively frequent words which you should know, but being frequent they will also tend to pop up in your study materials so it is less problematic if you miss a few here and there. Again the best use of the frequency table is to check at a fairly late stage where you have your lacunes, but you don't need to be so fanatic about it if you find a hole. If you have a hobby or special interest it would probably be more interesting and relevant for you to learn the special vocabulary pertaining to that area. Besides the frequencies of words near the end of the 5000 words range are so low that you have time to look them up when you finally meet them.

Personally I do wordlists based on bilingual dictionaries to extend my vocabularies, and if possible I use dictionaries with some morphological information and examples of idiomatic expressions. You will rarely find that in a mere frequency list, but this kind of information is essential, and if you use a dictionary you can also choose to learn words that are relevant for YOU instead of just some words others have chosen for you. You may end up learning some rare and arcane words this way, but you can easily detect by using simple common sense whether the words you choose are too obscure to be relevant.

And yes, there are language learners who expect to learn all their words from genuine speech and texts, but I see them as the hunter-gatherers amoung os. Those who use dictionaries and language learning tools like Anki in a systematic way are the farmers.
11 x

Kraut
Blue Belt
Posts: 864
Joined: Mon Aug 07, 2017 10:37 pm
Languages: German (N)
French (C)
English (C)
Spanish (A2)
Lithuanian
x 1097

Re: Frequency lists and dictionaries

Postby Kraut » Sun May 24, 2020 11:08 am

https://www.linguee.de/german-spanish/t ... 1-200.html

Linguee has a page with frequency lists, that according to them is updated weekly. This looks great but I have realized that it does not seem to be updated - it looks the
same as half a year ago, and I could not find other language pairs.
1 x

Haselnuss
White Belt
Posts: 11
Joined: Sun May 24, 2020 7:55 pm
Languages: English (N), German (learning)
x 38

Re: Frequency lists and dictionaries

Postby Haselnuss » Sun May 24, 2020 8:16 pm

I've used the German frequency dictionary at the center of my efforts to firm up my German. Learning the words in the frequency dictionary and completing an intermediate grammar book allowed me, with some extra effort, to listen to German radio and read the German news online every day.

Specifically, I created an Anki deck with all the words in the frequency dictionary, making sure for nouns to note down the plural and to color code for gender: masculine (blue), feminine (red) and neuter (green). I also marked all the nouns with any special grammatical attributes: weak masculine, irregular masculine, irregular neuter, adjectival declination, etc. For verbs, I noted down down the principle parts of any strong or irregular verbs. (These are tips I picked up from reading the online handbook for German instruction at Oxford University.)

One other factor that came into play in creating my FreqDict Anki deck is sorting out synonyms. I did this by crosschecking the relevant words in a couple of synonym dictionaries. An example of this are the words "Anlass" and "Ursache", which both map to the English word "cause", but have slightly different meanings. Anlass = proximate cause; Ursache = root cause.

Just published last fall, the second edition of the Routledge German Frequency dictionary now has 5,000 words. The first edition, which is the one I originally used, has only 4,000 words.
1 x

Kraut
Blue Belt
Posts: 864
Joined: Mon Aug 07, 2017 10:37 pm
Languages: German (N)
French (C)
English (C)
Spanish (A2)
Lithuanian
x 1097

Re: Frequency lists and dictionaries

Postby Kraut » Mon May 25, 2020 12:12 pm

Spanisch 5000 is a German - Spanish flashcard course based on the Davies frequency list with a lot more context as usual - and a third page.

Die 3 Prinzipien von „Spanisch 5000“

https://www.spanisch-5000.de/konzept.htm

Spanisch 5000 vereinigt 3 Prinzipien zu einem hoch effektiven, neuartigen Lernkonzept für Vokabeln: Das Prinzip der Häufigkeit, das Prinzip der feststehenden Wendungen und das Prinzip des Kontextlernens. Der Kurs geht damit über einen reinen Vokabelkurs deutlich hinaus und vermittelt in umfassender Weise Sprachkenntnisse. Das Prinzip der Häufigkeit: Jeder Sprach- oder Vokabelkurs bemüht sich, mit „einfachen“ Worten zu beginnen. Da Einheiten jedoch i.d.R. thematisch aufgebaut sind, fällt die Auswahl meist indes recht willkürlich aus - man erin - nere sich an den Englischunterricht in der Schule. Wäre also nicht eine strikte Anordnung nach Häufigkeit ideal? Dieser - keineswegs trivialen - Aufgabe hat sich u.a. Prof. Mark Davies verschrieben, der die spanische Sprache (weltweit!) auf Worthäufigkeiten untersucht hat und die Ergebnisse als „Frequency Dictionary“ veröffentlicht hat - eine sortierte Wortliste der 5000 häufigsten Worte. Auf diesem Dictio - nary, für das uns Prof. Davies uns freundlicherweise die Nutzungsrechte gewährt hat, beruht nun unser Kurs.



Spanish 5000 combines 3 principles into a highly effective, novel learning concept for vocabulary: the principle of frequency, the principle of fixed phrases and the principle of contextual learning. The course thus goes far beyond a mere vocabulary course and provides comprehensive language skills. The principle of frequency: Every language or vocabulary course tries to start with "simple" words. However, since units are usually thematically structured, the choice is usually quite arbitrary - remember the English lessons at school. So wouldn't a strict frequency order be ideal? This - by no means trivial - task has been set by Prof. Mark Davies, among others, who has examined the Spanish language (worldwide!) for word frequencies and published the results as a "Frequency Dictionary" - a sorted word list of the 5000 most frequent words. Our course is now based on this dictio - nary, for which Prof. Davies has kindly granted us the rights of use.

deepl translated
2 x

HerbM
Posts: 5
Joined: Thu Feb 20, 2020 11:11 am
Languages: English, native
French, upper beginner, perhaps closing on lower intermediate
Spanish beginner (inactive)
German, beginner (dormant, inactive)
German, lower beginner (dormant, inactive)
Arabic, lower beginner (dormant, inactive)
Nederland, lower beginner (dormant, inactive)
x 6

Re: Frequency lists and dictionaries

Postby HerbM » Sat May 30, 2020 12:23 am

For myself, I always start a new language with frequency lists (audio or written) among other things.

Even those who ascribe to "comprehensible input" (as I do) this makes sense because learning the first 1000, 2000, up to 5000 words or so allows you read more interesting and appropriate material sooner.

Note this is different from "learning" the vocabulary to learn the language but is rather about getting to the point that we can read and listen better to really learn the language.

Pronunciation is important to get started correctly and be able to read accurately from the beginning.
1 x

HerbM
Posts: 5
Joined: Thu Feb 20, 2020 11:11 am
Languages: English, native
French, upper beginner, perhaps closing on lower intermediate
Spanish beginner (inactive)
German, beginner (dormant, inactive)
German, lower beginner (dormant, inactive)
Arabic, lower beginner (dormant, inactive)
Nederland, lower beginner (dormant, inactive)
x 6

Re: Frequency lists and dictionaries

Postby HerbM » Thu Jun 18, 2020 5:30 am

Kraut wrote:https://www.linguee.de/german-spanish/topgerman/1-200.html

Linguee has a page with frequency lists, that according to them is updated weekly. This looks great but I have realized that it does not seem to be updated - it looks the
same as half a year ago, and I could not find other language pairs.


Fiel danken.

Studying French from English, I had trouble locating this page, or the link to it, or even Googling for it in French based on the German Spanish URL so here is the pattern:

Code: Select all


https://www.linguee.com       #  the site
   /language1-language2        # the languages, apparenty names are in English for the URL
   /topLANGUAGE1                # The word "top" plus LANGUAGE1 name, e.g., /topenglish
   /START#-END#                 # Position of words to show, e.g., /1-1000

Apparently 1000 words is the limit -- if you chose more then it will build links to each 100 words as a set with no actual words showing.
   
Example:
https://www.linguee.com/english-french/topenglish/1-1000.html


https://www.linguee.com/english-french/ ... 1-200.html

With this method you can have the desired list by downloading only 1 page per each 1000 words.

However, do note that the target language (e.g., language2) is NOT shown on the initial page with all the source language words.

Instead, each source language word is linked to the translation page for LANGUAGE1 which translates into language2.

Understandable, but perhaps slightly disappointing -- you need to either capture the translations one at a time, or paste them into the translator (Google translate with do this for about 5000 characters, not words, per submission, and will maintain all of the translations on a line for line basis.

It's easiest to align the translations by pasting into Excel columns or by using an editor with "column pasting" such as NotePad++ or Emacs.

On the other hand, translating a large group (500-1000 words) will only provide the main or primary translation rather than the full translation you would get by following the link or otherwise translating them individually.

Of course, with a bit of effort that too can all be automated pretty easily. (I'll get to that and share "real soon now".)
1 x

yong321
Orange Belt
Posts: 112
Joined: Thu Feb 25, 2016 12:42 am
Location: Texas
Languages: English, Chinese. Spanish, French, Italian, German, reading comprehension only.
Language Log: http://yong321.freeshell.org/misc.html#lang
x 126
Contact:

Re: Frequency lists and dictionaries

Postby yong321 » Mon Jun 22, 2020 7:51 pm

HerbM wrote:
Kraut wrote:https://www.linguee.de/german-spanish/topgerman/1-200.html

Linguee has a page with frequency lists, that according to them is updated weekly. This looks great but I have realized that it does not seem to be updated - it looks the
same as half a year ago, and I could not find other language pairs.


Fiel danken.
...
Of course, with a bit of effort that too can all be automated pretty easily. (I'll get to that and share "real soon now".)


By the way, how do you think the Linguee word frequency is compared to the others (Wiktionary frequency, Mark Davies' Routledge dictionaries, http://corpus.rae.es/lfrecuencias.html etc.)? If I'm not mistaken, this list is based on what words people visiting the Linguee website search on their site. For example, on page
https://www.linguee.com/spanish-english ... 1-200.html
it says "Most common Spanish *queries*, 1 to 200".
But that frequency won't reflect how often the words *occur* in reading or listening materials. This is not criticizing their list; in fact, it's a very good idea since it implicitly gives certain words a higher weight according to real users' learning experience. But I just want to point it out and would like to hear you all's thoughts.
1 x

User avatar
tungemål
Green Belt
Posts: 374
Joined: Sat Apr 06, 2019 3:56 pm
Location: Norway
Languages: Norwegian (N)
English, German, Spanish, Japanese, Dutch, Polish
x 624

Re: Frequency lists and dictionaries

Postby tungemål » Tue Jun 23, 2020 8:09 am

Kraut wrote:Spanisch 5000 is a German - Spanish flashcard course based on the Davies frequency list with a lot more context as usual - and a third page.


I had a look at their website. The course looks very good, and it is free. Are you using it? And do you use their flashcard app?

If I were to do this course I would simultaneously practice both Spanish and German. Whether that is an advantage I'm unsure of.
0 x


Return to “Practical Questions and Advice”

Who is online

Users browsing this forum: No registered users and 2 guests