Word frequency checker

All about language programs, courses, websites and other learning resources
User avatar
Iversen
Black Belt - 4th Dan
Posts: 4787
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15049

Re: Word frequency checker

Postby Iversen » Thu Jan 14, 2016 1:10 am

The most common words are typically also the most irregular, and if the software only counts wordforms then the statistics will be skewed. However it will be far more complicated to mark frequencies for headwords or wordfamilies, and the problem peters out after the first couple of hundred words.

It could actually be interesting to see a colour coded text as proposed. The programming should not be too hard if you work on text files, but to do it inside a program like Word or Libre Writer would be more complicated. First you have to make a version of the frequency table with frequency codes (maybe the colour codes themselves), and this list should be sorted alphabetically and saved as a text file. Then you take the text and mark all the word forms - i.e. all letter sequences which has nothing before or after it OR a space or dot, komma or other signs. If you work with text files then then this can be done simply by moving a pointer (or rather two, because you need to 'remember' the beginning of each word). If you work inside a fullblown text editor like Word or Libre W then there are definitely some modules that can identify words - right now I don't know how to activate them, but it can be done.

When a word form has been identified then it should be looked up in the frequency table. 'Not found' is also a result. And then you simply quote the colour code your found inside a tag before and after each and every word to produce the multicolour version. The question is just to find someone who is willing to invest some time in the project, and right now it is not me.
0 x

User avatar
Elenia
Black Belt - 1st Dan
Posts: 1888
Joined: Sun Jul 19, 2015 1:22 am
Location: London
Languages: English (N), Swedish (C1), French (Massively Atrophied) German (lowly beginner, somehow learnt to read)


Finnish?!
Language Log: viewtopic.php?t=708
x 3280
Contact:

Re: Word frequency checker

Postby Elenia » Thu Jan 14, 2016 1:23 am

rdearman wrote: I use something called FBReader which works on just about every platform, and the smartphone version has integrated google translations.


Off topic, but thanks for linking to FB reader! I used it on my old computer, but had difficulty finding it. I also never knew there was an app! I bet it's android only, though :(

---

On topic: what is notepad++? By which I really mean, is it a standard Windows app, or does it need to be downloaded? Or is it an app for another OS?
0 x

User avatar
Montmorency
Brown Belt
Posts: 1035
Joined: Tue Oct 06, 2015 3:01 pm
Location: Oxfordshire, UK
Languages: English (Native)
Maintaining: German (active skills lapsed somewhat).
Studying: Welsh (advanced beginner/intermediate);
Dabbling/Beginner: Czech

Back-burner: Spanish (intermediate) Norwegian (bit more than beginner) Danish (beginner).

Have studied: Latin, French, Italian, Dutch; OT Hebrew (briefly) NT Greek (briefly).
Language Log: viewtopic.php?f=15&t=1429
x 1184

Re: Word frequency checker

Postby Montmorency » Thu Jan 14, 2016 5:16 pm

Elenia wrote:
rdearman wrote: I use something called FBReader which works on just about every platform, and the smartphone version has integrated google translations.


Off topic, but thanks for linking to FB reader! I used it on my old computer, but had difficulty finding it. I also never knew there was an app! I bet it's android only, though :(

---

On topic: what is notepad++? By which I really mean, is it a standard Windows app, or does it need to be downloaded? Or is it an app for another OS?


It's a Windows program (or programme, if you prefer). It's a much more powerful version of MS Notepad, but is public domain and open source. I think it's mostly aimed at programmers, but of course, it's useful for other people as well.

It needs to be downloaded, but I don't remember it being a problem to install. It has a load of plugins as well. I can't remember if you have to install those separately, but I don't remember any problems installing it. I've probably only exploited a fraction of its power though.

https://notepad-plus-plus.org
1 x


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests