Selecting extensive reading materials

General discussion about learning languages
User avatar
rdearman
Site Admin
Posts: 2640
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
French (studies), Italian (studies), Mandarin (studies),
Esperanto TAC (Only god knows why), Finnish (only in it for the cookies)
Language Log: viewtopic.php?f=15&t=1836
x 5493
Contact:

Re: Selecting extensive reading materials

Postby rdearman » Wed Jul 06, 2016 2:38 pm

Zegpoddle wrote:Does anyone know of similar sites that grade text samples in other languages? Knowing the grade level of a text will not tell you how close you are to the magical 98% threshold that Nation recommends, but you could try reading books at various grade levels, see how difficult they are for you, and eventually identify what your ideal grade level is in your target language.

Unix/Linux has a command "style":
DESCRIPTION
Style analyses the surface characteristics of the writing style of a
document. It prints various readability grades, length of words, sen‐
tences and paragraphs. It can further locate sentences with certain
characteristics. If no files are given, the document is read from
standard input.

Numbers are counted as words with one syllable. A sentence is a
sequence of words, that starts with a capitalised word and ends with a
full stop, double colon, question mark or exclamation mark. A single
letter followed by a dot is considered an abbreviation, so it does not
end a sentence. Various multi-letter abbreviations are recognized,
they do not end a sentence as well. A paragraph consists of two or
more new line characters.


It has options for different languages, I know it did German & English. Here is the output for some random text file I had.

readability grades:
Kincaid: 8.8
ARI: 10.5
Coleman-Liau: 10.5
Flesch Index: 66.4/100 (plain English)
Fog Index: 11.7
Lix: 41.9 = school year 7
SMOG-Grading: 10.6
sentence info:
5459 characters
1152 words, average length 4.74 characters = 1.43 syllables
60 sentences, average length 19.2 words
40% (24) short sentences (at most 14 words)
11% (7) long sentences (at least 29 words)
6 paragraphs, average length 10.0 sentences
0% (0) questions
70% (42) passive sentences
longest sent 57 wds at sent 14; shortest sent 5 wds at sent 24
word usage:
verb types:
to be (51) auxiliary (4)
types as % of total:
conjunctions 7% (80) pronouns 2% (28) prepositions 13% (146)
normalisations 3% (32)
sentence beginnings:
pronoun (2) interrogative pronoun (0) article (14)
subordinating conjunction (1) conjunction (1) preposition (11)
0 x
"Never blame on malice that which can be explained by stupidity."

User avatar
patrickwilken
White Belt
Posts: 16
Joined: Tue Jul 21, 2015 11:39 am
Location: Berlin
Languages: English (N), German (B2+)
x 49
Contact:

Re: Selecting extensive reading materials

Postby patrickwilken » Wed Jul 06, 2016 4:49 pm

Zegpoddle wrote:Does anyone know of similar sites that grade text samples in other languages? Knowing the grade level of a text will not tell you how close you are to the magical 98% threshold that Nation recommends, but you could try reading books at various grade levels, see how difficult they are for you, and eventually identify what your ideal grade level is in your target language.


The site Readlang http://readlang.com allows you to upload ebooks and other text and amongst other things calculates a readability score , based on some sort word frequency assessment. My (admittedly crude) assessment for German is that it does OK job of sorting easy from intermediate to harder books.
0 x

Online
User avatar
aokoye
Brown Belt
Posts: 1262
Joined: Sat Jul 18, 2015 6:14 pm
Location: Portland, OR
Languages: English (N), German (~C1), Swedish (beginner), Dutch (beginner), French (beginner)
Language Log: viewtopic.php?f=15&t=2935
x 1962
Contact:

Re: Selecting extensive reading materials

Postby aokoye » Wed Jul 06, 2016 5:23 pm

patrickwilken wrote:The site Readlang http://readlang.com allows you to upload ebooks and other text and amongst other things calculates a readability score , based on some sort word frequency assessment. My (admittedly crude) assessment for German is that it does OK job of sorting easy from intermediate to harder books.


I do like Readlang but I think it does a pretty poor job at CEFR levels. When I've entered in texts that have been ranked by publishers, primarily Cornelsen but also some by Hueber - both have a good selection of texts ranked by level (though Cornelsen has significantly more) Readlang's rankings (in terms of level) were generally higher than the publisher's ranking. Now that I've typed that out you could probably just go with the assumption that Readlang is going to set the levels of things higher than they actually are and work from there.

That said, I'm looking at my Readlang account right now and it ranked Tonio Kröger by Thomas Mann as B2 which I think is low. Meanwhile it ranked Die Verwandlung by Kafka as C2 - I didn't find much difference in the two texts, if anything Die Verwandlung was a slightly easier read for me.

From their website:
The difficulty is calculated by a combination of:
- Automated Readability Index: wikipedia article
- The percentage of words which are in the top 2000 most frequent words in the language. The majority of the word frequency lists are based on movie subtitles and come from this site: Invoke IT Word Frequency Lists
It's far from perfect, but seems to work reasonably well for Spanish, English, French and German texts


That said I'm not really complaining, I have a free lifetime account because of how long I've been using the service (I was one of the first 600 users) and it made my life a lot easier during my last German lit class.
3 x
Prefered gender pronouns: Masculine

User avatar
MorkTheFiddle
Blue Belt
Posts: 506
Joined: Sat Jul 18, 2015 8:59 pm
Location: Texas, USA
Languages: English (N), French (read fluently), Spanish (read fluently). Studying Ancient Greek. Relearning German.
Language Log: viewtopic.php?f=15&t=5680&p=70021#p70021
x 709

Re: Selecting extensive reading materials

Postby MorkTheFiddle » Sat Mar 11, 2017 10:26 pm

paz wrote:I found this project, that from my point of view is quite interesting. They created a Readabilty Catalog of Project Gutenberg eBooks, I quote:
This website determines vocabulary difficulty by a more direct measure than the conventional readability formulas. Earlier researches found that the vocabulary difficulty of a text correlates with its text comprehension (eg. Schmitt et al. 2011). This website investigates vocabulary of a text with a vocabulary frequency list. The investigation process used in this website is similar to that described by Nation (2006), where this website only counts words included in the word family list and this website omits words with diactrical marks.

Since the ebooks are from the Project Gutenberg, everything is free!

Whatever became of this, Paz? What books did you select? What criteria did you use for your selections? Did your criteria turn out to be useful, or did you just have to feel your way to books you could read?
1 x
Ah ! Le bon billet qu'a La Châtre !

jeffers
Orange Belt
Posts: 152
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: viewtopic.php?f=15&t=2612
x 284

Re: Selecting extensive reading materials

Postby jeffers » Tue Mar 14, 2017 12:17 pm

I'm a little surprised nobody has mentioned e-readers in this context. I find it frustrating to read a paper book with a lot of unknown words, but on my Kindle I can look up words without significant impact on my reading flow. If I am interested enough in the subject matter, I can read quite advanced material with the help of my Kindle.

Part of what Nation meant by "enjoying" reading is being able to read without stopping regularly to look things up. E-readers alleviate some of the stopping, enabling me to enjoy books that would otherwise be a chore to read.
5 x
Fr books: 7 / 100films: 90 / 200
De books: 1 / 50films: 6 / 50
Hi books: 0 / 50films: 2 / 50
Gr books: 0 / 50films: 0 / 50

Tomás
Blue Belt
Posts: 554
Joined: Sat Oct 10, 2015 9:48 pm
Languages: English (N). Currently studying Spanish (intermediate), French (false beginner).
x 636

Re: Selecting extensive reading materials

Postby Tomás » Tue Mar 14, 2017 1:57 pm

jeffers wrote:I'm a little surprised nobody has mentioned e-readers in this context. I find it frustrating to read a paper book with a lot of unknown words, but on my Kindle I can look up words without significant impact on my reading flow. If I am interested enough in the subject matter, I can read quite advanced material with the help of my Kindle.

Part of what Nation meant by "enjoying" reading is being able to read without stopping regularly to look things up. E-readers alleviate some of the stopping, enabling me to enjoy books that would otherwise be a chore to read.


Easy fix: don't stop to look things up. I never do anymore--even on e-readers--and enjoy reading a lot more because of it.
3 x

jeffers
Orange Belt
Posts: 152
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: viewtopic.php?f=15&t=2612
x 284

Re: Selecting extensive reading materials

Postby jeffers » Tue Mar 14, 2017 2:45 pm

Tomás wrote:
jeffers wrote:I'm a little surprised nobody has mentioned e-readers in this context. I find it frustrating to read a paper book with a lot of unknown words, but on my Kindle I can look up words without significant impact on my reading flow. If I am interested enough in the subject matter, I can read quite advanced material with the help of my Kindle.

Part of what Nation meant by "enjoying" reading is being able to read without stopping regularly to look things up. E-readers alleviate some of the stopping, enabling me to enjoy books that would otherwise be a chore to read.


Easy fix: don't stop to look things up. I never do anymore--even on e-readers--and enjoy reading a lot more because of it.


Which leaves you with the previous dilemma, how much can I get away with not knowing? Nation et. al. assume 98%, others differ. Everyone probably has their own line of comfort. My point is simply that with an e-reader dictionary I can read things above the level I would be comfortable with in paper.
3 x
Fr books: 7 / 100films: 90 / 200
De books: 1 / 50films: 6 / 50
Hi books: 0 / 50films: 2 / 50
Gr books: 0 / 50films: 0 / 50

Cavesa
Black Belt - 2nd Dan
Posts: 2060
Joined: Mon Jul 20, 2015 9:46 am
Languages: Czech (N), English (C1), French (C2), Spanish (intermediate), German (somewhere on the path), Italian (beginner)
x 5343

Re: Selecting extensive reading materials

Postby Cavesa » Tue Mar 14, 2017 3:41 pm

aokoye wrote:
patrickwilken wrote:The site Readlang http://readlang.com allows you to upload ebooks and other text and amongst other things calculates a readability score , based on some sort word frequency assessment. My (admittedly crude) assessment for German is that it does OK job of sorting easy from intermediate to harder books.


I do like Readlang but I think it does a pretty poor job at CEFR levels. When I've entered in texts that have been ranked by publishers, primarily Cornelsen but also some by Hueber - both have a good selection of texts ranked by level (though Cornelsen has significantly more) Readlang's rankings (in terms of level) were generally higher than the publisher's ranking. Now that I've typed that out you could probably just go with the assumption that Readlang is going to set the levels of things higher than they actually are and work from there.

That said, I'm looking at my Readlang account right now and it ranked Tonio Kröger by Thomas Mann as B2 which I think is low. Meanwhile it ranked Die Verwandlung by Kafka as C2 - I didn't find much difference in the two texts, if anything Die Verwandlung was a slightly easier read for me.

From their website:
The difficulty is calculated by a combination of:
- Automated Readability Index: wikipedia article
- The percentage of words which are in the top 2000 most frequent words in the language. The majority of the word frequency lists are based on movie subtitles and come from this site: Invoke IT Word Frequency Lists
It's far from perfect, but seems to work reasonably well for Spanish, English, French and German texts


That said I'm not really complaining, I have a free lifetime account because of how long I've been using the service (I was one of the first 600 users) and it made my life a lot easier during my last German lit class.



I am under the same impression. The readlang levels assessment seems to converge everything towards B2, with some B1 and occassional C1 book assessments, in my opinion. That is not a complaint, the assessment is not easy to do and none I have ever seen, except for artificial graded readers, has seemed too reliable. Fortunately, it is not a vital function of the site. Really, the best assessment seems to be either asking other learners (preferably those, with which we know how big a grain of salt to take) or just opening the book and trying it out. I recommend opening it elsewhere than on the first page, as many books start with descriptions and similar not that representative passages (for example the Wheel of Time)
0 x

Tomás
Blue Belt
Posts: 554
Joined: Sat Oct 10, 2015 9:48 pm
Languages: English (N). Currently studying Spanish (intermediate), French (false beginner).
x 636

Re: Selecting extensive reading materials

Postby Tomás » Tue Mar 14, 2017 3:54 pm

jeffers wrote:
Tomás wrote:
jeffers wrote:I'm a little surprised nobody has mentioned e-readers in this context. I find it frustrating to read a paper book with a lot of unknown words, but on my Kindle I can look up words without significant impact on my reading flow. If I am interested enough in the subject matter, I can read quite advanced material with the help of my Kindle.

Part of what Nation meant by "enjoying" reading is being able to read without stopping regularly to look things up. E-readers alleviate some of the stopping, enabling me to enjoy books that would otherwise be a chore to read.


Easy fix: don't stop to look things up. I never do anymore--even on e-readers--and enjoy reading a lot more because of it.


Which leaves you with the previous dilemma, how much can I get away with not knowing? Nation et. al. assume 98%, others differ. Everyone probably has their own line of comfort. My point is simply that with an e-reader dictionary I can read things above the level I would be comfortable with in paper.


I assume that everyone has a different level of tolerance for ambiguity. However, the translate button itself is also remarkably addictive. I have a high tolerance for ambiguity while reading, but if I have a translate button available then I will click it like a cocaine-addicted monkey seeking his fix. For me, the solution was to get rid of the button and thus temptation.
1 x

User avatar
Ani
Blue Belt
Posts: 700
Joined: Mon Mar 14, 2016 8:58 am
Location: Alaska
Languages: English (N), French (getting fairly proficient), Finnish (on hold) Greek and Russian (beginner)
x 1457

Re: Selecting extensive reading materials

Postby Ani » Tue Mar 14, 2017 5:28 pm

Tomás wrote:
I assume that everyone has a different level of tolerance for ambiguity. However, the translate button itself is also remarkably addictive. I have a high tolerance for ambiguity while reading, but if I have a translate button available then I will click it like a cocaine-addicted monkey seeking his fix. For me, the solution was to get rid of the button and thus temptation.


So what's wrong with the translation button? It is like a mix of iguanamon's side by side readers and EMK's cheating method. Personally, I don't see how anyone could learn plants and animals from the L2-L2 dictionary :) Eventually you will learn enough that it would take more time to look up or translate.
1 x
But there's no sense crying over every mistake. You just keep on trying till you run out of cake.


Return to “General Language Discussion”

Who is online

Users browsing this forum: No registered users and 1 guest