I have a friend who used to use color highlighting to parse complex sentences, e.g. verb phrases were highlighted in green. He had a complete color system worked out. You might consider this a streamlined form of sentence diagramming, without actual diagrams. He thought it helped him to see the syntactic structure much more easily. Other people's reactions varied - some thought it was a crutch he should abandon, and others thought it was just way too much work... He didn't seem to mind investing the time, though, and I don't think it really slowed him down that much.
He only did this on paper, not on a computer. His syntax highlighting system probably could have been more elaborate if he hadn't been limited to the physical color highlighters he had readily available in those days. It was rather colorful as it was, though.
32000 words of intensive reading - an accidental experiment
-
- Orange Belt
- Posts: 228
- Joined: Sun Feb 26, 2017 4:01 pm
- Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
- x 590
-
- Black Belt - 3rd Dan
- Posts: 3532
- Joined: Thu Jul 30, 2015 11:04 am
- Location: Scotland
- Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc - x 8809
- Contact:
Re: 32000 words of intensive reading - an accidental experiment
I personally find that the small amount of time I did explicit parsing of English taught me enough about the process that I can now do it in my head, and can "see" divisions in text without marking them, both in English and in languages that I'm studying.
4 x
- Tristano
- Blue Belt
- Posts: 640
- Joined: Mon Jul 20, 2015 7:11 am
- Location: The Netherlands
- Languages: Native: Italian
Speaks: English, Dutch, French, Spanish
Understands but not yet speaks: Romanian
Studies: German
Can't wait to put his hands on: Scandinavian languages, Slavic languages, Turkish, Arabic and other stuff - Language Log: viewtopic.php?f=15&t=5141
- x 1015
Re: 32000 words of intensive reading - an accidental experiment
Except for thanking you for the nice post and congratulating the woman for both effort and results, I'm intruding here with a side question.
I'm used to find my English horrible. The reported excerpts and part of the posts belonging to this thread, to which level of cefr scale we can associate? That would be my level in passive English and one below would be a rough representation of my actives.
I'm used to find my English horrible. The reported excerpts and part of the posts belonging to this thread, to which level of cefr scale we can associate? That would be my level in passive English and one below would be a rough representation of my actives.
0 x
-
- Posts: 7
- Joined: Sun Mar 06, 2016 9:24 pm
- x 8
Re: 32000 words of intensive reading - an accidental experiment
I'm taking a similar approach with Greek, using the so called advanced texts from GreekPod101. By my estimate the 50 texts available amount to approximately 15000 words. I copy them to google docs and add notes to the words I'm not totally familiar with. I can feel a little progress in my understanding, but I'm not even halfway through.
5 x
- luke
- Brown Belt
- Posts: 1243
- Joined: Fri Aug 07, 2015 9:09 pm
- Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
- Language Log: https://forum.language-learners.org/vie ... 15&t=16948
- x 3632
Re: 32000 words of intensive reading - an accidental experiment
s_allard wrote:Axon wrote:...
In political articles there are monsters of sentences like this one:In a speech at Georgetown University, she laid out the U.S. military maneuvers over the past several months—including a nuclear-powered submarine heading to South Korea, the movement of three aircraft carriers to the Western Pacific, and the Army testing out “mobilization centers” for deploying troops and training soldiers to fight in tunnels like those beneath North Korea—that inform this worry.
I'd be curious to see what the interactive process was to arrive at a complete understanding of that sentence.
One should be able to join the parts before and after the dashes.
In a speech at Georgetown University, she laid out the U.S. military maneuvers over the past several months that inform this worry. ("this worry", refers to a sentence before the monster one quoted).
Then one can treat the part between the dashes as support for the main idea ("this worry"):
including a nuclear-powered submarine heading to South Korea, the movement of three aircraft carriers to the Western Pacific, and the Army testing out “mobilization centers” for deploying troops and training soldiers to fight in tunnels like those beneath North Korea (Refers to the "U.S. military maneuvers" in the main sentence).
So, formulaically, in good English (like the New York Times article that was quoted), one can parse a sentence with dashes something like this:
A -- B -- C.
as
AC. B.
0 x
: Cien años de soledad 20x
: 5500 pages - Reading
: FSI Basic Spanish 3x
: Camino a Macondo
: 5500 pages - Reading
: FSI Basic Spanish 3x
: Camino a Macondo
- reineke
- Black Belt - 3rd Dan
- Posts: 3570
- Joined: Wed Jan 06, 2016 7:34 pm
- Languages: Fox (C4)
- Language Log: https://forum.language-learners.org/vie ... =15&t=6979
- x 6554
Re: 32000 words of intensive reading - an accidental experiment
Axon wrote:
In political articles there are monsters of sentences like this one:In a speech at Georgetown University, she laid out the U.S. military maneuvers over the past several months—including a nuclear-powered submarine heading to South Korea, the movement of three aircraft carriers to the Western Pacific, and the Army testing out “mobilization centers” for deploying troops and training soldiers to fight in tunnels like those beneath North Korea—that inform this worry.
Look how long that middle part is! And less obvious but equally confusing: what are the mobilization centers for? Deploying troops and training soldiers. To do what? Fight in tunnels. Tunnels like what? Like the tunnels beneath North Korea. What does "inform a worry" really mean, and how can it be used? I'm a native speaker with a degree in a writing-heavy subject and I'm positive I've never used that phrase before.
You cut out important details.
'The Military Has Seen the Writing on the Wall'
The United States is preparing for a war with North Korea that it hopes never to have to fight
"When Senator Tammy Duckworth returned from a recent trip to South Korea and Japan, she brought back a sobering message: “Americans simply are not in touch with just how close we are to war on the Korean peninsula.” In a speech at Georgetown University, she laid out the U.S. military maneuvers over the past several months—including a nuclear-powered submarine heading to South Korea, the movement of three aircraft carriers to the Western Pacific, and the Army testing out “mobilization centers” for deploying troops and training soldiers to fight in tunnels like those beneath North Korea—that inform this worry. In an interview with me, she said the U.S. military seems to be operating with the attitude that a conflict “‘will probably happen, and we better be ready to go.’”
https://www.theatlantic.com/internation ... ea/551381/
Your text: When Senator Tammy Duckworth returned from a recen ...
Flesch Reading Ease score: 44.3 (text scale)
Flesch Reading Ease scored your text: difficult to read.
Gunning Fog: 15 (text scale)
Gunning Fog scored your text: hard to read.
Flesch-Kincaid Grade Level: 13.2
Grade level: College.
The Coleman-Liau Index: 11
Grade level: Eleventh Grade
The SMOG Index: 12
Grade level: Twelfth Grade
Automated Readability Index: 13.6
Grade level: 21-22 yrs. old (college level)
Linsear Write Formula : 16.6
Grade level: College Graduate and above.
http://www.readabilityformulas.com
2 x
- Axon
- Blue Belt
- Posts: 776
- Joined: Thu Jun 16, 2016 12:29 am
- Location: California
- Languages: Native English, in order of comfort: Mandarin, German, Indonesian,
Spanish, French, Russian,
Cantonese, Vietnamese, Polish. - Language Log: viewtopic.php?f=15&t=5086
- x 3297
Re: 32000 words of intensive reading - an accidental experiment
Thanks, reineke! I didn't know there were so many different reading level calculators. I agree, you do need to know what "this worry" refers to in order to truly grasp the meaning of the sentence.
She's up past article 40 now, and one new thing I've noticed is that a few of the words she's asking me for help with are words that I would not really be able to use confidently in my own writing. Words like stolid, renal, putatively, bumptious, comport. Other educated native speakers, can you give definitions of these out of context off the top of your head? Have they recently or ever appeared in your writing?
Some might worry: If you don't have a tutor or native speaker around to tell you, then how can you be sure that the words you're learning from native sources are actually good words to use? I'd say more extensive reading will give you good intuition in the right direction. Also, if you vary the register of your readings, you'll notice which words appear in highbrow or more general writing.
Here (with her permission) is a sample of words that she's recently written down. This is a nice view into a self-learner's process of improving English literacy.
I searched the forum for several of these at random and found very few results. I think that goes to show the value of finding varied sources for reading practice. You could read this forum for weeks on end and only read "cobblestone" twice, but any novel set in London or any Minecraft video will expose you to that word dozens of times.
She's up past article 40 now, and one new thing I've noticed is that a few of the words she's asking me for help with are words that I would not really be able to use confidently in my own writing. Words like stolid, renal, putatively, bumptious, comport. Other educated native speakers, can you give definitions of these out of context off the top of your head? Have they recently or ever appeared in your writing?
Some might worry: If you don't have a tutor or native speaker around to tell you, then how can you be sure that the words you're learning from native sources are actually good words to use? I'd say more extensive reading will give you good intuition in the right direction. Also, if you vary the register of your readings, you'll notice which words appear in highbrow or more general writing.
Here (with her permission) is a sample of words that she's recently written down. This is a nice view into a self-learner's process of improving English literacy.
inhospitable, blunder, minted, akin to, ferocious, clout, well versed in, overthrow, a mob of, unleash, memorandum, hard-nosed, stemming from, croon, uninhibited, zealot, subordinate, atrocity, redoubled, undulating, slab, bohemian, hub, gripe, fret, cobblestone, sundering, allege, centrist, lag, pidgin, fizz
I searched the forum for several of these at random and found very few results. I think that goes to show the value of finding varied sources for reading practice. You could read this forum for weeks on end and only read "cobblestone" twice, but any novel set in London or any Minecraft video will expose you to that word dozens of times.
7 x
-
- Blue Belt
- Posts: 985
- Joined: Sat Jul 25, 2015 3:01 pm
- Location: Canada
- Languages: French (N), English (N), Spanish (C2 Cert.), German (B2 Cert)
- x 2373
Re: 32000 words of intensive reading - an accidental experiment
I think this last post illustrated once again two fundamental truths about vocabulary learning. Firstly, outside the fundamental set of function or grammar words and the basic vocabulary of everyday life - including work of course -, most of the words in a language are rarely heard, read or used by most people.
This is a broad statement that demands lots of explanation that I don't have time to get into now but the point is that what we actually use is only a tiny portion of what is out there. I watch about an hour of television every day in English and I'll say that I hear a new word or saying every day. Just last night, I heard "If wishes were horses, beggars would ride". That was completely new to me. Who knows when I will hear it again.
The second point is that we will acquire new vocabulary as needed. That is exactly what reading or exposure in general does. This is of course how professional or occupational vocabulary is learned. A bus driver, a lawyer, an engineer, each will have their own subset of words that I do not know.
This is a broad statement that demands lots of explanation that I don't have time to get into now but the point is that what we actually use is only a tiny portion of what is out there. I watch about an hour of television every day in English and I'll say that I hear a new word or saying every day. Just last night, I heard "If wishes were horses, beggars would ride". That was completely new to me. Who knows when I will hear it again.
The second point is that we will acquire new vocabulary as needed. That is exactly what reading or exposure in general does. This is of course how professional or occupational vocabulary is learned. A bus driver, a lawyer, an engineer, each will have their own subset of words that I do not know.
4 x
- tommus
- Blue Belt
- Posts: 957
- Joined: Sat Jul 04, 2015 3:59 pm
- Location: Kingston, ON, Canada
- Languages: English (N), French (B2), Dutch (B2)
- x 1937
Re: 32000 words of intensive reading - an accidental experiment
s_allard wrote:I watch about an hour of television every day in English and I'll say that I hear a new word or saying every day
I watch a half hour of Dutch news every day. Yesterday, I did an analysis of the last 1,000 days of the subtitles of that news. I removed all the proper nouns and the numbers, leaving all forms of the remaining words (so not word families but the much more numerous word variations). Of these, there was a total of 55,000. And I then graphed the accumulated number of words per day. After a brief steep rise at the beginning, the daily increase was an almost linear increase of 55 words per day, still increasing by about 55 words per day after 1,000 days (about 3 years). So understanding that some (I don't yet know how many) would be variations of words that I have seen in the last 3 years, each new half hour of daily news contains about 55 words (or forms of words) that I have not actually seen before. Now I am going to look more closely at just what these 55 per day are.
6 x
Dutch: 01 September -> 31 December 2020
● Watch 1000 Dutch TV Series Videos | : |
- reineke
- Black Belt - 3rd Dan
- Posts: 3570
- Joined: Wed Jan 06, 2016 7:34 pm
- Languages: Fox (C4)
- Language Log: https://forum.language-learners.org/vie ... =15&t=6979
- x 6554
Re: 32000 words of intensive reading - an accidental experiment
Most adult native test-takers range from 20,000–35,000 words
Average native test-takers of age 8 already know 10,000 words
Average native test-takers of age 4 already know 5,000 words
Adult native test-takers learn almost 1 new word a day until middle age
Adult test-taker vocabulary growth basically stops at middle age
The most common vocabulary size for foreign test-takers is 4,500 words
http://testyourvocab.com/blog/
In another vocabulary test study researchers found that "an average 20-year-old native speaker of American English knows 42,000 lemmas and 4,200 non-transparent multiword expressions, derived from 11,100 word families.." "The numbers range from 27,000 lemmas for the lowest 5% to 52,000 for the highest 5%. The knowledge of the words can be as shallow as knowing that the word exists. In addition, people learn tens of thousands of inflected forms and proper nouns (names), which account for the substantially high numbers of ‘words known’ mentioned in other publications."
http://journal.frontiersin.org/article/ ... 01116/full
http://vocabulary.ugent.be/wordtest/start
Russian
http://www.myvocab.info/articles/slovar ... razovaniya
However...
What Is Advanced-Level Vocabulary? The Case of Chunks and Clusters
http://www.tesol.org/docs/default-sourc ... .pdf?sfvrs
".. we move from the notion of advanced vocabulary as a set of words to the notion of advanced vocabulary as sets of words in combination. Once again, corpus analysis will be employed to help us search for patterns and frequencies. However, when we expand our search criteria to look for groupings of more than one word, things become more complicated, and there are clear lessons to be learned about how we describe the vocabulary of a language, as well as implications for what teachers teach in their vocabulary lessons and how learners approach the task of acquiring vocabulary and developing fluency. Throughout this paper we work from, but also hope to challenge, the understanding of many teachers, researchers, and learners that vocabulary means no more than all the single words of a language."
"Using a 4.7-million-word sample of North American English conversation from the Cambridge International Corpus (CIC), and applying corpus analytical software to obtain a frequency count for recurrent chunks, the following totals emerge for chunks occurring more than 20 times:
two-word chunks 19,509
three-word chunks 12,681
four-word chunks 2,953
five-word chunks 385
Chunks and Single Words
Only 14 items in a single-word frequency list occur more often than the most frequent chunk (i.e., you know, which occurs 45,873 times). Of the first 100 items in the overall frequency list, 11 are two-word chunks, including I think and I mean. By the time we reach 500 items, there are 177 two-word chunks and 7 three-word chunks, that is, 35% of the most frequent items are chunks, not single words."
So, in order to achieve advanced-level vocabulary you may need to know items like "up," "yours," and "up yours" which represents a considerable learning burden.
Also...
"experience changes the quality of lexical representations, and does so differently for different words and different individuals. Some aspects of this relationship are well-described, including the logarithmic relationship between word frequency of occurence and behavioral correlates of word recognition: ten exposures to an infrequent word may have a similarly strong impact on the quality of that word’s mental representation as 100 exposures to a word that is well entrenched in one’s mental lexicon...
Importantly, it may not be simply the number of exposures to a word – larger for good readers, smaller for poor ones, due to their differences in reading experience – that would give rise to individual variability. It may be that poor readers are not able to use the exposures they do get to create the kind of high quality lexical representations that skilled readers have.. .
For example, readers who make fewer phonological discriminations due to poor phonological processing skills will not end up with the same quality of lexical representation after 100 exposures than someone without phonological problems would end up with, even if their level of reading experience is matched. The same holds true for readers with a limited learning capacity or a compromised long-term lexical memory, or any other behavioral or organic characteristic that impedes the entrenchment of mental lexical representation: in all these cases the readers would have to have a larger number of exposures to a word than readers without those characteristics to create a representation of the same quality. None of these scenarios can be accounted for by general-use corpora, however large and genre-balanced they are..."
https://www.ncbi.nlm.nih.gov/pmc/articl ... po=2.33161
Average native test-takers of age 8 already know 10,000 words
Average native test-takers of age 4 already know 5,000 words
Adult native test-takers learn almost 1 new word a day until middle age
Adult test-taker vocabulary growth basically stops at middle age
The most common vocabulary size for foreign test-takers is 4,500 words
http://testyourvocab.com/blog/
In another vocabulary test study researchers found that "an average 20-year-old native speaker of American English knows 42,000 lemmas and 4,200 non-transparent multiword expressions, derived from 11,100 word families.." "The numbers range from 27,000 lemmas for the lowest 5% to 52,000 for the highest 5%. The knowledge of the words can be as shallow as knowing that the word exists. In addition, people learn tens of thousands of inflected forms and proper nouns (names), which account for the substantially high numbers of ‘words known’ mentioned in other publications."
http://journal.frontiersin.org/article/ ... 01116/full
http://vocabulary.ugent.be/wordtest/start
Russian
http://www.myvocab.info/articles/slovar ... razovaniya
However...
What Is Advanced-Level Vocabulary? The Case of Chunks and Clusters
http://www.tesol.org/docs/default-sourc ... .pdf?sfvrs
".. we move from the notion of advanced vocabulary as a set of words to the notion of advanced vocabulary as sets of words in combination. Once again, corpus analysis will be employed to help us search for patterns and frequencies. However, when we expand our search criteria to look for groupings of more than one word, things become more complicated, and there are clear lessons to be learned about how we describe the vocabulary of a language, as well as implications for what teachers teach in their vocabulary lessons and how learners approach the task of acquiring vocabulary and developing fluency. Throughout this paper we work from, but also hope to challenge, the understanding of many teachers, researchers, and learners that vocabulary means no more than all the single words of a language."
"Using a 4.7-million-word sample of North American English conversation from the Cambridge International Corpus (CIC), and applying corpus analytical software to obtain a frequency count for recurrent chunks, the following totals emerge for chunks occurring more than 20 times:
two-word chunks 19,509
three-word chunks 12,681
four-word chunks 2,953
five-word chunks 385
Chunks and Single Words
Only 14 items in a single-word frequency list occur more often than the most frequent chunk (i.e., you know, which occurs 45,873 times). Of the first 100 items in the overall frequency list, 11 are two-word chunks, including I think and I mean. By the time we reach 500 items, there are 177 two-word chunks and 7 three-word chunks, that is, 35% of the most frequent items are chunks, not single words."
So, in order to achieve advanced-level vocabulary you may need to know items like "up," "yours," and "up yours" which represents a considerable learning burden.
Also...
"experience changes the quality of lexical representations, and does so differently for different words and different individuals. Some aspects of this relationship are well-described, including the logarithmic relationship between word frequency of occurence and behavioral correlates of word recognition: ten exposures to an infrequent word may have a similarly strong impact on the quality of that word’s mental representation as 100 exposures to a word that is well entrenched in one’s mental lexicon...
Importantly, it may not be simply the number of exposures to a word – larger for good readers, smaller for poor ones, due to their differences in reading experience – that would give rise to individual variability. It may be that poor readers are not able to use the exposures they do get to create the kind of high quality lexical representations that skilled readers have.. .
For example, readers who make fewer phonological discriminations due to poor phonological processing skills will not end up with the same quality of lexical representation after 100 exposures than someone without phonological problems would end up with, even if their level of reading experience is matched. The same holds true for readers with a limited learning capacity or a compromised long-term lexical memory, or any other behavioral or organic characteristic that impedes the entrenchment of mental lexical representation: in all these cases the readers would have to have a larger number of exposures to a word than readers without those characteristics to create a representation of the same quality. None of these scenarios can be accounted for by general-use corpora, however large and genre-balanced they are..."
https://www.ncbi.nlm.nih.gov/pmc/articl ... po=2.33161
5 x