Kids store 1.5 megabytes of information to master their native language
And the link at the bottom of that page to Humans store about 1.5 megabytes of information during language acquisition
It sounds so easy. Only 1.5 MB. That can't be so much that an adult couldn't do it quickly?
Native fluency: only 1.5 MB of information needed?
- tommus
- Blue Belt
- Posts: 957
- Joined: Sat Jul 04, 2015 3:59 pm
- Location: Kingston, ON, Canada
- Languages: English (N), French (B2), Dutch (B2)
- x 1937
Native fluency: only 1.5 MB of information needed?
2 x
Dutch: 01 September -> 31 December 2020
● Watch 1000 Dutch TV Series Videos | : |
-
- Yellow Belt
- Posts: 50
- Joined: Sun Jul 26, 2015 12:32 pm
- Location: Germany
- Languages: German (N), Mandarin (intermediate?), Spanish (beginner)
- x 37
Re: Native fluency: only 1.5 MB of information needed?
If it were only used for vocabulary, 1.5 MB would be around 349525 words. So still a lot of information xD
3 x
-
- Black Belt - 1st Dan
- Posts: 1988
- Joined: Mon Aug 27, 2018 11:26 am
- Languages: English (native), French & German (learning).
- Language Log: https://forum.language-learners.org/vie ... &start=200
- x 4079
Re: Native fluency: only 1.5 MB of information needed?
One question that comes out this is where do we store it all?tommus wrote:Kids store 1.5 megabytes of information to master their native language
And the link at the bottom of that page to Humans store about 1.5 megabytes of information during language acquisition
It sounds so easy. Only 1.5 MB. That can't be so much that an adult couldn't do it quickly?
Some people have perfect recall, so the information must be there somewhere.
PS
(A lecture about how babies learn language suggested statistical analysis of everything they heard was the process.)
1 x
- tommus
- Blue Belt
- Posts: 957
- Joined: Sat Jul 04, 2015 3:59 pm
- Location: Kingston, ON, Canada
- Languages: English (N), French (B2), Dutch (B2)
- x 1937
Re: Native fluency: only 1.5 MB of information needed?
Wurstmann wrote:If it were only used for vocabulary, 1.5 MB would be around 349525 words. So still a lot of information xD
I would estimate 180,000 words. Still a lot.
The 3000 most common English words take about 25 KB to store. And 1500/25 = 60. Then 3000 * 60 = 180,000
3000 most common English words
0 x
Dutch: 01 September -> 31 December 2020
● Watch 1000 Dutch TV Series Videos | : |
-
- Yellow Belt
- Posts: 50
- Joined: Sun Jul 26, 2015 12:32 pm
- Location: Germany
- Languages: German (N), Mandarin (intermediate?), Spanish (beginner)
- x 37
Re: Native fluency: only 1.5 MB of information needed?
tommus wrote:Wurstmann wrote:If it were only used for vocabulary, 1.5 MB would be around 349525 words. So still a lot of information xD
I would estimate 180,000 words. Still a lot.
The 3000 most common English words take about 25 KB to store. And 1500/25 = 60. Then 3000 * 60 = 180,000
3000 most common English words
I was going by this. It says the average length of a English word is 4.5 letters.
0 x
- tommus
- Blue Belt
- Posts: 957
- Joined: Sat Jul 04, 2015 3:59 pm
- Location: Kingston, ON, Canada
- Languages: English (N), French (B2), Dutch (B2)
- x 1937
Re: Native fluency: only 1.5 MB of information needed?
Wurstmann wrote:I was going by this. It says the average length of a English word is 4.5 letters.
Interesting.
For the 3000 words (above), they take 25,222 bytes which means 8.4 bytes per word on average. But in regular text, a lot of small words show up very often, words like a, I, in, at, on, the, and, etc. So in regular text, the average word length would probably be smaller. So I took some plain text articles from today's BBC news, and put them together in a plain text file. In that sample, the average word length was 6.1 bytes. So perhaps the 4.5 letters per word would be for simple regular text, not word lists where all the common short words occur only once.
0 x
Dutch: 01 September -> 31 December 2020
● Watch 1000 Dutch TV Series Videos | : |
- Deinonysus
- Brown Belt
- Posts: 1222
- Joined: Tue Sep 13, 2016 6:06 pm
- Location: MA, USA
- Languages:
• Native: English
• Advanced: French
• Intermediate: German,
Spanish, Hebrew
• Beginner: Italian,
Arabic - x 4635
Re: Native fluency: only 1.5 MB of information needed?
Proof that English is easier: English has no diacritics so your brain can store the vocabulary as ASCII text, only one byte per letter. Other languages need Unicode encoding which takes up more neurons.
13 x
/daɪ.nə.ˈnaɪ.səs/
- tommus
- Blue Belt
- Posts: 957
- Joined: Sat Jul 04, 2015 3:59 pm
- Location: Kingston, ON, Canada
- Languages: English (N), French (B2), Dutch (B2)
- x 1937
Re: Native fluency: only 1.5 MB of information needed?
Deinonysus wrote:Proof that English is easier.
Indeed. Some other languages like Dutch and German where long, compounded words are common, average word lengths must be considerably longer. Such as:
Dutch
Hottentottensoldatententententoonstellingsbouwterrein
meaning: "construction ground for the Hottentot soldiers' tents exhibition"
German
Bundespräsidentenstichwahlwiederholungsverschiebung
meaning: "deferral of the second iteration of the federal presidential run-off election"
Afrikaans
Tweedehandsemotorverkoopsmannevakbondstakingsvergaderingsameroeperstoespraakskrywerspersverklaringuitreikingsmediakonferensieaankondiging
meaning: "issuable media conference's announcement at a press release regarding the convener's speech at a secondhand car dealership union's strike meeting"
And of course, these are very common words that often occur in normal conversation!
One would tend to think that these long words don't put an extra burden on learning the language because they are made up of a bunch of short, simple words. That works to some extent in reading and listening. However, in writing or speaking, it is not so easy to know or remember which words to put together and in which order. So this just increases the difficulty for active usage versus passive usage. I often wonder about native speakers using such long compound words. Do they even think about them being run together, or do they just say them as an English native would use a series of separate words? I think it depends on the mostoftenusedenglishcommonwordcombinations.
Check out the Wikipedia longest words by language.
0 x
Dutch: 01 September -> 31 December 2020
● Watch 1000 Dutch TV Series Videos | : |
-
- Black Belt - 3rd Dan
- Posts: 3527
- Joined: Thu Jul 30, 2015 11:04 am
- Location: Scotland
- Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc - x 8794
- Contact:
Re: Native fluency: only 1.5 MB of information needed?
Deinonysus wrote:Proof that English is easier: English has no diacritics so your brain can store the vocabulary as ASCII text, only one byte per letter. Other languages need Unicode encoding which takes up more neurons.
Only if you use boring old 7-bit ASCII. 8-bit ASCII with a Latin Extended charades did most of Europe well enough in the 90s.
Now excuse me... I’m off to download my languages onto 3.5” floppy disks...
6 x
- zenmonkey
- Black Belt - 2nd Dan
- Posts: 2528
- Joined: Sun Jul 26, 2015 7:21 pm
- Location: California, Germany and France
- Languages: Spanish, English, French trilingual - German (B2/C1) on/off study: Persian, Hebrew, Tibetan, Setswana.
Some knowledge of Italian, Portuguese, Ladino, Yiddish ...
Want to tackle Tzotzil, Nahuatl - Language Log: viewtopic.php?f=15&t=859
- x 7032
- Contact:
Re: Native fluency: only 1.5 MB of information needed?
The brain is not computer, nor does it use bits. It’s not surprising that the article referenced the 1958 von Neumann “Computer and The Brain” for its estimation.
All these discussions about how many bits a word takes ... we process language as sounds, not as text.
All these discussions about how many bits a word takes ... we process language as sounds, not as text.
10 x
I am a leaf on the wind, watch how I soar
Return to “General Language Discussion”
Who is online
Users browsing this forum: No registered users and 2 guests