Oulipo approach to language learning

General discussion about learning languages
Cainntear
Black Belt - 3rd Dan
Posts: 3468
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8660
Contact:

Re: Oulipo approach to language learning

Postby Cainntear » Thu Jan 06, 2022 11:17 pm

McCarthy wrote:Chunks are chunks: analyzing them and taking them apart may not be useful, and they should be processed and retrieved holistically (see Wray, 2002).

I'm going to basically repeat what I said last time chunks came up: that is not what the concept of chunking in neuropsychology is about, and if linguists want to borrow terminology from psychology, they shouldn't change the concept; if they want to talk about a different concept, they should change the word.

Chunking is intrinsically multi-level. Recalling a chunk may manifest itself superficially with the instant recall of the entire phrase, but the hypothesis in (psychological) chunking is (as it was explained to me, at least) that all the schemata relating to the chunk are recalled simultaneously.

If I respond to "thank you" with "no bother, pal", I recall the full sentence as a chunk. But that chunk is itself composed of two chunks: "no bother" and "pal". One of those chunks is a word, and there's (arguably) nothing else to recall, but "no bother" is also a noun phrase consisting of a quantifier (no) and a noun (bother). My brain's activating and recalling all the "circuitry" for the two words, and the rule, and everything that revolves around "bother" by uncountable... all simultaneously.

Chunks built of chunks built of chunks -- that's the psychological model.

This notion that a lot in the teaching community that chunks are strictly indivisible doesn't come from any cognitive theory that I'm aware of, and I've never heard any solid intellectual argument for it. The closest I've heard to an argument is that it's quicker than spontaneous production of original utterances, but that's not a counterargument to the "nested" chunking of psychology, because chunking was always an explanation of higher-order concepts and skills and how these complex things are quick to recall.
reineke wrote:
Iversen wrote:
reineke wrote:Chunks are chunks: analyzing them and taking them apart may not be useful, and they should be processed and retrieved holistically (see Wray, 2002).


And that's one point where I beg to differ. I have just started to do wordlists with French expressions, and before that I read scores of pages with such expressions from a book containing page up and page down with such expressions, and it is obvious that they almost always are meaningful - the only problem is that you sometimes don't have the background information to see the logic.

Of course you also have to learn expressions and chunks as multi-word compounds, but I know from experience that I remember them better when I know what their components mean. Learning such compounds 'holistically' (without caring about the meaning of their components) would be the same as to burdening my memory with thousands of very long words. Learning them as meaningful word combinations makes the task much easier.


I don't disagree with your comment at all. Idioms, colloquial expressions and phrasal verbs count as lexical chunks and analyzing them and taking them apart may prove fun, useful and interesting. In some cases even rank beginners (hey, a chunk!)

Rank amateur, surely...?
will already understand the basic constituents of expressions such as "you know" but feel confused about the word combination or miss it completely. However chunks do need to be processed and retrieved holistically to make sense.

Yes, but psychology says they cannot be processed and retrieved holistically and naturally if the constituent concepts are not themselves already acquired.

If you start learning English and are taught to say "no bother pal" and don't know that "no bother" is itself a chunk, so you won't have the internal organisation that "no bother pal" is "no bother" + "pal", and you won't know that "no bother" = "quantifier:no" + "noun:uncount:bother" + "rule: quantifier before noun".

Thus you will not be processing the chunk in the same way a fluent speaker would, instead processing it as a memorised sequence. i.e. you're not processing it as a chunk.
2 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Oulipo approach to language learning

Postby reineke » Fri Jan 07, 2022 5:33 am

einzelne wrote:
reineke wrote:Now you mentioned getting around 500 words per your favorite type of novel. Polars are kind of thin (I think) so I would say that's a considerable number of unknown words already or just about right for a dedicated student.


See my answer to luke above. To provide more specificity: the book in question is Musso's Sauve moi. It has 84k words. I highlighted 453 words, of which 75 were some weird technical words or some useful words I definitely met before but which for some reason refuse to get into my longterm memory (oh, those sneaky bastards!). So, what do we have? 400 words for 80k book. It's 99,5% comprehension! The audiobook is 9,5h btw.

In my answer to luke I mentioned specific words I highlighted. You can check yourself how many meanings they have. As a reader of general fiction like Musso, I don't have your balloon problem. It's simply not an issue.

Now, imagine an Oulipo write who has a specific task: build a narrative or a dialogue on the basis of these 400 words (Oulipo adores such kind of literary experiments). To make it more feasible, for each word we can allow to add 5 high frequency words (or repetions of the same new words). So in the end you would have 400x5=2000 words. Make a audio version of it and you get (2000/150) 13 minute audio-story which allows you to review these 400 words time and then (in contrast to the original 9,5h audiobook).

You can construct your own version Oulipo text. For instance, you can have only 50 new words per story and add more repetitions to demonstrate different meanings. So you would have a 10 minute story which mentions all these words 5-8 times in different contexts (because in real novels/stories such words are hapaxes).

Wouldn't it be wonderful?

I have a strong feeling that you don't understand the original problem to which my Oulipo utopia was a response. It's about effective tools for expanding and reviewing passive vocabulary at high-intermediate, advanced stages.
.


I get it. I think we all do. We all want to concoct high octane Mad Max fuel for our language learning and avoid flipping over, crashing and burning (all the fun stuff, basically). Your first post involved native novels and a hypothetical book with a vocabulary of 20k words. That's not a beginner book but we can steal your dream and make it a graded effort especially since such material is lacking. Your second example regarding 50 new words being recycled sounds very much like a cartoon only as you're expanding the vocabulary the number of new words falls and there's generally room to repeat only a few low frequency words 7-8 times. 400 high frequency words plus 5 for sentence creation amounts to two-three new practice words per sentence and 75% text coverage. The text could clobber Moby Dick in complexity and the learner would not feel very advanced. You could easily miss the meaning of a known common word in a sentence or the entire sentence and your comprehension problems would balloon.

My guess is that sentence length would have to increase and that you would need more than 5 connecting words. Context, redundancy and a higher percentage of known words create the environment for successful learning. You're basically reading at a sweet spot and if you want to challenge yourself you can try with older literature and some nonfiction. I suspect that many of the words that fascinate you belong to the familiar register so you may also wish to explore TV fiction.
1 x

User avatar
einzelne
Blue Belt
Posts: 804
Joined: Sat Mar 17, 2018 11:33 pm
Languages: Russan (N), English (Working knowledge), French (Reading), German (Reading), Italian (Reading on Kindle)
x 2882

Re: Oulipo approach to language learning

Postby einzelne » Fri Jan 07, 2022 1:08 pm

reineke wrote:I get it.


Except that you don't. I would really appreciate if, for a change, you would listen to me at least once before rushing to project your notions re language learning (clearly I'm not into Mad Max fuel or any othe magic bullet). Had you done so, you would've seen that many of your counter-arguments missed the point. Because, clearly, my imaginary text cannot be used for extensive reading purposes...

reineke wrote:My guess is....


... And you didn't have to guess. My utopia is actually not that utopian. Take any chapter in the second half of Lingua Latina Per Se Illustrata. On average, it has 50 new words and it's audio version is about 10-15 minutes. The problem is that the textbook only covers 1700 words and it's not enough for reading. So, yes, you need something of a size of Moby Dick to introduce you another 20k words. Again, try see my point: this imaginary Oulipo Moby Dick is not aimed to serve as the main and only book to read but rather as a narrativized form of dictionary (well, it's a common recommendation to review your words within a meaningful context and use sample sentences for review. I just suggest to make one step further because a string of sentences connected into a meaningful whole are even better). True, textbook texts can be bland, that's why I brought Oulipo on the scene. They love to work with artificial formal limitations and, in spite of that, managed to produce interesting and engaging books (at least to my taste).
4 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Oulipo approach to language learning

Postby reineke » Fri Jan 07, 2022 5:46 pm

einzelne wrote:It’s not a question but rather thinking out loud.
I’ve got to the point when reading an average contemporary fiction book in French gives me on average 500 new words/expressions. I would love to continue to expand my knowledge of everyday words/idiomatic expressions but he problem is I rarely enjoy contemporary novels and, frankly speaking, I don’t have that much time for reading...

All learners materials are centered around high frequency words. Novels on the other hand become 'ineffective' tools of vocabulary expansion (you read 80k words and only get 500 new words).
So this leads me to my utopian vision of a novel for language learners. What I have in mind is some kind Oulipo experiment for learning purposes: a novel (a long sequency of stories, sketches?)  which would would methodically go through the frequency dictionary and try to connect them into a semblance of coherent narrative. (Suppose, every 15 min chapter has to introduce 50 new words and mention them at least 5-7 times under different contexts)

Imagine a book, which doesn't have 7-8k unique words, as an average novel does, but at least 20k ? That means that by relistening to a single audiobook, you would review the core vocabulary which, as studies show, has to be way bigger that 5k.
I'm aware that this is just a utopian vision but, damn, what a beautiful utopia it could be!


It sounds like a distopian nightmare to me but okay. Here's the thing - you're waffling and sending mixed messages. 

In Lingua Latina...  "students first learn grammar and vocabulary intuitively through extended contextual reading and an innovative system of marginal notes. It is the only textbook currently available that gives students the opportunity to learn Latin without resorting to translation".

The first sentence in this textbook is "Roma in Italia est". The second one is "Italia in Europa est". This reminds me of my high school textbook. Sentences and text gradually get more complex and the book features a vocabulary of 1800 words. Is this an appropriate example if you're aiming to create an incredibly dense advanced book?

If you are introducing the same word several times under different contexts as you mention in your opening post you're necessarily creating opportunities for incidental learning or at the very least you're likely to lead the reader in this direction. 

Such material will necessarily expand like a balloon and if you squeeze it on one side it will expand on another. Until it pops under predetermined constraints.

Target language: French
Constraint: minimize the use of the 5000 most frequent French words to 5 per each new post 5000 word used
Word pool: Le Dictionnaire de l'Académie Française 1694-today. 60,000 words
Le Grand Robert 100000 mots, 350000 sens in 2013. Today it's 500,000 "mots" The magic of advertising.

I don't know that you've made a case that Proust and older writers are somehow a less valuable source of useful language. What's useful and what's not? What's the main criterion for introducing a new word? If it's their ranking on the frequency list, you'll be reading gobs of literary French in any case.

Is this a learner tool or a torture device? You'll need some kind of constraint on the use of low frequency words too. 
I don't think you can exclude literary language. There's a ton of it in the example sentences. I would argue that since you don't like the stuff and it's regularly included in all the dictionaries, you'll need to learn it and this could potentially be a very painful (painless?) way of doing it. Of course you could just read some Rabelais, Chateaubriand and Proust? Each can furnish you with a book featuring 20k unique words. You don't need to read all of it. Idiolect and narrative genre constraints tend to get revealed early and you can hop between books.

Regardless, you want a neat dense package peppered with modern words and expressions so that you can be more efficient at vocabulary learning. Okay. No one's going to write this monster but you should keep dreaming. In the meantime I can think of anthologies of modern literature as a practical and readily available solution. There are anthologies of modern French literature, francophone literature, sci-fi, crime etc.  
1 x

User avatar
einzelne
Blue Belt
Posts: 804
Joined: Sat Mar 17, 2018 11:33 pm
Languages: Russan (N), English (Working knowledge), French (Reading), German (Reading), Italian (Reading on Kindle)
x 2882

Re: Oulipo approach to language learning

Postby einzelne » Fri Jan 07, 2022 10:43 pm

reineke wrote:Maybe you can share your strategies because learners keep discussing, arguing, humping frequency lists and groaning and moaning under vocabulary load and researchers keep researching.


Sorry, the website has been behaving weird recently so I missed this post.

I don’t have any kung-fu strategies for vocabulary learning. Unsurprisingly, it all boils down to repetition at more or less regular intervals. Everyone organizes these repetitions accordingly. Somebody, for instance, likes Anki, I, for one, can stand it.

When I switch from Assimil, textbook audio, I usually intensely listen to a couple of audiobooks. I cut it myself into small 1-2 min chunks and listen to them on repeat in my dead time (just like I listen to Assimil in the beginner stage). Your first book gives you several thousand words, so at this stage this strategy it makes to total sense. Then, when it’s only 500-1000 words — emm, not so much.

At a certain moment I even fooled with the idea of cutting sentences with new words/expression from my audiobooks for vocabulary review but, first, it’s very time consuming. Second, simply listening a string of unconnected sentences produces a schizophrenic feeling in my head. Context is good, but story, narration, argumentative essay—anything that connects this words into a semblance of whole—is way better. At least for me. That’s why I would love to have such a book build from low frequency mode. Also, listening to audio also helps tremendously to retain words in you own memory, so, that's why in my utopian vision there's definitely an audio version of it.

Now, as for your research regarding the percentage of unknown words. From my own experience, 95% is huge underestimation. I think even 98% is too optimistic. Since the beginning of COVID I started to read a lot of books on my Kindle app on my iPad (the libraries were closed and it was very hard to get foreign books from overseas). In my case, comfortable unassisted reading starts at 99% or, better yet, at 99,5%. Just before Musso, I read an experimental novel (Volodine’s Le port intérieur), my comprehension level was 98% and it was brutal. I doubt I could enjoy it without a pop-up dictionary.
Last edited by einzelne on Sat Jan 08, 2022 3:34 am, edited 1 time in total.
2 x

User avatar
einzelne
Blue Belt
Posts: 804
Joined: Sat Mar 17, 2018 11:33 pm
Languages: Russan (N), English (Working knowledge), French (Reading), German (Reading), Italian (Reading on Kindle)
x 2882

Re: Oulipo approach to language learning

Postby einzelne » Fri Jan 07, 2022 11:18 pm

reineke wrote:The first sentence in this textbook is "Roma in Italia est". The second one is "Italia in Europa est". This reminds me of my high school textbook. Sentences and text gradually get more complex and the book features a vocabulary of 1800 words. Is this an appropriate example if you're aiming to create an incredibly dense advanced book?


Well what do you think? If this book is targeted for advanced students, clearly you no longer have to think about grammar and you won't have such artificial sentences.

reineke wrote:What's useful and what's not? What's the main criterion for introducing a new word? If it's their ranking on the frequency list, you'll be reading gobs of literary French in any case.


Statistical analysis which takes it into account. And good frequency vocabularies address the problem of literally/newspaper language. You sort your corpus and assign the ranks to words accordingly. Also, linguists have corpora of spoken language, radio, TV, you know?

reineke wrote:Is this a learner tool or a torture device?


Is an Assimil course a learner tool or a torture device? An average course has 2000-2500 unique words, 1,5 audio (if you cut unnecessary pauses). A set of one week dialogues is about 10-14 minutes and gives you at least 70 new words/expression

It's funny, because, in a way, all I'm suggesting is to expand the Assimil approach to the lower frequency words for the advanced stages.

reineke wrote:Regardless, you want a neat dense package peppered with modern words and expressions so that you can be more efficient at vocabulary learning.


Yes, this is pretty much it.

reineke wrote:No one's going to write this monster.


Thank you for warning! For a moment I had a faint glimpse of hope...

reineke wrote:In the meantime I can think of anthologies of modern literature as a practical and readily available solution. There are anthologies of modern French literature, francophone literature, sci-fi, crime etc.


The other day, I happened to download several collections of stories written by contemporary and not so much contemporary writers (Camus). May be I was lucky (or unlucky in my case) to find easy writers, but so far the percentage of known words doesn't differ that much from full fleshed novels.
1 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Oulipo approach to language learning

Postby reineke » Sat Jan 08, 2022 12:20 am

Your comment about advanced students and grammar opens a whole new can of worms. Here you are birthing a dream and I'm sort of pushing you down flights of stairs. I mean, it's an idea... The book that is, not DIY abortion. It raises some interesting questions about vocabulary learning. Most people don't have your problem. They struggle a lot more. Camus is barely cold in my eyes. You can try with actual anthologies of modern literature which should include author notes, discussion of the period, footnotes, poetry, essays etc.
2 x

User avatar
reineke
Black Belt - 3rd Dan
Posts: 3570
Joined: Wed Jan 06, 2016 7:34 pm
Languages: Fox (C4)
Language Log: https://forum.language-learners.org/vie ... =15&t=6979
x 6554

Re: Oulipo approach to language learning

Postby reineke » Sat Jan 08, 2022 1:00 am

Regarding Camus -

"Camus has not become popular overnight. Though his intellectual star may have risen and fallen over the decades, novels such as The Stranger and The Plague, along with major essays such as "The Myth of Sisyphus" and "The Rebel," have always had a large readership. It’s not difficult to see why. Camus’s spare, unornamented style makes him one of the more accessible and translatable exponents of literary modernism, and has facilitated his translation into English. The Stranger, in particular, has been a point of entry into French literature for generations of high-schoolers, and as such is an object of nostalgia as well as a badge of cultural literacy."

https://www.thenationalbookreview.com/f ... orld-today
1 x

User avatar
einzelne
Blue Belt
Posts: 804
Joined: Sat Mar 17, 2018 11:33 pm
Languages: Russan (N), English (Working knowledge), French (Reading), German (Reading), Italian (Reading on Kindle)
x 2882

Re: Oulipo approach to language learning

Postby einzelne » Sat Jan 08, 2022 3:33 pm

reineke wrote:Your comment about advanced students and grammar opens a whole new can of worms. Here you are birthing a dream and I'm sort of pushing you down flights of stairs. I mean, it's an idea...


What kind of worms? Metaphorically speaking, I 'read' such books all the time at the beginner, intermediate, and early advanced stages of my language learning – you vocabulary is not vast, and each text is highly packed with new words and expressions. You read and review, and since the density of new words is so big, you generally review the whole text (the whole dialogue of Assimil, the whole chapter of the adapted book, the whole page of your first audiobook). It's only at the advanced stage that this strategy looses its effectiveness because the number of new words thins out to the point when you get only one/two per page. So, yeah, Moby Dick is actually a great book from this perspective — 18k unique words. If only it were the 18k from 10k-30k range of the frequency list...

You suggestion to read anthologies misses the point, just like your comment re Camus. When it comes to vocabulary acquisition, this is what you want to read — simple, straightforward prose, so you won't get distracted by stylistic quirks. In fact, simple, descriptivist style actually encourages you to use a richer vocabulary for you're forced to use concrete, simple terms to describe objective reality. I read La peste at the early stage of my French and it gave me 1500 new words/expressions (just like my first Levy/Musso novels) — something that other higher register writers (Duras, Blanchot, Klossowski, Robbe-Grillet) could not deliver. I had the same experience with English: in terms of vocabulary density, a book like Gone Girl have more new words per page than any Virginia Woolf novel.

I read enough high register literature, in fact, this is my major focus in all my languages but I still would like to expand my everyday vocabulary. So anthologies of won't be of much help.

In principle, the thing I'm suggesting — podcasts for language learners do it all the time: they compose a short dialogue or a mini-story where they introduce new words/expressions. The main problem is that they give these words and expressions au compte-gouttes and unsystematically. I know that no one will write a book like this. But you think that there's some deep methodological flaws in my utopian vision of such a book, whereas there's none. As I've been trying to show here, language learners 'read' and 'listen' such a book all the time.
2 x

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Oulipo approach to language learning

Postby luke » Sat Jan 08, 2022 7:58 pm

einzelne wrote:I know that no one will write a book like this.

Get a computational linguist on your side and your "dreams" can all come true.

Not that it's trivial, but a Ph.D student who's interested in providing the solution you're proposing could certainly turn out a good first draft and create the groundwork for a broader application.

Rough draft of algorithm:
a) Parse Moby Dick into tokens.
b) Create lookup table (synonyms, frequency ranges, etc) for all tokens, their synonyms, antonyms, etc.
c) Divide book into 40 roughly equal parts (500 words for each new section * 40 sections) to get to 20K.
d) Substitute words over the current "frequency marker" with those in the current "500 word" window as much as possible. Remember the words that didn't get inserted in the current window "frequency window" or were unused in previous "frequency windows". Let "frequency" of substitutions also be a factor to guide substitution. (Don't just use a word once and call it done. E.G., say we're substituting for "tiger". Start with "cat". Use "tiger". Use "huge feline", substitute "tiger" again, etc.)
e) Continue until the book is complete.

Moby Dick has 135 chapters, so each 500 word "frequency window" gets about 3 chapters per "window".

So, something like this:
Chapters 1-6 first 1000 words in frequency list are substituted for words above 1000 in frequency list.
Chapters 7-9 first 1500 words in frequency list are substituted for words above 1500 in frequency list and try to use all of the words 1001-1500. Program remembers those it wasn't able to shoehorn in.
Chapters 10-12 first 2000 words in frequency list are available for words above 2000. Try to use any word that hasn't been used yet in the 1-2000 list.
...
Chapters 117-120 all 20,000 words are available and should be used up or used enough.
Chapters 121-135 Let Herman Melville determine word usage, but do substitutions for any remaining 20K frequency list words that have been unused.

If you want the frequency list to be 10k-30k, that's still doable with a similar algorithm and a different "frequency list":
Chapters 1-6 first 10,000 words in frequency list are substituted for words above 10,000 in frequency list.
Chapters 7-9 first 10,500 words in frequency list are substituted for words above 10,500 in frequency list and try to use all of the words 10,001-10,500. Program remembers those it hasn't able to shoehorn in.
etc.
3 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo


Return to “General Language Discussion”

Who is online

Users browsing this forum: No registered users and 2 guests