zac299 wrote:Emk: If you're happen to read this...
Warning! Giant wall of text ahead! Run now!
zac299 wrote:I'm up to page 45, i think, of the book. It's been a real pleasure jumping into this one. It's really, really building up my automaticity of understanding the past tenses and grasping the meaning immediately. This is starting to translate into dividends in my listening.
This is a really good sign! You're reading an interesting book, you're getting in some real volume, and it's increasing the number of things you recognize automatically. That's
exactly what you want to have happening.
To answer your other questions, let me grab a copy of
my favorite diagram.
This is my personal mental framework. I break content into three categories:
- Opaque. This is just a wall incomprehensible content, and staring at it doesn't yield much information.
- Decipherable. Here, you can theoretically understand the content, but you need to actively work at it.
- Automatic. You hear it (or read it) and you just understand it.
To get from (a) to (b), you can use almost any trick you can imagine. This is why I labelled it "cheating", to encourage myself to think deviously and creatively! To get from (b) to (c), your best tools are sheer volume and repetition. This is where running up the page count or the episode count really pays off. It's exactly what you're describing with your "automaticity of understanding the past tenses."
In practice, all of this is going on at once. Parts of a given page or episode will be opaque, others will be decipherable, and some will be transparent. The transparent bits will help make other parts decipherable. But you are also using outside knowledge, the action you're seeing on the screen, and even things you look up.
This explains what's going on with my bilingual book, and my sentence flash cards. I'm in the weird position of being able to decipher quite a lot in the context of the book. But when I rip a sentence out and put it on a flashcard, I'm losing too much context (relative to my actual language skills). And those sentences get a lot harder. So I'm going to rewrite my card-creating script to add more context.
And one final important thing to understand: Because consolidation requires so much sheer volume, your minute-to-minute approach can be pretty flexible. For me, it took about 20 real books and probably 200 hours of television before I looked at a C2 reading exam and I was like, "That's it? They just want me to read that little passage and answer some questions? So what do I do with the rest of the time?" But once you're talking about that kind of volume, it's OK to miss some stuff—you're going to get another chance later, when it's a bit more in reach.
So the classic approach is to mix "intensive" reading and watching with "extensive" versions. Sometimes you'll pick a page, or an episode, or a chapter, and you'll try to "zoom in". You can look up more-or-less everything interesting. Learn surprising new facts. But a lot of the time, you want focus on the stuff you
can decipher without too much work, and just keep going until it becomes second nature. It's OK to let the opaque stuff go a lot of the time. Because the more decipherable stuff you make automatic, the more the opaque stuff will become decipherable. And you can adjust the mix of intensive and extensive as needed.
But one of the key insights I've seen from looking at years of logs, is that people who don't do
any extensive reading or watching tend to get stuck around A2 or B1. They may get really good at deciphering text, but they don't build the automaticity. And to step onto theoretically controversial ground
, there is a part of your brain which is
meant to learn languages. It's trying to build a "model" of how everything fits together. But that automatic, subconscious part needs a certain minimum amount of content to do its work. And it needs some way to match up the content and the meaning—not 100% of the time, but at least enough to gain a beachhead and expand.
And when you think of it that way, "20 adult books to reach C1/C2 reading" is a
low, low price. From the end where you're standing now, yeah, that seems like a
lot. But as you go, the process snowballs as you consolidate more and more. I remember the first time I sat down and read an 80,000-word French book in a single day. Totally mind-blowing.
zac299 wrote:I'm happy to say I've started on the transcripts for the 3 pablo escobar episodes as well. It is taking a long time to wade my way through them and make translations for the parts I don't know. Especially because it's so colloquial. There's phrases and sentences I've gotta google around for just to get a proper translation. Heck, there's been 1 or 2 parts not even googling has given me any answers. But I've picked up some cool phrases from this already. Can't wait to finish the 3 transcripts and go back to these episodes to watch/study/read along all at once.
Again, this is just perfect. You've watched a ton of episodes of Pablo Escobar, so you're definitely getting in your "consolidation." But you've also picked out a couple of them and dug deeper, and you learned a bunch of interesting stuff. All the new stuff you've deciphered is now eligible for future consolidation.
zac299 wrote:I've also started writing out sentences and paragraphs using all the news I've taken from my books thus-far. I've done 2 full pages and estimate it'll take another 3 or 4 to catch up on all the new words. It's been a lot of fun. I think I'll post my silly writings here in my log on the off chance anyone wants to laugh along.
In saying that, when it's done, I'll send it to a friend for corrections and to get them to record themselves reading everything for me. I'll give this test a shot for a while and see if it helps make more of the vocab stick.
This kind of writing exercise can be great. Once I threw myself in the deep and started speaking French with my wife at home, there was a period of 30 days where I consistently wrote 50–100 words, and got them corrected. In my case, I often focused on writing things I'd
wanted to talk about, but hadn't been ready for.
zac299 wrote:I tend to do about 45 minutes with the Advanced podcasts from news in slow spanish. My native friends tell me they'd rate them as a 5 or 6/10 for native-speed. But it is nice to be understanding large swathes of these podcasts. Each run through I"m picking up on more details I didn't notice the previous time through. I'd say I'm getting around 75% of these podcast episodes nowadays.
This is a really great place to be. To oversimpify, native content comes in two main types: (1) clear, professionally enunciated audio, and (2) aggressively idiomatic audio. (1) appears in native content designed for a wide audience. After all, many native speakers are hard of hearing, or they're familiar with different regional accents, etc. (2) often appears in unscripted content, or in content that's trying to be edgy and idiomatic. The Spanish version of
Avatar falls into category (1), as do some singers like
Julieta Venegas. Her diction really is crystal clear. But
Enrique Inglesias is moving towards category (2). And a film like
Y Tu Mamá Tambien is pretty unapologetically in category (2).
From where you're standing, category (1) is pretty broadly in reach. Read a few books, watch another 50 hours of a television series or two, and it should start coming together. One huge milestone that's not
too far into your future is to be able to pick up a brand new TV show in category (1), and to be able to get up above 90% comprehension within the first season, just by watching.
But that category (2) audio? That's a long-term project. Even native speakers can struggle with category (2) audio sometimes, which is why I sometimes turn on subtitles for some
English-language Netflix series, even if I'm the only person watching. There is absolutely nothing wrong with focusing on category (1) audio right now, because it's well within your reach. And once you're pretty solid on category (1) audio and you're comfortable reading, that's a fantastic point for tackling category (2).
zac299 wrote:As for the radio ambulente, I'm trying to decide between 2 methods of listening. Should I:
A) try and understand every part that I can, naturally. Even if it means "pausing" in my mind to think about what was said, working it out, then realising I've then missed the next 5 or 10 seconds of audio because I wasn't concentrating
Or...
B) throw "understanding" out the window for a while and simply focus on hearing each syllable and word as they come, to practice listening at a native speed... In preparation for when I can understand most of the words being used.
Either of these can be beneficial, I think. Iversen talks about "listening like a bloodhound", by which I understand he means (B). But (A) can be fine, too, especially if you're going to listen multiple times anyways. This is probably like asking, "Should I train curls? Or bench presses?" They're different but related exercises.
zac299 wrote:Regarding your episode transcript idea, would you mind elaborating on how you used them for your own study?
I've been working my way through them to translate whatever I didn't understand into english. But should I be translating the whole thing? While there's significant portions I can understand while I read them... I'm sure when I start watching the corresponding episode at its normal pace, there'll be parts I "understand" but still miss because sometimes it moves too fast.
For
Buffy in French, I read through the transcripts, looking up anything that seemed interesting. At that point, I'd say that the transcripts were at least 80% "decipherable", with big chunks of that already automatic, at least in written form. And I could get up to 90% "decipherable" with some dictionary work. Then I watched the episode while reading along with the transcript—but only because I didn't have accurate subtitles. I likely did some rewinding—if you have one of those "jump back 30 seconds" buttons, go wild! And then I watched the episodes through at least one time more without the transcripts.
But the specific details weren't critical. Any other version of the basic idea would have been useful. But after I'd gone through a couple of episodes like this, I just extensively watched about 5 seasons of
Buffy. And they were long seasons. This got me up to about 95% listening comprehension without subtitles or rewinding. Probably this worked so well because I
could already "decipher" 80–90% of the written episodes. And I could do that because I already had maybe 2 books under my belt, around 800 pages of reading. By just watching through 5 seasons, I "consolidated" that knowledge, converted reading to listening, and made it mostly "automatic." Of course, the first few times you do this, it's very "series specific"—you learn a certain specialized vocabulary and become familiar with particular voices. But after a few different series, these skills broaden out.
Whereas for Spanish, I decided to take this whole idea to a ridiculous extreme. I made the audio cards you see in my log, breaking the episode down into very short segments with bilingual text. Then when I "learned" each card, I'd replay the audio over and over until I could close my eyes, hear the audio, and—at least in that moment—I could map the audio to the meaning. The first hundred cards or so were hard, because I didn't actually know how Spanish verbs worked, or what the pronouns were. Though I did take out my handy 6-page laminated grammar sometimes! I cannot overstate just how vague those first 100 cards were, really. And then, of course, I reviewed the cards. During each review, I'd listen to the audio over and over, until I could once again decipher just that one line of dialog that I'd "learned" a few days ago. Then I'd show the back of the card and listen a few times more. (I almost never "failed" cards. I'm actually abusing Anki here.) And as I reviewed, the gap between each review would increase about 2.5 times each time I saw it. (Less for harder cards, more for easier ones.) And once the cards hit a month old, they underwent a strange and magic "sea change"—I could just listen to most of them, and they were mostly automatic. At this point, I'd sit down and watch the episodes. Or I turned them into MP3 playlists using "substudy export tracks" and just listened to them a bunch of times.
Within about 30–60 hours of work, I was in the ridiculous position of someone who couldn't reliably conjugate
ser in the present tense, but who could close his eyes and actually understand 4 specific episodes of
Avatar, at full native speed. And I could "follow" at least 30–40% of most
other episodes. But these skills were entirely
Avatar-specific. For anything else, my listening dropped off a cliff.
So as you can see, these are two very different versions of the same basic idea. For
Buffy, I had OK reading knowledge (much of it in the "deciphering" range), a couple of printed transcripts made by a fan, and a DVD box set. And I used a small amount of intensive study to kickstart an extensive watching process. Which paid off
amazingly. Whereas in the case of the first four epsidoes
Avatar, I had
nothing. (Well, OK, I had English and B2+ French.) Accurate bilingual subs gave me a massive "cheating" boost. And then I used lots of repetition to "consolidate". I basically shredded each episode down to individual lines of dialog and scheduled their reviews algorithmically. And then once they had undergone that weird "sea change", I watched or listened each episode several times more. When I've paid that much to understand something, and when I really enjoy the content, it's worth a few more passes to get some cheap & easy consolidation. (This is also why I learn songs, because I can listen to them hundreds of times.)
So we have two examples of the same process, but with the dials set very differently. And you want to know where you should set
your dials. Well, the good news is that your brain is
designed to do this, robustly, under less than ideal circumstances. Humans developed about three key tricks over the last few million years:
Endurance hunting, a really good throwing arm, and language learning. And I think it's the third that ultimately put us on top. Like, tool-making is great, but if you want to teach someone how to build a spear-thrower, then you're going to need do a lot of talking until they get the details right. And if you study enough anthropology, then you'll quickly realize that band- and village-level societies can have incredible linguistic diversity in a small region, and people move around all the time, thanks to trade, warfare, kidnapping and sheer wanderlust. So from a strictly evolutionary perspective, adults
need to be able to learn languages. And they need to be able to do it under "field" conditions, without textbooks or grammars or AI-assisted flash card systems.
And just how good is this "learning in the field" ability? Well, I have some friends who could be charitably described as "firearm enthusiasts". And they
will not shut up about Klashnikov's engineering, and how his designs could be manufactured using sketchy equipment, buried in the mud, washed out, mistreated, and still mostly work. And our human language-learning ability strikes me as much the same kind of thing—once our brain decides that we have no choice but to learn a language, it's going to happen. Whatever parts of our brain learn languages, they're not some kind of specialized equipment that only works under carefully-controlled conditions. They're not
cheap to activate, so our brain will try to get of it if it can. But once we convince our brain this is going to happen, the process is robust.
However! Once we understand how the basic system works, we can jump-start it. Which is where grammar books and flash cards and FSI courses and dictionaries and all that come in. All of this stuff alllows us to speed up that initial "deciphering" work. But metaphorically, we're just trying to get that engine to turn over a few times. Once it's running, the "consolidating" process is largely self-sustaining, in much the same way that an alternator keeps the battery charged and the spark plugs firing. To abuse the metaphor even further, it's not necessarily a
new car, so maybe you'll need to open the hood and tinker sometimes to keep running optimally.
So I can't give you magical, precise instructions for your current situation. But happily, you don't
need precise instructions. Instead:
- Have faith that once your brain has some initial comprehension to consolidate, some kind of starting point, then it can turn a few million words of a related langauge into decent comprehension skills. For an adult English speaker learning a romance language, the total amount of content you need to successfully decipher and consolidate is ludicrously small (as big as it seems right now). The 5,000-page SC starting next month should absolutely do it.
- But you can help the process along via a mix of occasionally digging deeper, repeating things until they sink in, or using outside knowledge to artificially boost your comprehension. A modest amount of this can go surprisingly far.
In modern AI, there's a split between "supervised" and "unsupervised" learning. "Supervised" learning involves giving the AI carefully-labeled examples explaining how everything works. "Unsupervised" learning basically involves handing the AI the entire internet and saying, "I dunno, learn the patterns, it's all there somewhere." And one of the big lessons of AI has been that the most powerful systems do huge amounts of "unsupervised" learning. There's a famous essay related to this, titled
The Bitter Lesson (emphasis added):
Rich Sutton wrote:In speech recognition, there was an early competition, sponsored by DARPA, in the 1970s. Entrants included a host of special methods that took advantage of human knowledge---knowledge of words, of phonemes, of the human vocal tract, etc. On the other side were newer methods that were more statistical in nature and did much more computation, based on hidden Markov models (HMMs). Again, the statistical methods won out over the human-knowledge-based methods. This led to a major change in all of natural language processing, gradually over decades, where statistics and computation came to dominate the field. The recent rise of deep learning in speech recognition is the most recent step in this consistent direction. Deep learning methods rely even less on human knowledge, and use even more computation, together with learning on huge training sets, to produce dramatically better speech recognition systems. As in the games, researchers always tried to make systems that worked the way the researchers thought their own minds worked---they tried to put that knowledge in their systems---but it proved ultimately counterproductive, and a colossal waste of researcher's time, when, through Moore's law, massive computation became available and a means was found to put it to good use...
The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity.
That part of your brain that learns languages under "field" conditions? It's at least
1,000 times more efficient than the statistical language-learning systems decribed above. We can take an entire
nuclear power plant's worth of energy, and pour it into advanced nanoscale circuitry doing billions of operations per second, and it's
still a really bad imitation of the parts of your brain that learn languages.
When you work through a transcript in detail, or when I "learn" an episode using Anki, we're basically hand-feeding that part of our brain some easy input. We're saying "this content X has meaning Y, please take note." And our brain gulps that down. We're doing "supervised learning", carefully matching language to its meaning. And this is really important for anyone who hasn't been suddenly dropped in a village speaking an unfamiliar language! And in fact, we can get all the way to A2 or even B1 with a heavy mix of "supervised" learning. But to get from A2 to B2 or C1, we need more and more volume. And the only way to efficiently get enough volume is to turn to "unsupervised" (or
"self-supervised" learning). Of course, we can still keep profitably
supplementing that volume with extra "supervised" learning. The AI equivalent of this supplementation is
reinforcement learning from human feedback, where a large self-supervised model is rapidly fine-tuned by feeding it a small number of examples.
Or if you'd prefer a less technological and evolutionary metaphor, we could go with C.S. Lewis instead:
C.S. Lewis wrote:And when the garden is in its full glory the gardener’s contributions to that glory will still have been in a sense paltry compared with those of nature. Without life springing from the earth, without rain, light and heat descending from the sky, he could do nothing. When he has done all, he has merely encouraged here and discouraged there, powers and beauties that have a different source. But his share, though small, is indispensable and laborious.
So all this is a
giant wall of words to say that you're on the right path, and it will pay off faster and faster.
The important thing to remember is that there's an underlying natural process here, and it is
powerful. Even if we take a nuclear power plant, some large buildings full of nanoscale circuitry, a copy of the entire internet, and $100 million in cash, we can only produce an incomplete imitiation. And we don't understand
that imitation much better than we understand our own brains. But you've reached the point where you're reading books, and you're understanding 75% of news podcasts, and you're enjoying a television series. Often, you just understand what you can, and keep going. Sometimes, you dig deeper and upgrade some "opaque" bits to "decipherable", or you repeat a podcast to get some extra "consolidation." And you're adjusting on the fly.
The path in front of you will often look impossible and frustrating, if you've never walked this way before. But from where I'm standing, looking back at the path? And having seen lots of other people walk the same path? Well, your success looks
inevitable.