How not to learn Spanish: Building too much stuff, not studying enough

Postby **emk** » Tue Mar 12, 2024 2:16 am

Mägo de Oz turns out to be excellent programming music! I have just achieved my big goal: Surprisingly accurate and well-timed bilingual subtitles, generated using 100% pure AI. :shock:

substudy-avatar-translation-whoo.jpg

substudy-avatar-translation-2.jpg

The Spanish subtitles were generated using Whisper, and the English translations using the GPT-3.5-Turbo API (in "function calling" mode). The two subtitle tracks were combined using "substudy combine". The total API costs for this episode were under US$0.20.

(I imagine performance will drop off sharply with smaller languages, and I've been focusing on good "beginner" shows for now. Your mileage may vary. Results may not be typical. Trained driver on a close course.)

Carl wrote:This was just one anecdote. But if it's a general issue, maybe you could simply feed ChatGPT shorter chunks of subtitles at a time, rather than, say, a whole episode's worth?

I'm feeding it 10-15 lines at a time, using carefully constructed prompts and the "function calling" API. Seems moderately robust so far?

sfuqua wrote:This is very exciting stuff. I wasted a few days trying to get subtitles for a Children of the Sea, Kaijuu no kodomo, before I gave up... I eventually got a subs2srs deck for the movie where the audio and the written stuff barely matched...

Yeah, it's tough. For audio cards to really work, you need:

Native L2 audio, complete with inflection, emotions and a story. This should be considered non-negotiable for any language with a media industry. The inflection will be burned into your brain. The emotions and story make it interesting, because you're going to wind up seeing this stuff a lot.
Mostly accurate L2 subtitles. If a subtitle doesn't match the audio, you'll almost always need to throw that card out.
A basically decent translation into L1 subtitles. This is the most forgiving part.

But getting all three of these in the same place was always a miserably complicated problem. I only had 4 usable episodes out of 7 seasons of TV!

But ah, if you could only take an interesting native video, wave a magic wand, and get solid bilingual subs in a couple of minutes for US$0.20. :lol:

Even if were only for the biggest languages and only for intermediate audio, it would be a win. Let's see how far this will go.

(Binaries will be available soonish, as usual.)

Postby **emk** » Tue Mar 12, 2024 1:02 pm

Let's convert those subs to a web page! This is sort of like an ultra-low budget version of Language Reactor. If I want a pop-up dictionary, I need to install one of those browser plugins.

avatar-03-03-review.jpg

Remember: The subs above are 100% AI generated, in both languages! And they're really not bad at all—they seem better than YouTube's automatic subs, they seem to match the audio better than 95% of French DVD subs, and they're even pretty competitive with fan subs dredged up from weird corners of the internet. And the translations are at least as good as anything I could get out of the couple of "translate subtitles online" sites that I tried—most of those seemed to be running Google Translate one line at a time, which produces weirdly broken phrases.

Eventually, I'd love to turn this static web page into a real web app, with a regular video player. And then make it easy to convert selected subs into Subs2SRS-style cards. Lots of tools do parts of this—I think Migaku+Netflix comes closest right now, despite a lot of rough edges. But what I'd love is a seamless pipeline that takes a video as input, allows you to watch it as intensively or extensively as you want, and produces good Subs2SRS-style cards as output.

After that, I'd love to experiment with active output—maybe start with turning mature listening cards into cloze cards that work like the Assmil active wave? Seems like it's worth a shot.

Postby **emk** » Wed Mar 13, 2024 1:00 am

Download substudy v0.5.3 here!

To use this, you will need to set up the environment variable OPENAI_API_KEY. You can do this in the normal ways, or by creating a file ".env" in the directory with your media files, containing:

Code: Select all

OPENAI_API_KEY="..."

To try the automatic translations, run:

Code: Select all

substudy translate foreign_subs.srt --native-lang=en > native_subs.srt

Again, I've only tested this on short episodes and major languages. If you try to translate your subtitles into Latin, I cannot guarantee that GPT 3.5 will do the right thing. :lol:

In my very limited comparison testing, these subs are a better than the couple of online free SRT translators I tried.

To get some of the other output formats I show off above, try the example commands here. I may reorganize these commands at some point.

More Mägo de Oz. I've mostly been busy listening to Avatar and learning new words, but how about some more more Mägo de Oz? And check out those lyrics overlaid on the video!

One of the first things I've learned to do when starting a language is to look up a translation of "lyrics". It's paroles in French and I believe letra in Spanish.

Postby **emk** » Thu Mar 14, 2024 2:21 pm

Gotcha! Found two transcription errors. This one is my favorite:

avatar-whisper-error.jpg

That middle caption should say ¡Aang siete, Sokka cero! Oops. And I found another one where a couple of words were obviously wrong. But this is based on a fairly close watching of 2 episodes. I'm sure there are a few more errors I haven't caught. But honestly, Whisper is truly impressive on clear audio. (And Avatar is really clear, which makes it great for this. The voice actors speak quickly, but they don't clip or slur their words.)

Happily, a few bad captions are no problem for Subs2SRS-style cards. You'll recognize the problem when you learn the card, and you'll either fix it or throw it out.

Porting the Whisper transcription code to Rust. This is bit tricky, because I need to automatically split long audio tracks without cutting dialog in half. Yesterday evening, I tested WebRTC Voice Activity Detection for this. If this works, it will be much easier to run substudy over a media file and get back an SRT file.

Actually studying. Because this technically still a Spanish log, no matter how distracted I am by building shiny tools. I've been listening to Avatar 01.01's dialog on loop (with all the non-dialog bits stripped out). And it's wild, because I understand at least 80% if I pay attention. It's the same feeling I got when I skipped back 30 lessons in Assimil and listened to the audio.

Now, the interesting challange is that after learning 4 episodes using audio cards, I don't want to make a card for each line. I can get a respectable fraction of the easy lines on the first try! And other lines are obvious after a couple of loop-throughs, as I drag my Spanish vocab out of the attic after 5-7 years of storage. It's hard to give precise numbers, because my vocabulary is coming back quickly.

What I really want is a way to go through an episode and "star" lines to turn into cards. Again, nothing new, but there's value in putting the pieces together well.

Postby **emk** » Fri Mar 15, 2024 1:42 am

OK, planning time for more bad ideas! This is a "bad ideas" log.

I completed the original 10,000-page Super Challenge in French (before the reading got cut in half). I am currently eyeballing the modern "half-sized" Super Challenge for Spanish, which would be 2,500 pages. I do want to extend substudy to support aligned bilingual text at some point, and this would be a good excuse. And it would encourage me to keep going and maybe not pause work on substudy until 2035. And besides, I really want to read Borges someday.

But this raises the question: What is my Spanish level, anyway? My output skills aren't even close to A1. My input skills? Well, I've got a tiny vocabulary, mostly related to elemental magic. And I couldn't write out a conjugation table, though many common conversational forms are coming back. On the plus side, I've got French and English for the cognates, and that's huge. I can read parallel Spanish/English subtitles more easily than I ought to be able to. And my listening skills are quite promising for my level.

I guess I'm a "false beginner" with relatively good audio decoding? That feels about right.

So the real question is how long it would take to push myself to B1 reading skills. Once I have B1 reading skills, 2,500 pages in 20 months is no problem. That's a short book every two months. And I bet if I just add vocabulary, I'll get B1ish listening skills without much extra effort, because I won't need to "speed up" my comprehension. Maybe I could make a bilingual book. I've done enough Harry Potter for one lifetime. But I've read The Dresden Files a couple of times, and I'd enjoy a re-read. And the vocabulary does involve magic and fire, so it plays to my strengths! And there are at least 2,500 pages in the series, so I could shamelessly exploit the "narrow reading" boost.

And Dresden Files book 2 (Grave Peril) would allow me to chain into Z Nation. That's the one with the pretty clear Spanish dub, right?

I mean, really the entire point of my Spanish experiment was to find good ways to drop people straight into native materials. So maybe I should do something ridiculous like sign up for a Challenge. Or my own personal version of it, with simpler audio rules so I can mix things up freely.

Hmm. Maybe. If anyone started a SC at way too low a level and still finished at least 2,500 pages without it eating their life, I'd love to hear from you.

jackb · Postby **jackb** » Fri Mar 15, 2024 12:03 pm

If I'm going to spend time with a Harry, I'd rather it was Dresden than Potter. I also think there are a enough books to do a full SC reading portion of 5000 pages with Dresden alone.

I love what you're doing here. I may not post or like much but this stuff is great.

jeffers · Postby **jeffers** » Fri Mar 15, 2024 12:27 pm

emk wrote:Now, the interesting challange is that after learning 4 episodes using audio cards, I don't want to make a card for each line. I can get a respectable fraction of the easy lines on the first try! And other lines are obvious after a couple of loop-throughs, as I drag my Spanish vocab out of the attic after 5-7 years of storage. It's hard to give precise numbers, because my vocabulary is coming back quickly.

If you came up with some way to tag words as known, you could automatically remove any card with no unknown words. The problem with this might be that you know a word in one context but not in another. Nevertheless this would only eliminate a small number of cards for a beginner.

Postby **emk** » Fri Mar 15, 2024 1:46 pm

jackb wrote:If I'm going to spend time with a Harry, I'd rather it was Dresden than Potter.

Hah, yes. Well put. And thank you for your kind words; the positive feedback helps motivate me.

So I did a little experiment last night. I downloaded Spanish editions of two of the Dresden books, and tried to read them with the Kindle pop-up dictionary. I could do it, but it was painfully slow going. It would be way too much work to start a Super Challenge this way.

So I downloaded the two most popular "bilingual e-readers" for Android.

Duoreader had a copy of War of the Worlds, which is a throwback to that lovely late-19th-century style prose style, with lots of Latin-derived words and plenty of parallel constructions for emphasis. (That particular style is always a good bet if you're bluffing your way through reading a Romance language you don't actually know yet.) The app displayed the text in two columns that scrolled together. As a reading test, this was a success. I could start the Super Challenge like this. But sadly, Duoreader doesn't have bookmarks and can't remember what I last read. So it's completely intolerable in practice.

Beelinguapp had two short excerpts from Don Quixote in the "Intermediate 2" level. I don't know if they were edited, but I could also work with these. But sadly, Binlinguapp apparently does not believe in the power of "narrow reading", and so they only gave me 5 or 10 paragraphs of each book. And they don't support bookmarks, either. But since I can apparently read their intermediate texts in a single sitting, it sadly doesn't matter.

So my takeaway is that I can profitably work with parallel texts, using any of the standard tools (extensive reading, Assimil-style learning, Anki sentence cards, Listening/Reading, etc). But sadly, all of the Android apps for doing this with actual ebooks are dumpster fires.

This is why I end up writing too many of my own language learning tools.

I really believe that you can do amazing things with interesting native materials starting literally on day 1. But too many of the fun techniques involve gluing 4 complicated tools together, or require you to do something impossible like ~~find accurate French subtitles~~ bring back the Golden Fleece. All I want is fun media with "aligned" L2/L1 content, and tolerable tools. Is this too much to ask?

But I really ought to consider a SC. It's within reach, even if I would need to teach substudy a few new tricks. And I need to supplement my TV vocabulary with the stuff that's found in books.

jeffers · Postby **jeffers** » Fri Mar 15, 2024 2:44 pm

Have you tried the Language Reactor reading tool?

I agree that lookups on Kindle can be a pain, well on touchscreen Kindles anyway. It wasn't bad on the old keyboard kindles because you could just move the cursor down while you were reading and the dictionary would sit at the bottom of the screen. But sadly, they don't build them with language learners primarily in mind. The main reason I do most of my reading on Kindle is because it's comfortable on the eyes, I can take a crapton of books with me everywhere I travel, and I can make the font larger so I don't need my reading glasses.

Postby **emk** » Fri Mar 15, 2024 3:31 pm

jeffers wrote:Have you tried the Language Reactor reading tool?

It's one of the better bilingual readers!

But it's still not ideal for my goals. Let me review it. Here's their Don Quixote:

language-reactor-don-quixote.jpg

The translations are worse overall than the Avatar subtitle translations I can generate. Their flashcard tool is worse than Migaku's (which generates good cards, but which comes with an extremely clunky SRS app for Android). And unlike Migaku, they don't have a tool for asking ChatGPT to help:

emk wrote:Can you break down "la primera salida que de su tierra hizo el ingenioso don Quijote"? I'm not sure what's going on here.

ChatGPT (GPT 4) wrote:Certainly! The phrase "la primera salida que de su tierra hizo el ingenioso don Quijote" is a bit complex due to its structure, so let's break it down:

"la primera salida" translates to "the first departure" or "the first sally." Here, "salida" refers to the act of leaving or going out, indicating an expedition or venture outside one's usual confines.
"que" is a relative pronoun here, meaning "that" in English. It's used to introduce a relative clause, providing more information about "la primera salida."
"de su tierra" means "from his land" or "from his homeland." "Su" is a possessive adjective that corresponds to "his," and "tierra" refers to land or territory.
"hizo" is the past tense of the verb "hacer," which means "to do" or "to make." In this context, it translates to "made" or "undertook."
"el ingenioso don Quijote" translates to "the ingenious Don Quixote." "Ingenioso" describes someone who is clever or inventive, and "Don Quijote" is the name of the protagonist.
Putting it all together, "la primera salida que de su tierra hizo el ingenioso don Quijote" translates to "the first departure that the ingenious Don Quixote undertook from his homeland." This phrase indicates that the passage is about Don Quixote's first venture or journey from the place where he lived.

You can't trust this stuff 100%, but it's certainly good enough for learning from naitve content, if you use some common sense. Compared to what I did to claw my way into native French media a dozen years ago, this is a four-lane highway and an air-conditioned sports car.

My ideal tool would ingest native books and TV shows on one end, allow me to consume them either extensively or intensively, and spit out high-context SRS cards on demand (including parallel media and helpful explanations). I would like to be able to read books on my phone, or watch series on my laptop. And I definitely want to be able to zip through cards while standing in line at the grocery store. And ideally, I wouldn't be limited to stuff that's a hundred years old.

Language Reactor and Migaku both do (different) parts of this well. But it should be possible to do much better. Sadly, ars longa, vita brevis.

A language learners’ forum

How not to learn Spanish: Building too much stuff, not studying enough

First AI-translated subtitles!

Re: How not to learn Spanish: Building too much stuff, not studying enough

substudy v0.5.3

Re: How not to learn Spanish: Building too much stuff, not studying enough

Debating a half-sized Super Challenge

Re: How not to learn Spanish: Building too much stuff, not studying enough

Re: How not to learn Spanish: Building too much stuff, not studying enough

Re: How not to learn Spanish: Building too much stuff, not studying enough

Re: How not to learn Spanish: Building too much stuff, not studying enough

Re: How not to learn Spanish: Building too much stuff, not studying enough

Who is online