elAmericanoTranquilo wrote:Very cool stuff, emk! It's really interesting to think about what might be possible now that we have access to all of these fantastic underlying tools. By the way, have you checked out knowclip?
Oooh, that is very shiny and I had not seen it. Thank you very much!
In general, I put down substudy back in 2017 or so, and I've paid almost no attention to what's happening with language learning tools since. I've been trying to fix that recently by reviewing a bunch of popular tools, but there are still neat things out there that I've never seen.
Knowclip is a free, open source project supported by Patreon, and it has very similar goals to substudy—though the author focuses more on selecting "easy" cards at an appropriate level. Which, honestly, a pretty easy feature idea to overlook if:
- You're an English and French speaker learning Spanish, and
- You have a high tolerance for partially comprehensible input.
elAmericanoTranquilo wrote:It ended up being more fun for me to step through the video in Language Reactor versus working through the cards. But a big part of that might be because I've never been able to get into the habit of studing with SRS cards
The exact kind of exercise that I'm doing (turning an entire episode's dialog into flash cards) is most useful at the beginner level. At this point, basically everything is worth studying to some extent or another. At higher levels, I think it makes sense to do a lot more extensive watching, and only make cards when I stumble over something interesting. Sheer volume of largely-comprehensible input is magic.
I do have plans to add a basic video-watching UI to substudy. That's actually the next thing on my list. Though I reserve the right to change the list or ignore it for another 5 years!
elAmericanoTranquilo wrote:Another thing I came across back when I was working on this is that there are various forks of whisper. In particular, I was using stable-ts in order to get more accurate timestamps. I don't know if whisper itself has since made improvements in this area.
Very interesting! I generally can't run serious models on a 4GB nVidia graphics card, and I assume that most users of substudy can't either. If I were willing to spend more money on hosting fees (besides what I've been spending all these years to keep the forum up), I suppose I could host a modified Whisper model as a private REST API. But then I'd need to charge for access, or start up a Patreon, or something. Still, I'm very glad to know that this is out there.
An interesting card, and a conversation with ChatGPT. I'm having breakfast late this morning, and this fascinating card just popped up for the first time:
Quizás lo que sea que tenga que hacer, ¿se me ocurrirá?
"Maybe whatever I have to do will come to me?"
Hmm, those are not familiar verb forms. And we're speaking about possibilities, not facts. It's time to do a little research:
emk wrote:Hello! I'm learning Spanish. Please respond to me as a skilled and professional Spanish tutor who works with experienced language learners.
I encountered the following phrase:
> Quizás lo que sea que tenga que hacer, ¿se me ocurrirá?
There's a bunch of stuff going on there, including some unfamiliar verb forms. Could you please break this sentence down piece by piece and explain what's happening here? Thank you!
ChatGPT 4 wrote:¡Por supuesto! Este tipo de frase es muy rica en estructuras gramaticales y expresiones idiomáticas del español, así que vamos a desglosarla para entender mejor cada uno de sus componentes.
"Quizás"
- Quizás: Es una palabra que introduce una oración para expresar una posibilidad o duda. Es similar al "perhaps" en inglés. Puede usarse indistintamente con "tal vez".
"lo que sea que tenga que hacer"
- lo que: Esta expresión se usa para referirse a una cosa o situación de manera no específica. Funciona como un pronombre relativo que introduce una cláusula relativa. Equivale al "what" en inglés en frases como "what I need".
- sea: Es el subjuntivo presente del verbo "ser". Aquí se usa para expresar una idea de incertidumbre o algo hipotético relacionado con "lo que". El uso del subjuntivo en este caso señala que no se está hablando de una realidad específica, sino de una posibilidad o suposición.
- que tenga que hacer: Otra cláusula con el subjuntivo, esta vez con el verbo "tener". "Tenga que" es una forma de expresar obligación o necesidad, y el uso del subjuntivo aquí también indica que se trata de algo hipotético o no confirmado. Es equivalente a decir "that I have to do" en inglés.
"¿se me ocurrirá?"
- se me ocurrirá: Esta es una construcción reflexiva que indica que algo ocurre a la persona que habla, a menudo usado para expresar que a alguien se le viene una idea a la mente. El verbo "ocurrir" está en futuro ("ocurrirá"), lo cual indica una pregunta sobre la posibilidad de que algo suceda o se le ocurra a la persona en el futuro. La partícula "se" indica que el verbo es reflexivo, y el "me" señala que el sujeto de la acción es el hablante. Es decir, "will it occur to me?" en inglés.
Entonces, al traducir la frase completa al inglés, sería algo como "Perhaps whatever I have to do will occur to me?" o más naturalmente, "Maybe I'll figure out whatever I need to do?"
Esta frase expresa una mezcla de esperanza y duda sobre la capacidad del hablante de llegar a entender o descubrir qué acción debe tomar en el futuro. Utiliza estructuras gramaticales avanzadas como el subjuntivo para transmitir incertidumbre y reflexividad para expresar que el proceso de pensamiento ocurrirá internamente al hablante.
Ah, yup, it's el subjuntivo!
And I'm pleased to note that I can actually read quite a lot of that grammatical explanation, thanks to English, French, my prior knowledge of grammar, and my knowledge of Spanish so far. I guess ChatGPT-playing-a-serious-Spanish-tutor thinks it's time I start my monolingual transition. But seriously, having that explanation in Spanish is perfectly fine, and I can follow the key points with no problem.
My whole entire argument in Cheating & Consolidating was that it doesn't matter why I understand something like the passage above. Prior knowledge, context, whatever. As long as I can understand what I'm reading, my brain can use that exposure to "consolidate" my knowledge of what things mean, and move it one tiny step close to automatic knowledge. The entire point of this exercise is to bootstrap my way to the point that I can learn Spanish just by reading books and watching TV. I've seen a few early intermediate learners get fixated almost exclusively on intensive study, when they'd probably benefit more from a mix of intensive and extensive study. Sheer volume is what turns slow, laborious knowledge into effortless automatic knowledge. Both parts are important.
LLM advice. Since people have been asking in other threads about using ChatGPT for language learning, here are a few tips:
- I'm talking to the expensive model, ChatGPT 4. It's $20/month, but it's smarter than the free version, and it's less likely to hallucinate. Substudy uses the cheapter GPT-3.5-Turbo to translate subtitles, but it contains a few tricks to coax out better results.
- My question is very concrete. I'm not asking general questions about Spanish grammar, I'm asking how a particular phrase works.
- I have told ChatGPT what character to play. This is a subtle but powerful trick, because ChatGPT has seen thousands of examples of language tutors, both good and bad. And ChatGPT is fundmentally an improv actor, trying to predict how a conversation would go. But the average conversation is, well, average. And we don't want average, we want good! So we tell ChatGPT which character to play, and it just gets "smarter." Simply because we asked it.
- I'm not asking for too much at once. I see lots of prompts where people ask for things like, "Here are 10 vocabulary words. Please construct example sentences using each of these words, where each word appears in each grammatical case, and not using any nouns or verbs except these." That kind of prompt is "overconstrained". It's asking for too much at once. And the model is doing a high-speed improv act, and it's trying to make up answers "off the top of its head." So you get mistakes and hallucination.
- Literally nothing I'm doing here requires 100% correct answers from the model. Like, maybe it explains the subjunctive to me incorrectly. So what? If it gets things 90% right 90% of the time, I'm way ahead of the game.
Repeating cards aloud, or "shadowing" them in real time. Yes, it's already obvious that this is an useful idea! Maybe not starting on day 1, before I've had a chance to "marinate" in the sounds. But now that I've been working on it for a while, "echoing" the dialog is almost certainly going to be very helpful. Also I need to go look up Spanish phonetics and learn the high points again.