How not to learn Spanish: Building too much stuff, not studying enough

Continue or start your personal language log here, including logs for challenge participants
kundalini
Orange Belt
Posts: 117
Joined: Sun Jan 24, 2021 8:17 pm
Languages: English (C), Greek (low intermediate)
x 365

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby kundalini » Mon Apr 01, 2024 2:00 am

emk wrote:I prepared a bilingual interlinear ebook, as discussed up thread. Bertalign is fantastic. And the interlinear format is better for me right now, but it requires a lot of discipline to use effectively.


Do you mean that it's too tempting to look at the translation? Or something else?

I haven't been disciplined in my reading of Le Comte de Monte Cristo at all, but I still feel as though I'm learning a lot. I've been reading along an interlinear text while listening to the audio (L-R) -- sometimes reading along in French, sometimes in English; sometimes pausing to mull over new terms, and sometimes not; sometimes reviewing vocab, but just as often forgoing review if I don't feel like it. The experience has been very pleasurable, made so by a magnificent story that is well-narrated. I'd probably learn more if I undertook vocab acquisition more intentionally, but an intensive approach has its costs, too, as I think you pointed out earlier.
1 x
Iliad: 12 / 24
French Super Challenge Books: 0 / 5000 (0/5000 pages)
French Super Challenge Films: 0 / 9000 (0/9000 minutes)

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby emk » Tue Apr 02, 2024 3:23 am

kundalini wrote:Do you mean that it's too tempting to look at the translation? Or something else?

When I had just started reading, it was too easy for me to glance at the English. Happily that didn't last too long. I'm actually reading nearly all of the Spanish now, even if it's pretty slow.

But please don't take my personal struggles as any kind of guideline on what you should or should not do! How intensively you might wanr to engage with a certain text depends on your level, your goals, your individual skills, how much energy you have that day, etc. For me, at my level, I need to actually read the Spanish.

I'm hitting an average of two unknown words per sentence, and I'm using a mix of the bilingual text and the pop-up dictionary. I'm not trying to actually learn the vocab, because I only have to the time to learn 10-20 cards per day, and I'm already learning 15 between Avatar and music! So that would allow me to read only another 5 sentences a day if I wanted to fully learn all the vocab. :lol:

So I'm keeping my book as an "extensive" activity: I look up some words, or glance at some bilingual text. If I'm feeling super inspired, I'll highlight a sentence. But above all, I try to keep going!

The pros. So far, this project has left me with some nice stuff:

  • I have 4.5 episodes of Avatar that I can watch and understand very large parts of, often including very precise details of what's being said. (And I can put the dialog on loop without the other audio.)
  • I will have a growing number of songs I can sing along to.
  • I am somehow successfully getting into a book series that's a few thousand pages long.

Given how few total hours I've put into Spanish, and the fact that most of them were over half a decade ago, this is not a bad start!

The cons. I still haven't done any serious output work. I have some ideas, but I've been too busy prepping for a potential half Super Challenge. Worse, I haven't looked at a Spanish conjugation table in 5 years. So I "know" a lot of endings in context, sort of, but I'm extremely vague on the details. I'll fix this at some point!

I am not, by any reasonable definition, anywhere near A1. All of my efforts have been poured into a few hyper-specific skills, all of them narrowly focused on a few pieces of content.

But the hell of it is, I'm already on the runway and taxiing for takeoff for a half Super Challenge. This is a ridiculous and unwise experiment. And I'm certainly benefitting from prior knowledge of English and French.

But I remember just how big a difference those first few full-sized books and those first 50-60 hours of television made in French. So I'm trying to take some ridiculous shortcuts to get there again in Spanish. (But again, please note the title of my log: This is not a log for reasonable or responsible methods.)

Reading progress since Saturday morning: 8% of a 370 page book!
9 x

User avatar
elAmericanoTranquilo
White Belt
Posts: 49
Joined: Sun Sep 18, 2022 5:11 pm
Languages: English (N), Spanish (B1)
Language Log: https://forum.language-learners.org/vie ... 15&t=18495
x 153

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby elAmericanoTranquilo » Wed Apr 03, 2024 3:36 pm

Thanks to a post by Le Baron, I've been digging into the SpanishInput youtube channel by Miguel Lescano. The reason I mention it here is because there's some synergy with this discussion; he recommends studying by transcribing videos that have both native audio and perfect subtitles, and he recently started creating Anki flash cards that play audio snippets and then require you to type what you have heard.

Jump to the 13:50 mark in this video to see the kind of cards he's using - they are pretty nice in that they highlight character by character any descrepancies between your translation and the reference one.
https://www.youtube.com/watch?v=inl79UK3vx0

So far I really like doing these transcription exercises and I want to incorporate something like this into my study plan. I'm just not sure yet whether I prefer transcribing on paper, or with flash cards that require me to type the transcription, or with flash cards that just show me the correct translation and ask me if I got it right.
4 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby emk » Thu Apr 04, 2024 1:21 am

elAmericanoTranquilo wrote:Thanks to a post by Le Baron, I've been digging into the SpanishInput youtube channel by Miguel Lescano. The reason I mention it here is because there's some synergy with this discussion; he recommends studying by transcribing videos that have both native audio and perfect subtitles, and he recently started creating Anki flash cards that play audio snippets and then require you to type what you have heard.

Ah, neat!

For me, the biggest risk with Anki is adding too many hard cards. If a card is hard, I'm much more likely to fail it outright. And then it comes back again. And pretty soon, all my hard and bad cards will be cycling through my deck a couple of times per week. So I tend to lean towards card formats that are a bit easier, and I try to avoid hitting Again/Fail once I've learned a card.

But I've discovered that marginal sentence or audio cards both tend to get easier over time, even if I don't ever fail them. Because of this, I tend to treat Anki as an "amplifier", and not as a set list of things I must learn. Cards are disposable; I can make thousands.

So my Anki decks are full of delightful cards, and I keep the stress level low.

But active recall is definitely useful. And transcribing audio is sort of the ultimate active recall. Still, I might be tempted to do transcriptions outside Anki, especially if I started failing lots of cards.

New song! OK, I have finished Eres para mí! That was a fun song—not too hard, lots of good bits. And then in an unwise moment, I chose Bailando. This is catchy song!



But, ugh, it's a lot harder than it seemed. The song is fast, with lots of "reduced" sounds. And there are lots of fun little grammar bits going on.

Image Image

This is a good exercise for me, though! I'm ready to work on hearing lots of little grammar words very quickly.

Substudy song improvements (soon). Oh, and in order to get this working, I had to upgrade substudy's song support. I'll polish this code up a bit and release it this coming weekend. But basically, if you supply substudy with the exact lyrics, it will try to use that as the base of the transcription. This works especially well for complicated songs, and it should probably fix some of the problems rdearman saw with Korean.
4 x

User avatar
elAmericanoTranquilo
White Belt
Posts: 49
Joined: Sun Sep 18, 2022 5:11 pm
Languages: English (N), Spanish (B1)
Language Log: https://forum.language-learners.org/vie ... 15&t=18495
x 153

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby elAmericanoTranquilo » Thu Apr 04, 2024 5:52 am

emk wrote:But I've discovered that marginal sentence or audio cards both tend to get easier over time, even if I don't ever fail them. Because of this, I tend to treat Anki as an "amplifier", and not as a set list of things I must learn. Cards are disposable; I can make thousands.

So my Anki decks are full of delightful cards, and I keep the stress level low.

But active recall is definitely useful. And transcribing audio is sort of the ultimate active recall. Still, I might be tempted to do transcriptions outside Anki, especially if I started failing lots of cards.
Very helpful, thanks!
1 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

An active wave!

Postby emk » Thu Apr 11, 2024 7:21 pm

I have finished learning Avatar 01.07 passively, so it's finally time for an "active wave"!

To do this, I manually edited my "Substudy Audio" template, and added support for optional "reverse" cards. I'll explain this in more detail later. But the idea is that there's an "AddActive" field on every card, and if I fill it with "y", Anki will generate an active version of a card. Here's what it looks like:

Image Image

As you can see, the front side still contains the image. But the audio has been moved to the back side. And the front side contains the surrounding context in Spanish, plus the English text I need to translate. I toss the English context on the back, even though it's pretty much unused at this point.

Here are two more examples:

Image Image

On the left, you can see a hint "Oigan" to help me guess what what interjection is used here. This is a good way to salvage cards which would otherwise have ambiguous L1→L2 translations.

So far, these cards feel fairly easy, which is good. This is probably because I've already been ear-wormed by the "L2 comprehension" version of these cards. It also helps that I hand-pick which cards I "set as active". I pick shorter cards, with complete sentences, and focus on useful expressions, vocabulary, and grammar.

Also, I'm planning to grade these in a slightly forgiving fashion. If I mess up one verb form, I'll probably just pass the card. If I later find these cards too difficult, or if I want to focus on very specific verb forms, I might start doing cloze cards at some point. But for now, this seems easy and promising.

Singing along with music. I find that I can sing along with significant parts of "Eres para Mí" when it comes up on my play list. This is a big reason why I study songs in the first place. They're conventient listening and sing-a-long practice!

substudy: Improved song support! This is for rdearman, in particular, but also for anyone else who's had mixed results with songs.

rdearman wrote:Here I'm actually giving it the entire song lyrics as the example text. (Didn't know if you wanted me to post to github or not). This song (like many Korean songs) has a mixture of both Korean and English.

Try the new "--expected-text" option, which works much better with lyrics:

Code: Select all

substudy transcribe song.mp3 --expected-text=lyrics.txt > song.es.srt

Download / Documentation

Daily progress. Anki reviews currently take around 20 minutes per day, though I do sometimes have 30 minute days. I'm currently learning:

  • 5 new active Avatar cards/day. The limiting factor here is probably that I only have 250–300 comprehension cards per episode, and I probably don't want to make more than 20–30% active. But I can always go back and mine episides 01.01, 01.02, 01.05 and 01.06, from my experiment back in the day.
  • 5 new music comprehension cards/day. These are less useful in many ways than the TV cards. But they do expose me to a wider range of accents, help me build up a playlist where I understand most of the lyrics, and give me something to sing along to. Plus they're fun.
Reading. I would need to read ~4.2 pages/day for 20 months to complete a half Super Challenge. My current book has ~370 pages in a monolingual edition, and I'm aiming to read at least 1% of the bilingual version per day, or the equivalent of 3.7 monolingual pages. So this is already pretty close to the speed I'll need for the Super Challenge. I'm already at 18% (or ~66 monolingual pages) in under 2 weeks. Which is, uh, probably OK for someone who probably can't conjugate the present tense of ser? And this an adult fiction novel.

In terms of the actual experience of reading, sometimes I can get entire Spanish sentences without needing to glance at the English. But if I do need to glance at the English, I can almost always understand the Spanish sentence when I re-read it. This is basically Cheating and Consolidating. My guess is that by the time I reach 500–750 pages, I'll need the bilingual English a lot less.

Thoughts on this experiment. The goal of this whole experiment has been to skip straight from zero to the fun B1+ activities: watching TV, reading books, etc. I've sacrified a lot of things (like active skills, breadth of vocabulary, etc) to get straight to the fun bits. But now I'm (slowly) reading a bilingual book, and there are now 5 episodes of Avatar that I can just watch. And with a slight effort to focus, I'll be able to understand 80% of the dialog in the episodes I've studied. This is fantastic for building automaticity!

Counting both my first and second attemps at Spanish, including my extensive watching of 6 seasons of Avatar and Korra over 5 years ago, I've probably put in 150–175 hours on actual Spanish. Which is about on par with an Assimil course, and under the 180–200 classroom hours expected for A2. But if you look at just the "study" parts, then that cuts out ~50 hours of extensive television watching (plus some other stuff), probably leaving ~100 hours of Anki and other real study. The most interesting part of this experiment will be seeing how long it takes to develop some active Spanish, and how long it takes to hit B1+ reading skills.
8 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

An AI experiment: Auto-translating & annotating text cards

Postby emk » Fri Apr 12, 2024 12:11 pm

One of the things I've noticed while reading is certain constructions are sticking quickly, but others are a bit harder to pick up. This was 100% expected.

  • Old school: I'd just plow through and hope to figure everything out eventually. This works surprisingly well, especially around B1.
  • "New" school (pre-2015): I built a web-based UI to process sentences and add definitions from Wiktionary, plus an Anki plugin to import them. This allowed me to pick up rarer phrases more quickly.
  • The school of "We're about 10 years from being the second-smartest species on this planet": Oh, what the heck, let's just use AI.
So I grabbed the "My Clippings.txt" file off my Kindle. (You can also find your clippings online, depending on what Kindle model you're using.) I manually reformatted these into a text file as follows:

Code: Select all

Videntes, fantasmas, vampiros..., [[todo lo habido y por haber.]]
--
A pesar de los avances tecnológicos, las cosas no habían cambiado como todos esperaban y pensaban que lo harían.
--
Ni [[siquiera]] me gusta utilizar portaminas.

Note the "[[...]]". I used those to manually mark expressions which seemed especially interesting.

Then, I wrote a little script:

Code: Select all

#!/usr/bin/env python
#
# Usage:
#   python make-text-cards.py <deck> <source-name> <input-text-file> <output-csv-file>

import csv
import json
from typing import Dict, List, Optional

from dotenv import load_dotenv
from markdown import markdown
from openai import OpenAI


# Load environment variables. Create a file named `.env` in the same directory as this file
# and add the following line to it:
#
# OPENAI_API_KEY="your-api-key"
load_dotenv()

def generate_cards(input_texts: List[str], *, source: Optional[str] = None) -> List[Dict[str, str]]:
    """Read in a file of text snippets and convert them into cards.

    Output fields should be:

    - Front: The original text
    - Back: The translation
    - Notes: Extra notes or context generated by the model for text marked with
      "[[...]]".
    - Source: The source of the text snippet.
    """

    # We need to build up a sample dialog between the "user" and the
    # "assistant", before asking our actual question. This "teaches" the model
    # how to respond, essentially by putting words into its mouth.
    system_message = """
You are a translator helping prepare Anki cards. You will be given short text in
Spanish, which will put onto the front of cards. Your job is to translate the
short text to English. Following the translation, you should briefly break break
down any phrases surrounded by [[ ]] and explain how they work. Do not include
any explanations if there are no [[ ]].
"""
    prompt_1 = "Tenía un alma de tigre."
    response_1 = {
        "translation": "He had a tiger's soul."
    }
    prompt_2 = "Ni [[siquiera]] hay una gramola."
    response_2 = {
        "translation": "There isn't even a jukebox.",
        "explanations": "- **siquiera:** The word \"siquiera\" in Spanish is used to add emphasis, typically in negative contexts, similar to the English word \"even.\" In this sentence, \"Ni siquiera\" translates directly to \"not even,\" emphasizing that there isn’t a jukebox at all.",
    }

    # Declare the function that the model should call.
    tools = [{
        "type": "function",
        "function": {
            "name": "add_data_to_card",
            "description": "Add the translation (and optionally explanations) to the current card.",
            "parameters": {
                "type": "object",
                "properties": {
                    "translation": { "type": "string" },
                    "explanations": { "type": "string" },
                },
                "required": ["translation"]
            }
        }
    }]

    # Generate the translations using GPT-3.5.
    client = OpenAI()

    result = []
    for input_text in input_texts:

        print(f"Input: {input_text}")
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": system_message},
                {"role": "user", "content": prompt_1},
                {"role": "function", "name": "add_data_to_card", "content": json.dumps(response_1)},
                {"role": "user", "content": prompt_2},
                {"role": "function", "name": "add_data_to_card", "content": json.dumps(response_2)},
                {"role": "user", "content": input_text},
            ],
            tools = tools,
            tool_choice = {"type": "function", "function": {"name": "add_data_to_card"}},
        )

        # Extract the tool call from the response.
        tool_calls = response.choices[0].message.tool_calls
        assert len(tool_calls) == 1
        args = json.loads(tool_calls[0].function.arguments)
        print(f"{json.dumps(args, indent=4)}")

        # Convert [[ and ]] to ** and **.
        front = input_text.replace("[[", "**").replace("]]", "**")

        # Convert the explanations to Markdown.
        if args.get("explanations"):
            explanations = markdown(args["explanations"])
        else:
            explanations = None

        result.append({
            "Front": markdown(front),
            "Back": markdown(args["translation"]),
            "Notes": explanations,
            "Source": source,
        })

    return result

def texts_to_csv(in_texts_path: str, out_csv_path: str, *, deck: str, source: Optional[str] = None) -> None:
    """Read in a file of text snippets separated by "\\n--\\n" and write the
    generated cards to a CSV file."""

    with open(in_texts_path, "r") as f:
        input_texts = f.read().strip().split("\n--\n")

    cards = generate_cards(input_texts, source=source)

    # Write CSV correctly using a library. Note that Anki imports work much
    # better if we provide a header.
    with open(out_csv_path, "w", newline="") as f:
        f.write(f"""#separator:Semicolon
#html:true
#notetype:Text Snippet
#deck:{deck}
#columns:""")
        writer = csv.DictWriter(f, fieldnames=["Front", "Back", "Notes", "Source"], delimiter=";")
        writer.writeheader()
        writer.writerows(cards)

# Command line entry point.
if __name__ == "__main__":
    import sys

    if len(sys.argv) != 5:
        print(f"Usage: {sys.argv[0]} <deck> <source-name> <input-text-file> <output-csv-file>")
        sys.exit(1)

    deck = sys.argv[1]
    source = sys.argv[2]
    in_texts_path = sys.argv[3]
    out_csv_path = sys.argv[4]

    texts_to_csv(in_texts_path, out_csv_path, deck=deck, source=source)

If you're a programmer, see the "Python experiments" README for notes on running this. There are some really neat tricks in here, including both "function calling" and "few-shot prompting". Oh, and it shows how to generate an "Anki header" for a CSV file we want to import, which improves the experience dramatically.

We can take the generated CSV file, and import it into Anki:

Image Image
The translations are generated automatically using GPT-3.5-Turbo. And any phrases in "[[...]]" have been explained by the model! (Sometimes it gets confused and explains interesting things I didn't mark. I'd probably get better results if I switched from GPT-3.5-Turbo to Claude 3 Opus.)

Again, these cards are pretty easy to produce, and therefore pretty disposable (which is what we want). The hardest part right now is extracting sentences from "My Clippings.txt", which is a gross format. Other than that, I just need to highlight things on my Kindle when reading, then go back through and add "[[...]]" around interesting bits later.

Again, grading is very gentle. If I can more-or-less understand the highlighted bit in context, that's good enough for me.

More attempts to estimate time spent. Also, my total study hours in my last post are extremely approximate. During the first part of my Spanish experiment, many years ago, I wrote down that I put in 60 hours of total study. In retrospect, I'm guessing that this was probably 45 hours of Anki (probably 20 new audio cards a day for two months?) and 15 hours of other study, including using my laminated Spanish poster and looking various things up? Plus there were 6 seasons of Avatar and Korra, which might have been very roughly 45 hours of extensive TV watching.

Since restarting, I've put in 28 days of Anki reviews, usually around 17–23 minutes a day, and I've been reading my book for just under two weeks. Book reading rarely exceeds 30 minutes, because trying to understand Spanish at my level basically puts my brain into emergency power-save mode. But I've also gotten in plenty of extra listening time while working on substudy, so it's hard to calculate. But there's at least another 20–25 hours of real study in there.

5 months of Assiml at an 45 minutes/day would be about 110 hours, depending on how seriously I took the active wave after lesson 60 (which is arguably not worth it). I probably have stronger comprehension—including much stronger listening comprehension—than I would after Assimil, but I've sacrificed elsewhere. Still, I can handle a surprising number of Spanish verb forms in clear native audio, sometimes without even thinking. Stuff like "las cosas no habían cambiado como todos esperaban y pensaban que lo harían" is pretty much automatic.

Study numbers. Here is my current daily workload:

  • 5 new Avatar "active" cards (audio)
  • 5 new music comprehension cards (audio)
  • 5 new text comprehension cards
  • 1% of my book
Reading is at 20%, or the equivalent of 74 monolingual pages, after 13 days. This feels like a well-balanced plan for the next week or so, since it includes both active production and a solid investment in reading. After a week, I should have a some idea how the active audio cards are working.
4 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

Re: An AI experiment: Auto-translating & annotating text cards

Postby emk » Sat Apr 13, 2024 1:37 pm

emk wrote:Study numbers. Here is my current daily workload:

  • 5 new Avatar "active" cards (audio)
  • 5 new music comprehension cards (audio)
  • 5 new text comprehension cards
  • 1% of my book

Ouch, 30 minutes of Anki reviews this morning. :shock: A couple of things are making this difficult:

  • Even though I review each category of cards separately, having this many types of cards means I keep switching "modes".
  • I've gone through a lot of different songs recently, which gives me a very diverse mix of young cards.
  • The new "active Avatar" format is a bit challenging!
Here are some more examples of the "active" cards:

Image Image

A key point about the cards is that I know this dialog. It's practically earwormed. So I'm trying to map the English phrase back to the audio I remember. These cards definitely take more effort to "learn", especially since I don't really know my tenses or conjugation tables. I'm essentially brute-forcing them. Also, this is literally the first time I've tried to use any active Spanish. So I'll keep doing these for another week, and then decide if I need to adjust the difficulty any. Also, it's probably about time I found my laminated Spanish poster and studied (gasp) some very basic grammar again.

But the format of trying to recall entire short phrases with which I've been earwormed is definitely interesting! It feels like it's doing curious things to my brain. So in the spirit of this experiment, let's keep trying it! (That's the liberating thing about this log—I'm free to pursue all sorts of weird ideas to see what happens.)

More examples from the brain blender. Just to give you an idea of the diversity of these cards, here are some more:

Image Image

On the left, we have another song card. This song's lyrics tend to be terser and less predictable, and I am discovering yet another way to pronounce "ll". Spanish is absolutely a "pluricentric" language, and my song cards make this painfully obvious.

On the right, notice the hint "Murphy is stressed." I add these manually as needed, often the first time I review the card. This is just a way to adjust the difficulty downwards, because I like easy cards. I think these sentence cards would benefit from including the preceding and following sentences, just like I do for the video cards. But to do that while keeping the workflow clean, I'd probably need to build an actual bilingual reading UI.

The translations here are machine-generated, not from an English version of the book. There are some tradeoffs to each type of translation:

  • Professional translations: These usually understand the meaning and the context. (Although not always—I suspect I've found several translation errors so far.) But the translation is often fairly loose, which makes things harder when I'm reaching this far beyond my knowledge base.
  • AI translations: These tend to be more literal, which helps a lot. But if I'm translating one sentence at a time, the AI sometimes lacks the context to translate things correctly. This is not helped by the fact that the "person" of Spanish verbs and pronouns often relies on context.
If I wanted to build even more tools, I could definitely get better results out of the AI translations.

Experimenting with Claude 3 Haiku. Up until now, I've mostly been using GPT-3.5-Turbo, the model behind the free version of ChatGPT 3.5. When used via the API in "function calling" mode, it's actually quite reliable and easy to use, and it gets great results with minimal effort. However, Anthropic's Claude 3 Haiku model is both cheaper, and—given clear enough examples and instructions—probably a bit "smarter."

And Anthropic recently announced support for "function calling." I was eventually able to get very good results out of this, but it took a lot more work than GPT-3.5-Turbo. Compare the prompts used by the GPT-3.5-Turbo card maker with the the prompts used by the Claude 3 Haiku version. Claude 3 Haiku is more likely to ignore instructions, or to produce output in the wrong format.

So if you just want to get something working, go with GPT-3.5-Turbo for now. If you need to process a lot of data, then it might be worth porting to Claude 3 Haiku. And if all else fails, and you don't care about budget, you can always talk to one of the big models like GPT-4-Turbo or Claude 3 Opus.

But if you're already writing scripts to generate Anki cards, definitely feel free to take apart the scripts I've been posting and modify them to try out your ideas. There's a lot of untapped potential here.
5 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby emk » Sat Apr 13, 2024 9:01 pm

In another thread, Iversen joked:

Iversen wrote:How to Learn a Language without acknowledging that you also are studying.

I believe in truth-in-advertising. I do actually study! Sometimes.

And now that I can watch some Avatar, understand some songs, and painfully slog my way through a real book, it's time for me to study again. I've found my trusty 6-page laminated grammar reference, and I'm going to take some notes. :lol:

Distances. Spanish has a threeway division, basically "here", "there", and "over there." For adverbs, we have aquí "here", ahí "there" and allí "over there." Similarly, we have este "this", ese "that" and aquel "that ... over there".

Verbs. We have three regular conjugations, which seem to divide into two groups: the -ar verbs, and the -er/-ir verbs. Plus, you know, a bunch of irregulars. I probably need to know ser, estar, haber and maybe 1–3 others at this point (dar, ir, poner? what about common auxiliaries?). Other irregulars I can mostly pick up from context for now.

A lot of this is pretty familiar, but I'm not 100% on any of it.

Conjugations:

  • Present: -o/-as/-a/-amos/-áis/-an. For -er, replace "a" with "e". For -ir, like -er, except it's -imos and -ís.
  • Preterite: -é/-aste/-ó/-amos/-asteis/-aron. For -er and -ir, use -i for the initial vowel, giving -í/-iste/-ió/-imos/-isteis/-ieron
  • Imperfect: -aba forms. For -er/-ir, -ía/-ías/-ía/-íamos/-íais/-ían (this I hadn't quite puzzled out!).
  • Future: Infinite minus -e, -é/-as/-a/-emos/-éis/-án.
  • Conditional: Like future, but use the imperfect -er/-ir endings.
  • Present subjunctive: Use the yo present indicative form, minus -o. Take the present indicative endings, and swap the -ar and -er/-ir groups. Finally, use 3sing for 1sing.
  • Past subjunctive: Use the -ron form minus -on, plus the present subjunctive -er/-ir endings.
  • Perfect & progressive: Mostly like English, with -do and -ndo forms, respectively.
That's actually pretty reasonable.

Observed oddities: Passive reflexives, a bit like French? And some odd auxiliary verbs that take a progressive form, including ir, venir, seguir, andar. Plus there's ir a with an infinitive, similar to "going to". And actually a bunch of others. Spanish is into modals!

Traps. Por/para, ser/estar. Might need to do some focused cloze work on these pairs at some point.

That's probably enough for now. If I'm feeling inspired, I could probably ask GPT-3.5-Turbo to generate me some short Assimil-style dialogs illustrating this information, and then turn those dialogs into cloze cards. But if I can internalize just this, it should last me most of the way to B1, thanks to my copious input. After that, I can break out the actual book, with a whole 65 tiny pages of grammar! I'm all about the laminated quick references and the Dover Essential Grammar series; my goal is deal with grammar in manageable chunks, once I can already guess half of it anyways.
4 x

Online
User avatar
emk
Black Belt - 1st Dan
Posts: 1722
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6806
Contact:

Re: How not to learn Spanish: Building too much stuff, not studying enough

Postby emk » Sun Apr 14, 2024 4:41 pm

Conjugation cards. So, I'm going to break several of my usual self-imposed Anki rules. I'm going to use a deck that someone else made, one which is focused primarily on drilling a certain point of grammar. But it's a nice deck that someone put a lot of effort into: Ultimate Spanish Conjugation (Lisardo's KOFI Method) (homepage). This contains 4,246 verb forms, conjugated in extremely simple and repetitive phrases.

Which, well, I certainly don't want all that. So let's start by suspending all the cards, and just reactivate the ones I want. The deck has a fantastic tagging scheme, so we can just search for the forms we want:

Code: Select all

tag:ser tag:irregular_form -tag:subjuntivo_futuro
tag:haber tag:irregular_form -tag:subjuntivo_futuro
tag:estar tag:irregular_form -tag:subjuntivo_futuro
tag:hablar -tag:subjuntivo_futuro
tag:deber -tag:subjuntivo_futuro
tag:vivir -tag:subjuntivo_futuro

This gives me 262 cards, covering 3 irregular verbs and 3 standard verb forms. I've made some decisions here, based on what I've seen in Avatar, various songs, and my book:

  • For irregular verbs, I'm going to start out by learning just the irregular forms.
  • I don't care about the subjunctive future. Apparently it mostly appears in fixed expressions and extremely elevated prose? Whatever, I'll deal with when I run into it.
  • More or less all of the other verb tenses are familiar from Avatar and my reading. I think Spanish just uses all of them in normal prose and speech? Might as well learn them all.
  • I'm going to learn the vos/vostros forms because they're apparently pretty common in certain regions and it's not that much extra work. And one of the Spanish speakers I interact with most often lived in Argentina, so I might as well just learn these forms.
  • deber kills two birds with one stone: it gives me -er verbs and the most useful modal verb.
Also, I've set my "Easy" graduation threshold for new cards to several weeks. If I easily recognize a card the first time I see it, I don't need to see it again soon. So let's see how this looks:

Image Image

Yup, that seems like it's worth 262 cards. Which is, like, an entire episode of Avatar. But I bought him his coffee, because I really don't want to write a program to generate this deck myself. :lol:

And all this brings my daily new cards up to 20, which is a lot.

Image

...and I need to add a new song. Didn't I do that a couple of days ago? Glad I have this mostly automated!

Problems with sentence cards. This whol/e plan of "highlight sentences on the Kindle, export to my laptop, translate using GPT-3.5-Turbo and turn into cards" is producing too many difficult cards. The big problems seems to be that:

  • Getting random sentences out of context is too hard at my level. I keep needing to add hints to remind me who's talking, etc.
  • There just isn't enough context on the cards for GPT-3.5-Turbo to guess a good translation.
So I think what I need to do is take the Kindle highlights, and map them back to the underlying book, and then add (Spanish/English, prev/curr/next) data, like I do for media cards. Ugh, but I bet it makes the cards a lot better.
5 x


Return to “Language logs”

Who is online

Users browsing this forum: fenkoli and 2 guests