Miguel's transcription log

Continue or start your personal language log here, including logs for challenge participants
User avatar
SpanishInput
Yellow Belt
Posts: 97
Joined: Sun Sep 26, 2021 3:11 pm
Location: Ecuador
Languages: Spanish (N), English (C2), Mandarin (HSK 5)
x 469

Miguel's transcription log

Postby SpanishInput » Mon Nov 15, 2021 1:59 am

Hi! Seeing the language logs here inspired me to start working again on my very rusty Mandarin. I know some users in this forum highly recommend doing transcription exercises. Back in 2017 I spent 3 months doing transcription exercises every day, and this helped me pass HSK 5. At that time I kept a log of my progress in Chinese Forums. Today I decided to start doing this again. This time I'll use the Netflix drama "Meteor Garden". On my first day, I did this for 2 pomodori (25 minute blocks) and got a 19.56% error rate. This is the first time in a long time I'm grabbing paper and pencil to write hanzi, so please excuse my awful handwriting. (It's not really better in Spanish!)

IMG_4066 (2).jpg


I'm aiming for accuracy, not speed. I'll keep track of my daily error rate on a spreadsheet and once a week I'll generate graphs to see how I'm doing. Hopefully, with time, my error rate should be going down.
You do not have the required permissions to view the files attached to this post.
9 x

User avatar
AllSubNoDub
Orange Belt
Posts: 172
Joined: Thu Aug 26, 2021 10:44 pm
Languages: English (N)
Speaks: Spanish (B1+), German (B2 dormant)
Learns: Japanese (Kanji only)
Language Log: https://forum.language-learners.org/vie ... 15&t=17191
x 475

Re: Miguel's transcription log

Postby AllSubNoDub » Mon Nov 15, 2021 1:32 pm

Hey Miguel, would you mind going into greater detail? So for this method you're watching a video, pausing (I'm assuming) and trying to write the characters as you think you hear them, then checking them against the subtitles? I'm curious, this sounds like an interesting method. I assume you must already have a pretty good vocabulary and spoken comprehension (well, you're HSK V, so I guess that goes without saying). How did you go about learning characters?
1 x

User avatar
SpanishInput
Yellow Belt
Posts: 97
Joined: Sun Sep 26, 2021 3:11 pm
Location: Ecuador
Languages: Spanish (N), English (C2), Mandarin (HSK 5)
x 469

Re: Miguel's transcription log

Postby SpanishInput » Mon Nov 15, 2021 3:12 pm

Hi, AllSubNoDub. Thank you for your comment.

How I do this:

-Grab a pencil, a piece of paper, eraser and a dictionary app (Pleco).

-Sit down for one pomodoro (25 minutes) and transcribe a TV Show on Netflix using these Language Reactor's features: Autopause (pauses with each subtitle line), line repeat (just press S), hide subtitles (set to hide both captions and translations) and speed controls (press 1 to make it slower, 2 to make it faster). I have a video about how you do all this with Language Reactor. I won't link to it here as my account is still new, but someone embedded it on the thread about Spanish n-grams. I use the dictionary app when I need to figure something out or remember the correct "spelling" of a word. I'm transcribing episodes I already watched years ago, so I'm familiar with the plot. If I can't figure out a character even after consulting Pleco, I just write a ? in the box.

-Immediately after writing each line, I reveal the subtitles (press E key) and compare them to what I wrote. I then write the correct characters in the spaces between lines. I use my own transcription paper for this. It has 25 squares per row and 20 rows per side of an A4 sheet. This makes it easy to count how many characters you wrote during a session. The paper also has spaces between rows, where I can write corrections. I count the following as errors: 1) Wrong character 2) Missing character 3) Extra character (I write an X in this case).

-Get up and walk for 5 minutes after one pomodoro and then work for another pomodoro.

-After 2 pomodori, I then copy the following data to Google sheets: Date, episode, squares filled, and number of errors. It's easier to see and count errors if you rotate the sheet 90°. I then calculate my error rate (errors divided by squares I filled) and number of correct squares (total squares minus errors).

The nice thing about this exercise is that you're being exposed to authentic input while writing things down at the same time, and you get immediate feedback on whether what you wrote is correct or not. Transcription helps you notice things you normally would not pay attention to.

My Mandarin and Hanzi/Kanji background
I started with Japanese back in 2004. I went though Heisig's Remembering the Kanji on-and-off and finished in about a year and a half. I used Stackz! as a basic Leitner-based review system. I also had the full "Japanese for busy people" series. I hated the dialogues, grammar explanations and drills, but I really, really enjoyed the short stories in the workbooks. I didn't really make much progress in Japanese until I gave all the books away, installed Anki, filled it with words from paper flashcards I had and just started to read stuff and try to have fun with the "Erin's challenge" videos and exercises. That's when I learned the power of word lists and SRS. By 2011 I had reached a point where I could have simple conversations in Japanese and I could read stuff I was familiar with without major problems. Then a friend convinced me to take up Mandarin.

I started Mandarin in a formal setting in 2012, but I found those classes boring, and I stopped learning altogether. Around 2013-2014 I found FluentFlix (Now FluentU), which at the time was free, and I practically started from zero again with Mandarin. I had forgotten almost everything. With FluentU I learned the power of these features: Autopause, hide subtitles, repeat a subtitle line, popup dictionary and bilingual subtitles. However, FluentU had mostly random content that I wasn't interested in, there was no "discovery" engine for new stuff and it only included short clips instead of full movies or full TV series, so I stopped using it.

In 2015, I made the decision to drop Japanese completely and focus on Mandarin. I went through Heisig's books for Simplified Hanzi in about 6 months, and I supplemented what the books were lacking (pronunciation, sample words) with Pleco and other means. The details of what I did are in my Amazon review of Heisig's first book. By october of the same year, some 3-4 months after starting Heisig, I was able to read in Mandarin, without the help of pinyin, in front of other people. Of couse, with lots of preparation. I still have my recordings from 2015. BTW, at the time I was spending 3-5 hours every day practising writing Hanzi by hand, using Anki, practising tones and hanzi readings with Pleco flashcards, etc. Around october I also started memorizing the readigs of words from the HSK lists and from the Subtlex list, which is a list gathered from subtitles. That's when I learned the power of word lists gathered from movie subtitles, and of ranking words by contextual diversity instead of frequency. It was really crazy. Of course my previous experience with Japanese helped, but lots of hanzi are different, all have different readings and many have different meanings. If I was a clickbaiter YouTuber I could have made a video "How I learned to read and write Chinese characters in 3 months", but of course that would have been a misrepresentation.

I kept reading and listening to Chinese every week (including a trip to China) until early 2017, when life happened and I dropped Chinese almost completely. But you probably know about the sunken cost fallacy: We just can't let go of something in which we've invested a lot of time an energy in, even if it has no practical benefit. So in august 2017 I signed up to take the HSK 5 exam in december. I started looking for a FluentU replacement that I could use with native content I actually wanted to watch instead of the short curated isolated clips provided by FluentU. I tried several video and audio players for language learners, and settled on the now defunct LAMP player and Workaudiobook, and I used them to make transcription exercises for 3 months. I used LAMP with a TV show (downloaded the mp4 and the srt) and Workaudiobook with a podcast. I passed the HSK 5. 2018 is a blur. I can't remember if I engaged with Mandarin at all.

Fast forward to 2019. Browser add-ons to watch Netflix with similar features to FluentU and LAMP started to appear. For a time, I used "Netflix Dual subtitle for learning languages", created by Niko (Japanese). Then LLN (now Language Reactor) appeared, and it became wildly popular for its ease of use. So popular that there's now a copycat extension with a similar name. There's also GlotDojo, which works with chinese streaming services such as Iqiyi and WeTV and is more customizable than Language Reactor. So now there are no excuses not to be consuming Chinese media. There's an overflow of good quality Chinese media! I started to watch Chinese dramas with these extensions on and off. I wasn't doing transcription exercises, tho. I did use LLN's Anki flashcard export feature, but that quickly became boring. I then started to read the progress logs in this forum, and here I am. Luke's log in particular inspired me to start actively engaging with Chinese content again.

I'm attaching my transcription paper in case anyone else is learning a character-based language. For Latin-based languages I'd suggest just typing on a text editor with word count features (or maybe using transcription paper with wider boxes?).
1000transcriptionpaper-double-sided.zip


And here's a list of the most common words in Chinese Netflix shows that I would consider "watchable". Fantasy, horror and overly violent shows and movies were omitted. Only words that appear in at least 10% of contexts (contextual diversity, aka "document range") in this small sample (61 contexts) were included.

Chinese Netflix 10 percent words.zip
You do not have the required permissions to view the files attached to this post.
7 x

User avatar
rdearman
Site Admin
Posts: 7231
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
Language Log: viewtopic.php?f=15&t=1836
x 23128
Contact:

Re: Miguel's transcription log

Postby rdearman » Mon Nov 15, 2021 3:43 pm

You should be able to post links now. Pity it is Chrome only, looked useful there for a second. :D
2 x
: 0 / 150 Read 150 books in 2024

My YouTube Channel
The Autodidactic Podcast
My Author's Newsletter

I post on this forum with mobile devices, so excuse short msgs and typos.

alaart
Green Belt
Posts: 338
Joined: Sat Aug 03, 2019 6:58 am
Location: Kaoshiung
Languages: DE (N), EN
B1: NL, JP, PT (BR), ZH
A2: KR
A1: ES
Language Log: https://forum.language-learners.org/vie ... hp?t=10867
x 1027

Re: Miguel's transcription log

Postby alaart » Mon Nov 15, 2021 4:02 pm

Thanks for the detailed explanation.

I used transcription exercises before for Chinese around 2018 but with the focus being on pinyin, trying to catch the right tones and training my ear. It was with the help of a similar website to FluentU called yabla. It's nice to read that it works, maybe I should have kept going, as I'm now half tone deaf again ( I'm always struggling maintaining my Chinese ).

Good luck to you, and I will follow your log and hope to someday do something similar when life permits it.
1 x

User avatar
AllSubNoDub
Orange Belt
Posts: 172
Joined: Thu Aug 26, 2021 10:44 pm
Languages: English (N)
Speaks: Spanish (B1+), German (B2 dormant)
Learns: Japanese (Kanji only)
Language Log: https://forum.language-learners.org/vie ... 15&t=17191
x 475

Re: Miguel's transcription log

Postby AllSubNoDub » Mon Nov 15, 2021 5:28 pm

Thank you for this amazing post and breakdown! Somehow, I've never heard of this method, though it seems in some way familiar (sentido común?). My brother has worked through RTH and I'm inclined to say that it's probably even more useful for Chinese than its originally-intended Japanese, given the stronger written-to-spoken connection between characters with the former.

I'm going to pretend you didn't mention this part:
SpanishInput wrote:But you probably know about the sunken cost fallacy: We just can't let go of something in which we've invested a lot of time an energy in, even if it has no practical benefit.

Since you went a bit beyond the original question, do you mind sharing your experiences with your amazing English? Did you learn it in school? My goal for Spanish is C2+ (the only language for which I have such a lofty target at this point), but I find it seems so rare for anyone to actually reach and test at that level in any foreign language. No rush, this is your log, take your time answering in any way you feel comfortable.
5 x

User avatar
SpanishInput
Yellow Belt
Posts: 97
Joined: Sun Sep 26, 2021 3:11 pm
Location: Ecuador
Languages: Spanish (N), English (C2), Mandarin (HSK 5)
x 469

Miguel vs. Zipf's Law

Postby SpanishInput » Sun Nov 21, 2021 9:08 pm

Weekly update: Here’s how my transcription exercise has been going so far. Not looking as good as I'd like. :oops:
error rate.PNG

By Friday I had already noticed my error rate had been hovering above 15%, which meant my comprehension was below 85%. I decided to do something about it: Learn words from my list of the most common words in different dramas. But then I remembered Refold’s advice to focus on only one “domain” at a time. In this case, my current domain is the drama “Meteor Garden (2018)”, which has 49 episodes.

General lists vs. Specific lists

Imron, the administrator of Chinese-forums, talks about a similar idea in his blog “Chinese the hard way”. He says after a certain point (1,200 words/HSK 4), learning words from a general word list is inefficient, and learning from a list specific to the content you’re consuming is an order of magnitude more efficient:
https://www.chinesethehardway.com/artic ... efficient/

If I had started going through my “general list”, I would have needed to reach word #11,463 just to get 90% coverage.

I used AntConc to find the most common words in “Meteor Garden”. Surprisingly, the drama only contains 11,053 unique words (144,498 words total):
antconc.PNG

After exporting the AntConc results (File › Save output) I loaded this file into LibreOffice Calc. After deleting the first column and moving the data in the first two rows to another sheet, I then ordered the data first by the first column (frequency) and then by the second column (contextual diversity). This way contextual diversity (the number of different episodes a word appears in) becomes the main ranking criteria. This will save me from wasting time learning words that might appear several times, but only in one or two episodes. I then used a couple of simple formulas to calculate cumulative coverage:
libreoffice_calc.png

It turns out that with just the 2,318 most common words you can reach 90% coverage! And I’m still working with words that appear in at least 4 episodes. BTW, I know some of these “words” might not be actual “words”. Chinese text segmentation software is not perfect.
cumulative coverage.png

A bit more specific

I recalculated everything for just episodes 13-49, because I’m already at the beginning of episode 13. Now there are only 9,237 unique words, and I can reach 90% coverage with just the 2,151 most common ones.

I'm going over this list with Pleco and adding new words/possible troublesome words as flashcards. I'm also going to save the list as a custom dictionary in Notepad++ so I can easily skip these already-processed words when I repeat this word selection process for other shows in the future.

In other news

I found an exchange partner on conversationexchange.com, a site recommended by Rick. My partner and I have agreed to correct each other's writing. Time for production! :geek:

rdearman wrote:You should be able to post links now

Thanks, Rick! And thank you for recommending conversationexchange.com.

alaart wrote:trying to catch the right tones and training my ear

Tones were the bane of my existence for years. They still give me trouble. 加油! And 谢谢 for your good wishes.

AllSubNoDub wrote:Somehow, I've never heard of this method, though it seems in some way familiar.

It was mentioned in this very same forum back in 2018: https://forum.language-learners.org/vie ... php?t=9631
A nice introductory article about the benefits of this method can be found here, although I don't follow these exact instructions: https://www.fluentin3months.com/transcr ... technique/ (No matter what you think of this site, this article is actually good)

AllSubNoDub wrote:It's probably even more useful for Chinese than its originally-intended Japanese, given the stronger written-to-spoken connection between characters with the former.

Yes, but you need to add some extra work. I give some more details about what I did with this book here: (100% independent, non-affiliated review. However, I did have contact with Dr. Heisig years ago... To ask him when the Spanish version would be out :lol:, and later to thank him. :) )
https://www.amazon.com/gp/customer-revi ... 0824833236

AllSubNoDub wrote:Did you learn it in school? My goal for Spanish is C2+

Thanks! Congratulations on that goal! I learned English in the Bénédict school of languages as a teen. They have a nice method in a small group setting (max 9), with their own books. I highly recommend them. After that, the Internet came along and I had to use it in English because online information in Spanish was almost non-existent. Cable TV with closed captions, DVDs with subtitles and music also helped a lot. Even though the Oxford Online Placement Test tells me my English is C2, I know my accent is not always easy to understand, and my prepositions and word usage are sometimes off. I need to work on those.

I’ll try to update this log only on Sundays, so each post (hopefully) contains a bit more progress. Feel free to add comments/questions/advice in between posts.
You do not have the required permissions to view the files attached to this post.
3 x

User avatar
SpanishInput
Yellow Belt
Posts: 97
Joined: Sun Sep 26, 2021 3:11 pm
Location: Ecuador
Languages: Spanish (N), English (C2), Mandarin (HSK 5)
x 469

Re: Miguel's transcription log

Postby SpanishInput » Sat Nov 27, 2021 7:29 pm

This week I've been keeping at it with my daily transcription exercise. I'm still using the same drama, Meteor Garden (2018). It's based on the manga "Boys over flowers". It's utterly ridiculous and over-the-top. It's literally a real life adaptation of a manga, so what else could I expect? :roll: Still, the production value is very, very high, which makes this thing watchable, and the fact that all the audio is dubbed to Mandarin (In China they often don't use the audio recorded with the footage) makes it easier for me to understand what they're saying:

BTW, this is actually a remake of another adaptation... of another adaptation.... Ok, let's just say this story is the "Ugly Betty/Betty la fea" of the Asian world when it comes to the number of adapations and remakes. :lol:

There's a high variability day-to-day in my comprehension level. One day the drama has only common words and I do fine, the next day it has some references to Chinese poetry and cooking terms that I don't know. This week I've had both my worst and my best day when it comes to the error rate. So I've decided to stop looking at my daily error rate and instead look at a 7-day moving average, so I can more easily see trends. It's still hovering above 15%. To increase my comprehension, I'm adding words from this drama to Pleco flashcards. My criteria for adding them is:
  • The word must appear in at least 3 episodes I haven't watched yet.
  • OR I might add it even if it appears only once in this show, ONLY IF it's also in the list of top words in all the TV shows I pre-selected.
So far I'm confident this strategy is going to work. We'll see.

In other news, I wrote something in Chinese and sent it to my exchange partner. She kindly marked all mistakes, explained them and suggested a more natural alternative. Now it's my turn to mark her Spanish text.

BTW, here's a graph of how I'm doing with the listening exercise:
You do not have the required permissions to view the files attached to this post.
5 x

User avatar
M23
Orange Belt
Posts: 161
Joined: Wed Dec 09, 2015 6:58 am
Location: Colorado (USA)
Languages: Analog languages - English (N), Spanish (intermediate), German (n00b). Digital languages- Java (n00b)
Language Log: viewtopic.php?f=15&t=2186
x 297

Re: Miguel's transcription log

Postby M23 » Sat Nov 27, 2021 8:58 pm

Very interesting idea. I might have to try this out with Spanish when I am not bogged down with school. Keep it up for sure. :D
1 x

User avatar
SpanishInput
Yellow Belt
Posts: 97
Joined: Sun Sep 26, 2021 3:11 pm
Location: Ecuador
Languages: Spanish (N), English (C2), Mandarin (HSK 5)
x 469

Re: Miguel's transcription log

Postby SpanishInput » Sat Dec 11, 2021 8:23 pm

Time for an update:
I've kept doing the transcription exercise on most days, except Wednesday and Thursday this week, because I was busy and not in the best mood for transcription. But I've kept my Pleco flashcard routine. According to Pleco, I've learned 276 words these days. I'm adding around 10 "new" words per day. Some are not actually new, but just words that trip me up when I hear them. I find that sometimes tones trip me up, and sometimes it's the difference between u and ü.

After my Wed-Thu gap, I decided to lower my daily transcription quota to just one pomodoro (25 minutes), and spend another pomodoro just enjoying the show. This seems a lot more sustainable than just transcribing for one hour every day and never watching the show just for fun. I need to remember to have fun.

As for my error rate, it keeps going up and down. Nowadays I'm actually skipping lines that are shouted or where two people speak at the same time, because those are just too hard.

I've also recently tried to use the Migaku browser extension instead of Language Reactor. It has lots of promising features, but it's just not suitable for what I'm doing right now, so I'm sticking with LR.
You do not have the required permissions to view the files attached to this post.
4 x


Return to “Language logs”

Who is online

Users browsing this forum: No registered users and 2 guests