Interlaced bilingual subtitles

All about language programs, courses, websites and other learning resources
kundalini
Orange Belt
Posts: 114
Joined: Sun Jan 24, 2021 8:17 pm
Languages: English (C), Greek (low intermediate)
x 358

Interlaced bilingual subtitles

Postby kundalini » Tue Mar 02, 2021 9:52 pm

Interlaced bilingual texts have been something of a holy grail for those interested in the Listening-Reading method. As far as I'm aware, they've traditionally required intensive labor to create, even more than parallel texts, and for that reason are relatively rare, even commercially. They're also much easier to follow than parallel texts, since they're often interlaced at the sentence level.

So I was excited when, thanks to this forum, specifically a post by Kraut (https://forum.language-learners.org/viewtopic.php?f=19&t=9001&p=176030&hilit=subtitletools#p176030), I came across a way to create such texts easily with video subtitles. I discussed this in another post, but I found it so helpful that I thought I'd create a separate thread. The only requirements are subtitle files in the desired languages, and a free service at https://subtitletools.com/.

If you use Netflix, there's a browser extension that allows the downloading of subtitles in all available languages in any Netflix program. The details are outlined here: https://www.reddit.com/r/LearnJapanese/comments/iq2q24/tutorial_how_to_download_japanese_subtitle_or_any/ However, the subtitles do not need to come from Netflix. Subtitles to films and TV shows in multiple languages are widely available on the web.

Once I have the subtitle files, I go to https://subtitletools.com/ and merge the files with Subtitle Merger. Then I download the output file and convert to plain text by uploading it once again to subtitletools (Convert to Plain Text).

subtitletools.png


The result looks like the below. You'll notice that some sentences aren't translated, as not every sentence is directly translated by the translator. However, this has not been a major problem for me.

Call_My_Agent__S01E01_WEBRip_Netflix_fr_cc__txt.jpg


What you do with the text file at this point is up to you. You can export selected words and phrases into a flashcard app, or you can read it beforehand and try to watch the video without looking at subtitles, or you can read along with the video. The possibilities are wide open. Personally, I have been loading them onto an e-reader (after converting the txt files into epub with Calibre) to read the subs before watching an episode. I save words and passages by highlighting them, which I can review later. Here's what it looks like on the e-reader:

IMG_9130.jpg


Of course, there are some cons. Compared to lengthy novels, the vocabulary found in a film or TV series script is much less dense. There are often scenes where not much is said. From a language learning perspective, these linguistically impoverished moments are dead weight. But such moments are much easier to gloss over when reading solely the text (albeit at the expense of listening comprehension). One is also unlikely to gain much exposure to sophisticated prose with subtitles. On the other hand, they offer a much better window into the actual daily spoken language than classic novels.

The pros, I think, outweigh the downsides. One of the key benefits of the L-R approach with long texts is that, through exposure to the author's idiolect, words and phrases particular to that author are consolidated into one's memory as they are encountered repeatedly in various contexts. TV shows, too, have their own idiolect that can be repeatedly encountered over a number of seasons. The main difficulty with L-R is that it can be difficult to obtain both bilingual texts and unabridged audio recordings in one's target language. The text and videos that can be used with interlaced bilingual subs, on the other hand, are much more widely available. I think the sheer volume of material that can be obtained through TV shows and subtitles is its biggest strength.

Lastly, I'm not affiliated with any of the products or services mentioned in this post, including Netflix and subtitletools.com
You do not have the required permissions to view the files attached to this post.
4 x
Iliad: 12 / 24

Kraut
Black Belt - 2nd Dan
Posts: 2620
Joined: Mon Aug 07, 2017 10:37 pm
Languages: German (N)
French (C)
English (C)
Spanish (A2)
Lithuanian
x 3226

Re: Interlaced bilingual subtitles

Postby Kraut » Wed Mar 03, 2021 10:28 am

You can extract SRT-files from any satellite TV program that you have recorded using ProjectX ( https://www.chip.de/downloads/ProjectX_15629057.html ) as long as it's in SD. If you have recorded a programme in HD, you need something like TS-Doctor:
( http://www.cypheros.de/tsdoctor3.html?g ... gIyXfD_BwE )

Then you can proceed as described above and get interlaced texts or watch a film with two subtitles.
2 x

mcthulhu
Orange Belt
Posts: 228
Joined: Sun Feb 26, 2017 4:01 pm
Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
x 590

Re: Interlaced bilingual subtitles

Postby mcthulhu » Wed Mar 03, 2021 10:06 pm

I'm interested in why you think that interlaced texts are much easier to follow than parallel texts. Is it just the separation into sentences? Or is above + below easier to follow than left + right?

Interlaced texts may be a holy grail but shouldn't be that difficult to produce, at least not if you're interlacing sentences rather than individual phrases and words. Have you tried LF-Aligner?
0 x

kundalini
Orange Belt
Posts: 114
Joined: Sun Jan 24, 2021 8:17 pm
Languages: English (C), Greek (low intermediate)
x 358

Re: Interlaced bilingual subtitles

Postby kundalini » Thu Mar 04, 2021 1:30 pm

mcthulhu wrote:I'm interested in why you think that interlaced texts are much easier to follow than parallel texts. Is it just the separation into sentences? Or is above + below easier to follow than left + right?

Interlaced texts may be a holy grail but shouldn't be that difficult to produce, at least not if you're interlacing sentences rather than individual phrases and words. Have you tried LF-Aligner?


Good question! I think it is mostly related to the distances involved in the eyes' saccadic movements. In a parallel text, the eyes often have to jump horizontally across the entire width of the page to find the equivalent of an unknown word. An interlaced text requires a much shorter vertical jump. Even more so when interlaced at a sentence level. I noticed that I would often lose my place in a parallel text after jumping across the page to consult the meaning of a word or a phrase. All told, I'd consequently spend a fair amount of time trying to reestablish where I was in the text, an experience that I found distracting. Naturally, this happened more often when the parallel texts came in longer blocks of text.

FT_-_The_Mysterious_Island.jpg


In terms of personal preference as well, I've found interlaced texts to suit my needs better than parallel texts. I prefer to use an e-reader when possible over a computer monitor, and the smaller screen is better suited to interlaced texts than parallel.

Thank you for suggesting LF-Aligner. I'll try it, and am curious to see if it is capable of handling translation irregularities. With subtitles, the colloquial translations often add or subtract lines from the original. As seen below, some lines go untranslated. But the app is able to match up the next pair of sentences based on their timestamps in the subtitle files. I'd be very excited if there are programs that can create interlaced texts automatically with novels and non-fiction books!

A_language_learners’_forum_-_Post_a_reply.jpg
You do not have the required permissions to view the files attached to this post.
0 x
Iliad: 12 / 24

Kraut
Black Belt - 2nd Dan
Posts: 2620
Joined: Mon Aug 07, 2017 10:37 pm
Languages: German (N)
French (C)
English (C)
Spanish (A2)
Lithuanian
x 3226

Re: Interlaced bilingual subtitles

Postby Kraut » Thu Mar 04, 2021 3:13 pm

kundalini wrote:
Good question! I think it is mostly related to the distances involved in the eyes' saccadic movements. In a parallel text, the eyes often have to jump horizontally across the entire width of the page to find the equivalent of an unknown word. An interlaced text requires a much shorter vertical jump. Even more so when interlaced at a sentence level. I noticed that I would often lose my place in a parallel text after jumping across the page to consult the meaning of a word or a phrase. All told, I'd consequently spend a fair amount of time trying to reestablish where I was in the text, an experience that I found distracting. Naturally, this happened more often when the parallel texts came in longer blocks of text.


This is exactly my experience when I do reverse translation with my eyes back to the original in the bidirectional method. First all is very comfortable when Deepl gives you the parallel format. But then after a while seeing my German script bothers me to the point that i now record my German translations and do the back translation via audio while my eyes occasionaly look at the original for checking.
1 x

mcthulhu
Orange Belt
Posts: 228
Joined: Sun Feb 26, 2017 4:01 pm
Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
x 590

Re: Interlaced bilingual subtitles

Postby mcthulhu » Fri Mar 05, 2021 2:16 am

...mostly related to the distances involved in the eyes' saccadic movements. In a parallel text, the eyes often have to jump horizontally across the entire width of the page to find the equivalent of an unknown word. An interlaced text requires a much shorter vertical jump. Even more so when interlaced at a sentence level. I noticed that I would often lose my place in a parallel text...

That's a very good point, kundalini, thanks. In Jorkens, I have the "traditional" parallel book display format with the original book on the left side and the translated version on the right side (or vice versa, depending). The two versions are just shown in two independently scrollable iframes, which is fine for readability but maybe not for going back and forth. I've been planning at some point to incorporate some sort of sentence or at least paragraph alignment, to calculate which sentence in text 2 is a translation of which sentence in text 1, and present the two arrays of sentences in side-by-side columns. I guess that format would still suffer from the same problem of having to look all the way across the page, though dividing the text into rows and columns at least ought to make it harder to lose your place. Maybe I'll try to have equivalent sentences paired vertically instead of horizontally (or make it an option).

Or I could hide the second version, and have it appear only when you hover over a sentence in the first version. (I always worry that I might end up just focusing on my native language because it's so much easier to read, which defeats the whole purpose.)
3 x

User avatar
siouxchief
Yellow Belt
Posts: 77
Joined: Sat May 25, 2019 8:36 am
Location: Ireland
Languages: Learning French
x 125

Re: Interlaced bilingual subtitles

Postby siouxchief » Sun Mar 07, 2021 11:56 am

Forgive me if you know this already but there is a really useful plugin for Netflix with parallel text and lots of other features. Worth checking out:

https://languagelearningwithnetflix.com/
1 x

kundalini
Orange Belt
Posts: 114
Joined: Sun Jan 24, 2021 8:17 pm
Languages: English (C), Greek (low intermediate)
x 358

Re: Interlaced bilingual subtitles

Postby kundalini » Fri Mar 12, 2021 4:39 pm

mcthulhu wrote:In Jorkens, I have the "traditional" parallel book display format with the original book on the left side and the translated version on the right side (or vice versa, depending). The two versions are just shown in two independently scrollable iframes, which is fine for readability but maybe not for going back and forth. I've been planning at some point to incorporate some sort of sentence or at least paragraph alignment, to calculate which sentence in text 2 is a translation of which sentence in text 1, and present the two arrays of sentences in side-by-side columns. I guess that format would still suffer from the same problem of having to look all the way across the page, though dividing the text into rows and columns at least ought to make it harder to lose your place. Maybe I'll try to have equivalent sentences paired vertically instead of horizontally (or make it an option).


Now that I've had a chance to install Jorkens, I'm excited to see where it goes in the future. I agree -- continuous matched scrolling in parallel texts would be very helpful.

Interlaced text matching, then being able to export it to epub or txt, would be even better.

How much of a programming challenge would it be to pair up matching sentences? This would likely include taking into account orphan sentences that go untranslated, extra sentences in the translation, and liberal translations.
0 x
Iliad: 12 / 24

Kraut
Black Belt - 2nd Dan
Posts: 2620
Joined: Mon Aug 07, 2017 10:37 pm
Languages: German (N)
French (C)
English (C)
Spanish (A2)
Lithuanian
x 3226

Re: Interlaced bilingual subtitles

Postby Kraut » Tue Mar 16, 2021 12:45 am

This is from Dehaene's book on reading: words when read in groups should not be too big. I think i have been doing this wrong and should stick to colouring words from now on.

https://www.penguinrandomhouse.de/Stani ... 0_5395.rhd

2. one might think that it depends on the size of the characters how easily we can read something: Small characters should be harder to grasp than large ones. Strangely enough, this is not the case. Because the larger the letters of a word are printed, the more space they take up on the retina - so the letters move to the edge of the field of vision, where they are difficult to distinguish.

2. Man könnte meinen, es hänge von der Größe der Zeichen ab, wie leicht wir etwas lesen können: Kleine Zeichen sollten schwerer zu erfassen sein als große. Seltsamerweise ist das nicht der Fall. Denn je größer die Buchstaben eines Wortes gedruckt sind, desto mehr Platz beanspruchen sie auf der Retina – die Buchstaben wandern also an den Rand des Sehfeldes, wo sie nur schwer unterscheidbar sind.
You do not have the required permissions to view the files attached to this post.
2 x

kundalini
Orange Belt
Posts: 114
Joined: Sun Jan 24, 2021 8:17 pm
Languages: English (C), Greek (low intermediate)
x 358

Re: Interlaced bilingual subtitles

Postby kundalini » Tue Mar 16, 2021 2:40 pm

Kraut wrote:This is from Dehaene's book on reading: words when read in groups should not be too big. I think i have been doing this wrong and should stick to colouring words from now on.

https://www.penguinrandomhouse.de/Stani ... 0_5395.rhd

2. one might think that it depends on the size of the characters how easily we can read something: Small characters should be harder to grasp than large ones. Strangely enough, this is not the case. Because the larger the letters of a word are printed, the more space they take up on the retina - so the letters move to the edge of the field of vision, where they are difficult to distinguish.


This is interesting -- thanks for sharing the quote. I've noticed that reading in big and small fonts are both uncomfortable experiences, but perhaps they are for different reasons.
1 x
Iliad: 12 / 24


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests