Ripping DVDs to Hard Drive + How To Learn With TV

General discussion about learning languages
Stefan
Green Belt
Posts: 379
Joined: Sun Dec 20, 2015 9:59 pm
Location: Sweden
Languages: -
x 920
Contact:

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Stefan » Sat Sep 23, 2017 8:58 am

Seneca wrote:How would I do that? Are you thinking something like what is at this link? This seems quite labor intensive, so I'd want to know if this is what was meant before taking the time!

That's the one. I created Anki decks for a handful of movies years ago but haven't since so I decided to test one episode now (La légende de Korra) and used a stopwatch while doing so.

The time consuming part is gathering great material (ripping DVDs or editing subs) although Netflix is a gold mine today for subs (DRM protected videos). Back then I searched "subtitle sites" but gave up when I came to the conclusion that only one in ten subs had decent quality. Then I switched to real DVDs with included subs which was a bit better before ultimately ending up with public service and near perfect transcription. Unfortunately public service doesn't offer a translation in your native language so you still need to find those subs or run the file through Google Translate.

Anyway, once you have ripped the DVD and ended up with a movie file and matching subs, then there are basically two steps left. Enter the information into subs2srs, which takes about one minute and then click Go!. This starts the exporting process which is done automatically but can take some time. On my cheap laptop bought used for $75, it took me nine minutes to export one episode with 193 cards and 23 minutes of playtime. I'm sure it's a lot quicker on a decent computer but it's not a big deal since I can spend the time surfing this forum. Then you import the newly created tsv and media files into Anki which took me less than a minute.

Initiate subs2srs: 1 min
Exporting data: 9 min (auto)
Importing to Anki: 1 min

tl;dr: once you have the movie file and matching subs, it's about two minutes of work and a few minutes of waiting time.



This would probably be my process since it's the most time efficient method I can think of:

1) Find a decent movie on a mediathek and capture with matching subs.
2) Edit subs to unbreak and merge lines to avoid cards with half a sentence.
3) Upload txt version of the subs to Google Translate and get your native version. (fixed)
4) Run the files through subs2srs and export finished media files.
5) Import into Anki.

Note that I haven't tried this so I don't know what the end result would be like, but I reckon the machine translation might be worth it instead of spending time finding and editing multiple subs. I believe Netflix has a bunch of identical subs for different languages so it might be an option if you have it available. You would probably still need to merge lines in both subs though.
Last edited by Stefan on Sat Sep 23, 2017 1:57 pm, edited 1 time in total.
1 x

DaveBee
Blue Belt
Posts: 952
Joined: Wed Nov 02, 2016 8:49 pm
Location: UK
Languages: English (native). French (studying).
Language Log: https://forum.language-learners.org/vie ... =15&t=7466
x 1386

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby DaveBee » Sat Sep 23, 2017 9:25 am

Stefan wrote:3) Upload txt version of the subs to Google Translate and get your native version.
4) Run the files through subs2srs and export finished media files.
5) Import into Anki.

Note that I haven't tried this so I don't know what the end result would be like, but I reckon the machine translation might be worth it instead of spending time finding and editing multiple subs. I believe Netflix has a bunch of identical subs for different languages so it might be an option if you have it available. You would probably still need to merge lines in both subs though.
DeepL might be a better option than Google Translate.
Last edited by DaveBee on Sat Sep 23, 2017 9:26 am, edited 1 time in total.
0 x

User avatar
Seneca
Green Belt
Posts: 268
Joined: Sat Jun 11, 2016 5:08 pm
Location: Eurasia
Languages: English (N); 日本語 (beginner)
x 351

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Seneca » Sat Sep 23, 2017 9:25 am

Stefan wrote:This would probably be my process since it's the most time efficient method I can think of:

1) Find a decent movie on a mediathek and capture with matching subs.
2) Edit subs to unbreak and merge lines to avoid cards with half a sentence.
3) Upload txt version of the subs to Google Translate and get your native version.
4) Run the files through subs2srs and export finished media files.
5) Import into Anki.

Note that I haven't tried this so I don't know what the end result would be like, but I reckon the machine translation might be worth it instead of spending time finding and editing multiple subs. I believe Netflix has a bunch of identical subs for different languages so it might be an option if you have it available. You would probably still need to merge lines in both subs though.

Thanks for the detailed rundown and summary of this! I really appreciate it. I have purchased the box sets of a few TV shows (The Shield and The Wire) to use with this. I think The Wire would be a lot more challenging due to all the street slang. Around the end of October (I have a lot going on the next month and a half), I am going to try out The Shield with this method.

The good news is that, with the delay, I have plenty of time to edit/prep/tweak this stuff. I really like this as a concept. Plus, these DVDs all have English and the target language. So it seems it should be easy for me to rip one version for watching just with target language subs per your useful links I quoted in the first post and then to do the above to make the flashcards. :D I look forward to reporting back at the end of November to say how it goes.

edit: or, is it better to get an online translation of the target language subs so they are the same rather than use the English track on the DVD since that'd likely be wordier than the target language?
0 x

Stefan
Green Belt
Posts: 379
Joined: Sun Dec 20, 2015 9:59 pm
Location: Sweden
Languages: -
x 920
Contact:

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Stefan » Sat Sep 23, 2017 10:20 am

DaveBee wrote:DeepL might be a better option than Google Translate.

I have high hopes for it but sadly there's no API or file upload yet so you're limited to 5000 characters and it's a bit buggy with single numbers turning into something else. For example, 6 turns into 6 6.6 and 16 turns into 16 16 16. You could divide the sub file (mine is 23k chars), translate them and then combine while fixing the number errors.
1 x

User avatar
Adrianslont
Blue Belt
Posts: 827
Joined: Sun Aug 16, 2015 10:39 am
Location: Australia
Languages: English (N), Learning Indonesian and French
x 1936

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Adrianslont » Sat Sep 23, 2017 11:14 am

Seneca wrote:
Stefan wrote:This would probably be my process since it's the most time efficient method I can think of:

1) Find a decent movie on a mediathek and capture with matching subs.
2) Edit subs to unbreak and merge lines to avoid cards with half a sentence.
3) Upload txt version of the subs to Google Translate and get your native version.
4) Run the files through subs2srs and export finished media files.
5) Import into Anki.

Note that I haven't tried this so I don't know what the end result would be like, but I reckon the machine translation might be worth it instead of spending time finding and editing multiple subs. I believe Netflix has a bunch of identical subs for different languages so it might be an option if you have it available. You would probably still need to merge lines in both subs though.

Thanks for the detailed rundown and summary of this! I really appreciate it. I have purchased the box sets of a few TV shows (The Shield and The Wire) to use with this. I think The Wire would be a lot more challenging due to all the street slang. Around the end of October (I have a lot going on the next month and a half), I am going to try out The Shield with this method.

The good news is that, with the delay, I have plenty of time to edit/prep/tweak this stuff. I really like this as a concept. Plus, these DVDs all have English and the target language. So it seems it should be easy for me to rip one version for watching just with target language subs per your useful links I quoted in the first post and then to do the above to make the flashcards. :D I look forward to reporting back at the end of November to say how it goes.

edit: or, is it better to get an online translation of the target language subs so they are the same rather than use the English track on the DVD since that'd likely be wordier than the target language?


I actually use SubtitleEdit, a free program, to get a translation of my target language subs to my native language, English. It runs the subs through Google translate for you very quickly. You get a very literal translation and it can be ugly but there are two advantages.

1. Using Google translate or using it through SubtitleEdit means you have the same number of target language and native language subs. Often, if you use the target Lang and naive Lang subs provided on a DVD, there may not be the same number of lines eg 1237 target lines and 1264 native Lang lines. This causes all sorts of problems when making anki cards - the subs don't match without a lot of manual editing.

2. SubtitleEdit has dictionaries that can help you clean up your subs. After ripping subs from a DVD, you have to convert a vobsub to an .srt file and they can need cleaning up as basically it's an OCR operation. SubtitleEdit does it pretty quickly in a semautomated way.

If all of this sounds like a lot of work, it is. The first time. Then it is much easier the second time. If you can rip a DVD with Handbrake, the rest is only a little bit harder. And in my opinion very much worth learning.

The really hard part is finding shows where the target language soundtrack and target language subs are a close match. I find it easy for Indonesian but very difficult for french. And also you want to find shows that you like and that are helpful at your stage of learning.
3 x

DaveBee
Blue Belt
Posts: 952
Joined: Wed Nov 02, 2016 8:49 pm
Location: UK
Languages: English (native). French (studying).
Language Log: https://forum.language-learners.org/vie ... =15&t=7466
x 1386

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby DaveBee » Sat Sep 23, 2017 11:25 am

Adrianslont wrote:The really hard part is finding shows where the target language soundtrack and target language subs are a close match. I find it easy for Indonesian but very difficult for french. And also you want to find shows that you like and that are helpful at your stage of learning.
For french, Arte.tv/fr is good for this (Version Française - ST sourds/mal).

Arte do some geo-IP blocking, I use Firefox with the modify headers extension to spoof a french IP address and that seems to be enough to pass.
1 x

Stefan
Green Belt
Posts: 379
Joined: Sun Dec 20, 2015 9:59 pm
Location: Sweden
Languages: -
x 920
Contact:

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Stefan » Sat Sep 23, 2017 1:56 pm

Adrianslont wrote:I actually use SubtitleEdit, a free program, to get a translation of my target language subs to my native language, English. It runs the subs through Google translate for you very quickly.

This is quite amazing. I knew about the built in link to Google Translate but didn't know that you could run through the whole text automatically. It seems to work great and only takes a second while saving me the work of going through their site. The development of SubtitleEdit is active as well so I sent them a message about adding support for the inofficial DeepL API.
0 x

User avatar
Seneca
Green Belt
Posts: 268
Joined: Sat Jun 11, 2016 5:08 pm
Location: Eurasia
Languages: English (N); 日本語 (beginner)
x 351

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Seneca » Sat Sep 23, 2017 6:20 pm

zenmonkey wrote:Use Handbrake. (Instructions: http://lifehacker.com/how-to-rip-a-dvd- ... er-5809765)
Choose the subtitles during the ripping process in the software.

For the subtitle, these are files with an srt format (you can 'find' these or use SubRip to create them from your DVD).
If you don't want to use subtitle files Handbrake will burn them directly into the video during extraction. Done.

I tend to watch on my computer and pause, return if there is something I don't get. I also write down a few sentences if I find expressions interesting or useful.


Edit: beat me by seconds.

I was not able to figure out how to use SubRip to extract the subtitles directly.

Adrianslont wrote:I actually use SubtitleEdit, a free program, to get a translation of my target language subs to my native language, English. It runs the subs through Google translate for you very quickly. You get a very literal translation and it can be ugly but there are two advantages.

1. Using Google translate or using it through SubtitleEdit means you have the same number of target language and native language subs. Often, if you use the target Lang and naive Lang subs provided on a DVD, there may not be the same number of lines eg 1237 target lines and 1264 native Lang lines. This causes all sorts of problems when making anki cards - the subs don't match without a lot of manual editing.

2. SubtitleEdit has dictionaries that can help you clean up your subs. After ripping subs from a DVD, you have to convert a vobsub to an .srt file and they can need cleaning up as basically it's an OCR operation. SubtitleEdit does it pretty quickly in a semautomated way.

If all of this sounds like a lot of work, it is. The first time. Then it is much easier the second time. If you can rip a DVD with Handbrake, the rest is only a little bit harder. And in my opinion very much worth learning.

The really hard part is finding shows where the target language soundtrack and target language subs are a close match. I find it easy for Indonesian but very difficult for french. And also you want to find shows that you like and that are helpful at your stage of learning.


I have Mad Men on DVD in Spanish. I was fiddling around with it to try to make sure I am doing this right. Below is step-by-step what I did.

I used HandBrake and after selecting the DVD in my optical drive I put the HandBrake settings this way:

MP4
Video: FPS set to "30"
Audio: "español (AC3) (2.0)"
Subtitles: "español (VobSub)" and of the three options to the right, I only checked "Default". This took ~35 minutes.

I did not rip the English subtitles as I want to base the English side of the flash card on the Spanish subtitle itself.

After this process was done in HandBrake, I right-clicked the new file and opened it with Subtitle Edit. I had OCR method as "OCR as Tesseract" and downloaded "Spanish" under language. Then I hit "Start OCR." This took about 10 minutes. Then I hit okay and "save as" and now have the .srt file of the Spanish.

Then, in Subtile Edit, I hit Auto-translate->Powered by Google. Then I selected all of the Spanish in the left column and clicked "Translate" at the top. This took less than 2 minutes. Then I hit save-as, and saved the file with the same name except it added ".en" before the ".srt"

Then I opened subs2srs, and put the Spanish .srt into subs1, the English .srt into subs2, and the mp4 video from above into the Video options. Here is where things go a bit off the rails. When I try to choose the mp4 file I created after selecting video, I get an error that says, "Unhandled exception has occurred in your application. If you click Continue, the application will ignore the error and attempt to continue. If you click Quit, the application will close immediately. The system cannot find the file specified." But the file definitely exists. I can even open and play it just fine in VLC. Any ideas?

I am trying to follow the advice here.

Stefan wrote:
Adrianslont wrote:I actually use SubtitleEdit, a free program, to get a translation of my target language subs to my native language, English. It runs the subs through Google translate for you very quickly.

This is quite amazing. I knew about the built in link to Google Translate but didn't know that you could run through the whole text automatically. It seems to work great and only takes a second while saving me the work of going through their site. The development of SubtitleEdit is active as well so I sent them a message about adding support for the inofficial DeepL API.


You seemed to have instant luck with this :lol: Do you see where I might be going wrong?

edited: typos
0 x

User avatar
Seneca
Green Belt
Posts: 268
Joined: Sat Jun 11, 2016 5:08 pm
Location: Eurasia
Languages: English (N); 日本語 (beginner)
x 351

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Seneca » Sat Sep 23, 2017 11:06 pm

I have also tried a few other assorted files I happen to have on my laptop in the subs2srs Video section, and got the same error each time. So it seems at least it wasn't a problem of my handbrake usage! Progress, kind of.
0 x

User avatar
Adrianslont
Blue Belt
Posts: 827
Joined: Sun Aug 16, 2015 10:39 am
Location: Australia
Languages: English (N), Learning Indonesian and French
x 1936

Re: Ripping DVDs to Hard Drive + How To Learn With TV

Postby Adrianslont » Sun Sep 24, 2017 1:53 am

Seneca, I think the only thing I do differently is that I rip to .mkv. I don't know if that will resolve your problem but might be worth a try.
0 x


Return to “General Language Discussion”

Who is online

Users browsing this forum: No registered users and 2 guests