substudy: Make Anki cards and other resources from video & bilingual subtitles (command-line)

All about language programs, courses, websites and other learning resources
Marais
x 7658

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby Marais » Sun Oct 09, 2016 4:49 pm

I tried to email some official TV stations in France and ask them for their subtitled versions of programmes for deaf people and they don't have any. I wonder sometimes how deaf people get on here. Every show should have accurate subtitles available imo.
2 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6330
Contact:

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby emk » Tue Oct 11, 2016 10:59 am

Marais wrote:I tried to email some official TV stations in France and ask them for their subtitled versions of programmes for deaf people and they don't have any. I wonder sometimes how deaf people get on here. Every show should have accurate subtitles available imo.

Subtitling in France is terrible. Even when they do subtitle for the deaf, it sometimes bears very little resemblance to the spoken dialog. Although if you look at DVDs released in the last five years, it's not quite as hopeless as it used to be.

We have a Wiki page with French movies and series that we've verified to have reasonably good subs, and there are links to various websites where you can look for more.

But if you're using substudy, especially as a beginner, you want subtitles that are at least 90% accurate, because you'll throw away basically every card where the audio doesn't match the text. I've tried doing subs2srs and substudy with less accurate subs, and it was only about 20% as effective as finding a series with clear dialog and high-quality subs. It seems like everyone ignores this advice at first, but it really does make a huge difference.
1 x

maschingon
Yellow Belt
Posts: 64
Joined: Wed Aug 24, 2016 6:57 am
Location: Mérida, YUC, México
Languages: Spanish [C2], Portuguese [B2? C1 writing? no idea], Chinese [high intermediate level, no idea specifically]
Language Log: viewtopic.php?f=15&t=3906
x 33

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby maschingon » Fri Oct 21, 2016 7:44 am

rdearman wrote:It would appear you don't have the substudy program in your path. This means you'll probably have to type the entire path, or figure out how to edit the path variable. I don't do MAC's however you can do it temporarily on a unix machine, which is what a MAC really is, by using the export command. Anyway, the simple way is to find the directory where the substudy command is using a search command or find:

Code: Select all

find ~ -iname substudy


then type in the full path to the executable

Mine is located:

Code: Select all

/home/rick/.multirust/toolchains/nightly/cargo/bin/substudy


yours is probably similar. I have the substudy folder in my path. You can do this (on a linux machine, and probably a Mac) by editing the .bashrc file in your home area. Then insert something like below with the appropriate corrections.

Code: Select all

export PATH=${PATH}:/home/rick/.multirust/toolchains/nightly/cargo/bin



I understand everything you say up until "I have the substudy folder in my path. You can do this by editing the .bashrc file in your home area".

I found the file path,

Code: Select all

/Users/ElMasChingon/.cargo/registry/index/github.com-1ecc6299db9ec823/su/bs/substudy


What is the .bashrc mean? I lose you there.
0 x
Michael King (seudónimos: Miguel Rey / Miguel Rei / 金一迈)

maschingon
Yellow Belt
Posts: 64
Joined: Wed Aug 24, 2016 6:57 am
Location: Mérida, YUC, México
Languages: Spanish [C2], Portuguese [B2? C1 writing? no idea], Chinese [high intermediate level, no idea specifically]
Language Log: viewtopic.php?f=15&t=3906
x 33

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby maschingon » Fri Oct 21, 2016 7:53 am

emk wrote:If you've installed more recently, you might need to run:

Code: Select all

export PATH=${PATH}:~/.cargo/bin


I need to go update substudy a bit and track down a compilation issue on some newer Mac systems. I've been terribly busy with work and training lately, and I've fallen behind on maintaining some of my programs!

As always, if you're a developer, I do accept pull requests on github.


I'm trying my hardest here haha, I'm probably doing something wrong that's really obvious, but I have no idea right now. Anyways, here's what I've tried in the last few minutes since the last time I re-opened the terminal:

Code: Select all

Michaels-MBP:bs ElMasChingon$ ./substudy
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy export csv traicionera.mp3 \ Traicionera_subtitles.es.srt Traicionera_subtitles.en.srt
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy PATH=${PATH}://Users/ElMasChingon/.cargo/registry/index/github.com-1ecc6299db9ec823/su/bs/substudy
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy PATH=${PATH}:/Users/ElMasChingon/.cargo/registry/index/github.com-1ecc6299db9ec823/su/bs/substudy traicionera.mp3 \ Traicionera_subtitles.es.srt Traicionera_subtitles.en.srt
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$
0 x
Michael King (seudónimos: Miguel Rey / Miguel Rei / 金一迈)

maschingon
Yellow Belt
Posts: 64
Joined: Wed Aug 24, 2016 6:57 am
Location: Mérida, YUC, México
Languages: Spanish [C2], Portuguese [B2? C1 writing? no idea], Chinese [high intermediate level, no idea specifically]
Language Log: viewtopic.php?f=15&t=3906
x 33

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby maschingon » Fri Oct 21, 2016 7:53 am

emk wrote:If you've installed more recently, you might need to run:

Code: Select all

export PATH=${PATH}:~/.cargo/bin


I need to go update substudy a bit and track down a compilation issue on some newer Mac systems. I've been terribly busy with work and training lately, and I've fallen behind on maintaining some of my programs!

As always, if you're a developer, I do accept pull requests on github.


I'm trying my hardest here haha, I'm probably doing something wrong that's really obvious, but I have no idea right now. Anyways, here's what I've tried in the last few minutes since the last time I re-opened the terminal:

Code: Select all

Michaels-MBP:bs ElMasChingon$ ./substudy
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy export csv traicionera.mp3 \ Traicionera_subtitles.es.srt Traicionera_subtitles.en.srt
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy PATH=${PATH}://Users/ElMasChingon/.cargo/registry/index/github.com-1ecc6299db9ec823/su/bs/substudy
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$ ./substudy PATH=${PATH}:/Users/ElMasChingon/.cargo/registry/index/github.com-1ecc6299db9ec823/su/bs/substudy traicionera.mp3 \ Traicionera_subtitles.es.srt Traicionera_subtitles.en.srt
-bash: ./substudy: Permission denied
Michaels-MBP:bs ElMasChingon$
0 x
Michael King (seudónimos: Miguel Rey / Miguel Rei / 金一迈)

User avatar
rdearman
Site Admin
Posts: 7231
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
Language Log: viewtopic.php?f=15&t=1836
x 23127
Contact:

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby rdearman » Fri Oct 21, 2016 5:18 pm

maschingon wrote:What is the .bashrc mean? I lose you there.


In a bash terminal shell, which is what you are using, in your home area is your configuration file. It is called .bashrc and normally you can't see it, because files begining with a dot are normally hidden. However you can list it with the ls command like this.

Code: Select all

ls -la

or you can just edit it. Now I don't know what text editors there are on a MAC so you will have to figure that one out yourself. But just replace editor program in the command below.

Code: Select all

EDITOR ~/.bashrc


And you can edit the file and add the stuff I said earlier.
1 x
: 0 / 150 Read 150 books in 2024

My YouTube Channel
The Autodidactic Podcast
My Author's Newsletter

I post on this forum with mobile devices, so excuse short msgs and typos.

maschingon
Yellow Belt
Posts: 64
Joined: Wed Aug 24, 2016 6:57 am
Location: Mérida, YUC, México
Languages: Spanish [C2], Portuguese [B2? C1 writing? no idea], Chinese [high intermediate level, no idea specifically]
Language Log: viewtopic.php?f=15&t=3906
x 33

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby maschingon » Sun Oct 23, 2016 4:36 am

I got it working a few days ago, I'm now off to the races! Finally... I'm not getting it to work with an mp3 file, but that's not a crucial problem - for now I can just grab the music video / lyric video (although the lyric video would add a whole new interesting element...)

Another question: is there a way to modify the information it spits out? In other words, can I change the layout in which it gives the output so that I can have it follow my preferred deck model?
0 x
Michael King (seudónimos: Miguel Rey / Miguel Rei / 金一迈)

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6330
Contact:

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby emk » Sun Nov 06, 2016 4:59 pm

I think I have substudy working again on new Mac systems! But I'll need a technically-inclined tester to be sure.

If you're running Xcode 8 under OS X El Capitan or Sierra, I've made a pre-release version of substudy. To test it out, see the instructions here, but replace the actual install command with:

Code: Select all

cargo install --vers 0.4.1-pre.1 substudy

This should produce a working copy of substudy. But please let me know if you encounter any issues with the auto-detection of character sets; it seems to be detecting certain very short UTF-8 fies as using a legacy central European encoding, and I don't know whether that will affect longer files. I suspect it won't, but I'd love independent confirmation.

In related good news, you may be able to get substudy working on Windows if you have your Rust compiler configured to use MSVC. But that's still an "experts only" possibility right now. If people are happy with this updated version of substudy, I'll make an official release and try to set up automatic binary builds so that you can just download it. (Of course, you'll still need to install ffmpeg.)

maschingon wrote:I got it working a few days ago, I'm now off to the races! Finally... I'm not getting it to work with an mp3 file, but that's not a crucial problem - for now I can just grab the music video / lyric video (although the lyric video would add a whole new interesting element...)

Thank you for your bug report about this! I think this is a case of missing MP3 metadata, and I'll look into fixing it when I get a moment.

maschingon wrote:Another question: is there a way to modify the information it spits out? In other words, can I change the layout in which it gives the output so that I can have it follow my preferred deck model?

Substudy spits out raw CSV files, which you can import into Anki using any deck model of your choice.
1 x

maschingon
Yellow Belt
Posts: 64
Joined: Wed Aug 24, 2016 6:57 am
Location: Mérida, YUC, México
Languages: Spanish [C2], Portuguese [B2? C1 writing? no idea], Chinese [high intermediate level, no idea specifically]
Language Log: viewtopic.php?f=15&t=3906
x 33

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby maschingon » Mon Nov 07, 2016 11:27 am

emk wrote:
maschingon wrote:I got it working a few days ago, I'm now off to the races! Finally... I'm not getting it to work with an mp3 file, but that's not a crucial problem - for now I can just grab the music video / lyric video (although the lyric video would add a whole new interesting element...)

Thank you for your bug report about this! I think this is a case of missing MP3 metadata, and I'll look into fixing it when I get a moment.


No, thank you! (putting emphasis on the "you" is quite hard to do through text...)

emk wrote:
maschingon wrote:Another question: is there a way to modify the information it spits out? In other words, can I change the layout in which it gives the output so that I can have it follow my preferred deck model?

Substudy spits out raw CSV files, which you can import into Anki using any deck model of your choice.


Ya, that was a very dumb question, sorry about that... :oops: I figured that one out on my own :mrgreen:
0 x
Michael King (seudónimos: Miguel Rey / Miguel Rei / 金一迈)

User avatar
rdearman
Site Admin
Posts: 7231
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
Language Log: viewtopic.php?f=15&t=1836
x 23127
Contact:

Re: substudy: A tool for making bilingual subtitles (MacOS X or Linux, command-line)

Postby rdearman » Sat Dec 17, 2016 4:47 pm

I have another request based on a recent thread about the software Gradint http://people.ds.cam.ac.uk/ssb22/gradint/ which is to make Pimsleur like exercises.

I had a look at using substudy in combination with Gradint to create a Pimsleur style course from a film. Substudy is probably 90% of the way there toward making all the input files required by Gradint, the only difference is when you create tracks the file ends in .fi.mp3 (for finnish) instead of _fi.mp3 which is what Gradint expects.

I can do this with a script, but what would be cool is to give substudy either two video files (or in the case of mkv the audio language track numbers) and two SRT files. This would then generate a directory filled with files with timestamp_language.mp3 so 02089_940-02117_019_fr.mp3 and 02089_940-02117_019_en.mp3. This would then allow Gradint to create a Pimsleur course using these files. Better yet would be for substudy to generate a set of files using the Pimsleur algorithm which could be burned to disk and played in the car.

I may have a go at this myself.
3 x
: 0 / 150 Read 150 books in 2024

My YouTube Channel
The Autodidactic Podcast
My Author's Newsletter

I post on this forum with mobile devices, so excuse short msgs and typos.


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests