arthaey wrote:emk wrote:Code: Select all
substudy export review avatar_01_01.mkv avatar_01_01.es.srt avatar_01_01.en.srt
Yay for simple usage!
Thanks! My theory is that this shouldn't have a zillion command-line flags. It's willing to go to some pretty ridiculous lengths to figure things out for you. If your data isn't in UTF-8 format, for example, it will run an encoding detector and convert automatically if possible. Or if your video file contains multiple audio files tagged with different languages, it's soon going to run Google's human language detector on the subtitles (which is extremely accurate given that much text) and try to pick the right track by default. I'm currently working on that right now. Why have these.computer things if they're just going to ask a hundred incomprehensible questions?
As for the ffmpeg version: avoid libav in favor of real ffmpeg. Version 2.8.x should be pretty safe. You can try whatever you've got, but keep a process monitor open it and kill ffmpeg if it tries to eat 2GB of RAM.
arthaey wrote:You seem to be on a roll here, so I don't want to duplicate your dev efforts. But if you're not actually planning on doing the CSV export today, I may take a crack at it later.
The final version of the CSV export code will involve a somewhat messy refactoring to support multiple exporters. I'm not sure that I want to inflict that on somebody else.
But if you want to hack together a simple CSV export that piggybacks on the "review" exporter, it would be a great warmup exercise. To do this, you would want to look up BurntSushi's awesome Rust csv library, and add an entry for it to the Cargo.toml file, and then an 'extern crate csv;' line in src/lib.rs. Then go to src/export/mod.rs, add a 'use csv;' near the top, and just hack the CSV output directly in to the big 'export' function without worrying about the design.
If you want to import into Anki, you'll probably also want change 'grow(0.5, 0.5)' to 'grow(1.5, 1.5)' to get a bit more context around each clip. You may also want to replace the 'index' in the audio and video file names with a timestamp based on the start time of the audio/video. This will prevent name clashes in Anki by making the file names more unique.
If you get stuck, the main tutorial / manual provided on the Rust site is excellent. And I'm always happy to answer questions, though I reserve the right to do so in French sometimes in your case.
I'm also happy to do remote pair programming to help explain the code and show off Rust tricks.
I'm honestly not sure if this is an ideal project to merge--too much cleanup will need to happen elsewhere--but it's a great warmup exercise. It's up to you if you want to take it on.
Also, this will be much easier if you have some prior knowledge of a language like C or C++ (at least heap versus stack, and pointers).
Oh, yeah: The Rust borrow checker will hurt (so good) for the first week or two, but once you make friends with it and stop picking fights, it mostly fades into the background like a badass guardian angel who watches over your code for vile, vile memory ownership bugs, while still letting you get all the speed of a true low-level language. Just don't pick fights with it if you can avoid doing so.