kunsttyv wrote:Hi emk, thanks for taking the initiative creating such useful tools for language learners. I have some limited experience with subs2srs, and while I could see its potential and usefulness, I didn't really use it that much since I usually don't have access to a Windows installation, and also I didn't like the clunky uncustomizable nature of the program.
You might want to try out my
substudy command-line tool. It's trivial to install on Mac or Linux systems, and despite not having a GUI, it actually has a much simpler "UI" than Subs2SRS because it doesn't insist on asking you dozens of questions when it can just figure out things on its own.
kunsttyv wrote:How much effort would you think it would take someone with a decent amount of experience in imperative languages like Python, C etc. to be functional in Rust? Is it just a matter of adjusting oneself to the syntax, or would it be necessary to learn totally new concepts or even coding paradigms? I've heard that Rust comes with a pretty comprehensive type system for handling type safety issues.
Rust combines several old and new ideas. If you already know most of these, the learning curve shouldn't be too bad (maybe a week or two to be truly comfortable). But it none of this is familiar, it might take a while:
- Memory. Like C and C++, Rust exposes the difference between the stack and the heap, and between passing things as values and passing them as references. People who've only worked in garbage-collected languages might find this confusing.
- Anonymous functions. Like JavaScript, Ruby, or a functional language, Rust tends to rely more heavily on functions which take anonymous functions as arguments. Python or C programmers might find this surprising, but users of any other reasonably modern programming language are totally used to this.
- Generic types. Like C++, C#, Java and TypeScript, Rust uses generic types. So instead of using a type like "Vec" to represent a simple array, you would write "Vec<u8>" or "Vec<String>", where "u8" and "String" are compile-time type parameters. This may be challenging for people who've only ever used C and scripting languages. Generic types do add some complexity to the language, but they make it easier to write "zero-overhead" abstractions: code which is both fast and high-level. So it's a tradeoff.
- The "borrow checker". This is the only genuinely new thing in Rust. Basically, all values must be "owned" by some piece of code, and the owner can choose to hand out either single mutable reference at a time, or multiple read-only references. This allows Rust to have automatic, correct memory management without needing a garbage collector. For some people, they already code in a style where "who owns what" is mostly obvious, and they'll adjust relatively rapidly. Other people will take longer and may need to change how they think about programs. (This also means that implementing doubly-linked lists in Rust is an advanced topic.)
Good Rust introductions include:
For the kind of work I'm doing in this project, I would say the advantages of Rust are:
- If a Rust program compiles at all, it typically works correctly. In most languages, it's relatively easy to get your code to compile, but you often need to spend a fair bit of time running it and debugging it. Rust has a different workflow: You may spend more time trying to figure out compiler error messages, but once your program actually compiles, it will very often work correctly on the first try. Some people love this; other people get depressed "fighting" with the compiler and just want to see their program run, even if it's not correct yet. (If you get stuck, go to the Mozilla IRC server and ask for help on the #rust channel.)
- Rust code tends to be fast. This is especially useful for things like parsing MPEG-2 format subtitles, or for doing OCR. Rust gives you more-or-less the same speed as C, but with many convenient things you'd expect from higher-level languages.
- Rust's "cargo" is an excellent library manager and build tool. Cargo provides support for downloading third-party libraries and compiling your code. It's very advanced and well-thought-out, and you can easily use any library on crates.io.
- It's easy to make Rust programs work on Mac, Windows and Linux. This is partly thanks to Cargo, and partly thanks to the fact that common Rust libraries handle cross-platform issues for you. My vobsub2png tool worked on MacOS and Windows on the very first try!
- It's easy to make standalone statically-linked binaries using Rust. This makes it trivial to just download and unzip a binary without a lot of fuss, even on Linux.
The disadvantages are that Rust may be overkill, especially for throwaway scripts, and especially if you haven't learned Rust yet!
kunsttyv wrote:And one more thing, this utility project, would it only be for "subtitle study" or do you plan to incorporate other srs functionality as well? I have written
some code to generate anki cards from kindle dictionary look-ups. It's working nicely for my specific setup, but I would like to rewrite it and make it more robust.
Nice! I'm always happy to see people working on language-learning tools!
Personally, I actually have two projects here:
substudy, which is meant for learning languages using subtitles, and the "Rust subtitle utilities" that I'm discussing in this thread, which are a bit lower-level but which may be useful to language learners building their own tools, and which may someday be part of a newer version of substudy.
For both projects, I'm very likely to focus strictly on learning using subtitles and video. It's already a huge project as is! For working with text, there are already tools like
readlang.com,
Learning with Texts, and my own (permanently beta, and not really maintained)
SRS Collector, which can import Kindle highlights and turn them into Anki cards. (There's an associated Anki plugin and Chrome plugin that I used to provide to beta testers.)
But like leosmith, I increasingly feel like "
Listening is Everything", especially in the beginning. I still feel like tools like readlang, LWT and SRS Collector are useful for rapidly improving vocabulary while doing extensive reading. But I feel like video is much more useful early on, and that's where I want to focus my efforts as both a student and a programmer.