Solved: How to create your own Glossika-like GSR files?

kelciour · Postby **kelciour** » Thu Apr 05, 2018 7:28 pm

neumanc wrote:This seems to be an excellent add-on for Anki to emulate Glossika. Thank you so much for letting me know, kelciour! However, I have the impression that this is only available for Anki, not for Ankidroid. Am I right?

AnkiDroid has similar option Automatic display answer but the cards will be marked as failed:

The automatic display answer feature allows you to have the answer shown automatically after some timeout period. You can also have the next question shown automatically; in this case the card is assumed to be failed (i.e. the again button is automatically chosen)

There're several issues on GitHub about it (https://github.com/ankidroid/Anki-Android/issues/2609 and https://github.com/ankidroid/Anki-Android/issues/4787). For personal use this behaviour can be easily "fixed", I think, by replacing all the occurences of mEase1Layout to mEase3Layout in this function (AbstractFlashcardViewer.java#L1897) and using the first steps of this guide to build custom version of AnkiDroid. It could take 30 minutes or a couple of hours or maybe a little bit more.

neumanc · Postby **neumanc** » Thu Apr 05, 2018 9:28 pm

kelciour wrote:AnkiDroid has similar option Automatic display answer but the cards will be marked as failed:
The automatic display answer feature allows you to have the answer shown automatically after some timeout period. You can also have the next question shown automatically; in this case the card is assumed to be failed (i.e. the again button is automatically chosen)
There're several issues on GitHub about it (https://github.com/ankidroid/Anki-Android/issues/2609 and https://github.com/ankidroid/Anki-Android/issues/4787). For personal use this behaviour can be easily "fixed", I think, by replacing all the occurences of mEase1Layout to mEase3Layout in this function (AbstractFlashcardViewer.java#L1897) and using the first steps of this guide to build custom version of AnkiDroid. It could take 30 minutes or a couple of hours or maybe a little bit more.

Thank you very much for your effort, kelciour. So it seems that whatever way it would be possible to emulate Glossika GSR files, one kind or another of computer programming is involved. I knew the day would come where I had to pick up coding again.

neumanc · Postby **neumanc** » Thu Apr 05, 2018 10:57 pm

jeff_lindqvist wrote:Instead of editing the algorithm intervals, I give each card a lower rating (never Again, because that will eventually result in a leech; Hard makes the card appear relatively soon ( ! ); Good is possibly too long into the future and Easy is definitely too long). When I flip a card, I also see when it's due - usually that tells me how new a card is, and if I "should" send it to the future by choosing a better rating.

In short, Hard is what I sometimes use to (kind of) mimic overlearning. It may defeat the purpose of the software, but again, we can never know if the SRS is "perfectly" adapted to our own forgetting curve.

This is probably the easiest way to do overlearning with Ankidroid. What I could do is to combine this with defining three swipe gestures: one for "again", one for "hard", and one for "easy". On the first day I just would gesture "again" a few times, then "hard" so that the card will show up the next day. The next two days, I could do the same, so overlearning would continue. On the fourth day, I would have to gesture "easy". If I would alter the interval for "easy" to a very long time (say 30 days), I would successfully have emulated Glossika. But really, is it worth it? I would have to do muuuch swiping.

jeff_lindqvist wrote:Or just don't use Anki at all. People have learned languages before. You can cram the material in a number of ways. I think Ari created playlists full of ChinesePod lessons and just deleted stuff as he got tired of it.

This is so true. People have learned languages in all times without any gadgets. Normally I'm not an Anki "follower" either. For overlearning, I would just sit in an armchair with my Assimil or Linguaphone or whatever in hand and repeat the dialogues until they become second nature. This works without any "preparations" (i.e. feeding sentences into Anki) whatsoever. But believe me, this is not as effective as doing Glossika, as I found out at least for myself: With Glossika, there's no "position effect", that is to say that one recalls a sentence only in the context of a specific dialogue. Furthermore, without modern devices, you always have to guide yourself whether you should repeat the dialogue at hand another time or not, or whether you should revise a specific dialogue at all. With Glossika, it's just listening and speaking, be it in the armchair, on the go or - as I did this very evening - doing the dishes. Unbelievable (at least for me) but true: I revised tonight about 400 sentence pairs (which I had "overlearned" with Glossika a few weeks ago) while doing household chores and with only very few mistakes. How would I have done this with an Assimil book in my hands? When have I been able to memorize so many sentences without any effort to remember? Certainly not by reading, reciting and repeating dialogues from a book. So modern gadgets can provide considerable assistance. But unfortunately it takes a hell of a lot of effort to set up and "feed" the devices in such a way that their advantages come into effect.

Postby **jeff_lindqvist** » Thu Apr 05, 2018 11:32 pm

I agree that some tools can really help us - as long as we know how to let them work for us instead of the other way around.

I once experimented with comparing two mock cards which I had added the same day - how long it would take for #1 to move well into the future with a certain series of ratings, and when #2 were scheduled. I wrote down the info on the back of the card (e.g. Aug 1 - Hard - 1 day, Aug 2 - Hard - 1 day, Aug 3 - Hard - 2 days, Aug 5 - Hard - 2 days, Aug 7 - Hard - 3 days and so on). At some point I accidentally managed to overwrite the back info with just the original answer. :oops:

If your deck is big, it may be difficult to remember when you added a certain card, and indeed to know "why" it shows up when it does. Has it been "upgraded" to this date (due to an Easy rating), or "downgraded" (due to any of the other)? Of course, nothing of this is really important.

For the time being, I just give myself a lower rating to make sure the card appears rather soon. I'm still at 200-300 reviews per day for two languages where I add something more or less every day. (Further info in Huge Anki decks)

kelciour · Postby **kelciour** » Fri Apr 06, 2018 7:46 pm

neumanc wrote:That's why I came up with the idea that I could create my own Glossika-like course. I have plenty of audio in the target language. I could split these files into separate sentences and number them consecutively. I could also record the translation in my native language with a microphone and split this audio file into individual numbered sentences. All this would be easy to do with a program like Audacity. The only problem would be to generate mp3 files from them, which would contain the sentences in different order similar to the GSR files.

Anki could be used to solve this problem. However, the repetition intervals would quickly become too large, so that the overlearning effect, as Glossika achieves it, would be missed. It would therefore be better to write an appropriate computer program that is capable of automatically creating GSR-like audio files from numbered audio files in the teaching language and in the target language. Each day 10 new sentence pairs would be taught and 30 already known sentences from the last three days would be repeated. Such a program would certainly not be too complicated.

Does anyone of you know if such a program already exists?

It does exist. Some time ago ufff wrote it to generate new GMS & GSR files for English-Russian Glossika course because some sentences had been translated incorrectly into Russian. It's a script written in Python 3. Take a look at the scripts in this src folder.
The header files for GMS & GSR files can be generated using the code from "get_grs_headers.py" & "gen_gsm_headers.py" (it's here). They won't work out-of-the box as the IVONA Speech Cloud Beta service had beed closed and replaced by Amazon Polly. It seems that there're two ways how to use Amazon Polly to batch generate audio. The first one is by using AWS CLI and the second one is by using AWS SDK for Python. The live demo can be found here (won't work without an AWS account). The second option is to use Google Cloud Text-to-Speech instead. In that case the installation process can be done by following Quickstart: Text-to-Speech at first (the step 2 can be skipped) and Text-to-Speech API Client Libraries after that. In this case the updated version of "get_grs_headers.py" and "gen_gsm_headers.py" can be found here.

neumanc · Postby **neumanc** » Sat Apr 07, 2018 9:35 am

kelciour wrote:It does exist. Some time ago ufff wrote it to generate new GMS & GSR files for English-Russian Glossika course because some sentences had been translated incorrectly into Russian. It's a script written in Python 3. Take a look at the scripts in this src folder.
The header files for GMS & GSR files can be generated using the code from "get_grs_headers.py" & "gen_gsm_headers.py" (it's here). They won't work out-of-the box as the IVONA Speech Cloud Beta service had beed closed and replaced by Amazon Polly. It seems that there're two ways how to use Amazon Polly to batch generate audio. The first one is by using AWS CLI and the second one is by using AWS SDK for Python. The live demo can be found here (won't work without an AWS account). The second option is to use Google Cloud Text-to-Speech instead. In that case the installation process can be done by following Quickstart: Text-to-Speech at first (the step 2 can be skipped) and Text-to-Speech API Client Libraries after that. In this case the updated version of "get_grs_headers.py" and "gen_gsm_headers.py" can be found here.

Thank you so much for your effort, kelciour! So it seems there are two different Python scripts out there for emulating Glossika, this one and the one created by Axon. I have to look very deeply into this. As I am new to Python, this may take some time. I have just installed Python 3.6.5., the latest version. Next step will be watching some tutorials on YouTube in order to get going with this programming language. After that, I hope that I will be able to at least comprehend what these algorithms do. It may well be that I will have to ask you some (or several) questions about these scripts. Right now, I cannot do more than thank you. If I make any progress, I will update this thread.

kelciour · Postby **kelciour** » Sat Apr 07, 2018 10:05 am

No problem. I can't recommend any tutorials, but maybe some of these will do:

neumanc · Postby **neumanc** » Sun Apr 08, 2018 1:48 pm

Progress report: Unfortunately, I couldn't get the scripts running out of the box. Don't know why yet. I have the impression that Axon's script is written in Python 2, not in Python 3. However, I succeeded to convert an mp3 file into a wave file using Python! This is a small breakthrough, because Python alone is not at all able to manipulate mp3 files. In order to achieve this, the pydub library has to be installed. But this can only be done if you learn how to use Python's installing routine pip. Furthermore, you need to get either ffmpeg or libav from the Internet, which pydub in turn relies on. Never heard of these before? Don't know how to install these, or which version to get? Me neither until today! You just have to try it out. Why does this have to be so complicated? Now that everything is set up, I'll have to learn how to code with Python and how to use pydub's modules in order to comprehend what these scripts were designed to do. Then I'll have to fix the scripts. I don't know if I have the time to do this. I'll keep trying.

neumanc · Postby **neumanc** » Mon Apr 09, 2018 6:32 pm

Progress Report: Success!!!! Done!!!

I worked all weekend and today to learn Python. Since I could neither make these two scripts work nor understand them, I had to write the GSR emulating program myself. And guess what: It worked out, at least I think so. That was fun! I enjoyed coding so much that I almost have the feeling that I have missed my calling! Maybe I should have studied computer science instead of law. I must point out that this success would not have been possible without Axon's explanations what the GSR algorithm really does. Many thanks to Axon at this point!

I have tested my script with up to 60 sentence pairs. But I don't see any reason why it should not be able to process 1,000 or 3,000 or even more sentence pairs. In contrast to Glossika, the user is able to determine how long the pause between two sentences should be. The user can choose between fixed pauses, the length of which he can determine as desired, or pauses which correspond to the length of the respective sentence in the target language. This means that you don't have to press the pause button of your mp3 player in order to follow the audio files, even in case of very long sentences. The script emulates the pattern of the old GSR files, i.e. every day (up to) 40 known sentences are repeated first, then 10 new sentences are practiced. As with Glossika, the individual spaced repetition files have (up to) 180 "reps" (i.e. the first four and the last four have fewer "reps").

Besides the GSR-like spaced repetition files, the script also creates "mass sentences files" according to the familiar pattern of the GMS-A, GMS-B and GMS-C files. However, instead of files of 50 sentences each, all sentences are merged into one file for learning new sentences (L1, L2, L2), another file for translation training (L1, L2) and a third file for reviewing only the sentences in the target language (L2).

With the help of the script, any sentence pairs can be "overlearned". I myself intend to translate my resources (Assimil, Linguaphone, etc.) orally and record myself in doing so. Then I will split the audio files for the target language and the teaching language into individual sentences and number them in the same order. Afterwards, a GSR-like program can be created with the script within a few minutes. In this way, a GSR-like program can be created from any audio resource.

If the moderators don't object for legal reasons (the idea of the GSR algorithm itself isn't legally protected, is it?), I am willing to publish my script on this forum for everybody to use. If necessary, I could change the algorithm a little bit. Anyway, as long as I haven't heard back from the moderators, I will abstain from publishing it, but the script is "privately available", if you know what I mean by that.

scivola · Postby **scivola** » Mon Apr 09, 2018 7:40 pm

I, for one, would love to see your script, neumanc. It sounds like you've done some excellent work!

A language learners’ forum

Solved: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: How to create your own Glossika-like GSR files?

Re: Solved: How to create your own Glossika-like GSR files?

Who is online