statistics for Jorkens

All about language programs, courses, websites and other learning resources
mcthulhu
Orange Belt
Posts: 228
Joined: Sun Feb 26, 2017 4:01 pm
Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
x 590

statistics for Jorkens

Postby mcthulhu » Thu May 06, 2021 1:49 am

(also posted on Reddit)

With the latest update to the source code on https://github.com/mcthulhu/jorkens, Jorkens will now, when closing, save statistics on the latest session in the database. These statistics consist of:

    the current data and time;

    the language being read;

    the length of time spent reading, in minutes (caveat: Jorkens does not currently check for user inactivity, so if you accidentally leave it running overnight, your statistics will be thrown off);

    the number of words read (it is assumed for the sake of simplicity that the pages displayed on the screen have been read);

    a list of the words looked up (a number might have been enough right now, but someday I'd like to try analyzing changes over time in the level of difficulty of unknown words);

    the number of words added to the local glossary from this portion of the text;

    the number of new flashcards created;

    the number of flashcards gotten right, if flashcards were reviewed during this session;

    the vocabulary size of this portion of the text, after stopword removal and lemmatization;

    the average sentence length for this portion of the text; and

    the Type-Token Ratio (vocabulary size divided by total number of words).

The last three items can be considered different measures of reading difficulty. It might be interesting to look at them at some point in connection with reading speed.

The next step will be to add options to display some of this information in an intelligible way, probably using chart.js, at least to start with. I'll probably begin with the number of consecutive days with at least some time spent reading (i.e., not breaking the chain); average and total time spent reading; average and total words read; and reading speed. These would all need to be by language, of course.

In the meantime, the data will be accumulating in the database. A tool like DB Browser for SQLite can be used to export the "sessionstats" table to a .csv file for those who would like to use external tools like Excel to do their own analysis.
3 x

Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests