So as readers of this log know, I've incorporated regular writing into my routine. From about January this year and onward I have been doing a variety of writing styles:
1) Learning new grammar points and explicitly practicing them by making up random sentences to use them
2) Writing about daily events and topics
3) Translating English to Korean
4) Writing out set stories in Korean (I did a fairytale)
The reason for my focus on output was that in my last review (circa December or whenever it was last year) I came to the realization that I was still making many very fundamental mistakes in Korean grammar which other learners had managed to overcome. Until that point I had done an almost exclusively input focused approach to learning. I *did* look up grammar points fairly frequently but it was just enough to understand the text at hand - I did not practice it or read any further so long as I was able to continue working through input.
So after some consideration I decided that doing 33% output, 66% input rather than 90%+ input might fix this issue.
I have *felt* that this change has been a benefit, but I want to find some way of 'measuring' this if possible.
LIGHT BLUE LINE:The above graph represents some 60+ individual submissions (of one or more sentences per submission). For each data point (ie: one set of sentences submitted to Lang8, LingQ or wherever for correction) I count up how many grammatical errors I made. I divide that by the number of words in the submission to reach a "grammatical errors per word" statistic.
This is plotted in light blue. It represents the original data.
GREEN LINE:However the problem is that I don't really write the same amount of text each time. Sometimes it can be very low word count and other times I may write a whole bunch. So I can't really start to draw trend lines on the original data because one data point of "20 words" is not really as accurate as a datapoint which represents 100 words.
So you will note there is a plotted green line. I took the original data and put it into buckets of 80+ words. So instead of having 4
successive entries of 20 words each, I have one statistic which represents 80 words. Similarly if I had two
successive entries of 70 words and 20 words, I'd combine them into one representing 90 words.
As you can tell the green line kind of smooths out the data already.
RED LINE:Then a trend line which is a moving average of 3 has been fitted to the data points on the green line.
So it is all incredibly noisy - there were different correctors at different stages in my language learning. Some may have applied more corrections than others. But I wanted to see what the output would be!
ADDITIONAL NOTES:
Now that I've described what is going on in the graph, here are some further notes before I make observations:
* I had used Lang8 a *long* time ago (back in February last year) for a number of corrections. I have not included all of them but I took a few of them just to show my scores from right at the start of my language learning process.
* There is a big gap until July 2016 and a couple little gaps late in the year. I did no writing during these times that I recall or have record of.
* Prior to about August 2016 all my writing was really simple. Subject Object Verb for many sentences. So I did not attempt to employ any complex grammar and my sentences were typically very short.
* From late 2016 onward I began to employ a variety of grammar points when writing to break out of simple text and have continued to do so.
* From January 2017 onward I began to write very regularly, often longer entries than before (which is why the green data points occur more frequently - it only took 2 submissions to accrue 80 words).
* From January 2017 onward I began to explicitly practice grammar points using random sentences. I did not include any of these submissions on the graph.
So the data is very noisy as it depends on the corrector as to how much grammar might be adjusted. Additionally how I scored grammar mistakes is also somewhat arbitrary even though I did make up a set of rules to try ensure consistency. On some occasions it was really difficult to tell how much I got wrong when things were considerably reworded.
OBSERVATIONS:
There does appear to be a slight downward trend, but given how noisy the data is and how slight the trend is, I think its quite reasonable to conclude the graph itself is telling me nothing. Perhaps it would have to be more notable to confidently draw any conclusion.
Something of note is that the peaks (looking at the raw data blue line) are getting smaller with time which at least might indicate more *consistency* in my writing.
The graph shows nothing of the makeup of my mistakes, but I did note that last year I had a lot of obviously misplaced (or missing) subject, object and topic particles. My mistakes on this front have dropped off considerably, but the more obscure uses of 은/는 in a comparative function are cropping up much more.
So I think there has been progress, but I am still mulling over my conclusion on this:
Has daily output in Korean (in the form of writing) proven effective at addressing the grammar related problems I found last review? Is it something I should continue doing? If so, is there a modification to it which might improve the effectiveness of it?