MrWarper's self-checking exercise generator

General discussion about learning languages
User avatar
mrwarper
Orange Belt
Posts: 106
Joined: Sat Jul 18, 2015 4:06 pm
Languages: A bunch, in various stages
Language Log: http://how-to-learn-any-language.com/fo ... ?TID=39905
x 149
Contact:

Re: MrWarper's self-checking exercise generator

Postby mrwarper » Sat May 21, 2022 10:56 am

Sorry, read the last posts and got busy.
Cainntear wrote:What I mean is about making it "semantic" in the sense that the file tells us the information, not the formatting.[...]

If you look at the likes of text2qti, the AMC-text
[...]
In theory the same format could be used to import directly into anything (including the likes of your and my apps)[...]

If I choose to render this as a numbered list [...]. Or maybe I stick them in a table so that I get columns to print out.
I try to think of my data in these format-agnostic ways.
Data format agnosticism is always a good thing.

However, while I am all for going 'semantic' over 'formatting' -based, formatting is arbitrarily based on having something that is in the file, i.e. "line break" or other characters in a data stream, interpreted as control codes and used to actually break lines and such -- so in the end there is no big difference...

From a slightly more abstract point of view, what the generator will be doing (for exercises, but also in general) is little more than string search and replace: this ("dynamic element" or whatever) is found in input i.e. the text as typed by the exercise writer, or imported from a file, etc., and it is replaced in output with that (<HTML code>).

All that thus matters is what delimiters we choose for each exercise dynamic element, and / or any sub-parts. I think regular expressions allow for enough flexibility both in input and output, and a template system would give us even more, if used to read or store our delimiters of choice.
In text2qti and most similar things, a single line break doesn't have any independent meaning -- instead it's having specific characters at the start of the line that counts. This means you can break a line if you need to stop it getting too long -- see question 11 above (text2qti treats any line starting with space or tab as being a continuation of the previous line, and it replaces the newline and all initial space with a single space in the output).
Just like any number of spaces and line breaks in HTML source are condensed to a single space for rendering.

OK, the problem with considering single line breaks in 'simple' input is that they are too likely to end up in the input by accident -- either because people tend to insert them before long lines wrap automatically in text areas, or because it should be possible to import plain text files which may include them. But delimiters consisting of specific x/y/z strings at the beginning of a line are really nothing more than "line break + x/y/z" strings, so for these to be useful we should make sure we're using stuff that is less likely to be accidental. Like perhaps, double line breaks ; )
0 x
MrWarper while HTLAL is offline.

User avatar
mrwarper
Orange Belt
Posts: 106
Joined: Sat Jul 18, 2015 4:06 pm
Languages: A bunch, in various stages
Language Log: http://how-to-learn-any-language.com/fo ... ?TID=39905
x 149
Contact:

Re: MrWarper's self-checking exercise generator

Postby mrwarper » Sat May 21, 2022 11:07 am

zenmonkey wrote:Ok, I get now that you want to have code that is light, transportable, and device or connection agnostic.
Yes, agnosticism of most kinds is generally important for my projects -- which does not mean stuff meant to cater to some special needs can't be added later on. However, 'special' needs will never be a priority for me, especially if that means leaving someone else out. But that does not need to happen. More on this below.
I do hope you figure out how to include audio files because a lot of useful exercises are based on that.
Of course, a lot of exercises are based on having some audio recording available, just like in the examples given (did you have a look at them, and the comments about 'missing' audio?). If that is the case, it is obvious that the recording audio file must be linked (whether using normal <a href=""> links, <audio> elements or what have you) so that students can play it, and the exercises make sense.

What I fail to see is, how that relates to the exercises themselves in such a way that somehow "including" (embedding?) audio files in the output is necessary -- for me linking is more than enough.

Then again, I may be missing some really good idea here, and / or some exercise primitive that I overlooked. If so, please explain what you think I am missing on regarding audio.
And I'm curious, do you plan to manage languages that typically did not show up correctly in those old devices because the codepages were not installed and required some sort of in-line font declaration? I remember running into issues with Tibetan, Urdu and Assyrian in 2014. I never did get an old phone to show Arabic natively.

BTW, should you be declaring your DTD?
I thought that the current standard is that if it is undeclared it will be read as HTML5 and then you may have some obsolete identifiers.
I think these are good examples of catering to special needs that should not get in the way when properly done:

Leaving aside specific "codepages" and "inline" stuff considerations for the moment being, exercise writers should generally expect the generator to output HTML that is fully functional. Naturally, this includes, for example, the <head> section of the output file, which must be present, but with which they should generally not need to tinker.

But if that part of the output HTML is not generated from user input (other than maybe its title, please bear with me), where will all the code in it come from? I think it is easiest to inject generated exercises into a ready-made functional HTML template. This, in turn, leads directly to the idea of letting users provide their own one as well, and have exercises injected directly into whatever they supply instead, if that works better for them. They should be in a better position to know their exact needs anyway, and it would avoid the need to edit output afterwards.

Obviously, a number of pre-made templates could be made readily available if specific reasons to create them are identified.
0 x
MrWarper while HTLAL is offline.

Cainntear
Black Belt - 3rd Dan
Posts: 3469
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8666
Contact:

Re: MrWarper's self-checking exercise generator

Postby Cainntear » Sat May 21, 2022 2:25 pm

mrwarper wrote:
zenmonkey wrote:I do hope you figure out how to include audio files because a lot of useful exercises are based on that.
Of course, a lot of exercises are based on having some audio recording available, just like in the examples given (did you have a look at them, and the comments about 'missing' audio?). If that is the case, it is obvious that the recording audio file must be linked (whether using normal <a href=""> links, <audio> elements or what have you) so that students can play it, and the exercises make sense.

What I fail to see is, how that relates to the exercises themselves in such a way that somehow "including" (embedding?) audio files in the output is necessary -- for me linking is more than enough.

You fail to see the advantage of having a play button right next to the question, rather than clicking and loading a new page, possibly leading to the loss of your progress so far...?
1 x

User avatar
mrwarper
Orange Belt
Posts: 106
Joined: Sat Jul 18, 2015 4:06 pm
Languages: A bunch, in various stages
Language Log: http://how-to-learn-any-language.com/fo ... ?TID=39905
x 149
Contact:

Including vs. linking audio, etc.

Postby mrwarper » Sat May 21, 2022 6:46 pm

I have been clicking on links to audio files for more than twenty years now, and I always got the files to load just fine on one external player or another -- not once instead of the current page in the browser, so you really made me wonder what you were talking about...

Sure enough, if/when an audio file is missing and you click on a link to it, the browser will report the problem -- maybe replacing the current page with a 404 error or something (hence your "loss of progress"?) -- the exact way depends on your browser.

I assume if you have a 'proper' <audio> element pointing to a missing audio file you will always get either no audio controls or an error icon from the start, or some error message when you click on them and the file is not found, but never a "not found" page replacing the current one?

My guess is what you get exactly will depend on the browser as well, but even if the above is admittedly a much more graceful way to handle missing files, pointing to missing audio will still prevent students from completing exercises in any meaningful way, so the bottom line when writing exercises for real is, make sure your audio is not missing.

All of that said, of course I do see an advantage in having the browser generate audio controls by itself if it knows how to play the file, instead of simply offering to load it externally. But that's where the magic of HTML lets you get both just in case by embedding an old school "<a href..." inside an <audio>, as you pointed out right away.

What I meant, however, is that from an abstract, simple "search and replace" point of view, there is again no difference between replacing <whatever you choose to signal an external audio file> with '<a href=[...]' and with a fancy-pants '<audio><source="[...]"><a href="[same again...]">'. It is exactly the same thing from the point of view of the exercise writer: you write the equivalent of "audio: this file.mp3" and the system generates the right code for you.

In other words, no matter what the best code is, embedding (linking, actually) external audio (or video!) as a whole does not imply any new exercise primitive, since in itself it is unrelated to any checking, programming, or "moving parts" within the logic of generating an exercise: some simple string will be replaced with another, and that's it.

What I thought was meant by "including" is to somehow take or otherwise handle parts of external audio data streams from within JS/HTML. That I won't even consider, unless a couple of really bright ideas for exercises that force it come up first.
1 x
MrWarper while HTLAL is offline.

Cainntear
Black Belt - 3rd Dan
Posts: 3469
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8666
Contact:

Re: Including vs. linking audio, etc.

Postby Cainntear » Sun May 22, 2022 9:56 am

mrwarper wrote:I have been clicking on links to audio files for more than twenty years now, and I always got the files to load just fine on one external player or another -- not once instead of the current page in the browser, so you really made me wonder what you were talking about...

Sure enough, if/when an audio file is missing and you click on a link to it, the browser will report the problem -- maybe replacing the current page with a 404 error or something (hence your "loss of progress"?) -- the exact way depends on your browser.

I've never configured my browser to do anything with audio files, so if I click on a link to an mp3, I get a "page" consisting of an HTML5 audio element in the centre of the screen, and that will be most people's experience as it's the default behaviour in most browsers now.

I assume if you have a 'proper' <audio> element pointing to a missing audio file you will always get either no audio controls or an error icon from the start, or some error message when you click on them and the file is not found, but never a "not found" page replacing the current one?

Yup. My recollection is that most (if not all) browsers give you the embedded player bar with an error in it.

What I meant, however, is that from an abstract, simple "search and replace" point of view, there is again no difference between replacing <whatever you choose to signal an external audio file> with '<a href=[...]' and with a fancy-pants '<audio><source="[...]"><a href="[same again...]">'. It is exactly the same thing from the point of view of the exercise writer: you write the equivalent of "audio: this file.mp3" and the system generates the right code for you.


Search and replace... this is the final thing I needed to understand the difference between our design philosophies here.

You are right -- if you do it as search-and-replace, then there is no difference, because search-and-replace makes the data file the master document. After all you can't do search and replace on an output template because you don't know in advance the number of questions, number of multiple-choice options etc.

The approach I take is parsing the source file into an abstract object representation -- I'll represent it here in XML-style syntax, but I wouldn't generate this, it just represents the objects I'd have in code:

Code: Select all

<quiz>
  <title>An example quiz</title>
  <question type="MCQ">
    <prompt>What colour is the sky?</prompt>
    <options>
      <option correct="false">Green</option>
      <option correct="false">Red</option>
      <option correct="true">Blue</option>
    </options>
  </question>
</quiz>


Then I would generate appropriate HTML corresponding to each structure.

This makes the generation engine far more flexible; for example you could have nesting and grouping to allow the easy generation of reading exercises with a single passage and multiple questions.
1 x

User avatar
mrwarper
Orange Belt
Posts: 106
Joined: Sat Jul 18, 2015 4:06 pm
Languages: A bunch, in various stages
Language Log: http://how-to-learn-any-language.com/fo ... ?TID=39905
x 149
Contact:

Re: Including vs. linking audio, etc.

Postby mrwarper » Sun May 22, 2022 4:51 pm

Cainntear wrote:I've never configured my browser to do anything with audio files, so if I click on a link to an mp3, I get a "page" consisting of an HTML5 audio element in the centre of the screen, and that will be most people's experience as it's the default behaviour in most browsers now.
I tried FireFox on a different machine and you're right. I can only imagine it's been too long since this started to happen and I did configure my usual system to behave.

I just can't stand how modern browsers constantly try to become some kind of hyper-bloated universal viewer and show you huge PDFs and all sorts of files "without downloading them" (and yes, often replacing the current page!), so I have grown used to just save or open non-HTML links on another tab when on somebody else's computer, and not even think about it any more.

Given enough time, you do these things on autopilot and effectively forget that you do something different. Mystery solved, and easy to fix improving the replacement code. Thank you for making me snap out of it.
My recollection is that most (if not all) browsers give you the embedded player bar with an error in it.
Thank you, I had quickly put together a file with an <audio> container and such to test and that's what I got, but you can never be sure about all other systems, which is an important reason to remain agnostic in many ways.
Search and replace... this is the final thing I needed to understand the difference between our design philosophies here.
Yes, I guess we never emphasize enough --assuming we even mention-- what is obvious to us : /
You are right -- if you do it as search-and-replace, then there is no difference, because search-and-replace makes the data file the master document. After all you can't do search and replace on an output template because you don't know in advance the number of questions, number of multiple-choice options etc.

The approach I take is parsing the source file into an abstract object representation [...] Then I would generate appropriate HTML corresponding to each structure.
This is an interesting approach, but I think the less stages and intermediate structures are created between input and output, the better. It also means less coding : )

Regarding the keeping of simple texts as 'master' documents, I think that's the way to go because writers' time should be more valuable when compared to whatever can be generated automatically. It would be pointless to have exercises rewritten just to get newer or better HTML if they can be simply generated again with a couple of clicks using the same 'source'.
This makes the generation engine far more flexible; for example you could have nesting and grouping to allow the easy generation of reading exercises with a single passage and multiple questions.
This sounds intriguing. Could you please provide a simple example? Thank you,
0 x
MrWarper while HTLAL is offline.

Cainntear
Black Belt - 3rd Dan
Posts: 3469
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8666
Contact:

Re: Including vs. linking audio, etc.

Postby Cainntear » Mon May 23, 2022 6:21 pm

[duplicate post]
Last edited by Cainntear on Thu Jun 02, 2022 8:54 pm, edited 2 times in total.
0 x

Cainntear
Black Belt - 3rd Dan
Posts: 3469
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8666
Contact:

Re: Including vs. linking audio, etc.

Postby Cainntear » Mon May 23, 2022 6:22 pm

mrwarper wrote:
Cainntear wrote:You are right -- if you do it as search-and-replace, then there is no difference, because search-and-replace makes the data file the master document. After all you can't do search and replace on an output template because you don't know in advance the number of questions, number of multiple-choice options etc.

The approach I take is parsing the source file into an abstract object representation [...] Then I would generate appropriate HTML corresponding to each structure.
This is an interesting approach, but I think the less stages and intermediate structures are created between input and output, the better. It also means less coding : )

The goal of a computer program is to stop you having to do something multiple times, though -- take the time to program it once, and then it's handled automatically in the future. The more the program does, the less you have to think about the format as you write the questions, and the more you can focus on the logic of the task being set

Regarding the keeping of simple texts as 'master' documents, I think that's the way to go because writers' time should be more valuable when compared to whatever can be generated automatically. It would be pointless to have exercises rewritten just to get newer or better HTML if they can be simply generated again with a couple of clicks using the same 'source'.

Sorry, bad choice of wording on my part. The text files are of course the master document, but in the same way that an HTML file is. It doesn't contain the full details of implementation, and the browser and any attacked CSS files can adapt it.

My point is that the more you're thinking it terms of layout (search-and-replace) and not in terms of question semantics, the harder it will be to revise the output style, because you're encoding style in your data file. If you use tags like <center> <b> <font> etc in a webpage to explicitly set the output style, you need to edit your HTML to change the style; if instead you use <h1> <h2> ... <h5>, you can change the output formatting of the tags with an external style sheet.

This makes the generation engine far more flexible; for example you could have nesting and grouping to allow the easy generation of reading exercises with a single passage and multiple questions.
This sounds intriguing. Could you please provide a simple example? Thank you,

For example

Code: Select all

<questiongroup>
   <stimulus>Leo Pasvolsky (August 22, 1893 – May 5, 1953) was a journalist, economist, state department official and special assistant to Secretary of State Cordell Hull. He was one of the United States government's main planners for the post World War II world and "probably the foremost author of the UN Charter."[1] Thomas Connally said in his memoirs "Certainly he had more to do with writing the framework of the charter than anyone else."[2] His New York Times obituary is subtitled "Wrote Charter of World Organization." A short, rotund, mustachioed pipe smoker with a very large and round head, he joked that he might find it easier to roll than to walk. An aide compared him to the third little pig in the Three Little Pigs, Hull called him "Friar Tuck". A hardworking "one-man think tank" for Hull, he preferred to stay invisible, in the background.[3] In the words of Richard Holbrooke, he "was one of those figures peculiar to Washington – a tenacious bureaucrat who, fixed on a single goal, left behind a huge legacy while virtually disappearing from history."
  </stimulus>
  <questions>
    <question>
      <prompt>When was Leo Pasvolsky born?</prompt>
      <options>
         <option correct="true">1893</option>
         <option correct="false">1953</option>
      </options>
    </question>
    <question>
      <prompt>The New York Times article subtitled "Wrote Charter of World Organization" was written...</prompt>
      <options>
         <option correct="true">... after Pasvolsky's death</option>
         <option correct="false">... by Pasvolsky</option>
         <option correct="false">... as a birthday present by his wife</option>
      </options>
    </question>
  </questions>
</questiongroup>


[Text taken from a randomly selected Wikipedia article. Out of curiosity, I checked how far it is from "philosophy"... 23 clicks]

It's a short passage and only has 2 questions, but in a real paper-based exam, you could have most of a side of paper for each of the questions and answers. It can be pretty unwieldy on a computer screen, but the best approach is to have two separately scrollable panes side-by-side: text stimulus on the left, interactive questions on the right.

Have a look here, and click on "part 5". You'll see the side-by-side format, but without independent scrolling, and if you work through the questions, you'll see how intrusive the lack of independent scrolling is.
Last edited by Cainntear on Thu Jun 02, 2022 8:52 pm, edited 1 time in total.
0 x

Cainntear
Black Belt - 3rd Dan
Posts: 3469
Joined: Thu Jul 30, 2015 11:04 am
Location: Scotland
Languages: English(N)
Advanced: French,Spanish, Scottish Gaelic
Intermediate: Italian, Catalan, Corsican
Basic: Welsh
Dabbling: Polish, Russian etc
x 8666
Contact:

Re: MrWarper's self-checking exercise generator

Postby Cainntear » Mon May 23, 2022 6:38 pm

Or here, where I've taken a genuine Cambridge sample question and input it to Hot Potatoes. Hot Potatoes lets you look at one question at a time, navigating with forward and back buttons, or look at all questions simultaneously and also doesn't have independent scrolling of the reading passage.

Edit:
I've hacked the CSS to give a side-by-side scrolling format.
1 x


Return to “General Language Discussion”

Who is online

Users browsing this forum: themethod and 2 guests