Korean podcasts sorted by difficulty (plus experimental web player)

ryanheise · Postby **ryanheise** » Wed Feb 05, 2020 3:56 am

Thanks for the feedback. I've posted a new list at the bottom of the previous page which interleaves all episodes from all podcasts into a single list, and from it you should be able to see everything, including what were the (estimated) most difficult episodes for each podcast.

I personally prefer the interleaved list since you can see where each podcast falls on the global spectrum. But I'll just say again that this is only approximate and you may still find that the relative positions of podcasts might not be perfect. For example, one episode might have easier vocabulary than another episode, but it might also be harder in other ways that are not measured such as grammar, speed of speech, slurred speech and background noises that make speech less intelligible, and general processing errors.

By the way, when you click on the "listen" link, I may also be able to modify that little audio player widget (not 100% sure) to do things like pause after each sentence, or maybe repeat each sentence a couple of times before continuing to the next sentence. Worth doing? Not that I'd be able to do it, but I could investigate.

Christi · Postby **Christi** » Wed Feb 05, 2020 7:08 pm

ryanheise wrote: I may also be able to modify that little audio player widget (not 100% sure) to do things like pause after each sentence, or maybe repeat each sentence a couple of times before continuing to the next sentence. Worth doing? Not that I'd be able to do it, but I could investigate.

That would be great! I'm starting to feel a little case of hero worship coming up :lol:

Postby **rdearman** » Wed Feb 05, 2020 7:10 pm

Christi wrote:
ryanheise wrote: I may also be able to modify that little audio player widget (not 100% sure) to do things like pause after each sentence, or maybe repeat each sentence a couple of times before continuing to the next sentence. Worth doing? Not that I'd be able to do it, but I could investigate.

That would be great! I'm starting to feel a little case of hero worship coming up

Don't bother, just download WorkAudioBook it already does all that.

ryanheise · Postby **ryanheise** » Thu Feb 06, 2020 3:37 am

rdearman wrote:
Christi wrote:
ryanheise wrote: I may also be able to modify that little audio player widget (not 100% sure) to do things like pause after each sentence, or maybe repeat each sentence a couple of times before continuing to the next sentence. Worth doing? Not that I'd be able to do it, but I could investigate.

That would be great! I'm starting to feel a little case of hero worship coming up

Don't bother, just download WorkAudioBook it already does all that.

Yep, I think you got me onto WorkAudioBook in the first place, and I was using this excellent software for my listening attention span experiment. I highly recommend it.

However, being a programmer, I suffer from this severe hacker condition where I also like to build things ;-)

If I can see something could be made even a little more convenient and save me time, I'll take even the feeblest of excuses for the chance to write some code. Here, I think it could be convenient to get this listening experience right in the web page. (I'm sure if I saw a doctor, they would tell me there is nothing that can be done for my condition...)

Just to talk a bit about what's possible here, in the process of coming up with the analysis code for estimating podcast difficulty, I realised I needed to analyse at the sentence level to improve the accuracy of the analysis, and I ended up developing a new approach to sentence detection that in some ways is more sophisticated that WorkAudioBook. And in some ways it's also not as good, although it looks more promising overall, so I know I can keep improving it. Basically, what WorkAudioBook operates on is a min/max threshold for the length of a segment to cut to, and it tries to find a split within those parameters, but it appears to do a standard measurement of the audio energy, and when it detects the audio energy dip below a certain threshold (which could be dynamic or absolute), it does a split. These are fairly standard techniques in audio processing, but also inaccurate when it comes to voice.

The problem is that in many podcasts, there is background noise or music where the energy does not dip and so the dips won't be detected, and you end up with an algorithm that splits at inconvenient times, like in the middle of a word. This is just a limitation of that particular approach to sentence detection. So basically what I ended up doing was using more information to make the determination about where to split, looking at characteristics of the audio besides energy, including voice properties. My algorithm will still have problems, though, where someone is talking continuously without taking a breath, or two people are conversing with each other in such a way that there is continuous voice, but I'll keep working on this over time. One idea is that if the two people talking to each other are speaking in different vocal registers, I may be able to detect that, and infer another sentence split.

Anyway, now that I have developed this algorithm, it actually seems like it should be quite easy to modify one of those web audio player widgets to break at those timestamps. I'm simply talking about reusing the sentence detection code I've now written and just seeing if I can tie it into the web player. I don't have as much experience with the web APIs, so I do have some question marks over whether it would be possible, particularly with how accurate seeking is with these web APIs. If I can accurately determine the correct timestamp for a sentence split, but the web APIs don't let me jump to those positions accurately, then it won't work. But... if it DOES work, I think it could be quite convenient for a web player to be able to give you this behaviour immediately, rather than the several steps it may take to get the podcast into the app (find, download, transfer, open, configure).

I'll do some investigations and if it doesn't work, that's fine. Let's see.

Postby **rdearman** » Thu Feb 06, 2020 9:02 am

You might want to check out EMK's sub-study stuff. He has a web-player in the software.

https://github.com/emk/subtitles-rs

ryanheise · Postby **ryanheise** » Thu Feb 06, 2020 11:59 am

Thanks. By the way, while looking for a video of substudy in action, I came across one of your YouTube videos: https://www.youtube.com/watch?v=p23qGYuFnr4 . You have a new subscriber ;-)

I just checked out the player code in that project, although unfortunately it's not quite what I'm looking for (the code is written in Reason ML and part of a monolithic code base rather than being a widget I can reuse). Most likely, I'll use howler.js .

Christi · Postby **Christi** » Fri Feb 07, 2020 3:44 pm

Do you need to download the podcast to be able to use WorkAudioBook?

ryanheise · Postby **ryanheise** » Fri Feb 07, 2020 4:28 pm

Christi wrote:Do you need to download the podcast to be able to use WorkAudioBook?

Yes, basically download it and then open it within WorkAudioBook via the menu.

ryanheise · Postby **ryanheise** » Wed Feb 26, 2020 2:06 am

An update: I got a "proof of concept" working a couple of weeks ago. The web player is able to repeat individual sentences and also slow down audio.

With the proof of concept, I was able to see how the sentence detection algorithm "actually" works in different situations:

- When there is a clear silent gap between sentences, the algorithm works perfectly.
- But when there is no gap, adjacent sentences effectively combine together making one big long sentence (not good).

This has been working well enough for my Japanese material, but it's turned out to be more of a problem with Korean (but read on, because I have figured out a solution). One pattern in particularly fools it is where Koreans start their next sentence immediately after the previous one, and then pause after the topic of the next sentence (i.e. right after the "는").

Even for the "Real-Life Korean Conversations For Beginners" podcast, there are some episodes where there is almost no gap between sentences, and the longest gaps appear mid-sentence and not between sentences.

So I knew this was a problem I had to solve for it to be useful, and that's what I've been working on solving these past two weeks.

The solution

What I have now is a much more precise way of detecting Korean sentences. The algorithm works in two phases:

1. Split the audio where the "long" gaps are. This phase produces very accurate boundaries.
2. Within each of these segments (if the segment is still too long), we do a more aggressive split that is based on properties of the Korean language rather than merely on looking for a silent gap (where in fact there might not be one). The split points in this phase are usually accurate within half a second. This is still enough to miss a whole word, so in this phase, I am thinking about having an option to add extra padding on the ends of segments, so the beginning and ending of each segment can be extended by a bit to ensure that you don't miss anything. As you increase this setting, you may be able to hear a little bit of the next or previous sentence giving you a little bit of overlap, depending on how close the sentences are together.

In general, it will be more accurate by default on easier content, and less accurate on advanced content. But for "Real-Life Korean Conversations For Beginners", it seems to be performing very well with little overlap.

The approach does seem very promising, and it is something I can keep improving the accuracy of in the future. It's also something I'd now like to go back and apply to Japanese as well. Most of the analysis that goes into this is very language specific, so the techniques have to be specially crafted for each language, but now that I have seen what works for Korean, I can try to find similar language features in Japanese that could make it work for that language, too.

Anyway, I just want to say that I am working on it, and I will hopefully be able to make it available soon.

AnneL · Postby **AnneL** » Wed Feb 26, 2020 3:48 am

ryanheise wrote: I am thinking about having an option to add extra padding on the ends of segments, so the beginning and ending of each segment can be extended by a bit to ensure that you don't miss anything

It will probably be good enough. From my own experience and what I assume you're doing, doing the big work (chunking paragraphs) automatically is where it's very useful to not have to do it by hand, but once it's down to several sentences connected together the padding will help automate all roughly and good enough to work with.
Have fun with what's left to do!

A language learners’ forum

Korean podcasts sorted by difficulty (plus experimental web player)

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Re: Korean podcasts sorted by difficulty

Who is online