Hindi Numbers Project

Ask specific questions about your target languages. Beginner questions welcome!
jeffers
Blue Belt
Posts: 936
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=19785
x 3160
Contact:

Hindi Numbers Project

Postby jeffers » Wed May 08, 2024 3:02 pm

I've had an idea to write a series of Pimsleur-style questions to learn and practice Hindi numbers. The problem is, it would really need to be audio, and finding a Hindi speaker who would want to record the audio would be quite a hurdle. However, I've recently realized that AI might be able to do the audio for me, if I write the questions. Heck, given the right prompts AI should be able to write the script as well.

The TL;DR is that in Hindi the numbers from 1-99 are basically each unique, although there are recognizable patterns.

Below is what I wrote in my log, but I'm posting here as well in order to possibly get advice from people who don't read my log.
Hindi Numbers

Hindi numbers can be quite difficult because every number from 1 to 99 is unique. There isn't a predictable pattern like "thirty-two, thirty-three", etc. However, there are patterns which can help with recognition. Today I laid all the numbers out in a table and noted the patterns. Basically, each number has a prefix showing the value of the ones (units), and a suffix showing which multiple of ten it belongs to. For example, numbers ending in 2 start with बा (baa) or ब (ba), and all of the numbers from 71-78 end in हतर (hatar), so 72 is बहतर (bahatar).

The 9s are a tricky case, but they basically sound like "one less than the next ten", so while eighty is अस्सी (assi), 79 is उन्नासी (unnaasi). The a sound at the beginning is lengthened in this case. 69 is उनहतर (unhatar) which sounds great since हतर (hatar) is the standard suffix for numbers in the 70s, but 70 itself is सत्तर (sattar).

I wouldn't recommend memorizing the prefixes and suffixes in isolation, but being aware of the patterns will certainly help with learning the numbers, and especially with recognizing them. Unfortunately, it is rare to see Hindi numbers written out as words, so when reading you either have to know the numbers or read them as English numbers (what I usually do :oops: ). There's an additional wrinkle which is that bilingual Hindi speakers will often just use the English numbers when they speak. This is something that you will often hear in Hindi films.


Anyway, here's a picture of the table I made, for anyone who is curious:
Screenshot 2024-05-08 153113.png


My approach to learning Hindi numbers was to first learn to count to 20, then learn all the other tens, then work on the fives (e.g 25, 35, etc). That's sort of as far as I got, and not very well. The next step was to begin to learn the numbers in between, one ten at a time (e.g. 21-29, then 31-39, and so on). I never actually did this latter step, but I've been thinking about a method taken from the Pimsleur playbook. My idea is that I would like to try to make some audio lessons practicing the numbers using maths (instead of counting). I think Pimsleur really got this right. Just memorizing a list of numbers teaches you to count, but doesn't actually help you when you need to use a number in isolation. Incidentally, Pimsleur does the same with days of the week: they don't teach you them in order, but teach you Wednesday in one lesson, and Friday in another.

So what I'm planning to do is create a bunch of sentences something doing easy maths with the numbers, like "What is 5 plus 4?....... 5 plus 4 is nine." I then want to try to use an AI text to speech tool to turn this into audio. I would start with 1-10, then 10-20, and then the tens, so questions like "What is 20 + 20?" or "What is 50 - 10?". The next lessons would work on all the 5s (25, 35, etc), and then finally lessons covering each set of ten remaining (21-29, 31-39, etc). In between I guess it would be helpful to include a couple review lessons which work all the numbers used so far. Finally, there would be a lesson or lessons doing similar things with bigger numbers.

That's the idea, but I don't really know if I will have to time to actually do it. Does anyone know of a good text-to-speech engine which can handle Hindi?


So there it is. The Hindi Numbers Project is my idea to try to make a Pimsleur-style audio tool which could help me and others finally learn Hindi numbers!
You do not have the required permissions to view the files attached to this post.
8 x
Le mieux est l'ennemi du bien (roughly, the perfect is the enemy of the good)

French SC Books: 0 / 5000 (0/5000 pp)
French SC Films: 0 / 9000 (0/9000 mins)

User avatar
rdearman
Site Admin
Posts: 7396
Joined: Thu May 14, 2015 4:18 pm
Location: United Kingdom
Languages: English (N)
Language Log: viewtopic.php?f=15&t=1836
x 24079
Contact:

Re: Hindi Numbers Project

Postby rdearman » Wed May 08, 2024 8:23 pm

If I were to attempt this. I would use OpenTTS for the audio or Amazon Polly. Then I would use gradint to create a Pimsluer audio course.

I am on my phone so can't give you links. But I also did videos about both topics on my YT Channel, so you could look there.

EDIT: Added links to the software. I also have a python script in my log that will generate translations from a list of sentences in a spreadsheet. I have a second python script which will convert the text in both columns into audio for use with gradint.
2 x
: 98 / 150 Read 150 books in 2024

My YouTube Channel
The Autodidactic Podcast
My Author's Newsletter

I post on this forum with mobile devices, so excuse short msgs and typos.

zac299
Yellow Belt
Posts: 65
Joined: Fri Feb 09, 2024 2:43 am
Languages: English (N)
Spanish (Beginner)
x 249

Re: Hindi Numbers Project

Postby zac299 » Thu May 09, 2024 3:56 am

I would create the scripts then use a website like Upwork to get super cheap people to read it. You'd get a million people apply for your job and you could pick 3 or 4 from different parts so you even get different voices/accents/speeds of speech etc for your same script.
1 x

jeffers
Blue Belt
Posts: 936
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=19785
x 3160
Contact:

Re: Hindi Numbers Project

Postby jeffers » Fri May 10, 2024 12:59 pm

Sometimes the simplest parts are the hardest.

Today I started writing the script, and realised I wasn't sure how should I script something like:
"What is two plus three? ... Two plus three makes five. "
I've never really had a lesson on maths in Hindi, so I wasn't quite sure how best to say something like this. The first problem was with how to say "plus". It took a while to find a list of mathematical terms in Hindi, which has जमा (jamaa) for plus. However, jamaa appears to be quite formal and I'm leaning more towards saying something more like "What is two and two?" which would use और (aur). The next problem was how to say "equals". Again, there is a technical answer, बराबर (baraabar), but I don't think it would be used as much in an informal setting. Looking on a wordpress Wordreference [oops!] page, the most popular answer is not to say "equals", but to pause and say the number, or say the number with "होता है" (hota hai = becomes) or "होते हैं" the plural form. The people on the thread didn't seem to have a preference between the singular or the plural, although Google Translate goes with the plural form. One person on the thread said you would use the plural form with और "and" but the singular form with जमा "plus".

After working this much out, I went to my staff room to get some coffee, and a teacher from Delhi was there so I asked her how she would say, "two plus two equals four", and she answered "दो और दो, चार". I asked if you would use "होता है" and she said, "Yes, yes, we could say, "दो और दो, चार होते हैं". Sadly, I forgot to ask her how to say minus, but I can leave that for another day.

For now I've settled on the following (with a pause after the question for the student to answer first, then a pause after the anwer so the student can repeat that):
दो और तीन क्या होते हैं?
दो और तीन पांच होते हैं


Although I said this would be a Pimsleur-inspired course, I actually intend it to be 99% in Hind, and I will only include English the first time a word is used. For example, at the beginning of the first math exercise, it would have to have a translation in English the first time a maths question is asked, but for subsequent questions it would just ask and answer in Hindi. And the first time a set of numbers are introduced, they would be named in Hindi, then English, then Hindi, with pauses after each for the student to repeat the Hindi, or say the Hindi after the English, as in:
एक, दो, तीन, चार, पांच
one, two, three, four, five
एक, दो, तीन, चार, पांच


I also want to do a few sentence problems, such as, "Raj has one chicken, Sita has three chickens. How many chickens are there all together?" Which would also help students practice singular and plural, because why not? In that case, words might need to be translated on the first use, but I would stick to common Hindi words, so perhaps I could assume the student would know the word. Certain phrases like "all together" ("कुल मिलाकर", kul milaakar) would be used quite frequently, and I would work to keep the sentence problems as simple as possible. I'll have to work out the details of that sort of thing later, but any suggestions would be welcome!
1 x
Le mieux est l'ennemi du bien (roughly, the perfect is the enemy of the good)

French SC Books: 0 / 5000 (0/5000 pp)
French SC Films: 0 / 9000 (0/9000 mins)

jeffers
Blue Belt
Posts: 936
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=19785
x 3160
Contact:

Re: Hindi Numbers Project

Postby jeffers » Mon Jun 24, 2024 2:18 pm

Just an update to say that I'm still working on this, and it's gotten much bigger than I anticipated. Of course! :lol:

First of all, TTS. I looked up the bassics of SSML (a markup language used for TTS), to verify that I could do things like slow speech down, add pauses, change voices, and even adjust the emphasis within a sentence. Next I signed up to Amazon AWS, but they only have one Hindi voice, and I want to at least be able to alternate between a male and female voice. I found another company called Narakeet, which has 6 Hindi voices. However, they have their own method of marking TTS texts, and I found that it didn't work consistently. For example, when I slowed speech down to 80%, I believe it sped back up after a couple of paragraphs. After a bit of experimenting I did get something working, but now I've used up my free use allocation with them, and I'm not sure I want to spend money on Narakeet if I can't get the SSML to do what I want it to. Finally, today I signed up for Google's Cloud services, including TTS. Google has four Hind voices, which is good enough for my purposes, and it has a pricing model which works well for me: the first million characters per month is free and then after that it comes to $16 for a million characters. How much is a million characters, really? I have a 25 page document where I have made notes about the project and written the script for the first lesson. That full document comes to around 20,000 characters, so 1 million is really more than sufficient to give me time to play with the TTS and test variations with the SSML. On top of that, Google is giving new customers $300 credit for 3 months.

I started writing the script for lesson 1 (numbers from 1 to 10), and the ideas have grown and grown. I still have the exercises which add numbers as I mentioned above, but I realised that if that is all the lesson does, it would get really boring and repetitive. So I started writing some word problems along the lines of, "Rani has 2 chickens, Rajesh has 3 chickens. How many chickens do they have all together?" Those sorts of questions are a bit long for someone who might be new to the language, so I also wrote some substitution-style questions. In the end, I realised an FSI type of model would be more useful here than a Pimsleur style model.

All of that to say, I've essentially completed the script for the first full lesson, which includes 8 separate sections practicing the first 10 numbers in a variety of ways. As an estimate, the whole lesson should last between 25-30 minutes, but that is a very rough guess. Now I need to get to experiment with putting more of the script through TTS, use the SSML to switch voices and make the voice pitch a bit more varied, and then finally produce the first full lesson. If I can get that lesson sounding reasonably good, I'll get to work on the rest of the lessons, which for some of the exercises will simply be a "cut, paste, edit, repeat" job, since the exercises will essentially be the same, except for the word problems and substitution exercises.

The problem is, working on this course has taken a lot of time away from reading for the Super Challenge. Oh well, I'm learning, I'm doing some writing in Hindi, I'm enjoying myself, and I might even end up producing something useful! That's a win in my opinion.
2 x
Le mieux est l'ennemi du bien (roughly, the perfect is the enemy of the good)

French SC Books: 0 / 5000 (0/5000 pp)
French SC Films: 0 / 9000 (0/9000 mins)

zac299
Yellow Belt
Posts: 65
Joined: Fri Feb 09, 2024 2:43 am
Languages: English (N)
Spanish (Beginner)
x 249

Re: Hindi Numbers Project

Postby zac299 » Thu Jun 27, 2024 4:10 am

Good stuff.

1 other thought would be using Audacity to slow/speed up your audio. It lets you edit pitch as well so it doesn't come out sounding like a chipmunk.

So, you could simply use your subscription for every paragraph you need at normal pace and then use audacity to get a consistent paragraph at whatever other speed you want afterwards
1 x

jeffers
Blue Belt
Posts: 936
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=19785
x 3160
Contact:

Re: Hindi Numbers Project

Postby jeffers » Tue Jul 02, 2024 12:18 pm

I've made progress using Google's TTS API, and finally worked out how to successfully give it a file with ssml tags and receive back an mp3 file. Since then I've added ssml to the whole first part of the first lesson, which lasts around 4 1/2 minutes, and then I've been tweaking little things like pause duration and the wording of instructions. I'm not 100% happy with Google's TTS, because there is one section where it slurs the first consonant at the beginning of three separate sentences, and I can't figure out why. I might change the wording, shift those sentences elsewhere, or explore another TTS engine with multiple Hindi voices. The nice thing is, once I have an complete ssml file, I could use that with any engine, as long as I edit the voice name properties to voices used in that engine.

At the end of the long post below, taken from my log, I'm asking for a couple volunteers to listen to and critique the first lesson section. If you're interested, please let me know.


Update on the Hindi Numbers Project (This is copied from my log)
I spent a lot of time early on with a very basic language question: how do you do verbal maths in Hindi? How do you say, "What is two plus two? Two plus two is four."? This seems rather basic, but it's not really covered clearly in textbooks or even online (where either the language is quite wordy, or they use English terms). I found some help on Word Reference Forums, and I also messaged three friends I used to work with in India, two Hindi teachers and a maths teacher.

I wanted to keep the language itself simple, since this was going to be about practicing the numbers. Rather than use the word for plus, जमा (jama), I decided to use "and", so came up with:
दो और दो क्या होते हैं? (Do aur do kya hote hain? = What are two and two? or more literally, What do two and two make?)
दो और दो चार होते हैं। (Do aur do char hote hain. Two and two make four.)
The script needed to include some brief explanation and practice of the words, but I think it works fine.

Subtraction was going to be a bit trickier, but my friends agreed on a relatively straightforward solution, so that's what I went with:
दो में से, एक घटाएंगे, क्या बचेगा? (Do main se, ek ghataaeinge, kya bachegaa? Literally: From two, we will reduce one, what will be left?)
For my script, I needed to include more explanation for this section than for the addition, but I felt like it would be useful in the long run as the language would continue to be used, in exactly the same pattern, in all future lessons.

I've now written the full script for the first lesson, which is divided into 8 sections. I intend to make a separate track for each section:
  • Introduce numbers 1-5, practice adding 1 to each number, in order, then add 2 to each number, in order.
  • Do the same with numbers 6-10.
  • Review and practice (addition problems adding all the numbers to all the numbers, mixed order).
  • Substitution exercises (same sentence, different numbers following an prompt in English).
  • Subtraction questions, -1 from each number in order, -2 from each number in order.
  • Simple word problems (e.g. speaker says a time, student has to say the time one hour earlier. I'm calling these "type-1 word problems".
  • Longer word problems (e.g. Sita has 3 chickens, Arjun has 5 chickens. How many do they have all together?) I'm calling these "type-2 word problems".
  • Mixed review. Addition and subraction problems, mixed order, also using some numbers from previous lessons.

I've planned out up to 17 lessons, each lesson will essentially following the same pattern:
  • Numbers 1-10
  • Numbers 11-20
  • All the tens (10, 20, 30, 40, etc, up to 100)
  • All the fives (5, 15, 25, etc, up to 95)
  • Numbers 31-40
  • 41-50
  • 51-60
  • 61-70
  • 71-80
  • 81-90
  • 91-100
  • Full review of 1-100
  • Fractions
  • Large numbers
  • Ordinal numbers(?)

I reckon each lesson would last around 30 minutes, so if I actually finish the whole course it will run 7-8 hours in total. It seems like a lot of lessons to learn numbers from 1-100, but remember that for Hindi each number is unique.

Telling the time will be worked into lessons on other numbers, so it's not all taught at once. Note that some fractions are used in time, just as in English. I haven't thought about it yet, but I should do something similar with dates and months. For example, lesson 2 could teach a couple month names, and practice dates with those months. I don't think I would want a single lesson trying to teach all the months in one go, but that might be needed, and then practice with them could be spaced across the course.

So where have I gotten to? I have written the full script for all parts of lesson 1 (numbers 1-10), and I have added the ssml tags to part 1 so that I can generate an mp3 with multiple voices (e.g. male and female alternating), speaking rates, and pauses. Most ofther lessons will simply follow the same pattern, but more thought is needed for word problems, because each lesson will have different questions. Adding the ssml tags and testing them is a chore, but is at least a straightforward task. However, I find I'm constantly making micro adjustments to the tags as I work on them, e.g. this pause is too short, wording of an instruction needs changing, etc. Eventually, I'm thinking I will then put the files into Audacity and add "room tone", which is a low background noise so the silences aren't jarringly quiet. I'm even considering adding a short 15 second music clip over the beginning and end of each lesson, to make it sound more professional, but maybe not. I know a lot of us don't like our time being wasted like that! :lol: The raw audio for the first section (about 4 1/2 minutes) is nearly ready!

I would like to ask for some help. I would like a couple volunteers to be willing to listen to the raw sound files and give me feedback. E.g. is the pace too quick or too slow? Are the instructions unclear? Is it really too boring? And so on. Please let me know if you are willing to give it a listen, and I will send you the first section this weekend.
1 x
Le mieux est l'ennemi du bien (roughly, the perfect is the enemy of the good)

French SC Books: 0 / 5000 (0/5000 pp)
French SC Films: 0 / 9000 (0/9000 mins)


Return to “Practical Questions and Advice”

Who is online

Users browsing this forum: No registered users and 2 guests