Yandex translations

Ask specific questions about your target languages. Beginner questions welcome!
Dragon27
Blue Belt
Posts: 616
Joined: Tue Aug 25, 2015 6:40 am
Languages: Russian (N)
English - best foreign language
Polish, Spanish - passive advanced
Tatar, German, French, Greek - studying
x 1375

Re: Yandex translations

Postby Dragon27 » Wed Jan 19, 2022 7:44 pm

Well, it does the translation of "Ты красивый. Вы красивый." from Russian into Ukrainian properly "Ти красивий. Ви красивий.", and it even translates "хорошо" (well - adverb) and "колодец" (well - an object) into "добре" and "колодязь" respectively (which it didn't do previously), I'm impressed!
It probably works better with some language pairs, than with others (Russian and Ukrainian are not related for nothing), and translating "колодец" into most other languages gets their equivalents of "хорошо" (although it may give proper suggestions in the synonyms list underneath), so the impression is a bit dulled ):
0 x

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Yandex translations

Postby luke » Wed Jan 19, 2022 7:56 pm

zenmonkey wrote:But not for the reason you stated. Google will also fail for single words as the meaning is ambiguous.

Machine translation has issues and will continue to have them for a long-time but they are definitely reaching human-level quality, particularly when you provide them sufficient context.

Do YouTube subtitles use the same translation engine as translate.google? Does YouTube figure out the subtitles and keep them for a long time, or does it periodically try to "retranslate" audio, based on its newer technology?
0 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo

User avatar
zenmonkey
Black Belt - 2nd Dan
Posts: 2528
Joined: Sun Jul 26, 2015 7:21 pm
Location: California, Germany and France
Languages: Spanish, English, French trilingual - German (B2/C1) on/off study: Persian, Hebrew, Tibetan, Setswana.
Some knowledge of Italian, Portuguese, Ladino, Yiddish ...
Want to tackle Tzotzil, Nahuatl
Language Log: viewtopic.php?f=15&t=859
x 7030
Contact:

Re: Yandex translations

Postby zenmonkey » Wed Jan 19, 2022 8:49 pm

luke wrote:
zenmonkey wrote:But not for the reason you stated. Google will also fail for single words as the meaning is ambiguous.

Machine translation has issues and will continue to have them for a long-time but they are definitely reaching human-level quality, particularly when you provide them sufficient context.

Do YouTube subtitles use the same translation engine as translate.google? Does YouTube figure out the subtitles and keep them for a long time, or does it periodically try to "retranslate" audio, based on its newer technology?


I know that part of the process is generating transcriptions of the video and that this is done a few different ways - provided by the channel, crowdsourced, and through the automated algorithms provided by YouTube/Google. In 2009(?) or so, Google announced it was doing the captioning for Youtube and I suspect that YouTube uses the Google Cloud Speech To Text API.

So that then generates L1 speech to L1 text. And this process, of course, generates errors. Let us call that set of errors E1
L1 text is then is used as input for L2 text. The E1 is also input to that translation so L2 is a function of (L1, E1). As you can imagine, this is why some of the closed-captioned translations are ... well, hilarious. This L1 text -> L2 text translation probably uses the Google Translate API, there is no real reason for them to use something else.

I have no idea if they regularly reprocess source audio to improve captioned output quality. But audio to text is still computationally expensive and doing it for zillions of youtube videos is probably NOT built into the platform management, but that is just my guess.

Speech to Speech translation is a more complicated process (because of the absence of stored corpus, time to process, etc..) but there are projects like Translatotron (Google supported - true S2S translation), Moses and Mastor (IBM, DARPA - S2T2S) that are advancing in this area. I'm not current on their status or success. I just read enough about them to know that small fish like me should stay away.
3 x
I am a leaf on the wind, watch how I soar

vonPeterhof
Blue Belt
Posts: 879
Joined: Sat Aug 08, 2015 1:55 am
Languages: Russian (N), English (C2), Japanese (~C1), German (~B2), Kazakh (~B1), Norwegian (~A2)
Studying: Kazakh, Mandarin, Coptic
Language Log: viewtopic.php?f=15&t=1237
x 2833
Contact:

Re: Yandex translations

Postby vonPeterhof » Wed Jan 19, 2022 9:34 pm

einzelne wrote:
vonPeterhof wrote:Edit: actually now that I've played around with this a bit more this test doesn't seem to work for Irish. If you set it as the source language then the output stays in English no matter what you pick as the target language. Somehow I doubt that there's enough of an Irish/Hmong or Irish/Chichewa corpus of translations for the machine learning algorithm to be drawing upon :lol:


Yes, I used it last fall for Latin, had the same issue, and also thought it's because of their small corpus.

Just checked Latin and no, it does the expected thing and translates the English text into the target language.
Снимок экрана 2022-01-20 в 00.33.04.png

Irish for comparison:
Снимок экрана 2022-01-20 в 00.33.10.png
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
sfuqua
Black Belt - 1st Dan
Posts: 1642
Joined: Sun Jul 19, 2015 5:05 am
Location: san jose, california
Languages: Bad English: native
Samoan: speak, but rusty
Tagalog: imperfect, but use all the time
Spanish: read
French: read some
Japanese: beginner, obsessively studying
Language Log: https://forum.language-learners.org/vie ... =15&t=9248
x 6299

Re: Yandex translations

Postby sfuqua » Wed Jan 19, 2022 11:43 pm

I just played around with yandex, Irish->English, and it got a lot of things right. Although it was pretty lost with Munster verb forms. I think it has a really limited vocabulary.
1 x
荒海や佐渡によこたふ天の川

the rough sea / stretching out towards Sado / the Milky Way
Basho[1689]

Sometimes Japanese is just too much...

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Yandex translations

Postby luke » Wed Jan 26, 2022 9:50 pm

A very good example of Yandex having a better translation than Google for this sentence from Cien años de soledad:

Original: Úrsula lo interpretó como el regreso del cordero extraviado.
Professional: Úrsula interpreted it as the return of the strayed lamb.
Google: Úrsula interprets it as the regression of the extravagant cord.
Yandex: Ursula interpreted it as the return of the lost lamb.

Based on the famous translated quotes below, I'm thinking Yandex won this round:

Luke 15:4 - Passion and KJV translations wrote:There once was a shepherd with a hundred lambs, but one of his lambs wandered away and was lost. So the shepherd left the ninety-nine lambs out in the open field and searched in the wilderness for that one lost lamb.

What man of you, having an hundred sheep, if he lose one of them, doth not leave the ninety and nine in the wilderness, and go after that which is lost, until he find it?
0 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo

User avatar
zenmonkey
Black Belt - 2nd Dan
Posts: 2528
Joined: Sun Jul 26, 2015 7:21 pm
Location: California, Germany and France
Languages: Spanish, English, French trilingual - German (B2/C1) on/off study: Persian, Hebrew, Tibetan, Setswana.
Some knowledge of Italian, Portuguese, Ladino, Yiddish ...
Want to tackle Tzotzil, Nahuatl
Language Log: viewtopic.php?f=15&t=859
x 7030
Contact:

Re: Yandex translations

Postby zenmonkey » Wed Jan 26, 2022 9:56 pm

luke wrote:A very good example of Yandex having a better translation than Google for this sentence from Cien años de soledad:

Original: Úrsula lo interpretó como el regreso del cordero extraviado.
Professional: Úrsula interpreted it as the return of the strayed lamb.
Google: Úrsula interprets it as the regression of the extravagant cord.
Yandex: Ursula interpreted it as the return of the lost lamb.

Based on the famous translated quotes below, I'm thinking Yandex won this round:

Luke 15:4 - Passion and KJV translations wrote:There once was a shepherd with a hundred lambs, but one of his lambs wandered away and was lost. So the shepherd left the ninety-nine lambs out in the open field and searched in the wilderness for that one lost lamb.

What man of you, having an hundred sheep, if he lose one of them, doth not leave the ninety and nine in the wilderness, and go after that which is lost, until he find it?


Just entered the original into Google and got ...

Úrsula interpreted it as the return of the lost lamb.

Screenshot 2022-01-26 at 13.56.34.png


I don't know why we get different results...
You do not have the required permissions to view the files attached to this post.
0 x
I am a leaf on the wind, watch how I soar

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Yandex translations

Postby luke » Thu Jan 27, 2022 12:48 am

zenmonkey wrote:Just entered the original into Google and got ...

Úrsula interpreted it as the return of the lost lamb.

I don't know why we get different results...

Google is always listening. It hears GOOgle GOOgle and thinks, "I am Big Brother". ;)

But actually, I think there may have been an issue with the nut behind the wheel - I was manually setting certain language drop downs to Greek and I think that is where the bad translation originated. I sort of confirmed that by copying a Spanish sentence containing the adjective "puro" into google translate with the language set to Greek, and it used "cigar" in the translation. When I corrected the source language to Spanish, google knew that in the context, the proper translation was "pure".

Mea culpa!
0 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo

User avatar
luke
Brown Belt
Posts: 1243
Joined: Fri Aug 07, 2015 9:09 pm
Languages: English (N). Spanish (intermediate), Esperanto (B1), French (intermediate but rusting)
Language Log: https://forum.language-learners.org/vie ... 15&t=16948
x 3631

Re: Yandex translations

Postby luke » Tue Feb 01, 2022 3:21 pm

I was curious what Yandex and Google had to say about the difference between two verbs. It's a question people asked in the wordreference and spanishdict forums:

Google translated portarse and comportarse the same. Yandex made a distinction.

O: Carlos se portó mal.
Y: Carlos misbehaved.
G: Carlos misbehaved.

O: Carlos se comportó mal.
Y: Carlos behaved badly.
G: Carlos misbehaved.

O: Estoy portándome bien.
Y: I'm being good.
G: I am behaving well.

O: Estoy comportándome bien.
Y: I'm behaving well.
G: I am behaving well.

O: = Original Spanish
Y: = Yandex Translation
G: = Google Translation

The Word Reference forum post on portarse versus comportarse was not unanimous on the difference.

Just off the cuff, I prefer the Yandex distinction, but I don't know that it's correct.

Just another "thank you" to sfuqua for bringing this helpful resource into the light.
1 x
: 124 / 124 Cien años de soledad 20x
: 5479 / 5500 5500 pages - Reading
: 51 / 55 FSI Basic Spanish 3x
: 309 / 506 Camino a Macondo

BeaP
Green Belt
Posts: 405
Joined: Sun Oct 17, 2021 8:18 am
Languages: Hungarian (N), English, German, Spanish, French, Italian
x 1990

Re: Yandex translations

Postby BeaP » Tue Feb 01, 2022 3:48 pm

luke wrote:Just off the cuff, I prefer the Yandex distinction, but I don't know that it's correct.


This is an explanation from a source I find reliable: https://www.espanolavanzado.com/significados/2000-portarse-comportarse
1 x


Return to “Practical Questions and Advice”

Who is online

Users browsing this forum: No registered users and 2 guests