The new ChatGPT is a polyglot, apparently

General discussion about learning languages
jeffers
Blue Belt
Posts: 848
Joined: Sat Aug 22, 2015 4:12 pm
Location: UK
Languages: Speaks: English (N), Hindi (A2-B1)

Learning: The above, plus French (A2-B1), German (A1), Ancient Greek (?), Sanskrit (beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=19785
x 2774
Contact:

Re: The new ChatGPT is a polyglot, apparently

Postby jeffers » Thu Jan 05, 2023 8:05 pm

rdearman wrote:It is a useful typing tutor. I used this prompt to get it to allow me to type in Korean.

Code: Select all

i want you to act as a korean typing tutor. I want you to only reply inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. You output one random korean word and the english translation and I will attempt to write the korean word as quickly as I can. If I get it correct, you output a new word and translation in the same code block with the new word. please put all your output in the same code block so the page doesn't scroll. when i need to tell you something in english, i will do so by putting text inside curly brackets {like this}. Start


Telling it not to scroll doesn't work, but otherwise it is a good typing tutor.


This is an awesome idea for practicing typing in languages with other scripts. I edited your instructions to do it for Hindi, but had a few hiccups. Initially it gave me a list of 10 words each time. The next attempt it gave me very difficult words to type, having a lot of conjuct characters such as क्ष (क + ् + ष). The next version of the instructions ended up giving me mostly easy 2 letter words. Finally, I settled on this version and typed for 5 or 10 minutes. Although I asked it to give me increasingly difficult words, it continued to give me words of middling difficulty. I suppose the next step would be to ask it to give sentences to type.

Here is the question I used:
i want you to act as a hindi typing tutor. You output a single random hindi word and the english translation and I will attempt to write the hindi word in response. If I get it correct, you output a new word and translation. If I get it wrong, let me know. If I am doing well, give me increasingly difficult words.


EDIT: I've just run into a hitch with the word पढ़ना. After a few attempts I said that I was correct, and it gave the following response:
ChatGPT
I'm sorry, but the correct spelling is actually पढ़ना. The spelling पढ़ना is not correct. Please try again.
Jeff Lehman
Which letter am I getting wrong?
ChatGPT
The correct spelling has a "ढ़" (dhri) in it, whereas the spelling you provided has a "ढ" (Dhri) in it. "ढ़" (dhri) is a different letter from "ढ" (Dhri) in Hindi. "ढ़" (dhri) is pronounced like "dh" as in "father", while "ढ" (Dhri) is pronounced like "th" as in "there". Please try again using the correct spelling of the word, which is पढ़ना.


Getting annoyed, I opened Python IDLE and copied what I typed and what the chat was saying was correct, and got this:
>>>"पढ़ना" == "पढ़ना"
>>>False

Stunned, I cursored through the two words and discovered that my version (typed using the standard Hindi keyboard layout built into Windows) of ढ़ is made up of 2 separable characters, : ढ and ़ whearas the Chat version is a single inseparable character. I pulled up the onscreen keyboard, and it looks like I can't make that version of the character, sadly. I might be scuppered.

EDIT 2: one final update:
It gave me the word कमजोर (kamajor), which is a common spelling, but not the proper dictionary spelling, which should have a ज़ "z" in place of the ज "j". Really, it comes down to dialect, region, and/or whether the speaker acknowledges Urdu loanwords as coming from Urdu. Anyway, I typed, "कमजोर is an urdu loanword and should have a z sound instead of a j sound. So the proper spelling should be कमज़ोर". ChatGPT responded:
"You are correct! I apologize for the error. The correct spelling for the word is indeed कमज़ोर. Thank you for pointing out the mistake."
6 x
Le mieux est l'ennemi du bien (roughly, the perfect is the enemy of the good)

French SC Books: 0 / 5000 (0/5000 pp)
French SC Films: 0 / 9000 (0/9000 mins)

User avatar
TopDog_IK
Yellow Belt
Posts: 80
Joined: Thu Aug 04, 2022 4:21 am
Languages: English (N), German (B2/C1)
x 79

Re: The new ChatGPT is a polyglot, apparently

Postby TopDog_IK » Sat Jan 07, 2023 2:47 am

Free AI teacher?
You do not have the required permissions to view the files attached to this post.
3 x

User avatar
noblethings
Yellow Belt
Posts: 54
Joined: Wed Feb 08, 2023 12:12 pm
Location: at home
Languages: English (N), French (False beginner)
Language Log: https://forum.language-learners.org/vie ... 15&t=18899
x 157

Re: The new ChatGPT is a polyglot, apparently

Postby noblethings » Sat Feb 11, 2023 2:26 pm

I didn't test its actual polyglot capabilties, but ChatGPT is so helpful in giving me a broad understanding of the language families. And what makes certain languages similar to each other. It explains everything clearly.
1 x
French audio input (70 hours): : 70 / 70
FSI vol 1 [89 tapes] : 30 / 89
unit 5 : 5 / 7

User avatar
tungemål
Blue Belt
Posts: 947
Joined: Sat Apr 06, 2019 3:56 pm
Location: Norway
Languages: Norwegian (N)
English, German, Spanish, Japanese, Dutch, Polish
Language Log: https://forum.language-learners.org/vie ... 15&t=17672
x 2181

Re: The new ChatGPT is a polyglot, apparently

Postby tungemål » Sat Feb 11, 2023 2:54 pm

Good for practicing chatting in a language. I wrote this:

Can we have a conversation in Dutch? And please correct me if I use wrong grammar.
- Natuurlijk! Laten we gerust Nederlands praten. Ik zal u corrigeren als u verkeerde grammatica gebruikt. Hoe gaat het vandaag met u?

Chatgpt is a good conversation partner, but it didn't correct my wrong sentences. However, if I write "is this sentence correct: [sentence]", then it will correct it.
3 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: The new ChatGPT is a polyglot, apparently

Postby emk » Sun Apr 16, 2023 12:23 pm

I'm observing a big difference in results from GPT 3.5-based models (including the free version of ChatGPT) and the new GPT 4-based models (available as the paid version of ChatGPT). For certain college-level tasks, it's gone from 10th to 90th percentile performance. If you prompt it clearly, it can pass a lot of college midterms.

Here's an example of ChatGPT w/GPT 4 in action, doing simple French corrections. Notice how I prompt it.

emk wrote:Good morning! I would like you to help me with correcting short French texts. For example, if I write:

> Je vais à la bibliothèque.

You would respond:

> CORRECT.

If I write:

> Oui, c'est la voiture que j'ai vu!

You would respond:

> You should change "vu" to "vue" here, because...

You should also add any necessary accents and fix any minor errors.

Let's begin with:

J'ai un beau chapeau!

ChatGPT wrote:CORRECT.

emk wrote:Je veux que tu fais quelque chose pour moi.

ChatGPT wrote:You should change "fais" to "fasses" here, because you need to use the subjunctive form of the verb "faire" after "que" in this case. The corrected sentence is:

> Je veux que tu fasses quelque chose pour moi.

This is correct!

The key ideas here are:

  • Be clear about what you want. Do you want basic grammar correction, or do you want it also reword things in a more natural style?
  • Provide examples! Just like when asking people to do something, it helps to be clear.
  • Make absolutely certain that your examples are correct. If you use errors in your examples, you may get more errors in your responses!
  • Basic good manners don't seem to hurt, I think? These models are trained off of millions of human conversations, and they have a very strong model of how conversations usually proceed. If you're rude to humans, you don't get their best effort. Now, the ChatGPT interface has been trained heavily to always be polite, but slightly different configurations will perform visibly worse if you are rude. If you're rude to ChatGPT, it needs to balance two competing goals: responding usefully, and responding in a way that accurately models real conversations. Certain GPT-based models will actually resort to using polite sarcasm in this case! But using good manners means that the goals of "responding helpfully" and "responding in a way that models real conversations" are in alignment.
  • Long conversations eventually get weird and performance goes down. If ChatGPT seems to be drifting, start a fresh conversation. This is especially important if it starts giving bad corrections, because once you have a few bad corrections in your history, it will use those as examples for what it should do, and it will start generating more bad corrections.
Warning: GPT 4 will still occasionally make up bad advice! "Hallucination" is the technical term. So whenever it explains something to you, you need to double-check that and make sure that you actually understand what's going on. This ia key part of the learning process.

In terms of accuracy, it seems OK to me for short examples? It may give you bad advice, obviously. But if you've ever used online "correct my writing" sites like lang8 back in the day, you'll know that humans are also prone to giving horrible advice and bad corrections. The GPT 4 version of ChatGPT seems to be better at this than random internet people.

Socratic tutor mode

I've also had surprisingly good luck with prompts like this, using ChatGPT 4:

emk wrote:I would like you to act as a Socratic tutor. Please don't directly give me the answer to my problems. Instead, please lead me through the solution process step by step, asking questions that will help me to understand the subject in depth myself.

It is OK to begin by printing at outline of what you would like to teach, and to use that as notes. You can do this by writing it between brakets like {{ What to teach goes here }}. I won't look too closely at your notes. I'll focus on the Socratic questions you ask me.

I am trying to understand AdamW optimizer for neural networks, and why the algorithm works the way it does. I already understand Stochastic Gradient Descent.

Thank you for your help!

Some useful points here:

  • I should consider adding an example of what I'm looking for, though that wasn't necessary in this case.
  • In general, ChatGPT will get a lot smarter if you ask it to "show its work." For example, it's normally pretty bad at math, but if you ask it to show each step, it gets noticeably better. It's usually a bit sloppy about peotry, but if you ask it to write a first draft and a final draft, it will do better. Here, I tell ChatGPT that it can write notes to itself inside "{{ }}" containing a "lesson plan", which allows it to be a more coherent tutor.
Also, ChatGPT isn't great at staying in "Socratic tutor" mode. It will eventually resort to explaining things outright. If it does, you can nudge it back with something like:

emk wrote:Could you please walk me through the moments as a Socratic tutor, like you were doing earlier? Thank you! You can write your "inner thoughts" and teaching plan in {{ }} like before.

GPT 4 is better at quite a few things than GPT 3.5, but at $20/month for 25 messages every 3 hours, it's fairly expensive.

Anyway, I hope that some of these examples help!
9 x

User avatar
Le Baron
Black Belt - 3rd Dan
Posts: 3507
Joined: Mon Jan 18, 2021 5:14 pm
Location: Koude kikkerland
Languages: English (N), fr, nl, de, eo, Sranantongo,
Maintaining: es, swahili.
Language Log: https://forum.language-learners.org/vie ... 15&t=18796
x 9384

Re: The new ChatGPT is a polyglot, apparently

Postby Le Baron » Sun Apr 16, 2023 1:01 pm

I've worked quite hard at trying to make the instructions and prompts more accurate and more targeted. Originally I tried this on languages I know well, simply so that I can see and follow the process without getting lost. Knowing what is 'correct' helps a lot with watching the process. There is however quite an error margin, which is easy to catch if I'm doing this in English/French/Dutch, but not so easy in e.g. Spanish, Russian, Swahili.

Nevertheless, it does provide the impetus for me to double check and confirm what I am getting in the output. So on the one hand it still does facilitate learning for me and does provide help and assistance, but I can't rely on ChatGPT for accuracy simply because I am not in a strong enough position to conform what is true or false in a language I'm only learning. For languages where you're in a stronger position it's much better.
2 x

User avatar
emk
Black Belt - 1st Dan
Posts: 1620
Joined: Sat Jul 18, 2015 12:07 pm
Location: Vermont, USA
Languages: English (N), French (B2+)
Badly neglected "just for fun" languages: Middle Egyptian, Spanish.
Language Log: viewtopic.php?f=15&t=723
x 6323
Contact:

Re: The new ChatGPT is a polyglot, apparently

Postby emk » Sun Apr 16, 2023 1:11 pm

jeffers wrote:Getting annoyed, I opened Python IDLE and copied what I typed and what the chat was saying was correct, and got this:

>>>"पढ़ना" == "पढ़ना"
>>>False

So as an experiment, I asked the GPT 4 version of ChatGPT what's going on here:

emk wrote:Someone typed the following into Python:

>>>"पढ़ना" == "पढ़ना"
>>>False

This seems weird! Can you explain what's going on here?

ChatGPT wrote:The Python code you've provided compares two strings and returns False, which is indeed surprising, as the strings look identical. One possible reason for this could be that the two strings are composed of different Unicode characters that visually appear the same, or have different encodings or byte sequences.

To debug this issue, let's look at the Unicode code points for each character in both strings:

Code: Select all

string1 = "पढ़ना"
string2 = "पढ़ना"

for char in string1:
    print(f"{char}: {ord(char)}")

print("---")

for char in string2:
    print(f"{char}: {ord(char)}")

This code will print the Unicode code points for each character in both strings. If there are any differences, you can spot them by comparing the output. If the issue is related to encodings, you can check the byte sequences of both strings using the encode() method:

Code: Select all

print(string1.encode('utf-8'))
print(string2.encode('utf-8'))

If you find any differences in Unicode code points or byte sequences, that's likely the reason for the comparison returning False. To fix the issue, you may need to normalize the strings or use the correct encoding.

emk wrote:If I wanted to make sure that two visually-similar strings compared the same, are there any tricks I can use to make that happen more often?

ChatGPT wrote:Yes, you can use Unicode normalization to make sure that visually similar strings compare the same more often. Unicode normalization is a process that converts equivalent Unicode strings into a standardized form, which can then be compared more easily.

Python's unicodedata module provides normalization functions for this purpose. There are four normalization forms: NFC, NFD, NFKC, and NFKD. NFC (Normalized Form C) and NFD (Normalized Form D) are the most commonly used forms.

Here's an example of how to normalize two visually similar strings and compare them:

Code: Select all

import unicodedata

string1 = "पढ़ना"
string2 = "पढ़ना"

normalized_string1 = unicodedata.normalize('NFC', string1)
normalized_string2 = unicodedata.normalize('NFC', string2)

print(normalized_string1 == normalized_string2)

In this example, we normalize both strings using the NFC form and then compare them. If the strings are visually similar, the normalization process should make them equivalent, and the comparison should return True.

Note that normalization may not cover all edge cases, and you might still encounter issues with certain combinations of characters or encodings. However, it's a good first step to improve the comparison of visually similar strings.

Yup, that's a pretty good summary of why things like this might happen! I haven't double-checked whether the Python code works. Sometimes ChatGPT makes up plausible APIs that don't actually exist.

So if you encounter an issue like this, you might try responding as follows:

ChatGPT wrote:I'm sorry, but the correct spelling is actually पढ़ना. The spelling पढ़ना is not correct. Please try again.

User wrote:Huh, those two words look visually similar to me. Is it possible that we're running into a Unicode normalization problem? What happens if you convert both words to the Unicode normalization form NFC before comparing them? Please show your work by breaking each word down into characters separated by spaces and indicating which letters differ, and how.

Unfortunately, I can't test this, because my browser automatically converts text to a specific Unicode normalization format whenever I paste. So I keep asking ChatGPT "You said that STRING_1_IN_NFC_FORMAT is identical to EXACTLY_THE_SAME_STRING_IN_NFC_FORMAT" and it just gets really confused and starts offering to write Python code to figure things out.

But there's another complicating factor here! Specifically, we need to know how a specific GPT parser "tokenizes" inputs—it might not even be able to see individual letters in some cases, because the pre-processing steps might convert words to a vector embedded in a high-dimensional space. Which is actually pretty fun from a linguistic perspective! If you're familiar with "word2vec", most toy GPT-based systems do something similar. So some of these systems may be totally unable to understand ideas like "the third letter of 'king' is 'n'", because that's not how they "see" words at all. Instead, they might see words in a representation where statements like "KING - MAN + WOMAN = QUEEN" is a primitive operation. Seriously, word2vec is a fun linguistic game.

But it looks like ChatGPT 3.5 has access to the raw letters:

emk wrote:What are the individual letters in "my dog is happy"? Please list them separated by spaces!

ChatGPT 3.5 wrote:The individual letters in "my dog is happy" are:

m y d o g i s h a p p y

So there's probably some way to get ChatGPT to handle NFC vs NFD normalization, by using a sufficiently clever prompt. But it's probably going to an exercise in mutual "confusion" between you and the lanugage model, because (1) normalization is just kinda confusing, and (2) the way these models typically "see" the world may involve some complicated pre-processing steps.
3 x

User avatar
mattmo
White Belt
Posts: 34
Joined: Tue Feb 28, 2023 2:27 pm
Languages: English (N), Cantonese Chinese (C1), Mandarin Chinese (B1)
x 47

Re: The new ChatGPT is a polyglot, apparently

Postby mattmo » Wed Apr 19, 2023 2:34 pm

ChatGPT is very impressive. I wonder how much it will change the landscape around us in the future; and how much we will use it for language learning too.
0 x
My language learning blog: https://langapprentice.com/
Started language learning in 2020

lichtrausch
Blue Belt
Posts: 511
Joined: Thu Jul 23, 2015 3:21 pm
Languages: English (N), German, Japanese, Mandarin, Korean
x 1380

Re: The new ChatGPT is a polyglot, apparently

Postby lichtrausch » Fri May 05, 2023 2:58 am

The AI revolution is leaving Arabic speakers behind

A tool like ChatGPT has yet to be trained on a massive amount of high-quality, diverse and representative Arabic written data. Its lack of data makes the tool's results in Arabic unable to distinguish the nuances, accuracy and depth needed to generate quality content.

Many will say: "But the problem is that the Arabic language is complex, and it is difficult to comprehend all its unique grammar, syntax and vocabulary."

So I say: "Mankind has been able to send a man to the moon, but it cannot teach a chatbot Arabic grammar rules?"[...]

Some AI-driven video-making tools, for example, are still unable to properly regulate the direction of writing in Arabic. Imagine the extreme frustration felt by every journalist or content creator when he/she wants to start the title of the article with a number, only to find that the words overlapped, their positions were reversed and the title became incomprehensible.

Missed opportunities mean less ability to use these tools to create high-quality content, less ability to achieve significant reach, and less ability to contribute to the growth and development of the Arabic content industry ecosystem.
0 x

lichtrausch
Blue Belt
Posts: 511
Joined: Thu Jul 23, 2015 3:21 pm
Languages: English (N), German, Japanese, Mandarin, Korean
x 1380

Re: The new ChatGPT is a polyglot, apparently

Postby lichtrausch » Tue May 09, 2023 3:58 am

1 x


Return to “General Language Discussion”

Who is online

Users browsing this forum: Dragon27, nathancrow77 and 2 guests