Tool to convert conjugated words to the infinitive form

Ask specific questions about your target languages. Beginner questions welcome!
mverse
Posts: 8
Joined: Sun Nov 15, 2020 7:29 pm
Languages: English (N)
Learning: French (beginner)
Dabbled: Korean
School classes long forgotten: Spanish
x 16

Tool to convert conjugated words to the infinitive form

Postby mverse » Fri Nov 20, 2020 10:53 pm

Hi everyone,

I'm making a word list from a French book I have in text form. I want to learn the top couple thousand words used in the book that I don't yet know via Anki before trying to read the book. I have a list of unique words already, but some are conjugated. Does anyone know any tools where one can feed in a list of conjugated words in French and get the infinitive forms as the output?

EDIT: I do already know about websites that I can input each word individually, but I'm looking for something that I can upload a list to and receive a list back from rather than doing all that manual work.
0 x

User avatar
Adrianslont
Blue Belt
Posts: 770
Joined: Sun Aug 16, 2015 10:39 am
Location: Australia
Languages: English (N), Learning Indonesian and French
x 1651

Re: Tool to convert conjugated words to the infinitive form

Postby Adrianslont » Sun Nov 22, 2020 9:20 am

I believe you are looking for a lemmatiser. Have a google for French lemmatiser and you might find one.

Sorry, I am not being more helpful but I have never used one and I’m not attracted to this approach to learning - but I noticed no one had chimed in with a specific suggestion.

I’d be interested to hear details about your experience later, though.
0 x

Doitsujin
Orange Belt
Posts: 237
Joined: Sat Jul 18, 2015 6:21 pm
Languages: German (N)
x 426

Re: Tool to convert conjugated words to the infinitive form

Postby Doitsujin » Sun Nov 22, 2020 11:10 am

mverse wrote:EDIT: I do already know about websites that I can input each word individually, but I'm looking for something that I can upload a list to and receive a list back from rather than doing all that manual work.

I don't think that such a tool exists, but if you have basic Python programming skills or are interested in acquiring them, you might be able to write a custom NLTK 3 script for this task.

There are also a couple of stand-alone Python libraries that you could use, for example:
Pattern (supports Dutch, English, Spanish, German, French and Italian.)

You also might find the French Verb Conjugation Rules library helpful, which comes with a stand-alone Windows conjugation app. (It's in the FrenchVerbWorkshop\FrenchVerbWorkshop\bin\Debug folder.)

There's also a website with inflection lists for German, English, Spanish, French, Italian, Portuguese and Russian that you might be able to use.

For more links, also see the list of natural language processing resources and tools topic.
3 x

mverse
Posts: 8
Joined: Sun Nov 15, 2020 7:29 pm
Languages: English (N)
Learning: French (beginner)
Dabbled: Korean
School classes long forgotten: Spanish
x 16

Re: Tool to convert conjugated words to the infinitive form

Postby mverse » Sun Nov 22, 2020 6:33 pm

Thank you for the help, Doitsujin and Adrianslont!

For future readers, I also found this library: https://mlconjug3.readthedocs.io/en/latest/.
0 x

白田龍
Orange Belt
Posts: 175
Joined: Wed Mar 21, 2018 6:54 pm
Languages: English, Portuguese, Spanish, Catalan, French, Persian, Arabic, Mandarin, Japanese.
x 280

Re: Tool to convert conjugated words to the infinitive form

Postby 白田龍 » Sun Nov 22, 2020 8:29 pm

There's http://lexique.org , which you can query online.


and there are some lemmatization dicts here:
https://github.com/michmech/lemmatization-lists

using them in python is trivial:

Code: Select all

#encoding: utf8
lemmaDict = {}
with open('lemmatization-es.txt', 'rb') as f:
   data = f.read().decode('utf8').replace(u'\r', u'').split(u'\n')
   data = [a.split(u'\t') for a in data]
   
for a in data:
   if len(a) >1:
      lemmaDict[a[1]] = a[0]
   
def lemmatize(word):
   return lemmaDict.get(word, word + u'*')
   
def test():
   for a in [ u'salió', u'usuarios', u'abofeteéis', u'diferenciando', u'diferenciándola' ]:
      print(lemmatize(a))
   
test()



If you can't get your way around it paste your wordlist at https://pastebin.com/ I can lemmatize it for you.
4 x

mverse
Posts: 8
Joined: Sun Nov 15, 2020 7:29 pm
Languages: English (N)
Learning: French (beginner)
Dabbled: Korean
School classes long forgotten: Spanish
x 16

Re: Tool to convert conjugated words to the infinitive form

Postby mverse » Sun Nov 22, 2020 9:01 pm

Amazing! Thank you, 白田龍! That is exactly what I was looking for. I've used Python before, so I'm good to go.
0 x

mcthulhu
Orange Belt
Posts: 207
Joined: Sun Feb 26, 2017 4:01 pm
Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
x 553

Re: Tool to convert conjugated words to the infinitive form

Postby mcthulhu » Mon Nov 23, 2020 5:18 am

I've been using the Stanza Python library from Stanford NLP lately for lemmatization in Jorkens. I'm sending a chapter's worth of words at a time to be lemmatized, and the response time is pretty good. FWIW, Jorkens can generate a lemmatized word frequency list for the current book and save it in .csv format.

There's also TreeTagger, which supports French and can be used as a stand-alone program without Python. It can handle bulk input, though you have to put each word on a separate line. You can also find at least one JavaScript library on GitHub that is for French lemmatization: https://github.com/bastienbot/nlp-js-tools-french, which gave me pretty good results.

Another way to do it is with a finite state transducer (software with rules to convert a set of strings into another set of strings, basically). See https://sourceforge.net/projects/hfst/f ... ansducers/, which includes one for French.

As you can see from the other responses, there are a lot of ways to do this.
3 x


Return to “Practical Questions and Advice”

Who is online

Users browsing this forum: No registered users and 1 guest