Bilinual, free interlinear books

All about language programs, courses, websites and other learning resources
bilinual
Posts: 3
Joined: Mon May 18, 2020 11:12 am
Languages: Swedish, English
x 2

Bilinual, free interlinear books

Postby bilinual » Mon May 18, 2020 12:02 pm

Hello, I am Hamid. Being stuck at home during Corona-virus quarantine time, I worked on a web site to improve my reading and vocabulary skills by reading books that are annotated with translation hints. A few of the books even have the audio. Maybe it can also be interesting for you whether you are stuck at home or just interested in language learning.
The Bilinual project objective is not to translate books but annotates them with in-context translation and language hints. When I use the website, I do not rely on the annotation but use these hints and clues for actively guessing the right translation. Bilinual takes the burden from me to look up the word and provide the needed information through interaction/distraction-free interfaces. No extra steps to check the dictionary, no distraction from a full-text translation, just smart word-by-word translation hints!

Books listed on the main page are complete books and you can browse them with no cost, with no limitation, and no registration is needed. As of now, there are no paid features but I appreciate it if you register for the mailing list and share the link with your friends.
Please click the link below to try some of the free classic books in the Bilinual library.

It is possible to add your own preferred text or import books from Project Gutenberg in this link.

Please let me know if you have any requests or suggestions.
0 x

mcthulhu
Orange Belt
Posts: 228
Joined: Sun Feb 26, 2017 4:01 pm
Languages: English (native); strong reading skills - Russian, Spanish, French, Italian, German, Serbo-Croatian, Macedonian, Bulgarian, Slovene, Farsi; fair reading skills - Polish, Czech, Dutch, Esperanto, Portuguese; beginner/rusty - Swedish, Norwegian, Danish
x 590

Re: Bilinual, free interlinear books

Postby mcthulhu » Mon May 18, 2020 8:53 pm

It's a nice idea, and thank you for doing it. The selection of books seems pretty good. I think the annotations could possibly use some more work, however. I took a quick look at Die Leiden des jungen Werther, and I'm not sure how "Des Moines" got to be the annotation for every occurrence of "des." I'm pretty sure that is an incorrect translation... "Go" doesn't seem to be a useful annotation for "die," nor does "Leyden" for "Leiden"; and that's just in the title of the book. It's difficult to believe that these were proofread and corrected by a person with a PhD, as the Web site says.

Further on, there are a lot of gaps, with function words annotated, sort of, while more important nouns are left annotated - e.g. in the phrase "vom Schicksal," the first word has an annotation but the noun, which is more likely to be unknown, does not.

A lot of annotations are just wrong or random - for instance, at the beginning of the English-Spanish version of Pride and Prejudice, in the sentence starting with "It is a truth universally acknowledged...," the word "truth" is annotated as "Dios" instead of "verdad," as I would have expected. The surname Bennet is annotated as "garofilea," which seems to be the Spanish translation of some herb called "bennet" in English; that's just confusing, as is the translation of Mrs. Long's name as the verb "ansiar."

The next and previous buttons don't seem to work correctly; I was on page 7 and clicked the previous button, and was taken to the home page instead of page 6.
9 x

User avatar
iguanamon
Black Belt - 2nd Dan
Posts: 2363
Joined: Sat Jul 18, 2015 11:14 am
Location: Virgin Islands
Languages: Speaks: English (Native); Spanish (C2); Portuguese (C2); Haitian Creole (C1); Ladino/Djudeo-espanyol (C1); Lesser Antilles French Creole (B2)
Studies: Catalan (B2)
Language Log: viewtopic.php?t=797
x 14269

Re: Bilinual, free interlinear books

Postby iguanamon » Tue May 19, 2020 12:18 am

First, mods, did you all allow this post? Second, many of the English notations for "Platero y yo" are so wrong that I don't get, given the premise of the site, how this would help a learner to read with ease or learn the correct translation.
Bilinual.com- about wrote: What is the challenge?
Words can have different meanings. For example, if you give someone directions, you would say “Turn left at the first intersection, then turn right.” However, you might use the same word in a different context and meaning: “Freedom of speech is a basic human right.” Or, for some words, even another completely different meaning: “The algorithm needs to decide which translation is correct to display the right words.”. The tools aims to eliminate the extra step of looking up words in a dictionary and finding the best translation so that the readers can instead focus on reading and enjoying the book itself.

ADVERTENCIA Á LOS HOMBRES QUE LEAN ("lean" is read not run)ESTE LIBRO PARA NIÑOS ("hombres" is men not cat).

So, how would this help me to read if I didn't know Spanish well? I'd have to look up these words and no dictionary would have these meanings for these words in English. Also, "niños" is notated later in the paragraph, which makes me wonder why it wasn't notated when it first appeared. If the word is supposed to be unknown, how would it help me to see the notation the second time the word appears?

There are other glaring errors in the text and the notations. As far as I am aware, Spanish does not have a solo accented "á". (edited to reflect member Ser's observation that this was correct at the time the book was written). "No" notated as "nope"... good grief! "Niños" does not mean "baby". The conjugation of poner- "pongo" should not be translated as "send". There are other egregious errors that should not be there. Frankly, I don't see how this, in its current form, would be of any benefit to a learner. They would be constantly looking up words because the notations just don't make sense within the context.
Bilinual.com- about wrote: How does Bilinual service works?
Proper annotations are selected by a machine learning algorithm that is powered by Google Word2Vec and SpaCy/Gensim/NLTK libraries. All books are in the public domain in the U.S and are available as a part of "Project Gutenberg".
This project was not possible without these open projects and libraries: Fasttext, SpaCy, Gensim, Numpy, Flask, NLTK WordNet, Celery, SQLite, jQuery.

Anyone can do better with DeepL or Google Translate or Bing Translate. The developer should go back to the drawing board.
Last edited by iguanamon on Tue May 19, 2020 11:48 am, edited 1 time in total.
3 x

User avatar
Querneus
Blue Belt
Posts: 841
Joined: Thu Dec 01, 2016 5:28 am
Location: Vancouver, Canada
Languages: Speaks: Spanish (N), English
Studying: Latin, French, Mandarin
x 2287

Re: Bilinual, free interlinear books

Postby Querneus » Tue May 19, 2020 2:11 am

iguanamon wrote:There are other glaring errors in the text and the notations. As far as I am aware, Spanish does not have a solo accented "á".

It did between the 17th century (or so) and the early 20th century. So this simply reflects the public domain book that the text was taken from.

iguanamon wrote:Anyone can do better with DeepL or Google Translate or Bing Translate. The developer should go back to the drawing board.

I am similarly saddened that automatizing the creation of these pages with Google Translate would produce better results, yes.
1 x

User avatar
Gordafarin2
Orange Belt
Posts: 161
Joined: Wed Aug 22, 2018 10:53 am
Languages: English (N)
Current focus: Mandarin (A2), Italian (A2)
Maintaining: Persian (B2), Esperanto (B2), Spanish (rusty B1-2)
Dabbled: ASL, French
Language Log: https://forum.language-learners.org/vie ... 15&t=17156
x 557
Contact:

Re: Bilinual, free interlinear books

Postby Gordafarin2 » Tue May 19, 2020 10:11 am

The design of your site looks very nice. I'm at a rusty intermediate level of Spanish where annotating words like this would, in theory, be really helpful.

But I'm so utterly confused by the 'translations' of some of these words that I can't even focus on the Spanish. Where on earth did you get 'cat' for hombres (=men)? Or 'play' for sé (=I know)?

I know single word-for-word translations are hard to automate, but these are far from the most common or likely translations of those words, I'm honestly perplexed.

es interlinear.PNG


I'm sorry to say I had to laugh at this part in the About page
Words can have different meanings. For example, if you give someone directions, you would say “Turn left at the first intersection, then turn right.” However, you might use the same word in a different context and meaning: “Freedom of speech is a basic human right.” Or, for some words, even another completely different meaning: “The algorithm needs to decide which translation is correct to display the right words.”. The tools aims to eliminate the extra step of looking words in dictionary and finding the best translation so that the readers can instead focus on reading and enjoying the book itself.



I think the idea holds a lot of promise, but if it can't outperform any of the number of "click to Google translate" tools out there, it's not useable yet. It needs a better algorithm or some human editing. Or at least a pop-up dictionary that can bring up a full list of definitions, so I can figure out what the word is really supposed to mean when the app gives a howler translation.
pregnant forest.PNG

:?
You do not have the required permissions to view the files attached to this post.
3 x
Persian... 10 novels: 4 / 10

Mandarin...
4000 words: 4000 / 4000 / 2000 characters: 1640 / 2000

she/her

User avatar
Serpent
Black Belt - 3rd Dan
Posts: 3657
Joined: Sat Jul 18, 2015 10:54 am
Location: Moskova
Languages: heritage
Russian (native); Belarusian, Polish

fluent or close: Finnish (certified C1), English; Portuguese, Spanish, German, Italian
learning: Croatian+, Ukrainian; Romanian, Galician; Danish, Swedish; Estonian
exploring: Latin, Karelian, Catalan, Dutch, Czech, Latvian
x 5181
Contact:

Re: Bilinual, free interlinear books

Postby Serpent » Tue May 19, 2020 1:58 pm

iguanamon wrote:First, mods, did you all allow this post?
Yes :?
Sorry, I checked the Swedish story first and didn't notice any obvious errors (though now I found some). I was too focused on whether the stories are samples or complete ones, and whether you can read everything for free :?
1 x
LyricsTraining now has Finnish and Polish :)
Corrections welcome

bilinual
Posts: 3
Joined: Mon May 18, 2020 11:12 am
Languages: Swedish, English
x 2

Re: Bilinual, free interlinear books

Postby bilinual » Thu May 21, 2020 12:55 pm

It's a nice idea, and thank you for doing it. The selection of books seems pretty good. I think the annotations could possibly use some more work
@mcthulhu

The design of your site looks very nice. I'm at a rusty intermediate level of Spanish where annotating words like this would, in theory, be really helpful.
@Gordafarin2

Thanks for being positive about the project and I am very glad you like the idea.

I took a quick look at Die Leiden des jungen Werther, and I'm not sure how "Des Moines" got to be the annotation for every occurrence of "des." I'm pretty sure that is an incorrect translation... "Go" doesn't seem to be a useful annotation for "die," nor does "Leyden" for "Leiden"; and that's just in the title of the book.


Thanks everyone for pointing out the annotation problems. You are definitely right. The annotations are not manual but done entirely using a ML algorithm that I coded. I don't mean to blame the algorithm but I would like to emphasize that it is the first try and with more data and a little bit of tuning, I expect to get better results in the future.

It's difficult to believe that these were proofread and corrected by a person with a PhD, as the Web site says.


Annotations are not done by a person, sorry for the confusion. The books are directly and automatically taken from Project Gutenberg website. The sentence you are referring to is a part of the book text file. https://www.gutenberg.org/cache/epub/2407/pg2407.txt. You are right. It is indeed confusing and I will note this down to get this line removed from the book.

The next and previous buttons don't seem to work correctly; I was on page 7 and clicked the previous button, and was taken to the home page instead of page 6.

Thanks. This is a bug that I will fix soon.

First, mods, did you all allow this post?

Anyone can do better with DeepL or Google Translate or Bing Translate. The developer should go back to the drawing board.

I am similarly saddened that automatizing the creation of these pages with Google Translate would produce better results, yes.


Guys, please calm down. I am aware of the obvious annotation problems. I am the developer, designer, and data scientist in this project using free/open datasets to train the models for all these languages on my laptop during my free time paying for the hosting service from my own money. I want to assure you that I am aware that my project is not comparable with Google/Bing translate, but thanks for reminding me.
0 x

Doitsujin
Green Belt
Posts: 404
Joined: Sat Jul 18, 2015 6:21 pm
Languages: German (N)
x 807

Re: Bilinual, free interlinear books

Postby Doitsujin » Thu May 21, 2020 5:55 pm

bilinual wrote:Guys, please calm down. I am aware of the obvious annotation problems. I am the developer, designer, and data scientist in this project using free/open datasets to train the models for all these languages on my laptop during my free time paying for the hosting service from my own money. I want to assure you that I am aware that my project is not comparable with Google/Bing translate, but thanks for reminding me.
The problem is that you've presented your website as a language learning tool, which it isn't, because none of your bilingual books contains useful interlinear translations. It's also not exactly a novel idea, because there are lots of other websites out there who offer Public Domain books with machine-translated annotations/translations.
3 x

bilinual
Posts: 3
Joined: Mon May 18, 2020 11:12 am
Languages: Swedish, English
x 2

Re: Bilinual, free interlinear books

Postby bilinual » Thu May 21, 2020 8:11 pm

Thanks for your comment Doitsujin.
It's also not exactly a novel idea, because there are lots of other websites out there who offer Public Domain books with machine-translated annotations/translations.

I never claimed the project to be novel.

I would like to emphasize again, this is the first try and with more data and a little bit of tuning, I expect to get better results in the future.
2 x


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests