Page 2 of 2

Re: New subforum? Language software development

Posted: Fri Jan 06, 2017 5:11 pm
by mrwarper
Will be watching this thread with interest ;)

tommus wrote:I think we need a more targeted objective(s) for such a sub-forum. [...]
Let me propose one topic that has wide applicability.
[...]
So has anyone found effective ways to approach the challenge of character encoding in language learning software?

I think this is too broad a question. Character encoding is not that difficult to understand, but the challenges it poses are very variable depending on what one tries to do exactly, and all sorts of other circumstances...

Re: New subforum? Language software development

Posted: Fri Jan 06, 2017 7:36 pm
by Doitsujin
tommus wrote:I have tried to find good tutorials to help make this all clear and understandable, with no satisfactory results. Character encoding sounds relatively easy until you actually try using it in serious applications. For example, to work with applications that use articles from the Dutch Wikipedia, one might use ISO-8859-1 which is generally used to encode western European languages, but there are many, many articles, with people and place names for example, that fail to display properly.


Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.

You also might want to look into building HTML5/CSS/JavaScript based web apps. These apps also have the added advantage that you might be able to convert them to apps for mobile devices with toolkits such as PhoneGapp.

Re: New subforum? Language software development

Posted: Fri Jan 06, 2017 9:08 pm
by tommus
Doitsujin wrote:Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.

Thank you very much! That information looks to be very useful for processing character encoding. And because it is Python, that is a good reason to further develop my use of Python. Computer programming is much like language learning; you are constantly being attracted by other interesting "languages" that are too interesting to ignore.

This sort of discussion and suggestions are good examples of the potential of this proposed sub-forum on language software development. Language learning and computer programming make a great match that provides mutual support, and perhaps equally important, very good incentive and motivation to sustain an interest in both.

Re: New subforum? Language software development

Posted: Fri Jan 06, 2017 10:09 pm
by galaxyrocker
Doitsujin wrote:Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.



I use BeautifulSoup for a little desktop app I have that searches Irish language dictionaries, and it's absolutely wonderful for what it does, also covering the accents and such.

Re: New subforum? Language software development

Posted: Mon Jan 09, 2017 12:27 pm
by DangerDave2010
I think we should have the new sub-forum. Surely we can discuss our projects on other sections, but these discussions will soon be flooded into oblivion. If we have a dedicated section, over the years, this will bring about a great revolution in computer assisted language learning, as newcomers will have a place to learn from the accumulated corpus of experience.

Re: New subforum? Language software development

Posted: Thu Jan 19, 2017 4:48 pm
by Whodathunkitz
For Tatoeba - I (think) I just downloaded the full list of sentences in Cebuano (1200 ish?)

Possibly downloaded a join list between sentences/language.

I looked at the languages they linked to (somehow).

Mostly German, some English.

Downloaded German and English.

Used powershell to trim the files and MS Access to join them (possibly used powershell).

Concatenated the German and English.

Imported into Google sheets and added a googletranslate function for German to English), sorted by length (ascending), uniqued.

Manipulated into Memrise bulk import format and imported into an English>Cebuano course (German in note field) and a German>Cebuano course (English in note field or ignored?).

http://www.memrise.com/course/1185245/1 ... sentences/
http://www.memrise.com/course/1185300/deutsch-cebuano/

Re: New subforum? Language software development

Posted: Tue Feb 07, 2017 11:28 am
by smallwhite
Cainntear wrote:
smallwhite wrote:And what about apps in beta or apps that are published with certain functions still under development, which of the two subforums would they belong to?

If it's a matter for users, then the user-orientated forum (language programs and resources) -- if it's about software development, the development forum.


Is it correct for this new thread to be in the subforum "Development Area" as opposed to "Language Programs and Resources"?
A program to learn to write numbers in different languages

the OP of which wrote:I just wrote this program to learn to write cardinal numbers for fun and also because online converters make mistakes and therefore propagate false informations about languages.
I bank on native speakers to check if there are errors.

PS: There is an encoding trouble with japanese in Translate mode.

EnToutesLettres.zip