New subforum? Language software development

Small area for language related software developers. If you have a feature request please put in the appropriate place. This area is for developers of language software and forum software development only.
User avatar
mrwarper
Orange Belt
Posts: 106
Joined: Sat Jul 18, 2015 4:06 pm
Languages: A bunch, in various stages
Language Log: http://how-to-learn-any-language.com/fo ... ?TID=39905
x 151
Contact:

Re: New subforum? Language software development

Postby mrwarper » Fri Jan 06, 2017 5:11 pm

Will be watching this thread with interest ;)

tommus wrote:I think we need a more targeted objective(s) for such a sub-forum. [...]
Let me propose one topic that has wide applicability.
[...]
So has anyone found effective ways to approach the challenge of character encoding in language learning software?

I think this is too broad a question. Character encoding is not that difficult to understand, but the challenges it poses are very variable depending on what one tries to do exactly, and all sorts of other circumstances...
0 x
MrWarper while HTLAL is offline.

Doitsujin
Green Belt
Posts: 425
Joined: Sat Jul 18, 2015 6:21 pm
Languages: German (N)
x 860

Re: New subforum? Language software development

Postby Doitsujin » Fri Jan 06, 2017 7:36 pm

tommus wrote:I have tried to find good tutorials to help make this all clear and understandable, with no satisfactory results. Character encoding sounds relatively easy until you actually try using it in serious applications. For example, to work with applications that use articles from the Dutch Wikipedia, one might use ISO-8859-1 which is generally used to encode western European languages, but there are many, many articles, with people and place names for example, that fail to display properly.


Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.

You also might want to look into building HTML5/CSS/JavaScript based web apps. These apps also have the added advantage that you might be able to convert them to apps for mobile devices with toolkits such as PhoneGapp.
3 x

User avatar
tommus
Blue Belt
Posts: 972
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1985

Re: New subforum? Language software development

Postby tommus » Fri Jan 06, 2017 9:08 pm

Doitsujin wrote:Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.

Thank you very much! That information looks to be very useful for processing character encoding. And because it is Python, that is a good reason to further develop my use of Python. Computer programming is much like language learning; you are constantly being attracted by other interesting "languages" that are too interesting to ignore.

This sort of discussion and suggestions are good examples of the potential of this proposed sub-forum on language software development. Language learning and computer programming make a great match that provides mutual support, and perhaps equally important, very good incentive and motivation to sustain an interest in both.
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

galaxyrocker
Brown Belt
Posts: 1148
Joined: Mon Jul 20, 2015 12:44 am
Languages: English (N), Irish (Teastas Eorpach na Gaeilge B2), French, dabbling elsewhere sometimes
Language Log: viewtopic.php?f=15&t=757
x 3476

Re: New subforum? Language software development

Postby galaxyrocker » Fri Jan 06, 2017 10:09 pm

Doitsujin wrote:Have you looked into using Beautiful Soup? As long as the website is properly encoded, it'll always return Unicode, which greatly simplifies post-processing.



I use BeautifulSoup for a little desktop app I have that searches Irish language dictionaries, and it's absolutely wonderful for what it does, also covering the accents and such.
0 x

DangerDave2010
Orange Belt
Posts: 214
Joined: Sun Feb 14, 2016 5:10 am
Languages: gibberish (N)
x 291

Re: New subforum? Language software development

Postby DangerDave2010 » Mon Jan 09, 2017 12:27 pm

I think we should have the new sub-forum. Surely we can discuss our projects on other sections, but these discussions will soon be flooded into oblivion. If we have a dedicated section, over the years, this will bring about a great revolution in computer assisted language learning, as newcomers will have a place to learn from the accumulated corpus of experience.
3 x

Whodathunkitz
Green Belt
Posts: 421
Joined: Mon Dec 26, 2016 7:40 pm
Location: UK
Languages: English (N), Cebuano (basic spoken daily, best L2), Spanish (beginner, but can read), Esperanto (beginner and not maintained). Sometimes dabble with Dutch, Serbian, Slovak, Czech, German and Arabic.
Language Log: viewtopic.php?f=15&t=5133&start=30
x 324

Re: New subforum? Language software development

Postby Whodathunkitz » Thu Jan 19, 2017 4:48 pm

For Tatoeba - I (think) I just downloaded the full list of sentences in Cebuano (1200 ish?)

Possibly downloaded a join list between sentences/language.

I looked at the languages they linked to (somehow).

Mostly German, some English.

Downloaded German and English.

Used powershell to trim the files and MS Access to join them (possibly used powershell).

Concatenated the German and English.

Imported into Google sheets and added a googletranslate function for German to English), sorted by length (ascending), uniqued.

Manipulated into Memrise bulk import format and imported into an English>Cebuano course (German in note field) and a German>Cebuano course (English in note field or ignored?).

http://www.memrise.com/course/1185245/1 ... sentences/
http://www.memrise.com/course/1185300/deutsch-cebuano/
0 x
2018 Cebuano SuperChallenge 1 May 2018-Dec 2019
: 150 / 600 SC days:
: 6 / 1250 Read (aim daily 2000 words):
: 299 / 9000 Video (aim daily 15 minutes):

User avatar
smallwhite
Black Belt - 2nd Dan
Posts: 2391
Joined: Mon Jul 06, 2015 6:55 am
Location: Hong Kong
Languages: Native: Cantonese;
Good: English, French, Spanish, Italian;
Mediocre: Mandarin, German, Swedish, Dutch.
.
x 4895

Re: New subforum? Language software development

Postby smallwhite » Tue Feb 07, 2017 11:28 am

Cainntear wrote:
smallwhite wrote:And what about apps in beta or apps that are published with certain functions still under development, which of the two subforums would they belong to?

If it's a matter for users, then the user-orientated forum (language programs and resources) -- if it's about software development, the development forum.


Is it correct for this new thread to be in the subforum "Development Area" as opposed to "Language Programs and Resources"?
A program to learn to write numbers in different languages

the OP of which wrote:I just wrote this program to learn to write cardinal numbers for fun and also because online converters make mistakes and therefore propagate false informations about languages.
I bank on native speakers to check if there are errors.

PS: There is an encoding trouble with japanese in Translate mode.

EnToutesLettres.zip
0 x
Dialang or it didn't happen.


Return to “Development Area”

Who is online

Users browsing this forum: No registered users and 2 guests