The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

All about language programs, courses, websites and other learning resources
User avatar
Montmorency
Brown Belt
Posts: 1035
Joined: Tue Oct 06, 2015 3:01 pm
Location: Oxfordshire, UK
Languages: English (Native)
Maintaining: German (active skills lapsed somewhat).
Studying: Welsh (advanced beginner/intermediate);
Dabbling/Beginner: Czech

Back-burner: Spanish (intermediate) Norwegian (bit more than beginner) Danish (beginner).

Have studied: Latin, French, Italian, Dutch; OT Hebrew (briefly) NT Greek (briefly).
Language Log: viewtopic.php?f=15&t=1429
x 1184

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

Postby Montmorency » Sat Apr 16, 2016 12:44 am

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
by Joel Spolsky


http://www.joelonsoftware.com/articles/Unicode.html

This is an old article (2003), but still helped me fill in some of my horrible knowledge gaps when it comes to Unicode, and for various reasons, I've been trying to fill some of those gaps recently.

It would have been very applicable to me back in 2003 when I was still employed in IT, and although I had re-started learning natural languages seriously back in the early-to-mid-1990s, my knowledge of computer internationalisation was just about confined to the notes I had taped to my keyboard or screen, detailing the ALT-codes for German, French or later Spanish "special characters". (Of course, they were not "special" to German, French, or Spanish speakers, but a normal part of the language, but how ignorant some of us English speakers were back then). I originally worked on IBM or IBM-compatible mainframes, and lived my life through EBCDIC. If it wasn't listed on the "little green [IBM 360 reference] card", then it didn't exist as far as I was concerned. I was aware that at the IBM User Group conferences I went to in mainland Europe, there were often heated sessions over what IBM called "National Character Sets", but to be honest, we English-speakers really couldn't see what the fuss was about. I blush now...


☺☺☺☺ ~~~~~~~~~~ ☺☺☺☺


We recently had a thread about computer programming, and although this article doesn't belong there, I wonder if it may be of interest to some of the participants in that thread. In any case, I get the impression there is a fairly high count of IT types on this forum, as I think there were on HTLAL.
7 x

User avatar
langwijes
Posts: 8
Joined: Thu Apr 14, 2016 10:40 pm
Languages: ...
English (N)
French (A0-A1)
Korean (A0-A1)
Mandarin (A0)
x 6

Re: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

Postby langwijes » Mon Apr 18, 2016 1:25 am

Thanks for posting this article. As it pertains to language learners these days there is an extra dimension that's not often discussed, namely settling on the "correct" keyboard layout(s) in the target language(s).

As an example, I'm recently investigating whether to devote the time to learn French. I can touch type in English on a QWERTY layout keyboard. Ideally, I'd like to keep as much of the muscle memory as possible to transfer to other languages. France is contemplating a standardization of a new keyboard layout (to be announced) instead of the non-standardized yet widely adopted AZERTY French layout[1]. After some digging, it seems most pragmatic multi-lingual typists choose the U.S. International keyboard layout, and I decided to do the same.

Of course, this decision and muscle memory transfer becomes much harder and drastically less effective when we consider non-Romance scripts. I learned the Korean keyboard many years ago and that was thankfully relatively easy due to the phonetic and mostly alternating nature between consonants (left hand) and vowels (right hand). However, I've recently been looking at Chinese and am frankly a little disheartened by the Wubi or Cangjie methods and their respective shape-based learning curves. I believe that many younger Chinese use pinyin to type, although I question whether this is still common in more serious corporate settings due to typing speeds.

At least Russian seems to have a standard keyboard (ЙЦУКЕН), so hopefully by the time I get around to learning it in retirement it will be one less thing to worry about (or by then we'll have 100% accurate speech-based computer input methods!).

There's no question in my post above, but merely a reflection on the additional technological dimension that some (or maybe only a few) of us may be grappling with.

[1] http://www.theverge.com/2016/1/21/10805 ... out-azerty
1 x


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests