2009-02-20

use of unicode, emacs doc, ascii kludge

2008-09-11

David Combs wrote:
> Xah-- question about the characters in your posts:
>
> If I or someone sees eg a url in one of your posts, and
> wants to go to that url (because you've suggested doing that,
> maybe), it's a little difficult to just cut-n-paste your string,
> what with all the extra control or whatever they are characters
> mixed in.
>
> What is that stuff, why is it there, and is it really necessary
> for you to include it.

The summation sign “∑” in my sig is my and my website signet.

in the end of my sig, there's this character “☄” (unicode name “comet”). It is there so that it forces groups.google.com and Apple Mail application into sending the message with unicode encoding. Otherwise, the heuristics'll typically pick Japanese encoding. (in google groups, last i checked about last year there's no way to set encoding. And in Apple Mail 2 years ago, there's no unicode encoding option... it is added now but last i checked there's no preference that can set encoding ... )

My use of curly quotes “ ” or other unicode chars are just convenience and practical need. I have in my emacs various easy ways to type them. The need to quote is for example, seen throughout gnu's docs, but they used a ascii kludge of backtick for left curly single quote and straight quote for right curly single quote, e.g. “`something'”. (quoting is needed for highlighting purposes or to make a phrase's semantic from normal interpretation in the sentence.)

Since about 2006, i find emacs's support of unicode very robust and i have no problem with these and other mathematical chars or chinese in emacs. In general, opensource langs and tools in e.g. linux world has much caught on and support unicode, and i think that is good. (commercial world long ago supported and use these chars in practice (e.g. Apple in early 1990s and Microsoft Windows since about WindowsNT 4 or Windows2000)) The OpenSource world typically has a lag of 5 to 10 years in catching up most desktop techs. Even today, there are still a few cave dwelling tech geekers you'll see occasionally complaint about unicode in posts (e.g. Alan M here insists that newsgroup posts should be in ascii only!). But thankfully these days you'll often see others tech geekers follow up chiding about the complainer like “dude, get a proper newsreader” ...

You FreeSoftware and OpenSource supporters really should move on and embrace unicode.

I have made few suggestions here and elsewhere in the past 2 years including several private exchanges with Richard Stallman, about updating emacs doc (and in general all GNU docs convention) to to use “” and ‘’ in place of painful and ugly and technically problematic and ambiguous ascii kludge `', among few other modernization issues... but in general it's met with extreme difficulty...

i've been wanting to file a emacs bug report on this particular issue ... but with so many resistance and my “troll” persona etc ... basically it's very discouraging ...

The problem with `' or ``'' is that:

• it's just 1980's ascii kludge to get around the fact there were no matching quotes in ascii. In some technical sense, it's misuse and abuse of symbols.
• it's ugly.
• it's ambiguous. The straight quote has many meanings, and both straight quote and backtick also has special meanings in elisp lang and in function's inline doc string.
• it is not possible to do a syntactical parse. (compare it to quoting with chars that are matching pairs.)

If you think there's some merit in this suggestion, please file a bug report. (menu “Help‣Send Bug Report...”)

Xah
∑ http://xahlee.org/



http://groups.google.com/group/gnu.emacs.help/browse_frm/thread/a272d87845efc639

http://xahlee.org/UnixResource_dir/writ2/unicode_use.txt

No comments:

Post a Comment