JUL
10
2003

KDE Gains Support for International Domain Names

Konqueror and the KDE base libraries in CVS now support domain names written with names outside the usual strict 7-bit ASCII letters. This means that one can now register and access domain names written in proper letters for almost all languages in the planet, not just English. Konqueror is among the first browsers to support this new technology, developed in cooperation with VeriSign which has also been cooperating with the Safari and Mozilla teams (Mozilla IDN announcement and explanation).

The support was added into our base libraries several months ago, but last bugs were trimmed out only about a month ago. In the mean time, the drafts provided to us have been made official Internet standards (RFCs) and domain name registrars are starting to sell domain names encoded in other languages.

Although we have done basic tests with Konqueror and other base applications and the code seems to work fine, we are still expecting bugs to surface and intend to squash them quickly. It's my expectation that we will fix everything by the time KDE 3.2 is released.

The support currently requires the GNU IDN Library to be installed on your system. If that is so and you're using CVS HEAD, you can test the support at the following addresses:

  1. Examples of domains in the Latin 1 set
  2. Examples of other language domains

Please note that KDE only supports the standards-compliant Punycode encoding. The IDN testbed led by VeriSign used another encoding (RACE) which has since been deprecated. Konqueror will therefore not work on those domains.

Comments

kah-däh.äh ;-)

Kidding aside. It's good to see new technology implemented that fast
in KDE. Open source proves once again to be where the innovation is.
IMO the Greeks will profit the most from that within the Union. A German
or french or swedish word can be read without those special letters
and we've got used to it anyway from crosswords and awful support for
our language for years on various electronic devices. But people which language
has completely different letters will certainly be glad they can at last have
native domain names. And chinese people can now put a whole story
in the URL bar... So cheers to them. Keep up the good work, KDE team.


By Jan at Thu, 2003/07/10 - 5:00am

Actually, french people do not like very much not using accents...
so I imagine from now I will have to do horrible things with my keyboard to access french websites ;)


By ParideP at Thu, 2003/07/10 - 5:00am

> ... I imagine from now I will have to do horrible things with my keyboard
> to access french websites ;)

That was also my first thought... I think this could become a real problem
for domains in more exotic languages.

And I think it'll be quite some source of confusion because
there may be a version with and without accents or
even mixed cases that do not necessarily belong to the same
person.


By cm at Mon, 2003/07/14 - 5:00am

Heya,

For what it's worth, I'm French but I don't like using accents for Net purposes. I think it just makes things more confusing. *g* Hopefully the use of international scripts won't make things difficult for people with a different keyboard though...


By Random Ribbit Person at Wed, 2003/07/16 - 5:00am

How is a non chinese person going to visit a chinese website. all those strange characters are not on my keyboard. granted, i can't read chinese let alone write it but the internet should be open to everyone


By Mark Hannessen at Thu, 2003/07/10 - 5:00am

Use a program for chinese input. For example, xcin and fcitx accept input in PinYin.


By Robert Klein at Thu, 2003/07/10 - 5:00am

you can have two domain for a single server : a full-ASCII domain name, and an international domain name.


By capit. igloo at Thu, 2003/07/10 - 5:00am

Just type in the unicode character... code. Alt+00+code_number.


By Anonymous at Fri, 2003/07/11 - 5:00am

Oh wait... does the Alt trick even work in unix!?!? I hope I haven't just made an ass of myself.


By Anonymous at Fri, 2003/07/11 - 5:00am

It doesn't (at least for me), would be a nice feature request tho. ;)


By Datschge at Fri, 2003/07/11 - 5:00am

and how is a chinese person supposed to visit a website written in ascii - i.e. american encoding? is it a bit much to expect every computer user on planet earth to learn the roman alphabet?

okay, so their keyboards probably have ascii letters written on them anyway, and some chinese input methods actually rely on a knowledge of roman letter phonetics to work, but even this is a symptom of past narrow-mindedness on the part of the dominant american technology makers. technological development seems to be shifting to the far east - let's hope the chinese developers are more thoughtful or else we'll all have to learn mandarin pretty soon!

the vast majority of websites visited in the world are clicked-through hyperlinks rather than typed in anyway, and even if it does have to be typed in, there'll probably be an ascii alternative domain name.


By David Wilkinson at Mon, 2004/12/20 - 6:00am

And how is the support for these new URL's in other KDE applications? I mean, it's nice that the website can be reached, but can I also reach it's webmaster using KMail? Can I read their newsgroups in KNode?
What kind of changes does this need?


By André Somers at Thu, 2003/07/10 - 5:00am

As for the SMTP kioslave, that one should be IDNA-ready for months now.
Note, however, that IDNA is not about localized mail addresses. Those are currently in the prcesss of being discussed in the IETF. Search for "IMAA".

To answer your question: Yes, you can send mail to webmaster@.org, but not to @normaldomain.org.

Marc


By Marc Mutz at Thu, 2003/07/10 - 5:00am

That depends if the application doesn't do anything funny with its URLs. The support is inside KURL and our DNS lookup functions. As long as the application doesn't try to do the lookup by itself and as long as it keeps the URL in Unicode, there should be no problems.

Of course, some protocols like SMTP and HTTP require the hostname being accessed to be transmitted over the wire. That's a protocol-specific implementation and has to be checked on case-by-case basis. HTTP and SMTP send the Punycode-encoded hostname, but other protocols could just send it in UTF-8 if they felt like doing so.

And while kio_smtp has been made working and I believe the other ioslaves are good to go, I'm not sure about KMail itself. I mean, it wouldn't stop you, but it's certainly nicer to see the properly decoded domain name in the KMail/KNode panes than the ugly Punycode domains.


By Thiago Macieira at Thu, 2003/07/10 - 5:00am

it seems ie supports this, too, at least mine does.

part from that, is there a place where you can already register .com and .de-domains with those special characters?


By me at Thu, 2003/07/10 - 5:00am

Currently you can't register such .de domains, see http://www.denic.de/doc/faq/domainregistrierung.en.html#r0011


By Olaf Jan Schmidt at Thu, 2003/07/10 - 5:00am

I was pleasently surprised that KDE has only a run-time dependency on libidn for this feature.


By Anonymous at Thu, 2003/07/10 - 5:00am

Will khtml support xslt? Its actually coming into use on a few sites now. Its not going to go away....

Also, will ksvg be integrated into khtml ( by a plugin if DOM handlers can be dynamically registered ? ) for inline svg?


By rjw at Thu, 2003/07/10 - 5:00am

If your site isn't only for the audience of your country, most people will have problem to get on your site (I'm talking about win-users that don't even know what is a character map).

But anyway I think it's nice because most of navigation those days is by links (and not like the old days when I wrote urls on my journal) so it should not be a big problem, and for brazilian sites it would be a welcome news being able to create urls like: http://www.ação.com.br/

Reguards.


By Iuri Fiedoruk at Thu, 2003/07/10 - 5:00am

What I also like is what FFII.org does with internationalization.

index.en.html
index.de.html

and so on. This is very convinient and I would prefer a web browser standard that automatically picks up a page in the correct language, so you don't need script or additional tools.


By Bert at Thu, 2003/07/10 - 5:00am

Ever since 1.0, HTTP has had the Accept-Language header. The idea is that when a browser requests a file, it transmits this header with the language codes its user likes in order of preference, such as "Accept-Language: fr, de, en-au". It's then up to the server to decide what to do with the codes, and sadly most people don't configure their servers to pay attention to the header. (That's why you might't have heard of it).

Internet Explorer and Mozilla have supported this for many years, and both have sections in their preferences where you specify the languages you want and their order. As I recall, Konqi has supported this for ages as well (but OTOH I forget where the pref is for selecting your languages).


By azza-bazoo at Thu, 2003/07/10 - 5:00am

IIRC, it just uses the fallback sequence defined in the KDE-wide language settings.


By SadEagle at Thu, 2003/07/10 - 5:00am

wouldn't it be better if everything was just encoded in UTF8?


By David Wilkinson at Mon, 2004/12/20 - 6:00am