When the Domain Name System (DNS) was designed, nobody thought of it that eventually other characters than the default [a-z0-9-] will be needed. In 2003, the first top-level domains (TLD) - .info, .jp, .kr, .lt, .pl, .se, ... - started to expand their character set. By now you can register IDN domains with all large TLDs (cnobi / dcnobi) and many Country-code TLDs.
But what is IDN? An internationalized domain name (IDN) is a domain name contains non-ASCII characters. The implementation must be done on the client side. The Server gets standard ASCII strings for the domain name. To encode the domains a simple algorithm called Punycode is used. The string "löl" will become: "xn--ll-fka". Characteristic is that the string always begin with "xn--".
Of course you can build a simple script which converts a Unicode string in its Punycode equivalent. The Punycode is defined in RFC 3492. But it is much easier with a library. As for almost anything there is something prefabricated. There is the GNU libidn library which will do this task for us. I wrote a simple wrapper to use libidn with PHP.
The PHP extension adds two new functions:
<?php
echo idna_toASCII('xärg.örg'), "\n";
echo idna_toUnicode('xn--xrg-9ka.xn--rg-eka');
?>
The output looks like this:
xn--xrg-9ka.xn--rg-eka
xärg.örg