Skip to main content Help Control Panel

Aubagne HipHop

Tout le Hip Hop d'Aubagne et des alentours

Home «   Server software «  

Documentation: shared/utf8.php

Class Utf8 - Encode and decode utf-8 strings

This script is a reference file of this system.

Voir aussi:

Licence: GNU Lesser General Public License

Auteurs:

decode_recursively() - Transcode utf-8 to Unicode recursively

function &decode_recursively(&$fields)

This function extends utf8_decode() to arrays and to Unicode entities. I know, it's a hack...

Voir aussi:

from_entities() - Restore UTF-8 from HTML entities

function &from_entities(&$html)

This function adds to utf8::from_unicode() the capability to decode HTML entities as well.

from_unicode() - Restore UTF-8 from HTML Unicode entities

function &from_unicode(&$unicode)

This function is triggered by the YACS handler during page rendering. It is aiming to transcode HTML Unicode entities (eg, €) back to actual UTF-8 encoding (eg, €).

to_ascii() - Transcode to US ASCII

function &to_ascii($utf, $options='')

This function is primarily used to build strings matching RFC 2183 and RFC 822 requirements on MIME data.

You should use it to build valid file names for downloads. According to the RFC: "Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII."

For example:
// get a valid file name
$file_name utf8::to_ascii($context['page_title'].'.xml');

// suggest a download
Safe::header('Content-Disposition: attachment; filename="'.$file_name.'"');


This function can also be used to enforce the ASCII character set in texts. For this kind of usage it is recommended to add spaces to the optional parameters, like in:
// enforce ascii
$text utf8::to_ascii($text' ');


Voir aussi:

to_iso8859() - Transcode to ISO 8859

function &to_iso8859(&$utf, $options='')

To be used only when there is no other alternative.

Voir aussi:

to_unicode() - Transcode multi-byte characters to HTML representations for Unicode

function &to_unicode(&$input)

This function is aiming to preserve Unicode characters through storage in a ISO-8859-1 compliant system.

Every multi-byte UTF-8 character is transformed to its equivalent HTML numerical entity (eg, ᇘ) that may be handled safely by PHP and by MySQL.

Of course, this solution does not allow for full-text search in the database and therefore, is not a definitive solution to internationalization issues. It does enable, however, practical use of Unicode to build pages in foreign languages.

Also, this function transforms HTML entities into their equivalent Unicode entities. For example, w.bloggar posts pages using HTML entities. If you have to modify these pages using web forms, you would like to get UTF-8 instead.

Voir aussi:

transcode() - Transcode unicode entities to/from HTML entities

function &transcode(&$input, $to_unicode=TRUE)

Also, this function transforms HTML entities into their equivalent Unicode entities. For example, w.bloggar posts pages using HTML entities. If you have to modify these pages using web forms, you would like to get UTF-8 instead.

Voir aussi:

Tools
Browse the source of this script
Server software