Home « Server software «
Documentation: shared/utf8.php
Class Utf8 - Encode and decode utf-8 strings
This script is a reference file of this system.
Voir aussi:
- UTF-8, a transformation format of ISO 10646
- How to develop multilingual, Unicode applications with PHP
- A tutorial on character code issues
Licence: GNU Lesser General Public License
Auteurs:
- Bernard Paques bernard.paques@bigfoot.com
decode_recursively() - Transcode utf-8 to Unicode recursively
function &decode_recursively(&$fields)
- &$fields - array of encoded fields
- returns the transformed array
utf8_decode()
to arrays and to Unicode entities.
I know, it's a hack...Voir aussi:
from_entities() - Restore UTF-8 from HTML entities
function &from_entities(&$html)
- &$html - string a string with a mix of HTML entities
- returns an UTF-8 string
utf8::from_unicode()
the capability
to decode HTML entities as well.from_unicode() - Restore UTF-8 from HTML Unicode entities
function &from_unicode(&$unicode)
- &$unicode - string a string with a mix of UTF-8 and of HTML Unicode entities
- returns an UTF-8 string
to_ascii() - Transcode to US ASCII
function &to_ascii($utf, $options='')
- $utf - string a complex string using HTML entities
- $options='' - string optional characters to accept
- returns a US-ASCII string
You should use it to build valid file names for downloads. According to the RFC: "Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII."
For example:
// get a valid file name
$file_name = utf8::to_ascii($context['page_title'].'.xml');
// suggest a download
Safe::header('Content-Disposition: attachment; filename="'.$file_name.'"');
This function can also be used to enforce the ASCII character set in texts. For this kind of usage it is recommended to add spaces to the optional parameters, like in:
// enforce ascii
$text = utf8::to_ascii($text, ' ');
Voir aussi:
- The Content-Disposition Header Field
- ASCII - ISO 8859-1 (Latin-1) Table with HTML Entity Names
- articles/export.php
- articles/fetch_as_msword.php
- articles/fetch_as_pdf.php
- articles/fetch_for_palm.php
- articles/ie_bookmarklet.php
- files/edit.php
- files/fetch.php
- files/fetch_all.php
- images/edit.php
- tables/fetch_as_csv.php
- tables/fetch_as_xml.php
- users/fetch_vcard.php
to_iso8859() - Transcode to ISO 8859
function &to_iso8859(&$utf, $options='')
- &$utf - string a complex string using unicode entities
- $options='' - string optional characters to accept
- returns a ISO 8859 string
Voir aussi:
to_unicode() - Transcode multi-byte characters to HTML representations for Unicode
function &to_unicode(&$input)
- &$input - string the original UTF-8 string
- returns a string acceptable in an ISO-8859-1 storage system (ie., PHP4 + MySQl 3)
Every multi-byte UTF-8 character is transformed to its equivalent HTML numerical entity (eg, ᇘ) that may be handled safely by PHP and by MySQL.
Of course, this solution does not allow for full-text search in the database and therefore, is not a definitive solution to internationalization issues. It does enable, however, practical use of Unicode to build pages in foreign languages.
Also, this function transforms HTML entities into their equivalent Unicode entities. For example, w.bloggar posts pages using HTML entities. If you have to modify these pages using web forms, you would like to get UTF-8 instead.
Voir aussi:
transcode() - Transcode unicode entities to/from HTML entities
function &transcode(&$input, $to_unicode=TRUE)
- &$input - string the string to be transcoded
- $to_unicode=TRUE - boolean TRUE to transcode to Unicode, FALSE to transcode to HTML
- returns a transcoded string
Voir aussi: