Login!
Register as a new user Lost password?
 

MODx Bug/Feature Tracker and Feature Requests

Welcome to the MODx CMS Tracker. Please choose the appropriate project from the drop down menu and provide as much information as possible regarding your server environment and browser. Thanks!

FS#685 — Encoding problems in QuickEdit

Attached to Project — MODx
Opened by Pontus Ågren (Pont) - Saturday, 18 November 2006, 08:59AM
Last edited by Ryan Thrash (rthrash) - Wednesday, 23 January 2008, 07:39PM
Task Type Bug Report
Category Core Distribution
Status Unconfirmed
Assigned To No-one
Operating System All
Severity High
Priority Normal
Reported Version 0.9.5 RC2
Due in Version 0.9.6.2
Due Date Undecided
Percent Complete 0%

Details

Being swedish I naturally use åäöÅÄÖ when appropriate. If I edit a page using QuickEdit and use the Apply button the swedish letters on the page changes to åäöÅÄÖ after the refresh. This lasts until I click OK and the refresh restores the correct letters.

When submitting a form (ie clicking OK), javascript uses the specified encoding of the page. However - when using "Apply" the form is saved with an AJAX-call to index.php. Javascript encodes this internally as utf8 and the posted data has to be decoded in the backend to the encoding used on the site (unless it's utf8) to be saved correctly.

On my site the encoding is ISO-8859-1 so the interim save wrongly encodes the content as utf8. Saving with "OK" corrects this but should I choose to just "Close" the editor window the content remains saved with the wrong encoding.

This temporary fix for the issue makes it work with sites encoded with iso-8859-1. Add the following code to assets/modules/quick_edit/editor.class.inc.php .

274 $value_prep = $modx->db->escape($value);
275
276 /* ADD THIS */
277 if (mb_detect_encoding($value_prep,'UTF-8, ISO-8859-1') == 'UTF-8') {
278 $value_prep = utf8_decode($value_prep);
279 }
280
281 if(is_numeric($cv->id)) {

I'm sure there is a better way to do this. And the code should of course work for any encoding, but my coding skills are limited. I still hope this can serve as a beginning of a proper fix.
This task depends upon

This task blocks these from closing
Comment by Garry Nutting (garryn) - Saturday, 18 November 2006, 12:15PM
Something to note: The proposed change will only work if the mbstring extension has been installed for PHP - this is a non-default extension and has to be explicitly enabled which could mean that it's not that widely supported.

There seems to be a few other alternative ways of detecting UTF-8 without using the mbstring extension on the manual page for the function: http://us2.php.net/mb_detect_encoding

Comment by Pontus Ågren (Pont) - Wednesday, 29 November 2006, 03:05PM
The other solutions using the w3c regex work equally well. I used the quicker one. Remove lines 277-279 in the above fix and insert the following lines instead.

if (preg_match('%(?:
[\xC2-\xDF][\x80-\xBF]
\xE0[\xA0-\xBF][\x80-\xBF]
[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}
\xED[\x80-\x9F][\x80-\xBF]
\xF0[\x90-\xBF][\x80-\xBF]{2}
[\xF1-\xF3][\x80-\xBF]{3}
\xF4[\x80-\x8F][\x80-\xBF]{2}
)+%xs', $value_prep)) {
$value_prep = utf8_decode($value_prep);
}

Comment by Pontus Ågren (Pont) - Wednesday, 29 November 2006, 04:56PM
I just read up on utf8_decode and realised that it only decodes to ISO-8859-1. So the above soultion is only relevant for ISO-8859-1 encoded sites.

A more general solution is needed to fix the issue for other encodings.

Comment by Pontus Ågren (Pont) - Saturday, 02 December 2006, 02:10PM
I've come up with the following solutions. Both of them assumes the sites encoding is the same as the managers encoding.

If the mbstring extension is disregarded this fixes the issue.

if (preg_match('%(?:
[\xC2-\xDF][\x80-\xBF]
|\xE0[\xA0-\xBF][\x80-\xBF]
|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}
|\xED[\x80-\x9F][\x80-\xBF]
|\xF0[\x90-\xBF][\x80-\xBF]{2}
|[\xF1-\xF3][\x80-\xBF]{3}
|\xF4[\x80-\x8F][\x80-\xBF]{2}
)+%xs', $value_prep)) {
$value_prep = iconv("UTF-8", $modx->config['etomite_charset'], $value_prep);
}


On the other hand - if the mbstring extension is considered required, this solution can be used.

$charset = $modx->config['etomite_charset'];
if (mb_detect_encoding($value_prep, "UTF-8, $charset") == 'UTF-8') {
$value_prep = mb_convert_encoding($value_prep, $charset, "UTF-8");
}

These solutions needs to be tested on other encodings. There are limitations on which encodings are supported by iconv and mb_convert_encoding.