When using the rest resource to process some JSON containing UTF8 escape characters such as \u00fc (u umlaut) I was getting dodgey characters back.
I set about debugging this and having spent a day or two on it have reduced it to two things:
Firstly, the installed version of Spidermonkey does not appear to interpret these characters correctly, so I tried the latest version of Rhino, which does. Here's Spidermonkey on the command line:
~> js js> '\u00fc' js>
... and Rhino:
Rhino 1.7 release 2 2009 03 22 js> '\u00fc' ΓΌ js>
So I modified data/private/conf/tools.inc to temporarily point at the jar of Rhino:
#define('SQ_TOOL_JS_PATH', '/usr/bin/js'); define('SQ_TOOL_JS_PATH', '/usr/bin/java -jar /home/me/rhino/js-14.jar');
Secondly, the js process requires the LANG environment variable to be set. The only way I could find to do this was to pass it direct from the PHP in packages/web_services/rest/page_templates/page_rest_resource_js/page_rest_resource_js.inc, like so:
#$process = proc_open(SQ_TOOL_JS_PATH . ' - ', $descriptorspec, $pipes); $process = proc_open(SQ_TOOL_JS_PATH . ' - ', $descriptorspec, $pipes, NULL, Array('LANG=en_GB.utf8'));
With these changes the REST resource correctly interprets the utf8 characters.
I would prefer not to have to modify the PHP, so my next step should be to compile the latest version of Rhino so it's at /usr/bin/js and somehow set LANG elsewhere, although I've tried it using Apache SetEnv and adding it to default profile, but these do not work. But before I embark on this, does anyone have any experience of this? Am I on a massive wild goose chase? Is there a simple solution?