Every so often as the Tridion CMS content and design are
weaved by Editors and Developers, I encounter the unexpected character encoding
issue. The published page has a quirky A
or funny U, drawing attention to syntax and aspect rather than the pertinent
content it should.
The simple explanation for why this can happen is that the
issue is due to discrepancies in character encoding settings of the various
systems the published content will pass in its journey to its final
destination. Here are several
checkpoints I follow through when I play detective and what I look for to solve
the mystery:
1. Publication Target (Tridion): What setting has
been selected for the Publication Target Default Code Page value? I check this value in the Publication Target
properties of the Tridion CMS
Admin Panel. By default, this is set to “System Default” which will acquire the
code settings dictated by the Windows operating system of the publisher
machine. I usually change this to Unicode
(UTF-8)[1].
2. Browser: What is the browser using for its
character set? In Internet Explorer I
check View ⇒
Encoding and look to see that the Unicode (UTF-8) menu item is marked on. In Firefox, check Options ⇒ Content ⇒ Fonts & Colors
Advanced ⇒
Default Character Encoding. In Google
Chrome ⇒
Options ⇒
Under the Hood ⇒
Web Content ⇒
Customize fonts ⇒
Encoding
3. Java Virtual Machine: What JVM does the Tridion
Deployer run in (for instance one used in Tomcat), and what encodings are set
there? As from JDK 1.4 it is possible to find out what is supported by a
particular JVM via java.nio.charset. Charset. availableCharsets()[2].
4. Application JVM:
·
Is the IDE used forcing a specific encoding, for
instance if I’m using Eclipse?
·
Is any operation depending on the
standard locale for character I/O carrying along the correct encoding, for
example when reading a file? Reader r = new InputStreamReader(new FileInputStream("myfile"), "UTF-8");
·
Tridion
Deployer: if running on a file system, consider running the deployer with
-Dfile.encoding=UTF8 command options
5. Web Servers: Decoding onward the trail, despite
all of the above, most web servers are happily unaware of any encodings or
treat the communication channel as ISO-8859-1, so another two checkpoints in
one is at the level of webservers such as IIS, Tomcat or Sun Java System
Application Server. Did you know that
depending on the webserver even the requests GET and POST themselves can be
treated differently by the same webserver?
Beware these settings are server dependent and while Sun’s JSAS will
treat both GET and POST the same based on one configuration, Tomcat may not,
and IIS will expect the individual settings to be specified[3].
·
IIS/.NET web.config: <globalization
fileEncoding="UTF-8" requestEncoding="UTF-8"
responseEncoding="UTF-8"/>
·
Tomcat server.xml: set
URIEncoding="UTF-8"
·
Sun Java System Application Server sun-web.xml:
include
<parameter-encoding
default-charset="UTF-8"/>
6. Page level can override encoding directives in HTTP header
settings in:
·
HTML
<meta http-equiv="Content Type"
content="text/html; charset=UTF-8" />
·
.NET
<% @ Page
ResponseEncoding="utf-8" %>
·
Java/JSP
<%@page
pageEncoding="UTF-8"%>
<%@page
contentType="text/html;charset=UTF-8"%>
request.setCharacterEncoding("UTF-8");
·
XML
<?xml version="1.0"
encoding="UTF-8"?>
7. Create own abstract layer to interact with CM, also for
overriding server settings. If step 5 has
given you visions of long and dark nights bravely searching your server’s
documentation for that minuscule setting, there is light at the end of the
tunnel. Put your magnifying-glass
away. It is possible to establish a
server-independent encoding layer.
Consider setting a context parameter in WEB-INF/web.xml and
propagate this throughout your code by reading it before any other parameters
and passing it along in extensions of the request object for both GET and POST
methods.
Here’s hoping data fidelity serves you right,
and happy encoding.
[1] http://sdllivecontent.sdl.com/LiveContent/content/en-US/SDL_Tridion_2011/concept_879633C70905448885956711778D2C0E
[2] http://mindprod.com/jgloss/encoding.html
[3] http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
[3] http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
Great post, Elena! Very applicable since character encoding issues can cross all three consulting roles and can affect authors and users.
ReplyDeleteWell done Elena. Very good article.
ReplyDeleteAs more customers use Experience Manager and/or Tridion's Content Delivery Web Service, we can include any "OData" Web servers as places to also check under #5 Web Servers. :-)
ReplyDeleteHi Elena, I have done all the settings you have mentioned in this article. Yet after setting up Tridion UGC 2011 SP1 My data encoding doesn't work
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteTo add to #4, setting the correct file encoding for a Deployer on a Windows Server is done by setting the -DFile.encoding via a jvmarg in the registry. It is the key "HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Tridion\Content Delivery\General" and this image shows an example of setting it to UTF-8 http://bkjh.home.xs4all.nl/images/ContentDeliveryJvmArg.png
ReplyDeleteThis work for me:
ReplyDeleteIn this file weblogic-application.xml:
webapp.encoding.default
ISO-8859-1
Saludos from Chile!
ultimate guide.
ReplyDelete"I very much enjoyed this article.Nice article thanks for given this information. i hope it useful to many pepole.php jobs in hyderabad.
ReplyDelete"
Thanks For Sharin With Us.It gave me a lot of Helpful information.
ReplyDeleteUI Development Training
UI Development Training in Hyderabad
UI Development Online Training
wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.PHP Training in Chennai
ReplyDeletePHP Online Training in Chennai
Machine Learning Training in Chennai
iOT Training in Chennai
Blockchain Training in Chennai
Open Stack Training in Chennai
How it works: Using Wickr Me you can make free calls and send free text messages using your webcam. Webcam: You can use your webcam for sending What Is Wickr Me
ReplyDelete