Starting with version 3.1, Simplicité is using by default using the UTF-8 encoding (previous versions were using by default the ISO-8859-1 encoding).
Depending on your installation, using UTF-8 may require some addition configurations described below.
If you experiment any character encoding issues, you must have misconfigured or forgotten something.
It is required to set the JVM default encoding to UTF-8.
The most reliable way to do it is by adding an explict
-Dfile.encoding=UTF-8 to the JVM options.
Note: on Linux it is also possible to set the
LANGenvironment variable (e.g.
en_US.UTF-8) either for the account running Tomcat level or, preferably, globally.
For Tomcat 6 and 7, the connectors definitions in
conf/server.xml needs to be updated to force UTF-8 for URI encoding:
<Connector URIEncoding="UTF-8" ... />
Starting with Tomcat 8 this is the default, so you only need to change something if you don't use UTF-8.
Nothing to do :-)
The default encoding of the database must be set to
utf8 and the default collation to
Warning: You can use other language-specific unicode collations instead of the
utf8_unicode_ciif needed but if you use the
utf8_bincollation, the columns search will be case sensitive
This can be set as server's default in your MySQL config file:
[mysqld] (...) collation_server=utf8_unicode_ci character_set_server=utf8 (...)
Note: when changing these values a database service restart is needed
This can also be set at the database level by:
CREATE DATABASE <database name> DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;;
This can be also done after creation by:
ALTER DATABASE <database name> DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Note: If the database was loaded before its character set is set to UTF-8 you must reload it or convert explicitly all tables (see below)
When using the setup package, the
db-mysql.properties must be adjusted for setting UTF-8 support in the JDBC URL of the datasource,
this means the JDBC URL must contains the
For Tomcat, this results in a datasource descriptor similar to this one:
<Resource name="jdbc/mysqlexample" type="javax.sql.DataSource" auth="Container" username="<username>" password="<password>" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://<host>:<port>/<database name>?autoReconnect=true&characterEncoding=utf8&characterResultSets=utf8"/>
To check current charset and collation of existing tables you can use:
SHOW TABLE STATUS LIKE '<table name>';
To convert existing tables to UTF-8 you can use:
ALTER TABLE <table name> CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
The database must be created with UTF-8 encoding:
create database simplicite encoding 'UTF8' lc_ctype 'en_US.UTF-8' lc_collate 'en_US.UTF-8' template <an UTF-8 database template name>;
Note: if you have created a database using another encoding you must drop it and do it again
No additional configuration is then need at the datasource descriptor level.
The unicode support must be present and installed for server. Nothing else is required.
No native UTF-8 support (unless using
nvarchar types which is not the default).
Note: this constraint does not seem applicable to SQLServer 2017+ on Linux
Java and server side scripts
Make sure that you use
Globals.getPlatformEncoding() for designating the platform encoding (instead of hard-coded encoding name)
when you use APIs that have encoding argument(s).
Note: you shouldn't be using such APIs unless you really need to do explicit encoding conversions (e.g. from a ISO-8859-1 encoded file in an adapter).
Custom JSP pages and servlets
If you have custom JSP pages (you shouldn't if you use recent versions of the platform for which external objects must be preferred to custom JSP pages and servlets). you need to adjust the following directive if present :
<%@ page pageEncoding="UTF-8" %>
You should also adjust the following instruction if present in your custom JSP pages and/or servlets:
Note: If you use the standard API
ServletTool.setHTTPHeadersmethod instead of above directive and/or instruction (which is definitely the right approach) you don''t need to do anything._
If you need to convert a text file from
UTF-8 you can, for instance, use the Linux
iconv command line tool:
iconv -f ISO-8859-1 -t UTF-8 iso.txt > utf.txt
Most modern text editors also provide features to convert files from one encoding to another.