Overview:
This is a brief article on enabling the support for displaying UTF-8 characters (e.i: Japanese font) on JSP/HTML pages of a web applications on Tomcat. We can achieve this with THREE easy steps.
To make UTF-8 working under Java, Tomcat, Linux/Windows, it requires the following:
This is a brief article on enabling the support for displaying UTF-8 characters (e.i: Japanese font) on JSP/HTML pages of a web applications on Tomcat. We can achieve this with THREE easy steps.
- Update Tomcat's server.xml
- Define a javax.servlet.Filter and Update the Web Application's web.xml
- Enable UTF-8 encoding on JSP/HTML
Update Tomcat's server.xml
This handles GET request URL. With this configuration, the Connector uses UTF-8 encoding to handle all incoming GET request parameters.
<Connector
. . .
URIEncoding="UTF-8"/>
http://localhost:8080/foo-app/get?foo_param=こんにちは世界
e.i: request.getParameter("foo_param") // the value retrieved will be encoded with UTF-8 and you'll get the UTF-8 value as it is("こんにちは世界").
IMPORTANT NOTE: POST requests will have NO effect by this change.
Define a javax.servlet.Filter and Update the Web Application's web.xml
Now, we need to enforce our web application to handle all requests and response in terms of UTF-8 encoding. This way, we are handling POST requests as well. For this purpose, we need to define a character set filter that'll transform all the requests and response into UTF-8 encoding in the following manner.
Define a javax.servlet.Filter and Update the Web Application's web.xml
Now, we need to enforce our web application to handle all requests and response in terms of UTF-8 encoding. This way, we are handling POST requests as well. For this purpose, we need to define a character set filter that'll transform all the requests and response into UTF-8 encoding in the following manner.
package org.fazlan.tomcat.ext.filter;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import java.io.IOException;
/***
* This is a filter class to force the java webapp to handle all requests and responses as UTF-8 encoded by default.
* This requires that we define a character set filter.
* This filter makes sure that if the browser hasn't set the encoding used in the request, that it's set to UTF-8.
*/
public class CharacterSetFilter implements Filter {
private static final String UTF8 = "UTF-8";
private static final String CONTENT_TYPE = "text/html; charset=UTF-8";
private String encoding;
@Override
public void init(FilterConfig config) throws ServletException {
encoding = config.getInitParameter("requestCharEncoding");
if (encoding == null) {
encoding = UTF8;
}
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
// Honour the client-specified character encoding
if (null == request.getCharacterEncoding()) {
request.setCharacterEncoding(encoding);
}
/**
* Set the default response content type and encoding
*/
response.setContentType(CONTENT_TYPE);
response.setCharacterEncoding(UTF8);
chain.doFilter(request, response);
}
@Override
public void destroy() {
}
}
The filter ensures that if the browser has not set the encoding format in the request, UTF-8 is set as the default encoding. Also, it sets UTF-8 as the default response encoding.
Now, we need to add this to our web application's web.xml to make it work.
. . .
<filter>
<filter-name>CharacterSetFilter</filter-name>
<filter-class>org.fazlan.tomcat.ext.filter.CharacterSetFilter</filter-class>
<init-param>
<param-name>requestEncoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>CharacterSetFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
. . .
Enable UTF-8 encoding on JSP/HTML
JSP Pages
All JSP pages that needs to render UTF-8 content needs to have the following on top the page declaration.
<%@ page contentType="text/html;charset=UTF-8" language="java" pageEncoding="UTF-8" %>
HTML Pages
All HTML pages that needs to render UTF-8 content needs to have the following in their header section.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
...
</head>
Summary:
The article looked at how to support UTF-8 content in your web application deployed on Tomcat.