Published
- 1 min read
Java Unit Test to check UTF-8 chars
During a migration to a new plattform, we have detected an issue with the character encoding. Some of the messages contained the UTF-8 replacement character (�)
Fortunately, we have been able to fix the configuration issue and to make sure it does not happen again we have put in place a variation of the following unit test:
import org.junit.Assert;
import org.junit.Test;
public class CheckUtf8ReplacementChar {
private boolean containsUtf8ReplacementCharacter(String target) {
final int REPLACEMENT_CHARACTER_VALUE = 65533;
for (int i = 0; i < target.length(); i++) {
if ((int) target.charAt(i) == REPLACEMENT_CHARACTER_VALUE) {
return true;
}
}
return false;
}
@Test
public void shouldDetectUtf8ReplacementChar() {
final String wrongString = "Wrong characters ������������������<br>";
final String okString = "OK characters";
Assert.assertTrue(containsUtf8ReplacementCharacter(wrongString));
Assert.assertFalse(containsUtf8ReplacementCharacter(okString));
}
}
Even though this can be improved, we didn’t have much time to think about it. That’s the first way we could develop.