Monday, March 23, 2009

How do I use regular expressions to decode text containing html escape code?

String s = "Some html escape codes ABC";
StringBuffer stringBuffer = new StringBuffer();
String re = "(?:&#(\\d+);)";
Pattern pattern = Pattern.compile(re);
Matcher matcher = pattern.matcher(args[0]);
while (matcher.find())
{
char c = (char) Integer.parseInt(matcher.group(1));
matcher.appendReplacement(stringBuffer, Character.toString(c));
}
matcher.appendTail(stringBuffer);
String result = stringBuffer.toString();

No comments: