Regular Expression to Find HTML Comments in Java

by Max Rohde,


You would like to select the contents of all comments in an HTML (or XML) document using Java Regular Expressions.

For instance:


There is a powerful regular expression, which can be found on this page.

static final String commentRegex = "(// )?\\<![ \\r\\n\\t]*(--([^\\-]|[\\r\\n]|-[^\\-])*--[ \\r\\n\\t]*)\\>";

However, this regular expression might lead to your application to 'hang' if there are (bad, bad input) documents with starting comments without matching comment end, like: