You would like to select the contents of all comments in an HTML (or XML) document using Java Regular Expressions.

For instance:


There is a powerful regular expression, which can be found on this page.

static final String commentRegex = "(// )?\\<![ \\r\\n\\t]*(--([^\\-]|[\\r\\n]|-[^\\-])*--[ \\r\\n\\t]*)\\>";

However, this regular expression might lead to your application to 'hang' if there are (bad, bad input) documents with starting comments without matching comment end, like:


