Using the XML parser for HTML

torontonian · August 17, 2020, 2:23pm

I tried feeding some fetched HTML into the XML parser, to extract some elements, and it gave me several errors. It seems to be strict in terms of closing tags and unrecognized entities.
Is it possible to make it more permissive, so it can accept loosely formed html?

bill · August 17, 2020, 4:01pm

It is not possible. A new parser would likely have to be written.