The problem is that this kind of philosophy is fundamentally incompatible with HTML5.
There was an attempt for a "strict-mode" HTML, it was XML, but it failed (on the web) for various reasons (including IE). HTML5 specifies the exact behavior of what every browser must do upon encountering tag-soup, which is useful because real-world HTML has been tag-soup for a very long time.
I guess the strictest thing you can do is to die upon encountering "validation errors", but I don't think this would help much to simplify your job. (Maybe you can drop the adoption agency?) But now your parser chokes on a lot of websites - likely on hand-written HTML, which has a greater potential for validation errors but also typically simpler layout.
And HTML parsing is still the easy part of writing a browser! Layout is much harder to do, partly because layout is hard, but also because it's under-specified. Implement "undefined behavior" in a way that other browsers don't, and your browser won't work on a lot of pages.
(There have been improvements, but HTML is still miles ahead. e.g. CSS 2 has no automatic table layout algorithm, and AFAICT the CSS 3 version is still "not yet ready for implementation".)
There was an attempt for a "strict-mode" HTML, it was XML, but it failed (on the web) for various reasons (including IE). HTML5 specifies the exact behavior of what every browser must do upon encountering tag-soup, which is useful because real-world HTML has been tag-soup for a very long time.
I guess the strictest thing you can do is to die upon encountering "validation errors", but I don't think this would help much to simplify your job. (Maybe you can drop the adoption agency?) But now your parser chokes on a lot of websites - likely on hand-written HTML, which has a greater potential for validation errors but also typically simpler layout.
And HTML parsing is still the easy part of writing a browser! Layout is much harder to do, partly because layout is hard, but also because it's under-specified. Implement "undefined behavior" in a way that other browsers don't, and your browser won't work on a lot of pages.
(There have been improvements, but HTML is still miles ahead. e.g. CSS 2 has no automatic table layout algorithm, and AFAICT the CSS 3 version is still "not yet ready for implementation".)