Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Try to parse this with it, <html>hello</lol>world</html>.

Parsing invalid xml with a xml parser throws an error like it should.

I am using cElementTree for parsing and this is what will happen with your input.

    In [1]: import xml.etree.cElementTree as ElementTree

    In [2]: ElementTree.XML('<html>hello</lol>world</html>')
    ---------------------------------------------------------------------------
    ParseError                                Traceback (most recent call last)
    /home/rahul/musings/python/<ipython-input-2-54e782b0af58> in <module>()
    ----> 1 ElementTree.XML('<html>hello</lol>world</html>')

    /home/rahul/musings/python/<string> in XML(text)

    ParseError: mismatched tag: line 1, column 13
> make my own xmldict parser in that way

What is that way you are talking about? There isn't a xmldict implementation for Python(at least I can't google it), I needed one, so I wrote a recursive descent parser. A recursive descent parser is a popular choice provided your grammar is LL(k) - Python, Perl et al run on hand-coded recursive-descent parsers. Also, recursive-descent parsers are easiest to handroll.

PS - There is a way to disagree. Your's isn't the right way.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: