Hacker News new | past | comments | ask | show | jobs | submit login

Well, nothing prevents this to be a 3-5 lines of Python with more suitable data structures:

    $ pip install dawg
and then

    import dawg
    words = open('/usr/share/dict/words', 'r').read().splitlines()
    d = dawg.DAWG(words)
(this example actually works)



Let's not store the whole file in memory at once!


I think the point of using data structures like DAWG is to reduce the memory consumption to the point it is feasable to store the whole dataset in memory.

Practical DAWG application would be the following anyway:

    import dawg
    d = dawg.DAWG().load('words.dawg')
because DAWG minimization may require a lot of memory.


Ok.

    import dawg
    with open('/usr/share/dict/words','r') as words:
        d = dawg.DAWG(w.strip() for w in words)


a minor remark: with the current `dawg` module implementation this should be

    import dawg
    with open('/usr/share/dict/words','r') as f:
        words = (w.strip() for w in f)
        d = dawg.DAWG(words, input_is_sorted=True)


I assume GP meant, don't store all the words in memory at once.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: