Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Replace OCR with Vision Language Models (github.com/vlm-run)
292 points by EarlyOom 5 months ago | past | 125 comments
Show HN: Visually parse an entire YouTube video frame by frame (github.com/vlm-run)
5 points by EarlyOom 6 months ago | past
A Node.js SDK for calling Vision Language Models (github.com/vlm-run)
6 points by EarlyOom 6 months ago | past
Run structured extraction on documents/images locally with Ollama and Pydantic (github.com/vlm-run)
170 points by EarlyOom 6 months ago | past | 29 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: