Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Open‑source InternVL3.5 crushes GPT‑4V on multimodal benchmarks (medium.com/data-science-in-your-pocket)
4 points by acossta 14 days ago | hide | past | favorite | 1 comment


This isn’t another hype piece. The InternVL3.5 is a coherent vision‑language model that actually understands pixels and text together. It comes in sizes from 1 B up to a monster 241 B parameters, and on benchmarks like MMMU and ChartQA it beats closed models like GPT‑4V, Claude and Qwen. An open‑source LLM that competitive signals we can build cutting‑edge multimodal apps without depending on a black‑box API, which is a big deal for devs who care about hackability and reproducibility.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: