Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OK that makes sense; so basically if I want to give this a shot then I would just need to read llm_chat.js and the TVM docs, and translate llm_chat.js to my language of choice effectively?


I think instead what would be needed is a wgpu native runtime support for TVM. Like the implementations in tvm vulkan, then it will be naturally link to any runtime that provides webgpu.h

Then yah the llm_chat.js would be high-level logic that targets the tvm runtime, and can be implemented in any language that tvm runtime support(that includes, js, java, c++ rust etc).

Support webgpu native is an interesting direction. Feel free to open a thread in tvm discuss forum and perhaps there would be fun things to collaborate in OSS


How big is the "runtime" part? My use case would basically be: run this in a native app that links against webgpu (wgpu or dawn). Is there a reference implementation for this runtime that one could study?


tvm runtime is pretty decent(~700k-2M level depending on dependency included), you can checkout tvm community and bring up the question there, i think there might be some common interest. There are impl of runtime for vulkan, metal that can be used as reference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: