Hacker News new | past | comments | ask | show | jobs | submit login

My assumption is that this gives you the ability to locally encode vectors. This is useful for those not using API services to build their vectors.



Transformer inference is ~60 lines of numpy[0] (closer to 500 once you add tokenization etc). It would be nice to just have this and not all of pytorch and transformers.

[0] https://jaykmody.com/blog/gpt-from-scratch/


It's 60 lines for CPU only inference, which'll be slow. If you want GPU acceleration it'll be a lot more than 60 lines.


What about models besides GPT? Most of the popular vector encoding models aren't using this architecture.

If you really didn't want PyTorch/Transformers, you could consider exporting your models to ONNX (https://github.com/microsoft/onnxruntime).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: