Hey Chris , can you further qualify performance? Before I share some thoughts on...

fullstackchris · on Aug 7, 2023

I'm no expert in actual ML implementations but I was under the impression that Python (i.e Tensorflow) is actually C/C++ based under the hood. I just meant I can't imagine the V8 engine can be as performant for all that matrix math in those models.

But now that I'm looking at the actual code samples, I'm not even sure JavaScript is doing any of the actual heavy lifting? (I see you use OpenAI's embedding) so this tool is more of the glue connecting all the parts? Again, I'm out of my wheelhouse here.

benjreinhart · on Aug 7, 2023

Ahh yes, right now we're operating at a higher-level of the stack.

That said, we are investigating serving from Node and possibly on edge devices with WebGPU. For serving from Node, it would be similar to what you describe with Tensorflow compiling down to C/C++. There are various backends for frameworks like Tensorflow, Pytorch, etc. and those backends are often C/C++. We would bridge this lower-level code to Node through e.g. Node API (https://nodejs.org/api/n-api.html) or use frameworks like ONNX / ONNX Runtime.