Serverless development experience for embedded computer vision

yeldarb · on Nov 16, 2023

Pretty neat! We've been using Lambda for ML serving low-volume CV models (and my understanding is AWS' Sagemaker Serverless is a lambda wrapper) for a couple of years at Roboflow and it is really good for low-volume and bursty use-cases. The latency is surprisingly not bad. It gets really expensive relative to GPUs for high load (and especially predictable high-load like monitoring security cameras 24/7) though so we end up with our biggest enterprise customers running things in a Kubernetes cluster.

There are a few serverless GPU companies like Banana.dev and Modal; I really want to give them a shot. Anyone have experience using them in prod?

zaptrem · on Nov 16, 2023

We've been building with Modal over the past few months (though no prod-scale tests yet) and were slightly disappointed by very large (10-20 second) cold start times. In the long term we're more interested in inference servers that use compiled/optimized models instead of running plain old PyTorch (which adds another few seconds to cold start on its own).

miguelaeh · on Nov 16, 2023

We are adding support for inference servers to Pipeless. We started by the ONNX Runtime, and OpenVINO, CoreML, CUDA and TensorRT execution providers. Some people mentioned me to integrate also with the Triton server, however I still need to deep into that and check its license. The good part is, there is no cold start right now, at the cost of having some resources allocated from the node start.

miguelaeh · on Nov 16, 2023

Hi!

It is so cool you shared this repo. I am the developer behind it, hope you enjoy it and can provide some valuable feedback!

angelmm · on Nov 16, 2023

After ChatGPT was announced, I found many cool projects that simplifies how you integrate LLM capabilities into your services. However, I didn't find many of these in the vision ecosystem.

Getting started in AI + vision with just 3 commands is amazing! I will definitely try it for some personal projects with IP cameras.

Good stuff :)

migmartri · on Nov 16, 2023

I found this OSS tool interesting since it abstracts pretty much all the plumbing required to set up a computer vision pipeline.

yeldarb · on Nov 16, 2023

This looks like a really cool project; would you be open to us PR'ing support for the 50k fine-tuned models on Roboflow Universe[1] via an `inference`[2] integration?

[1] https://roboflow.com/universe

[2] https://github.com/roboflow/inference

miguelaeh · on Nov 16, 2023

Definitely! It is something I was thinking to do, I just did not find the time yet. I think allowing people to automatically load models from the Roboflow universe would be awesome!