I can’t use it on mobile looks like which is annoying as we’ve had touch screen phones for almost 20yrs now, the good thing is at least I care enough im annoyed I can’t use it. I’d imagine you’ll get a lot more sharing if mobile works
This seems completely reasonable pre revenue and I’ve no doubt they make an ai app that 1000x’s everyone’s investment using the best ai technology and algorithms and maybe even some better data. hard to know tho for sure sometimes they lose money sometimes they won’t lose money but ai big and I think ai stay big so more ai
Could you explain more ? I was never able to understand this argument. I can understand AI might reduce the labor costs in the future. What about finite resources like minerals ?
The moment I received my first packet on a cut-up wired headset I used as a transceiver makeshift tenceiver it felt like something clicked and I just began to understand more how the universe works. I recommend projects of this type and wish more folks did it.
Hey HN, this is a lil side project I've been working on.
I've ran a ML hosting platform (banana.dev) last few yrs.
Not being able to get GPUs has stressed me out a lot. So took a crack at solving it.
This is a super alpha product, please don't trust it with sensitive workloads or production applications yet. Down to chat if you'd like to use this at scale!
Would love honest feedback, and feel free to AMA!
-kyle
hey HN, we use used to run an ML consultancy for a year that helped companies build & host models in prod. We learned how tedious & expensive it was to host ML. Customer models had to run on a fleet of always-on GPUs that would often get <10% utilization, which felt like a big money sink.
Over time we built infrastructure to improve GPU utilization. Six months ago we made a pivot to focus solely on productizing this infra into a hosting platform for ML teams to use that would remove the pain of deployment and reduce the cost of hosting models.
We deploy on A100 GPUs, and you pay per second of inference. If you aren’t running inferences you pay nothing. Couple points to clarify: Yes, the models are actually cold-booted, we aren’t just running them in the background. We boot models faster due to how we manage OS memory. Yes, there is still cold-boot time, it’s not instant but it’s significantly faster (e.g., 15 seconds instead of 10 minutes for some transformers like GPTJ).
Lastly, model quality is not lost on Banana because we aren’t doing traditional weight quantization or network pruning which makes networks smaller/faster but sacrifices quality. You can think of Banana more as a compiler + hosting platform. We break down your code to run faster on GPUs.
We are starting with a Visual Question-Answer model, and plan to expand its capabilities to be increasingly general purpose over time as we build in common CV features and upscale the quantity of parameters.
This is a hybrid of vision and language models, which can extract semantic meaning from images and query against it using natural English. This v0 runs on 13B parameters, with 18B and 34B model iterations coming in the pipeline.
The API is in beta, so jump into the waitlist linked above to get early access.
reply