Hey, can you give an example ? The model is not perfect and this is our first version, so will get better and faster for sure. Still I generated the full prompt you referenced and it sounds good to me. Adds some laughter, but this makes it more non-robotic in my mind.
Speaker 1: Oh yes, the deep sea, nature’s basement. Home to creatures so bizarre, even nightmares are like, “Nah, I’ll pass.”
Speaker 2: Right? It's like the ocean was running a clearance sale on leftover parts. “Hey, who wants a fish with a lightbulb head? No one? Alright, let’s just drop this bad boy in the Mariana Trench.”
Speaker 1: Oh man, let’s start with a classic: the anglerfish. It’s a fish that decided it was uh, tired of chasing its food and thought, “What if I just dangle a glow stick on my head and let dinner come to me?”
Speaker 2: Honestly, I respect that. Can you imagine if we had that? Like, I’m sitting on my couch with a glowing Dorito on my forehead, waiting for snacks to find me.
Speaker 1: Dang man, I’d come find you for sure.
Depends on the machine, number of threads selected and the model checkpoint used (Vit-B or Vit-L or Vit-B). The video demo attached is running on Apple M2 Ultra and using the Vit-B model. The generation of the image embedding takes ~1.9s there and all the subsequent mask segmentations take ~45ms.
However, I am now focusing on improving the inference speed by making better use of ggml and trying out quantization. Once I make some progress in this direction I will compare to other SAM alternatives and benchmark more thoroughly.
https://drive.google.com/file/d/1JzfweTdvCWzJ6Wwv0KdgfaxZcyn...
Speaker 1: Oh yes, the deep sea, nature’s basement. Home to creatures so bizarre, even nightmares are like, “Nah, I’ll pass.” Speaker 2: Right? It's like the ocean was running a clearance sale on leftover parts. “Hey, who wants a fish with a lightbulb head? No one? Alright, let’s just drop this bad boy in the Mariana Trench.” Speaker 1: Oh man, let’s start with a classic: the anglerfish. It’s a fish that decided it was uh, tired of chasing its food and thought, “What if I just dangle a glow stick on my head and let dinner come to me?” Speaker 2: Honestly, I respect that. Can you imagine if we had that? Like, I’m sitting on my couch with a glowing Dorito on my forehead, waiting for snacks to find me. Speaker 1: Dang man, I’d come find you for sure.