Hacker News new | past | comments | ask | show | jobs | submit login

Yeah sorry my answer was more than insufficient to be honest. I wrote it _in bed_ and was embarrassed the next day because it was of really low quality. Thought of expanding it later. So yeah I screwed the pooch here and I'm sorry, I will try to do better now by expanding on my answer.

First of all this is all from memory and I didn't try seaweedfs again for this.

So first things first. I evaluated seaweedfs for HPC Cluster usage in 2020 (oh my this is some time ago), but my test setup were VMs. I tried it with many small and larger files and it didn't scale at all (at least when I tested it) for parallel loads. The response time was acceptable, but the throughput was very low. When I tried it "weed server" spun up everything more or less fine, but had problems binding correctly that a distributed setup worked. Based on the wiki documentation I configured a master server, a filer and a few volume server (iirc). My main gripes at that time were as follows:

  * the syntax of the different clients was incosistent
  * the throughput was rather low
  * the components didn't work well together in a certain configuration and I had to specify many things manually
  * the wiki was lacking
I tried filer (fuse), s3 and hadoop. s3 wasn't compatible enough to work with everything I tried with it so I spun up a minIO instance as a gateway to test the whole thing. When working over a longer period I had some hangs as well.

That's sadly everything I remember on it but I made a presentation if you are interested I can look for it and give you the benchmarks I tried and the limitations I found (although they will be all HORRIBLY out of date). When I tested it there were 2 versions with different size limitations iirc. I just now looked over your gitlab releases and can't find these.

Sorry again if I misrepresented seaweedfs here with my outdated tests. I looked at the github wiki and it looks much better then when I last played with it. I will give it a spin again soon and if I find my old experience of it to be not represantive, maybe write something about it and post it here.

---

minIO was when I tried it mainly an s3 server and gateway. It had a simple web framework that allowed you to upload and share files. One of our use cases that we thought we could use minIO for was as a bucket browser/web interface. It was easy to setup as a server as well. Like I said I didn't track it after testing it for about a month. Today it boasts with it's performance and AI/ML use cases. Here is there pricing model https://min.io/pricing and you can see how they add value to their product.

---

Ceph is like I said the most complex product of the three with the most components that need to be setup (even though it's quite easy now). Performance is optimized in their crimson project https://next.redhat.com/2021/01/18/crimson-evolving-ceph-for... (this is a WIP and not enabled by default). It's not the most straight forward to tune since many small things can lead to big performance gains and losses (for instance the erasure code k and m you choose), but I found that the defaults got more sane with time.




Thanks for the detailed clarification! I am too deep into the SeaweedFS low level details and am all ears on how to make it simpler to use. SeaweedFS has weekly releases and is constantly evolving.

Depending on your case, you may need to add more filers. UCSD has a setup that uses about 10 filers to achieve 1.5 billion iops. https://twitter.com/SeaweedFS/status/1549890262633107456 There are many AI/ML users switching from MinIO or CEPH to SeaweedFS, especially with lots of images/text/audio files to process.

I found MinIO benchmark results is really, well, "marketing". MinIO is basically just an S3 API layer on top of the local disks. Any object is mapped to at least 2 files on disk, one for metadata and one is the object itself.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: