By default it “scrubs” basic metadata daily and does a deep scrub where it fully reads the object and confirms the checksum is correct from all 3 replicas weekly for all of the data in the cluster.
So what amount of disk bandwidth/usage is involved?
For instance, say that I have 30TB of disk space used and it is across 3 replicas , thus 3 systems.
When I kick off the deep scrub operation, what amiunt of reads will happen on each system? Just the smaller amount of metadata or the actual full size of the files themselves?
In Ceph, objects are organized into placement groups (PGs), and a scrub is performed on one PG at a time, operating on all replicas of that PG.
For a normal scrub, only the metadata (essentially, the list of stored objects) is compared, so the amount of data read is very small. For a deep scrub, each replica reads and verifies the contents of all its data, and compares the hashes with its peers. So a deep scrub of all PGs ends up reading the entire contents of every disk. (Depending on what you mean by "disk space used", that could be 30TB, or 30TBx3.)
The deep scrub frequency is configurable, so e.g. if each disk is fast enough to sequentially read its entire contents in 24 hours, and you choose to deep-scrub every 30 days, you're devoting 1/30th of your total IOPS to scrubbing.
Note that "3 replicas" is not necessarily the same as "3 systems". The normal way to use Ceph is that if you set a replication factor of 3, each PG has 3 replicas that are chosen from your pool of disks/servers; a cluster with N replicas and N servers is just a special case of this (with more limited fault-tolerance). In a typical cluster, any given scrub operation only touches a small fraction of the disks at a time.
By default it “scrubs” basic metadata daily and does a deep scrub where it fully reads the object and confirms the checksum is correct from all 3 replicas weekly for all of the data in the cluster.
It’s automatic and enabled by default.