ec vs raid vs replication (data safety) #507

eleaner · 2022-11-19T13:27:44Z

eleaner
Nov 19, 2022

Hi Guys

It is a theoretical question that started to bother me, but I cannot find any info online.

imagine different setups (random choice)
a) replication with 3 copies (1+2)
b) EC 4+2
c) EC 8+2
d) RAID 6+2

In each case, I will lose my data with a third failure.
But which one should I use if my priority is data safety?

jkiebzak · 2022-11-19T14:58:47Z

jkiebzak
Nov 19, 2022

For data integrity, data repair speed is essential. I believe the following would be true, but I don't have metrics to prove it. a) replication with 3 copies - this would be the fastest data repair. chunks from one failed drive exist on many other drives in the cluster. Thus many drives are *read from* in order to replace the missing chunks, and many drives are *written to* while re-writing missing chunks. b) EC 4+2 - this would be faster than 8+2 since only any 4 shards need to be read in order to rewrite missing shards. Many disks in the whole pool will be used to read/write shards, thus faster than standard RAID. - caveat: network speed and chunk server CPU speed effect shard rebuild time. c) EC 8+2 - this would be faster than RAID 6+2, since many disks from the whole cluster will be used to read/write (similar to EC 4+2) - caveat: network speed and chunk server CPU speed effect shard rebuild time. d) RAID 6+2 - standard raid will have the slowest repair time since only disks that are part of the pool will limit the I/O for rebuild time. - assumes you have a hotspare. If you don't have a hot spare, then you need to add time "human time" to the repair speed since a tech will need to do the drive swap before repair begins. - cluster load/RAID pool I/O load will determine rebuild times, along with disk speeds. e) Another option: ZFS DRAID - if you have a large pool of disks on one chunk server, rebuild times can be very fast ( https://openzfs.github.io/openzfs-docs/Basic%20Concepts/dRAID%20Howto.html#rebuilding-to-a-distributed-spare ). caveat to all RAID: if you are layering goal1 on top of chunk servers using RAID, if a chunk server goes down, all the chunks for each file on that server will be unavailable until the server is operational again.

…

Message ID: ***@***.***>

0 replies

anon314159 · 2023-12-13T11:15:12Z

anon314159
Dec 13, 2023

Thanks for the useful information but how do enable EC with the 3.x community edition? I have tried several options beetleswith both master and chunk configs, but nothing seems to work.

Enable erasure coding

use_ec = on

Define erasure coding parameters

ec_k = 4
ec_m = 2
ec_policy = EVENODD

1 reply

xandrus Dec 13, 2023

Currently, EC4+X and EC8+X are only available in the MooseFS 4 PRO version.
MooseFS 3 CE and PRO do not implement EC functionality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ec vs raid vs replication (data safety) #507

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

ec vs raid vs replication (data safety) #507

eleaner Nov 19, 2022

Replies: 2 comments · 1 reply

jkiebzak Nov 19, 2022

anon314159 Dec 13, 2023

Enable erasure coding

Define erasure coding parameters

xandrus Dec 13, 2023

eleaner
Nov 19, 2022

Replies: 2 comments 1 reply

jkiebzak
Nov 19, 2022

anon314159
Dec 13, 2023