Based on recommendation moving discussion from bug https://bugzilla.redhat.com/show_bug.cgi?id=874068 to anaconda-devel-list. Initial posts can be found in that bug.
If you are willing to discuss my notes, please keep me on cc list as I am not subscribed in anaconda-devel list. Thank you.
To address the concerns you specifically wanted explicit addressing:
- multiple usage of term redundancy/redundant for few checkboxes.
The term 'Redundancy' is used once on the RAID configuration screen, I'm looking at the screen now. It is used once, not multiple times. If that isn't what you meant, please clarify.
On this screenshot (attached to original bug): https://bugzilla.redhat.com/attachment.cgi?id=640017
First checkbox has label 'Redundancy', last checkbox has label 'Redundant'. In my understanding same terms. Additonally they aren't different by any means, just two equivalent checkboxes for 'redundancy' in same group. Btw that 'RAID6' label visible on the screenshot does not bring any light to selected configuration.
- error detection is feature of all RAIDs but RAID0, levels can't be
differentiated
This is incorrect, Marian. Neither RAID0 or RAID1 have error detection.
Linked section does not seem to mention error detection for any RAID level at all (maybe following table, however that clearly states RAID1 is able to detect/tolerate failure of n-1 drives which is far away best error detection/correction from all other levels).
Contrary last sentence of top-level initial paragraph of same wiki page clearly states:
RAID levels > 0 provide protection against unrecoverable (sector) read errors, as well as whole disk failure.
Additionally md manual page (try man md or read http://linux.die.net/man/4/md ) states exactly same in section RECOVERY:
If the md driver detects a write error on a device in a RAID1, RAID4, RAID5, RAID6, or RAID10 array, it immediately disables that device (marking it as faulty) and continues operation on the remaining devices.
Marian is right, wiki as well as man md agree with him, all RAID levels except of RAID0 do error detection.
Therefore I say that usage of term 'error detection' is highly confusing as it actually counts wide range of RAID levels, RAID is error detecting redundant composition of disks by definition (once more, many professionals do not count RAID0 to RAID family).
- concern that 'common terminology' used in anaconda is not used anywhere else.
If you read comment #10 again, you will see I wrote the following:
in the RAID # format ("common terminology").
I was referring to the RAID # format (RAID ), RAID 5, RAID10, etc) as the "common terminology" as that was what Martin called it in comment #0.
This confusion is caused by insufficient description of my intention, that's my fault sorry for that. I'll try to correct it. I absolutely agree with initial reporter of reffered bug, Martin Banas, that common terminology is crucial for easy adoption of new RAID handling. In my opinion it is absolutely necessary to bring similar naming as other products providing RAID capability such as BIOS RAID, full featured RAID controllers or other operating systems. I am afraid that terminology introduced in new anaconda isn't used by any similar product and as such it tends to be highly confusing (it is highly confusing to me for all expressed reasons). Therefore I asked to show usage of approach also in other products. I hope it is much clearer now.
Well, this is it. Last but not least initial message of bug https://bugzilla.redhat.com/show_bug.cgi?id=874068 describes issue with current implemetation of RAID definition in really specific comments as is and it does not deserve to be closed/rejected without further discussion/research.
As the anaconda implements creation of MD devices it would be really nice to have it aligned with MD terminology as expressed in relevant man pages.
Thank you very much for audience.
Regards,
Marian
I'm guessing your mail bounced when you tried to send it to anaconda-devel since you're not subscribed. I think you have to be subscribed to post.
I'll comment inline:
On 12/19/2012 04:26 PM, Marian Ganisin wrote:
On this screenshot (attached to original bug): https://bugzilla.redhat.com/attachment.cgi?id=640017
First checkbox has label 'Redundancy', last checkbox has label 'Redundant'. In my understanding same terms. Additonally they aren't different by any means, just two equivalent checkboxes for 'redundancy' in same group. Btw that 'RAID6' label visible on the screenshot does not bring any light to selected configuration.
They're not in the same group. The first checkbox you're referring to is the Redundancy checkbox for data redundancy. The second one, you'll note in the screenshot, is indented under the 'Error detection (partity)' checkbox. It is a child of the error detection checkbox, and means redundant error detection / partity, not redundant data.
I'm sorry that the two similar words confused you. Maybe we can indent the parity options a bit more so it's clearer they apply to error detection, or grey them out unless error detection is selected.
- error detection is feature of all RAIDs but RAID0, levels can't be
differentiated
This is incorrect, Marian. Neither RAID0 or RAID1 have error detection.
Linked section does not seem to mention error detection for any RAID level at all (maybe following table, however that clearly states RAID1 is able to detect/tolerate failure of n-1 drives which is far away best error detection/correction from all other levels).
First of all, the user interface clearly identifies the error detection type we are talking about as "parity." The label for the option is "Error detection (parity)."
The link I pointed you to states:
"RAID 0 (block-level striping without parity " "In RAID 1 (mirroring without parity"
Contrary last sentence of top-level initial paragraph of same wiki page clearly states:
RAID levels > 0 provide protection against unrecoverable (sector) read errors, as well as whole disk failure.
Yes, they do. That is because they are either mirrored or feature parity. Just because you've mirrored your data and have a spare copy when one device fails, doesn't mean you're using a RAID level with parity.
Additionally md manual page (try man md or read http://linux.die.net/man/4/md ) states exactly same in section RECOVERY:
If the md driver detects a write error on a device in a RAID1, RAID4, RAID5, RAID6, or RAID10 array, it immediately disables that device (marking it as faulty) and continues operation on the remaining devices.
Again, this is *not* equivalent to parity.
If you read comment #10 again, you will see I wrote the following:
in the RAID # format ("common terminology").
I was referring to the RAID # format (RAID ), RAID 5, RAID10, etc) as the "common terminology" as that was what Martin called it in comment #0.
This confusion is caused by insufficient description of my intention, that's my fault sorry for that. I'll try to correct it. I absolutely agree with initial reporter of reffered bug, Martin Banas, that common terminology is crucial for easy adoption of new RAID handling. In my opinion it is absolutely necessary to bring similar naming as other products providing RAID capability such as BIOS RAID, full featured RAID controllers or other operating systems. I am afraid that terminology introduced in new anaconda isn't used by any similar product and as such it tends to be highly confusing (it is highly confusing to me for all expressed reasons). Therefore I asked to show usage of approach also in other products. I hope it is much clearer now.
You've completely lost me here. I'm sorry again that you apparently did not understand or read carefully what I wrote in the bug.
Well, this is it. Last but not least initial message of bug https://bugzilla.redhat.com/show_bug.cgi?id=874068 describes issue with current implemetation of RAID definition in really specific comments as is and it does not deserve to be closed/rejected without further discussion/research.
I closed the bug, as I explained twice in the bug, because we cannot fix it for Fedora 18 and will not be able to revisit it until we do the usability research I mentioned in the bug. Yes, we will do research and gather data on the issue. Just because I closed the bug doesn't mean that isn't going to happen.
As the anaconda implements creation of MD devices it would be really nice to have it aligned with MD terminology as expressed in relevant man pages.
You've missed the point.
~m
On Wed, Dec 19, 2012 at 04:39:46PM -0500, Máirín Duffy wrote:
On 12/19/2012 04:26 PM, Marian Ganisin wrote:
On this screenshot (attached to original bug): https://bugzilla.redhat.com/attachment.cgi?id=640017
First checkbox has label 'Redundancy', last checkbox has label 'Redundant'. In my understanding same terms. Additonally they aren't different by any means, just two equivalent checkboxes for 'redundancy' in same group. Btw that 'RAID6' label visible on the screenshot does not bring any light to selected configuration.
They're not in the same group. The first checkbox you're referring to is the Redundancy checkbox for data redundancy. The second one, you'll note in the screenshot, is indented under the 'Error detection (partity)' checkbox. It is a child of the error detection checkbox, and means redundant error detection / partity, not redundant data.
I think it is not surprise they are different from functional point of view. However code is invisible for the user, he can not differentiate based on this unless he reads code. UI makes the difference. If we are talking about UI, because that's what guides user, checkboxes are equivalent.
I'm sorry that the two similar words confused you. Maybe we can indent the parity options a bit more so it's clearer they apply to error detection, or grey them out unless error detection is selected.
Yes, this is highly desirable to make that clear and understandable. Still explicit mention about RAID levels would be even better.
- error detection is feature of all RAIDs but RAID0, levels can't be
differentiated
This is incorrect, Marian. Neither RAID0 or RAID1 have error detection.
Linked section does not seem to mention error detection for any RAID level at all (maybe following table, however that clearly states RAID1 is able to detect/tolerate failure of n-1 drives which is far away best error detection/correction from all other levels).
First of all, the user interface clearly identifies the error detection type we are talking about as "parity." The label for the option is "Error detection (parity)."
It is clear to you, you designed it and you know what you wanted to put behind it. For others it is _error_detection_ and as such it is highly confusing. This is what I am trying to tell you. Please do not ignore me, we are providing you look from different point of view. I believe this is always valuable.
The link I pointed you to states:
"RAID 0 (block-level striping without parity " "In RAID 1 (mirroring without parity"
Use this wording and I won't complain, it will be clear to wider user base. Keep RAID levels and you'll have a solution for Martin's bug report.
However anaconda is using completely different naming, maybe based on this, but different. It does not help in orientation.
I asked my colleagues how to build RAID6, despite their rich experience they didn't know and they needed to click some checkboxes and press apply to see what is chosen.
I asked my colleagues if RAID10 has parity, the answer was: If it is able to detect error, it has to have a parity.
This confusion is caused by insufficient description of my intention, that's my fault sorry for that. I'll try to correct it. I absolutely agree with initial reporter of reffered bug, Martin Banas, that common terminology is crucial for easy adoption of new RAID handling. In my opinion it is absolutely necessary to bring similar naming as other products providing RAID capability such as BIOS RAID, full featured RAID controllers or other operating systems. I am afraid that terminology introduced in new anaconda isn't used by any similar product and as such it tends to be highly confusing (it is highly confusing to me for all expressed reasons). Therefore I asked to show usage of approach also in other products. I hope it is much clearer now.
You've completely lost me here. I'm sorry again that you apparently did not understand or read carefully what I wrote in the bug.
Read carefully and understood pretty well.
My team did lot's of testing of new anaconda. You won't find a man on the earth who would do at least half of testing of what each of my colleagues did.
We accepted new design and we have started to work on it. Simply because we believe that each of us has same goal, we want to make better product. We actively participate on that goal. During the testing we figured out that new implementation of RAID definition does not have to be best choice and we believe it is good idea to do some correction. It would be really nice at least to accept our comments and to think about that. Instead of that we are rejected with comments like: "you missed idea; this is only your opinion; you rant!; and many more" You are happy just because you can prove that _you_ are right. This really does not match with my idea of good collaboration. :(
Please let's stop to argue who is wrong and who is right and simply try to accept your QA as worthy counterpart. Once more, our mission goal is exactly same.
On Dec 19, 2012, at 2:39 PM, Máirín Duffy duffy@fedoraproject.org wrote:
On 12/19/2012 04:26 PM, Marian Ganisin wrote:
On this screenshot (attached to original bug): https://bugzilla.redhat.com/attachment.cgi?id=640017
First checkbox has label 'Redundancy', last checkbox has label 'Redundant'. In my understanding same terms. Additonally they aren't different by any means, just two equivalent checkboxes for 'redundancy' in same group. Btw that 'RAID6' label visible on the screenshot does not bring any light to selected configuration.
They're not in the same group. The first checkbox you're referring to is the Redundancy checkbox for data redundancy. The second one, you'll note in the screenshot, is indented under the 'Error detection (partity)' checkbox. It is a child of the error detection checkbox, and means redundant error detection / partity, not redundant data.
I'm sorry that the two similar words confused you. Maybe we can indent the parity options a bit more so it's clearer they apply to error detection, or grey them out unless error detection is selected.
There are some problems with this UI that I mentioned in August, primarily related to Btrfs. I didn't revisit this RAID UI after the extra options that don't apply to Btrfs were removed.
Currently the UI implies at first glance that a lot of RAID and nested RAID levels are supported because of the use of only checkboxes and no graying: RAID linear, 0, 1, 4, 5, 6, and also nested 01/10, 04/40, 41, 05/50, 51, 06/60, 61 are implied, with associated ambiguity on which nesting order is used. And I have to ask if any nested RAID other than 10 is seriously going to be supported for installs. It seems excessive to me.
Anyway, for someone who understands RAID I find the UI intensely confusing as to what I'm going to get, even setting aside the RAID level label in 18.37.4-1 always says RAID0 no matter what is chosen.
1. I suggest exchanging "Redundancy (mirror)" for "Mirroring". Redundancy applies to RAID 1, 4, 5, 6 so using it as a primary term is confusing. If this checkbox means RAID 1, then it really ought to say just "Mirroring" or "Mirroring (RAID1)".
2. "Optimize performance (stripe)" I think it's more clear if this "Performance optimized striping" if this is meant just for RAID 0.
3. "Error detection (parity)" is a confusing label. Parity applies to md RAID 4, 5 and 6 and I can check this option by itself with nothing else checked which then implies RAID 4 which is so uncommon that supporting it doesn't make sense. Once selected I can't deselect it; and I can't select either option below. But I can select the two options above. So upon checking this, I have RAID 4, and can choose either RAID 41 or RAID 40, which are even more rare than RAID 4.
Error detection itself is misleading because in normal operation RAID 4, 5, 6 themselves do not detect any errors above what the drive firmware detects (which is the same for RAID 1 and RAID 0). In order to get error detection the user must initiate or schedule a scrub or repair. Conversely, Btrfs does have error detection which is active during normal operation regardless of the profile used.
Error correction is true, but it's also true for RAID 1.
4. "Distributed" parity applies to RAID 5 or 6. It's unclear which I'll get, or why I can select it all by itself with no other options including without Error detection (parity) selected.
5. "Redundant" under Error detection is confusing. RAID 4 is redundant, RAID 5 is redundant, RAID 6 is redundant. I'd guess it means RAID 6 except…
6. I can't choose both Distributed and Redundant, also very confusing, so Redundant must not mean RAID 6.
Items 3-6 I think need to be consolidated into just RAID 5 or 6, and skip RAID 4. As for what plain language to use to describe them, it's difficult. "Single drive fault tolerance" is maybe OK, but that applies to a two disk RAID 1.
And possibly the best multidisk options, which are also the easiest and fastest to grow when more space is needed: linear, and RAID 1+linear, combined with XFS for parallelization. But this isn't an option at all.
Chris Murphy
With the caveat that any changes suggested have close to a 0 chance of getting into F18 since we are well past string freeze and non-blocker freeze and some of the developers have already left for the holidays. (Thanks for suggesting a few concrete ideas, though):
On Thu, 2012-12-20 at 05:00 -0700, Chris Murphy wrote:
- I suggest exchanging "Redundancy (mirror)" for "Mirroring".
Redundancy applies to RAID 1, 4, 5, 6 so using it as a primary term is confusing. If this checkbox means RAID 1, then it really ought to say just "Mirroring" or "Mirroring (RAID1)".
Swapping 'Redundancy (mirror)' for 'Mirroring' seems like a reasonable change to me, especially if it makes the label less confusing. Would you be willing to file a bug for this under F19?
- "Optimize performance (stripe)" I think it's more clear if this
"Performance optimized striping" if this is meant just for RAID 0.
It's not just meant for RAID 0.
- "Error detection (parity)" is a confusing label. Parity applies to
md RAID 4, 5 and 6 and I can check this option by itself with nothing else checked which then implies RAID 4 which is so uncommon that supporting it doesn't make sense. Once selected I can't deselect it; and I can't select either option below. But I can select the two options above. So upon checking this, I have RAID 4, and can choose either RAID 41 or RAID 40, which are even more rare than RAID 4.
While I understand your concern that the arrangement of checkboxes allows for more rare RAID configurations, a flat list of RAID levels also treats more rare levels at the same level as the more common ones so the base problem here is the same.
Error detection itself is misleading because in normal operation RAID 4, 5, 6 themselves do not detect any errors above what the drive firmware detects (which is the same for RAID 1 and RAID 0). In order to get error detection the user must initiate or schedule a scrub or repair. Conversely, Btrfs does have error detection which is active during normal operation regardless of the profile used.
How would you suggest changing the label?
Error correction is true, but it's also true for RAID 1.
- "Distributed" parity applies to RAID 5 or 6. It's unclear which I'll
get, or why I can select it all by itself with no other options including without Error detection (parity) selected.
- "Redundant" under Error detection is confusing. RAID 4 is redundant,
RAID 5 is redundant, RAID 6 is redundant. I'd guess it means RAID 6 except…
It's indented under parity, meaning redundant parity. Marian had this confusion as well, so I think we need to tighten up the visual design there and perhaps grey out those two sub checkboxes unless error detection (parity) is active, perhaps even change the labels to 'redundant parity' and 'distributed parity'
- I can't choose both Distributed and Redundant, also very confusing, so Redundant must not mean RAID 6.
I think you should be able to choose distributed and redundant to get 6 (distributed only for 5), so this may be a bug.
Items 3-6 I think need to be consolidated into just RAID 5 or 6, and skip RAID 4. As for what plain language to use to describe them, it's difficult. "Single drive fault tolerance" is maybe OK, but that applies to a two disk RAID 1.
If we drop RAID 4 I'm sure we'll get more self-righteous hate mail. Can I blame you and provide your email address if we do this? :)
And possibly the best multidisk options, which are also the easiest and fastest to grow when more space is needed: linear, and RAID 1+linear, combined with XFS for parallelization. But this isn't an option at all.
If XFS is a supported fs type (I don't know if it is offhand) you could select RAID as a partition type and XFS as the filesystem, using RAID for the technology dropdown.
~m
On Dec 20, 2012, at 8:25 AM, Máirín Duffy duffy@fedoraproject.org wrote:
On Thu, 2012-12-20 at 05:00 -0700, Chris Murphy wrote:
- I suggest exchanging "Redundancy (mirror)" for "Mirroring".
Redundancy applies to RAID 1, 4, 5, 6 so using it as a primary term is confusing. If this checkbox means RAID 1, then it really ought to say just "Mirroring" or "Mirroring (RAID1)".
Swapping 'Redundancy (mirror)' for 'Mirroring' seems like a reasonable change to me, especially if it makes the label less confusing. Would you be willing to file a bug for this under F19?
Yes.
- "Optimize performance (stripe)" I think it's more clear if this
"Performance optimized striping" if this is meant just for RAID 0.
It's not just meant for RAID 0.
There's a considerable write performance penalty for RAID 5 and 6. In certain cases there's also one for RAID 4.
- "Error detection (parity)" is a confusing label. Parity applies to
md RAID 4, 5 and 6 and I can check this option by itself with nothing else checked which then implies RAID 4 which is so uncommon that supporting it doesn't make sense. Once selected I can't deselect it; and I can't select either option below. But I can select the two options above. So upon checking this, I have RAID 4, and can choose either RAID 41 or RAID 40, which are even more rare than RAID 4.
While I understand your concern that the arrangement of checkboxes allows for more rare RAID configurations, a flat list of RAID levels also treats more rare levels at the same level as the more common ones so the base problem here is the same.
While I'm not advocating a return to a flat list, the advantage is the rare/unadvisable RAID options are simply not displayed. It's immediately clear what's possible. Presently I have to play with the interface to discover what is and isn't possible, and there's actually some hidden discovery in that I can check the boxes "Redundancy + Redundant" for 61, but when I click Apply Changes, this is refused.
Error detection itself is misleading because in normal operation RAID 4, 5, 6 themselves do not detect any errors above what the drive firmware detects (which is the same for RAID 1 and RAID 0). In order to get error detection the user must initiate or schedule a scrub or repair. Conversely, Btrfs does have error detection which is active during normal operation regardless of the profile used.
How would you suggest changing the label?
No easy answer.
A user who wants redundancy while using most of their drives for their stuff, would be served by this option. So some way of conveying "Redundancy using X data drives, Y parity drives, some performance loss for writes" with an additional checkbox for Dual Parity. So that would mean collapsing the three parity options into two options.
If that's in the ball park of agreeable, then I'd actually change my mind on the first checkbox Redundancy vs Mirroring, to have some way of conveying "Redundancy using X data drives, X mirror drives, minimal performance loss for writes".
RAID 0 would be "No redundancy, using X data drives, performance optimized".
I think I understand the goal, which is to get away from secret decoder ring UI. "Distributed" is better than "RAID 5" in that sense, but I wonder if "distributed parity" and "error detection" and even the word "parity" itself, which I used earlier, are helpful or just another term people would have to look up to understand what to do.
I wouldn't mind seeing this functionality made even more aggressive based on layered use cases, even using a use case based UI. Something based on this:
http://blog.linuxgrrl.com/wp-content/uploads/2011/12/partitioning-personas.p...
But put into a separate application that makes it easier to configure RAID for general use, not just installing an OS, would be seriously very useful.
Error correction is true, but it's also true for RAID 1.
- "Distributed" parity applies to RAID 5 or 6. It's unclear which I'll
get, or why I can select it all by itself with no other options including without Error detection (parity) selected.
- "Redundant" under Error detection is confusing. RAID 4 is redundant,
RAID 5 is redundant, RAID 6 is redundant. I'd guess it means RAID 6 except…
It's indented under parity, meaning redundant parity. Marian had this confusion as well, so I think we need to tighten up the visual design there and perhaps grey out those two sub checkboxes unless error detection (parity) is active, perhaps even change the labels to 'redundant parity' and 'distributed parity'
I think also what's happening is part of the UI changes live as I check boxes (graying out or checkboxes check or uncheck automatically) – but in other cases they change or an option is refused once I click on Apply Changes.
Items 3-6 I think need to be consolidated into just RAID 5 or 6, and skip RAID 4. As for what plain language to use to describe them, it's difficult. "Single drive fault tolerance" is maybe OK, but that applies to a two disk RAID 1.
If we drop RAID 4 I'm sure we'll get more self-righteous hate mail. Can I blame you and provide your email address if we do this? :)
Yeah sure, for all two people who install an OS to RAID 4? Haha. It doesn't hurt anyone to leave it in, unless their use case is heavy small file write and then the parity disk gets bogged down well before the data disks. So either the user still needs to know such things about the RAID level they've effectively chosen, or the UI instead needs to draw out the usage context, and make a choice for the user based on that. And I think that actually would help both pros and non-sophisticated users alike.
And possibly the best multidisk options, which are also the easiest and fastest to grow when more space is needed: linear, and RAID 1+linear, combined with XFS for parallelization. But this isn't an option at all.
If XFS is a supported fs type (I don't know if it is offhand) you could select RAID as a partition type and XFS as the filesystem, using RAID for the technology dropdown.
XFS is, but unfortunately there isn't a UI option for RAID linear/concat. Exposing this with checkboxes probably clutters the UI and would add to confusion, even if it presents a good option for a number of use cases. It's better to stuff this behind a use case UI, and if the usage/context fits, then this becomes the suggested storage tech to use.
Chris Murphy
On Dec 19, 2012, at 2:26 PM, Marian Ganisin mganisin@redhat.com wrote:
If the md driver detects a write error on a device in a RAID1, RAID4, RAID5, RAID6, or RAID10 array, it immediately disables that device (marking it as faulty) and continues operation on the remaining devices.
Marian is right, wiki as well as man md agree with him, all RAID levels except of RAID0 do error detection.
This md entry is referring to the drive firmware itself reporting a (sector) read or write error. This error detection always occurs, and is totally independent of md. The md RAID levels merely dictate subsequent behaviors of this drive error detection.
When a sector read error occurs, md will get the data from a mirrored copy (RAID 1), or rebuild it from parity (RAID 4, 5, 6). That recovered data is then also written to the LBA that previously had the read error, and if the firmware determines the sector is bad it will remap to a reserve sector. So md isn't actually doing error detection at all in normal operation, it's the drive firmware that does this. What md provides is a way to correct for the error. In the case of a write error there is no way out, the device must wholesale be considered faulty.
Therefore I say that usage of term 'error detection' is highly confusing as it actually counts wide range of RAID levels, RAID is error detecting redundant composition of disks by definition (once more, many professionals do not count RAID0 to RAID family).
Error correction is more correct than error detection. Scrub check or repair is the method for md based error detection. But scrubs aren't configured by default, so in fact md error detection is never occurring out of the box. Therefore I find the term "error detection" is misleading.
As the anaconda implements creation of MD devices it would be really nice to have it aligned with MD terminology as expressed in relevant man pages.
What I'm finding is that it does do this, it's just not immediately discoverable. You must check a box, and click Apply, for the RAID level label to change.
I asked my colleagues if RAID10 has parity, the answer was: If it is able to detect error, it has to have a parity.
Well, they're wrong. Md using RAID 1 can also detect error during a scrub.
In the case of a RAID 1 scrub, if data chunks don't match between two devices, md reports an error has been detected. Of course, it's ambiguous which drive contains the valid/invalid chunks.
In the case of a RAID 5 scrub, it means reading all data and parity chunks and recomputing parity to compare to the parity chunks. If there's a mismatch, md reports an error has been detected. But again it's ambiguous whether it's the data chunk or parity chunk that's wrong. Yet a repair type of scrub for RAID 5/6 assume data chunks are correct, and write new parity chunks.
So parity is not required for detecting error.
Chris Murphy
anaconda-devel@lists.stg.fedoraproject.org