Hi!
I've put a fuller explanation of the bug in a bugzilla comment[1], and I've included the text below. Please let us know if you have any additional questions.
============================================================
The problem was discovered when the cache was initialized, and a user then issued a command to add an additional data device to the pool, resulting in the following assertion failure:
stratisd[6378]: thread 'stratis-wt-6' panicked at 'assertion failed: `(left == right)` stratisd[6378]: left: `Device { major: 252, minor: 7 }`, stratisd[6378]: right: `Device { major: 252, minor: 0 }`', src/engine/strat_engine/thinpool/thinpool.rs:516:9
The assertion that failed was:
assert_eq!( backstore.device().expect( "thinpool exists and has been allocated to, so backstore must have a cap device" ), self.backstore_device );
This assertion checks that the device that the thinpool allocates its component devices from is the same as the one that the backstore considers to be its cap device, the one to be allocated from.
The assertion itself was quite correct, and effective in identifying the bug.
The assertion failed because the backstore's device was the cache device, but the thinpool was allocating from the cap device it was using before the cache was initialized and bypassing the cache. There is no risk of data corruption in this case; the problem is that the cache is unused and so no performance benefit is gained.
In the example above, where the operation is correct, before the cache is initialized the thinpool devices, 253:1 and 253:2, are allocated from the cap device, 253:0. After the cache is initialized, the thinpool devices are allocated from the cache device, 253:7, instead. This is the correct configuration.
Previously, 253:1 and 253:2 would continue to allocate from 253:0 even though the cache device, 253:7, was constructed properly.
Because the pool metadata had been written properly, any action that destroyed and rebuilt the device stack, would cause the thinpool devices to be set up properly to allocate from the cache device. So, a reboot would certainly cause the device stack to be reconstructed correctly.
Because of the particular nature of the code defect that caused the bug, adding an additional device to the cache would cause the thinpool devices to be allocated properly from the cache device. =====================================================================
- muhern
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2007018#c4
On Mon, Nov 15, 2021 at 11:28 PM Ryan Gonzalez rymg19@gmail.com wrote:
stratisd was not immediately updating the devicemapper device stack when a cache was initialized with the result that the cache was not immediately put in use
Out of curiosity, what were the practical effects of this issue? Was it just degraded performance in some cases?
-- Ryan https://refi64.com/ On Nov 15, 2021, 8:25 PM -0600, the Mulhern amulhern@redhat.com, wrote:
Hi!
Stratis 3.0.0, which includes new versions of stratisd and stratis-cli has been released.
Please see the blog post[1] for details of the release.
Thanks for your continued interest in the Stratis project.
- mulhern
[1] https://stratis-storage.github.io/stratis-release-notes-3-0-0/ _______________________________________________ stratis-devel mailing list -- stratis-devel@lists.fedorahosted.org To unsubscribe send an email to stratis-devel-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/stratis-devel@lists.fedorahoste... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure