Hi,
Jiri found this review (among many) of Fedora 25:
http://www.hecticgeek.com/2016/12/fedora-25-review/
It includes a couple recommendations:
* We should restore systemd-readahead to speed boot time by ~30% for users without SSDs. Endless has a downstream patch for this. Or we could use Ubuntu's readahead utility. * We should switch from CFQ to deadline I/O scheduler (which Ubuntu has been using for years) for subjective massive responsiveness improvements when the system is under load
Of these, the later seems easier to change and more important. Anyone know why we're still using CFQ? If the answer is "it's better for servers" then perhaps we need a mechanism to adjust this on a per- product basis.
Just wanted to put these issues back on the radar....
Michael
On Mon, Jan 16, 2017 at 06:07:42PM -0600, Michael Catanzaro wrote:
http://www.hecticgeek.com/2016/12/fedora-25-review/ It includes a couple recommendations:
- We should restore systemd-readahead to speed boot time by ~30% for
users without SSDs. Endless has a downstream patch for this. Or we could use Ubuntu's readahead utility.
Has Endless done benchmarks? I'd hate to renable it based mainly on anecdotes. If it turns out to be worthwhile, it'd be nice to work with Endless to get the feature back upstream in systemd - if I remember right, one of the points at its removal was that the main developers didn't have the spinning-disk hardware to benefit from it and that no one who cared stepped up. If it turns out that there _are_ people who care, upstream seems like the right home.
- We should switch from CFQ to deadline I/O scheduler (which Ubuntu
has been using for years) for subjective massive responsiveness improvements when the system is under load
Same point about benchmarks applies here, although it's a lot less work to experiment with and the consequences easily reversed. FWIW, RHEL 7 defaults to deadline on all devices except SATA disks, which default to CFQ. And on virtual disks, you get no IO scheduling at all, which makes sense and makes this irrelevant in those cases.
Of these, the later seems easier to change and more important. Anyone know why we're still using CFQ? If the answer is "it's better for servers" then perhaps we need a mechanism to adjust this on a per- product basis.
Dunno, but doing it per-edition might make sense. At least, unless the above is applicable for everything.
On Mon, 2017-01-16 at 20:19 -0500, Matthew Miller wrote:
On Mon, Jan 16, 2017 at 06:07:42PM -0600, Michael Catanzaro wrote:
http://www.hecticgeek.com/2016/12/fedora-25-review/ It includes a couple recommendations: * We should restore systemd-readahead to speed boot time by ~30% for users without SSDs. Endless has a downstream patch for this. Or we could use Ubuntu's readahead utility.
Has Endless done benchmarks? I'd hate to renable it based mainly on anecdotes.
I don't know. I think their benchmarks would not be super relevant to us anyway, since they are targeting extremely low-end hardware; I would be unsurprised if readahead is far more important for Endless than it is for Fedora.
But this reviewer has taken measurements (the basis for that 30% number). They're at the bottom of his Fedora 25 review. They are not super scientific, because they are comparing Fedora to Ubuntu, but this same reviewer also reviewed Fedora 22 a while back with and without readahead. His boot took 38 seconds without readahead and 28 seconds with readahead. It is an anecdote, since it's only tested on one single computer, but the difference is quite major:
http://www.hecticgeek.com/2015/06/fedora-22-review/
Same point about benchmarks applies here, although it's a lot less work to experiment with and the consequences easily reversed. FWIW, RHEL 7 defaults to deadline on all devices except SATA disks, which default to CFQ. And on virtual disks, you get no IO scheduling at all, which makes sense and makes this irrelevant in those cases.
Very interesting that our behavior is different from RHEL!
Michael
Matthew Miller píše v Po 16. 01. 2017 v 20:19 -0500:
On Mon, Jan 16, 2017 at 06:07:42PM -0600, Michael Catanzaro wrote:
http://www.hecticgeek.com/2016/12/fedora-25-review/ It includes a couple recommendations: * We should restore systemd-readahead to speed boot time by ~30% for users without SSDs. Endless has a downstream patch for this. Or we could use Ubuntu's readahead utility.
Has Endless done benchmarks? I'd hate to renable it based mainly on anecdotes. If it turns out to be worthwhile, it'd be nice to work with Endless to get the feature back upstream in systemd - if I remember right, one of the points at its removal was that the main developers didn't have the spinning-disk hardware to benefit from it and that no one who cared stepped up. If it turns out that there _are_ people who care, upstream seems like the right home.
* We should switch from CFQ to deadline I/O scheduler (which Ubuntu has been using for years) for subjective massive responsiveness improvements when the system is under load
Same point about benchmarks applies here, although it's a lot less work to experiment with and the consequences easily reversed. FWIW, RHEL 7 defaults to deadline on all devices except SATA disks, which default to CFQ. And on virtual disks, you get no IO scheduling at all, which makes sense and makes this irrelevant in those cases
I briefly spoke with Jiri Hladky, the manager of the FS perf team in Red Hat, and he says that deadline is clearly better for SSD. Results for rotating disks are mixed, but he'd still prefer deadline there, too. He can provide more detailed info if we need it.
Jiri
On Tue, Jan 17, 2017 at 04:38:31PM +0100, Jiri Eischmann wrote:
I briefly spoke with Jiri Hladky, the manager of the FS perf team in Red Hat, and he says that deadline is clearly better for SSD. Results for rotating disks are mixed, but he'd still prefer deadline there, too. He can provide more detailed info if we need it.
It's my understanding that CFQ is better in the case where you a) have a single spinning disk with b) mixed workload on top of that and c) care about overall throughput more than latency. An example might be if you are a budget hosting provider and are running multiple VMs on single-disk servers. If is preferred for lower latency on desktops, and is overall preferred for servers (a win on RAID, on SSD, and even on single-disk systems in many cases), switching to as across-the-board default deadline seems like the straightforward choice.
I guess the next step is to engage the kernel team, and possibly FESCo since this is obviously a big engineering steering decision. And someone other than me can decide if this should be a Change.
On Tue, Jan 17, 2017 at 12:04 PM, Matthew Miller mattdm@fedoraproject.org wrote:
On Tue, Jan 17, 2017 at 04:38:31PM +0100, Jiri Eischmann wrote:
I briefly spoke with Jiri Hladky, the manager of the FS perf team in Red Hat, and he says that deadline is clearly better for SSD. Results for rotating disks are mixed, but he'd still prefer deadline there, too. He can provide more detailed info if we need it.
It's my understanding that CFQ is better in the case where you a) have a single spinning disk with b) mixed workload on top of that and c) care about overall throughput more than latency. An example might be if you are a budget hosting provider and are running multiple VMs on single-disk servers. If is preferred for lower latency on desktops, and is overall preferred for servers (a win on RAID, on SSD, and even on single-disk systems in many cases), switching to as across-the-board default deadline seems like the straightforward choice.
I guess the next step is to engage the kernel team, and possibly FESCo since this is obviously a big engineering steering decision. And someone other than me can decide if this should be a Change.
That is a somewhat odd suggestion. It's a runtime tunable for a reason. No default will ever work for 100% of the cases Fedora may be used in. Rather than go through all the effort of making a decision for all of Fedora at the FESCo level, I would prefer the Workgroups look at what their target machine types are and come up with a default they would prefer to see. Then they can work on setting it at boot via an Edition specific setting.
josh
On Tue, Jan 17, 2017 at 12:17:30PM -0500, Josh Boyer wrote:
That is a somewhat odd suggestion. It's a runtime tunable for a reason. No default will ever work for 100% of the cases Fedora may be used in.
Sure, when it *seems* that a certain value is likely to be right in the (possibly vast) majority of case, doesn't it make sense for that to be the default and let people tune differently when they have a specific need?
Rather than go through all the effort of making a decision for all of Fedora at the FESCo level, I would prefer the Workgroups look at what their target machine types are and come up with a default they would prefer to see. Then they can work on setting it at boot via an Edition specific setting.
I mean, that's fine too.
On Tue, Jan 17, 2017 at 12:24 PM, Matthew Miller mattdm@fedoraproject.org wrote:
On Tue, Jan 17, 2017 at 12:17:30PM -0500, Josh Boyer wrote:
That is a somewhat odd suggestion. It's a runtime tunable for a reason. No default will ever work for 100% of the cases Fedora may be used in.
Sure, when it *seems* that a certain value is likely to be right in the (possibly vast) majority of case, doesn't it make sense for that to be the default and let people tune differently when they have a specific need?
Yes, but doing that across all of Fedora is pointless. What works for the majority of Server users won't necessarily work for Cloud or Workstation. Also, the end user's ability to even discover that it's tunable varies greatly from one Edition to the next. One could expect sysadmins knowing about this and changing the default on their Server install. The Workstation end user may not be as low-level detail aware and could just suffer with poor performance because they think that's the only option.
Rather than go through all the effort of making a decision for all of Fedora at the FESCo level, I would prefer the Workgroups look at what their target machine types are and come up with a default they would prefer to see. Then they can work on setting it at boot via an Edition specific setting.
I mean, that's fine too.
It's more realistic.
josh
On Tue, Jan 17, 2017 at 12:31:04PM -0500, Josh Boyer wrote:
Yes, but doing that across all of Fedora is pointless. What works for the majority of Server users won't necessarily work for Cloud or Workstation. Also, the end user's ability to even discover that it's tunable varies greatly from one Edition to the next. One could expect sysadmins knowing about this and changing the default on their Server install. The Workstation end user may not be as low-level detail aware and could just suffer with poor performance because they think that's the only option.
Well, I'm starting from Michael's premise that deadline would be better for latency for most desktop users (regardless of disk type), and clearly better when using SSD. This leads me to a different conclusion than the above.
It's irrelevant for cloud and any other virt deployment of Atomic or Server. As far as I know, the special case on hardware where cfq is better is the one I outlined (on hardware, single spindle, prefer throughput, mixed workload) and I agree that it's okay to expect sysadmins to handle that.
On Tue, Jan 17, 2017 at 12:38 PM, Matthew Miller mattdm@fedoraproject.org wrote:
On Tue, Jan 17, 2017 at 12:31:04PM -0500, Josh Boyer wrote:
Yes, but doing that across all of Fedora is pointless. What works for the majority of Server users won't necessarily work for Cloud or Workstation. Also, the end user's ability to even discover that it's tunable varies greatly from one Edition to the next. One could expect sysadmins knowing about this and changing the default on their Server install. The Workstation end user may not be as low-level detail aware and could just suffer with poor performance because they think that's the only option.
Well, I'm starting from Michael's premise that deadline would be better for latency for most desktop users (regardless of disk type), and clearly better when using SSD. This leads me to a different conclusion than the above.
Then set it as such in Workstation. I don't see how your conclusion conflicts with mine at all.
It's irrelevant for cloud and any other virt deployment of Atomic or Server. As far as I know, the special case on hardware where cfq is better is the one I outlined (on hardware, single spindle, prefer throughput, mixed workload) and I agree that it's okay to expect sysadmins to handle that.
Why is it irrelevant on virt? Do people not care about local storage impacts of their guests? That would be surprising.
josh
On Tue, Jan 17, 2017 at 12:50:43PM -0500, Josh Boyer wrote:
Well, I'm starting from Michael's premise that deadline would be better for latency for most desktop users (regardless of disk type), and clearly better when using SSD. This leads me to a different conclusion than the above.
Then set it as such in Workstation. I don't see how your conclusion conflicts with mine at all.
Well, if it seems like the best default for Workstation (and therefore probably also most of the desktop Spins) *and* for server, doesn't changing the overall default make the most sense?
It's irrelevant for cloud and any other virt deployment of Atomic or Server. As far as I know, the special case on hardware where cfq is better is the one I outlined (on hardware, single spindle, prefer throughput, mixed workload) and I agree that it's okay to expect sysadmins to handle that.
Why is it irrelevant on virt? Do people not care about local storage impacts of their guests? That would be surprising.
It's relevant to virt hosts, but not to cloud and virt _guests_, where the io scheduler is bypassed completely. See http://www.linux-kvm.org/images/6/63/02x06a-VirtioBlk.pdf
On Tue, Jan 17, 2017 at 3:10 PM, Matthew Miller mattdm@fedoraproject.org wrote:
On Tue, Jan 17, 2017 at 12:50:43PM -0500, Josh Boyer wrote:
Well, I'm starting from Michael's premise that deadline would be better for latency for most desktop users (regardless of disk type), and clearly better when using SSD. This leads me to a different conclusion than the above.
Then set it as such in Workstation. I don't see how your conclusion conflicts with mine at all.
Well, if it seems like the best default for Workstation (and therefore probably also most of the desktop Spins) *and* for server, doesn't changing the overall default make the most sense?
Via the runtime tunable, sure (read that as: we are not carrying a damn kernel patch). But to assume that it makes sense without at least discussing it with the Editions seems odd.
It's irrelevant for cloud and any other virt deployment of Atomic or Server. As far as I know, the special case on hardware where cfq is better is the one I outlined (on hardware, single spindle, prefer throughput, mixed workload) and I agree that it's okay to expect sysadmins to handle that.
Why is it irrelevant on virt? Do people not care about local storage impacts of their guests? That would be surprising.
It's relevant to virt hosts, but not to cloud and virt _guests_, where the io scheduler is bypassed completely. See http://www.linux-kvm.org/images/6/63/02x06a-VirtioBlk.pdf
Is that the default IO driver in QEMU across all releases at this point?
josh
On Tue, Jan 17, 2017 at 6:38 PM, Matthew Miller mattdm@fedoraproject.org wrote:
On Tue, Jan 17, 2017 at 12:31:04PM -0500, Josh Boyer wrote:
Yes, but doing that across all of Fedora is pointless. What works for the majority of Server users won't necessarily work for Cloud or Workstation. Also, the end user's ability to even discover that it's tunable varies greatly from one Edition to the next. One could expect sysadmins knowing about this and changing the default on their Server install. The Workstation end user may not be as low-level detail aware and could just suffer with poor performance because they think that's the only option.
Well, I'm starting from Michael's premise that deadline would be better for latency for most desktop users (regardless of disk type), and clearly better when using SSD. This leads me to a different conclusion than the above.
It's irrelevant for cloud and any other virt deployment of Atomic or Server. As far as I know, the special case on hardware where cfq is better is the one I outlined (on hardware, single spindle, prefer throughput, mixed workload) and I agree that it's okay to expect sysadmins to handle that.
Well there are a few things to consider here ...
1) Afaik deadline does not support I/O priorities (ionice), whole CFQ does. Which might be harmful when you have processes like tracker competing for I/O in the background 2) What about external media? It might perform a bit better on internal disks but what about USB connected storage? 3) Given the recent developments upstream (BFQ and multi queue) shouldn't we rather wait for that?
The scheduler choice could be set by a simple udev rule
$ cat /etc/udev/rules.d/60-ssd-scheduler.rules # set deadline scheduler for non-rotating disks ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"
$ for f in /sys/block/sd?/queue/rotational; do printf "$f is "; cat $f; done /sys/block/sda/queue/rotational is 0 /sys/block/sdb/queue/rotational is 1 /sys/block/sdc/queue/rotational is 0
$ for f in /sys/block/sd?/queue/scheduler; do printf "$f is "; cat $f; done /sys/block/sda/queue/scheduler is noop [deadline] cfq /sys/block/sdb/queue/scheduler is noop deadline [cfq] /sys/block/sdc/queue/scheduler is noop [deadline] cfq
On Sun, Jul 09, 2017 at 10:26:01AM -0000, Leigh Scott wrote:
The scheduler choice could be set by a simple udev rule
$ cat /etc/udev/rules.d/60-ssd-scheduler.rules # set deadline scheduler for non-rotating disks ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"
$ for f in /sys/block/sd?/queue/rotational; do printf "$f is "; cat $f; done /sys/block/sda/queue/rotational is 0 /sys/block/sdb/queue/rotational is 1 /sys/block/sdc/queue/rotational is 0
$ for f in /sys/block/sd?/queue/scheduler; do printf "$f is "; cat $f; done /sys/block/sda/queue/scheduler is noop [deadline] cfq /sys/block/sdb/queue/scheduler is noop deadline [cfq] /sys/block/sdc/queue/scheduler is noop [deadline] cfq
This would be F27 material. What about the two new fancy schedulers added in 4.12 (BFQ, Kyber, https://lwn.net/Articles/720675/)?
Zbyszek
desktop@lists.stg.fedoraproject.org