Add descriptions of parallel dumping and how to use it.
Signed-off-by: Zhou wenjian zhouwj-fnst@cn.fujitsu.com Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> --- kexec-kdump-howto.txt | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/kexec-kdump-howto.txt b/kexec-kdump-howto.txt index 05b497f..fab7b09 100644 --- a/kexec-kdump-howto.txt +++ b/kexec-kdump-howto.txt @@ -616,6 +616,50 @@ options are copied from /proc/cmdline. In general it is best to append command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing the original command line completely.
+Parallel Dumping Operation +========================== +Kexec allows kdump using multiple cpus. So parallel feature can +accelerate dumping greatly, especially in doing compress and filter. +For example: +"makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" +has 2 or more times performance of +"makedumpfile -c /proc/vmcore dumpfile", +if THREAD_NUM is larger than 2 and the num of cpus that can be used is +larger than THREAD_NUM. + +Notes on how to use multiple cpus on a capture kernel on x86 system: + +To use multiple cpus on a capture kernel on x86 system: + +- First, confirm that you are using a sufficiently new kernel version + that supports disable_cpu_apicid kernel option as a capture kernel, + which is needed to avoid x86 specific hardware issue (*). The + disable_cpu_apicid kernel option is automatically appended by + kdumpctl script and is ignored if the kernel doesn't support + it. Thus, you don't need to do anything else except for the + confirmation. + +- Then, you need to specify how many cpus you use in a capture kernel + by specifying the number of cpus in nr_cpus kernel option in + /etc/sysconfig/kdump. nr_cpus is 1 at default. + +Note strongly that you should use necessary and sufficnet amount of +cpus on a capture kernel. IOW, don't use too many cpus on a capture +kernel, or the capture kernel easily leads to panic due to Out Of +Memory. + +There are kernel data structures and drivers allocating memory in +proportion to the number of cpus. More you use cpus, more and more +memory system consumes. Memory is rare, limited resource in a capture +kernel. Reserved memory should be kept as less as possible. When +configuring nr_cpus option, you should confirm that kdump certainly +successfully works without leading to panic due to Out Of Memory on a +capture kernel. + +(*) Without disable_cpu_apicid kernel option, capture kernel leads to +hang, system reset or power-off at boot, depending on your system and +runtime situation at the time of crash. + Debugging Tips -------------- - One can drop into a shell before/after saving vmcore with the help of
Hi, Wenjian.
How about setting the nr_cpus as default value for parallel dumping? Then we do not need to specify the nr_cpus in kdump.conf.
Thanks Minfei
On 09/08/15 at 10:22am, Zhou Wenjian wrote:
Add descriptions of parallel dumping and how to use it.
Signed-off-by: Zhou wenjian zhouwj-fnst@cn.fujitsu.com Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
kexec-kdump-howto.txt | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/kexec-kdump-howto.txt b/kexec-kdump-howto.txt index 05b497f..fab7b09 100644 --- a/kexec-kdump-howto.txt +++ b/kexec-kdump-howto.txt @@ -616,6 +616,50 @@ options are copied from /proc/cmdline. In general it is best to append command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing the original command line completely.
+Parallel Dumping Operation +========================== +Kexec allows kdump using multiple cpus. So parallel feature can +accelerate dumping greatly, especially in doing compress and filter. +For example: +"makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" +has 2 or more times performance of +"makedumpfile -c /proc/vmcore dumpfile", +if THREAD_NUM is larger than 2 and the num of cpus that can be used is +larger than THREAD_NUM.
+Notes on how to use multiple cpus on a capture kernel on x86 system:
+To use multiple cpus on a capture kernel on x86 system:
+- First, confirm that you are using a sufficiently new kernel version
- that supports disable_cpu_apicid kernel option as a capture kernel,
- which is needed to avoid x86 specific hardware issue (*). The
- disable_cpu_apicid kernel option is automatically appended by
- kdumpctl script and is ignored if the kernel doesn't support
- it. Thus, you don't need to do anything else except for the
- confirmation.
+- Then, you need to specify how many cpus you use in a capture kernel
- by specifying the number of cpus in nr_cpus kernel option in
- /etc/sysconfig/kdump. nr_cpus is 1 at default.
+Note strongly that you should use necessary and sufficnet amount of +cpus on a capture kernel. IOW, don't use too many cpus on a capture +kernel, or the capture kernel easily leads to panic due to Out Of +Memory.
+There are kernel data structures and drivers allocating memory in +proportion to the number of cpus. More you use cpus, more and more +memory system consumes. Memory is rare, limited resource in a capture +kernel. Reserved memory should be kept as less as possible. When +configuring nr_cpus option, you should confirm that kdump certainly +successfully works without leading to panic due to Out Of Memory on a +capture kernel.
+(*) Without disable_cpu_apicid kernel option, capture kernel leads to +hang, system reset or power-off at boot, depending on your system and +runtime situation at the time of crash.
Debugging Tips
- One can drop into a shell before/after saving vmcore with the help of
-- 1.8.3.1
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
Hello Minfei,
On 09/08/2015 10:46 AM, Minfei Huang wrote:
Hi, Wenjian.
How about setting the nr_cpus as default value for parallel dumping? Then we do not need to specify the nr_cpus in kdump.conf.
What do you mean? Do you mean changing the default value of nr_cpus? nr_cpus is set in /etc/sysconfig/kdump not in kdump.conf.
On 09/08/15 at 10:51am, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
On 09/08/2015 10:46 AM, Minfei Huang wrote:
Hi, Wenjian.
How about setting the nr_cpus as default value for parallel dumping? Then we do not need to specify the nr_cpus in kdump.conf.
What do you mean? Do you mean changing the default value of nr_cpus?
We can use cpu number which is running in the 2nd kernel as the parallel thread parameter. Thus we do not need to pass the num_thread in kdump.conf.
nr_cpus is set in /etc/sysconfig/kdump not in kdump.conf.
Due to the Intel CPU bug, we fail to get the exact cpu number used in the 2nd kernel, if nr_cpus is equal to MAX_CPU(all of the plugged in machine). if cpu 0 is the crashed cpu, then the booting cpu number is nr_cpus in 2nd kernel, otherwise is nr_cpus - 1.
How about adding an option to do this work. Once we pass an option to open the parallel thread feature in kdump.conf, makedumpfile can get the max cpu number to create the threads in 2nd kernel.
Thanks Minfei
Hello Minfei,
I get what you mean now. You mean adding an option that user can choose whether using parallel feature or not and if parallel feature is used, than setting the default value of num_threads according to the nr_cpus, right?
I think there are two questions. One is that the option is only for makedumpfile, and it even focus on makedumpfile's arguments. It's not a good idea to do so much work for makedumpfile in kexec.
The other is that if user set the num_threads value by their selves, and the value is larger than the cpus, it can't do any help.
I think the best way is to tell users what it is and how to use it. In kexec, we should tell users how to bring up multiple cpus and what the benefit is. In makedumpfiel, we should tell users how to use multiple threads and what may affect the performance.
On 09/08/15 at 01:57pm, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
I get what you mean now. You mean adding an option that user can choose whether using parallel feature or not and if parallel feature is used, than setting the default value of num_threads according to the nr_cpus, right?
Yes.
I think there are two questions. One is that the option is only for makedumpfile, and it even focus on makedumpfile's arguments. It's not a good idea to do so much work for makedumpfile in kexec.
Yes, makedumpfile will add a new argument to enable/disable the parallel thread feature. As I talked in previous thread, system may only enable nr_cpus - 1 CPUs in 2nd kernel, due to intel CPU bug.
Following is the performance testing you pasted: without num-threads: 9.5s num-threads 2: 6.2s num-threads 3: 56.7s num-threads 4: 3m23s num-threads 5: 3m43s num-threads 6: 4m51s
Thus the parallel thread feature may casue the bad performance.
The other is that if user set the num_threads value by their selves, and the value is larger than the cpus, it can't do any help.
What I talked in previous thread is that we can not specify the number of threads in kdump.conf. Only we can do is to enable/disable parallel thread feature (makedumpfile will create the number of thread according to the running CPU in 2nd kernel).
Thanks Minfei
I think the best way is to tell users what it is and how to use it. In kexec, we should tell users how to bring up multiple cpus and what the benefit is. In makedumpfiel, we should tell users how to use multiple threads and what may affect the performance.
-- Thanks Zhou On 09/08/2015 12:53 PM, Minfei Huang wrote:
On 09/08/15 at 10:51am, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
On 09/08/2015 10:46 AM, Minfei Huang wrote:
Hi, Wenjian.
How about setting the nr_cpus as default value for parallel dumping? Then we do not need to specify the nr_cpus in kdump.conf.
What do you mean? Do you mean changing the default value of nr_cpus?
We can use cpu number which is running in the 2nd kernel as the parallel thread parameter. Thus we do not need to pass the num_thread in kdump.conf.
nr_cpus is set in /etc/sysconfig/kdump not in kdump.conf.
Due to the Intel CPU bug, we fail to get the exact cpu number used in the 2nd kernel, if nr_cpus is equal to MAX_CPU(all of the plugged in machine). if cpu 0 is the crashed cpu, then the booting cpu number is nr_cpus in 2nd kernel, otherwise is nr_cpus - 1.
How about adding an option to do this work. Once we pass an option to open the parallel thread feature in kdump.conf, makedumpfile can get the max cpu number to create the threads in 2nd kernel.
Thanks Minfei
On 09/08/2015 02:37 PM, Minfei Huang wrote:
On 09/08/15 at 01:57pm, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
I get what you mean now. You mean adding an option that user can choose whether using parallel feature or not and if parallel feature is used, than setting the default value of num_threads according to the nr_cpus, right?
Yes.
I think there are two questions. One is that the option is only for makedumpfile, and it even focus on makedumpfile's arguments. It's not a good idea to do so much work for makedumpfile in kexec.
Yes, makedumpfile will add a new argument to enable/disable the parallel thread feature. As I talked in previous thread, system may only enable nr_cpus - 1 CPUs in 2nd kernel, due to intel CPU bug.
Following is the performance testing you pasted: without num-threads: 9.5s num-threads 2: 6.2s num-threads 3: 56.7s num-threads 4: 3m23s num-threads 5: 3m43s num-threads 6: 4m51s
Thus the parallel thread feature may casue the bad performance.
The other is that if user set the num_threads value by their selves, and the value is larger than the cpus, it can't do any help.
What I talked in previous thread is that we can not specify the number of threads in kdump.conf. Only we can do is to enable/disable parallel thread feature (makedumpfile will create the number of thread according to the running CPU in 2nd kernel).
Using too more cpus won't always have better performance. Since num_threads is introduced to improve the performance, setting default num_threads value has little meaning. User should know the parallel feature well, or he will have been satisfied without multiple threads.
And even parallel thread feature is enabled, if the value nr_cpus is 1, parallel thread won't work. It may be very strange to users.
But whatever, it's needed to provide some descriptions of the parallel feature.
On 09/08/15 at 04:03pm, "Zhou, Wenjian/周文剑" wrote:
On 09/08/2015 02:37 PM, Minfei Huang wrote:
On 09/08/15 at 01:57pm, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
I get what you mean now. You mean adding an option that user can choose whether using parallel feature or not and if parallel feature is used, than setting the default value of num_threads according to the nr_cpus, right?
Yes.
I think there are two questions. One is that the option is only for makedumpfile, and it even focus on makedumpfile's arguments. It's not a good idea to do so much work for makedumpfile in kexec.
Yes, makedumpfile will add a new argument to enable/disable the parallel thread feature. As I talked in previous thread, system may only enable nr_cpus - 1 CPUs in 2nd kernel, due to intel CPU bug.
Following is the performance testing you pasted: without num-threads: 9.5s num-threads 2: 6.2s num-threads 3: 56.7s num-threads 4: 3m23s num-threads 5: 3m43s num-threads 6: 4m51s
Thus the parallel thread feature may casue the bad performance.
The other is that if user set the num_threads value by their selves, and the value is larger than the cpus, it can't do any help.
What I talked in previous thread is that we can not specify the number of threads in kdump.conf. Only we can do is to enable/disable parallel thread feature (makedumpfile will create the number of thread according to the running CPU in 2nd kernel).
Using too more cpus won't always have better performance. Since num_threads is introduced to improve the performance, setting default num_threads value has little meaning. User should know the parallel feature well, or he will have been satisfied without multiple threads.
And even parallel thread feature is enabled, if the value nr_cpus is 1, parallel thread won't work. It may be very strange to users.
But whatever, it's needed to provide some descriptions of the parallel feature.
Agree. It is helpful if there is a document in howto or manpage.
Thanks Minfei
I think the best way is to tell users what it is and how to use it. In kexec, we should tell users how to bring up multiple cpus and what the benefit is. In makedumpfiel, we should tell users how to use multiple threads and what may affect the performance.
-- Thanks Zhou On 09/08/2015 12:53 PM, Minfei Huang wrote:
On 09/08/15 at 10:51am, "Zhou, Wenjian/周文剑" wrote:
Hello Minfei,
On 09/08/2015 10:46 AM, Minfei Huang wrote:
Hi, Wenjian.
How about setting the nr_cpus as default value for parallel dumping? Then we do not need to specify the nr_cpus in kdump.conf.
What do you mean? Do you mean changing the default value of nr_cpus?
We can use cpu number which is running in the 2nd kernel as the parallel thread parameter. Thus we do not need to pass the num_thread in kdump.conf.
nr_cpus is set in /etc/sysconfig/kdump not in kdump.conf.
Due to the Intel CPU bug, we fail to get the exact cpu number used in the 2nd kernel, if nr_cpus is equal to MAX_CPU(all of the plugged in machine). if cpu 0 is the crashed cpu, then the booting cpu number is nr_cpus in 2nd kernel, otherwise is nr_cpus - 1.
How about adding an option to do this work. Once we pass an option to open the parallel thread feature in kdump.conf, makedumpfile can get the max cpu number to create the threads in 2nd kernel.
Thanks Minfei
Hi,
Some comments inline. I always worry about reviewing documentation patch because my english is not good enough.
Ccing Pratyush and Vivek, hope any of you can have time to review the documentation.
On 09/08/15 at 10:22am, Zhou Wenjian wrote:
Add descriptions of parallel dumping and how to use it.
Signed-off-by: Zhou wenjian zhouwj-fnst@cn.fujitsu.com Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
kexec-kdump-howto.txt | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/kexec-kdump-howto.txt b/kexec-kdump-howto.txt index 05b497f..fab7b09 100644 --- a/kexec-kdump-howto.txt +++ b/kexec-kdump-howto.txt @@ -616,6 +616,50 @@ options are copied from /proc/cmdline. In general it is best to append command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing the original command line completely.
+Parallel Dumping Operation +========================== +Kexec allows kdump using multiple cpus. So parallel feature can +accelerate dumping greatly, especially in doing compress and filter. +For example: +"makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" +has 2 or more times performance of +"makedumpfile -c /proc/vmcore dumpfile", +if THREAD_NUM is larger than 2 and the num of cpus that can be used is
s/the num of cpus that can be used/the usable cpu numbers
+larger than THREAD_NUM.
Above paragraph is not easy to undertand, how about drop the example, we know it can increase performance, but it also depends on test cases? Such as network trafic in networking dump? So it may have 2 or more times performance but it is not 100%, right?
+Notes on how to use multiple cpus on a capture kernel on x86 system:
I feel "Notes about" is better, but I'm not sure.
+To use multiple cpus on a capture kernel on x86 system:
Since there is a line above "Notes on how to.." so "To use" line is redundant.
+- First, confirm that you are using a sufficiently new kernel version
"make sure" is slightly better than "confirm", "sufficiently new" is not necessary
- that supports disable_cpu_apicid kernel option as a capture kernel,
- which is needed to avoid x86 specific hardware issue (*). The
- disable_cpu_apicid kernel option is automatically appended by
- kdumpctl script and is ignored if the kernel doesn't support
- it. Thus, you don't need to do anything else except for the
- confirmation.
Thus, [...] is not necessary.
+- Then, you need to specify how many cpus you use in a capture kernel
s/you use/to be used
- by specifying the number of cpus in nr_cpus kernel option in
- /etc/sysconfig/kdump. nr_cpus is 1 at default.
+Note strongly that you should use necessary and sufficnet amount of
s/sufficnet/sufficient
+cpus on a capture kernel. IOW, don't use too many cpus on a capture +kernel, or the capture kernel easily leads to panic due to Out Of +Memory.
It is a good concern, do you have some test result about the memory usage for nr_cpus > 1?
+There are kernel data structures and drivers allocating memory in +proportion to the number of cpus. More you use cpus, more and more +memory system consumes. Memory is rare, limited resource in a capture +kernel. Reserved memory should be kept as less as possible. When +configuring nr_cpus option, you should confirm that kdump certainly +successfully works without leading to panic due to Out Of Memory on a +capture kernel.
Above paragraph seems is not necessary.
+(*) Without disable_cpu_apicid kernel option, capture kernel leads to +hang, system reset or power-off at boot, depending on your system and +runtime situation at the time of crash.
Because disable_cpu_apicid is x86 only, it should be mentioned. BTW, have anyone tested other arches?
Debugging Tips
- One can drop into a shell before/after saving vmcore with the help of
-- 1.8.3.1
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
Hi Zhou,
On 09/09/2015:05:59:46 PM, Dave Young wrote:
Ccing Pratyush and Vivek, hope any of you can have time to review the documentation.
Thanks for CCing.
On 09/08/15 at 10:22am, Zhou Wenjian wrote:
+Parallel Dumping Operation +========================== +Kexec allows kdump using multiple cpus. So parallel feature can +accelerate dumping greatly, especially in doing compress and filter.
I think 'considerably' or 'substantially' would be a better fit than 'greatly'. Similarly, 'executing' or 'performing' than 'doing'.
+For example: +"makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" +has 2 or more times performance of
s/of/compared to/
[...]
+- First, confirm that you are using a sufficiently new kernel version
[...]
+- Then, you need to specify how many cpus you use in a capture kernel
In stead of 'First' and 'Then', I think following would be better: 1) Make sure that you are using..... 2) You need to specify....
+Note strongly that you should use necessary and sufficnet amount of
How about
Note: You must use necessary and sufficient number of .....
s/sufficnet/sufficient
+cpus on a capture kernel. IOW, don't use too many cpus on a capture +kernel, or the capture kernel easily leads to panic due to Out Of +Memory.
s/easily leads/may lead/
+There are kernel data structures and drivers allocating memory in
s/allocating/which allocate
+proportion to the number of cpus. More you use cpus, more and more +memory system consumes. Memory is rare, limited resource in a capture +kernel. Reserved memory should be kept as less as possible. When +configuring nr_cpus option, you should confirm that kdump certainly +successfully works without leading to panic due to Out Of Memory on a +capture kernel.
Above paragraph seems is not necessary.
I agree.
~Pratyush
On 09/09/2015 05:59 PM, Dave Young wrote:
Hi,
Some comments inline. I always worry about reviewing documentation patch because my english is not good enough.
Ccing Pratyush and Vivek, hope any of you can have time to review the documentation.
On 09/08/15 at 10:22am, Zhou Wenjian wrote:
Add descriptions of parallel dumping and how to use it.
Signed-off-by: Zhou wenjian zhouwj-fnst@cn.fujitsu.com Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
kexec-kdump-howto.txt | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/kexec-kdump-howto.txt b/kexec-kdump-howto.txt index 05b497f..fab7b09 100644 --- a/kexec-kdump-howto.txt +++ b/kexec-kdump-howto.txt @@ -616,6 +616,50 @@ options are copied from /proc/cmdline. In general it is best to append command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing the original command line completely.
+Parallel Dumping Operation +========================== +Kexec allows kdump using multiple cpus. So parallel feature can +accelerate dumping greatly, especially in doing compress and filter. +For example: +"makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" +has 2 or more times performance of +"makedumpfile -c /proc/vmcore dumpfile", +if THREAD_NUM is larger than 2 and the num of cpus that can be used is
s/the num of cpus that can be used/the usable cpu numbers
+larger than THREAD_NUM.
Above paragraph is not easy to undertand, how about drop the example, we know it can increase performance, but it also depends on test cases? Such as network trafic in networking dump? So it may have 2 or more times performance but it is not 100%, right?
+Notes on how to use multiple cpus on a capture kernel on x86 system:
I feel "Notes about" is better, but I'm not sure.
"Notes on" is used in the above of "Parallel Dumping Operation", such as "Notes on rootfs mount:".
+To use multiple cpus on a capture kernel on x86 system:
Since there is a line above "Notes on how to.." so "To use" line is redundant.
+- First, confirm that you are using a sufficiently new kernel version
"make sure" is slightly better than "confirm", "sufficiently new" is not necessary
- that supports disable_cpu_apicid kernel option as a capture kernel,
- which is needed to avoid x86 specific hardware issue (*). The
- disable_cpu_apicid kernel option is automatically appended by
- kdumpctl script and is ignored if the kernel doesn't support
- it. Thus, you don't need to do anything else except for the
- confirmation.
Thus, [...] is not necessary.
+- Then, you need to specify how many cpus you use in a capture kernel
s/you use/to be used
- by specifying the number of cpus in nr_cpus kernel option in
- /etc/sysconfig/kdump. nr_cpus is 1 at default.
+Note strongly that you should use necessary and sufficnet amount of
s/sufficnet/sufficient
+cpus on a capture kernel. IOW, don't use too many cpus on a capture +kernel, or the capture kernel easily leads to panic due to Out Of +Memory.
It is a good concern, do you have some test result about the memory usage for nr_cpus > 1?
I have just test it. The following is the result.
crashkernel=128M
nr_cpus 1 2 4 8 16 total mem 113372 113212 112972 112492 111532 (KB) free mem 74360 72648 69760 65128 57032 (KB)
BTW, other comments have been reflected in the new patch. But I forgot to change the patch version to v2.