Hi,
I noticed this in my dmesg
general protection fault: 0000 [#2] SMP last sysfs file: /sys/fs/ecryptfs/version CPU 2 Modules linked in: cbc cryptd aes_x86_64 aes_generic ecb ecryptfs smsc47m192 hwmon_vid coretemp nf_conntrack_netbios_ns ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 iTCO_wdt i2c_i801 iTCO_vendor_support serio_raw r8169 mii sata_sil i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Pid: 18329, comm: flush-8:16 Tainted: G D 2.6.35.5-29.fc13.x86_64 #1 D945GCLF2/ RIP: 0010:[<ffffffff8122f04f>] [<ffffffff8122f04f>] cfq_free_io_context+0x18/0x34 RSP: 0018:ffff88000d909d90 EFLAGS: 00010202 RAX: 00000001075550e9 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000 RDX: 00000001075550ec RSI: ffff880005bd0dd0 RDI: 0000000000000282 RBP: ffff88000d909da0 R08: ffff880040225340 R09: 0000000000000000 R10: ffff88000d909d90 R11: dead000000200200 R12: ffff880040225340 R13: ffff8800785a8688 R14: 0000000000000000 R15: ffff88007c800000 FS: 0000000000000000(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f96ea945c03 CR3: 0000000077ab7000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process flush-8:16 (pid: 18329, threadinfo ffff88000d908000, task ffff8800785a8000) Stack: ffff880040225340 ffff8800785a8000 ffff88000d909dd0 ffffffff812263db <0> ffffffff812263a6 ffff8800785a8000 ffff8800785a8688 ffff880040225340 <0> ffff88000d909e10 ffffffff812264e7 ffffffff81226492 ffff8800785a8688 Call Trace: [<ffffffff812263db>] put_io_context+0x69/0x9c [<ffffffff812263a6>] ? put_io_context+0x34/0x9c [<ffffffff812264e7>] exit_io_context+0xa0/0xad [<ffffffff81226492>] ? exit_io_context+0x4b/0xad [<ffffffff810555a1>] do_exit+0x786/0x7ba [<ffffffff810f9965>] ? bdi_start_fn+0x0/0xda [<ffffffff8106aebc>] kthreadd+0x0/0x13a [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10 [<ffffffff814a1310>] ? restore_args+0x0/0x30 [<ffffffff8106ae1a>] ? kthread+0x0/0xa2 [<ffffffff8100aaa0>] ? kernel_thread_helper+0x0/0x10 Code: 81 e8 48 93 e8 ff 41 5a 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 48 8b 5f 78 49 89 fc 48 85 db 74 17 <48> 8b 03 48 8d 73 b0 4c 89 e7 0f 18 08 e8 45 ff ff ff 48 8b 1b RIP [<ffffffff8122f04f>] cfq_free_io_context+0x18/0x34 RSP <ffff88000d909d90> ---[ end trace f5f95d6a4269a603 ]--- Fixing recursive fault but reboot is needed!
uname -a Linux ozzy.pl 2.6.35.5-29.fc13.x86_64 #1 SMP Wed Sep 22 00:00:29 CEST 2010 x86_64 x86_64 x86_64 GNU/Linux
It's self compiled 2.6.35.5-29. I had not noticed this in earlier versions. Full dmesg attached.
Regards, Michal
Yet another "invoked rcu_dereference_check"
=================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/exit.c:1387 invoked rcu_dereference_check() without protection!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0 2 locks held by mc/1364: #0: (tasklist_lock){.+.+..}, at: [<ffffffff81054a76>] do_wait+0xde/0x233 #1: (&(&sighand->siglock)->rlock){-.....}, at: [<ffffffff810545a7>] wait_consider_task+0x705/0xaf6
stack backtrace: Pid: 1364, comm: mc Not tainted 2.6.35.5-29.fc13.x86_64 #1 Call Trace: [<ffffffff8107bda6>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff81054636>] wait_consider_task+0x794/0xaf6 [<ffffffff81054aa0>] do_wait+0x108/0x233 [<ffffffff81054c69>] sys_wait4+0x9e/0xc1 [<ffffffff814a12f5>] ? retint_swapgs+0x13/0x1b [<ffffffff81052f11>] ? child_wait_callback+0x0/0x58 [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
That is fixed upstream in commit f362b73244fb16ea4ae127ced1467dd8adaa7733. If that's not already queued for 2.6.35-stable, then it probably should be.
Thanks, Roland
2010/9/23 Chris Wright chrisw@redhat.com:
- Roland McGrath (roland@redhat.com) wrote:
That is fixed upstream in commit f362b73244fb16ea4ae127ced1467dd8adaa7733. If that's not already queued for 2.6.35-stable, then it probably should be.
It is queued for 2.6.35-stable
Ok, thanks.
What about CFQ problem?
Regards, Michal
On Fri, Sep 24, 2010 at 08:58:59AM +0200, Micha? Piotrowski wrote:
2010/9/23 Chris Wright chrisw@redhat.com:
- Roland McGrath (roland@redhat.com) wrote:
That is fixed upstream in commit f362b73244fb16ea4ae127ced1467dd8adaa7733. If that's not already queued for 2.6.35-stable, then it probably should be.
It is queued for 2.6.35-stable
Ok, thanks.
What about CFQ problem?
I looked at that on Monday, but haven't found it yet.
2010/9/24 Kyle McMartin kyle@mcmartin.ca:
On Fri, Sep 24, 2010 at 08:58:59AM +0200, Micha? Piotrowski wrote:
2010/9/23 Chris Wright chrisw@redhat.com:
- Roland McGrath (roland@redhat.com) wrote:
That is fixed upstream in commit f362b73244fb16ea4ae127ced1467dd8adaa7733. If that's not already queued for 2.6.35-stable, then it probably should be.
It is queued for 2.6.35-stable
Ok, thanks.
What about CFQ problem?
I looked at that on Monday, but haven't found it yet.
So it is a known problem. Good to know.
Regards, Michal
On Fri, Sep 24, 2010 at 09:03:00AM -0400, Kyle McMartin wrote:
On Fri, Sep 24, 2010 at 02:50:35PM +0200, Micha? Piotrowski wrote:
I looked at that on Monday, but haven't found it yet.
So it is a known problem. Good to know.
Aside from you reporting it, no, I haven't seen anything aside from some mails from Alexey in 2008.
(This one fwiw: http://kerneltrap.org/mailarchive/linux-kernel/2008/5/30/1984184)
2010/9/24 Kyle McMartin kyle@mcmartin.ca:
On Fri, Sep 24, 2010 at 02:50:35PM +0200, Micha? Piotrowski wrote:
I looked at that on Monday, but haven't found it yet.
So it is a known problem. Good to know.
Aside from you reporting it, no, I haven't seen anything aside from some mails from Alexey in 2008.
I think that bug appeared when copying data from one directory to another -> /home/samba4 -> /home/samba5. Both dirs are on different discs and mounted with eCryptFS. The first drive worked flawlessly for almost a year. I added second disc a few days ago as a backup drive. I'll check the connections. Maybe it's a hardware fault? I checked the disk with badblocks and smartctl before using it - there were no problems. I do not see anything in the dmesg which would indicate that there are problems with the hardware.
I'll check the hardware and try to reproduce the error.
--Kyle
Regards, Michal
On Thu, 23 Sep 2010 13:06:25 +0200 Michał Piotrowski mkkp4x4@gmail.com wrote:
RIP: 0010:[<ffffffff8122f04f>] [<ffffffff8122f04f>] cfq_free_io_context+0x18/0x34
2010/9/24 Chuck Ebbert cebbert@redhat.com:
On Thu, 23 Sep 2010 13:06:25 +0200 Michał Piotrowski mkkp4x4@gmail.com wrote:
RIP: 0010:[<ffffffff8122f04f>] [<ffffffff8122f04f>] cfq_free_io_context+0x18/0x34
Thanks. If I find a repeatable path to reproduce the error I'll describe it in bugzilla thread.
Regards, Michal
W dniu 24 września 2010 15:28 użytkownik Michał Piotrowski mkkp4x4@gmail.com napisał:
2010/9/24 Chuck Ebbert cebbert@redhat.com:
On Thu, 23 Sep 2010 13:06:25 +0200 Michał Piotrowski mkkp4x4@gmail.com wrote:
RIP: 0010:[<ffffffff8122f04f>] [<ffffffff8122f04f>] cfq_free_io_context+0x18/0x34
No luck with this bug so far, but I noticed something else
------------[ cut here ]------------ WARNING: at fs/fs-writeback.c:78 inode_to_bdi+0x62/0x6d() Hardware name: Dirtiable inode bdi default != sb bdi ecryptfs Modules linked in: cbc cryptd aes_x86_64 aes_generic ecb ecryptfs smsc47m192 hwmon_vid coretemp nf_conntrack_netbios_ns ip6t_REJECT nf_conntrack_ipv6 ip6table _filter ip6_tables ipv6 iTCO_wdt iTCO_vendor_support i2c_i801 r8169 mii serio_raw sata_sil i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last un loaded: scsi_wait_scan] Pid: 1393, comm: mc Not tainted 2.6.35.6-33.rc1.fc13.x86_64 #1 Call Trace: [<ffffffff8104d490>] warn_slowpath_common+0x85/0x9d [<ffffffff8104d54b>] warn_slowpath_fmt+0x46/0x48 [<ffffffff81132527>] inode_to_bdi+0x62/0x6d [<ffffffff81132f7a>] __mark_inode_dirty+0xc1/0x12b [<ffffffffa01d1871>] ecryptfs_write_lower+0x9b/0xae [ecryptfs] [<ffffffffa01d2711>] ecryptfs_write_metadata+0x200/0x258 [ecryptfs] [<ffffffffa01cf3e4>] ecryptfs_create+0x23b/0x2a7 [ecryptfs] [<ffffffff8112048b>] vfs_create+0x70/0x92 [<ffffffff81121125>] do_last+0x293/0x5b8 [<ffffffff81122d20>] do_filp_open+0x217/0x5fe [<ffffffff812205c3>] ? might_fault+0x21/0x23 [<ffffffff8112bb72>] ? alloc_fd+0x7b/0x124 [<ffffffff811157a6>] do_sys_open+0x63/0x10f [<ffffffff81115885>] sys_open+0x20/0x22 [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b ---[ end trace 33fc98bd38e19cc2 ]--- ------------[ cut here ]------------
This is 2.6.35.6-33.rc1.fc13.x86_64
Once I have copied 10 files to /home/samba4 there are five such errors.
Also, I noticed that simle "grep -R something ." on my tmpfs is a lot faster on this kernel than on 2.6.35.5-29. Namely: - on 2.6.35.6-33 it takes 2 to 3 seconds - on 2.6.35.5-29 it was 8 to 10 seconds
Of course, this acceleration makes me very happy - this will shorten the time the task up to 3 days. But I wonder why it is so - I do not see any changes related to tmpfs.
Regards, Michal
Hi Jan,
I think that this problem is caused by bdi-fix-warnings-in-__mark_inode_dirty-for-dev-zero-and-friends.patch
No luck with this bug so far, but I noticed something else
------------[ cut here ]------------ WARNING: at fs/fs-writeback.c:78 inode_to_bdi+0x62/0x6d() Hardware name: Dirtiable inode bdi default != sb bdi ecryptfs Modules linked in: cbc cryptd aes_x86_64 aes_generic ecb ecryptfs smsc47m192 hwmon_vid coretemp nf_conntrack_netbios_ns ip6t_REJECT nf_conntrack_ipv6 ip6table _filter ip6_tables ipv6 iTCO_wdt iTCO_vendor_support i2c_i801 r8169 mii serio_raw sata_sil i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last un loaded: scsi_wait_scan] Pid: 1393, comm: mc Not tainted 2.6.35.6-33.rc1.fc13.x86_64 #1 Call Trace: [<ffffffff8104d490>] warn_slowpath_common+0x85/0x9d [<ffffffff8104d54b>] warn_slowpath_fmt+0x46/0x48 [<ffffffff81132527>] inode_to_bdi+0x62/0x6d [<ffffffff81132f7a>] __mark_inode_dirty+0xc1/0x12b [<ffffffffa01d1871>] ecryptfs_write_lower+0x9b/0xae [ecryptfs] [<ffffffffa01d2711>] ecryptfs_write_metadata+0x200/0x258 [ecryptfs] [<ffffffffa01cf3e4>] ecryptfs_create+0x23b/0x2a7 [ecryptfs] [<ffffffff8112048b>] vfs_create+0x70/0x92 [<ffffffff81121125>] do_last+0x293/0x5b8 [<ffffffff81122d20>] do_filp_open+0x217/0x5fe [<ffffffff812205c3>] ? might_fault+0x21/0x23 [<ffffffff8112bb72>] ? alloc_fd+0x7b/0x124 [<ffffffff811157a6>] do_sys_open+0x63/0x10f [<ffffffff81115885>] sys_open+0x20/0x22 [<ffffffff81009c72>] system_call_fastpath+0x16/0x1b ---[ end trace 33fc98bd38e19cc2 ]--- ------------[ cut here ]------------
Can you take a look at this?
Regards, Michal
kernel@lists.fedoraproject.org