Truncated log entries

List overview All Threads
Download

newer

older

PT_GNU_STACK, ld.so.cache and java

set/getxattrs - I am badly struck...

Barry Roomberg

27 Oct 2004 27 Oct '04

12:13 p.m.

I'm running Fedora Core 2 Kernel: 2.6.5-1.358 I'm logging activity in a directory (thanks Stephen).

I occasionally get what look like to be truncated log entries such as:

Oct 27 11:24:21 mstoppel1 kernel: audit(1098890661.257:8894633): avc: granted { read } for pid=17834 exed=500 fsuid=500 egid=500 sgid=500 fsgid=500

"exed=500" ???

also: Oct 27 11:26:47 mstoppel1 kernel: =500 fsgid=500

Any idea why? They are rare and interspersed with good entries.

Show replies by date

Stephen Smalley

27 Oct 27 Oct

12:24 p.m.

On Wed, 2004-10-27 at 13:13, Barry Roomberg wrote:

...

I'm running Fedora Core 2 Kernel: 2.6.5-1.358 I'm logging activity in a directory (thanks Stephen).

I occasionally get what look like to be truncated log entries such as:

Oct 27 11:24:21 mstoppel1 kernel: audit(1098890661.257:8894633): avc: granted { read } for pid=17834 exed=500 fsuid=500 egid=500 sgid=500 fsgid=500

"exed=500" ???

also: Oct 27 11:26:47 mstoppel1 kernel: =500 fsgid=500

Any idea why? They are rare and interspersed with good entries.

/me guesses that the kernel audit framework isn't SMP-safe. Is anyone at RedHat looking into this? It was already bugzilla'd by Tom London.

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Peter Martuccelli

29 Oct 29 Oct

11:18 a.m.

On Wed, 2004-10-27 at 13:24, Stephen Smalley wrote:

...

On Wed, 2004-10-27 at 13:13, Barry Roomberg wrote:

...
I'm running Fedora Core 2 Kernel: 2.6.5-1.358 I'm logging activity in a directory (thanks Stephen).

I occasionally get what look like to be truncated log entries such as:

Oct 27 11:24:21 mstoppel1 kernel: audit(1098890661.257:8894633): avc: granted { read } for pid=17834 exed=500 fsuid=500 egid=500 sgid=500 fsgid=500

"exed=500" ???

also: Oct 27 11:26:47 mstoppel1 kernel: =500 fsgid=500

Any idea why? They are rare and interspersed with good entries.

/me guesses that the kernel audit framework isn't SMP-safe. Is anyone at RedHat looking into this? It was already bugzilla'd by Tom London.

I want to verify that people have only seen this issue on SMP systems.

I can work with Dan on getting a test together to try and reproduce the problem.

Regards,

Peter

Stephen Smalley

12:38 p.m.

On Fri, 2004-10-29 at 12:18, Peter Martuccelli wrote:

...

I want to verify that people have only seen this issue on SMP systems.

I can work with Dan on getting a test together to try and reproduce the problem.

A couple of people have reported it on UP as well.

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Tom London

12:53 p.m.

Yeah, me, for example! Running on UP, no hyper, etc.

tom

-- Tom London

Valdis.Kletnieks＠vt.edu

4:01 p.m.

On Fri, 29 Oct 2004 12:18:03 EDT, Peter Martuccelli said:

...

...
/me guesses that the kernel audit framework isn't SMP-safe. Is anyone at RedHat looking into this? It was already bugzilla'd by Tom London.

I want to verify that people have only seen this issue on SMP systems.

I can work with Dan on getting a test together to try and reproduce the problem.

Possibly a variant issue, often seen on my laptop (is a UP with Ingo Molnar's VP patches and PREEMPT defined). Lots of leading blanks. Always show up in pairs, one with 115-120 blanks, followed by a second that has 15-25 more blanks. Both messages are always truncated after the exe= field.

Oct 19 09:25:28 turing-police kernel: audit(1098192328.373:0): avc: denied { unlink } for pid=3110 exe=/sbin/ldconfig name= ld.so.cache dev=dm-5 ino=24601 scontext=root:sysadm_r:ldconfig_t tcontext=root:object_r:etc_t tclass=file Oct 19 09:25:28 turing-police kernel: audit(1098192328.986:0): avc: denied { search } for pid=15579 exe=/usr/bin/dbus-daemo n-1 Oct 19 09:25:28 turing-police kernel: audit(1098192328.986:0): avc: denied { write } for pid=15579 exe=/usr/ bin/dbus-daemon-1

Those were 3 consecutive messages out of the kernel. Might be the issue is 'SMP or PREEMPT'. Two more examples from that day...

Oct 19 09:59:03 turing-police crond(pam_unix)[13653]: session opened for user dshield by (uid=0) Oct 19 09:59:03 turing-police kernel: audit(1098194343.340:0): avc: denied { search } for pid=30651 exe=/usr/sbin/crond Oct 19 09:59:03 turing-police kernel: audit(1098194343.340:0): avc: denied { write } for pid=30651 exe=/usr/sbin/crond ... Oct 19 10:19:37 turing-police ntpd[30628]: sendto(198.82.1.204): Invalid argument Oct 19 10:20:18 turing-police kernel: audit(1098195618.634:0): avc: denied { search } for pid=21753 exe=/usr/sbin/smartd Oct 19 10:20:18 turing-police kernel: audit(1098195618.923:0): avc: denied { write } for pid=21753 exe=/usr/sbin/smartd

Russell Coker

2 Nov 2 Nov

9:16 a.m.

On Sat, 30 Oct 2004 02:18, Peter Martuccelli peterm@redhat.com wrote:

...

I want to verify that people have only seen this issue on SMP systems.

I'm running 2.6.9-1.649 on a single Pentium-M 1700MHz (Thinkpad T41p). I see the truncated errors repeatedly and often. I don't do anything special, just run strict policy and do basic desktop stuff like checking my email. gam_server triggers a good supply of AVC messages.

-- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/ Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page

Steve G

10:31 a.m.

...

I see the truncated errors repeatedly and often. I don't do anything special, just run strict policy and do basic desktop stuff like checking my email.

I was wondering where the code was that actually does the logging. I see an avc_init function call that takes a logger callback function. I was wondering what is being used for the callback. Is it in user space or kernel?

-Steve Grubb

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Stephen Smalley

11:36 a.m.

On Tue, 2004-11-02 at 11:31, Steve G wrote:

...

...
I see the truncated errors repeatedly and often. I don't do anything special, just run strict policy and do basic desktop stuff like checking my email.

I was wondering where the code was that actually does the logging. I see an avc_init function call that takes a logger callback function. I was wondering what is being used for the callback. Is it in user space or kernel?

The issue has to do with the kernel audit framework (linux-2.6.x/kernel/audit.c), which is called by the kernel AVC (linux-2.6.x/security/selinux/avc.c:avc_audit). Userspace AVC is a port of the kernel AVC to userspace, not relevant to this particular bug.

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Valdis.Kletnieks＠vt.edu

27 Oct 27 Oct

12:24 p.m.

On Wed, 27 Oct 2004 13:13:26 EDT, Barry Roomberg said:

...

"exed=500" ???

Typo in a kprintf? ;)

...

also: Oct 27 11:26:47 mstoppel1 kernel: =500 fsgid=500

Any idea why? They are rare and interspersed with good entries.

Are you perchance on an SMP system (which includes a 1-CPU HT)? There's a few race conditions when processes on both/multiple processors printk() at the same time. Other possibility is a burst of traffic wrapped the kernel syslog buffer before klogd read it. On recent kernels, you can tune how big the buffer is at kernel build time with CONFIG_LOG_BUF_SHIFT (16 for a 64K buffer, 17 for 128K, etc).

Stephen Smalley

12:26 p.m.

On Wed, 2004-10-27 at 13:24, Valdis.Kletnieks@vt.edu wrote:

...

Are you perchance on an SMP system (which includes a 1-CPU HT)? There's a few race conditions when processes on both/multiple processors printk() at the same time. Other possibility is a burst of traffic wrapped the kernel syslog buffer before klogd read it. On recent kernels, you can tune how big the buffer is at kernel build time with CONFIG_LOG_BUF_SHIFT (16 for a 64K buffer, 17 for 128K, etc).

SELinux was migrated from using printk to using the kernel audit framework developed by RedHat a while back. We started getting bug reports about truncated audit messages not long after...

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Tom London

12:54 p.m.

I get this problem on my simple little uniprocessor box....

I may be creating a pattern out of noise, but this problem appears to be weakly correlated with loading a new policy.

I spent a bit of time looking at the code to see if I could find a stale pointer in the audit interface. Didn't see anything...

Has anyone noticed this running the 'boot policy" (i.e., the same policy the system booted with)?

tom

-- Tom London

Stephen Smalley

1:02 p.m.

On Wed, 2004-10-27 at 13:54, Tom London wrote:

...

I get this problem on my simple little uniprocessor box....

I may be creating a pattern out of noise, but this problem appears to be weakly correlated with loading a new policy.

I spent a bit of time looking at the code to see if I could find a stale pointer in the audit interface. Didn't see anything...

Has anyone noticed this running the 'boot policy" (i.e., the same policy the system booted with)?

I actually haven't seen it recently at all, even while after doing massive numbers of policy reloads at the same time as running tests while testing/debugging the RCU scalability work.

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Valdis.Kletnieks＠vt.edu

1:32 p.m.

On Wed, 27 Oct 2004 13:26:35 EDT, Stephen Smalley said:

...

SELinux was migrated from using printk to using the kernel audit framework developed by RedHat a while back. We started getting bug reports about truncated audit messages not long after...

There's this code in kernel/audit.c, in audit_log_drain():

if (!audit_pid) { /* No daemon */ int offset = ab->nlh ? NLMSG_SPACE(0) : 0; int len = skb->len - offset; printk(KERN_ERR "%*.*s\n", len, len, skb->data + offset); }

That len/offset look racy to me. It's called from audit_log_end_fast(), which checks for calls in IRQ context, but I'm not seeing where we do any SMP or PREEMPT locking.

Stephen Smalley

1:34 p.m.

On Wed, 2004-10-27 at 14:32, Valdis.Kletnieks@vt.edu wrote:

...

There's this code in kernel/audit.c, in audit_log_drain():
            if (!audit_pid) { /* No daemon */
                    int offset = ab->nlh ? NLMSG_SPACE(0) : 0;
                    int len    = skb->len - offset;
                    printk(KERN_ERR "%*.*s\n",
                           len, len, skb->data + offset);
            }
That len/offset look racy to me. It's called from audit_log_end_fast(), which checks for calls in IRQ context, but I'm not seeing where we do any SMP or PREEMPT locking.

I think that's ok, as it is acting upon an audit buffer that was necessarily allocated by and only accessible to the same thread (by audit_log_start).

-- Stephen Smalley sds@epoch.ncsc.mil National Security Agency

Tom London

29 Oct 29 Oct

11:55 a.m.

OK.... answered my own question:

This is NOT connected with loading new policies. I just got it running the 'boot policy'.

tom

-- Tom London

7121

Age (days ago)

7127

Last active (days ago)

selinux@lists.fedoraproject.org

15 comments

7 participants

tags (0)

participants (7)

Barry Roomberg
Peter Martuccelli
Russell Coker
Stephen Smalley
Steve G
Tom London
Valdis.Kletnieks＠vt.edu