I'm running 3.0-0.rc6.git6.1.fc16.x86_64 on a new sandy bridge laptop on updated f15 (except for kernel, procps and mdadm from rawhide).
(I'll put rc7 on once the build is done - thanks kyle!
The message I get (temps certainly seem normal):
------------- /var/log/messages -----------------------------
Jul 12 08:39:43 lap3 mcelog[1001]: HARDWARE ERROR. This is *NOT* a software problem! Jul 12 08:39:43 lap3 mcelog[1001]: Please contact your hardware vendor Jul 12 08:39:43 lap3 mcelog[1001]: MCE 11 Jul 12 08:39:43 lap3 mcelog[1001]: CPU 6 THERMAL EVENT TSC 985befbeec Jul 12 08:39:43 lap3 mcelog[1001]: TIME 1310474383 Tue Jul 12 08:39:43 2011 Jul 12 08:39:43 lap3 mcelog[1001]: Processor 6 below trip temperature. Throttling disabled Jul 12 08:39:43 lap3 mcelog[1001]: STATUS c000000088400c08 MCGSTATUS 0 Jul 12 08:39:43 lap3 mcelog[1001]: MCGCAP c09 APICID 6 SOCKETID 0 Jul 12 08:39:43 lap3 mcelog[1001]: CPUID Vendor Intel Family 6 Model 42 Jul 12 08:39:43 lap3 mcelog[1001]: mcelog: Unsupported new Family 6 Model 2a CPU: only decoding architectural err
------------- ver_linux -------------------------------- Linux lap3.prv.sapience.com 3.0-0.rc6.git6.1.fc16.x86_64 #1 SMP Sun Jul 10 16:00:07 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Gnu C 4.6.0 Gnu make 3.82 binutils 2.21.51.0.6 util-linux 2.19.1 mount support module-init-tools 3.16 e2fsprogs 1.41.14 jfsutils 1.1.13 xfsprogs 3.1.4 pcmciautils 017 quota-tools 4.00-pre1. PPP 2.4.5 isdn4k-utils 3.13 Linux C Library 2.14 Dynamic linker (ldd) 2.14 Procps 3.2.8 Net-tools 1.60 Kbd 1.15.2 oprofile 0.9.6 Sh-utils 8.10 wireless-tools 29 Modules Loaded tcp_lp vboxnetadp vboxnetflt vboxdrv hidp fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc ppdev parport_pc lp parport sunrpc capi kernelcapi rfcomm bnep ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xts gf128mul dm_crypt arc4 uvcvideo videodev media v4l2_compat_ioctl32 btusb bluetooth snd_hda_codec_conexant microcode joydev i2c_i801 snd_hda_intel snd_hda_codec snd_hwdep iwlagn snd_seq snd_seq_device mac80211 snd_pcm cfg80211 xhci_hcd iTCO_wdt snd_timer iTCO_vendor_support e1000e snd_page_alloc thinkpad_acpi rfkill snd soundcore virtio_net kvm_intel kvm firewire_ohci sdhci_pci sdhci firewire_core mmc_core crc_itu_t wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video
Hi,
2011/7/12 Genes MailLists lists@sapience.com:
I'm running 3.0-0.rc6.git6.1.fc16.x86_64 on a new sandy bridge laptop on updated f15 (except for kernel, procps and mdadm from rawhide).
AFAIK it can be changed somehow http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Do...
(I'll put rc7 on once the build is done - thanks kyle!
The message I get (temps certainly seem normal):
------------- /var/log/messages -----------------------------
Jul 12 08:39:43 lap3 mcelog[1001]: HARDWARE ERROR. This is *NOT* a software problem! Jul 12 08:39:43 lap3 mcelog[1001]: Please contact your hardware vendor Jul 12 08:39:43 lap3 mcelog[1001]: MCE 11 Jul 12 08:39:43 lap3 mcelog[1001]: CPU 6 THERMAL EVENT TSC 985befbeec Jul 12 08:39:43 lap3 mcelog[1001]: TIME 1310474383 Tue Jul 12 08:39:43 2011 Jul 12 08:39:43 lap3 mcelog[1001]: Processor 6 below trip temperature. Throttling disabled Jul 12 08:39:43 lap3 mcelog[1001]: STATUS c000000088400c08 MCGSTATUS 0 Jul 12 08:39:43 lap3 mcelog[1001]: MCGCAP c09 APICID 6 SOCKETID 0 Jul 12 08:39:43 lap3 mcelog[1001]: CPUID Vendor Intel Family 6 Model 42 Jul 12 08:39:43 lap3 mcelog[1001]: mcelog: Unsupported new Family 6 Model 2a CPU: only decoding architectural err
------------- ver_linux -------------------------------- Linux lap3.prv.sapience.com 3.0-0.rc6.git6.1.fc16.x86_64 #1 SMP Sun Jul 10 16:00:07 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Gnu C 4.6.0 Gnu make 3.82 binutils 2.21.51.0.6 util-linux 2.19.1 mount support module-init-tools 3.16 e2fsprogs 1.41.14 jfsutils 1.1.13 xfsprogs 3.1.4 pcmciautils 017 quota-tools 4.00-pre1. PPP 2.4.5 isdn4k-utils 3.13 Linux C Library 2.14 Dynamic linker (ldd) 2.14 Procps 3.2.8 Net-tools 1.60 Kbd 1.15.2 oprofile 0.9.6 Sh-utils 8.10 wireless-tools 29 Modules Loaded tcp_lp vboxnetadp vboxnetflt vboxdrv hidp fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc ppdev parport_pc lp parport sunrpc capi kernelcapi rfcomm bnep ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xts gf128mul dm_crypt arc4 uvcvideo videodev media v4l2_compat_ioctl32 btusb bluetooth snd_hda_codec_conexant microcode joydev i2c_i801 snd_hda_intel snd_hda_codec snd_hwdep iwlagn snd_seq snd_seq_device mac80211 snd_pcm cfg80211 xhci_hcd iTCO_wdt snd_timer iTCO_vendor_support e1000e snd_page_alloc thinkpad_acpi rfkill snd soundcore virtio_net kvm_intel kvm firewire_ohci sdhci_pci sdhci firewire_core mmc_core crc_itu_t wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video _______________________________________________ kernel mailing list kernel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/kernel
On 07/12/2011 09:14 AM, Michał Piotrowski wrote:
AFAIK it can be changed somehow http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Do...
Seems so - hopefully those who understand this better can fix it for sandy bridge chipsets - which I assume is the source of the problem.
Obviously my laptop was not so cold as to be under thermal specs :D
Still have same error in latest kernel build:
rc7.git10
Jul 21 23:30:28 lap3 mcelog[937]: HARDWARE ERROR. This is *NOT* a software problem! Jul 21 23:30:28 lap3 mcelog[937]: Please contact your hardware vendor Jul 21 23:30:28 lap3 mcelog[937]: MCE 15 Jul 21 23:30:28 lap3 mcelog[937]: CPU 7 THERMAL EVENT TSC 7f14a833e3 Jul 21 23:30:28 lap3 mcelog[937]: TIME 1311305428 Thu Jul 21 23:30:28 2011 Jul 21 23:30:28 lap3 mcelog[937]: Processor 7 below trip temperature. Throttling disabled Jul 21 23:30:28 lap3 mcelog[937]: STATUS c000000088230800 MCGSTATUS 0 Jul 21 23:30:28 lap3 mcelog[937]: MCGCAP c09 APICID 7 SOCKETID 0 Jul 21 23:30:28 lap3 mcelog[937]: CPUID Vendor Intel Family 6 Model 42 Jul 21 23:31:36 lap3 pulseaudio[2281]: ratelimit.c: 16 events suppressed
On Thu, Jul 21, 2011 at 11:36:17PM -0400, Genes MailLists wrote:
Still have same error in latest kernel build:
rc7.git10
Jul 21 23:30:28 lap3 mcelog[937]: HARDWARE ERROR. This is *NOT* a software problem! Jul 21 23:30:28 lap3 mcelog[937]: Please contact your hardware vendor Jul 21 23:30:28 lap3 mcelog[937]: MCE 15 Jul 21 23:30:28 lap3 mcelog[937]: CPU 7 THERMAL EVENT TSC 7f14a833e3 Jul 21 23:30:28 lap3 mcelog[937]: TIME 1311305428 Thu Jul 21 23:30:28 2011 Jul 21 23:30:28 lap3 mcelog[937]: Processor 7 below trip temperature. Throttling disabled Jul 21 23:30:28 lap3 mcelog[937]: STATUS c000000088230800 MCGSTATUS 0 Jul 21 23:30:28 lap3 mcelog[937]: MCGCAP c09 APICID 7 SOCKETID 0 Jul 21 23:30:28 lap3 mcelog[937]: CPUID Vendor Intel Family 6 Model 42 Jul 21 23:31:36 lap3 pulseaudio[2281]: ratelimit.c: 16 events suppressed
If you are running a Sandy Bridge laptop, there is probably a BIOS update that might fix this. I don't think Sandy Bridge processors have been released to the general public yet.
We were chasing broken MCEs in Nehalem that reported bad memory all the time. Various kernel hacks worked around it until a BIOS fix came along.
I wouldn't be surprised if this is just another broken MCE, especially if the laptop doesn't feel hot.
Cheers, Don
On 07/22/2011 09:14 AM, Don Zickus wrote:
Jul 21 23:30:28 lap3 mcelog[937]: Processor 7 below trip temperature.
^^^^^^^^
If you are running a Sandy Bridge laptop, there is probably a BIOS update that might fix this. I don't think Sandy Bridge processors have been released to the general public yet.
We were chasing broken MCEs in Nehalem that reported bad memory all the time. Various kernel hacks worked around it until a BIOS fix came along.
I wouldn't be surprised if this is just another broken MCE, especially if the laptop doesn't feel hot.
It is indeed a Sandy Bridge laptop - and it is most definitely available to the public - I bought it a couple months ago from the (public) lenovo website :-)
The complaint it is too cold funnily enough ... and no it does not feel hot either ..
I'll check for BIOS update - but last I checked there wasn't one - I wish it was easy to do bios update on this in linux :-(
gene/
On 07/22/2011 09:27 AM, Genes MailLists wrote:
Processor 7 below trip temperature
ISTR seeing some mention of problems with X & MCE messages on Sandy Bridge. The first thing to try is reverting this patch in your kernel: linux-2.6 commit ccab5c82759e2ace74b2e84f82d1e0eedd932571.
Some of the Dell R910 laptops had this weird error too. It was caused by some bogus MCEs being kicked during C1E and C-state transitions BUT ... that was a different processor (Westmere or Nehalem IIRC).
In any case this isn't a critical situation -- your laptop keeps running and is warning you that it thinks something is wrong. Can you try disabling C1E and C-state transitions in your BIOS?
P.
On 07/22/2011 10:19 AM, Prarit Bhargava wrote:
On 07/22/2011 09:27 AM, Genes MailLists wrote:
Processor 7 below trip temperature
ISTR seeing some mention of problems with X & MCE messages on Sandy Bridge. The first thing to try is reverting this patch in your kernel: linux-2.6 commit ccab5c82759e2ace74b2e84f82d1e0eedd932571.
Some of the Dell R910 laptops had this weird error too. It was caused by some bogus MCEs being kicked during C1E and C-state transitions BUT ... that was a different processor (Westmere or Nehalem IIRC).
In any case this isn't a critical situation -- your laptop keeps running and is warning you that it thinks something is wrong. Can you try disabling C1E and C-state transitions in your BIOS?
P.
Forgot to check BIOS settings - would they be labeled as such (C1E / C-state) ?
Also after updating bios i see no issues - but I am actually not sure they occur until after a sleep - so I'll keep the list posted.
thanks for your help.
gene
On 07/25/2011 10:43 PM, Genes MailLists wrote:
On 07/22/2011 10:19 AM, Prarit Bhargava wrote:
On 07/22/2011 09:27 AM, Genes MailLists wrote:
Processor 7 below trip temperature
ISTR seeing some mention of problems with X & MCE messages on Sandy Bridge. The first thing to try is reverting this patch in your kernel: linux-2.6 commit ccab5c82759e2ace74b2e84f82d1e0eedd932571.
Some of the Dell R910 laptops had this weird error too. It was caused by some bogus MCEs being kicked during C1E and C-state transitions BUT ... that was a different processor (Westmere or Nehalem IIRC).
In any case this isn't a critical situation -- your laptop keeps running and is warning you that it thinks something is wrong. Can you try disabling C1E and C-state transitions in your BIOS?
P.
Forgot to check BIOS settings - would they be labeled as such (C1E / C-state) ?
Yeah, something like that. It might be labeled as just "Processor Power State" or something too.
Also after updating bios i see no issues - but I am actually not sure they occur until after a sleep - so I'll keep the list posted.
Try backing out the commit above.
P.
thanks for your help.
gene _______________________________________________ kernel mailing list kernel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/kernel
On 07/22/2011 09:27 AM, Genes MailLists wrote:
On 07/22/2011 09:14 AM, Don Zickus wrote:
Jul 21 23:30:28 lap3 mcelog[937]: Processor 7 below trip temperature.
^^^^^^^^
If you are running a Sandy Bridge laptop, there is probably a BIOS update that might fix this. I don't think Sandy Bridge processors have been released to the general public yet.
Bios updated - will report back if get any more errors ...
On 07/25/2011 10:39 PM, Genes MailLists wrote:
Bios updated - will report back if get any more errors ...
Problem is still there - I note a new message about Unsupported new Family :
Jul 25 23:25:59 lap3 mcelog[941]: HARDWARE ERROR. This is *NOT* a software problem! Jul 25 23:25:59 lap3 mcelog[941]: Please contact your hardware vendor Jul 25 23:25:59 lap3 mcelog[941]: MCE 14 Jul 25 23:25:59 lap3 mcelog[941]: CPU 6 THERMAL EVENT TSC 8bcbc2a2def Jul 25 23:25:59 lap3 mcelog[941]: TIME 1311650759 Mon Jul 25 23:25:59 2011 Jul 25 23:25:59 lap3 mcelog[941]: Processor 6 below trip temperature. Throttling disabled Jul 25 23:25:59 lap3 mcelog[941]: STATUS c000000088300808 MCGSTATUS 0 Jul 25 23:25:59 lap3 mcelog[941]: MCGCAP c09 APICID 6 SOCKETID 0 Jul 25 23:25:59 lap3 mcelog[941]: CPUID Vendor Intel Family 6 Model 42 Jul 25 23:25:59 lap3 mcelog[941]: mcelog: Unsupported new Family 6 Model 2a CPU: only decoding architectural errors
On 07/25/2011 11:29 PM, Genes MailLists wrote:
On 07/25/2011 10:39 PM, Genes MailLists wrote:
Bios updated - will report back if get any more errors ...
Problem is still there - I note a new message about Unsupported new Family :
Forgot to mention this is 3.0 kernel now ...
kernel@lists.fedoraproject.org