monitoring CPU temp

General questions regarding Linux.

Moderators: Terry, FWLUG Administrator

monitoring CPU temp

Postby stack » Tue Jan 22, 2008 7:44 pm

So I turned on the monitor screen for my server today and saw this message on the login screen:
kernel: CPU0: Temperature above threshold, cpu clock throttled (total events = 3851 )
kernel: CPU0: Temperature / speed normal

The only thing that came to my mind was "what the hell... ? ? " I monitor my hard drive temperatures religiously, but because I never can get CPU temperature monitoring to work right I never bother with it ( with my pick-of-the-litter luck I seem to always get hardware that isn't supported). So I just keep my systems well cooled and I always set the BIOS option of "when temp gets above XX degrees C turn it off".

When I saw this message, I was quite shocked. So I did some digging. Sure enough /var/log/kern.log has some more info (it throttled and then 5 minutes later returned to normal) but nothing about how hot the temperature was nor about where I could get such info. So off to Google. It turns out that the Linux kernel monitors such things for you now (learn something new every day). However, I can't find out where such information would be stored.

I dug around in /proc/acpi and it did have some information about the CPU but nothing about the temperature. The processor is a P4 2.2 and there is nothing special about it as far as I can tell on a standard Intel motherboard. So I caved and started installing more programs (Horay! Just want you want to do with your server kids! Installing random packages helps you clutter your drive which is important for servers. Remember, if it's not broke your not trying!" ).

The BIOS sees the fans, knows their speed, and has the three temperature zones listed. It shouldn't be too difficult to just read what it is already capturing right?

First up lm-sensors. Did the install, configured and compiled the modules, then modprobe'd the modules and run! ...or not...because even though it finds 1 of the 6 fans in the case it doesn't know what to do with it and of course my motherboard is not supported for anything else....

Next up MBmon. This is the one program that I have had best results with in the past...and once again my motherboard is not recognized.

OK lets try gkrellm. Can we guess the results? Not supported. Dah.... :x

OK if the kernel can figure out that the temp is too hot, why can't at least one of three more popular programs for monitoring CPU temperatures figure it out? but more importantly, does anyone know how to find such info out from the kernel? It apparently has figured it out but it just isn't saying (or I am looking in the wrong spot).

OK well its dinner time, so I will start searching Google again when I get back. In the meantime, if anyone has any ideas please let me know.

Thanks!
~Stack~
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas

Re: monitoring CPU temp

Postby Randy » Tue Jan 22, 2008 9:10 pm

Have you tried running the sensors-detect script, as root, to see what hardware sensors it will detect?

Admittedly, even with the script, it's still kind of hit or miss. You can make a copy of it's output and manually load any modules it recommends before putting the output in /etc/modprobe.conf (or wherever your distro looks for module info) to test it out.

After you manually load the modules, just type sensors and see what kind of output you get. Should it display the values you seek, you many need to delve further into the correct formulas for computing the thresholds of your system. Look for a sensors.conf file somewhere on your system.
-- Randy
User avatar
Randy
Site Admin
 
Posts: 351
Joined: Mon Feb 13, 2006 9:45 pm
Location: Fort Worth, Texas

Re: monitoring CPU temp

Postby stack » Tue Jan 22, 2008 10:06 pm

Yeah, I ran the sensors-detect and it only detected the fan. The modprobe loaded just fine into the kernel, but when I run sensors I get an error on reading the fan. But I don't care about the fan :P .

I have done some more research and I can't figure out how the kernel is determining what the temperature. I did confirm my suspicion that the motherboard is not supported so these other programs are not going to work :x

If I can figure out what the kernel is seeing, then maybe I can figure out a solution. So far I am not finding much...guess its time to start reading the code....
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas


Return to FWLUG General Discussions

Who is online

Users browsing this forum: No registered users and 43 guests

cron