Tux

...making Linux just a little more fun!

Tracking load average issues

Neil Youngman [ny at youngman.org.uk]


Wed, 18 Jul 2007 09:27:12 +0100

I was asked to look at a system that had a consistent load average around 5.3 to 5.5. Now I know very little about how to track down load average issues, so I haven't been able to find much. The CPU usage is about 90% idle, so it's not CPU bound.

I googled for "load average", "high load average" and "diagnose load average" and I found very little of use. the one thing I found was that if it's processes stuck waiting on I/O "ps ax" should show processes in state "D". There are none visible on this box.

Do the gang know of any good resources for diagnosing load average issues or have any useful tips?

Neil Youngman


Top    Back


Thomas Adam [thomas at edulinux.homeunix.org]


Wed, 18 Jul 2007 19:27:37 +0100

On Wed, Jul 18, 2007 at 09:27:12AM +0100, Neil Youngman wrote:

> I was asked to look at a system that had a consistent load average around 5.3 
> to 5.5. Now I know very little about how to track down load average issues, 
> so I haven't been able to find much. The CPU usage is about 90% idle, so it's 
> not CPU bound.

Which means it can only be I/O related.

> I googled for "load average", "high load average" and "diagnose load average" 
> and I found very little of use. the one thing I found was that if it's 
> processes stuck waiting on I/O "ps ax" should show processes in state "D". 
> There are none visible on this box.

No, that's completely bogus. All active processes can be I/O bound -- look at communication between sockets and pipes, for example.

> Do the gang know of any good resources for diagnosing load average issues or 
> have any useful tips?

I sometimes help diagnose this sort of thing at work, and it can be tricky. I suppose you could look at some of your processes and attach strace to them to see what they're up to, but then only you would know which processes.

Sometimes, a high load average can be hardware related as well, such as having apic turned on in the BIOS, for instance. You could also look at lsof to make sure nothing out of the ordinary is left open, etc. But at that point, you're really clutching at straws.

-- 
Thomas Adam
"He wants you back, he screams into the night air, like a fireman going
through a window that has no fire." -- Mike Myers, "This Poem Sucks".

Top    Back


Jim Jackson [jj at franjam.org.uk]


Wed, 18 Jul 2007 22:19:05 +0100 (BST)

On Wed, 18 Jul 2007, Neil Youngman wrote:

> I was asked to look at a system that had a consistent load average around 5.3
> to 5.5. Now I know very little about how to track down load average issues,
> so I haven't been able to find much. The CPU usage is about 90% idle, so it's
> not CPU bound.

Have you run the "top" command? It shows which processes are most active. I believe there are such things as gtop, ktop? etc for the those suffering command line timidity^H^H^H^H^H^H^H^Htemerity (timidity is a midi application :-).

A load ave of 5 means that there were on ave 5 processes read to run for the whole of the averaging period - the usual ave periods are 1min 5min and 15min.


Top    Back


René Pfeiffer [lynx at luchs.at]


Wed, 18 Jul 2007 23:57:54 +0200

On Jul 18, 2007 at 0927 +0100, Neil Youngman appeared and said:

> I was asked to look at a system that had a consistent load average around 5.3
> to 5.5. Now I know very little about how to track down load average issues,
> so I haven't been able to find much. The CPU usage is about 90% idle, so it's
> not CPU bound.

"htop" is a nice tool for browsing the process table. It has some nice features and a better screen output than "top". Modern "procinfo" variants can also distinguish between user, nice, system, waiting for I/O, interrupt and idle time. This helps to get a better view at the load.

Thomas already noticed that you are probably I/O bound on this machine. Look at everything that shuffles or waits for data; databases, network applications, file processing. "iostat", "netstat" and "vmstat" can help you to see I/O rates.

If this system is attached to a monitoring system (such as Nagios or Munin, http://munin.sourceforge.net/, for example) you might also get an impression about what the system is doing.

Best wishes, René.


Top    Back


Neil Youngman [ny at youngman.org.uk]


Thu, 19 Jul 2007 09:09:04 +0100

On or around Wednesday 18 July 2007 22:19, Jim Jackson reorganised a bunch of electrons to form the message:

> On Wed, 18 Jul 2007, Neil Youngman wrote:
> > I was asked to look at a system that had a consistent load average around
> > 5.3 to 5.5. Now I know very little about how to track down load average
> > issues, so I haven't been able to find much. The CPU usage is about 90%
> > idle, so it's not CPU bound.
>
> Have you run the "top" command? It shows which processes are most active.
> I believe there are such things as gtop, ktop? etc for the those suffering
> command line timidity^H^H^H^H^H^H^H^Htemerity (timidity is a midi
> application :-).

Yes, top was the first thing I ran, which is where I got the 90% idle figure.

> A load ave of 5 means that there were on ave 5 processes read to run for
> the whole of the averaging period - the usual ave periods are 1min 5min
> and 15min.

That much I found, although apparently "ready to run" includes waiting for I/O, otherwise they would soak up that 90% CPU idle time. What I haven't found yet is a way to tell which processes are waiting on I/O and why?

Neil


Top    Back


Jim Jackson [jj at franjam.org.uk]


Thu, 19 Jul 2007 22:57:07 +0100 (BST)

On Thu, 19 Jul 2007, Neil Youngman wrote:

> On or around Wednesday 18 July 2007 22:19, Jim Jackson reorganised a bunch of
> electrons to form the message:
> > On Wed, 18 Jul 2007, Neil Youngman wrote:
> > > I was asked to look at a system that had a consistent load average around
> > > 5.3 to 5.5. Now I know very little about how to track down load average
> > > issues, so I haven't been able to find much. The CPU usage is about 90%
> > > idle, so it's not CPU bound.
> >
> > Have you run the "top" command? It shows which processes are most active.
> > I believe there are such things as gtop, ktop? etc for the those suffering
> > command line timidity^H^H^H^H^H^H^H^Htemerity (timidity is a midi
> > application :-).
>
> Yes, top was the first thing I ran, which is where I got the 90% idle figure.
>
> > A load ave of 5 means that there were on ave 5 processes read to run for
> > the whole of the averaging period - the usual ave periods are 1min 5min
> > and 15min.
>
> That much I found, although apparently "ready to run" includes waiting for
> I/O,

It sort of depends on how the process is "waiting" for i/o. Doing it the sensible way and the process should be sleeping untill i/o, i.e. doing a blocking read or using select or similar. However bad design spinning on a non blocking read would possibly account for it.

>From the ps manual page
      PROCESS STATE CODES
       D   uninterruptible sleep (usually IO)
       R   runnable (on run queue)
       S   sleeping
       T   traced or stopped
       Z   a defunct ("zombie") process
If the process is runnable then the userland code is ready to run, and that is what load measures.

> otherwise they would soak up that 90% CPU idle time. What I haven't
> found yet is a way to tell which processes are waiting on I/O and why?

It does indeed seem strange. Everytime I've investigated high persistant loads there have been obvious culprits.

>From top get the process numbers of likely suspect and watch them with suitable
 while true do
  ps uax | grep PID
  sleep 1
 done
to see what they are doing. You may need to customise the ps flags.


Top    Back


Thomas Adam [thomas at edulinux.homeunix.org]


Thu, 19 Jul 2007 23:01:45 +0100

On Thu, Jul 19, 2007 at 10:57:07PM +0100, Jim Jackson wrote:

> It sort of depends on how the process is "waiting" for i/o. Doing it the
> sensible way and the process should be sleeping untill i/o, i.e. doing a
> blocking read or using select or similar. However bad design spinning on a
> non blocking read would possibly account for it.

Maybe, but that's slightly going in the wrong direction. I suspect if Neil can confirm if this is a persistent issue or not, that it's going to be hardware-related, and not software I/O (the mark of a suspect program, for instance.) In which case, going with the noapic suggestion both in the kernel and in the BIOS (as I suggested) is still something worth trying.

>  while true do
>   ps uax | grep PID
>   sleep 1
>  done

Let's do this properly using watch(1), please.

-- 
Thomas Adam
"He wants you back, he screams into the night air, like a fireman going
through a window that has no fire." -- Mike Myers, "This Poem Sucks".

Top    Back


Mulyadi Santosa [mulyadi.santosa at gmail.com]


Fri, 20 Jul 2007 12:38:55 +0700

Hi Neil...

> That much I found, although apparently "ready to run" includes waiting for
> I/O, otherwise they would soak up that 90% CPU idle time. What I haven't
> found yet is a way to tell which processes are waiting on I/O and why?

I don't know if doing kernel compilation is possible in your env, but perhaps.. if you can, maybe blktrace can help you. You can find it here: http://git.kernel.dk/?p=blktrace.git;a=summary

I don't know how to fetch git repository, so the best I can do is just showing you the tool. You also need to turn on a kernel config...I forgot the exact name, try to grep for "block trace" in your kernel config.

Another thing you can try is system tap. However, that also needs kprobe support. What you need to hook is probably any fs related operation (read, write, readahead, flush and so on).

Goin' in reverse way, you said the load is about 5? IIRC, that means in average there are 5-6 runnable process running on each of your CPU (cores). And they likely sleep on something. If you can confirm via iostat that it's indeed block device access, the at least you get a lead. If not, it can be anything: busy waiting on socket, busy ping pong between Unix domain sockets, etc.

I think, what you can do now... is to hook to every suspicious process id using strace. it takes time, but the result could be more satisfying than just counting on statistics.

regards,

Mulyadi


Top    Back


Neil Youngman [ny at youngman.org.uk]


Fri, 20 Jul 2007 10:53:07 +0100

On or around Thursday 19 July 2007 23:01, Thomas Adam reorganised a bunch of electrons to form the message:

> On Thu, Jul 19, 2007 at 10:57:07PM +0100, Jim Jackson wrote:
> > It sort of depends on how the process is "waiting" for i/o. Doing it
> > the sensible way and the process should be sleeping untill i/o, i.e.
> > doing a blocking read or using select or similar. However bad design
> > spinning on a non blocking read would possibly account for it.
>
> Maybe, but that's slightly going in the wrong direction.  I suspect if Neil
> can confirm if this is a persistent issue or not, that it's going to be
> hardware-related, and not software I/O (the mark of a suspect program, for
> instance.)  In which case, going with the noapic suggestion both in the
> kernel and in the BIOS (as I suggested) is still something worth trying.

It's intermittent rather than persistent. The affected servers will run with a load average around 0.3 normally for weeks, then the load average will ramp up and settle at a relatively high level for an hour or so before returning to normal.

There is no obvious degradation of service, so we're not panicking about it, but we are wondering if the systems are trying to tell us something.

These aren't systems on which I can easily mess with kernel and BIOS options, but I'll look into whether we can do something with the NOAPIC option.

Neil


Top    Back


Kapil Hari Paranjape [kapil at imsc.res.in]


Fri, 20 Jul 2007 16:03:06 +0530

Hello,

On Fri, 20 Jul 2007, Neil Youngman wrote:

> It's intermittent rather than persistent. The affected servers will run with a 
> load average around 0.3 normally for weeks, then the load average will ramp 
> up and settle at a relatively high level for an hour or so before returning 
> to normal. 
> 
> There is no obvious degradation of service, so we're not panicking about it, 
> but we are wondering if the systems are trying to tell us something.

This seems to point to something like a cron job which raises the load but is "nice" about it.

I suppose you have looked at the output of ps and tried to find processes that make a lot of use of I/O resources.

One thing that does use I/O resources extensively but does not otherwise load the system is any program that uses "polling" to get interactive input. One non-interactive such routine is a ppp dialer. Other expect-send routines are equally suspect.

Until recently I used to use "cat /dev/xconsole" to get log messages on one of my "screen"s. This tends to use up significant I/O resources as compared with "inotail -f /dev/xconsole".

I suppose you can also look for process that need enough memory as to cause significant swapping.

Regards,

Kapil. --


Top    Back


Raj Shekhar [rajlist2 at rajshekhar.net]


Fri, 27 Jul 2007 20:40:28 +0530

in infinite wisdom Neil Youngman spoke thus On 07/18/2007 01:57 PM:

> I googled for "load average", "high load average" and "diagnose load average" 
> and I found very little of use. the one thing I found was that if it's 
> processes stuck waiting on I/O "ps ax" should show processes in state "D". 
> There are none visible on this box.

I doubt you can fix it without any monitoring. There are lots of light weight monitoring scripts that you can use (I use nagios, and I cannot remember any names for light weight scripts from the top of my head. Nagios would be an overkill for you). In generic terms, what you need to do is

- install a monitoring script

- all monitoring systems have hooks that allow you to insert your own monitoring scripts

- monitor for system load - the bash oneliner should give you the system load for the past 1 minute

uptime  |perl -lane 'if (m/.+ (load average: )(.+), (.+), (.+)/) {print $2}'
- when the number goes above UPPER-LIMIT, then do a 'ps auxww >> LOG_FILE' from your monitoring script itself (where UPPER-LIMIT and LOG_FILE are your user supplied values)

- study the log file deeply to find a pattern.

- fix the problem

You can also use sar to monitor system load. sar is part of the sysstat package and you need to enable it explicitly through cron (man sar, sadc ,sa1 for more details). Once you have sar running, check its data to see if you can find a pattern in spikes and then dig further.

-- 
raj shekhar
facts: http://rajshekhar.net | opinions: http://rajshekhar.net/blog
I dare do all that may become a man; Who dares do more is none.

Top    Back


Ben Okopnik [ben at linuxgazette.net]


Fri, 27 Jul 2007 14:10:00 -0400

On Fri, Jul 27, 2007 at 08:40:28PM +0530, Raj Shekhar wrote:

> in infinite wisdom Neil Youngman spoke thus  On 07/18/2007 01:57 PM:
> 
> > I googled for "load average", "high load average" and "diagnose load average" 
> > and I found very little of use. the one thing I found was that if it's 
> > processes stuck waiting on I/O "ps ax" should show processes in state "D". 
> > There are none visible on this box.
> 
> I doubt you can fix it without any monitoring.  There are lots of light 
> weight monitoring scripts that you can use (I use nagios, and I cannot 
> remember any names for light weight scripts from the top of my head. 
> Nagios would be an overkill for you).  In generic terms, what you need 
> to do is
>   - install a monitoring script
>   - all monitoring systems have hooks that allow you to insert your own 
> monitoring scripts
>   - monitor for system load - the bash oneliner should give you the 
> system load for the past 1 minute
> "
> uptime  |perl -lane 'if (m/.+ (load average: )(.+), (.+), (.+)/) {print $2}'
> "
>   - when the number goes above UPPER-LIMIT, then do a 'ps auxww >> 
> LOG_FILE' from your monitoring script itself (where UPPER-LIMIT and 
> LOG_FILE are your user supplied values)

You could easily combine the two tasks and automate them, perhaps by running a cron job. Or, you could wrap a '{ ...; sleep 10; redo }' construct around the statements below to get snapshots at 10-second intervals.

perl -we'`uptime`=~/average: ([^,]+)/;system "ps auxww>>LOG_FILE" if $1>$LIMIT'
Obviously, you'll want to modify the name of the log file and set $LIMIT (or use an explicit value.)

>   - study the log file deeply to find a pattern.

I think that this is the part that Neil was asking about, really. :)

>   - fix the problem

Once the above is done, that's most likely trivial.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top    Back