server uptime

Thu Mar 20 17:42:57 EDT 2008

On Thu, Mar 20, 2008 at 09:46:04AM -0400, Jerry Feldman wrote:
> On Wed, 19 Mar 2008 21:38:52 -0400
> "Mark E. Mallett" <mem at mv.mv.com> wrote:
> 
> > sometimes it's good to reboot a system just to make sure you can.
> 
> That's very old school :-)

thank you :)

> Back in the days where mainframes had the power of my PDA, operating
> systems were somewhat unsophisticated. I ran a data center in San
> Antonio where we ran VM/370 - IBM's virtualization, with the batch os
> (OS/VS1) in one VM, and CMS for online users - data control and
> programmers.  The thinking back then if we shutdown we may never get
> the system back up, but this was 1950s mentality. But, memory leaks and
> things could cause the OS to degrade over time. 
> 
> Today, for the most part, the Linux and Unix kernels really do not need
> periodic reboots unless there is a problem. On out BLU mail server
> we've seen that the routing table gets screwed up and is difficult to
> fix. In any case, since nearly every service and driver can be stopped
> and started remotely, the only reason I might want to reboot other than
> a kernel upgrade is that it might be faster to reboot than to try to
> fix an issue, but that tends to be more of a Microsoft mentality, but
> Windows Server has become more stable also. 

But all of that is completely different from what I said.  I agree that
software can keep running without a reboot.  But as I mentioned,
sometimes a reboot will find something that you can't possibly find by
keeping a system running.  Like some of the things I listed.  My point
is that a planned reboot can help protect you from surprises that you
might learn only from an unplanned reboot.

Note that this is a "do as I say not as I do" kind of remark.  I never
actually reboot anything for that reason, I just think it's a good reason :)

Relatedly, a group of systems can get into a state where it's hard to
reboot the whole group back into that state.  This can happen when you
build up a collection of services and servers over time, but never from
scratch.  e.g. you might have a system A that uses a service on system B
during its boot process, and vice versa (although bigger trouble can
come with harder to find dependency loops that creep in through some
crack in the plans).

mm