server uptime

Mark E. Mallett mem at mv.mv.com
Thu Mar 20 17:42:57 EDT 2008


On Thu, Mar 20, 2008 at 09:46:04AM -0400, Jerry Feldman wrote:
> On Wed, 19 Mar 2008 21:38:52 -0400
> "Mark E. Mallett" <mem at mv.mv.com> wrote:
> 
> > sometimes it's good to reboot a system just to make sure you can.
> 
> That's very old school :-)

thank you :)


> Back in the days where mainframes had the power of my PDA, operating
> systems were somewhat unsophisticated. I ran a data center in San
> Antonio where we ran VM/370 - IBM's virtualization, with the batch os
> (OS/VS1) in one VM, and CMS for online users - data control and
> programmers.  The thinking back then if we shutdown we may never get
> the system back up, but this was 1950s mentality. But, memory leaks and
> things could cause the OS to degrade over time. 
> 
> Today, for the most part, the Linux and Unix kernels really do not need
> periodic reboots unless there is a problem. On out BLU mail server
> we've seen that the routing table gets screwed up and is difficult to
> fix. In any case, since nearly every service and driver can be stopped
> and started remotely, the only reason I might want to reboot other than
> a kernel upgrade is that it might be faster to reboot than to try to
> fix an issue, but that tends to be more of a Microsoft mentality, but
> Windows Server has become more stable also. 

But all of that is completely different from what I said.  I agree that
software can keep running without a reboot.  But as I mentioned,
sometimes a reboot will find something that you can't possibly find by
keeping a system running.  Like some of the things I listed.  My point
is that a planned reboot can help protect you from surprises that you
might learn only from an unplanned reboot.

Note that this is a "do as I say not as I do" kind of remark.  I never
actually reboot anything for that reason, I just think it's a good reason :)

Relatedly, a group of systems can get into a state where it's hard to
reboot the whole group back into that state.  This can happen when you
build up a collection of services and servers over time, but never from
scratch.  e.g. you might have a system A that uses a service on system B
during its boot process, and vice versa (although bigger trouble can
come with harder to find dependency loops that creep in through some
crack in the plans).

mm


More information about the gnhlug-discuss mailing list