TIP: yum hangs on futex()
Bill McGonigle
bill at bfccomputing.com
Mon Jan 15 18:25:01 EST 2007
Since a server locked up a couple weeks ago yum would just hang
trying to do anything. I decided to tackle it today.
Doing an strace revealed the hang was after opening the files in /var/
lib/rpm and yum was waiting on a futex() call.
To make a long story short, it turns out RPM uses a BerkeleyDB and,
as I seem to find in some unique situation every week, BerkeleyDB
doesn't survive a hard reboot if it's in use when such a thing
happens (usually power outages around here...).
The tell-tale sign, in general, is a trail of: __db.nnn files. e.g.,
in /var/lib/rpm:
$ls -l
total 144528
-rw-r--r-- 1 rpm rpm 20131840 Jan 15 14:45 Basenames
-rw-r--r-- 1 rpm rpm 12288 Jan 15 13:52 Conflictname
-rw-r--r-- 1 rpm rpm 9764864 Jan 15 14:45 Dirnames
-rw-r--r-- 1 rpm rpm 20750336 Jan 15 14:45 Filemd5s
-rw-r--r-- 1 rpm rpm 45056 Jan 15 14:45 Group
-rw-r--r-- 1 rpm rpm 36864 Jan 15 14:45 Installtid
-rw-r--r-- 1 rpm rpm 86016 Jan 15 14:45 Name
-rw-r--r-- 1 rpm rpm 104443904 Jan 15 14:45 Packages
-rw-r--r-- 1 rpm rpm 663552 Jan 15 14:45 Providename
-rw-r--r-- 1 rpm rpm 249856 Jan 15 14:45 Provideversion
-rw-r--r-- 1 rpm rpm 12288 Jan 12 14:20 Pubkeys
-rw-r--r-- 1 rpm rpm 897024 Jan 15 14:45 Requirename
-rw-r--r-- 1 rpm rpm 442368 Jan 15 14:45 Requireversion
-rw-r--r-- 1 rpm rpm 315392 Jan 15 14:45 Sha1header
-rw-r--r-- 1 rpm rpm 167936 Jan 15 14:45 Sigmd5
-rw-r--r-- 1 rpm rpm 12288 Jan 15 13:53 Triggername
-rw-r--r-- 1 root root 0 Jan 13 05:06 __db.000
-rw-r--r-- 1 root root 24576 Jan 13 05:04 __db.001
-rw-r--r-- 1 root root 1318912 Jan 13 05:04 __db.002
-rw-r--r-- 1 root root 450560 Jan 13 05:04 __db.003
To fix, this or any general case of BerkeleyDB corruption:
kill any processes that look like they're waiting on the rpm database
cd /var/lib/rpm
db_recover
And, lo, yum works again.
Other things that have similarly crapped out on various servers in
the past few weeks for me include postgrey and openldap. The
db_recover trick seems to work most of the time. db_verify rarely
reports anything useful but will reliably segfault on a broken
database. ( what's the emoticon for 'bangs-head-on-wall' ?)
Good news: it looks like versions of RPM in CVS (please be in Fedora
7...) will use SQLite for a back-end if available.
Plug: Come hear John Harris learn us about SQLite at DLSLUG on Feb. 1st.
-Bill
-----
Bill McGonigle, Owner Work: 603.448.4440
BFC Computing, LLC Home: 603.448.1668
bill at bfccomputing.com Cell: 603.252.2606
http://www.bfccomputing.com/ Page: 603.442.1833
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf
More information about the gnhlug-discuss
mailing list