Link atomicity [was Re: NFS Question]
Derek D. Martin
ddm+gnhlug at pizzashack.org
Fri Aug 30 00:07:05 EDT 2002
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
At some point hitherto, jim.mcginness at att.net hath spake thusly:
[using link(2) to create a lock]
> I have always told people who need portability and aren't looking for
> high-performance, fine-grained locks that they should use this sort of
> link mechanism. It's definitely atomic and you can implement it in shell
> scripts.
This is also the method favored by WR Stevens...
> I've never heard of the suggestion that it would be okay to
> consider the link successful if the link call returned failure but the
> link count, on a subsequent stat, was 2. That sounds like bad advice.
I hadn't either, and I agree. However, that is what the man page
says... ;-) I suppose it's not impossible that at some point,
link(2) could succeed but still return non-zero. However, at least
according to the man pages on my systems, that doesn't seem to be the
case.
In any event, if the lock count has increased to 2, that may mean that
someone else has linked the file. To assume that the link count being
raised to 2 means your lock was successful after a non-zero return
from link(2) would seem folly to me...
> > However, kernel developers at MCL have told me that because NFS by
> > default uses asyncronous I/O, this also contains a race condition.
>
> I don't see this.
Well I think the case that maddog raised was an example. If the
server crashes before the operation is committed on both ends, you
could have a problem. IANAKH (I am not a kernel hacker)!
> There would of course be a race condition if a process other than
> the lock owner, the process that succeeded in creating the link,
> were to unlink the lockfile. The removal could arrive immediately
> after a successful link attempt.
Good point! However, this I think is more a programming error than an
actual race condition. The program should only remove the lock file
if it actually had a lock; to do otherwise is a logic error. If it
did have a lock, then no other program would be able to take the lock,
and thus should not be trying to remove the lock.
Ugh, my head hurts, like trying to do proofs in geometry. "Assume
file A successfully took a lock..." =8^)
> But perhaps what they were referring to is that changes to other
> files during the interval while the lockfile is held would not
> necessarily be committed to disk at the point when the lockfile
> is removed.
It was a discussion had about a year ago; I don't remember the
details. I could try to find out, if you're really interested. =8^)
- --
Derek Martin ddm at pizzashack.org
- ---------------------------------------------
I prefer mail encrypted with PGP/GPG!
GnuPG Key ID: 0x81CFE75D
Retrieve my public key at http://pgp.mit.edu
Learn more about it at http://www.gnupg.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE9bu9odjdlQoHP510RAlfGAKCVVt/tSvZYbPPgFoYPq81iufyDmACgqzCk
MUY8AwCbIiSIZGpxIOYV8Q0=
=zfFJ
-----END PGP SIGNATURE-----
More information about the gnhlug-discuss
mailing list