NFS stops responding
Michael ODonnell
michael.odonnell at comcast.net
Fri Apr 2 11:03:07 EDT 2010
> The client isn't seeing the replies? Blame the router, blame
> the router!
Heh. I'd love to, and I just acquired a brand new switch to use as
an experimental replacement for the one currently deployed. I'll be
ecstatic if that fixes thing, though I'm not optimistic.
I don't really trust my interpretation of what Wireshark is showing
me but, if I'm correct, the problem is not that we stop seeing return
traffic from the server, it's more that the client code stops making sane
decisions in response when it arrives. Maybe the packets aren't getting
all the way back down the stack to be processed by the client code?
Wireshark display of relevant traffic while observing 'ls -l mountPoint'
on client hang and then return with 'I/O Error' :
On CLIENT A:
# Time SRC DST PROT INFO
1031 1.989127 A B NFS V3 GETATTR Call, FH:0x70ab15aa
4565 10.121595 B A NFS V3 GETATTR Call, FH:0x00091508
4567 10.124981 A B NFS V3 FSSTAT Call, FH:0x17a976a8
4587 10.205087 A B NFS V3 GETATTR Call, FH:0xf2c997c8
29395 61.989380 A B NFS V3 GETATTR Call, FH:0x70ab15aa [retransmission of #1031]
66805 130.119722 B A NFS V3 GETATTR Call, FH:0x0089db89
66814 130.124815 A B NFS V3 FSSTAT Call, FH:0x18a979a8
97138 181.989898 A B NFS V3 GETATTR Call, FH:0x70ab15aa
On SERVER B:
# Time SRC DST PROT INFO
677 1.342486 A B NFS V3 GETATTR Call, FH:0x70ab15aa
4045 9.474848 B A NFS V3 GETATTR Call, FH:0x00091508
4047 9.478325 A B NFS V3 FSSTAT Call, FH:0x17a976a8
4076 9.558433 A B NFS V3 GETATTR Call, FH:0xf2c997c8
28625 61.342630 A B NFS V3 GETATTR Call, FH:0x70ab15aa [retransmission of #677]
61257 129.472779 B A NFS V3 GETATTR Call, FH:0x0089db89
61268 129.477965 A B NFS V3 FSSTAT Call, FH:0x18a979a8
87631 181.342989 A B NFS V3 GETATTR Call, FH:0x70ab15aa
> Simplify, simplify, simplify. [...] However, NFS is often the
> dominant traffic source and people are surprised to see that
> telnet/ftp/ssh don't work either.
All other network plumbing appears to be in working order while the
problem is occurring - I can connect from one system to another at will
via SSH, rsync, HTTP, ping, etc.
More information about the gnhlug-discuss
mailing list