NFS stops responding

Michael ODonnell michael.odonnell at comcast.net
Fri Apr 2 11:03:07 EDT 2010



> The client isn't seeing the replies?  Blame the router, blame
> the router!

Heh.  I'd love to, and I just acquired a brand new switch to use as
an experimental replacement for the one currently deployed.  I'll be
ecstatic if that fixes thing, though I'm not optimistic.

I don't really trust my interpretation of what Wireshark is showing
me but, if I'm correct, the problem is not that we stop seeing return
traffic from the server, it's more that the client code stops making sane
decisions in response when it arrives.  Maybe the packets aren't getting
all the way back down the stack to be processed by the client code?

Wireshark display of relevant traffic while observing 'ls -l mountPoint'
on client hang and then return with 'I/O Error' :

  On CLIENT A:
  #     Time       SRC DST PROT INFO
  1031  1.989127   A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa
  4565  10.121595  B   A   NFS  V3   GETATTR Call, FH:0x00091508
  4567  10.124981  A   B   NFS  V3   FSSTAT  Call, FH:0x17a976a8
  4587  10.205087  A   B   NFS  V3   GETATTR Call, FH:0xf2c997c8
  29395 61.989380  A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa [retransmission of #1031]
  66805 130.119722 B   A   NFS  V3   GETATTR Call, FH:0x0089db89
  66814 130.124815 A   B   NFS  V3   FSSTAT  Call, FH:0x18a979a8
  97138 181.989898 A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa

  On SERVER B:
  #     Time       SRC DST PROT INFO
  677   1.342486   A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa
  4045  9.474848   B   A   NFS  V3   GETATTR Call, FH:0x00091508
  4047  9.478325   A   B   NFS  V3   FSSTAT  Call, FH:0x17a976a8
  4076  9.558433   A   B   NFS  V3   GETATTR Call, FH:0xf2c997c8
  28625 61.342630  A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa [retransmission of #677]
  61257 129.472779 B   A   NFS  V3   GETATTR Call, FH:0x0089db89
  61268 129.477965 A   B   NFS  V3   FSSTAT  Call, FH:0x18a979a8
  87631 181.342989 A   B   NFS  V3   GETATTR Call, FH:0x70ab15aa

> Simplify, simplify, simplify.  [...]  However, NFS is often the
> dominant traffic source and people are surprised to see that
> telnet/ftp/ssh don't work either.

All other network plumbing appears to be in working order while the
problem is occurring - I can connect from one system to another at will
via SSH, rsync, HTTP, ping, etc.



More information about the gnhlug-discuss mailing list