DevHeads.net

nfsvers and nfs-utils-1.3.0-0.48.el7

We encountered a weird problem today, and I thought some of you might
like to hear the solution.

The underlying change was listed in the 7.4 changelog, so it's not a
bug, but it may drive you buggy.

The majority of our HPC cluster nodes run CentOS 7, though the exact
patch levels vary from node to node. None is older than 7.3, but a few
newer nodes were kickstarted right to 7.4.

The problem was that our mounts of Isilon NFS exports were failing
randomly among the nodes. Routing was fine. Network connectivity was
fine.

The short answer is that the default in 7.4, and I think in the
nfs-utils-1.3.0-0.48.el7 package in particular, has changed. While NFS
v4.0 was the default up to 7.3, the 7.4 protocols are subtly
different:

1. Try NFS v4.1 first
2. Fail down to NFS v3
3. Fail down to NFS v2

The problem is that our Isilon works with NFS v4.0, not 4.1, but 4.0
is not in the fail-down path.

The short-term answer is to specify nfsvers=4.0 in our autofs
configuration files, which works like a charm.

Like I said, this was an announced change, but the implications
escaped us until now. So this little writeup is just for the record.

Comments

Re: nfsvers and nfs-utils-1.3.0-0.48.el7

By Johnny Hughes at 10/12/2017 - 13:42

On 10/12/2017 12:33 PM, Paul Heinlein wrote:
You are not the first person to have this issue .. thanks for the post.