Resolution
This issue was discussed with Red Hat Engineering under Private Bug 702085, but was not able to be repaired within the RHEL 5 product lifecycle.
Red Hat Product Management have elected not to repair this issue within RHEL 6 for the following reasons:
- Customer exposure to this issue is not significant.
- This behaviour is documented on the Customer Portal along with workarounds.
- This behaviour is much harder (takes longer) to reproduce in RHEL 6 compared to RHEL 5.
- The NFS protocol does not guarantee cache coherence.
Workarounds:
- Ensure one client cannot read a file while another client is accessing it by using file locks, such as flock in shell scripts, or
fcntl()
in C.- Open the file with
O_DIRECT
so that the page cache is avoided.- Do not read past the EOF.
- For example, in Python use
os.stat()
to get the file size,os.open()
to open the file andos.read()
to read only up to the file size.- Avoid running
tail -f
on files residing on NFS mounts.- If using RHEL 5, the
sync
mount option will also avoid this issue. This will not work in RHEL6, as thenfs_readpage_sync()
function was removed upstream in between RHEL 5 and RHEL 6, so that function does not exist in RHEL 6.