My NFS timeouts / dirty page writeback problem.

kuvwert@lemmy.dbzer0.com · edit-2 3 months ago

My NFS timeouts / dirty page writeback problem.

deadcade@lemmy.deadca.de · 3 months ago

Sounds like a band-aid fix to a completely different problem. If NFS is timing out, something is clearly broken. Assuming it’s not your network (though it could very well be), it’s likely the Synology NAS. Since they’re relatively closed devices afaik, I sadly can’t help much in troubleshooting. And sure, dumping 25GB on it all at once is heavy, but it should handle that, being a NAS.

kuvwert@lemmy.dbzer0.com · 3 months ago

You could 1000% percent be right here. I don’t have an education on networking or system ops.

The NAS can sustain gigabit writes without complaint. My premis is the kernel isn’t flushing at wire speed in a steady stream. It hoards 2+ GB in page cache then fires a burst of concurrent NFS RPCs all at once. The NAS can’t ack them fast enough and the client declares it dead.

I ran three dd tests, same 2GB file, same NAS, only changed client config:

128K buffers: 101 MB/s, dies at 2GB
32K buffers: 2.1 GB/s (all page cache, nothing hitting the wire), dies even harder
64MB dirty page cap: steady 11.4 MB/s, zero timeouts, clean finish

The NAS handled all 2GB in test 3 because it arrived as a drip instead of a wall.

At least thats the working theory :)

deadcade@lemmy.deadca.de · 3 months ago

What I’m noticing more, is that you can keep a consistent 11.4MB/s, this feels relatively close to what you’d usually pull through a 100mbit/s link (after accounting for overhead). If that’s the case, it shouldn’t matter how the NFS client decides to chunk the data, for how much throughput there is to the NAS. Which means you’re looking at a broken NFS server that can’t handle large single transmissions.

If it’s not the case, and you’ve got a faster network link, it seems that the NAS just can’t keep up when given >2gb at once. That could be a hardware resource limitation, where this fix is probably the best you can do without upgrading hardware. If it’s not a resource limitation, then the NFS server is misbehaving when sent large chunks of data.

Basically, if your network itself (like switches, cables) isn’t broken, you’re either dealing with a NAS that is severely underspecced for what it’s supposed to do, or a broken NFS server.

Another possibility for network issues, is that your proxmox thinks it has gigabit (or higher), but some device or cable in between your server and NAS limits speed to 100mbit/s. I think it’d be likely to cause the specific issues you’re seeing, and something like mixed cable speeds would explain why the issue is so uncommon/hard to find. The smaller buffers more frequent acknowledgements would sidestep this.

Do note I am also not an expert in NFS, I’m mostly going off experience with the “fuck around and find out” method.

kuvwert@lemmy.dbzer0.com · 3 months ago

Holy Canoli You were right.

Checked the Synology’s network interface page and it was negotiating at 100 Mbps to my Orbi router. Your comment about 11.4 MB/s lining up with a 100 Mbps ceiling is exactly what got me to actually check the link speed instead of just assuming gigabit.

Bought a cheap unmanaged switch, plugged the NAS and server into it directly, came up at 1000 Mbps, and now I’m getting 107-115 MB/s on the same tests. Updated the post and credited you. Appreciate you pushing back on it.

deadcade@lemmy.deadca.de · 3 months ago

Hell yeah! 10x speed improvement for free!

frongt@lemmy.zip · 3 months ago

I started writing the same comment, but discarded it when I came to the same conclusion. You can hand off 2gb of writes, sure, but if the Synology doesn’t receive them fast enough, proxmox times out.

I suppose you could also increase the buffer on Synology side, but if you’re going to buffer that much then you should have a battery-backed cache to avoid data loss.

I don’t know enough about nfs to know if it supports anything that would bubble up the “please wait I need time to write all this data” message from the physical disks all the way up through your disk controller, Synology system, nfs connection, nfs client, to the actual writing application.

You might also consider a different kind of mount, like iscsi, if you aren’t sharing it with any other system.

tal@lemmy.today · edit-2 3 months ago

He could probably run an NFS server that isn’t a closed box, and have that just use the Synology box as storage for that server. That’d give whatever options Linux and/or the NFS server you want to run have for giving fair prioritization to writes, or increasing cache size (like, say he has bursty load and blows through the cache on the Synology NAS, but a Linux NFS server with more write cache available could potentially just slurp up writes quickly and then more-slowly hand them off to the NAS).

Honestly, though, I think that a preferable option, if one doesn’t want to mess with client global VM options (which wouldn’t be my first choice, but it sounds like OP is okay with it) is just to crank up the timeout options on the NFS clients, as I mention in my other comment, if he just doesn’t want timeout errors to percolate up and doesn’t mind the NAS taking a while to finish whatever it’s doing in some situations. It’s possible that he tried that, but I didn’t see it in his post.

NFSv4 has leases, and — I haven’t tested it, but it’s plausible to me from a protocol standpoint — it might be possible that it can be set up such that as long as a lease can be renewed, it doesn’t time out outstanding file operations, even if they’re taking a long time. The Synology NAS might be able to avoid taking too long to renew leases and causing clients to time out on that as long as it’s reachable, even if it’s doing a lot of writing. That’d still let you know if you had your NFS server wedge or lost connectivity to it, because your leases would go away within a bounded amount of time, but might not time out on time to complete other operations. No guarantee, just it’s something that I might go look into if I were hitting this myself.

tal@lemmy.today · edit-2 3 months ago

That’s a global VM setting, which is also going to affect your other filesystems mounted by that Linux system, which may or may not be a concern.

If that is an issue, you might also consider — I’m not testing these, but would expect that it should work:

Passing the sync mount option on the client for the NFS mount. That will use no writeback caching for that filesystem, which may impact performance more than you want.
Increasing the NFS mount options on the client for timeo= or retrans=. These will avoid having the client time out and decide that the NFS server is taking excessively long (though an operation may still take longer to complete if the NFS server is taking a while to respond).

happy_wheels@lemmy.blahaj.zone · 3 months ago

Based off what you wrote - and the fact that I’m massively sleep-deprived - it all makes sense. The issue you describe and the fix applied are akin to what we see in the database world, where users complain about queries being slow or unresponsive after trying to force-kill. Only for us to find out, that they submitted queries with a COMMIT after the whole 10mil record transaction, which clearly the DB can handle, but it will take a significant amount of time to rollback vs if the COMMITs are broken up and submitted more frequent. Basically chunking up the data into more manageable pieces as to not saturate the db threads, not to mention the underlying REDO and transaction log files too. So hope this was truly a long-term fix vs just a short-term one. Either way, great write up.( Also, you may want to invest in some 2.5gb networking for later, not that 1GbE isn’t enough, but just more pipeline is great, although I don’t know how much upgradeability your Synology will have in that department, so YMMV)

My NFS timeouts / dirty page writeback problem.

My NFS timeouts / dirty page writeback problem.

Double Check Your NFS timeouts to your NAS arent an NFS problem. They might be a dirty page writeback problem.

The fix