Niall’s virtual diary archives – Monday 08 March 2021

by . Last updated .

Monday 8 March 2021: 20:44. I had a pretty bad week last week at work. My main development workstation had, on the preceding Friday, crashed taking my development VMs with them, so I spent the weekend before last reinstalling Ubuntu 20.04 with ZFS-on-Root (my standard Linux setup for five years now!), moving the work codebase onto that shared with Windows over Samba, with the intent of building for Linux within the VM, and building for Windows from the Samba share. This was a big divergence from my previous setup of Windows Subsystem for Linux v1 doing the builds for both Linux and Windows, and then I’d run the Linux executables over a Samba share. There is nothing wrong with my former setup for smaller codebases, but as the work codebase approaches 150k LOC, WSL v1 based Linux builds are getting unwieldy slow. And WSL v2 is the same as a Linux VM, except the file system is shared by 9p rather than Samba, and 9p is very considerably slower than Samba, so you’re much better off configuring your own Linux VM and Samba installation and tuning the snot out of Samba.

Anyway, all of last week my developer workstation kept locking up, losing me work in progress. I tried relocating the NVMe SSD (a Samsung 970 Pro) into a new M.2 socket, and since then it appears to be reliable again. But that’s water under the bridge, what I’m here to talk about now is how I fixed Visual Studio 2019 not building reliably over a Samba share, because absolutely nobody else seemed to find a solution to this oft reported problem (well apart from this guy here who found a workaround to a related but different problem which has the same manifestation as mine).

Firstly, I am not building into the Samba share. I create a build directory on Windows, and tell cmake to populate that Windows build directory using a git worktree located on a mapped network drive M:\ which is the Samba share of the git worktree in the Linux VM \\kate-linux. As the build never writes into the source worktree, Samba is only being used here for reads only, and so thanks to opportunistic locking (oplocks), Windows aggressively caches the source tree in Windows and build performance is pretty close to native speed.

Except, it’s not quite reliable. 99.9% of the time it works fine. But occasionally MSVC doesn’t find some header file, or Visual Studio refuses to save a file, and if you look in the directory it is creating lots of orphaned temporary files from the failed saves. The problem is much worse if you use --parallel with cmake --build . --config Debug where MSVC will fail to find lots of header files, sufficiently so that you don’t get a usable build. Initially I thought this was purely a MSVC/Visual Studio problem, as it only ever appeared there, not helped by all the google searches reporting the same problem and almost all also mentioned MSVC/Visual Studio. But I also noticed that occasionally executing git from Windows where the git repo was on the mapped network drive would fail too with messages such as:

fatal: update_ref failed for ref 'HEAD': cannot lock ref 'HEAD': unable to create lock file non-directory in the way

… and other messages suggesting that the network share was being racy with respect to changes on the network share.

My initial thought was that Samba must be misconfigured, even though it was pretty much with default config, and Ubuntu 20.04’s Samba is v4.11.6 which to my best knowledge, has no known major bugs and its default config is pretty optimal for performance, unlike earlier versions before Samba v4. I spent all last week when I was waiting on Linux build trial and error A-B testing various network and Samba configurations, alas to no avail.

This weekend passed, and today Monday morning I had a bit of a brainwave: What if Samba is absolutely fine, and it is Windows 10 which is the cause?

That led me to Microsoft’s documentation page about SMB2 Redirector Caches which documents three registry settings to fiddle with. It turns out that setting these parameters in HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters fixes all my MSVC failed-to-find-file, Visual Studio 2019 failed-save-edited-file, and git failed-to-checkout-branch problems:

  1. DirectoryCacheLifetime = (DWORD) 0
  2. FileNotFoundCacheLifetime = (DWORD) 0
  3. FileInfoCacheLifetime = (DWORD) 0

After you have set these using regedit, Run services.msc, find the Workstation service and restart it. To verify it’s working, open powershell with Administrator privileges and do Get-SmbClientConfiguration:

ConnectionCountPerRssNetworkInterface : 4
DirectoryCacheEntriesMax              : 16
DirectoryCacheEntrySizeMax            : 65536
DirectoryCacheLifetime                : 0
DormantFileLimit                      : 1023
EnableBandwidthThrottling             : True
EnableByteRangeLockingOnReadOnlyFiles : True
EnableInsecureGuestLogons             : False
EnableLargeMtu                        : True
EnableLoadBalanceScaleOut             : True
EnableMultiChannel                    : True
EnableSecuritySignature               : True
ExtendedSessionTimeout                : 1000
FileInfoCacheEntriesMax               : 64
FileInfoCacheLifetime                 : 0
FileNotFoundCacheEntriesMax           : 128
FileNotFoundCacheLifetime             : 0
KeepConn                              : 600
MaxCmds                               : 50
MaximumConnectionCountPerServer       : 32
OplocksDisabled                       : False
RequireSecuritySignature              : False
SessionTimeout                        : 60
UseOpportunisticLocking               : True
WindowSizeThreshold                   : 8

Note the zero values for the parameters we forced to zero, but large MTUs remain on, oplocks are on, and multichannel is on.

SMB Multichannel is probably the only major Samba performance enhancing feature not enabled by default in Samba v4. This is because it was buggy until recently, but now it’s working very well. SMB Multichannel lets file transfers multiplex over multiple TCP connections, so just like with Download Accelerators on the internet, you can multiply a per-TCP-connection maximum several fold over multiple connections, thus greatly increasing transfer rates. This isn’t particularly important for many small files like during a C++ compile run, but if you have multiple threads all accessing a single Samba share, with SMB Multichannel those threads actually see some concurrency whereas without SMB Multichannel, they all get funnelled through a single TCP connection with a global mutex. So, for a parallel build like what Visual Studio now does by default, SMB Multichannel is a big gain.

You can see if your Hyper-VM Linux and your Windows installation are already employing SMB Multichannel using this command in an Administrator privileged PowerShell:

Get-SmbMultichannelConnection -IncludeNotSelected

Server Name Selected Client IP     Server IP       Client Interface Index Server Interface Index Client RSS Capable Client RDMA Capable
----------- -------- ---------     ---------       ---------------------- ---------------------- ------------------ -------------------
kate-linux  False    192.168.137.1 192.168.137.235 7                      2                      False              False
kate-linux  False    192.168.2.172 192.168.137.234 11                     1                      False              False
kate-linux  False    192.168.2.172 192.168.137.234 11                     1                      False              False
kate-linux  False    192.168.137.1 192.168.137.234 7                      1                      False              False
kate-linux  True     192.168.137.1 192.168.137.234 7                      1                      True               False
kate-linux  False    192.168.2.172 192.168.137.235 11                     2                      False              False

If it prints nothing, SMB Multichannel is NOT being employed.

If your Samba is after v4.13, it should autodetect when your network setup is RSS capable on its own. If both sides can do RSS, enabling SMB multichannel is as simple as adding this into your smb.conf:

[global]

server min protocol = SMB3
server multi channel support = yes

Note you need to reboot your Linux VM and then your host Windows machine before this takes effect.

If your Samba is before v4.13, you will need to either force RSS on (ideal as it can parallelise according to CPUs in your machine) or assign more than one network adapter to both your VM and your host on the Hyper-VM bridge (not ideal, as max concurrency is the number of multiple NIC pairs between Linux and Windows). Here is how you force Samba to advertise support for RSS and RDMA:

interfaces = "192.168.137.234;if_index=1,capability=RSS,capability=RDMA,speed=10000000000"

Obviously, you will need a static IP for your Linux VM for this to work, and you need to enable RSS in the virtual 10Gb NIC and on the Hyper-VM bridge you are using.

I left RDMA enabled in there too, though it only makes sense on real hardware with a sufficiently capable real NIC on a real server. Obviously if you do have such capable hardware, you can sustain 10Gb/sec on a 100Gbit link with 256Kb per i/o @ QD8, or 2Gb/sec on a 100Gbit link with 4Kb per i/o @ QD200. Over a software emulated switch and NIC, the SMB Multichannel only mainly increases concurrency for both host and VM, helping ameliorate the VM<=>Host latency.

Finally, the only other settings which Samba v4 doesn’t currently enable which might help are:

socket options = TCP_NODELAY IPTOS_LOWDELAY
use sendfile = yes

TCP_NODELAY is already on by default in Samba v4, but IPTOS_LOWDELAY is not. This might improve performance a bit given that now Windows does no caching of metadata whatsoever given the registry changes above. And use of kernel sendfile() to zero copy transmit files is off by default, for some reason, so turning it on might reduce CPU cache loading a little.

Hopefully I helped other people now reading figure out the solution to what has been a very frustrating week for me in getting Visual Studio/MSVC to reliably build a Linux Samba share supplied git worktree.

#samba




Go back to the archive index Go back to the latest entries

Contact the webmaster: Niall Douglas @ webmaster2<at symbol>nedprod.com (Last updated: 2021-03-08 20:44:01 +0000 UTC)