Hosting on a headless server
GPU servers can greatly accelerate the training speed of deep learning algorithms. However, most GPU servers are headless and have no default display device. This creates challenges for Unity's visual rendering. We recommend setting up a virtual display tool X server
to enable headless rendering.
1) Packages Installation: First ensure server has following nvidia packages, assuming NVIDIA driver uses version 515:
> apt list --installed | grep "nvidia"
libnvidia-cfg1-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-common-515-server/unknown,unknown,now 515.65.01-0lambda0~20.04.1 all [installed,automatic]
libnvidia-compute-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-compute-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-decode-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-decode-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-encode-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-encode-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-extra-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-extra-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-fbc1-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-fbc1-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-gl-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
libnvidia-gl-515-server/unknown,now 515.65.01-0lambda0~20.04.1 i386 [installed,automatic]
libnvidia-ml-dev/unknown,now 11.6.55~11.6.2-0lambda1.1 amd64 [installed,automatic]
nvidia-compute-utils-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
nvidia-cuda-dev/unknown,now 11.6.2-0lambda1.1 amd64 [installed,automatic]
nvidia-cuda-gdb/unknown,now 11.6.124~11.6.2-0lambda1.1 amd64 [installed,automatic]
nvidia-cuda-toolkit-doc/unknown,unknown,now 11.6.2-0lambda1.1 all [installed,automatic]
nvidia-cuda-toolkit/unknown,now 11.6.2-0lambda1.1 amd64 [installed]
nvidia-dkms-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
nvidia-driver-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed]
nvidia-kernel-common-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
nvidia-kernel-source-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
nvidia-prime/focal-updates,focal-updates,now 0.8.16~0.20.04.2 all [installed]
nvidia-profiler/unknown,now 11.6.124~11.6.2-0lambda1.1 amd64 [installed,automatic]
nvidia-settings/unknown,now 515.76-0lambda1 amd64 [installed]
nvidia-utils-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
Install following packages:
> apt install xorg mesa-utils xserver-xorg xserver-xorg-video-dummy xserver-xorg-video-nvidia-515-server
Then verify that xserver related packages match following:
> apt list --installed | grep "xserver"
x11-xserver-utils/focal,now 7.7+8 amd64 [installed,automatic]
xserver-common/focal-updates,focal-updates,focal-security,focal-security,now 2:1.20.13-1ubuntu1~20.04.3 all [installed,automatic]
xserver-xephyr/focal-updates,focal-security,now 2:1.20.13-1ubuntu1~20.04.3 amd64 [installed,automatic]
xserver-xorg-core/focal-updates,focal-security,now 2:1.20.13-1ubuntu1~20.04.3 amd64 [installed,automatic]
xserver-xorg-dev/focal-updates,focal-security,now 2:1.20.13-1ubuntu1~20.04.3 amd64 [installed]
xserver-xorg-input-all/focal,now 1:7.7+19ubuntu14 amd64 [installed,automatic]
xserver-xorg-input-libinput/focal,now 0.29.0-1 amd64 [installed,automatic]
xserver-xorg-input-wacom/focal,now 1:0.39.0-0ubuntu1 amd64 [installed,automatic]
xserver-xorg-legacy/focal-updates,focal-security,now 2:1.20.13-1ubuntu1~20.04.3 amd64 [installed,automatic]
xserver-xorg-video-all/focal,now 1:7.7+19ubuntu14 amd64 [installed,automatic]
xserver-xorg-video-amdgpu/focal-updates,now 19.1.0-1ubuntu0.1 amd64 [installed,automatic]
xserver-xorg-video-ati/focal,now 1:19.1.0-1 amd64 [installed,automatic]
xserver-xorg-video-dummy/focal,now 1:0.3.8-1build3 amd64 [installed]
xserver-xorg-video-fbdev/focal,now 1:0.5.0-1ubuntu1 amd64 [installed,automatic]
xserver-xorg-video-intel/focal,now 2:2.99.917+git20200226-1 amd64 [installed,automatic]
xserver-xorg-video-nouveau/focal,now 1:1.0.16-1 amd64 [installed,automatic]
xserver-xorg-video-nvidia-515-server/unknown,now 515.65.01-0lambda0~20.04.1 amd64 [installed,automatic]
xserver-xorg-video-qxl/focal,now 0.1.5+git20200331-1 amd64 [installed,automatic]
xserver-xorg-video-radeon/focal,now 1:19.1.0-1 amd64 [installed,automatic]
xserver-xorg-video-vesa/focal,now 1:2.4.0-2 amd64 [installed,automatic]
xserver-xorg-video-vmware/focal,now 1:13.3.0-3 amd64 [installed,automatic]
xserver-xorg/focal,now 1:7.7+19ubuntu14 amd64 [installed]
2) Configurations: Modify “/etc/X11/Xwrapper.config” by following line:
This enables all users to start/stop X server. Then prepare two configuration files:
Modify “headless-gpu.conf” by changing BusIDs of the devices. To find BusIDs:
Finally, copy “headless-gpu.conf” and “headless-dummy.conf” to “/etc/X11/”
3) Start X Server: To start X server on display 0 with GPU:
Test GPU acceleration by:
“OpenGL vendor string” should show “NVIDIA Corporation”.
To start X server on display 0 with CPU:
4) Debugging: Assuming Xserver starts on display 0, and error occurred. Find log info in:
5) Note that Docker and the Nakama server still needs to be installed on headless servers. In this case Docker Engine should be installed instead of docker desktop. Once docker is installed, you will need to add users to the docker group and reboot the server. Running Nakama is the same as in Setup Nakama.
6) Now Unity visuals can be rendered onto X Server. Make sure X is started:
In case of error:
(EE)
Fatal server error:
(EE) Server is already active for display 0
If this server is no longer running, remove /tmp/.X0-lock
and start again.
(EE)
(EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
(EE)
This means X on display 0 has already started. Either use the display or start on a new display index. You can select GPU by openinig bash/etc/X11/headless-gpu.conf
, find Section “Screen”, and modify Device name to the target GPU.
Now run your python script with: