Compare commits
2 Commits
a25881bc3e
...
706b62058b
| Author | SHA1 | Date |
|---|---|---|
|
|
706b62058b | 2 years ago |
|
|
c7f2ccb03b | 2 years ago |
@ -0,0 +1,27 @@
|
||||
# Setup
|
||||
|
||||
On a new host
|
||||
|
||||
## Min Browser
|
||||
|
||||
|
||||
## Element Desktop
|
||||
|
||||
To share passwords and other random links.
|
||||
|
||||
|
||||
## Git
|
||||
|
||||
### User Name and Email
|
||||
|
||||
### Cache Password
|
||||
|
||||
Because Gitea / Forgejo require that the repository directory belong to the `gitea` or `forgejo` user, and I believe
|
||||
I've misconfigured it on `petra`.
|
||||
|
||||
According to this
|
||||
https://www.freecodecamp.org/news/how-to-fix-git-always-asking-for-user-credentials/
|
||||
|
||||
```
|
||||
git config --global credential.helper store
|
||||
```
|
||||
@ -1,133 +0,0 @@
|
||||
# Kernel Boot Params
|
||||
|
||||
## Updating Grub
|
||||
|
||||
To select the `AMDGPU` driver / module.
|
||||
|
||||
https://askubuntu.com/a/1314983
|
||||
|
||||
I'll add the following flags to the appropriate line in `/etc/default/grub`
|
||||
|
||||
```
|
||||
GRUB_CMDLINE_LINUX_DEFAULT="radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 amdgpu.dc=1 amdgpu.dpm=1 amdgpu.modeset=1"
|
||||
```
|
||||
|
||||
Then run
|
||||
|
||||
```
|
||||
sudo update-grub
|
||||
```
|
||||
|
||||
and reboot, selecting Ubuntu 20.04 per this branch of the debugging tree.
|
||||
|
||||
## Rebooting
|
||||
|
||||
After rebooting, `OpenCL` detects the GPU! But it hangs and does not return from the call
|
||||
|
||||
```
|
||||
sudo clinfo -l
|
||||
```
|
||||
|
||||
The call without the `sudo` yields just the CPU and `Number of devices: 0` as before.
|
||||
|
||||
Here's the glorious output:
|
||||
|
||||
```
|
||||
$ sudo clinfo -l
|
||||
[sudo] password for arcologos:
|
||||
Number of platforms: 1
|
||||
Platform Profile: FULL_PROFILE
|
||||
Platform Version: OpenCL 2.1 AMD-APP (3558.0)
|
||||
Platform Name: AMD Accelerated Parallel Processing
|
||||
Platform Vendor: Advanced Micro Devices, Inc.
|
||||
Platform Extensions: cl_khr_icd cl_amd_event_callback
|
||||
|
||||
|
||||
Platform Name: AMD Accelerated Parallel Processing
|
||||
Number of devices: 1
|
||||
Device Type: CL_DEVICE_TYPE_GPU
|
||||
Vendor ID: 1002h
|
||||
Board name: AMD Radeon (TM) R9 390 Series
|
||||
Device Topology: PCI[ B#3, D#0, F#0 ]
|
||||
Max compute units: 40
|
||||
Max work items dimensions: 3
|
||||
Max work items[0]: 1024
|
||||
Max work items[1]: 1024
|
||||
Max work items[2]: 1024
|
||||
Max work group size: 256
|
||||
Preferred vector width char: 4
|
||||
Preferred vector width short: 2
|
||||
Preferred vector width int: 1
|
||||
Preferred vector width long: 1
|
||||
Preferred vector width float: 1
|
||||
Preferred vector width double: 1
|
||||
Native vector width char: 4
|
||||
Native vector width short: 2
|
||||
Native vector width int: 1
|
||||
Native vector width long: 1
|
||||
Native vector width float: 1
|
||||
Native vector width double: 1
|
||||
Max clock frequency: 1005Mhz
|
||||
Address bits: 64
|
||||
Max memory allocation: 7301444400
|
||||
Image support: Yes
|
||||
Max number of images read arguments: 128
|
||||
Max number of images write arguments: 8
|
||||
Max image 2D width: 16384
|
||||
Max image 2D height: 16384
|
||||
Max image 3D width: 16384
|
||||
Max image 3D height: 16384
|
||||
Max image 3D depth: 8192
|
||||
Max samplers within kernel: 26545
|
||||
Max size of kernel argument: 1024
|
||||
Alignment (bits) of base address: 1024
|
||||
Minimum alignment (bytes) for any datatype: 128
|
||||
Single precision floating point capability
|
||||
Denorms: No
|
||||
Quiet NaNs: Yes
|
||||
Round to nearest even: Yes
|
||||
Round to zero: Yes
|
||||
Round to +ve and infinity: Yes
|
||||
IEEE754-2008 fused multiply-add: Yes
|
||||
Cache type: Read/Write
|
||||
Cache line size: 64
|
||||
Cache size: 16384
|
||||
Global memory size: 8589934592
|
||||
Constant buffer size: 7301444400
|
||||
Max number of constant args: 8
|
||||
Local memory type: Scratchpad
|
||||
Local memory size: 65536
|
||||
Max pipe arguments: 16
|
||||
Max pipe active reservations: 16
|
||||
Max pipe packet size: 3006477104
|
||||
Max global variable size: 7301444400
|
||||
Max global variable preferred total size: 8589934592
|
||||
Max read/write image args: 64
|
||||
Max on device events: 1024
|
||||
Queue on device max size: 8388608
|
||||
Max on device queues: 1
|
||||
Queue on device preferred size: 262144
|
||||
SVM capabilities:
|
||||
Coarse grain buffer: Yes
|
||||
Fine grain buffer: Yes
|
||||
Fine grain system: No
|
||||
Atomics: No
|
||||
Preferred platform atomic alignment: 0
|
||||
Preferred global atomic alignment: 0
|
||||
Preferred local atomic alignment: 0
|
||||
```
|
||||
|
||||
## Add User to Video Group
|
||||
|
||||
Per this response
|
||||
|
||||
https://github.com/RadeonOpenCompute/ROCm/issues/482#issuecomment-410551357
|
||||
|
||||
I add the current user to the `video` group. I forget where I first read that this was necessary, but next also
|
||||
is to try the `render` group.
|
||||
|
||||
```
|
||||
sudo usermod -G video -a $USER
|
||||
|
||||
```
|
||||
|
||||
@ -0,0 +1,95 @@
|
||||
# Render Group and Rebooting
|
||||
|
||||
When trying to do a hardware shutdown, I got these error messages for every instance
|
||||
of the `clinfo` process that I had started up. They hang hard, and don't respond to
|
||||
kill signals.
|
||||
|
||||
```
|
||||
systemd-shutdown[1]: Waiting for process: clinfo, clinfo, clinfo, clinfo, clinfo
|
||||
```
|
||||
|
||||
## Rebooting
|
||||
|
||||
New groups indeed show up upon reboot.
|
||||
|
||||
```
|
||||
$ id
|
||||
uid=1000(arcologos) gid=1000(arcologos) groups=1000(arcologos),4(adm),24(cdrom),27(sudo),30(dip),44(video),46(plugdev),109(render),120(lpadmin),133(lxd),134(sambashare)
|
||||
```
|
||||
|
||||
## Killing `clinfo` Process
|
||||
|
||||
Repeatedly calling
|
||||
|
||||
```
|
||||
ps -eax
|
||||
```
|
||||
|
||||
and
|
||||
|
||||
```
|
||||
sudo kill -KILL <pid>
|
||||
```
|
||||
|
||||
of anything mentioning `clinfo` eventually did kill the `bash` processes
|
||||
waiting on `clinfo` after many minutes. However, the real processes
|
||||
remain as zombies
|
||||
|
||||
```
|
||||
$ ps -eax | grep clinfo
|
||||
1519 pts/0 D 0:44 [clinfo]
|
||||
1553 pts/1 D 0:00 [clinfo]
|
||||
```
|
||||
|
||||
## Installing `libnuma-dev`
|
||||
|
||||
One of the GitHub issue advice was to install `libnuma-dev`,
|
||||
and now I wish I had been paying more attention to NUMA when
|
||||
working on SaLSa self-driving cars with Samhitha.
|
||||
|
||||
```
|
||||
$ sudo apt install numactl
|
||||
Reading package lists... Done
|
||||
Building dependency tree
|
||||
Reading state information... Done
|
||||
The following NEW packages will be installed:
|
||||
numactl
|
||||
0 upgraded, 1 newly installed, 0 to remove and 171 not upgraded.
|
||||
Need to get 38.5 kB of archives.
|
||||
After this operation, 150 kB of additional disk space will be used.
|
||||
Get:1 http://us.archive.ubuntu.com/ubuntu focal/main amd64 numactl amd64 2.0.12-1 [38.5 kB]
|
||||
Fetched 38.5 kB in 0s (140 kB/s)
|
||||
Selecting previously unselected package numactl.
|
||||
(Reading database ... 165065 files and directories currently installed.)
|
||||
Preparing to unpack .../numactl_2.0.12-1_amd64.deb ...
|
||||
Unpacking numactl (2.0.12-1) ...
|
||||
Setting up numactl (2.0.12-1) ...
|
||||
Processing triggers for man-db (2.9.1-1) ...
|
||||
```
|
||||
|
||||
The two cores of the Celeron show up in `numactl` but not the GPU.
|
||||
```
|
||||
$ sudo numactl -s
|
||||
policy: default
|
||||
preferred node: current
|
||||
physcpubind: 0 1
|
||||
cpubind: 0
|
||||
nodebind: 0
|
||||
membind: 0
|
||||
$ sudo numactl --hardware
|
||||
available: 1 nodes (0)
|
||||
node 0 cpus: 0 1
|
||||
node 0 size: 1574 MB
|
||||
node 0 free: 103 MB
|
||||
node distances:
|
||||
node 0
|
||||
0: 10
|
||||
```
|
||||
|
||||
## Perhaps a Problem with `rocm-opencl` or Ubuntu distribution.
|
||||
|
||||
A problem with this on Fedora prompts `clinfo` maintainer to suggest taking it up with Fedora
|
||||
https://github.com/Oblomov/clinfo/issues/81
|
||||
|
||||
Unanswered
|
||||
https://community.amd.com/t5/drivers-software/clinfo-gets-hanged/td-p/444906
|
||||
File diff suppressed because it is too large
Load Diff
Loading…
Reference in new issue