gpuutil/README.md

# gpuutil

A naive tool for observing gpu status and auto set visible gpu in python code.

## How to use

1. install the package.
```shell
pip install https://git.zmy.pub/zmyme/gpuutil/archive/v0.0.3.tar.gz
```

2. for observing gpu status, just input
```shell
python -m gpuutil <options>
```
when directly running ```python -m gpuutil```, you would probably get:
```text
+----+------+------+----------+----------+------+----------------+
| ID | Fan  | Temp |   Pwr    |   Freq   | Util |      Vmem      |
+----+------+------+----------+----------+------+----------------+
| 0  | 22 % | 21 C |  9.11 W  | 300 MHz  | 0 %  | 3089/11019 MiB |
| 1  | 22 % | 23 C |  6.28 W  | 300 MHz  | 0 %  | 786/11019 MiB  |
| 2  | 38 % | 59 C | 92.04 W  | 1890 MHz | 6 %  | 3608/11019 MiB |
| 3  | 40 % | 67 C | 246.38 W | 1740 MHz | 93 % | 3598/11019 MiB |
+----+------+------+----------+----------+------+----------------+
|                          Process Info                          |
+----------------------------------------------------------------+
| [26107|0] user1(737 MiB) python                                |
| [34033|0,1] user2(1566 MiB) python                             |
| [37190|0] user2(783 MiB) python                                |
| [37260|0] user2(783 MiB) python                                |
| [30356|2] user3(3605 MiB) python train.py --args --some really |
| long arguments                                                 |
| [34922|3] user3(3595 MiB) python train.py --args --some really |
| long arguments version 2                                       |
+----------------------------------------------------------------+
```
To get more information, run ```python -m gpuutil -h```, you would get:
```text
usage: __main__.py [-h] [--profile PROFILE] [--cols COLS] [--style STYLE]
                   [--show-process SHOW_PROCESS] [--vertical VERTICAL] [--save]

optional arguments:
  -h, --help            show this help message and exit
  --profile PROFILE, -p PROFILE
                        profile keyword, corresponding configuration are saved in ~/.gpuutil.conf
  --cols COLS, -c COLS  colums to show.(Availabel cols: ['ID', 'Fan', 'Temp', 'TempMax', 'Pwr',
                        'PwrMax', 'Freq', 'FreqMax', 'Util', 'Vmem', 'UsedMem', 'TotalMem', 'FreeMem',
                        'Users']
  --style STYLE, -sty STYLE
                        column style, format: |c|l:15|r|c:14rl:13|, c,l,r are align methods, | is line
                        and :(int) are width limit.
  --show-process SHOW_PROCESS, -sp SHOW_PROCESS
                        whether show process or not
  --vertical VERTICAL, -v VERTICAL
                        whether show each user in different lines. (show user vertically)
  --save                save config to profile
```

3. To auto set visible gpu in your python code, just use the following python code.
```python
from gpuutil import auto_set
auto_set(1)
```

the ```auto_set```function is defined as follows:
```python
# num: the number of the gpu you want to use.
# allow_nonfree: whether use non-empty gpu when there is no enough of them.
# ask: if is set to true, the script will ask for a confirmation when using non empty gpu. if false, it will use the non empty gpu directly.
# blacklist: a list of int, the gpu in this list will not be used unless you mannuly choose them.
# show: if set to true, it will show which gpu is currently using.
def auto_set(num, allow_nonfree=True, ask=True, blacklist=[], show=True):
	# some code here.
```

# Use this inside an docker.
For some reason, codes that running in docker cannot get the correct information about the process that using the gpu.
To support that, gpuutil supports read the output command of nvidia-smi and ps from an given file, which should be generated by you from host machine
To use this in docker, try the following steps:
1. figure out a way to pass the output of command ```nvidia-smi -q -x``` to the docker that your are currently using, save the output as a text file.
2. pass the output of a ps-like command to the docker. It is a table-like output, the first line is header, which should at least contains user, pid and command. below is an valid output generated by running ```ps -axo user,pid,command```on host machine:
```
USER  PID COMMAND
root    1 /bin/bash -c bash /etc/init.docker; /usr/sbin/sshd -D
root    8 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
root    9 sshd: user1 [priv]
user1  19 sshd: user1@pts/0
user1  20 -zsh
user1  97 tmux
user1  98 -zsh
```
if your generated output have different name, for example when you are using ```docker top``` instead of ```ps```, the ```COMMAND``` section would be ```CMD```, therefore you need prepare a dict that maps its name to either of ```user, pid, command```, note that its insensitive to upper case.
3. run the configuration script.
```shell
python -m gpuutil.set_redirect -nv path/to/your/nvidia/output -ps /path/to/your/ps/output -pst cmd=command,username=user
```
for more information about the script, run ```python -m gpuutil.set_redirect -h```, you will get:
```
usage: set_redirect.py [-h] [--nvsmi NVSMI] [--ps PS] [--ps_name_trans PS_NAME_TRANS]

optional arguments:
  -h, --help            show this help message and exit
  --nvsmi NVSMI, -nv NVSMI
                        a file indicates real nvidia-smi -q -x output.
  --ps PS, -ps PS       a file indicates real ps-like output.
  --ps_name_trans PS_NAME_TRANS, -pst PS_NAME_TRANS
                        a dict of name trans, format: name1=buildin,name2=buildin, buildin can be choosen from cmd,user,pid
```
> some advice:
> 1. you can use a script that run nvidia-smi and ps command and save their output to a directory, the mount the directory to the docker as readonly.
> 2. you could consider mount the directory as tmpfs.

## ps:
1. You can get more detailed gpu info via accessing gpuutil.GPUStat class, for more information, just look the code.
2. Since it use ps command to get detailed process info, it can only be used on linux, if you use it on windows, some information might be missing.
3. If you have any trouble, feel free to open an issue.
4. The code is straight forward, it's also a good choice to take an look at the code if you got any trouble.