Note: This article is originally written for my classmates attending COMP4163 Neural Networks and Deep Learning.
Hi! In the next few chapters we will learn some very basic usage of Linux operating systems. Nowadays Linux operating system(s) are widely used among cloud servers and high performance computing (including the GPU server we may use), but Linux is quite different from Microsoft Windows, which we use every day, so the concepts may seem difficult for us. But don’t worry, should you encounter any problem about Linux that is not covered in this document, feel free to ask me!
What is Linux
“Linux is not another DOS.”
– Dr. Jefferson Fong
In 1991, Linus Torvalds created the Linux operating system in University of Helsinki in Finland. What makes Linux so successful is its openness: its source code is open for access and free (both as in “freedom” and as in “free beer”), and its code is constantly changing thanks to millions of volunteers around the world (which means we can change Linux too if we have such ability).
Linux per se is only an operating system kernel, not a usable operating system. (Recall the COMP3033 Operating System course!) Thus, people around the world write peripheral software (a.k.a. user-space programs) and pack them together. Operating systems with Linux kernel that are usable out-of-box are called Linux distributions (a.k.a. distros). Famous Linux distributions include Ubuntu, Fedora, CentOS, Arch Linux, and (thousands) more.
The GPU server provided for our use is installed with Ubuntu 18.04, which is a stable and modern one.
Log into a Local Linux Operating System
To log into a local Linux operating system, first we need to install a Linux operating system. Alex Ning will cover this in a future tutorial, so we skip this first.
Log into a Remote Linux Operating System
To log into a remote Linux operating system, we use the Secured SHell (
ssh). We need to first have an SSH client installed on our local operating system. Here are some options:
- On Microsoft Windows:
- PuTTY is recommended.
- A recent version of Windows 10 (1809 and newer) should already have a
sshinstalled. To use it, follow the following steps:
<WINDOWS> + X
- In the popped-up context menu, click “Windows PowerShell” (or Command Prompt).
- Start using
sshas is described below.
- SSH is also built into Git Bash, MSYS2, and Cygwin.
- On macOS: SSH is pre-installed.
- On Linux: SSH is usually pre-installed. If not, ask Google.
- In Microsoft Store, Linux distributions built on Windows Subsystem of Linux (WSL) have the same usage as normal Linux operating systems.
- On Android: Termux, ConnectBot.
- On iOS and iPadOS: Termius.
In this document, only the standard
ssh command line program is covered. Information about the GPU server is described in the “Introduction to GPU server” (J. Wu, J. Yu, L. Zheng, et al.) document.
Next, connect to the server:
<name>is our account name (on our GPU server this is our student ID).
The following is an example of first login:
$ ssh email@example.com The authenticity of host 'nndl.uiccst.com (172.31.19.19)' can't be established. ECDSA key fingerprint is SHA256:oSlAV0yVYODHi6EOLQujjOSVtBg+aYbgd4eRmmj+nMY. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added 'nndl.uiccst.com,172.31.19.19' (ECDSA) to the list of known hosts. firstname.lastname@example.org's password: Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-65-generic x86_64) ... Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. 1630003072@cst-ml-server001:~$
On first connection, the SSH client asks if we trust the remote machine. We just trust it. Then we should see the command line prompt (the
$ means we are currently logged in as a normal user;
# means super user root).
Look at the texts before the cursor. They mean:
<username>is the user name.
<hostname>is the name of the connected machine.
<currdir>is the current location. There are some special values:
/is the root directory (see below).
~is our home directory; we should put all personal files here.
Files and directories in Linux operating systems (and all those UNIX-like systems) are organized in a tree form.
/ is called the “root directory”, under which directories are nodes with children, and files are leaves. This is different from Microsoft Windows: there is no
To list files and directories in the current location, type:
ls # a brief list ll # a detailed list (alias of `ls -lh`) l # a more detailed list with hidden files listed (alias of `ls -lah`)
To go into a directory, type
cd <path>, where
<path> is the path to the directory. Examples:
# Enter the "training" directory in the "data" directory # under the current location cd data/training # Enter the system configuration directory, which is located # under the root directory cd /etc # Go back to the home directory (yes, just type `cd`) cd
To create a directory, type
mkdir <name>, where
<name> is the name of directory. The directory will be created under the current location; to create directory in other directory, type
To copy a file or a directory, type:
cp [-r] <source> <target>
<source> is the file we want to copy, and
<target> is the location we want the file goes. The optional
cp to copy files recursively; otherwise
cp refuses to copy a directory.
# Copy train.txt from the current directory to the data directory # The trailing slash is not required cp train.txt data/ # Copy the whole directory named "data" to .keras/mnist # The trailing slash is not required cp -r data/ .keras/mnist
To move a file (instead of copy), use
mv. Note, that to rename a file or directory,
mv is also used:
# Move train.txt to directory data (the original file will disappear) # The trailing slash is not required mv train.txt data/ # Rename directory data to dataset mv data dataset
To delete a file, type
rm <file>, where
<file> is the name of the file we want to delete.
To delete a directory, type
rm -r <dirname>, where
<dirname> is the name of the directory we want to delete. The
-r argument means “delete the directory recursively”; otherwise
rm refuses to delete a directory.
One of the powerful and fascinating things about Linux operating systems is that the system utilities, when combined, are extremely powerful. Let’s see how to use some of the tools to do cool things.
Suppose there is a big text file named “train.log”, and we want to find text lines containing the string “accuracy”. Type:
cat train.log | grep "accuracy"
Let’s break this out:
(not cats!), but if only one file is provided, it just prints it. So
cat train.logprints the file content of
|is a pipe. Recall what we learned in COMP3033 Operating System course: pipe copies the standard output from the former command, and pastes it to the standard input of the latter command. So the pipe copies the file content of
train.logand passes it to
grep "accuracy"searches for the string
accuracyfrom its standard input. If it finds one, it prints the whole line containing the string.
To edit files on the server, use
nano, which is simple to use. Just append the path to the file we want to edit. To exit
<CTRL> + X.
vim can also be used, which is more advanced, but hard to use (it inherits the operation mode back from 1970s). In case we accidently get stuck in the
vim editor, remember how to exit the editor: first press
<ESC>, then press
Z, finally press
Sometimes we may want to download files on the server or upload files to the server. SSH has file transfer support, so we just use it.
Secured CoPy (
scp) is used to transfer files from and to SSH-enabled servers. Basic usage of this command:
scp <from> <to>
<from>is the path to the file to be copied:
- if it’s on our local computer, just type in the path pointing to the file
- if it’s on the server, type as in this format:
- mostly the same as
- if the path is not absolute (e.g.
scpthinks it in our home directory
- mostly the same as
<to>is the path to the copied location
- basically follows the rules above
$ scp train.txt email@example.com: firstname.lastname@example.org's password: train.txt 100% 0 0.0KB/s 00:00
train.txt from the local, current directory to
1630003072’s home directory in the server (note that there is nothing after the colon).
There is a more convenient way: use FileZilla, which should be familiar to us. When typing the remote hostname, remember to prefix
sftp://, which means the file transfer protocol running on SSH. The port should be
In case we wonder the usage of the GPU server, type
htop and we can see a task manager. The task manager tells us the CPU usage, memory usage, and more.
To see the GPU usage, type
nvidia-smi. This tool should be straightforward.