Introduction
Objective
- Tried teaching [Docker][docker] to data scientists
- Had to back up every few minutes to explain underlying concepts
- "What does it mean to 'mount' a filesystem?"
- "What does it mean to 'background' a job?"
- "What is a 'port'?"
- Answer those questions step-by-step so that a web service running in Docker will make sense
What This Is
- Notes and working examples that instructors can use to perform a lesson
- Do not expect novices with no prior Unix experience to be able to learn from them on their own
- Musical analogy
- This is the chord changes and melody
- We expect instructors to create an arrangement and/or improvise while delivering
- Please see the license for terms of use, the Code of Conduct for community standards, and these guidelines for notes on contributing
Scope
- [Intended audience][persona]
- Ning did a bachelor's degree in economics and now works as a data analyst for the Ministry of Health
- They are comfortable working with Unix command-line tools, writing data analysis programs in Python, and downloading data from the web manually
- Ning wants to understand what happens when they install a package or run a pipeline in the cloud
- Their work schedule is unpredictable and highly variable, so they need to be able to learn a bit at a time
Prerequisites
- Unix shell commands covered in [this Software Carpentry lesson][sc_shell]:
pwd
;ls
;cd
;.
and..
;rm
andrmdir
;mkdir
;touch
;mv
;cp
;tree
;cat
;wc
;head
;tail
;less
;cut
;echo
;history
;find
;grep
;zip
;man
- current working directory; absolute and relative paths; naming files;
editing with
nano
- standard input; standard output; standard error; redirection; pipes
*
and?
wildcards; shell variable with$
expansion;for
loop
- Python for command-line scripting
- variables; numbers and strings; lists; dictionaries;
for
andwhile
loops;if
/else
;with
; defining and calling functions;sys.argv
,sys.stdin
, andsys.stdout
; simple regular expressions; reading JSON data; reading CSV files using [Pandas][pandas] or [Polars][polars] pip install
python -m venv
orconda create
- variables; numbers and strings; lists; dictionaries;
Learning Outcomes
- Explain the difference between shell variables and environment variables and write shell scripts that use each.
- Create a virtual environment and explain what this actually does.
- Create
requirements.txt
file for [pip
][pip] and explain version pinning. - Explain what a filesystem is (disk partitions, inodes, symbolic links)
and use
df
,ln
, similar commands to explore with them. - Explain what a process is and use commands like
ps
andkill
to explore and manage them. - Explain what a job is and use commands like
jobs
,bg
, andfg
to manage them. - Explain what
cron
jobs are and how to create them. - Explain the difference between a container and a virtual machine.
- Create and manage Docker images.
- Explain what ports are and write Python code that uses sockets and HTTP.
- Explain what certificates are and how they are used to support HTTPS.
- Explain what key pairs are and how they are stored, and create and manage key pairs.
- Explain what IP addresses are and how they are resolved.
- Explain how traditional password authentication works and describe its weaknesses.
Setup
- Download the latest release
- Unzip the file in a temporary directory to create:
./site/*.*
: files and directories used in examples./src/*.*
: shell scripts and Python programs./out/*.*
: expected output for examples
Acknowledgments
My thanks to everyone who helped make this tutorial possible:
[% thanks %]