Introduction
Objective
- Tried teaching Docker to data scientists
- Had to back up every few minutes to explain underlying concepts
- "What does it mean to 'mount' a filesystem?"
- "What does it mean to 'background' a job?"
- "What is a 'port'?"
- Answer those questions step-by-step so that a web service running in Docker will make sense
What This Is
- Notes and working examples that instructors can use to perform a lesson
- Do not expect novices with no prior Unix experience to be able to learn from them on their own
- Musical analogy
- This is the chord changes and melody
- We expect instructors to create an arrangement and/or improvise while delivering
- Please see the license for terms of use,
the Code of Conduct for community standards,
and these guidelines for notes on contributing
Scope
- Intended audience
- Ning did a bachelor's degree in economics
and now works as a data analyst for the Ministry of Health
- They are comfortable working with Unix command-line tools,
writing data analysis programs in Python,
and downloading data from the web manually
- Ning wants to understand what happens when they install a package
or run a pipeline in the cloud
- Their work schedule is unpredictable and highly variable,
so they need to be able to learn a bit at a time
Prerequisites
- Unix shell commands covered in this Software Carpentry lesson:
pwd
; ls
; cd
; .
and ..
; rm
and rmdir
; mkdir
; touch
;
mv
; cp
; tree
; cat
; wc
; head
; tail
; less
; cut
; echo
;
history
; find
; grep
; zip
; man
- current working directory; absolute and relative paths; naming files;
editing with
nano
- standard input; standard output; standard error; redirection; pipes
*
and ?
wildcards; shell variable with $
expansion; for
loop
- Python for command-line scripting
- variables; numbers and strings; lists; dictionaries;
for
and while
loops;
if
/else
; with
; defining and calling functions; sys.argv
, sys.stdin
,
and sys.stdout
; simple regular expressions; reading JSON data;
reading CSV files using Pandas or Polars
pip install
python -m venv
or conda create
Learning Outcomes
- Explain the difference between shell variables and environment variables
and write shell scripts that use each.
- Create a virtual environment and explain what this actually does.
- Create
requirements.txt
file for pip
and explain version pinning.
- Explain what a filesystem is (disk partitions, inodes, symbolic links)
and use
df
, ln
, similar commands to explore with them.
- Explain what a process is and use commands like
ps
and kill
to explore and manage them.
- Explain what a job is and use commands like
jobs
, bg
, and fg
to manage them.
- Explain what
cron
jobs are and how to create them.
- Explain the difference between a container and a virtual machine.
- Create and manage Docker images.
- Explain what ports are and write Python code that uses sockets and HTTP.
- Explain what certificates are and how they are used to support HTTPS.
- Explain what key pairs are and how they are stored, and create and manage key pairs.
- Explain what IP addresses are and how they are resolved.
- Explain how traditional password authentication works and describe its weaknesses.
Setup
- Download the latest release
- Unzip the file in a temporary directory to create:
./site/*.*
: files and directories used in examples
./src/*.*
: shell scripts and Python programs
./out/*.*
: expected output for examples