Contributing
Contributions are very welcome. Please file issues or submit pull requests in our GitHub repository. All contributors will be acknowledged.
In Brief
-
Use
uv venv
anduv pip install -r pyproject.toml
to install the packages required by the helper tools and Python examples. -
The source files for examples are in their section directories along with captured output in
.out
files. -
Makefile
contains the commands used to re-run each example. If you add a new example, please add a corresponding rule inMakefile
. -
Use a level-2 heading for each sub-topic. Use
{: .aside}
for an aside or{: .exercise}
for exercise. -
Please create SVG diagrams using draw.io. Please use 14-point Helvetica for text, solid 1-point black lines, and unfilled objects.
Labels
Name | Description | Color |
---|---|---|
change | something different | #FBCA04 |
feature | new feature | #B60205 |
fix | something broken | #5319E7 |
good first issue | newcomers are always welcome | #D4C5F9 |
talk | question or discussion | #0E8A16 |
task | one-off task | #1D76DB |
Please use Conventional Commits style for pull requests
by using change:
, feature:
, fix:
, or task:
as the first word
in the title of the commit message.
You may also use publish:
if the PR just rebuilds the HTML version of the lesson.
Contributors
Greg Wilson was the co-founder and first Executive Director of Software Carpentry and received ACM SIGSOFT's Influential Educator Award in 2020.
Konstantinos Kitsios is a PhD student at University of Zurich, working on applications of machine learning to software engineering. Previously he worked as a software engineer at Tesla.
Pao Corrales has a PhD in Atmospheric Sciences and works on improving severe weather forecasts in Argentina. She loves teaching science and programming using evidence-based techniques centered on the students and their context.
FAQ
- Why SQL?
- Because if you dig down far enough, almost every data science project sits on top of a relational database. (Jon Udell once called PostgreSQL "an operating system for data science".) SQL's relational model has also been a powerful influence on dataframe libraries like the tidyverse, Pandas, and Polars; understanding the former therefore helps people understand the latter.
- Why McCole?
- The first version of this tutorial used Jekyll because it is the default for GitHub Pages and because its frustrating limitations would discourage contributors from messing around with the template instead of writing content. However, those limitations proved more frustrating than anticipated: in particular, very few data scientists speak Ruby, so previewing changes locally required them to install and use yet another language framework.
- Why Make?
- It runs everywhere, no other build tool is a clear successor, and, like Jekyll, it's uncomfortable enough to use that people won't be tempted to fiddle with it when they could be writing.
- Why hand-drawn figures rather than Graphviz or Mermaid?
- Because it's faster to Just Effing Draw than it is to try to tweak layout parameters for text-to-diagram systems. If you really want to make developers' lives better, build a diff-and-merge tool for SVG: programmers shouldn't have to use punchard-compatible data formats in the 21st Century just to get the benefits of version control.
- Why make this tutorial freely available?
- Because if we all give a little, we all get a lot.
Colophon
-
The colors in this theme are lightened versions of those used in classic Canadian postage stamps. The art in the title is by Danielle Navarro and used with her gracious permission.
-
The CSS files used to style code were obtained from highlight-css; legibility was checked using WebAIM WAVE.
-
Diagrams were created with the desktop version of draw.io.
-
The site is hosted on GitHub Pages.
-
Thanks to the authors of BeautifulSoup, html5validator, ruff, and all the other software used in this project. If we all give a little, we all get a lot.