Tutorials

Tutorials

The following is a list of online self-study tutorials prepared by the SCF and partners. Note that a zip file with all the (non-screencast) materials for each tutorial can be found by following the (materials on Github) link and using the "Download ZIP" button in the lower right of the Github page.

Basics of UNIX

Provides a basic introduction to the UNIX command line (including Linux and the Mac terminal).
Last updated December 2021. Prepared by Chris Paciorek.

Introduction to LaTeX

A quick introduction to LaTeX, a powerful and flexible system for formatting documents, especially those using mathematical notation. Focuses on demonstration using a concrete example.
Last updated August 2015. Prepared by Chris Paciorek.

Dynamic documents with code chunks

A quick introduction to embedding R, bash, and Python code in PDF and HTML documents using R Markdown, LaTeX based (knitr and Sweave) formats, and Jupyter notebooks.
Last updated January 2022. Prepared by Chris Paciorek.

Introduction to git and GitHub

The basics of git, a version control system, and hosting git repositories on GitHub.
Last updated August 2017. Prepared by Jarrod Millman.

Using the bash shell

UNIX utilities, shortcuts, shell scripting, job control, and regular expressions.
Last updated September 2019. Prepared by Jarrod Millman and Chris Paciorek.

String processing

String processing, including regular expressions, in R and Python.
Last updated September 2019. Prepared by Chris Paciorek.

Parallel processing in Python, R, Matlab, and C/C++

How to use threaded linear algebra and basic parallel processing on one or more machines (including use with the Slurm scheduler).
Last updated December 2021. Prepared by Chris Paciorek.

Flexible parallel processing using Dask in Python and future in R

Parallel processing on one or more machines, including using distributed datasets in Dask.
Last updated November 2021. Prepared by Chris Paciorek.

Working with large datasets in SQL, R, and Python

Using databases from R and Python, plus material on packages in R for working with large datasets.
Last updated February 2022. Prepared by Chris Paciorek.

Using make for workflows

How to use make to automate workflows and make them reproducible.
Last updated August 2015. Prepared by Chris Paciorek.

Writing efficient R code

How to assess the speed of your code and write code that will run quickly in R.
Last updated October 2021. Prepared by Chris Paciorek.

Debugging in R

How to use R's debugging tools, handle errors, and avoid bugs.
Last updated August 2021. Prepared by Chris Paciorek.

Last updated December 2021.