Training

We’ve aggregated some training materials on good practices for computational research. Please let us know what else would be useful.

  • Extracting data from PDFs with Tabula

    Tutorial explaining how to extract data from tables in PDFs using Tabula, an open source project. The tutorial uses Docker.

  • Avian Influenza Nextstrain Build

    Step by step instructions for Nextstrain installation, and pipelining with Snakemake, using an example dataset of publicly available nonhuman H3Nx data from Genbank.

  • Seasonal Influenza Nextstrain Quickstart Guide

    How to build a Nexstrain analysis from GISAID data.

  • Git for Science thumbnail

    Git for Science

    With the rise of digital methods in research, version control systems such as git serve as modern lab notebooks. Version control platforms (Github, Gitlab) also facilitate publishing and collaboration.

  • Code Structure: Part 1 thumbnail

    Code Structure: Part 1

    Lean how to write simple, canonical code.

    Part 1 covers the theoretical framework, functional design, the limits of Functional Programming, and some well-known software principles.

  • Code Structure: Part 2 thumbnail

    Code Structure: Part 2

    Lean how to write simple, canonical code.

    Part 2 covers interfaces, and a realistic example from modeling populations of pathogens.

  • Pathogen evolution, selection, and immunity thumbnail

    Pathogen evolution, selection, and immunity

    Trevor Bedford and Sarah Cobey teach a 2.5-day module on pathogen evolution, selection, and immunity for SISMID each July, with an emphasis on modeling and statistics. Our slides and exercises are here. Registration for the course usually starts in January.

  • Digital Validation in Research thumbnail

    Digital Validation in Research

    Repeatability, in science and computation, is conceptually very simple. Make conditions the same, as exactly as necessary, and the process will repeat. Drop an apple and it will fall. There are, of course, details that can be omitted. Knowing which details are indispensable is essential to ensuring repeatability.

  • Nextstrain in HPC Environments thumbnail

    Nextstrain in HPC Environments

    If you don't have the option to install Nextstrain on a local machine, cloud host, or another dedicated environment, a traditional HPC environment will work just as well with some adjustments. And it comes with learning how to use Snakemake, a useful tool for orchestrating complex jobs.

Suggestions?

If you have feedback on training, ideas for new modules, or collaboration proposals, please let us know.