Reproducible Research Environments

cross language & cross platform

Eli Knaap

San Diego State University

2024-12-03

why

  • use pixi to recreate the exact same environment anywhere (quickly)
  • use quarto to execute and build
  • use docker and devcontainers to work anywhere
    • locally
    • in a virtual machine
    • in a cloud-based container
  • include all necessary tooling for spatial analysis & academic writing (or teaching!)

what (tl;dr)

https://github.com/knaaptime/rypyrx

  • this repo demonstrates a reproducible research project that can (should!) run anywhere, including just in a github codespaces instance (coding on your ipad!)
  • this also works nicely for teaching workshops
  • the bones are just pixi and quarto
  • the magic is the .devcontainer and .github configuration files

end result

setup

this is a template repository; you can run it as-is to get a feel for the workflow, or clone and edit for your needs

working locally

go somewhere on your computer you keep your projects, then clone (or fork then clone to run from your own copy)

  • cd projects
  • git clone https://github.com/knaaptime/rypyrx.git

working in a cloud-based container (codespaces)

To open this repository exactly as it appears, just click the button

(you can use the same config elsewhere than github, but this is is nice)

working in codespaces

wait for the VM to finish building, then click the plus to open a new terminal

then continue with the pixi run commands

(currently you need pixi run quarto render paper --latex-engine=tectonic or similar because pixi’s sandboxing gets in quarto’s way of finding tinytex 🤷‍♂️)

from your own repository (1)

if you want to start a github codespace from another project (like if you forked your own version of this repo)

from your own repository (2)

working in a local devcontainer (1)

  • install vscode
  • install docker

working in a local devcontainer (2)

  • open vscode
  • click the little blue icon in the bottom left

working in a local devcontainer (3)

click ‘reopen in container’

run

if you need to add/change dependencies, edit the pixi.toml file

the notebooks directory

I like jupyter notebooks for doing my main analysis

  • the existing pixi.toml file includes both Python and R kernels
    • you can install R stuff outside conda-forge using the usual install.packages, but you won’t have the lockfile for those packages…
  • the notebooks show how to use both languages and generate output for the same paper

run jupyter

pixi run jupyter lab

edit notebooks

write

use quarto to manage file execution order and generate outputs

the paper directory

this directory stores a standalone quarto project

  • use the _quarto.yml file to include any pieces you need executed
  • i like to split academic drafts into files by section, the use includes to wrap them together
  • when submitting latex to journals, they often require all additional files (figures, bibs) be in the same directory as the main file. That usually creates a ton of files in one directory which i hate, so i use prefixes to differentiate file types

the index.qmd file

I use a single index.qmd file as the main source document, then include each section as a different file

generate the paper

  • edit stuff in the paper directory
  • pixi run quarto render paper

output

the output will be rendered to the default location at paper/_manuscript/

configure any preferences in the _quarto.yml file. Mine are

  • the default pdf file has the same name as the project (rypryx.pdf in this case)
  • the default html file is called index.html to make it easier to serve

publish

There is a preconfigured github-action recipe in the .github directory that will serve a static website from the slides directory

(edit the names if you like, but i usually have slides to publish). Anything in this directory will be served, so you can also toss your paper in pdf/html in here when it’s ready for the public

configure the repository

one small config:

go to the repository settings

go to pages settings

use github actions

the config file is already setup. You’re done.

edit the slides

the slides directory is another standalone quarto project

  • edit the index.qmd file in slides
  • pixi run quarto render slides

commit the directory

commit everything you want public in slides, then push it up.

  • cd slides
  • git add .
  • git commit -m 'add slides'
  • git push

everything will be available at https://{{yourgithubname}}.github.io/{{yourprojectname}}

more

use quarto extensions to make this look nice.