1.2. Data & Workflows

Learning Goals

  1. Understand the neccessity for reproducible workflows.
  2. Experience working with data.
  3. Recognize that tools are needed.
  4. Realize that all of this is a process and that getting started is better than doing nothing.

Working with data

Hey everyone,
I found this cool dataset online and I wanted to check this out. It is attached to this email.

Best wishes, TG

data.csv

NoteActivity: Make a graph
  • Make a plot of the data in data.csv.
  • What do you think you plotted:
    • Give it a title and labels
  • Modify your graph according to Dr. Gerken’s suggestion?
TipReflection Questions:
  • Apply the FAIR principles to data.csv
  • What are other issues with the dataset?
  • How easy would it be to make the exact same figure
    • by another person?
    • by yourself in 2 years?

FAIR(-er) and Open Data

The Global Carbon Budget

Annual Carbon Emissions and their Partitioning
TipDiscussion Questions:
  • How FAIR is this dataset?
  • Why did I make you do this?

Reproducible Worklows

According to Bowers & Voors (2016), reproducible data analysis is programming rather than using a tool like Excel.

TipReflection Question:
  • What do they mean by this?
  • How does your experience during the activity relate?

We will introduce some tools to help you do just that next.

Looking Forward

On Friday, we will start working with some of the tool.

I am asking you to install two pieces of software.

  • GitHub Desktop
  • Anaconda Python with Jupyter

You can find details here. Let’s have a look.