Lesson Plan for Week 7

Objectives

  • We will check in about the semester project. I will ask each group to briefly introduce your research questions and to share one interesting thing about a dataset they found.
  • We are now moving on to more complex data analysis including satellite and model data. This means that it is a good time to do some additional ground work on how to best do reproducible science.
    • We will talk about APIs and Python Environments

Specific learning goals

  • Be able to define what an API is
  • Understand how APIs can be used to access environmental datasets through computer code
  • Describe how APIs contribute to FAIR data and reproducible data analysis pipelines
  • Understand why Python virtual environments are useful/ needed
  • Be able to create, activate, and save virtual environments using Anaconda or conda from the command line

Class Preparation

Before coming to class, please explore the materials that were posted in the Data Analysis Pipelines section of the lecture notes.

Readings and Materials

APIs

Anaconda environments and managing Python packages

These two resources below, give an introduction on python packages and why having virtual python environments is best-practice when doing data analysis.

  • CU Boulder EarthLab: Python Packages for Earth Data Science
  • CU Boulder EarthLab: Conda Environments
    • Note: Conda is the backbone of Anaconda. Anaconda Navigator provides a Graphical User Interface to conda, which means that rather than working on the command line in the terminal, you can do things by clicking on the screen. The downside to this is that Anaconda is
      1. very slow compared to conda
      2. owned by a company that limits access (e.g. you had to register to download Anaconda)
  • Carpentries: Working with environments
    • Carpentries is a group dedicated improving software development skills for (data) scientists. They have an entire course on working with conda. Their lesson on conda environments is a comprehensive guide to creating, working with and sharing conda environments.

Planned Agenda

  • Monday:
    • Discussion of first learning reflection
    • Semester project update
    • Group work on practice exercises
      • I have to leave 20 minutes early
  • Wednesday:
    • 2-minute stand-up
      • Problem/statement research questions
      • Dataset
      • What’s next/ roadblocks
    • Mini-Lecture on Environments
    • Practice working with python environments
      • Checklist:
        • I know how to create a new anaconda environment
        • I know how to activate a python environment using Anaconda/ Conda
        • I can execute python code (e.g. a Jupyter notebook) using a specific environment

Assignments

  • Semester project:
    • Submit a first draft of the problem statement
    • Find and share at least one tabular dataset that you can then explore