Lesson Plan for Week 4

Objectives

We will continue and expand on the themes of week 3: (1) How to connect data and environmental questions. This is preparation for getting started with the semester project. If things go well, we will have a set of questions that we can continue with for forming project groups. (2) We will use the tools that we introduced to start with some actual data analysis.

This means we will:

  • General:
    • clarify questions from the previous week
  • Environmental Issues/ Data
    • discussing environmental issues and their relationship to data
    • clarify what is needed to analyze and communicate environmental issues
    • collect questions that can be addressed using data
    • explore sources for environmental datasets
  • Data Analysis:
    • learn how to use github to share code/ get updated code
    • explore tabular, structured data as a common form of environmental data
    • use pandas to
      • read tabular data into python
      • perform exploratory data analysis

Specific learning goals

  • practice Git/ GitHub workflows to retrieve, update, and share data analysis code
  • learn how to read tabular data into Python using pandas and jupyter notebooks
  • perform exploratory data analysis
  • understand the importance of exploratory data analysis for understanding environmental datasets
  • be able to describe basic features of environmental data and why they matter

Class Preparation

We will get started with developing some workflows for working with data. These will involve Git/GitHub and Jupyter Notebooks. This means that you should:

  • Have the tools installed (see Assignments from last week)
  • Have your GitHub Account connected to GitHub-Desktop
  • Be able to open a jupyter notebook
  • Bring your laptop
  • Please also make sure to bring any notes from the previous week as a basis for discussion

Readings

Tools

We are starting to use Python for data analysis. For reading and exploring tabular data, we will be using the pandas library.

The Abernathy (2021) textbook has two chapters that are worth reading as a reference. These are also the notebooks that our in-class activities are based upon.

  • Python Fundamentals
    • A refresher on the basic features of the Python programming language.
  • Basic Pandas
    • An introduction to pandas Series and DataFrames as the basic organizational structure and how these can be manipulated and processed for doing exploratory analysis and plotting of data.

Background: Palmer Penguin Data

We will use data from the Palmer Archipelago Penguins. This dataset has become a popular example dataset for environmental analysis, because it is comparatively simple to understand and has some nice features for analysis. (see e.g. here)

These data were collected from 2007 - 2009 by Dr. Kristen Gorman with the Palmer Station Long Term Ecological Research Program, part of the US Long Term Ecological Research Network. The data are originally from the Environmental Data Initiative (EDI) Data Portal, and are available for use by CC0 license (“No Rights Reserved”) in accordance with the Palmer Station Data Policy.

Planned Agenda

Monday:

  • Questions from previous week
  • Comment on submissions for first learning note
  • Continue with the data discussion and activities from last Wednesday

Wednesday

  • Begin working with the Palmer Penguin Dataset to get started with data analysis.

Assignments