Lesson Plan for Week 10

Objectives

We continue to work with Xarray and will be introducing some more data processing concepts. We have already encountered aggregation-methods like groupby or rolling averages in pandas.

When working with global and gridded environmental data these operations are also very useful.

For example, since environmental data has annual cycles, it often makes sense to calculate anomalies, which is something where groupby can help.

This is one step to work more in-depth with this kind of data including then fitting models or calculating correlations between variables.

Specific learning goals

Technical

Selecting and subsetting variables from an xarray- dataset.
Plotting xarray data on maps.
Performing aggregation options like .rolling() or .groupby() to process gridded
Use pooch to access NOAA Climate Data Products on Amazon Web Services.

Weather and Climate System

Comprehend the fundamentals of climatologies.
Calculate an anomaly to a climatology.
Calculate the rolling mean of the anomaly data to smooth the time series and extract long-term signals/patterns

Class Preparation

Readings and Materials

Background

Abernathy: Maps in Scientific Python
- We won’t discuss this explicitly, but this is a bit of background on working with maps in python. This provides a bit of background on why this matters and how this is practically done in python.
Climate Match: We will be working through two tutorials from Climate Match, an open source bootcamp for using python for climate science.
- Specifically, we will be using materials from Tutorial 4: Understanding Climatology Through Precipitation Data and Tutorial 5: Calculating Anomalies Using Precipitation Data
- Please watch the two introductory videos:

Data:

This week also makes use of some climate data observed by satellite data that is published by NOAA and freely accessible on Amazon AWS.

Climate Match has a good overview on datasets from different providers that provide climate and environmental data.

Climate Match: Finding Satellite Climate Data

Planned Agenda

Monday:

More xarray: Grouping, Anomaly Calculation, Correlation, …

Wednesday:

Check-In: Python Skills and Resources
- Skills Check
Check-In: Learning Reflection
Semester Project Check-In.

Activity:

Motivation

Thinking about the upcoming learning reflection, it is a good idea to think about all the python skills that we have been covering so far.

This follows up from a similar exercise about the tools in Week 9.

Task

With a partner/ or your team:

Go through the list and discuss where in the course you have encountered this before in ISAT 420.
Where in the course was this covered?
Think about whether this is something that applies to your semester project?
Do you know how you would apply this in an exercise?

Python Area	Skill	Additional Resource
General	Import a package
	Specify the path of a file
	Select multiple files for reading using `glob`
Pandas	Select items from a pandas `series` or `dataframe` object	AP
	Calculate basic statistics on a dataframe or series using `.min(), .max(), .mean(), ...`	AP
	Selecting data on index using `.loc()`	AP
	Select data in a dataframe on a condition	AP
	Merge data from two different dataframes into a new dataframe	AP, PDA 5.2
	Use `df.describe()` and `df.info()` to understand data	AP, PDA 5.3
	Read tabular data (e.g. `csv` ) into a dataframe using `pd.read_csv()`	AP, PDA 6.1
	Use the `df.plot()` functionality to make simple plots such as scatter plots, histograms, or barplots	AP, PDA 8, PDA 9.2
	Plotting a column in a dataframe using `.plot(y=...,)`
	Parsing time-series data and using the date as index	AP
	Selecting data by columns using a list of columns`df[[‘col1’, ‘col2’]]`	PDA 5.1AP, E1
	Identify and fill missing values in a dataframe	AP
	Using `df.groupby()` as an aggregation function	AP
	Using `pd.read_csv()` to read more complex tabular data (i.e. `tab`-delimited, skipping rows, naming columns)	AP, PDA 6.1
	Temporally resampling and aggregating data using `df.resample().mean()`	AP, E1
	Performing calculations and assigning results to a new column	AP
	Adding titles, labels, text, and other features to plots

Xarray	Reading a gridded `netCDF` dataset using `.open_dataset()` and `.open_mfdataset()`	AX, CM4
	Exploring dimensions, coordinates, data variables, and attributes of dataset	AX, CM4
	Selecting a variable from dataset `ds.<var_name>`	AX, CM4
	Selecting data using `.sel()`	AX, CM4
	Creating a mapped plot and adding map features	AX, CM4
	Calculating statistics like means across named dimensions `.mean(dim=…)`	AX
	Selecting a slice from a dataset using `slice(<start>, <end>)` to only plot a region	AX
	Access remote data using `s3fs` and `pooch`	CM4
	Use `.groupby()` to find the typical behavior (climatology) of environmental data	AX, CM4
	Use `.groupby()` and climatologies to find deviations from typical conditions (e.g. anomalies)	AX, CM4
	Use `.rolling()` to remove high frequency variations in environmental data.	AX, CM5
	Use `.weighted` to calculated area averages on a sphere	AX, CM5

Resources

[PDA] McKinney W., Python for Data Analysis - Open Edition, 3e, O’Reilly, 2022
Abernathy, R: Earth and Environmental Data Science, 2021
- [AP]: Chapter: Pandas
- [AX]: Chapter: Xarray Fundamentals
Earth Lab, Intermediate Earth Data Science Textbook, University of Colorado Boulder, Earth Lab, Updated: 2022, Citation DOI: https://doi.org/10.5281/zenodo.4683910
- [E1]: Chapter 1: Time Series Data in Python
- [CM4]: Climate Match: Tutorial 4: Understanding Climatology Through Precipitation Data
- [CM5]: Climate Match: Tutorial 5: Calculating Anomalies Using Precipitation Data

Semester Project

Where are you in the project
What data do you have?
What data do you need?
How can you apply the concepts of ISAT 420 to your data and your questions?
How should you structure your repository?

Keep notes