4. Environmental Data

Motivation

A problem defined is a problem half solved via quotefancy

Environmental issues are complex and often hard to understand and even harder to solve. If they were simple, we’d already have solutions!

If we want to be successful, we need the right tools and the right data for our problem.

If you are an ISAT major, you learned that solving complex problems requires defining the problem.

If you have defined your problem, you can then go out to find data that will help you solve it.

Specific learning goals

We are working towards the following learning goals:

  • Understand that solving environmental problems requires the right data for the problem
  • Be able to relate environmental data to a research question
  • Be familiarized with different data sources for environmental information
  • Identify issues of environmental datasets regarding FAIR
  • Understand the complexity and difficulty of environmental datasets and access to them.

Skills:

  • access various collections of environmental data
  • formulate questions that can be solved using environmental data

Recap: Framework for approaching environmental issues

flowchart LR
  A(Environmental Issue) --> B(Specific Question)
  B --> C(Data Analysis Workflow)
  B1[Environmental Data] --> C
  C --> D(Product)

Figure 1: Schematic representation of the Data Analysis Process

Recap: Open Science and Workflows

Open and reproducible science is a collection of practices (Figure 2) that allow us to easily share, work and collaborate with others1.

An open science workflow highlighting the roles of data, code, and workflows as well as stakeholder engagement to find solutions for environments problems
Figure 2: An open science workflow highlighting the roles of data, code, and workflows. Source: Max Joseph, Earth Lab at University of Colorado, Boulder.

Problem Identification

First things first, we have to define our problem. Without a clear problem definition, we cannot even start to address problems in a meaningful way.

Here is what I want you to do.

Important

Please keep a written electronic record for each of these activities.

I recommend putting everything into a single file.

It would be a good idea to practice your Jupyter notebook and markdown skills and to put everything into an .ipynb file. That way, you can easily add links, images, etc.

TipActivity: Going from issue to question
  1. Start with the environmental issue from last week
  2. As a group introduce each others issues and the specific questions that you asked?
  3. You can either select two research questions that you discussed or come up with new ones.
    • It helps to be as specific here as possible.
    • You could think about temporal change, a comparison between places, magnitude of impact, …
  4. For each of the research questions, hypothesize what data (e.g. environmental, climate, social) you would need to address your research questions
    • Think about specific measurements or observations that would be needed
    • Where would these need to be?
    • How often and over what period of time?
Note

We should acknowledge here that we are doing quite a bit of speculation about our environmental issue.

When we start thinking about semester projects, that will deal with an environmental issue, you will need to do some background research on whether your problem statement and data needs make sense.

But for now, this is okay as a thought exercise.

Finding and Obtaining Environmental Data

There is a deluge of environmental data and it is easy to get overwhelmed with the amount of information that is presented and whether it is useful and reliable.

There are many good resources that can provide starting points on where to find data related to environmental problems.

Starting points

Note

It is easy to get overwhelmed with all these resources and I don’t expect you to review all of them.

I invite you to review a few of them that you think might relate to your environmental issue and research questions.

  • Many universities have environmental data research guides. For example, Yale Library maintains a Public Environmental Data resource page with General Environmental Resources and links to Public Environmental Data organized by theme.

Screen shot of organized public environmental datasets on Yale Libraries
  • The Government and International Organizations have data portals. Here are a few examples:

    • NASA Data Pathfinders: A collection of guided resources to find NASA datasets organized by environmental issues.
    • NOAA OneStop: The National Oceanic and Atmospheric Administration’s Data Search Platform containing data on weather, climate, fisheries, coasts, and oceans
    • DATA.gov: The open data portal of the U.S. Federal Government. All federal government agencies and many states and cities publish their data there. You can filter the data catalog for environmental topics.
    • EPA Data: The U.S. Environmental Protection Agency’s landing portal for EPA data. Note that the actual data are archived on DATA.gov.
  • Academic journals specific to environmental data:

    • Scientific Data
      • A journal for publishing scientific datasets (all topics)
      • There is a search function that lets you filter by topic area (e.g. Climate Sciences)
    • Earth System Science Data
      • An open-access journal for documenting and publishing datasets from observations of the Earth System
TipActivity: Finding data
  1. Review some of the resources that you think are likely most relevant to your problem. (Hint: The NASA Data Pathfinder looks nice and gives a good overview of different problems).
  2. Try to find data that is relevant to your specific research question using these resources. As you research keep a record of:
    • Relevant datasets
      • What is measured in the dataset?
      • Is there information about temporal/ spatial coverage, resolution, …
    • Are you finding what you are looking for
    • Could you access/ download use these datasets
      • How can you access/ download the data?
      • What is the file format? Can you open this or what software do you need?
  3. Overall Reflection:
    • Compare the data sources you visited.
      • What made them easy/ difficult to use?
      • Who do you think is the target audience?
      • Are they effective in reaching their audience?
    • What was the most difficult/ confusing part of this exercise?
NoteDeliverable:

There is nothing to submit, but you should keep a written record that we can then continue to work with both as part of class discussion and moving forward when analyzing data. This would also be the good basis for a learning note.

Footnotes

  1. CU Boulder’s Intro to Earth Data Science (2020) textbook , Section 1: Introduction to Open Reproducible Workflows↩︎