Tools that we will be using in ISAT 420
Motivation
There are many environmental issues that we need to understand and solve as a society
To do so, we need to make use of all the environmental data that is being collected
Overall, the goal is to have reliable, reproducible, and open science for understanding and solving these environmental issues.
The entire process can be visualized using a flow-chart for environmental data analysis that connects collected environmental data to decision making (Figure 1):
Figure 1: Environmental Data Analysis Flowchart; Credit: NumberAnalytics We will be practicing this workflow in class and will learn to apply a set of tools to do so.
For this we will be following in the footsteps of the Pangeo community that has built an entire ecosystem for open-science built around Python and that was originally developed by climate scientists (Figure 2):
Figure 2: Pangeo – A community for open, reproducible, scalable geoscience
Tools
Selecting the right tools for the job is important. With the wrong tools, you might still get the job done, but things will be more cumbersome (Figure 3).
The free tools we will be using for data analysis are:
- Python has become the defacto standard programming language for data analysis and data science.
- Jupyter Notebooks allow you to create documents that combine python code, formatted text, images and visualizations in your webbrowser.
- Anaconda is an ecosystem for managing your python installation and supporting packages.
- Github with Github Desktop for collaborating and sharing code as well as ensuring that our data analysis is reproducible.
- Markdown is a writing system that uses commands for formatting. It is used in Github and Jupyter to format text.