Semester Project
Timeline & Steps
| Step | Target Date |
|
Week 6 |
|
Week 7 |
|
Week 7 |
|
Week 8+ |
|
|
|
Requirements
Your semester project should
Let me know if you run into technical issues or need help as early as possible, so that I can help you.
- report your results in a Jupyter notebook that combines text, figures, and code to generate these figures.
- make use of at least two datasets (observations or models)
- these datasets should be described and you need to document how the data was acquired
- be placed in the Semester Project GitHub repository containing
a directory structure as discussed in class. At a minimum, there should be folders for
-Main_Repository| |- Data |- Code |- ProjectReport |- OtherMaterials |- Documentation |- <folders as needed>an
environment.ymlfile to document your computational environment. See instructions here for how to create this file with Anaconda Navigator.code for all analysis (including any analysis not contained in the final report, such as exploratory analysis or code to download data). Place these into the code directory
contain a README.md file to explain the contents of the GitHub repository and its purpose.
data used or a description of the data and how it was acquired/ downloaded (in the
Datadirectory)move other files, such as working docs into the
OtherMaterialsfolder
A sample report structure is found in this document
Grading Rubric
Because you will present the main results in class, the grading rubric emphasizes criteria related to data management, methods, and background information.
Formal Requirements (20%)
- Uploaded/ submitted via GitHub
- Uses a markdown or
.ipynbformat - GitHub directory has adequate directory structure
- GitHub directory contains a
Readme.mdfile containing information about the project
Report Content:
Note: Consult the Semester Project Outline for details.
- Problem Introduction (25%):
- adequately introduces the problem using a problem statement
- provides sufficient background using external references (with References Cited section) for a scientifically minded person to understand the problem space
- Data and Methods (30%):
- describes the datasets including sources, identifiers, and links
- provides background about datasets
- Including exploratory analysis
- outlines your data processing including how and why it was done
- Results, Discussion, and Conclusion (25%):
- the results in the report support the presentation results, ideally expanding on the presentation
- Provide evidence for relationships between variables, groups, etc.
- the discussion expands on the discussion presented in the presentation, especially regarding data aspects and future implications of results
- a logical conclusion is presented
- the results in the report support the presentation results, ideally expanding on the presentation
- Problem Introduction (25%):
Task: Problem Statement
The problem statement should provide a clear and concise description of the issue that will help guide your research and analysis. It should include the following components:
Based on your topic:
- Name the core problem you want to investigate.
- Provide context for the problem
- Who is affected by this problem? This could include specific groups of people, ecosystems, or other stakeholders.
- What are the specific harms that are occurring or could occur as a result of this problem? This could include health impacts, economic impacts, environmental impacts, etc.
- Why is it important to address this problem?
- Formulate at least two specific research questions that arise from the problem statement and that you want to answer with your project. These should be specific and focused, and they should be answerable with data. This means there should be a quantitative aspect to the question that can be addressed.
The below is a good template for writing a concise problem statement (adapted from Dr. Papadakis; ISAT 491). >The purpose of this project is to [A]. Because of [B] and [C]. We expect our project to result in [D]. Which will allow [E].
- A: Your environmental issue
- B: Broad Background
- C: Harms/ Impacts to stakeholders/ ecosystems
- D: Your specific research questions
- E: Any next steps
- What data are needed?
- variables
- time periods
- spatial/ temporal resolution
- Checking initial data sources
Exploratory Data Analysis
Here are some specific steps and suggestions
- With pandas and xarray (will be introduced soon) you should be able to load most datasets.
- If you come across a format you don’t know or don’t know how, please let me know.
- You should download data ASAP and check whether you can open the datasets
- Document what exactly you download for reproducibility.
- Try to download only what you need. Especially gridded data can get big very soon.
- If datasets are less than ~20 MB each, put them on GitHub in a
Datadirectory.
- If they are larger, put them into a cloud storage and provide links in the repository.
- If datasets are less than ~20 MB each, put them on GitHub in a
- Create initial plots and plan your analysis.
- Are there issues with the data?
- What steps do you need to take to conduct your analysis?
- Do you need to modify research questions?
- …
- I will discuss them with your groups.
- Make sure to have all code on GitHub