Skip to content

Commit

Permalink
Update formatting for exercises
Browse files Browse the repository at this point in the history
  • Loading branch information
davewhipp committed Jun 4, 2024
1 parent 94d0117 commit 8a8c96c
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 25 deletions.
24 changes: 12 additions & 12 deletions source/part1/chapter-03/md/04-exercises.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,27 @@ jupyter:
# Exercises


## Exercise 3.1
### Exercise 3.1

In this exercise your task is to open and explore a NOAA weather data file using Pandas. The data file name is 6153237444115dat.csv and it is located in the data folder (*add link*). An overview of the tasks in this exercise:

- Import the Pandas module
- Read the data using Pandas into a variable called data
- Calculate a number of basic statistics from the data

### Problem 1 - Read the file and clean it
#### Problem 1 - Read the file and clean it

Import the pandas module and read the weather data into a variable called `data`. Print the first five rows of the data file.

### Problem 2 - Basic characteristics of the data
#### Problem 2 - Basic characteristics of the data

Based on the `data` DataFrame from Problem 1, answer to following questions:

1. How many rows is there in the data?
2. What are the column names?
3. What are the datatypes of the columns?

### Problem 3 - Descriptive statistics
#### Problem 3 - Descriptive statistics

Based on the `data` DataFrame from Problem 1, answer to following questions:

Expand All @@ -44,7 +44,7 @@ Based on the `data` DataFrame from Problem 1, answer to following questions:
- How many unique stations exists in the data (use the `USAF` column)?


## Exercise 3.2
### Exercise 3.2

In this exercise, you will clean the data from our data file by removing no-data values, convert temperature values in Fahrenheit to Celsius, and split the data into separate datasets using the weather station identification code. We will start this problem by cleaning and converting our temperature data. An overview of the tasks in this exercise:

Expand All @@ -54,14 +54,14 @@ In this exercise, you will clean the data from our data file by removing no-data
- Divide the data into separate DataFrames for the Helsinki Kumpula and Rovaniemi stations
- Save the new DataFrames to CSV files

### Problem 1 - Read the data and remove NaN values
#### Problem 1 - Read the data and remove NaN values

The first step for this problem is to read the data file 6153237444115dat.csv into a variable `data` using pandas and cleaning it a bit:

- Select the columns `USAF, YR--MODAHRMN, TEMP, MAX, MIN` from the `data` DataFrame and assign them to a variable `selected`
- Remove all rows from `selected` that have NoData in the column `TEMP` using the `dropna()` function

### Problem 2 - Convert temperatures to Celsius
#### Problem 2 - Convert temperatures to Celsius

Convert the temperature values from Fahrenheits to Celsius:

Expand All @@ -71,7 +71,7 @@ Convert the temperature values from Fahrenheits to Celsius:
- Round the values in the `Celsius` column to have 0 decimals (do not create a new column, update the current one)
- Convert the `Celsius` values into integers (do not create a new column, update the current one)

### Problem 3 - Select data and save to disk
#### Problem 3 - Select data and save to disk

Divide the data in `selected` into two separate DataFrames:

Expand All @@ -83,22 +83,22 @@ Divide the data in `selected` into two separate DataFrames:
- Repeat the same procedures and save the `rovaniemi` DataFrame into a file `Rovaniemi_temps_May_Aug_2017.csv`.


## Exercise 3.3
### Exercise 3.3

In this Exercise, we will explore our temperature data by comparing spring temperatures between Kumpula and Rovaniemi. To do this we'll use some conditions to extract subsets of our data and then analyse these subsets using basic pandas functions. Notice that in this exercise, we will use data saved from the previous Exercise (2.2.6), hence you should finish that Exercise before this one. An overview of the tasks in this exercise:

- Calculate the median temperatures for Kumpula and Rovaniemi for the summer of 2017
- Select temperatures for May and June 2017 in separate DataFrames for each location
- Calculate descriptive statistics for each month (May, June) and location (Kumpula, Rovaniemi)

### Problem 1 - Read the data and calculate basic statistics
#### Problem 1 - Read the data and calculate basic statistics

Read in the CSV files generated in Exercise 2.2.6 to the variables `kumpula` and `rovaniemi` and answer to following questions:

- What was the median Celsius temperature during the observed period in Helsinki Kumpula? Store the answer in a variable `kumpula_median`.
- What was the median Celsius temperature during the observed period in Rovaniemi? Store the answer in a variable `rovaniemi_median`.

### Problem 2 - Select data and compare temperatures between months
#### Problem 2 - Select data and compare temperatures between months

The median temperatures above consider data from the entire summer (May-Aug), hence the differences might not be so clear. Let's now find out the mean temperatures from May and June 2017 in Kumpula and Rovaniemi:

Expand All @@ -108,7 +108,7 @@ The median temperatures above consider data from the entire summer (May-Aug), he
- Does there seem to be a large difference in temperatures between the months?
- Is Rovaniemi a much colder place than Kumpula?

### Problem 3 - Parse daily temperatures by aggregating data
#### Problem 3 - Parse daily temperatures by aggregating data

In this problem, the aim is to aggregate the hourly temperature data for Kumpula and Rovaniemi weather stations to a daily level. Currently, there are at most three measurements per hour in the data, as you can see from the YR--MODAHRMN column:

Expand Down
26 changes: 13 additions & 13 deletions source/part1/chapter-03/nb/04-exercises.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,27 +13,27 @@
"id": "59979bb7-98eb-423e-9893-6261dd9a43e1",
"metadata": {},
"source": [
"## Exercise 3.1\n",
"### Exercise 3.1\n",
"\n",
"In this exercise your task is to open and explore a NOAA weather data file using Pandas. The data file name is 6153237444115dat.csv and it is located in the data folder (*add link*). An overview of the tasks in this exercise:\n",
"\n",
"- Import the Pandas module\n",
"- Read the data using Pandas into a variable called data\n",
"- Calculate a number of basic statistics from the data\n",
"\n",
"### Problem 1 - Read the file and clean it\n",
"#### Problem 1 - Read the file and clean it\n",
"\n",
"Import the pandas module and read the weather data into a variable called `data`. Print the first five rows of the data file.\n",
"\n",
"### Problem 2 - Basic characteristics of the data\n",
"#### Problem 2 - Basic characteristics of the data\n",
"\n",
"Based on the `data` DataFrame from Problem 1, answer to following questions:\n",
"\n",
"1. How many rows is there in the data?\n",
"2. What are the column names?\n",
"3. What are the datatypes of the columns?\n",
"\n",
"### Problem 3 - Descriptive statistics\n",
"#### Problem 3 - Descriptive statistics\n",
"\n",
"Based on the `data` DataFrame from Problem 1, answer to following questions:\n",
"\n",
Expand All @@ -47,7 +47,7 @@
"id": "6cca621b-1faa-4e66-b649-e47f639c8866",
"metadata": {},
"source": [
"## Exercise 3.2\n",
"### Exercise 3.2\n",
"\n",
"In this exercise, you will clean the data from our data file by removing no-data values, convert temperature values in Fahrenheit to Celsius, and split the data into separate datasets using the weather station identification code. We will start this problem by cleaning and converting our temperature data. An overview of the tasks in this exercise:\n",
"\n",
Expand All @@ -57,14 +57,14 @@
"- Divide the data into separate DataFrames for the Helsinki Kumpula and Rovaniemi stations\n",
"- Save the new DataFrames to CSV files\n",
"\n",
"### Problem 1 - Read the data and remove NaN values\n",
"#### Problem 1 - Read the data and remove NaN values\n",
"\n",
"The first step for this problem is to read the data file 6153237444115dat.csv into a variable `data` using pandas and cleaning it a bit:\n",
"\n",
"- Select the columns `USAF, YR--MODAHRMN, TEMP, MAX, MIN` from the `data` DataFrame and assign them to a variable `selected`\n",
"- Remove all rows from `selected` that have NoData in the column `TEMP` using the `dropna()` function\n",
"\n",
"### Problem 2 - Convert temperatures to Celsius\n",
"#### Problem 2 - Convert temperatures to Celsius\n",
"\n",
"Convert the temperature values from Fahrenheits to Celsius:\n",
"\n",
Expand All @@ -74,7 +74,7 @@
"- Round the values in the `Celsius` column to have 0 decimals (do not create a new column, update the current one)\n",
"- Convert the `Celsius` values into integers (do not create a new column, update the current one)\n",
"\n",
"### Problem 3 - Select data and save to disk\n",
"#### Problem 3 - Select data and save to disk\n",
"\n",
"Divide the data in `selected` into two separate DataFrames:\n",
"\n",
Expand All @@ -91,22 +91,22 @@
"id": "b7ff4901-2d10-4047-9dcb-ef9d781b7250",
"metadata": {},
"source": [
"## Exercise 3.3\n",
"### Exercise 3.3\n",
"\n",
"In this Exercise, we will explore our temperature data by comparing spring temperatures between Kumpula and Rovaniemi. To do this we'll use some conditions to extract subsets of our data and then analyse these subsets using basic pandas functions. Notice that in this exercise, we will use data saved from the previous Exercise (2.2.6), hence you should finish that Exercise before this one. An overview of the tasks in this exercise:\n",
"\n",
"- Calculate the median temperatures for Kumpula and Rovaniemi for the summer of 2017\n",
"- Select temperatures for May and June 2017 in separate DataFrames for each location\n",
"- Calculate descriptive statistics for each month (May, June) and location (Kumpula, Rovaniemi)\n",
"\n",
"### Problem 1 - Read the data and calculate basic statistics\n",
"#### Problem 1 - Read the data and calculate basic statistics\n",
"\n",
"Read in the CSV files generated in Exercise 2.2.6 to the variables `kumpula` and `rovaniemi` and answer to following questions:\n",
"\n",
"- What was the median Celsius temperature during the observed period in Helsinki Kumpula? Store the answer in a variable `kumpula_median`.\n",
"- What was the median Celsius temperature during the observed period in Rovaniemi? Store the answer in a variable `rovaniemi_median`.\n",
"\n",
"### Problem 2 - Select data and compare temperatures between months\n",
"#### Problem 2 - Select data and compare temperatures between months\n",
"\n",
"The median temperatures above consider data from the entire summer (May-Aug), hence the differences might not be so clear. Let's now find out the mean temperatures from May and June 2017 in Kumpula and Rovaniemi:\n",
"\n",
Expand All @@ -116,7 +116,7 @@
" - Does there seem to be a large difference in temperatures between the months?\n",
" - Is Rovaniemi a much colder place than Kumpula?\n",
"\n",
"### Problem 3 - Parse daily temperatures by aggregating data \n",
"#### Problem 3 - Parse daily temperatures by aggregating data \n",
"\n",
"In this problem, the aim is to aggregate the hourly temperature data for Kumpula and Rovaniemi weather stations to a daily level. Currently, there are at most three measurements per hour in the data, as you can see from the YR--MODAHRMN column:\n",
"\n",
Expand Down Expand Up @@ -153,7 +153,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.11.7"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 8a8c96c

Please sign in to comment.