Skip to content

Commit

Permalink
differences for PR #140
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Oct 8, 2024
1 parent 6d2c65f commit 93fbed6
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 16 deletions.
25 changes: 10 additions & 15 deletions data-visualisation.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,10 @@ df_long.head()

Ok! We are now ready to plot our data. Since this data is monthly data, we can plot the circulation data over time.

::::::::::::::::::::::::::::::::::::: instructor
## Instructor note: Pandas 2.2.* bug
There is a bug in the latest release of Pandas that is causing certain plots to display in a garbled manner. This is a [known issue](https://github.com/pandas-dev/pandas/issues/59960) that the Pandas team plans to address. In the meantime, learners and instructors can user older versions of pandas *or* add `.sort_index()` before any instance of `.plot()`. For example, use `albany['circulation'].sort_index().plot()` instead of `albany['circulation'].plot()`.
:::::::::::::::::::::::::::::::::::::::::::::::::

At first, let’s focus on a specific branch. We can select the rows for the Albany Park branch:
At first, let’s focus on a specific branch. We can select the rows for the Albany Park branch and then use `.sort_index()` to be explicit that we want our data to be sorted in the order of the date index.

``` python
albany = df_long[df_long['branch'] == 'Albany Park']
albany = df_long[df_long['branch'] == 'Albany Park'].sort_index()
```

``` python
Expand All @@ -66,13 +61,13 @@ albany.head()
|------------|-------------|----------------------|---------|----------|--------|------|---------|-------------|
| date | | | | | | | | |
| 2011-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | january | 8427 |
| 2012-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 83297 | 2012 | january | 10173 |
| 2013-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 572 | 2013 | january | 0 |
| 2014-01-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 50484 | 2014 | january | 35 |
| 2015-01-01 | Albany Park | NaN | NaN | NaN | 133366 | 2015 | january | 10889 |
| 2011-02-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | february | 7023 |
| 2011-03-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | march | 9702 |
| 2011-04-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | april | 9344 |
| 2011-05-01 | Albany Park | 5150 N. Kimball Ave. | Chicago | 60625.0 | 120059 | 2011 | may | 8865 |


Now we can use the `plot()` function that is built in to pandas. Let’s try it:
Now we can use the `plot()` function that is built in to pandas. Let’s try it:

``` python
albany.plot()
Expand Down Expand Up @@ -199,7 +194,7 @@ Here is a view of the [interactive output of the Plotly bar chart](learners/bar_
## Plotting with Pandas

1. Load the dataset `df_long.pkl` using Pandas.
2. Create a new DataFrame that only includes the data for the "Chinatown" branch.
2. Create a new DataFrame that only includes the data for the "Chinatown" branch. (Don't forget to sort by the index)
3. Use the Pandas plotting function to plot the "circulation" column over time.


Expand All @@ -211,7 +206,7 @@ Here is a view of the [interactive output of the Plotly bar chart](learners/bar_
```python
import pandas as pd
df_long = pd.read_pickle('data/df_long.pkl')
chinatown = df_long[df_long['branch'] == 'Chinatown']
chinatown = df_long[df_long['branch'] == 'Chinatown'].sort_index()
chinatown['circulation'].plot()
```

Expand All @@ -235,7 +230,7 @@ Add a line to the code below to plot the Uptown branch circulation including the
```python
import pandas as pd
df_long = pd.read_pickle('data/df_long.pkl')
uptown = df_long[df_long['branch'] == 'Uptown']
uptown = df_long[df_long['branch'] == 'Uptown'].sort_index()
```

::::::::::::::: solution
Expand Down
Binary file modified fig/albany-plot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"episodes/conditionals.md" "b567ac5270b3dc82c4ed119870a0a890" "site/built/conditionals.md" "2024-06-17"
"episodes/writing-functions.md" "99171306646b8b63c66a493acef12e63" "site/built/writing-functions.md" "2024-06-17"
"episodes/tidy.md" "03e41c4d6c93d0b4b1ea4b2ea0c17522" "site/built/tidy.md" "2024-06-27"
"episodes/data-visualisation.md" "c0f8a792ea7b637a782ee67879d5f34f" "site/built/data-visualisation.md" "2024-10-07"
"episodes/data-visualisation.md" "83929e8d9a980200de5f6fa293cf41ee" "site/built/data-visualisation.md" "2024-10-08"
"episodes/wrap.md" "6e2c8fe8bab006ad451a481d27982d06" "site/built/wrap.md" "2024-06-17"
"instructors/design.md" "644a2269c636c2de465fe655b899a508" "site/built/design.md" "2023-05-08"
"instructors/instructor-notes.md" "62646361a3b355df21da3707168fee01" "site/built/instructor-notes.md" "2023-05-08"
Expand Down

0 comments on commit 93fbed6

Please sign in to comment.