Skip to content

Commit

Permalink
Embed video about reinforcement learning
Browse files Browse the repository at this point in the history
  • Loading branch information
s2t2 committed Oct 4, 2024
1 parent 728ef43 commit 054293a
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 19 deletions.
22 changes: 12 additions & 10 deletions docs/notes/predictive-modeling/autoregressive-models/arima.qmd
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Autocorrelation and Auto-Regressive Models
# Auto-Regressive Models

```{python}
#| echo: false
Expand All @@ -15,8 +15,6 @@ set_option('display.max_rows', 6)

**Auto-Regressive Integrated Moving Average (ARIMA)** is a "method for forecasting or predicting future outcomes based on a historical time series. It is based on the statistical concept of serial correlation, where past data points influence future data points." - [Source: Investopedia](https://www.investopedia.com/terms/a/autoregressive-integrated-moving-average-arima.asp)

In practice, ARIMA models may be better at short term forecasting, and may not perform as well in forecasting over the long term.

An ARIMA model has three key components:

+ **Auto-Regressive (AR)** part: involves regressing the current value of the series against its past values (lags). The idea is that past observations have an influence on the current value.
Expand All @@ -26,6 +24,8 @@ An ARIMA model has three key components:
+ **Moving Average (MA)** part: involves modeling the relationship between the current value of the series and past forecast errors (residuals). The model adjusts the forecast based on the error terms from previous periods.


In practice, ARIMA models may be better at short term forecasting, and may not perform as well in forecasting over the long term.

## Assumption of Stationarity

:::{.callout-warning title="Assumption of stationarity"}
Expand All @@ -35,7 +35,7 @@ For instance, while stock *prices* are generally non-stationary, ARIMA models ca
:::


## Examples of ARMA Models
## Examples

:::{.callout-note title="Data Source"}
These examples of autoregressive models are based on material by Prof. Ram Yamarthy.
Expand Down Expand Up @@ -65,7 +65,7 @@ df
```


## Data Exploration
#### Data Exploration

Sorting data:

Expand Down Expand Up @@ -100,7 +100,7 @@ px.line(x=y.index, y=y, height=450,



### Stationarity
##### Stationarity

Check for stationarity:

Expand All @@ -115,7 +115,7 @@ print(f'P-value: {result[1]}')
# If p-value > 0.05, the series is not stationary, and differencing is required
```

### Autocorrelation
##### Autocorrelation


Examining autocorrelation over ten lagging periods:
Expand All @@ -142,7 +142,7 @@ fig.show()

We see moderately high autocorrelation persists until two to four lagging periods.

## Train/Test Split
#### Train/Test Split

```{python}
#test_size = 0.2
Expand All @@ -169,7 +169,7 @@ print("Y TEST:", y_test.shape)
```


## Model Training
#### Model Training

To implement autoregressive moving average model in Python, we can use the [`ARIMA` class](https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima.model.ARIMA.html) from `statsmodels`.

Expand Down Expand Up @@ -214,7 +214,7 @@ px.line(train_set, y=["W-L%", "Predicted"], height=350,
```


## Evaluation
#### Evaluation

Reconstructing test set with predictions for the test period:

Expand Down Expand Up @@ -276,3 +276,5 @@ Experimenting with different `order` parameter values may yield different result


### Example 2 - GDP Growth

TBA
3 changes: 3 additions & 0 deletions docs/notes/predictive-modeling/ml-foundations/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,9 @@ Example unsupervised learning tasks include:
**Reinforcement learning** is a different type of machine learning approach, where an "agent" learns to make decisions by interacting with an environment. The agent takes actions and receives feedback in the form of rewards or penalties, adjusting its strategy to maximize the cumulative reward over time.


{{< video https://www.youtube.com/watch?v=kopoLzvh5jY >}}


## Machine Learning Problem Formulation

Machine learning problem formulation refers to the process of clearly defining the task that a machine learning model is meant to solve. This step is crucial in guiding the development of the model and ensuring that the right data, techniques, and metrics are applied to achieve the desired outcome. Problem formulation involves several key components:
Expand Down
2 changes: 1 addition & 1 deletion docs/notes/predictive-modeling/regression/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ $$

Where:

+ $\beta_0$ is the intercept (the value of $y$ when $X=0$
+ $\beta_0$ is the intercept (the value of $y$ when $X=0$)
+ $\beta_1$ is the coefficient representing the slope of the line, which measures how much $y$ changes for a one-unit change in $X$


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ px.line(df, y=["employment", "prediction"], height=350,

**Regression Residuals**

Plotting residuals:
Removing the trend, plotting just the residuals:


```{python}
Expand All @@ -270,6 +270,8 @@ px.line(df, y="residual",
)
```

There seem to be some periodic movements in the residuals.

#### Seasonality via Means of Periodic Residuals

Observe there may be some cyclical patterns in the residuals, by calculating periodic means:
Expand All @@ -282,7 +284,6 @@ set_option('display.max_rows', 15)
```

```{python}
df = df.copy()
df["year"] = df.index.year
df["quarter"] = df.index.quarter
df["month"] = df.index.month
Expand Down Expand Up @@ -348,18 +349,17 @@ df["prediction_monthly"] = results_monthly.fittedvalues
df["residual_monthly"] = results_monthly.resid
```

```{python}
#height = 450
Decomposition of the original data into trend, seasonal component, and residuals:

#px.line(df, y=["employment", "prediction"], title="Employment vs trend", height=height)
```{python}
px.line(df, y=["employment", "prediction"], title="Employment vs trend", height=350)
```


```{python}
#px.line(df, y="prediction_monthly", title="Employment vs seasonal trend", height=height)
px.line(df, y="prediction_monthly", title="Employment seasonal component", height=350)
```


```{python}
px.line(df, y="residual_monthly", title="Employment seasonal trend residual", height=450)
px.line(df, y="residual_monthly", title="Employment de-trended residual", height=350)
```

0 comments on commit 054293a

Please sign in to comment.