Adding the notebook check to CI and updating the spelling mistakes in…

… the notebooks
pymc-labs · Sep 16, 2024 · 5143ad5 · 5143ad5
1 parent 54ff563
commit 5143ad5
Show file tree

Hide file tree

Showing 17 changed files with 36 additions and 32 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -29,6 +29,10 @@ jobs:
         run: pytest docs/source/.codespell/test_notebook_to_markdown.py
       - name: Run tests
         run: pytest --cov-report=xml --no-cov-on-fail
+      - name: Check codespell for notebooks
+        run: |
+          python ./docs/source/.codespell/notebook_to_markdown.py --tempdir tmp_markdown
+          codespell
       - name: Upload coverage to Codecov
         uses: codecov/codecov-action@v4
         with:

diff --git a/docs/source/knowledgebase/quasi_dags.ipynb b/docs/source/knowledgebase/quasi_dags.ipynb
@@ -104,7 +104,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This leads us to Randomized Controlled Trials (RCTs) which are considered the gold standard for estimating causal effects. One reason for this is that we (as experimenters) intervene in the system by assigning units to treatment by {term}`random assignment`. Because of this intervention, any causal influence of the confounders upon the treatment $\\mathbf{X} \\rightarrow Z$ is broken - treamtent is now soley determined by the randomisation process, $R \\rightarrow T$. The following causal DAG illustrates the structure of an RCT."
+    "This leads us to Randomized Controlled Trials (RCTs) which are considered the gold standard for estimating causal effects. One reason for this is that we (as experimenters) intervene in the system by assigning units to treatment by {term}`random assignment`. Because of this intervention, any causal influence of the confounders upon the treatment $\\mathbf{X} \\rightarrow Z$ is broken - treamtent is now solely determined by the randomisation process, $R \\rightarrow T$. The following causal DAG illustrates the structure of an RCT."
    ]
   },
   {

diff --git a/docs/source/notebooks/ancova_pymc.ipynb b/docs/source/notebooks/ancova_pymc.ipynb
@@ -222,7 +222,7 @@
     "## Run the analysis\n",
     "\n",
     ":::{note}\n",
-    "The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
+    "The `random_seed` keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.\n",
     ":::"
    ]
   },

diff --git a/docs/source/notebooks/did_pymc.ipynb b/docs/source/notebooks/did_pymc.ipynb
@@ -148,7 +148,7 @@
     "## Run the analysis\n",
     "\n",
     ":::{note}\n",
-    "The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
+    "The `random_seed` keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.\n",
     ":::"
    ]
   },

diff --git a/docs/source/notebooks/did_pymc_banks.ipynb b/docs/source/notebooks/did_pymc_banks.ipynb
@@ -329,7 +329,7 @@
     "* $\\mu_i$ is the expected value of the outcome (number of banks in business) for the $i^{th}$ observation.\n",
     "* $\\beta_0$ is an intercept term to capture the basiline number of banks in business of the control group, in the pre-intervention period.\n",
     "* `district` is a dummy variable, so $\\beta_{d}$ will represent a main effect of district, that is any offset of the treatment group relative to the control group.\n",
-    "* `post_treatment` is also a dummy variable which captures any shift in the outcome after the treatment time, regardless of the recieving treatment or not.\n",
+    "* `post_treatment` is also a dummy variable which captures any shift in the outcome after the treatment time, regardless of the receiving treatment or not.\n",
     "* the interaction of the two dummary variables `district:post_treatment` will only take on values of 1 for the treatment group after the intervention. Therefore $\\beta_{\\Delta}$ will represent our estimated causal effect."
    ]
   },
@@ -515,7 +515,7 @@
    "source": [
     "## Analysis 2 - DiD with multiple pre/post observations\n",
     "\n",
-    "Now we'll do a difference in differences analysis of the full dataset. This approach has similarities to {term}`CITS` (Comparative Interrupted Time-Series) with a single control over time. Although slightly abitrary, we distinguish between the two techniques on whether there is enough time series data for CITS to capture the time series patterns."
+    "Now we'll do a difference in differences analysis of the full dataset. This approach has similarities to {term}`CITS` (Comparative Interrupted Time-Series) with a single control over time. Although slightly arbitrary, we distinguish between the two techniques on whether there is enough time series data for CITS to capture the time series patterns."
    ]
   },
   {

diff --git a/docs/source/notebooks/geolift1.ipynb b/docs/source/notebooks/geolift1.ipynb
@@ -269,7 +269,7 @@
     "We can use `CausalPy`'s API to run this procedure, but using Bayesian inference methods as follows:\n",
     "\n",
     ":::{note}\n",
-    "The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
+    "The `random_seed` keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.\n",
     ":::"
    ]
   },

diff --git a/docs/source/notebooks/inv_prop_pymc.ipynb b/docs/source/notebooks/inv_prop_pymc.ipynb
@@ -22,9 +22,9 @@
                 "\n",
                 "In this notebook we will briefly demonstrate how to use propensity score weighting schemes to recover treatment effects in the analysis of observational data. We will first showcase the method with a simulated data example drawn from Lucy D’Agostino McGowan's [excellent blog](https://livefreeordichotomize.com/posts/2019-01-17-understanding-propensity-score-weighting/) on inverse propensity score weighting. Then we shall apply the same techniques to NHEFS data set discussed in Miguel Hernan and Robins' _Causal Inference: What if_ [book](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/). This data set measures the effect of quitting smoking between the period of 1971 and 1982. At each of these two points in time the participant's weight was recorded, and we seek to estimate the effect of quitting in the intervening years on the weight recorded in 1982.\n",
                 "\n",
-                "We will use inverse propensity score weighting techniques to estimate the average treatment effect. There are a range of weighting techniques available: we have implemented `raw`, `robust`, `doubly robust` and `overlap` weighting schemes all of which aim to estimate the average treatment effect. The idea of a propensity score (very broadly) is to derive a one-number summary of individual's probability of adopting a particular treatment. This score is typically calculated by fitting a predictive logit model on all an individual's observed attributes predicting whether or not the those attributes drive the indivdual towards the treatment status. In the case of the NHEFS data we want a model to measure the propensity for each individual to quit smoking. \n",
+                "We will use inverse propensity score weighting techniques to estimate the average treatment effect. There are a range of weighting techniques available: we have implemented `raw`, `robust`, `doubly robust` and `overlap` weighting schemes all of which aim to estimate the average treatment effect. The idea of a propensity score (very broadly) is to derive a one-number summary of individual's probability of adopting a particular treatment. This score is typically calculated by fitting a predictive logit model on all an individual's observed attributes predicting whether or not the those attributes drive the individual towards the treatment status. In the case of the NHEFS data we want a model to measure the propensity for each individual to quit smoking. \n",
                 "\n",
-                "The reason we want this propensity score is because with observed data we often have a kind of imbalance in our covariate profiles across treatment groups. Meaning our data might be unrepresentative in some crucial aspect. This prevents us cleanly reading off treatment effects by looking at simple group differences. These \"imbalances\" can be driven by selection effects into the treatment status so that if we want to estimate the average treatment effect in the population as a whole we need to be wary that our sample might not give us generalisable insight into the treatment differences. Using propensity scores as a measure of the prevalance to adopt the treatment status in the population, we can cleverly weight the observed data to privilege observations of \"rare\" occurence in each group. For example, if smoking is the treatment status and regular running is generally not common among the group of smokers, then on the occasion we see a smoker marathon runner we should heavily weight their outcome measure to overcome their low prevalence in the treated group but real presence in the unmeasured population. Inverse propensity weighting tries to define weighting schemes are inversely proportional to an individual's propensity score so as to better recover an estimate which mitigates (somewhat) the risk of selection effect bias. For more details and illustration of these themes see the PyMC examples [write up](https://www.pymc.io/projects/examples/en/latest/causal_inference/bayesian_nonparametric_causal.html) on Non-Parametric Bayesian methods. {cite:p}`forde2024nonparam`\n"
+                "The reason we want this propensity score is because with observed data we often have a kind of imbalance in our covariate profiles across treatment groups. Meaning our data might be unrepresentative in some crucial aspect. This prevents us cleanly reading off treatment effects by looking at simple group differences. These \"imbalances\" can be driven by selection effects into the treatment status so that if we want to estimate the average treatment effect in the population as a whole we need to be wary that our sample might not give us generalisable insight into the treatment differences. Using propensity scores as a measure of the prevalance to adopt the treatment status in the population, we can cleverly weight the observed data to privilege observations of \"rare\" occurrence in each group. For example, if smoking is the treatment status and regular running is generally not common among the group of smokers, then on the occasion we see a smoker marathon runner we should heavily weight their outcome measure to overcome their low prevalence in the treated group but real presence in the unmeasured population. Inverse propensity weighting tries to define weighting schemes are inversely proportional to an individual's propensity score so as to better recover an estimate which mitigates (somewhat) the risk of selection effect bias. For more details and illustration of these themes see the PyMC examples [write up](https://www.pymc.io/projects/examples/en/latest/causal_inference/bayesian_nonparametric_causal.html) on Non-Parametric Bayesian methods. {cite:p}`forde2024nonparam`\n"
             ]
         },
         {
@@ -832,7 +832,7 @@
             "cell_type": "markdown",
             "metadata": {},
             "source": [
-                "We see here how the particular weighting scheme was able to recover the true treatment effect by defining a contrast in a different pseudo population. This is a useful reminder in that, while propensity score weighting methods are aids to inference in observational data, not all weighting schemes are created equal and we need to be careful in our assessment of when each is applied appropriately. Fundamentally the weighting scheme of choice should be tied to the question of what are you trying to estimate. Aronow and Miller's _Foundations of Agnostic Statistics_ {cite:p}`aronowFoundations` has a good explantion of the differences between the `raw`, `robust` and `doubly robust` weighting schemes. In some sense these offer an escalating series of refined estimators each trying to improve the variance in the ATE estimate. The `doubly robust` approach also tries to offer some guarantees against model misspecification. The `overlap` estimator represents an attempt to calculate the ATE among the population with the overlapping propensity scores. This can be used to guard against poor inference in cases where propensity score distributions have large non-overlapping regions."
+                "We see here how the particular weighting scheme was able to recover the true treatment effect by defining a contrast in a different pseudo population. This is a useful reminder in that, while propensity score weighting methods are aids to inference in observational data, not all weighting schemes are created equal and we need to be careful in our assessment of when each is applied appropriately. Fundamentally the weighting scheme of choice should be tied to the question of what are you trying to estimate. Aronow and Miller's _Foundations of Agnostic Statistics_ {cite:p}`aronowFoundations` has a good explanation of the differences between the `raw`, `robust` and `doubly robust` weighting schemes. In some sense these offer an escalating series of refined estimators each trying to improve the variance in the ATE estimate. The `doubly robust` approach also tries to offer some guarantees against model misspecification. The `overlap` estimator represents an attempt to calculate the ATE among the population with the overlapping propensity scores. This can be used to guard against poor inference in cases where propensity score distributions have large non-overlapping regions."
             ]
         },
         {

diff --git a/docs/source/notebooks/its_covid.ipynb b/docs/source/notebooks/its_covid.ipynb
@@ -167,7 +167,7 @@
     "\n",
     "* `date` + `year`: self explanatory\n",
     "* `month`: month, numerically encoded. Needs to be treated as a categorical variable\n",
-    "* `temp`: average UK temperature (Celcius)\n",
+    "* `temp`: average UK temperature (Celsius)\n",
     "* `t`: time\n",
     "* `pre`: boolean flag indicating pre or post intervention"
    ]
@@ -182,7 +182,7 @@
     "In this example we are going to standardize the data. So we have to be careful in how we interpret the inferred regression coefficients, and the posterior predictions will be in this standardized space.\n",
     "\n",
     ":::{note}\n",
-    "The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
+    "The `random_seed` keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.\n",
     ":::"
    ]
   },

diff --git a/docs/source/notebooks/its_pymc.ipynb b/docs/source/notebooks/its_pymc.ipynb
@@ -163,7 +163,7 @@
     "Run the analysis\n",
     "\n",
     ":::{note}\n",
-    "The `random_seed` keyword argument for the PyMC sampler is not neccessary. We use it here so that the results are reproducible.\n",
+    "The `random_seed` keyword argument for the PyMC sampler is not necessary. We use it here so that the results are reproducible.\n",
     ":::"
    ]
   },
@@ -304,7 +304,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As well as the model coefficients, we might be interested in the avarage causal impact and average cumulative causal impact.\n",
+    "As well as the model coefficients, we might be interested in the average causal impact and average cumulative causal impact.\n",
     "\n",
     ":::{note}\n",
     "Better output for the summary statistics are in progress!\n",