diff --git a/pull337/classification1.html b/pull337/classification1.html index e001036e..6f49dd91 100644 --- a/pull337/classification1.html +++ b/pull337/classification1.html @@ -864,23 +864,23 @@

5.4.3. Exploring the cancer data
-
+
@@ -974,23 +974,23 @@

5.5. Classification with K-nearest neigh
-
+

Fig. 5.2 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.#

@@ -1049,23 +1049,23 @@

5.5. Classification with K-nearest neigh
-
+

Fig. 5.3 Scatter plot of concavity versus perimeter. The new observation is represented as a red diamond with a line to the one nearest neighbor, which has a malignant @@ -1128,23 +1128,23 @@

5.5. Classification with K-nearest neigh
-
+

Fig. 5.4 Scatter plot of concavity versus perimeter. The new observation is represented as a red diamond with a line to the one nearest neighbor, which has a benign @@ -1207,23 +1207,23 @@

5.5. Classification with K-nearest neigh
-
+

Fig. 5.5 Scatter plot of concavity versus perimeter with three nearest neighbors.#

@@ -1303,23 +1303,23 @@

5.5.1. Distance between points
-
+

Fig. 5.6 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.#

@@ -1499,23 +1499,23 @@

5.5.1. Distance between points
-
+

Fig. 5.7 Scatter plot of concavity versus perimeter with 5 nearest neighbors circled.#

@@ -1711,9 +1711,9 @@

5.5.2. More than two explanatory variabl }); } -

Fig. 5.9 Comparison of K = 3 nearest neighbors with unstandardized and standardized data.#

@@ -2504,23 +2504,23 @@

5.7.1. Centering and scaling
-
+

Fig. 5.10 Close-up of three nearest neighbors for unstandardized data.#

@@ -2617,23 +2617,23 @@

5.7.2. Balancing
-
+

@@ -2714,23 +2714,23 @@

5.7.2. Balancing
-
+

Fig. 5.12 Imbalanced data with 7 nearest neighbors to a new observation highlighted.#

@@ -2788,23 +2788,23 @@

5.7.2. Balancing
-
+

Fig. 5.13 Imbalanced data with background color indicating the decision of the classifier and the points represent the labeled data.#

@@ -2898,23 +2898,23 @@

5.7.2. Balancing
-
+

Fig. 5.14 Upsampled data with background color indicating the decision of the classifier.#

@@ -3437,23 +3437,23 @@

5.7.3. Missing data
-
+
diff --git a/pull337/classification2.html b/pull337/classification2.html index 7bdd6fd2..648158b3 100644 --- a/pull337/classification2.html +++ b/pull337/classification2.html @@ -802,23 +802,23 @@

6.5. Evaluating performance with
-
+
@@ -1539,32 +1539,32 @@

6.6.1. Cross-validation6.6.1. Cross-validation6.6.1. Cross-validation6.6.1. Cross-validation6.6.2. Parameter value selection
-
+

Fig. 6.5 Plot of estimated accuracy versus the number of neighbors.#

@@ -2276,23 +2276,23 @@

6.6.3. Under/Overfitting
-
+

Fig. 6.6 Plot of accuracy estimate versus number of neighbors for many K values.#

@@ -2367,23 +2367,23 @@

6.6.3. Under/Overfitting
-
+

Fig. 6.7 Effect of K in overfitting and underfitting.#

@@ -2802,23 +2802,23 @@

6.8.1. The effect of irrelevant predicto
-
+

Fig. 6.9 Effect of inclusion of irrelevant predictors.#

@@ -2881,23 +2881,23 @@

6.8.1. The effect of irrelevant predicto
-
+

Fig. 6.10 Tuned number of neighbors for varying number of irrelevant predictors.#

@@ -2951,23 +2951,23 @@

6.8.1. The effect of irrelevant predicto
-
+

Fig. 6.11 Accuracy versus number of irrelevant predictors for tuned and untuned number of neighbors.#

@@ -3430,23 +3430,23 @@

6.8.3. Forward selection in Python
-
+

Fig. 6.12 Estimated accuracy versus the number of predictors for the sequence of models built using forward selection.#

diff --git a/pull337/clustering.html b/pull337/clustering.html index 68911515..a4f173ee 100644 --- a/pull337/clustering.html +++ b/pull337/clustering.html @@ -448,7 +448,7 @@

9.4. An illustrative examplethe palmerpenguins R package [Horst et al., 2020]. This data set was collected by Dr. Kristen Gorman and the Palmer Station, Antarctica Long Term Ecological Research Site, and includes -measurements for adult penguins (Fig. 9.1) found near there [Gorman et al., 2014]. +measurements for adult penguins (Fig. 9.1) found near there [Gorman et al., 2014]. Our goal will be to use two variables—penguin bill and flipper length, both in millimeters—to determine whether there are distinct types of penguins in our data. @@ -749,23 +749,23 @@

9.4. An illustrative example
-
+

Fig. 9.2 Scatter plot of standardized bill length versus standardized flipper length.#

@@ -843,23 +843,23 @@

9.4. An illustrative example
-
+

Fig. 9.3 Scatter plot of standardized bill length versus standardized flipper length with colored groups.#

@@ -952,23 +952,23 @@

9.5.1. Measuring cluster quality
-
+

Fig. 9.4 Cluster 0 from the penguins_standardized data set example. Observations are small blue points, with the cluster center highlighted as a large blue point with a black outline.#

@@ -1035,23 +1035,23 @@

9.5.1. Measuring cluster quality
-
+

Fig. 9.5 Cluster 0 from the penguins_standardized data set example. Observations are small blue points, with the cluster center highlighted as a large blue point with a black outline. The distances from the observations to the cluster center are represented as black lines.#

@@ -1114,23 +1114,23 @@

9.5.1. Measuring cluster quality
-
+

Fig. 9.6 All clusters from the penguins_standardized data set example. Observations are small orange, blue, and yellow points with cluster centers denoted by larger points with a black outline. The distances from the observations to each of the respective cluster centers are represented as black lines.#

@@ -1198,23 +1198,23 @@

9.5.2. The clustering algorithm
-
+

Fig. 9.7 Random initialization of labels. Each cluster is depicted as a different color and shape.#

@@ -1280,23 +1280,23 @@

9.5.2. The clustering algorithm
-
+

Fig. 9.8 First three iterations of K-means clustering on the penguins_standardized example data set. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.#

@@ -1366,23 +1366,23 @@

9.5.3. Random restarts
-
+

Fig. 9.9 Random initialization of labels.#

@@ -1437,23 +1437,23 @@

9.5.3. Random restarts
-
+

Fig. 9.10 First four iterations of K-means clustering on the penguins_standardized example data set with a poor random initialization. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.#

@@ -1523,23 +1523,23 @@

9.5.4. Choosing K
-
+

Fig. 9.11 Clustering of the penguin data for K clusters ranging from 1 to 9. Cluster centers are indicated by larger points that are outlined in black.#

@@ -1599,23 +1599,23 @@

9.5.4. Choosing K
-
+

Fig. 9.12 Total WSSD for K clusters ranging from 1 to 9.#

@@ -1927,23 +1927,23 @@

9.6. K-means in Python
-
+

Fig. 9.13 The data colored by the cluster assignments returned by K-means.#

@@ -2172,23 +2172,23 @@

9.6. K-means in Python
-
+

Fig. 9.14 A plot showing the total WSSD versus the number of clusters.#

@@ -2295,7 +2295,7 @@

9.9. References
GWF14
-

Kristen Gorman, Tony Williams, and William Fraser. Ecological sexual dimorphism and environmental variability within a community of Antarctic penguins (genus \emph Pygoscelis). PLoS ONE, 2014.

+

Kristen Gorman, Tony Williams, and William Fraser. Ecological sexual dimorphism and environmental variability within a community of Antarctic penguins (genus pygoscelis). PLoS ONE, 2014.

HHG20

Allison Horst, Alison Hill, and Kristen Gorman. palmerpenguins: Palmer Archipelago penguin data. 2020. R package version 0.1.0. URL: https://allisonhorst.github.io/palmerpenguins/.

diff --git a/pull337/inference.html b/pull337/inference.html index f9edcb7e..fdfb71f3 100644 --- a/pull337/inference.html +++ b/pull337/inference.html @@ -1220,23 +1220,23 @@

10.4.1. Sampling distributions for propo
-
+

Fig. 10.2 Sampling distribution of the sample proportion for sample size 40.#

@@ -1344,23 +1344,23 @@

10.4.2. Sampling distributions for means
-
+

Fig. 10.3 Population distribution of price per night (dollars) for all Airbnb listings in Vancouver, Canada.#

@@ -1466,23 +1466,23 @@

10.4.2. Sampling distributions for means
-
+

Fig. 10.4 Distribution of price per night (dollars) for sample of 40 Airbnb listings.#

@@ -1681,23 +1681,23 @@

10.4.2. Sampling distributions for means
-
+

Fig. 10.5 Sampling distribution of the sample means for sample size of 40.#

@@ -1777,23 +1777,23 @@

10.4.2. Sampling distributions for means
-
+
@@ -1859,23 +1859,23 @@

10.4.2. Sampling distributions for means
-
+
@@ -2012,23 +2012,23 @@

10.5.1. Overview
-
+

Fig. 10.8 Comparison of samples of different sizes from the population.#

@@ -2617,23 +2617,23 @@

10.5.2. Bootstrapping in Python
-
+
@@ -2717,23 +2717,23 @@

10.5.2. Bootstrapping in Python
-
+
@@ -3013,23 +3013,23 @@

10.5.2. Bootstrapping in Python
-
+
@@ -3286,23 +3286,23 @@

10.5.2. Bootstrapping in Python
-
+
@@ -3363,23 +3363,23 @@

10.5.2. Bootstrapping in Python
-
+
@@ -3511,23 +3511,23 @@

10.5.3. Using the bootstrap to calculate
-
+
diff --git a/pull337/intro.html b/pull337/intro.html index 9101406a..af269dbd 100644 --- a/pull337/intro.html +++ b/pull337/intro.html @@ -2047,23 +2047,23 @@

1.11.1. Using
-
+

Fig. 1.9 Bar plot of the ten Aboriginal languages most often reported by Canadian residents as their mother tongue#

@@ -2151,23 +2151,23 @@

1.11.1. Using
-
+

Fig. 1.10 Bar plot of the ten Aboriginal languages most often reported by Canadian residents as their mother tongue with x and y labels. Note that this visualization is not done yet; there are still improvements to be made.#

@@ -2237,23 +2237,23 @@

1.11.1. Using
-
+

Fig. 1.11 Horizontal bar plot of the ten Aboriginal languages most often reported by Canadian residents as their mother tongue. There are no more serious issues with this visualization, but it could be refined further.#

@@ -2325,23 +2325,23 @@

1.11.1. Using
-
+

Fig. 1.12 Bar plot of the ten Aboriginal languages most often reported by Canadian residents as their mother tongue with bars reordered.#

@@ -2451,23 +2451,23 @@

1.11.3. Putting it all together
-
+

Fig. 1.13 Bar plot of the ten Aboriginal languages most often reported by Canadian residents as their mother tongue#

diff --git a/pull337/regression1.html b/pull337/regression1.html index 1cdb8d88..c24a7229 100644 --- a/pull337/regression1.html +++ b/pull337/regression1.html @@ -672,23 +672,23 @@

7.4. Exploring a data set
-
+

Fig. 7.1 Scatter plot of price (USD) versus house size (square feet).#

@@ -801,23 +801,23 @@

7.5. K-nearest neighbors regression
-
+

Fig. 7.2 Scatter plot of price (USD) versus house size (square feet) with vertical line indicating 2,000 square feet on x-axis.#

@@ -986,23 +986,23 @@

7.5. K-nearest neighbors regression
-
+

Fig. 7.3 Scatter plot of price (USD) versus house size (square feet) with lines to 5 nearest neighbors (highlighted in orange).#

@@ -1076,23 +1076,23 @@

7.5. K-nearest neighbors regression
-
+

Fig. 7.4 Scatter plot of price (USD) versus house size (square feet) with predicted price for a 2,000 square-foot house based on 5 nearest neighbors represented as a red dot.#

@@ -1215,23 +1215,23 @@

7.6. Training, evaluating, and tuning th
-
+

Fig. 7.5 Scatter plot of price (USD) versus house size (square feet) with example predictions (orange line) and the error in those predictions compared with true response values (vertical lines).#

@@ -1606,23 +1606,23 @@

7.6. Training, evaluating, and tuning th
-
+

Fig. 7.6 Effect of the number of neighbors on the RMSPE.#

@@ -1704,23 +1704,23 @@

7.7. Underfitting and overfitting
-
+

Fig. 7.7 Predicted values for house price (represented as a orange line) from K-NN regression models for six different values for \(K\).#

@@ -1913,23 +1913,23 @@

7.8. Evaluating on the test set
-
+

Fig. 7.8 Predicted values of house price (orange line) for the final K-NN regression model.#

@@ -2020,23 +2020,23 @@

7.9. Multivariable K-NN regression
-
+

Fig. 7.9 Scatter plot of the sale price of houses versus the number of bedrooms.#

@@ -2241,9 +2241,9 @@

7.9. Multivariable K-NN regression -

Fig. 8.1 Scatter plot of sale price versus size with line of best fit for subset of the Sacramento housing data.#

@@ -523,23 +523,23 @@

8.3. Simple linear regression
-
+

Fig. 8.2 Scatter plot of sale price versus size with line of best fit and a red dot at the predicted sale price for a 2,000 square-foot home.#

@@ -599,23 +599,23 @@

8.3. Simple linear regression
-
+

Fig. 8.3 Scatter plot of sale price versus size with many possible lines that could be drawn through the data points.#

@@ -675,23 +675,23 @@

8.3. Simple linear regression
-
+

Fig. 8.4 Scatter plot of sale price versus size with lines denoting the vertical distances between the predicted values and the observed data points.#

@@ -921,23 +921,23 @@

8.4. Linear regression in Python
-
+

Fig. 8.5 Scatter plot of sale price versus size with line of best fit for the full Sacramento housing data.#

@@ -1000,23 +1000,23 @@

8.5. Comparing simple linear and K-NN re
-
+

Fig. 8.6 Comparison of simple linear regression and K-NN regression.#

@@ -1187,9 +1187,9 @@

8.6. Multivariable linear regression -

Fig. 8.8 Scatter plot of a subset of the data, with outlier highlighted in red.#

@@ -1414,23 +1414,23 @@

8.7.1. Outliers
-
+

Fig. 8.9 Scatter plot of the full data, with outlier highlighted in red.#

@@ -1496,23 +1496,23 @@

8.7.2. Multicollinearity
-
+

Fig. 8.10 Scatter plot of house size (in square feet) measured by person 1 versus house size (in square feet) measured by person 2.#

@@ -1685,23 +1685,23 @@

8.8. Designing new predictors
-
+

Fig. 8.11 Example of a data set with a nonlinear relationship between the predictor and the response.#

@@ -1773,23 +1773,23 @@

8.8. Designing new predictors
-
+

Fig. 8.12 Relationship between the transformed predictor and the response.#

diff --git a/pull337/searchindex.js b/pull337/searchindex.js index ee7f30f9..82937c33 100644 --- a/pull337/searchindex.js +++ b/pull337/searchindex.js @@ -1 +1 @@ -Search.setIndex({"docnames": ["acknowledgements", "authors", "classification1", "classification2", "clustering", "foreword-text", "index", "inference", "intro", "jupyter", "preface-text", "reading", "regression1", "regression2", "setup", "version-control", "viz", "wrangling"], "filenames": ["acknowledgements.md", "authors.md", "classification1.md", "classification2.md", "clustering.md", "foreword-text.md", "index.md", "inference.md", "intro.md", "jupyter.md", "preface-text.md", "reading.md", "regression1.md", "regression2.md", "setup.md", "version-control.md", "viz.md", "wrangling.md"], "titles": ["Acknowledgments", "About the authors", "5. Classification I: training & predicting", "6. Classification II: evaluation & tuning", "9. Clustering", "Foreword", "Data Science", "10. Statistical inference", "1. Python and Pandas", "11. Combining code and text with Jupyter", "Preface", "2. Reading in data locally and from the web", "7. Regression I: K-nearest neighbors", "8. Regression II: linear regression", "13. Setting up your computer", "12. Collaboration with version control", "4. Effective data visualization", "3. Cleaning and wrangling data"], "terms": {"we": [0, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17], "d": [0, 1, 5, 7, 8, 11, 16], "like": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "thank": 0, "everyon": 0, "ha": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "contribut": [0, 1, 5, 15], "develop": [0, 1, 3, 5, 7, 8, 9, 10, 11, 15], "data": [0, 1, 4, 5, 7, 10, 13, 14, 15], "scienc": [0, 1, 2, 3, 5, 8, 9, 10, 14, 15, 17], "A": [0, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "first": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "introduct": [0, 3, 4, 5, 7, 8, 10, 11, 13], "thi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16], "an": [0, 1, 2, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17], "open": [0, 1, 6, 8, 9, 11, 14, 15, 16], "sourc": [0, 1, 11, 16], "textbook": [0, 1, 2, 3, 5, 6, 10, 11, 13, 15, 17], "began": [0, 11], "collect": [0, 2, 3, 4, 5, 7, 8, 11, 16, 17], "cours": [0, 1, 3, 4, 5, 7, 8, 9, 10, 11, 13, 17], "read": [0, 2, 3, 6, 7, 8, 9, 10, 12, 13, 15, 16, 17], "dsci": [0, 11, 14], "100": [0, 2, 3, 7, 8, 11, 12, 13, 14, 16, 17], "new": [0, 3, 4, 7, 8, 11, 12, 14, 15, 16, 17], "introductori": [0, 3, 5, 7], "univers": [0, 1, 7, 11], "british": [0, 1, 7, 8, 11], "columbia": [0, 1, 7, 11], "ubc": [0, 1, 11, 14], "sever": [0, 1, 2, 7, 11, 15, 16, 17], "faculti": 0, "member": [0, 2, 5, 15], "depart": [0, 1], "statist": [0, 1, 2, 3, 4, 5, 8, 11, 12, 13, 16], "were": [0, 2, 3, 7, 8, 9, 11, 13, 15, 16, 17], "pivot": 0, "shape": [0, 2, 4, 7, 8, 11, 13, 16, 17], "direct": [0, 2, 5, 11, 16], "greatli": [0, 17], "broad": [0, 5, 16], "structur": [0, 3, 4, 8, 11, 16], "list": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16], "topic": [0, 3, 4, 9, 13, 15], "book": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "would": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "especi": [0, 2, 8, 11, 14, 15, 16], "mat\u00eda": 0, "salib\u00edan": 0, "barrera": 0, "hi": [0, 1], "mentorship": 0, "dure": [0, 1, 3, 8, 12, 15, 17], "initi": [0, 1, 2, 4, 8, 11, 12, 13, 15, 16], "roll": 0, "out": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "both": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "door": 0, "wa": [0, 1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "alwai": [0, 2, 3, 4, 9, 11, 12, 13, 14, 16, 17], "when": [0, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17], "need": [0, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "chat": 0, "about": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "how": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "best": [0, 3, 7, 8, 11, 12, 13, 15, 16], "introduc": [0, 5, 7, 8, 13, 15, 16, 17], "teach": [0, 1, 2, 5, 8], "our": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "year": [0, 2, 5, 8, 11, 16, 17], "student": [0, 1, 5, 7], "also": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "gabriela": 0, "cohen": 0, "freue": 0, "her": [0, 1], "561": 0, "regress": [0, 2, 3, 4, 8, 10], "i": [0, 3, 4, 7, 8, 9, 10, 11, 13, 14, 16, 17], "materi": [0, 2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16, 17], "from": [0, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 16], "master": [0, 1], "program": [0, 1, 3, 5, 8, 9, 10, 11, 14, 16], "some": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "linear": [0, 3, 9, 12, 16], "figur": [0, 2, 8, 17], "inspir": [0, 11], "all": [0, 2, 3, 4, 5, 7, 9, 10, 12, 13, 14, 15, 16, 17], "those": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "who": [0, 2, 3, 5, 7, 8, 9, 11, 15, 16, 17], "process": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "publish": [0, 11, 16], "In": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "particular": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "review": [0, 11, 15], "feedback": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "suggest": [0, 2, 3, 7, 8, 12, 13, 16, 17], "rohan": 0, "alexand": 0, "isabella": 0, "ghement": 0, "virgilio": 0, "g\u00f3mez": 0, "rubio": 0, "albert": [0, 16], "kim": 0, "adam": 0, "loi": 0, "maria": 0, "prokofieva": 0, "emili": 0, "rieder": 0, "greg": [0, 15], "wilson": [0, 8, 15], "The": [0, 1, 5, 7, 8, 11, 14, 15, 16, 17], "improv": [0, 2, 3, 4, 7, 8, 12, 13, 15, 16], "substanti": [0, 3, 12], "insight": [0, 4, 10, 16], "give": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "special": [0, 2, 7, 8, 11, 15, 16, 17], "jim": 0, "zidek": 0, "support": [0, 2, 3, 8, 14, 16, 17], "encourag": [0, 5, 17], "throughout": [0, 2, 3, 8, 10, 15, 17], "roger": [0, 5, 8], "peng": [0, 5, 8], "gracious": 0, "offer": [0, 3, 7, 11, 12, 13, 15], "write": [0, 2, 8, 9, 13, 15, 17], "foreword": 0, "final": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "ow": 0, "debt": 0, "gratitud": 0, "over": [0, 1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 14, 15, 16, 17], "past": [0, 2, 3, 4, 5, 11, 12, 13, 14, 15, 16], "few": [0, 2, 3, 4, 5, 7, 8, 11, 12, 13, 14, 15, 16, 17], "thei": [0, 2, 3, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "provid": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "invalu": 0, "worksheet": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "found": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "bug": [0, 15, 17], "us": [0, 1, 2, 3, 4, 5, 9, 10, 12, 13, 14, 16], "stood": 0, "veri": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "patient": [0, 2, 3], "class": [0, 2, 3, 8, 11, 16, 17], "while": [0, 2, 3, 4, 8, 10, 11, 13, 16, 17], "frantic": 0, "fix": [0, 2, 3, 7, 9, 12, 15, 16, 17], "brought": 0, "level": [0, 3, 4, 7, 8, 10, 13, 16], "enthusiasm": 0, "sustain": 0, "hard": [0, 8, 11, 16, 17], "work": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 16, 17], "creat": [0, 1, 2, 4, 7, 10, 11, 12, 13, 17], "interact": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "them": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "taught": [0, 2], "learn": [0, 1, 5, 10], "reflect": [0, 1, 16], "content": [0, 1, 2, 6, 11, 15, 17], "translat": [0, 11], "origin": [0, 1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "which": [0, 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "focus": [0, 1, 2, 3, 5, 12], "r": [0, 1, 3, 4, 5, 6, 7, 8, 11, 16], "languag": [0, 1, 2, 3, 5, 7, 9, 10, 11, 12, 14, 17], "ar": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "navya": 0, "dahiya": 0, "gloria": 0, "ye": [0, 2], "complet": [0, 1, 3, 7, 8, 9, 11, 12, 14, 15], "round": [0, 3, 7], "philip": 0, "austin": 0, "leadership": 0, "guidanc": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "gratefulli": 0, "educ": [0, 1, 2, 5], "resourc": [0, 1, 2, 12], "fund": 0, "earth": [0, 1, 16], "ocean": [0, 1], "atmospher": [0, 1, 16], "exercis": [0, 5, 10, 14], "version": [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "tiffani": [1, 6, 8, 16], "timber": [1, 6, 8, 16], "trevor": [1, 3, 4, 6, 13], "campbel": [1, 6], "melissa": [1, 6], "lee": [1, 6, 16], "adapt": [1, 16], "python": [1, 2, 5, 10, 12, 14, 15, 16], "joel": [1, 6], "ostblom": [1, 6], "lindsei": [1, 6], "heagi": [1, 6], "associ": [1, 7, 8, 10, 11, 15, 17], "professor": 1, "co": [1, 16], "director": 1, "vancouv": [1, 7, 11, 16, 17], "option": [1, 2, 9, 11, 13, 14, 15, 16, 17], "role": [1, 8, 11, 16], "she": 1, "curriculum": 1, "around": [1, 3, 5, 7, 8, 12, 13, 16, 17], "respons": [1, 2, 3, 4, 11, 12, 13, 15], "applic": [1, 3, 5, 7, 8, 11, 12, 13, 14, 17], "solv": [1, 2, 3, 4, 8, 10, 13, 15, 17], "real": [1, 2, 3, 7, 8, 11, 12, 13, 15, 17], "world": [1, 7, 8, 10, 15, 16, 17], "problem": [1, 3, 4, 5, 7, 8, 9, 10, 13, 15, 16, 17], "One": [1, 2, 3, 7, 8, 9, 12, 13, 15, 16, 17], "favorit": [1, 13], "graduat": 1, "collabor": [1, 5, 9, 10], "softwar": [1, 2, 10, 11, 14, 15, 16, 17], "packag": [1, 2, 3, 4, 5, 8, 11, 12, 13, 14, 16, 17], "modern": [1, 2, 5, 11, 16], "tool": [1, 2, 3, 4, 5, 8, 9, 10, 11, 13, 16, 17], "workflow": [1, 2, 3, 4, 5, 8, 9, 10, 12, 13], "research": [1, 4, 5, 16], "autom": [1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "scalabl": 1, "bayesian": 1, "infer": [1, 5], "algorithm": [1, 3, 12, 13, 16], "nonparametr": [1, 2, 12], "stream": 1, "theori": [1, 2, 4, 7, 12], "he": 1, "previous": [1, 2, 7, 8, 11, 12, 16, 17], "postdoctor": 1, "advis": [1, 11, 12, 16], "tamara": 1, "broderick": 1, "comput": [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "artifici": 1, "intellig": 1, "laboratori": [1, 4], "csail": 1, "institut": [1, 3, 16], "system": [1, 11, 14, 15], "societi": 1, "idss": 1, "mit": 1, "ph": 1, "candid": [1, 3, 13], "under": [1, 5, 6, 9, 14, 15, 17], "jonathan": 1, "inform": [1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "decis": [1, 2, 3, 5, 7, 15], "lid": 1, "befor": [1, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17], "engin": [1, 11, 13, 14, 16], "toronto": [1, 11, 17], "assist": 1, "undergradu": [1, 7], "center": [1, 4, 5, 7, 11, 13, 16, 17], "approach": [1, 2, 3, 4, 7, 8, 10, 12, 13, 15, 17], "assess": [1, 3, 4, 12, 13, 15, 16], "promot": 1, "equiti": 1, "divers": [1, 7], "inclus": [1, 3, 13, 15], "phd": 1, "passion": 1, "reproduc": [1, 3, 4, 5, 7, 9, 10, 11, 15], "through": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "quantit": [1, 3, 4, 7, 8, 12, 16], "imag": [1, 2, 9, 10, 11, 14, 16], "analysi": [1, 2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "pipelin": [1, 3, 4, 12], "studi": [1, 2, 3, 4, 5, 7, 8, 12, 16, 17], "stem": [1, 12], "cell": [1, 2, 3, 4, 8, 11, 13, 17], "development": 1, "biologi": [1, 15], "sinc": [1, 2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "lead": [1, 2, 3, 5, 8, 9, 15, 16], "workshop": [1, 2], "now": [1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "care": [1, 2, 3, 11, 12, 13, 16, 17], "deepli": [1, 17], "spread": [1, 2, 4, 7, 8, 16, 17], "literaci": 1, "excit": [1, 5, 8], "programmat": [1, 3, 11], "project": [1, 2, 9, 11, 16], "geophys": 1, "invers": 1, "facil": [1, 15], "combin": [1, 2, 3, 4, 10, 11, 13, 15, 16], "method": [1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 15, 16, 17], "numer": [1, 2, 7, 11, 12, 13, 16, 17], "simul": [1, 2, 7, 16], "machin": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "answer": [1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 16, 17], "question": [1, 2, 3, 4, 5, 7, 10, 12, 13, 16, 17], "subsurfac": 1, "primari": [1, 2, 8, 15, 16, 17], "includ": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "miner": 1, "explor": [1, 3, 4, 7, 11, 13, 15, 16, 17], "carbon": [1, 16], "sequestr": 1, "groundwat": 1, "environment": [1, 4], "bsc": 1, "alberta": [1, 11, 17], "held": [1, 3], "posit": [1, 3, 4, 8, 12, 16], "california": [1, 12], "berkelei": 1, "prior": [1, 2, 11, 15], "start": [1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "current": [1, 2, 8, 9, 11, 13, 15, 16, 17], "previou": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "sole": [2, 9], "descript": [2, 7, 8, 9, 10, 11, 15, 16, 17], "exploratori": [2, 3, 4, 8, 10, 12, 16], "next": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "serv": [2, 3, 5, 7, 9, 13, 15], "forai": [2, 12], "focu": [2, 3, 4, 7, 8, 9, 12, 13, 15, 16, 17], "e": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "one": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16], "valu": [2, 4, 7, 9, 12, 13, 16], "categor": [2, 3, 4, 7, 8, 12, 16, 17], "interest": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "cover": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "basic": [2, 3, 8, 11, 13, 15, 16], "make": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "suitabl": [2, 17], "classifi": 2, "accur": [2, 3, 7, 12, 13, 16], "well": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "where": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "possibl": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "maxim": [2, 3, 12], "accuraci": [2, 3, 7, 12, 13], "By": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "end": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "reader": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "abl": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "do": [2, 3, 4, 5, 8, 9, 11, 12, 13, 14, 15, 16], "follow": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "recogn": [2, 9, 11, 12, 15, 17], "situat": [2, 3, 4, 8, 12, 15, 16, 17], "appropri": [2, 3, 4, 8, 11, 12, 14, 16, 17], "what": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16], "interpret": [2, 3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "output": [2, 3, 4, 8, 9, 11, 12, 13, 16, 17], "hand": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16, 17], "straight": [2, 4, 12, 13, 16], "line": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17], "euclidean": [2, 4], "graph": 2, "predictor": [2, 4, 12], "explain": [2, 4, 7, 8, 12, 13], "perform": [2, 4, 7, 8, 9, 10, 11, 12, 13, 16], "imput": 2, "step": [2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "model": [2, 3, 4, 5, 13, 15, 16], "make_pipelin": [2, 3, 4, 12], "mani": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "want": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "base": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16], "experi": [2, 16], "For": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "instanc": [2, 3, 7, 8, 11, 17], "doctor": [2, 3], "mai": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "diagnos": [2, 3], "either": [2, 3, 4, 8, 9, 10, 12, 13, 15, 17], "diseas": 2, "healthi": 2, "symptom": 2, "s": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "email": [2, 11, 15], "might": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "tag": [2, 11, 14], "given": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "spam": 2, "text": [2, 3, 7, 8, 10, 12, 13, 14, 15, 17], "credit": 2, "card": 2, "compani": 2, "whether": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16], "purchas": [2, 4, 12, 13], "fraudul": 2, "item": [2, 4, 8, 9, 11, 12, 15, 16, 17], "amount": [2, 3, 4, 9, 11, 12, 13, 16], "locat": [2, 11, 15, 16], "These": [2, 3, 4, 5, 8, 9, 11, 13, 15, 16], "task": [2, 4, 7, 10, 12, 16, 17], "exampl": [2, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "sometim": [2, 3, 7, 8, 11, 12, 13, 14, 16, 17], "call": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "label": [2, 4, 8, 12, 16, 17], "other": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17], "featur": [2, 3, 9, 12, 13, 15, 16], "gener": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 17], "assign": [2, 4, 7, 8, 11, 16, 17], "without": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "known": [2, 3, 8, 11, 13, 16], "g": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "basi": [2, 11, 16], "similar": [2, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "know": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "name": [2, 3, 4, 7, 9, 12, 13, 14, 15, 16], "come": [2, 4, 7, 8, 12, 13, 14, 16, 17], "fact": [2, 3, 7, 8, 9, 11, 13, 15], "onc": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "can": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "There": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "could": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "wide": [2, 3, 4, 5, 11, 13, 15, 16], "hart": [2, 12], "1967": [2, 3, 12], "hodg": [2, 12], "1951": [2, 12], "your": [2, 3, 4, 7, 8, 10, 11, 12, 13, 16, 17], "futur": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16], "you": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "encount": [2, 4, 11, 12, 13, 14, 17], "tree": [2, 3, 12], "vector": [2, 3, 11, 16], "svm": 2, "logist": [2, 3, 13], "neural": 2, "network": [2, 11], "see": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "addit": [2, 8, 12], "section": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "begin": [2, 3, 4, 7, 8, 11, 12, 15, 16, 17], "It": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "worth": [2, 3, 16, 17], "mention": [2, 3, 4, 7, 9, 11, 13, 14, 15, 17], "variat": [2, 7, 12, 16], "binari": [2, 3], "onli": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "involv": [2, 3, 4, 9, 11, 13, 14, 15, 16, 17], "diagnosi": [2, 3], "run": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "multiclass": 2, "categori": [2, 3, 4, 7, 8, 11, 16, 17], "bronchiti": 2, "pneumonia": 2, "common": [2, 3, 4, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17], "cold": 2, "digit": 2, "breast": [2, 3], "dr": [2, 4, 16], "william": [2, 3, 4], "h": [2, 11], "wolberg": [2, 3], "w": [2, 8, 11], "nick": [2, 3, 8], "street": [2, 3], "olvi": [2, 3], "l": [2, 11], "mangasarian": [2, 3], "et": [2, 3, 4, 7, 11, 13, 15, 16], "al": [2, 3, 4, 7, 11, 13, 15, 16], "1993": [2, 3], "row": [2, 3, 4, 7, 9, 12, 13, 15, 16], "repres": [2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16, 17], "tumor": [2, 8], "sampl": [2, 3, 12], "benign": [2, 3, 8, 12], "malign": [2, 3, 8, 12], "measur": [2, 3, 7, 8, 12, 13, 16, 17], "nucleu": 2, "textur": [2, 3], "perimet": [2, 3, 8], "area": [2, 3, 8, 11, 12, 13, 15, 16, 17], "conduct": [2, 11], "physician": 2, "As": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "analys": [2, 3, 4, 5, 8, 9, 10, 11, 15, 16, 17], "formul": [2, 7, 8, 12, 16], "precis": [2, 3, 7, 9, 12, 14, 15, 17], "here": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "avail": [2, 3, 5, 8, 10, 11, 13, 14, 16], "unknown": [2, 7, 8], "show": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "import": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "becaus": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "tradit": 2, "non": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "driven": [2, 4], "quit": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "subject": [2, 9, 15, 16], "depend": [2, 3, 4, 7, 9, 11, 12, 13, 16, 17], "upon": [2, 3, 11], "skill": [2, 5, 9, 11, 16], "experienc": 2, "furthermor": [2, 3, 5, 16], "normal": [2, 3, 7, 15, 17], "danger": [2, 14], "stai": [2, 7, 11, 16], "same": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "place": [2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 16, 17], "stop": [2, 3, 4, 9, 13, 14], "grow": [2, 3, 5, 13], "get": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "larg": [2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16], "contrast": [2, 3, 4, 7, 8, 11, 12, 13, 15], "invad": 2, "surround": [2, 8, 11, 15, 16], "tissu": 2, "nearbi": [2, 3, 11], "organ": [2, 8, 9, 11, 15, 16, 17], "caus": [2, 3, 4, 8, 9, 12, 13, 16, 17], "seriou": [2, 8, 15], "damag": [2, 3], "stanford": 2, "health": [2, 5], "2021": [2, 8, 11], "thu": [2, 3, 9, 11, 12, 13, 15, 17], "quickli": [2, 3, 8, 13, 16], "type": [2, 3, 4, 7, 9, 11, 12, 13, 14, 15, 16], "guid": [2, 8, 12, 15, 16], "treatment": [2, 3, 17], "wrangl": [2, 3, 5, 8, 10, 11, 13, 16], "visual": [2, 3, 4, 7, 9, 10, 12, 13, 15, 17], "order": [2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "better": [2, 3, 4, 12, 13, 16], "understand": [2, 3, 4, 5, 7, 8, 10, 11, 13, 15, 16, 17], "panda": [2, 3, 4, 7, 9, 11, 12, 13, 16, 17], "altair": [2, 3, 4, 9, 12, 13], "pd": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "alt": [2, 3, 4, 7, 8, 12, 13, 16], "case": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "file": [2, 8, 14, 17], "contain": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "csv": [2, 3, 4, 7, 8, 12, 13, 16, 17], "header": [2, 8, 9, 15, 17], "ll": [2, 3, 7, 8, 11, 12, 14, 15, 16, 17], "read_csv": [2, 3, 4, 7, 8, 9, 12, 13, 16, 17], "function": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], "argument": [2, 3, 4, 7, 8, 9, 12, 16, 17], "inspect": [2, 8, 11, 16, 17], "wdbc": 2, "id": [2, 3, 7, 16], "radiu": [2, 3], "smooth": [2, 3, 12, 16], "compact": [2, 3], "concav": [2, 3], "concave_point": [2, 3], "symmetri": [2, 3], "fractal_dimens": [2, 3], "0": [2, 3, 4, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17], "842302": 2, "m": [2, 3, 8, 11, 16], "1": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "096100": 2, "2": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17], "071512": 2, "268817": 2, "983510": 2, "567087": 2, "3": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "280628": 2, "650542": 2, "530249": 2, "215566": 2, "253764": 2, "842517": 2, "828212": 2, "353322": 2, "684473": 2, "907030": 2, "826235": 2, "486643": 2, "023825": 2, "547662": 2, "001391": 2, "867889": 2, "84300903": 2, "578499": 2, "455786": 2, "565126": 2, "557513": 2, "941382": 2, "052000": 2, "362280": 2, "035440": 2, "938859": 2, "397658": 2, "84348301": 2, "768233": 2, "253509": 2, "592166": 2, "763792": 2, "280667": 2, "399917": 2, "914213": 2, "450431": 2, "864862": 2, "4": [2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "906602": 2, "84358402": 2, "748758": 2, "150804": 2, "775011": 2, "824624": 2, "280125": 2, "538866": 2, "369806": 2, "427237": 2, "009552": 2, "561956": 2, "564": [2, 3], "926424": 2, "109139": 2, "720838": 2, "058974": 2, "341795": 2, "040926": 2, "218868": 2, "945573": 2, "318924": 2, "312314": 2, "930209": 2, "565": [2, 3], "926682": 2, "703356": 2, "083301": 2, "614511": 2, "722326": 2, "102368": 2, "017817": 2, "692434": 2, "262558": 2, "217473": 2, "057681": 2, "566": [2, 3], "926954": 2, "701667": 2, "043775": 2, "672084": 2, "577445": 2, "839745": 2, "038646": 2, "046547": 2, "105684": 2, "808406": 2, "894800": 2, "567": [2, 3], "927241": 2, "836725": 2, "334403": 2, "980781": 2, "733693": 2, "524426": 2, "269267": 2, "294046": 2, "656528": 2, "135315": 2, "042778": 2, "568": [2, 3], "92751": 2, "b": [2, 3], "806811": 2, "220718": 2, "812793": 2, "346604": 2, "109349": 2, "149741": 2, "113893": 2, "260710": 2, "819349": 2, "560539": 2, "569": [2, 3], "12": [2, 3, 4, 7, 8, 9, 11, 13, 14, 15, 16, 17], "column": [2, 3, 4, 7, 9, 12, 13, 14, 16], "biopsi": [2, 3], "remov": [2, 3, 8, 14, 15, 16], "bodi": [2, 15], "examin": [2, 3, 4, 11, 12], "presenc": [2, 3], "tradition": 2, "procedur": [2, 3, 4, 5, 12], "invas": 2, "fine": [2, 3, 9, 15, 16, 17], "needl": 2, "aspir": 2, "present": [2, 3, 5, 7, 8, 11, 15, 16, 17], "extract": [2, 3, 4, 8, 11, 12, 13], "small": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16], "less": [2, 3, 4, 7, 11, 12, 13, 15, 16, 17], "ten": [2, 7, 8, 16], "differ": [2, 3, 4, 5, 7, 8, 12, 13, 14, 15, 17], "below": [2, 3, 7, 8, 11, 13, 15, 16], "mean": [2, 3, 8, 9, 11, 12, 13, 15, 16, 17], "across": [2, 3, 5, 7, 8, 11, 12, 13, 15, 16], "nuclei": 2, "record": [2, 3, 7, 8, 11, 15, 16, 17], "part": [2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 15, 16, 17], "prepar": [2, 3, 16], "have": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], "been": [2, 3, 9, 11, 12, 13, 15, 16, 17], "standard": [2, 3, 4, 7, 9, 12, 13, 15, 16, 17], "discuss": [2, 3, 5, 7, 11, 12, 13, 14, 15, 16, 17], "why": [2, 3, 8, 12, 16, 17], "later": [2, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "addition": [2, 3, 4, 7, 9, 11, 13, 15, 17], "uniqu": [2, 3, 5, 8, 15], "therefor": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 16, 17], "total": [2, 3, 4, 8, 11, 12, 16, 17], "per": [2, 7, 11, 15, 16, 17], "identif": 2, "number": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "deviat": [2, 3, 4, 7, 17], "grai": [2, 15, 17], "length": [2, 4, 7, 8, 16, 17], "contour": 2, "insid": [2, 3, 7, 8, 9, 11, 14, 15, 16, 17], "local": [2, 12, 14], "ratio": [2, 17], "squar": [2, 3, 4, 8, 9, 11, 12, 13, 16, 17], "portion": [2, 11], "mirror": 2, "fractal": 2, "dimens": 2, "rough": [2, 4, 16], "info": [2, 3, 8, 16, 17], "preview": [2, 3, 4, 7, 8, 9, 10, 12, 13, 15, 16, 17], "frame": [2, 3, 4, 7, 9, 11, 12, 13, 16], "easier": [2, 3, 7, 8, 11, 12, 13, 14, 15, 16, 17], "lot": [2, 3, 4, 8, 11, 13, 16, 17], "print": [2, 3, 7, 8, 9, 11, 13, 14, 16, 17], "down": [2, 9, 11, 14, 15, 17], "page": [2, 3, 4, 6, 9, 11, 13, 14, 15], "instead": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "entri": [2, 3, 7, 8, 11, 16, 17], "core": [2, 3, 5, 8, 16, 17], "datafram": [2, 3, 4, 7, 11, 12, 13, 16, 17], "rangeindex": [2, 3, 16, 17], "null": [2, 3, 16, 17], "count": [2, 3, 7, 8, 11, 16, 17], "dtype": [2, 3, 7, 16, 17], "int64": [2, 3, 11, 16, 17], "float64": [2, 3, 7, 11, 16, 17], "6": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "7": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "8": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "9": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "10": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "11": [2, 3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "memori": [2, 3, 11, 16, 17], "usag": [2, 3, 8, 11, 13, 16, 17], "53": [2, 3, 7, 13, 15], "kb": [2, 3, 16, 17], "abov": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16], "arrai": [2, 3, 4, 5, 12, 13], "readabl": [2, 3, 8, 11, 12, 15, 16, 17], "renam": [2, 3, 7, 8, 9, 11, 12, 17], "replac": [2, 3, 7, 8, 11, 13, 14, 15, 16], "take": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "dictionari": [2, 3, 11, 17], "map": [2, 4, 8, 11, 12, 13, 16], "desir": [2, 3, 8, 11, 12, 15, 17], "verifi": [2, 7, 14], "result": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "ani": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "let": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "groupbi": [2, 7], "size": [2, 3, 4, 7, 11, 12, 13, 17], "find": [2, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "percentag": [2, 3, 7, 8, 16], "pair": [2, 3, 4, 11, 17], "Then": [2, 3, 4, 7, 8, 9, 12, 13, 14, 15, 16, 17], "calcul": [2, 3, 4, 12, 13], "group": [2, 3, 4, 7, 8, 14, 16], "divid": [2, 3, 8, 11, 16, 17], "multipli": [2, 8, 16], "equal": [2, 3, 4, 7, 12, 13, 17], "access": [2, 3, 4, 7, 12, 14, 16, 17], "via": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16], "attribut": [2, 3, 4, 6, 8, 11, 12], "357": [2, 3], "63": [2, 3, 11, 12], "212": [2, 4, 8, 11, 16, 17], "37": [2, 3, 4, 15, 16], "62": [2, 11, 12, 16], "741652": 2, "258348": 2, "conveni": [2, 3, 8, 11, 17], "value_count": [2, 3, 7, 17], "occurr": [2, 16], "If": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "pass": [2, 3, 8, 11, 12, 16, 17], "seri": [2, 3, 7, 12, 13], "occur": [2, 3, 4, 7, 9, 12, 13, 15, 16, 17], "true": [2, 3, 4, 7, 8, 9, 12, 16, 17], "fraction": [2, 3, 7, 12, 15, 16], "627417": 2, "372583": 2, "proport": [2, 3, 8, 16, 17], "draw": [2, 7, 8, 12, 13, 16], "color": [2, 3, 4, 11, 12, 13, 17], "scatter": [2, 3, 4, 12, 13], "plot": [2, 3, 4, 7, 12, 13, 17], "relationship": [2, 3, 4, 7, 8, 12, 13, 16, 17], "recal": [2, 3, 4, 7, 8, 12, 13, 15, 16, 17], "default": [2, 3, 8, 9, 11, 12, 14, 15, 16, 17], "palett": 2, "colorblind": [2, 16], "friendli": [2, 16], "so": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "stick": [2, 3, 11, 15], "perim_concav": [2, 3], "chart": [2, 3, 4, 7, 12, 13], "mark_circl": [2, 3, 4, 12, 13, 16], "encod": [2, 3, 4, 7, 8, 11, 12, 13, 16], "x": [2, 3, 4, 7, 8, 11, 12, 13, 14, 16], "titl": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16], "y": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16], "versu": [2, 3, 4, 8, 11, 12, 13, 17], "fig": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "typic": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "fall": [2, 3, 7, 12, 15, 16], "upper": [2, 7, 15, 16], "right": [2, 3, 4, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17], "corner": [2, 3, 14, 15, 16], "lower": [2, 3, 7, 9, 13, 16], "left": [2, 3, 4, 6, 8, 9, 11, 13, 14, 15, 16, 17], "word": [2, 3, 7, 8, 9, 11, 12, 13, 15, 17], "tend": [2, 3, 12, 15, 16], "ones": [2, 16, 17], "larger": [2, 3, 4, 7, 11, 12, 13, 15, 16], "suppos": [2, 4, 7, 8, 9, 11, 12, 15, 17], "obtain": [2, 3, 4, 7, 8, 12, 13, 15, 16, 17], "except": [2, 11, 13, 15, 17], "sai": [2, 3, 7, 9, 11, 12, 13, 14, 16, 17], "respect": [2, 3, 4, 7, 11, 15, 16, 17], "lie": 2, "middl": [2, 7, 11], "orang": [2, 4, 12, 13], "cloud": [2, 11, 15, 16], "probabl": [2, 3, 7, 11, 13], "seem": [2, 3, 5, 7, 9, 11, 12, 13, 16, 17], "actual": [2, 3, 4, 7, 8, 11, 12, 13, 15, 17], "practic": [2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 15, 16, 17], "To": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "most": [2, 3, 4, 5, 7, 8, 9, 11, 15, 16, 17], "must": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "choos": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 17], "advanc": [2, 3, 4, 5, 7, 9, 13, 14, 15, 16, 17], "assum": [2, 7, 9], "someon": [2, 3, 8, 9, 15], "chosen": [2, 3, 4, 13, 17], "ourselv": [2, 3, 12], "illustr": [2, 3, 7, 12, 13, 16, 17], "concept": [2, 5, 7, 8, 10, 12, 13, 15, 16], "walk": [2, 8, 12, 15], "whose": [2, 9, 11, 15, 17], "depict": [2, 4], "red": [2, 4, 9, 11, 12, 13, 14, 15], "diamond": 2, "coordin": [2, 4, 8, 16], "idea": [2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "close": [2, 3, 4, 7, 8, 11, 12, 15, 16], "anoth": [2, 3, 7, 8, 9, 11, 12, 13, 15, 16, 17], "expect": [2, 3, 4, 7, 8, 9, 11, 12, 13, 17], "look": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "doe": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16], "consid": [2, 3, 4, 5, 7, 8, 12, 13, 15, 16, 17], "closest": [2, 3, 11, 16], "among": [2, 11, 15, 17], "major": [2, 3, 4, 8, 12, 13, 16, 17], "shown": [2, 3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "vote": [2, 3, 8], "three": [2, 3, 4, 7, 8, 9, 10, 11, 12, 15, 16, 17], "chose": [2, 3, 16], "noth": [2, 7, 8, 13], "though": [2, 3, 7, 8, 11, 12, 13, 15, 16, 17], "odd": [2, 11], "avoid": [2, 3, 13, 16], "ti": [2, 11], "decid": [2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "often": [2, 3, 4, 5, 7, 8, 9, 11, 12, 15, 16, 17], "just": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "denot": [2, 4, 8, 11, 12, 13, 16, 17], "a_x": 2, "a_i": 2, "b_x": 2, "b_y": 2, "definit": [2, 5, 11, 16, 17], "plane": [2, 13], "formula": [2, 3, 4, 12, 13, 16], "mathrm": [2, 3], "sqrt": [2, 3, 12, 15], "select": [2, 4, 7, 9, 11, 12, 13, 14, 15, 16], "correspond": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "smallest": [2, 8, 12, 16, 17], "code": [2, 3, 5, 8, 10, 11, 12, 14, 15, 16, 17], "add": [2, 3, 4, 7, 8, 11, 12, 14, 16, 17], "root": [2, 3, 11, 12, 15], "nsmallest": [2, 12, 16], "new_obs_perimet": 2, "new_obs_concav": 2, "dist_from_new": 2, "112": 2, "241202": 2, "653051": 2, "880626": 2, "258": 2, "750277": 2, "870061": 2, "979663": 2, "351": 2, "622700": 2, "541410": 2, "143088": 2, "430": 2, "416930": 2, "314364": 2, "256806": 2, "152": 2, "160091": 2, "039155": 2, "279258": 2, "tabl": [2, 3, 6, 8, 9, 11, 14, 16, 17], "mathemat": [2, 3, 7, 12, 13, 16], "detail": [2, 3, 4, 8, 9, 11, 13, 14, 15, 16, 17], "24": [2, 15, 16], "65": [2, 3, 7, 11, 12, 17], "88": [2, 3], "75": [2, 3, 7, 8, 11, 12, 13, 16, 17], "87": [2, 3, 12], "98": [2, 8, 13, 16], "54": [2, 3, 15, 16, 17], "14": [2, 3, 4, 7, 8, 9, 11, 15, 16, 17], "42": [2, 7, 15, 16, 17], "31": [2, 3, 15, 16, 17], "26": [2, 3, 15, 16], "16": [2, 3, 4, 7, 11, 15, 16], "04": [2, 11, 14, 16, 17], "28": [2, 4, 12, 13, 15, 16], "circl": [2, 9, 15, 16], "although": [2, 3, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "toward": [2, 7, 8, 15], "exactli": [2, 3, 7, 8, 11, 12, 13, 14, 16], "appli": [2, 3, 5, 8, 12, 13, 16], "higher": [2, 3, 7, 8, 12, 13, 16, 17], "help": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "a_": 2, "dot": [2, 8, 11, 12, 13, 16], "b_": 2, "becom": [2, 3, 4, 5, 7, 8, 9, 12, 13, 15, 17], "still": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16], "space": [2, 9, 11, 12, 13, 14, 16], "417": [2, 16], "837": 2, "had": [2, 3, 7, 8, 11, 12, 16, 17], "ad": [2, 3, 4, 11, 12, 13, 15], "up": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16], "took": [2, 7, 8], "27": [2, 8, 12, 15, 16], "new_obs_symmetri": 2, "836722": 2, "267368": 2, "400": [2, 12, 17], "334664": 2, "886368": 2, "099359": 2, "472326": 2, "562": 2, "470430": 2, "084810": 2, "154075": 2, "499268": 2, "68": 2, "365450": 2, "812359": 2, "092064": 2, "531594": 2, "055065": 2, "555575": 2, "dimension": 2, "five": [2, 3, 14, 16, 17], "3d": [2, 12, 13], "note": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "recommend": [2, 8, 9, 10, 12, 13, 14, 15, 17], "against": [2, 9, 12, 13], "purpos": [2, 3, 4, 7, 11, 12, 13, 15, 16, 17], "complic": [2, 8, 11, 12, 16], "handl": [2, 3, 8, 16], "multipl": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16], "thankfulli": [2, 4], "implement": [2, 3, 4, 5, 13, 16], "buitinck": 2, "2013": [2, 3, 4, 13], "along": [2, 3, 5, 7, 8, 11, 12, 14, 15, 16], "sklearn": [2, 3, 4, 12, 13], "keep": [2, 3, 7, 8, 11, 14, 15, 16, 17], "simpl": [2, 3, 4, 7, 11, 12, 14, 16, 17], "fewer": [2, 3], "mistak": [2, 3, 12, 16], "tell": [2, 3, 7, 8, 9, 11, 12, 13, 15, 16, 17], "prefer": [2, 3, 4, 11, 13, 16, 17], "regular": [2, 11, 12, 15, 16, 17], "set_config": [2, 3, 4, 12, 13], "notic": [2, 3, 7, 8, 11, 13, 16, 17], "wai": [2, 3, 4, 7, 8, 9, 10, 11, 14, 15, 16, 17], "prefix": 2, "extens": [2, 9, 11, 13, 14, 15, 16], "subsequ": [2, 8, 16], "long": [2, 3, 4, 7, 8, 9, 11, 13, 15, 16], "clutter": [2, 16], "kneighborsclassifi": [2, 3], "38": [2, 4, 12, 15, 16], "charact": [2, 8, 9, 11, 15, 16, 17], "transform_output": [2, 3, 4, 12, 13], "modul": 2, "build": [2, 3, 5, 12, 16], "pick": [2, 3, 4, 11, 13, 15, 16], "store": [2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17], "cancer_train": [2, 3], "specifi": [2, 3, 7, 8, 9, 11, 12, 13, 14, 16, 17], "weight": 2, "control": [2, 3, 9, 10, 11, 14], "uniform": [2, 3, 11], "choic": [2, 3, 4, 7, 12, 15, 16, 17], "weigh": [2, 8], "websit": [2, 3, 6, 11, 13, 15], "knn": [2, 3], "n_neighbor": [2, 3, 12], "jupyt": [2, 3, 4, 5, 8, 10, 13, 14], "environ": [2, 3, 4, 5, 8, 9, 13, 14, 15], "pleas": [2, 3, 4, 6, 8, 9, 13], "rerun": [2, 3, 4, 13], "html": [2, 3, 4, 13, 16, 17], "represent": [2, 3, 4, 11, 13], "trust": [2, 3, 4, 7, 13], "notebook": [2, 3, 4, 5, 13, 14, 15], "On": [2, 3, 4, 8, 11, 12, 13, 15, 16, 17], "github": [2, 3, 4, 5, 8, 11, 13, 16], "unabl": [2, 3, 4, 11, 13, 15], "render": [2, 3, 4, 9, 13, 15, 16], "try": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "nbviewer": [2, 3, 4, 13], "org": [2, 3, 4, 7, 8, 11, 13, 16, 17], "kneighborsclassifierkneighborsclassifi": [2, 3], "fit": [2, 3, 4, 12, 13, 16], "much": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "outsid": [2, 3, 7, 9, 12, 13, 15, 16], "heavi": 2, "lift": 2, "modifi": [2, 3, 15], "after": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "itself": [2, 3, 5, 7, 11, 13, 16, 17], "ran": 2, "manual": [2, 3, 4, 7, 9, 11, 12, 14, 17], "time": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 17], "new_ob": 2, "Is": [2, 4, 8, 12, 16, 17], "don": [2, 3, 4, 7, 8, 9, 11, 12, 15, 16, 17], "t": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "necessarili": [2, 3, 8, 17], "correct": [2, 3, 8, 14, 15, 16, 17], "quantifi": [2, 3, 13], "think": [2, 3, 5, 8, 9, 11, 13, 17], "rang": [2, 3, 4, 11, 12, 13, 16, 17], "matter": [2, 12, 16, 17], "identifi": [2, 3, 4, 8, 10, 11, 12, 15, 16], "effect": [2, 4, 7, 8, 12, 13, 14, 17], "But": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "doesn": [2, 3, 9, 11, 16, 17], "salari": 2, "dollar": [2, 7, 11, 12, 13], "job": [2, 11, 16], "1000": [2, 3, 7, 16], "huge": [2, 11], "compar": [2, 3, 7, 8, 11, 12, 15, 16, 17], "conceptu": [2, 15], "opposit": 2, "yearli": 2, "temperatur": 2, "degre": 2, "kelvin": 2, "celsiu": 2, "constant": [2, 13, 16], "shift": [2, 8, 9], "273": [2, 17], "even": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "likewis": [2, 17], "hypothet": 2, "thousand": [2, 3, 11, 16], "singl": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "affect": [2, 3, 8, 9, 12, 13, 16], "chang": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17], "outcom": [2, 8], "averag": [2, 3, 7, 8, 11, 12, 13, 17], "central": 2, "subtract": [2, 3, 8], "said": [2, 3], "unstandard": [2, 4], "wisconsin": 2, "until": [2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17], "did": [2, 3, 7, 8, 10, 11, 12, 13, 15, 16, 17], "earlier": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "thing": [2, 3, 7, 9, 11, 14, 15, 16, 17], "unscaled_canc": 2, "wdbc_unscal": [2, 3], "1001": 2, "11840": [2, 3], "1326": 2, "08474": [2, 3], "1203": 2, "10960": [2, 3], "386": 2, "14250": [2, 3], "1297": 2, "10030": [2, 3], "1479": 2, "11100": [2, 3], "1261": 2, "09780": [2, 3], "858": 2, "08455": [2, 3], "1265": 2, "11780": [2, 3], "181": [2, 4], "05263": [2, 3], "unscal": 2, "uncent": 2, "Will": 2, "framework": [2, 5, 13], "preprocessor": [2, 3, 4, 12], "manipul": [2, 11, 17], "standardscal": [2, 3, 4, 12], "transform": [2, 3, 4, 8, 12, 13, 17], "wrap": [2, 3, 4, 12], "columntransform": [2, 3, 4], "make_column_transform": [2, 3, 4, 12], "enabl": [2, 9, 11, 14, 15, 16, 17], "handi": [2, 8, 17], "sequenc": [2, 3, 8, 11, 14, 16], "compos": [2, 3, 4, 8, 12], "x27": [2, 3, 4], "columntransformercolumntransform": [2, 3, 4], "standardscalerstandardscal": [2, 3, 4], "individu": [2, 3, 7, 8, 13, 15, 16], "difficult": [2, 3, 4, 5, 8, 9, 11, 13, 16, 17], "rather": [2, 3, 7, 8, 9, 11, 12, 15, 16, 17], "make_column_selector": [2, 3], "dtype_includ": [2, 3], "equival": [2, 8, 11, 13, 17], "lt": 2, "_column_transform": 2, "0x7fd984a1ae50": 2, "gt": 2, "readi": [2, 3, 8, 9, 11, 14, 15], "happen": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "necessari": [2, 4, 12, 14, 16], "bit": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17], "unnecessari": 2, "howev": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "quantiti": [2, 3, 5, 7, 16, 17], "scaled_canc": 2, "standardscaler__area": 2, "standardscaler__smooth": 2, "984375": 2, "568466": 2, "908708": 2, "826962": 2, "558884": 2, "942210": 2, "764464": 2, "283553": 2, "826229": 2, "280372": 2, "343856": 2, "041842": 2, "723842": 2, "102458": 2, "577953": 2, "840484": 2, "735218": 2, "525767": 2, "347789": 2, "112085": 2, "woohoo": 2, "input": [2, 3, 4, 8, 11, 12, 15, 17], "behavior": [2, 4, 12, 16, 17], "drop": [2, 3, 9, 14, 15, 16, 17], "remain": [2, 3, 4, 8, 14], "rest": [2, 3, 8, 13, 17], "remaind": [2, 3, 8, 11, 12, 17], "passthrough": 2, "separ": [2, 3, 4, 8, 9, 15, 16], "underscor": [2, 8, 9, 15, 17], "again": [2, 3, 7, 8, 9, 11, 12, 13, 14, 16, 17], "preserv": [2, 3], "verbose_feature_names_out": [2, 4], "fals": [2, 3, 4, 8, 11, 12, 13, 16, 17], "should": [2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 16, 17], "leav": [2, 4, 13], "preprocessor_keep_al": 2, "scaled_cancer_al": 2, "wonder": [2, 7, 11], "technic": [2, 3, 8, 9, 12, 14, 15, 16, 17], "error": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "prone": [2, 3, 11, 17], "accident": [2, 3, 9, 11, 15, 16, 17], "forget": [2, 4, 15], "proper": 2, "free": [2, 3, 13, 15], "requir": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "yourself": [2, 4, 8, 11, 13, 15], "further": [2, 3, 4, 7, 8, 9, 11, 13, 16, 17], "automat": [2, 3, 4, 11, 12, 15, 16], "streamlin": 2, "effort": [2, 9, 11, 15], "side": [2, 6, 7, 8, 14, 15, 16], "annot": [2, 4, 16], "within": [2, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "nearli": [2, 4, 13, 17], "vertic": [2, 7, 8, 12, 13, 16, 17], "align": [2, 11, 16], "black": [2, 4, 11, 12, 16], "region": [2, 3, 11, 12, 17], "domin": 2, "intuit": [2, 3, 12, 16, 17], "reason": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "carefulli": [2, 4, 8, 11, 17], "domain": [2, 8, 11, 16], "comparison": [2, 7, 13, 16, 17], "potenti": [2, 3, 4, 12, 13, 17], "issu": [2, 8, 9, 11, 13, 14, 16, 17], "imbal": 2, "overal": [2, 3, 8, 12, 16], "pattern": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "otherwis": [2, 3, 4, 7, 8, 16], "rare": [2, 4, 16], "malici": 2, "detect": [2, 4], "rarer": 2, "unimport": 2, "revisit": [2, 3, 11, 13, 17], "head": [2, 9, 11, 14, 15, 16], "top": [2, 3, 6, 8, 9, 11, 12, 13, 14, 16, 17], "concat": [2, 7], "glue": 2, "filter": [2, 7, 11, 16], "back": [2, 3, 5, 7, 9, 11, 12, 13, 14, 15, 16, 17], "concaten": [2, 7], "axi": [2, 8, 12, 13, 15, 17], "yield": [2, 3, 7], "taller": 2, "horizont": [2, 8, 16], "produc": [2, 3, 5, 8, 9, 13, 16, 17], "wider": [2, 7, 8, 17], "imbalanc": [2, 3], "rare_canc": 2, "rare_plot": 2, "With": [2, 4, 8, 11, 16, 17], "least": [2, 3, 4, 7, 9, 16], "win": 2, "highlight": [2, 4, 7, 9, 11, 12, 13, 14, 15, 17], "13": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "background": [2, 3, 7, 11, 13, 16], "blue": [2, 4, 9, 12, 15, 17], "indic": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "despit": [2, 3, 11, 16], "simplic": [2, 3, 15], "sound": [2, 3, 9], "manner": [2, 5, 9, 13], "fairli": [2, 3, 7, 14, 16], "nuanc": 2, "suffic": [2, 7], "rebal": 2, "oversampl": 2, "replic": [2, 7], "power": [2, 3, 5, 8, 11, 15, 16, 17], "own": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "increas": [2, 3, 4, 5, 7, 12, 13, 16, 17], "n": [2, 3, 4, 7, 8, 11, 12, 16, 17], "randomli": [2, 3, 4, 7, 13], "properli": [2, 3, 16], "random": [2, 7, 12, 13], "malignant_canc": 2, "benign_canc": 2, "malignant_cancer_upsampl": 2, "upsampled_canc": 2, "vice": [2, 3], "versa": [2, 3], "closer": [2, 16], "upsampl": 2, "wild": [2, 8, 13], "unfortun": [2, 3, 4, 7, 9, 11, 13, 16], "challeng": [2, 15, 17], "reli": [2, 3, 9, 12, 13, 16], "expert": [2, 3, 8, 14], "knowledg": [2, 8, 13, 15, 17], "relat": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "survei": [2, 8, 17], "particip": [2, 3], "margin": [2, 9], "peopl": [2, 5, 7, 8, 9, 12, 13, 15, 16, 17], "respond": [2, 11, 15], "certain": [2, 8, 11, 15, 16], "kind": [2, 3, 4, 7, 8, 11, 16], "fear": [2, 8], "honestli": 2, "neg": [2, 3, 9, 12, 13, 15, 16, 17], "consequ": [2, 3, 7, 9, 17], "simpli": [2, 3, 11, 16, 17], "throw": 2, "awai": [2, 3, 7, 11, 12, 13, 15, 17], "bia": 2, "conclus": [2, 7, 8, 16], "inadvert": [2, 9], "ignor": [2, 3, 8, 12, 17], "easili": [2, 3, 4, 8, 9, 10, 11, 15, 16, 17], "mislead": 2, "detriment": 2, "impact": [2, 4, 5, 7, 13, 17], "techniqu": [2, 3, 4, 5, 7, 8, 11, 13, 16], "deal": [2, 9, 11], "isn": [2, 3, 8, 11, 12, 16], "anyth": [2, 3, 8, 13, 17], "els": [2, 8, 9, 11], "subset": [2, 7, 9, 11, 12, 13, 17], "missing_canc": 2, "wdbc_miss": 2, "nan": [2, 11, 17], "475956": 2, "834601": 2, "386808": 2, "169878": 2, "160508": 2, "137124": 2, "henc": [2, 3, 4, 9, 11, 12, 16], "too": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "accomplish": [2, 3, 7, 8, 9, 16, 17], "dropna": 2, "no_missing_canc": 2, "strategi": [2, 3, 16], "fill": [2, 9, 11, 13, 16], "synthet": 2, "simpleimput": 2, "simpleimputersimpleimput": 2, "directli": [2, 3, 4, 7, 8, 9, 14, 15, 17], "imputed_canc": 2, "846860": 2, "384942": 2, "document": [2, 4, 9, 10, 11, 14, 15, 16, 17], "crucial": 2, "critic": [2, 5, 7, 8, 9, 13, 16, 17], "chain": [2, 17], "intermedi": [2, 8], "whole": [2, 3, 4, 7, 11, 15, 17], "scratch": [2, 7, 15, 16], "nn": [2, 3], "knn_pipelin": [2, 3], "pipelinepipelin": [2, 3, 4], "500": [2, 7, 12, 13], "075": 2, "1500": 2, "new_observ": 2, "second": [2, 3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "15": [2, 3, 4, 7, 8, 9, 11, 13, 15, 16, 17], "seen": [2, 3, 5, 12, 13, 15, 16, 17], "littl": [2, 3, 11, 12, 13, 16, 17], "grid": [2, 3, 12, 16], "meshgrid": 2, "numpi": [2, 3, 4, 7, 11, 12, 13, 16, 17], "high": [2, 3, 5, 7, 8, 9, 10, 13], "transpar": [2, 8], "low": [2, 3, 13], "opac": [2, 12, 16], "np": [2, 3, 4, 7, 12, 13], "val": 2, "arrang": [2, 7, 8, 16], "are_grid": 2, "linspac": 2, "min": [2, 12, 13, 16, 17], "95": [2, 3, 7, 8, 11, 13, 16], "max": [2, 3, 12, 13, 16, 17], "05": [2, 8, 11, 16], "50": [2, 3, 7, 8, 11, 12, 13, 15, 17], "smo_grid": 2, "asgrid": 2, "reshap": [2, 17], "knnpredgrid": 2, "bind": 2, "prediction_t": 2, "copi": [2, 11, 15, 17], "unscaled_plot": 2, "mark_point": [2, 16], "40": [2, 3, 7, 8, 11, 15, 16, 17], "nice": [2, 3, 9, 11, 13, 16], "fade": 2, "prediction_plot": 2, "300": [2, 3, 7, 16, 17], "accompani": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "repositori": [2, 3, 4, 7, 8, 11, 12, 13, 14, 16, 17], "launch": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "browser": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "click": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "binder": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "button": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "view": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "download": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "sure": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "instruct": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "setup": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "ensur": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "intend": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "blb": 2, "lar": 2, "gill": 2, "loupp": 2, "mathieu": 2, "blondel": 2, "fabian": 2, "pedregosa": 2, "andrea": 2, "mueller": 2, "olivi": 2, "grisel": 2, "vlad": 2, "nicula": 2, "peter": [2, 12], "prettenhof": 2, "alexandr": 2, "gramfort": 2, "jaqu": 2, "grobler": 2, "robert": [2, 3, 4, 13], "layton": 2, "jake": 2, "vanderpla": [2, 16], "arnaud": 2, "joli": 2, "brian": [2, 16], "holt": 2, "ga": [2, 16], "\u00eb": 2, "varoquaux": 2, "api": 2, "design": [2, 3, 9, 11, 15, 16, 17], "ecml": 2, "pkdd": 2, "mine": [2, 7], "108": [2, 3], "122": [2, 3], "ch67": [2, 12], "thoma": [2, 12], "ieee": [2, 4, 12], "transact": [2, 4, 12], "21": [2, 3, 8, 11, 12, 15, 16], "fh51": [2, 12], "evelyn": [2, 3, 12], "joseph": [2, 12], "discriminatori": [2, 12], "discrimin": [2, 3, 12], "consist": [2, 4, 7, 8, 11, 12, 14, 15, 16, 17], "properti": [2, 3, 8, 11, 12, 13, 16, 17], "report": [2, 3, 7, 8, 9, 12, 16, 17], "usaf": [2, 12], "school": [2, 5, 8, 12], "aviat": [2, 12], "medicin": [2, 12], "randolph": [2, 12], "field": [2, 5, 11, 12, 16], "texa": [2, 12], "swm93": [2, 3], "nuclear": [2, 3], "intern": [2, 3, 6, 16], "symposium": [2, 3], "electron": [2, 3, 15], "technolog": [2, 3, 16], "stanfordhcare21": 2, "url": [2, 3, 4, 7, 8, 13, 14, 15, 16, 17], "http": [2, 3, 4, 6, 7, 8, 10, 11, 13, 14, 15, 16, 17], "stanfordhealthcar": 2, "medic": [2, 3], "condit": 2, "continu": [3, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "its": [3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "describ": [3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "matric": 3, "neighbor": [3, 13], "k": [3, 8, 11, 16], "nearest": [3, 4, 13], "estim": [3, 7, 8, 10, 12, 13], "underfit": [3, 13], "advantag": [3, 4, 7, 11, 12, 13, 14, 15, 16, 17], "disadvantag": [3, 4, 12, 13, 16], "wrong": [3, 7, 8, 13, 16, 17], "cancer": 3, "ask": [3, 4, 5, 7, 11, 12, 13, 15, 16, 17], "kei": [3, 5, 7, 8, 11, 14, 15, 16, 17], "impli": [3, 7], "between": [3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "oppos": [3, 11, 12, 16, 17], "memor": 3, "visit": [3, 6, 7, 8, 11, 14, 15, 16], "hospit": 3, "more": [3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "trick": 3, "asid": [3, 8, 11, 13], "match": [3, 11, 12, 13, 15, 16, 17], "observ": [3, 4, 7, 8, 12, 13, 15, 16, 17], "confid": [3, 7, 12], "golden": 3, "rule": [3, 7, 8, 12, 16], "cannot": [3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "than": [3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "realli": [3, 7, 8, 11, 12, 16, 17], "imagin": [3, 7, 9, 11, 15, 16, 17], "bad": [3, 4, 11, 16], "overestim": [3, 7], "made": [3, 4, 8, 12, 13, 14, 15, 16, 17], "frac": [3, 4, 7, 12], "summar": [3, 7, 8, 10, 11, 16, 17], "stori": [3, 9, 12, 16], "alon": [3, 7, 15], "comprehens": [3, 4, 7], "each": [3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "correctli": [3, 8, 11, 14, 16, 17], "incorrectli": 3, "57": 3, "bottom": [3, 9, 14, 15], "roughli": [3, 4, 7, 12, 13, 16], "89": [3, 8, 16], "892": 3, "misclassifi": 3, "disastr": 3, "receiv": [3, 11, 15], "particularli": [3, 11, 13, 16], "unaccept": 3, "term": [3, 4, 7, 8, 11, 12, 16, 17], "talk": [3, 11, 16], "four": [3, 4, 8, 10, 16], "perfect": [3, 16], "zero": [3, 4, 12, 13, 16, 17], "almost": [3, 4, 8, 11, 12, 16], "two": [3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "commonli": [3, 7, 8, 9, 13, 15, 16, 17], "metric": [3, 4, 12, 13], "togeth": [3, 4, 7, 9, 11, 16, 17], "inde": [3, 4, 7, 8, 11, 13, 16, 17], "20": [3, 7, 8, 11, 13, 15, 16, 17], "quad": [3, 4], "25": [3, 7, 8, 11, 12, 15, 16, 17], "rel": [3, 4, 8, 16], "context": [3, 11, 12, 13, 16, 17], "certainli": [3, 7], "achiev": [3, 8, 12, 16, 17], "guess": [3, 4, 7, 8], "everi": [3, 7, 8, 9, 11, 13, 15, 17], "similarli": [3, 8, 11, 16, 17], "never": [3, 8, 12, 15], "obsev": 3, "Of": [3, 7, 13, 17], "somewher": [3, 8, 11, 12, 16], "extrem": [3, 7, 12, 13], "trade": [3, 4], "off": [3, 4, 7, 13], "fair": [3, 11, 12], "unbias": 3, "influenc": [3, 4, 7, 12, 13, 16], "human": [3, 4, 7, 11, 15, 16, 17], "counter": 3, "main": [3, 8, 14, 17], "tenet": 3, "determin": [3, 4, 7, 12, 14, 15, 16, 17], "everyth": [3, 7, 8, 14, 17], "point": [3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "investig": [3, 7, 8, 11, 16], "integ": [3, 11, 16, 17], "At": [3, 8, 9, 10, 11, 13], "track": [3, 7, 8, 15, 17], "to_list": 3, "convert": [3, 7, 11, 12, 16, 17], "nums_0_to_9": 3, "5": [3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "random_numbers1": 3, "appear": [3, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16], "fresh": [3, 9], "batch": 3, "random_numbers2": 3, "forc": [3, 16], "random_numbers1_again": 3, "random_numbers2_again": 3, "And": [3, 7, 8, 11, 12, 13, 15, 16, 17], "4235": 3, "random_numbers1_differ": 3, "random_numbers2_differ": 3, "beyond": [3, 4, 8, 11, 12, 13, 14, 15, 16, 17], "explicitli": [3, 11, 15, 16, 17], "insert": [3, 15, 17], "therebi": [3, 16], "global": [3, 16], "drawback": 3, "buri": 3, "undesir": 3, "entir": [3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "plai": [3, 8, 11, 14], "random_st": 3, "pcg64": 3, "rng": 3, "random_numbers1_third": 3, "random_numbers2_third": 3, "load": [3, 4, 9, 10, 11, 12, 13, 16, 17], "quick": [3, 8, 11], "re": [3, 4, 8, 9, 10, 11, 12, 15, 16, 17], "scale": [3, 4, 11, 12, 13, 15, 16], "done": [3, 8, 9, 11, 14, 15, 16, 17], "preliminari": 3, "train_test_split": [3, 12, 13], "shuffl": 3, "stratifi": [3, 12], "exist": [3, 8, 9, 11, 13, 14, 15, 16, 17], "train_siz": [3, 12, 13], "model_select": [3, 12, 13], "cancer_test": 3, "index": [3, 8, 11, 15, 17], "426": 3, "196": [3, 4, 7, 8, 12], "296": 3, "43": [3, 4, 15, 16], "143": 3, "116": 3, "miss": [3, 4, 16, 17], "626761": 3, "373239": 3, "last": [3, 7, 8, 10, 11, 15, 16, 17], "sensit": [3, 8, 13], "consider": 3, "aspect": [3, 7, 13, 16], "fortun": [3, 7, 8, 11, 12, 13, 17], "construct": [3, 7, 8, 11, 16, 17], "cancer_preprocessor": 3, "augment": [3, 4], "864726": 3, "146": [3, 7], "869691": 3, "86": 3, "86135501": 3, "846226": 3, "105": [3, 7, 8, 11, 16, 17], "863030": 3, "244": 3, "884180": 3, "23": [3, 11, 15, 16, 17], "851509": 3, "125": [3, 8, 17], "86561": 3, "281": 3, "8912055": 3, "84799002": 3, "score": [3, 11, 12], "8951048951048951": 3, "90": [3, 7, 8, 16, 17], "precision_scor": 3, "recall_scor": 3, "y_true": [3, 12, 13], "y_pred": [3, 12, 13], "pos_label": 3, "8275862068965517": 3, "9056603773584906": 3, "83": 3, "91": [3, 7], "crosstab": 3, "alphabet": [3, 8, 16, 17], "80": [3, 17], "48": [3, 7, 8, 15, 16], "agre": [3, 11, 13], "displaystyl": 3, "51": [3, 7, 15, 16], "82": [3, 13, 16], "76": 3, "That": [3, 7, 8, 11, 12, 16, 17], "pretti": [3, 7, 11], "wait": [3, 8, 11, 12, 13, 16, 17], "Or": [3, 7, 13], "someth": [3, 4, 7, 8, 9, 11, 12, 15, 16, 17], "99": [3, 4, 7, 12, 13, 16], "terribl": 3, "impress": [3, 16], "attent": [3, 8, 12, 17], "sacrif": 3, "easi": [3, 4, 8, 9, 11, 13, 15, 16, 17], "baselin": [3, 16], "regardless": [3, 11, 12, 16], "sens": [3, 4, 7, 8, 12, 13, 16, 17], "hope": [3, 11, 13, 16], "signific": [3, 5, 8], "Be": [3, 11, 12, 16], "enough": [3, 7, 8, 11, 12, 13, 15, 16, 17], "usual": [3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "suspect": [3, 4, 7], "built": [3, 8, 9, 14, 17], "perspect": [3, 4, 12, 17], "hoorai": 3, "cautiou": 3, "misdiagnos": 3, "vast": [3, 4, 16, 17], "behav": [3, 7, 13], "principl": [3, 5, 16], "ideal": [3, 9, 12, 17], "somehow": [3, 7, 11], "hasn": 3, "yet": [3, 7, 8, 9, 11, 12, 15, 16, 17], "rememb": [3, 7, 8, 9, 11, 13, 16, 17], "touch": [3, 16], "dai": [3, 9, 11, 15, 16], "strongli": [3, 10, 13], "whatev": [3, 4, 8, 16], "lucki": [3, 7], "perhap": [3, 7, 8, 9, 11, 12, 13, 16], "sub": [3, 11], "cancer_subtrain": 3, "cancer_valid": 3, "acc": 3, "897196261682243": 3, "repeat": [3, 4, 5, 7, 15], "none": [3, 4, 11, 13, 15, 17], "underli": [3, 4, 8], "reduc": [3, 4, 11, 16], "un": [3, 4], "c": [3, 8, 11], "evenli": [3, 12], "chunk": [3, 13], "iter": [3, 4, 8, 15, 16, 17], "fold": [3, 12], "cross_valid": 3, "cv": [3, 12], "cancer_pip": 3, "cv_5_df": 3, "fit_tim": 3, "score_tim": 3, "test_scor": 3, "004374": 3, "005595": 3, "837209": 3, "003659": 3, "005356": 3, "870588": 3, "003580": 3, "005332": 3, "894118": 3, "003673": 3, "005435": 3, "003497": 3, "005305": 3, "882353": 3, "aggreg": [3, 8], "sem": 3, "uncertain": [3, 7, 12], "scope": [3, 4, 8, 12, 13, 14, 15, 16], "01": [3, 7, 11, 16, 17], "cv_5_metric": 3, "agg": [3, 13, 17], "003757": 3, "005404": 3, "870971": 3, "000158": 3, "000052": 3, "009501": 3, "limit": [3, 4, 11, 13, 15, 16, 17], "speed": 3, "trial": [3, 16], "cv_10": 3, "cv_10_df": 3, "cv_10_metric": 3, "003604": 3, "004122": 3, "884939": 3, "000067": 3, "000032": 3, "006718": 3, "slightli": [3, 7, 11, 12, 13, 16], "due": [3, 4, 5, 7, 11, 17], "reduct": 3, "dramat": 3, "cv_50_df": 3, "cv_50_metric": 3, "003526": 3, "003071": 3, "888056": 3, "000012": 3, "000008": 3, "003005": 3, "downstream": 3, "expens": [3, 11], "chemo": 3, "radiat": 3, "therapi": 3, "death": 3, "mispredict": 3, "gridsearchcv": [3, 12], "unspecifi": 3, "cancer_tune_pip": 3, "tunabl": 3, "get_param": [3, 12], "verbos": 3, "columntransformer__n_job": 3, "columntransformer__remaind": 3, "columntransformer__sparse_threshold": 3, "columntransformer__transformer_weight": 3, "columntransformer__transform": 3, "columntransformer__verbos": 3, "columntransformer__verbose_feature_names_out": 3, "columntransformer__standardscal": 3, "columntransformer__standardscaler__copi": 3, "columntransformer__standardscaler__with_mean": 3, "columntransformer__standardscaler__with_std": 3, "kneighborsclassifier__algorithm": 3, "auto": [3, 15], "kneighborsclassifier__leaf_s": 3, "30": [3, 5, 7, 8, 11, 12, 15, 16, 17], "kneighborsclassifier__metr": 3, "minkowski": 3, "kneighborsclassifier__metric_param": 3, "kneighborsclassifier__n_job": 3, "kneighborsclassifier__n_neighbor": 3, "kneighborsclassifier__p": 3, "kneighborsclassifier__weight": 3, "wow": [3, 7, 16], "stuff": 3, "sift": 3, "muck": [3, 11], "stand": [3, 11, 12, 16], "parameter_grid": 3, "allow": [3, 4, 7, 8, 9, 10, 11, 12, 15, 16, 17], "greater": [3, 4, 11, 17], "third": [3, 4], "skip": [3, 9, 17], "96": [3, 13, 16], "emploi": [3, 5, 11, 12], "okai": [3, 16, 17], "param_grid": [3, 12], "cancer_tune_grid": 3, "cv_results_": [3, 12], "format": [3, 5, 10, 11, 12, 13, 17], "accuracies_grid": 3, "19": [3, 7, 11, 15, 16], "mean_fit_tim": 3, "std_fit_tim": 3, "mean_score_tim": 3, "std_score_tim": 3, "param_kneighborsclassifier__n_neighbor": 3, "param": 3, "split0_test_scor": 3, "split1_test_scor": 3, "split2_test_scor": 3, "split3_test_scor": 3, "split4_test_scor": 3, "split5_test_scor": 3, "split6_test_scor": 3, "split7_test_scor": 3, "split8_test_scor": 3, "split9_test_scor": 3, "mean_test_scor": [3, 12], "17": [3, 4, 11, 13, 15, 16], "std_test_scor": [3, 12], "18": [3, 4, 7, 8, 11, 15, 16], "rank_test_scor": 3, "int32": [3, 17], "param_kneighbors_classifier__n_neighbor": 3, "unus": 3, "sem_test_scor": [3, 12], "845127": 3, "019966": 3, "873200": 3, "015680": 3, "861517": 3, "019547": 3, "861573": 3, "017787": 3, "866279": 3, "017889": 3, "875637": 3, "016026": 3, "885050": 3, "015406": 3, "36": [3, 4, 15, 16], "887375": 3, "013694": 3, "41": [3, 15, 16, 17], "46": [3, 4, 15, 16], "887320": 3, "013314": 3, "882669": 3, "014523": 3, "56": [3, 8], "878018": 3, "014414": 3, "61": [3, 7, 8, 11], "880343": 3, "014299": 3, "66": [3, 7, 11, 12], "015416": 3, "71": [3, 11], "877962": 3, "013660": 3, "014698": 3, "81": [3, 16], "880288": 3, "011277": 3, "875581": 3, "012967": 3, "008193": 3, "shortcut": [3, 9, 16], "layer": 3, "accuracy_vs_k": 3, "mark_lin": [3, 4, 12, 13, 16], "neighbour": [3, 12], "highest": [3, 17], "best_params_": [3, 12], "vari": [3, 7, 12, 13, 14, 16, 17], "exact": [3, 7, 13, 16], "justifi": [3, 16], "optim": [3, 11, 12], "decreas": [3, 4, 7, 16, 17], "reliabl": [3, 5, 7, 9, 16], "uncertainti": [3, 7], "cost": [3, 7, 12, 13], "prohibit": [3, 12], "large_param_grid": 3, "385": 3, "large_cancer_tune_grid": 3, "large_accuracies_grid": 3, "large_accuracy_vs_k": 3, "farther": [3, 16], "sort": [3, 4, 8, 9, 11, 13, 16, 17], "boundari": [3, 13], "simpler": 3, "stronger": 3, "regard": [3, 7, 8, 9, 12, 13, 17], "themselv": [3, 5, 11, 16], "noisi": [3, 12, 16], "jag": 3, "essenti": [3, 7, 8, 9, 11, 12, 17], "problemat": [3, 9, 11, 16], "unreli": [3, 7, 13], "strike": 3, "balanc": [3, 5, 7], "qualiti": [3, 5, 9, 12, 13], "retrain": [3, 12], "9090909090909091": 3, "8846153846153846": 3, "8679245283018868": 3, "84": [3, 16], "glanc": 3, "surpris": 3, "knew": 3, "return": [3, 4, 7, 8, 11, 13, 14, 17], "put": [3, 7, 11, 12, 13, 14, 15, 17], "defin": [3, 7, 8, 10, 11, 12, 13, 16, 17], "execut": [3, 11, 15], "search": [3, 4, 11, 14, 15], "strength": [3, 13, 16], "weak": [3, 12, 13, 16], "assumpt": [3, 4, 12, 13], "multi": 3, "slow": [3, 9, 12, 13], "treat": [3, 4, 8, 15, 16, 17], "accept": [3, 11, 12, 14, 15], "wors": [3, 8, 17], "meaning": [3, 4, 8, 11, 13, 15], "cancer_irrelev": 3, "irrelevant1": 3, "irrelevant2": 3, "30010": 3, "08690": 3, "132": [3, 7], "19740": 3, "130": [3, 7, 17], "00": [3, 7, 17], "24140": 3, "77": [3, 7], "58": [3, 16], "19800": 3, "135": [3, 7, 13, 16], "24390": 3, "142": 3, "14400": 3, "131": 3, "09251": 3, "35140": 3, "140": [3, 7], "00000": [3, 7], "47": [3, 4, 15, 16], "92": [3, 7], "increasingli": [3, 11], "distanc": [3, 4, 12, 13, 16], "corrupt": 3, "outperform": 3, "combat": 3, "extra": [3, 11, 13], "nois": [3, 16], "smoothli": 3, "trend": [3, 7, 8, 12, 13, 16], "corrobor": 3, "evid": [3, 5], "untun": 3, "scientif": [3, 12, 13, 15], "clear": [3, 4, 5, 7, 8, 13, 15, 16, 17], "cut": 3, "obviou": [3, 9, 13, 16], "relev": [3, 11, 12, 13], "consum": [3, 7, 17], "systemat": 3, "beal": 3, "hock": 3, "lesli": 3, "moder": 3, "ab": [3, 11, 12], "bc": [3, 7, 8], "ac": 3, "abc": 3, "million": [3, 13, 16], "computation": 3, "draper": 3, "smith": 3, "1966": 3, "eforymson": 3, "straightforward": [3, 11, 16], "form": [3, 4, 7, 8, 11, 12, 13, 16, 17], "updat": [3, 4, 14, 15, 16], "big": [3, 7, 8, 11, 15, 16], "55": [3, 7, 12, 16, 17], "caution": [3, 9, 11], "move": [3, 8, 10, 12, 13, 15, 16], "likelihood": 3, "unlucki": [3, 4], "stumbl": 3, "risk": [3, 12], "suffer": 3, "turn": [3, 4, 8, 11, 12, 13, 17], "smaller": [3, 12, 13, 16], "irrelevant3": 3, "full": [3, 7, 8, 11, 13, 15, 16, 17], "cancer_subset": 3, "sequentialfeatureselector": 3, "tri": [3, 4, 12, 13, 16], "flexibl": [3, 9, 13, 17], "resort": 3, "loop": [3, 17], "flow": 3, "mckinnei": [3, 11, 16, 17], "2012": [3, 8, 11, 16, 17], "n_total": 3, "check": [3, 8, 10, 11, 15, 16, 17], "j": [3, 8, 11], "len": [3, 11], "accuracy_dict": 3, "selected_predictor": 3, "empti": [3, 9, 15], "n_job": 3, "best_set": 3, "argmax": 3, "append": [3, 11, 16, 17], "join": [3, 11, 15], "del": [3, 16], "891103": 3, "917450": 3, "931454": 3, "926253": 3, "906955": 3, "exhibit": [3, 9], "fluctuat": [3, 12], "attempt": [3, 4, 16], "account": [3, 14, 15], "chanc": [3, 7, 14], "elbow": [3, 4], "successfulli": [3, 9, 11, 15], "judgement": 3, "excel": [3, 8, 13, 15], "tutori": [3, 5, 9, 11, 13], "go": [3, 7, 8, 10, 11, 13, 14, 16], "jame": [3, 4, 11, 13], "great": [3, 4, 7, 8, 9, 11, 13, 15, 16], "naiv": 3, "bay": 3, "goe": [3, 8, 9, 11, 13], "popular": [3, 4, 5, 11, 13, 15], "bkm67": 3, "martin": 3, "lansdown": 3, "mauric": 3, "georg": 3, "kendal": 3, "david": [3, 7], "mann": 3, "discard": 3, "multivari": 3, "biometrika": 3, "366": 3, "ds66": 3, "norman": 3, "harri": 3, "wilei": [3, 16], "efo66": 3, "stepwis": 3, "backward": 3, "eastern": 3, "meet": 3, "hl67": 3, "ronald": 3, "technometr": 3, "531": 3, "540": 3, "jwht13": [3, 4, 13], "gareth": [3, 4, 13], "daniela": [3, 4, 13], "witten": [3, 4, 13], "hasti": [3, 4, 13], "tibshirani": [3, 4, 13], "springer": [3, 4, 13, 16], "1st": [3, 4, 13], "edit": [3, 4, 13, 14, 16], "www": [3, 4, 8, 11, 13], "statlearn": [3, 4, 13], "com": [3, 4, 7, 11, 13, 14, 15, 16], "mck12": [3, 11, 16, 17], "ipython": [3, 11, 14, 16, 17], "o": [3, 8, 11, 14, 16, 17], "reilli": [3, 11, 16, 17], "media": [3, 9, 11, 16, 17], "inc": [3, 7, 11, 16, 17], "subgroup": [4, 8, 16, 17], "predict": [4, 5, 8, 10, 12, 13, 16], "differenti": 4, "classif": [4, 5, 8, 10, 12, 13], "variabl": [4, 7, 8, 9, 11, 12, 13, 16, 17], "scikit": [4, 12, 13], "set": [4, 7, 9, 10, 11, 13, 15, 17], "genet": [4, 16], "ancestr": 4, "subpopul": 4, "onlin": [4, 7, 11, 14, 15, 16], "custom": [4, 16], "uncov": [4, 9, 16], "fundament": [4, 7, 8, 16], "supervis": 4, "unsupervis": 4, "imposs": [4, 7], "articl": [4, 8], "wikipedia": [4, 11], "evalu": [4, 7, 8, 13, 16], "test": [4, 7, 13, 14], "good": [4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "ascertain": 4, "rigor": [4, 5, 12], "lloyd": 4, "1982": 4, "hierarch": 4, "princip": 4, "compon": [4, 8], "multidimension": 4, "semisupervis": 4, "goal": [4, 8, 12, 16, 17], "benefici": [4, 11], "unlabel": [4, 8], "willing": [4, 7], "seed": [4, 7, 12, 13], "palmerpenguin": 4, "horst": 4, "2020": [4, 7, 8, 16], "kristen": 4, "gorman": 4, "palmer": 4, "station": [4, 16], "antarctica": [4, 16], "ecolog": 4, "site": [4, 6, 11], "adult": 4, "penguin": 4, "2014": [4, 5, 17], "bill": 4, "flipper": 4, "millimet": 4, "distinct": [4, 9, 16], "speci": 4, "discoveri": [4, 13], "gentoo": 4, "bill_length_mm": 4, "flipper_length_mm": 4, "39": [4, 15, 16], "182": 4, "34": [4, 15, 16, 17], "187": [4, 7, 12], "190": [4, 12, 17], "195": [4, 8, 17], "193": [4, 12], "213": [4, 8, 11, 16, 17], "215": [4, 17], "45": [4, 8, 11, 15, 16, 17], "220": [4, 17], "49": [4, 15, 16], "208": 4, "52": [4, 15], "197": 4, "189": [4, 7], "penguins_standard": 4, "bill_length_standard": 4, "flipper_length_standard": 4, "641361": 4, "189773": 4, "144917": 4, "328412": 4, "517922": 4, "921755": 4, "107617": 4, "846513": 4, "409743": 4, "677761": 4, "238168": 4, "271104": 4, "902464": 4, "433767": 4, "720106": 4, "192860": 4, "645505": 4, "355522": 4, "962559": 4, "440353": 4, "762179": 4, "205012": 4, "111528": 4, "123299": 4, "786203": 4, "626855": 4, "757407": 4, "783170": 4, "108442": 4, "776057": 4, "759092": 4, "subtyp": 4, "scatter_plot": 4, "meaningless": 4, "etc": [4, 7, 8, 11, 15, 16, 17], "adjust": [4, 16], "sum": [4, 12, 17], "wssd": 4, "inertia": 4, "mu_x": 4, "mu_i": 4, "x_1": 4, "x_2": 4, "x_3": 4, "x_4": 4, "y_1": 4, "y_2": 4, "y_3": 4, "y_4": 4, "35": [4, 8, 15, 16, 17], "outlin": [4, 8, 11, 16, 17], "far": [4, 13, 15, 16, 17], "yellow": [4, 17], "variant": 4, "minim": [4, 12, 13, 16], "reassign": 4, "longer": [4, 8, 17], "termin": [4, 14], "onward": [4, 11, 14, 16], "guarante": [4, 14], "forev": 4, "logic": [4, 8, 11, 17], "finit": [4, 7, 16], "unlik": [4, 7, 11, 12, 16], "stuck": [4, 9, 17], "solut": [4, 7, 8], "poor": [4, 11], "lowest": [4, 11, 16], "cross": [4, 12, 13], "valid": [4, 10, 11, 12, 13], "subdivid": 4, "merg": [4, 11], "diminish": 4, "reach": [4, 11, 13, 15, 16], "being": [4, 5, 7, 8, 9, 11, 12, 15, 16, 17], "address": [4, 5, 8, 11, 12, 13, 15], "preprocess": [4, 12], "kmean": 4, "n_cluster": 4, "kmeanskmean": 4, "penguin_clust": 4, "labels_": 4, "altern": [4, 8, 13, 15, 16, 17], "suffix": [4, 16], "nomin": [4, 16], "discret": [4, 16], "cluster_plot": 4, "inertia_": 4, "730719092276117": 4, "varieti": [4, 5, 11, 13, 15, 17], "ks": 4, "oper": [4, 7, 8, 11, 14, 15, 16], "safest": 4, "reus": 4, "penguin_clust_k": 4, "000000": 4, "576264": 4, "730719": 4, "343613": 4, "362131": 4, "678383": 4, "293320": 4, "975016": 4, "785232": 4, "elbow_plot": 4, "bump": [4, 16], "prevent": [4, 8, 9, 11, 16, 17], "n_init": 4, "paramet": [4, 7, 11, 12, 13, 16, 17], "realm": 4, "specif": [4, 5, 11, 12, 14, 15, 16], "companion": [4, 11], "pca": 4, "gwf14": 4, "toni": 4, "fraser": 4, "sexual": 4, "dimorph": 4, "commun": [4, 5, 8, 11, 16], "ntarctic": 4, "genu": 4, "emph": 4, "pygosc": 4, "plo": [4, 15], "ONE": 4, "hhg20": 4, "allison": 4, "alison": 4, "hill": [4, 11], "archipelago": 4, "allisonhorst": 4, "io": [4, 8, 11, 16], "llo82": 4, "stuart": 4, "quantiz": 4, "pcm": 4, "129": 4, "137": [4, 7, 8, 13], "releas": [4, 11], "bell": [4, 7], "telephon": 4, "paper": [4, 5, 16], "1957": 4, "john": [5, 17], "hopkin": 5, "bloomberg": 5, "public": [5, 8, 15], "2023": [5, 11], "expand": [5, 14, 15, 17], "grown": 5, "significantli": [5, 7, 8, 9, 13, 16, 17], "recent": [5, 11, 14, 15], "attract": 5, "demand": [5, 16], "concurr": 5, "growth": [5, 16], "prolifer": 5, "blog": 5, "post": [5, 11, 15], "fast": [5, 16], "literatur": 5, "inclin": 5, "moment": [5, 12, 17], "former": [5, 12], "activ": [5, 9, 11], "amongst": 5, "practition": 5, "consensu": 5, "date": [5, 11, 15, 16], "element": [5, 11, 16, 17], "clean": [5, 8, 10, 11], "highli": [5, 8, 15], "nevertheless": [5, 13, 16], "emerg": 5, "lack": [5, 15], "agreement": 5, "strong": [5, 13, 16], "propos": [5, 8], "vision": [5, 16], "implic": [5, 7], "engag": 5, "tidi": [5, 8], "tabular": [5, 17], "formal": [5, 8], "hadlei": [5, 17], "wickham": [5, 17], "tidyvers": 5, "independ": [5, 9, 10, 16], "facilit": [5, 15], "audit": [5, 10], "complex": [5, 11, 13, 15, 16], "eas": [5, 15], "clearli": [5, 11, 16], "unobserv": 5, "popul": [5, 7, 8, 11, 16, 17], "succe": 5, "foster": 5, "fluent": 5, "vers": 5, "behind": [5, 11, 15, 16], "immedi": [5, 11, 13], "integr": [5, 11], "git": [5, 14, 15], "train": [5, 13], "ever": [5, 13, 15, 17], "awar": [5, 9, 15], "sophist": [5, 8], "mix": [5, 9, 17], "generaliz": 5, "confront": 5, "web": [6, 9, 15], "navig": [6, 8, 9, 11, 14, 15], "mobil": 6, "devic": [6, 11], "menu": [6, 8, 9, 14], "datasciencebook": [6, 10, 11, 14], "ca": [6, 8, 10, 11, 12, 13, 14], "licens": 6, "creativ": 6, "noncommerci": 6, "sharealik": 6, "extend": [7, 13, 16], "inferenti": [7, 8, 10, 12, 16], "interv": 7, "approxim": 7, "broader": 7, "retail": 7, "sell": 7, "iphon": 7, "accessori": 7, "market": [7, 12, 13], "strateg": 7, "product": [7, 8, 15], "north": [7, 11, 16], "american": [7, 8, 11], "colleg": 7, "campus": 7, "america": [7, 16], "owner": [7, 11, 13], "characterist": [7, 8, 11, 16, 17], "costli": 7, "taken": [7, 8, 12, 15, 16], "60": [7, 8], "canada": [7, 8, 11, 16, 17], "apart": [7, 11, 16], "rent": 7, "budget": [7, 12], "studio": 7, "rental": [7, 11], "price": [7, 11, 12, 13], "month": [7, 15, 16], "monthli": 7, "airbnb": 7, "cox": 7, "marketplac": 7, "vacat": 7, "septemb": [7, 16], "neighborhood": 7, "room": 7, "accommod": 7, "bathroom": 7, "bedroom": [7, 11, 12, 13], "bed": [7, 12, 13], "night": 7, "neighbourhood": 7, "room_typ": 7, "downtown": 7, "home": [7, 8, 11, 12, 13, 14, 16, 17], "apt": [7, 14], "bath": [7, 12], "150": [7, 16], "eastsid": 7, "west": 7, "85": [7, 8, 11, 12, 13, 16], "kensington": 7, "cedar": 7, "cottag": 7, "110": 7, "4589": 7, "4590": 7, "4591": 7, "oakridg": 7, "privat": [7, 11, 15], "4592": 7, "dunbar": 7, "southland": 7, "share": [7, 9, 11, 15, 16, 17], "29": [7, 12, 15, 16, 17], "4593": 7, "145": 7, "4594": 7, "shaughnessi": 7, "citi": [7, 11, 12, 17], "plan": [7, 15], "bylaw": 7, "747497": 7, "246408": 7, "005224": 7, "hotel": 7, "000871": 7, "747": 7, "155": [7, 13, 17], "725": 7, "250": [7, 12, 17], "025": 7, "625": 7, "350": [7, 12, 13, 17], "confirm": [7, 15, 16], "histogram": 7, "000": [7, 8, 11, 12, 13, 16], "20_000": 7, "605": 7, "606": 7, "marpol": 7, "4579": 7, "4580": 7, "160": [7, 12], "1739": 7, "1740": 7, "151": [7, 8, 16], "3904": 7, "3905": 7, "185": [7, 17], "1596": 7, "1597": 7, "kitsilano": 7, "3060": 7, "3061": 7, "hast": 7, "sunris": 7, "78": 7, "19999": 7, "527": 7, "528": 7, "1587": 7, "1588": 7, "169": [7, 13], "3860": 7, "3861": 7, "2747": 7, "2748": 7, "285": 7, "800000": 7, "0000": 7, "999": 7, "750": [7, 16], "775": 7, "225": [7, 11], "19998": 7, "700": [7, 17], "275": 7, "44552": 7, "reset_index": [7, 17], "caveat": [7, 16, 17], "twice": [7, 13], "sample_proport": 7, "44547": 7, "44548": 7, "44549": 7, "44550": 7, "44551": 7, "sample_estim": 7, "675": 7, "44541": 7, "19995": 7, "44543": 7, "19996": 7, "44545": 7, "19997": 7, "20000": 7, "mind": [7, 8, 11, 15], "sampling_distribut": 7, "mark_bar": [7, 8, 16], "bin": [7, 16], "maxbin": [7, 16], "symmetr": 7, "peak": [7, 16], "74848375": 7, "748": [7, 12], "neither": [7, 12, 16], "nor": [7, 9, 13], "underestim": 7, "tendenc": 7, "travel": 7, "wish": [7, 8, 15], "overpr": [7, 12], "population_distribut": 7, "skew": 7, "tail": [7, 11], "154": [7, 13], "5109773617762": 7, "one_sampl": 7, "sample_distribut": 7, "153": 7, "48225": 7, "wouldn": [7, 15], "alreadi": [7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "mean_pric": 7, "148": 7, "56075": 7, "165": [7, 17], "50500": 7, "93925": 7, "139": 7, "14650": 7, "198": 7, "50000": 7, "192": 7, "66425": 7, "144": 7, "88600": 7, "08800": 7, "156": [7, 12], "25000": 7, "170": 7, "mean_of_sample_mean": 7, "sample_mean": 7, "disappear": 7, "thumb": [7, 16], "emphasi": 7, "saw": [7, 11, 12, 17], "notion": [7, 12], "pretend": 7, "clever": 7, "drawn": [7, 13, 16], "median": [7, 16, 17], "slope": [7, 13], "displai": [7, 8, 9, 11, 13, 15, 16, 17], "4025": 7, "4026": 7, "renfrew": 7, "collingwood": 7, "1977": [7, 16], "1978": 7, "fairview": 7, "70": [7, 11, 16, 17], "4008": 7, "4009": 7, "269": [7, 16], "1543": 7, "1544": 7, "320": 7, "3350": 7, "3351": 7, "804": 7, "805": 7, "mount": 7, "pleasant": 7, "2286": 7, "2287": 7, "1010": 7, "1011": 7, "strathcona": 7, "120": [7, 8, 11, 17], "1878": 7, "1879": [7, 16], "175": 7, "1644": 7, "1645": 7, "2771": 7, "2772": 7, "4151": 7, "4152": 7, "289": 7, "4495": 7, "4496": 7, "rilei": 7, "park": [7, 16], "115": 7, "1308": 7, "1309": 7, "2246": 7, "2247": 7, "2335": 7, "2336": 7, "4059": 7, "4060": 7, "1280": 7, "1281": 7, "4324": 7, "4325": 7, "3403": 7, "3404": 7, "arbutu": 7, "ridg": 7, "664": 7, "1729": 7, "1730": 7, "93": [7, 16], "3722": 7, "3723": 7, "241": 7, "242": 7, "3955": 7, "3956": 7, "1042": 7, "1043": 7, "649": 7, "650": [7, 16], "sunset": 7, "1995": [7, 16], "1996": 7, "363": 7, "364": 7, "1783": 7, "1784": 7, "806": 7, "254": 7, "255": 7, "3365": 7, "3366": 7, "4562": 7, "4563": 7, "64": [7, 11, 12, 14], "2124": 7, "2125": 7, "200": [7, 8, 11, 12, 16], "1997": 7, "1998": 7, "257": 7, "4329": 7, "4330": [7, 17], "3408": 7, "3409": 7, "635": 7, "636": 7, "grandview": 7, "woodland": 7, "103": [7, 17], "one_sample_dist": 7, "boot1": 7, "boot1_dist": 7, "ident": [7, 8, 11], "mimic": 7, "break": [7, 11, 12, 13], "boot20000": 7, "six": [7, 8, 10, 12, 16, 17], "six_bootstrap_sampl": 7, "queri": [7, 11], "height": [7, 13, 16], "facet": [7, 16], "67175": 7, "42500": 7, "149": [7, 8], "35000": 7, "13225": 7, "179": [7, 8], "79675": 7, "188": 7, "28225": 7, "boot20000_mean": 7, "159": 7, "29675": 7, "136": 7, "55725": 7, "161": 7, "93950": 7, "22500": 7, "boot_est_dist": 7, "resampl": 7, "repeatedli": 7, "percentil": [7, 17], "captur": [7, 11, 13, 16], "narrow": [7, 11, 17], "comfort": [7, 15], "strict": [7, 8], "unhelp": 7, "life": [7, 8], "deadli": 7, "ascend": [7, 8, 16], "bound": [7, 16], "97": [7, 13, 16], "quantil": 7, "express": [7, 16, 17], "5th": 7, "975": 7, "ci_bound": 7, "121": [7, 12], "607069": 7, "191": [7, 8], "525362": 7, "finish": [7, 9, 10, 11, 14, 15, 16], "journei": 7, "surfac": [7, 12, 13, 16], "foundat": [7, 8, 11, 13], "openintro": 7, "diez": 7, "2019": [7, 16], "solid": [7, 16], "grasp": 7, "natur": [7, 15, 16, 17], "coxd": 7, "murrai": 7, "insideairbnb": 7, "09": [7, 11, 16], "dccetinkayarb19": 7, "\u00e7": 7, "etinkaya": 7, "rundel": 7, "christoph": 7, "barr": 7, "os": [7, 9], "dirti": 8, "dig": [8, 11, 17], "jump": [8, 10, 11, 16], "symbol": [8, 14, 16, 17], "spoken": [8, 16, 17], "resid": [8, 16], "indigen": 8, "cultur": 8, "anywher": [8, 9], "2018": [8, 16], "sadli": 8, "colon": [8, 17], "led": [8, 16], "loss": 8, "children": 8, "speak": [8, 11, 16, 17], "mother": [8, 16, 17], "tongu": [8, 16, 17], "childhood": 8, "residenti": [8, 12], "discov": 8, "act": [8, 15, 16, 17], "harm": 8, "endang": 8, "geograph": 8, "walker": 8, "2017": [8, 15], "came": [8, 12, 16], "aborigin": [8, 11, 16, 17], "truth": 8, "reconcili": 8, "commiss": 8, "action": 8, "2015": 8, "canlang": [8, 11, 16], "2016": [8, 11, 16, 17], "censu": [8, 11, 16, 17], "214": [8, 11, 16, 17], "offici": [8, 11, 16, 17], "mother_tongu": [8, 11, 16, 17], "expos": 8, "birth": 8, "most_at_hom": [8, 11, 16, 17], "most_at_work": [8, 11, 16, 17], "lang_known": [8, 11, 16, 17], "accord": [8, 11, 16, 17], "deep": [8, 13], "simplifi": [8, 11, 17], "concentr": [8, 16], "expertis": 8, "bias": 8, "aim": [8, 10, 16], "causal": [8, 12, 16], "mechanist": [8, 16], "leek": 8, "matsui": 8, "earli": [8, 10], "live": [8, 11, 16], "provinc": [8, 11], "territori": 8, "hypothes": [8, 16], "polit": 8, "parti": 8, "wealth": [8, 16], "elect": 8, "quantif": 8, "factor": [8, 16], "mechan": [8, 11, 12], "pertain": [8, 16, 17], "occasion": [8, 14, 17], "race": [8, 12, 13], "runner": 8, "regularli": [8, 9], "graphic": [8, 9, 11, 14, 15, 16], "ag": 8, "old": [8, 11, 15], "50kg": 8, "cluster": [8, 10, 16], "bought": 8, "amazon": 8, "cellphon": 8, "ownership": 8, "android": 8, "phone": 8, "essenc": 8, "spreadsheet": [8, 11], "microsoft": 8, "rectangular": 8, "primarili": [8, 12, 15, 16], "speaker": [8, 16, 17], "comma": [8, 9, 12, 17], "short": [8, 11, 16], "save": [8, 11, 14, 15], "googl": [8, 11], "sheet": [8, 11], "can_lang": [8, 11, 16, 17], "plain": [8, 9, 15], "editor": [8, 9, 11, 15], "notepad": 8, "590": [8, 11, 16], "235": [8, 11, 16, 17], "665": [8, 11, 16], "afrikaan": [8, 11, 16, 17], "10260": [8, 11, 16], "4785": [8, 11, 16], "23415": [8, 11, 16], "afro": [8, 11, 16, 17], "asiat": [8, 11, 16, 17], "1150": [8, 11, 16], "44": [8, 11, 15, 16], "akan": [8, 11, 16, 17], "twi": [8, 11, 16, 17], "13460": [8, 11, 16], "5985": [8, 11, 16], "22150": [8, 11, 16], "albanian": [8, 11, 16, 17], "26895": [8, 11, 16], "13135": [8, 11, 16], "345": [8, 11, 16], "31930": [8, 11, 16], "algonquian": [8, 11, 17], "algonquin": [8, 11, 17], "1260": [8, 11], "370": [8, 11, 17], "2480": [8, 11], "sign": [8, 11, 12, 15, 16], "2685": [8, 11], "3020": [8, 11], "1145": [8, 11], "amhar": [8, 11], "22465": [8, 11], "12785": [8, 11], "33670": [8, 11], "instal": [8, 9, 10, 11, 14], "team": [8, 15], "es": 8, "innei": 8, "2010": 8, "command": [8, 9, 11, 14], "shorter": [8, 9, 11, 15, 16], "alia": [8, 9], "gave": [8, 11], "harder": [8, 16, 17], "quot": [8, 11], "letter": [8, 14, 15], "distinguish": [8, 16], "satisfi": [8, 11], "syntax": [8, 11, 15, 17], "amp": [8, 11, 16, 17], "445": [8, 11, 16, 17], "2775": [8, 11, 16], "209": [8, 11, 16, 17], "wolof": [8, 11, 16, 17], "3990": [8, 11, 16], "1385": [8, 11, 16], "8240": [8, 11, 16], "210": [8, 11, 16, 17], "wood": [8, 11, 16, 17], "cree": [8, 11, 16, 17], "1840": [8, 11, 16], "800": [8, 11, 16], "2665": [8, 11, 16], "211": [8, 11, 12, 16, 17], "wu": [8, 11, 16, 17], "shanghaines": [8, 11, 16, 17], "12915": [8, 11, 16], "7650": [8, 11, 16], "16530": [8, 11, 16], "yiddish": [8, 11, 16, 17], "13555": [8, 11, 16], "7085": [8, 11, 16], "895": [8, 11, 16], "20985": [8, 11, 16], "yoruba": [8, 11, 16, 17], "9080": [8, 11, 16], "2615": [8, 11, 16], "22415": [8, 11, 16], "screen": [8, 9, 11], "string": [8, 11, 15, 16, 17], "my_numb": 8, "alic": 8, "_": [8, 9, 16, 17], "won": [8, 11, 13, 15, 17], "complain": 8, "my": [8, 9], "syntaxerror": 8, "mayb": [8, 11], "meant": 8, "convent": [8, 9, 15], "lowercas": [8, 15], "language_data": 8, "pep": 8, "guido": 8, "van": 8, "rossum": 8, "2001": 8, "minut": [8, 9, 13, 16], "underneath": [8, 9], "ve": [8, 11, 15], "largest": [8, 11, 16, 17], "restrict": [8, 13, 17], "bracket": [8, 9, 12, 17], "statement": [8, 11, 17], "written": [8, 9, 11, 15], "doubl": [8, 9, 10, 14, 16, 17], "athabaskan": [8, 11, 17], "atikamekw": [8, 11, 17], "6150": [8, 11], "5465": 8, "1100": 8, "6645": 8, "thompson": [8, 11], "ntlakapamux": [8, 11], "335": [8, 11], "450": 8, "tlingit": [8, 11], "260": 8, "tsimshian": [8, 11], "410": 8, "206": 8, "wakashan": [8, 11], "67": [8, 11, 12, 16], "aboriginal_lang": 8, "alias": 8, "wrote": 8, "terminolog": 8, "obj": 8, "f": [8, 11, 12, 14], "programm": 8, "confus": [8, 11, 17], "appar": 8, "rescu": 8, "selected_lang": 8, "descend": [8, 16], "decend": 8, "arranged_lang": 8, "64050": 8, "inuktitut": 8, "35210": 8, "138": 8, "ojibwai": 8, "17885": 8, "oji": 8, "12855": 8, "dene": 8, "10700": 8, "32": [8, 13, 15, 16, 17], "cayuga": 8, "squamish": 8, "iroquoian": 8, "ten_lang": 8, "montagnai": 8, "innu": 8, "10235": 8, "119": 8, "mi": [8, 16], "kmaq": 8, "6690": 8, "3065": 8, "180": [8, 13], "stonei": 8, "3025": 8, "becam": 8, "curiou": 8, "728": [8, 16], "canadian_popul": [8, 16], "overwrit": 8, "opt": [8, 11, 12], "mother_tongue_perc": [8, 16], "35_151_728": [8, 16], "35151728": 8, "latter": [8, 12], "clearer": [8, 16], "182210": 8, "100166": 8, "050879": 8, "036570": 8, "030439": 8, "029117": 8, "019032": 8, "017496": 8, "008719": 8, "008606": 8, "ten_lang_perc": 8, "008": 8, "temporari": [8, 15, 17], "arranged_lang_sort": 8, "trace": [8, 9], "split": [8, 12, 13, 16], "rewrit": 8, "unwieldi": 8, "parenthesi": 8, "demonstr": [8, 11, 12, 13, 16, 17], "cleaner": 8, "messi": [8, 15, 17], "pars": [8, 11, 16], "block": [8, 11], "piec": 8, "period": [8, 9, 11, 16], "Not": [8, 17], "feed": 8, "redo": 8, "overwhelm": 8, "debug": 8, "midwai": 8, "audienc": [8, 9, 15, 16], "difficulti": 8, "scrutin": 8, "convei": [8, 16], "understood": 8, "shortli": 8, "ax": [8, 16], "mark": [8, 11, 15, 16], "channel": [8, 11, 12, 15, 16], "barplot_mother_tongu": 8, "refin": [8, 11], "quotat": [8, 11], "modif": [8, 17], "tackl": 8, "rotat": 8, "swap": [8, 16], "barplot_mother_tongue_axi": 8, "forward": [8, 11, 12], "suit": [8, 16, 17], "reorder": 8, "ordered_barplot_mother_tongu": 8, "swampi": 8, "elsewher": [8, 11], "moos": 8, "northern": 8, "east": 8, "southern": 8, "comment": [8, 15], "hash": [8, 15], "importantli": 8, "self": [8, 11], "habit": [8, 12], "got": 8, "tast": 8, "ten_lang_plot": 8, "nobodi": 8, "pull": [8, 11, 14], "forgotten": [8, 15], "pop": [8, 9, 11], "slowli": 8, "adept": 8, "remind": [8, 17], "lab": [8, 14], "lookup": 8, "concis": 8, "press": [8, 9], "tab": [8, 9, 11, 14, 15], "bring": [8, 11], "typo": 8, "hold": [8, 11, 16, 17], "dialogu": 8, "dialog": [8, 15], "contextu": 8, "gvr01": 8, "coghlan": 8, "barri": [8, 17], "warsaw": 8, "style": [8, 11], "0008": 8, "lp15": 8, "jeffrei": [8, 16], "347": 8, "6228": 8, "1314": 8, "1315": 8, "pm15": 8, "elizabeth": 8, "art": [8, 16], "anyon": [8, 9, 11, 15], "skybrud": 8, "consult": [8, 11, 15], "llc": 8, "bookdown": 8, "rdpeng": 8, "artofdatasci": 8, "tim20": [8, 16], "ttimber": [8, 11, 16], "wal17": 8, "anada": 8, "canadiangeograph": 8, "wil18": 8, "kori": 8, "bccampu": 8, "opentextbc": 8, "indigenizationfound": 8, "statisticscanada16a": 8, "www12": 8, "statcan": 8, "gc": 8, "recens": 8, "dp": 8, "eng": 8, "cfm": 8, "statisticscanada16b": 8, "borigin": 8, "irst": 8, "ation": 8, "\u00e9ti": 8, "nuit": 8, "sa": 8, "2016022": 8, "x2016022": 8, "statisticscanada18": 8, "evolut": 8, "1901": 8, "www150": 8, "n1": 8, "pub": 8, "630": 8, "x2018001": 8, "htm": 8, "thepdteam20": 8, "dev": 8, "februari": 8, "doi": [8, 16], "5281": 8, "zenodo": 8, "3509134": 8, "trutharcocanada12": 8, "govern": 8, "servic": [8, 11, 15], "trutharcocanada15": 8, "ction": 8, "www2": 8, "gov": [8, 11, 16], "asset": 8, "columbian": 8, "calls_to_action_english2": 8, "pdf": [8, 16], "wesmckinney10": 8, "ata": 8, "tructur": 8, "tatist": 8, "omput": 8, "p": [8, 11, 14], "ython": 8, "t\u00e9fan": 8, "der": 8, "arrod": 8, "illman": 8, "roceed": 8, "9th": 8, "cienc": 8, "onfer": 8, "25080": 8, "majora": 8, "92bf1922": 8, "00a": 8, "interleav": 9, "narrat": 9, "platform": [9, 15], "interfac": [9, 11, 14, 15], "dress": 9, "morn": 9, "configur": [9, 10, 14, 15], "formatt": 9, "artifact": 9, "analyz": [9, 10, 11, 17], "realiti": [9, 13], "consciou": [9, 15], "screenshot": 9, "easiest": [9, 14], "jupyterhub": [9, 15], "provis": 9, "authent": [9, 15], "gain": [9, 11], "instructor": [9, 10], "refer": 9, "entireti": 9, "cursor": 9, "rectangl": [9, 15, 16], "toolbar": [9, 11], "keyboard": [9, 15], "enter": [9, 11, 14, 15, 16], "arrow": [9, 15], "restart": [9, 14], "bar": [9, 11, 13, 14], "slight": [9, 12], "session": [9, 14, 15], "delet": [9, 14, 15], "emul": 9, "window": [9, 11], "statu": 9, "idl": 9, "busi": 9, "excess": 9, "unrespons": 9, "lose": 9, "connect": [9, 11, 13, 14, 15, 16], "interrupt": 9, "paus": 9, "server": [9, 11, 15], "hub": 9, "panel": 9, "shut": [9, 14], "rich": [9, 15], "bold": 9, "italic": 9, "bullet": [9, 11], "eventu": [9, 11, 16], "unformat": 9, "unrend": 9, "box": [9, 12, 13, 14, 15], "progress": [9, 14], "autosav": 9, "disk": [9, 11], "icon": [9, 11, 14, 15], "mac": 9, "arbitrari": [9, 16], "downsid": [9, 14], "nonlinear": [9, 13, 16], "deliber": [9, 15], "referenc": 9, "unconvent": 9, "fail": 9, "nonfunct": 9, "scenario": [9, 11], "event": [9, 15], "guard": 9, "sooner": 9, "linearli": [9, 13], "suffici": [9, 16], "extern": [9, 15], "heavili": 9, "loc": [9, 16], "package_nam": 9, "pn": 9, "librari": [9, 16], "hidden": [9, 11], "ipynb": [9, 11, 15], "shareabl": 9, "firefox": 9, "safari": 9, "chrome": 9, "edg": 9, "adob": 9, "acrobat": 9, "benefit": [9, 11, 15, 17], "standalon": 9, "font": [9, 11, 16], "launcher": 9, "visibl": [9, 15, 16], "untitl": 9, "white": 9, "troublesom": [9, 15], "repetit": 9, "dash": [9, 16], "jupyterlab": 9, "keen": 9, "commonmark": 9, "cheatsheet": 9, "friend": 10, "colleagu": 10, "histori": [10, 15], "chapter": 10, "spend": [10, 11, 12, 17], "restructur": 10, "usabl": 10, "coher": 10, "variou": [11, 14, 17], "laptop": [11, 15], "gatewai": 11, "unless": [11, 14, 16], "upfront": [11, 17], "devot": 11, "shoelac": 11, "trip": 11, "skiprow": 11, "ibi": 11, "list_tabl": 11, "to_csv": 11, "astronomi": 11, "pictur": [11, 16], "request": [11, 17], "internet": [11, 14], "remot": 11, "directori": [11, 14, 15, 16], "filesystem": 11, "folder": [11, 14, 15], "project3": 11, "happiness_report": 11, "slash": [11, 17], "proce": [11, 14, 15, 17], "happy_data": 11, "bike_shar": 11, "project2": 11, "silli": [11, 13], "redund": [11, 16], "whew": 11, "bonu": 11, "fatima": 11, "jayden": 11, "usernam": [11, 15], "link": [11, 14, 15], "video": [11, 14], "omma": 11, "epar": 11, "v": [11, 14], "alu": 11, "aren": [11, 16, 17], "canadian": [11, 17], "canlang_data": 11, "oftentim": [11, 17], "sentenc": 11, "paragraph": [11, 16], "scientist": 11, "distribut": [11, 15, 16], "permiss": [11, 15], "21930": 11, "parsererror": 11, "messag": [11, 14, 15, 16, 17], "wasn": [11, 16], "can_lang_meta": 11, "token": 11, "didn": [11, 17], "tsv": 11, "escap": 11, "backslash": 11, "can_lang_no_nam": 11, "curli": [11, 17], "brace": 11, "col_map": 11, "canlang_data_renam": 11, "u": [11, 14, 16], "niform": 11, "esourc": 11, "ocat": 11, "raw": [11, 14, 16, 17], "githubusercont": [11, 14], "datasci": 11, "whichev": 11, "xlsx": 11, "snippet": [11, 15], "_rel": 11, "j1": 11, "w8": 11, "qrj": 11, "tf": 11, "wz": 11, "hlio": 11, "8f": 11, "3wn": 11, "ed2": 11, "gz": 11, "_r": 11, "yg": 11, "tuee": 11, "6q": 11, "rzy": 11, "l60": 11, "xtp": 11, "4vt": 11, "jq": 11, "sheet_nam": 11, "sad": 11, "usecol": 11, "beforehand": 11, "libr": 11, "offic": 11, "semicolon": 11, "decim": [11, 16, 17], "european": 11, "countri": 11, "storag": 11, "user": [11, 14, 15], "manag": [11, 14, 15], "mysql": 11, "oracl": 11, "sql": 11, "simplest": [11, 16], "db": 11, "backend": 11, "send": [11, 15], "sqlalchemi": 11, "matur": 11, "deeper": 11, "friendlier": 11, "conn": 11, "retriev": [11, 12, 15, 17], "secretli": 11, "scene": [11, 15], "canlang_t": 11, "databaset": 11, "r0": 11, "countstar": 11, "haven": [11, 14], "sent": [11, 15], "effici": [11, 13, 15, 16], "lazi": 11, "compil": 11, "str": 11, "AS": 11, "nfrom": 11, "t0": 11, "arab": 11, "419890": 11, "223535": 11, "5585": 11, "629055": 11, "mostli": [11, 15, 16, 17], "canlang_table_filt": 11, "predic": 11, "canlang_table_select": 11, "r1": 11, "aboriginal_lang_data": 11, "attributeerror": 11, "traceback": 11, "conda": [11, 14], "lib": 11, "python3": 11, "expr": 11, "py": [11, 14, 17], "645": 11, "__getattr__": 11, "641": 11, "hint": 11, "common_typo": 11, "642": [11, 13], "rais": [11, 16], "643": 11, "__name__": 11, "644": 11, "tahltan": 11, "crash": 11, "postgr": 11, "client": [11, 12], "host": [11, 14, 15], "localhost": 11, "port": [11, 14], "endpoint": 11, "5432": 11, "password": [11, 15], "can_mov_db": 11, "movi": 11, "fakeserv": 11, "stat": 11, "user0001": 11, "abc123": 11, "theme": [11, 16], "medium": [11, 15], "title_alias": 11, "episod": 11, "names_occup": 11, "occup": 11, "rate": 11, "ratings_t": 11, "alchemyt": 11, "average_r": 11, "num_vot": 11, "avg_rat": 11, "order_bi": 11, "backup": 11, "secur": [11, 15], "simultan": [11, 15, 17], "conflict": 11, "billion": 11, "daili": 11, "chao": 11, "ensu": 11, "no_official_lang_data": 11, "no_official_languag": 11, "magic": 11, "uncommon": 11, "pplicat": 11, "rogram": 11, "nterfac": 11, "secret": [11, 15], "somewhat": [11, 13], "thought": [11, 13, 17], "painstak": 11, "gather": [11, 16], "yper": 11, "ext": 11, "arkup": 11, "anguag": 11, "ascad": 11, "tyle": 11, "heet": 11, "webpag": [11, 15], "wherea": [11, 13, 17], "layout": [11, 16], "subsect": 11, "richardson": 11, "2007": 11, "reitz": 11, "foot": [11, 12, 13], "craiglist": 11, "craigslist": 11, "advertis": [11, 12, 13], "span": 11, "meta": 11, "hous": [11, 12, 13], "1br": 11, "hood": 11, "13768": 11, "108th": 11, "avenu": 11, "maptag": 11, "pid": 11, "6786042973": 11, "banish": 11, "trash": [11, 14], "hide": [11, 16], "unbanish": 11, "href": 11, "restor": 11, "2285": 11, "oof": 11, "keyword": [11, 17], "grab": 11, "selectorgadget": 11, "cc": 11, "deselect": 11, "pic": 11, "footag": 11, "gadget": 11, "robot": 11, "txt": [11, 15], "cl": 11, "spider": 11, "script": 11, "scraper": 11, "crawler": 11, "explicit": [11, 17], "realist": 11, "disallow": 11, "td": 11, "nth": 11, "child": [11, 13], "largestc": 11, "target": 11, "bs4": 11, "wiki": 11, "en": 11, "parser": 11, "population_nod": 11, "slice": [11, 16, 17], "clariti": [11, 16], "greater_toronto_area": 11, "202": 11, "london": [11, 17], "_ontario": 11, "ontario": 11, "543": 11, "551": 11, "greater_montr": 11, "montreal": [11, 17], "node": 11, "rid": 11, "get_text": 11, "fantast": 11, "albeit": 11, "canada_wiki_t": 11, "metropolitan": [11, 17], "droplevel": 11, "canada_wiki_df": 11, "rank": 11, "unnam": 11, "8_level_1": 11, "9_level_1": 11, "6202225": 11, "543551": 11, "quebec": 11, "4291732": 11, "halifax": [11, 17], "nova": 11, "scotia": 11, "465703": 11, "2642825": 11, "st": [11, 17], "catharin": [11, 17], "niagara": [11, 17], "433604": 11, "ottawa": [11, 17], "gatineau": [11, 17], "1488307": 11, "windsor": [11, 17], "422630": 11, "calgari": [11, 17], "1481806": 11, "oshawa": 11, "415311": 11, "edmonton": [11, 17], "1418118": 11, "victoria": [11, 16, 17], "397237": 11, "839311": 11, "saskatoon": 11, "saskatchewan": 11, "317480": 11, "winnipeg": [11, 17], "manitoba": 11, "834678": 11, "regina": [11, 17], "249217": 11, "hamilton": 11, "785184": 11, "sherbrook": 11, "227398": 11, "kitchen": [11, 17], "cambridg": [11, 17], "waterloo": [11, 17], "575847": 11, "kelowna": [11, 17], "222162": 11, "desktop": 11, "stun": 11, "rho": 11, "ophiuchi": 11, "juli": 11, "webb": 11, "telescop": 11, "nircam": 11, "molecular": [11, 16], "signup": 11, "safe": [11, 15], "transfer": [11, 12], "infinit": 11, "bandwidth": 11, "frequent": [11, 15], "success": [11, 15], "bog": 11, "revok": 11, "grant": 11, "quota": 11, "overrun": 11, "abid": 11, "hourli": 11, "hour": [11, 12], "planetari": 11, "apod": 11, "api_kei": 11, "your_api_kei": 11, "07": [11, 16], "explan": [11, 16], "mere": 11, "390": 11, "light": [11, 15], "sun": [11, 16], "star": 11, "planet": 11, "peer": 11, "natal": 11, "infrar": 11, "spectacular": 11, "cosmic": 11, "snapshot": [11, 14, 15], "celebr": 11, "young": 11, "brighter": 11, "sport": 11, "diffract": 11, "spike": 11, "jet": 11, "shock": 11, "hydrogen": 11, "blast": 11, "newborn": 11, "yellowish": 11, "dusti": 11, "caviti": 11, "carv": 11, "energet": 11, "Near": 11, "shadow": 11, "cast": 11, "protoplanetari": 11, "hdurl": 11, "2307": 11, "stsci": 11, "01_rhooph": 11, "png": [11, 16], "media_typ": 11, "service_vers": 11, "v1": 11, "01_rhooph1024": 11, "neat": 11, "json": 11, "javascript": 11, "notat": [11, 17], "nasa_data_singl": 11, "start_dat": 11, "end_dat": 11, "nasa_data": 11, "74": [11, 16], "copyright": 11, "data_dict": 11, "nasa_df": 11, "carina": 11, "nebula": 11, "ncarlo": 11, "taylor": 11, "2305": 11, "carnorth": 11, "02": [11, 16], "flat": [11, 12, 13, 16], "rock": 11, "mar": 11, "nnasa": 11, "njpl": 11, "caltech": 11, "nmsss": 11, "nprocess": 11, "ne": 11, "flatmar": 11, "03": [11, 16, 17], "centauru": 11, "peculiar": 11, "island": 11, "nmarco": 11, "lorenzi": 11, "nangu": 11, "lau": 11, "tommi": 11, "tse": 11, "ntex": 11, "ngc5128_": 11, "galaxi": 11, "famou": 11, "hole": 11, "pia23122": 11, "shackleton": 11, "shadowcam": 11, "shacklet": 11, "69": 11, "doom": 11, "eta": 11, "nesa": 11, "nhubbl": 11, "nlice": 11, "etacarin": 11, "dust": 11, "ngc": 11, "6559": 11, "nadam": 11, "ntelescop": 11, "ngc6559_": 11, "sunspot": 11, "spot": 11, "72": 11, "ring": 11, "spiral": 11, "1398": 11, "ngc1398_": 11, "73": [11, 16], "readili": 11, "heart": 11, "awesom": 11, "udac": 11, "linux": [11, 14], "rthepsfoundation23": 11, "kenneth": 11, "readthedoc": 11, "latest": [11, 14, 15, 17], "ric07": 11, "leonard": 11, "beauti": 11, "soup": 11, "april": [11, 16], "nasaesacsa": 11, "esa": 11, "csa": 11, "pontoppidan": 11, "pagan": 11, "esawebb": 11, "weic2316a": 11, "realtsproject21": 11, "internetlivestat": 11, "faster": [12, 16], "rmspe": [12, 13], "person": [12, 13, 16], "week": 12, "annual": 12, "boston": 12, "marathon": 12, "sale": [12, 13], "spline": 12, "heurist": 12, "932": 12, "estat": [12, 13], "sacramento": [12, 13], "bee": 12, "newspap": 12, "realtor": 12, "zip": [12, 14, 15], "sqft": [12, 13], "latitud": 12, "longitud": 12, "z95838": 12, "836": [12, 17], "59222": 12, "631913": 12, "434879": 12, "z95823": 12, "1167": 12, "68212": 12, "478902": 12, "431028": 12, "z95815": 12, "796": 12, "68880": 12, "618305": 12, "443839": 12, "852": 12, "69307": 12, "616835": 12, "439146": 12, "z95824": 12, "797": 12, "81900": 12, "519470": 12, "435768": 12, "927": 12, "z95829": 12, "2280": 12, "232425": 12, "457679": 12, "359620": 12, "928": [12, 17], "1477": 12, "234000": 12, "499893": 12, "458890": 12, "929": 12, "citrus_height": 12, "z95610": 12, "1216": 12, "235000": 12, "708824": 12, "256803": 12, "930": [12, 16], "elk_grov": 12, "z95758": 12, "1685": 12, "235301": 12, "417000": 12, "397424": 12, "931": 12, "el_dorado_hil": 12, "z95762": 12, "1362": 12, "235738": 12, "655245": 12, "075915": 12, "livabl": 12, "feet": [12, 13], "usd": [12, 13], "unit": [12, 13, 16, 17], "front": [12, 16], "0f": [12, 13], "sold": [12, 13], "dive": 12, "subsampl": 12, "small_sacramento": 12, "pai": 12, "absent": 12, "small_plot": 12, "overlai": 12, "line_df": 12, "2000": 12, "mark_rul": [12, 16], "strokedash": [12, 16], "dist": 12, "nearest_neighbor": 12, "298": 12, "1900": 12, "361745": 12, "487409": 12, "461413": 12, "718": 12, "antelop": 12, "z95843": 12, "2160": 12, "290000": 12, "704554": 12, "354753": 12, "rosevil": 12, "z95678": 12, "1744": 12, "326951": 12, "771917": 12, "304439": 12, "256": 12, "252": 12, "z95835": 12, "1718": 12, "250000": 12, "676658": 12, "528128": 12, "282": 12, "rancho_cordova": 12, "z95670": 12, "1671": 12, "175000": 12, "591477": 12, "315340": 12, "329": 12, "280739": 12, "280": [12, 16, 17], "739": 12, "unansw": 12, "abil": [12, 15, 16, 17], "lock": [12, 13], "sacramento_train": [12, 13], "sacramento_test": [12, 13], "limits_": 12, "y_i": 12, "hat": 12, "_i": 12, "th": 12, "forecast": 12, "overshoot": 12, "undershoot": 12, "rmse": [12, 13], "equat": [12, 13], "kneighborsregressor": [12, 13], "neg_root_mean_squared_error": 12, "kneighborsregressor__n_neighbor": 12, "sacr_pipelin": 12, "sacr_preprocessor": 12, "201": 12, "sacr_gridsearch": 12, "sacr_result": 12, "param_kneighborsregressor__n_neighbor": 12, "117365": 12, "988307": 12, "2715": 12, "383001": 12, "93956": 12, "523683": 12, "2466": 12, "200227": 12, "89859": 12, "401722": 12, "2739": 12, "713448": 12, "87893": 12, "534919": 12, "2958": 12, "587153": 12, "86444": 12, "413831": 12, "3383": 12, "712997": 12, "92909": 12, "550051": 12, "2562": 12, "784826": 12, "93137": 12, "289780": 12, "2511": 12, "564001": 12, "93395": 12, "588763": 12, "2492": 12, "272799": 12, "93671": 12, "588088": 12, "2473": 12, "312705": 12, "199": 12, "93986": 12, "752272": 12, "048651": 12, "nonneg": 12, "neg_": 12, "convolut": 12, "alright": [12, 16], "101": [12, 17], "minimum": [12, 13, 17], "699": 12, "perfectli": [12, 15, 16], "datapoint": 12, "inflex": 12, "idiosyncrat": 12, "unseen": [12, 13], "mean_squared_error": [12, 13], "87498": 12, "86808211416": 12, "499": 12, "578": 12, "neglig": 12, "buyer": 12, "afford": 12, "maximum": [12, 13, 17], "5000": 12, "superimpos": [12, 13], "qualit": [12, 13], "opportun": 12, "sqft_prediction_grid": [12, 13], "arang": 12, "base_plot": 12, "sacr_preds_plot": [12, 13], "best_k_sacr": 12, "ff7f0e": [12, 13], "concern": [12, 13], "incorpor": [12, 17], "plot_b": 12, "moreov": 12, "85156": 12, "027067": 12, "3376": 12, "143313": 12, "rmspe_mult": 12, "85083": 12, "2902421959": 12, "083": 12, "overlaid": [12, 13], "2d": 12, "newli": [12, 15], "character": 13, "conclud": 13, "slower": 13, "confusingli": 13, "undervalu": 13, "beta_0": 13, "beta_1": 13, "cdot": 13, "intercept": [13, 16], "coeffici": 13, "parametr": 13, "push": 13, "happili": 13, "crazi": 13, "shouldn": 13, "600": [13, 16], "276": 13, "027": 13, "plausibl": 13, "linearregress": 13, "linear_model": 13, "coef_": 13, "intercept_": 13, "lm": 13, "285652": 13, "15642": 13, "309105": 13, "hurt": 13, "afterward": [13, 17], "85376": 13, "59691629931": 13, "377": [13, 16], "tricki": [13, 14], "all_point": 13, "wiggli": 13, "curv": [13, 16], "oscil": [13, 16], "Such": 13, "fare": 13, "extrapol": 13, "obvious": 13, "mlm": 13, "linearregressionlinearregress": 13, "lm_mult_test_rmsp": 13, "82331": 13, "04630202598": 13, "331": 13, "hallmark": 13, "59235377": 13, "20333": 13, "43213798": 13, "53180": 13, "26906624224": 13, "beta_2": 13, "hyperplan": 13, "333": [13, 16], "tune": [13, 16], "collinear": 13, "judg": 13, "unbeknownst": 13, "analyst": 13, "parent": 13, "absurdli": 13, "subtl": [13, 17], "inaccur": 13, "238": 13, "ft": 13, "041": 13, "166": 13, "539": 13, "ic": 13, "cream": 13, "flavor": [13, 16], "remark": 13, "homeown": 13, "df": [13, 17], "fulli": [13, 16], "5994": 13, "288853": 13, "1688": 13, "092090": 13, "9859": 13, "021194": 13, "9160": 13, "812375": 13, "6400": 13, "212624": 13, "7341": 13, "333609": 13, "8434": 13, "656970": 13, "3329": 13, "106273": 13, "7170": 13, "311442": 13, "7895": 13, "567003": 13, "cubic": 13, "z": 13, "magnitud": [13, 16], "leap": 13, "stone": 13, "enjoi": 13, "ventura": 14, "22": [14, 15, 16], "cpu": 14, "english": [14, 16, 17], "virtual": 14, "rightmost": 14, "compress": [14, 16], "unzip": 14, "autograd": 14, "pre": 14, "isol": 14, "interf": 14, "ex": 14, "wizard": 14, "wsl": 14, "hyper": 14, "prompt": [14, 15], "cmd": 14, "admin": 14, "administr": 14, "log": [14, 15, 16], "bio": 14, "hotkei": 14, "esc": 14, "reboot": 14, "familiar": 14, "ubcdsci": 14, "proceed": [14, 17], "dockerfil": 14, "besid": [14, 15], "textbox": 14, "8888": 14, "volum": 14, "path": [14, 16, 17], "jovyan": 14, "scroll": [14, 15], "127": 14, "troubleshoot": 14, "tip": 14, "dmg": 14, "intel": 14, "processor": 14, "older": 14, "appl": 14, "newer": 14, "drag": [14, 15], "sudo": 14, "certif": 14, "curl": 14, "gnupg": 14, "fssl": 14, "sh": 14, "chmod": 14, "rm": 14, "pwd": 14, "homepag": 14, "bundl": 14, "kernel": 14, "pip": 14, "upgrad": 14, "env": 14, "intro": 14, "yml": 14, "compat": 14, "xcode": 14, "x64": 14, "arm64": 14, "debian": 14, "deb": 14, "dpkg": 14, "jlab": 14, "me": 15, "ago": 15, "holder": 15, "lifespan": 15, "resolv": 15, "revis": 15, "mess": [15, 16], "repercuss": 15, "boggl": 15, "unclear": 15, "document_final_draft_fin": 15, "to_hand_in_final_v2": 15, "polish": 15, "springboard": 15, "fruit": 15, "revert": 15, "Being": 15, "todai": [15, 16], "safeti": 15, "workspac": 15, "schemat": 15, "maintain": 15, "told": 15, "metadata": 15, "brief": 15, "narr": 15, "readm": 15, "md": 15, "draft": 15, "shorten": 15, "daa29d6": 15, "884c7ce": 15, "prerequisit": 15, "stage": 15, "physic": [15, 16], "placehold": 15, "synchron": 15, "templat": 15, "canadian_languag": 15, "hyphen": 15, "privaci": 15, "happi": 15, "green": [15, 17], "respositori": 15, "reserv": 15, "upload": [15, 16], "toggl": 15, "markdown": 15, "archiv": 15, "defeat": 15, "prove": 15, "beginn": 15, "grain": 15, "expiri": 15, "creation": 15, "absolut": [15, 16], "tick": [15, 16], "repo": 15, "fret": 15, "eda": 15, "flag": 15, "pane": 15, "plu": 15, "untrack": 15, "checkpoint": 15, "state": [15, 16], "datetim": [15, 16], "stamp": 15, "ok": 15, "credenti": 15, "author": 15, "33": [15, 16, 17], "dismiss": 15, "invit": 15, "collaborators_github_user_nam": 15, "refresh": 15, "blend": [15, 16], "offend": 15, "preced": 15, "histor": 15, "float": [15, 17], "app": 15, "convers": [15, 16, 17], "subtop": 15, "persist": 15, "thread": 15, "searchabl": 15, "notif": 15, "repli": 15, "submit": [15, 16], "submiss": 15, "youtub": 15, "advic": 15, "gitlab": 15, "bitbucket": 15, "wbc": 15, "jennif": 15, "bryan": 15, "karen": 15, "cranston": 15, "justin": 15, "kitz": 15, "lex": 15, "nederbragt": 15, "traci": 15, "teal": 15, "subplot": 16, "raster": 16, "svg": 16, "distract": 16, "poster": 16, "wilk": 16, "oft": 16, "pie": 16, "static": 16, "math": 16, "cognit": 16, "mental": 16, "plainli": 16, "legend": 16, "scheme": 16, "surprisingli": 16, "sex": 16, "ancestri": 16, "deeb": 16, "2005": 16, "blind": 16, "reinforc": 16, "sparingli": 16, "detract": 16, "wari": 16, "overplot": 16, "overlap": 16, "zoom": 16, "vegafus": 16, "data_transform": 16, "curat": 16, "pieter": 16, "tan": 16, "noaa": 16, "gml": 16, "ralph": 16, "keel": 16, "scripp": 16, "oceanographi": 16, "dioxid": 16, "hawaii": 16, "1959": 16, "1980": 16, "co2_df": 16, "mauna_loa_data": 16, "parse_d": 16, "date_measur": 16, "ppm": 16, "338": 16, "340": 16, "341": 16, "06": [16, 17], "479": 16, "414": 16, "480": 16, "481": 16, "416": 16, "482": [16, 17], "483": 16, "484": 16, "datetime64": 16, "ns": 16, "iso": 16, "8601": 16, "alphanumer": 16, "mark_": 16, "leverag": 16, "helper": 16, "co2_scatt": 16, "upward": 16, "affirm": 16, "predecessor": 16, "successor": 16, "alter": 16, "segment": 16, "emphas": 16, "co2_lin": 16, "aha": 16, "phenomenon": 16, "muddl": 16, "settl": 16, "configure_axi": 16, "titlefonts": 16, "co2_line_label": 16, "co2": 16, "configure_": 16, "1990": 16, "clip": 16, "stack": [16, 17], "co2_line_scal": 16, "late": 16, "season": 16, "summer": 16, "octob": 16, "winter": 16, "novemb": 16, "analog": 16, "paint": 16, "blank": 16, "canva": 16, "primer": 16, "akin": 16, "sketch": 16, "durat": 16, "geyser": 16, "yellowston": 16, "nation": 16, "wyom": 16, "79": 16, "283": 16, "533": 16, "267": 16, "117": [16, 17], "268": [16, 17], "270": 16, "817": 16, "271": 16, "467": 16, "272": 16, "faithful_scatt": 16, "faithful_scatter_label": 16, "faithful_scatter_labels_black": 16, "whom": 16, "hollow": 16, "can_lang_plot": 16, "vs": 16, "can_lang_plot_label": 16, "bunch": 16, "clump": 16, "french": [16, 17], "460": 16, "850": 16, "19460850": 16, "22162865": 16, "15265335": 16, "29748265": 16, "59": [16, 17], "7166700": 16, "6943800": 16, "3825215": 16, "10242945": 16, "logarithm": 16, "squish": 16, "log_": 16, "log10": 16, "inf": 16, "can_lang_plot_log": 16, "gridlin": 16, "seven": 16, "can_lang_plot_log_revis": 16, "tickcount": 16, "kilo": 16, "mutat": 16, "most_at_home_perc": 16, "001678": 16, "000669": 16, "029188": 16, "013612": 16, "003272": 16, "001266": 16, "038291": 16, "017026": 16, "076511": 16, "037367": 16, "011351": 16, "003940": 16, "005234": 16, "002276": 16, "036741": 16, "021763": 16, "038561": 16, "020155": 16, "025831": 16, "007439": 16, "can_lang_plot_perc": 16, "meaningfulli": 16, "onto": 16, "belong": [16, 17], "can_lang_plot_categori": 16, "laid": 16, "can_lang_plot_legend": 16, "orient": 16, "tableau10": 16, "unsur": 16, "dark2": 16, "aesthet": 16, "switch": 16, "can_lang_plot_them": 16, "tooltip": 16, "hover": 16, "mous": 16, "pointer": 16, "can_lang_plot_tooltip": 16, "mile": 16, "mcneil": 16, "contin": 16, "south": 16, "africa": 16, "europ": 16, "asia": 16, "australia": 16, "islands_df": 16, "landmass_typ": 16, "11506": 16, "5500": 16, "16988": 16, "2968": 16, "axel": 16, "heiberg": 16, "baffin": 16, "184": 16, "bank": 16, "borneo": 16, "britain": 16, "celeb": 16, "celon": 16, "cuba": 16, "devon": 16, "ellesmer": 16, "3745": 16, "greenland": 16, "840": 16, "hainan": 16, "hispaniola": 16, "hokkaido": 16, "honshu": 16, "iceland": 16, "ireland": 16, "java": 16, "kyushu": 16, "luzon": 16, "madagascar": 16, "227": 16, "melvil": 16, "mindanao": 16, "molucca": 16, "guinea": 16, "306": 16, "zealand": 16, "newfoundland": 16, "9390": 16, "novaya": 16, "zemlya": 16, "princ": 16, "wale": 16, "sakhalin": 16, "6795": 16, "southampton": 16, "spitsbergen": 16, "sumatra": 16, "183": 16, "taiwan": 16, "tasmania": 16, "tierra": 16, "fuego": 16, "timor": 16, "islands_bar": 16, "nlargest": 16, "tilt": 16, "sort_valu": 16, "islands_top12": 16, "islands_bar_top": 16, "appeal": 16, "minu": 16, "revers": 16, "caption": 16, "slide": 16, "summari": 16, "twelv": 16, "islands_plot_sort": 16, "morlei": 16, "1882": 16, "299": 16, "792": 16, "458": 16, "km": 16, "sec": 16, "kilometr": 16, "morley_df": 16, "expt": 16, "740": 16, "900": 16, "1070": [16, 17], "940": 16, "950": 16, "810": 16, "870": 16, "experiment": 16, "fell": 16, "morley_bar": 16, "thin": 16, "bucket": 16, "morley_hist": 16, "datum": 16, "thick": 16, "v_line": 16, "morley_hist_lin": 16, "morley_hist_color": 16, "sit": 16, "transluc": 16, "morley_hist_categor": 16, "deriv": 16, "incorrect": 16, "clearest": 16, "morley_hist_facet": 16, "1050": 16, "foremost": 16, "subtli": 16, "speed_of_light": 16, "299792": 16, "relativeerror": 16, "299000": 16, "019194": 16, "017498": 16, "035872": 16, "092578": 16, "045879": 16, "049215": 16, "052550": 16, "002516": 16, "005851": 16, "025865": 16, "morley_hist_rel": 16, "recreat": 16, "admir": 16, "morley_hist_maxbin": 16, "width": 16, "motiv": 16, "establish": 16, "pose": 16, "wiggl": 16, "discern": 16, "parenthes": [16, 17], "energi": 16, "automot": 16, "plant": 16, "burn": [16, 17], "fossil": 16, "fuel": 16, "greenhous": 16, "gase": 16, "byproduct": 16, "trap": 16, "heat": 16, "warm": 16, "observatori": 16, "amplitud": 16, "1800": 16, "kilomet": 16, "farthest": 16, "confer": 16, "shop": 16, "billboard": 16, "pixel": 16, "lossi": 16, "lossless": 16, "jpeg": 16, "jpg": 16, "photograph": 16, "bmp": 16, "tiff": 16, "tif": 16, "gimp": 16, "redraw": 16, "ep": 16, "inkscap": 16, "shrink": 16, "portabl": 16, "hardl": 16, "1991": 16, "filenam": 16, "img": 16, "viz": 16, "faithful_plot": 16, "mb": 16, "decent": 16, "bigger": 16, "dee05": 16, "sameer": 16, "clinic": 16, "369": 16, "har91": 16, "wolfgang": 16, "york": 16, "mcn77": 16, "donald": 16, "mic82": 16, "veloc": 16, "nite": 16, "tate": 16, "aval": 16, "cademi": 16, "nnapoli": 16, "astronom": 16, "tk20": 16, "ccgg": 16, "vgh": 16, "jacob": 16, "granger": 16, "heer": 16, "dominik": 16, "moritz": 16, "kanit": 16, "wongsuphasawat": 16, "arvind": 16, "satyanarayan": 16, "eitan": 16, "ilia": 16, "timofeev": 16, "ben": 16, "welsh": 16, "scott": 16, "sievert": 16, "journal": [16, 17], "1057": 16, "21105": 16, "joss": 16, "01057": 16, "wil19": 16, "clau": 16, "clauswilk": 16, "dataviz": 16, "util": 17, "entiti": 17, "2235145": 17, "abbrevi": 17, "int": 17, "14159": 17, "boolean": 17, "bool": 17, "hello": 17, "nonetyp": 17, "arithmet": 17, "dict": 17, "cities_seri": 17, "separt": 17, "population_in_2016": 17, "1027613": 17, "1823281": 17, "544870": 17, "571146": 17, "321484": 17, "upcom": 17, "population_in_2016_df": 17, "criteria": 17, "No": 17, "bespok": 17, "untidi": 17, "2006": 17, "2011": 17, "land": 17, "region_lang_top5_cities_wid": 17, "cite": 17, "montr\u00e9al": 17, "lang_wid": 17, "985": 17, "1435": 17, "960": 17, "575": 17, "360": 17, "240": 17, "8485": 17, "1015": 17, "705": 17, "885": 17, "13260": 17, "2450": 17, "1090": 17, "1365": 17, "770": 17, "2440": 17, "5290": 17, "1025": 17, "380": 17, "3355": 17, "8960": 17, "3380": 17, "1430": 17, "tough": 17, "lang_mother_tidi": 17, "id_var": 17, "var_nam": 17, "value_nam": 17, "1065": 17, "1066": 17, "1067": 17, "1068": 17, "1069": 17, "met": 17, "commut": 17, "widen": 17, "region_lang_top5_cities_long": 17, "lang_long": 17, "2135": 17, "2136": 17, "2137": 17, "2138": 17, "2139": 17, "2140": 17, "lang_home_tidi": 17, "2495": 17, "1622735": 17, "1330555": 17, "8630": 17, "3245": 17, "behaviour": 17, "colum": 17, "messier": 17, "dealt": 17, "lang_messi": 17, "region_lang_top5_cities_messi": 17, "265": 17, "520": 17, "505": 17, "4045": 17, "440": 17, "330": 17, "6380": 17, "1445": 17, "530": 17, "620": 17, "3130": 17, "760": 17, "6665": 17, "860": 17, "1080": 17, "lang_messy_long": 17, "tidy_lang": 17, "astyp": 17, "depth": 17, "occas": 17, "official_lang": 17, "3836770": 17, "3218725": 17, "29800": 17, "11940": 17, "620510": 17, "412120": 17, "2669195": 17, "1607550": 17, "487": 17, "696": 17, "1065070": 17, "844740": 17, "701": 17, "910": 17, "1050410": 17, "792700": 17, "915": 17, "10950": 17, "2520": 17, "1060": 17, "ampersand": 17, "pipe": 17, "region_data": 17, "household": 17, "dwell": 17, "bellevil": 17, "43002": 17, "1354": 17, "65121": 17, "103472": 17, "45050": 17, "lethbridg": 17, "45696": 17, "3046": 17, "69699": 17, "117394": 17, "48317": 17, "thunder": 17, "bai": 17, "52545": 17, "2618": 17, "26318": 17, "121621": 17, "57146": 17, "peterborough": 17, "50533": 17, "1636": 17, "98336": 17, "121721": 17, "55662": 17, "saint": 17, "52872": 17, "3793": 17, "42158": 17, "126202": 17, "58398": 17, "535499": 17, "7168": 17, "96442": 17, "1323783": 17, "519693": 17, "5241": 17, "70103": 17, "1392609": 17, "960894": 17, "3040": 17, "41532": 17, "2463431": 17, "1727310": 17, "4638": 17, "24059": 17, "4098927": 17, "2135909": 17, "6269": 17, "93132": 17, "5928040": 17, "interst": 17, "city_nam": 17, "five_c": 17, "502143": 17, "9857": 17, "77908": 17, "1321426": 17, "537634": 17, "seriesa": 17, "seriesb": 17, "669": 17, "capabl": 17, "omit": 17, "startswith": 17, "darker": 17, "region_lang": 17, "moncton": 17, "saguenai": 17, "7485": 17, "7486": 17, "7487": 17, "abbotsford": 17, "mission": 17, "7488": 17, "7489": 17, "7490": 17, "23171710": 17, "std": 17, "490000e": 17, "093686e": 17, "401258e": 17, "000000e": 17, "836770e": 17, "25th": 17, "50th": 17, "75th": 17, "skipna": 17, "3061820": 17, "5600480": 17, "numeric_onli": 17, "3200": 17, "341121": 17, "3093": 17, "686248": 17, "1853": 17, "757677": 17, "5127": 17, "499332": 17, "55231": 17, "640268": 17, "64012": 17, "578320": 17, "48574": 17, "532066": 17, "94001": 17, "162338": 17, "cartoon": 17, "dataframegroupbi": 17, "0x7fab6d73a450": 17, "137445": 17, "182390": 17, "97840": 17, "brantford": 17, "124560": 17, "troi": 17, "rivi\u00e8r": 17, "149835": 17, "331375": 17, "270715": 17, "612595": 17, "23015": 17, "875": 17, "8235": 17, "2695": 17, "102": 17, "365": 17, "23565": 17, "104": 17, "11185": 17, "122100": 17, "93495": 17, "167835": 17, "168990": 17, "115125": 17, "193445": 17, "93655": 17, "54150": 17, "100855": 17, "116645": 17, "73910": 17, "130835": 17, "937055": 17, "1343335": 17, "147805": 17, "78610": 17, "149805": 17, "1316635": 17, "2289515": 17, "302690": 17, "211705": 17, "354470": 17, "235990": 17, "166220": 17, "318540": 17, "530570": 17, "437460": 17, "749285": 17, "keyerror": 17, "qu\u00e9bec": 17, "028571": 17, "region_lang_num": 17, "wise": 17, "040": 17, "aforement": 17, "english_lang": 17, "1898": 17, "444955": 17, "2500590": 17, "1903": 17, "1918": 17, "1919": 17, "930405": 17, "1275265": 17, "1923": 17, "city_pop": 17, "unchang": 17, "tmp": 17, "ipykernel_12": 17, "2654974267": 17, "settingwithcopywarn": 17, "row_index": 17, "col_index": 17, "pydata": 17, "doc": 17, "stabl": 17, "user_guid": 17, "warn": 17, "went": 17, "silenc": 17, "div": 17, "divis": 17, "108554": 17, "151384": 17, "100543": 17, "610060": 17, "516498": 17, "647224": 17, "542966": 17, "944744": 17, "672877": 17, "764802": 17, "606588": 17, "964617": 17, "704092": 17, "794906": 17, "599882": 17, "965067": 17, "534472": 17, "658730": 17, "540123": 17, "929401": 17, "city_popul": 17, "wic14": 17}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"acknowledg": 0, "python": [0, 3, 4, 6, 7, 8, 9, 11, 13, 17], "edit": [0, 6, 9, 15], "about": 1, "author": 1, "classif": [2, 3], "i": [2, 12, 15], "train": [2, 3, 12], "predict": [2, 3], "overview": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "chapter": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "learn": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "object": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "The": [2, 3, 4, 9, 12, 13], "problem": [2, 12], "explor": [2, 8, 9, 12], "data": [2, 3, 6, 8, 9, 11, 12, 16, 17], "set": [2, 3, 8, 12, 14, 16], "load": [2, 8], "cancer": 2, "describ": 2, "variabl": [2, 3], "k": [2, 4, 12, 13], "nearest": [2, 12], "neighbor": [2, 12], "distanc": 2, "between": 2, "point": 2, "evalu": [2, 3, 12], "from": [2, 11, 15, 17], "new": [2, 9, 13], "observ": 2, "each": 2, "its": 2, "5": 2, "more": 2, "than": 2, "two": 2, "explanatori": 2, "summari": [2, 3, 7, 9, 11, 17], "algorithm": [2, 4], "scikit": [2, 3], "preprocess": [2, 3], "center": 2, "scale": 2, "balanc": 2, "miss": [2, 11], "put": [2, 8], "togeth": [2, 8], "pipelin": 2, "exercis": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "refer": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "ii": [3, 13], "tune": [3, 12], "perform": [3, 17], "an": [3, 4, 11, 16], "exampl": [3, 4], "confus": 3, "matrix": 3, "tumor": 3, "imag": 3, "random": [3, 4], "seed": 3, "creat": [3, 8, 9, 15, 16], "test": [3, 12], "split": [3, 17], "classifi": 3, "label": 3, "critic": 3, "analyz": 3, "cross": 3, "valid": 3, "paramet": 3, "valu": [3, 8, 11, 17], "select": [3, 8, 17], "under": 3, "overfit": [3, 12], "predictor": [3, 13], "effect": [3, 16], "irrelev": 3, "find": 3, "good": 3, "subset": [3, 8], "forward": 3, "addit": [3, 4, 7, 9, 11, 13, 15, 16, 17], "resourc": [3, 4, 7, 9, 11, 13, 15, 16, 17], "cluster": 4, "illustr": 4, "mean": [4, 7], "measur": 4, "qualiti": 4, "restart": 4, "choos": [4, 16], "foreword": 5, "scienc": 6, "A": 6, "first": 6, "introduct": 6, "welcom": 6, "statist": [7, 17], "infer": 7, "why": [7, 11, 15], "do": [7, 17], "we": [7, 11], "need": 7, "sampl": 7, "distribut": 7, "proport": 7, "bootstrap": 7, "us": [7, 8, 11, 15, 17], "calcul": [7, 17], "plausibl": 7, "rang": 7, "panda": 8, "canadian": [8, 16], "languag": [8, 16], "ask": 8, "question": 8, "type": [8, 17], "analysi": 8, "tabular": [8, 11], "name": [8, 11, 17], "thing": 8, "frame": [8, 17], "loc": [8, 17], "filter": [8, 17], "row": [8, 11, 17], "column": [8, 11, 17], "sort_valu": 8, "head": 8, "order": 8, "ad": [8, 16, 17], "modifi": [8, 17], "combin": [8, 9, 17], "step": 8, "chain": 8, "multilin": 8, "express": 8, "visual": [8, 16], "altair": [8, 16], "bar": [8, 16], "plot": [8, 16], "format": [8, 9, 16], "chart": [8, 16], "all": [8, 11], "access": [8, 9, 11, 15], "document": 8, "code": 9, "text": [9, 11, 16], "jupyt": [9, 15], "cell": 9, "execut": 9, "kernel": 9, "markdown": 9, "save": [9, 16], "your": [9, 14, 15], "work": [9, 14, 15], "best": 9, "practic": 9, "run": 9, "notebook": 9, "includ": 9, "packag": 9, "file": [9, 11, 15, 16], "export": 9, "differ": [9, 11, 16], "html": [9, 11], "pdf": 9, "prefac": 10, "read": 11, "local": [11, 15], "web": 11, "absolut": 11, "rel": 11, "path": 11, "plain": 11, "read_csv": 11, "comma": 11, "separ": [11, 17], "skip": 11, "when": [11, 16], "sep": 11, "argument": 11, "header": 11, "handl": [11, 15], "directli": 11, "url": 11, "preview": 11, "befor": 11, "microsoft": 11, "excel": 11, "read_excel": 11, "databas": 11, "sqlite": 11, "postgresql": 11, "should": [11, 15], "bother": 11, "write": 11, "csv": 11, "obtain": [11, 14], "scrape": 11, "css": 11, "selector": 11, "beautifulsoup": 11, "read_html": 11, "api": 11, "nasa": 11, "regress": [12, 13], "model": 12, "underfit": 12, "multivari": [12, 13], "nn": [12, 13], "strength": 12, "limit": 12, "linear": 13, "simpl": 13, "compar": 13, "multicollinear": 13, "outlier": 13, "design": 13, "other": 13, "side": 13, "up": [14, 17], "comput": 14, "worksheet": 14, "thi": [14, 17], "book": 14, "docker": 14, "window": 14, "maco": 14, "ubuntu": 14, "jupyterlab": 14, "desktop": 14, "collabor": 15, "version": 15, "control": 15, "what": [15, 17], "repositori": 15, "workflow": 15, "commit": 15, "chang": 15, "push": 15, "remot": 15, "pull": 15, "github": 15, "pen": 15, "tool": 15, "add": 15, "menu": 15, "gener": 15, "person": 15, "token": 15, "clone": 15, "specifi": 15, "make": 15, "give": 15, "project": 15, "merg": [15, 17], "conflict": 15, "commun": 15, "issu": 15, "refin": 16, "scatter": 16, "line": 16, "mauna": 16, "loa": 16, "co_": 16, "2": 16, "old": 16, "faith": 16, "erupt": 16, "time": 16, "axi": 16, "transform": 16, "color": 16, "island": 16, "landmass": 16, "histogram": 16, "michelson": 16, "speed": 16, "light": 16, "layer": 16, "binwidth": 16, "explain": 16, "size": 16, "clean": 17, "wrangl": 17, "seri": 17, "basic": 17, "doe": 17, "have": 17, "structur": 17, "tidi": 17, "go": 17, "wide": 17, "long": 17, "melt": 17, "pivot": 17, "str": 17, "deal": 17, "multipl": 17, "extract": 17, "certain": 17, "satisfi": 17, "condit": 17, "least": 17, "one": 17, "list": 17, "isin": 17, "abov": 17, "below": 17, "threshold": 17, "queri": 17, "iloc": 17, "posit": 17, "aggreg": 17, "individu": 17, "oper": 17, "group": 17, "groupbi": 17, "appli": 17, "function": 17, "across": 17}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 6, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.intersphinx": 1, "sphinxcontrib.bibtex": 9, "sphinx": 56}}) \ No newline at end of file +Search.setIndex({"docnames": ["acknowledgements", "authors", "classification1", "classification2", "clustering", "foreword-text", "index", "inference", "intro", "jupyter", "preface-text", "reading", "regression1", "regression2", "setup", "version-control", "viz", "wrangling"], "filenames": ["acknowledgements.md", "authors.md", "classification1.md", "classification2.md", "clustering.md", "foreword-text.md", "index.md", "inference.md", "intro.md", "jupyter.md", "preface-text.md", "reading.md", "regression1.md", "regression2.md", "setup.md", "version-control.md", "viz.md", "wrangling.md"], "titles": ["Acknowledgments", "About the authors", "5. Classification I: training & predicting", "6. Classification II: evaluation & tuning", "9. Clustering", "Foreword", "Data Science", "10. Statistical inference", "1. Python and Pandas", "11. Combining code and text with Jupyter", "Preface", "2. Reading in data locally and from the web", "7. Regression I: K-nearest neighbors", "8. Regression II: linear regression", "13. Setting up your computer", "12. Collaboration with version control", "4. Effective data visualization", "3. Cleaning and wrangling data"], "terms": {"we": [0, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17], "d": [0, 1, 5, 7, 8, 11, 16], "like": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "thank": 0, "everyon": 0, "ha": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "contribut": [0, 1, 5, 15], "develop": [0, 1, 3, 5, 7, 8, 9, 10, 11, 15], "data": [0, 1, 4, 5, 7, 10, 13, 14, 15], "scienc": [0, 1, 2, 3, 5, 8, 9, 10, 14, 15, 17], "A": [0, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "first": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "introduct": [0, 3, 4, 5, 7, 8, 10, 11, 13], "thi": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16], "an": [0, 1, 2, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17], "open": [0, 1, 6, 8, 9, 11, 14, 15, 16], "sourc": [0, 1, 11, 16], "textbook": [0, 1, 2, 3, 5, 6, 10, 11, 13, 15, 17], "began": [0, 11], "collect": [0, 2, 3, 4, 5, 7, 8, 11, 16, 17], "cours": [0, 1, 3, 4, 5, 7, 8, 9, 10, 11, 13, 17], "read": [0, 2, 3, 6, 7, 8, 9, 10, 12, 13, 15, 16, 17], "dsci": [0, 11, 14], "100": [0, 2, 3, 7, 8, 11, 12, 13, 14, 16, 17], "new": [0, 3, 4, 7, 8, 11, 12, 14, 15, 16, 17], "introductori": [0, 3, 5, 7], "univers": [0, 1, 7, 11], "british": [0, 1, 7, 8, 11], "columbia": [0, 1, 7, 11], "ubc": [0, 1, 11, 14], "sever": [0, 1, 2, 7, 11, 15, 16, 17], "faculti": 0, "member": [0, 2, 5, 15], "depart": [0, 1], "statist": [0, 1, 2, 3, 4, 5, 8, 11, 12, 13, 16], "were": [0, 2, 3, 7, 8, 9, 11, 13, 15, 16, 17], "pivot": 0, "shape": [0, 2, 4, 7, 8, 11, 13, 16, 17], "direct": [0, 2, 5, 11, 16], "greatli": [0, 17], "broad": [0, 5, 16], "structur": [0, 3, 4, 8, 11, 16], "list": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16], "topic": [0, 3, 4, 9, 13, 15], "book": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "would": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "especi": [0, 2, 8, 11, 14, 15, 16], "mat\u00eda": 0, "salib\u00edan": 0, "barrera": 0, "hi": [0, 1], "mentorship": 0, "dure": [0, 1, 3, 8, 12, 15, 17], "initi": [0, 1, 2, 4, 8, 11, 12, 13, 15, 16], "roll": 0, "out": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "both": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "door": 0, "wa": [0, 1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "alwai": [0, 2, 3, 4, 9, 11, 12, 13, 14, 16, 17], "when": [0, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17], "need": [0, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "chat": 0, "about": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "how": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "best": [0, 3, 7, 8, 11, 12, 13, 15, 16], "introduc": [0, 5, 7, 8, 13, 15, 16, 17], "teach": [0, 1, 2, 5, 8], "our": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "year": [0, 2, 5, 8, 11, 16, 17], "student": [0, 1, 5, 7], "also": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "gabriela": 0, "cohen": 0, "freue": 0, "her": [0, 1], "561": 0, "regress": [0, 2, 3, 4, 8, 10], "i": [0, 3, 4, 7, 8, 9, 10, 11, 13, 14, 16, 17], "materi": [0, 2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16, 17], "from": [0, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 16], "master": [0, 1], "program": [0, 1, 3, 5, 8, 9, 10, 11, 14, 16], "some": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "linear": [0, 3, 9, 12, 16], "figur": [0, 2, 8, 17], "inspir": [0, 11], "all": [0, 2, 3, 4, 5, 7, 9, 10, 12, 13, 14, 15, 16, 17], "those": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "who": [0, 2, 3, 5, 7, 8, 9, 11, 15, 16, 17], "process": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "publish": [0, 11, 16], "In": [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "particular": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "review": [0, 11, 15], "feedback": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "suggest": [0, 2, 3, 7, 8, 12, 13, 16, 17], "rohan": 0, "alexand": 0, "isabella": 0, "ghement": 0, "virgilio": 0, "g\u00f3mez": 0, "rubio": 0, "albert": [0, 16], "kim": 0, "adam": 0, "loi": 0, "maria": 0, "prokofieva": 0, "emili": 0, "rieder": 0, "greg": [0, 15], "wilson": [0, 8, 15], "The": [0, 1, 5, 7, 8, 11, 14, 15, 16, 17], "improv": [0, 2, 3, 4, 7, 8, 12, 13, 15, 16], "substanti": [0, 3, 12], "insight": [0, 4, 10, 16], "give": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "special": [0, 2, 7, 8, 11, 15, 16, 17], "jim": 0, "zidek": 0, "support": [0, 2, 3, 8, 14, 16, 17], "encourag": [0, 5, 17], "throughout": [0, 2, 3, 8, 10, 15, 17], "roger": [0, 5, 8], "peng": [0, 5, 8], "gracious": 0, "offer": [0, 3, 7, 11, 12, 13, 15], "write": [0, 2, 8, 9, 13, 15, 17], "foreword": 0, "final": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "ow": 0, "debt": 0, "gratitud": 0, "over": [0, 1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 14, 15, 16, 17], "past": [0, 2, 3, 4, 5, 11, 12, 13, 14, 15, 16], "few": [0, 2, 3, 4, 5, 7, 8, 11, 12, 13, 14, 15, 16, 17], "thei": [0, 2, 3, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "provid": [0, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "invalu": 0, "worksheet": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "found": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "bug": [0, 15, 17], "us": [0, 1, 2, 3, 4, 5, 9, 10, 12, 13, 14, 16], "stood": 0, "veri": [0, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "patient": [0, 2, 3], "class": [0, 2, 3, 8, 11, 16, 17], "while": [0, 2, 3, 4, 8, 10, 11, 13, 16, 17], "frantic": 0, "fix": [0, 2, 3, 7, 9, 12, 15, 16, 17], "brought": 0, "level": [0, 3, 4, 7, 8, 10, 13, 16], "enthusiasm": 0, "sustain": 0, "hard": [0, 8, 11, 16, 17], "work": [0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 16, 17], "creat": [0, 1, 2, 4, 7, 10, 11, 12, 13, 17], "interact": [0, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "them": [0, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "taught": [0, 2], "learn": [0, 1, 5, 10], "reflect": [0, 1, 16], "content": [0, 1, 2, 6, 11, 15, 17], "translat": [0, 11], "origin": [0, 1, 2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "which": [0, 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "focus": [0, 1, 2, 3, 5, 12], "r": [0, 1, 3, 4, 5, 6, 7, 8, 11, 16], "languag": [0, 1, 2, 3, 5, 7, 9, 10, 11, 12, 14, 17], "ar": [0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "navya": 0, "dahiya": 0, "gloria": 0, "ye": [0, 2], "complet": [0, 1, 3, 7, 8, 9, 11, 12, 14, 15], "round": [0, 3, 7], "philip": 0, "austin": 0, "leadership": 0, "guidanc": [0, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "gratefulli": 0, "educ": [0, 1, 2, 5], "resourc": [0, 1, 2, 12], "fund": 0, "earth": [0, 1, 16], "ocean": [0, 1], "atmospher": [0, 1, 16], "exercis": [0, 5, 10, 14], "version": [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "tiffani": [1, 6, 8, 16], "timber": [1, 6, 8, 16], "trevor": [1, 3, 4, 6, 13], "campbel": [1, 6], "melissa": [1, 6], "lee": [1, 6, 16], "adapt": [1, 16], "python": [1, 2, 5, 10, 12, 14, 15, 16], "joel": [1, 6], "ostblom": [1, 6], "lindsei": [1, 6], "heagi": [1, 6], "associ": [1, 7, 8, 10, 11, 15, 17], "professor": 1, "co": [1, 16], "director": 1, "vancouv": [1, 7, 11, 16, 17], "option": [1, 2, 9, 11, 13, 14, 15, 16, 17], "role": [1, 8, 11, 16], "she": 1, "curriculum": 1, "around": [1, 3, 5, 7, 8, 12, 13, 16, 17], "respons": [1, 2, 3, 4, 11, 12, 13, 15], "applic": [1, 3, 5, 7, 8, 11, 12, 13, 14, 17], "solv": [1, 2, 3, 4, 8, 10, 13, 15, 17], "real": [1, 2, 3, 7, 8, 11, 12, 13, 15, 17], "world": [1, 7, 8, 10, 15, 16, 17], "problem": [1, 3, 4, 5, 7, 8, 9, 10, 13, 15, 16, 17], "One": [1, 2, 3, 7, 8, 9, 12, 13, 15, 16, 17], "favorit": [1, 13], "graduat": 1, "collabor": [1, 5, 9, 10], "softwar": [1, 2, 10, 11, 14, 15, 16, 17], "packag": [1, 2, 3, 4, 5, 8, 11, 12, 13, 14, 16, 17], "modern": [1, 2, 5, 11, 16], "tool": [1, 2, 3, 4, 5, 8, 9, 10, 11, 13, 16, 17], "workflow": [1, 2, 3, 4, 5, 8, 9, 10, 12, 13], "research": [1, 4, 5, 16], "autom": [1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "scalabl": 1, "bayesian": 1, "infer": [1, 5], "algorithm": [1, 3, 12, 13, 16], "nonparametr": [1, 2, 12], "stream": 1, "theori": [1, 2, 4, 7, 12], "he": 1, "previous": [1, 2, 7, 8, 11, 12, 16, 17], "postdoctor": 1, "advis": [1, 11, 12, 16], "tamara": 1, "broderick": 1, "comput": [1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "artifici": 1, "intellig": 1, "laboratori": [1, 4], "csail": 1, "institut": [1, 3, 16], "system": [1, 11, 14, 15], "societi": 1, "idss": 1, "mit": 1, "ph": 1, "candid": [1, 3, 13], "under": [1, 5, 6, 9, 14, 15, 17], "jonathan": 1, "inform": [1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "decis": [1, 2, 3, 5, 7, 15], "lid": 1, "befor": [1, 2, 3, 4, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17], "engin": [1, 11, 13, 14, 16], "toronto": [1, 11, 17], "assist": 1, "undergradu": [1, 7], "center": [1, 4, 5, 7, 11, 13, 16, 17], "approach": [1, 2, 3, 4, 7, 8, 10, 12, 13, 15, 17], "assess": [1, 3, 4, 12, 13, 15, 16], "promot": 1, "equiti": 1, "divers": [1, 7], "inclus": [1, 3, 13, 15], "phd": 1, "passion": 1, "reproduc": [1, 3, 4, 5, 7, 9, 10, 11, 15], "through": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "quantit": [1, 3, 4, 7, 8, 12, 16], "imag": [1, 2, 9, 10, 11, 14, 16], "analysi": [1, 2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "pipelin": [1, 3, 4, 12], "studi": [1, 2, 3, 4, 5, 7, 8, 12, 16, 17], "stem": [1, 12], "cell": [1, 2, 3, 4, 8, 11, 13, 17], "development": 1, "biologi": [1, 15], "sinc": [1, 2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "lead": [1, 2, 3, 5, 8, 9, 15, 16], "workshop": [1, 2], "now": [1, 2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "care": [1, 2, 3, 11, 12, 13, 16, 17], "deepli": [1, 17], "spread": [1, 2, 4, 7, 8, 16, 17], "literaci": 1, "excit": [1, 5, 8], "programmat": [1, 3, 11], "project": [1, 2, 9, 11, 16], "geophys": 1, "invers": 1, "facil": [1, 15], "combin": [1, 2, 3, 4, 10, 11, 13, 15, 16], "method": [1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 15, 16, 17], "numer": [1, 2, 7, 11, 12, 13, 16, 17], "simul": [1, 2, 7, 16], "machin": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "answer": [1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 16, 17], "question": [1, 2, 3, 4, 5, 7, 10, 12, 13, 16, 17], "subsurfac": 1, "primari": [1, 2, 8, 15, 16, 17], "includ": [1, 2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "miner": 1, "explor": [1, 3, 4, 7, 11, 13, 15, 16, 17], "carbon": [1, 16], "sequestr": 1, "groundwat": 1, "environment": [1, 4], "bsc": 1, "alberta": [1, 11, 17], "held": [1, 3], "posit": [1, 3, 4, 8, 12, 16], "california": [1, 12], "berkelei": 1, "prior": [1, 2, 11, 15], "start": [1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "current": [1, 2, 8, 9, 11, 13, 15, 16, 17], "previou": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "sole": [2, 9], "descript": [2, 7, 8, 9, 10, 11, 15, 16, 17], "exploratori": [2, 3, 4, 8, 10, 12, 16], "next": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "serv": [2, 3, 5, 7, 9, 13, 15], "forai": [2, 12], "focu": [2, 3, 4, 7, 8, 9, 12, 13, 15, 16, 17], "e": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "one": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16], "valu": [2, 4, 7, 9, 12, 13, 16], "categor": [2, 3, 4, 7, 8, 12, 16, 17], "interest": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "cover": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "basic": [2, 3, 8, 11, 13, 15, 16], "make": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "suitabl": [2, 17], "classifi": 2, "accur": [2, 3, 7, 12, 13, 16], "well": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "where": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "possibl": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "maxim": [2, 3, 12], "accuraci": [2, 3, 7, 12, 13], "By": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "end": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "reader": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "abl": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "do": [2, 3, 4, 5, 8, 9, 11, 12, 13, 14, 15, 16], "follow": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "recogn": [2, 9, 11, 12, 15, 17], "situat": [2, 3, 4, 8, 12, 15, 16, 17], "appropri": [2, 3, 4, 8, 11, 12, 14, 16, 17], "what": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16], "interpret": [2, 3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "output": [2, 3, 4, 8, 9, 11, 12, 13, 16, 17], "hand": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16, 17], "straight": [2, 4, 12, 13, 16], "line": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17], "euclidean": [2, 4], "graph": 2, "predictor": [2, 4, 12], "explain": [2, 4, 7, 8, 12, 13], "perform": [2, 4, 7, 8, 9, 10, 11, 12, 13, 16], "imput": 2, "step": [2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "model": [2, 3, 4, 5, 13, 15, 16], "make_pipelin": [2, 3, 4, 12], "mani": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "want": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "base": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16], "experi": [2, 16], "For": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "instanc": [2, 3, 7, 8, 11, 17], "doctor": [2, 3], "mai": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "diagnos": [2, 3], "either": [2, 3, 4, 8, 9, 10, 12, 13, 15, 17], "diseas": 2, "healthi": 2, "symptom": 2, "s": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "email": [2, 11, 15], "might": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "tag": [2, 11, 14], "given": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "spam": 2, "text": [2, 3, 7, 8, 10, 12, 13, 14, 15, 17], "credit": 2, "card": 2, "compani": 2, "whether": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16], "purchas": [2, 4, 12, 13], "fraudul": 2, "item": [2, 4, 8, 9, 11, 12, 15, 16, 17], "amount": [2, 3, 4, 9, 11, 12, 13, 16], "locat": [2, 11, 15, 16], "These": [2, 3, 4, 5, 8, 9, 11, 13, 15, 16], "task": [2, 4, 7, 10, 12, 16, 17], "exampl": [2, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "sometim": [2, 3, 7, 8, 11, 12, 13, 14, 16, 17], "call": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "label": [2, 4, 8, 12, 16, 17], "other": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17], "featur": [2, 3, 9, 12, 13, 15, 16], "gener": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 17], "assign": [2, 4, 7, 8, 11, 16, 17], "without": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "known": [2, 3, 8, 11, 13, 16], "g": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "basi": [2, 11, 16], "similar": [2, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "know": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "name": [2, 3, 4, 7, 9, 12, 13, 14, 15, 16], "come": [2, 4, 7, 8, 12, 13, 14, 16, 17], "fact": [2, 3, 7, 8, 9, 11, 13, 15], "onc": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "can": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "There": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "could": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16, 17], "wide": [2, 3, 4, 5, 11, 13, 15, 16], "hart": [2, 12], "1967": [2, 3, 12], "hodg": [2, 12], "1951": [2, 12], "your": [2, 3, 4, 7, 8, 10, 11, 12, 13, 16, 17], "futur": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16], "you": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "encount": [2, 4, 11, 12, 13, 14, 17], "tree": [2, 3, 12], "vector": [2, 3, 11, 16], "svm": 2, "logist": [2, 3, 13], "neural": 2, "network": [2, 11], "see": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "addit": [2, 8, 12], "section": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "begin": [2, 3, 4, 7, 8, 11, 12, 15, 16, 17], "It": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "worth": [2, 3, 16, 17], "mention": [2, 3, 4, 7, 9, 11, 13, 14, 15, 17], "variat": [2, 7, 12, 16], "binari": [2, 3], "onli": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "involv": [2, 3, 4, 9, 11, 13, 14, 15, 16, 17], "diagnosi": [2, 3], "run": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "multiclass": 2, "categori": [2, 3, 4, 7, 8, 11, 16, 17], "bronchiti": 2, "pneumonia": 2, "common": [2, 3, 4, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17], "cold": 2, "digit": 2, "breast": [2, 3], "dr": [2, 4, 16], "william": [2, 3, 4], "h": [2, 11], "wolberg": [2, 3], "w": [2, 8, 11], "nick": [2, 3, 8], "street": [2, 3], "olvi": [2, 3], "l": [2, 11], "mangasarian": [2, 3], "et": [2, 3, 4, 7, 11, 13, 15, 16], "al": [2, 3, 4, 7, 11, 13, 15, 16], "1993": [2, 3], "row": [2, 3, 4, 7, 9, 12, 13, 15, 16], "repres": [2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16, 17], "tumor": [2, 8], "sampl": [2, 3, 12], "benign": [2, 3, 8, 12], "malign": [2, 3, 8, 12], "measur": [2, 3, 7, 8, 12, 13, 16, 17], "nucleu": 2, "textur": [2, 3], "perimet": [2, 3, 8], "area": [2, 3, 8, 11, 12, 13, 15, 16, 17], "conduct": [2, 11], "physician": 2, "As": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "analys": [2, 3, 4, 5, 8, 9, 10, 11, 15, 16, 17], "formul": [2, 7, 8, 12, 16], "precis": [2, 3, 7, 9, 12, 14, 15, 17], "here": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "avail": [2, 3, 5, 8, 10, 11, 13, 14, 16], "unknown": [2, 7, 8], "show": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "import": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "becaus": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "tradit": 2, "non": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "driven": [2, 4], "quit": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "subject": [2, 9, 15, 16], "depend": [2, 3, 4, 7, 9, 11, 12, 13, 16, 17], "upon": [2, 3, 11], "skill": [2, 5, 9, 11, 16], "experienc": 2, "furthermor": [2, 3, 5, 16], "normal": [2, 3, 7, 15, 17], "danger": [2, 14], "stai": [2, 7, 11, 16], "same": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "place": [2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 16, 17], "stop": [2, 3, 4, 9, 13, 14], "grow": [2, 3, 5, 13], "get": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "larg": [2, 3, 4, 5, 7, 8, 11, 12, 13, 15, 16], "contrast": [2, 3, 4, 7, 8, 11, 12, 13, 15], "invad": 2, "surround": [2, 8, 11, 15, 16], "tissu": 2, "nearbi": [2, 3, 11], "organ": [2, 8, 9, 11, 15, 16, 17], "caus": [2, 3, 4, 8, 9, 12, 13, 16, 17], "seriou": [2, 8, 15], "damag": [2, 3], "stanford": 2, "health": [2, 5], "2021": [2, 8, 11], "thu": [2, 3, 9, 11, 12, 13, 15, 17], "quickli": [2, 3, 8, 13, 16], "type": [2, 3, 4, 7, 9, 11, 12, 13, 14, 15, 16], "guid": [2, 8, 12, 15, 16], "treatment": [2, 3, 17], "wrangl": [2, 3, 5, 8, 10, 11, 13, 16], "visual": [2, 3, 4, 7, 9, 10, 12, 13, 15, 17], "order": [2, 3, 4, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17], "better": [2, 3, 4, 12, 13, 16], "understand": [2, 3, 4, 5, 7, 8, 10, 11, 13, 15, 16, 17], "panda": [2, 3, 4, 7, 9, 11, 12, 13, 16, 17], "altair": [2, 3, 4, 9, 12, 13], "pd": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "alt": [2, 3, 4, 7, 8, 12, 13, 16], "case": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "file": [2, 8, 14, 17], "contain": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "csv": [2, 3, 4, 7, 8, 12, 13, 16, 17], "header": [2, 8, 9, 15, 17], "ll": [2, 3, 7, 8, 11, 12, 14, 15, 16, 17], "read_csv": [2, 3, 4, 7, 8, 9, 12, 13, 16, 17], "function": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], "argument": [2, 3, 4, 7, 8, 9, 12, 16, 17], "inspect": [2, 8, 11, 16, 17], "wdbc": 2, "id": [2, 3, 7, 16], "radiu": [2, 3], "smooth": [2, 3, 12, 16], "compact": [2, 3], "concav": [2, 3], "concave_point": [2, 3], "symmetri": [2, 3], "fractal_dimens": [2, 3], "0": [2, 3, 4, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17], "842302": 2, "m": [2, 3, 8, 11, 16], "1": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "096100": 2, "2": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17], "071512": 2, "268817": 2, "983510": 2, "567087": 2, "3": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "280628": 2, "650542": 2, "530249": 2, "215566": 2, "253764": 2, "842517": 2, "828212": 2, "353322": 2, "684473": 2, "907030": 2, "826235": 2, "486643": 2, "023825": 2, "547662": 2, "001391": 2, "867889": 2, "84300903": 2, "578499": 2, "455786": 2, "565126": 2, "557513": 2, "941382": 2, "052000": 2, "362280": 2, "035440": 2, "938859": 2, "397658": 2, "84348301": 2, "768233": 2, "253509": 2, "592166": 2, "763792": 2, "280667": 2, "399917": 2, "914213": 2, "450431": 2, "864862": 2, "4": [2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "906602": 2, "84358402": 2, "748758": 2, "150804": 2, "775011": 2, "824624": 2, "280125": 2, "538866": 2, "369806": 2, "427237": 2, "009552": 2, "561956": 2, "564": [2, 3], "926424": 2, "109139": 2, "720838": 2, "058974": 2, "341795": 2, "040926": 2, "218868": 2, "945573": 2, "318924": 2, "312314": 2, "930209": 2, "565": [2, 3], "926682": 2, "703356": 2, "083301": 2, "614511": 2, "722326": 2, "102368": 2, "017817": 2, "692434": 2, "262558": 2, "217473": 2, "057681": 2, "566": [2, 3], "926954": 2, "701667": 2, "043775": 2, "672084": 2, "577445": 2, "839745": 2, "038646": 2, "046547": 2, "105684": 2, "808406": 2, "894800": 2, "567": [2, 3], "927241": 2, "836725": 2, "334403": 2, "980781": 2, "733693": 2, "524426": 2, "269267": 2, "294046": 2, "656528": 2, "135315": 2, "042778": 2, "568": [2, 3], "92751": 2, "b": [2, 3], "806811": 2, "220718": 2, "812793": 2, "346604": 2, "109349": 2, "149741": 2, "113893": 2, "260710": 2, "819349": 2, "560539": 2, "569": [2, 3], "12": [2, 3, 4, 7, 8, 9, 11, 13, 14, 15, 16, 17], "column": [2, 3, 4, 7, 9, 12, 13, 14, 16], "biopsi": [2, 3], "remov": [2, 3, 8, 14, 15, 16], "bodi": [2, 15], "examin": [2, 3, 4, 11, 12], "presenc": [2, 3], "tradition": 2, "procedur": [2, 3, 4, 5, 12], "invas": 2, "fine": [2, 3, 9, 15, 16, 17], "needl": 2, "aspir": 2, "present": [2, 3, 5, 7, 8, 11, 15, 16, 17], "extract": [2, 3, 4, 8, 11, 12, 13], "small": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16], "less": [2, 3, 4, 7, 11, 12, 13, 15, 16, 17], "ten": [2, 7, 8, 16], "differ": [2, 3, 4, 5, 7, 8, 12, 13, 14, 15, 17], "below": [2, 3, 7, 8, 11, 13, 15, 16], "mean": [2, 3, 8, 9, 11, 12, 13, 15, 16, 17], "across": [2, 3, 5, 7, 8, 11, 12, 13, 15, 16], "nuclei": 2, "record": [2, 3, 7, 8, 11, 15, 16, 17], "part": [2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 15, 16, 17], "prepar": [2, 3, 16], "have": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], "been": [2, 3, 9, 11, 12, 13, 15, 16, 17], "standard": [2, 3, 4, 7, 9, 12, 13, 15, 16, 17], "discuss": [2, 3, 5, 7, 11, 12, 13, 14, 15, 16, 17], "why": [2, 3, 8, 12, 16, 17], "later": [2, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "addition": [2, 3, 4, 7, 9, 11, 13, 15, 17], "uniqu": [2, 3, 5, 8, 15], "therefor": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 16, 17], "total": [2, 3, 4, 8, 11, 12, 16, 17], "per": [2, 7, 11, 15, 16, 17], "identif": 2, "number": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "deviat": [2, 3, 4, 7, 17], "grai": [2, 15, 17], "length": [2, 4, 7, 8, 16, 17], "contour": 2, "insid": [2, 3, 7, 8, 9, 11, 14, 15, 16, 17], "local": [2, 12, 14], "ratio": [2, 17], "squar": [2, 3, 4, 8, 9, 11, 12, 13, 16, 17], "portion": [2, 11], "mirror": 2, "fractal": 2, "dimens": 2, "rough": [2, 4, 16], "info": [2, 3, 8, 16, 17], "preview": [2, 3, 4, 7, 8, 9, 10, 12, 13, 15, 16, 17], "frame": [2, 3, 4, 7, 9, 11, 12, 13, 16], "easier": [2, 3, 7, 8, 11, 12, 13, 14, 15, 16, 17], "lot": [2, 3, 4, 8, 11, 13, 16, 17], "print": [2, 3, 7, 8, 9, 11, 13, 14, 16, 17], "down": [2, 9, 11, 14, 15, 17], "page": [2, 3, 4, 6, 9, 11, 13, 14, 15], "instead": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "entri": [2, 3, 7, 8, 11, 16, 17], "core": [2, 3, 5, 8, 16, 17], "datafram": [2, 3, 4, 7, 11, 12, 13, 16, 17], "rangeindex": [2, 3, 16, 17], "null": [2, 3, 16, 17], "count": [2, 3, 7, 8, 11, 16, 17], "dtype": [2, 3, 7, 16, 17], "int64": [2, 3, 11, 16, 17], "float64": [2, 3, 7, 11, 16, 17], "6": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "7": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "8": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "9": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "10": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "11": [2, 3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "memori": [2, 3, 11, 16, 17], "usag": [2, 3, 8, 11, 13, 16, 17], "53": [2, 3, 7, 13, 15], "kb": [2, 3, 16, 17], "abov": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16], "arrai": [2, 3, 4, 5, 12, 13], "readabl": [2, 3, 8, 11, 12, 15, 16, 17], "renam": [2, 3, 7, 8, 9, 11, 12, 17], "replac": [2, 3, 7, 8, 11, 13, 14, 15, 16], "take": [2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "dictionari": [2, 3, 11, 17], "map": [2, 4, 8, 11, 12, 13, 16], "desir": [2, 3, 8, 11, 12, 15, 17], "verifi": [2, 7, 14], "result": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "ani": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "let": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "groupbi": [2, 7], "size": [2, 3, 4, 7, 11, 12, 13, 17], "find": [2, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "percentag": [2, 3, 7, 8, 16], "pair": [2, 3, 4, 11, 17], "Then": [2, 3, 4, 7, 8, 9, 12, 13, 14, 15, 16, 17], "calcul": [2, 3, 4, 12, 13], "group": [2, 3, 4, 7, 8, 14, 16], "divid": [2, 3, 8, 11, 16, 17], "multipli": [2, 8, 16], "equal": [2, 3, 4, 7, 12, 13, 17], "access": [2, 3, 4, 7, 12, 14, 16, 17], "via": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16], "attribut": [2, 3, 4, 6, 8, 11, 12], "357": [2, 3], "63": [2, 3, 11, 12], "212": [2, 4, 8, 11, 16, 17], "37": [2, 3, 4, 15, 16], "62": [2, 11, 12, 16], "741652": 2, "258348": 2, "conveni": [2, 3, 8, 11, 17], "value_count": [2, 3, 7, 17], "occurr": [2, 16], "If": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "pass": [2, 3, 8, 11, 12, 16, 17], "seri": [2, 3, 7, 12, 13], "occur": [2, 3, 4, 7, 9, 12, 13, 15, 16, 17], "true": [2, 3, 4, 7, 8, 9, 12, 16, 17], "fraction": [2, 3, 7, 12, 15, 16], "627417": 2, "372583": 2, "proport": [2, 3, 8, 16, 17], "draw": [2, 7, 8, 12, 13, 16], "color": [2, 3, 4, 11, 12, 13, 17], "scatter": [2, 3, 4, 12, 13], "plot": [2, 3, 4, 7, 12, 13, 17], "relationship": [2, 3, 4, 7, 8, 12, 13, 16, 17], "recal": [2, 3, 4, 7, 8, 12, 13, 15, 16, 17], "default": [2, 3, 8, 9, 11, 12, 14, 15, 16, 17], "palett": 2, "colorblind": [2, 16], "friendli": [2, 16], "so": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "stick": [2, 3, 11, 15], "perim_concav": [2, 3], "chart": [2, 3, 4, 7, 12, 13], "mark_circl": [2, 3, 4, 12, 13, 16], "encod": [2, 3, 4, 7, 8, 11, 12, 13, 16], "x": [2, 3, 4, 7, 8, 11, 12, 13, 14, 16], "titl": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16], "y": [2, 3, 4, 7, 8, 9, 11, 12, 13, 16], "versu": [2, 3, 4, 8, 11, 12, 13, 17], "fig": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "typic": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "fall": [2, 3, 7, 12, 15, 16], "upper": [2, 7, 15, 16], "right": [2, 3, 4, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17], "corner": [2, 3, 14, 15, 16], "lower": [2, 3, 7, 9, 13, 16], "left": [2, 3, 4, 6, 8, 9, 11, 13, 14, 15, 16, 17], "word": [2, 3, 7, 8, 9, 11, 12, 13, 15, 17], "tend": [2, 3, 12, 15, 16], "ones": [2, 16, 17], "larger": [2, 3, 4, 7, 11, 12, 13, 15, 16], "suppos": [2, 4, 7, 8, 9, 11, 12, 15, 17], "obtain": [2, 3, 4, 7, 8, 12, 13, 15, 16, 17], "except": [2, 11, 13, 15, 17], "sai": [2, 3, 7, 9, 11, 12, 13, 14, 16, 17], "respect": [2, 3, 4, 7, 11, 15, 16, 17], "lie": 2, "middl": [2, 7, 11], "orang": [2, 4, 12, 13], "cloud": [2, 11, 15, 16], "probabl": [2, 3, 7, 11, 13], "seem": [2, 3, 5, 7, 9, 11, 12, 13, 16, 17], "actual": [2, 3, 4, 7, 8, 11, 12, 13, 15, 17], "practic": [2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 15, 16, 17], "To": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "most": [2, 3, 4, 5, 7, 8, 9, 11, 15, 16, 17], "must": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "choos": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 17], "advanc": [2, 3, 4, 5, 7, 9, 13, 14, 15, 16, 17], "assum": [2, 7, 9], "someon": [2, 3, 8, 9, 15], "chosen": [2, 3, 4, 13, 17], "ourselv": [2, 3, 12], "illustr": [2, 3, 7, 12, 13, 16, 17], "concept": [2, 5, 7, 8, 10, 12, 13, 15, 16], "walk": [2, 8, 12, 15], "whose": [2, 9, 11, 15, 17], "depict": [2, 4], "red": [2, 4, 9, 11, 12, 13, 14, 15], "diamond": 2, "coordin": [2, 4, 8, 16], "idea": [2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "close": [2, 3, 4, 7, 8, 11, 12, 15, 16], "anoth": [2, 3, 7, 8, 9, 11, 12, 13, 15, 16, 17], "expect": [2, 3, 4, 7, 8, 9, 11, 12, 13, 17], "look": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "doe": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16], "consid": [2, 3, 4, 5, 7, 8, 12, 13, 15, 16, 17], "closest": [2, 3, 11, 16], "among": [2, 11, 15, 17], "major": [2, 3, 4, 8, 12, 13, 16, 17], "shown": [2, 3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "vote": [2, 3, 8], "three": [2, 3, 4, 7, 8, 9, 10, 11, 12, 15, 16, 17], "chose": [2, 3, 16], "noth": [2, 7, 8, 13], "though": [2, 3, 7, 8, 11, 12, 13, 15, 16, 17], "odd": [2, 11], "avoid": [2, 3, 13, 16], "ti": [2, 11], "decid": [2, 3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "often": [2, 3, 4, 5, 7, 8, 9, 11, 12, 15, 16, 17], "just": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "denot": [2, 4, 8, 11, 12, 13, 16, 17], "a_x": 2, "a_i": 2, "b_x": 2, "b_y": 2, "definit": [2, 5, 11, 16, 17], "plane": [2, 13], "formula": [2, 3, 4, 12, 13, 16], "mathrm": [2, 3], "sqrt": [2, 3, 12, 15], "select": [2, 4, 7, 9, 11, 12, 13, 14, 15, 16], "correspond": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "smallest": [2, 8, 12, 16, 17], "code": [2, 3, 5, 8, 10, 11, 12, 14, 15, 16, 17], "add": [2, 3, 4, 7, 8, 11, 12, 14, 16, 17], "root": [2, 3, 11, 12, 15], "nsmallest": [2, 12, 16], "new_obs_perimet": 2, "new_obs_concav": 2, "dist_from_new": 2, "112": 2, "241202": 2, "653051": 2, "880626": 2, "258": 2, "750277": 2, "870061": 2, "979663": 2, "351": 2, "622700": 2, "541410": 2, "143088": 2, "430": 2, "416930": 2, "314364": 2, "256806": 2, "152": 2, "160091": 2, "039155": 2, "279258": 2, "tabl": [2, 3, 6, 8, 9, 11, 14, 16, 17], "mathemat": [2, 3, 7, 12, 13, 16], "detail": [2, 3, 4, 8, 9, 11, 13, 14, 15, 16, 17], "24": [2, 15, 16], "65": [2, 3, 7, 11, 12, 17], "88": [2, 3], "75": [2, 3, 7, 8, 11, 12, 13, 16, 17], "87": [2, 3, 12], "98": [2, 8, 13, 16], "54": [2, 3, 15, 16, 17], "14": [2, 3, 4, 7, 8, 9, 11, 15, 16, 17], "42": [2, 7, 15, 16, 17], "31": [2, 3, 15, 16, 17], "26": [2, 3, 15, 16], "16": [2, 3, 4, 7, 11, 15, 16], "04": [2, 11, 14, 16, 17], "28": [2, 4, 12, 13, 15, 16], "circl": [2, 9, 15, 16], "although": [2, 3, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "toward": [2, 7, 8, 15], "exactli": [2, 3, 7, 8, 11, 12, 13, 14, 16], "appli": [2, 3, 5, 8, 12, 13, 16], "higher": [2, 3, 7, 8, 12, 13, 16, 17], "help": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "a_": 2, "dot": [2, 8, 11, 12, 13, 16], "b_": 2, "becom": [2, 3, 4, 5, 7, 8, 9, 12, 13, 15, 17], "still": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16], "space": [2, 9, 11, 12, 13, 14, 16], "417": [2, 16], "837": 2, "had": [2, 3, 7, 8, 11, 12, 16, 17], "ad": [2, 3, 4, 11, 12, 13, 15], "up": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16], "took": [2, 7, 8], "27": [2, 8, 12, 15, 16], "new_obs_symmetri": 2, "836722": 2, "267368": 2, "400": [2, 12, 17], "334664": 2, "886368": 2, "099359": 2, "472326": 2, "562": 2, "470430": 2, "084810": 2, "154075": 2, "499268": 2, "68": 2, "365450": 2, "812359": 2, "092064": 2, "531594": 2, "055065": 2, "555575": 2, "dimension": 2, "five": [2, 3, 14, 16, 17], "3d": [2, 12, 13], "note": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "recommend": [2, 8, 9, 10, 12, 13, 14, 15, 17], "against": [2, 9, 12, 13], "purpos": [2, 3, 4, 7, 11, 12, 13, 15, 16, 17], "complic": [2, 8, 11, 12, 16], "handl": [2, 3, 8, 16], "multipl": [2, 3, 4, 7, 8, 11, 12, 13, 14, 15, 16], "thankfulli": [2, 4], "implement": [2, 3, 4, 5, 13, 16], "buitinck": 2, "2013": [2, 3, 4, 13], "along": [2, 3, 5, 7, 8, 11, 12, 14, 15, 16], "sklearn": [2, 3, 4, 12, 13], "keep": [2, 3, 7, 8, 11, 14, 15, 16, 17], "simpl": [2, 3, 4, 7, 11, 12, 14, 16, 17], "fewer": [2, 3], "mistak": [2, 3, 12, 16], "tell": [2, 3, 7, 8, 9, 11, 12, 13, 15, 16, 17], "prefer": [2, 3, 4, 11, 13, 16, 17], "regular": [2, 11, 12, 15, 16, 17], "set_config": [2, 3, 4, 12, 13], "notic": [2, 3, 7, 8, 11, 13, 16, 17], "wai": [2, 3, 4, 7, 8, 9, 10, 11, 14, 15, 16, 17], "prefix": 2, "extens": [2, 9, 11, 13, 14, 15, 16], "subsequ": [2, 8, 16], "long": [2, 3, 4, 7, 8, 9, 11, 13, 15, 16], "clutter": [2, 16], "kneighborsclassifi": [2, 3], "38": [2, 4, 12, 15, 16], "charact": [2, 8, 9, 11, 15, 16, 17], "transform_output": [2, 3, 4, 12, 13], "modul": 2, "build": [2, 3, 5, 12, 16], "pick": [2, 3, 4, 11, 13, 15, 16], "store": [2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17], "cancer_train": [2, 3], "specifi": [2, 3, 7, 8, 9, 11, 12, 13, 14, 16, 17], "weight": 2, "control": [2, 3, 9, 10, 11, 14], "uniform": [2, 3, 11], "choic": [2, 3, 4, 7, 12, 15, 16, 17], "weigh": [2, 8], "websit": [2, 3, 6, 11, 13, 15], "knn": [2, 3], "n_neighbor": [2, 3, 12], "jupyt": [2, 3, 4, 5, 8, 10, 13, 14], "environ": [2, 3, 4, 5, 8, 9, 13, 14, 15], "pleas": [2, 3, 4, 6, 8, 9, 13], "rerun": [2, 3, 4, 13], "html": [2, 3, 4, 13, 16, 17], "represent": [2, 3, 4, 11, 13], "trust": [2, 3, 4, 7, 13], "notebook": [2, 3, 4, 5, 13, 14, 15], "On": [2, 3, 4, 8, 11, 12, 13, 15, 16, 17], "github": [2, 3, 4, 5, 8, 11, 13, 16], "unabl": [2, 3, 4, 11, 13, 15], "render": [2, 3, 4, 9, 13, 15, 16], "try": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 16, 17], "nbviewer": [2, 3, 4, 13], "org": [2, 3, 4, 7, 8, 11, 13, 16, 17], "kneighborsclassifierkneighborsclassifi": [2, 3], "fit": [2, 3, 4, 12, 13, 16], "much": [2, 3, 4, 5, 7, 8, 11, 12, 13, 16, 17], "outsid": [2, 3, 7, 9, 12, 13, 15, 16], "heavi": 2, "lift": 2, "modifi": [2, 3, 15], "after": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "itself": [2, 3, 5, 7, 11, 13, 16, 17], "ran": 2, "manual": [2, 3, 4, 7, 9, 11, 12, 14, 17], "time": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 17], "new_ob": 2, "Is": [2, 4, 8, 12, 16, 17], "don": [2, 3, 4, 7, 8, 9, 11, 12, 15, 16, 17], "t": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "necessarili": [2, 3, 8, 17], "correct": [2, 3, 8, 14, 15, 16, 17], "quantifi": [2, 3, 13], "think": [2, 3, 5, 8, 9, 11, 13, 17], "rang": [2, 3, 4, 11, 12, 13, 16, 17], "matter": [2, 12, 16, 17], "identifi": [2, 3, 4, 8, 10, 11, 12, 15, 16], "effect": [2, 4, 7, 8, 12, 13, 14, 17], "But": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "doesn": [2, 3, 9, 11, 16, 17], "salari": 2, "dollar": [2, 7, 11, 12, 13], "job": [2, 11, 16], "1000": [2, 3, 7, 16], "huge": [2, 11], "compar": [2, 3, 7, 8, 11, 12, 15, 16, 17], "conceptu": [2, 15], "opposit": 2, "yearli": 2, "temperatur": 2, "degre": 2, "kelvin": 2, "celsiu": 2, "constant": [2, 13, 16], "shift": [2, 8, 9], "273": [2, 17], "even": [2, 3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "likewis": [2, 17], "hypothet": 2, "thousand": [2, 3, 11, 16], "singl": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "affect": [2, 3, 8, 9, 12, 13, 16], "chang": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17], "outcom": [2, 8], "averag": [2, 3, 7, 8, 11, 12, 13, 17], "central": 2, "subtract": [2, 3, 8], "said": [2, 3], "unstandard": [2, 4], "wisconsin": 2, "until": [2, 3, 4, 7, 8, 9, 11, 14, 15, 16, 17], "did": [2, 3, 7, 8, 10, 11, 12, 13, 15, 16, 17], "earlier": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "thing": [2, 3, 7, 9, 11, 14, 15, 16, 17], "unscaled_canc": 2, "wdbc_unscal": [2, 3], "1001": 2, "11840": [2, 3], "1326": 2, "08474": [2, 3], "1203": 2, "10960": [2, 3], "386": 2, "14250": [2, 3], "1297": 2, "10030": [2, 3], "1479": 2, "11100": [2, 3], "1261": 2, "09780": [2, 3], "858": 2, "08455": [2, 3], "1265": 2, "11780": [2, 3], "181": [2, 4], "05263": [2, 3], "unscal": 2, "uncent": 2, "Will": 2, "framework": [2, 5, 13], "preprocessor": [2, 3, 4, 12], "manipul": [2, 11, 17], "standardscal": [2, 3, 4, 12], "transform": [2, 3, 4, 8, 12, 13, 17], "wrap": [2, 3, 4, 12], "columntransform": [2, 3, 4], "make_column_transform": [2, 3, 4, 12], "enabl": [2, 9, 11, 14, 15, 16, 17], "handi": [2, 8, 17], "sequenc": [2, 3, 8, 11, 14, 16], "compos": [2, 3, 4, 8, 12], "x27": [2, 3, 4], "columntransformercolumntransform": [2, 3, 4], "standardscalerstandardscal": [2, 3, 4], "individu": [2, 3, 7, 8, 13, 15, 16], "difficult": [2, 3, 4, 5, 8, 9, 11, 13, 16, 17], "rather": [2, 3, 7, 8, 9, 11, 12, 15, 16, 17], "make_column_selector": [2, 3], "dtype_includ": [2, 3], "equival": [2, 8, 11, 13, 17], "lt": 2, "_column_transform": 2, "0x7f4057c72610": 2, "gt": 2, "readi": [2, 3, 8, 9, 11, 14, 15], "happen": [2, 3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "necessari": [2, 4, 12, 14, 16], "bit": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17], "unnecessari": 2, "howev": [2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "quantiti": [2, 3, 5, 7, 16, 17], "scaled_canc": 2, "standardscaler__area": 2, "standardscaler__smooth": 2, "984375": 2, "568466": 2, "908708": 2, "826962": 2, "558884": 2, "942210": 2, "764464": 2, "283553": 2, "826229": 2, "280372": 2, "343856": 2, "041842": 2, "723842": 2, "102458": 2, "577953": 2, "840484": 2, "735218": 2, "525767": 2, "347789": 2, "112085": 2, "woohoo": 2, "input": [2, 3, 4, 8, 11, 12, 15, 17], "behavior": [2, 4, 12, 16, 17], "drop": [2, 3, 9, 14, 15, 16, 17], "remain": [2, 3, 4, 8, 14], "rest": [2, 3, 8, 13, 17], "remaind": [2, 3, 8, 11, 12, 17], "passthrough": 2, "separ": [2, 3, 4, 8, 9, 15, 16], "underscor": [2, 8, 9, 15, 17], "again": [2, 3, 7, 8, 9, 11, 12, 13, 14, 16, 17], "preserv": [2, 3], "verbose_feature_names_out": [2, 4], "fals": [2, 3, 4, 8, 11, 12, 13, 16, 17], "should": [2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 16, 17], "leav": [2, 4, 13], "preprocessor_keep_al": 2, "scaled_cancer_al": 2, "wonder": [2, 7, 11], "technic": [2, 3, 8, 9, 12, 14, 15, 16, 17], "error": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "prone": [2, 3, 11, 17], "accident": [2, 3, 9, 11, 15, 16, 17], "forget": [2, 4, 15], "proper": 2, "free": [2, 3, 13, 15], "requir": [2, 3, 4, 8, 9, 11, 12, 13, 14, 15, 16, 17], "yourself": [2, 4, 8, 11, 13, 15], "further": [2, 3, 4, 7, 8, 9, 11, 13, 16, 17], "automat": [2, 3, 4, 11, 12, 15, 16], "streamlin": 2, "effort": [2, 9, 11, 15], "side": [2, 6, 7, 8, 14, 15, 16], "annot": [2, 4, 16], "within": [2, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "nearli": [2, 4, 13, 17], "vertic": [2, 7, 8, 12, 13, 16, 17], "align": [2, 11, 16], "black": [2, 4, 11, 12, 16], "region": [2, 3, 11, 12, 17], "domin": 2, "intuit": [2, 3, 12, 16, 17], "reason": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "carefulli": [2, 4, 8, 11, 17], "domain": [2, 8, 11, 16], "comparison": [2, 7, 13, 16, 17], "potenti": [2, 3, 4, 12, 13, 17], "issu": [2, 8, 9, 11, 13, 14, 16, 17], "imbal": 2, "overal": [2, 3, 8, 12, 16], "pattern": [2, 3, 4, 7, 8, 11, 12, 13, 16, 17], "otherwis": [2, 3, 4, 7, 8, 16], "rare": [2, 4, 16], "malici": 2, "detect": [2, 4], "rarer": 2, "unimport": 2, "revisit": [2, 3, 11, 13, 17], "head": [2, 9, 11, 14, 15, 16], "top": [2, 3, 6, 8, 9, 11, 12, 13, 14, 16, 17], "concat": [2, 7], "glue": 2, "filter": [2, 7, 11, 16], "back": [2, 3, 5, 7, 9, 11, 12, 13, 14, 15, 16, 17], "concaten": [2, 7], "axi": [2, 8, 12, 13, 15, 17], "yield": [2, 3, 7], "taller": 2, "horizont": [2, 8, 16], "produc": [2, 3, 5, 8, 9, 13, 16, 17], "wider": [2, 7, 8, 17], "imbalanc": [2, 3], "rare_canc": 2, "rare_plot": 2, "With": [2, 4, 8, 11, 16, 17], "least": [2, 3, 4, 7, 9, 16], "win": 2, "highlight": [2, 4, 7, 9, 11, 12, 13, 14, 15, 17], "13": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "background": [2, 3, 7, 11, 13, 16], "blue": [2, 4, 9, 12, 15, 17], "indic": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "despit": [2, 3, 11, 16], "simplic": [2, 3, 15], "sound": [2, 3, 9], "manner": [2, 5, 9, 13], "fairli": [2, 3, 7, 14, 16], "nuanc": 2, "suffic": [2, 7], "rebal": 2, "oversampl": 2, "replic": [2, 7], "power": [2, 3, 5, 8, 11, 15, 16, 17], "own": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "increas": [2, 3, 4, 5, 7, 12, 13, 16, 17], "n": [2, 3, 4, 7, 8, 11, 12, 16, 17], "randomli": [2, 3, 4, 7, 13], "properli": [2, 3, 16], "random": [2, 7, 12, 13], "malignant_canc": 2, "benign_canc": 2, "malignant_cancer_upsampl": 2, "upsampled_canc": 2, "vice": [2, 3], "versa": [2, 3], "closer": [2, 16], "upsampl": 2, "wild": [2, 8, 13], "unfortun": [2, 3, 4, 7, 9, 11, 13, 16], "challeng": [2, 15, 17], "reli": [2, 3, 9, 12, 13, 16], "expert": [2, 3, 8, 14], "knowledg": [2, 8, 13, 15, 17], "relat": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "survei": [2, 8, 17], "particip": [2, 3], "margin": [2, 9], "peopl": [2, 5, 7, 8, 9, 12, 13, 15, 16, 17], "respond": [2, 11, 15], "certain": [2, 8, 11, 15, 16], "kind": [2, 3, 4, 7, 8, 11, 16], "fear": [2, 8], "honestli": 2, "neg": [2, 3, 9, 12, 13, 15, 16, 17], "consequ": [2, 3, 7, 9, 17], "simpli": [2, 3, 11, 16, 17], "throw": 2, "awai": [2, 3, 7, 11, 12, 13, 15, 17], "bia": 2, "conclus": [2, 7, 8, 16], "inadvert": [2, 9], "ignor": [2, 3, 8, 12, 17], "easili": [2, 3, 4, 8, 9, 10, 11, 15, 16, 17], "mislead": 2, "detriment": 2, "impact": [2, 4, 5, 7, 13, 17], "techniqu": [2, 3, 4, 5, 7, 8, 11, 13, 16], "deal": [2, 9, 11], "isn": [2, 3, 8, 11, 12, 16], "anyth": [2, 3, 8, 13, 17], "els": [2, 8, 9, 11], "subset": [2, 7, 9, 11, 12, 13, 17], "missing_canc": 2, "wdbc_miss": 2, "nan": [2, 11, 17], "475956": 2, "834601": 2, "386808": 2, "169878": 2, "160508": 2, "137124": 2, "henc": [2, 3, 4, 9, 11, 12, 16], "too": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "accomplish": [2, 3, 7, 8, 9, 16, 17], "dropna": 2, "no_missing_canc": 2, "strategi": [2, 3, 16], "fill": [2, 9, 11, 13, 16], "synthet": 2, "simpleimput": 2, "simpleimputersimpleimput": 2, "directli": [2, 3, 4, 7, 8, 9, 14, 15, 17], "imputed_canc": 2, "846860": 2, "384942": 2, "document": [2, 4, 9, 10, 11, 14, 15, 16, 17], "crucial": 2, "critic": [2, 5, 7, 8, 9, 13, 16, 17], "chain": [2, 17], "intermedi": [2, 8], "whole": [2, 3, 4, 7, 11, 15, 17], "scratch": [2, 7, 15, 16], "nn": [2, 3], "knn_pipelin": [2, 3], "pipelinepipelin": [2, 3, 4], "500": [2, 7, 12, 13], "075": 2, "1500": 2, "new_observ": 2, "second": [2, 3, 4, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17], "15": [2, 3, 4, 7, 8, 9, 11, 13, 15, 16, 17], "seen": [2, 3, 5, 12, 13, 15, 16, 17], "littl": [2, 3, 11, 12, 13, 16, 17], "grid": [2, 3, 12, 16], "meshgrid": 2, "numpi": [2, 3, 4, 7, 11, 12, 13, 16, 17], "high": [2, 3, 5, 7, 8, 9, 10, 13], "transpar": [2, 8], "low": [2, 3, 13], "opac": [2, 12, 16], "np": [2, 3, 4, 7, 12, 13], "val": 2, "arrang": [2, 7, 8, 16], "are_grid": 2, "linspac": 2, "min": [2, 12, 13, 16, 17], "95": [2, 3, 7, 8, 11, 13, 16], "max": [2, 3, 12, 13, 16, 17], "05": [2, 8, 11, 16], "50": [2, 3, 7, 8, 11, 12, 13, 15, 17], "smo_grid": 2, "asgrid": 2, "reshap": [2, 17], "knnpredgrid": 2, "bind": 2, "prediction_t": 2, "copi": [2, 11, 15, 17], "unscaled_plot": 2, "mark_point": [2, 16], "40": [2, 3, 7, 8, 11, 15, 16, 17], "nice": [2, 3, 9, 11, 13, 16], "fade": 2, "prediction_plot": 2, "300": [2, 3, 7, 16, 17], "accompani": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "repositori": [2, 3, 4, 7, 8, 11, 12, 13, 14, 16, 17], "launch": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "browser": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "click": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "binder": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "button": [2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "view": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "download": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "sure": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "instruct": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "setup": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "ensur": [2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "intend": [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17], "blb": 2, "lar": 2, "gill": 2, "loupp": 2, "mathieu": 2, "blondel": 2, "fabian": 2, "pedregosa": 2, "andrea": 2, "mueller": 2, "olivi": 2, "grisel": 2, "vlad": 2, "nicula": 2, "peter": [2, 12], "prettenhof": 2, "alexandr": 2, "gramfort": 2, "jaqu": 2, "grobler": 2, "robert": [2, 3, 4, 13], "layton": 2, "jake": 2, "vanderpla": [2, 16], "arnaud": 2, "joli": 2, "brian": [2, 16], "holt": 2, "ga": [2, 16], "\u00eb": 2, "varoquaux": 2, "api": 2, "design": [2, 3, 9, 11, 15, 16, 17], "ecml": 2, "pkdd": 2, "mine": [2, 7], "108": [2, 3], "122": [2, 3], "ch67": [2, 12], "thoma": [2, 12], "ieee": [2, 4, 12], "transact": [2, 4, 12], "21": [2, 3, 8, 11, 12, 15, 16], "fh51": [2, 12], "evelyn": [2, 3, 12], "joseph": [2, 12], "discriminatori": [2, 12], "discrimin": [2, 3, 12], "consist": [2, 4, 7, 8, 11, 12, 14, 15, 16, 17], "properti": [2, 3, 8, 11, 12, 13, 16, 17], "report": [2, 3, 7, 8, 9, 12, 16, 17], "usaf": [2, 12], "school": [2, 5, 8, 12], "aviat": [2, 12], "medicin": [2, 12], "randolph": [2, 12], "field": [2, 5, 11, 12, 16], "texa": [2, 12], "swm93": [2, 3], "nuclear": [2, 3], "intern": [2, 3, 6, 16], "symposium": [2, 3], "electron": [2, 3, 15], "technolog": [2, 3, 16], "stanfordhcare21": 2, "url": [2, 3, 4, 7, 8, 13, 14, 15, 16, 17], "http": [2, 3, 4, 6, 7, 8, 10, 11, 13, 14, 15, 16, 17], "stanfordhealthcar": 2, "medic": [2, 3], "condit": 2, "continu": [3, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17], "its": [3, 4, 8, 9, 11, 12, 13, 15, 16, 17], "describ": [3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "matric": 3, "neighbor": [3, 13], "k": [3, 8, 11, 16], "nearest": [3, 4, 13], "estim": [3, 7, 8, 10, 12, 13], "underfit": [3, 13], "advantag": [3, 4, 7, 11, 12, 13, 14, 15, 16, 17], "disadvantag": [3, 4, 12, 13, 16], "wrong": [3, 7, 8, 13, 16, 17], "cancer": 3, "ask": [3, 4, 5, 7, 11, 12, 13, 15, 16, 17], "kei": [3, 5, 7, 8, 11, 14, 15, 16, 17], "impli": [3, 7], "between": [3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "oppos": [3, 11, 12, 16, 17], "memor": 3, "visit": [3, 6, 7, 8, 11, 14, 15, 16], "hospit": 3, "more": [3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "trick": 3, "asid": [3, 8, 11, 13], "match": [3, 11, 12, 13, 15, 16, 17], "observ": [3, 4, 7, 8, 12, 13, 15, 16, 17], "confid": [3, 7, 12], "golden": 3, "rule": [3, 7, 8, 12, 16], "cannot": [3, 4, 7, 8, 9, 11, 12, 13, 15, 16, 17], "than": [3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "realli": [3, 7, 8, 11, 12, 16, 17], "imagin": [3, 7, 9, 11, 15, 16, 17], "bad": [3, 4, 11, 16], "overestim": [3, 7], "made": [3, 4, 8, 12, 13, 14, 15, 16, 17], "frac": [3, 4, 7, 12], "summar": [3, 7, 8, 10, 11, 16, 17], "stori": [3, 9, 12, 16], "alon": [3, 7, 15], "comprehens": [3, 4, 7], "each": [3, 4, 7, 8, 10, 11, 12, 13, 15, 16, 17], "correctli": [3, 8, 11, 14, 16, 17], "incorrectli": 3, "57": 3, "bottom": [3, 9, 14, 15], "roughli": [3, 4, 7, 12, 13, 16], "89": [3, 8, 16], "892": 3, "misclassifi": 3, "disastr": 3, "receiv": [3, 11, 15], "particularli": [3, 11, 13, 16], "unaccept": 3, "term": [3, 4, 7, 8, 11, 12, 16, 17], "talk": [3, 11, 16], "four": [3, 4, 8, 10, 16], "perfect": [3, 16], "zero": [3, 4, 12, 13, 16, 17], "almost": [3, 4, 8, 11, 12, 16], "two": [3, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17], "commonli": [3, 7, 8, 9, 13, 15, 16, 17], "metric": [3, 4, 12, 13], "togeth": [3, 4, 7, 9, 11, 16, 17], "inde": [3, 4, 7, 8, 11, 13, 16, 17], "20": [3, 7, 8, 11, 13, 15, 16, 17], "quad": [3, 4], "25": [3, 7, 8, 11, 12, 15, 16, 17], "rel": [3, 4, 8, 16], "context": [3, 11, 12, 13, 16, 17], "certainli": [3, 7], "achiev": [3, 8, 12, 16, 17], "guess": [3, 4, 7, 8], "everi": [3, 7, 8, 9, 11, 13, 15, 17], "similarli": [3, 8, 11, 16, 17], "never": [3, 8, 12, 15], "obsev": 3, "Of": [3, 7, 13, 17], "somewher": [3, 8, 11, 12, 16], "extrem": [3, 7, 12, 13], "trade": [3, 4], "off": [3, 4, 7, 13], "fair": [3, 11, 12], "unbias": 3, "influenc": [3, 4, 7, 12, 13, 16], "human": [3, 4, 7, 11, 15, 16, 17], "counter": 3, "main": [3, 8, 14, 17], "tenet": 3, "determin": [3, 4, 7, 12, 14, 15, 16, 17], "everyth": [3, 7, 8, 14, 17], "point": [3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], "investig": [3, 7, 8, 11, 16], "integ": [3, 11, 16, 17], "At": [3, 8, 9, 10, 11, 13], "track": [3, 7, 8, 15, 17], "to_list": 3, "convert": [3, 7, 11, 12, 16, 17], "nums_0_to_9": 3, "5": [3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "random_numbers1": 3, "appear": [3, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16], "fresh": [3, 9], "batch": 3, "random_numbers2": 3, "forc": [3, 16], "random_numbers1_again": 3, "random_numbers2_again": 3, "And": [3, 7, 8, 11, 12, 13, 15, 16, 17], "4235": 3, "random_numbers1_differ": 3, "random_numbers2_differ": 3, "beyond": [3, 4, 8, 11, 12, 13, 14, 15, 16, 17], "explicitli": [3, 11, 15, 16, 17], "insert": [3, 15, 17], "therebi": [3, 16], "global": [3, 16], "drawback": 3, "buri": 3, "undesir": 3, "entir": [3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "plai": [3, 8, 11, 14], "random_st": 3, "pcg64": 3, "rng": 3, "random_numbers1_third": 3, "random_numbers2_third": 3, "load": [3, 4, 9, 10, 11, 12, 13, 16, 17], "quick": [3, 8, 11], "re": [3, 4, 8, 9, 10, 11, 12, 15, 16, 17], "scale": [3, 4, 11, 12, 13, 15, 16], "done": [3, 8, 9, 11, 14, 15, 16, 17], "preliminari": 3, "train_test_split": [3, 12, 13], "shuffl": 3, "stratifi": [3, 12], "exist": [3, 8, 9, 11, 13, 14, 15, 16, 17], "train_siz": [3, 12, 13], "model_select": [3, 12, 13], "cancer_test": 3, "index": [3, 8, 11, 15, 17], "426": 3, "196": [3, 4, 7, 8, 12], "296": 3, "43": [3, 4, 15, 16], "143": 3, "116": 3, "miss": [3, 4, 16, 17], "626761": 3, "373239": 3, "last": [3, 7, 8, 10, 11, 15, 16, 17], "sensit": [3, 8, 13], "consider": 3, "aspect": [3, 7, 13, 16], "fortun": [3, 7, 8, 11, 12, 13, 17], "construct": [3, 7, 8, 11, 16, 17], "cancer_preprocessor": 3, "augment": [3, 4], "864726": 3, "146": [3, 7], "869691": 3, "86": 3, "86135501": 3, "846226": 3, "105": [3, 7, 8, 11, 16, 17], "863030": 3, "244": 3, "884180": 3, "23": [3, 11, 15, 16, 17], "851509": 3, "125": [3, 8, 17], "86561": 3, "281": 3, "8912055": 3, "84799002": 3, "score": [3, 11, 12], "8951048951048951": 3, "90": [3, 7, 8, 16, 17], "precision_scor": 3, "recall_scor": 3, "y_true": [3, 12, 13], "y_pred": [3, 12, 13], "pos_label": 3, "8275862068965517": 3, "9056603773584906": 3, "83": 3, "91": [3, 7], "crosstab": 3, "alphabet": [3, 8, 16, 17], "80": [3, 17], "48": [3, 7, 8, 15, 16], "agre": [3, 11, 13], "displaystyl": 3, "51": [3, 7, 15, 16], "82": [3, 13, 16], "76": 3, "That": [3, 7, 8, 11, 12, 16, 17], "pretti": [3, 7, 11], "wait": [3, 8, 11, 12, 13, 16, 17], "Or": [3, 7, 13], "someth": [3, 4, 7, 8, 9, 11, 12, 15, 16, 17], "99": [3, 4, 7, 12, 13, 16], "terribl": 3, "impress": [3, 16], "attent": [3, 8, 12, 17], "sacrif": 3, "easi": [3, 4, 8, 9, 11, 13, 15, 16, 17], "baselin": [3, 16], "regardless": [3, 11, 12, 16], "sens": [3, 4, 7, 8, 12, 13, 16, 17], "hope": [3, 11, 13, 16], "signific": [3, 5, 8], "Be": [3, 11, 12, 16], "enough": [3, 7, 8, 11, 12, 13, 15, 16, 17], "usual": [3, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "suspect": [3, 4, 7], "built": [3, 8, 9, 14, 17], "perspect": [3, 4, 12, 17], "hoorai": 3, "cautiou": 3, "misdiagnos": 3, "vast": [3, 4, 16, 17], "behav": [3, 7, 13], "principl": [3, 5, 16], "ideal": [3, 9, 12, 17], "somehow": [3, 7, 11], "hasn": 3, "yet": [3, 7, 8, 9, 11, 12, 15, 16, 17], "rememb": [3, 7, 8, 9, 11, 13, 16, 17], "touch": [3, 16], "dai": [3, 9, 11, 15, 16], "strongli": [3, 10, 13], "whatev": [3, 4, 8, 16], "lucki": [3, 7], "perhap": [3, 7, 8, 9, 11, 12, 13, 16], "sub": [3, 11], "cancer_subtrain": 3, "cancer_valid": 3, "acc": 3, "897196261682243": 3, "repeat": [3, 4, 5, 7, 15], "none": [3, 4, 11, 13, 15, 17], "underli": [3, 4, 8], "reduc": [3, 4, 11, 16], "un": [3, 4], "c": [3, 8, 11], "evenli": [3, 12], "chunk": [3, 13], "iter": [3, 4, 8, 15, 16, 17], "fold": [3, 12], "cross_valid": 3, "cv": [3, 12], "cancer_pip": 3, "cv_5_df": 3, "fit_tim": 3, "score_tim": 3, "test_scor": 3, "004744": 3, "006074": 3, "837209": 3, "003767": 3, "005599": 3, "870588": 3, "003626": 3, "005471": 3, "894118": 3, "003746": 3, "005538": 3, "003584": 3, "005502": 3, "882353": 3, "aggreg": [3, 8], "sem": 3, "uncertain": [3, 7, 12], "scope": [3, 4, 8, 12, 13, 14, 15, 16], "01": [3, 7, 11, 16, 17], "cv_5_metric": 3, "agg": [3, 13, 17], "003894": 3, "005637": 3, "870971": 3, "000215": 3, "000111": 3, "009501": 3, "limit": [3, 4, 11, 13, 15, 16, 17], "speed": 3, "trial": [3, 16], "cv_10": 3, "cv_10_df": 3, "cv_10_metric": 3, "003672": 3, "004243": 3, "884939": 3, "000032": 3, "000034": 3, "006718": 3, "slightli": [3, 7, 11, 12, 13, 16], "due": [3, 4, 5, 7, 11, 17], "reduct": 3, "dramat": 3, "cv_50_df": 3, "cv_50_metric": 3, "003633": 3, "003160": 3, "888056": 3, "000011": 3, "000008": 3, "003005": 3, "downstream": 3, "expens": [3, 11], "chemo": 3, "radiat": 3, "therapi": 3, "death": 3, "mispredict": 3, "gridsearchcv": [3, 12], "unspecifi": 3, "cancer_tune_pip": 3, "tunabl": 3, "get_param": [3, 12], "verbos": 3, "columntransformer__n_job": 3, "columntransformer__remaind": 3, "columntransformer__sparse_threshold": 3, "columntransformer__transformer_weight": 3, "columntransformer__transform": 3, "columntransformer__verbos": 3, "columntransformer__verbose_feature_names_out": 3, "columntransformer__standardscal": 3, "columntransformer__standardscaler__copi": 3, "columntransformer__standardscaler__with_mean": 3, "columntransformer__standardscaler__with_std": 3, "kneighborsclassifier__algorithm": 3, "auto": [3, 15], "kneighborsclassifier__leaf_s": 3, "30": [3, 5, 7, 8, 11, 12, 15, 16, 17], "kneighborsclassifier__metr": 3, "minkowski": 3, "kneighborsclassifier__metric_param": 3, "kneighborsclassifier__n_job": 3, "kneighborsclassifier__n_neighbor": 3, "kneighborsclassifier__p": 3, "kneighborsclassifier__weight": 3, "wow": [3, 7, 16], "stuff": 3, "sift": 3, "muck": [3, 11], "stand": [3, 11, 12, 16], "parameter_grid": 3, "allow": [3, 4, 7, 8, 9, 10, 11, 12, 15, 16, 17], "greater": [3, 4, 11, 17], "third": [3, 4], "skip": [3, 9, 17], "96": [3, 13, 16], "emploi": [3, 5, 11, 12], "okai": [3, 16, 17], "param_grid": [3, 12], "cancer_tune_grid": 3, "cv_results_": [3, 12], "format": [3, 5, 10, 11, 12, 13, 17], "accuracies_grid": 3, "19": [3, 7, 11, 15, 16], "mean_fit_tim": 3, "std_fit_tim": 3, "mean_score_tim": 3, "std_score_tim": 3, "param_kneighborsclassifier__n_neighbor": 3, "param": 3, "split0_test_scor": 3, "split1_test_scor": 3, "split2_test_scor": 3, "split3_test_scor": 3, "split4_test_scor": 3, "split5_test_scor": 3, "split6_test_scor": 3, "split7_test_scor": 3, "split8_test_scor": 3, "split9_test_scor": 3, "mean_test_scor": [3, 12], "17": [3, 4, 11, 13, 15, 16], "std_test_scor": [3, 12], "18": [3, 4, 7, 8, 11, 15, 16], "rank_test_scor": 3, "int32": [3, 17], "param_kneighbors_classifier__n_neighbor": 3, "unus": 3, "sem_test_scor": [3, 12], "845127": 3, "019966": 3, "873200": 3, "015680": 3, "861517": 3, "019547": 3, "861573": 3, "017787": 3, "866279": 3, "017889": 3, "875637": 3, "016026": 3, "885050": 3, "015406": 3, "36": [3, 4, 15, 16], "887375": 3, "013694": 3, "41": [3, 15, 16, 17], "46": [3, 4, 15, 16], "887320": 3, "013314": 3, "882669": 3, "014523": 3, "56": [3, 8], "878018": 3, "014414": 3, "61": [3, 7, 8, 11], "880343": 3, "014299": 3, "66": [3, 7, 11, 12], "015416": 3, "71": [3, 11], "877962": 3, "013660": 3, "014698": 3, "81": [3, 16], "880288": 3, "011277": 3, "875581": 3, "012967": 3, "008193": 3, "shortcut": [3, 9, 16], "layer": 3, "accuracy_vs_k": 3, "mark_lin": [3, 4, 12, 13, 16], "neighbour": [3, 12], "highest": [3, 17], "best_params_": [3, 12], "vari": [3, 7, 12, 13, 14, 16, 17], "exact": [3, 7, 13, 16], "justifi": [3, 16], "optim": [3, 11, 12], "decreas": [3, 4, 7, 16, 17], "reliabl": [3, 5, 7, 9, 16], "uncertainti": [3, 7], "cost": [3, 7, 12, 13], "prohibit": [3, 12], "large_param_grid": 3, "385": 3, "large_cancer_tune_grid": 3, "large_accuracies_grid": 3, "large_accuracy_vs_k": 3, "farther": [3, 16], "sort": [3, 4, 8, 9, 11, 13, 16, 17], "boundari": [3, 13], "simpler": 3, "stronger": 3, "regard": [3, 7, 8, 9, 12, 13, 17], "themselv": [3, 5, 11, 16], "noisi": [3, 12, 16], "jag": 3, "essenti": [3, 7, 8, 9, 11, 12, 17], "problemat": [3, 9, 11, 16], "unreli": [3, 7, 13], "strike": 3, "balanc": [3, 5, 7], "qualiti": [3, 5, 9, 12, 13], "retrain": [3, 12], "9090909090909091": 3, "8846153846153846": 3, "8679245283018868": 3, "84": [3, 16], "glanc": 3, "surpris": 3, "knew": 3, "return": [3, 4, 7, 8, 11, 13, 14, 17], "put": [3, 7, 11, 12, 13, 14, 15, 17], "defin": [3, 7, 8, 10, 11, 12, 13, 16, 17], "execut": [3, 11, 15], "search": [3, 4, 11, 14, 15], "strength": [3, 13, 16], "weak": [3, 12, 13, 16], "assumpt": [3, 4, 12, 13], "multi": 3, "slow": [3, 9, 12, 13], "treat": [3, 4, 8, 15, 16, 17], "accept": [3, 11, 12, 14, 15], "wors": [3, 8, 17], "meaning": [3, 4, 8, 11, 13, 15], "cancer_irrelev": 3, "irrelevant1": 3, "irrelevant2": 3, "30010": 3, "08690": 3, "132": [3, 7], "19740": 3, "130": [3, 7, 17], "00": [3, 7, 17], "24140": 3, "77": [3, 7], "58": [3, 16], "19800": 3, "135": [3, 7, 13, 16], "24390": 3, "142": 3, "14400": 3, "131": 3, "09251": 3, "35140": 3, "140": [3, 7], "00000": [3, 7], "47": [3, 4, 15, 16], "92": [3, 7], "increasingli": [3, 11], "distanc": [3, 4, 12, 13, 16], "corrupt": 3, "outperform": 3, "combat": 3, "extra": [3, 11, 13], "nois": [3, 16], "smoothli": 3, "trend": [3, 7, 8, 12, 13, 16], "corrobor": 3, "evid": [3, 5], "untun": 3, "scientif": [3, 12, 13, 15], "clear": [3, 4, 5, 7, 8, 13, 15, 16, 17], "cut": 3, "obviou": [3, 9, 13, 16], "relev": [3, 11, 12, 13], "consum": [3, 7, 17], "systemat": 3, "beal": 3, "hock": 3, "lesli": 3, "moder": 3, "ab": [3, 11, 12], "bc": [3, 7, 8], "ac": 3, "abc": 3, "million": [3, 13, 16], "computation": 3, "draper": 3, "smith": 3, "1966": 3, "eforymson": 3, "straightforward": [3, 11, 16], "form": [3, 4, 7, 8, 11, 12, 13, 16, 17], "updat": [3, 4, 14, 15, 16], "big": [3, 7, 8, 11, 15, 16], "55": [3, 7, 12, 16, 17], "caution": [3, 9, 11], "move": [3, 8, 10, 12, 13, 15, 16], "likelihood": 3, "unlucki": [3, 4], "stumbl": 3, "risk": [3, 12], "suffer": 3, "turn": [3, 4, 8, 11, 12, 13, 17], "smaller": [3, 12, 13, 16], "irrelevant3": 3, "full": [3, 7, 8, 11, 13, 15, 16, 17], "cancer_subset": 3, "sequentialfeatureselector": 3, "tri": [3, 4, 12, 13, 16], "flexibl": [3, 9, 13, 17], "resort": 3, "loop": [3, 17], "flow": 3, "mckinnei": [3, 11, 16, 17], "2012": [3, 8, 11, 16, 17], "n_total": 3, "check": [3, 8, 10, 11, 15, 16, 17], "j": [3, 8, 11], "len": [3, 11], "accuracy_dict": 3, "selected_predictor": 3, "empti": [3, 9, 15], "n_job": 3, "best_set": 3, "argmax": 3, "append": [3, 11, 16, 17], "join": [3, 11, 15], "del": [3, 16], "891103": 3, "917450": 3, "931454": 3, "926253": 3, "906955": 3, "exhibit": [3, 9], "fluctuat": [3, 12], "attempt": [3, 4, 16], "account": [3, 14, 15], "chanc": [3, 7, 14], "elbow": [3, 4], "successfulli": [3, 9, 11, 15], "judgement": 3, "excel": [3, 8, 13, 15], "tutori": [3, 5, 9, 11, 13], "go": [3, 7, 8, 10, 11, 13, 14, 16], "jame": [3, 4, 11, 13], "great": [3, 4, 7, 8, 9, 11, 13, 15, 16], "naiv": 3, "bay": 3, "goe": [3, 8, 9, 11, 13], "popular": [3, 4, 5, 11, 13, 15], "bkm67": 3, "martin": 3, "lansdown": 3, "mauric": 3, "georg": 3, "kendal": 3, "david": [3, 7], "mann": 3, "discard": 3, "multivari": 3, "biometrika": 3, "366": 3, "ds66": 3, "norman": 3, "harri": 3, "wilei": [3, 16], "efo66": 3, "stepwis": 3, "backward": 3, "eastern": 3, "meet": 3, "hl67": 3, "ronald": 3, "technometr": 3, "531": 3, "540": 3, "jwht13": [3, 4, 13], "gareth": [3, 4, 13], "daniela": [3, 4, 13], "witten": [3, 4, 13], "hasti": [3, 4, 13], "tibshirani": [3, 4, 13], "springer": [3, 4, 13, 16], "1st": [3, 4, 13], "edit": [3, 4, 13, 14, 16], "www": [3, 4, 8, 11, 13], "statlearn": [3, 4, 13], "com": [3, 4, 7, 11, 13, 14, 15, 16], "mck12": [3, 11, 16, 17], "ipython": [3, 11, 14, 16, 17], "o": [3, 8, 11, 14, 16, 17], "reilli": [3, 11, 16, 17], "media": [3, 9, 11, 16, 17], "inc": [3, 7, 11, 16, 17], "subgroup": [4, 8, 16, 17], "predict": [4, 5, 8, 10, 12, 13, 16], "differenti": 4, "classif": [4, 5, 8, 10, 12, 13], "variabl": [4, 7, 8, 9, 11, 12, 13, 16, 17], "scikit": [4, 12, 13], "set": [4, 7, 9, 10, 11, 13, 15, 17], "genet": [4, 16], "ancestr": 4, "subpopul": 4, "onlin": [4, 7, 11, 14, 15, 16], "custom": [4, 16], "uncov": [4, 9, 16], "fundament": [4, 7, 8, 16], "supervis": 4, "unsupervis": 4, "imposs": [4, 7], "articl": [4, 8], "wikipedia": [4, 11], "evalu": [4, 7, 8, 13, 16], "test": [4, 7, 13, 14], "good": [4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "ascertain": 4, "rigor": [4, 5, 12], "lloyd": 4, "1982": 4, "hierarch": 4, "princip": 4, "compon": [4, 8], "multidimension": 4, "semisupervis": 4, "goal": [4, 8, 12, 16, 17], "benefici": [4, 11], "unlabel": [4, 8], "willing": [4, 7], "seed": [4, 7, 12, 13], "palmerpenguin": 4, "horst": 4, "2020": [4, 7, 8, 16], "kristen": 4, "gorman": 4, "palmer": 4, "station": [4, 16], "antarctica": [4, 16], "ecolog": 4, "site": [4, 6, 11], "adult": 4, "penguin": 4, "2014": [4, 5, 17], "bill": 4, "flipper": 4, "millimet": 4, "distinct": [4, 9, 16], "speci": 4, "discoveri": [4, 13], "gentoo": 4, "bill_length_mm": 4, "flipper_length_mm": 4, "39": [4, 15, 16], "182": 4, "34": [4, 15, 16, 17], "187": [4, 7, 12], "190": [4, 12, 17], "195": [4, 8, 17], "193": [4, 12], "213": [4, 8, 11, 16, 17], "215": [4, 17], "45": [4, 8, 11, 15, 16, 17], "220": [4, 17], "49": [4, 15, 16], "208": 4, "52": [4, 15], "197": 4, "189": [4, 7], "penguins_standard": 4, "bill_length_standard": 4, "flipper_length_standard": 4, "641361": 4, "189773": 4, "144917": 4, "328412": 4, "517922": 4, "921755": 4, "107617": 4, "846513": 4, "409743": 4, "677761": 4, "238168": 4, "271104": 4, "902464": 4, "433767": 4, "720106": 4, "192860": 4, "645505": 4, "355522": 4, "962559": 4, "440353": 4, "762179": 4, "205012": 4, "111528": 4, "123299": 4, "786203": 4, "626855": 4, "757407": 4, "783170": 4, "108442": 4, "776057": 4, "759092": 4, "subtyp": 4, "scatter_plot": 4, "meaningless": 4, "etc": [4, 7, 8, 11, 15, 16, 17], "adjust": [4, 16], "sum": [4, 12, 17], "wssd": 4, "inertia": 4, "mu_x": 4, "mu_i": 4, "x_1": 4, "x_2": 4, "x_3": 4, "x_4": 4, "y_1": 4, "y_2": 4, "y_3": 4, "y_4": 4, "35": [4, 8, 15, 16, 17], "outlin": [4, 8, 11, 16, 17], "far": [4, 13, 15, 16, 17], "yellow": [4, 17], "variant": 4, "minim": [4, 12, 13, 16], "reassign": 4, "longer": [4, 8, 17], "termin": [4, 14], "onward": [4, 11, 14, 16], "guarante": [4, 14], "forev": 4, "logic": [4, 8, 11, 17], "finit": [4, 7, 16], "unlik": [4, 7, 11, 12, 16], "stuck": [4, 9, 17], "solut": [4, 7, 8], "poor": [4, 11], "lowest": [4, 11, 16], "cross": [4, 12, 13], "valid": [4, 10, 11, 12, 13], "subdivid": 4, "merg": [4, 11], "diminish": 4, "reach": [4, 11, 13, 15, 16], "being": [4, 5, 7, 8, 9, 11, 12, 15, 16, 17], "address": [4, 5, 8, 11, 12, 13, 15], "preprocess": [4, 12], "kmean": 4, "n_cluster": 4, "kmeanskmean": 4, "penguin_clust": 4, "labels_": 4, "altern": [4, 8, 13, 15, 16, 17], "suffix": [4, 16], "nomin": [4, 16], "discret": [4, 16], "cluster_plot": 4, "inertia_": 4, "730719092276117": 4, "varieti": [4, 5, 11, 13, 15, 17], "ks": 4, "oper": [4, 7, 8, 11, 14, 15, 16], "safest": 4, "reus": 4, "penguin_clust_k": 4, "000000": 4, "576264": 4, "730719": 4, "343613": 4, "362131": 4, "678383": 4, "293320": 4, "975016": 4, "785232": 4, "elbow_plot": 4, "bump": [4, 16], "prevent": [4, 8, 9, 11, 16, 17], "n_init": 4, "paramet": [4, 7, 11, 12, 13, 16, 17], "realm": 4, "specif": [4, 5, 11, 12, 14, 15, 16], "companion": [4, 11], "pca": 4, "gwf14": 4, "toni": 4, "fraser": 4, "sexual": 4, "dimorph": 4, "commun": [4, 5, 8, 11, 16], "ntarctic": 4, "genu": 4, "pygosc": 4, "plo": [4, 15], "ONE": 4, "hhg20": 4, "allison": 4, "alison": 4, "hill": [4, 11], "archipelago": 4, "allisonhorst": 4, "io": [4, 8, 11, 16], "llo82": 4, "stuart": 4, "quantiz": 4, "pcm": 4, "129": 4, "137": [4, 7, 8, 13], "releas": [4, 11], "bell": [4, 7], "telephon": 4, "paper": [4, 5, 16], "1957": 4, "john": [5, 17], "hopkin": 5, "bloomberg": 5, "public": [5, 8, 15], "2023": [5, 11], "expand": [5, 14, 15, 17], "grown": 5, "significantli": [5, 7, 8, 9, 13, 16, 17], "recent": [5, 11, 14, 15], "attract": 5, "demand": [5, 16], "concurr": 5, "growth": [5, 16], "prolifer": 5, "blog": 5, "post": [5, 11, 15], "fast": [5, 16], "literatur": 5, "inclin": 5, "moment": [5, 12, 17], "former": [5, 12], "activ": [5, 9, 11], "amongst": 5, "practition": 5, "consensu": 5, "date": [5, 11, 15, 16], "element": [5, 11, 16, 17], "clean": [5, 8, 10, 11], "highli": [5, 8, 15], "nevertheless": [5, 13, 16], "emerg": 5, "lack": [5, 15], "agreement": 5, "strong": [5, 13, 16], "propos": [5, 8], "vision": [5, 16], "implic": [5, 7], "engag": 5, "tidi": [5, 8], "tabular": [5, 17], "formal": [5, 8], "hadlei": [5, 17], "wickham": [5, 17], "tidyvers": 5, "independ": [5, 9, 10, 16], "facilit": [5, 15], "audit": [5, 10], "complex": [5, 11, 13, 15, 16], "eas": [5, 15], "clearli": [5, 11, 16], "unobserv": 5, "popul": [5, 7, 8, 11, 16, 17], "succe": 5, "foster": 5, "fluent": 5, "vers": 5, "behind": [5, 11, 15, 16], "immedi": [5, 11, 13], "integr": [5, 11], "git": [5, 14, 15], "train": [5, 13], "ever": [5, 13, 15, 17], "awar": [5, 9, 15], "sophist": [5, 8], "mix": [5, 9, 17], "generaliz": 5, "confront": 5, "web": [6, 9, 15], "navig": [6, 8, 9, 11, 14, 15], "mobil": 6, "devic": [6, 11], "menu": [6, 8, 9, 14], "datasciencebook": [6, 10, 11, 14], "ca": [6, 8, 10, 11, 12, 13, 14], "licens": 6, "creativ": 6, "noncommerci": 6, "sharealik": 6, "extend": [7, 13, 16], "inferenti": [7, 8, 10, 12, 16], "interv": 7, "approxim": 7, "broader": 7, "retail": 7, "sell": 7, "iphon": 7, "accessori": 7, "market": [7, 12, 13], "strateg": 7, "product": [7, 8, 15], "north": [7, 11, 16], "american": [7, 8, 11], "colleg": 7, "campus": 7, "america": [7, 16], "owner": [7, 11, 13], "characterist": [7, 8, 11, 16, 17], "costli": 7, "taken": [7, 8, 12, 15, 16], "60": [7, 8], "canada": [7, 8, 11, 16, 17], "apart": [7, 11, 16], "rent": 7, "budget": [7, 12], "studio": 7, "rental": [7, 11], "price": [7, 11, 12, 13], "month": [7, 15, 16], "monthli": 7, "airbnb": 7, "cox": 7, "marketplac": 7, "vacat": 7, "septemb": [7, 16], "neighborhood": 7, "room": 7, "accommod": 7, "bathroom": 7, "bedroom": [7, 11, 12, 13], "bed": [7, 12, 13], "night": 7, "neighbourhood": 7, "room_typ": 7, "downtown": 7, "home": [7, 8, 11, 12, 13, 14, 16, 17], "apt": [7, 14], "bath": [7, 12], "150": [7, 16], "eastsid": 7, "west": 7, "85": [7, 8, 11, 12, 13, 16], "kensington": 7, "cedar": 7, "cottag": 7, "110": 7, "4589": 7, "4590": 7, "4591": 7, "oakridg": 7, "privat": [7, 11, 15], "4592": 7, "dunbar": 7, "southland": 7, "share": [7, 9, 11, 15, 16, 17], "29": [7, 12, 15, 16, 17], "4593": 7, "145": 7, "4594": 7, "shaughnessi": 7, "citi": [7, 11, 12, 17], "plan": [7, 15], "bylaw": 7, "747497": 7, "246408": 7, "005224": 7, "hotel": 7, "000871": 7, "747": 7, "155": [7, 13, 17], "725": 7, "250": [7, 12, 17], "025": 7, "625": 7, "350": [7, 12, 13, 17], "confirm": [7, 15, 16], "histogram": 7, "000": [7, 8, 11, 12, 13, 16], "20_000": 7, "605": 7, "606": 7, "marpol": 7, "4579": 7, "4580": 7, "160": [7, 12], "1739": 7, "1740": 7, "151": [7, 8, 16], "3904": 7, "3905": 7, "185": [7, 17], "1596": 7, "1597": 7, "kitsilano": 7, "3060": 7, "3061": 7, "hast": 7, "sunris": 7, "78": 7, "19999": 7, "527": 7, "528": 7, "1587": 7, "1588": 7, "169": [7, 13], "3860": 7, "3861": 7, "2747": 7, "2748": 7, "285": 7, "800000": 7, "0000": 7, "999": 7, "750": [7, 16], "775": 7, "225": [7, 11], "19998": 7, "700": [7, 17], "275": 7, "44552": 7, "reset_index": [7, 17], "caveat": [7, 16, 17], "twice": [7, 13], "sample_proport": 7, "44547": 7, "44548": 7, "44549": 7, "44550": 7, "44551": 7, "sample_estim": 7, "675": 7, "44541": 7, "19995": 7, "44543": 7, "19996": 7, "44545": 7, "19997": 7, "20000": 7, "mind": [7, 8, 11, 15], "sampling_distribut": 7, "mark_bar": [7, 8, 16], "bin": [7, 16], "maxbin": [7, 16], "symmetr": 7, "peak": [7, 16], "74848375": 7, "748": [7, 12], "neither": [7, 12, 16], "nor": [7, 9, 13], "underestim": 7, "tendenc": 7, "travel": 7, "wish": [7, 8, 15], "overpr": [7, 12], "population_distribut": 7, "skew": 7, "tail": [7, 11], "154": [7, 13], "5109773617762": 7, "one_sampl": 7, "sample_distribut": 7, "153": 7, "48225": 7, "wouldn": [7, 15], "alreadi": [7, 8, 9, 10, 11, 12, 13, 14, 16, 17], "mean_pric": 7, "148": 7, "56075": 7, "165": [7, 17], "50500": 7, "93925": 7, "139": 7, "14650": 7, "198": 7, "50000": 7, "192": 7, "66425": 7, "144": 7, "88600": 7, "08800": 7, "156": [7, 12], "25000": 7, "170": 7, "mean_of_sample_mean": 7, "sample_mean": 7, "disappear": 7, "thumb": [7, 16], "emphasi": 7, "saw": [7, 11, 12, 17], "notion": [7, 12], "pretend": 7, "clever": 7, "drawn": [7, 13, 16], "median": [7, 16, 17], "slope": [7, 13], "displai": [7, 8, 9, 11, 13, 15, 16, 17], "4025": 7, "4026": 7, "renfrew": 7, "collingwood": 7, "1977": [7, 16], "1978": 7, "fairview": 7, "70": [7, 11, 16, 17], "4008": 7, "4009": 7, "269": [7, 16], "1543": 7, "1544": 7, "320": 7, "3350": 7, "3351": 7, "804": 7, "805": 7, "mount": 7, "pleasant": 7, "2286": 7, "2287": 7, "1010": 7, "1011": 7, "strathcona": 7, "120": [7, 8, 11, 17], "1878": 7, "1879": [7, 16], "175": 7, "1644": 7, "1645": 7, "2771": 7, "2772": 7, "4151": 7, "4152": 7, "289": 7, "4495": 7, "4496": 7, "rilei": 7, "park": [7, 16], "115": 7, "1308": 7, "1309": 7, "2246": 7, "2247": 7, "2335": 7, "2336": 7, "4059": 7, "4060": 7, "1280": 7, "1281": 7, "4324": 7, "4325": 7, "3403": 7, "3404": 7, "arbutu": 7, "ridg": 7, "664": 7, "1729": 7, "1730": 7, "93": [7, 16], "3722": 7, "3723": 7, "241": 7, "242": 7, "3955": 7, "3956": 7, "1042": 7, "1043": 7, "649": 7, "650": [7, 16], "sunset": 7, "1995": [7, 16], "1996": 7, "363": 7, "364": 7, "1783": 7, "1784": 7, "806": 7, "254": 7, "255": 7, "3365": 7, "3366": 7, "4562": 7, "4563": 7, "64": [7, 11, 12, 14], "2124": 7, "2125": 7, "200": [7, 8, 11, 12, 16], "1997": 7, "1998": 7, "257": 7, "4329": 7, "4330": [7, 17], "3408": 7, "3409": 7, "635": 7, "636": 7, "grandview": 7, "woodland": 7, "103": [7, 17], "one_sample_dist": 7, "boot1": 7, "boot1_dist": 7, "ident": [7, 8, 11], "mimic": 7, "break": [7, 11, 12, 13], "boot20000": 7, "six": [7, 8, 10, 12, 16, 17], "six_bootstrap_sampl": 7, "queri": [7, 11], "height": [7, 13, 16], "facet": [7, 16], "67175": 7, "42500": 7, "149": [7, 8], "35000": 7, "13225": 7, "179": [7, 8], "79675": 7, "188": 7, "28225": 7, "boot20000_mean": 7, "159": 7, "29675": 7, "136": 7, "55725": 7, "161": 7, "93950": 7, "22500": 7, "boot_est_dist": 7, "resampl": 7, "repeatedli": 7, "percentil": [7, 17], "captur": [7, 11, 13, 16], "narrow": [7, 11, 17], "comfort": [7, 15], "strict": [7, 8], "unhelp": 7, "life": [7, 8], "deadli": 7, "ascend": [7, 8, 16], "bound": [7, 16], "97": [7, 13, 16], "quantil": 7, "express": [7, 16, 17], "5th": 7, "975": 7, "ci_bound": 7, "121": [7, 12], "607069": 7, "191": [7, 8], "525362": 7, "finish": [7, 9, 10, 11, 14, 15, 16], "journei": 7, "surfac": [7, 12, 13, 16], "foundat": [7, 8, 11, 13], "openintro": 7, "diez": 7, "2019": [7, 16], "solid": [7, 16], "grasp": 7, "natur": [7, 15, 16, 17], "coxd": 7, "murrai": 7, "insideairbnb": 7, "09": [7, 11, 16], "dccetinkayarb19": 7, "\u00e7": 7, "etinkaya": 7, "rundel": 7, "christoph": 7, "barr": 7, "os": [7, 9], "dirti": 8, "dig": [8, 11, 17], "jump": [8, 10, 11, 16], "symbol": [8, 14, 16, 17], "spoken": [8, 16, 17], "resid": [8, 16], "indigen": 8, "cultur": 8, "anywher": [8, 9], "2018": [8, 16], "sadli": 8, "colon": [8, 17], "led": [8, 16], "loss": 8, "children": 8, "speak": [8, 11, 16, 17], "mother": [8, 16, 17], "tongu": [8, 16, 17], "childhood": 8, "residenti": [8, 12], "discov": 8, "act": [8, 15, 16, 17], "harm": 8, "endang": 8, "geograph": 8, "walker": 8, "2017": [8, 15], "came": [8, 12, 16], "aborigin": [8, 11, 16, 17], "truth": 8, "reconcili": 8, "commiss": 8, "action": 8, "2015": 8, "canlang": [8, 11, 16], "2016": [8, 11, 16, 17], "censu": [8, 11, 16, 17], "214": [8, 11, 16, 17], "offici": [8, 11, 16, 17], "mother_tongu": [8, 11, 16, 17], "expos": 8, "birth": 8, "most_at_hom": [8, 11, 16, 17], "most_at_work": [8, 11, 16, 17], "lang_known": [8, 11, 16, 17], "accord": [8, 11, 16, 17], "deep": [8, 13], "simplifi": [8, 11, 17], "concentr": [8, 16], "expertis": 8, "bias": 8, "aim": [8, 10, 16], "causal": [8, 12, 16], "mechanist": [8, 16], "leek": 8, "matsui": 8, "earli": [8, 10], "live": [8, 11, 16], "provinc": [8, 11], "territori": 8, "hypothes": [8, 16], "polit": 8, "parti": 8, "wealth": [8, 16], "elect": 8, "quantif": 8, "factor": [8, 16], "mechan": [8, 11, 12], "pertain": [8, 16, 17], "occasion": [8, 14, 17], "race": [8, 12, 13], "runner": 8, "regularli": [8, 9], "graphic": [8, 9, 11, 14, 15, 16], "ag": 8, "old": [8, 11, 15], "50kg": 8, "cluster": [8, 10, 16], "bought": 8, "amazon": 8, "cellphon": 8, "ownership": 8, "android": 8, "phone": 8, "essenc": 8, "spreadsheet": [8, 11], "microsoft": 8, "rectangular": 8, "primarili": [8, 12, 15, 16], "speaker": [8, 16, 17], "comma": [8, 9, 12, 17], "short": [8, 11, 16], "save": [8, 11, 14, 15], "googl": [8, 11], "sheet": [8, 11], "can_lang": [8, 11, 16, 17], "plain": [8, 9, 15], "editor": [8, 9, 11, 15], "notepad": 8, "590": [8, 11, 16], "235": [8, 11, 16, 17], "665": [8, 11, 16], "afrikaan": [8, 11, 16, 17], "10260": [8, 11, 16], "4785": [8, 11, 16], "23415": [8, 11, 16], "afro": [8, 11, 16, 17], "asiat": [8, 11, 16, 17], "1150": [8, 11, 16], "44": [8, 11, 15, 16], "akan": [8, 11, 16, 17], "twi": [8, 11, 16, 17], "13460": [8, 11, 16], "5985": [8, 11, 16], "22150": [8, 11, 16], "albanian": [8, 11, 16, 17], "26895": [8, 11, 16], "13135": [8, 11, 16], "345": [8, 11, 16], "31930": [8, 11, 16], "algonquian": [8, 11, 17], "algonquin": [8, 11, 17], "1260": [8, 11], "370": [8, 11, 17], "2480": [8, 11], "sign": [8, 11, 12, 15, 16], "2685": [8, 11], "3020": [8, 11], "1145": [8, 11], "amhar": [8, 11], "22465": [8, 11], "12785": [8, 11], "33670": [8, 11], "instal": [8, 9, 10, 11, 14], "team": [8, 15], "es": 8, "innei": 8, "2010": 8, "command": [8, 9, 11, 14], "shorter": [8, 9, 11, 15, 16], "alia": [8, 9], "gave": [8, 11], "harder": [8, 16, 17], "quot": [8, 11], "letter": [8, 14, 15], "distinguish": [8, 16], "satisfi": [8, 11], "syntax": [8, 11, 15, 17], "amp": [8, 11, 16, 17], "445": [8, 11, 16, 17], "2775": [8, 11, 16], "209": [8, 11, 16, 17], "wolof": [8, 11, 16, 17], "3990": [8, 11, 16], "1385": [8, 11, 16], "8240": [8, 11, 16], "210": [8, 11, 16, 17], "wood": [8, 11, 16, 17], "cree": [8, 11, 16, 17], "1840": [8, 11, 16], "800": [8, 11, 16], "2665": [8, 11, 16], "211": [8, 11, 12, 16, 17], "wu": [8, 11, 16, 17], "shanghaines": [8, 11, 16, 17], "12915": [8, 11, 16], "7650": [8, 11, 16], "16530": [8, 11, 16], "yiddish": [8, 11, 16, 17], "13555": [8, 11, 16], "7085": [8, 11, 16], "895": [8, 11, 16], "20985": [8, 11, 16], "yoruba": [8, 11, 16, 17], "9080": [8, 11, 16], "2615": [8, 11, 16], "22415": [8, 11, 16], "screen": [8, 9, 11], "string": [8, 11, 15, 16, 17], "my_numb": 8, "alic": 8, "_": [8, 9, 16, 17], "won": [8, 11, 13, 15, 17], "complain": 8, "my": [8, 9], "syntaxerror": 8, "mayb": [8, 11], "meant": 8, "convent": [8, 9, 15], "lowercas": [8, 15], "language_data": 8, "pep": 8, "guido": 8, "van": 8, "rossum": 8, "2001": 8, "minut": [8, 9, 13, 16], "underneath": [8, 9], "ve": [8, 11, 15], "largest": [8, 11, 16, 17], "restrict": [8, 13, 17], "bracket": [8, 9, 12, 17], "statement": [8, 11, 17], "written": [8, 9, 11, 15], "doubl": [8, 9, 10, 14, 16, 17], "athabaskan": [8, 11, 17], "atikamekw": [8, 11, 17], "6150": [8, 11], "5465": 8, "1100": 8, "6645": 8, "thompson": [8, 11], "ntlakapamux": [8, 11], "335": [8, 11], "450": 8, "tlingit": [8, 11], "260": 8, "tsimshian": [8, 11], "410": 8, "206": 8, "wakashan": [8, 11], "67": [8, 11, 12, 16], "aboriginal_lang": 8, "alias": 8, "wrote": 8, "terminolog": 8, "obj": 8, "f": [8, 11, 12, 14], "programm": 8, "confus": [8, 11, 17], "appar": 8, "rescu": 8, "selected_lang": 8, "descend": [8, 16], "decend": 8, "arranged_lang": 8, "64050": 8, "inuktitut": 8, "35210": 8, "138": 8, "ojibwai": 8, "17885": 8, "oji": 8, "12855": 8, "dene": 8, "10700": 8, "32": [8, 13, 15, 16, 17], "cayuga": 8, "squamish": 8, "iroquoian": 8, "ten_lang": 8, "montagnai": 8, "innu": 8, "10235": 8, "119": 8, "mi": [8, 16], "kmaq": 8, "6690": 8, "3065": 8, "180": [8, 13], "stonei": 8, "3025": 8, "becam": 8, "curiou": 8, "728": [8, 16], "canadian_popul": [8, 16], "overwrit": 8, "opt": [8, 11, 12], "mother_tongue_perc": [8, 16], "35_151_728": [8, 16], "35151728": 8, "latter": [8, 12], "clearer": [8, 16], "182210": 8, "100166": 8, "050879": 8, "036570": 8, "030439": 8, "029117": 8, "019032": 8, "017496": 8, "008719": 8, "008606": 8, "ten_lang_perc": 8, "008": 8, "temporari": [8, 15, 17], "arranged_lang_sort": 8, "trace": [8, 9], "split": [8, 12, 13, 16], "rewrit": 8, "unwieldi": 8, "parenthesi": 8, "demonstr": [8, 11, 12, 13, 16, 17], "cleaner": 8, "messi": [8, 15, 17], "pars": [8, 11, 16], "block": [8, 11], "piec": 8, "period": [8, 9, 11, 16], "Not": [8, 17], "feed": 8, "redo": 8, "overwhelm": 8, "debug": 8, "midwai": 8, "audienc": [8, 9, 15, 16], "difficulti": 8, "scrutin": 8, "convei": [8, 16], "understood": 8, "shortli": 8, "ax": [8, 16], "mark": [8, 11, 15, 16], "channel": [8, 11, 12, 15, 16], "barplot_mother_tongu": 8, "refin": [8, 11], "quotat": [8, 11], "modif": [8, 17], "tackl": 8, "rotat": 8, "swap": [8, 16], "barplot_mother_tongue_axi": 8, "forward": [8, 11, 12], "suit": [8, 16, 17], "reorder": 8, "ordered_barplot_mother_tongu": 8, "swampi": 8, "elsewher": [8, 11], "moos": 8, "northern": 8, "east": 8, "southern": 8, "comment": [8, 15], "hash": [8, 15], "importantli": 8, "self": [8, 11], "habit": [8, 12], "got": 8, "tast": 8, "ten_lang_plot": 8, "nobodi": 8, "pull": [8, 11, 14], "forgotten": [8, 15], "pop": [8, 9, 11], "slowli": 8, "adept": 8, "remind": [8, 17], "lab": [8, 14], "lookup": 8, "concis": 8, "press": [8, 9], "tab": [8, 9, 11, 14, 15], "bring": [8, 11], "typo": 8, "hold": [8, 11, 16, 17], "dialogu": 8, "dialog": [8, 15], "contextu": 8, "gvr01": 8, "coghlan": 8, "barri": [8, 17], "warsaw": 8, "style": [8, 11], "0008": 8, "lp15": 8, "jeffrei": [8, 16], "347": 8, "6228": 8, "1314": 8, "1315": 8, "pm15": 8, "elizabeth": 8, "art": [8, 16], "anyon": [8, 9, 11, 15], "skybrud": 8, "consult": [8, 11, 15], "llc": 8, "bookdown": 8, "rdpeng": 8, "artofdatasci": 8, "tim20": [8, 16], "ttimber": [8, 11, 16], "wal17": 8, "anada": 8, "canadiangeograph": 8, "wil18": 8, "kori": 8, "bccampu": 8, "opentextbc": 8, "indigenizationfound": 8, "statisticscanada16a": 8, "www12": 8, "statcan": 8, "gc": 8, "recens": 8, "dp": 8, "eng": 8, "cfm": 8, "statisticscanada16b": 8, "borigin": 8, "irst": 8, "ation": 8, "\u00e9ti": 8, "nuit": 8, "sa": 8, "2016022": 8, "x2016022": 8, "statisticscanada18": 8, "evolut": 8, "1901": 8, "www150": 8, "n1": 8, "pub": 8, "630": 8, "x2018001": 8, "htm": 8, "thepdteam20": 8, "dev": 8, "februari": 8, "doi": [8, 16], "5281": 8, "zenodo": 8, "3509134": 8, "trutharcocanada12": 8, "govern": 8, "servic": [8, 11, 15], "trutharcocanada15": 8, "ction": 8, "www2": 8, "gov": [8, 11, 16], "asset": 8, "columbian": 8, "calls_to_action_english2": 8, "pdf": [8, 16], "wesmckinney10": 8, "ata": 8, "tructur": 8, "tatist": 8, "omput": 8, "p": [8, 11, 14], "ython": 8, "t\u00e9fan": 8, "der": 8, "arrod": 8, "illman": 8, "roceed": 8, "9th": 8, "cienc": 8, "onfer": 8, "25080": 8, "majora": 8, "92bf1922": 8, "00a": 8, "interleav": 9, "narrat": 9, "platform": [9, 15], "interfac": [9, 11, 14, 15], "dress": 9, "morn": 9, "configur": [9, 10, 14, 15], "formatt": 9, "artifact": 9, "analyz": [9, 10, 11, 17], "realiti": [9, 13], "consciou": [9, 15], "screenshot": 9, "easiest": [9, 14], "jupyterhub": [9, 15], "provis": 9, "authent": [9, 15], "gain": [9, 11], "instructor": [9, 10], "refer": 9, "entireti": 9, "cursor": 9, "rectangl": [9, 15, 16], "toolbar": [9, 11], "keyboard": [9, 15], "enter": [9, 11, 14, 15, 16], "arrow": [9, 15], "restart": [9, 14], "bar": [9, 11, 13, 14], "slight": [9, 12], "session": [9, 14, 15], "delet": [9, 14, 15], "emul": 9, "window": [9, 11], "statu": 9, "idl": 9, "busi": 9, "excess": 9, "unrespons": 9, "lose": 9, "connect": [9, 11, 13, 14, 15, 16], "interrupt": 9, "paus": 9, "server": [9, 11, 15], "hub": 9, "panel": 9, "shut": [9, 14], "rich": [9, 15], "bold": 9, "italic": 9, "bullet": [9, 11], "eventu": [9, 11, 16], "unformat": 9, "unrend": 9, "box": [9, 12, 13, 14, 15], "progress": [9, 14], "autosav": 9, "disk": [9, 11], "icon": [9, 11, 14, 15], "mac": 9, "arbitrari": [9, 16], "downsid": [9, 14], "nonlinear": [9, 13, 16], "deliber": [9, 15], "referenc": 9, "unconvent": 9, "fail": 9, "nonfunct": 9, "scenario": [9, 11], "event": [9, 15], "guard": 9, "sooner": 9, "linearli": [9, 13], "suffici": [9, 16], "extern": [9, 15], "heavili": 9, "loc": [9, 16], "package_nam": 9, "pn": 9, "librari": [9, 16], "hidden": [9, 11], "ipynb": [9, 11, 15], "shareabl": 9, "firefox": 9, "safari": 9, "chrome": 9, "edg": 9, "adob": 9, "acrobat": 9, "benefit": [9, 11, 15, 17], "standalon": 9, "font": [9, 11, 16], "launcher": 9, "visibl": [9, 15, 16], "untitl": 9, "white": 9, "troublesom": [9, 15], "repetit": 9, "dash": [9, 16], "jupyterlab": 9, "keen": 9, "commonmark": 9, "cheatsheet": 9, "friend": 10, "colleagu": 10, "histori": [10, 15], "chapter": 10, "spend": [10, 11, 12, 17], "restructur": 10, "usabl": 10, "coher": 10, "variou": [11, 14, 17], "laptop": [11, 15], "gatewai": 11, "unless": [11, 14, 16], "upfront": [11, 17], "devot": 11, "shoelac": 11, "trip": 11, "skiprow": 11, "ibi": 11, "list_tabl": 11, "to_csv": 11, "astronomi": 11, "pictur": [11, 16], "request": [11, 17], "internet": [11, 14], "remot": 11, "directori": [11, 14, 15, 16], "filesystem": 11, "folder": [11, 14, 15], "project3": 11, "happiness_report": 11, "slash": [11, 17], "proce": [11, 14, 15, 17], "happy_data": 11, "bike_shar": 11, "project2": 11, "silli": [11, 13], "redund": [11, 16], "whew": 11, "bonu": 11, "fatima": 11, "jayden": 11, "usernam": [11, 15], "link": [11, 14, 15], "video": [11, 14], "omma": 11, "epar": 11, "v": [11, 14], "alu": 11, "aren": [11, 16, 17], "canadian": [11, 17], "canlang_data": 11, "oftentim": [11, 17], "sentenc": 11, "paragraph": [11, 16], "scientist": 11, "distribut": [11, 15, 16], "permiss": [11, 15], "21930": 11, "parsererror": 11, "messag": [11, 14, 15, 16, 17], "wasn": [11, 16], "can_lang_meta": 11, "token": 11, "didn": [11, 17], "tsv": 11, "escap": 11, "backslash": 11, "can_lang_no_nam": 11, "curli": [11, 17], "brace": 11, "col_map": 11, "canlang_data_renam": 11, "u": [11, 14, 16], "niform": 11, "esourc": 11, "ocat": 11, "raw": [11, 14, 16, 17], "githubusercont": [11, 14], "datasci": 11, "whichev": 11, "xlsx": 11, "snippet": [11, 15], "_rel": 11, "j1": 11, "w8": 11, "qrj": 11, "tf": 11, "wz": 11, "hlio": 11, "8f": 11, "3wn": 11, "ed2": 11, "gz": 11, "_r": 11, "yg": 11, "tuee": 11, "6q": 11, "rzy": 11, "l60": 11, "xtp": 11, "4vt": 11, "jq": 11, "sheet_nam": 11, "sad": 11, "usecol": 11, "beforehand": 11, "libr": 11, "offic": 11, "semicolon": 11, "decim": [11, 16, 17], "european": 11, "countri": 11, "storag": 11, "user": [11, 14, 15], "manag": [11, 14, 15], "mysql": 11, "oracl": 11, "sql": 11, "simplest": [11, 16], "db": 11, "backend": 11, "send": [11, 15], "sqlalchemi": 11, "matur": 11, "deeper": 11, "friendlier": 11, "conn": 11, "retriev": [11, 12, 15, 17], "secretli": 11, "scene": [11, 15], "canlang_t": 11, "databaset": 11, "r0": 11, "countstar": 11, "haven": [11, 14], "sent": [11, 15], "effici": [11, 13, 15, 16], "lazi": 11, "compil": 11, "str": 11, "AS": 11, "nfrom": 11, "t0": 11, "arab": 11, "419890": 11, "223535": 11, "5585": 11, "629055": 11, "mostli": [11, 15, 16, 17], "canlang_table_filt": 11, "predic": 11, "canlang_table_select": 11, "r1": 11, "aboriginal_lang_data": 11, "attributeerror": 11, "traceback": 11, "conda": [11, 14], "lib": 11, "python3": 11, "expr": 11, "py": [11, 14, 17], "645": 11, "__getattr__": 11, "641": 11, "hint": 11, "common_typo": 11, "642": [11, 13], "rais": [11, 16], "643": 11, "__name__": 11, "644": 11, "tahltan": 11, "crash": 11, "postgr": 11, "client": [11, 12], "host": [11, 14, 15], "localhost": 11, "port": [11, 14], "endpoint": 11, "5432": 11, "password": [11, 15], "can_mov_db": 11, "movi": 11, "fakeserv": 11, "stat": 11, "user0001": 11, "abc123": 11, "theme": [11, 16], "medium": [11, 15], "title_alias": 11, "episod": 11, "names_occup": 11, "occup": 11, "rate": 11, "ratings_t": 11, "alchemyt": 11, "average_r": 11, "num_vot": 11, "avg_rat": 11, "order_bi": 11, "backup": 11, "secur": [11, 15], "simultan": [11, 15, 17], "conflict": 11, "billion": 11, "daili": 11, "chao": 11, "ensu": 11, "no_official_lang_data": 11, "no_official_languag": 11, "magic": 11, "uncommon": 11, "pplicat": 11, "rogram": 11, "nterfac": 11, "secret": [11, 15], "somewhat": [11, 13], "thought": [11, 13, 17], "painstak": 11, "gather": [11, 16], "yper": 11, "ext": 11, "arkup": 11, "anguag": 11, "ascad": 11, "tyle": 11, "heet": 11, "webpag": [11, 15], "wherea": [11, 13, 17], "layout": [11, 16], "subsect": 11, "richardson": 11, "2007": 11, "reitz": 11, "foot": [11, 12, 13], "craiglist": 11, "craigslist": 11, "advertis": [11, 12, 13], "span": 11, "meta": 11, "hous": [11, 12, 13], "1br": 11, "hood": 11, "13768": 11, "108th": 11, "avenu": 11, "maptag": 11, "pid": 11, "6786042973": 11, "banish": 11, "trash": [11, 14], "hide": [11, 16], "unbanish": 11, "href": 11, "restor": 11, "2285": 11, "oof": 11, "keyword": [11, 17], "grab": 11, "selectorgadget": 11, "cc": 11, "deselect": 11, "pic": 11, "footag": 11, "gadget": 11, "robot": 11, "txt": [11, 15], "cl": 11, "spider": 11, "script": 11, "scraper": 11, "crawler": 11, "explicit": [11, 17], "realist": 11, "disallow": 11, "td": 11, "nth": 11, "child": [11, 13], "largestc": 11, "target": 11, "bs4": 11, "wiki": 11, "en": 11, "parser": 11, "population_nod": 11, "slice": [11, 16, 17], "clariti": [11, 16], "greater_toronto_area": 11, "202": 11, "london": [11, 17], "_ontario": 11, "ontario": 11, "543": 11, "551": 11, "greater_montr": 11, "montreal": [11, 17], "node": 11, "rid": 11, "get_text": 11, "fantast": 11, "albeit": 11, "canada_wiki_t": 11, "metropolitan": [11, 17], "droplevel": 11, "canada_wiki_df": 11, "rank": 11, "unnam": 11, "8_level_1": 11, "9_level_1": 11, "6202225": 11, "543551": 11, "quebec": 11, "4291732": 11, "halifax": [11, 17], "nova": 11, "scotia": 11, "465703": 11, "2642825": 11, "st": [11, 17], "catharin": [11, 17], "niagara": [11, 17], "433604": 11, "ottawa": [11, 17], "gatineau": [11, 17], "1488307": 11, "windsor": [11, 17], "422630": 11, "calgari": [11, 17], "1481806": 11, "oshawa": 11, "415311": 11, "edmonton": [11, 17], "1418118": 11, "victoria": [11, 16, 17], "397237": 11, "839311": 11, "saskatoon": 11, "saskatchewan": 11, "317480": 11, "winnipeg": [11, 17], "manitoba": 11, "834678": 11, "regina": [11, 17], "249217": 11, "hamilton": 11, "785184": 11, "sherbrook": 11, "227398": 11, "kitchen": [11, 17], "cambridg": [11, 17], "waterloo": [11, 17], "575847": 11, "kelowna": [11, 17], "222162": 11, "desktop": 11, "stun": 11, "rho": 11, "ophiuchi": 11, "juli": 11, "webb": 11, "telescop": 11, "nircam": 11, "molecular": [11, 16], "signup": 11, "safe": [11, 15], "transfer": [11, 12], "infinit": 11, "bandwidth": 11, "frequent": [11, 15], "success": [11, 15], "bog": 11, "revok": 11, "grant": 11, "quota": 11, "overrun": 11, "abid": 11, "hourli": 11, "hour": [11, 12], "planetari": 11, "apod": 11, "api_kei": 11, "your_api_kei": 11, "07": [11, 16], "explan": [11, 16], "mere": 11, "390": 11, "light": [11, 15], "sun": [11, 16], "star": 11, "planet": 11, "peer": 11, "natal": 11, "infrar": 11, "spectacular": 11, "cosmic": 11, "snapshot": [11, 14, 15], "celebr": 11, "young": 11, "brighter": 11, "sport": 11, "diffract": 11, "spike": 11, "jet": 11, "shock": 11, "hydrogen": 11, "blast": 11, "newborn": 11, "yellowish": 11, "dusti": 11, "caviti": 11, "carv": 11, "energet": 11, "Near": 11, "shadow": 11, "cast": 11, "protoplanetari": 11, "hdurl": 11, "2307": 11, "stsci": 11, "01_rhooph": 11, "png": [11, 16], "media_typ": 11, "service_vers": 11, "v1": 11, "01_rhooph1024": 11, "neat": 11, "json": 11, "javascript": 11, "notat": [11, 17], "nasa_data_singl": 11, "start_dat": 11, "end_dat": 11, "nasa_data": 11, "74": [11, 16], "copyright": 11, "data_dict": 11, "nasa_df": 11, "carina": 11, "nebula": 11, "ncarlo": 11, "taylor": 11, "2305": 11, "carnorth": 11, "02": [11, 16], "flat": [11, 12, 13, 16], "rock": 11, "mar": 11, "nnasa": 11, "njpl": 11, "caltech": 11, "nmsss": 11, "nprocess": 11, "ne": 11, "flatmar": 11, "03": [11, 16, 17], "centauru": 11, "peculiar": 11, "island": 11, "nmarco": 11, "lorenzi": 11, "nangu": 11, "lau": 11, "tommi": 11, "tse": 11, "ntex": 11, "ngc5128_": 11, "galaxi": 11, "famou": 11, "hole": 11, "pia23122": 11, "shackleton": 11, "shadowcam": 11, "shacklet": 11, "69": 11, "doom": 11, "eta": 11, "nesa": 11, "nhubbl": 11, "nlice": 11, "etacarin": 11, "dust": 11, "ngc": 11, "6559": 11, "nadam": 11, "ntelescop": 11, "ngc6559_": 11, "sunspot": 11, "spot": 11, "72": 11, "ring": 11, "spiral": 11, "1398": 11, "ngc1398_": 11, "73": [11, 16], "readili": 11, "heart": 11, "awesom": 11, "udac": 11, "linux": [11, 14], "rthepsfoundation23": 11, "kenneth": 11, "readthedoc": 11, "latest": [11, 14, 15, 17], "ric07": 11, "leonard": 11, "beauti": 11, "soup": 11, "april": [11, 16], "nasaesacsa": 11, "esa": 11, "csa": 11, "pontoppidan": 11, "pagan": 11, "esawebb": 11, "weic2316a": 11, "realtsproject21": 11, "internetlivestat": 11, "faster": [12, 16], "rmspe": [12, 13], "person": [12, 13, 16], "week": 12, "annual": 12, "boston": 12, "marathon": 12, "sale": [12, 13], "spline": 12, "heurist": 12, "932": 12, "estat": [12, 13], "sacramento": [12, 13], "bee": 12, "newspap": 12, "realtor": 12, "zip": [12, 14, 15], "sqft": [12, 13], "latitud": 12, "longitud": 12, "z95838": 12, "836": [12, 17], "59222": 12, "631913": 12, "434879": 12, "z95823": 12, "1167": 12, "68212": 12, "478902": 12, "431028": 12, "z95815": 12, "796": 12, "68880": 12, "618305": 12, "443839": 12, "852": 12, "69307": 12, "616835": 12, "439146": 12, "z95824": 12, "797": 12, "81900": 12, "519470": 12, "435768": 12, "927": 12, "z95829": 12, "2280": 12, "232425": 12, "457679": 12, "359620": 12, "928": [12, 17], "1477": 12, "234000": 12, "499893": 12, "458890": 12, "929": 12, "citrus_height": 12, "z95610": 12, "1216": 12, "235000": 12, "708824": 12, "256803": 12, "930": [12, 16], "elk_grov": 12, "z95758": 12, "1685": 12, "235301": 12, "417000": 12, "397424": 12, "931": 12, "el_dorado_hil": 12, "z95762": 12, "1362": 12, "235738": 12, "655245": 12, "075915": 12, "livabl": 12, "feet": [12, 13], "usd": [12, 13], "unit": [12, 13, 16, 17], "front": [12, 16], "0f": [12, 13], "sold": [12, 13], "dive": 12, "subsampl": 12, "small_sacramento": 12, "pai": 12, "absent": 12, "small_plot": 12, "overlai": 12, "line_df": 12, "2000": 12, "mark_rul": [12, 16], "strokedash": [12, 16], "dist": 12, "nearest_neighbor": 12, "298": 12, "1900": 12, "361745": 12, "487409": 12, "461413": 12, "718": 12, "antelop": 12, "z95843": 12, "2160": 12, "290000": 12, "704554": 12, "354753": 12, "rosevil": 12, "z95678": 12, "1744": 12, "326951": 12, "771917": 12, "304439": 12, "256": 12, "252": 12, "z95835": 12, "1718": 12, "250000": 12, "676658": 12, "528128": 12, "282": 12, "rancho_cordova": 12, "z95670": 12, "1671": 12, "175000": 12, "591477": 12, "315340": 12, "329": 12, "280739": 12, "280": [12, 16, 17], "739": 12, "unansw": 12, "abil": [12, 15, 16, 17], "lock": [12, 13], "sacramento_train": [12, 13], "sacramento_test": [12, 13], "limits_": 12, "y_i": 12, "hat": 12, "_i": 12, "th": 12, "forecast": 12, "overshoot": 12, "undershoot": 12, "rmse": [12, 13], "equat": [12, 13], "kneighborsregressor": [12, 13], "neg_root_mean_squared_error": 12, "kneighborsregressor__n_neighbor": 12, "sacr_pipelin": 12, "sacr_preprocessor": 12, "201": 12, "sacr_gridsearch": 12, "sacr_result": 12, "param_kneighborsregressor__n_neighbor": 12, "117365": 12, "988307": 12, "2715": 12, "383001": 12, "93956": 12, "523683": 12, "2466": 12, "200227": 12, "89859": 12, "401722": 12, "2739": 12, "713448": 12, "87893": 12, "534919": 12, "2958": 12, "587153": 12, "86444": 12, "413831": 12, "3383": 12, "712997": 12, "92909": 12, "550051": 12, "2562": 12, "784826": 12, "93137": 12, "289780": 12, "2511": 12, "564001": 12, "93395": 12, "588763": 12, "2492": 12, "272799": 12, "93671": 12, "588088": 12, "2473": 12, "312705": 12, "199": 12, "93986": 12, "752272": 12, "048651": 12, "nonneg": 12, "neg_": 12, "convolut": 12, "alright": [12, 16], "101": [12, 17], "minimum": [12, 13, 17], "699": 12, "perfectli": [12, 15, 16], "datapoint": 12, "inflex": 12, "idiosyncrat": 12, "unseen": [12, 13], "mean_squared_error": [12, 13], "87498": 12, "86808211416": 12, "499": 12, "578": 12, "neglig": 12, "buyer": 12, "afford": 12, "maximum": [12, 13, 17], "5000": 12, "superimpos": [12, 13], "qualit": [12, 13], "opportun": 12, "sqft_prediction_grid": [12, 13], "arang": 12, "base_plot": 12, "sacr_preds_plot": [12, 13], "best_k_sacr": 12, "ff7f0e": [12, 13], "concern": [12, 13], "incorpor": [12, 17], "plot_b": 12, "moreov": 12, "85156": 12, "027067": 12, "3376": 12, "143313": 12, "rmspe_mult": 12, "85083": 12, "2902421959": 12, "083": 12, "overlaid": [12, 13], "2d": 12, "newli": [12, 15], "character": 13, "conclud": 13, "slower": 13, "confusingli": 13, "undervalu": 13, "beta_0": 13, "beta_1": 13, "cdot": 13, "intercept": [13, 16], "coeffici": 13, "parametr": 13, "push": 13, "happili": 13, "crazi": 13, "shouldn": 13, "600": [13, 16], "276": 13, "027": 13, "plausibl": 13, "linearregress": 13, "linear_model": 13, "coef_": 13, "intercept_": 13, "lm": 13, "285652": 13, "15642": 13, "309105": 13, "hurt": 13, "afterward": [13, 17], "85376": 13, "59691629931": 13, "377": [13, 16], "tricki": [13, 14], "all_point": 13, "wiggli": 13, "curv": [13, 16], "oscil": [13, 16], "Such": 13, "fare": 13, "extrapol": 13, "obvious": 13, "mlm": 13, "linearregressionlinearregress": 13, "lm_mult_test_rmsp": 13, "82331": 13, "04630202598": 13, "331": 13, "hallmark": 13, "59235377": 13, "20333": 13, "43213798": 13, "53180": 13, "26906624224": 13, "beta_2": 13, "hyperplan": 13, "333": [13, 16], "tune": [13, 16], "collinear": 13, "judg": 13, "unbeknownst": 13, "analyst": 13, "parent": 13, "absurdli": 13, "subtl": [13, 17], "inaccur": 13, "238": 13, "ft": 13, "041": 13, "166": 13, "539": 13, "ic": 13, "cream": 13, "flavor": [13, 16], "remark": 13, "homeown": 13, "df": [13, 17], "fulli": [13, 16], "5994": 13, "288853": 13, "1688": 13, "092090": 13, "9859": 13, "021194": 13, "9160": 13, "812375": 13, "6400": 13, "212624": 13, "7341": 13, "333609": 13, "8434": 13, "656970": 13, "3329": 13, "106273": 13, "7170": 13, "311442": 13, "7895": 13, "567003": 13, "cubic": 13, "z": 13, "magnitud": [13, 16], "leap": 13, "stone": 13, "enjoi": 13, "ventura": 14, "22": [14, 15, 16], "cpu": 14, "english": [14, 16, 17], "virtual": 14, "rightmost": 14, "compress": [14, 16], "unzip": 14, "autograd": 14, "pre": 14, "isol": 14, "interf": 14, "ex": 14, "wizard": 14, "wsl": 14, "hyper": 14, "prompt": [14, 15], "cmd": 14, "admin": 14, "administr": 14, "log": [14, 15, 16], "bio": 14, "hotkei": 14, "esc": 14, "reboot": 14, "familiar": 14, "ubcdsci": 14, "proceed": [14, 17], "dockerfil": 14, "besid": [14, 15], "textbox": 14, "8888": 14, "volum": 14, "path": [14, 16, 17], "jovyan": 14, "scroll": [14, 15], "127": 14, "troubleshoot": 14, "tip": 14, "dmg": 14, "intel": 14, "processor": 14, "older": 14, "appl": 14, "newer": 14, "drag": [14, 15], "sudo": 14, "certif": 14, "curl": 14, "gnupg": 14, "fssl": 14, "sh": 14, "chmod": 14, "rm": 14, "pwd": 14, "homepag": 14, "bundl": 14, "kernel": 14, "pip": 14, "upgrad": 14, "env": 14, "intro": 14, "yml": 14, "compat": 14, "xcode": 14, "x64": 14, "arm64": 14, "debian": 14, "deb": 14, "dpkg": 14, "jlab": 14, "me": 15, "ago": 15, "holder": 15, "lifespan": 15, "resolv": 15, "revis": 15, "mess": [15, 16], "repercuss": 15, "boggl": 15, "unclear": 15, "document_final_draft_fin": 15, "to_hand_in_final_v2": 15, "polish": 15, "springboard": 15, "fruit": 15, "revert": 15, "Being": 15, "todai": [15, 16], "safeti": 15, "workspac": 15, "schemat": 15, "maintain": 15, "told": 15, "metadata": 15, "brief": 15, "narr": 15, "readm": 15, "md": 15, "draft": 15, "shorten": 15, "daa29d6": 15, "884c7ce": 15, "prerequisit": 15, "stage": 15, "physic": [15, 16], "placehold": 15, "synchron": 15, "templat": 15, "canadian_languag": 15, "hyphen": 15, "privaci": 15, "happi": 15, "green": [15, 17], "respositori": 15, "reserv": 15, "upload": [15, 16], "toggl": 15, "markdown": 15, "archiv": 15, "defeat": 15, "prove": 15, "beginn": 15, "grain": 15, "expiri": 15, "creation": 15, "absolut": [15, 16], "tick": [15, 16], "repo": 15, "fret": 15, "eda": 15, "flag": 15, "pane": 15, "plu": 15, "untrack": 15, "checkpoint": 15, "state": [15, 16], "datetim": [15, 16], "stamp": 15, "ok": 15, "credenti": 15, "author": 15, "33": [15, 16, 17], "dismiss": 15, "invit": 15, "collaborators_github_user_nam": 15, "refresh": 15, "blend": [15, 16], "offend": 15, "preced": 15, "histor": 15, "float": [15, 17], "app": 15, "convers": [15, 16, 17], "subtop": 15, "persist": 15, "thread": 15, "searchabl": 15, "notif": 15, "repli": 15, "submit": [15, 16], "submiss": 15, "youtub": 15, "advic": 15, "gitlab": 15, "bitbucket": 15, "wbc": 15, "jennif": 15, "bryan": 15, "karen": 15, "cranston": 15, "justin": 15, "kitz": 15, "lex": 15, "nederbragt": 15, "traci": 15, "teal": 15, "subplot": 16, "raster": 16, "svg": 16, "distract": 16, "poster": 16, "wilk": 16, "oft": 16, "pie": 16, "static": 16, "math": 16, "cognit": 16, "mental": 16, "plainli": 16, "legend": 16, "scheme": 16, "surprisingli": 16, "sex": 16, "ancestri": 16, "deeb": 16, "2005": 16, "blind": 16, "reinforc": 16, "sparingli": 16, "detract": 16, "wari": 16, "overplot": 16, "overlap": 16, "zoom": 16, "vegafus": 16, "data_transform": 16, "curat": 16, "pieter": 16, "tan": 16, "noaa": 16, "gml": 16, "ralph": 16, "keel": 16, "scripp": 16, "oceanographi": 16, "dioxid": 16, "hawaii": 16, "1959": 16, "1980": 16, "co2_df": 16, "mauna_loa_data": 16, "parse_d": 16, "date_measur": 16, "ppm": 16, "338": 16, "340": 16, "341": 16, "06": [16, 17], "479": 16, "414": 16, "480": 16, "481": 16, "416": 16, "482": [16, 17], "483": 16, "484": 16, "datetime64": 16, "ns": 16, "iso": 16, "8601": 16, "alphanumer": 16, "mark_": 16, "leverag": 16, "helper": 16, "co2_scatt": 16, "upward": 16, "affirm": 16, "predecessor": 16, "successor": 16, "alter": 16, "segment": 16, "emphas": 16, "co2_lin": 16, "aha": 16, "phenomenon": 16, "muddl": 16, "settl": 16, "configure_axi": 16, "titlefonts": 16, "co2_line_label": 16, "co2": 16, "configure_": 16, "1990": 16, "clip": 16, "stack": [16, 17], "co2_line_scal": 16, "late": 16, "season": 16, "summer": 16, "octob": 16, "winter": 16, "novemb": 16, "analog": 16, "paint": 16, "blank": 16, "canva": 16, "primer": 16, "akin": 16, "sketch": 16, "durat": 16, "geyser": 16, "yellowston": 16, "nation": 16, "wyom": 16, "79": 16, "283": 16, "533": 16, "267": 16, "117": [16, 17], "268": [16, 17], "270": 16, "817": 16, "271": 16, "467": 16, "272": 16, "faithful_scatt": 16, "faithful_scatter_label": 16, "faithful_scatter_labels_black": 16, "whom": 16, "hollow": 16, "can_lang_plot": 16, "vs": 16, "can_lang_plot_label": 16, "bunch": 16, "clump": 16, "french": [16, 17], "460": 16, "850": 16, "19460850": 16, "22162865": 16, "15265335": 16, "29748265": 16, "59": [16, 17], "7166700": 16, "6943800": 16, "3825215": 16, "10242945": 16, "logarithm": 16, "squish": 16, "log_": 16, "log10": 16, "inf": 16, "can_lang_plot_log": 16, "gridlin": 16, "seven": 16, "can_lang_plot_log_revis": 16, "tickcount": 16, "kilo": 16, "mutat": 16, "most_at_home_perc": 16, "001678": 16, "000669": 16, "029188": 16, "013612": 16, "003272": 16, "001266": 16, "038291": 16, "017026": 16, "076511": 16, "037367": 16, "011351": 16, "003940": 16, "005234": 16, "002276": 16, "036741": 16, "021763": 16, "038561": 16, "020155": 16, "025831": 16, "007439": 16, "can_lang_plot_perc": 16, "meaningfulli": 16, "onto": 16, "belong": [16, 17], "can_lang_plot_categori": 16, "laid": 16, "can_lang_plot_legend": 16, "orient": 16, "tableau10": 16, "unsur": 16, "dark2": 16, "aesthet": 16, "switch": 16, "can_lang_plot_them": 16, "tooltip": 16, "hover": 16, "mous": 16, "pointer": 16, "can_lang_plot_tooltip": 16, "mile": 16, "mcneil": 16, "contin": 16, "south": 16, "africa": 16, "europ": 16, "asia": 16, "australia": 16, "islands_df": 16, "landmass_typ": 16, "11506": 16, "5500": 16, "16988": 16, "2968": 16, "axel": 16, "heiberg": 16, "baffin": 16, "184": 16, "bank": 16, "borneo": 16, "britain": 16, "celeb": 16, "celon": 16, "cuba": 16, "devon": 16, "ellesmer": 16, "3745": 16, "greenland": 16, "840": 16, "hainan": 16, "hispaniola": 16, "hokkaido": 16, "honshu": 16, "iceland": 16, "ireland": 16, "java": 16, "kyushu": 16, "luzon": 16, "madagascar": 16, "227": 16, "melvil": 16, "mindanao": 16, "molucca": 16, "guinea": 16, "306": 16, "zealand": 16, "newfoundland": 16, "9390": 16, "novaya": 16, "zemlya": 16, "princ": 16, "wale": 16, "sakhalin": 16, "6795": 16, "southampton": 16, "spitsbergen": 16, "sumatra": 16, "183": 16, "taiwan": 16, "tasmania": 16, "tierra": 16, "fuego": 16, "timor": 16, "islands_bar": 16, "nlargest": 16, "tilt": 16, "sort_valu": 16, "islands_top12": 16, "islands_bar_top": 16, "appeal": 16, "minu": 16, "revers": 16, "caption": 16, "slide": 16, "summari": 16, "twelv": 16, "islands_plot_sort": 16, "morlei": 16, "1882": 16, "299": 16, "792": 16, "458": 16, "km": 16, "sec": 16, "kilometr": 16, "morley_df": 16, "expt": 16, "740": 16, "900": 16, "1070": [16, 17], "940": 16, "950": 16, "810": 16, "870": 16, "experiment": 16, "fell": 16, "morley_bar": 16, "thin": 16, "bucket": 16, "morley_hist": 16, "datum": 16, "thick": 16, "v_line": 16, "morley_hist_lin": 16, "morley_hist_color": 16, "sit": 16, "transluc": 16, "morley_hist_categor": 16, "deriv": 16, "incorrect": 16, "clearest": 16, "morley_hist_facet": 16, "1050": 16, "foremost": 16, "subtli": 16, "speed_of_light": 16, "299792": 16, "relativeerror": 16, "299000": 16, "019194": 16, "017498": 16, "035872": 16, "092578": 16, "045879": 16, "049215": 16, "052550": 16, "002516": 16, "005851": 16, "025865": 16, "morley_hist_rel": 16, "recreat": 16, "admir": 16, "morley_hist_maxbin": 16, "width": 16, "motiv": 16, "establish": 16, "pose": 16, "wiggl": 16, "discern": 16, "parenthes": [16, 17], "energi": 16, "automot": 16, "plant": 16, "burn": [16, 17], "fossil": 16, "fuel": 16, "greenhous": 16, "gase": 16, "byproduct": 16, "trap": 16, "heat": 16, "warm": 16, "observatori": 16, "amplitud": 16, "1800": 16, "kilomet": 16, "farthest": 16, "confer": 16, "shop": 16, "billboard": 16, "pixel": 16, "lossi": 16, "lossless": 16, "jpeg": 16, "jpg": 16, "photograph": 16, "bmp": 16, "tiff": 16, "tif": 16, "gimp": 16, "redraw": 16, "ep": 16, "inkscap": 16, "shrink": 16, "portabl": 16, "hardl": 16, "1991": 16, "filenam": 16, "img": 16, "viz": 16, "faithful_plot": 16, "mb": 16, "decent": 16, "bigger": 16, "dee05": 16, "sameer": 16, "clinic": 16, "369": 16, "har91": 16, "wolfgang": 16, "york": 16, "mcn77": 16, "donald": 16, "mic82": 16, "veloc": 16, "nite": 16, "tate": 16, "aval": 16, "cademi": 16, "nnapoli": 16, "astronom": 16, "tk20": 16, "ccgg": 16, "vgh": 16, "jacob": 16, "granger": 16, "heer": 16, "dominik": 16, "moritz": 16, "kanit": 16, "wongsuphasawat": 16, "arvind": 16, "satyanarayan": 16, "eitan": 16, "ilia": 16, "timofeev": 16, "ben": 16, "welsh": 16, "scott": 16, "sievert": 16, "journal": [16, 17], "1057": 16, "21105": 16, "joss": 16, "01057": 16, "wil19": 16, "clau": 16, "clauswilk": 16, "dataviz": 16, "util": 17, "entiti": 17, "2235145": 17, "abbrevi": 17, "int": 17, "14159": 17, "boolean": 17, "bool": 17, "hello": 17, "nonetyp": 17, "arithmet": 17, "dict": 17, "cities_seri": 17, "separt": 17, "population_in_2016": 17, "1027613": 17, "1823281": 17, "544870": 17, "571146": 17, "321484": 17, "upcom": 17, "population_in_2016_df": 17, "criteria": 17, "No": 17, "bespok": 17, "untidi": 17, "2006": 17, "2011": 17, "land": 17, "region_lang_top5_cities_wid": 17, "cite": 17, "montr\u00e9al": 17, "lang_wid": 17, "985": 17, "1435": 17, "960": 17, "575": 17, "360": 17, "240": 17, "8485": 17, "1015": 17, "705": 17, "885": 17, "13260": 17, "2450": 17, "1090": 17, "1365": 17, "770": 17, "2440": 17, "5290": 17, "1025": 17, "380": 17, "3355": 17, "8960": 17, "3380": 17, "1430": 17, "tough": 17, "lang_mother_tidi": 17, "id_var": 17, "var_nam": 17, "value_nam": 17, "1065": 17, "1066": 17, "1067": 17, "1068": 17, "1069": 17, "met": 17, "commut": 17, "widen": 17, "region_lang_top5_cities_long": 17, "lang_long": 17, "2135": 17, "2136": 17, "2137": 17, "2138": 17, "2139": 17, "2140": 17, "lang_home_tidi": 17, "2495": 17, "1622735": 17, "1330555": 17, "8630": 17, "3245": 17, "behaviour": 17, "colum": 17, "messier": 17, "dealt": 17, "lang_messi": 17, "region_lang_top5_cities_messi": 17, "265": 17, "520": 17, "505": 17, "4045": 17, "440": 17, "330": 17, "6380": 17, "1445": 17, "530": 17, "620": 17, "3130": 17, "760": 17, "6665": 17, "860": 17, "1080": 17, "lang_messy_long": 17, "tidy_lang": 17, "astyp": 17, "depth": 17, "occas": 17, "official_lang": 17, "3836770": 17, "3218725": 17, "29800": 17, "11940": 17, "620510": 17, "412120": 17, "2669195": 17, "1607550": 17, "487": 17, "696": 17, "1065070": 17, "844740": 17, "701": 17, "910": 17, "1050410": 17, "792700": 17, "915": 17, "10950": 17, "2520": 17, "1060": 17, "ampersand": 17, "pipe": 17, "region_data": 17, "household": 17, "dwell": 17, "bellevil": 17, "43002": 17, "1354": 17, "65121": 17, "103472": 17, "45050": 17, "lethbridg": 17, "45696": 17, "3046": 17, "69699": 17, "117394": 17, "48317": 17, "thunder": 17, "bai": 17, "52545": 17, "2618": 17, "26318": 17, "121621": 17, "57146": 17, "peterborough": 17, "50533": 17, "1636": 17, "98336": 17, "121721": 17, "55662": 17, "saint": 17, "52872": 17, "3793": 17, "42158": 17, "126202": 17, "58398": 17, "535499": 17, "7168": 17, "96442": 17, "1323783": 17, "519693": 17, "5241": 17, "70103": 17, "1392609": 17, "960894": 17, "3040": 17, "41532": 17, "2463431": 17, "1727310": 17, "4638": 17, "24059": 17, "4098927": 17, "2135909": 17, "6269": 17, "93132": 17, "5928040": 17, "interst": 17, "city_nam": 17, "five_c": 17, "502143": 17, "9857": 17, "77908": 17, "1321426": 17, "537634": 17, "seriesa": 17, "seriesb": 17, "669": 17, "capabl": 17, "omit": 17, "startswith": 17, "darker": 17, "region_lang": 17, "moncton": 17, "saguenai": 17, "7485": 17, "7486": 17, "7487": 17, "abbotsford": 17, "mission": 17, "7488": 17, "7489": 17, "7490": 17, "23171710": 17, "std": 17, "490000e": 17, "093686e": 17, "401258e": 17, "000000e": 17, "836770e": 17, "25th": 17, "50th": 17, "75th": 17, "skipna": 17, "3061820": 17, "5600480": 17, "numeric_onli": 17, "3200": 17, "341121": 17, "3093": 17, "686248": 17, "1853": 17, "757677": 17, "5127": 17, "499332": 17, "55231": 17, "640268": 17, "64012": 17, "578320": 17, "48574": 17, "532066": 17, "94001": 17, "162338": 17, "cartoon": 17, "dataframegroupbi": 17, "0x7f6c577be950": 17, "137445": 17, "182390": 17, "97840": 17, "brantford": 17, "124560": 17, "troi": 17, "rivi\u00e8r": 17, "149835": 17, "331375": 17, "270715": 17, "612595": 17, "23015": 17, "875": 17, "8235": 17, "2695": 17, "102": 17, "365": 17, "23565": 17, "104": 17, "11185": 17, "122100": 17, "93495": 17, "167835": 17, "168990": 17, "115125": 17, "193445": 17, "93655": 17, "54150": 17, "100855": 17, "116645": 17, "73910": 17, "130835": 17, "937055": 17, "1343335": 17, "147805": 17, "78610": 17, "149805": 17, "1316635": 17, "2289515": 17, "302690": 17, "211705": 17, "354470": 17, "235990": 17, "166220": 17, "318540": 17, "530570": 17, "437460": 17, "749285": 17, "keyerror": 17, "qu\u00e9bec": 17, "028571": 17, "region_lang_num": 17, "wise": 17, "040": 17, "aforement": 17, "english_lang": 17, "1898": 17, "444955": 17, "2500590": 17, "1903": 17, "1918": 17, "1919": 17, "930405": 17, "1275265": 17, "1923": 17, "city_pop": 17, "unchang": 17, "tmp": 17, "ipykernel_12": 17, "2654974267": 17, "settingwithcopywarn": 17, "row_index": 17, "col_index": 17, "pydata": 17, "doc": 17, "stabl": 17, "user_guid": 17, "warn": 17, "went": 17, "silenc": 17, "div": 17, "divis": 17, "108554": 17, "151384": 17, "100543": 17, "610060": 17, "516498": 17, "647224": 17, "542966": 17, "944744": 17, "672877": 17, "764802": 17, "606588": 17, "964617": 17, "704092": 17, "794906": 17, "599882": 17, "965067": 17, "534472": 17, "658730": 17, "540123": 17, "929401": 17, "city_popul": 17, "wic14": 17}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"acknowledg": 0, "python": [0, 3, 4, 6, 7, 8, 9, 11, 13, 17], "edit": [0, 6, 9, 15], "about": 1, "author": 1, "classif": [2, 3], "i": [2, 12, 15], "train": [2, 3, 12], "predict": [2, 3], "overview": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "chapter": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "learn": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "object": [2, 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17], "The": [2, 3, 4, 9, 12, 13], "problem": [2, 12], "explor": [2, 8, 9, 12], "data": [2, 3, 6, 8, 9, 11, 12, 16, 17], "set": [2, 3, 8, 12, 14, 16], "load": [2, 8], "cancer": 2, "describ": 2, "variabl": [2, 3], "k": [2, 4, 12, 13], "nearest": [2, 12], "neighbor": [2, 12], "distanc": 2, "between": 2, "point": 2, "evalu": [2, 3, 12], "from": [2, 11, 15, 17], "new": [2, 9, 13], "observ": 2, "each": 2, "its": 2, "5": 2, "more": 2, "than": 2, "two": 2, "explanatori": 2, "summari": [2, 3, 7, 9, 11, 17], "algorithm": [2, 4], "scikit": [2, 3], "preprocess": [2, 3], "center": 2, "scale": 2, "balanc": 2, "miss": [2, 11], "put": [2, 8], "togeth": [2, 8], "pipelin": 2, "exercis": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "refer": [2, 3, 4, 7, 8, 11, 12, 13, 15, 16, 17], "ii": [3, 13], "tune": [3, 12], "perform": [3, 17], "an": [3, 4, 11, 16], "exampl": [3, 4], "confus": 3, "matrix": 3, "tumor": 3, "imag": 3, "random": [3, 4], "seed": 3, "creat": [3, 8, 9, 15, 16], "test": [3, 12], "split": [3, 17], "classifi": 3, "label": 3, "critic": 3, "analyz": 3, "cross": 3, "valid": 3, "paramet": 3, "valu": [3, 8, 11, 17], "select": [3, 8, 17], "under": 3, "overfit": [3, 12], "predictor": [3, 13], "effect": [3, 16], "irrelev": 3, "find": 3, "good": 3, "subset": [3, 8], "forward": 3, "addit": [3, 4, 7, 9, 11, 13, 15, 16, 17], "resourc": [3, 4, 7, 9, 11, 13, 15, 16, 17], "cluster": 4, "illustr": 4, "mean": [4, 7], "measur": 4, "qualiti": 4, "restart": 4, "choos": [4, 16], "foreword": 5, "scienc": 6, "A": 6, "first": 6, "introduct": 6, "welcom": 6, "statist": [7, 17], "infer": 7, "why": [7, 11, 15], "do": [7, 17], "we": [7, 11], "need": 7, "sampl": 7, "distribut": 7, "proport": 7, "bootstrap": 7, "us": [7, 8, 11, 15, 17], "calcul": [7, 17], "plausibl": 7, "rang": 7, "panda": 8, "canadian": [8, 16], "languag": [8, 16], "ask": 8, "question": 8, "type": [8, 17], "analysi": 8, "tabular": [8, 11], "name": [8, 11, 17], "thing": 8, "frame": [8, 17], "loc": [8, 17], "filter": [8, 17], "row": [8, 11, 17], "column": [8, 11, 17], "sort_valu": 8, "head": 8, "order": 8, "ad": [8, 16, 17], "modifi": [8, 17], "combin": [8, 9, 17], "step": 8, "chain": 8, "multilin": 8, "express": 8, "visual": [8, 16], "altair": [8, 16], "bar": [8, 16], "plot": [8, 16], "format": [8, 9, 16], "chart": [8, 16], "all": [8, 11], "access": [8, 9, 11, 15], "document": 8, "code": 9, "text": [9, 11, 16], "jupyt": [9, 15], "cell": 9, "execut": 9, "kernel": 9, "markdown": 9, "save": [9, 16], "your": [9, 14, 15], "work": [9, 14, 15], "best": 9, "practic": 9, "run": 9, "notebook": 9, "includ": 9, "packag": 9, "file": [9, 11, 15, 16], "export": 9, "differ": [9, 11, 16], "html": [9, 11], "pdf": 9, "prefac": 10, "read": 11, "local": [11, 15], "web": 11, "absolut": 11, "rel": 11, "path": 11, "plain": 11, "read_csv": 11, "comma": 11, "separ": [11, 17], "skip": 11, "when": [11, 16], "sep": 11, "argument": 11, "header": 11, "handl": [11, 15], "directli": 11, "url": 11, "preview": 11, "befor": 11, "microsoft": 11, "excel": 11, "read_excel": 11, "databas": 11, "sqlite": 11, "postgresql": 11, "should": [11, 15], "bother": 11, "write": 11, "csv": 11, "obtain": [11, 14], "scrape": 11, "css": 11, "selector": 11, "beautifulsoup": 11, "read_html": 11, "api": 11, "nasa": 11, "regress": [12, 13], "model": 12, "underfit": 12, "multivari": [12, 13], "nn": [12, 13], "strength": 12, "limit": 12, "linear": 13, "simpl": 13, "compar": 13, "multicollinear": 13, "outlier": 13, "design": 13, "other": 13, "side": 13, "up": [14, 17], "comput": 14, "worksheet": 14, "thi": [14, 17], "book": 14, "docker": 14, "window": 14, "maco": 14, "ubuntu": 14, "jupyterlab": 14, "desktop": 14, "collabor": 15, "version": 15, "control": 15, "what": [15, 17], "repositori": 15, "workflow": 15, "commit": 15, "chang": 15, "push": 15, "remot": 15, "pull": 15, "github": 15, "pen": 15, "tool": 15, "add": 15, "menu": 15, "gener": 15, "person": 15, "token": 15, "clone": 15, "specifi": 15, "make": 15, "give": 15, "project": 15, "merg": [15, 17], "conflict": 15, "commun": 15, "issu": 15, "refin": 16, "scatter": 16, "line": 16, "mauna": 16, "loa": 16, "co_": 16, "2": 16, "old": 16, "faith": 16, "erupt": 16, "time": 16, "axi": 16, "transform": 16, "color": 16, "island": 16, "landmass": 16, "histogram": 16, "michelson": 16, "speed": 16, "light": 16, "layer": 16, "binwidth": 16, "explain": 16, "size": 16, "clean": 17, "wrangl": 17, "seri": 17, "basic": 17, "doe": 17, "have": 17, "structur": 17, "tidi": 17, "go": 17, "wide": 17, "long": 17, "melt": 17, "pivot": 17, "str": 17, "deal": 17, "multipl": 17, "extract": 17, "certain": 17, "satisfi": 17, "condit": 17, "least": 17, "one": 17, "list": 17, "isin": 17, "abov": 17, "below": 17, "threshold": 17, "queri": 17, "iloc": 17, "posit": 17, "aggreg": 17, "individu": 17, "oper": 17, "group": 17, "groupbi": 17, "appli": 17, "function": 17, "across": 17}, "envversion": {"sphinx.domains.c": 2, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 6, "sphinx.domains.index": 1, "sphinx.domains.javascript": 2, "sphinx.domains.math": 2, "sphinx.domains.python": 3, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx.ext.intersphinx": 1, "sphinxcontrib.bibtex": 9, "sphinx": 56}}) \ No newline at end of file diff --git a/pull337/viz.html b/pull337/viz.html index 0b9679ae..b4fdb0a3 100644 --- a/pull337/viz.html +++ b/pull337/viz.html @@ -733,23 +733,23 @@

4.5.1. Scatter plots and line plots: the
-
+

Fig. 4.2 Scatter plot of atmospheric concentration of CO\(_{2}\) over time.#

@@ -833,23 +833,23 @@

4.5.1. Scatter plots and line plots: the
-
+

Fig. 4.3 Line plot of atmospheric concentration of CO\(_{2}\) over time.#

@@ -929,23 +929,23 @@

4.5.1. Scatter plots and line plots: the
-
+

Fig. 4.4 Line plot of atmospheric concentration of CO\(_{2}\) over time with clearer axes and labels.#

@@ -1035,23 +1035,23 @@

4.5.1. Scatter plots and line plots: the
-
+

Fig. 4.5 Line plot of atmospheric concentration of CO\(_{2}\) from 1990 to 1995.#

@@ -1247,23 +1247,23 @@

4.5.2. Scatter plots: the Old Faithful e
-
+

Fig. 4.6 Scatter plot of waiting time and eruption time.#

@@ -1334,23 +1334,23 @@

4.5.2. Scatter plots: the Old Faithful e
-
+

Fig. 4.7 Scatter plot of waiting time and eruption time with clearer axes and labels.#

@@ -1415,23 +1415,23 @@

4.5.2. Scatter plots: the Old Faithful e
-
+

Fig. 4.8 Scatter plot of waiting time and eruption time with black points.#

@@ -1657,23 +1657,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.9 Scatter plot of number of Canadians reporting a language as their mother tongue vs the primary language at home#

@@ -1750,23 +1750,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.10 Scatter plot of number of Canadians reporting a language as their mother tongue vs the primary language at home with x and y labels.#

@@ -1924,23 +1924,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.11 Scatter plot of number of Canadians reporting a language as their mother tongue vs the primary language at home with log-adjusted x and y axes.#

@@ -2020,23 +2020,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.12 Scatter plot of number of Canadians reporting a language as their mother tongue vs the primary language at home with log-adjusted x and y axes. Only the major gridlines are shown. The suffix “k” indicates 1,000 (“kilo”), while the suffix “M” indicates 1,000,000 (“million”).#

@@ -2227,23 +2227,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.13 Scatter plot of percentage of Canadians reporting a language as their mother tongue vs the primary language at home.#

@@ -2364,23 +2364,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.14 Scatter plot of percentage of Canadians reporting a language as their mother tongue vs the primary language at home colored by language category.#

@@ -2459,23 +2459,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.15 Scatter plot of percentage of Canadians reporting a language as their mother tongue vs the primary language at home colored by language category with the legend edited.#

@@ -2570,23 +2570,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.16 Scatter plot of percentage of Canadians reporting a language as their mother tongue vs the primary language at home colored by language category with custom colors and shapes.#

@@ -2672,23 +2672,23 @@

4.5.3. Axis transformation and colored s
-
+

Fig. 4.17 Scatter plot of percentage of Canadians reporting a language as their mother tongue vs the primary language at home colored by language category with custom colors and mouse hover tooltip.#

@@ -3118,23 +3118,23 @@

4.5.4. Bar plots: the island landmass da
-
+

Fig. 4.18 Bar plot of Earth’s landmass sizes. The plot is too wide with the default settings.#

@@ -3219,23 +3219,23 @@

4.5.4. Bar plots: the island landmass da
-
+

Fig. 4.19 Bar plot of size for Earth’s largest 12 landmasses.#

@@ -3330,23 +3330,23 @@

4.5.4. Bar plots: the island landmass da
-
+

Fig. 4.20 Bar plot of size for Earth’s largest 12 landmasses, colored by landmass type, with clearer axes and labels.#

@@ -3558,23 +3558,23 @@

4.5.4. Bar plots: the island landmass da
-
+

Fig. 4.21 A bar chart of Michelson’s speed of light data.#

@@ -3649,23 +3649,23 @@

4.5.4. Bar plots: the island landmass da
-
+

Fig. 4.22 Histogram of Michelson’s speed of light data.#

@@ -3771,23 +3771,23 @@

Adding layers to an
-
+

Fig. 4.23 Histogram of Michelson’s speed of light data with vertical line indicating the true speed of light.#

@@ -3865,23 +3865,23 @@

Adding layers to an
-
+

Fig. 4.24 Histogram of Michelson’s speed of light data colored by experiment.#

@@ -3989,23 +3989,23 @@

Adding layers to an
-
+

Fig. 4.25 Histogram of Michelson’s speed of light data colored by experiment as a categorical variable.#

@@ -4092,23 +4092,23 @@

Adding layers to an
-
+

Fig. 4.26 Histogram of Michelson’s speed of light data split vertically by experiment.#

@@ -4324,23 +4324,23 @@

Adding layers to an
-
+

Fig. 4.27 Histogram of relative error split vertically by experiment with clearer axes and labels#

@@ -4414,23 +4414,23 @@

Choosing a binwidth for histograms
-
+

Fig. 4.28 Histogram of Michelson’s speed of light data.#

@@ -4502,23 +4502,23 @@

Choosing a binwidth for histograms
-
+

Fig. 4.29 Effect of varying number of max bins on histograms.#

diff --git a/pull337/wrangling.html b/pull337/wrangling.html index ea0a93b4..30dcba61 100644 --- a/pull337/wrangling.html +++ b/pull337/wrangling.html @@ -4792,7 +4792,7 @@

3.9. Performing operations on groups of

-
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fab6d73a450>
+
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f6c577be950>