diff --git a/pull301/classification1.html b/pull301/classification1.html index 8bfef4fa..556ca7a3 100644 --- a/pull301/classification1.html +++ b/pull301/classification1.html @@ -872,23 +872,23 @@
Fig. 5.2 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.#
Fig. 5.3 Scatter plot of concavity versus perimeter. The new observation is represented as a red diamond with a line to the one nearest neighbor, which has a malignant @@ -1136,23 +1136,23 @@ 5.5. Classification with - + Fig. 5.4 Scatter plot of concavity versus perimeter. The new observation is represented as a red diamond with a line to the one nearest neighbor, which has a benign @@ -1215,23 +1215,23 @@ 5.5. Classification with - + Fig. 5.5 Scatter plot of concavity versus perimeter with three nearest neighbors.# @@ -1311,23 +1311,23 @@ 5.5.1. Distance between points - + Fig. 5.6 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.# @@ -1507,23 +1507,23 @@ 5.5.1. Distance between points - + Fig. 5.7 Scatter plot of concavity versus perimeter with 5 nearest neighbors circled.# @@ -1719,9 +1719,9 @@ 5.5.2. More than two explanatory variabl }); } - Fig. 5.9 Comparison of K = 3 nearest neighbors with standardized and unstandardized data.# @@ -2512,23 +2512,23 @@ 5.7.1. Centering and scaling - + Fig. 5.10 Close-up of three nearest neighbors for unstandardized data.# @@ -2625,23 +2625,23 @@ 5.7.2. Balancing - + @@ -2722,23 +2722,23 @@ 5.7.2. Balancing - + Fig. 5.12 Imbalanced data with 7 nearest neighbors to a new observation highlighted.# @@ -2796,23 +2796,23 @@ 5.7.2. Balancing - + Fig. 5.13 Imbalanced data with background color indicating the decision of the classifier and the points represent the labeled data.# @@ -2906,23 +2906,23 @@ 5.7.2. Balancing - + Fig. 5.14 Upsampled data with background color indicating the decision of the classifier.# @@ -3445,23 +3445,23 @@ 5.7.3. Missing data - + diff --git a/pull301/classification2.html b/pull301/classification2.html index b719005e..7439883a 100644 --- a/pull301/classification2.html +++ b/pull301/classification2.html @@ -804,23 +804,23 @@ 6.5. Evaluating performance with - + @@ -1523,32 +1523,32 @@ 6.6.1. Cross-validation6.6.1. Cross-validation6.6.1. Cross-validation6.6.1. Cross-validation6.6.2. Parameter value selection - + Fig. 6.5 Plot of estimated accuracy versus the number of neighbors.# @@ -2256,23 +2256,23 @@ 6.6.3. Under/Overfitting - + Fig. 6.6 Plot of accuracy estimate versus number of neighbors for many K values.# @@ -2347,23 +2347,23 @@ 6.6.3. Under/Overfitting - + Fig. 6.7 Effect of K in overfitting and underfitting.# @@ -2648,23 +2648,23 @@ 6.8.1. The effect of irrelevant predicto - + Fig. 6.9 Effect of inclusion of irrelevant predictors.# @@ -2727,23 +2727,23 @@ 6.8.1. The effect of irrelevant predicto - + Fig. 6.10 Tuned number of neighbors for varying number of irrelevant predictors.# @@ -2797,23 +2797,23 @@ 6.8.1. The effect of irrelevant predicto - + Fig. 6.11 Accuracy versus number of irrelevant predictors for tuned and untuned number of neighbors.# @@ -3276,23 +3276,23 @@ 6.8.3. Forward selection in - + Fig. 6.12 Estimated accuracy versus the number of predictors for the sequence of models built using forward selection.# diff --git a/pull301/clustering.html b/pull301/clustering.html index 6411487d..0eb30e97 100644 --- a/pull301/clustering.html +++ b/pull301/clustering.html @@ -756,23 +756,23 @@ 9.4. An illustrative example - + Fig. 9.2 Scatter plot of standardized bill length versus standardized flipper length.# @@ -850,23 +850,23 @@ 9.4. An illustrative example - + Fig. 9.3 Scatter plot of standardized bill length versus standardized flipper length with colored groups.# @@ -959,23 +959,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.4 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange.# @@ -1042,23 +1042,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.5 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange. The distances from the observations to the cluster center are represented as black lines.# @@ -1121,23 +1121,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.6 All clusters from the penguins_standardized data set example. Observations are in blue, orange, and red with the cluster center highlighted in orange. The distances from the observations to each of the respective cluster centers are represented as black lines.# @@ -1205,23 +1205,23 @@ 9.5.2. The clustering algorithm - + Fig. 9.7 Random initialization of labels. Each cluster is depicted as a different color and shape.# @@ -1287,23 +1287,23 @@ 9.5.2. The clustering algorithm - + Fig. 9.8 First three iterations of K-means clustering on the penguins_standardized example data set. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.# @@ -1373,23 +1373,23 @@ 9.5.3. Random restarts - + Fig. 9.9 Random initialization of labels.# @@ -1444,23 +1444,23 @@ 9.5.3. Random restarts - + Fig. 9.10 First four iterations of K-means clustering on the penguins_standardized example data set with a poor random initialization. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.# @@ -1530,23 +1530,23 @@ 9.5.4. Choosing K - + Fig. 9.11 Clustering of the penguin data for K clusters ranging from 1 to 9. Cluster centers are indicated by larger points that are outlined in black.# @@ -1606,23 +1606,23 @@ 9.5.4. Choosing K - + Fig. 9.12 Total WSSD for K clusters ranging from 1 to 9.# @@ -1934,23 +1934,23 @@ 9.6. K-means in Python - + Fig. 9.13 The data colored by the cluster assignments returned by K-means.# @@ -2179,23 +2179,23 @@ 9.6. K-means in Python - + Fig. 9.14 A plot showing the total WSSD versus the number of clusters.# diff --git a/pull301/inference.html b/pull301/inference.html index a7d74bdc..6d395d04 100644 --- a/pull301/inference.html +++ b/pull301/inference.html @@ -1225,23 +1225,23 @@ 10.4.1. Sampling distributions for propo - + Fig. 10.2 Sampling distribution of the sample proportion for sample size 40.# @@ -1349,23 +1349,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.3 Population distribution of price per night (dollars) for all Airbnb listings in Vancouver, Canada.# @@ -1471,23 +1471,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.4 Distribution of price per night (dollars) for sample of 40 Airbnb listings.# @@ -1686,23 +1686,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.5 Sampling distribution of the sample means for sample size of 40.# @@ -1782,23 +1782,23 @@ 10.4.2. Sampling distributions for means - + @@ -1864,23 +1864,23 @@ 10.4.2. Sampling distributions for means - +
Fig. 5.4 Scatter plot of concavity versus perimeter. The new observation is represented as a red diamond with a line to the one nearest neighbor, which has a benign @@ -1215,23 +1215,23 @@ 5.5. Classification with - + Fig. 5.5 Scatter plot of concavity versus perimeter with three nearest neighbors.# @@ -1311,23 +1311,23 @@ 5.5.1. Distance between points - + Fig. 5.6 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.# @@ -1507,23 +1507,23 @@ 5.5.1. Distance between points - + Fig. 5.7 Scatter plot of concavity versus perimeter with 5 nearest neighbors circled.# @@ -1719,9 +1719,9 @@ 5.5.2. More than two explanatory variabl }); } - Fig. 5.9 Comparison of K = 3 nearest neighbors with standardized and unstandardized data.# @@ -2512,23 +2512,23 @@ 5.7.1. Centering and scaling - + Fig. 5.10 Close-up of three nearest neighbors for unstandardized data.# @@ -2625,23 +2625,23 @@ 5.7.2. Balancing - + @@ -2722,23 +2722,23 @@ 5.7.2. Balancing - + Fig. 5.12 Imbalanced data with 7 nearest neighbors to a new observation highlighted.# @@ -2796,23 +2796,23 @@ 5.7.2. Balancing - + Fig. 5.13 Imbalanced data with background color indicating the decision of the classifier and the points represent the labeled data.# @@ -2906,23 +2906,23 @@ 5.7.2. Balancing - + Fig. 5.14 Upsampled data with background color indicating the decision of the classifier.# @@ -3445,23 +3445,23 @@ 5.7.3. Missing data - + diff --git a/pull301/classification2.html b/pull301/classification2.html index b719005e..7439883a 100644 --- a/pull301/classification2.html +++ b/pull301/classification2.html @@ -804,23 +804,23 @@ 6.5. Evaluating performance with - + @@ -1523,32 +1523,32 @@ 6.6.1. Cross-validation6.6.1. Cross-validation6.6.1. Cross-validation6.6.1.
Fig. 5.5 Scatter plot of concavity versus perimeter with three nearest neighbors.#
Fig. 5.6 Scatter plot of concavity versus perimeter with new observation represented as a red diamond.#
Fig. 5.7 Scatter plot of concavity versus perimeter with 5 nearest neighbors circled.#
Fig. 5.9 Comparison of K = 3 nearest neighbors with standardized and unstandardized data.#
Fig. 5.10 Close-up of three nearest neighbors for unstandardized data.#
Fig. 5.12 Imbalanced data with 7 nearest neighbors to a new observation highlighted.#
Fig. 5.13 Imbalanced data with background color indicating the decision of the classifier and the points represent the labeled data.#
Fig. 5.14 Upsampled data with background color indicating the decision of the classifier.#
- + @@ -1523,32 +1523,32 @@ 6.6.1. Cross-validation6.6.1.
Fig. 6.5 Plot of estimated accuracy versus the number of neighbors.#
Fig. 6.6 Plot of accuracy estimate versus number of neighbors for many K values.#
Fig. 6.7 Effect of K in overfitting and underfitting.#
Fig. 6.9 Effect of inclusion of irrelevant predictors.#
Fig. 6.10 Tuned number of neighbors for varying number of irrelevant predictors.#
Fig. 6.11 Accuracy versus number of irrelevant predictors for tuned and untuned number of neighbors.#
- + Fig. 6.12 Estimated accuracy versus the number of predictors for the sequence of models built using forward selection.# diff --git a/pull301/clustering.html b/pull301/clustering.html index 6411487d..0eb30e97 100644 --- a/pull301/clustering.html +++ b/pull301/clustering.html @@ -756,23 +756,23 @@ 9.4. An illustrative example - + Fig. 9.2 Scatter plot of standardized bill length versus standardized flipper length.# @@ -850,23 +850,23 @@ 9.4. An illustrative example - + Fig. 9.3 Scatter plot of standardized bill length versus standardized flipper length with colored groups.# @@ -959,23 +959,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.4 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange.# @@ -1042,23 +1042,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.5 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange. The distances from the observations to the cluster center are represented as black lines.# @@ -1121,23 +1121,23 @@ 9.5.1. Measuring cluster quality - + Fig. 9.6 All clusters from the penguins_standardized data set example. Observations are in blue, orange, and red with the cluster center highlighted in orange. The distances from the observations to each of the respective cluster centers are represented as black lines.# @@ -1205,23 +1205,23 @@ 9.5.2. The clustering algorithm - + Fig. 9.7 Random initialization of labels. Each cluster is depicted as a different color and shape.# @@ -1287,23 +1287,23 @@ 9.5.2. The clustering algorithm - + Fig. 9.8 First three iterations of K-means clustering on the penguins_standardized example data set. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.# @@ -1373,23 +1373,23 @@ 9.5.3. Random restarts - + Fig. 9.9 Random initialization of labels.# @@ -1444,23 +1444,23 @@ 9.5.3. Random restarts - + Fig. 9.10 First four iterations of K-means clustering on the penguins_standardized example data set with a poor random initialization. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.# @@ -1530,23 +1530,23 @@ 9.5.4. Choosing K - + Fig. 9.11 Clustering of the penguin data for K clusters ranging from 1 to 9. Cluster centers are indicated by larger points that are outlined in black.# @@ -1606,23 +1606,23 @@ 9.5.4. Choosing K - + Fig. 9.12 Total WSSD for K clusters ranging from 1 to 9.# @@ -1934,23 +1934,23 @@ 9.6. K-means in Python - + Fig. 9.13 The data colored by the cluster assignments returned by K-means.# @@ -2179,23 +2179,23 @@ 9.6. K-means in Python - + Fig. 9.14 A plot showing the total WSSD versus the number of clusters.# diff --git a/pull301/inference.html b/pull301/inference.html index a7d74bdc..6d395d04 100644 --- a/pull301/inference.html +++ b/pull301/inference.html @@ -1225,23 +1225,23 @@ 10.4.1. Sampling distributions for propo - + Fig. 10.2 Sampling distribution of the sample proportion for sample size 40.# @@ -1349,23 +1349,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.3 Population distribution of price per night (dollars) for all Airbnb listings in Vancouver, Canada.# @@ -1471,23 +1471,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.4 Distribution of price per night (dollars) for sample of 40 Airbnb listings.# @@ -1686,23 +1686,23 @@ 10.4.2. Sampling distributions for means - + Fig. 10.5 Sampling distribution of the sample means for sample size of 40.# @@ -1782,23 +1782,23 @@ 10.4.2. Sampling distributions for means - + @@ -1864,23 +1864,23 @@ 10.4.2. Sampling distributions for means - +
Fig. 6.12 Estimated accuracy versus the number of predictors for the sequence of models built using forward selection.#
Fig. 9.2 Scatter plot of standardized bill length versus standardized flipper length.#
Fig. 9.3 Scatter plot of standardized bill length versus standardized flipper length with colored groups.#
Fig. 9.4 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange.#
penguins_standardized
Fig. 9.5 Cluster 0 from the penguins_standardized data set example. Observations are in blue, with the cluster center highlighted in orange. The distances from the observations to the cluster center are represented as black lines.#
Fig. 9.6 All clusters from the penguins_standardized data set example. Observations are in blue, orange, and red with the cluster center highlighted in orange. The distances from the observations to each of the respective cluster centers are represented as black lines.#
Fig. 9.7 Random initialization of labels. Each cluster is depicted as a different color and shape.#
Fig. 9.8 First three iterations of K-means clustering on the penguins_standardized example data set. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.#
Fig. 9.9 Random initialization of labels.#
Fig. 9.10 First four iterations of K-means clustering on the penguins_standardized example data set with a poor random initialization. Each pair of plots corresponds to an iteration. Within the pair, the first plot depicts the center update, and the second plot depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.#
Fig. 9.11 Clustering of the penguin data for K clusters ranging from 1 to 9. Cluster centers are indicated by larger points that are outlined in black.#
Fig. 9.12 Total WSSD for K clusters ranging from 1 to 9.#
Fig. 9.13 The data colored by the cluster assignments returned by K-means.#
Fig. 9.14 A plot showing the total WSSD versus the number of clusters.#
Fig. 10.2 Sampling distribution of the sample proportion for sample size 40.#
Fig. 10.3 Population distribution of price per night (dollars) for all Airbnb listings in Vancouver, Canada.#
Fig. 10.4 Distribution of price per night (dollars) for sample of 40 Airbnb listings.#
Fig. 10.5 Sampling distribution of the sample means for sample size of 40.#