From 648e3b58183a837076b91200d11c0c942ff7b1f0 Mon Sep 17 00:00:00 2001 From: shiffman Date: Mon, 7 Aug 2023 21:42:48 +0000 Subject: [PATCH] Notion - Update docs --- content/06_libraries.html | 4 +- content/09_ga.html | 1 - content/10_nn.html | 137 ++++++++++++++++++-------------------- 3 files changed, 68 insertions(+), 74 deletions(-) diff --git a/content/06_libraries.html b/content/06_libraries.html index b524c678..84140576 100644 --- a/content/06_libraries.html +++ b/content/06_libraries.html @@ -835,8 +835,8 @@

Exercise 6.6

Create a vehicle that has revolute joints for its wheels. Consider the size and positioning of the wheels. How does changing the stiffness property affect their movement?

-   -
 
+ +

Mouse Constraints

diff --git a/content/09_ga.html b/content/09_ga.html index f52c841a..3a0a7cc7 100644 --- a/content/09_ga.html +++ b/content/09_ga.html @@ -237,7 +237,6 @@

B) Create a mating pool.

Figure 9.2: A “wheel of fortune” where each slice of the wheel is sized according to a fitness value.

-

Spin the wheel and you’ll notice that Element B has the highest chance of being selected, followed by A, then E, then D, and finally C. This probability-based selection according to fitness is an excellent approach. One, it guarantees that the highest-scoring elements will be most likely to reproduce. Two, it does not entirely eliminate any variation from the population. Unlike with the elitist method, even the lowest-scoring element (in this case C) has a chance to pass its information down to the next generation. It’s quite possible (and often the case) that even low-scoring elements have a tiny nugget of genetic code that is truly useful and should not entirely be eliminated from the population. For example, in the case of evolving “to be or not to be”, we might have the following elements.

diff --git a/content/10_nn.html b/content/10_nn.html index ddd8c8a4..9ad12216 100644 --- a/content/10_nn.html +++ b/content/10_nn.html @@ -283,13 +283,6 @@

Coding the Perceptron

-
-

Example 10.1: The Perceptron

-
-
-
-
-

The error is the determining factor in how the perceptron’s weights should be adjusted. For any given weight, what I am looking to calculate is the change in weight, often called \Delta\text{weight} (or “delta” weight, delta being the Greek letter \Delta).

\text{new weight} = \text{weight} + \Delta\text{weight}

\Delta\text{weight} is calculated as the error multiplied by the input.

@@ -390,7 +383,13 @@

Example 10.1: The Perceptron

perceptron.train(trainingInputs, desired);

Now, it’s important to remember that this is just a demonstration. Remember the Shakespeare-typing monkeys? I asked the genetic algorithm to solve for “to be or not to be”—an answer I already knew. I did this to make sure the genetic algorithm worked properly. The same reasoning applies to this example. I don’t need a perceptron to tell me whether a point is above or below a line; I can do that with simple math. By using an example that I can easily solve without a perceptron, I can both demonstrate the algorithm of the perceptron and verify that it is working properly.

Let’s look the perceptron trained with with an array of many points.

-

+
+

Example 10.1: The Perceptron

+
+
+
+
+
// The Perceptron
 let perceptron;
 //{!1} 2,000 training points
@@ -459,39 +458,38 @@ 

It’s a “Network,” Remember?

Figure 10.11

On the left of Figure 10.11, is an example of classic linearly separable data. Graph all of the possibilities; if you can classify the data with a straight line, then it is linearly separable. On the right, however, is non-linearly separable data. You can’t draw a straight line to separate the black dots from the gray ones.

-

One of the simplest examples of a non-linearly separable problem is XOR, or “exclusive or.” By now your should be familiar with AND. For A AND B to be true, both A and B must be true. With OR, either A or B can be true for A OR B to evaluate as true. These are both linearly separable problems. Let’s look at the solution space, a “truth table.”

+

One of the simplest examples of a non-linearly separable problem is XOR, or “exclusive or.” I’m guessing, as someone who works with coding and p5.js, you are familiar with a logical \text{AND}. For A \text{ AND } B to be true, both A and B must be true. With \text{OR}|, either A or B can be true for A \text{ OR } B to evaluate as true. These are both linearly separable problems. Let’s look at the solution space, a “truth table.”

Figure 10.12
Figure 10.12

See how you can draw a line to separate the true outputs from the false ones?

-

XOR is the equivalent of OR and NOT AND. In other words, A XOR B only evaluates to true if one of them is true. If both are false or both are true, then we get false. Take a look at the following truth table.

+

\text{XOR} (”exclusive” or) is the equivalent \text{OR} and \text{NOT AND}. In other words, A \text{ XOR } B only evaluates to true if one of them is true. If both are false or both are true, then we get false. Take a look at the following truth table.

Figure 10.13
Figure 10.13

This is not linearly separable. Try to draw a straight line to separate the true outputs from the false ones—you can’t!

-

So perceptrons can’t even solve something as simple as XOR. But what if we made a network out of two perceptrons? If one perceptron can solve OR and one perceptron can solve NOT AND, then two perceptrons combined can solve XOR.

+

So perceptrons can’t even solve something as simple as \text{XOR}. But what if we made a network out of two perceptrons? If one perceptron can solve \text{OR} and one perceptron can solve \text{NOT AND}, then two perceptrons combined can solve \text{XOR}.

Figure 10.14
Figure 10.14

The above diagram is known as a multi-layered perceptron, a network of many neurons. Some are input neurons and receive the inputs, some are part of what’s called a “hidden” layer (as they are connected to neither the inputs nor the outputs of the network directly), and then there are the output neurons, from which the results are read.

-

Training these networks is much more complicated. With the simple perceptron, you could easily evaluate how to change the weights according to the error. But here there are so many different connections, each in a different layer of the network. How does one know how much each neuron or connection contributed to the overall error of the network?

-

The solution to optimizing weights of a multi-layered network is known as backpropagation. The output of the network is generated in the same manner as a perceptron. The inputs multiplied by the weights are summed and fed forward through the network. The difference here is that they pass through additional layers of neurons before reaching the output. Training the network (i.e. adjusting the weights) also involves taking the error (desired result - guess). The error, however, must be fed backwards through the network. The final error ultimately adjusts the weights of all the connections.

-

Backpropagation is beyond the scope of this book and involves a fancier activation function (called the sigmoid function) as well as some basic calculus. If you are interested in continuing down this road and learning more about how backpropagation works, you can find my “toy neural network” project at github.com/CodingTrain with links to accompanying video tutorials. They go through all the steps of solving XOR using a multi-layered feed forward network with backpropagation. For this chapter, however, I’d like to get some help and phone a friend.

+

Training these networks is more complex. With the simple perceptron, you could easily evaluate how to change the weights according to the error. But here there are so many different connections, each in a different layer of the network. How does one know how much each neuron or connection contributed to the overall error of the network?

+

The solution to optimizing the weights of a multi-layered network is known as backpropagation. In this process, the output of the network is generated in the same manner as a perceptron. The inputs multiplied by the weights are summed and fed forward through the network. The difference here is that they pass through additional layers of neurons before reaching the output. Training the network (i.e. adjusting the weights) also involves taking the error (desired result - guess). The error, however, must be fed backwards through the network. The final error ultimately adjusts the weights of all the connections.

+

Backpropagation is beyond the scope of this book and involves a variety of different activation functions (one class example is the “sigmoid” function) as well as some calculus. If you are interested in continuing down this road and learning more about how backpropagation works, you can find my “toy neural network” project at github.com/CodingTrain with links to accompanying video tutorials. They go through all the steps of solving \text{XOR} using a multi-layered feed forward network with backpropagation. For this chapter, however, I’d like to get some help and phone a friend.

Machine Learning with ml5.js

-

That friend is ml5.js. Inspired by the philosophy of p5.js, ml5.js is a JavaScript library that aims to make machine learning accessible to a wide range of artists, creative coders, and students. It is built on top of TensorFlow.js, Google's open-source library that runs machine learning models directly in the browser without the need to install or configure complex environments. However, TensorFlow.js's low-level operations and highly technical API can be intimidating to beginners. That's where ml5.js comes in, providing a friendly entry point for those who are new to machine learning and neural networks.

-

Before I get to my goal of adding a "neural network" brain to a steering agent and tying ml5.js back into the story of the book, I would like to demonstrate step-by-step how to train a neural network model with "supervised learning." There are several key terms and concepts important to cover, namely “classification”, “regression”, “inputs”, and “outputs”. Examining these ideas within the context of supervised learning scenario is a great way to explore on these foundational concepts, introduce the syntax of the ml5.js library, and tie everything together.

+

That friend is ml5.js. Inspired by the philosophy of p5.js, ml5.js is a JavaScript library that aims to make machine learning accessible to a wide range of artists, creative coders, and students. It is built on top of TensorFlow.js, Google's open-source library that runs machine learning models directly in the browser without the need to install or configure complex environments. TensorFlow.js's low-level operations and highly technical API, however, can be intimidating to beginners. That's where ml5.js comes in, providing a friendly entry point for those who are new to machine learning and neural networks.

+

Before I get to my goal of adding a "neural network" brain to a steering agent and tying ml5.js back into the story of the book, I would like to demonstrate step-by-step how to train a neural network model with "supervised learning." There are several key terms and concepts important to cover, namely “classification”, “regression”, “inputs”, and “outputs”. By walking through the full process of a supervised learning scenario, I hope to define these terms, explore other foundational concepts, introduce the syntax of the ml5.js library, and provide the tools to train your first machine learning model with your own data.

Classification and Regression

-

The majority of machine learning tasks fall into one of two categories: classification and regression. Classification is probably the easier of the two to understand at the start. It involves predicting a “label” (or “category” or “class”) for a piece of data. For example, an “image classifier" might try to guess if a photo is of a cat or a dog and assign the corresponding label.

+

The majority of machine learning tasks fall into one of two categories: classification and regression. Classification is probably the easier of the two to understand at the start. It involves predicting a “label” (or “category” or “class”) for a piece of data. For example, an image classifier might try to guess if a photo is of a cat or a dog and assign the corresponding label.

[FIGURE OF CAT OR DOG OR BIRD OR MONKEY OR ILLUSTRATIONS ASSIGNED A LABEL?]

-

This doesn’t happen by magic, however. The model must first be shown many examples of dog and cat illustrations with the correct labels in order to properly configure all the weights of all the connections. This is the supervised learning training process.

-

The simplest version of this scenario is probably the classic “Hello, World” demonstration of machine learning known as “MNIST”. MNIST, short for 'Modified National Institute of Standards and Technology,' is a dataset that was collected and processed by Yann LeCun and Corinna Cortes (AT&T Labs) and Christopher J.C. Burges (Microsoft Research). It is widely used for training and testing in the field of machine learning and consists of 70,000 handwritten digits from 0 to 9, each digit being a 28x28 pixel grayscale image.

+

This doesn’t happen by magic, however. The model must first be shown many examples of dogs and cats with the correct labels in order to properly configure the weights of all the connections. This is the supervised learning training process.

+

The classic “Hello, World” demonstration of machine learning and supervised learning is known as “MNIST”. MNIST, short for “Modified National Institute of Standards and Technology,” is a dataset that was collected and processed by Yann LeCun and Corinna Cortes (AT&T Labs) and Christopher J.C. Burges (Microsoft Research). It is widely used for training and testing in the field of machine learning and consists of 70,000 handwritten digits from 0 to 9, with each one being a 28x28 pixel grayscale image.

[FIGURE FOR MNIST?]

-

While I won't be building a complete MNIST model for training and deployment, it serves as a canonical example of a training dataset for image classification: 70,000 images each assigned one of 10 possible labels. The key element of classification is that the output of the model involves a fixed number of discrete options. There are only 10 possible digits that the model can guess, no more and no less. After the data is used to train the model, the goal is to classify new images and assign the appropriate label.

-

Regression, on the other hand, is a machine learning task where the prediction is a continuous value, typically a floating point number. A regression problem can involve multiple outputs, but when beginning it’s often simpler to think of it as just one.

-

Consider a machine learning model that predicts the daily electricity usage of a house based on any number of factors like number of occupants, size of house, temperature outside. Here, rather than a goal of the neural network picking from a discrete set of options, it makes more sense for the neural network to guess a number. Will the house use 30.5 kilowatt-hours of energy that day? 48.7 kWh? 100.2 kWh? The output is therefore a continuous value that the model attempts to predict.

+

While I won't be building a complete MNIST model with ml5.js (you could if you wanted to!), it serves as a canonical example of a training dataset for image classification: 70,000 images each assigned one of 10 possible labels. This idea of a “label” is fundamental to classification, where the output of a model involves a fixed number of discrete options. There are only 10 possible digits that the model can guess, no more and no less. After the data is used to train the model, the goal is to classify new images and assign the appropriate label.

+

Regression, on the other hand, is a machine learning task where the prediction is a continuous value, typically a floating point number. A regression problem can involve multiple outputs, but when beginning it’s often simpler to think of it as just one. Consider a machine learning model that predicts the daily electricity usage of a house based on any number of factors like number of occupants, size of house, temperature outside. Here, rather than a goal of the neural network picking from a discrete set of options, it makes more sense for the neural network to guess a number. Will the house use 30.5 kilowatt-hours of energy that day? 48.7 kWh? 100.2 kWh? The output is therefore a continuous value that the model attempts to predict.

[FIGURE ILLUSTRATING REGRESSION?]

Inputs and Outputs

Once the task has been determined, the next step is to finalize the configuration of inputs and outputs of the neural network. In the case of MNIST, each image is a collection of 28x28 grayscale pixels and each pixel can be represented as a single value (ranging from 0-255). The total pixels is 28 \times 28 = 784. The grayscale value of each pixel is an input to the neural network.

@@ -557,14 +555,14 @@

Inputs and Outputs

Here in this table, the inputs to the neural network are the first three columns (occupants, size, temperature). The fourth column on the right is what the neural network is expected to guess, or the output.

[FIGURE SHOWING 3 inputs + 1 output]

Setting up the Neural Network with ml5.js

-

In a typical machine learning scenario, the next step after establishing the inputs and outputs is to configure the full architecture of the neural network. This involves specifying the number of hidden layers between the inputs and outputs, the number of neurons in each layer, which activation functions to use, and more! While all of this is technically possible in ml5.js, using a high-level library has the advantage of making its best guesses based on the task, inputs, and outputs to configure the network and so I can get started writing the code itself!

-

Just as demonstrated with Matter.js and toxiclibs.js in chapter 6, the ml5.js library can be imported into index.html.

+

In a typical machine learning scenario, the next step after establishing the inputs and outputs is to configure the architecture of the neural network. This involves specifying the number of hidden layers between the inputs and outputs, the number of neurons in each layer, which activation functions to use, and more! While all of this is possible with ml5.js, it will make its best guess and design a model for you based on the task and data.

+

As demonstrated with Matter.js and toxiclibs.js in chapter 6, you can import the ml5.js library into your index.html file.

<script src="https://unpkg.com/ml5@latest/dist/ml5.min.js"></script>
-

The ml5.js library is a collection of machine learning models and functions that can be accessed with the syntax ml5.functionName(). If you wanted to use a pre-trained model that detects hands, you might say ml5.handpose() or for classifying images ml5.imageClassifier(). I encourage to explore all of what ml5.js has to offer (and I will reference some of these pre-trained models in upcoming exercise ideas), however, for this chapter, I’ll be focusing on one function only in ml5.js, the function for creating a generic “neural network”: ml5.neuralNetwork().

-

Creating the neural network involves first making a JavaScript object with the necessary configuration properties of the network. There are many options you can use to list, but almost all of them are optional as the network will use many defaults. The default task in ml5.js is “regression” so if you wanted to create a neural network for classification you would have to write the code as follows:

+

The ml5.js library is a collection of machine learning models that can be accessed using the syntax ml5.functionName(). For example, to use a pre-trained model that detects hands, you can use ml5.handpose(). For classifying images, you can use ml5.imageClassifier(). While I encourage exploring all that ml5.js has to offer (I will reference some of these pre-trained models in upcoming exercise ideas), for this chapter, I will focus on only one function in ml5.js: ml5.neuralNetwork(), which creates an empty neural network for you to train.

+

To create a neural network, you must first create a JavaScript object that will configure the model. While there are many properties that you can set, most of them are optional, as the network will use default values. Let’s begin by specifying the "task" that you intend the model to perform: "regression" or "classification.”

let options = { task: "classification" }
 let classifier = ml5.neuralNetwork(options);
-

This, however, gives ml5.js very little to go on in terms of designing the network architecture. Adding the inputs and outputs will complete the rest of the puzzle for it. In the case of MNIST, we established there were 784 inputs (grayscale pixel colors) and 10 possible output labels (digits “0” through “9”). This can be configured in ml5.js with a single integer for the number of inputs and an array of strings for the list of output labels.

+

This, however, gives ml5.js very little to go on in terms of designing the network architecture. Adding the inputs and outputs will complete the rest of the puzzle for it. In the case of MNIST, there are 784 inputs (grayscale pixel colors) and 10 possible output labels (digits “0” through “9”). This can be configured in ml5.js with a single integer for the number of inputs and an array of strings for the list of output labels.

let options = {
   inputs: 784,
   outputs: ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
@@ -578,27 +576,28 @@ 

Setting up the Neural Network task: "regression", }; let energyPredictor = ml5.neuralNetwork(options);

-

While the MNIST and energy predictor scenarios are useful starting points for understanding how machine learning works, it's important to note that these are simplified versions of what you might encounter in a “real-world” machine learning application. Depending on the problem, there could be significantly higher levels of complexity both in terms of the network architecture and the scale and preparation of data. Instead of a neatly packaged dataset like MNIST, you might be dealing with enormous amounts of messy data. This data might need to be processed and refined before it can be effectively used. You might think of it like organizing, washing, and chopping ingredients before you can start cooking with them.

+

While the MNIST and energy predictor scenarios are useful starting points for understanding how machine learning works, it's important to note that they are simplified versions of what you might encounter in a “real-world” machine learning application. Depending on the problem, there could be significantly higher levels of complexity both in terms of the network architecture and the scale and preparation of data. Instead of a neatly packaged dataset like MNIST, you might be dealing with enormous amounts of messy data. This data might need to be processed and refined before it can be effectively used. You can think of it like organizing, washing, and chopping ingredients before you can start cooking with them.

The “lifecycle” of a machine learning model is typically broken down into seven steps.

  1. Data Collection: Data forms the foundation of any machine learning task. This stage might involve running experiments, manually inputting values, sourcing public data, or a myriad of other methods.
  2. -
  3. Data Preparation: Raw data often isn't in a format suitable for machine learning algorithms. It might also have duplicate or missing values, or contain outliers that skew the data. Such inconsistencies may need to be manually adjusted. Additionally, neural networks typically work best with “normalized” data. While this term might remind you of normalizing vectors, it's important to understand that it carries a slightly different meaning in the context of data preparation. A “normalized” vector’s length is set to a fixed value, typically 1, with the direction intact. However, data normalized for machine learning involves adjusting the values so that they fit within a specific range, commonly between 0 and 1 or -1 and 1. Another key part of preparing data is separating it into two distinct sets: "training" and "testing" data. The training data is used to teach the model (Step 5). The testing data, on the other hand, is set aside and reserved evaluating the model’s performance (Step 6).
  4. +
  5. Data Preparation: Raw data often isn't in a format suitable for machine learning algorithms. It might also have duplicate or missing values, or contain outliers that skew the data. Such inconsistencies may need to be manually adjusted. Additionally, neural networks work best with “normalized” data. While this term might remind you of normalizing vectors, it's important to understand that it carries a slightly different meaning in the context of data preparation. A “normalized” vector’s length is set to a fixed value, usually 1, with the direction intact. However, data normalized for machine learning involves adjusting the values so that they fit within a specific range, generally between 0 and 1 or -1 and 1. Another key part of preparing data is separating it into distinct sets: training, validation, and testing. The training data is used to teach the model (Step 5). On the other hand, the validation and testing data (the distinction is subtle, more on this later) are set aside and reserved for evaluating the model's performance (Step 6).
  6. Choosing a Model: This step involves designing the architecture of the neural network. Different models are more suitable for certain types of data and outputs.
  7. -
  8. Training: This step involves feeding the "training" data through the model, allowing the model to learn and adjust the weights of the neural network based on its errors. This process is known as “optimization” where the model tunes the weights to optimize for the least amount of errors.
  9. -
  10. Evaluation: Remember that data that was saved for “testing” in step 3? Since that data wasn’t used in training, it provides a means to evaluate how well the model performs on new, unseen data.
  11. -
  12. Parameter Tuning: The training process is influenced by a set of parameters (often called “hyperparameters”), such as the "learning rate," which dictates how much the model should adjust its weights based on errors in prediction. By fine-tuning these parameters and possibly revisiting steps 5 (Training), 4 (Choosing a Model), or even 3 (Data Preparation), you can often improve the model's performance.
  13. +
  14. Training: This step involves feeding the "training" data through the model, allowing the model to adjust the weights of the neural network based on its errors. This process is known as “optimization” where the model tunes the weights to optimize for the least amount of errors.
  15. +
  16. Evaluation: Remember that “testing” data that was saved for in step 3? Since that data wasn’t used in training, it provides a means to evaluate how well the model performs on new, unseen data.
  17. +
  18. Parameter Tuning: The training process is influenced by a set of parameters (often called “hyperparameters”), such as the "learning rate," which dictates how much the model should adjust its weights based on errors in prediction. By fine-tuning these parameters and revisiting steps 5 (Training), 4 (Choosing a Model), or even 3 (Data Preparation), you can often improve the model's performance.
  19. Deployment: Once the model is trained and its performance is evaluated satisfactorily, it’s time to actually use the model out in the real world with new data!

Building a Gesture Classifier

-

I’d like to now follow the 7 steps above with an example p5.js problem and build all the code for each step using ml5.js. However, even though 7 is a truly excellent number, I think I missed a critical step. Let’s call it step 0.

+

I’d like to now follow the 7 steps outlined with an example problem well suited for p5.js and build all the code for each step using ml5.js. However, even though 7 is a truly excellent number, I think I missed a critical step. Let’s call it step 0.

  1. Identify the Problem: This initial step involves defining the problem that needs solving. What is the objective? What are you trying to accomplish or predict with your machine learning model?

After all, how are you supposed to collect your data without knowing what you are even trying to do? Are you predicting a number? A category? A sequence? Is it a binary choice, or are there multiple options? These considerations about your inputs (the data fed into the model) and outputs (the predictions) are critical for every other step of the machine learning journey.

-

And so let’s take a crack at step 0 for an example problem of training your first machine learning model with ml5.js and p5.js. Imagine for a moment, you’re working on an interactive application that responds to a gesture, maybe that gesture is ultimately meant to be classified via body tracking, but you want to start with something much simpler—one single stroke of the mouse. Each gesture could be recorded as a vector (extending from the start to the end points of a mouse movement) and the model’s task could be to predict one of four options: “up”, “down”, “left”, or “right.” Perfect! I’ve now got the objective and boiled it down into inputs and outputs!

+

[POSSIBLE ILLUSTRATION OF A SINGLE MOUSE SWIPE AS A GESTURE: basically can the paragraph below be made into a drawing?]

+

Let’s take a crack at step 0 for an example problem of training your first machine learning model with ml5.js and p5.js. Imagine for a moment, you’re working on an interactive application that responds to a gesture, maybe that gesture is ultimately meant to be classified via body tracking, but you want to start with something much simpler—one single stroke of the mouse. Each gesture could be recorded as a vector (extending from the start to the end points of a mouse movement) and the model’s task could be to predict one of four options: “up”, “down”, “left”, or “right.” Perfect! I’ve now got the objective and boiled it down into inputs and outputs!

Data Collection and Preparation

-

Next, I’ve got steps 1 and 2: data collection and preparation. Here, I’d like to take the approach of ordering a machine learning “meal-kit,” where the ingredients (data) comes pre-portioned and prepared. I’d like to focus here on the cooking itself, the process of training a machine learning model. After all, this is really just an appetizer for what will be the ultimate meal later in this chapter when I get to applying neural networks to steering agents.

-

So for me, I’m going to hard-code that data itself and manually keep it normalized within a range of -1 and 1. Here it is directly into the code (rather than loaded from a separate file) and organized into an array of objects, pairing the x,y components of a vector with a string label.

+

Next, I’ve got steps 1 and 2: data collection and preparation. Here, I’d like to take the approach of ordering a machine learning “meal-kit,” where the ingredients (data) comes pre-portioned and prepared. This way, I’ll get straight to the cooking itself, the process of training the model. After all, this is really just an appetizer for what will be the ultimate meal later in this chapter when I get to applying neural networks to steering agents.

+

For this step, I’ll hard-code that data itself and manually keep it normalized within a range of -1 and 1. Here it is directly written into the code, rather than loaded from a separate file. It is organized into an array of objects, pairing the x,y components of a vector with a string label.

let data = [
   { x: 0.99, y: 0.02, label: "right" },
   { x: 0.76, y: -0.1, label: "right" },
@@ -609,7 +608,7 @@ 

Data Collection and Preparation

{ x: 0.01, y: -0.9, label: "up" }, { x: -0.1, y: -0.8, label: "up" }, ];
-

In truth, it would likely be better to collect example data by asking users to perform specific gestures and recording their inputs, or by creating synthetic data that represents the idealized versions of the gestures I want the model to recognize. In either case, the key is to collect a diverse set of examples that adequately represent the variations in how the gestures might be performed. But let’s see how it goes with this small amount of dummy data.

+

In truth, it would likely be better to collect example data by asking users to perform specific gestures and recording their inputs, or by creating synthetic data that represents the idealized versions of the gestures I want the model to recognize. In either case, the key is to collect a diverse set of examples that adequately represent the variations in how the gestures might be performed. But let’s see how it goes with just a few servings of data.

Exercise 10.3

@@ -617,7 +616,7 @@

Exercise 10.3

JSON (JavaScript Object Notation) and CSV (Comma-Separated Values) are two popular formats for storing and loading data. JSON stores data in key-value pairs and follows the same exact format as JavaScript objects. CSV is a file format that stores “tabular” data (like a spreadsheet). There are numerous other data formats you could use depending on your needs what programming environment you are working with.

-

I’ll also note that, much like some of the genetic algorithm demonstrations in chapter 9, I am selecting a problem here that has a known solution and could have been solved more easily and efficiently without a neural network. The direction of a vector can be classified with the heading2D() function and a series of if statements! However, by using this seemingly trivial scenario, I’m hoping it will help me explain the process of training a machine learning model in a clear way as well as make it easy to check if the code is functioning as expected. When I’m done I’ll provide some ideas about how to expand this classifier to a scenario where if statements would not apply.

+

I’ll also note that, much like some of the genetic algorithm demonstrations in chapter 9, I am selecting a problem here that has a known solution and could have been solved more easily and efficiently without a neural network. The direction of a vector can be classified with the heading2D() function and a series of if statements! However, by using this seemingly trivial scenario, I hope to explain the process of training a machine learning model in an understandable and friendly way. Additionally, it will make it easy to check if the code is working as expected! When I’m done I’ll provide some ideas about how to expand the classifier to a scenario where if statements would not apply.

Choosing a Model

This is where I am going to let ml5.js do the heavy lifting for me. To create the model with ml5.js, all I need to do is specify the task, the inputs, and the outputs!

let options = {
@@ -627,12 +626,12 @@ 

Choosing a Model

debug: true }; let classifier = ml5.neuralNetwork(options);
-

That's it! I'm done! Thanks to ml5.js, I can bypass a host of complexities related to the manual configuration and setup of the neural network. This includes decisions about the network architecture, such as how many layers and neurons per layer to have, the kind of activation functions to use, and the setup of algorithms for training the network. Keep in mind that the default model architecture selected by ml5.js may not be perfect for all cases. I encourage you to read the ml5.js reference for additional explanations and details on how to customize the model.

-

I’ll also point out that ml5.js is able to infer the inputs and outputs from the data itself, so those properties is not entirely necessary to include here in the options object. However, for the sake of clarity (and since I’ll need to specify those for later examples), I’m including them here.

-

The debug property, when set to true, enables a visual interface for the training process. It’s a helpful too for spotting potential issues during training and for getting a better understanding of what's happening behind the scenes.

+

That's it! I'm done! Thanks to ml5.js, I can bypass a host of complexities related to the manual configuration and setup of the model. This includes decisions about the network architecture, such as how many layers and neurons per layer to have, the kind of activation functions to use, and the setup of algorithms for training the network. Keep in mind that the default model architecture selected by ml5.js may not be perfect for all cases. I encourage you to read the ml5.js reference for additional details on how to customize the model.

+

I’ll also point out that ml5.js is able to infer the inputs and outputs from the data itself, so those properties are not entirely necessary to include here in the options object. However, for the sake of clarity (and since I’ll need to specify those for later examples), I’m including them here.

+

The debug property, when set to true, enables a visual interface for the training process. It’s a helpful tool for spotting potential issues during training and for getting a better understanding of what's happening behind the scenes.

Training

-

Now that I have the data and a neural network initialized in the classifier variable, I’m ready to train the model! The thing is, I’m not really done with the data. In the “Data Collection and Preparation” section, I organized the data neatly into an array of objects, representing the x,y components of a vector paired with a string label. This format, while typical, isn't directly consumable by ml5.js for training. I need to be more specific about what are the inputs and what are the outputs for training the model. I certainly could have originally organized the data into a format that ml5.js recognizes, but I’m including this extra step as it’s much more likely to be what happens when you are using a “real” dataset that you’ve collected or sourced elsewhere.

-

ml5.js offers a fair amount of flexibility in the kinds of formats it will accept, the one I will choose to use here involves arrays—one for the inputs and one for the outputs.

+

Now that I have the data and a neural network initialized in the classifier variable, I’m ready to train the model! The thing is, I’m not really done with the data. In the “Data Collection and Preparation” section, I organized the data neatly into an array of objects, representing the x,y components of a vector paired with a string label. This format, while typical, isn't directly consumable by ml5.js for training. I need to specify which elements of the data are the inputs and which are the outputs for training the model. I could have initially organized the data into a format that ml5.js recognizes, but I'm including this extra step because it's more likely to be what happens when using a "real" dataset that has been collected or sourced elsewhere.

+

The ml5.js library offers a fair amount of flexibility in the kinds of formats it will accept, I will choose to use arrays—one for the inputs and one for the outputs.

for (let i = 0; i < data.length; i++) {
   let item = data[i];
   // An array of 2 numbers for the inputs
@@ -644,27 +643,27 @@ 

Training

}

A term you will often hear when talking about data in machine learning is “shape.” What is the “shape” of your data?

The "shape" of data in machine learning describes its dimensions and structure. It indicates how the data is organized in terms of rows, columns, and potentially even deeper, into additional dimensions. In the context of machine learning, understanding the shape of your data is crucial because it determines how the model should be structured.

-

Here, the input data's shape is a one-dimensional array containing 2 numbers (representing x and y). The output data, similarly, is an array but instead contains a single string label. While this is a very small and simple example, it nicely mirrors many real-world scenarios where input features are numerically represented in an array, and outputs are string labels.

+

Here, the input data's shape is a one-dimensional array containing 2 numbers (representing x and y). The output data, similarly, is an array but just contains a single string label. While this is a very small and simple example, it nicely mirrors many real-world scenarios where input features are numerically represented in an array, and outputs are string labels.

Oh dear, another term to unpack—features! In machine learning, the individual pieces of information used to make predictions are often called features. The term “feature” is chosen because it underscores the idea of distinct characteristics of the data are that most salient for the prediction. This will come into focus more clearly in future examples in this chapter.

-

Once the data has been passed into the classifier, ml5.js offers a helper function to normalize it.

+

After passing the data into the classifier, ml5.js provides a helper function to normalize it.

// Normalize the data
 classifier.normalizeData();
-

As I’ve mentioned, normalizing data (adjusting the scale to a standard range) is a critical step in the machine learning process. However, if you recall during the data collection process, the hand-coded data was written with values that already range between -1 and 1. So, while calling normalizeData() here is likely redundant, it's important to demonstrate. Normalizing your data as part of the pre-processing step will absolutely work, the auto-normalization feature of ml5.js is a quite convenient alternative.

+

As I’ve mentioned, normalizing data (adjusting the scale to a standard range) is a critical step in the machine learning process. However, if you recall during the data collection process, the hand-coded data was written with values that already range between -1 and 1. So, while calling normalizeData() here is likely redundant, it's important to demonstrate. Normalizing your data as part of the pre-processing step will absolutely work, but the auto-normalization feature of ml5.js is a quite convenient alternative.

Ok, this subsection is called training. So now it’s time to train! Here’s the code:

-
-// The "train" method initiates the training process
+
// The "train" method initiates the training process
 classifier.train(finishedTraining);
 
 // A callback function for when the training is complete
 function finishedTraining() {
   console.log("Training complete!");
 }
-

Yes, that’s it! After all, the hard work as already been completed! The data was collected, prepared, and fed into the model. However, if I were to run the above code and then test the model, the results would probably be inadequate. Here is where it’s important to introduce another key term in machine learning: the epoch. The train() method tells the neural network to start the learning process. But how long should it train for? You can think of an epoch as one round of practice, one cycle of using the entire dataset to update the weights of the neural network. Generally speaking, the longer you train, the better the network will perform, but at a certain point there are diminishing returns. You can specify the number of epochs with an options object passed into train().

+

Yes, that’s it! After all, the hard work as already been completed! The data was collected, prepared, and fed into the model. However, if I were to run the above code and then test the model, the results would probably be inadequate. Here is where it’s important to introduce another key term in machine learning: epoch. The train() method tells the neural network to start the learning process. But how long should it train for? You can think of an epoch as one round of practice, one cycle of using the entire dataset to update the weights of the neural network. Generally speaking, the longer you train, the better the network will perform, but at a certain point there are diminishing returns. The number of epochs can be set by passing in an options object into train().

 //{!1} Setting the number of epochs for training
 let options = { epochs: 25 };
 classifier.train(options, finishedTraining);
-

There are other “hyperparameters” you can set in the options variable (learning rate is one again!) but I’m going to stick with the defaults. You can read more about customization options in the ml5.js reference. The second argument finishedTraining() is optional, but good to include as its a callback that runs when the training process has completed. This is useful for knowing when you can begin the next steps in your code. There is also an additional optional callback typically named whileTraining() that is triggered after each epoch but for my purposes just knowing when it is done is plenty.

+

There are other "hyperparameters" that you can set in the options variable (learning rate is one again!), but I'm going to stick with the defaults. You can read more about customization options in the ml5.js reference.

+

The second argument, finishedTraining(), is optional, but it's good to include because it's a callback that runs when the training process is complete. This is useful for knowing when you can proceed to the next steps in your code. There is even another optional callback, which I usually name whileTraining(), that is triggered after each epoch. However, for my purposes, knowing when the training is done is plenty!

Callbacks

If you've worked with p5.js, you're already familiar with the concept of a callback even if you don't know it by that name. Think of the mousePressed() function. You define what should happen inside it, and p5.js takes care of calling it at the right moment, when the mouse is pressed.

@@ -672,28 +671,28 @@

Callbacks

In JavaScript, there's also a more recent approach for handling asynchronous operations known as "Promises." With Promises, you can use keywords like async and await to make your asynchronous code look more like traditional synchronous code. While ml5.js also supports this style, I’ll stick to using callbacks to stay aligned with p5.js style.

Evaluation

-

With debug set to true as part of the original call to ml5.neuralNetwork(), as soon train() is called, a visual interface will appear covering most of the p5.js page and canvas.

+

If debug is set to true in the initial call to ml5.neuralNetwork(), once train() is called, a visual interface appears covering most of the p5.js page and canvas.

-

This panel or “Visor” represents the evaluation step, as shown in Figure X.X. The “visor” is part of TensorFlow.js and includes a graph that provides real-time feedback on the progress of the training. I’d like to focus on the “loss” plotted on the y-axis against the number of epochs along the x-axis.

+

This panel, called "Visor," represents the evaluation step, as shown in Figure X.X. The Visor is a part of TensorFlow.js and includes a graph that provides real-time feedback on the progress of the training. Let’s take a moment to focus on the "loss" plotted on the y-axis against the number of epochs along the x-axis.

So, what exactly is this "loss"? Loss is a measure of how far off the model's predictions are from the “correct” outputs provided by the training data. It quantifies the model’s total error. When training begins, it's common for the loss to be high because the model has yet to learn anything. As the model trains through more epochs, it should, ideally, get better at its predictions, and the loss should decrease. If the graph goes down as the epochs increase, this is a good sign!

-

Running the training for 200 epochs might strike you as a bit excessive, especially for such a tiny dataset. In a real-world scenario with more extensive data, I would probably use fewer epochs. However, because the dataset here is limited, the higher number of epochs ensures that our model gets enough "practice" with the data. Remember, this is a "toy" example, aiming to make the concepts clear rather than to produce a sophisticated machine learning model.

-

Below the graph, you will also see a "model summary" table. This provides details on the lower-level TensorFlow.js model architecture that ml5.js created behind the scenes. This summary details default layer names, neuron counts per layer, and an aggregate "parameters" count, referring to weights connecting the neurons.

-

Now, before moving on, I’d like to refer back to the data preparation step. There I mentioned the idea of splitting the data between “training” and “testing.” In truth, a full machine learning workflow would split the data into three categories:

+

Running the training for 200 epochs might strike you as a bit excessive. In a real-world scenario with more extensive data, I would probably use fewer epochs. However, because the dataset here is so tiny, the higher number of epochs helps the model get enough "practice" with the data. Remember, this is a "toy" example, aiming to make the concepts clear rather than to produce a sophisticated machine learning model.

+

Below the graph, you will find a "model summary" table that provides details on the lower-level TensorFlow.js model architecture created behind the scenes. The summary includes layer names, neuron counts per layer, and a "parameters" count, which is the total number of weights, one for each connection between two neurons.

+

Now, before moving on, I’d like to refer back to the data preparation step. There I mentioned the idea of splitting the data between “training,” “validation,” and “testing.”

  1. training: primary dataset used to train the model
  2. validation: subset of data used to check the model during training
  3. testing: additional untouched data never considered during the training process to determine its final performance.
-

With ml5.js, while it’s possible to incorporate all three categories of data. However, I’m simplfying things here and focusing only on the training dataset. After all, my dataset only has 8 records in it, it’s much too small to divide into separate stages. For a more rigorous demonstration, this would be a terrible idea! Working only with training data risks the model “overfitting” the data. Overfitting is a term that describes when a machine learning model has learned the training data too well. In this case, it’s become so “tuned” to the specific details and any pecularities or noise in that data, that is is much less effective when working with new, unseen data. The best way to combat overfitting, is to use validation data during the training process! If it performs well on the training data but poorly on the validation data, it's a strong indicator that overfitting might be occurring.

+

With ml5.js, while it’s possible to incorporate all three categories of data. However, I’m simplifying things here and focusing only on the training dataset. After all, my dataset only has 8 records, it’s much too small to divide three different sets! Using such a small dataset risks the model “overfitting” the data. Overfitting is a term that describes when a machine learning model has learned the training data too well. In this case, it’s become so “tuned” to the specific peculiarities of the training data, that is is much less effective when working with new, unseen data. The best way to combat overfitting, is to use validation data during the training process! If it performs well on the training data but poorly on the validation data, it's a strong indicator that overfitting might be occurring.

ml5.js provides some automatic features to employ validation data, if you are inclined to go further, you can explore the full set of neural network examples at ml5js.org.

Parameter Tuning

-

After the evaluation step, there is typically an iterative process of adjusting "hyperparameters" to achieve the best performance from the model. The ml5.js library is designed to provide a higher-level, user-friendly interface to machine learning. So while it does offer some capabilities for parameter tuning (which you can explore in the ml5.js reference), it is not as geared towards low-level, fine-grained adjustments as some other frameworks might be. However, ultimately, TensorFlow.js might be your best bet since it offers a broader suite of tools and allows for lower-level control over the training process. For this demonstration—seeing a loss all the way down to 0.1 on the evaluation graph—I am satisfied with the result and happy to move onto deployment!

+

After the evaluation step, there is typically an iterative process of adjusting "hyperparameters" to achieve the best performance from the model. The ml5.js library is designed to provide a higher-level, user-friendly interface to machine learning. So while it does offer some capabilities for parameter tuning (which you can explore in the reference), it is not as geared towards low-level, fine-grained adjustments as some other frameworks might be. Using TensorFlow.js directly might be your best bet since it offers a broader suite of tools and allows for lower-level control over the training process. For this demonstration—seeing a loss all the way down to 0.1 on the evaluation graph—I am satisfied with the result and happy to move onto deployment!

Deployment

-

This is it, all that hard work has paid off! Now it’s time to deploy the model. This typically involves integrating it into a separate application to make predictions or decisions based on new, unseen data. For this, ml5.js offers the convenience of a save() and load() function. After all, there’s no reason to re-train a model every single time you use it! You can download the model to a file in one sketch and then load it for use in a completely different one. However, in this tiny, toy example, I’m going to demonstrate deploying and utilizing the model in the same sketch where it was trained.

-

The model is saved in the classifier variable so, in essence, it is already deployed. I know when it’s done because of the finishedTraining() callback so can use a boolean or other logic to engage the prediction stage of the code. In this example, I’ll create a global variable called label which will display the status of training and ultimately the predicted label to the canvas.

+

This is it, all that hard work has paid off! Now it’s time to deploy the model. This typically involves integrating it into a separate application to make predictions or decisions based on new, unseen data. For this, ml5.js offers the convenience of a save() and load() function. After all, there’s no reason to re-train a model every single time you use it! You can download the model to a file in one sketch and then load it for use in a completely different one. However, for simplicity, I’m going to demonstrate deploying and utilizing the model in the same sketch where it was trained.

+

Once the training process is complete, the resulting model is saved in the classifier variable and is, in essence, deployed. You can detect the completion of the training process using the finishedTraining() callback and use a boolean variable or other logic to initiate the prediction stage of the code. For this example, I’ll include a global variable statusto track the training process and ultimately display the predicted label on the canvas.

// When the sketch starts, it will show a status of "training"
 let status = "training";
 
@@ -708,18 +707,18 @@ 

Deployment

function finishedTraining() { status = "ready"; }
-

Once the model is trained, the classify() function can be used to send new data into the model for prediction. The format of the data sent to classify() should match the format of the data used in training, in this case two floating point numbers, representing the x and y components of a direction vector.

+

Once the model is trained, the classify() method can be called to send new data into the model for prediction. The format of the data sent to classify() should match the format of the data used in training, in this case two floating point numbers, representing the x and y components of a direction vector.

// Manually creating a vector
 let direction = createVector(1, 0);
 // Converting the x and y components into an input array
 let inputs = [direction.x, direction.y];
 // Asking the model to classify the inputs
 classifier.classify(inputs, gotResults);
-

The second argument of the classify() function is a callback. While it would be more convenient to receive the results back immediately and move onto the next line of code, just like with model loading and training, the results come back a later time via a separate callback event.

+

The second argument of the classify() function is a callback. Although it would be more convenient to receive the results immediately and move on to the next line of code, the results are returned later through a separate callback event (just as with model loading and training).

function gotResults(results) {
   console.log(results);
 }
-

The models prediction arrives in the form of an argument to the callback. Inside, you’ll find an array of the labels, sorted by “confidence.” Confidence refers to the probability assigned by the model to each label, representing how sure it is of that particular prediction. It ranges from 0 to 1, with values closer to 1 indicating higher confidence and values near 0 suggesting lower confidence.

+

The model’s prediction arrives in the argument to the callback, which I’m calling results in the code. Inside, you’ll find an array of the labels, sorted by “confidence.” Confidence refers to the probability assigned by the model to each label, representing how sure it is of that particular prediction. It ranges from 0 to 1, with values closer to 1 indicating higher confidence and values near 0 suggesting lower confidence.

[
   {
     "label": "right",
@@ -738,7 +737,7 @@ 

Deployment

"confidence": 0.00029277068097144365 } ]
-

In the example output here, the model is highly confident (approximately 96.7%) that the correct label is "right," while it has minimal confidence in the "left" label, 0.03%. The confidence values are also normalized and add up to 100%.

+

In the example output here, the model is highly confident (approximately 96.7%) that the correct label is "right," while it has minimal confidence in the "left" label, 0.03%. The confidence values are normalized and add up to 100%.

Example 10.2: Gesture Classifier

@@ -773,16 +772,12 @@

Example 10.2: Gesture Classifier

}

Since the array is sorted by confidence, if I just want to use a single label as the prediction, I can access the first element of the array with results[0].label as in the gotResults() function in Example 10.2.

-

- Exercise 10.4 - Divide Example 10.2 into three different sketches, one for collecting data, one for training, and one for deployment. Using the ml5.neuralNetwork functions save() and load() for saving and loading the model to and from a file. -

+

Exercise 10.4

+

Divide Example 10.2 into three different sketches, one for collecting data, one for training, and one for deployment. Using the ml5.neuralNetwork functions save() and load() for saving and loading the model to and from a file.

-

- Exercise 10.5 - Expand the gesture recognition to classify a sequence of vectors, capturing more accurately the path of a longer mouse movement. Remember your input data must have a consistent shape! So you’ll have to decide on how many vectors to use to represent a gesture and store no more and no less for each data point. While this approach can work, other machine learning models (such as Recurrent Neural Networks) are specifically designed to handle sequential data and might offer more flexibility and potential accuracy. -

+

Exercise 10.5

+

Expand the gesture recognition to classify a sequence of vectors, capturing more accurately the path of a longer mouse movement. Remember your input data must have a consistent shape! So you’ll have to decide on how many vectors to use to represent a gesture and store no more and no less for each data point. While this approach can work, other machine learning models (such as Recurrent Neural Networks) are specifically designed to handle sequential data and might offer more flexibility and potential accuracy.

What is NEAT? “neuroevolution augmented topologies”

flappy bird scenario (classification) vs. steering force (regression)?