Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ai lessons #2029

Open
wants to merge 85 commits into
base: master
Choose a base branch
from
Open

Ai lessons #2029

wants to merge 85 commits into from

Conversation

retabak
Copy link
Collaborator

@retabak retabak commented Apr 17, 2024

hey!

So, there are three lessons in-progress. I actually do think it would be helpful to review all three, although the only one that is close to being complete is ai-decision-trees. I think you should read them in the order in which I've listed them below.

Shriram has reviewed everything that you're going to review at least once. I have responded to his feedback, therefore things have changed and evolved a bit since he last saw them.

I'm aware that there is vocab that is not yet in the glossary (and the accompanying error messages).

One request: Since we're trying to get at least one lesson online, I'm hoping y'all will consider if ai-decision-trees can stand on its own and - if not - let me know what must change / be added. If we really want to get two lessons online, there's the possibility of combining what I've written so far for the machine learning lesson and the regression lesson? Maybe? That said, I'm 1000% confident that everything is going to change and evolve a ton with your amazing feedback!

ai-machine-learning : My plan was for this lesson to include a broad introduction to machine learning (1st section), and then kids would play with the spell checker (2nd section). I have since learned that the spell checker doesn't demonstrate the concepts that I originally thought it would demonstrate, so will need to collaborate with Shriram more to write this lesson. I'd like to introduce the idea of supervised / unsupervised learning here (which is relevant because those terms come up in subsequent lessons), and generally emphasize that ML is entirely driven by data and that an AI’s capabilities are directly linked to the data it was trained on.

ai-regression : The first section is designed to get kids interested in learning about regression as a type of machine learning. The second section (has not been written) will let kids practice regression and think about more deeply about regression in an AI context. @schanzer and I have brainstormed some about this, but I'm still feeling a little bit stuck. I need an example of regression that is AI, not just pure DS... and that's tricky.

ai-decision-trees : There are two completed sections of this lesson, and bunch of activities. Shriram has reviewed this lesson, and I have responded to his feedback. Someday I might like to add a third section on overfitting. I should mention that I wasted a lot of time making trees / images for this lesson, and they are definitely far from perfect. I don't know if there's any easier / better way to produce trees. Also: I've made some attempts to weave in #1335 but am confident that @flannery-denny will do so far more capably. Have at it, @flannery-denny !

ai-clustering : This lesson does not yet exist, but I have lots of ideas, and I think that it should exist. :)

@flannery-denny
Copy link
Collaborator

flannery-denny commented Apr 18, 2024

I've started with the decision tree lesson.

  • I think it could stand on its own if polished.

RT: Yay!

  • I couldn't read any of your diagrams so I had to open the asciidoc and increase sizes (I do not generally experience any difficulties with reading things). If you remade the diagrams and made the font sizes uniform, we could live with much smaller diagrams.

RT: I'd like to chat with you and / or @schanzer about making the diagrams. It was a miserable struggle and I think there has to be an easier way. Agreed that they suck.

  • Work on the asciidoc got me into the weeds and I've pushed some changes...
  • At some point I realized that the first half of the lesson focuses on a game and I'm not sure that it works unless you become much more prescriptive and I got out of the weeds. What happens if they don't ask about utensils as the first question? What happens if both the first item and the second item are a utensil? None of the diagrams and follow up activities have much connection. It seems you actually need to tell teachers which items to use as their first and second items. And, perhaps teachers could have students popcorn up questions and then force the choice of the utensil question from everything that pops up?

RT: This is the comment that I really do not understand, which also has me concerned that I have utterly failed to communicate something in the lesson. What's wrong with asking a different first question? What's wrong if the first and second item are a utensil? The same concepts will still be illustrated. The part where you said "None of the diagrams and follow up activities have much connection" is TOTALLY confusing to me. The entire first section of the lesson is dedicated to growing a single decision tree that categorizes knife, spoon, cup, plate, fork and mug. Is that not clear? I'm assuming you think it's too much of an ask to have teachers create a tree based on the questions that their kids actually ask? I'm not sure why that would be too big of a challenge for teachers? FWIW, there is a teacher note in the lesson that says "If you want your tree to look similar to the sample, choose "knife" as the first secret word, then help students to notice that there are three utensils on the list of items. That said, there is no need for your class diagram to be an exact copy of the sample provided."

  • I think we need to come up with different synthesize questions for the first section, because, as you can see from the edits I made earlier, I think lots of people won't have a frame of reference for 20 questions - I didn't know how to play til I read this lesson and we aren't actually teaching them to play 20 questions, I got my understanding from your possible answers. Perhaps we could give the game in this lesson a name? and reference it? The same is true for where it's referenced in the second section.

RT: I literally thought that everyone in the world knows 20 questions, but perhaps I was wrong! This is a bummer, because it's a really nice frame of reference. I'm not opposed to naming the game in the lesson but am curious if @schanzer thinks this is not a common enough reference.

  • I love the part of the workbook page where students have to classify the spork and the chopstick. I wonder if it belongs earlier on the page.

RT: seems like a good idea.

  • You begin part 2 by stating "We have a sense of the hierarchical structure, flexibility, and versatility of decision trees." I don't think you actually specifically talked about any of things. Perhaps they could be drawn out in the synthesize of the first section?

RT: Hm, I might need to think on this more. I think I was hoping some of this would be self-evident after students had completed the regression lesson.

  • We need to actually explain what a decision stump is - and what purity is before we send students to the workbook page. Also seems like something that belongs in the glossary. I was really confused by where those diagrams came from. And I think question 4 belongs above questions 2 (that ask how often our machine will make the right decision)?

RT: We aren't sending students to the workbook page! This one is supposed to be done as a class - guided practice. I also don't know that a definition of "decision stump" will be useful until students have made one. Purity is also way easier to understand when you have a stump in front of you. If both you and @schanzer want these definitions before the page, that's fine with me. I think that the best approach here would be to add a teacher note before the class completes this worksheet, to walk teachers through how to make a decision stump. Any more scaffolding would be scaffolding to the scaffolding... which feels excessive. I disagree about moving Q2 above Q4. I want students to figure our 2 and 3 just using the dataset, to appreciate how the stump simplifies things.

  • I think the why start with age pedagogy box deserves a defend and decide workbook page with a long skinny tree and a wide tree.

RT: I don't think this is worth getting into - it's very nitty gritty (as in, I couldn't find info on "how to break a tie" between attributes on the web, so had to consult with Shriram). I also don't think the trees will look terribly different with such a small dataset. If you'd like, I can send you an email exchange between me and Shriram about this topic for more details. (He does not think it's worth getting into either.)

  • There are two sections of Decision Tree: Level 1 but I'm not seeing any directions about when to tackle the second part or scaffolding that would help the student know how to fill in the blanks with column names and it isn't obvious to me what you're expecting here.

This entire page is supposed to be completed as a class. This page is the scaffolding. Does that help? Should I emphasize that more? Give teachers suggestions for guiding students through it?

  • On Decision Tree: Level 2 - it seems we should be asking this question of both stumps?
  • "4 According to our training data, this rule is correct in instances out of five."

RT: Q8 asks, "Will a computer following this rule make the correct prediction every time?" The answer is yes, so I thought adding in "according to our training data..." would be redundant.

  • I think 3a and 3c are of Building and Testing a Decision Tree are written in a confusing way, but because there are things studetns are supposed to do before number 1, I didn't notice until I went back to write this comment that students are supposed to fill in the tree. I would also suggest rewriting question 8 as I had to look at the solution to determine what the scope of the suggestion your were looking for was.

3a and 3c use the same sentence structure that students encounter on the other the Level 1 and Level 2 worksheets. Was hoping it would be recognizable. Maybe it's not? I can definitely add more prompting to Q8.

@flannery-denny
Copy link
Collaborator

@schanzer I’ve finished my feedback on Rachel’s decision tree lesson and left a long comment. It’s got a lot of potential but is going to take some work. lmk if you want a chance to weigh in before I "have at it", as rachel suggested.

Regardless I will read the other two before making further edits

@schanzer
Copy link
Member

@flannery-denny "have at it!" I'll wait to review until you give me the go-ahead

@retabak
Copy link
Collaborator Author

retabak commented Apr 19, 2024

@flannery-denny - Thanks for all of your thoughts / ideas and for your edits! I agree with all edits. As for your comments... I agree with some, disagree with some, and am very very confused by one.

Hoping @schanzer can weigh in on a few of my responses above as well.

Recognize that this may be easier to discuss in real time ?

@flannery-denny
Copy link
Collaborator

@retabak fyi - tabling work on this until PDprep is completed. Will definitely check back in with you before I have at it.


Before we start learning about AI (How does it actually work? What's the connection to data science?), let’s consider what sort of information we’ve absorbed about AI just by consuming books, movies, TV, and video games.

@QandA{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually start with these questions. Use them as the hook! I could even see a worksheet out there that gets students talking to one another about AI, what they've heard, what they've used, etc.

Only after kids have shared all of this does it make sense to point out that there's a lot of hype and misinformation.


As it turns out, characterizations of AI featuring robot rebellion and the like envision technology that *does not currently exist*. That said, there is plenty of AI that we interact with each and every day—and the futuristic dystopian AI imagined in movies, video games, and books can sometimes interfere with the way that we understand the AI of the present.

The AI that we interact with—in fact the only type of AI that we have achieved—is called Artificial Narrow Intelligence (ANI). ANI generally focuses on a single narrow task and has a limited range of abilities. Virtual assistants (Google Assistant, Siri, and Alexa, to name a few) are one example. These apps respond to voice commands in order to perform simple tasks (setting alarms, making phone class, and answering questions). Can you think of any others?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth talking about ANI if we're not going to talk about other kinds of AI? The term "ANI" is better understood when they also know what's not ANI.

I could even see a situation where the teacher groups student reactions to the previous prompts by whether they're talking about one kind of AI or another.

@fitb{}{}


@n Respond to the questions below, considering how each variation in training will influence ALVINN's performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all seem different enough that they should be different numbers, rather than sub-questions of question 2. If they were all different questions about a model after a single training, that would be different.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. They used to all by their own numbers. Shriram thought that was confusing and asked me to change them to sub-questions.

@fitb{}{}


b) Predict how safely ALVINN will drive on that same road on a sunny, snowy day. Explain. @fitb{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm getting hung up on "predict how safely", because it makes me think you want like a percentage or something. How about "Do you think ALVINN will drive just as well on that same road on a sunny, snowy day? Why or why not?"

}


@n In addition to producing a steering angle for each image of the road, ALVINN produces a _numeric_ measure of "confidence" in its response. What do you think causes ALVINN's "confidence" to increase or decrease? Is 100% "confidence" possible? @fitb{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool, but I want to know where this confidence number comes from!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to know too. Couldn't figure it out or find it. do you think this is a problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this doesn't put you in the middle between Shriram and I, but overall I feel like the word "regression" could be removed from the lesson without any impact at all. That tells me that we're not really making the connection between machine learning and linear regression that Shriram was hoping for. Probably easiest to discuss in our one-on-one

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, let's discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants