diff --git a/functions_advanced.ipynb b/functions_advanced.ipynb new file mode 100644 index 00000000..8200ef09 --- /dev/null +++ b/functions_advanced.ipynb @@ -0,0 +1,4959 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "658fd9eb-ee74-4c01-843a-69985c393c11", + "metadata": {}, + "source": [ + "# Functions (advanced)" + ] + }, + { + "cell_type": "markdown", + "id": "f9ad2013-b8f7-4c9c-8ee7-6316a503795b", + "metadata": {}, + "source": [ + "# Table of Contents\n", + "\n", + "- [Recap](#Recap)\n", + " - [Functions are objects](#Functions-are-objects)\n", + " - [Scopes and namespaces](#Scopes-and-namespaces)\n", + "- [Mutable objects as default values of function's parameters](#Mutable-objects-as-default-values-of-function's-parameters)\n", + "- [Lambdas](#Lambdas)\n", + " - [Lambdas and sorting](#Lambdas-and-sorting)\n", + "- [Closures](#Closures)\n", + " - [Modifying the free variable](#Modifying-the-free-variable)\n", + " - [Multiple instances of closures](#Multiple-instances-of-closures)\n", + " - [Closures can be tricky](#Closures-can-be-tricky)\n", + " - [Nested closures](#Nested-closures)\n", + " - [Closures: examples](#Closures:-examples)\n", + " - [Example 1](#Example-1)\n", + " - [Example 2](#Example-2)\n", + "- [Decorators](#Decorators)\n", + " - [Decorators: examples](#Decorators:-examples)\n", + " - [Example 1: timer](#Example-1:-timer)\n", + " - [Fibonacci with recursion](#Fibonacci-with-recursion)\n", + " - [Fibonacci with a simple loop](#Fibonacci-with-a-simple-loop)\n", + " - [Fibonacci using `reduce`](#Fibonacci-using-reduce)\n", + " - [Example 2: memoization](#Example-2:-memoization)\n", + " - [Parametrized decorators](#Parametrized-decorators)\n", + "- [Generators](#Generators)\n", + " - [Create an interable from a generator](#Create-an-interable-from-a-generator)\n", + " - [Combining generators](#Combining-generators)\n", + "- [Exercises](#Exercises)\n", + " - [Password checker factory](#Password-checker-factory)\n", + " - [String range](#String-range)\n", + " - [Read `n` lines](#Read-n-lines)\n", + " - [Only run once](#Only-run-once)" + ] + }, + { + "cell_type": "markdown", + "id": "1e7a4298-0533-43e8-b7d3-2e330edac443", + "metadata": {}, + "source": [ + "We are going to cover the following topics:\n", + "\n", + "- Decorators\n", + "- Lambdas\n", + "- Arguments and object's mutability\n", + "- Generators" + ] + }, + { + "cell_type": "markdown", + "id": "2b7dac95-e83a-46c6-8ad5-6b55a369bf17", + "metadata": {}, + "source": [ + "## Recap" + ] + }, + { + "cell_type": "markdown", + "id": "218acd08-b0f9-4c4f-83d3-613497d17b2e", + "metadata": {}, + "source": [ + "Before starting our deep dive on functions, we must revise quickly two important concepts. Have a look at the [Functions](./functions.ipynb#The-scope-of-a-function) notebook for more detail.\n", + "\n", + "1. Scopes and namespaces\n", + "2. Functions are objects" + ] + }, + { + "cell_type": "markdown", + "id": "b618d311-0e0d-4748-a434-39b5a0a0e585", + "metadata": {}, + "source": [ + "### Functions are objects" + ] + }, + { + "cell_type": "markdown", + "id": "380773ce-09de-42c1-8146-2ed3502ddeca", + "metadata": {}, + "source": [ + "As an example, suppose that we want to create a \"password checker\", that is, a function that can verify if an input password complies with some rules (e.g., minumum length, a given number of special characters). We could create a function with the following signature:\n", + "\n", + "```python\n", + "def check_password(\n", + " password: str,\n", + " min_length: int,\n", + " min_uppercase: int,\n", + " min_punctuation:\n", + " int, min_digits: int\n", + " ) -> bool:\n", + " \"\"\"Check if a given password complies with pre-defined rules.\"\"\"\n", + "```\n", + "\n", + "In various situations, passwords are subject to distinct rules. Once these rules are defined, our goal is to streamline the process of handling them. We aim to avoid repeatedly inputting them for every password to check, as this can become tedious." + ] + }, + { + "cell_type": "markdown", + "id": "438649ea-35d6-4404-8250-fa76bc4fda9d", + "metadata": {}, + "source": [ + "We can instead define a so-called **higher-order function** (see [Functional programming](./functional_programming.ipynb#Higher-Order-Functions-/-Functions-as-Values)): a function that returns another function.\n", + "It does **not** call that function, just returns it." + ] + }, + { + "cell_type": "markdown", + "id": "fc062b73-370e-4fc2-8f51-7237e9022791", + "metadata": {}, + "source": [ + "```python\n", + "\n", + "def check_password_factory(\n", + " min_length: int,\n", + " min_uppercase: int,\n", + " min_punctuation: int,\n", + " min_digits: int\n", + " ):\n", + " \"\"\"Our password checker factory\"\"\"\n", + "\n", + " def check_password(password: str) -> bool:\n", + " \"\"\"Password checker function\"\"\"\n", + " # our password checking logic\n", + " # ...\n", + " return # True or False\n", + "\n", + " return check_password\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "ee874e2b-6f6d-454f-a1c1-f79cc50415a5", + "metadata": {}, + "source": [ + "You would first call your factory function with some password requirements:\n", + "\n", + "```python\n", + "password_checker = check_password_factory(min_length=10, min_uppercase=4, min_punctuation=3, min_digits=1)\n", + "```\n", + "\n", + "And then you could verify that an input password adheres to the constraints:\n", + "\n", + "```python\n", + "password_checker(\"MyveryComplexPWD123\")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "5d443e29-214b-4b1e-ae50-c1948080ee91", + "metadata": {}, + "source": [ + "### Scopes and namespaces" + ] + }, + { + "cell_type": "markdown", + "id": "c76d3dde-eb94-4552-9df7-d0e95d946967", + "metadata": {}, + "source": [ + "Python's variables are just names (i.e., labels) that we can **bind** to objects. Each variable is simply telling Python where to look in our computer's memory to retrieve some data. These bindings are **not global**: some of them exist only in specific parts of our code.\n", + "\n", + "> The portion of code where a name binding is defined is called **lexical scope** (or just \"scope\"). The bindings are stored in a scope's **namespace**" + ] + }, + { + "cell_type": "markdown", + "id": "ee364526-00d2-40ec-bb33-3478ec3390a0", + "metadata": {}, + "source": [ + "We always have the following scopes:\n", + "\n", + "1. `built-in` scope\n", + "2. `global` (or module) scope\n", + "\n", + "We also have the `local` scope that's created when we are **calling** a function.\n", + "The local scope associated to any called function is **destroyed** after the function has done its job. Also the namespace associated with it will be gone." + ] + }, + { + "cell_type": "markdown", + "id": "f2654077-9a14-434a-acd6-81ca8b996228", + "metadata": {}, + "source": [ + "When Python needs to retrieve which object is referenced by a given name, it always starts from the current scope (the `local` one if we are inside a function's body). If a name binding is not found there, it searches in the scope immediately up in the hierarchy.\n", + "\n", + "> **LEGB rule**: **L**ocal → **E**nclosing → **G**lobal → **B**uilt-in" + ] + }, + { + "cell_type": "markdown", + "id": "c3b21272-45b0-4e21-ae46-8d4deba09164", + "metadata": {}, + "source": [ + "When Python encounters a function **definition** (i.e., at compile-time), it does two things:\n", + "\n", + "1. Scans for any variables that have values **assigned** anywhere in the function. By default, names that are assigned are **local** unless we are explicitly saying that they should not with the `global` keyword.\n", + "2. Variables that are **referenced** but **not assigned** a value anywhere in the function will **not be local**. When we are calling the function (i.e., run-time), Python will look for them in the **enclosing scope**." + ] + }, + { + "cell_type": "markdown", + "id": "a43ecf5e-d689-4bd7-8eee-2f1805e191d1", + "metadata": {}, + "source": [ + "Examples:\n", + "\n", + "```python\n", + "var = 10 # global (or module) scope\n", + "\n", + "def func_1():\n", + " print(var) # var is referenced but not assigned. At compile-time is \"non-local\"\n", + "\n", + "def func_2()\n", + " var = 100 # var is assigned. At compile-time will be placed in the \"local\" scope\n", + "\n", + "def func_3():\n", + " global var\n", + " var = 1000 # var is assigned, so it should be local. But it's also declared to be \"global\" with the keyword above\n", + "\n", + "def func_4():\n", + " print(var)\n", + " var = 100 # what happens here?\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "a3de049f-475f-4984-932e-a938dff21106", + "metadata": {}, + "source": [ + "A function gets its local scope upon calling. Since we can have function definitions inside of other functions, there can be **nested scopes**. This is where the `nonlocal` keyword becomes useful or even needed.\n", + "\n", + "> The `nonlocal` keyword is used to declare that a variable is not local to the current function but is defined in the **nearest enclosing scope** that is **not global**. It allows you to access and modify variables in the outer (non-global) scope from within an inner function.\n", + "\n", + "An example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "394f52d5-3141-4273-9852-dc099d0adcdb", + "metadata": {}, + "outputs": [], + "source": [ + "def outer_function():\n", + " x = \"global\"\n", + "\n", + " # prints \"outer\"\n", + " print(\"Inside outer_function:\", x)\n", + "\n", + " def inner_function():\n", + " nonlocal x\n", + " x = \"inner\"\n", + " # prints \"inner\"\n", + " print(\"Inside inner_function:\", x)\n", + "\n", + " inner_function()\n", + " # prints \"inner\" again, because we modified `x` from a nested scope\n", + " print(\"Inside outer_function:\", x)\n", + "\n", + "outer_function()" + ] + }, + { + "cell_type": "markdown", + "id": "50184808-ee68-4a1e-84a0-ff421074aabe", + "metadata": {}, + "source": [ + "Two important notes about the `nonlocal` keyword:\n", + "\n", + "1. Python will search for a `nonlocal` name in the **enclosing local scopes** until it first encounters the specified variable.\n", + "2. **Only** local scopes are searched, never the global one." + ] + }, + { + "cell_type": "markdown", + "id": "49f00325-11df-402c-a0c9-e2cb8dd091a2", + "metadata": {}, + "source": [ + "
\n", + "

Hint

Experiment a bit with the code above. If you didn't fully understand what global and nonlocal do, try changing the scope of x with the different keywords and see whether you obtain what you expect.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "4543c0c9-d6b7-4c48-803d-1d629845fe0b", + "metadata": {}, + "source": [ + "## Mutable objects as default values of function's parameters" + ] + }, + { + "cell_type": "markdown", + "id": "ca2ca3de-4cb0-41fc-93ad-d5c38b942a99", + "metadata": {}, + "source": [ + "Using mutable objects like lists or dictionaries as a function's parameters default values requires extra care, as it can lead to some unexpected or unwanted behaviors.\n", + "They could even produce mistakes that are difficult to debug." + ] + }, + { + "cell_type": "markdown", + "id": "992d588c-88c9-4c68-a078-924335fe2e62", + "metadata": {}, + "source": [ + "Consider what happens when Python evaluates the following code:\n", + "\n", + "```python\n", + "def func(a=10):\n", + " return a ** 2\n", + "```\n", + "\n", + "A new function is **created**, but it's body is run only when **executed**.\n", + "Also, a local scope for this function is only created upon its execution.\n", + "\n", + "What happens at \"compile-time\" is setting the default value of the parameter `a`.\n", + "Where's the problem, you might say? Look at this example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f610ff20-e5a1-4b15-a3ff-8977ad13dbe7", + "metadata": {}, + "outputs": [], + "source": [ + "from datetime import datetime\n", + "\n", + "def log(msg, *, dt=datetime.utcnow()):\n", + " print(f'{dt}: {msg}')" + ] + }, + { + "cell_type": "markdown", + "id": "70572399-3c56-4c14-ac92-519e5bfbc70f", + "metadata": {}, + "source": [ + "A simple logging function.\n", + "Let's run it a few times, waiting a bunch of seconds in between:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "28d78cfc-68b3-4bb4-ae9d-d2baec4f3ed1", + "metadata": {}, + "outputs": [], + "source": [ + "from time import sleep\n", + "\n", + "log(\"my first message\")\n", + "sleep(5)\n", + "log(\"my first message\")" + ] + }, + { + "cell_type": "markdown", + "id": "e788532b-a0bc-4b58-b6ae-f6c6a4d0a13a", + "metadata": {}, + "source": [ + "Something is wrong here: we waited 5 seconds, but the timestamp of our log message **did not change**.\n", + "Why? Because `dt` default value is set **when the function is defined**, and it's never changed afterwards – unless we pass it by ourselves." + ] + }, + { + "cell_type": "markdown", + "id": "6b2b6299-4a8b-4f2a-8bf1-b1e1024b6f83", + "metadata": {}, + "source": [ + "The correct pattern to avoid this unwanted behavior is setting a default value of `None`, so that the argument remains optional.\n", + "Then, inside the function's body, we can assign the correct or updated value:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ee7af02-a935-425a-b57a-c6a747758d30", + "metadata": {}, + "outputs": [], + "source": [ + "def log(msg, *, dt=None):\n", + " dt = dt or datetime.utcnow()\n", + " print(f'{dt}: {msg}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80584b4f-978c-4153-9ff9-10183225139a", + "metadata": {}, + "outputs": [], + "source": [ + "log(\"my first message\")\n", + "sleep(5)\n", + "log(\"my first message\")" + ] + }, + { + "cell_type": "markdown", + "id": "3174d7ae-6353-4561-b691-83b7fc729cae", + "metadata": {}, + "source": [ + "Now the output is what we expected." + ] + }, + { + "cell_type": "markdown", + "id": "fbcf8f91-86d5-404f-ab46-8a159a9ec18f", + "metadata": {}, + "source": [ + "Another problematic context is when we're dealing with **mutable objects** (e.g., lists, sets, dictionaries).\n", + "This definition includes custom classes, if we are not careful.\n", + "\n", + "Let's consider this example: we want to keep track of our groceries in different stores.\n", + "We might create a function that adds an item to a grocery list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "81c0cdff-32fc-4c6f-9072-f7a236c957ec", + "metadata": {}, + "outputs": [], + "source": [ + "def add_item(name, quantity, unit, grocery_list):\n", + " grocery_list.append(f\"{name} ({quantity} {unit})\")\n", + " return grocery_list" + ] + }, + { + "cell_type": "markdown", + "id": "195ce01b-363f-45de-8af5-02ec542b88c7", + "metadata": {}, + "source": [ + "We now have two stores and want to add some items to them:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9a88b5de-c0fe-42e8-a103-4ba5b24943b5", + "metadata": {}, + "outputs": [], + "source": [ + "store_1 = []\n", + "store_2 = []\n", + "\n", + "add_item('bananas', 2, 'units', store_1)\n", + "add_item('grapes', 1, 'bunch', store_1)\n", + "add_item('python', 1, 'medium-rare', store_2)\n", + "\n", + "print(store_1, store_2)" + ] + }, + { + "cell_type": "markdown", + "id": "428ba9cb-bc96-4b6c-b5a3-c9b3ba483a69", + "metadata": {}, + "source": [ + "All good.\n", + "What if we don't supply the store list where we want to add the new item?\n", + "We could have our function create a new, empty store list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "47bb8773-dd5a-4836-8c5b-8e66fcae92e3", + "metadata": {}, + "outputs": [], + "source": [ + "def add_item(name, quantity, unit, grocery_list=[]):\n", + " grocery_list.append(f\"{name} ({quantity} {unit})\")\n", + " return grocery_list" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ba49b5b4-7127-40f7-b19c-53cd197cc693", + "metadata": {}, + "outputs": [], + "source": [ + "store_1 = add_item('bananas', 2, 'units')\n", + "add_item('grapes', 1, 'bunch', store_1)\n", + "\n", + "print(store_1)" + ] + }, + { + "cell_type": "markdown", + "id": "20f49218-9fa5-48b7-b21a-535cfbeb2810", + "metadata": {}, + "source": [ + "Okay, all good.\n", + "Let's create our second list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b5044e41-7c41-469b-974d-6b5f7f052b61", + "metadata": {}, + "outputs": [], + "source": [ + "store_2 = add_item('milk', 1, 'gallon')\n", + "\n", + "print(store_2)" + ] + }, + { + "cell_type": "markdown", + "id": "a132eb6b-7b4d-436c-925a-7e504ab6d4f0", + "metadata": {}, + "source": [ + "Again, not what we expected, right?\n", + "`store_2` should be a completeley new list, while Python is still adding to the empty list used to initialize the default value of the `grocery_list` parameter.\n", + "\n", + "The solution pattern is similar to the logging function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bf6937c7-633c-45e7-9220-ecf3b8bade12", + "metadata": {}, + "outputs": [], + "source": [ + "def add_item(name, quantity, unit, grocery_list=None):\n", + " if not grocery_list:\n", + " grocery_list = []\n", + " grocery_list.append(f\"{name} ({quantity} {unit})\")\n", + " return grocery_list" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ffc972b4-2c29-4362-8564-0f260f33b242", + "metadata": {}, + "outputs": [], + "source": [ + "store_1 = add_item('bananas', 2, 'units')\n", + "add_item('grapes', 1, 'bunch', store_1)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e6dbd7f2-1e44-489c-bbcc-0e6704c27fb4", + "metadata": {}, + "outputs": [], + "source": [ + "store_1" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b6d27cda-6795-4421-b917-cfae946b5713", + "metadata": {}, + "outputs": [], + "source": [ + "store_2 = add_item('milk', 1, 'gallon')\n", + "store_2" + ] + }, + { + "cell_type": "markdown", + "id": "3e5f9f9f-c57b-4b85-bb29-ff56d739e06e", + "metadata": {}, + "source": [ + "And now everything works as we expected." + ] + }, + { + "cell_type": "markdown", + "id": "9c458327-b5bd-4df6-8025-d54ec60edeaa", + "metadata": {}, + "source": [ + "Using mutable objects as default values usually lead to unwanted results.\n", + "But there are some cases when this is precisely what we want.\n", + "\n", + "As an example where this might be useful, consider a function to calculate the factorial:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5d88ed0c-5e32-4db1-aebc-e066891a8f47", + "metadata": {}, + "outputs": [], + "source": [ + "def factorial(n):\n", + " if n < 1:\n", + " return 1\n", + " else:\n", + " print(f'Calculating {n}!')\n", + " return n * factorial(n-1)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3c9ba891-accc-446c-bf67-b06146411470", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "93ba1699-67e1-4c08-a321-077d2354e0d4", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(6)" + ] + }, + { + "cell_type": "markdown", + "id": "69cdf435-23fb-419e-9746-dd5a29bc40e4", + "metadata": {}, + "source": [ + "We had to recalculate some values the second time, values that we could have saved for any subsequent call.\n", + "We will see a much better approach later on, but now consider the following `factorial` function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9955eba2-62dc-445b-bf16-25f81fd02581", + "metadata": {}, + "outputs": [], + "source": [ + "def factorial(n, cache={}):\n", + " if n < 1:\n", + " return 1\n", + " elif n in cache:\n", + " return cache[n]\n", + " else:\n", + " print(f'Calculating {n}!')\n", + " result = n * factorial(n-1)\n", + " cache[n] = result\n", + " return result" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a77781a1-5bd7-4e2e-8863-a5d1e92ba72a", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dd3120ab-4ad3-46c3-9302-b4a9e20fc137", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b919b438-63e6-4869-8bc0-86eee0ea7b54", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(6)" + ] + }, + { + "cell_type": "markdown", + "id": "e5e6276e-1674-4b9a-8da2-5dffbaf1e09a", + "metadata": {}, + "source": [ + "Since the dictionary `cache` is initialized as an empty `dict()` **at compile-time** (when we define the function), we can update its content in any subsequent call to `factorial`.\n", + "This is an efficient way of reducing the run-time of a computation when we know we can store previously computed results." + ] + }, + { + "cell_type": "markdown", + "id": "6cdafe9c-2448-4ba2-b351-98de62508d4f", + "metadata": {}, + "source": [ + "## Lambdas" + ] + }, + { + "cell_type": "markdown", + "id": "e1bf04d6-c294-4143-bc9a-c647c930a934", + "metadata": {}, + "source": [ + "We already now how to create a function: we use the `def` keyword:\n", + "\n", + "```python\n", + "def mult(a, b):\n", + " return a * b\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "633b0bfa-1f64-4e3e-a100-82c148a13ff6", + "metadata": {}, + "source": [ + "A function can have **parameters** and a `return` statement.\n", + "If we don't write a `return` statement, Python adds one for us and returns the `None` object." + ] + }, + { + "cell_type": "markdown", + "id": "c4aa553c-a309-4f5d-be37-0f4ab0d9176d", + "metadata": {}, + "source": [ + "There's another way to define a function object: with **lambda expressions** (or lambdas).\n", + "The syntax is similar with a few differences:\n", + "\n", + "```python\n", + "lambda x: x ** 2\n", + "```\n", + "\n", + "- We don't have `def`\n", + "- The function has **no name**\n", + "- There is **no `return` statement**\n", + "\n", + "The \"body\" of a lambda expression follows the `:` mark.\n", + "If the expression evaluates to something, the lambda implicitly returns that value." + ] + }, + { + "cell_type": "markdown", + "id": "c515caf1-0a60-4fcd-9dd7-3596ab834e21", + "metadata": {}, + "source": [ + "Lambdas are objects, as much as functions are, so we can define a lambda and assign it to a name:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7266f0f2-cc8f-4326-b436-cce3512f992b", + "metadata": {}, + "outputs": [], + "source": [ + "f = lambda x: x ** 2" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "767cc544-5153-4472-973a-dc607f78558f", + "metadata": {}, + "outputs": [], + "source": [ + "f" + ] + }, + { + "cell_type": "markdown", + "id": "ec0dea2c-e714-44dc-9334-a2fe58debf0c", + "metadata": {}, + "source": [ + "We can also define lambdas with parameters **with a default value**:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e8681478-f223-4e7c-8d67-ad42da7aff20", + "metadata": {}, + "outputs": [], + "source": [ + "g = lambda x, y=10: (x, y)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "efb414e2-f002-4b46-8ee0-bc8d980235ca", + "metadata": {}, + "outputs": [], + "source": [ + "g(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "41acbdfa-a3a9-47e8-9343-69e60365a085", + "metadata": {}, + "outputs": [], + "source": [ + "g(10, -10)" + ] + }, + { + "cell_type": "markdown", + "id": "6742a7f5-c214-4ccb-9a4c-2e5d882f7de6", + "metadata": {}, + "source": [ + "Lambdas are very handy when we need something that behaves like a function, but we don't plan to use it many times. Examples:\n", + "\n", + "```python\n", + "\n", + "lambda x: x ** 2\n", + "lambda x, y: x + y\n", + "lambda: 'hello!' # no params\n", + "lambda s: s[::-1].upper() # what does it do?\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "96f87d21-5e7f-408f-aa6c-c15929cca11e", + "metadata": {}, + "source": [ + "Lambdas are **anonymous function objects**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "961f2782-38ab-4a7d-9a4d-6e9d43801827", + "metadata": {}, + "outputs": [], + "source": [ + "type(g)" + ] + }, + { + "cell_type": "markdown", + "id": "f62922e3-9809-4205-b875-06c793003e89", + "metadata": {}, + "source": [ + "Since are objects, they can be passed to (or returned from) other functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fa3eb3c3-9933-453a-8e59-27dc79ccc34f", + "metadata": {}, + "outputs": [], + "source": [ + "def apply_func(fn, *args, **kwargs):\n", + " return fn(*args, **kwargs)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4eeb04fe-d544-48ef-a77b-2f145e720236", + "metadata": {}, + "outputs": [], + "source": [ + "apply_func(lambda x, y: x+y, 1, 2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "505718d0-7079-4869-b855-811add5d4e6e", + "metadata": {}, + "outputs": [], + "source": [ + "apply_func(lambda *args: sum(args), 1, 2, 3, 4, 5)" + ] + }, + { + "cell_type": "markdown", + "id": "39da23ba-fead-49de-bd27-43ac6cfd019f", + "metadata": {}, + "source": [ + "The previous example is **not** the suggested way to sum values of an iterable: we should use the built-in `sum()` when appropriate:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9636ece4-fe53-47eb-a4a3-fd9ccee6cc76", + "metadata": {}, + "outputs": [], + "source": [ + "apply_func(sum, (1, 2, 3, 4, 5))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7de26d22-e493-4c87-9d1d-d5866f243149", + "metadata": {}, + "outputs": [], + "source": [ + "sum((1, 2, 3, 4, 5))" + ] + }, + { + "cell_type": "markdown", + "id": "0a22c386-b830-4038-8930-ccc172137e6f", + "metadata": {}, + "source": [ + "### Lambdas and sorting" + ] + }, + { + "cell_type": "markdown", + "id": "d646f3b7-9d6c-48bd-993a-df4b58703d78", + "metadata": {}, + "source": [ + "Python has a built-in `sorted` method returns any iterable sorted according to a default ordering.\n", + "Sometimes you may want to (or need to) specify a different criteria for sorting." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3c4b63b7-5e14-4dca-a12d-3cddf9ae3d5d", + "metadata": {}, + "outputs": [], + "source": [ + "letters = list(\"ABCDzywab\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fa77063b-852f-4bd3-a3c3-d06e0acf41fc", + "metadata": {}, + "outputs": [], + "source": [ + "sorted(letters)" + ] + }, + { + "cell_type": "markdown", + "id": "e3230390-88d4-45d1-ba88-fda0cd284d2a", + "metadata": {}, + "source": [ + "Python's `sorted` has a keyword-only argument named `key=` that takes a function used to return the key – that is, the ordinal criteria according to which we want to sort the iterable." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dcded938-a811-4c9b-9754-f0ecbbb54c19", + "metadata": {}, + "outputs": [], + "source": [ + "sorted(letters, key=str.upper) # sort as if all letters are CAPITALIZED" + ] + }, + { + "cell_type": "markdown", + "id": "dded89fa-3d95-4e66-8320-0f4b1137dff7", + "metadata": {}, + "source": [ + "Let's see how we can created a \"sorted dictionary\".\n", + "Recall that a dictionary is an **unordered collection**, so it doesn't have a built-in order.\n", + "\n", + "_(Well, that's not completely true. The most recent versions of Python store the key-value pairs in the order they are entered or supplied.\n", + "The thing is: accessing a dictionary **by index** doesn't make sense.)_" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f9c0fd39-09be-4326-9bdd-57dfeaa8ab97", + "metadata": {}, + "outputs": [], + "source": [ + "d = {'def': 300, 'abc': 200, 'ghi': 100}\n", + "sorted(d)" + ] + }, + { + "cell_type": "markdown", + "id": "fa44023f-406e-4bfe-8f84-8731ea68edd7", + "metadata": {}, + "source": [ + "Iterating over a dictionary is equivalent to iterate **over keys**.\n", + "If we wanted to sort our dictionary by its values, we need to use a lambda:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1cefc1c7-190d-4a32-9ecb-851315b6f6d5", + "metadata": {}, + "outputs": [], + "source": [ + "sorted(d, key=lambda k: d[k])" + ] + }, + { + "cell_type": "markdown", + "id": "7d0882a6-a8a7-4687-9bfd-0d96476d348a", + "metadata": {}, + "source": [ + "But wait: now we lost our values!\n", + "We need to do something more elaborate if we want to have a dictionary back:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e0b14481-087b-4431-a146-7511e3a8bc7c", + "metadata": {}, + "outputs": [], + "source": [ + "dict(sorted(d.items(), key=lambda x: x[1]))" + ] + }, + { + "cell_type": "markdown", + "id": "421950b6-55da-42f6-91b4-5395a681fb6b", + "metadata": {}, + "source": [ + "Another useful application of lambdas is when Python doesn't know how to apply an ordering to some kind of data.\n", + "For example, with complex numbers:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e92b884c-e9f5-42ff-819d-bad7694b2999", + "metadata": {}, + "outputs": [], + "source": [ + "complex = [3+3j, 1+1j, 0, 4-2j]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "850758c1-e616-4eaf-b99d-aea169da0359", + "metadata": {}, + "outputs": [], + "source": [ + "sorted(complex)" + ] + }, + { + "cell_type": "markdown", + "id": "703ea9e5-72f3-4eaf-95cd-a579e186bd01", + "metadata": {}, + "source": [ + "We can sort complex numbers based on their modulus:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "68923fed-b32d-4cd0-a58b-424a9a8f5d9f", + "metadata": {}, + "outputs": [], + "source": [ + "sorted(complex, key=lambda z: (z.real)**2 + (z.imag)**2)" + ] + }, + { + "cell_type": "markdown", + "id": "e8475bd0-9d51-40b1-99d2-0a61038124f6", + "metadata": {}, + "source": [ + "
\n", + "

Question

Can you find a way to randomize a list using sorted and lambdas?\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "1e906839-6146-4943-a61d-258de87ae279", + "metadata": {}, + "source": [ + "
\n", + "

Hint

Have a look at the random module of Python's standard library.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "f3613b0b-eb09-4152-954f-bddd92fdb7f4", + "metadata": {}, + "source": [ + "## Closures" + ] + }, + { + "cell_type": "markdown", + "id": "1e72d59a-2a74-4502-acbe-6fc9ac3ff3aa", + "metadata": {}, + "source": [ + "Let's consider the following code:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2aefd3ca-cb6d-463a-88be-05c89c7dc112", + "metadata": {}, + "outputs": [], + "source": [ + "def outer():\n", + " lang = \"Python\"\n", + "\n", + " def inner():\n", + " print(f\"{lang} rocks!\")\n", + "\n", + " inner()\n", + "\n", + "outer()" + ] + }, + { + "cell_type": "markdown", + "id": "9afd7b80-4b96-4879-ba7b-7bc36f727a0d", + "metadata": {}, + "source": [ + "Here the `lang` variable is **non-local** to `inner()` because it's only referenced. `lang` is also called **free variable**.\n", + "\n", + "> A **free variable** is a variable referenced locally but defined in the enclosing scope.\n", + "\n", + "Also, `lang` and `inner()` both belongs to the local scope of `outer()`. Since this bound is particularly special, it has a special name: it's called a **closure**.\n", + "The name \"closure\" come from the function `inner()` _enclosing_ its free variable `lang`." + ] + }, + { + "cell_type": "markdown", + "id": "a4b00d4c-dc4d-4550-bbcd-07a15f9aa7d2", + "metadata": {}, + "source": [ + "Let's make a small adjustment that will change a lot of things:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a9efa784-f542-41aa-9879-18452b77a1e2", + "metadata": {}, + "outputs": [], + "source": [ + "def outer():\n", + " lang = \"Python\"\n", + "\n", + " def inner():\n", + " print(f\"{lang} rocks!\")\n", + "\n", + " return inner\n", + "\n", + "outer()" + ] + }, + { + "cell_type": "markdown", + "id": "717f601a-eefc-4303-be9b-2b66613eed39", + "metadata": {}, + "source": [ + "We turned `outer()` into an higher-order function that does not return a simple function, but a closure (`inner()` + the free variable).\n", + "\n", + "Since functions are objects, we can assign that to a name:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "550b55d7-8055-446b-ab9c-1245818c3aa7", + "metadata": {}, + "outputs": [], + "source": [ + "fn = outer()" + ] + }, + { + "cell_type": "markdown", + "id": "e19c9332-80a4-4771-ae05-4b744d718543", + "metadata": {}, + "source": [ + "And then call that function as any other function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7a58ef5c-3dc9-4a28-b01e-0128ef4232a1", + "metadata": {}, + "outputs": [], + "source": [ + "fn()" + ] + }, + { + "cell_type": "markdown", + "id": "7d753841-49d6-4656-b4b9-d16e2a3cb107", + "metadata": {}, + "source": [ + "But wait a second! How's that possible? 🤔\n", + "\n", + "`fn()` is called **after** `outer()` has run: it runs when we are assigning the name `fn` to the result of calling `outer()`.\n", + "If the local scope of a function is destroyed after the function has run, how can `fn` know that `lang = \"Python\"`?\n", + "\n", + "That's because Pyhon realized that we created a closure, and it's doing something unusual." + ] + }, + { + "cell_type": "markdown", + "id": "402bd5c1-af47-4716-baa0-ac60da76efce", + "metadata": {}, + "source": [ + "If we look once again to the example above, we see that the name `lang` is **shared** between the local scope of `outer()` and the `print` statement inside `inner()`.\n", + "When Python sees this, it does something different: it creates an **intermediary** object – called a _cell object_ – that only contains a memory address.\n", + "A memory address of what? Of whatever object (i.e. data) is assigned to `lang`, the free variable." + ] + }, + { + "cell_type": "markdown", + "id": "0b1977a3-2854-4015-a5d9-6fd066b89e85", + "metadata": {}, + "source": [ + "![](./images/cell_object.png)" + ] + }, + { + "cell_type": "markdown", + "id": "8f1015aa-ad47-4be6-8612-f9de5ee22e6b", + "metadata": {}, + "source": [ + "We can see all that by inspecting some _hidden_ attributes of `fn`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "22189979-698f-4813-8bff-c70fd83d5a8d", + "metadata": {}, + "outputs": [], + "source": [ + "fn.__code__.co_freevars" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2a304bcc-100a-464e-81e1-74355d65d8bf", + "metadata": {}, + "outputs": [], + "source": [ + "fn.__closure__" + ] + }, + { + "cell_type": "markdown", + "id": "0d18060e-78d7-42e3-8e0c-984e3e5f6dcc", + "metadata": {}, + "source": [ + "We can see now the reason why we can call `fn()` and see the string \"Python rocks!\" printed although the variable `lang` is now out of scope (it's been destroyed).\n", + "There is another reference to the cell object, that from the closure created by `inner()` plus the free variable.\n", + "When running `outer()`, `inner()` is **not called**, and Python still knows how to retrieve the value of the string object." + ] + }, + { + "cell_type": "markdown", + "id": "6d84832a-a780-4439-bacf-4c45615563f5", + "metadata": {}, + "source": [ + "### Modifying the free variable" + ] + }, + { + "cell_type": "markdown", + "id": "06ee4be4-170b-490b-b17a-0e60b0d9d252", + "metadata": {}, + "source": [ + "We know that the `nonlocal` keyword allows us to modify variables from the **enclosing scope**. Therefore, the following closure will work as expected:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "12eb4f64-93bd-4a99-804c-386f8e10b048", + "metadata": {}, + "outputs": [], + "source": [ + "def counter():\n", + " count = 0 # local variable\n", + "\n", + " def inc():\n", + " nonlocal count\n", + " count += 1\n", + " return count\n", + " return inc\n", + "\n", + "c = counter()\n", + "\n", + "c()" + ] + }, + { + "cell_type": "markdown", + "id": "7cdf2e9e-1551-4ebe-88c9-796dfd8fc937", + "metadata": {}, + "source": [ + "The `inc()` function and the `count` variable are the closure, but the `count` variable is not only accessed, but also modified." + ] + }, + { + "cell_type": "markdown", + "id": "d209f602-2ff7-497c-9c4f-a3c419f38da7", + "metadata": {}, + "source": [ + "We can also have multiple closures that reference (and modify) the same free variable:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "377e42f8-c05e-4982-aa9c-fd6bef43e91d", + "metadata": {}, + "outputs": [], + "source": [ + "def adders():\n", + " count = 0\n", + "\n", + " def add_1():\n", + " nonlocal count\n", + " count += 1\n", + " return count\n", + "\n", + " def add_2():\n", + " nonlocal count\n", + " count += 2\n", + " return count\n", + "\n", + " return add_1, add_2\n", + "\n", + "fn1, fn2 = adders()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "94937713-746a-4d50-8e06-1772c94a89ff", + "metadata": {}, + "outputs": [], + "source": [ + "fn1()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "565ea9d9-6e9d-40b4-9450-c765bc747985", + "metadata": {}, + "outputs": [], + "source": [ + "fn2()" + ] + }, + { + "cell_type": "markdown", + "id": "8ed30e46-85b7-4b80-bf20-a2339d5d9509", + "metadata": {}, + "source": [ + "### Multiple instances of closures" + ] + }, + { + "cell_type": "markdown", + "id": "b9055daa-be56-41b0-97c4-44d33deda357", + "metadata": {}, + "source": [ + "As we saw before when talking about scopes, every time a function is **called** a new **local scope** is created.\n", + "A closure can then be created multiple times, and each time we are calling it a new **extended scope** is created.\n", + "\n", + "Consider the following example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c55535ac-d32a-491f-9509-423a5b4557b2", + "metadata": {}, + "outputs": [], + "source": [ + "def power(n):\n", + " # `n` is a local variable\n", + " def op(x):\n", + " return x ** n\n", + " return op" + ] + }, + { + "cell_type": "markdown", + "id": "6bdc2d9a-0cc5-4dba-88b4-a942408ffa6c", + "metadata": {}, + "source": [ + "`n` is our free variable, and we have a closure that contains the function `op()` and the free variable." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e4bfdffb-b059-4d50-ac96-53638b238314", + "metadata": {}, + "outputs": [], + "source": [ + "square = power(2)\n", + "cube = power(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c573a1e0-ab96-4292-a02d-6cdfbeb886db", + "metadata": {}, + "outputs": [], + "source": [ + "print(square(10))\n", + "print(cube(10))" + ] + }, + { + "cell_type": "markdown", + "id": "806c43d9-c78c-4fd0-9f9a-ec6cbbfa34ba", + "metadata": {}, + "source": [ + "We can verify that the two closures are completely different even though they were created from the same `power()`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6667349a-6efc-4d22-8709-865c0caf2320", + "metadata": {}, + "outputs": [], + "source": [ + "square.__closure__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c708373c-9439-4a7b-8953-7f87e6309271", + "metadata": {}, + "outputs": [], + "source": [ + "cube.__closure__" + ] + }, + { + "cell_type": "markdown", + "id": "7490f47f-e0a6-42c3-a56f-59503703e200", + "metadata": {}, + "source": [ + "As we expected, the free variable `n` (of type `int`) is referenced by the cell objects of both closures.\n", + "And since a new value for `n` is created in the local namespace of `power()` every time it's called, we obtained two **different** `int` objects." + ] + }, + { + "cell_type": "markdown", + "id": "5fe10082-37a7-4d69-a5d1-839dd63050f5", + "metadata": {}, + "source": [ + "### Closures can be tricky" + ] + }, + { + "cell_type": "markdown", + "id": "2e7d1a9a-04f8-4e4e-a667-391b4e722708", + "metadata": {}, + "source": [ + "One important aspect of closures can be the source of nasty bugs if we don't understand it well.\n", + "\n", + "> A free variable is **referenced** when the closure is created, but its value is **looked up** upon calling." + ] + }, + { + "cell_type": "markdown", + "id": "9186f397-d07c-4a2c-9de4-9a6df15fb0e9", + "metadata": {}, + "source": [ + "Let's say we want to create a function to sum different factors to an arbitrary number.\n", + "We could do:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c0988d39-7321-49d5-a619-0f960c5e352b", + "metadata": {}, + "outputs": [], + "source": [ + "def adder(n):\n", + " def op(x):\n", + " return n + x\n", + " return op" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "466b3e15-6fd4-4643-bf1b-b161bba237b2", + "metadata": {}, + "outputs": [], + "source": [ + "add_1 = adder(1)\n", + "add_2 = adder(2)\n", + "add_3 = adder(3)" + ] + }, + { + "cell_type": "markdown", + "id": "e446d1fc-a129-4fd9-81b1-1142fa99d865", + "metadata": {}, + "source": [ + "This works as expected:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb6d0536-14f4-4399-91ac-d31e23412ad2", + "metadata": {}, + "outputs": [], + "source": [ + "add_1(10), add_2(10), add_3(10)" + ] + }, + { + "cell_type": "markdown", + "id": "73753cb3-edb3-4dac-be9c-9cb850b9f945", + "metadata": {}, + "source": [ + "Now we have the following idea to improve our code:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1af7eb21-c379-4a86-be7b-4278ca5a4b81", + "metadata": {}, + "outputs": [], + "source": [ + "def fancy_adders():\n", + " adders = []\n", + " for n in range(1, 5):\n", + " def op(x):\n", + " return x + n\n", + " adders.append(op)\n", + " return adders" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "984491cc-a6f6-4bef-a8cf-83cc151fc2bf", + "metadata": {}, + "outputs": [], + "source": [ + "adders = fancy_adders()" + ] + }, + { + "cell_type": "markdown", + "id": "6fecef03-780c-49a0-85be-f44378f3936c", + "metadata": {}, + "source": [ + "We have now 4 functions in the `adders` list:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c27ceb4c-53dd-4c0e-8384-a879d28ed0b6", + "metadata": {}, + "outputs": [], + "source": [ + "adders" + ] + }, + { + "cell_type": "markdown", + "id": "980694ef-8baf-4d6a-8908-4bb839717706", + "metadata": {}, + "source": [ + "Let's see what happens when we call them:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a1277798-7a02-4c67-8f77-ee86eb460134", + "metadata": {}, + "outputs": [], + "source": [ + "adders[0](10), adders[1](10), adders[2](10), adders[3](10)" + ] + }, + { + "cell_type": "markdown", + "id": "4dedb6df-c484-4a98-afb8-f95b5ff2d980", + "metadata": {}, + "source": [ + "Wait, why?! It seems that we picked up the **same value** of the free variable.\n", + "The free variable is always `n`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b05043cf-3a6e-4243-a0a5-ea98d4fa0358", + "metadata": {}, + "outputs": [], + "source": [ + "adders[0].__code__.co_freevars" + ] + }, + { + "cell_type": "markdown", + "id": "2e6cd006-ebd2-4191-902d-f88cbd0c51fb", + "metadata": {}, + "source": [ + "And we can indeed verify that its value referenced by each closure is exactly the same:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f056d992-d712-46d8-933d-c6cb8fc276da", + "metadata": {}, + "outputs": [], + "source": [ + "[x.__closure__ for x in adders]" + ] + }, + { + "cell_type": "markdown", + "id": "5f7124e3-856e-4e7f-a71d-e7e3a4198804", + "metadata": {}, + "source": [ + "Which value? The last iteration of our loop, `n=4`.\n", + "In fact:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0f90363f-b29a-4829-801d-9eaf681801e2", + "metadata": {}, + "outputs": [], + "source": [ + "hex(id(4))" + ] + }, + { + "cell_type": "markdown", + "id": "18300b91-bcfc-43db-a2a0-61ffe9c61816", + "metadata": {}, + "source": [ + "The key to understand this behavior is remembering that **closures captures _variables_ and not _values_**.\n", + "This means that every `op()` function created in the loop is closing over the same variable `n`.\n", + "\n", + "By the time the call to `fancy_adders()` is over, the value of `n` is incremented to its final value, that is, 4.\n", + "This is the value that will be looked up when calling our closures! And this is why we can see that `hex(id(4))` – the memory address of the integer `4` – is indeed the same for all closures." + ] + }, + { + "cell_type": "markdown", + "id": "c3bc9c11-2691-4de9-84ed-49bdc9bfff17", + "metadata": {}, + "source": [ + "If we wanted to fix the example as we intended, we need to capture the **current** value of `n` when defining the closure:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "635e47e3-2043-44b2-b452-295a933b43dc", + "metadata": {}, + "outputs": [], + "source": [ + "def truly_fancy_adders():\n", + " adders = []\n", + " for n in range(1, 5):\n", + " def op(x, n=n): # Capture the current value of n as a default argument\n", + " return x + n\n", + " adders.append(op)\n", + " return adders" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bfc73083-d4d7-4a41-94d3-b1a46f37acec", + "metadata": {}, + "outputs": [], + "source": [ + "correct_adders = truly_fancy_adders()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "98f957a1-b04a-4529-8876-d5fa60917186", + "metadata": {}, + "outputs": [], + "source": [ + "correct_adders[0](10), correct_adders[1](10), correct_adders[2](10)" + ] + }, + { + "cell_type": "markdown", + "id": "ed3b46d3-aade-4b31-bfdd-b03b9916ef23", + "metadata": {}, + "source": [ + "Now, let's inspect our correct closures:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8fc351e4-847d-470f-9a5d-0c74f6e5eeb2", + "metadata": {}, + "outputs": [], + "source": [ + "[x.__closure__ for x in correct_adders]" + ] + }, + { + "cell_type": "markdown", + "id": "ac143f81-4aae-49ad-950d-9ca092361bda", + "metadata": {}, + "source": [ + "Hey, why `None`? Let's check our free variable:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9f0719f3-bc41-4d42-b633-fa49bf501cc5", + "metadata": {}, + "outputs": [], + "source": [ + "correct_adders[0].__code__.co_freevars" + ] + }, + { + "cell_type": "markdown", + "id": "0cfd31d5-b83a-439b-9690-4fbd1fe20b3d", + "metadata": {}, + "source": [ + "Nothing yet?\n", + "Can you think why you don't get any output?" + ] + }, + { + "cell_type": "markdown", + "id": "b0563fc3-de7b-439a-9a23-07af7d1dff03", + "metadata": {}, + "source": [ + "
\n", + "

Hint

Think about your answer before evaluating the cell below.\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "20cadf44-2207-4c16-95c4-f3eed8ef19db", + "metadata": {}, + "outputs": [], + "source": [ + "from tutorial.functions_advanced import tricky_closures as answer\n", + "answer" + ] + }, + { + "cell_type": "markdown", + "id": "ac01ce82-7dbf-42fb-8d78-69c1a8f34740", + "metadata": {}, + "source": [ + "### Nested closures" + ] + }, + { + "cell_type": "markdown", + "id": "a7751b1d-0b4e-418c-abab-efac075436f3", + "metadata": {}, + "source": [ + "We can also nest closures, as much as we can nest functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "eebd35bd-8e14-41a1-9df0-a4c716bc31ff", + "metadata": {}, + "outputs": [], + "source": [ + "def incrementer(n):\n", + " def inner(start):\n", + " current = start\n", + " def inc():\n", + " a = 10 # local variable, NOT a free variable\n", + " nonlocal current\n", + " current += n\n", + " return current\n", + " return inc\n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9a08d1c9-d201-4dc0-ad8a-7e6cb1805f77", + "metadata": {}, + "outputs": [], + "source": [ + "f = incrementer(2)\n", + "f.__code__.co_freevars" + ] + }, + { + "cell_type": "markdown", + "id": "be996408-4438-493e-8549-13c212e442db", + "metadata": {}, + "source": [ + "We create an incrementer function with the default increment, `n=2`, starting from `100`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3754bf22-54f8-4960-b591-b7ed1a6df33d", + "metadata": {}, + "outputs": [], + "source": [ + "inc_2 = f(100)\n", + "inc_2()" + ] + }, + { + "cell_type": "markdown", + "id": "b00c8f22-a319-4e7f-b0e9-1c9a41fe2bbd", + "metadata": {}, + "source": [ + "We can also create another _custom_ incrementer with a different increment value:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9d4443b1-c814-4c69-912a-9dadba8955a9", + "metadata": {}, + "outputs": [], + "source": [ + "inc_10 = incrementer(10)(100)\n", + "inc_10()" + ] + }, + { + "cell_type": "markdown", + "id": "dd0f62cc-da51-413d-bc0b-4b456abdd824", + "metadata": {}, + "source": [ + "### Closures: examples" + ] + }, + { + "cell_type": "markdown", + "id": "2b5c88fa-9a60-4286-b526-38a7294d433d", + "metadata": {}, + "source": [ + "#### Example 1" + ] + }, + { + "cell_type": "markdown", + "id": "69343a4d-aa85-46db-8194-80e64b2a8f2e", + "metadata": {}, + "source": [ + "Let's see a practical example of using closures.\n", + "We're going to see how closures can replace classes and be more straightforward for simple tasks.\n", + "\n", + "Say that we want to calculate a running average of some numbers that we don't know in advance.\n", + "We could create a class as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f9202909-547f-40df-8cf7-c897045d0865", + "metadata": {}, + "outputs": [], + "source": [ + "class Averager:\n", + " def __init__(self):\n", + " self.numbers = []\n", + "\n", + " def add(self, number):\n", + " self.numbers.append(number)\n", + " return sum(self.numbers) / len(self.numbers)\n", + "\n", + "a = Averager()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "610e83ce-e2f1-430f-9bac-86c1449d637a", + "metadata": {}, + "outputs": [], + "source": [ + "a.add(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ac4095a4-7dac-44c2-bd0d-4a40c55fc3eb", + "metadata": {}, + "outputs": [], + "source": [ + "a.add(20)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bad7b335-796d-443d-9bbd-13ce2bb5cd68", + "metadata": {}, + "outputs": [], + "source": [ + "a.add(30)" + ] + }, + { + "cell_type": "markdown", + "id": "d744a1c6-22c1-4616-b663-b19fb75ef0de", + "metadata": {}, + "source": [ + "How can we rewrite the our class as a closure?\n", + "The free variable will be the list `numbers`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ce547faa-4b3a-4c91-b005-a79d8f0f8191", + "metadata": {}, + "outputs": [], + "source": [ + "def averager():\n", + " numbers = []\n", + " def add(number):\n", + " numbers.append(number)\n", + " return sum(numbers) / len(numbers)\n", + " return add\n", + "\n", + "a = averager()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "54a62857-e94d-4247-805c-f041143e7ef6", + "metadata": {}, + "outputs": [], + "source": [ + "a.__closure__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b91fa945-ca65-4266-ac7d-7e9da458a8dd", + "metadata": {}, + "outputs": [], + "source": [ + "a.__code__.co_freevars" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2f7eced0-7ee1-47d1-9c3b-32e9c19a25df", + "metadata": {}, + "outputs": [], + "source": [ + "a(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9ffd05c2-45b9-40a3-b38f-e781164c6a27", + "metadata": {}, + "outputs": [], + "source": [ + "a(20)" + ] + }, + { + "cell_type": "markdown", + "id": "29e37eeb-cc17-4d5d-9803-425c6d399d4b", + "metadata": {}, + "source": [ + "We can make it better: instead of accumulating all the numbers in a list, we only need to keep a running total and count." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "253d0e95-1b33-42d9-88c0-95c6a10cb7df", + "metadata": {}, + "outputs": [], + "source": [ + "def averager():\n", + " total = 0\n", + " count = 0\n", + " def add(number):\n", + " nonlocal total\n", + " nonlocal count\n", + " \n", + " total += number\n", + " count += 1\n", + " return total / count\n", + " return add\n", + "\n", + "a = averager()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2cadaf0b-04fd-4210-b92b-f17708e1cd23", + "metadata": {}, + "outputs": [], + "source": [ + "a(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e5ad1f57-b8f5-462d-b5f2-0aa28d4274ad", + "metadata": {}, + "outputs": [], + "source": [ + "a(20)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "37e5be30-718e-485a-b60f-d8fa1ab8f47f", + "metadata": {}, + "outputs": [], + "source": [ + "a(30)" + ] + }, + { + "cell_type": "markdown", + "id": "5e63c686-923f-4b83-9081-d3cbbde42d68", + "metadata": {}, + "source": [ + "#### Example 2" + ] + }, + { + "cell_type": "markdown", + "id": "4b52912f-bcfd-4eab-98df-8a09af454415", + "metadata": {}, + "source": [ + "We want to create a counter function, that is, a function that increments a variable every time it's called:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3faacb42-febc-4175-a0d6-634c00af9def", + "metadata": {}, + "outputs": [], + "source": [ + "def counter(initial_value):\n", + " def inc(increment=1):\n", + " nonlocal initial_value\n", + " initial_value += increment\n", + " return initial_value\n", + " return inc" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb73542b-1949-4290-9a50-13018f97a5d8", + "metadata": {}, + "outputs": [], + "source": [ + "c1 = counter(0)\n", + "c100 = counter(100)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f0044533-5a6b-43de-afc0-baa37dd2a0ab", + "metadata": {}, + "outputs": [], + "source": [ + "c1()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ca113b53-d223-40a6-a291-c5e30c33c860", + "metadata": {}, + "outputs": [], + "source": [ + "c100()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4a14adc4-2d99-47a2-8c3a-19d6d4d3cec1", + "metadata": {}, + "outputs": [], + "source": [ + "c1(2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc144f72-dd32-4889-a864-4d854d322e97", + "metadata": {}, + "outputs": [], + "source": [ + "c100(10)" + ] + }, + { + "cell_type": "markdown", + "id": "448356f1-141a-43aa-85c0-8399eb86dd92", + "metadata": {}, + "source": [ + "As you can see, each closure maintains a reference to the `initial_value` variable that was created when the `counter()` function was **called**.\n", + "\n", + "Each time that function was called, a new local variable `initial_value` was created (with a value assigned from the argument), and it became a nonlocal (captured) variable in the inner scope." + ] + }, + { + "cell_type": "markdown", + "id": "b44f078d-99a5-4182-bf5c-8534874b13fd", + "metadata": {}, + "source": [ + "Let's extend this example to a **function counter**: a counter that keeps track how many times a function is run." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b70ff4ba-fe36-4a8a-8722-9983f7836e65", + "metadata": {}, + "outputs": [], + "source": [ + "def fcounter(function):\n", + " count = 0\n", + "\n", + " def inner(*args, **kwargs):\n", + " nonlocal count\n", + " count += 1\n", + " print(f\"Function '{function.__name__}' has beel called {count} times.\")\n", + " return function(*args, **kwargs)\n", + "\n", + " return inner" + ] + }, + { + "cell_type": "markdown", + "id": "9106d52f-4d4d-44ac-8b14-2d01aa530764", + "metadata": {}, + "source": [ + "Let's define a function we want to keep track of:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ed85089-96f6-43bf-a339-c0b4a61652aa", + "metadata": {}, + "outputs": [], + "source": [ + "def add(a, b):\n", + " return a + b\n", + "\n", + "counter_add = fcounter(add)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "02a87923-d871-4689-b486-97d832daa504", + "metadata": {}, + "outputs": [], + "source": [ + "counter_add.__code__.co_freevars" + ] + }, + { + "cell_type": "markdown", + "id": "b1400efa-d0d4-4fe9-8d97-81bd012c9c0d", + "metadata": {}, + "source": [ + "We have **two** free variables, one of which is a function (remember: functions are objects)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55821ef1-6d90-4c5d-ab5a-35920bb169e6", + "metadata": {}, + "outputs": [], + "source": [ + "counter_add(1, 2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4645c14e-1024-4d6a-8fc4-a0643dfd826e", + "metadata": {}, + "outputs": [], + "source": [ + "counter_add(2, 3)" + ] + }, + { + "cell_type": "markdown", + "id": "03c3e2d6-c94e-43f4-9b10-c4aff030bedf", + "metadata": {}, + "source": [ + "## Decorators" + ] + }, + { + "cell_type": "markdown", + "id": "46a4a41b-107e-4244-9b1f-75a7fde3b309", + "metadata": {}, + "source": [ + "Now that we know what closures are and what we can do with them, understanding what is a decorator is a (small) step away.\n", + "\n", + "Consider again the last example in the previous section: a counter for functions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2379a0f9-d93b-4a69-9d72-9b504fda9115", + "metadata": {}, + "outputs": [], + "source": [ + "def fcounter(function):\n", + " count = 0\n", + "\n", + " def inner(*args, **kwargs):\n", + " nonlocal count\n", + " count += 1\n", + " print(f\"Function '{function.__name__}' has beel called {count} times.\")\n", + " return function(*args, **kwargs)\n", + "\n", + " return inner" + ] + }, + { + "cell_type": "markdown", + "id": "c2af34e6-fb63-41ea-8885-8f1b4f9042ef", + "metadata": {}, + "source": [ + "And then create a function we want to keep track:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "937b6b94-b093-4955-98f4-2aad9caf2a16", + "metadata": {}, + "outputs": [], + "source": [ + "def factorial(n):\n", + " product = 1\n", + " if n == 0:\n", + " return 1\n", + " for i in range(2, n+1):\n", + " product *= i\n", + " return product" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "445d8327-a1f2-41b7-86ed-860fcf1dc2b0", + "metadata": {}, + "outputs": [], + "source": [ + "counter_fact = fcounter(factorial)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c74867df-f407-4f22-8631-42fef69f09ba", + "metadata": {}, + "outputs": [], + "source": [ + "counter_fact.__closure__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a14265c3-579e-4968-a9b9-e96dc9fc9bf2", + "metadata": {}, + "outputs": [], + "source": [ + "counter_fact(10)" + ] + }, + { + "cell_type": "markdown", + "id": "0733182d-bd6d-4268-ad92-6647baf97ebd", + "metadata": {}, + "source": [ + "Of course, `counter_fact` is an arbitrary name.\n", + "Nothing prevents us from calling it `factorial`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "eb942a01-86d2-4d56-87d8-2008253245a8", + "metadata": {}, + "outputs": [], + "source": [ + "factorial = fcounter(factorial)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "db807859-9005-40bc-bf4a-7e00c632123b", + "metadata": {}, + "outputs": [], + "source": [ + "factorial(10)" + ] + }, + { + "cell_type": "markdown", + "id": "87f34a69-b2dd-4a11-a491-227d8b73ff45", + "metadata": {}, + "source": [ + "This way of defining a function, creating a closure that \"wraps\" a function object, and then **renaming** the initial function is so common in Python that gained a special syntax:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "242d5a97-935c-4215-a9eb-9ec2d6e0f989", + "metadata": {}, + "outputs": [], + "source": [ + "@fcounter\n", + "def mult(a: float, b: float) -> float:\n", + " \"\"\"Multiplies two floats\"\"\"\n", + " return a * b" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "67da9a4e-a0bf-48e3-a28b-e1dde0bca8e3", + "metadata": {}, + "outputs": [], + "source": [ + "mult(2.0, 4.0)" + ] + }, + { + "cell_type": "markdown", + "id": "f28db141-183b-434f-9e0a-b1eab1d90522", + "metadata": {}, + "source": [ + "The function `fcounter` is called a **decorator**, because it's placed **before** the function definition line with the special symbol `@`." + ] + }, + { + "cell_type": "markdown", + "id": "cd26059a-0e81-4c32-9324-cc08b32c7c5c", + "metadata": {}, + "source": [ + "There's one problem, though.\n", + "If we inspect our `mult` function, we could see that it has lost something:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ce41b7be-c854-4d51-8539-635de67d1102", + "metadata": {}, + "outputs": [], + "source": [ + "mult.__name__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ecf4afb7-f02b-4793-95bf-7aa3d78cd07f", + "metadata": {}, + "outputs": [], + "source": [ + "help(mult)" + ] + }, + { + "cell_type": "markdown", + "id": "7ce8eaef-9c35-4c4b-82ea-a54503a1f8e9", + "metadata": {}, + "source": [ + "As you can see, we've also lost our docstring and type hints!\n", + "What's left is the docstring and the type annotations of the `inner` function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7c4816c2-c705-41a5-911f-d8f36e9543e6", + "metadata": {}, + "outputs": [], + "source": [ + "import inspect\n", + "\n", + "print(inspect.getsource(mult))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe8a5163-b1e4-4817-8c16-25e2a67c0120", + "metadata": {}, + "outputs": [], + "source": [ + "print(inspect.signature(mult))" + ] + }, + { + "cell_type": "markdown", + "id": "eea17387-3cac-4982-9e39-fae19b6e6bed", + "metadata": {}, + "source": [ + "We _could_ put back that information, but it might not be straighforward:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "153128bf-aa89-4b02-9205-213fc3deeab8", + "metadata": {}, + "outputs": [], + "source": [ + "def fcounter(function):\n", + " count = 0\n", + "\n", + " def inner(*args, **kwargs):\n", + " nonlocal count\n", + " count += 1\n", + " print(f\"Function '{function.__name__}' has beel called {count} times.\")\n", + " return function(*args, **kwargs)\n", + "\n", + " inner.__name__ = function.__name__\n", + " inner.__doc__ = function.__doc__\n", + "\n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "77e21a34-e89c-4419-b70c-144b803070f9", + "metadata": {}, + "outputs": [], + "source": [ + "@fcounter\n", + "def add(a: int, b: int = 10) -> int:\n", + " \"\"\"Sum two integers\"\"\"\n", + " return a + b" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "04284d5f-58d8-4821-9cb4-58da6bf1e8fa", + "metadata": {}, + "outputs": [], + "source": [ + "help(add)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6de233cd-21ba-4c1e-a1a7-341a8a75f6e7", + "metadata": {}, + "outputs": [], + "source": [ + "add.__name__" + ] + }, + { + "cell_type": "markdown", + "id": "567e660a-5adb-4b8a-8e39-eba77358f77f", + "metadata": {}, + "source": [ + "Okay, at least our docstring and function's name are back.\n", + "What about the type annotations?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0864b0f5-4a50-4444-aa5e-942e8c04d29e", + "metadata": {}, + "outputs": [], + "source": [ + "inspect.signature(add).parameters" + ] + }, + { + "cell_type": "markdown", + "id": "d71b09d2-fe46-485c-a73d-594ab13da6b8", + "metadata": {}, + "source": [ + "Unfortunately, they stil belong to the `inner` function.\n", + "There's a way to bring them back, and we have to use a built-in function from the `functolls` module called `wraps`.\n", + "\n", + "Curiously, `functools.wraps` is **itself a decorator**!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "57485151-33e2-4b29-a449-0e379975efe1", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import wraps\n", + "\n", + "def fcounter(function):\n", + " count = 0\n", + "\n", + " @wraps(function)\n", + " def inner(*args, **kwargs):\n", + " nonlocal count\n", + " count += 1\n", + " print(f\"Function '{function.__name__}' has beel called {count} times.\")\n", + " return function(*args, **kwargs)\n", + "\n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "87968847-225f-45fb-8d07-37cc7a52cef0", + "metadata": {}, + "outputs": [], + "source": [ + "@fcounter\n", + "def add(a: int, b: int = 10) -> int:\n", + " \"\"\"Sum two integers\"\"\"\n", + " return a + b" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "02d33237-63ca-4c50-ac37-21cda9dc262a", + "metadata": {}, + "outputs": [], + "source": [ + "help(add)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a73c3ea7-7755-434c-8a70-b0c7b3eebc6b", + "metadata": {}, + "outputs": [], + "source": [ + "inspect.signature(add)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4fbb7916-ed84-42cb-9869-e117eff96b7c", + "metadata": {}, + "outputs": [], + "source": [ + "inspect.signature(add).parameters" + ] + }, + { + "cell_type": "markdown", + "id": "d1187b91-b02a-4f92-8c15-5d6618225140", + "metadata": {}, + "source": [ + "And now everything is back to normal." + ] + }, + { + "cell_type": "markdown", + "id": "ae092a72-f0e7-4f32-89c3-5ff12d1c27a5", + "metadata": {}, + "source": [ + "### Decorators: examples" + ] + }, + { + "cell_type": "markdown", + "id": "851353f7-fa38-49ef-b265-1950ff0d0715", + "metadata": {}, + "source": [ + "#### Example 1: timer" + ] + }, + { + "cell_type": "markdown", + "id": "f2951d11-964f-4b3d-a631-fe853e458739", + "metadata": {}, + "source": [ + "This is classic example of using decorators: creating a timer for a generic function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "efc370cb-7f4f-4e5e-bbd4-f20c0760eb81", + "metadata": {}, + "outputs": [], + "source": [ + "from time import perf_counter\n", + "from functools import wraps\n", + "\n", + "def timed(function):\n", + "\n", + " @wraps(function)\n", + " def inner(*args, **kwargs):\n", + " start = perf_counter()\n", + " \n", + " result = function(*args, **kwargs)\n", + " \n", + " end = perf_counter()\n", + " elapsed = end - start\n", + " \n", + " args_ = [str(a) for a in args]\n", + " kwargs_ = [f'{k}={v}' for (k, v) in kwargs.items()]\n", + " all_args = args_ + kwargs_\n", + " args_str = ','.join(all_args)\n", + " \n", + " print(f'{function.__name__}({args_str}) took {elapsed:.6f}s to run.')\n", + "\n", + " return result\n", + " \n", + " return inner" + ] + }, + { + "cell_type": "markdown", + "id": "2d9bf042-cc6f-4adb-a901-54bf0d8f92ce", + "metadata": {}, + "source": [ + "Let's test it with a function to calculate the n-th Fibonacci number: `1, 1, 2, 3, 5, 8, 11, ...`\n", + "\n", + "We're going to write **three** Fibonacci implementations to compare their efficiency:\n", + "\n", + "1. With recursion\n", + "2. With a simple loop\n", + "3. A functional approach\n", + "\n", + "**NOTE**: while Python indexes start from 0, our Fibonacci sequence starts from 1 (by choice)." + ] + }, + { + "cell_type": "markdown", + "id": "cac78813-e4cf-4156-9ce1-2f725f3a0272", + "metadata": {}, + "source": [ + "##### Fibonacci with recursion" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a24b09c5-97ce-42b5-9a05-ffe0265c0a23", + "metadata": {}, + "outputs": [], + "source": [ + "def calc_fib_recursive(n):\n", + " if n <= 2:\n", + " return 1\n", + " return calc_fib_recursive(n - 1) + calc_fib_recursive(n - 2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4a161836-1271-460d-822d-ec9d58482606", + "metadata": {}, + "outputs": [], + "source": [ + "calc_fib_recursive(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "68504937-8af3-414e-a702-b6686e0694b7", + "metadata": {}, + "outputs": [], + "source": [ + "calc_fib_recursive(6)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "614a913c-59ca-4a0d-bfd9-ab764ea9eace", + "metadata": {}, + "outputs": [], + "source": [ + "@timed\n", + "def fib_recursive(n):\n", + " return calc_fib_recursive(n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ef9b981-58e2-455a-a037-784790ba7efa", + "metadata": {}, + "outputs": [], + "source": [ + "fib_recursive(33)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "42c695d9-1659-4154-aabf-7a177eb40ebe", + "metadata": {}, + "outputs": [], + "source": [ + "fib_recursive(35)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d3da9690-6d8b-4559-af89-9dffb048bad0", + "metadata": {}, + "outputs": [], + "source": [ + "fib_recursive(40)" + ] + }, + { + "cell_type": "markdown", + "id": "30b1d00c-b868-430f-b37e-84042003d051", + "metadata": {}, + "source": [ + "Sounds a bit long, doesn't it?\n", + "Well, it's: we are calculating the same numbers **every time**.\n", + "When we're past the 30th number, we start seeing some considerable slow down." + ] + }, + { + "cell_type": "markdown", + "id": "56480f96-ebd2-41ff-8ab9-e6d361fbab6c", + "metadata": {}, + "source": [ + "##### Fibonacci with a simple loop" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0fb9d93b-458c-4ba5-841a-a1c635c7edab", + "metadata": {}, + "outputs": [], + "source": [ + "@timed\n", + "def fib_loop(n):\n", + " fib_1 = 1\n", + " fib_2 = 1\n", + " \n", + " for i in range(3, n + 1):\n", + " fib_1, fib_2 = fib_2, fib_1 + fib_2\n", + " \n", + " return fib_2\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80491592-1e14-418d-ad07-763309d433a3", + "metadata": {}, + "outputs": [], + "source": [ + "for n in (3, 10, 30, 35, 40):\n", + " fib_loop(n)" + ] + }, + { + "cell_type": "markdown", + "id": "963ec600-406d-491c-a39e-46d7d5862c8b", + "metadata": {}, + "source": [ + "Incredibly more efficient!\n", + "This is just getting rid of multiple (useless) calculations." + ] + }, + { + "cell_type": "markdown", + "id": "40658db4-2751-4778-a801-3b87eec988b5", + "metadata": {}, + "source": [ + "##### Fibonacci using `reduce`" + ] + }, + { + "cell_type": "markdown", + "id": "ca74bba8-b255-4460-b957-5f9990048b3f", + "metadata": {}, + "source": [ + "First, a quick refresher:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3914bc14-5216-4bde-a1f7-2d5672c36885", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import reduce" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9b5209ea-958b-47c3-8b9f-a6f7496ffad7", + "metadata": {}, + "outputs": [], + "source": [ + "reduce(lambda x, y: x + y, (1, 2, 3, 4, 5))" + ] + }, + { + "cell_type": "markdown", + "id": "2b308e95-0eb7-4096-bc77-0bad55e3895c", + "metadata": {}, + "source": [ + "It's just the progressive sum of pairs of numbers.\n", + "`reduce` applies an operation (1st argument) to pairs of element in an interable (2nd argument)." + ] + }, + { + "cell_type": "markdown", + "id": "9cd0be33-3259-4459-b25c-c077e2aa5d87", + "metadata": {}, + "source": [ + "To calculate the Fibonacci sequence with `reduce`:\n", + "\n", + "```\n", + "n=1:\n", + "(1, 0) --> (1, 1)\n", + "\n", + "n=2:\n", + "(1, 0) --> (1, 1) --> (1 + 1, 1) = (2, 1) : result = 2 \n", + "\n", + "n=3\n", + "(1, 0) --> (1, 1) --> (2, 1) --> (2+1, 2) = (3, 2) : result = 3\n", + "\n", + "n=4\n", + "(1, 0) --> (1, 1) --> (2, 1) --> (3, 2) --> (5, 3) : result = 5\n", + "```\n", + "\n", + "In general each step in the reduction is as follows:\n", + "\n", + "```\n", + "previous value = (a, b)\n", + "new value = (a+b, a)\n", + "```\n", + "\n", + "If we start our reduction with an initial value of `(1, 0)`, we need to run our \"loop\" `n` times.\n", + "We therefore use a \"dummy\" sequence of length `n` to create `n` steps in our reduce." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "76da0450-1c18-465a-916b-a7a14e4d6c51", + "metadata": {}, + "outputs": [], + "source": [ + "@timed\n", + "def fib_reduce(n):\n", + " initial = (1, 0)\n", + " fib_n = reduce(lambda prev, n: (prev[0] + prev[1], prev[0]), range(n), initial)\n", + " return fib_n[0]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bac97e9b-b2d3-4a7c-9dc2-2afd10c056a9", + "metadata": {}, + "outputs": [], + "source": [ + "for n in (3, 10, 30, 35, 40):\n", + " fib_reduce(n)" + ] + }, + { + "cell_type": "markdown", + "id": "f6908625-d87b-4290-a26b-f7584d02fbb3", + "metadata": {}, + "source": [ + "If we compare the three methods:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1380b3c9-5848-4c20-ba24-a31821d55467", + "metadata": {}, + "outputs": [], + "source": [ + "fib_recursive(35)\n", + "fib_loop(35)\n", + "fib_reduce(35)" + ] + }, + { + "cell_type": "markdown", + "id": "d538441b-7e05-4f09-81ee-b9c07f457cfe", + "metadata": {}, + "source": [ + "Although the recursive method is the __easiest__ to understand, it's also the slowest because it's written inefficiently.\n", + "How can we improve it? Let's see a second example of using decorators." + ] + }, + { + "cell_type": "markdown", + "id": "cd0bdfaf-8869-42da-bbda-b2e7c58cac1b", + "metadata": {}, + "source": [ + "#### Example 2: memoization" + ] + }, + { + "cell_type": "markdown", + "id": "a6855d4c-4bee-499c-a83b-4a757f9bacd2", + "metadata": {}, + "source": [ + "The previous example showed one task that a decorator can accomplish pretty well: adding some feature to a predefined function.\n", + "But what about __changing__ the behavior of the function itself?" + ] + }, + { + "cell_type": "markdown", + "id": "29a3838a-b172-4b22-accd-2f28631f9015", + "metadata": {}, + "source": [ + "Remember the Fibonacci sequence example.\n", + "We discovered that the recursive approach is by far the most intuitive, yet it's tremendously inefficient because a number gets calculated multiple times." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "037ea535-54b8-46a3-a792-73d178bf9156", + "metadata": {}, + "outputs": [], + "source": [ + "def fib(n):\n", + " print (f'Calculating fib({n})')\n", + " return 1 if n < 3 else fib(n-1) + fib(n-2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3a21a00e-25e3-407d-9a3c-ad635fa4c711", + "metadata": {}, + "outputs": [], + "source": [ + "fib(5)" + ] + }, + { + "cell_type": "markdown", + "id": "70b34b5f-329a-4ed7-893f-bd5c941e7717", + "metadata": {}, + "source": [ + "You can see that `fib(2)` is calculated **three times**.\n", + "And the larger the number, the more often a number is recalculated.\n", + "That's why with `fib(40)` the recursive approach is taking ages to finish." + ] + }, + { + "cell_type": "markdown", + "id": "3d051bd9-356e-43ae-9241-01ca0685f806", + "metadata": {}, + "source": [ + "We'll see how we can improve this approach using a decorator and a caching mechanism for previously calculated numbers.\n", + "This approach is well-known in computer science, and it's called **memoization**." + ] + }, + { + "cell_type": "markdown", + "id": "d0dc6010-55bf-4f05-ad43-0d23e37e0adb", + "metadata": {}, + "source": [ + "For the sake of comparison, let's first approach this problem with a simple class:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "74bcdcf3-fd56-407b-a29e-2e389a9b7119", + "metadata": {}, + "outputs": [], + "source": [ + "class Fib:\n", + " def __init__(self):\n", + " self.cache = {1: 1, 2: 1} # initial values already known\n", + " \n", + " def fib(self, n):\n", + " if n not in self.cache:\n", + " print(f'Calculating fib({n})')\n", + " self.cache[n] = self.fib(n-1) + self.fib(n-2)\n", + " return self.cache[n]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a4410484-9f84-4129-bbb7-03aa54b86ebd", + "metadata": {}, + "outputs": [], + "source": [ + "f = Fib()\n", + "f.fib(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e1c4e895-0df0-44f5-a433-2e301aa6a42a", + "metadata": {}, + "outputs": [], + "source": [ + "f.fib(12)" + ] + }, + { + "cell_type": "markdown", + "id": "5eb4de0f-8764-4780-902d-866ed71c083d", + "metadata": {}, + "source": [ + "You can see that numbers $\\leq 10$ are **not** recalculated, but are fetched from the cache.\n", + "\n", + "Let's see how we can do this with a closure:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d3c8d45b-8206-430e-b036-ace879fb5eeb", + "metadata": {}, + "outputs": [], + "source": [ + "def fib():\n", + " # `cache` is our free variable\n", + " cache = {1: 1, 2: 2}\n", + " \n", + " def calc_fib(n):\n", + " if n not in cache:\n", + " print(f'Calculating fib({n})')\n", + " cache[n] = calc_fib(n-1) + calc_fib(n-2)\n", + " return cache[n]\n", + " \n", + " return calc_fib" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e70c5950-aebc-46ce-adb9-7aef09842f49", + "metadata": {}, + "outputs": [], + "source": [ + "f = fib() # create our closure\n", + "f(10) # call it" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d4b745b4-3949-4311-af3f-59ea310a8693", + "metadata": {}, + "outputs": [], + "source": [ + "f(15)" + ] + }, + { + "cell_type": "markdown", + "id": "70606145-7e3d-4d39-b912-0a9417243c2b", + "metadata": {}, + "source": [ + "Once again, cached valued are just returned and not recalculated." + ] + }, + { + "cell_type": "markdown", + "id": "d5909ef9-5a98-4b10-b75b-7e2524f7476a", + "metadata": {}, + "source": [ + "How can we implement this as a decorator?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "927c6b8c-53b0-429e-9056-a73ce32c1118", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import wraps\n", + "\n", + "def memoize_fib(fn):\n", + " cache = {}\n", + " \n", + " @wraps(fn)\n", + " def inner(n):\n", + " if n not in cache:\n", + " cache[n] = fn(n)\n", + " return cache[n]\n", + " \n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5e31857a-01f4-462a-b1b5-589991580de0", + "metadata": {}, + "outputs": [], + "source": [ + "@memoize_fib\n", + "def fib(n):\n", + " print (f'Calculating fib({n})')\n", + " return 1 if n < 3 else fib(n-1) + fib(n-2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3bbe4852-3229-48d3-9b57-aa8bd5fb19c8", + "metadata": {}, + "outputs": [], + "source": [ + "fib(3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ecd39ea1-6452-4d08-ba34-0ef25c2cb8c8", + "metadata": {}, + "outputs": [], + "source": [ + "fib(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6a3349e4-fbf4-46ec-ad3f-8f024338eda0", + "metadata": {}, + "outputs": [], + "source": [ + "fib(6)" + ] + }, + { + "cell_type": "markdown", + "id": "0601bada-12d1-4ea5-88a3-dea98a518da2", + "metadata": {}, + "source": [ + "`fib(6)` was literally instantaneous because we already had it in the cache." + ] + }, + { + "cell_type": "markdown", + "id": "9e4d6bd0-be10-4e9c-82fd-e858217c28ba", + "metadata": {}, + "source": [ + "How to create a generic decorator that caches the return values of **any** function?\n", + "We know how to do it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0dcc0d34-4ce6-4226-93ad-60cdbffb82e8", + "metadata": {}, + "outputs": [], + "source": [ + "def memoize(fn):\n", + " cache = {}\n", + " \n", + " @wraps(fn)\n", + " def inner(*args):\n", + " if args not in cache:\n", + " cache[args] = fn(*args)\n", + " return cache[args]\n", + " \n", + " return inner" + ] + }, + { + "cell_type": "markdown", + "id": "bda787dc-56c4-4039-9dfb-0bde5a2ecb74", + "metadata": {}, + "source": [ + "And we can now give any function a cache to store previously calculated results:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3305b178-4529-490b-b874-f90e2e04057b", + "metadata": {}, + "outputs": [], + "source": [ + "@memoize\n", + "def fact(n):\n", + " print(f'Calculating {n}!')\n", + " return 1 if n < 2 else n * fact(n-1)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cdb57a21-3369-429b-be6b-d8070f6274d8", + "metadata": {}, + "outputs": [], + "source": [ + "fact(6)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "85424c51-aad7-4a65-86be-dfcae38fca5d", + "metadata": {}, + "outputs": [], + "source": [ + "fact(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9e72cc94-1625-4a31-bfc5-9c5d960ebf9c", + "metadata": {}, + "outputs": [], + "source": [ + "fact(9)" + ] + }, + { + "cell_type": "markdown", + "id": "97663e19-c050-4ddd-9966-5517329f82d1", + "metadata": {}, + "source": [ + "Caching and decorators play a crucial role in optimizing function performance.\n", + "By caching previously calculated results in memory (or on disk), we can drastically reduce the time required for the calculation.\n", + "\n", + "However, our simple memoizer has a limitation: the cache size is **unbounded**, which may not be ideal.\n", + "In practice, it's often desirable to restrict the cache to a specific number of entries.\n", + "This helps strike a balance between computational efficiency and memory utilization." + ] + }, + { + "cell_type": "markdown", + "id": "34a56f44-3a4a-4ef5-b13d-a8996c2705cb", + "metadata": {}, + "source": [ + "Additionally, our current implementation does not handle keyword arguments (`**kwargs`), which can be a significant limitation in more complex scenarios.\n", + "\n", + "Fortunately, Python provides a built-in solution for memoization in the `functools` module, known as `lru_cache`.\n", + "This decorator is designed to address the drawbacks of our basic memoization example.\n", + "`lru_cache` stands for **Least Recently Used** caching, meaning that when the cache reaches its limit, the least recently used entries are automatically removed to make room for new ones.\n", + "This feature ensures efficient memory management while improving performance." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "85542bcc-f74a-4ad1-8f49-0f28a05924a1", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import lru_cache" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e3dc8989-b32f-4df1-abfe-163b8e402a3d", + "metadata": {}, + "outputs": [], + "source": [ + "@lru_cache()\n", + "def fact(n):\n", + " print(f\"Calculating fact({n})\")\n", + " return 1 if n < 2 else n * fact(n-1)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "58e99db4-c68f-4361-b1fc-4a36d56bc63c", + "metadata": {}, + "outputs": [], + "source": [ + "for n in (2, 5, 6, 10, 15, 8):\n", + " print(fact(n))" + ] + }, + { + "cell_type": "markdown", + "id": "f2ff8c04-31b2-41cf-9d53-5f342770a4ab", + "metadata": {}, + "source": [ + "Once again, the last value `fact(8)` was simply fetched from the cache." + ] + }, + { + "cell_type": "markdown", + "id": "0127faae-a296-41d2-b369-19e0af6e87dd", + "metadata": {}, + "source": [ + "Now let's see if we have improved on our recursive approach of calculating Fibonacci numbers.\n", + "Recall the naive implementation:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bbe7626d-d436-41e0-85f0-803ee8033399", + "metadata": {}, + "outputs": [], + "source": [ + "from time import perf_counter" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3ca9a287-aff9-493a-9904-9f583ee2d32e", + "metadata": {}, + "outputs": [], + "source": [ + "def fib_no_memo(n):\n", + " return 1 if n < 3 else fib_no_memo(n-1) + fib_no_memo(n-2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3f23099d-5362-4629-ad10-816a61318118", + "metadata": {}, + "outputs": [], + "source": [ + "start = perf_counter()\n", + "result = fib_no_memo(35)\n", + "done = perf_counter() - start\n", + "\n", + "print(f\"result={result}, elapsed: {done}s\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "53f342fd-8e57-4960-8fb4-65a9164d1b4f", + "metadata": {}, + "outputs": [], + "source": [ + "@lru_cache()\n", + "def fib_memo(n):\n", + " return 1 if n < 3 else fib_memo(n-1) + fib_memo(n-2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6205cc9e-47f0-46c1-ad92-f066dd5f2e29", + "metadata": {}, + "outputs": [], + "source": [ + "start = perf_counter()\n", + "result = fib_memo(35)\n", + "done = perf_counter() - start\n", + "\n", + "print(f\"result={result}, elapsed: {done}s\")" + ] + }, + { + "cell_type": "markdown", + "id": "db34d63c-f25e-4d4f-aa7e-19aea575a6cb", + "metadata": {}, + "source": [ + "It's about **4 orders of magnitude** faster than the naive approach! 🔥\n", + "Let's time it again to see what happens:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71c5f5ed-c8cf-493b-a8ae-4af3cfcac730", + "metadata": {}, + "outputs": [], + "source": [ + "start = perf_counter()\n", + "result = fib_memo(35)\n", + "done = perf_counter() - start\n", + "\n", + "print(f\"result={result}, elapsed: {done}s\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "344492e4-35cd-4f1e-953c-6a25ed824beb", + "metadata": {}, + "outputs": [], + "source": [ + "start = perf_counter()\n", + "result = fib_memo(35)\n", + "done = perf_counter() - start\n", + "\n", + "print(f\"result={result}, elapsed: {done}s\")" + ] + }, + { + "cell_type": "markdown", + "id": "e4b2b121-1fa2-41c2-bc4c-6cfc8fa018de", + "metadata": {}, + "source": [ + "Not the same time, but about the same order of magnitude.\n", + "It means that no extra calculation was needed." + ] + }, + { + "cell_type": "markdown", + "id": "b2ed8eac-d3be-4de5-894a-f72efc2c2046", + "metadata": {}, + "source": [ + "You may have noticed that `lru_cache` was called an **empty list of arguments**, but it supports some.\n", + "One of them is the **cache size**: by default, it can hold up to **128 items**.\n", + "The best is to use powers of 2 for performance reasons, but you can change it to anything you want, including `None` for an **unbounded** cache (not recommended)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "af2d1111-b1e8-41d9-94cf-5a385d143ba4", + "metadata": {}, + "outputs": [], + "source": [ + "@lru_cache(maxsize=8)\n", + "def fib(n):\n", + " print(f\"Calculating fib({n})\")\n", + " return 1 if n < 3 else fib(n-1) + fib(n-2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4d7667da-ef73-44e1-a502-4e7bee54db8b", + "metadata": {}, + "outputs": [], + "source": [ + "fib(8)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "33242e6d-3695-43bb-8e3f-22b3a5455d7a", + "metadata": {}, + "outputs": [], + "source": [ + "fib(9)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4558e886-905b-4a75-86ca-5b52d113baca", + "metadata": {}, + "outputs": [], + "source": [ + "fib(1)" + ] + }, + { + "cell_type": "markdown", + "id": "defa65f2-8f62-45c5-a295-78eca6320b14", + "metadata": {}, + "source": [ + "We had to recalculate `fib(1)` because when we called `fib(9)` the least recent item in the cache (the result of `fib(1)`) was evicted from the cache." + ] + }, + { + "cell_type": "markdown", + "id": "11cfa5cb-e95f-47fd-8eda-de7b671e4a0c", + "metadata": {}, + "source": [ + "### Parametrized decorators" + ] + }, + { + "cell_type": "markdown", + "id": "56dec685-4796-483a-b4f6-86652160af57", + "metadata": {}, + "source": [ + "Here comes a natural question: what if I need to pass some argument to my decorator?\n", + "Think again of the `functools.lru_cache`: it takes one parameters, the cache size called `maxsize`." + ] + }, + { + "cell_type": "markdown", + "id": "5e80180f-c6ec-4298-a276-75b84e35aa8e", + "metadata": {}, + "source": [ + "Let's bring back our `timed` decorator and make a small change.\n", + "Instead of calculating the time of a **single run**, we want to calculate an **average** of, say, `10` runs:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c4f17d45-364d-4af7-9c8d-2be2c34c6eed", + "metadata": {}, + "outputs": [], + "source": [ + "from time import perf_counter\n", + "\n", + "def timed(fn):\n", + " def inner(*args, **kwargs):\n", + " total_elapsed = 0\n", + " \n", + " for i in range(10):\n", + " start = perf_counter()\n", + " result = fn(*args, **kwargs)\n", + " end = perf_counter()\n", + " total_elapsed += (perf_counter() - start)\n", + " \n", + " avg_elapsed = total_elapsed / 10\n", + " \n", + " print(f'Avg runtime: {avg_elapsed:.6f}s')\n", + " \n", + " return result\n", + " \n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "07889c31-4a4a-45f2-a6a4-e0da8cdf19bf", + "metadata": {}, + "outputs": [], + "source": [ + "def calc_fib_recurse(n):\n", + " return 1 if n < 3 else calc_fib_recurse(n-1) + calc_fib_recurse(n-2)\n", + "\n", + "@timed\n", + "def fib(n):\n", + " return calc_fib_recurse(n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7664bbd0-beec-42b8-88fa-a634781e40c7", + "metadata": {}, + "outputs": [], + "source": [ + "fib(30)" + ] + }, + { + "cell_type": "markdown", + "id": "7fcf76cf-906c-4d8b-a69d-163ec31fa4d9", + "metadata": {}, + "source": [ + "But what if I wanted to time this function **100 times**?\n", + "Or say that I have different functions that should be timed with a different number of repetitions?\n", + "It's not the best to have the value `10` hard-coded, right?\n", + "\n", + "Let's change this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8649f150-aeaa-4d6e-a8e1-e8e6f444e4af", + "metadata": {}, + "outputs": [], + "source": [ + "def timed(fn, num_reps): \n", + " def inner(*args, **kwargs):\n", + " total_elapsed = 0\n", + " \n", + " for i in range(num_reps):\n", + " start = perf_counter()\n", + " result = fn(*args, **kwargs)\n", + " end = perf_counter()\n", + " total_elapsed += (perf_counter() - start)\n", + " \n", + " avg_elapsed = total_elapsed / num_reps\n", + " \n", + " print(f'Avg runtime: {avg_elapsed:.6f}s ({num_reps} reps)')\n", + " return result\n", + " \n", + " return inner" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6033e377-bb3c-458f-8df9-c29fe6ec0ade", + "metadata": {}, + "outputs": [], + "source": [ + "def fib(n):\n", + " return calc_fib_recurse(n)\n", + "\n", + "fib = timed(fib, 5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f52d07c1-53a5-4f70-beee-b4ea23dec26b", + "metadata": {}, + "outputs": [], + "source": [ + "fib(28)" + ] + }, + { + "cell_type": "markdown", + "id": "8949b05a-b68f-4c6d-9cd5-f44eb394a973", + "metadata": {}, + "source": [ + "But wait: why did we use the fancy `@-` syntax?\n", + "The reason is simple: with `@` the decorating function (`timed` here) can only take a **single argument**, that is, the function to be decorated." + ] + }, + { + "cell_type": "markdown", + "id": "b6badfec-b32c-416c-9c51-4291666b2ed8", + "metadata": {}, + "source": [ + "To fix this behavior we need to rethink of what `@` is doing.\n", + "Writing\n", + "\n", + "```python\n", + "@timed\n", + "def my_func():\n", + " pass\n", + "```\n", + "\n", + "is equivalent to\n", + "\n", + "```python\n", + "my_func = timed(my_func)\n", + "```\n", + "\n", + "When called, `timed` returns the **inner closure**, where the original function is the free variable.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "604bd997-d375-4620-a3f8-84daa3ba63f1", + "metadata": {}, + "outputs": [], + "source": [ + "fib.__closure__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "42189578-2918-4525-9e40-184d8b527434", + "metadata": {}, + "outputs": [], + "source": [ + "fib.__code__.co_freevars" + ] + }, + { + "cell_type": "markdown", + "id": "7f702383-31b4-42a9-b97b-d6ef5d34d6cc", + "metadata": {}, + "source": [ + "So, for the syntax `@timed(10)` to work, where `10` is the number of repetition, `timed` should return **a decorator itself**, and not our closure.\n", + "In practice, the `timed` function is a **decorator factory**: something that's able to return a \"parametrized\" decorator." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6975521b-162d-495f-b185-b86559cddd86", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import wraps\n", + "from time import perf_counter\n", + "\n", + "def timed(num_reps=10):\n", + " \n", + " def decorator(fn):\n", + "\n", + " @wraps(fn)\n", + " def inner(*args, **kwargs):\n", + " total_elapsed = 0\n", + " \n", + " for i in range(num_reps):\n", + " start = perf_counter()\n", + " result = fn(*args, **kwargs)\n", + " end = perf_counter()\n", + " total_elapsed += (perf_counter() - start)\n", + " \n", + " avg_elapsed = total_elapsed / num_reps\n", + " \n", + " print(f'Avg Run time: {avg_elapsed:.6f}s ({num_reps} reps)')\n", + " return result\n", + " \n", + " return inner\n", + " \n", + " return decorator " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f097a33c-48b7-4ee3-9460-53b31a5835c1", + "metadata": {}, + "outputs": [], + "source": [ + "def calc_fib_recurse(n):\n", + " return 1 if n < 3 else calc_fib_recurse(n-1) + calc_fib_recurse(n-2)\n", + "\n", + "@timed(10)\n", + "def fib(n):\n", + " return calc_fib_recurse(n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f87ec0fe-ef8d-4d5a-b8cf-56eb24cce6e8", + "metadata": {}, + "outputs": [], + "source": [ + "fib(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "85f7bc42-d42e-41ca-8bc4-b2a1369b765e", + "metadata": {}, + "outputs": [], + "source": [ + "from functools import lru_cache\n", + "\n", + "def calc_fact(n):\n", + " return 1 if n < 2 else n * calc_fact(n-1)\n", + "\n", + "@timed(20)\n", + "@lru_cache()\n", + "def fact(n):\n", + " return calc_fact(n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6d205959-7384-4a5a-a5fd-a60816ae102b", + "metadata": {}, + "outputs": [], + "source": [ + "fact(10)" + ] + }, + { + "cell_type": "markdown", + "id": "a17be0fa-db15-4a09-b92e-4da09c5d3c3a", + "metadata": {}, + "source": [ + "And yes, you can **stack multiple decorators**! 😎" + ] + }, + { + "cell_type": "markdown", + "id": "4d73ede6-b0ef-435f-9eb5-ef10a4f099e0", + "metadata": {}, + "source": [ + "## Generators" + ] + }, + { + "cell_type": "markdown", + "id": "93753115-b428-49ad-883b-29d1ad5cefef", + "metadata": {}, + "source": [ + "The concept of generators is very much tied to that of \"looping over some kind of container\".\n", + "And we already used generators many time without realizing it.\n", + "The easiest example is a standard `for` loop over some range of integers:\n", + "\n", + "```python\n", + "for i in range(10):\n", + " # do something\n", + "```\n", + "\n", + "The object that Python builds for us with `range(10)` is something very close to a generator. " + ] + }, + { + "cell_type": "markdown", + "id": "210bcc5e-58cf-466d-8e17-2f0c4c9eaec1", + "metadata": {}, + "source": [ + "To understand generators, we first need to review what it means to be **iterable** and, more importantly, what is an **iterator**.\n", + "\n", + "1. An **iterable** is any object that can return one item at time until there are no items left.\n", + "2. An **iterator** is an object that represents a stream of data and keeps track of the current position while processing the stream. It must implement two methods of the _iterator protocol_: `__next__` (returns the next element in the stream and advances the position) and `__iter__` (returns the iterator object itself)" + ] + }, + { + "cell_type": "markdown", + "id": "015bb6cd-f87b-4b8b-acd6-5087bca63911", + "metadata": {}, + "source": [ + "Delving deep into iterators is out of the scope of this section, so we are going to show you a practical example of a class that implements the \"iterator protocol\".\n", + "\n", + "Example: we want an iterator that build squares of successive integers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a5202c22-1018-4652-a75d-b5027f6de5fd", + "metadata": {}, + "outputs": [], + "source": [ + "class Squares:\n", + " def __init__(self, n):\n", + " self.n = n\n", + " self.i = 0\n", + "\n", + " def __iter__(self):\n", + " return self\n", + "\n", + " def __next__(self):\n", + " if self.i >= self.n:\n", + " raise StopIteration\n", + "\n", + " self.i += 1\n", + " return self.i ** 2" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "192ba5cc-bd97-4310-9ad4-ff520ea8050f", + "metadata": {}, + "outputs": [], + "source": [ + "for sq in Squares(5):\n", + " print(sq)" + ] + }, + { + "cell_type": "markdown", + "id": "aa22aa62-570b-43c1-bbf4-6948ed8940cb", + "metadata": {}, + "source": [ + "We see that we can indeed loop over our custom `Squares` class.\n", + "How Python is able to do this?\n", + "\n", + "1. The `__next__` method returns the next item, without going past the last one\n", + "2. We raise a special exception if we are at the last item (or past)\n", + "3. The `__iter__` method returns an instance of the class, meaning that the object itself _is_ an iterator" + ] + }, + { + "cell_type": "markdown", + "id": "89b69921-b35b-4c6b-a958-a11697e37ec8", + "metadata": {}, + "source": [ + "As you might have learned by now, we can implement some built-in behavior in our classes by using the so-called \"special methods\" or **dunder methods**: those with this naming schema `__method__`.\n", + "\n", + "A few examples:\n", + "\n", + "- The `len()` built-in can be defined with the `__len__` method\n", + "- The string returned by `str()` can be defined with the `__str__` method. The same goes for an object's representation with `repr()` (and `__repr__`)\n", + "- The `[]` (fetching an item by index from an ordered collection) can be defined with the `__getitem__` method" + ] + }, + { + "cell_type": "markdown", + "id": "cf30162e-8c18-4dbe-be1e-f0796d1ceecf", + "metadata": {}, + "source": [ + "Python also has the built-in `next()` which does what you think it does: it takes an **iterator** object and returns the next element in the stream of data by calling the `__next__` method implemented by that object.\n", + "\n", + "It the same way, we can call `iter()` on an object as the **only** argument and return an iterator.\n", + "Our class is doing that by implementing the `__iter__` method.\n", + "\n", + "But there's another way of calling `iter()` with **two arguments**: the first must be a **callable** (i.e., a function) and the second argument is a **sentinel**. As soon as the callable returns the sentinel value, then a `StopIteration` is raised." + ] + }, + { + "cell_type": "markdown", + "id": "a41a06b7-b900-457e-bbe5-2e1f49d03260", + "metadata": {}, + "source": [ + "We could've written our `Squares` class using a closure instead:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "32ed78a0-093c-4c0f-ad1e-08040f0767f2", + "metadata": {}, + "outputs": [], + "source": [ + "def square():\n", + " i = 0\n", + " def inner():\n", + " nonlocal i\n", + " i += 1\n", + " return i ** 2\n", + " return inner\n", + "\n", + "square_iter = iter(square(), 5**2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cf6b292a-0b8b-483a-b6c2-02197ed0a617", + "metadata": {}, + "outputs": [], + "source": [ + "for sq in square_iter:\n", + " print(sq)" + ] + }, + { + "cell_type": "markdown", + "id": "2c7e0730-2f0b-4c05-b327-0c608f1d8e6e", + "metadata": {}, + "source": [ + "If the value returned by `square()` is 25 (our sentinel), then a `StopIteration` is raised." + ] + }, + { + "cell_type": "markdown", + "id": "60418aad-f45f-4e8d-92f8-f1d25c31a3ac", + "metadata": {}, + "source": [ + "These two ways are identical: in the first case (the class), we built the iterator ourselves. In the second case, Python built it for us.\n", + "\n", + "The second example is a shorter code, but maybe a bit more difficult to understand if we didn't write it.\n", + "There's a better way to do the same, and it's using **generators** with their special keyword `yield`." + ] + }, + { + "cell_type": "markdown", + "id": "12189bba-06b8-46d7-ade8-d98f155ccec4", + "metadata": {}, + "source": [ + "The `yield` statement is used almost like a `return` statement in a function, but there is a huge difference.\n", + "When the `yield` statement is encountered, Python returns whatever value `yield` specifies, but it **pauses** execution of the function.\n", + "We can then _call_ the same function again and it will _resume_ from where the last `yield` was encountered.\n", + "\n", + "We do **not** resume the function by calling it the standard way, but we have to use the built-in `next()`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bcebcd57-1eb3-464b-90a1-a4472460ee9f", + "metadata": {}, + "outputs": [], + "source": [ + "def my_func():\n", + " print('line 1')\n", + " yield 'Python'\n", + " print('line 2')\n", + " yield 'Is'\n", + " print('line 3')\n", + " yield 'Great'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "02645231-a433-4d6d-b715-44ad0c8de2c5", + "metadata": {}, + "outputs": [], + "source": [ + "gen_my_func = my_func()\n", + "type(gen_my_func)" + ] + }, + { + "cell_type": "markdown", + "id": "a64b6ee5-c532-4562-bf28-de0d49ce5214", + "metadata": {}, + "source": [ + "Here it is: our function returned _something_ different than the usual \"function\" object.\n", + "We did not run anything in the function body until we use it as an argument of `next()`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d481c9c0-da8a-4b32-a6d4-2ad26a32a1dd", + "metadata": {}, + "outputs": [], + "source": [ + "next(gen_my_func)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5b73e97f-af8a-4aeb-8e09-8c78fd90ad0e", + "metadata": {}, + "outputs": [], + "source": [ + "next(gen_my_func)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7f4792ae-2e76-496e-83d6-e5fa0cd3b511", + "metadata": {}, + "outputs": [], + "source": [ + "next(gen_my_func)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "08af1d98-6ef1-4b91-9cb8-36e84a89c803", + "metadata": {}, + "outputs": [], + "source": [ + "next(gen_my_func)" + ] + }, + { + "cell_type": "markdown", + "id": "fd0cd226-51e9-4776-b02d-805900edce5e", + "metadata": {}, + "source": [ + "A `StopIteration` is raised if we are trying to go past the last `yield` statement.\n", + "This should ring a bell: the `next()` method, a `StopIteration`... it seems that `gen_my_func` is very similar to an iterator.\n", + "\n", + "How can we check it?\n", + "We know that an iterator **must** implement an `__iter__` method, right?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4d97ad51-16ea-48c6-bcab-fbf7dd09e1cc", + "metadata": {}, + "outputs": [], + "source": [ + "'__iter__' in dir(gen_my_func)" + ] + }, + { + "cell_type": "markdown", + "id": "f14e9fd4-ab5b-44fa-bf3c-c65f2f931ab4", + "metadata": {}, + "source": [ + "And also the `__next__` method" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c0c60262-cf6b-445d-8026-5da9623f2cae", + "metadata": {}, + "outputs": [], + "source": [ + "'__next__' in dir(gen_my_func)" + ] + }, + { + "cell_type": "markdown", + "id": "f4727a96-ecab-4ed2-bccb-665bad5b5935", + "metadata": {}, + "source": [ + "We can also check that `iter()` applied on our object returns indeed the same thing.\n", + "That is, our object is itself an iterator." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d52671aa-ae89-4ffa-a86f-d7b8d4801d19", + "metadata": {}, + "outputs": [], + "source": [ + "gen_my_func" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb89ae4b-6e03-449a-853d-783ef380a904", + "metadata": {}, + "outputs": [], + "source": [ + "iter(gen_my_func)" + ] + }, + { + "cell_type": "markdown", + "id": "9e43f3ba-8b8b-49c7-a10d-44640df0ad65", + "metadata": {}, + "source": [ + "Precisely the same object." + ] + }, + { + "cell_type": "markdown", + "id": "c2a8686c-3ebe-449b-9857-0d33284c8d3a", + "metadata": {}, + "source": [ + "How Python knows when to stop the iteration?\n", + "When should it raise the `StopIteration`?\n", + "In the simple example above, it's easy: when there's nothing else after the last `yield`.\n", + "\n", + "Well, not really \"nothing\". Remember that Python returns `None` for us if we don't specify any `return` statement.\n", + "So, in general, the iteration will terminate **when we return something from the function** using the `return` statement.\n", + "\n", + "Let's go back to our `squares` example and refactor it to have a generator:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b0d6a96c-03b2-4c5c-97b5-e8eeb92d3459", + "metadata": {}, + "outputs": [], + "source": [ + "def squares(sentinel):\n", + " i = 0\n", + " while True:\n", + " if i < sentinel:\n", + " yield i ** 2\n", + " i += 1\n", + " else:\n", + " return 'Finished.'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c3036908-c8a4-4fb8-8ba4-0d4923af2bcc", + "metadata": {}, + "outputs": [], + "source": [ + "sq = squares(3)\n", + "next(sq)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3b22bd05-6164-4d0f-b056-bc99c9cac527", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "135dd1b8-557b-4b40-9b91-3025f0e956db", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq) # this is the last" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7b0200d9-4c70-46df-ab4a-83d942aa439d", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq) # a StopIteration is raised" + ] + }, + { + "cell_type": "markdown", + "id": "31517002-ba25-41f3-81ae-113b3ddbb8ac", + "metadata": {}, + "source": [ + "Note how in the generator function above we incremented the number `i` **after** the `yield` statement.\n", + "That is, as soon as we resume our function, we make sure to be in the correct position of our _stream of data_ – in this case, a sequence of integers squared." + ] + }, + { + "cell_type": "markdown", + "id": "3cbf6eab-aff5-4e39-b974-559d581d7d2a", + "metadata": {}, + "source": [ + "### Create an interable from a generator" + ] + }, + { + "cell_type": "markdown", + "id": "379ea24c-55eb-4a8b-85b8-4747c988af2d", + "metadata": {}, + "source": [ + "As we know, generators are iterators.\n", + "This means that we can **consume** them (i.e., exhaust the elements they can return).\n", + "However, sometimes we want to create an interable instead, like a list, that we can loop over as many time as we want.\n", + "\n", + "We know all the pieces to put together to obtain such a thing: we need a class that implements the iterator protocol." + ] + }, + { + "cell_type": "markdown", + "id": "eb5c13df-f019-4f7d-8e2d-55c2bfff6f96", + "metadata": {}, + "source": [ + "Let's consider again the example of generating squares of integers:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a3a39c29-fb62-4bed-957f-ea38fdad1665", + "metadata": {}, + "outputs": [], + "source": [ + "def squares_gen(n):\n", + " for i in range(n):\n", + " yield i ** 2" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1f78d206-0dc6-4ef1-8016-6a3a9430dea7", + "metadata": {}, + "outputs": [], + "source": [ + "sq = squares_gen(5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5f82b60a-e623-4d86-98fb-7eef3bc72dca", + "metadata": {}, + "outputs": [], + "source": [ + "for num in sq:\n", + " print(num)" + ] + }, + { + "cell_type": "markdown", + "id": "564771fe-2f21-4d8d-97f2-b2f592623426", + "metadata": {}, + "source": [ + "But our generator is now exhausted and it has nothing left to return:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "af2a0872-99fe-440c-8767-959a3ec95338", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq)" + ] + }, + { + "cell_type": "markdown", + "id": "9a14144f-e80a-4a9b-99f8-2b4eceda4217", + "metadata": {}, + "source": [ + "To restart the iteration, we need to create another instance of the generator.\n", + "We can wrap this behavior in an **iterable class**:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "33e71361-0245-4cc3-bc5d-73c11d0ab59f", + "metadata": {}, + "outputs": [], + "source": [ + "class Squares:\n", + " def __init__(self, n):\n", + " self.n = n\n", + "\n", + " def __iter__(self):\n", + " return squares_gen(self.n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "305966ad-0c54-4edb-86de-2008653874fa", + "metadata": {}, + "outputs": [], + "source": [ + "sq = Squares(5)\n", + "[num for num in sq]" + ] + }, + { + "cell_type": "markdown", + "id": "2e104cb5-3c8a-4d62-b3f9-50472a3a85e3", + "metadata": {}, + "source": [ + "And we can do it again:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "76499d2b-227b-4558-8815-1325baa1379a", + "metadata": {}, + "outputs": [], + "source": [ + "[num for num in sq]" + ] + }, + { + "cell_type": "markdown", + "id": "10db0c6c-7d58-40b5-8600-28f4876f6899", + "metadata": {}, + "source": [ + "We can put everything is a single class to make things easier to read:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc5b2d58-ac9a-451a-ae31-ca212ba74843", + "metadata": {}, + "outputs": [], + "source": [ + "class Squares:\n", + " def __init__(self, n):\n", + " self.n = n\n", + " \n", + " @staticmethod\n", + " def squares_gen(n):\n", + " for i in range(n):\n", + " yield i ** 2\n", + " \n", + " def __iter__(self):\n", + " return Squares.squares_gen(self.n)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3ed78f78-2054-4b72-9715-17f45d8e1754", + "metadata": {}, + "outputs": [], + "source": [ + "sq = Squares(10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb1c8334-1263-43d9-b626-72513e2ea779", + "metadata": {}, + "outputs": [], + "source": [ + "[num for num in sq]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d281f4d9-2e91-41c7-9287-ce7bfd79def2", + "metadata": {}, + "outputs": [], + "source": [ + "[num for num in sq]" + ] + }, + { + "cell_type": "markdown", + "id": "57990dd0-af52-4316-a71b-bd28edf57cd6", + "metadata": {}, + "source": [ + "### Combining generators" + ] + }, + { + "cell_type": "markdown", + "id": "95750a37-00ee-4292-ac6c-f64320048327", + "metadata": {}, + "source": [ + "We have to be careful when using a generator with one another.\n", + "For example, the `enumerate()` built-in returns a generator to iterate over an indexed container." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "462988f1-79f4-4a3d-997e-c415c7301691", + "metadata": {}, + "outputs": [], + "source": [ + "def squares(n):\n", + " for i in range(n):\n", + " yield i ** 2" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6767cfe1-bc52-424b-8665-cfd7f4b36dd9", + "metadata": {}, + "outputs": [], + "source": [ + "sq = squares(5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b35c2c33-d798-4458-a712-75edcfbc676a", + "metadata": {}, + "outputs": [], + "source": [ + "enum_sq = enumerate(sq)" + ] + }, + { + "cell_type": "markdown", + "id": "b338a634-13a3-4b93-b533-d5aa6c15e199", + "metadata": {}, + "source": [ + "Now, `enumerate` builds a generator itself, so `sq` had not been consumed yet at this point:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e8beb7cc-3bbb-4153-aa71-fc4def671e68", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "62bd58ff-2b79-4f3f-ad6f-06d22736b53c", + "metadata": {}, + "outputs": [], + "source": [ + "next(sq)" + ] + }, + { + "cell_type": "markdown", + "id": "0f2ce2d7-2ec6-428f-aa8c-94cc6d6a65b2", + "metadata": {}, + "source": [ + "But since we now have consumed **2 elements** from `sq`, when we use `enumerate` it will also have two less items from `sq`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9070f983-4938-4ee1-a97a-b85ecfc2c5bd", + "metadata": {}, + "outputs": [], + "source": [ + "next(enum_sq)" + ] + }, + { + "cell_type": "markdown", + "id": "040e46b2-206a-4051-b548-62acec5287be", + "metadata": {}, + "source": [ + "And this might not be what you expected: the value is the **third** element of `sq` ($2^2$), while the index is `0`, as if we were starting from the beginning.\n", + "From the point of view of the generator returned by `enumerate`, **we are at the beginning**.\n", + "\n", + "So, beware when you are combining multiple generators, and think carefully what's the behavior you expect." + ] + }, + { + "cell_type": "markdown", + "id": "f5214c67-0f92-4a7f-a768-eee90dd14bbb", + "metadata": {}, + "source": [ + "## Exercises" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "69a49f30-1d16-4c84-89bc-30d6bd7421f2", + "metadata": {}, + "outputs": [], + "source": [ + "%reload_ext tutorial.tests.testsuite" + ] + }, + { + "cell_type": "markdown", + "id": "777251e1-d655-4e36-a41c-ddefe020d1b7", + "metadata": {}, + "source": [ + "### Password checker factory" + ] + }, + { + "cell_type": "markdown", + "id": "bbec8f73-9b4f-49bc-bd59-c3db2ed6132d", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "Create a function called `password_checker_factory` that can be used to generate different password checkers.\n", + "This function will take **four parameters**: `min_uppercase`, `min_lowercase`, `min_punctuation`, and `min_digits`.\n", + "They represents the constraints on a given password:\n", + "\n", + "1. The minimum number of uppercase letters.\n", + "2. The minimum number of lowercase letters.\n", + "3. The minimum number of punctuation characters.\n", + "4. The minimum number of digits.\n", + "\n", + "\n", + "The `create_password_checker` function generates another function that assesses a given password (string).\n", + "This resulting function returns a **tuple with two elements**:\n", + "\n", + "1. The first element is a **boolean** indicating if the password passed validation.\n", + "2. The second element is a **dictionary** mapping `uppercase`, `lowercase`, `punctuation`, and `digits` to the difference between the actual count of each type in the password and its minimum requirement. Positive values denote exceeding, and negative values denote not meeting these minimums." + ] + }, + { + "cell_type": "markdown", + "id": "ed1919fe-96d2-4c59-8753-9f2113aa8c3e", + "metadata": {}, + "source": [ + "For example, to create a password checker that requires a password to have at least 2 uppercase letters, at least 3 lowercase letters, at least 1 punctuation mark, and at least 4 digits, we can write\n", + "\n", + "```python\n", + "pc1 = create_password_checker(2, 3, 1, 4)\n", + "```\n", + "\n", + "If we test the following passwords:\n", + "\n", + "```python\n", + "print(pc1('Ab!1'))\n", + "print(pc1('ABcde!1234'))\n", + "```\n", + "\n", + "We should get these results:\n", + "\n", + "```python\n", + "(False, {'uppercase': -1, 'lowercase': -2, 'punctuation': 0, 'digits': -3})\n", + "(True, {'uppercase': 0, 'lowercase': 0, 'punctuation': 0, 'digits': 0})\n", + "```\n", + "\n", + "In this example, the first password `Ab!1` is **invalid**: it lacks `1` uppercase character, `2` lowercase, and `3` digits.\n", + "Instead, the second password is **valid**." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "565bd836-35f0-40c7-9897-653d2e6253f6", + "metadata": {}, + "outputs": [], + "source": [ + "%%ipytest\n", + "\n", + "def solution_password_checker_factory(min_up: int, min_low: int, min_pun: int, min_dig: int):\n", + " \"\"\"Password checker factory\"\"\"\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "1fe377f9-0f6c-44d5-8d15-c74a8502bc52", + "metadata": {}, + "source": [ + "### String range" + ] + }, + { + "cell_type": "markdown", + "id": "740cf590-13c5-4df3-b476-7c0ca7d80e29", + "metadata": {}, + "source": [ + "Create a function called `str_range` that emulates the the built-in `range`, but for characters.\n", + "That is, when you call `str_range('j', 'm')`, you will get back a generator that produces each of the letters in between.\n", + "\n", + "The function takes two **mandatory** parameters, `start` and `end`, plus an **optional** `step` value, with default value of `1`.\n", + "\n", + "As opposed to Python's numeric `range()`, the string ranges generated by `str_range` are **including** their final string (that is, `end`).\n", + "Moreover, since Python 3 supports non-Latin characters (and even non-alphabetic), it should be possible to use any of them as a `start` or `end` value." + ] + }, + { + "cell_type": "markdown", + "id": "48b1b02e-4971-4ee0-9d1e-7a746b847a9a", + "metadata": {}, + "source": [ + "
\n", + "

Hint

You might want to look up what it means to get an \"integer representing the Unicode code point of that character\". The official docs of Python might help you.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "a0618164-199a-4211-8c83-1341c0fc84f7", + "metadata": {}, + "source": [ + "
\n", + "

Note

It's okay if in some languages it doesn't quite exist the idea of \"iterating over a range of characters\".\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "00b721c6-87e8-4eda-8cc3-b1bca3d2bdd6", + "metadata": {}, + "outputs": [], + "source": [ + "%%ipytest\n", + "\n", + "def solution_str_range(start: str, end: str, step: int):\n", + " \"\"\"Return a generator from `start` to `end` strings (inclusive)\"\"\"\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "09965c4e-db8a-46a6-9b57-a4b5350ae398", + "metadata": {}, + "source": [ + "### Read `n` lines" + ] + }, + { + "cell_type": "markdown", + "id": "ba075f54-be74-41c9-9c56-a25aefb6569f", + "metadata": {}, + "source": [ + "Create a function called `read_n_lines` that takes two arguments: the filename from which to read, and the **maximum number of lines** that should be returned with each iteration.\n", + "\n", + "For example, if we had a file like\n", + "\n", + "```\n", + "File line 0 aaa\n", + "File line 1 bbb\n", + "File line 2 ccc\n", + "File line 3 ddd\n", + "File line 4 eee\n", + "File line 5 fff\n", + "File line 6 ggg\n", + "```\n", + "\n", + "Then we could use the `read_n_lines` to read pairs of lines:\n", + "\n", + "```python\n", + "for two_lines in read_n_lines(filename, 2):\n", + " print(two_lines.rstrip())\n", + "```\n", + "\n", + "And the output would be:\n", + "\n", + "```\n", + "File line 0 aaa\n", + "File line 1 bbb\n", + "\n", + "File line 2 ccc\n", + "File line 3 ddd\n", + "\n", + "File line 4 eee\n", + "File line 5 fff\n", + "\n", + "File line 6 ggg\n", + "```\n", + "\n", + "The last line is returned by itself because the file has an odd number of lines." + ] + }, + { + "cell_type": "markdown", + "id": "e21a3c08-1584-409f-8e3b-d36c5d22b960", + "metadata": {}, + "source": [ + "We could also do:\n", + "\n", + "```python\n", + "for four_lines in read_n(filename, 4):\n", + " print(four_lines.rstrip())\n", + "```\n", + "\n", + "And get back\n", + "\n", + "```\n", + "File line 0 aaa\n", + "File line 1 bbb\n", + "File line 2 ccc\n", + "File line 3 ddd\n", + "\n", + "File line 4 eee\n", + "File line 5 fff\n", + "File line 6 ggg\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "e20056ea-b537-4cb3-9e6a-624348c8d443", + "metadata": {}, + "source": [ + "
\n", + "

Note

With each iteration, read_n_lines shoul return a string (not a list) containing up to the number of lines specified by the parameter lines.\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5d58c7d1-e361-4d41-a7d2-162ca06746b0", + "metadata": {}, + "outputs": [], + "source": [ + "%%ipytest\n", + "\n", + "def solution_read_n_lines(filename: str, lines: int):\n", + " \"\"\"Read multiple lines from a file\"\"\"\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "a941b875-c1a1-4153-9e3b-57f2ca4118b7", + "metadata": {}, + "source": [ + "### Only run once" + ] + }, + { + "cell_type": "markdown", + "id": "1867171a-da42-419e-8a13-b5727fa4f805", + "metadata": {}, + "source": [ + "Create a decorator called `once` that restricts a function to run at most **once every `allowed_time` seconds**, where `allowed_time` is a parameter with a default value of `15`.\n", + "\n", + "If you try to invoke the function too soon, the decorator should raise an exception called `RuntimeError` which tells you how long you need to wait before running your function again.\n", + "The error message should be `Wait another {remaining_time} seconds`, where `remaining_time` is the time left to wait before running the function again." + ] + }, + { + "cell_type": "markdown", + "id": "ee62b302-a6a8-4207-933f-88e2a63a634e", + "metadata": {}, + "source": [ + "For example, the following code:\n", + "\n", + "```python\n", + "import time\n", + "\n", + "@once(15)\n", + "def hello(name):\n", + " return f\"Hello, {name}!\"\n", + "\n", + "for i in range(30):\n", + " print(i)\n", + " try:\n", + " time.sleep(3)\n", + " print(hello(f\"attempt #{i}\"))\n", + " except TooSoonError as err:\n", + " print(f\"Too soon: {err}\")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "037e2acc-a4bd-4682-84af-861f7874ce92", + "metadata": {}, + "source": [ + "Should print something like:\n", + "\n", + "```\n", + "0\n", + "Hello, attempt #0\n", + "1\n", + "Too soon: Wait another 12.00 seconds\n", + "2\n", + "Too soon: Wait another 8.99 seconds\n", + "3\n", + "Too soon: Wait another 5.98 seconds\n", + "4\n", + "Too soon: Wait another 2.98 seconds\n", + "5\n", + "Hello, attempt #5\n", + "6\n", + "Too soon: Wait another 12.00 seconds\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "269e6e73-6a3b-41d7-a50e-25d20d3cf0ba", + "metadata": {}, + "source": [ + "
\n", + "

Note

The decorator should handle any kind of function, i.e., it should not care about the kind or number of parameters the function accepts.\n", + "
\n", + "\n", + "
\n", + "

Important

The tests need to run for some time to check the solution. Don't worry if the execution of the cell below seems to be hanging: it's not.\n", + "
\n", + "\n", + "
\n", + "

Hint

If you are stuck, you can always comment the line %%ipytest to skip the tests, and enable them again when you think your solution is ready.\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "23bd64fd-470c-4685-8cd2-e489a535b68c", + "metadata": {}, + "outputs": [], + "source": [ + "%%ipytest\n", + "import time\n", + "\n", + "def solution_once(allowed_time: int = 15) -> t.Callable:\n", + " \"\"\"Decorator to run a function at most once per given seconds\"\"\"\n", + " " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/images/cell_object.png b/images/cell_object.png new file mode 100644 index 00000000..6edfb3a5 Binary files /dev/null and b/images/cell_object.png differ diff --git a/index.ipynb b/index.ipynb index 1538118d..a8a1a328 100644 --- a/index.ipynb +++ b/index.ipynb @@ -16,8 +16,17 @@ "- [Modules and packages](./modules_and_packages.ipynb)\n", "\n", "# Advanced tutorial\n", - "- [Manage Python project](./manage_python_project.ipynb)" + "\n", + "- [Manage Python project](./manage_python_project.ipynb)\n", + "- [Advanced functions](./functions_advanced.ipynb)\n" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -36,7 +45,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.1" + "version": "3.10.10" } }, "nbformat": 4, diff --git a/tutorial/common.py b/tutorial/common.py index ec076b3f..1d21a601 100644 --- a/tutorial/common.py +++ b/tutorial/common.py @@ -154,3 +154,17 @@ def clear(self, _=None): for question in self.questions: question.clear() self.output.value = "" + + +class Spoiler(ipw.Accordion): + def __init__(self, title: str, content: str, show_content: bool = False): + """A Spoiler widget. + + title: The title of the spoiler. + content: The content of the spoiler. + open: Whether the spoiler is open or closed. + """ + super().__init__() + self.children = [ipw.HTML(content)] + self.set_title(0, title) + self.selected_index = 0 if show_content else None diff --git a/tutorial/functions_advanced.py b/tutorial/functions_advanced.py new file mode 100644 index 00000000..ac026f7e --- /dev/null +++ b/tutorial/functions_advanced.py @@ -0,0 +1,11 @@ +from markdown import markdown + +from .common import Spoiler + +tricky_closures = Spoiler( + "Answer", + markdown( + "Look at the definition of `op()`. Does it reference the variable `n` other than in setting the default value of `n`? **No**. " + "Hence, the variable `n` is **not** a free variable, and the function `op()` is **not** a closure, just a plain function." + ), +) diff --git a/tutorial/tests/test_functions_advanced.py b/tutorial/tests/test_functions_advanced.py new file mode 100644 index 00000000..42433091 --- /dev/null +++ b/tutorial/tests/test_functions_advanced.py @@ -0,0 +1,273 @@ +import pathlib +import time +import typing as t +from string import ascii_lowercase as lowercase + +import pytest + + +# +# Exercise: Password checker factory +# +def reference_password_checker_factory( + min_up: int, min_low: int, min_pun: int, min_dig: int +) -> t.Callable: + """Password checker factory""" + # The `string` module contains a number of useful constants + import string + + # The `sub` function from the operator module can be used to subtract two numbers + # sub(x, y) is equivalent to x - y + from operator import sub + + def password_checker(password: str) -> tuple[bool, dict]: + """Password checker function""" + + # Counts the number of chars for each class in a password + counts = [ + sum(1 for char in password if char in _class) + for _class in ( + string.ascii_uppercase, + string.ascii_lowercase, + string.punctuation, + string.digits, + ) + ] + + # Compare with requirements and calculate the differences + diffs = [ + sub(*pair) for pair in zip(counts, (min_up, min_low, min_pun, min_dig)) + ] + + result = dict(zip(("uppercase", "lowercase", "punctuation", "digits"), diffs)) + + return all(diff >= 0 for diff in diffs), result + + return password_checker + + +def test_password_checker_factory_no_min_no_pw(function_to_test: t.Callable): + pc = function_to_test(0, 0, 0, 0) + result, details = pc("") + + assert result + assert len(details) == 4 + for value in details.values(): + assert value == 0 + + +def test_password_checker_factory_no_min_some_pw(function_to_test: t.Callable): + pc = function_to_test(0, 0, 0, 0) + result, details = pc("ABCDefgh!@#$1234") + + assert result + assert len(details) == 4 + for value in details.values(): + assert value == 4 + + +def test_password_checker_factory_simple_good(function_to_test: t.Callable): + pc = function_to_test(1, 2, 3, 4) + result, details = pc("Abc!@#1234") + + assert result + assert len(details) == 4 + for value in details.values(): + assert value == 0 + + +def test_password_checker_factory_simple_bad(function_to_test: t.Callable): + pc = function_to_test(1, 2, 3, 4) + result, details = pc("b!#234") + + assert not result + assert len(details) == 4 + for value in details.values(): + assert value == -1 + + +@pytest.mark.parametrize("onlyset", ["uppercase", "lowercase", "punctuation", "digits"]) +def test_password_checker_factory_only_set_one(onlyset, function_to_test: t.Callable): + for source in ["uppercase", "lowercase", "punctuation", "digits"]: + if onlyset == source: + pw = globals()[source][:4] + + pc = function_to_test(4, 4, 4, 4) + result, details = pc(pw) # type: ignore + + assert not result + assert len(details) == 4 + for key, value in details.items(): + if key == onlyset: + assert value == 0 + else: + assert value == -4 + + +@pytest.mark.parametrize( + "donotset", ["uppercase", "lowercase", "punctuation", "digits"] +) +def test_password_checker_factory_only_ignore_one( + donotset, function_to_test: t.Callable +): + pw = "" + for source in ["uppercase", "lowercase", "punctuation", "digits"]: + if donotset == source: + continue + pw += globals()[source][:4] + + pc = function_to_test(4, 4, 4, 4) + result, details = pc(pw) + + assert not result + assert len(details) == 4 + for key, value in details.items(): + if key == donotset: + assert value == -4 + else: + assert value == 0 + + +# +# Exercise: Once per minute +# + + +def hello(name): + return f"Hello {name}!" + + +def reference_once(allowed_time: int = 15) -> t.Callable: + """Decorator to run a function at most once""" + + def decorator(func: t.Callable) -> t.Callable: + timer = 0.0 + + def wrapper(*args, **kwargs) -> t.Any: + """Wrapper""" + nonlocal timer + + if not timer: + timer = time.perf_counter() + return func(*args, **kwargs) + + if (stop := time.perf_counter()) - timer < allowed_time: + raise RuntimeError( + f"Wait another {allowed_time - (stop - timer):.2f} seconds" + ) + + timer = time.perf_counter() + + return func(*args, **kwargs) + + return wrapper + + return decorator + + +def test_once_simple(function_to_test: t.Callable) -> None: + _hello = function_to_test(5)(hello) + assert _hello("world") == "Hello world!" + + +def test_once_twice(function_to_test: t.Callable) -> None: + allowed_time = 5 + _hello = function_to_test(allowed_time)(hello) + + time.sleep(allowed_time) + assert _hello("world") == "Hello world!" + + with pytest.raises(RuntimeError) as err: + _hello("world 2") + + assert err.type is RuntimeError + assert "Wait another 5." in err.value.args[0] + + +def test_once_waiting_not_enough_time(function_to_test: t.Callable) -> None: + allowed_time = 10 + _hello = function_to_test(allowed_time)(hello) + + time.sleep(allowed_time) + assert _hello("world") == "Hello world!" + time.sleep(allowed_time - 1) + + with pytest.raises(RuntimeError) as err: + _hello("world 2") + + assert err.type is RuntimeError + assert "Wait another 1." in err.value.args[0] + + +# +# Exercise: String range +# + + +def reference_str_range(start: str, end: str, step: int = 1) -> t.Iterator[str]: + """Return an iterator of strings from start to end, inclusive""" + for i in range(ord(start), ord(end) + (1 if step > 0 else -1), step): + yield chr(i) + + +def test_str_range_same_start_end(function_to_test: t.Callable): + r = function_to_test("a", "a") + assert iter(r) == r + assert "".join(list(r)) == "a" + + +def test_str_range_simple(function_to_test: t.Callable): + r = function_to_test("a", "c") + assert "".join(list(r)) == "abc" + + +def test_str_range_simple_with_step(function_to_test: t.Callable): + r = function_to_test("a", "c", 2) + assert "".join(list(r)) == "ac" + + +def test_str_range_simple_with_negativestep(function_to_test: t.Callable): + r = function_to_test("c", "a", -2) + assert "".join(list(r)) == "ca" + + +def test_str_range_hebrew(function_to_test: t.Callable): + r = function_to_test("א", "ז", 2) + assert "".join(list(r)) == "אגהז" + + +# +# Exercise Read n lines +# + + +def reference_read_n_lines(filename: str, lines: int): + with open(filename) as file: + while True: + first_line = file.readline() + + if not first_line: + break + + yield first_line + "".join(file.readline() for _ in range(lines - 1)) + + +def create_alphabet_file(tmp_path: pathlib.Path): + d = tmp_path / "sub" + d.mkdir() + p = d / "alphabet.txt" + + text = "\n".join(f"{one_letter*20}" for one_letter in lowercase) + "\n" + + p.write_text(text) + + return p + + +@pytest.mark.parametrize("n,expected", [(1, 26), (2, 13), (3, 9), (4, 7)]) +def test_read_n_lines(tmp_path, n, expected, function_to_test: t.Callable): + p = create_alphabet_file(tmp_path) + + i = function_to_test(p, n) + assert i == iter(i) + assert len(list(i)) == expected