From 116684bf410b3dcfcd1c8df53517f6c7999209b9 Mon Sep 17 00:00:00 2001 From: Annamalai Date: Thu, 21 Dec 2023 23:37:40 +0530 Subject: [PATCH] post dec20 fix3 --- _posts/2023-12-20-TIL_Dec_20.md | 44 +++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 _posts/2023-12-20-TIL_Dec_20.md diff --git a/_posts/2023-12-20-TIL_Dec_20.md b/_posts/2023-12-20-TIL_Dec_20.md new file mode 100644 index 0000000000000..58b1567cb1130 --- /dev/null +++ b/_posts/2023-12-20-TIL_Dec_20.md @@ -0,0 +1,44 @@ +--- +layout: post +title: TIL(20/12/23) +--- + +## Bayesian Optimization + +**Bayesian optimization** is a powerful strategy for finding the extrema of objective functions that are expensive to evaluate. It is particularly useful when these evaluations are costly, when one does not have access to derivatives, or when the problem at hand is non-convex. + +``` + The Bayesian Optimization algorithm can be summarized as follows: + + 1. Select a Sample by Optimizing the Acquisition Function. + 2. Evaluate the Sample With the Objective Function. + 3. Update the Data and, in turn, the Surrogate Function. + 4. Go To 1. +``` + +- It uses a + - **Surrogate function** - that approximates the relationship between I/O data of the sample. There are many ways to model, one of the ways is to use Random Forest/Gaussian Process (GP, with many different kernels) i.e here we're kind of approximating the *objective function* such that it can be *easily sampled*. + - **Acquisition function** - It gives a sample that is to evaluated by the objective function. It is found by optimizing this function by various methods and it balances exploitation and exploration (E&E)[1]. + +- It is highly used in Tuning of Hyperparameters e.g. Optuna,HyperOpt. + +## Optuna + +- [API Reference](https://optuna.readthedocs.io/en/stable/reference/index.html) + +- This [video](https://www.youtube.com/watch?v=t-INgABWULw) gives a good intro. + +I am trying to use it for HPO of lunar lander environment, initally results weren't that good. I think it's because of not giving a *proper intreval* i.e a large intreval that won't result in a good choice of HP. May be I have to give try other ways to make it work. + +## Paper Reading + +- I'm reading this paper, "[Improving Environment Robustness of Deep Reinforcement Learning Approaches for Autonomous Racing Using Bayesian Optimization-based Curriculum Learning](https://arxiv.org/pdf/2312.10557.pdf)", which I saw from ArXiV (in CS.RO) today, where they propose a method for **automated curriculum selection** which lead me to know about Bayesian Optimization. + +- *Curriculum Learning* is about **spoon feeding different environments for agent exploration**, so that we can get a robust policy. + + +## References + +1. https://machinelearningmastery.com/what-is-bayesian-optimization/ + +