forked from nicolas-steenhout/a11y-minute
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
2291409
commit 116684b
Showing
1 changed file
with
44 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
layout: post | ||
title: TIL(20/12/23) | ||
--- | ||
|
||
## Bayesian Optimization | ||
|
||
**Bayesian optimization** is a powerful strategy for finding the extrema of objective functions that are expensive to evaluate. It is particularly useful when these evaluations are costly, when one does not have access to derivatives, or when the problem at hand is non-convex. | ||
|
||
``` | ||
The Bayesian Optimization algorithm can be summarized as follows: | ||
1. Select a Sample by Optimizing the Acquisition Function. | ||
2. Evaluate the Sample With the Objective Function. | ||
3. Update the Data and, in turn, the Surrogate Function. | ||
4. Go To 1. | ||
``` | ||
|
||
- It uses a | ||
- **Surrogate function** - that approximates the relationship between I/O data of the sample. There are many ways to model, one of the ways is to use Random Forest/Gaussian Process (GP, with many different kernels) i.e here we're kind of approximating the *objective function* such that it can be *easily sampled*. | ||
- **Acquisition function** - It gives a sample that is to evaluated by the objective function. It is found by optimizing this function by various methods and it balances exploitation and exploration (E&E)[1]. | ||
|
||
- It is highly used in Tuning of Hyperparameters e.g. Optuna,HyperOpt. | ||
|
||
## Optuna | ||
|
||
- [API Reference](https://optuna.readthedocs.io/en/stable/reference/index.html) | ||
|
||
- This [video](https://www.youtube.com/watch?v=t-INgABWULw) gives a good intro. | ||
|
||
I am trying to use it for HPO of lunar lander environment, initally results weren't that good. I think it's because of not giving a *proper intreval* i.e a large intreval that won't result in a good choice of HP. May be I have to give try other ways to make it work. | ||
|
||
## Paper Reading | ||
|
||
- I'm reading this paper, "[Improving Environment Robustness of Deep Reinforcement Learning Approaches for Autonomous Racing Using Bayesian Optimization-based Curriculum Learning](https://arxiv.org/pdf/2312.10557.pdf)", which I saw from ArXiV (in CS.RO) today, where they propose a method for **automated curriculum selection** which lead me to know about Bayesian Optimization. | ||
|
||
- *Curriculum Learning* is about **spoon feeding different environments for agent exploration**, so that we can get a robust policy. | ||
|
||
|
||
## References | ||
|
||
1. https://machinelearningmastery.com/what-is-bayesian-optimization/ | ||
|
||
|