-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drone in turbulent air flow #211
Conversation
Peter, I always enjoy seeing your work and your learning process. It is a very good thought to add randomness, because that happens in the real world too, and control strategies need to be robust. But trajectory optimization won't give you a more robust control strategy, unfortunately. If you put rand() functions in your system, IPOPT won't be able to solve because it is gradient-based and can't find consistent search directions. IPOPT will, however, solve if you "precalculate" a random signal, store it, and then use it as a time dependent perturbing force F(t) in your dynamics. But then, the turbulence is perfectly predictable, and IPOPT will find a solution that perfectly compensates for the anticipated perturbations, or even takes advantage of them to achieve the goal. One way to handle that in trajectory optimization is to optimize over several (maybe 10) instances of the same task, each with its own randomly precalculated perturbation signal F(t) (maybe 10). The optimization minimizes the total cost over all tasks, and uses the same controller for all of them [1,2]. When the system is unstable, that controller must have a feedback component to make that work. Solving becomes very hard when you increase the number of instances. With a low number, you still have a risk that the optimal controller has specialized too much for the specific instances of perturbation and is still not robust. I don't think (so far) that this is a practical or reliable approach. The good news is, though, that there is such a thing as the "certainty equivalence principle", which says that the optimal control solution for the deterministic problem (without turbulence) is the same as the solution of the stochastic problem when two things are true: (1) the dynamics is linear, and (2) the cost function is linear or quadratic in the states and controls. The drone problem may be close enough to linear-quadratic. A simple example of not having certainly equivalence is the so-called "cliff walking" problem. You have to walk from point A to point B. The straight line from A to B gets close to a cliff but does not fall off. So that's the optimal path in the deterministic case. The cliff does not influence your performance and is not even seen. When it is (randomly) windy, the straight path is too dangerous. A stochastic trajectory optimization should not give you that solution if you have defined the problem so that falling of the cliff is very costly, large effort required to climb back, etc.. The optimal strategy in the stochastic case is a curved path from A to B that keeps a larger distance from the cliff. Such truly stochastic problems seem more suitable for methods such as reinforcement learning, which can handle rand() functions, and actually use randomness intentionally to explore the system and the solution space. You can also use trajectory optimization with an optimization method that can handle randomness, such as CMA (covariance matrix adaptation). CMA does not use gradients and tolerates random variations when evaluating the cost function. It is more or less how nature optimizes. Simulated Annealing and Genetic Algorithms are also in that class. However, such methods cannot handle constraints. Unstable systems need feedback control. You have to formulate the trajectory optimization problem without constraints, which means shooting rather than collocation. We did this in [3]. In relation to this pull request, I think it's probably not suitable for the examples gallery. [1] Koelewijn AD, van den Bogert AJ (2020) A solution method for predictive simulations in a stochastic environment. J Biomech, 104: 109759. [2] Wang H, van den Bogert AJ (2020) Identification of the human postural control system through stochastic trajectory optimization. J Neurosci Meth 334, 108580. [3] Koelewijn AD, van den Bogert AJ (2022) Antagonistic co-contraction can minimize muscular effort in systems with uncertainty. PeerJ 10:e13085. |
Dear Ton, This cliff problem makes complete sense!! I love such examples! My goal is to play around with sympy.physics.mechanics and opty, and hopefully be able to contribute. For example, for KanesMethod I think I found a way to handle non-linear motion constraints, but
Could you send me a link to [1]? I'd like to read it. At your advixe, which I totally agree with!, I will close this PR. Thanks, Peter |
Googling the title finds a full text link at Pubmed, which would require access to Journal of Biomechanics. It also finds a preprint link at BioRXiv, and I am not sure if that is the last version. If you would like a PDF of the final publication, send me a request at [email protected]. |
This is the drone already in examples-gallery. I just added some turbulence, modeled as white noise acting on the center of mass of the drone. Basically, I just wanted to see how opty would work under such irregular input. Up to a certain strenth of the white noise it seems fine, of course running time is longer, 48 sec in this case.
I do not think, it is worth putting on examples-gallery, as it is such a small modification.
@moorepants if you agree with me, simply close this PR.
(I put it here for two reasons: