Issues using MaxUCB criterion #38

JvThunder · 2023-10-22T15:17:22Z

I was trying to use POMCPOWSolver(criterion=MaxUCB(1.0)) for my project, but I got an error.
Then, I tried a very simple environment with simple transitions as in the following code:

from julia.POMDPs import solve, simulate
from julia.POMDPTools import Deterministic, HistoryRecorder, RandomPolicy
from julia.POMCPOW import POMCPOWSolver, MaxUCB
from julia.CommonRLSpaces import Box
from quickpomdps import QuickPOMDP

def transition(state, action):
    return Deterministic([state[0] + 1])

def observation(state, action, next_state):
    return Deterministic(next_state)

def reward(state, action, next_state):
    return 1

def terminal(state):
    return (state[0] >= 2)

pomdp = QuickPOMDP(
    states = Box([0], [3]),
    actions = Box([0], [1]),
    observations = Box([0], [3]),
    discount = 0.9,
    isterminal = terminal,
    transition = transition,
    observation = observation,
    reward = reward,
    initialstate = Deterministic([1])
)

# TODO: this is not working
# this works well
# solver = POMCPOWSolver(max_time = 1, tree_queries = 15)
# this got into MethodError: no method matching insert
solver = POMCPOWSolver(criterion=MaxUCB(1.0))

policy = solve(solver, pomdp)
hr = HistoryRecorder(max_steps=2)
hist = simulate(hr, pomdp, policy)
rhist = simulate(hr, pomdp, RandomPolicy(pomdp))

it = 0
for step in hist:
    print(f"____step:{it}____")
    print("State: ", step.s)
    print("Action: ", step.a)
    print("Reward: ", step.r)
    print("__________________")
    it += 1

Note that I am using python-jl to run this. I also tried POMCPOWSolver(max_time = 1, tree_queries = 15) and it works fine, so I think the issue might be the MaxUCB. The error I got is:

Traceback (most recent call last):
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 308, in main
    python(**vars(ns))
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/site-packages/julia/pseudo_python_cli.py", line 59, in python
    scope = runpy.run_path(script, run_name="__main__")
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/jvthunder/anaconda/envs/pomdp/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "test.py", line 60, in <module>
    hist = simulate(hr, pomdp, policy)
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching insert!(::POMCPOW.CategoricalVector{Tuple{StaticArraysCore.SVector{1, Float64}, Float64}}, ::Tuple{Vector{Int64}, Float64}, ::Float64)

Closest candidates are:
  insert!(!Matched::DataStructures.SortedMultiDict{K, D, Ord}, ::Any, ::Any) where {K, D, Ord<:Base.Order.Ordering}
   @ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/sorted_multi_dict.jl:167
  insert!(::POMCPOW.CategoricalVector{T}, !Matched::T, ::Float64) where T
   @ POMCPOW ~/.julia/packages/POMCPOW/f6XAQ/src/categorical_vector.jl:12
  insert!(!Matched::DataStructures.BalancedTree23{K, D, Ord}, ::Any, ::Any, !Matched::Bool) where {K, D, Ord<:Base.Order.Ordering}
   @ DataStructures ~/.julia/packages/DataStructures/MKv4P/src/balanced_tree.jl:358
  ...

Can you please tell me how to make this work with MaxUCB?

The text was updated successfully, but these errors were encountered:

zsunberg · 2023-10-23T20:53:09Z

You might be able to fix the error with from julia.StaticArrays import SVector and then replace your current transition distribution with

def transition(state, action):
    return Deterministic(SVector(state[0] + 1))

Let me know if that works and/or if you need more explanation.

zsunberg · 2023-10-24T02:08:06Z

I think that #39 fixes the problem so that you can use the original code anyways now

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues using MaxUCB criterion #38

Issues using MaxUCB criterion #38

JvThunder commented Oct 22, 2023

zsunberg commented Oct 23, 2023

zsunberg commented Oct 24, 2023

Issues using MaxUCB criterion #38

Issues using MaxUCB criterion #38

Comments

JvThunder commented Oct 22, 2023

zsunberg commented Oct 23, 2023

zsunberg commented Oct 24, 2023