You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In isaac lab, I have actions and observations for six joint angles, I only know actions are scaled 0.5, the second joint initial position is 0.8, the third joint initial joint is -0.7. In my urdf configuration. the first joint angle range is [-150, 150 degree], the second one is [0, 170], the thrid joint angle is [-165, 0]. The fourth one is [-87, 87], the fifth one is [-77, 77]. The last one is [-10 , 10 degree]. I used managed based RL environment and PPO to train my policy. I found the following actions applied to the environment and I got the observations when the robot arm is static. but how these actions and observations match with each other?
actions is tensor([[-0.0552, 2.1343, -3.4098, 4.9304, 21.5618, 3.0488]],
device='cuda:0')
obs[0:6] is tensor([[-0.0196, 1.0895, -1.6983, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.0197, 1.8895, -2.3984, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
actions is tensor([[-5.3144, 2.2712, -4.4182, 11.5965, 21.9344, 2.8949]],
device='cuda:0')
obs[0:6] is tensor([[-0.8412, 1.1578, -2.1798, 1.5181, 0.7094, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.8412, 1.9578, -2.8798, 1.5181, 0.7092, 0.1745]],
device='cuda:0')
actions is tensor([[-4.5678, 1.8938, -3.6141, 10.9934, 16.4971, 5.3105]],
device='cuda:0')
obs[0:6] is tensor([[-0.8171, 0.9566, -1.7877, 1.5184, 0.4591, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.8170, 1.7565, -2.4878, 1.5183, 0.4591, 0.1745]],
device='cuda:0')
actions is tensor([[-1.4693, 1.6199, -2.4913, 6.2381, 10.3354, 10.9402]],
device='cuda:0')
obs[0:6] is tensor([[-0.7388, 0.8239, -1.2290, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.7388, 1.6240, -1.9291, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
The difference between obs and joint angles are the offset angles (0.8 and -0.7). But how does observation match the actions?
Hi @hanlin-ga,
This depends on what kind of action term you are using. For the standard JointPositionAction, the actions are first going through an affine transform applied_actions = policy_output * scale + offset (https://github.com/isaac-sim/IsaacLab/blob/main/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/mdp/actions/joint_actions.py#L122). The offset is the default joint positions (if activated in the config).
The applied_actions is the given as joint position target to the PD control strategy of the actuators, which will drive the joint angles towards the desired joint positions. In most PD controlled systems, the target can not be reached (due to disturbances such as gravity) and there will always be a steady-state error. As a result, your observations will never match the applied actions (taking into account all the scales and offsets).
Hi Dhoeller19, thank you for the reply! Do you know why some actions items are more than 10? I found all the joint angles and obs are between [-2pi, 2pi]. For example: actions = tensor([[-5.3144, 2.2712, -4.4182, 11.5965, 21.9344, 2.8949]], device='cuda:0')
This discussion was converted from issue #1167 on October 07, 2024 07:36.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
In isaac lab, I have actions and observations for six joint angles, I only know actions are scaled 0.5, the second joint initial position is 0.8, the third joint initial joint is -0.7. In my urdf configuration. the first joint angle range is [-150, 150 degree], the second one is [0, 170], the thrid joint angle is [-165, 0]. The fourth one is [-87, 87], the fifth one is [-77, 77]. The last one is [-10 , 10 degree]. I used managed based RL environment and PPO to train my policy. I found the following actions applied to the environment and I got the observations when the robot arm is static. but how these actions and observations match with each other?
actions is tensor([[-0.0552, 2.1343, -3.4098, 4.9304, 21.5618, 3.0488]],
device='cuda:0')
obs[0:6] is tensor([[-0.0196, 1.0895, -1.6983, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.0197, 1.8895, -2.3984, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
actions is tensor([[-5.3144, 2.2712, -4.4182, 11.5965, 21.9344, 2.8949]],
device='cuda:0')
obs[0:6] is tensor([[-0.8412, 1.1578, -2.1798, 1.5181, 0.7094, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.8412, 1.9578, -2.8798, 1.5181, 0.7092, 0.1745]],
device='cuda:0')
actions is tensor([[-4.5678, 1.8938, -3.6141, 10.9934, 16.4971, 5.3105]],
device='cuda:0')
obs[0:6] is tensor([[-0.8171, 0.9566, -1.7877, 1.5184, 0.4591, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.8170, 1.7565, -2.4878, 1.5183, 0.4591, 0.1745]],
device='cuda:0')
actions is tensor([[-1.4693, 1.6199, -2.4913, 6.2381, 10.3354, 10.9402]],
device='cuda:0')
obs[0:6] is tensor([[-0.7388, 0.8239, -1.2290, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
joint angles are tensor([[-0.7388, 1.6240, -1.9291, 1.5181, 1.3439, 0.1745]],
device='cuda:0')
The difference between obs and joint angles are the offset angles (0.8 and -0.7). But how does observation match the actions?
Beta Was this translation helpful? Give feedback.
All reactions