Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It runs successfully locally, but fails to start the head node on ubuntu server. The error reported in the log is as follows. #4

Open
lp1106 opened this issue Apr 28, 2024 · 3 comments

Comments

@lp1106
Copy link

lp1106 commented Apr 28, 2024

2024-04-28 00:09:16,621 WARNING node_head.py:209 -- Head node is not registered even after 10 seconds. The API server might not work correctly. Please report a Github issue. Internal states :{'head_node_registration_time_s': None, 'registered_nodes': 0, 'registered_agents': 0, 'node_update_count': 100, 'module_lifetime_s': 10.10161828994751}

@matteobettini
Copy link
Member

Hello, is this a rllib error? Are you using the suggested ray version?

@lp1106
Copy link
Author

lp1106 commented Apr 28, 2024

Yes.
It is very strange that when I run a simple program, I won't report an error, but when I run this project, I will report an error on the console, as follows,
File "/home/clp/anaconda3/envs/hetgppo/lib/python3.8/site-packages/ray/_private/node.py", line 312, in init ray._private.services.wait_for_node( File "/home/clp/anaconda3/envs/hetgppo/lib/python3.8/site-packages/ray/_private/services.py", line 385, in wait_for_node raise TimeoutError("Timed out while waiting for node to startup.") TimeoutError: Timed out while waiting for node to startup. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "train_give_way.py", line 205, in TrainingUtils.init_ray(scenario_name=scenario_name, local_mode=ON_MAC) "/home/clp/HetGPPO/utils.py", line 73, in init_ray ray.init( File "/home/clp/anaconda3/envs/hetgppo/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/clp/anaconda3/envs/hetgppo/lib/python3.8/site-packages/ray/_private/worker.py", line 1429, in init _global_node = ray._private.node.Node( File "/home/clp/anaconda3/envs/hetgppo/lib/python3.8/site-packages/ray/_private/node.py", line 319, in init raise Exception( Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup.
And in log is the warning "2024-04-28 00:09:16,621 WARNING node_head.py:209 -- Head node is not registered even after 10 seconds. The API server might not work correctly. Please report a Github issue. Internal states :{'head_node_registration_time_s': None, 'registered_nodes': 0, 'registered_agents': 0, 'node_update_count': 100, 'module_lifetime_s': 10.10161828994751}"

@matteobettini
Copy link
Member

I have never seen this, try reinstalling ray

pip install —force "ray[rllib]"==2.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants