Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Resubmit #47871 & #48030 and Fix the Failed Test for Windows #48054

Merged
merged 6 commits into from
Oct 18, 2024

Conversation

MengjinYan
Copy link
Collaborator

@MengjinYan MengjinYan commented Oct 16, 2024

Why are these changes needed?

Related issue number

Closes #48037

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@MengjinYan MengjinYan added the go add ONLY when ready to merge, run all tests label Oct 16, 2024
Signed-off-by: Mengjin Yan <[email protected]>
Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this PR ready for review yet?

@@ -1,10 +1,10 @@
DASHBOARD_PROVISIONING_TEMPLATE = """
apiVersion: 1
# DASHBOARD_PROVISIONING_TEMPLATE = """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these lines commented out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out!

This file and python/ray/dashboard/modules/metrics/grafana_datasource_template.py should be deleted as I moved the content to the more general template file: python/ray/dashboard/modules/metrics/templates.py. Something might be wrong when I reapplying the patch from the previous PR #47871. I'll update the PR to remove the files.

@MengjinYan
Copy link
Collaborator Author

MengjinYan commented Oct 17, 2024

is this PR ready for review yet?

The original changes are reverted because a windows test failed due to the change. I did the fix and now waiting for all the CI tests to pass. It will be ready for review after the CI tests passed.

2. Make the test that specifies the temp-dir parameter doesn't run in Windows. This is because the specified path doesn't apply to Windows

Signed-off-by: Mengjin Yan <[email protected]>
@rynewang
Copy link
Contributor

This is surprising since I thought Python can treat / as dir separator on Windows as well. But anyway LGTM

@MengjinYan
Copy link
Collaborator Author

This is surprising since I thought Python can treat / as dir separator on Windows as well. But anyway LGTM

is this PR ready for review yet?

The original changes are reverted because a windows test failed due to the change. I did the fix and now waiting for all the CI tests to pass. It will be ready for review after the CI tests passed.

@aslonnie The PR passed the Windows test and I made sure the reverted changes are all included in this one so it should be ready to review.

@MengjinYan
Copy link
Collaborator Author

This is surprising since I thought Python can treat / as dir separator on Windows as well. But anyway LGTM

Thanks! I think there are 2 things we hit in the Windows test:

  1. The temp path in Windows are different from that on linux. On Windows, it is "C:\Users..." while on Linux, we are using "/tmp/ray", so we cannot assume the temp path to always be "/tmp/ray"
  2. Originally in the test, I used formatted string to generate the path instead of os.path. This cause the mismatch of the separator.

@rynewang rynewang merged commit 75f6047 into master Oct 18, 2024
6 checks passed
@rynewang rynewang deleted the issue-41648-2 branch October 18, 2024 20:32
Copy link
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some style nits if they can be fixed in a separate PR.

# The function assumes the Ray cluster to be monitored by Prometheus uses the
# default configuration with "/tmp/ray" as the default root temporary directory.
#
# This is to support the `ray metrics launch-prometheus` command, when a ray cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...when a Ray cluster...

# default configuration with "/tmp/ray" as the default root temporary directory.
#
# This is to support the `ray metrics launch-prometheus` command, when a ray cluster
# is not started yet and we have no way to get a `--temp-dir` anywhere. So we choose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasn't started yet and the user doesn't have a way to.....

shutil.rmtree(grafana_config_output_path)
os.makedirs(os.path.dirname(grafana_config_output_path), exist_ok=True)
shutil.copytree(GRAFANA_CONFIG_INPUT_PATH, grafana_config_output_path)
# Create grafana configuration folder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create Grafana configuration folder.

shutil.rmtree(self._grafana_config_output_path)
os.makedirs(self._grafana_config_output_path, exist_ok=True)

# Overwrite grafana's configuration file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verwrite Grafana's configuration file.

GRAFANA_INI_TEMPLATE.format(
grafana_provisioning_folder=grafana_prov_folder_with_latest_session
)
)

# Overwrite grafana's dashboard provisioning directory based on env var
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overwrite Grafana's dashboard provisioning directory based on env var.

# scrape_timeout is set to the global default (10s).
scrape_configs:
# Scrape from each ray node as defined in the service_discovery.json provided by ray.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ray node...provided by Ray.
(capitalize Ray)

def test_prometheus_config_content():
# Test to make sure the content in the hardcoded file
# (python/ray/dashboard/modules/metrics/export/prometheus/prometheus.yml) will
# always be the same as the template (templates.py) used to generate prometheus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prometheus
(capitalize Prometheus)

# Test to make sure the content in the hardcoded file
# (python/ray/dashboard/modules/metrics/export/prometheus/prometheus.yml) will
# always be the same as the template (templates.py) used to generate prometheus
# config file when ray startup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when Ray starts up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI test windows://python/ray/tests:test_metrics_head is consistently_failing
6 participants