Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Old RayServices not deleted after operator update to 1.2.1 #2374

Open
1 of 2 tasks
aravind-anantha opened this issue Sep 11, 2024 · 8 comments
Open
1 of 2 tasks
Labels
bug Something isn't working triage

Comments

@aravind-anantha
Copy link

Search before asking

  • I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

After kuberay operator update to 1.2.1 new ray services were created and are up and running as expected. But, the old ray services never got deleted. The expected behavior is for the old services to be torn down once the new ones are up but it did not happen.

Reproduction script

Following instructions from https://docs.ray.io/en/latest/cluster/kubernetes/user-guides/upgrade-guide.html to upgrade the operator to 1.2.1

Reconciling the ingress and service resources on the active Ray cluster. No pending Ray cluster found.

^ is a message in the logs but the old cluster exists and is still dangling.

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@aravind-anantha aravind-anantha added bug Something isn't working triage labels Sep 11, 2024
@andrewsykim
Copy link
Collaborator

The expected behavior is for the old services to be torn down once the new ones are up but it did not happen.

To clarify, do you mean old RayService or Service?

@aravind-anantha
Copy link
Author

The expected behavior is for the old services to be torn down once the new ones are up but it did not happen.

To clarify, do you mean old RayService or Service?

Yes, I meant the old RayService

@andrewsykim
Copy link
Collaborator

I don't think upgrading kuberay to v1.2.1 is suppose to delete exist RayService objects. Can you share a link to code or documentation that implies that we do this?

@andrewsykim
Copy link
Collaborator

Or are you referring to the "old" RayCluster used for the RayService?

@aravind-anantha
Copy link
Author

Or are you referring to the "old" RayCluster used for the RayService?

Yes, the old cluster being used for the RayService is not being deleted.

@andrewsykim
Copy link
Collaborator

I believe that's expected, if you upgrade kuberay without changing any fields of the RayService, the RayCluster should be not replaced

@aravind-anantha
Copy link
Author

I believe that's expected, if you upgrade kuberay without changing any fields of the RayService, the RayCluster should be not replaced

Oh, we also made another change to the resources, specifically the amount of memory the headGroup requests. Could it be because both happened around the same time something went wrong?

@andrewsykim
Copy link
Collaborator

Yes, if you change some property of the RayService, it will trigger a replacement of the RayCluster. However, this doesn't apply for all fields like minReplicas and maxReplicas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants