[Bug] KubeRay Worker group pod keeps restarting on EKS - Fails to CrashLoopBackOff #2420
Open
2 tasks done
Labels
bug
Something isn't working
external-author-action-required
P1
Issue that should be fixed within a few weeks
Search before asking
KubeRay Component
Others
What happened + What you expected to happen
I am following steps 2-5 here on an Amazon EKS cluster. I am able to run a job and access the dashboard, however, the workers keep restarting (K9s screenshot attached)
Logs of the ray-worker can be found below:
Running the same steps on kind works as expected, with the worker pod being in ready state and not failing
Reproduction script
Anything else
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: