Skip to content

Commit

Permalink
snc-library: Delete the failed pods before check for available one
Browse files Browse the repository at this point in the history
Sometime pods goes to `ContainerStatusUnknown` state where it is not
able to send the status to kubelet and it stays there till manually
deleted and due to it our snc script fails. In this PR we are deleting
the pods which are in failed state (which is the same for
ContainerStatusUnknown one) and then checks the pods availablity.

```
+ sleep 256
+ all_pods_are_running_completed none
+ local ignoreNamespace=none
+ ./openshift-clients/linux/oc get pod --no-headers --all-namespaces '--field-selector=metadata.namespace!=none'
+ grep -v Running
+ grep -v Completed
openshift-kube-apiserver                           installer-11-crc                                         0/1   ContainerStatusUnknown   1                19m
+ exit=1
+ wait=512
+ count=10
+ '[' 10 -lt 10 ']'
+ echo 'Retry 10/10 exited 1, no more retries left.'
Retry 10/10 exited 1, no more retries left.
```

fixes: #920
  • Loading branch information
praveenkumar committed Jul 3, 2024
1 parent 0d4dfbf commit 06e6b4b
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions snc-library.sh
Original file line number Diff line number Diff line change
Expand Up @@ -241,8 +241,13 @@ function no_operators_degraded() {
${OC} get co -ojsonpath='{.items[*].status.conditions[?(@.type=="Degraded")].status}' | grep -v True
}

function retry_failed_pods() {
${OC} delete pods --field-selector=status.phase=Failed -A
}

function all_pods_are_running_completed() {
local ignoreNamespace=$1
retry_failed_pods
! ${OC} get pod --no-headers --all-namespaces --field-selector=metadata.namespace!="${ignoreNamespace}" | grep -v Running | grep -v Completed
}

Expand Down

0 comments on commit 06e6b4b

Please sign in to comment.