Skip to content
This repository has been archived by the owner on Nov 7, 2018. It is now read-only.

Preventing data los on data nodes during rolling update #188

Open
depauna opened this issue Jun 5, 2018 · 3 comments
Open

Preventing data los on data nodes during rolling update #188

depauna opened this issue Jun 5, 2018 · 3 comments

Comments

@depauna
Copy link

depauna commented Jun 5, 2018

Shouldn't there be an http livenessprobe on the data nodes that prevents terminating older data nodes when they are still replicating new data to newly running data nodes during a rolling update?

So meaning newly created data nodes are not seen as fully running untill the cluster status is green.

I'm looking at something like this:

       livenessProbe:
          httpGet:
            path: /_cluster/health?wait_for_status=green
            host: elasticsearch
            port: 9200
            scheme: HTTP
          initialDelaySeconds: 300
          timeoutSeconds: 60
          failureThreshold: 5
@pires
Copy link
Owner

pires commented Jun 8, 2018

A failed livenessProbe kills a pod. Are you thinking of readinessProbe?

@depauna
Copy link
Author

depauna commented Jun 14, 2018

Yup, totally messed that up. Something like this.

        readinessProbe:
          exec:
            command:
            - curl 
            - -i 
            - -H 
            - "Accept: application/json" 
            - -H 
            - "Content-Type: application/json" 
            - -X 
            - GET 
            - http://{{ .Values.elasticsearch.name }}:{{ .Values.elasticsearch.client.restPort }}/_cluster/health?wait_for_status=green&timeout=31557600s
          initialDelaySeconds: 30
          timeoutSeconds: 10
          failureThreshold: 3

Only downside is; when a node goes down or reboots. Cluster status becomes yellow and then when the node tries to start it will never fully happen (will remain 0/1 Running). As the status is not green.

Still wondering if we need it. As with rolling updates data could get lost when as soon as one data node is available again the other one goes down before syncing new data to the new one.

Or do the containers have something built in for that?

@depauna
Copy link
Author

depauna commented Jun 21, 2018

Any thoughts?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants