Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weakforced synchost should not download data if data structure differs [BUG] #267

Open
sshipway opened this issue May 21, 2019 · 5 comments
Assignees

Comments

@sshipway
Copy link

Describe the bug
On startup, the synchost feature can download the in-memory data from a sister node. This should not happen if the data has a different structure, such as if HLLBits is changed.

Package: wforce-2.0.1-2.x86_64

To Reproduce
Steps to reproduce the behavior:

  1. Set up 2-node cluster with stringstatsDB that has an HLL field, and synchost definitions with minimal required uptime.
  2. Cause records to be added to the HLL stringstatsDB
  3. Modify one node to have setHLLBits(8)
  4. Restart that node.
  5. Attempt to have records updated in the HLL stringstatsDB

Expected behavior
Since the stringstatsDB is now a different format, the restarting node should spot this, and not request the data from the existing node butinstead start with an empty database.

Actual behaviour
Data is downloaded into memory from other node even though it does not match the expected format. When the data is actually accessed in operation, it produces an error.

{"status":"failure", "reason":"number of registers doesn't match: 256 != 64"}

OS:

  • OS: Centos
  • Version 7

Additional context
This problem may also occur if new fields are added or other fields changed. Maybe have a version code created by a hash of the database definition checked as part of the sync handshake?

@sshipway
Copy link
Author

Workaround - restart both nodes immediately. Since the uptime is less than the required, sync transfer does not happen and a fresh database is created.

@neilcook
Copy link
Collaborator

I would say this is a feature request rather than a bug report. Given that a multitude of things will fail if all the wforce instances have different configs, (like replication) then this really a bad idea. I might just use a hash based on the wforce.conf file.

@neilcook
Copy link
Collaborator

Just to stress, and I might make this more explicit in the docs: wforce clusters expect all cluster nodes to have to same config. I'm not going to spend a huge amount of time checking that people haven't followed that rule.

@sshipway
Copy link
Author

I agree that all nodes should have the same config, and you don't want to waste a lot of time checking everything -- though some simple verification might help prevent mistakes.

A hash based on the entire config file might not work because there are some differences (synchost and siblings might not list themselves)

The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.

Maybe a better workaround is to recommend that the cluster key (setKey()) be changed at the same time as any changes to database structure? I haven't tested this, but would that result in a rolling update of nodes failing to connect nodes until the sibling was also updated?

@neilcook
Copy link
Collaborator

neilcook commented May 22, 2019 via email

@neilcook neilcook removed the bug label May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants