Weakforced synchost should not download data if data structure differs [BUG] #267

sshipway · 2019-05-21T00:46:23Z

Describe the bug
On startup, the synchost feature can download the in-memory data from a sister node. This should not happen if the data has a different structure, such as if HLLBits is changed.

Package: wforce-2.0.1-2.x86_64

To Reproduce
Steps to reproduce the behavior:

Set up 2-node cluster with stringstatsDB that has an HLL field, and synchost definitions with minimal required uptime.
Cause records to be added to the HLL stringstatsDB
Modify one node to have setHLLBits(8)
Restart that node.
Attempt to have records updated in the HLL stringstatsDB

Expected behavior
Since the stringstatsDB is now a different format, the restarting node should spot this, and not request the data from the existing node butinstead start with an empty database.

Actual behaviour
Data is downloaded into memory from other node even though it does not match the expected format. When the data is actually accessed in operation, it produces an error.

{"status":"failure", "reason":"number of registers doesn't match: 256 != 64"}

OS:

OS: Centos
Version 7

Additional context
This problem may also occur if new fields are added or other fields changed. Maybe have a version code created by a hash of the database definition checked as part of the sync handshake?

sshipway · 2019-05-21T00:47:12Z

Workaround - restart both nodes immediately. Since the uptime is less than the required, sync transfer does not happen and a fresh database is created.

neilcook · 2019-05-21T08:33:03Z

I would say this is a feature request rather than a bug report. Given that a multitude of things will fail if all the wforce instances have different configs, (like replication) then this really a bad idea. I might just use a hash based on the wforce.conf file.

neilcook · 2019-05-21T08:34:17Z

Just to stress, and I might make this more explicit in the docs: wforce clusters expect all cluster nodes to have to same config. I'm not going to spend a huge amount of time checking that people haven't followed that rule.

sshipway · 2019-05-21T22:39:45Z

I agree that all nodes should have the same config, and you don't want to waste a lot of time checking everything -- though some simple verification might help prevent mistakes.

A hash based on the entire config file might not work because there are some differences (synchost and siblings might not list themselves)

The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.

Maybe a better workaround is to recommend that the cluster key (setKey()) be changed at the same time as any changes to database structure? I haven't tested this, but would that result in a rolling update of nodes failing to connect nodes until the sibling was also updated?

neilcook · 2019-05-22T09:21:27Z

The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.

Yes I will look into possible ways to fingerprint the StatsDBs configuration at the very least, as this is the most critical. Neil

sshipway added the bug label May 21, 2019

sshipway assigned neilcook May 21, 2019

neilcook added the enhancement label May 21, 2019

neilcook removed the bug label May 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weakforced synchost should not download data if data structure differs [BUG] #267

Weakforced synchost should not download data if data structure differs [BUG] #267

sshipway commented May 21, 2019

sshipway commented May 21, 2019

neilcook commented May 21, 2019

neilcook commented May 21, 2019

sshipway commented May 21, 2019

neilcook commented May 22, 2019 via email

Weakforced synchost should not download data if data structure differs [BUG] #267

Weakforced synchost should not download data if data structure differs [BUG] #267

Comments

sshipway commented May 21, 2019

sshipway commented May 21, 2019

neilcook commented May 21, 2019

neilcook commented May 21, 2019

sshipway commented May 21, 2019

neilcook commented May 22, 2019 via email