Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes with tfchain error don't update #2393

Open
scottyeager opened this issue Aug 7, 2024 · 2 comments
Open

Nodes with tfchain error don't update #2393

scottyeager opened this issue Aug 7, 2024 · 2 comments
Assignees
Labels
type_bug Something isn't working
Milestone

Comments

@scottyeager
Copy link

I noticed that I can't reach some mainnet nodes over RMB.

RMBError: 104 invalid envelope signature: sr25519 signature verification failed

Here's an example from the dashboard, when attempting to deploy a VM on node 1479:

image

Same result using the RMB proxy:

image

Here's a non exhaustive list of affected node ids on mainnet:

1087
1226
1479
1640
1723
1926
1966
2158
2723
4349
@ramezsaeed ramezsaeed added this to the 3.12 milestone Aug 19, 2024
@rawdaGastan rawdaGastan self-assigned this Aug 19, 2024
@rawdaGastan
Copy link
Contributor

Are you sure those nodes are updated? Can you please check their versions if possible?

@xmonader xmonader added the type_bug Something isn't working label Aug 19, 2024
@scottyeager
Copy link
Author

I have reviewed the logs for all nodes in my list above. It seems they all have some issue that's preventing them from updating.

What's common in the logs of all nodes is this line:

[+] identityd: error failed to get flist info error="failed to get flist (tf-zos/zos:production-3:latest.flist) info: 404 Not Found"

Most of the nodes also have an error about read only cache and resulting boltdb failure. For example:

[+] provisiond: fatal exiting error="error running integrity checks: unlinkat /var/cache/modules/provisiond/metrics-diff.bolt: read-only file system"

1087
1226
1479
1640
1723
2158
2723
4349

A couple don't have the read only cache error but instead have an error regarding tfchain, like this:

[+] noded:  error failed to decode events from tfchain error="unable to find field Balances_Locked for event #62 with EventID [20 17]"

1926
1966

Checking now, I see that there's a fix for nodes with read only cache not getting the latest version.

But what about those last two nodes? They are not reporting read only cache, but it seems they have a similar behavior in not accepting the latest version.

@scottyeager scottyeager changed the title RMBError: signature verification failed Nodes with tfchain error don't update Aug 23, 2024
@ashraffouda ashraffouda modified the milestones: 3.12, 3.13 Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants