Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Blutgang 0.3.0 Garreg Mach #57

Open
makemake-kbo opened this issue Feb 9, 2024 · 20 comments
Open

[META] Blutgang 0.3.0 Garreg Mach #57

makemake-kbo opened this issue Feb 9, 2024 · 20 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers Major release For a new major release

Comments

@makemake-kbo
Copy link
Contributor

This is a meta issue for general questions and troubleshooting related to Blutgang 0.3.0 Garreg Mach.

If you have trouble updating, using, getting undefined behaviour, or anything minor that does not deserve it's own issue, you can report it here.

@makemake-kbo makemake-kbo added enhancement New feature or request bug Something isn't working good first issue Good for newcomers Major release For a new major release and removed enhancement New feature or request labels Feb 9, 2024
@makemake-kbo makemake-kbo self-assigned this Feb 9, 2024
@makemake-kbo makemake-kbo pinned this issue Feb 9, 2024
@chris-vest
Copy link

@makemake-kbo Trying to run this on Kubernetes, and after checking the RPC latencies, it simply exits:

outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Wrn: All data cleared from the database.
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Info: Starting Blutgang 0.3.0 Garreg Mach
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Info: Bound to: 0.0.0.0:3000
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Wrn: Reorg detected!
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Removing stale entries from the cache.
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Info: Adding user 1 to sink map
outbound-blutgang-lb-dev-f382c-1 outbound-blutgang-lb Info: Subscribe_user finding: ["newHeads"]
- outbound-blutgang-lb-dev-f382c-1 › outbound-blutgang-lb

Running locally with the same config using Docker, it seems to work fine. I have no liveness / readiness set on Kubernetes, so it's not being killed by the orchestrator.

    Last State:     Terminated
      Reason:       Error
      Exit Code:    132
      Started:      Wed, 21 Feb 2024 11:20:37 +0100
      Finished:     Wed, 21 Feb 2024 11:20:49 +0100
spec:
  containers:
  - args:
    - -c
    - /app/config.toml
    command:
    - /app/blutgang

@makemake-kbo
Copy link
Contributor Author

@chris-vest its probably being killed by something. exiting on its own without any error code should not be happening.

@chris-vest
Copy link

@makemake-kbo Can you confirm the health check endpoint? I saw there was a feature added for it but I can't find the endpoint.

@makemake-kbo
Copy link
Contributor Author

@chris-vest its at / on the admin api. if you get a response of {"id":0} that means its working. if you get anything else that means its unhealthy.

@chris-vest
Copy link

@makemake-kbo Thank you.

I don't think K8s is killing the container, it seems to error out:

    lastState:
      terminated:
        containerID: containerd://46bd3f545980b859a4be7dfa96463197fbbd8efff6ee1b407167230a86da3957
        exitCode: 132
        finishedAt: "2024-02-21T10:59:28Z"
        reason: Error
        startedAt: "2024-02-21T10:59:28Z"

@chris-vest
Copy link

Is there a debug log?

@chris-vest
Copy link

If you could provide an example config for Kubernetes so I can check against it, that would be great. I'm using the 0.3.0 image.

@makemake-kbo
Copy link
Contributor Author

@chris-vest feature flagdebug-verbose prints verbose output about what its doing.

theres helm charts here https://github.com/ethpandaops/ethereum-helm-charts/tree/master/charts/blutgang you can use as reference.

@chris-vest
Copy link

Any idea what the 132 exit code could be?

I'm using 0.2.0 now and that seems to work fine.

@makemake-kbo
Copy link
Contributor Author

If 0.2.0 works fine this is probably a regression. Could you post your config/full output?

@chris-vest
Copy link

chris-vest commented Feb 21, 2024

redacted

@chris-vest
Copy link

Same config as above but I removed Quicknode and DRPC config to see if it would help, so you only see it checking one RPC latency at startup.

+ outbound-blutgang-lb-dev-f382c-0 › outbound-blutgang-lb
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Using config file at /app/config.toml
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Sorting RPCs by latency...
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb https://REDACTED: 126060066.75ns
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Wrn: All data cleared from the database.
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Starting Blutgang 0.3.0 Garreg Mach
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Bound to: 0.0.0.0:3000
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Admin namespace enabled, accepting admin methods at admin port
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Bound admin to: 0.0.0.0:5715
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Wrn: Reorg detected!
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Removing stale entries from the cache.
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Adding user 1 to sink map
outbound-blutgang-lb-dev-f382c-0 outbound-blutgang-lb Info: Subscribe_user finding: ["newHeads"]
- outbound-blutgang-lb-dev-f382c-0 › outbound-blutgang-lb

@makemake-kbo
Copy link
Contributor Author

Could you run sudo cat /proc/cpuinfo | grep avx on the machine running k8s? Theres a chance that its erroring out with 132 because it doesn't have avx2 instructions.

@chris-vest
Copy link

$ sudo cat /proc/cpuinfo | grep avx2
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512_vnni md_clear arch_capabilities

@makemake-kbo
Copy link
Contributor Author

That's really bizarre. Exit code 132 should mean that it encountered an illegal instruction. Just to confirm what's happening, if you run the x86_64 build from the releases page, does it still SIGILL?

@chris-vest
Copy link

I'll try to build an image with that x86_64 build and let you know.

@makemake-kbo
Copy link
Contributor Author

@chris-vest pushed a new container with a conservative build target and verbose debug output. if error 132 doesn't happen with this one then it's probably safe to say its target cpu related.

@chris-vest
Copy link

@makemake-kbo Your 0.3.0-debug image seems to work! Last question I think is whether I can make the logging less verbose? I probably really only want errors. I've set both RUST_LIB_BACKTRACE and RUST_BACKTRACE to 0.

@jhvst
Copy link
Contributor

jhvst commented Feb 23, 2024

I'm getting malformed JSON output on the websocket connection.

websocat ws://upstream:8545
> {"jsonrpc":"2.0","id":155,"method":"eth_subscribe","params":["newHeads"]}
< {"jsonrpc":"2.0","id":155,"result":"0x76540fa86762d185d54875398cf69c4b"}
websocat ws://blutgang:8545
> {"jsonrpc":"2.0","id":155,"method":"eth_subscribe","params":["newHeads"]}
< {"jsonrpc":"2.0","id":155,"result":0x76540fa86762d185d54875398cf69c4b}

As you can see, the blutgang result value is not a string for some reason, but a raw value, even though upstream reports this back correctly. This seemingly only happens in this first response with the subscribe -- all other responses thereafter are correctly relayed.

@makemake-kbo
Copy link
Contributor Author

@makemake-kbo Your 0.3.0-debug image seems to work! Last question I think is whether I can make the logging less verbose? I probably really only want errors. I've set both RUST_LIB_BACKTRACE and RUST_BACKTRACE to 0.

@chris-vest awesome! ill make a new minor release with various small bug fixes and a more conservative target either later today or tomorrow. verbose debug output is a compile time feature.

I'm getting malformed JSON output on the websocket connection.

websocat ws://upstream:8545
> {"jsonrpc":"2.0","id":155,"method":"eth_subscribe","params":["newHeads"]}
< {"jsonrpc":"2.0","id":155,"result":"0x76540fa86762d185d54875398cf69c4b"}
websocat ws://blutgang:8545
> {"jsonrpc":"2.0","id":155,"method":"eth_subscribe","params":["newHeads"]}
< {"jsonrpc":"2.0","id":155,"result":0x76540fa86762d185d54875398cf69c4b}

As you can see, the blutgang result value is not a string for some reason, but a raw value, even though upstream reports this back correctly. This seemingly only happens in this first response with the subscribe -- all other responses thereafter are correctly relayed.

fixed in 8268e2b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers Major release For a new major release
Projects
Status: No status
Development

No branches or pull requests

3 participants