Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AS3/FAST RPM installation fails in GCP cloud deployment intermittent #46

Open
RavinderReddyF5 opened this issue Aug 18, 2022 · 5 comments
Labels
bug Something isn't working
Milestone

Comments

@RavinderReddyF5
Copy link

error snapshot:

Wed, 17 Aug 2022 19:55:35 GMT - severe: FAST Worker: Failed to save config: Error: AS3 Driver failed to GET declaration: Request failed with status code 404
""
    at AS3Driver._handleAS3Error (/var/config/rest/iapps/f5-appsvcs-templates/lib/drivers.js:597:19)
    at /var/config/rest/iapps/f5-appsvcs-templates/lib/drivers.js:570:24
    at callReaction (/var/config/rest/iapps/f5-appsvcs-templates/node_modules/core-js/modules/es.promise.constructor.js:75:18)
I am able to replicate the following error:

and this error occurs because AS3 in a bad state; interesting thing is that AS3 was alive at some point after installation:

2022-08-17T19:30:47.381Z [25812]: info: Validating - as3 extension is available.
2022-08-17T19:30:47.382Z [25812]: silly: Making request: GET http://localhost:8100/mgmt/shared/appsvcs/info verifyTls: true
2022-08-17T19:30:48.674Z [25812]: silly: Request response: 404 {"code":404,"message":"","referer":"Unknown","errorStack":[]}
2022-08-17T19:30:48.675Z [25812]: silly: Error: Is available check failed 404
2022-08-17T19:30:51.675Z [25812]: silly: Retrying... Attempts left: 9
2022-08-17T19:30:51.677Z [25812]: silly: Making request: GET http://localhost:8100/mgmt/shared/appsvcs/info verifyTls: true
2022-08-17T19:30:53.469Z [25812]: silly: Request response: 200 {"version":"3.38.0","release":"4","schemaCurrent":"3.38.0","schemaMinimum":"3.0.0"}

but then, FAST checks for AS3 availability starts failing:

Wed, 17 Aug 2022 19:36:43 GMT - info: FAST Worker [0]: Entering Fetching AS3 info
Wed, 17 Aug 2022 19:36:43 GMT - finest: socket 342 opened
Wed, 17 Aug 2022 19:36:43 GMT - severe: [RestOperationDispatcher] 'shared/fast/info' not found.
Wed, 17 Aug 2022 19:36:43 GMT - severe: [ErrorHandlingModule] RestOperation failed: "/shared/fast/info". {"code":404,"message":"","referer":"Unknown","originalRequestBody":"","errorStack":[]}
Wed, 17 Aug 2022 19:36:44 GMT - info: FAST Worker [0]: Exiting Fetching AS3 info


and right now, AS3 is not listed under installed applications:
# curl -s -u admin: http://localhost:8100/mgmt/shared/iapp/global-installed-packages | jq .items[].appName
"f5-service-discovery"
"f5-declarative-onboarding"
"f5-cloud-failover"
"f5-telemetry"
"f5-appsvcs-templates"

somehow, AS3 gets uninstalled or it fails to start after restnoded restarts caused by other extension installations

under restnoded.log, I found the following message around the failure time:
Wed, 17 Aug 2022 19:36:26 GMT - warning: [appsvcs] {"message":"AS3 version: 3.38.0","level":"warning"}
ok - I have tried to re-run installation without DO config; just extensions and everything went through (but on 3rd attempt - couple times, I was getting a null rpm file from Github which caused installation failure)

i re-run declaration with just DO config (after installation exts were done before) and everything seems to be working without problems 


how consistent this issue? as I said before, it seems that the AS3 gets into bad state after restnoded restarts (restarts caused by extensions installation) - if this is a consistent issue, this requires a ticket/JIRA for RuntimeInit as first step 


ugghh - I am not able to re-run installation from the same host due to this error: 
RPM installation failed: Package f5-telemetry version 1.30.0-1 has status null
this appears to be happening because Github returns null file - potentially, this can be improved on RuntimeInit side as well 


perhaps, github throttle downs request from this IP because two many requests 


anyhow, if this is a consistent issue, I would advise to report a bug (contact Shyaw Karim or Krithika Chidambaram - they will file a JIRA story)  for Runtime Init to investigate why AS3 is in a bad state after installation all other extensions


@RavinderReddyF5
Copy link
Author

@shyawnkarim this issues reported based on comments from @andreykashcheev. he is aware of issue.

@f5-applebaum
Copy link

Hi,

Noticed repo that repo is defaulting to:
n1-standard-4
https://github.com/F5Networks/terraform-gcp-bigip-module/blob/main/variables.tf#L28

When installing many extensions, noticed this type of symptom with smaller images. Can you try bumping up to:

n1-standard-8:
ex.
https://github.com/F5Networks/f5-google-gdm-templates-v2/blob/main/examples/quickstart/sample_quickstart.yaml#L36

to see if that helps resolve it. If that doesn't work, possibly increasing the delay between installs:

https://github.com/F5Networks/f5-bigip-runtime-init#controls
extensionInstallDelayInMs: 15000

@shyawnkarim
Copy link

@RavinderReddyF5, did @f5-applebaum's advice solve your issue?

@JeffGiroux
Copy link

JeffGiroux commented Nov 3, 2022

getting similar failures in GCP. Tried template as-is with tag 2.4.0.0 and also latest 2.6.0.0. Both deployments of quickstart result in same failure.

instance = n1-standard-8

snippet...
2022-11-03T23:31:53.888Z [3042]: info: Validating - fast extension is available after restnoded restart.
2022-11-03T23:32:53.601Z [3042]: error: Is available check failed 404

full...

cat /var/log/cloud/startup-script-post-swap-nic.log
2022-11-03T23:29:57.153Z [3042]: info: Configuration file: /config/cloud/runtime-init-conf.yaml
2022-11-03T23:29:57.176Z [3042]: info: Processing controls parameters
2022-11-03T23:29:57.180Z [3042]: info: Validating provided declaration
2022-11-03T23:29:57.289Z [3042]: info: Successfully validated declaration
2022-11-03T23:29:57.377Z [3042]: info: Resolving parameters
2022-11-03T23:29:58.428Z [3042]: info: Executing install operations.
2022-11-03T23:29:58.439Z [3042]: info: Installing - do 1.33.0
2022-11-03T23:30:02.182Z [3042]: info: Validating - do extension is available.
2022-11-03T23:30:15.223Z [3042]: info: Installing - as3 3.40.0
2022-11-03T23:30:19.567Z [3042]: info: Validating - as3 extension is available.
2022-11-03T23:30:54.873Z [3042]: info: Installing - ts 1.32.0
2022-11-03T23:31:01.831Z [3042]: info: Validating - ts extension is available.
2022-11-03T23:31:11.851Z [3042]: info: Installing - fast 1.21.0
2022-11-03T23:31:14.675Z [3042]: info: Validating - fast extension is available.
2022-11-03T23:31:44.750Z [3042]: info: fast extension  is not available. Attempt to restart restnoded.
2022-11-03T23:31:53.888Z [3042]: info: Validating - fast extension  is available after restnoded restart.
2022-11-03T23:32:53.601Z [3042]: error: Is available check failed 404
2022-11-03T23:32:53.601Z [3042]: info: Sending F5 Teem report for failure case.
2022-11-03T23:32:58.754Z [3042]: info: {"id":"defc7c6b-c4ad-878d-65b645131685","product":"BIG-IP","cpuCount":8,"diskSize":83968,"memoryInMb":30160,"version":"16.1.3.2","nicCount":3,"regKey":"OOUIC-GSJFA-JMOZI-BVXJR-YAFXINC","platformId":"Z100","hostname":"bigip1","management":"10.0.0.2/32","provisionedModules":{"ltm":"nominal"},"installedPackages":{"f5-service-discovery-1.10.15-1.noarch":"1.10.15","f5-declarative-onboarding-1.33.0-7.noarch":"1.33.0","f5-appsvcs-3.40.0-5.noarch":"3.40.0","f5-telemetry-1.32.0-2.noarch":"1.32.0","f5-appsvcs-templates-1.21.0-1.noarch":"1.21.0"},"environment":{"pythonVersion":"Python 2.7.5","pythonVersionDetailed":"2.7.5 (default, Sep 14 2022, 06:56:50) \n[GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]","nodeVersion":"v6.9.1","libraries":{"ssh":"OpenSSH_7.4p1, OpenSSL 1.0.2u-fips  20 Dec 2019"}}}
2022-11-03T23:33:05.777Z [3042]: info: F5 Teem report was successfully sent for failure case.
2022-11-03T23:33:05.778Z [3042]: info: Is available check failed 404
[admin@localhost:Active:Standalone] ~ # cat /var/log/cloud/startup-script-post-swap-nic.log 

@shyawnkarim
Copy link

This issue, internal ID ESECLDTPLT-3219, has already been completed and will be available with the next release.

@shyawnkarim shyawnkarim added the bug Something isn't working label Nov 7, 2022
@shyawnkarim shyawnkarim added this to the backlog milestone Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants