Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restnoded high CPU and Memory when handling ~700 http request per second Request logging. #233

Open
eos500n2000 opened this issue Nov 15, 2022 · 2 comments
Labels
bug Something isn't working untriaged Issue needs to be reviewed for validity

Comments

@eos500n2000
Copy link

eos500n2000 commented Nov 15, 2022

Environment

  • Telemetry Streaming Version: 1.31. 1.32
  • BIG-IP Version: v15.1.6.1

Summary

Restnoded high CPU and Memory when handling ~700 http request per second Request logging.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Submit the following declaration:
[root@HQF5LB02:Active:Disconnected (Trust Domain Only)] config # restcurl -u admin:admin /mgmt/shared/telemetry/declare
{
  "message": "success",
  "declaration": {
    "class": "Telemetry",
    "controls": {
      "class": "Controls",
      "logLevel": "debug",
      "debug": true,
      "memoryThresholdPercent": 90
    },
    "My_System": {
      "class": "Telemetry_System",
      "systemPoller": {
        "interval": 60,
        "enable": true,
        "actions": [
          {
            "setTag": {
              "tenant": "`T`",
              "application": "`A`"
            },
            "enable": true
          }
        ]
      },
      "enable": true,
      "host": "localhost",
      "port": 8100,
      "protocol": "http",
      "allowSelfSignedCert": false
    },
    "My_Listener": {
      "class": "Telemetry_Listener",
      "port": 6514,
      "enable": true,
      "trace": false,
      "match": "",
      "actions": [
        {
          "setTag": {
            "tenant": "`T`",
            "application": "`A`"
          },
          "enable": true
        }
      ]
    },
    "Httptest_Consumer": {
      "class": "Telemetry_Consumer",
      "type": "Generic_HTTP",
      "host": "10.20.4.176",
      "protocol": "http",
      "port": 8001,
      "enable": true,
      "trace": true,
      "path": "/",
      "allowSelfSignedCert": false,
      "method": "POST",
      "outputMode": "processed",
      "compressionType": "none"
    },
    "schemaVersion": "1.32.0"
  }
}

[VS and Request log Config]

ltm pool /Common/telemetry {
members {
/Common/255.255.255.254:6514 {
address 255.255.255.254
}
}
}
ltm rule /Common/telemetry_local_rule {
when CLIENT_ACCEPTED {
node 127.0.0.1 6514
}
}

ltm virtual /Common/telemetry_local {
creation-time 2022-09-28:14:16:12
destination /Common/255.255.255.254:6514
ip-protocol tcp
last-modified-time 2022-09-28:14:16:12
mask 255.255.255.255
persist {
/Common/source_addr {
default yes
}
}
profiles {
/Common/f5-tcp-progressive { }
}
rules {
/Common/telemetry_local_rule
}
serverssl-use-sni disabled
source 0.0.0.0/0
source-address-translation {
type automap
}
translate-address enabled
translate-port enabled
}

ltm profile request-log /Common/telemetry {
app-service none
defaults-from /Common/request-log
request-log-pool /Common/telemetry
request-log-protocol mds-tcp
request-log-template event_source="request_logging",hostname="$BIGIP_HOSTNAME",client_ip="$CLIENT_IP",server_ip="$SERVER_IP",http_method="$HTTP_METHOD",http_uri="$HTTP_URI",virtual_name="$VIRTUAL_NAME",event_timestamp="$DATE_HTTP"
request-logging enabled
response-log-pool /Common/telemetry
response-log-protocol mds-tcp
response-log-template event_source="response_logging",hostname="$BIGIP_HOSTNAME",client_ip="$CLIENT_IP",server_ip="$SERVER_IP",http_method="$HTTP_METHOD",http_uri="$HTTP_URI",virtual_name="$VIRTUAL_NAME",event_timestamp="$DATE_HTTP",http_statcode="$HTTP_STATCODE",http_status="$HTTP_STATUS",response_ms="$RESPONSE_MSECS"
response-logging enabled
}

[http traffic test VS and pool]

ltm virtual /Common/telemetry_test_vs {
creation-time 2022-10-04:19:38:53
destination /Common/10.20.4.16:80
ip-protocol tcp
last-modified-time 2022-10-04:19:38:53
mask 255.255.255.255
pool /Common/telemetry_test_pool
profiles {
/Common/http { }
/Common/tcp { }
/Common/telemetry { }
}
serverssl-use-sni disabled
source 0.0.0.0/0
source-address-translation {
type automap
}
translate-address enabled
translate-port enabled
}

ltm pool /Common/telemetry_test_pool {
members {
/Common/12.0.0.180:80 {
address 12.0.0.180
}
}
monitor /Common/tcp
}

Send ~ 700 request/second to telemetry_test_vs.

  1. Observe the following error response:
    Restnoded high CPU and Memory, restarted in about 7 mins.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

20972 restnod+ 20 0 2821160 1.6g 16860 R 124.6 0.3 7:07.54 /usr/bin/f5-rest-node /usr/share/rest/node/src/restnode.js -p 8105 --logLevel finest -i /var/log/restnoded/restnod+

top - 15:20:04 up 1 day, 20:28, 1 user, load average: 4.11, 3.23, 2.70

Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie

%Cpu(s): 6.8 us, 1.5 sy, 0.6 ni, 91.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

KiB Mem : 52832300+total, 35512116 free, 48742419+used, 5386728 buff/cache

KiB Swap: 5242876 total, 5242876 free, 0 used. 35985340 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

20972 restnod+ 20 0 2817576 1.5g 16860 R 126.2 0.3 7:11.34 /usr/bin/f5-rest-node /usr/share/rest/node/src/restnode.js -p 8105 --logLevel finest -i /var/log/restnoded/restnod+

Broadcast message from [systemd-journald@HQF5LB02] (Wed 2022-11-09 15:20:05 CST):

logger[24708]: Re-starting restnoded

2022 Nov 9 15:20:05 [HQF5LB02] logger[24708]: Re-starting restnoded

Expected Behavior

Restnoded does not restart

Actual Behavior

Restnoded restarts

PS: A F5 Support case [00277783] has been filed for the same issue.

@eos500n2000 eos500n2000 added bug Something isn't working untriaged Issue needs to be reviewed for validity labels Nov 15, 2022
@eos500n2000
Copy link
Author

1.)
Tested ~ 700 http requests per second in lab for request logging.
restnoded CPU was 100%, Memory went to 1.6G very quickly and Restnoded restarted after about 7~8 mins.

2.)
I used http consumer in lab ( customer uses Azure log analysis). PE suggested to add two customOpts as below.
"Httptest_Consumer": {
"class": "Telemetry_Consumer",
"type": "Generic_HTTP",
"host": "10.20.4.176",
"protocol": "http",
"port": 8001,
"enable": true,
"trace": true,
"path": "/",
"allowSelfSignedCert": false,
"method": "POST",
"customOpts": [
{ "name": "keepAlive","value": true }, #####
{ "name": "maxSockets", "value": 30} #####
]

There is no improvements. Restnoded still restarts after about 7 mins.

And there is no such tuning options for Azure LA consumer anyway.

3.)
Also tested disabled/ removed all consumers. ( for debugging)

"Httptest_Consumer": {
"class": "Telemetry_Consumer",
"type": "Generic_HTTP",
"host": "10.20.4.176",
"protocol": "http",
"port": 8001,
"enable": false, ######## disabled
"trace": true,
"path": "/",

With 700 TPS http traffic, restnodes memory was stable, CPU is between 50%~70% (comparing with ~0% when there is no traffic.) And no restarts.

@G-gonzalezjimenez
Copy link

Thanks, created ticket TS718 internally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged Issue needs to be reviewed for validity
Projects
None yet
Development

No branches or pull requests

2 participants