Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The server gets berserk on CPU and RAM out of nowhere #70

Open
triuk opened this issue Feb 26, 2023 · 21 comments
Open

The server gets berserk on CPU and RAM out of nowhere #70

triuk opened this issue Feb 26, 2023 · 21 comments

Comments

@triuk
Copy link

triuk commented Feb 26, 2023

Hi, I do not know what changed, but even after a clean install, the CPU hits 100 % and RAM is consumed +30 MB/s. The game runs, but unplayable in these conditions and the server is not responsive after a while anyway.
Do you experience similar issue?
I tried to revert back to b2e4e04b2763604b4e3cedd5241cd123f3a84fe3
with

git reset --hard b2e4e04b2763604b4e3cedd5241cd123f3a84fe3
./install.sh

but it is the same (can I revert previously installed version this way?).
Before I start doing some tests, I'd like to make sure everything is fine on your side.

@bviktor
Copy link
Collaborator

bviktor commented Feb 26, 2023

Actually, I noticed the same thing, "connection lost" all the time. But I can't imagine what would've possibly broke things on my side.

Apparently the last update to KF2 was on Feb 2nd:

https://steamdb.info/app/232090/patchnotes/

So it shouldn't be that either I guess?

I tried skipping the UKFP mutator, but that didn't solve it for me. Will keep you updated if I find out the cause, please do the same.

@triuk
Copy link
Author

triuk commented Mar 1, 2023

Well the error diasppeared after a longer time. It took longer than expected. It is downloaded. False alarm.
Maybe it is related, maybe not. I'll just post everything suspicious I find. I did a clean install in VBox Ubuntu 22.04 server (faster to do things on my workstation than the potato low-power server).
Here is the output of klf status workshop, nothing is downloaded.

------------------------------------------------------
Subscribed workshop maps:                             
------------------------------------------------------
838775511       KF-HorzineArenaRMEdition        ❌❌🌐
1210703659      KF-KillingPool                  ❌❌🌐
------------------------------------------------------
Subscribed workshop mutators:                         
------------------------------------------------------
2625647922      N/A                             ❌❌🌐
2875147606      N/A                             ❌❌🌐
find: ‘/home/steam/Steam/KF2Server/Binaries/Win64/steamapps/workshop/content/232090’: No such file or directory
find: ‘/home/steam/Steam/KF2Server/KFGame/Cache’: No such file or directory

@triuk
Copy link
Author

triuk commented Mar 1, 2023

OK, I found what causes the behavior.
The server hits 100 % CPU and eat RAM (KFGameSteamServ process) as soon as I expose the UDP gaming port 7777 to the internet.
It looks like someone is mining crypto on the KF2 server (so far a joke, but not really).
The game is totally fine when I have the server just in my local network.
Unfortunately I have no idea, how to fix this or if you can even fix this. But the KF2 server is unusable in this state.

@bviktor
Copy link
Collaborator

bviktor commented Mar 1, 2023

Huh, so maybe that's why I saw lot of discussions about DDoS protection on the TWI forums...

So maybe we're being flooded with bullcrap? It'd be explained by the fact that my new server was usable the other day.

Whatever the case, implementing rate limits on the exposed port would be a good idea, so I'll see what I can do about it.

@bviktor
Copy link
Collaborator

bviktor commented Mar 1, 2023

And also thanks a lot for your reports!

@triuk
Copy link
Author

triuk commented Mar 1, 2023

Yes, it can be some kind of DDoS. Firewall solution (if even possible) is just a workaround. Thank you for the info, I'll try it later, I am going to bed now :)
Nevertheless,the TWI must do the patch work, because it is their application that is exploited and eats absurd amount of CPU and RAM.

@bviktor
Copy link
Collaborator

bviktor commented Mar 1, 2023

Unfortunately in this day and age DDoS protection is not optional :)

I already have something in my mind - rate limit for connections on KF2 ports, then log with firewalld the ones that got rejected, and then fail2ban those IPs for a day or so.

@triuk
Copy link
Author

triuk commented Mar 2, 2023

There is a solution. Updated today. I still think TWI should resolve it on KF2 server side, but probably plain hope since the problem started in 2021.
https://forums.tripwireinteractive.com/index.php?threads/kf2-or-any-unreal-engine-3-server-on-redhat-centos-rocky-alma-linux-ddos-defense-with-the-help-of-firewalld.2337631/

@triuk
Copy link
Author

triuk commented Mar 2, 2023

OK, so I installed needed packages:
sudo apt install cron firewalld
Then run every command as a root (sudo):
crontab -e
put there (the path is to Launch.log file)

*/20 * * * * tail -5000 /home/steam/Steam/KF2Server/KFGame/Logs/Launch.log|grep -F -A2 'Connection timed out after'|awk -F" |:" '/Close/ {a[$7]++} END {for (b in a) {if (a[b]>4) {print b}}}'|uniq|while read ip; do firewall-cmd --permanent --ipset=networkblock --add-entry=$ip/20;done && firewall-cmd --reload >/dev/null 2>&1
0 6 */3 * * firewall-cmd --ipset=networkblock --get-entries|while read ip; do firewall-cmd --permanent --ipset=networkblock --remove-entry=$ip;done;firewall-cmd --permanent --ipset=networkblock --add-entry=128.116.0.0/17 >/dev/null 2>&1

after that run these commands (I use default port 7777):

firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -p udp --dport 7777 -m connlimit --connlimit-above 5 --connlimit-mask 20 -j DROP
firewall-cmd --permanent --new-ipset=networkblock --type=hash:net
firewall-cmd --permanent --zone=drop --add-source=ipset:networkblock

and finally allow desired ports, for me they are:

firewall-cmd --add-port=7777/udp --permanent
firewall-cmd --add-port=27015/udp --permanent
firewall-cmd --add-port=8080/tcp --permanent

Restart the firewall and done:
systemctl restart firewalld

tl;dr, it works. There is still overhead, but the server is usable and my friends can connect. According to the author, the banlist is persistant, so maybe there will be less overhead in the future, when the banlist is more complete.

@bviktor
Copy link
Collaborator

bviktor commented Mar 3, 2023

Permanently banning IPs is not a good idea in general, since public IPs often change hands.

I'm trying to implement some kind of rate limiting. Will get back to you soon.

bviktor added a commit that referenced this issue Mar 3, 2023
@bviktor
Copy link
Collaborator

bviktor commented Mar 3, 2023

This is an initial stab at it, for now it seems to be working but will find out in the coming days.

As for you, you already made several manual changes, so I'm afraid there's no easy way to test this, since your changes will probably interfere.

bviktor added a commit that referenced this issue Mar 4, 2023
bviktor added a commit that referenced this issue Mar 4, 2023
bviktor added a commit that referenced this issue Mar 4, 2023
bviktor added a commit that referenced this issue Mar 4, 2023
bviktor added a commit that referenced this issue Mar 4, 2023
Refs #70
Refs #73
bviktor added a commit that referenced this issue Mar 4, 2023
bviktor added a commit that referenced this issue Mar 4, 2023
@bviktor bviktor added this to the 3.0 milestone Mar 5, 2023
@bviktor
Copy link
Collaborator

bviktor commented Mar 5, 2023

Things kinda settled down I think, so if you get the chance to test it out from scratch sometime, please report back :)

@bviktor
Copy link
Collaborator

bviktor commented Mar 5, 2023

Well the error diasppeared after a longer time. It took longer than expected. It is downloaded. False alarm. Maybe it is related, maybe not. I'll just post everything suspicious I find. I did a clean install in VBox Ubuntu 22.04 server (faster to do things on my workstation than the potato low-power server). Here is the output of klf status workshop, nothing is downloaded.

------------------------------------------------------
Subscribed workshop maps:                             
------------------------------------------------------
838775511       KF-HorzineArenaRMEdition        ❌❌🌐
1210703659      KF-KillingPool                  ❌❌🌐
------------------------------------------------------
Subscribed workshop mutators:                         
------------------------------------------------------
2625647922      N/A                             ❌❌🌐
2875147606      N/A                             ❌❌🌐
find: ‘/home/steam/Steam/KF2Server/Binaries/Win64/steamapps/workshop/content/232090’: No such file or directory
find: ‘/home/steam/Steam/KF2Server/KFGame/Cache’: No such file or directory

For the record, these are valid issues as well, please see #72 and #75. But they're unrelated to the CPU/RAM problem.

@bviktor bviktor removed this from the 3.0 milestone Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
Refs #70
Refs #73
bviktor added a commit that referenced this issue Mar 7, 2023
bviktor added a commit that referenced this issue Mar 7, 2023
@triuk
Copy link
Author

triuk commented Mar 8, 2023

Hi, I tried your workaround long term and I hate to write that, but your solution does not work for me well.
Comparison of the latest 52e4884 and pre-ddos 00c3c11 with IP ban solution from the forum:

  1. Just a thought - those 4 default items at workshop take ages to download, but somehow they download after a long time - that affects both versions, probably ddos unrelated?

  2. The main difference is in the webadmin responsiveness. It takes seconds to load the page, sometimes it even does not load and I need to reload it. The chat console at the bottom permanently shows "page not found" error.
    I do not face this at the IP ban solution at all.

For the record, I do those tests on my workstation with Ryzen 5 5600H, so there is not a lack of resources.

@bviktor
Copy link
Collaborator

bviktor commented Mar 8, 2023

Thanks for your response!

Uh, yeah, maybe I was a bit foolish to take an nginx reverse proxy for granted.

If webadmin is slow, then you're probably hitting the rate limits over HTTP.

Would you be so kind as to reinstall and retry with the rate limit increased to dunno, maybe 50/m? Here:

https://github.com/noobient/killinuxfloor/blob/master/roles/install/tasks/firewalld.yml#L45

@triuk
Copy link
Author

triuk commented Mar 8, 2023

Increasing to 50/m did not help much. The delay is unbearable, the chat console sometimes come to life though.
But I tried to remove the 8080/tcp from your script as the tcp is not vulnerable to that type of attack; and I just added the port to firewall firewall-cmd --add-port=8080/tcp --permanent
This way it probably works like you intended with responsive web interface.

@k0dat
Copy link

k0dat commented May 25, 2023

@bviktor,

I probably should have posted earlier in this thread. But anyway, for the last few months I've had my rate limit set at 20/m and seems like a more sensible value than default 10/m. I found with 10/m I was personally hitting the rate limit. I think this was in KF2 client I was adjusting search parameters in the server browser probably hitting refresh a few times and my server didn't show up until I left it for a minute. I've had a friend report a similar issue even with 20/m - I think he was similarly adjusting parameters and spamming refresh. Which makes me think, should the limit be even higher than my 20/m?

Apart from a single person coming from a single IP, my thinking is there could be some people trying to join our servers at LAN parties on a shared IP address and would hit the limit trying to search for the same server at the same time. Could be also some people on shared IP via CGNAT - but I'd think given small KF2 population that would less likely than a LAN party environment, but still possible. So perhaps some sort of temporary IP ban might be worth considering when hitting a higher limit? Below are DDOS stats from one of my servers. There's only a relatively small number of IPs being hit compared to the number of overall requests.


Today's DDoS stats:

Denied packets: 2,564,918
Unique IPs: 255
Log size: 587M
Log throttled: yes
Log limits: 20000 allowed within 600 seconds

@triuk
Copy link
Author

triuk commented May 25, 2023

Hi, yeah the 10/m is too low as my own server kicked me out a few times :P

  • Would it be possible to have the rate tunable eg. klf setrate 20 for much faster testing?

  • Also could you remove the rate limit for the TCP as it is solely UDP attack anyway? The webinterface does not work when I host it on the same machine.

@k0dat
Copy link

k0dat commented May 25, 2023

@triuk - Regarding your second point with web admin, I might be able to provide some guidance. From your earlier post you mentioned port 8080, so it sounds like you're exposing web admin to the Internet over HTTP? If so, I highly recommend against this. What you need to do is set up a reverse proxy, ideally with HTTPS. I'm using NGINX as a reverse proxy with HTTPS. It's not that hard to set up and I can provide with some of my setup notes if you're interested?

@triuk
Copy link
Author

triuk commented May 25, 2023

Lets continue in the discussion: #83 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants