Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed mount of HugePages leads to process hang #1741

Open
tiagolobocastro opened this issue Sep 20, 2024 · 0 comments
Open

Mixed mount of HugePages leads to process hang #1741

tiagolobocastro opened this issue Sep 20, 2024 · 0 comments
Labels
BUG Something isn't working

Comments

@tiagolobocastro
Copy link
Contributor

Describe the bug
On k8s, io-engine container hangs whilst trying to flock /dev/hugepages

To Reproduce
Setup your system with default 1GiB hugepage size and allocate 2MiB hugepages.

Expected behavior
Should work :)

Additional context
The problem is that we end up with both 1Gi and 2Mi hugepages mounted on /dev/hugepages.

/ # mount | grep huge
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=1024M)
nodev on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)

EAL init is not able to cope with this and hangs as soon as it tries to flock /dev/hugepages.

I think the fix is to ensure we mount 1Gi and 2Mi on separate mounts, example:

        - name: hugepage
          mountPath: /dev/hugepages-2MiB

In fact today helm-chart is not very flexible as it simply calls out 2MiB hugepages. We should add some vars to allow choosing which hugepages to use...

@tiagolobocastro tiagolobocastro added the BUG Something isn't working label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant