-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing to mount XFS volumes #2110
Comments
/assign |
By the way, these are the logs from the CSI driver when a new XFS volume fails to be mounted:
Manually re-formatting the XFS volume on the worker node using |
Thanks for the very detailed bug report @mpb10. This issue is caused by a compatibility mismatch between the version of xfsprogs used by the driver and the kernel version on the worker nodes - the driver utilizes xfsprogs v5.18 which formats XFS volumes with features requiring kernel v5.18 or higher. However, as noted above, the custom Ubuntu 20.04 worker nodes are running an older kernel version (v5.4), which does not support newer XFS features. Relevant
The fact that manually reformatting the volume using the host's older xfsprogs version (v5.3.0) resolves the issue further confirms that the problem lies in the kernel's inability to mount volumes formatted by the newer xfsprogs version used by the driver. Ideally, the best solution here would be upgrading the kernel or using an AMI that includes a more recent kernel version : ) I understand this may be challenging or not feasible. In that case, as far as workarounds go, formatting the volumes with the older xfsprogs version available on the host before they are mounted by the driver (as you are currently doing) or using statically provisioned volumes that are pre-formatted would be viable options. I'll discuss this pain point with the team during our next sync-up and follow up here with the long term view for this class of issue. |
Thank you for the fast response to this! To add some context, the Ubuntu 20.04 kernel version that we're using is their FIPS-enabled kernel, which only goes up to version Also, having to manually format our volumes during the provisioning process is rather inefficient and breaks up our automation workflows, so this solution isn't ideal either. I think having the ability to build our own version of the CSI driver image with an older version of Thanks! |
Thanks for that feedback @mpb10, its very helpful. The team is looking to implement a new optional parameter on the node plugin to let users disable some of the newer XFS formatting features. This should solve the compatibility issues you're seeing with the older kernel. To be clear, this will be an opt-in feature. We're doing it this way to preserve existing behavior, and more importantly, disabling the newer XFS features may result in other compatibility issues down the road. Relevant WIP PR: #2121 - feel free to leave further feedback/questions either here or directly on the PR. |
This is great! this solution will work perfectly for us and we eagerly await it. Thank you very much @torredil and team! |
Hi @torredil |
@chethan-das We're actively working on this feature and we hope to release it in the near future but won't have a firm ETA until it's fully ready and tested. I (or somebody else from the team) will update this issue when we have a firm ETA or other information available. |
/close This should be fixed in aws-ebs-csi-driver v1.35.0, and has been tested by a user with nodes with linux kernel versions ≤ 5.4. Thank you for raising this issue! Please set the See our driver options documentation or PR #2121 for more details. |
@AndrewSirenko: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/kind bug
We're running into an issue with the
aws-ebs-csi-driver
on an Ubuntu 20.04 worker node where we believe it is incorrectly formatting XFS volumes, and therefore they can't be mounted in the pod's containers.What happened?
We get the following pod error trying to create and mount an XFS volume to a pod that is running on an Ubuntu 20.04 worker node:
Events:
What you expected to happen?
We expect the XFS volume to be mounted automatically by the CSI driver without any errors.
How to reproduce it (as minimally and precisely as possible)?
We use the following manifest to reproduce the error:
Our worker nodes are Ubuntu 20.04 AWS EC2 instances running in FIPS mode. This issue does not occur on a different OS, such as SUSE chost images. We also tried disabling FIPS mode, however that didn't make a difference:
We've ensured that our
xfsprogs
package, which suppliesmkfs.xfs
is up-to-date:We're using the following Kubernetes version:
We've replicated this on the
aws-ebs-csi-driver
CSI driver versionsv1.29.0
and1.33.0
. Changing the CSI driver version doesn't seem to make a difference. We also don't see any mention of this issue in the changelog.Anything else we need to know?:
This error does not occur if the volume type is EXT4 or if we change the worker node OS to something other than Ubuntu 20.04.
We believe that the CSI driver is formatting the XFS volumes incorrectly. This is because, if we SSH into the worker node and manually re-format the XFS volume that is failing to mount, it will then be able to be mounted by the
aws-ebs-csi-driver
without any issues. This leads us to believe that theaws-ebs-csi-driver
is incorrectly formatting the XFS volume before attempting to mount it.Additionally, the
xfs_info
values for the XFS volume are slightly different when the CSI driver formats it and when it's manually re-formatted. Both formatting commands are using the default formatting parameters formkfs.xfs
.This is the
xfs_info
of the XFS volume after it is formatted by theaws-ebs-csi-driver
:And this is the
xfs_info
of the same XFS volume after it is re-formatted manually and can be mounted without issue. Notice that the only value that changed isblocks
under thelog
section:As a temporary workaround, we've created a daemonSet that watches the worker nodes for newly created XFS volumes, tests whether they mount successfully, and if they don't, re-formats them automatically. While this does work, it's risky and not production-worthy.
Environment
kubectl version
):Client Version: v1.30.3 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.3
v1.29.0
andv1.33.0
Thanks!
The text was updated successfully, but these errors were encountered: