-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982
Merged
qkrorlqr
merged 19 commits into
ydb-platform:main
from
aikuchin:users/antonkuchin/csi-stage-volumes
Sep 30, 2024
Merged
[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode #1982
qkrorlqr
merged 19 commits into
ydb-platform:main
from
aikuchin:users/antonkuchin/csi-stage-volumes
Sep 30, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hi! Thank you for contributing! |
Direct comparison can fail for wrapped errors, use errors.Is(err, target) instead. Signed-off-by: Anton Kuchin <[email protected]>
This way it is more obvious that they are unused and makes the linter happy. Signed-off-by: Anton Kuchin <[email protected]>
This is an impossible code path now but someday this can change and otherwise function is ready to have nil mnt, so fix one more condition. Signed-off-by: Anton Kuchin <[email protected]>
aikuchin
force-pushed
the
users/antonkuchin/csi-stage-volumes
branch
2 times, most recently
from
September 23, 2024 05:09
2271f27
to
6381d5d
Compare
aikuchin
changed the title
[RFC] [CSI driver] PoC of staged mount for vhost block volumes in VM mode
[CSI driver] PoC of staged mount for vhost block volumes in VM mode
Sep 23, 2024
qkrorlqr
added
ok-to-test
Label to approve test launch for external members
large-tests
Launch large tests for PR
labels
Sep 23, 2024
github-actions
bot
removed
the
ok-to-test
Label to approve test launch for external members
label
Sep 23, 2024
aikuchin
changed the title
[CSI driver] PoC of staged mount for vhost block volumes in VM mode
[CSI driver] Imlementation of staged mount for vhost block volumes in VM mode
Sep 23, 2024
aikuchin
changed the title
[CSI driver] Imlementation of staged mount for vhost block volumes in VM mode
[CSI driver] Implementation of staged mount for vhost block volumes in VM mode
Sep 23, 2024
aikuchin
changed the title
[CSI driver] Implementation of staged mount for vhost block volumes in VM mode
[CSI driver] Implementation of staged mount for vhost block and FS volumes in VM mode
Sep 23, 2024
Signed-off-by: Anton Kuchin <[email protected]>
Some errors are expected or don't matter, so ignore them explicitly. Signed-off-by: Anton Kuchin <[email protected]>
Signed-off-by: Anton Kuchin <[email protected]>
This will be required later to add staged mount for vhost endpoints Signed-off-by: Anton Kuchin <[email protected]>
Volume context and capabilities should be reused in Stage and Publish Move most of other constants from arguments to variables. Signed-off-by: Anton Kuchin <[email protected]>
This will be used later when part of NodePublish logic for vhost endpoints will be moved to NodeStage. Signed-off-by: Anton Kuchin <[email protected]>
Before we assumed that pod ID is always present and can be used as a part of mount path, but if endpoint is started in NodePublish podId is not known. For such cases return staging directory name where endpoint will be started. Signed-off-by: Anton Kuchin <[email protected]>
aikuchin
force-pushed
the
users/antonkuchin/csi-stage-volumes
branch
from
September 25, 2024 17:53
6381d5d
to
aa98a8f
Compare
tpashkin
reviewed
Sep 26, 2024
…anceId attribute Signed-off-by: Anton Kuchin <[email protected]>
Also rename all socketDir variables to endpointDir for conststency. Signed-off-by: Anton Kuchin <[email protected]>
In most callsites we already know endpointDir, so just accept it as an argument instead of rebuilding every time. Signed-off-by: Anton Kuchin <[email protected]>
This keeps apart codepaths that require podId separate from ones that rely on instanceID. It also allows to remove temporary hack from getInstanceOrPodId and rename it back to getPodId. Signed-off-by: Anton Kuchin <[email protected]>
This way instanceId variable can be used only if it is not empty and error handling happens only if any of stage functions was called. Signed-off-by: Anton Kuchin <[email protected]>
We will need these parameters later to properly stop endpoints during NodeUnstage. This also allows to skip unnecessary calls in unstage of old volumes. Signed-off-by: Anton Kuchin <[email protected]>
We had all staged disk on the node to be in shared staging directory, but now we have all necessary information in stage record to get instanceId in NodeUnstage so we can group disks in instance directory. Old path looked like: /...NBS.../sockets/${POD_ID}/${VOLUME_ID}/[nbs,nfs].sock Intermediate path looked like: /...NBS.../sockets/staging/${VOLUME_ID}/[nbs,nfs].sock New path will look like: /...NBS.../sockets/${INSTANCE_ID}/${VOLUME_ID}/[nbs,nfs].sock Signed-off-by: Anton Kuchin <[email protected]>
aikuchin
force-pushed
the
users/antonkuchin/csi-stage-volumes
branch
from
September 27, 2024 09:16
aa98a8f
to
fc02929
Compare
According to spec target directory must be removed after NodeUnstage but now mounter mock doesn't delete it during mount point cleanup. If directory deletion is enabled then other test breaks because it relies on previous tests leaving temoprary directories. This will be fixed separately form this PR. Signed-off-by: Anton Kuchin <[email protected]>
We use the same code for all types of volumes in NodeUnpublish because we have no good way to get their type. Add comment to explain why StopEndpoint calls in NodeUnpublishVolume has no effect for staged disks. Signed-off-by: Anton Kuchin <[email protected]>
aikuchin
force-pushed
the
users/antonkuchin/csi-stage-volumes
branch
from
September 27, 2024 09:26
fc02929
to
e7c8ad8
Compare
antonmyagkov
approved these changes
Sep 27, 2024
tpashkin
added
blockstore
Add this label to run only cloud/blockstore build and tests on PR
ok-to-test
Label to approve test launch for external members
labels
Sep 27, 2024
github-actions
bot
removed
the
ok-to-test
Label to approve test launch for external members
label
Sep 27, 2024
drbasic
approved these changes
Sep 30, 2024
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
blockstore
Add this label to run only cloud/blockstore build and tests on PR
large-tests
Launch large tests for PR
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously endpoints were started on NodePublish phase, this breaks hotplug of multiple disks to the same VM (causes recreation of endpoints when helpler pod chandges, and ends up in deadlock attemptiong to create new endpoint before the old wan was stopped).
This commit splits volume mount between NodeStage and NodePublish phases. So endpoint is started in NodeStage, and NodePublish just creates mount to requested pod. With this split change of hotplug pod doesn't cause endpoint stop (because it doesn't trigger NodeUnpublish) but intead results in remounts, avoiding deadlock and reconnects.
What is unclear:
should stage mount be implemented for other types (FS, nbd?)Decided that only vhost endpoints (NBS and NFS) should use new scheme
how to get podId in NodeStage phase (from volume attributes?)We don't use podId at all, use instead instanceId from volume attributes, if it is not provided then use old scheme
how to make tranfer from old scheme seamlessVolumes that have instanceId attrubute will use new scheme, others will continue using old scheme
should we use separate stage directory or just use s.socketsDirGrouped volumes in instanceId directories