Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NBS-4681: cache DA sessions in a file #41

Merged
merged 1 commit into from
Jan 9, 2024
Merged

Conversation

sharpeye
Copy link
Collaborator

@sharpeye sharpeye commented Jan 4, 2024

No description provided.

@sharpeye sharpeye force-pushed the NBS-4681-session-cache branch 2 times, most recently from ed5f8bb to 9df623b Compare January 5, 2024 14:13
Copy link
Contributor

github-actions bot commented Jan 5, 2024

Note

This is an automated comment that will be appended during run.

🔴 linux-x86_64-relwithdebinfo: some tests FAILED for commit f094392.

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
4645 4551 0 22 0 72

@@ -455,6 +460,8 @@ TFuture<TInitializeResult> TDiskAgentState::Initialize(
Logging->CreateLog("BLOCKSTORE_DISK_AGENT"));

InitRdmaTarget(std::move(rdmaTargetConfig));

RestoreSessions(*DeviceClient);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeviceClient it's already member of TDiskAgentState so you can avoid passing it to member functions RestoreSessions and UpdateSessionCache

Copy link
Collaborator Author

@sharpeye sharpeye Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a feature :) Inside RestoreSessions and UpdateSessionCache there is no need to check the validity of DeviceClient.

std::make_move_iterator(session.MutableDeviceIds()->begin()),
std::make_move_iterator(session.MutableDeviceIds()->end()));

CheckError(client.AcquireDevices(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before calling to client.AcquireDevices we need to filter out uuids from old devices that were removed from the disk agent.
Otherwise disk agent can be stopped and some change in config will remove device. Then we will try to acquire the old device from the saved sessions file and it will fail because of the code here https://github.com/ydb-platform/nbs/blob/main/cloud/blockstore/libs/storage/disk_agent/model/device_client.cpp#L40

Also if we won't prune old devices that stopped to be part of the disk agent config we might end with large sessions file containing sessions with invalid device uuids

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tweaked the code so that only broken sessions will be rejected now. There is no problem with old devices because UpdateSessionCache rewrites the cache file completely with actual sessions.

budevg
budevg previously approved these changes Jan 8, 2024
@@ -21,6 +21,51 @@ TDeviceClient::TDeviceClient(
, Log(std::move(log))
{}

TVector<NProto::TDiskAgentDeviceSession> TDeviceClient::GetSessions() const
{
THashMap<TString, NProto::TDiskAgentDeviceSession> sessions;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

а зачем дедупликация через мапу? разве у нас могут быть дубли?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Это группировка по clientId:

uuid-1 : writer-1, reader-1
uuid-2 : writer-2
uuid-3 : writer-1, reader-1

в

reader-1: uuid-1, uuid-3
writer-1: uuid-1, uuid-3
writer-2: uuid-2

@qkrorlqr qkrorlqr self-assigned this Jan 9, 2024
@sharpeye sharpeye merged commit abd42ec into main Jan 9, 2024
4 of 5 checks passed
@sharpeye sharpeye deleted the NBS-4681-session-cache branch January 9, 2024 16:46
sharpeye added a commit that referenced this pull request Jan 9, 2024
NBS-4681: cache DA sessions in a file (#41)
sharpeye added a commit that referenced this pull request Jan 9, 2024
sharpeye added a commit that referenced this pull request Jan 10, 2024
sharpeye added a commit that referenced this pull request Jan 11, 2024
* NBS-4681: cache DA sessions in a file (#41)

* [tests] set service_local timeout to 3min (#65)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants