Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to MongDB Atlas seems to take ~40s every time #1112

Closed
Jasper-Bekkers opened this issue May 29, 2024 · 3 comments
Closed

Connecting to MongDB Atlas seems to take ~40s every time #1112

Jasper-Bekkers opened this issue May 29, 2024 · 3 comments
Assignees
Labels

Comments

@Jasper-Bekkers
Copy link

Versions/Environment

  1. What version of Rust are you using? 1.78
  2. What operating system are you using? Windows
  3. What versions of the driver and its dependencies are you using? (Run
    cargo pkgid mongodb & cargo pkgid bson) 2.8.2, problem persists on main branch
  4. What version of MongoDB are you using? (Check with the MongoDB shell using db.version()) 7.3.2
  5. What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)? serverless

Describe the bug

When using this crate with mongodb+srv:// makes the ClientOptions::parse function take ~40s. Additionally, when connecting with mongodb:// it connects nearly instantly, but then fails with the following error messages (unclear if related).

called `Result::unwrap()` on an `Err` value: Error { kind: ServerSelection { message: "Server selection timeout: No available servers. Topology: { Type: Unknown, Servers: [ { Address: evolve-highscores.4ngync0.mongodb.net:27017, Type: Unknown, Error: Kind: I/O error: No such host is known. (os error 11001), labels: {} } ] }" }, labels: {}, wire_version: None, source: None }

When connecting through MongoDB Compass it connects near instantly, when connecting through nodejs it connects near instantly.

...

#[tokio::main]
async fn main() {
    use mongodb::{
        options::{ClientOptions, FindOptions},
        Client,
    };
    let k = std::time::Instant::now();
    println!("start: {:?}", k.elapsed());

    let mut client_options = ClientOptions::parse("mongodb+srv://..mongodb.net....")
    .await
    .unwrap();

    println!("ClientOptions: {:?}", k.elapsed());
}

Digging in with the profiler it's showing something interesting:
Tokio seems to be mostly doing nothing for ~5 seconds in between bursts in which it's trying to do some DNS related activity for ~1ms

Data captured with Superluminal Profiler, red indicates "thread is stalling" and specifically in this case it's in SwapContext (win32)
image

If we zoom in a bit more we can see there are distinc sections of "nothing" for 5 seconds:

image

If we zoom in to what's going on inbetween these 5 seconds it becomes a bit clear:

image

Here we see that we end up in small calls to trust_dns. Right now for me it's unclear if this is a problem with trust-dns or if this is related to something this crate is doing. However the fact that nodejs and Compass don't have this behavior definitely seems to indicate that something is going wrong on the Rust side of things.

@Jasper-Bekkers
Copy link
Author

Jasper-Bekkers commented May 29, 2024

Alright looks like the default timeout on trust-dns is 5 seconds and it'll try a bunch of things sequentially.

Changing AsyncResolver as follows and passing in a custom ResolverConfig speeds things up significantly.

    pub(crate) async fn new(config: Option<ResolverConfig>) -> Result<Self> {
        let resolver = match config {
            Some(config) => {
                let mut opts = trust_dns_resolver::config::ResolverOpts::default();
                opts.timeout = std::time::Duration::from_millis(100); // the change
                trust_dns_resolver::TokioAsyncResolver::tokio(config, opts).map_err(Error::from_resolve_error)?
            }
            None => trust_dns_resolver::TokioAsyncResolver::tokio_from_system_conf() // need to bypass this since it's using default values
                .map_err(Error::from_resolve_error)?,
        };

        Ok(Self { resolver })
    }
}

@abr-egn
Copy link
Contributor

abr-egn commented May 30, 2024

Yup, this is unfortunately a known issue in trust-dns-resolver that we can't do much about :( https://crates.io/crates/mongodb#windows-dns-note

@abr-egn abr-egn closed this as completed May 30, 2024
@Jasper-Bekkers
Copy link
Author

Jasper-Bekkers commented May 31, 2024

Yup, this is unfortunately a known issue in trust-dns-resolver that we can't do much about :( https://crates.io/crates/mongodb#windows-dns-note

Thanks for linking the readme, I couldn't an actual issue for this when looking so it appears its mostly untracked on the mongodb side.

Can we work with the trust-dns community to fix this? It seems like it would hamper prototyping and development quite a bit?

One of the questions that they had, that I can't answer for them, is where the 8x multiplier comes from. Their timeout is 5 seconds, yet with mongodb its 40s. Additionally, why are we even running into timeout situations. If we feed this information back into trust-dns maybe we can make that crate slightly better for everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants