Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching the probes to spread the workload and more exact time readings #15

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

fmotrifork
Copy link
Contributor

When having many probes on a single board instance, the current implementation results in probes waiting for each other and not for the actual services.
This results in many probes that are marked as "slow" even though the problem stems from to many open probes at once.

This PR batches the probes in 8 at a time.
It also splits the http probe time into two:

  • Time from request is created and until first byte is received from the server
  • Time from request is created and until the server finishes sending bytes.

This allows us to better understand if a slow reading is from infrastructure / loadbalancers or from a slow http service.

Copy link
Owner

@Lesterpig Lesterpig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this pull request!

It actually changes a lot of things in the project (Dockerfile, Makefile, logging, prometheus...).
I would only consider changing the manager.go and probe/http.go files in this PR.
If needed, please open new pull requests for the proposed changes.

I also have put a few comments in the modified files.
If you have some time, could you please update your pull request? Otherwise I can do it for you 😃

for category, services := range manager.Services {
for _, service := range services {
// Batching the probes to spread the workload
if math.Mod(i, 8) == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is there a reason to use a float for i?
We could use integers and check with if i % 8 == 0.

Moreover, the i = 0 line might not be necessary (or the condition can just be i >= 8).

TLSClientConfig: &tls.Config{InsecureSkipVerify: !opts.VerifyCertificate},
}
tr := http.DefaultTransport.(*http.Transport).Clone()
tr.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the opts.VerifyCertificate option has been forgotten?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants