-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add dns record delete validation #221
Conversation
* Fix an issue with the doamin owners checks when multiple instances are being used, allOwners was being appended with multiple duplicates. * Add additional checks to ensure all remaining records are eventually updated with the correct domain owners. * Add logging of time taken around dnsrecord create/update/delete checks. Signed-off-by: Michael Nairn <[email protected]>
dfb7260
to
7e5df90
Compare
// QueuedAt is a time when DNS record was received for the reconciliation | ||
// QueuedAt is a time when DNS record was received for the reconciliation. | ||
// During deletion QueuedAt is used to track the last time a change was detected when processing the delete request. | ||
// Used for validation purposes to ensure the record is completely removed from the provider. | ||
QueuedAt metav1.Time `json:"queuedAt,omitempty"` | ||
|
||
// QueuedFor is a time when we expect a DNS record to be reconciled again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think QueuedFor
is used anywhere, should it be removed(in a different PR)?
// QueuedAt is a time when DNS record was received for the reconciliation | ||
// QueuedAt is a time when DNS record was received for the reconciliation. | ||
// During deletion QueuedAt is used to track the last time a change was detected when processing the delete request. | ||
// Used for validation purposes to ensure the record is completely removed from the provider. | ||
QueuedAt metav1.Time `json:"queuedAt,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure about re-using this for a different purpose, but it didn't seem like a great idea adding yet another time field just for tracking the delete updates.
7e5df90
to
7f8f5f0
Compare
Adds validation around the deletion of DNSRecord resources to reduce the chances of unexpected records being left in the external dns provider (AWS Routes53 etc..) after the record has been garbage collected. It now takes into account the possibly that multiple records could be manipulating the same zone record set simultaneously which can happen if a significant number of DNSRecord resources were contributing to the same rootHost record set in the provider and were all deleted at roughly the same time. The deletion logic was changed to always queue for another reconciliation loop an amount of time (currently 15 seconds) after the last time a change was detected. A change can either be that it had to update the provider itself or that some status on the record was changed, the most important one being `Status.DomainOwners`. If either of these changes occur the delete validation loop of 15 seconds is restarted and will only move on to remove the finalizer once it reaches the 15 seconds duration without any changes having been made. Note: Some providers, such as google, handle this better as they require the API requests to know the current state of the record you are updating/deleting before it will be accepted. Others such as AWS(Route53) unfortunately do not meaning that some things one record just deleted, may get added back by another depending on when it last queried the providers zone records. Signed-off-by: Michael Nairn <[email protected]>
7f8f5f0
to
27d5067
Compare
closes #181
fix e2e multi record test issues
feat: Add dns record delete validation
Adds validation around the deletion of DNSRecord resources to reduce the chances of unexpected records being left in the external dns provider (AWS Routes53 etc..) after the record has been garbage collected. It now takes into account the possibly that multiple records could be manipulating the same zone record set simultaneously which can happen if
a significant number of DNSRecord resources were contributing to the same rootHost record set in the provider and were all deleted at roughly the same time.
The deletion logic was changed to always run validation checks, at random intervals, for an amount of time (currently 15 seconds) after the last time a change was detected. A change here can either be that it had to update the provider itself or that some status on the record was changed, the most important one being
Status.DomainOwners
. If either of these changes occur the delete validation loop of 15 seconds is restarted and will only move on to remove the finalizer once it reaches the 15 seconds duration without any changes having been made.Note: Some providers, such as google, handle this better as they require the API requests to know the current state of the record you are updating/deleting before it will be accepted. Others such as AWS(Route53) unfortunately do not meaning that some things one record just deleted, may get added back by another depending on when it last queried the providers zone records.
Validation
Terminal 1:
Setup local env with multiple instances and watch all dnsrecords:
Terminal 2:
Tail all operator logs:
Terminal 3:
Run multi record e2e with 20 instances (AWS)
Run multi record e2e with 20 instances (GCP)