Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: prevent creating topics just after deletion ( #1739) #1918

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

danigiri
Copy link

@danigiri danigiri commented Oct 29, 2024

Deleting and recreating topics very quickly (<5min) means there is still stale metadata in the kafka producer, which creates problems (see #1739) and frontend instances need to be restarted for stuck messages to be sent.

Deletion creates a node in the group called /deletion_time, stores the time of deletion there. All creation and updates are still within a transaction.

Throws an exception if not enough time has passed (5 min hardcoded ATM).

Functionality has been added at the topic level and left group impl agnostic of those details as much as possible (except deletion and empty logic).

Input requested

  • look at overall design/approach and naming (/deletion_time or more verbose /kafka_topics_deletion_time), or deleted vs removed
  • ZookeeperTopicRepositoryTest tests used a shared group that cross-contaminated tests, added a mitigation
  • add enhancement label to PR
  • examples of PRs for integration testing or any related guidance
  • google code style check seems to fail without really checking anything
  • if this PR has potential you can add the hacktoberfest-accepted tag so it's counted before the end of the month

Done

  • added the code changes
  • added new groovy unit tests for the added functionality
  • tests on ZookeeperTopicRepositoryTest now do not share state
  • all existing unit tests pass (except the docker one)
  • ran on modified files like google-java-format -i $(find . -name ZookeeperTopicRepository.java)

Pending:

  • configuration of the time delay
  • updating with latest upstream if needed
  • looking at failing checks
  • manual testing in UI and running docker test
  • checking integration tests and how they work
  • see if the malformed topic can be deleted in master

see  allegro#1739
Deletion creates a node in the group /deletion_time, stores the time there
Throws an exception if not enough time has passed (5 min)
all existing unit tests pass (except the docker)

pending:
- updating with latest upstream
- manual testing in UI and running docker test
- adding unit tests for the added functionality
- checking integration tests
- configuration of the time delay
@danigiri
Copy link
Author

Test cycles are taking a while in my local hardware. To better debug this will proceed with creating some additional tests to test this enhancement specifically.

existing tests use a shared group which cross-contaminates tests
so added a mitigation
@danigiri
Copy link
Author

Checked the failing tests and they look transient and unrelated:
org.awaitility.core.ConditionTimeoutException: Assertion condition defined as a Lambda expression in pl.allegro.tech.hermes.test.helper.client.integration.HermesInitHelper expected: ACTIVE but was: PENDING within 10 seconds.

integrationTests succedded locally for me and slowIntegrationTests is still running but I'd say this is in a good enough state to be reviewed. Happy to bump a file to re-run the checks again

@danigiri danigiri marked this pull request as ready for review October 30, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant