-
Notifications
You must be signed in to change notification settings - Fork 14.8k
KAFKA-18915: Migrate AdminClientRebootstrapTest to use new test infra #19094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-18915: Migrate AdminClientRebootstrapTest to use new test infra #19094
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
I have a nit comment.
| var topic = "topic"; | ||
| try (var admin = clusterInstance.admin()) { | ||
| admin.createTopics(List.of(new NewTopic(topic, BROKER_COUNT, (short) 2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline topic
| .setTypes(Set.of(Type.KRAFT)) | ||
| .setBrokers(BROKER_COUNT) | ||
| .setAdminClientProperties(rebootstrapProperties) | ||
| .setServerProperties(serverProperties).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: here and above there's too much whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comments.
I've fixed it.
| server1.awaitShutdown(); | ||
|
|
||
| // Only the server 0 is available for the admin client during the bootstrap. | ||
| admin.listTopics().names().get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you've just translated an existing test, but it's good to use timeouts with these get() calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, adding timeouts to prevent get() calls from blocking.
|
|
||
| // Only the server 0 is available for the admin client during the bootstrap. | ||
| admin.listTopics().names().get(); | ||
| TestUtils.waitForCondition(() -> admin.listTopics().names().get().contains(topic), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This get() will still block until the future completes. What I was suggestion was that we do something like .get(5, TimeUnit.MINUTES) or some reasonable timeout. This would prevent the test from getting stuck if something went wrong.
Adding the waitForCondition is a good improvement. It will make this test more reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Adding the waitForCondition only waits for get() to complete, your suggestion prevents the testing blocking.
I've fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clarkwtc sorry for late review. I leave some major comments. Please take a look, thanks!
| return ClusterConfig.defaultBuilder() | ||
| .setTypes(Set.of(Type.KRAFT)) | ||
| .setBrokers(BROKER_COUNT) | ||
| .setAdminClientProperties(rebootstrapProperties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is no-op, and we should use clusterInstance.admin(Map) to build admin with custom configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
open https://issues.apache.org/jira/browse/KAFKA-18944 to remove those unused setters
| var topic = "topic"; | ||
| var timeout = 5; | ||
| try (var admin = clusterInstance.admin()) { | ||
| admin.createTopics(List.of(new NewTopic(topic, BROKER_COUNT, (short) 2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BROKER_COUNT should be used to define replicas rather than partitions, since our goal is to ensure each broker has a replica.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, these are different definitions and should not reuse variables in partitions.
| } | ||
|
|
||
| @ClusterTemplate(value = "generator") | ||
| public void testRebootstrap(ClusterInstance clusterInstance) throws InterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After rewriting, this test no longer covers the scenario. We need to verify that the admin can "rebootstrap" when all "known" brokers are unavailable. Therefore, we should shut down one broker before creating the admin to ensure it's aware of "one" broker. Then, we shut down the broker the admin is aware of and restart the other one.
- Shut down broker0.
- Create the admin with bootstrap=broker0,broker1.
- The admin is aware of broker1.
- Run some admin APIs to ensure everything is fine.
- Shut down broker1 and restart broker0.
- The admin can't connect to broker1, so it starts to rebootstrap to find other available brokers.
- The admin finds broker0.
- Run some admin APIs to ensure everything is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for redefining this test to cover the scenario.
We will move forward with this PR: #19187
Remove unused `saslServerProperties`, `saslClientProperties`, `adminClientProperties`, `producerProperties`, and `consumerProperties` in ClusterConfig. First, I quickly fixed the unused adminClientProperties, and then I will move on to #19094 to fix the related issues. Pass AdminClientRebootstrapTest <img width="1398" alt="Screenshot 2025-03-09 at 12 54 57 PM" src="https://github.com/user-attachments/assets/73c50376-6602-493d-8abd-0eb2bb304114" /> Pass ClusterConfigTest <img width="1117" alt="Screenshot 2025-03-09 at 12 55 28 PM" src="https://github.com/user-attachments/assets/b4da59da-dfdf-4698-9077-5086854360ab" /> Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Migrate AdminClientRebootstrapTest to the new test infra and remove the old Scala test.
The test results

Reviewers: TengYao Chi kitingiao@gmail.com, David Arthur mumrah@gmail.com