Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kdump: Show message in 'Test configuration' to verify the crash #19107

Merged
merged 3 commits into from
Aug 23, 2023

Conversation

skobyda
Copy link
Contributor

@skobyda skobyda commented Jul 18, 2023

This adds message showing a location where user can verify whetever crash was successful:

  • Local filesystem: "To verify whether the crash was successful check /var/crash location for kernel crash dump(vmcore)."
  • SSH: "To verify whether the crash was successful check /var/crash location at [email protected] for kernel crash dump(vmcore)"
  • -NFS: "To verify whether the crash was successful check /var/crash location at myserver.com:/export/cores for kernel crash dump(vmcore)."

Closes https://issues.redhat.com/browse/RHELPLAN-125375
Closes https://bugzilla.redhat.com/show_bug.cgi?id=2097440


Kdump: Show location of kdump to verify the successful configuration test

You can now see the location where kdump can be found if testing of kdump settings is successful.

Screenshot from 2023-08-23 11-17-45

@skobyda skobyda requested a review from garrett July 18, 2023 12:45
@marusak
Copy link
Member

marusak commented Jul 18, 2023

Nice! One more additional idea is whether we can check this automatically? That once you reboot and visit kdupm page again it would show big green check mark that there is fresh kernel dump in designed location?

@skobyda
Copy link
Contributor Author

skobyda commented Jul 18, 2023

Nice! One more additional idea is whether we can check this automatically? That once you reboot and visit kdupm page again it would show big green check mark that there is fresh kernel dump in designed location?

I have been thinking also about this to send as followup (since this PR is really simple and fixes jira/bugzilla, I think it's worth it to merge separately ASAP).

And about the automatic check. I guess we have to save to browser's localStorage the timestamp of the last simulated kernel crash. And then we can check the last write to the crash dump location. As long as we only support it for crash dump location at Local filesystem/SSH/NFS, it should be fairly easy to check. For others, I think it's right now not feasible to implement.

EDIT: Or maybe just looking at kdump service logs + the timestamp should be enough

@garrett
Copy link
Member

garrett commented Jul 25, 2023

Local:

kdump, local

SSH:

kdump, ssh

NFS:

kdump, nfs


Additionally, most people aren't going to remember where to check, especially if it takes a while to test. Therefore, we should also make sure this information is easy to see on the front page (like perhaps highlight a crash report if there was a crash on the last boot).

@skobyda
Copy link
Contributor Author

skobyda commented Aug 3, 2023

@garrett updated the design:

Also, if you configure kdump to use SSH, it will not move it to [email protected]:/home/you but [email protected]:${path} where path is configurable by the user through our UI

NFS:

Screenshot from 2023-08-03 13-53-16

SSH:

Screenshot from 2023-08-03 13-52-58

Local:

Screenshot from 2023-08-03 13-53-25

@skobyda skobyda requested review from garrett and removed request for garrett August 3, 2023 11:56
Copy link
Member

@garrett garrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Minor nitpicks:

  1. The modals should have an icon (specifically, the golden <!> icon, like in the mockups)
  2. There should be an X on closable modals (either with "Cancel", as the case here, and also on "Close" ones (not the case here; just pointing that out))

Additionally: I notice the width is wider in the screenshots than in the modal. Generally, for less complex modals (especially confirmations), going with the narrower ones would be better for readability. However, there is some path and filename info here; going too narrow could provide more opportunities for wrapping (not great here). So it's probably fine at your width. This is just a comment, not something to change. 😉

@skobyda skobyda force-pushed the kdumpMessage branch 3 times, most recently from 6006f68 to 55c6ab4 Compare August 15, 2023 11:25
@skobyda skobyda marked this pull request as ready for review August 15, 2023 11:49
@skobyda
Copy link
Contributor Author

skobyda commented Aug 15, 2023

Updated the design, tests seem green, I think it's ready for code-review:
Screenshot from 2023-08-15 13-51-45

@liutgnu
Copy link

liutgnu commented Aug 16, 2023

Hi skobyda,

I'm Tao Liu from kdump team, came from bz2215024. Thanks a lot for your efforts on this. Since cockpit support is one subtask of bz2215024, and I have worked on the TUI part for sometime, I can share some of my thoughts:

  1. It will be better to have an automatic test check mechanism, as @marusak pointed out, after machine reboot and user re-enter the kdump page, a sign indicating if the test is successful or not will be good. Relying on user himself to check if the "test vmcore" been generated is inconvenient, there might be plenty of vmcores located in a nfs/ssh server for "nfs/ssh kdump", the manual checking will be a boring work.

  2. It may not be the final desgin, I planed to introduce 2 new shell command line options for kdumpctl: "kdumpctl test" and "kdumpctl test-check". "kdumpctl test" will generate a timestamp, which will be used to mark the vmcore that it belongs to a specifc test. "kdumpctl test-check" will check the kdump folder after reboot back, no matter if it's a local folder or a nfs/ssh remote folder, to see a vmcore with the specific timestamp string exist or not, which then be the indicator of the test is successful or not. Maybe you can call "kdumpctl test-check" in cockpit if it is easier.

  3. If the user modifies the configuration file of kdump, the prior test result will be invalid and notify he should retest based on the current kdump config. I hadn't figure out the tech details, maybe some file monitor service to be involved.

Anyway, it's currently an early design phase for me, we can share thoughts or discuss kdump APIs if you need some underlying support.

Thanks,
Tao Liu

@skobyda
Copy link
Contributor Author

skobyda commented Aug 16, 2023

  1. It will be better to have an automatic test check mechanism, as @marusak pointed out, after machine reboot and user re-enter the kdump page, a sign indicating if the test is successful or not will be good. Relying on user himself to check if the "test vmcore" been generated is inconvenient, there might be plenty of vmcores located in a nfs/ssh server for "nfs/ssh kdump", the manual checking will be a boring work.

Yes, that will be a followup

  1. It may not be the final desgin, I planed to introduce 2 new shell command line options for kdumpctl: "kdumpctl test" and "kdumpctl test-check". "kdumpctl test" will generate a timestamp, which will be used to mark the vmcore that it belongs to a specifc test. "kdumpctl test-check" will check the kdump folder after reboot back, no matter if it's a local folder or a nfs/ssh remote folder, to see a vmcore with the specific timestamp string exist or not, which then be the indicator of the test is successful or not. Maybe you can call "kdumpctl test-check" in cockpit if it is easier.

That's exactly what I have as work-in-progress locally. My plan was to generate timestamp when "Test configuration" is triggered and store in in browser's storage. But "kdumpctl test-check" would work be even better! while using the similar principle.

  1. If the user modifies the configuration file of kdump, the prior test result will be invalid and notify he should retest based on the current kdump config. I hadn't figure out the tech details, maybe some file monitor service to be involved.

From our point of view, we are alsofine with calling some command to reset the timestamp manually when user uses our UI to change configuration file.

@liutgnu
Copy link

liutgnu commented Aug 16, 2023

  1. It will be better to have an automatic test check mechanism, as @marusak pointed out, after machine reboot and user re-enter the kdump page, a sign indicating if the test is successful or not will be good. Relying on user himself to check if the "test vmcore" been generated is inconvenient, there might be plenty of vmcores located in a nfs/ssh server for "nfs/ssh kdump", the manual checking will be a boring work.

Yes, that will be a followup

^_^

  1. It may not be the final desgin, I planed to introduce 2 new shell command line options for kdumpctl: "kdumpctl test" and "kdumpctl test-check". "kdumpctl test" will generate a timestamp, which will be used to mark the vmcore that it belongs to a specifc test. "kdumpctl test-check" will check the kdump folder after reboot back, no matter if it's a local folder or a nfs/ssh remote folder, to see a vmcore with the specific timestamp string exist or not, which then be the indicator of the test is successful or not. Maybe you can call "kdumpctl test-check" in cockpit if it is easier.

That's exactly what I have as work-in-progress locally. My plan was to generate timestamp when "Test configuration" is triggered and store in in browser's storage. But "kdumpctl test-check" would work be even better! while using the similar principle.

OK, I will try to design the kdumpctl APIs in consideration of cockpit, hope it can be clean and easy to use. Just FYI, there is the repo which we used for implementing TUI "kdumpctl test/test-check" etc, currently just a POC in the ltao-dev branch.

  1. If the user modifies the configuration file of kdump, the prior test result will be invalid and notify he should retest based on the current kdump config. I hadn't figure out the tech details, maybe some file monitor service to be involved.

From our point of view, we are alsofine with calling some command to reset the timestamp manually when user uses our UI to change configuration file.

OK, thanks!

@garrett
Copy link
Member

garrett commented Aug 16, 2023

I'm working on integrating the discussion into a new mockup!

Here's what I have so far:

Kdump page

Source: Kdump.penpot in Kdump.zip

It's still lacking a lot, such as: the modals for each option, the alert for when the service is out of sync (that is: if it's enabled and the service isn't running), other error cases, etc. — but the bulk of the direction to modernize the Kdump page (and bring it up to established patterns in both Cockpit and PatternFly) is there.

@garrett
Copy link
Member

garrett commented Aug 16, 2023

I've opened up a discussion @ #19203 with a summary of what @liutgnu, @skobyda, and I have contributed here... so we can continue talking about kdump more in general, without it having to be tied to this PR (as this PR will be merged pretty soon and I'd love to keep the conversation going).

@skobyda skobyda force-pushed the kdumpMessage branch 3 times, most recently from 54d1cb9 to bc80683 Compare August 17, 2023 09:33
@liutgnu
Copy link

liutgnu commented Aug 17, 2023

I've opened up a discussion @ #19203 with a summary of what @liutgnu, @skobyda, and I have contributed here... so we can continue talking about kdump more in general, without it having to be tied to this PR (as this PR will be merged pretty soon and I'd love to keep the conversation going).

OK thanks, we can discuss there.

@skobyda
Copy link
Contributor Author

skobyda commented Aug 17, 2023

After a bit more discussion, one last change to the dialogs:
Screenshot from 2023-08-17 11-09-51
Screenshot from 2023-08-17 11-16-30
Screenshot from 2023-08-17 11-15-46

pkg/kdump/kdump.scss Outdated Show resolved Hide resolved
pkg/kdump/kdump-view.jsx Show resolved Hide resolved
pkg/kdump/kdump-view.jsx Show resolved Hide resolved
pkg/lib/cockpit-components-dialog.jsx Show resolved Hide resolved
@skobyda skobyda force-pushed the kdumpMessage branch 3 times, most recently from 086b465 to 1ea4253 Compare August 22, 2023 22:31
This adds message showing a location where user can verify whetever
crash was successful:
Local filesystem: "To verify whether the crash was successful check
/var/crash location for kernel crash dump(vmcore)."
SSH: "To verify whether the crash was successful check /var/crash
location at [email protected] for kernel crash dump(vmcore)"
NFS: "To verify whether the crash was successful check /var/crash
location at myserver.com:/export/cores for kernel crash dump(vmcore)."

Closes https://issues.redhat.com/browse/RHELPLAN-125375
Closes https://bugzilla.redhat.com/show_bug.cgi?id=2097440
' ' + _("Results of the crash will be stored in $0 as $1, if kdump is properly configured."),
<span className="pf-v5-u-font-family-monospace-vf">{path}</span>,
<span className="pf-v5-u-font-family-monospace-vf">vmcore</span>);
} else if (target.type === "ssh" || target.type == "nfs") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is not executed by any test. Details

Comment on lines +282 to +283
<span className="pf-v5-u-font-family-monospace-vf">{target.type === "ssh" ? "SSH" : "NFS"}</span>,
<span className="pf-v5-u-font-family-monospace-vf">{`${target.server}:${target.type === "nfs" ? target.export + path : path}`}</span>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 added lines are not executed by any test. Details

@jelly jelly dismissed garrett’s stale review August 23, 2023 08:43

Garrett's ack'ed the changes on matrix.

@skobyda skobyda merged commit 9cceccb into cockpit-project:main Aug 23, 2023
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants