-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] targetcli commands hanging #178
Comments
Hello, thanks for the report. |
@karibertils Can you please post a link to the entire dmesg output? |
@maurizio-lombardi Here is the log for the whole day when issue occurred last time. kernel.2020.12.28.log There were some PC's trying to attach non existing targets at the time. Don't know if that could be related to the issue ? I have been rebooting every other day since to avoid the issue. I can skip reboots to gather new logs if that helps. Maybe enable some debug mode also ? |
No news on this bug ? I have tried also Ubuntu Server 20.10. Using |
@karibertils Hi, I am trying to reproduce it. "Using kernel 5.8.0-44-generic and targetcli-fb 2.1.53 and same thing happens there also." I am sure the bug is still present in the latest upstream kernel. |
@maurizio-lombardi No I believe it happens regardless of initiators connecting to non-existing targets or not. When the issue starts every targetcli command hangs. Doing I am network booting 80 PC's. All of them have 1 target which has 2 LUN's. LUN 0 is almost never changed. But LUN 1 is removed and re-added every time the PC's boot. example:
We did previously remove and re-add LUN 1 once every 24 hours. And the issue happened with similar frequency then. It can happen after running for 1-9 days. Usually it takes 3+ days though. |
Hmm, I asked because of the following backtrace Dec 28 17:17:16 rocky kernel: [361650.549542] INFO: task targetcli:3414923 blocked for more than 120 seconds. that makes me think that there is a race condition somewhere, causing a problem with the refcounting Note that lio_target_call_delnpfromtpg() is called when you execute a command like "targetcli iscsi/ delete iqn...." If you have new dmesg logs please post them here, they might help |
Ok that sounds plausible. But we did previously try few times to make sure there were no connections to non-existant targets. And the issue did persist. But I guess there have always been at least a few attempts. We only delete targets few times a day. But the boot script deletes&readds the LUN's every boot so there's more stress on that path. Here is a recent kernel log. |
I have 80 targets, each with 2 block devices using zfs clones from snapshot.
The issue starts usually after 3-7 days. When it starts, various commands hang indefinently.
Right now I'm running
/iscsi> delete iqn.2020-01.is.gz:192-168-101-163
and it's frozen indefinently.logs show
Running Proxmox 6.3 with kernel
5.4.78-2-pve
. Usingtargetcli-fb version 2.1.48-2
The text was updated successfully, but these errors were encountered: