Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: k_work_sync is modified after k_work_flush returns #64530

Closed
c1728p9 opened this issue Oct 28, 2023 · 2 comments · Fixed by #66995
Closed

kernel: k_work_sync is modified after k_work_flush returns #64530

c1728p9 opened this issue Oct 28, 2023 · 2 comments · Fixed by #66995
Assignees
Labels
area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale

Comments

@c1728p9
Copy link

c1728p9 commented Oct 28, 2023

Describe the bug
After a call to k_work_flush returns the k_work_sync struct may still be modified by the workq.

To Reproduce
The problem can be replicated by running the following code:

#include <zephyr/kernel.h>
#include <zephyr/logging/log.h>
#include <assert.h>
#include <string.h>

#define LOG_LEVEL 4
LOG_MODULE_REGISTER(main);

K_THREAD_STACK_DEFINE(workq_stack, 2048);
struct k_work_q workq;

void work_handler(struct k_work *item)
{
	LOG_INF("Work running");
}

void main(void)
{
	LOG_INF("Staring Test");
	k_work_queue_init(&workq);
	k_work_queue_start(&workq, workq_stack, K_THREAD_STACK_SIZEOF(workq_stack), 10, NULL);

	struct k_work work;
	struct k_work_sync sync;

	k_work_init(&work, work_handler);

	LOG_INF("Submitting work to workq");
	int ret = k_work_submit_to_queue(&workq, &work);
	assert(ret >= 0);
	k_work_flush(&work, &sync);
	LOG_INF("k_work_flush finished");

	LOG_INF("Filling k_work with 0xff");
	struct k_work work_expected;
	memset(&work_expected, 0xff, sizeof(work_expected));
	memcpy(&work, &work_expected, sizeof(work));

	LOG_INF("Filling k_work_sync with 0xff");
	struct k_work_sync sync_expected;
	memset(&sync_expected, 0xff, sizeof(sync_expected));
	memcpy(&sync, &sync_expected, sizeof(sync));

	LOG_INF("Waiting");
	k_msleep(100);

	if (memcmp(&work, &work_expected, sizeof(work)) == 0) {
		LOG_INF("No changes to k_work detected");
	} else {
		LOG_ERR("Modification to k_work detected");
		LOG_HEXDUMP_ERR(&work_expected, sizeof(work_expected), "Expected: ");
		LOG_HEXDUMP_ERR(&work, sizeof(work), "Got: ");
	}

	if (memcmp(&sync, &sync_expected, sizeof(sync)) == 0) {
		LOG_INF("No changes to k_work_sync detected");
	} else {
		LOG_ERR("Modification to k_work_sync detected");
		LOG_HEXDUMP_ERR(&sync_expected, sizeof(sync_expected), "Expected: ");
		LOG_HEXDUMP_ERR(&sync, sizeof(sync), "Got: ");
	}

	while (1)
	{
		k_msleep(1000);
	}
}

Expected behavior
The struct k_work_sync is not modified by the workq after k_work_flush returns.

Impact
It is unclear when it is safe to reuse or free k_work_sync after a call to k_work_flush.

Logs and console output
Logs from the above program running on a K64F with Zephyr v3.3.0:

*** Booting Zephyr OS build zephyr-v3.3.0 ***
[00:00:00.000,000] <inf> main: Staring Test
[00:00:00.000,000] <inf> main: Submitting work to workq
[00:00:00.000,000] <inf> main: Work running
[00:00:00.000,000] <inf> main: k_work_flush finished
[00:00:00.000,000] <inf> main: Filling k_work with 0xff
[00:00:00.000,000] <inf> main: Filling k_work_sync with 0xff
[00:00:00.000,000] <inf> main: Waiting
[00:00:00.100,000] <inf> main: No changes to k_work detected
[00:00:00.100,000] <err> main: Modification to k_work_sync detected
[00:00:00.100,000] <err> main: Expected:
                               ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff |........ ........
                               ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff |........ ........
[00:00:00.100,000] <err> main: Got:
                               ff ff ff ff ff ff ff ff  ff ff ff ff fc ff ff ff |........ ........
                               ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff |........ ........

Environment (please complete the following information):

  • OS: Windows
  • Toolchain: Zephyr SDK
  • Commit SHA: 07c6af3 (v3.3.0)
@c1728p9 c1728p9 added the bug The issue is a bug, or the PR is fixing a bug label Oct 28, 2023
@github-actions
Copy link

Hi @c1728p9! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

@MaureenHelm MaureenHelm added the priority: low Low impact/importance bug label Oct 31, 2023
Songjf-ttk added a commit to Songjf-ttk/zephyr that referenced this issue Dec 25, 2023
Assert that k_work_sync is modified after k_work_flush
returns. This is because the work queue thread continues
to modify the flag in sync even after k_work_flush returns.
This commit adds K_WORK_SYNCED_BIT, and with this bit, we
can prevent the work queue thread from modifying sync after
k_work_flush returns.

It is now safe to reuse or free k_work_sync after a call to
k_work_flush(), k_work_cancel_sync() and sibling flush and
cancel APIs.

Fixes zephyrproject-rtos#64530

Signed-off-by: Junfan Song <[email protected]>
Songjf-ttk added a commit to Songjf-ttk/zephyr that referenced this issue Dec 29, 2023
After a call to k_work_flush returns the sync variable
may still be modified by the workq.  This is because
the work queue thread continues to modify the flag in
sync even after k_work_flush returns.  This commit adds
K_WORK_FLUSHING_BIT, and with this bit, we moved the
logic of waking up the caller from handle_flush to the
finalize_flush_locked in workq, so that after waking up
the caller, the workqueue will no longer operate on sync.

Fixes: zephyrproject-rtos#64530

Signed-off-by: Junfan Song <[email protected]>
Songjf-ttk added a commit to Songjf-ttk/zephyr that referenced this issue Dec 30, 2023
After a call to k_work_flush returns the sync variable
may still be modified by the workq.  This is because
the work queue thread continues to modify the flag in
sync even after k_work_flush returns.  This commit adds
K_WORK_FLUSHING_BIT, and with this bit, we moved the
logic of waking up the caller from handle_flush to the
finalize_flush_locked in workq, so that after waking up
the caller, the workqueue will no longer operate on sync.

Fixes: zephyrproject-rtos#64530

Signed-off-by: Junfan Song <[email protected]>
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Dec 31, 2023
carlescufi pushed a commit that referenced this issue Jan 3, 2024
After a call to k_work_flush returns the sync variable
may still be modified by the workq.  This is because
the work queue thread continues to modify the flag in
sync even after k_work_flush returns.  This commit adds
K_WORK_FLUSHING_BIT, and with this bit, we moved the
logic of waking up the caller from handle_flush to the
finalize_flush_locked in workq, so that after waking up
the caller, the workqueue will no longer operate on sync.

Fixes: #64530

Signed-off-by: Junfan Song <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Kernel bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants