Re: [RFC PATCH 01/10] mm/damon/core: introduce damon_ctx->paused
From: SeongJae Park
Date: Tue Mar 17 2026 - 00:20:25 EST
On Sun, 15 Mar 2026 14:00:00 -0700 SeongJae Park <sj@xxxxxxxxxx> wrote:
> DAMON supports only start and stop of the execution. When it is
> stopped, its internal data that it self-trained goes away. It will be
> useful if the execution can be paused and resumed with the previous
> self-trained data.
>
> Introduce per-context API parameter, 'paused', for the purpose. The
> parameter can be set and unset while DAMON is running and paused, using
> the online parameters commit helper functions (damon_commit_ctx() and
> damon_call()). Once 'paused' is set, the kdamond_fn() main loop does
> only limited works with sampling interval sleep during the works. The
> limited works include the handling of the online parameters update, so
> that users can unset the 'pause' and resume the execution when they
> want. It also keep checking DAMON stop conditions and handling of it,
> so that DAMON can be stopped while paused if needed.
>
> Signed-off-by: SeongJae Park <sj@xxxxxxxxxx>
> ---
> include/linux/damon.h | 2 ++
> mm/damon/core.c | 8 ++++++++
> 2 files changed, 10 insertions(+)
>
> diff --git a/include/linux/damon.h b/include/linux/damon.h
> index 3a441fbca170d..421e51eff3bd2 100644
> --- a/include/linux/damon.h
> +++ b/include/linux/damon.h
> @@ -811,6 +811,8 @@ struct damon_ctx {
> * intervals tuning
> */
> unsigned long next_intervals_tune_sis;
> + /* pause kdamond main loop */
> + bool pause;
> /* for waiting until the execution of the kdamond_fn is started */
> struct completion kdamond_started;
> /* for scheme quotas prioritization */
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index f9854aedc42d1..1e9f6aa569fd2 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1331,6 +1331,7 @@ int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
> if (err)
> return err;
> }
> + dst->pause = src->pause;
> dst->ops = src->ops;
> dst->addr_unit = src->addr_unit;
> dst->min_region_sz = src->min_region_sz;
> @@ -2978,6 +2979,13 @@ static int kdamond_fn(void *data)
> * kdamond_merge_regions() if possible, to reduce overhead
> */
> kdamond_call(ctx, false);
> + while (ctx->pause) {
> + if (kdamond_need_stop(ctx))
> + goto done;
> + kdamond_usleep(ctx->attrs.sample_interval);
> + /* allow caller unset pause via damon_call() */
> + kdamond_call(ctx, false);
> + }
sashiko.dev comments [1] below. Let me reply in line.
: Could this cause a deadlock if a walk request is initiated while DAMON is
: paused?
:
: Looking at kdamond_fn(), when ctx->pause is true, the thread stays in the
: while (ctx->pause) loop and skips kdamond_apply_schemes(). If a user
: triggers a sysfs command that initiates a walk (like writing
: update_schemes_tried_regions), the sysfs thread calls damos_walk() and waits
: for walk_control->completion.
:
: Since the kdamond thread is paused, the completion is never signaled, and the
: blocked sysfs writer will hold the global damon_sysfs_lock permanently. This
: prevents further interactions with the DAMON sysfs interface, making it
: impossible to even unpause the context.
Correct. I was able to trigger the deadlock on my tet setup.
:
: Should we call damos_walk_cancel(ctx) inside the pause loop to abort pending
: walk requests, similar to what is done in kdamond_wait_activation()?
Good suggestion. I will add below attaching fixup change on the next spin. I
confirmed the deadlock cannot be triggered after applying the fixup.
[1] https://sashiko.dev/#/patchset/20260315210012.94846-2-sj@xxxxxxxxxx
Thanks,
SJ
[...]
=== >8 ===
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -3405,6 +3405,7 @@ static int kdamond_fn(void *data)
kdamond_usleep(ctx->attrs.sample_interval);
/* allow caller unset pause via damon_call() */
kdamond_call(ctx, false);
+ damos_walk_cancel(ctx);
}
if (!list_empty(&ctx->schemes))
kdamond_apply_schemes(ctx);