Re: [RFC PATCH v4 0/1] mm/damon: add node_eligible_mem_bp and node_ineligible_mem_bp goal metrics

From: SeongJae Park

Date: Sat Mar 21 2026 - 12:57:57 EST


Hello Ravi,


Thank you for this patch! TL; DR: Other than trivial things I commented below
and to the patch, I believe it is time to drop the RFC tag, and work on merging
this.

On Fri, 20 Mar 2026 12:04:52 -0700 Ravi Jonnalagadda <ravis.opensrc@xxxxxxxxx> wrote:

> This patch introduces two new DAMON quota goal metrics for controlling

s/DAMON/DAMOS/ ?

> memory distribution in heterogeneous memory systems (e.g., DRAM and CXL
> memory tiering) using physical address (PA) mode monitoring.
>
> v3: https://lore.kernel.org/linux-mm/20260223123232.12851-1-ravis.opensrc@xxxxxxxxx/

The above link would better to be put on 'Chage since v3' section below.

>
> Changes since v3:
> =================
>
> - The first two patches from v3 (goal_tuner initialization fix and
> esz=0 quota bypass fix) are now in damon/next. This submission

It is not also in mm-unstable :)

> contains only the core metrics patch, rebased on top of those fixes.
>
> - Simplified implementation: removed per-node eligible_bytes array, now
> iterates scheme-eligible regions directly for each goal evaluation.
>
> - Handle regions crossing node boundaries: uses damon_get_folio() to
> determine actual NUMA node placement of each folio rather than
> assuming uniform node placement within a region.
>
> - Pass scheme pointer directly to metric calculation functions, avoiding
> container_of() derivation from quota pointer.
>
> - Fixed 80-column wrapping issues.

Thank you for addressing all my comments!

>
> Background and Motivation
> =========================
>
> In heterogeneous memory systems, controlling memory distribution across
> NUMA nodes is essential for performance optimization. This patch enables
> system-wide page distribution with target-state goals like "maintain 30%
> of scheme-eligible memory on CXL" using PA-mode DAMON schemes.
>
> What These Metrics Measure
> ==========================
>
> node_eligible_mem_bp:
> scheme_eligible_bytes_on_node / total_scheme_eligible_bytes * 10000
>
> node_ineligible_mem_bp:
> (total - scheme_eligible_bytes_on_node) / total * 10000
>
> The metrics are complementary: eligible_bp + ineligible_bp = 10000 bp.
>
> Two-Scheme Setup for Hot Page Distribution
> ==========================================
>
> For maintaining 30% of hot memory on CXL (node 1):

I think it could help easy reading if the above sentence also explains
node 0 is DRAM. For example,

For maintaining hot memory on DRAM (node 0) and CXL (node 1) in 7:3 ratio:

>
> PUSH scheme: migrate_hot from node 0 -> node 1
> goal: node_ineligible_mem_bp, nid=0, target=3000
> "Push hot pages out until 30% of hot memory is NOT on DRAM"

Seems the sentence assumes the actor is in DRAM. It was not very clear to me.
How about making it clear? E.g.,

"Move hot pages from DRAM to CXL, if more than 70% of hot data is in DRAM"

>
> PULL scheme: migrate_hot from node 1 -> node 0
> goal: node_eligible_mem_bp, nid=0, target=7000
> "Pull hot pages back until 70% of hot memory IS on DRAM"

If the above example is good for you, to be consistent with it, how about
rewording like below?

"Move hot pages from CXL to DRAM, if less than 70% of hot data is in DRAM"

>
> The complementary goals create a feedback loop that converges to the
> target distribution.
>
> Dependencies
> ============
>
> This patch is based on SJ's damon/next branch which includes the
> TEMPORAL goal tuner required for these metrics.

Your test might be depend on the feature. But this patch series itself is not,
as users could also use it with CONSIST tuner?

Also, as I mentioned above, the feature is now also in mm-unstable tree.

>
> Testing Results
> ===============
>
> Functionally tested on a two-node heterogeneous memory system with DRAM
> (node 0) and CXL memory (node 1). Used PUSH+PULL scheme configuration
> with migrate_hot action to maintain a target hot memory ratio between
> the two tiers.
>
> With the TEMPORAL goal tuner, the system converges quickly to the target
> distribution. The tuner drives esz to maximum when under goal and to
> zero once the goal is met, forming a simple on/off feedback loop that
> stabilizes at the desired ratio.
>
> With the CONSIST tuner, the scheme still converges but more slowly, as
> it migrates and then throttles itself based on quota feedback. The time
> to reach the goal varies depending on workload intensity.

Sounds reasonable!

Do you plan to further evaluate some performance metrics? I'd not strongly
request that, but it would be very nice if we can have that.

Regardless of your answer to the above question, I think the current code and
the test is good enough to consider merging this. I suggest dropping the RFC
tag from the next spin.

Thank you for doing this, Ravi!


Thanks,
SJ

[...]