Re: [RFC PATCH v4 0/1] mm/damon: add node_eligible_mem_bp and node_ineligible_mem_bp goal metrics

From: Ravi Jonnalagadda

Date: Mon Mar 23 2026 - 15:26:04 EST


On Sat, Mar 21, 2026 at 9:57 AM SeongJae Park <sj@xxxxxxxxxx> wrote:
>
> Hello Ravi,
>
>
> Thank you for this patch! TL; DR: Other than trivial things I commented below
> and to the patch, I believe it is time to drop the RFC tag, and work on merging
> this.
>

Thanks, SJ, for the prompt and detailed feedback!

> On Fri, 20 Mar 2026 12:04:52 -0700 Ravi Jonnalagadda <ravis.opensrc@xxxxxxxxx> wrote:
>
> > This patch introduces two new DAMON quota goal metrics for controlling
>
> s/DAMON/DAMOS/ ?
>

Will fix it.

> > memory distribution in heterogeneous memory systems (e.g., DRAM and CXL
> > memory tiering) using physical address (PA) mode monitoring.
> >
> > v3: https://lore.kernel.org/linux-mm/20260223123232.12851-1-ravis.opensrc@xxxxxxxxx/
>
> The above link would better to be put on 'Chage since v3' section below.
>

Got it. Will take care of it next time.

> >
> > Changes since v3:
> > =================
> >
> > - The first two patches from v3 (goal_tuner initialization fix and
> > esz=0 quota bypass fix) are now in damon/next. This submission
>
> It is not also in mm-unstable :)

Good to know. Will mention this in the next version.

>
> > contains only the core metrics patch, rebased on top of those fixes.
> >
> > - Simplified implementation: removed per-node eligible_bytes array, now
> > iterates scheme-eligible regions directly for each goal evaluation.
> >
> > - Handle regions crossing node boundaries: uses damon_get_folio() to
> > determine actual NUMA node placement of each folio rather than
> > assuming uniform node placement within a region.
> >
> > - Pass scheme pointer directly to metric calculation functions, avoiding
> > container_of() derivation from quota pointer.
> >
> > - Fixed 80-column wrapping issues.
>
> Thank you for addressing all my comments!
>
> >
> > Background and Motivation
> > =========================
> >
> > In heterogeneous memory systems, controlling memory distribution across
> > NUMA nodes is essential for performance optimization. This patch enables
> > system-wide page distribution with target-state goals like "maintain 30%
> > of scheme-eligible memory on CXL" using PA-mode DAMON schemes.
> >
> > What These Metrics Measure
> > ==========================
> >
> > node_eligible_mem_bp:
> > scheme_eligible_bytes_on_node / total_scheme_eligible_bytes * 10000
> >
> > node_ineligible_mem_bp:
> > (total - scheme_eligible_bytes_on_node) / total * 10000
> >
> > The metrics are complementary: eligible_bp + ineligible_bp = 10000 bp.
> >
> > Two-Scheme Setup for Hot Page Distribution
> > ==========================================
> >
> > For maintaining 30% of hot memory on CXL (node 1):
>
> I think it could help easy reading if the above sentence also explains
> node 0 is DRAM. For example,
>
> For maintaining hot memory on DRAM (node 0) and CXL (node 1) in 7:3 ratio:

Good suggestion, will clarify the node mapping.

>
> >
> > PUSH scheme: migrate_hot from node 0 -> node 1
> > goal: node_ineligible_mem_bp, nid=0, target=3000
> > "Push hot pages out until 30% of hot memory is NOT on DRAM"
>
> Seems the sentence assumes the actor is in DRAM. It was not very clear to me.
> How about making it clear? E.g.,
>
> "Move hot pages from DRAM to CXL, if more than 70% of hot data is in DRAM"

Got it. Will use your suggested wording.

>
> >
> > PULL scheme: migrate_hot from node 1 -> node 0
> > goal: node_eligible_mem_bp, nid=0, target=7000
> > "Pull hot pages back until 70% of hot memory IS on DRAM"
>
> If the above example is good for you, to be consistent with it, how about
> rewording like below?
>
> "Move hot pages from CXL to DRAM, if less than 70% of hot data is in DRAM"

Agreed. Will reword this too.

>
> >
> > The complementary goals create a feedback loop that converges to the
> > target distribution.
> >
> > Dependencies
> > ============
> >
> > This patch is based on SJ's damon/next branch which includes the
> > TEMPORAL goal tuner required for these metrics.
>
> Your test might be depend on the feature. But this patch series itself is not,
> as users could also use it with CONSIST tuner?
>

Correct, the metrics work with both tuners. Will reword to clarify that
testing used TEMPORAL but the patch itself does not depend on it.

> Also, as I mentioned above, the feature is now also in mm-unstable tree.
>
> >
> > Testing Results
> > ===============
> >
> > Functionally tested on a two-node heterogeneous memory system with DRAM
> > (node 0) and CXL memory (node 1). Used PUSH+PULL scheme configuration
> > with migrate_hot action to maintain a target hot memory ratio between
> > the two tiers.
> >
> > With the TEMPORAL goal tuner, the system converges quickly to the target
> > distribution. The tuner drives esz to maximum when under goal and to
> > zero once the goal is met, forming a simple on/off feedback loop that
> > stabilizes at the desired ratio.
> >
> > With the CONSIST tuner, the scheme still converges but more slowly, as
> > it migrates and then throttles itself based on quota feedback. The time
> > to reach the goal varies depending on workload intensity.
>
> Sounds reasonable!
>
> Do you plan to further evaluate some performance metrics? I'd not strongly
> request that, but it would be very nice if we can have that.
>

Yes, I am planning to run additional tests. I will send v5 addressing
all the review comments and dropping the RFC tag. Results will follow
as the testing progresses.

> Regardless of your answer to the above question, I think the current code and
> the test is good enough to consider merging this. I suggest dropping the RFC
> tag from the next spin.
>
> Thank you for doing this, Ravi!
>

Thank you! Will drop the RFC tag for v5.

>
> Thanks,
> SJ
>

Best Regards,
Ravi.

> [...]