Re: [PATCH net] page_pool: Fix use-after-free in page_pool_recycle_in_ring

From: dongchenchen (A)
Date: Mon May 26 2025 - 10:47:59 EST



On Fri, May 23, 2025 at 1:31 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
On 2025/5/23 14:45, Dong Chenchen wrote:

static bool page_pool_recycle_in_ring(struct page_pool *pool, netmem_ref netmem)
{
+ bool in_softirq;
int ret;
int -> bool?

/* BH protection not needed if current is softirq */
- if (in_softirq())
- ret = ptr_ring_produce(&pool->ring, (__force void *)netmem);
- else
- ret = ptr_ring_produce_bh(&pool->ring, (__force void *)netmem);
-
- if (!ret) {
+ in_softirq = page_pool_producer_lock(pool);
+ ret = !__ptr_ring_produce(&pool->ring, (__force void *)netmem);
+ if (ret)
recycle_stat_inc(pool, ring);
- return true;
- }
+ page_pool_producer_unlock(pool, in_softirq);

- return false;
+ return ret;
}

/* Only allow direct recycling in special circumstances, into the
@@ -1091,10 +1088,14 @@ static void page_pool_scrub(struct page_pool *pool)

static int page_pool_release(struct page_pool *pool)
{
+ bool in_softirq;
int inflight;

page_pool_scrub(pool);
inflight = page_pool_inflight(pool, true);
+ /* Acquire producer lock to make sure producers have exited. */
+ in_softirq = page_pool_producer_lock(pool);
+ page_pool_producer_unlock(pool, in_softirq);
Is a compiler barrier needed to ensure compiler doesn't optimize away
the above code?

I don't want to derail this conversation too much, and I suggested a
similar fix to this initially, but now I'm not sure I understand why
it works.

Why is the existing barrier not working and acquiring/releasing the
producer lock fixes this issue instead? The existing barrier is the
producer thread incrementing pool->pages_state_release_cnt, and
page_pool_release() is supposed to block the freeing of the page_pool
until it sees the
`atomic_inc_return_relaxed(&pool->pages_state_release_cnt);` from the
producer thread. Any idea why this barrier is not working? AFAIU it
should do the exact same thing as acquiring/dropping the producer
lock.

Hi, Mina
As previously mentioned:
page_pool_recycle_in_ring
ptr_ring_produce
spin_lock(&r->producer_lock);
WRITE_ONCE(r->queue[r->producer++], ptr)
//recycle last page to pool, producer + release_cnt = hold_cnt
page_pool_release
page_pool_scrub
page_pool_empty_ring
ptr_ring_consume
page_pool_return_page
//release_cnt=hold_cnt
__page_pool_destroy //inflight=0
free_percpu(pool->recycle_stats);
free(pool) //free
spin_unlock(&r->producer_lock); //pool->ring uaf read
recycle_stat_inc(pool, ring);

release_cnt can block the freeing of the page_pool until it sees the
(release_cnt = hold_cnt) from the producer thread.
However, page_pool_release() can be executed simultaneously when a page
is recycle (e.g. kfree_skb). page_pool release_cnt will increase after
the producer is written, then pool can be free and pool read in producer
will trigger UAF.
So adding a producer lock barrier to wait for recycle process to
complete can fix it.

Best Regards,
Dong Chenchen