Re: [PATCH v2 7/7] selftests: memcg: Treat failure for zeroing sock in test_memcg_sock as XFAIL

From: Li Wang

Date: Mon Mar 23 2026 - 05:47:58 EST

On Fri, Mar 20, 2026 at 04:42:41PM -0400, Waiman Long wrote:
> Although there is supposed to be a periodic and asynchronous flush of
> stats every 2 seconds, the actual time lag between succesive runs can
> actually vary quite a bit. In fact, I have seen time lag of up to 10s
> of seconds in some cases.
>
> At the end of test_memcg_sock, it waits up to 3 seconds for the
> "sock" attribute of memory.stat to go back down to 0. Obviously it
> may occasionally fail especially when the kernel has large page size
> (e.g. 64k). Treat this failure as an expected failure (XFAIL) to
> distinguish it from the other failure cases.
>
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> ---
> tools/testing/selftests/cgroup/test_memcontrol.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 5336be5ed2f5..af3e8fe4e50e 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1486,12 +1486,21 @@ static int test_memcg_sock(const char *root)
> * Poll memory.stat for up to 3 seconds (~FLUSH_TIME plus some
> * scheduling slack) and require that the "sock " counter
> * eventually drops to zero.
> + *
> + * The actual run-to-run elapse time between consecutive run
> + * of asynchronous memcg rstat flush may varies quite a bit.
> + * So the 3 seconds wait time may not be enough for the "sock"
> + * counter to go down to 0. Treat it as a XFAIL instead of
> + * a FAIL.
> */
> sock_post = cg_read_key_long_poll(memcg, "memory.stat", "sock ", 0,
> MEMCG_SOCKSTAT_WAIT_RETRIES,
> DEFAULT_WAIT_INTERVAL_US);
> - if (sock_post)
> + if (sock_post) {
> + if (sock_post > 0)
> + ret = KSFT_XFAIL;

XFAIL means "expected failure" and is intended for known kernel bugs or
unsupported features. A timing issue where the test simply doesn't wait
long enough probably not an expected failure, it's a test that needs a
longer timeout.

I'm wondering can we just enlarge the MEMCG_SOCKSTAT_WAIT_RETRIES value?
e.g. from 30 to 150

--
Regards,
Li Wang