Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for conditional locking

From: David Laight
Date: Sat May 17 2025 - 05:18:26 EST


On Tue, 13 May 2025 14:28:37 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 13 May 2025 at 13:31, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Nevermind - should've read back through the thread for context.
>
> Well, your comment did make me test what I can make gcc generate..
>
> I still can't get gcc to do
>
> cmpq $-4095,%rdi
> jns .L189
>
> for IS_ERR_OR_NULL() however hard I try.
>
> The best I *can* get both gcc and clang to at least do
>
> movq %rdi, %rcx
> addq $4095, %rcx
> jns .L189
>
> which I suspect it much better than the "lea+cmpq", because a pure
> register move is handled by the renaming and has no cost aside from
> the front end (ie decoding).
>
> So
>
> #define IS_ERR_OR_NULL(ptr) (MAX_ERRNO + (long)(ptr) >= 0)
>
> does seem to be potentially something we could use, and maybe we could
> push the compiler people to realize that their current code generation
> is bad.
>
> Of course, it doesn't actually *really* work for IS_ERR_OR_NULL(),
> because it gets the wrong results for user pointers, and while the
> *common* case for the kernel is to test various kernel pointers, the
> user pointer case does happen (ie mmap() and friends).
>
> IOW, it's not actually correct in that form, I just wanted to see what
> we could do for some limited form of this common pattern.
>
> Anyway, I am surprised that neither gcc nor clang seem to have
> realized that you can turn an "add" that just checks the condition
> codes for sign or equality into a "cmp" of the negative value.
>
> It seems such a trivial and obvious thing to do. But maybe I'm
> confused and am missing something.

Doing the signed compare (long)(ptr) >= -MAX_ERRNO generates cmp + jl
(sign != overflow) which is a better test.

To let user pointers through it might be possible to generate:
leaq -1(%reg), %reg
cmpq $-4097, %reg
leaq 1(%reg), %reg
ja label
which trades a register for an instruction.
It wouldn't be too bad if the second 'leaq' is moved to the branch
target - especially for any cpu that don't have inc/dec that doesn't
affect the flags.

David