Re: [PATCH 0/3] crypto: Remove arch-optimized des and des3_ede code

From: Simon Richter

Date: Fri Mar 27 2026 - 06:04:00 EST


Hi,

On 3/27/26 5:27 AM, Eric Biggers wrote:

In general that's good of course, but DES and 3DES? Really? Why is
effort going into these obsolete algorithms at all?

If there's dedicated instructions, we need to emulate them, even if the kernel stops using them, because userspace might still use them. The alternative is implementing them as a trap in the kernel that delegates to the crypto subsystem, and nobody wants that. O_O

I wonder if it would make sense to split between "crypto" and "offload" subsystems, so the "crypto" side can focus on a small number of contemporary algorithms and give them simple, easily auditable interfaces, and move all the complexity of asynchronous request processing in offload hardware over to the "offloading" side. The userspace API would also move to the "offloading" subsystem.

This would give the offloading subsystem a bit more flexibility in API design as well, so we could maybe represent offload capabilities in network or storage hardware as well, or allow userspace to set policies or find an optimized routing, without compromising security in the crypto subsystem.

However, even from the "crypto" perspective I believe that we can't get around support for asynchronous offload devices, because of mobile devices. I suspect no one would be building dedicated silicon for asynchronous AES into mobile CPUs if that wasn't worth it somehow -- so if such a device is present, we want to use it as much as possible, because the expectation is that while the difference in performance compared to the CPU is hardly noticeable, the difference in battery lifetime is (that's why dropping async request support from fscrypt makes it largely useless on mobile).

Most of the other offload scenarios are already handled bypassing the crypto subsystem: the network stack has its own offloading mechanism, while nx-gzip is a regular device driver and does not even register an acomp algorithm (even though that would be really cool for zram/zswap, and would benefit dozens (dozens!) of users).

A lot of the resistance to changes in the crypto subsystem comes from the long tail, either hardware that is somewhat seldom, or built for some special purpose where the crypto APIs are already a limiting factor, and further consolidation towards standard PCs is making the situation worse.

I can certainly see that the complexity in the API that would be needed to support all the interesting use cases is somewhat undesirable, hence the idea to split off generic transforms and allow the interfaces there to become more expressive (on-device dmabufs, in-place operation, device-side contexts, device-side queues, device-to-device transfer offload, ...).

The current state where these use cases are technically inside the scope of the crypto subsystem, but deemed out of scope by the crypto subsystem leaves them in a kind of limbo, and that is very frustrating.

I don't know if it will be worth it to dedicate a weekend to implementing nx-gzip support as an acomp module, or nx-aes support as acrypt, or if that work would be rejected or removed in half a year, and I'm sure maintainers of ports to older hardware feel similar.

Simon