Re: [PATCH v3 1/1] x86/mce/amd: Guard SMCA DESTAT access on non-SMCA machines

Next message: Aleksandr Nogikh: "[PATCH v2] x86/kexec: Disable KCOV instrumentation after load_segments()"
Previous message: Michael Kelley: "RE: [PATCH] mshv: Fix error handling in mshv_region_populate_pages"
In reply to: Borislav Petkov: "Re: [PATCH v3 1/1] x86/mce/amd: Guard SMCA DESTAT access on non-SMCA machines"
Next in thread: Borislav Petkov: "Re: [PATCH v3 1/1] x86/mce/amd: Guard SMCA DESTAT access on non-SMCA machines"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: William Roche

Date: Tue Mar 17 2026 - 17:56:54 EST

On 3/17/26 21:24, Borislav Petkov wrote:

On Tue, Mar 17, 2026 at 09:06:54PM +0100, William Roche wrote:

Relaying the error to the guest doesn't only have a value to target a VM
process but also deal with free memory or clean file cache memory impacted
etc... Cases where a memory error may not crash the kernel can benefit to
the VM too

I don't understand - what do you mean with "free memory or clean file cache
memory"?

The physical address of an uncorrected memory error (if/when it can be identified) can give a chance to a kernel reaction depending on the state (and type) of the impacted memory -- as implemented in mm/memory-failure.c with error_states[], me_pagecache_clean() or try_memory_failure()...

The Kernel can try to "deal" with the error. The process case (with its SIGBUS) is probably the most common one, but a few kernel memory pages impacted by a memory error could be isolated (poisoned) without requiring a kernel crash. Free memory pages or clean page cache pages could be an example of that, they are poisoned and should not be used by the system after that. The kernel can also return EIO error on poisoned page cache failed access attempt, etc...

These mechanisms are implemented for the bare-metal running kernel, but what is really interesting when relaying the error to a VM is that its kernel can, in some cases, also benefit from these mechanisms. And having a chance (even small) to avoid a VM crash is a significant gain for virtualized workload.

Just giving my point of view on why we care about VM relayed memory errors :)

William.