Re: [PATCH v2 2/2] x86/CPU/AMD: Print the reason for the last reset
From: Ingo Molnar
Date: Tue Apr 08 2025 - 16:32:09 EST
* Mario Limonciello <superm1@xxxxxxxxxx> wrote:
> +static inline char *get_s5_reset_reason(u32 value)
> +{
> + if (value & BIT(0))
> + return "trip of thermal pin BP_THERMTRIP_L";
> + if (value & BIT(1))
> + return "power button";
> + if (value & BIT(2))
> + return "shutdown pin";
> + if (value & BIT(4))
> + return "remote ASF power off command";
> + if (value & BIT(9))
> + return "internal CPU thermal trip";
> + if (value & BIT(16))
> + return "user reset via BP_SYS_RST_L pin";
> + if (value & BIT(17))
> + return "PCI reset";
> + if (value & BIT(18) ||
> + value & BIT(19) ||
> + value & BIT(20))
> + return "CF9 reset";
> + if (value & BIT(21))
> + return "power state of acpi state transition";
> + if (value & BIT(22))
> + return "keyboard reset pin KB_RST_L";
> + if (value & BIT(23))
> + return "internal CPU shutdown";
> + if (value & BIT(24))
> + return "failed boot timer";
> + if (value & BIT(25))
> + return "watchdog timer";
> + if (value & BIT(26))
> + return "remote ASF reset command";
> + if (value & BIT(27))
> + return "data fabric sync flood event due to uncorrected error";
> + if (value & BIT(29))
> + return "MP1 watchdog timer timeout";
> + if (value & BIT(30))
> + return "parity error";
> + if (value & BIT(31))
> + return "software sync flood event";
> + return "unknown reason";
Can multiple bits be set in principle, belonging to different reasons?
Also, wouldn't a clean, readable text array and find_first_bit() result
in more readable and more maintainable code?
Which can be initialized thusly:
static const char *s6_reset_reason_txt[] = {
[0] = "trip of thermal pin BP_THERMTRIP_L",
[1] = "power button",
[2] = "shutdown pin",
[4] = "remote ASF power off command",
[9] = "internal CPU thermal trip",
...
};
Also the text should probably be expanded into standard noun+verb
sentences or so, to make it all less ambiguous:
static const char *s6_reset_reason_txt[] = {
[0] = "thermal pin BP_THERMTRIP_L was tripped",
[1] = "power button was pressed",
[2] = "shutdown pin was shorted",
[4] = "remote ASF power off command was received",
[9] = "internal CPU thermal limit was tripped",
...
};
etc. Note the deliberate use of past tense, to make it clear this
refers to a previous event, while usually syslog events indicate
current events.
> + /*
> + * FCH::PM::S5_RESET_STATUS
> + * PM Base = 0xFED80300
> + * S5_RESET_STATUS offset = 0xC0
> + */
> + addr = ioremap(0xFED803C0, sizeof(value));
0xFED803C0 is a magic number, please define a symbol for it.
> + if (!addr)
> + return 0;
> + value = ioread32(addr);
> + iounmap(addr);
> +
> + pr_info("System was reset due to %s (0x%08x)\n",
> + get_s5_reset_reason(value), value);
Please make the source of this printout a bit more specific, something
like:
x86/amd/Fam17h: Previous system reset reason [%0x08x]: %s
or so? Also note how grepped output will be easier to read due to
flipping the fixed-width numeric and the variable-length text output:
# Before:
x86/amd/Fam17h: Previous system reset reason: thermal pin BP_THERMTRIP_L was tripped (0x00000001)
x86/amd/Fam17h: Previous system reset reason: power button was pressed (0x00000002)
x86/amd/Fam17h: Previous system reset reason: shutdown pin was shorted (0x00000004)
x86/amd/Fam17h: Previous system reset reason: remote ASF power off command was received (0x00000010)
x86/amd/Fam17h: Previous system reset reason: internal CPU thermal limit was tripped (0x00000200)
# After:
x86/amd/Fam17h: Previous system reset reason: [0x00000001]: thermal pin BP_THERMTRIP_L was tripped
x86/amd/Fam17h: Previous system reset reason: [0x00000002]: power button was pressed
x86/amd/Fam17h: Previous system reset reason: [0x00000004]: shutdown pin was shorted
x86/amd/Fam17h: Previous system reset reason: [0x00000010]: remote ASF power off command was received
x86/amd/Fam17h: Previous system reset reason: [0x00000200]: internal CPU thermal limit was tripped
Thanks,
Ingo