Re: Concerns about AI code in the kernel
From: Ellie
Date: Wed May 13 2026 - 08:10:26 EST
On 5/13/26 10:17 AM, Ellie wrote:
It also seems like if an LLM was made with properly licensed data
I just realized I should clarify what such data might look like:
I'm not a lawyer, but I'm guessing from a moral practical angle the expectation would be that the training data for an LLM would have a license that 1. doesn't require attribution and 2. is considered compatible with relicensing to the kernel's GPL.
(Whether this approach would safely avoid legal problems I wouldn't know. But it seems to me like it'd cause less upset in the community.)
The majority of public projects on Github, apparently those are what Co-Pilot is trained on, seem not to meet above requirements. I'm guessing for Claude Code the situation might be similarly concerning.
Regards,
Ellie