Re: Concerns about AI code in the kernel

From: Ellie

Date: Wed May 13 2026 - 08:10:26 EST


On 5/13/26 10:17 AM, Ellie wrote:
It also seems like if an LLM was made with properly licensed data

I just realized I should clarify what such data might look like:

I'm not a lawyer, but I'm guessing from a moral practical angle the expectation would be that the training data for an LLM would have a license that 1. doesn't require attribution and 2. is considered compatible with relicensing to the kernel's GPL.

(Whether this approach would safely avoid legal problems I wouldn't know. But it seems to me like it'd cause less upset in the community.)

The majority of public projects on Github, apparently those are what Co-Pilot is trained on, seem not to meet above requirements. I'm guessing for Claude Code the situation might be similarly concerning.

Regards,

Ellie