Re: [RFC PATCH net-next 2/3] seg6: add SRv6 L2 tunnel device (srl2)
From: Justin Iurman
Date: Tue Mar 24 2026 - 12:37:42 EST
On Tue, Mar 24, 2026 at 4:08 PM Justin Iurman <justin.iurman@xxxxxxxxx> wrote:
>
> On Sun, Mar 22, 2026 at 12:06 AM Andrea Mayer <andrea.mayer@xxxxxxxxxxx> wrote:
> >
> > Introduce srl2, an Ethernet pseudowire device over SRv6. It
> > encapsulates L2 frames in IPv6 with a Segment Routing Header for
> > transmission across an SRv6 network.
> >
> > The encapsulation logic reuses seg6_do_srh_encap() with
> > IPPROTO_ETHERNET. The transmit path uses the standard IPv6 tunnel
> > infrastructure (dst_cache, ip6_route_output, ip6tunnel_xmit).
> >
> > The device is configured with a segment list for point-to-point
> > L2 encapsulation.
> >
> > Usage:
> >
> > ip link add srl2-0 type srl2 segs fc00::a,fc00::b
> >
> > Co-developed-by: Stefano Salsano <stefano.salsano@xxxxxxxxxxx>
> > Signed-off-by: Stefano Salsano <stefano.salsano@xxxxxxxxxxx>
> > Signed-off-by: Andrea Mayer <andrea.mayer@xxxxxxxxxxx>
> > ---
> > include/linux/srl2.h | 7 +
> > include/uapi/linux/srl2.h | 20 +++
> > net/ipv6/Kconfig | 16 +++
> > net/ipv6/Makefile | 1 +
> > net/ipv6/seg6.c | 1 +
> > net/ipv6/srl2.c | 269 ++++++++++++++++++++++++++++++++++++++
> > 6 files changed, 314 insertions(+)
> > create mode 100644 include/linux/srl2.h
> > create mode 100644 include/uapi/linux/srl2.h
> > create mode 100644 net/ipv6/srl2.c
> >
>
> [snip]
>
> > diff --git a/net/ipv6/srl2.c b/net/ipv6/srl2.c
> > new file mode 100644
> > index 000000000000..66aa5375d218
> > --- /dev/null
> > +++ b/net/ipv6/srl2.c
> > @@ -0,0 +1,269 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * SRv6 L2 tunnel device (srl2)
> > + *
> > + * A virtual Ethernet device that encapsulates L2 frames in IPv6 with a
> > + * Segment Routing Header (SRH) for transmission over an SRv6 network.
> > + * On the remote side, a seg6_local behavior such as End.DT2U or End.DX2
> > + * decapsulates the inner Ethernet frame for L2 delivery.
> > + *
> > + * The encapsulation logic reuses seg6_do_srh_encap() from seg6_iptunnel.c
> > + * with IPPROTO_ETHERNET (143). The transmit path uses the standard IPv6
> > + * tunnel infrastructure (dst_cache, ip6_route_output, ip6tunnel_xmit).
> > + *
> > + * Authors:
> > + * Andrea Mayer <andrea.mayer@xxxxxxxxxxx>
> > + * Stefano Salsano <stefano.salsano@xxxxxxxxxxx>
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/netdevice.h>
> > +#include <linux/etherdevice.h>
> > +#include <net/dst_cache.h>
> > +#include <net/ip6_route.h>
> > +#include <net/ip_tunnels.h>
> > +#include <net/ip6_tunnel.h>
> > +#include <net/seg6.h>
> > +#include <linux/seg6.h>
> > +#include <linux/srl2.h>
> > +
> > +/* Conservative initial estimate for SRH size before newlink provides
> > + * the actual value. 256 bytes accommodates up to 15 SIDs.
> > + */
> > +#define SRL2_SRH_HEADROOM_EST 256
> > +
> > +struct srl2_priv {
> > + struct ipv6_sr_hdr *srh;
> > + struct dst_cache dst_cache;
> > +};
> > +
> > +/*
> > + * srl2_xmit - encapsulate an L2 frame in IPv6+SRH and transmit
> > + *
> > + * When the bridge (or local stack) sends a frame through this device,
> > + * skb->data points to the inner Ethernet header. We look up a route
> > + * towards the first SID, prepend the outer IPv6+SRH via
> > + * seg6_do_srh_encap(), and transmit via ip6tunnel_xmit().
> > + *
> > + * The route lookup result is cached per-cpu in dst_cache. Since the
> > + * first SID is constant for the lifetime of the device, the cache
> > + * avoids repeated route lookups in the common case.
> > + */
> > +static netdev_tx_t srl2_xmit(struct sk_buff *skb, struct net_device *dev)
> > +{
> > + struct srl2_priv *priv = netdev_priv(dev);
> > + struct net *net = dev_net(dev);
> > + struct dst_entry *dst;
> > + struct flowi6 fl6;
> > + int err;
> > +
> > + local_bh_disable();
> > + dst = dst_cache_get(&priv->dst_cache);
> > + local_bh_enable();
> > +
> > + if (unlikely(!dst)) {
> > + memset(&fl6, 0, sizeof(fl6));
> > + fl6.daddr = priv->srh->segments[priv->srh->first_segment];
> > +
> > + dst = ip6_route_output(net, NULL, &fl6);
> > + if (dst->error) {
> > + dst_release(dst);
> > + DEV_STATS_INC(dev, tx_carrier_errors);
> > + goto drop;
> > + }
> > +
> > + if (dst_dev(dst) == dev) {
> > + dst_release(dst);
> > + DEV_STATS_INC(dev, collisions);
> > + goto drop;
> > + }
> > +
> > + local_bh_disable();
> > + /* saddr is unused */
> > + dst_cache_set_ip6(&priv->dst_cache, dst, &fl6.saddr);
> > + local_bh_enable();
> > + }
> > +
> > + skb_scrub_packet(skb, false);
> > +
> > + skb_dst_set(skb, dst);
> > +
> > + err = seg6_do_srh_encap(skb, priv->srh, IPPROTO_ETHERNET);
>
> We shouldn't be reusing seg6_do_srh_encap() as it also manages its own
> lwt dst_cache. There's probably a need to rework that part to avoid
> code duplication.
Never mind, it's just that dst_dev_overhead() would be called with a
default (NULL) dst_entry, so I guess we should remove the likely
annotation for perf reasons in this case. FYI, dst_dev_overhead() was
introduced to mitigate a double reallocation in skb's, but I don't
think we should worry about it in this context as it is only triggered
for much more segments (see https://arxiv.org/pdf/2503.14959, and
related commits 0600cf40e9b3 and 40475b63761a).