From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qv1-xf29.google.com (mail-qv1-xf29.google.com [IPv6:2607:f8b0:4864:20::f29]) by mail.toke.dk (Postfix) with ESMTPS id 75DC6A34A59 for ; Mon, 23 Oct 2023 10:28:46 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=nQS26sFd Received: by mail-qv1-xf29.google.com with SMTP id 6a1803df08f44-65b093b97d2so4150846d6.0 for ; Mon, 23 Oct 2023 01:28:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698049725; x=1698654525; darn=xdp-project.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=T2el5B3d/edYoPLgmuwr9Mwa8TFkiwQOhj/8O9ejoqY=; b=nQS26sFdSteoYL+EfS6apCxx18dfyJTYPq/XVrTsbN7oTckqE2PQ7Wz4QxoclAtHoA dR/5Wpps7x0UvlR+viHVrSriSUapZaxUvyizeU0HOXcJiUhZsgiXx+wptbJFsRmpzmSe nz9/jOB7AN8SRDcKk3vRKk4Obc8mcSbCRcbuqJh2+tDqeGsVdmjR4cQvP1M/u6/LHNJY oLtBuPKAsU31k81Pz9aaxn5wg26vy4OyxyWUvNQXRBjHX28Qpo4tL8kqVR3tEfhwXaxp uVV+/CQDgLCsEOk6DEtjCNm1AOCt1lBV6GPoo86YUh2nWgSNeala4E4/biQ2s1rMi28N QgEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698049725; x=1698654525; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=T2el5B3d/edYoPLgmuwr9Mwa8TFkiwQOhj/8O9ejoqY=; b=jQZ3wvE8RIsKWVu5XHSj0JSTDdqUmeIXX9XfNZNGJ483sSW3JUXxWU+uSHFEW68tgO dkecwTAjHhko2Z1T9CfBL5kD5MihBY3Cb/sqdCi+Rn0cGLpIgXgzSmu5qKa3q+jS9rgi gdHf7bFQZLd2HHiMu+eb70ttaEcVOKsx5OMmC+ZKJd5cHesGhgjZ/9zWRbw/kMb1NDJj vgJ3R+grbS2AybhuMnZQWHGNSZWZbJHlgimnUleVni0KBxb0ksX4IzA8SKNswA18/gxI pE+NHn73OuvlUvfi1VPBsR5wm1619Q8hyJmcZnf0KFt4s2JZadJPG+eGlYiR6AQ6o58e G4BA== X-Gm-Message-State: AOJu0YwxtzmJPVgrJ+Vp9BiMiMw2ITF068zSyuKiV2//pDj8aWW7POWt 2dj6R1yUk3UrYWUjwuRs1pYKHLQUcBfi3ArkxqM= X-Google-Smtp-Source: AGHT+IHhPrK/U8eNAivjutwIZ6a0dSrVffu9c59zy7YXF145ShDpJhts5S/cHlmYEa+giKYUdhUQrzTtIgPEjBf6upQ= X-Received: by 2002:a05:6214:21ec:b0:66a:d2c1:992d with SMTP id p12-20020a05621421ec00b0066ad2c1992dmr8742598qvj.0.1698049725097; Mon, 23 Oct 2023 01:28:45 -0700 (PDT) MIME-Version: 1.0 References: <20231019174944.3376335-1-sdf@google.com> <20231019174944.3376335-2-sdf@google.com> In-Reply-To: <20231019174944.3376335-2-sdf@google.com> From: Magnus Karlsson Date: Mon, 23 Oct 2023 10:28:33 +0200 Message-ID: To: Stanislav Fomichev Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: 3BG6LSWVUSUIC2G23UHGGWZ2NBLIAZOY X-Message-ID-Hash: 3BG6LSWVUSUIC2G23UHGGWZ2NBLIAZOY X-MailFrom: magnus.karlsson@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [PATCH bpf-next v4 01/11] xsk: Support tx_metadata_len List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu, 19 Oct 2023 at 19:50, Stanislav Fomichev wrote: > > For zerocopy mode, tx_desc->addr can point to the arbitrary offset nit: the -> an > and carry some TX metadata in the headroom. For copy mode, there > is no way currently to populate skb metadata. > > Introduce new tx_metadata_len umem config option that indicates how many > bytes to treat as metadata. Metadata bytes come prior to tx_desc address > (same as in RX case). > > The size of the metadata has the same constraints as XDP: > - less than 256 bytes > - 4-byte aligned > - non-zero > > This data is not interpreted in any way right now. > > Signed-off-by: Stanislav Fomichev > --- > include/net/xdp_sock.h | 1 + > include/net/xsk_buff_pool.h | 1 + > include/uapi/linux/if_xdp.h | 1 + > net/xdp/xdp_umem.c | 4 ++++ > net/xdp/xsk.c | 12 +++++++++++- > net/xdp/xsk_buff_pool.c | 1 + > net/xdp/xsk_queue.h | 17 ++++++++++------- > tools/include/uapi/linux/if_xdp.h | 1 + > 8 files changed, 30 insertions(+), 8 deletions(-) > > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h > index 7dd0df2f6f8e..5ae88a00f34a 100644 > --- a/include/net/xdp_sock.h > +++ b/include/net/xdp_sock.h > @@ -30,6 +30,7 @@ struct xdp_umem { > struct user_struct *user; > refcount_t users; > u8 flags; > + u8 tx_metadata_len; > bool zc; > struct page **pgs; > int id; > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h > index b0bdff26fc88..1985ffaf9b0c 100644 > --- a/include/net/xsk_buff_pool.h > +++ b/include/net/xsk_buff_pool.h > @@ -77,6 +77,7 @@ struct xsk_buff_pool { > u32 chunk_size; > u32 chunk_shift; > u32 frame_len; > + u8 tx_metadata_len; /* inherited from umem */ > u8 cached_need_wakeup; > bool uses_need_wakeup; > bool dma_need_sync; > diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h > index 8d48863472b9..2ecf79282c26 100644 > --- a/include/uapi/linux/if_xdp.h > +++ b/include/uapi/linux/if_xdp.h > @@ -76,6 +76,7 @@ struct xdp_umem_reg { > __u32 chunk_size; > __u32 headroom; > __u32 flags; > + __u32 tx_metadata_len; > }; > > struct xdp_statistics { > diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c > index 06cead2b8e34..333f3d53aad4 100644 > --- a/net/xdp/xdp_umem.c > +++ b/net/xdp/xdp_umem.c > @@ -199,6 +199,9 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) > if (headroom >= chunk_size - XDP_PACKET_HEADROOM) > return -EINVAL; > > + if (mr->tx_metadata_len > 256 || mr->tx_metadata_len % 4) > + return -EINVAL; Should be >= 256 since the final internal destination is a u8 and the documentation says "should be less than 256 bytes". > + > umem->size = size; > umem->headroom = headroom; > umem->chunk_size = chunk_size; > @@ -207,6 +210,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr) > umem->pgs = NULL; > umem->user = NULL; > umem->flags = mr->flags; > + umem->tx_metadata_len = mr->tx_metadata_len; > > INIT_LIST_HEAD(&umem->xsk_dma_list); > refcount_set(&umem->users, 1); > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index ba070fd37d24..ba4c77a24a83 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c > @@ -1265,6 +1265,14 @@ struct xdp_umem_reg_v1 { > __u32 headroom; > }; > > +struct xdp_umem_reg_v2 { > + __u64 addr; /* Start of packet data area */ > + __u64 len; /* Length of packet data area */ > + __u32 chunk_size; > + __u32 headroom; > + __u32 flags; > +}; > + > static int xsk_setsockopt(struct socket *sock, int level, int optname, > sockptr_t optval, unsigned int optlen) > { > @@ -1308,8 +1316,10 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname, > > if (optlen < sizeof(struct xdp_umem_reg_v1)) > return -EINVAL; > - else if (optlen < sizeof(mr)) > + else if (optlen < sizeof(struct xdp_umem_reg_v2)) > mr_size = sizeof(struct xdp_umem_reg_v1); > + else if (optlen < sizeof(mr)) > + mr_size = sizeof(struct xdp_umem_reg_v2); > > if (copy_from_sockptr(&mr, optval, mr_size)) > return -EFAULT; > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c > index 49cb9f9a09be..386eddcdf837 100644 > --- a/net/xdp/xsk_buff_pool.c > +++ b/net/xdp/xsk_buff_pool.c > @@ -85,6 +85,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, > XDP_PACKET_HEADROOM; > pool->umem = umem; > pool->addrs = umem->addrs; > + pool->tx_metadata_len = umem->tx_metadata_len; > INIT_LIST_HEAD(&pool->free_list); > INIT_LIST_HEAD(&pool->xskb_list); > INIT_LIST_HEAD(&pool->xsk_tx_list); > diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h > index 13354a1e4280..c74a1372bcb9 100644 > --- a/net/xdp/xsk_queue.h > +++ b/net/xdp/xsk_queue.h > @@ -143,15 +143,17 @@ static inline bool xp_unused_options_set(u32 options) > static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, > struct xdp_desc *desc) > { > - u64 offset = desc->addr & (pool->chunk_size - 1); > + u64 addr = desc->addr - pool->tx_metadata_len; > + u64 len = desc->len + pool->tx_metadata_len; > + u64 offset = addr & (pool->chunk_size - 1); > > if (!desc->len) > return false; > > - if (offset + desc->len > pool->chunk_size) > + if (offset + len > pool->chunk_size) > return false; > > - if (desc->addr >= pool->addrs_cnt) > + if (addr >= pool->addrs_cnt) > return false; > > if (xp_unused_options_set(desc->options)) > @@ -162,16 +164,17 @@ static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, > static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, > struct xdp_desc *desc) > { > - u64 addr = xp_unaligned_add_offset_to_addr(desc->addr); > + u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len; > + u64 len = desc->len + pool->tx_metadata_len; > > if (!desc->len) > return false; > > - if (desc->len > pool->chunk_size) > + if (len > pool->chunk_size) > return false; > > - if (addr >= pool->addrs_cnt || addr + desc->len > pool->addrs_cnt || > - xp_desc_crosses_non_contig_pg(pool, addr, desc->len)) > + if (addr >= pool->addrs_cnt || addr + len > pool->addrs_cnt || > + xp_desc_crosses_non_contig_pg(pool, addr, len)) > return false; > > if (xp_unused_options_set(desc->options)) > diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h > index 73a47da885dc..34411a2e5b6c 100644 > --- a/tools/include/uapi/linux/if_xdp.h > +++ b/tools/include/uapi/linux/if_xdp.h > @@ -76,6 +76,7 @@ struct xdp_umem_reg { > __u32 chunk_size; > __u32 headroom; > __u32 flags; > + __u32 tx_metadata_len; > }; > > struct xdp_statistics { > -- > 2.42.0.655.g421f12c284-goog > >