XDP hardware hints discussion mail archive
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: <sdf@google.com>
Cc: "Jakub Kicinski" <kuba@kernel.org>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
	haoluo@google.com, jolsa@kernel.org,
	"Tariq Toukan" <tariqt@nvidia.com>,
	"David Ahern" <dsahern@gmail.com>,
	"Willem de Bruijn" <willemb@google.com>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Anatoly Burakov" <anatoly.burakov@intel.com>,
	"Alexander Lobakin" <alexandr.lobakin@intel.com>,
	"Magnus Karlsson" <magnus.karlsson@gmail.com>,
	"Maryam Tahhan" <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [xdp-hints] Re: [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff
Date: Wed, 23 Nov 2022 22:54:18 +0100	[thread overview]
Message-ID: <Y36WiiijyDqNioIn@boxer> (raw)
In-Reply-To: <Y3557Ecr80Y9ZD2z@google.com>

On Wed, Nov 23, 2022 at 11:52:12AM -0800, sdf@google.com wrote:
> On 11/23, Jakub Kicinski wrote:
> > On Wed, 23 Nov 2022 10:26:41 -0800 Stanislav Fomichev wrote:
> > > > This embedding trick works for drivers that put xdp_buff on the stack,
> > > > but mlx5 supports XSK zerocopy, which uses the xsk_buff_pool for
> > > > allocating them. This makes it a bit awkward to do the same thing
> > there;
> > > > and since it's probably going to be fairly common to do something like
> > > > this, how about we just add a 'void *drv_priv' pointer to struct
> > > > xdp_buff that the drivers can use? The xdp_buff already takes up a
> > full
> > > > cache line anyway, so any data stuffed after it will spill over to a
> > new
> > > > one; so I don't think there's much difference performance-wise.
> > >
> > > I guess the alternative is to extend xsk_buff_pool with some new
> > > argument for xdp_buff tailroom? (so it can kmalloc(sizeof(xdp_buff) +
> > > xdp_buff_tailroom))
> > > But it seems messy because there is no way of knowing what the target
> > > device's tailroom is, so it has to be a user setting :-/
> > > I've started with a priv pointer in xdp_buff initially, it seems fine
> > > to go back. I'll probably convert veth/mlx4 to the same mode as well
> > > to avoid having different approaches in different places..
> 
> > Can we not do this please? Add 16B of "private driver space" after
> > the xdp_buff in xdp_buff_xsk (we have 16B to full cacheline), the

It is time to jump the hints train I guess:D

We have 8 bytes left in the cacheline that xdp_buff occupies - pahole
output below shows that cb spans through two cachelines. Did you mean
something else though?

> > drivers decide how they use it. Drivers can do BUILD_BUG_ON() for their
> > expected size and cast that to whatever struct they want. This is how
> > various offloads work, the variable size tailroom would be an over
> > design IMO.
> 
> > And this way non XSK paths can keep its normal typing.
> 
> Good idea, prototyped below, lmk if it that's not what you had in mind.
> 
> struct xdp_buff_xsk {
> 	struct xdp_buff            xdp;                  /*     0    56 */
> 	u8                         cb[16];               /*    56    16 */
> 	/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
> 	dma_addr_t                 dma;                  /*    72     8 */
> 	dma_addr_t                 frame_dma;            /*    80     8 */
> 	struct xsk_buff_pool *     pool;                 /*    88     8 */
> 	u64                        orig_addr;            /*    96     8 */
> 	struct list_head           free_list_node;       /*   104    16 */
> 
> 	/* size: 120, cachelines: 2, members: 7 */
> 	/* last cacheline: 56 bytes */
> };
> 
> Toke, I can try to merge this into your patch + keep your SoB (or feel free
> to try this and retest yourself, whatever works).
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> index bc2d9034af5b..837bf103b871 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> @@ -44,6 +44,11 @@
>  	(MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \
>  	 sizeof(struct mlx5_wqe_inline_seg))
> 
> +struct mlx5_xdp_cb {
> +	struct mlx5_cqe64 *cqe;
> +	struct mlx5e_rq *rq;
> +};
> +
>  struct mlx5e_xsk_param;
>  int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param
> *xsk);
>  bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index c91b54d9ff27..84d23b2da7ce 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -5,6 +5,7 @@
>  #include "en/xdp.h"
>  #include <net/xdp_sock_drv.h>
>  #include <linux/filter.h>
> +#include <linux/build_bug.h>
> 
>  /* RX data path */
> 
> @@ -286,8 +287,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct
> mlx5e_rq *rq,
>  					      u32 cqe_bcnt)
>  {
>  	struct xdp_buff *xdp = wi->au->xsk;
> +	struct mlx5_xdp_cb *cb;
>  	struct bpf_prog *prog;
> 
> +	BUILD_BUG_ON(sizeof(struct mlx5_xdp_cb) > XSKB_CB_SIZE);
> +	cb = xp_get_cb(xdp);
> +	cb->cqe = NULL /*cqe*/;
> +	cb->rq = rq;

I believe that these could be set once at a setup time within a pool -
take a look at xsk_pool_set_rxq_info(). This will save us cycles so that
we will skip assignments per each processed xdp_buff.

AF_XDP ZC performance comes in a major part from the fact that thanks to
xsk_buff_pool we have less work to do per each processed buffer.

> +
>  	/* wi->offset is not used in this function, because xdp->data and the
>  	 * DMA address point directly to the necessary place. Furthermore, the
>  	 * XSK allocator allocates frames per packet, instead of pages, so
> diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
> index f787c3f524b0..b298590429e7 100644
> --- a/include/net/xsk_buff_pool.h
> +++ b/include/net/xsk_buff_pool.h
> @@ -19,8 +19,11 @@ struct xdp_sock;
>  struct device;
>  struct page;
> 
> +#define XSKB_CB_SIZE 16
> +
>  struct xdp_buff_xsk {
>  	struct xdp_buff xdp;
> +	u8 cb[XSKB_CB_SIZE]; /* Private area for the drivers to use. */
>  	dma_addr_t dma;
>  	dma_addr_t frame_dma;
>  	struct xsk_buff_pool *pool;
> @@ -143,6 +146,11 @@ static inline dma_addr_t xp_get_frame_dma(struct
> xdp_buff_xsk *xskb)
>  	return xskb->frame_dma;
>  }
> 
> +static inline void *xp_get_cb(struct xdp_buff *xdp)
> +{
> +	return (void *)xdp + offsetof(struct xdp_buff_xsk, cb);
> +}

This should have a wrapper in include/net/xdp_sock_drv.h that drivers will
call.

Generally I think this should fly but I'm not sure about cb being 16
bytes.

> +
>  void xp_dma_sync_for_cpu_slow(struct xdp_buff_xsk *xskb);
>  static inline void xp_dma_sync_for_cpu(struct xdp_buff_xsk *xskb)
>  {
> 
> > > > I'll send my patch to add support to mlx5 (using the drv_priv pointer
> > > > approach) separately.
> > >
> > > Saw them, thanks! Will include them in v3+.

  reply	other threads:[~2022-11-23 21:55 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 18:25 [xdp-hints] [PATCH bpf-next v2 0/8] xdp: hints via kfuncs Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 1/8] bpf: Document XDP RX metadata Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 2/8] bpf: XDP metadata RX kfuncs Stanislav Fomichev
2022-11-21 23:31   ` [xdp-hints] " kernel test robot
2022-11-23  6:34   ` Martin KaFai Lau
2022-11-23 18:43     ` Stanislav Fomichev
2022-11-23 14:24   ` Toke Høiland-Jørgensen
2022-11-23 18:43     ` Stanislav Fomichev
2022-11-24  2:23   ` kernel test robot
2022-11-24 12:19   ` kernel test robot
2022-11-24 13:09   ` kernel test robot
2022-11-25 17:53   ` Toke Høiland-Jørgensen
2022-11-28 18:53     ` Stanislav Fomichev
2022-11-28 19:21       ` Stanislav Fomichev
2022-11-28 22:25         ` Toke Høiland-Jørgensen
2022-11-28 22:10       ` Toke Høiland-Jørgensen
2022-11-30 17:24   ` Larysa Zaremba
2022-11-30 19:06     ` Stanislav Fomichev
2022-11-30 20:17       ` Stanislav Fomichev
2022-12-01 13:52         ` Larysa Zaremba
2022-12-01 17:14           ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 3/8] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 4/8] veth: Support RX XDP metadata Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 5/8] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-11-29 10:06   ` [xdp-hints] " Anton Protopopov
2022-11-29 18:52     ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-22 13:49   ` [xdp-hints] " Tariq Toukan
2022-11-22 18:08     ` Stanislav Fomichev
2022-11-23 14:33   ` Toke Høiland-Jørgensen
2022-11-23 18:26     ` Stanislav Fomichev
2022-11-23 19:14       ` Jakub Kicinski
2022-11-23 19:52         ` sdf
2022-11-23 21:54           ` Maciej Fijalkowski [this message]
2022-11-23 21:55           ` Toke Høiland-Jørgensen
2022-11-24  1:47             ` Jakub Kicinski
2022-11-24 14:39               ` Toke Høiland-Jørgensen
2022-11-24 15:17                 ` Maciej Fijalkowski
2022-11-24 16:11                   ` Maciej Fijalkowski
2022-11-25  0:36                     ` Toke Høiland-Jørgensen
2022-11-28 21:58                       ` Stanislav Fomichev
2022-11-28 22:11                         ` Toke Høiland-Jørgensen
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 7/8] mxl4: Support RX XDP metadata Stanislav Fomichev
2022-11-22 13:50   ` [xdp-hints] " Tariq Toukan
2022-11-22 18:08     ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 8/8] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-11-23 14:26   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-23 18:29     ` Stanislav Fomichev
2022-11-23 19:17       ` Jakub Kicinski
2022-11-23 19:54         ` Stanislav Fomichev
2022-11-23 14:46 ` [xdp-hints] [PATCH bpf-next 1/2] xdp: Add drv_priv pointer to struct xdp_buff Toke Høiland-Jørgensen
2022-11-23 14:46   ` [xdp-hints] [PATCH bpf-next 2/2] mlx5: Support XDP RX metadata Toke Høiland-Jørgensen
2022-11-23 22:29     ` [xdp-hints] " Saeed Mahameed
2022-11-23 22:44       ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y36WiiijyDqNioIn@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=tariqt@nvidia.com \
    --cc=toke@redhat.com \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox