From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: sdf@google.com, Jakub Kicinski <kuba@kernel.org>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
haoluo@google.com, jolsa@kernel.org,
Tariq Toukan <tariqt@nvidia.com>, David Ahern <dsahern@gmail.com>,
Willem de Bruijn <willemb@google.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Anatoly Burakov <anatoly.burakov@intel.com>,
Alexander Lobakin <alexandr.lobakin@intel.com>,
Magnus Karlsson <magnus.karlsson@gmail.com>,
Maryam Tahhan <mtahhan@redhat.com>,
xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [xdp-hints] Re: [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff
Date: Wed, 23 Nov 2022 22:55:21 +0100 [thread overview]
Message-ID: <871qptuyie.fsf@toke.dk> (raw)
In-Reply-To: <Y3557Ecr80Y9ZD2z@google.com>
sdf@google.com writes:
> On 11/23, Jakub Kicinski wrote:
>> On Wed, 23 Nov 2022 10:26:41 -0800 Stanislav Fomichev wrote:
>> > > This embedding trick works for drivers that put xdp_buff on the stack,
>> > > but mlx5 supports XSK zerocopy, which uses the xsk_buff_pool for
>> > > allocating them. This makes it a bit awkward to do the same thing
>> there;
>> > > and since it's probably going to be fairly common to do something like
>> > > this, how about we just add a 'void *drv_priv' pointer to struct
>> > > xdp_buff that the drivers can use? The xdp_buff already takes up a
>> full
>> > > cache line anyway, so any data stuffed after it will spill over to a
>> new
>> > > one; so I don't think there's much difference performance-wise.
>> >
>> > I guess the alternative is to extend xsk_buff_pool with some new
>> > argument for xdp_buff tailroom? (so it can kmalloc(sizeof(xdp_buff) +
>> > xdp_buff_tailroom))
>> > But it seems messy because there is no way of knowing what the target
>> > device's tailroom is, so it has to be a user setting :-/
>> > I've started with a priv pointer in xdp_buff initially, it seems fine
>> > to go back. I'll probably convert veth/mlx4 to the same mode as well
>> > to avoid having different approaches in different places..
>
>> Can we not do this please? Add 16B of "private driver space" after
>> the xdp_buff in xdp_buff_xsk (we have 16B to full cacheline), the
>> drivers decide how they use it. Drivers can do BUILD_BUG_ON() for their
>> expected size and cast that to whatever struct they want. This is how
>> various offloads work, the variable size tailroom would be an over
>> design IMO.
>
>> And this way non XSK paths can keep its normal typing.
>
> Good idea, prototyped below, lmk if it that's not what you had in mind.
>
> struct xdp_buff_xsk {
> struct xdp_buff xdp; /* 0 56 */
> u8 cb[16]; /* 56 16 */
> /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
As pahole helpfully says here, xdp_buff is actually only 8 bytes from
being a full cache line. I thought about adding a 'cb' field like this
to xdp_buff itself, but figured that since there's only room for a
single pointer, why not just add that and let the driver point it to
where it wants to store the extra context data?
I am not suggesting to make anything variable-size; the 'void *drv_priv'
is just a normal pointer. There's no changes to any typing; not sure
where you got that from, Jakub?
Also, the priv pointer approach works for both XSK and on-stack
allocations, unlike this approach (see below).
> dma_addr_t dma; /* 72 8 */
> dma_addr_t frame_dma; /* 80 8 */
> struct xsk_buff_pool * pool; /* 88 8 */
> u64 orig_addr; /* 96 8 */
> struct list_head free_list_node; /* 104 16 */
>
> /* size: 120, cachelines: 2, members: 7 */
> /* last cacheline: 56 bytes */
> };
>
> Toke, I can try to merge this into your patch + keep your SoB (or feel free
> to try this and retest yourself, whatever works).
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> index bc2d9034af5b..837bf103b871 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> @@ -44,6 +44,11 @@
> (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \
> sizeof(struct mlx5_wqe_inline_seg))
>
> +struct mlx5_xdp_cb {
> + struct mlx5_cqe64 *cqe;
> + struct mlx5e_rq *rq;
> +};
> +
> struct mlx5e_xsk_param;
> int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param
> *xsk);
> bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index c91b54d9ff27..84d23b2da7ce 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -5,6 +5,7 @@
> #include "en/xdp.h"
> #include <net/xdp_sock_drv.h>
> #include <linux/filter.h>
> +#include <linux/build_bug.h>
>
> /* RX data path */
>
> @@ -286,8 +287,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct
> mlx5e_rq *rq,
> u32 cqe_bcnt)
> {
> struct xdp_buff *xdp = wi->au->xsk;
> + struct mlx5_xdp_cb *cb;
> struct bpf_prog *prog;
>
> + BUILD_BUG_ON(sizeof(struct mlx5_xdp_cb) > XSKB_CB_SIZE);
> + cb = xp_get_cb(xdp);
> + cb->cqe = NULL /*cqe*/;
> + cb->rq = rq;
So this works fine for the XSK path, but for the regular XDP path, mlx5
*does* indeed put the xdp_buff on the stack. So to re-use code there
would be an implicit assumption that both memory layout and size matches
between the two paths. I'm not sure that's better than just having a
pointer inside the xdp_buff and pointing it wherever makes sense for
that driver (as my patch did)?
-Toke
next prev parent reply other threads:[~2022-11-23 21:55 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-21 18:25 [xdp-hints] [PATCH bpf-next v2 0/8] xdp: hints via kfuncs Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 1/8] bpf: Document XDP RX metadata Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 2/8] bpf: XDP metadata RX kfuncs Stanislav Fomichev
2022-11-21 23:31 ` [xdp-hints] " kernel test robot
2022-11-23 6:34 ` Martin KaFai Lau
2022-11-23 18:43 ` Stanislav Fomichev
2022-11-23 14:24 ` Toke Høiland-Jørgensen
2022-11-23 18:43 ` Stanislav Fomichev
2022-11-24 2:23 ` kernel test robot
2022-11-24 12:19 ` kernel test robot
2022-11-24 13:09 ` kernel test robot
2022-11-25 17:53 ` Toke Høiland-Jørgensen
2022-11-28 18:53 ` Stanislav Fomichev
2022-11-28 19:21 ` Stanislav Fomichev
2022-11-28 22:25 ` Toke Høiland-Jørgensen
2022-11-28 22:10 ` Toke Høiland-Jørgensen
2022-11-30 17:24 ` Larysa Zaremba
2022-11-30 19:06 ` Stanislav Fomichev
2022-11-30 20:17 ` Stanislav Fomichev
2022-12-01 13:52 ` Larysa Zaremba
2022-12-01 17:14 ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 3/8] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 4/8] veth: Support RX XDP metadata Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 5/8] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-11-29 10:06 ` [xdp-hints] " Anton Protopopov
2022-11-29 18:52 ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-22 13:49 ` [xdp-hints] " Tariq Toukan
2022-11-22 18:08 ` Stanislav Fomichev
2022-11-23 14:33 ` Toke Høiland-Jørgensen
2022-11-23 18:26 ` Stanislav Fomichev
2022-11-23 19:14 ` Jakub Kicinski
2022-11-23 19:52 ` sdf
2022-11-23 21:54 ` Maciej Fijalkowski
2022-11-23 21:55 ` Toke Høiland-Jørgensen [this message]
2022-11-24 1:47 ` Jakub Kicinski
2022-11-24 14:39 ` Toke Høiland-Jørgensen
2022-11-24 15:17 ` Maciej Fijalkowski
2022-11-24 16:11 ` Maciej Fijalkowski
2022-11-25 0:36 ` Toke Høiland-Jørgensen
2022-11-28 21:58 ` Stanislav Fomichev
2022-11-28 22:11 ` Toke Høiland-Jørgensen
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 7/8] mxl4: Support RX XDP metadata Stanislav Fomichev
2022-11-22 13:50 ` [xdp-hints] " Tariq Toukan
2022-11-22 18:08 ` Stanislav Fomichev
2022-11-21 18:25 ` [xdp-hints] [PATCH bpf-next v2 8/8] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-11-23 14:26 ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-23 18:29 ` Stanislav Fomichev
2022-11-23 19:17 ` Jakub Kicinski
2022-11-23 19:54 ` Stanislav Fomichev
2022-11-23 14:46 ` [xdp-hints] [PATCH bpf-next 1/2] xdp: Add drv_priv pointer to struct xdp_buff Toke Høiland-Jørgensen
2022-11-23 14:46 ` [xdp-hints] [PATCH bpf-next 2/2] mlx5: Support XDP RX metadata Toke Høiland-Jørgensen
2022-11-23 22:29 ` [xdp-hints] " Saeed Mahameed
2022-11-23 22:44 ` Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871qptuyie.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexandr.lobakin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mtahhan@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=tariqt@nvidia.com \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox