From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mail.toke.dk (Postfix) with ESMTPS id A5AC79DE275 for ; Sun, 15 Jan 2023 12:29:17 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=WX59T3jF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673782155; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TsRwH7a5ZAzBzan8pdnub+ZTHCBAS6u97LqnGtseybw=; b=WX59T3jFzt7P8QSGg3j+jae1Pt61/ieGbYgh0l/Fwch0hiftmg0348J4j09ADSo2FtoFAd +motObFxx3mCEQUTeiMK+hxfgyfFc+jwZYURdQuWf0wfZ4w1v/Qs2RQlcwaR9UUG6Z5G8d e0V0K3eCCT047xggHo9Hxg37rBoQ8sA= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-513-fC1Ete6VPuGYwKp4L7SVgw-1; Sun, 15 Jan 2023 06:13:50 -0500 X-MC-Unique: fC1Ete6VPuGYwKp4L7SVgw-1 Received: by mail-ed1-f70.google.com with SMTP id x13-20020a05640226cd00b0047ac11c9774so17317904edd.17 for ; Sun, 15 Jan 2023 03:13:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qo+RjDN6fRsLN+/Dxl3u28Xl4qmoaGjS11UlhtqdFaA=; b=AuVxYMOpDW1ZFrMzNw463ho/oHkLSjB1gOok94gnltaFJvRsY2A5OtO/CBr9X6RSqo koFwsh/wU7LdFBoHBcZTfzw8Edh2KLb1um37EDv1W6DIyD8GWhMMAFh7M0NX35/Vx3AX NvS5BmfduyPE3b0VMsY0lUG2sTIx4qM9aQNhnuqv4NyZXvWaqnunrafwSfQNo8D27fGq iRJMksF2/s8gPu8eE+iAv6oXO3o5tNF3ERSAJqhLNxdiwoJ8x4Htr3zCRMSAw0wmHK2u lPx9rg5Un/+Tci5mNfP1H6S6jQdDEbocIaUwQLth+NvxwuSSWP+wXGfzRQIzPgIASe0f Ts0A== X-Gm-Message-State: AFqh2kpi672zXCRrPdnKdqTUeXCUKG0e5BM1F07pV+/gN91OFiTPwiA5 loBuG9vdGwMys7e940ZlKxyTrdj1AF/UnsVysBTvU+uHhUmbETe74RkJZ8ZagKiU6Dm/GMJksmk w6mKgPFecizw64Q89DDRs X-Received: by 2002:a17:907:cc03:b0:7c4:f8fb:6a27 with SMTP id uo3-20020a170907cc0300b007c4f8fb6a27mr7859389ejc.0.1673781228913; Sun, 15 Jan 2023 03:13:48 -0800 (PST) X-Google-Smtp-Source: AMrXdXtE12kaaGZX4CIw6tlhtGPaOV/YEsVFLoHlrViFgewZBY/VEgAEpT3y0ZvredEKvY0EAkNDBg== X-Received: by 2002:a17:907:cc03:b0:7c4:f8fb:6a27 with SMTP id uo3-20020a170907cc0300b007c4f8fb6a27mr7859201ejc.0.1673781223811; Sun, 15 Jan 2023 03:13:43 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id f12-20020a170906494c00b00860c51f7de5sm4826573ejt.131.2023.01.15.03.13.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Jan 2023 03:13:43 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 42EA2900CC6; Sun, 15 Jan 2023 12:13:42 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Tariq Toukan , Stanislav Fomichev In-Reply-To: <493fd525-10b3-c136-8458-a1560ed2cdcb@gmail.com> References: <20230112003230.3779451-1-sdf@google.com> <20230112003230.3779451-16-sdf@google.com> <87k01rfojm.fsf@toke.dk> <87h6wvfmfa.fsf@toke.dk> <87358ef7e8.fsf@toke.dk> <493fd525-10b3-c136-8458-a1560ed2cdcb@gmail.com> X-Clacks-Overhead: GNU Terry Pratchett Date: Sun, 15 Jan 2023 12:13:42 +0100 Message-ID: <87tu0sdp95.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: FIX2NA3P34IIUHEO2BHIN7C3PX77EN2C X-Message-ID-Hash: FIX2NA3P34IIUHEO2BHIN7C3PX77EN2C X-MailFrom: toke@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, Saeed Mahameed , David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org X-Mailman-Version: 3.3.7 Precedence: list Subject: [xdp-hints] Re: [PATCH bpf-next v7 15/17] net/mlx5e: Introduce wrapper for xdp_buff List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Tariq Toukan writes: > On 13/01/2023 23:31, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> Tariq Toukan writes: >>=20 >>> On 12/01/2023 23:55, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>> Toke H=C3=B8iland-J=C3=B8rgensen writes: >>>> >>>>> Stanislav Fomichev writes: >>>>> >>>>>> On Thu, Jan 12, 2023 at 12:07 AM Tariq Toukan wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 12/01/2023 2:32, Stanislav Fomichev wrote: >>>>>>>> From: Toke H=C3=B8iland-J=C3=B8rgensen >>>>>>>> >>>>>>>> Preparation for implementing HW metadata kfuncs. No functional cha= nge. >>>>>>>> >>>>>>>> Cc: Tariq Toukan >>>>>>>> Cc: Saeed Mahameed >>>>>>>> Cc: John Fastabend >>>>>>>> Cc: David Ahern >>>>>>>> Cc: Martin KaFai Lau >>>>>>>> Cc: Jakub Kicinski >>>>>>>> Cc: Willem de Bruijn >>>>>>>> Cc: Jesper Dangaard Brouer >>>>>>>> Cc: Anatoly Burakov >>>>>>>> Cc: Alexander Lobakin >>>>>>>> Cc: Magnus Karlsson >>>>>>>> Cc: Maryam Tahhan >>>>>>>> Cc: xdp-hints@xdp-project.net >>>>>>>> Cc: netdev@vger.kernel.org >>>>>>>> Signed-off-by: Toke H=C3=B8iland-J=C3=B8rgensen >>>>>>>> Signed-off-by: Stanislav Fomichev >>>>>>>> --- >>>>>>>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + >>>>>>>> .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +- >>>>>>>> .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 6 +- >>>>>>>> .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 25 ++++---- >>>>>>>> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 58 +++++++++--= -------- >>>>>>>> 5 files changed, 50 insertions(+), 43 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/driver= s/net/ethernet/mellanox/mlx5/core/en.h >>>>>>>> index 2d77fb8a8a01..af663978d1b4 100644 >>>>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>>>> @@ -469,6 +469,7 @@ struct mlx5e_txqsq { >>>>>>>> union mlx5e_alloc_unit { >>>>>>>> struct page *page; >>>>>>>> struct xdp_buff *xsk; >>>>>>>> + struct mlx5e_xdp_buff *mxbuf; >>>>>>> >>>>>>> In XSK files below you mix usage of both alloc_units[page_idx].mxbu= f and >>>>>>> alloc_units[page_idx].xsk, while both fields share the memory of a = union. >>>>>>> >>>>>>> As struct mlx5e_xdp_buff wraps struct xdp_buff, I think that you ju= st >>>>>>> need to change the existing xsk field type from struct xdp_buff *xs= k >>>>>>> into struct mlx5e_xdp_buff *xsk and align the usage. >>>>>> >>>>>> Hmmm, good point. I'm actually not sure how it works currently. >>>>>> mlx5e_alloc_unit.mxbuf doesn't seem to be initialized anywhere? Toke= , >>>>>> am I missing something? >>>>> >>>>> It's initialised piecemeal in different places; but yeah, we're mixin= g >>>>> things a bit... >>>>> >>>>>> I'm thinking about something like this: >>>>>> >>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>> index af663978d1b4..2d77fb8a8a01 100644 >>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>>>> @@ -469,7 +469,6 @@ struct mlx5e_txqsq { >>>>>> union mlx5e_alloc_unit { >>>>>> struct page *page; >>>>>> struct xdp_buff *xsk; >>>>>> - struct mlx5e_xdp_buff *mxbuf; >>>>>> }; >>>>> >>>>> Hmm, for consistency with the non-XSK path we should rather go the ot= her >>>>> direction and lose the xsk member, moving everything to mxbuf? Let me >>>>> give that a shot... >>>> >>>> Something like the below? >>>> >>>> -Toke >>>> >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/ne= t/ethernet/mellanox/mlx5/core/en.h >>>> index 6de02d8aeab8..cb9cdb6421c5 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h >>>> @@ -468,7 +468,6 @@ struct mlx5e_txqsq { >>>> =20 >>>> union mlx5e_alloc_unit { >>>> =09struct page *page; >>>> -=09struct xdp_buff *xsk; >>>> =09struct mlx5e_xdp_buff *mxbuf; >>>> }; >>>> =20 >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/driver= s/net/ethernet/mellanox/mlx5/core/en/xdp.h >>>> index cb568c62aba0..95694a25ec31 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h >>>> @@ -33,6 +33,7 @@ >>>> #define __MLX5_EN_XDP_H__ >>>> =20 >>>> #include >>>> +#include >>>> =20 >>>> #include "en.h" >>>> #include "en/txrx.h" >>>> @@ -112,6 +113,21 @@ static inline void mlx5e_xmit_xdp_doorbell(struct= mlx5e_xdpsq *sq) >>>> =09} >>>> } >>>> =20 >>>> +static inline struct mlx5e_xdp_buff *mlx5e_xsk_buff_alloc(struct xsk_= buff_pool *pool) >>>> +{ >>>> +=09return (struct mlx5e_xdp_buff *)xsk_buff_alloc(pool); >>> >>> What about the space needed for the rq / cqe fields? xsk_buff_alloc >>> won't allocate it. >>=20 >> It will! See patch 14 in the series that adds a 'cb' field to >> xdp_buff_xsk, meaning there's actually space after the xdp_buff struct >> being allocated by the xsk_buff_alloc API. The XSK_CHECK_PRIV_TYPE macro >> call is there to ensure the cb field is big enough for the struct we're >> casting to in the driver. >>=20 > > Oh okay, got it. > >>>> +} >>>> + >>>> +static inline void mlx5e_xsk_buff_free(struct mlx5e_xdp_buff *mxbuf) >>>> +{ >>>> +=09xsk_buff_free(&mxbuf->xdp); >>>> +} >>>> + >>>> +static inline dma_addr_t mlx5e_xsk_buff_xdp_get_frame_dma(struct mlx5= e_xdp_buff *mxbuf) >>>> +{ >>>> +=09return xsk_buff_xdp_get_frame_dma(&mxbuf->xdp); >>>> +} >>>> + >>>> /* Enable inline WQEs to shift some load from a congested HCA (HW) = to >>>> * a less congested cpu (SW). >>>> */ >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/dri= vers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c >>>> index 8bf3029abd3c..1f166dbb7f22 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c >>>> @@ -3,7 +3,6 @@ >>>> =20 >>>> #include "rx.h" >>>> #include "en/xdp.h" >>>> -#include >>>> #include >>>> =20 >>>> /* RX data path */ >>>> @@ -21,7 +20,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u1= 6 ix) >>>> =09if (unlikely(!xsk_buff_can_alloc(rq->xsk_pool, rq->mpwqe.pages_p= er_wqe))) >>>> =09=09goto err; >>>> =20 >>>> -=09BUILD_BUG_ON(sizeof(wi->alloc_units[0]) !=3D sizeof(wi->alloc_unit= s[0].xsk)); >>>> +=09BUILD_BUG_ON(sizeof(wi->alloc_units[0]) !=3D sizeof(wi->alloc_unit= s[0].mxbuf)); >>>> =09XSK_CHECK_PRIV_TYPE(struct mlx5e_xdp_buff); >>>> =09batch =3D xsk_buff_alloc_batch(rq->xsk_pool, (struct xdp_buff **= )wi->alloc_units, >>>> =09=09=09=09 rq->mpwqe.pages_per_wqe); >>> >>> This batching API gets broken as well... >>> xsk_buff_alloc_batch fills an array of struct xdp_buff pointers, it >>> cannot correctly act on the array of struct mlx5e_xdp_buff, as it >>> contains additional fields. >>=20 >> See above for why this does, in fact, work. I agree it's not totally >> obvious, and in any case there's going to be a point where the cast >> happens where type safety will break, which is what I was alluding to in >> my reply to Stanislav. >>=20 >> I guess we could try to rework the API in xdp_sock_drv.h to make this >> more obvious instead of using the casting driver-specific wrappers I >> suggested here. Or we could go with Stanislav's suggestion of keeping >> allocation etc using xdp_buff and only casting to mlx5e_xdp_buff in the >> function where it's used; then we can keep the casting localised to that >> function, and put a comment there explaining why it works? >>=20 > > Stanislav's proposal LGTM. > Let's keep the casting localised, and make sure there's a comment there. Alright, cool! -Toke