From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by mail.toke.dk (Postfix) with ESMTPS id 5129CA1C1D6 for ; Wed, 26 Jul 2023 01:48:24 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Su10Lw0a Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-5634dbfb8b1so2839909a12.1 for ; Tue, 25 Jul 2023 16:48:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690328902; x=1690933702; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=B1gbm7SHBDO0dPR/G75Fc7n9oOa169DATobxNqRb7Fo=; b=Su10Lw0aZkMXrozhNalfPEWGBz3FhkeaPpGtzzOAn1qKwDNInPVoz6XPfyQwSzANfF eMDlGZoQ3Es5KlBPDvYqSatRdy++wyXBAZyr3hvFYL1BdkpUtjqCDsXziWGQfvgt7Z1H ccUxYrLGoL4bOJ19LhS5PN2iEoPp41kxKBrENXDFjCQQAcByjx4gfmJ+e9sGrRwRobtw q2gnlpM+g2l0uNWR0L5w5SdRVCmFAy3COMYFvE7D8DrI1RwSrtdSvLPjEuKrqpDfyiJO /I+j8+Gw0LFObfsHX37p6uBdeWU6jaihzh4YEi6wglZB7NFhiNrw5BQIxEGJbP3JCDwO mAxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690328902; x=1690933702; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B1gbm7SHBDO0dPR/G75Fc7n9oOa169DATobxNqRb7Fo=; b=GrcpJlKGgntIbBtCyDZC8Z3WvSKeBhF2moJv0eYwRjd+w1Rz4wE6DL65opVGvIsa72 GqcY94u9aP9tss2F7zgy27cDaaTp7cZfRjC2STARyv0DscysEY0MnvTFW2NUgf2Aq1fE hrf6QFEQqFBK0thnjESX6TmYImgohf7trXjuAjqNygYWY1r0YFx8Z9uj7RXlWGQS66L1 h+unf5Y7uWYSlvb65Tu1HhR9D7eJdh68eo6q3E3PfGoMNvNno1nT6eXLFw0M1f2DsGtq G0uHEOXp53647AiMy3UJm9aY/1HTJhHubxC2zmoeMJl9LMUTE4rJmyIOprV8pPzJ9XaK LyAg== X-Gm-Message-State: ABy/qLZyXBP/mXE5ujSm/mR8Zex05a2N9MD20+dYi6jtUH/ogPnB82nu izp8ttVgheBMvHbHPDw3JekfSR4= X-Google-Smtp-Source: APBJJlH3rRw/OjSC+3HxgvpyffsTjs0Fu3o5pNslJdK66NIDXlelQFYhoNUxe0vY3Ue+cuoipTE3cqc= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a63:3ec5:0:b0:557:6227:bf47 with SMTP id l188-20020a633ec5000000b005576227bf47mr3108pga.9.1690328902214; Tue, 25 Jul 2023 16:48:22 -0700 (PDT) Date: Tue, 25 Jul 2023 16:48:20 -0700 In-Reply-To: <64c056686b527_3a4d294e6@willemb.c.googlers.com.notmuch> Mime-Version: 1.0 References: <20230724235957.1953861-1-sdf@google.com> <20230724235957.1953861-3-sdf@google.com> <64c0369eadbd5_3fe1bc2940@willemb.c.googlers.com.notmuch> <64c056686b527_3a4d294e6@willemb.c.googlers.com.notmuch> Message-ID: From: Stanislav Fomichev To: Willem de Bruijn Content-Type: text/plain; charset="utf-8" Message-ID-Hash: HWQRG3QAWEQC27BOUOQPYVXNI7HKQ7C7 X-Message-ID-Hash: HWQRG3QAWEQC27BOUOQPYVXNI7HKQ7C7 X-MailFrom: 3Rl_AZAMKCTMhSUVddVaT.RdbmSe-WXcihmSe-egdYTRi.cTi@flex--sdf.bounces.google.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [RFC net-next v4 2/8] xsk: add TX timestamp and TX checksum offload support List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 07/25, Willem de Bruijn wrote: > Stanislav Fomichev wrote: > > On 07/25, Willem de Bruijn wrote: > > > Stanislav Fomichev wrote: > > > > This change actually defines the (initial) metadata layout > > > > that should be used by AF_XDP userspace (xsk_tx_metadata). > > > > The first field is flags which requests appropriate offloads, > > > > followed by the offload-specific fields. The supported per-device > > > > offloads are exported via netlink (new xsk-flags). > > > > > > > > The offloads themselves are still implemented in a bit of a > > > > framework-y fashion that's left from my initial kfunc attempt. > > > > I'm introducing new xsk_tx_metadata_ops which drivers are > > > > supposed to implement. The drivers are also supposed > > > > to call xsk_tx_metadata_request/xsk_tx_metadata_complete in > > > > the right places. Since xsk_tx_metadata_{request,_complete} > > > > are static inline, we don't incur any extra overhead doing > > > > indirect calls. > > > > > > > > The benefit of this scheme is as follows: > > > > - keeps all metadata layout parsing away from driver code > > > > - makes it easy to grep and see which drivers implement what > > > > - don't need any extra flags to maintain to keep track of that > > > > offloads are implemented; if the callback is implemented - the offload > > > > is supported (used by netlink reporting code) > > > > > > > > Two offloads are defined right now: > > > > 1. XDP_TX_METADATA_CHECKSUM: skb-style csum_start+csum_offset > > > > 2. XDP_TX_METADATA_TIMESTAMP: writes TX timestamp back into metadata > > > > area upon completion (tx_timestamp field) > > > > > > > > The offloads are also implemented for copy mode: > > > > 1. Extra XDP_TX_METADATA_CHECKSUM_SW to trigger skb_checksum_help; this > > > > might be useful as a reference implementation and for testing > > > > 2. XDP_TX_METADATA_TIMESTAMP writes SW timestamp from the skb > > > > destructor (note I'm reusing hwtstamps to pass metadata pointer) > > > > > > > > The struct is forward-compatible and can be extended in the future > > > > by appending more fields. > > > > > > > > Signed-off-by: Stanislav Fomichev > > > > > +/* Request transmit checksum offload. Checksum start position and offset > > > > + * are communicated via csum_start and csum_offset fields of struct > > > > + * xsk_tx_metadata. > > > > + */ > > > > +#define XDP_TX_METADATA_CHECKSUM (1 << 1) > > > > + > > > > +/* Force checksum calculation in software. Can be used for testing or > > > > + * working around potential HW issues. This option causes performance > > > > + * degradation and only works in XDP_COPY mode. > > > > + */ > > > > +#define XDP_TX_METADATA_CHECKSUM_SW (1 << 2) > > > > > > Not sure how useful this is, especially if only for copy mode. > > > > Seems useful at least as a reference implementation? But I'm happy > > to drop. It's used only in the tests for now. I was using it to > > verify csum_offset/start field values. > > If testing over veth, does anything even look at the checksum? My receiver in the xdp_metadata test looks at it and compares to the fixed (verified) value: ASSERT_EQ(udph->check, 0x1c72, "csum"); The packet is always the same (and macs are fixed), so we are able to do that. > > > > +struct xsk_tx_metadata { > > > > + __u32 flags; > > > > + > > > > + /* XDP_TX_METADATA_CHECKSUM */ > > > > + > > > > + /* Offset from desc->addr where checksumming should start. */ > > > > + __u16 csum_start; > > > > + /* Offset from csum_start where checksum should be stored. */ > > > > + __u16 csum_offset; > > > > + > > > > + /* XDP_TX_METADATA_TIMESTAMP */ > > > > + > > > > + __u64 tx_timestamp; > > > > +}; > > > > > > Is this structure easily extensible for future offloads, > > > such as USO? > > > > We can append more field. What do we need for USO? Something akin > > to gso_size/gso_segs/gso_type ? > > Yes, a bit to set the feature (gso_type) and a field to store the > segment size (gso_size). > > Pacing offload is the other feature that comes to mind. That could > conceivably use the tx_timestamp field. Right, so we can append to this struct and add more XDP_TX_METADATA_$(FLAG)s to signal various features. Jakub mentioned that it might be handy to pass l2_offset/l3_offset/l4_offset, but I'm not sure whether he was talking about xSO offloads or something else.