From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mail.toke.dk (Postfix) with ESMTPS id A3948A15D38 for ; Tue, 4 Jul 2023 12:39:12 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Y+EFKtJW DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1688467151; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hKyh2YKZjw4dZMNDASEhnwjlZJjD/bk3lXbK1isJdxM=; b=Y+EFKtJWZHaUfqpvVkkUZvDQBtPDtbcgotL5BjklHXMt0OXdl56qsukOim3qPj0bzA5+E8 FvPa3qBKdsGos6x+ug2NLTehbwy0UK91IN/q9/ftDFWGCbB44Gj5VasXtR5AwmXe9iAstW mGvy8amaUSlQ/467XIORfikgeCNLzJ0= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-170-P3tSaPeQOSWyzZGXKzS-nw-1; Tue, 04 Jul 2023 06:39:10 -0400 X-MC-Unique: P3tSaPeQOSWyzZGXKzS-nw-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-314394a798dso870853f8f.0 for ; Tue, 04 Jul 2023 03:39:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688467149; x=1691059149; h=content-transfer-encoding:in-reply-to:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hKyh2YKZjw4dZMNDASEhnwjlZJjD/bk3lXbK1isJdxM=; b=khzARaKY4hMKlZQflkgVpGhVCBatwRj2xtz/5/CoIV6xxQhzrb2E/pxYVvI05gcDwL U98XFzpeANKqgTGzweTt8C5RxrbsEpHJK0Y43JzTp2CLAKeBL/cDP6rU3WbVtov//SO0 23RgoBBzYT3tQWg1OlO1l0G+O3XI/mQyB81tRblfT7PDr0g/u/r4yxxMJ4cygpjW3nvS xDjTcw9Q9uLuTyWNoTwgnFOSimI4DPqByjmpLqTclMHoPpIQeYU0KRXZjyEhOWyjE9bi qyrLgYzf4AD6qaNFgPGJbNDnH1AYPZ6NDnR/4m5Wc2ZB8kmU47z1x6jn6i7vNxKEcYN0 z8yQ== X-Gm-Message-State: AC+VfDzjOCbTkyicr41T3l9i2kJ5O5e+V31dj/S1wLHcmbo5zKCQlMHJ S8Sjiy5KAHUzHcSb1IX0PS+/lzxF6AOxhEpSO433ltJbro5xEOx20buP1WbUptNLF8RFgBEEj6+ RfRLWNGfJMb1drRz7/bK4 X-Received: by 2002:a5d:4244:0:b0:313:f0a7:133a with SMTP id s4-20020a5d4244000000b00313f0a7133amr16032453wrr.13.1688467148806; Tue, 04 Jul 2023 03:39:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7nCIDY7KPZacyNtybDMIiVOQFpcJ51/VuZA9mfRPufhxaz0fUdhkdLRNNWHSASiWsaGwUjdw== X-Received: by 2002:a5d:4244:0:b0:313:f0a7:133a with SMTP id s4-20020a5d4244000000b00313f0a7133amr16032438wrr.13.1688467148506; Tue, 04 Jul 2023 03:39:08 -0700 (PDT) Received: from [192.168.42.100] (194-45-78-10.static.kviknet.net. [194.45.78.10]) by smtp.gmail.com with ESMTPSA id v11-20020adff68b000000b0031424950a99sm10813720wrp.81.2023.07.04.03.39.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 04 Jul 2023 03:39:07 -0700 (PDT) From: Jesper Dangaard Brouer X-Google-Original-From: Jesper Dangaard Brouer Message-ID: <9cd44759-416c-7274-f805-ee9d756f15b1@redhat.com> Date: Tue, 4 Jul 2023 12:39:06 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 To: Larysa Zaremba , John Fastabend References: <20230703181226.19380-1-larysa.zaremba@intel.com> <20230703181226.19380-13-larysa.zaremba@intel.com> <64a331c338a5a_628d3208cb@john.notmuch> In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: FS2VRUWPOWIFUZ3FTRC5YNHTJBPDWJJU X-Message-ID-Hash: FS2VRUWPOWIFUZ3FTRC5YNHTJBPDWJJU X-MailFrom: jbrouer@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: brouer@redhat.com, bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org, "David S. Miller" , Alexander Duyck X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [PATCH bpf-next v2 12/20] xdp: Add checksum level hint List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Cc. DaveM+Alex Duyck, as I value your insights on checksums. On 04/07/2023 11.24, Larysa Zaremba wrote: > On Mon, Jul 03, 2023 at 01:38:27PM -0700, John Fastabend wrote: >> Larysa Zaremba wrote: >>> Implement functionality that enables drivers to expose to XDP code, >>> whether checksums was checked and on what level. >>> >>> Signed-off-by: Larysa Zaremba >>> --- >>> Documentation/networking/xdp-rx-metadata.rst | 3 +++ >>> include/linux/netdevice.h | 1 + >>> include/net/xdp.h | 2 ++ >>> kernel/bpf/offload.c | 2 ++ >>> net/core/xdp.c | 21 ++++++++++++++++++++ >>> 5 files changed, 29 insertions(+) >>> >>> diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst >>> index ea6dd79a21d3..4ec6ddfd2a52 100644 >>> --- a/Documentation/networking/xdp-rx-metadata.rst >>> +++ b/Documentation/networking/xdp-rx-metadata.rst >>> @@ -26,6 +26,9 @@ metadata is supported, this set will grow: >>> .. kernel-doc:: net/core/xdp.c >>> :identifiers: bpf_xdp_metadata_rx_vlan_tag >>> >>> +.. kernel-doc:: net/core/xdp.c >>> + :identifiers: bpf_xdp_metadata_rx_csum_lvl >>> + >>> An XDP program can use these kfuncs to read the metadata into stack >>> variables for its own consumption. Or, to pass the metadata on to other >>> consumers, an XDP program can store it into the metadata area carried >>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >>> index 4fa4380e6d89..569563687172 100644 >>> --- a/include/linux/netdevice.h >>> +++ b/include/linux/netdevice.h >>> @@ -1660,6 +1660,7 @@ struct xdp_metadata_ops { >>> enum xdp_rss_hash_type *rss_type); >>> int (*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tag, >>> __be16 *vlan_proto); >>> + int (*xmo_rx_csum_lvl)(const struct xdp_md *ctx, u8 *csum_level); >>> }; >>> >>> /** >>> diff --git a/include/net/xdp.h b/include/net/xdp.h >>> index 89c58f56ffc6..61ed38fa79d1 100644 >>> --- a/include/net/xdp.h >>> +++ b/include/net/xdp.h >>> @@ -391,6 +391,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info, >>> bpf_xdp_metadata_rx_hash) \ >>> XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \ >>> bpf_xdp_metadata_rx_vlan_tag) \ >>> + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CSUM_LVL, \ >>> + bpf_xdp_metadata_rx_csum_lvl) \ >>> >>> enum { >>> #define XDP_METADATA_KFUNC(name, _) name, >>> diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c >>> index 986e7becfd42..a133fb775f49 100644 >>> --- a/kernel/bpf/offload.c >>> +++ b/kernel/bpf/offload.c >>> @@ -850,6 +850,8 @@ void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id) >>> p = ops->xmo_rx_hash; >>> else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_VLAN_TAG)) >>> p = ops->xmo_rx_vlan_tag; >>> + else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_CSUM_LVL)) >>> + p = ops->xmo_rx_csum_lvl; >>> out: >>> up_read(&bpf_devs_lock); >>> >>> diff --git a/net/core/xdp.c b/net/core/xdp.c >>> index f6262c90e45f..c666d3e0a26c 100644 >>> --- a/net/core/xdp.c >>> +++ b/net/core/xdp.c >>> @@ -758,6 +758,27 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan >>> return -EOPNOTSUPP; >>> } >>> >>> +/** >>> + * bpf_xdp_metadata_rx_csum_lvl - Get depth at which HW has checked the checksum. >>> + * @ctx: XDP context pointer. >>> + * @csum_level: Return value pointer. >>> + * >>> + * In case of success, csum_level contains depth of the last verified checksum. >>> + * If only the outermost checksum was verified, csum_level is 0, if both >>> + * encapsulation and inner transport checksums were verified, csum_level is 1, >>> + * and so on. >>> + * For more details, refer to csum_level field in sk_buff. >>> + * >>> + * Return: >>> + * * Returns 0 on success or ``-errno`` on error. >>> + * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc >>> + * * ``-ENODATA`` : Checksum was not validated >>> + */ >>> +__bpf_kfunc int bpf_xdp_metadata_rx_csum_lvl(const struct xdp_md *ctx, u8 *csum_level) >> >> Istead of ENODATA should we return what would be put in the ip_summed field >> CHECKSUM_{NONE, UNNECESSARY, COMPLETE, PARTIAL}? Then sig would be, I was thinking the same, what about checksum "type". >> >> bpf_xdp_metadata_rx_csum_lvl(const struct xdp_md *ctx, u8 *type, u8 *lvl); >> >> or something like that? Or is the thought that its not really necessary? >> I don't have a strong preference but figured it was worth asking. >> > > I see no value in returning CHECKSUM_COMPLETE without the actual checksum value. > Same with CHECKSUM_PARTIAL and csum_start. Returning those values too would > overcomplicate the function signature. > So, this kfunc bpf_xdp_metadata_rx_csum_lvl() success is it equivilent to CHECKSUM_UNNECESSARY? Looking at documentation[1] (generated from skbuff.h): [1] https://kernel.org/doc/html/latest/networking/skbuff.html#checksumming-of-received-packets-by-device Is the idea that we can add another kfunc (new signature) than can deal with the other types of checksums (in a later kernel release)? >>> +{ >>> + return -EOPNOTSUPP; >>> +} >>> + >>> __diag_pop(); >