From: John Fastabend <john.fastabend@gmail.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>,
Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
brouer@redhat.com, bpf@vger.kernel.org, ast@kernel.org,
daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com,
haoluo@google.com, jolsa@kernel.org,
David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
Willem de Bruijn <willemb@google.com>,
Anatoly Burakov <anatoly.burakov@intel.com>,
Alexander Lobakin <alexandr.lobakin@intel.com>,
Magnus Karlsson <magnus.karlsson@gmail.com>,
Maryam Tahhan <mtahhan@redhat.com>,
xdp-hints@xdp-project.net, netdev@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Alexander Duyck <alexander.duyck@gmail.com>
Subject: [xdp-hints] Re: [PATCH bpf-next v2 12/20] xdp: Add checksum level hint
Date: Wed, 05 Jul 2023 22:50:31 -0700 [thread overview]
Message-ID: <64a656273ee15_b20ce2087a@john.notmuch> (raw)
In-Reply-To: <ZKQAPBcIE/iCkiX2@lincoln>
Larysa Zaremba wrote:
> On Tue, Jul 04, 2023 at 12:39:06PM +0200, Jesper Dangaard Brouer wrote:
> > Cc. DaveM+Alex Duyck, as I value your insights on checksums.
> >
> > On 04/07/2023 11.24, Larysa Zaremba wrote:
> > > On Mon, Jul 03, 2023 at 01:38:27PM -0700, John Fastabend wrote:
> > > > Larysa Zaremba wrote:
> > > > > Implement functionality that enables drivers to expose to XDP code,
> > > > > whether checksums was checked and on what level.
> > > > >
> > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > > > ---
> > > > > Documentation/networking/xdp-rx-metadata.rst | 3 +++
> > > > > include/linux/netdevice.h | 1 +
> > > > > include/net/xdp.h | 2 ++
> > > > > kernel/bpf/offload.c | 2 ++
> > > > > net/core/xdp.c | 21 ++++++++++++++++++++
> > > > > 5 files changed, 29 insertions(+)
> > > > >
> > > > > diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> > > > > index ea6dd79a21d3..4ec6ddfd2a52 100644
> > > > > --- a/Documentation/networking/xdp-rx-metadata.rst
> > > > > +++ b/Documentation/networking/xdp-rx-metadata.rst
> > > > > @@ -26,6 +26,9 @@ metadata is supported, this set will grow:
> > > > > .. kernel-doc:: net/core/xdp.c
> > > > > :identifiers: bpf_xdp_metadata_rx_vlan_tag
> > > > > +.. kernel-doc:: net/core/xdp.c
> > > > > + :identifiers: bpf_xdp_metadata_rx_csum_lvl
> > > > > +
> > > > > An XDP program can use these kfuncs to read the metadata into stack
> > > > > variables for its own consumption. Or, to pass the metadata on to other
> > > > > consumers, an XDP program can store it into the metadata area carried
> > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > > > index 4fa4380e6d89..569563687172 100644
> > > > > --- a/include/linux/netdevice.h
> > > > > +++ b/include/linux/netdevice.h
> > > > > @@ -1660,6 +1660,7 @@ struct xdp_metadata_ops {
> > > > > enum xdp_rss_hash_type *rss_type);
> > > > > int (*xmo_rx_vlan_tag)(const struct xdp_md *ctx, u16 *vlan_tag,
> > > > > __be16 *vlan_proto);
> > > > > + int (*xmo_rx_csum_lvl)(const struct xdp_md *ctx, u8 *csum_level);
> > > > > };
> > > > > /**
> > > > > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > > > > index 89c58f56ffc6..61ed38fa79d1 100644
> > > > > --- a/include/net/xdp.h
> > > > > +++ b/include/net/xdp.h
> > > > > @@ -391,6 +391,8 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
> > > > > bpf_xdp_metadata_rx_hash) \
> > > > > XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \
> > > > > bpf_xdp_metadata_rx_vlan_tag) \
> > > > > + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CSUM_LVL, \
> > > > > + bpf_xdp_metadata_rx_csum_lvl) \
> > > > > enum {
> > > > > #define XDP_METADATA_KFUNC(name, _) name,
> > > > > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c
> > > > > index 986e7becfd42..a133fb775f49 100644
> > > > > --- a/kernel/bpf/offload.c
> > > > > +++ b/kernel/bpf/offload.c
> > > > > @@ -850,6 +850,8 @@ void *bpf_dev_bound_resolve_kfunc(struct bpf_prog *prog, u32 func_id)
> > > > > p = ops->xmo_rx_hash;
> > > > > else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_VLAN_TAG))
> > > > > p = ops->xmo_rx_vlan_tag;
> > > > > + else if (func_id == bpf_xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_CSUM_LVL))
> > > > > + p = ops->xmo_rx_csum_lvl;
> > > > > out:
> > > > > up_read(&bpf_devs_lock);
> > > > > diff --git a/net/core/xdp.c b/net/core/xdp.c
> > > > > index f6262c90e45f..c666d3e0a26c 100644
> > > > > --- a/net/core/xdp.c
> > > > > +++ b/net/core/xdp.c
> > > > > @@ -758,6 +758,27 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, u16 *vlan
> > > > > return -EOPNOTSUPP;
> > > > > }
> > > > > +/**
> > > > > + * bpf_xdp_metadata_rx_csum_lvl - Get depth at which HW has checked the checksum.
> > > > > + * @ctx: XDP context pointer.
> > > > > + * @csum_level: Return value pointer.
> > > > > + *
> > > > > + * In case of success, csum_level contains depth of the last verified checksum.
> > > > > + * If only the outermost checksum was verified, csum_level is 0, if both
> > > > > + * encapsulation and inner transport checksums were verified, csum_level is 1,
> > > > > + * and so on.
> > > > > + * For more details, refer to csum_level field in sk_buff.
> > > > > + *
> > > > > + * Return:
> > > > > + * * Returns 0 on success or ``-errno`` on error.
> > > > > + * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
> > > > > + * * ``-ENODATA`` : Checksum was not validated
> > > > > + */
> > > > > +__bpf_kfunc int bpf_xdp_metadata_rx_csum_lvl(const struct xdp_md *ctx, u8 *csum_level)
> > > >
> > > > Istead of ENODATA should we return what would be put in the ip_summed field
> > > > CHECKSUM_{NONE, UNNECESSARY, COMPLETE, PARTIAL}? Then sig would be,
> >
> > I was thinking the same, what about checksum "type".
> >
> > > >
> > > > bpf_xdp_metadata_rx_csum_lvl(const struct xdp_md *ctx, u8 *type, u8 *lvl);
> > > >
> > > > or something like that? Or is the thought that its not really necessary?
> > > > I don't have a strong preference but figured it was worth asking.
> > > >
> > >
> > > I see no value in returning CHECKSUM_COMPLETE without the actual checksum value.
> > > Same with CHECKSUM_PARTIAL and csum_start. Returning those values too would
> > > overcomplicate the function signature.
> >
> > So, this kfunc bpf_xdp_metadata_rx_csum_lvl() success is it equivilent to
> > CHECKSUM_UNNECESSARY?
>
> This is 100% true for physical NICs, it's more complicated for veth, bacause it
> often receives CHECKSUM_PARTIAL, which shouldn't normally apprear on RX, but is
> treated by the network stack as a validated checksum, because there is no way
> internally generated packet could be messed up. I would be grateful if you could
> look at the veth patch and share your opinion about this.
>
> >
> > Looking at documentation[1] (generated from skbuff.h):
> > [1] https://kernel.org/doc/html/latest/networking/skbuff.html#checksumming-of-received-packets-by-device
> >
> > Is the idea that we can add another kfunc (new signature) than can deal
> > with the other types of checksums (in a later kernel release)?
> >
>
> Yes, that is the idea.
If we think there is a chance we might need another kfunc we should add it
in the same kfunc. It would be unfortunate to have to do two kfuncs when
one would work. It shouldn't cost much/anything(?) to hardcode the type for
most cases? I think if we need it later I would advocate for updating this
kfunc to support it. Of course then userspace will have to swivel on the
kfunc signature.
next prev parent reply other threads:[~2023-07-06 5:50 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-03 18:12 [xdp-hints] [PATCH bpf-next v2 00/20] XDP metadata via kfuncs for ice Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 01/20] ice: make RX hash reading code more reusable Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 02/20] ice: make RX HW timestamp " Larysa Zaremba
2023-07-04 10:04 ` [xdp-hints] " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 03/20] ice: make RX checksum checking " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 04/20] ice: Make ptype internal to descriptor info processing Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 05/20] ice: Introduce ice_xdp_buff Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 06/20] ice: Support HW timestamp hint Larysa Zaremba
2023-07-05 17:30 ` [xdp-hints] " Stanislav Fomichev
2023-07-06 14:22 ` Larysa Zaremba
2023-07-06 16:39 ` Stanislav Fomichev
2023-07-10 15:49 ` Larysa Zaremba
2023-07-10 18:12 ` Stanislav Fomichev
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 07/20] ice: Support RX hash XDP hint Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 08/20] ice: Support XDP hints in AF_XDP ZC mode Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 09/20] xdp: Add VLAN tag hint Larysa Zaremba
2023-07-03 20:15 ` [xdp-hints] " John Fastabend
2023-07-04 8:23 ` Larysa Zaremba
2023-07-04 10:23 ` Jesper Dangaard Brouer
2023-07-04 11:02 ` Larysa Zaremba
2023-07-04 14:18 ` Jesper Dangaard Brouer
2023-07-06 14:46 ` Larysa Zaremba
2023-07-07 13:57 ` Jesper Dangaard Brouer
2023-07-07 17:58 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 10/20] ice: Implement " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 11/20] ice: use VLAN proto from ring packet context in skb path Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 12/20] xdp: Add checksum level hint Larysa Zaremba
2023-07-03 20:38 ` [xdp-hints] " John Fastabend
2023-07-04 9:24 ` Larysa Zaremba
2023-07-04 10:39 ` Jesper Dangaard Brouer
2023-07-04 11:19 ` Larysa Zaremba
2023-07-06 5:50 ` John Fastabend [this message]
2023-07-06 9:04 ` Jesper Dangaard Brouer
2023-07-06 12:38 ` Larysa Zaremba
2023-07-06 12:49 ` Larysa Zaremba
2023-07-10 16:58 ` Alexander Lobakin
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 13/20] ice: Implement " Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 14/20] selftests/bpf: Allow VLAN packets in xdp_hw_metadata Larysa Zaremba
2023-07-05 17:31 ` [xdp-hints] " Stanislav Fomichev
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 15/20] net, xdp: allow metadata > 32 Larysa Zaremba
2023-07-03 21:06 ` [xdp-hints] " John Fastabend
2023-07-06 14:51 ` Larysa Zaremba
2023-07-10 14:01 ` Alexander Lobakin
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 16/20] selftests/bpf: Add flags and new hints to xdp_hw_metadata Larysa Zaremba
2023-07-04 11:03 ` [xdp-hints] " Jesper Dangaard Brouer
2023-07-04 11:04 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 17/20] veth: Implement VLAN tag and checksum level XDP hint Larysa Zaremba
2023-07-05 17:25 ` [xdp-hints] " Stanislav Fomichev
2023-07-06 9:57 ` Jesper Dangaard Brouer
2023-07-06 10:15 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 18/20] selftests/bpf: Use AF_INET for TX in xdp_metadata Larysa Zaremba
2023-07-05 17:39 ` [xdp-hints] " Stanislav Fomichev
2023-07-06 14:11 ` Larysa Zaremba
2023-07-06 17:25 ` Stanislav Fomichev
2023-07-06 17:27 ` Stanislav Fomichev
2023-07-07 8:33 ` Larysa Zaremba
2023-07-07 16:49 ` Stanislav Fomichev
2023-07-07 16:58 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 19/20] selftests/bpf: Check VLAN tag and proto " Larysa Zaremba
2023-07-05 17:41 ` [xdp-hints] " Stanislav Fomichev
2023-07-06 10:10 ` Jesper Dangaard Brouer
2023-07-06 10:13 ` Larysa Zaremba
2023-07-03 18:12 ` [xdp-hints] [PATCH bpf-next v2 20/20] selftests/bpf: check checksum level " Larysa Zaremba
2023-07-05 17:41 ` [xdp-hints] " Stanislav Fomichev
2023-07-06 10:25 ` Jesper Dangaard Brouer
2023-07-06 12:02 ` Larysa Zaremba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64a656273ee15_b20ce2087a@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=alexander.duyck@gmail.com \
--cc=alexandr.lobakin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=haoluo@google.com \
--cc=jbrouer@redhat.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=magnus.karlsson@gmail.com \
--cc=martin.lau@linux.dev \
--cc=mtahhan@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox