From: Stanislav Fomichev <sdf@google.com>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
brouer@redhat.com, bpf@vger.kernel.org, netdev@vger.kernel.org,
martin.lau@kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
yhs@fb.com, john.fastabend@gmail.com, dsahern@gmail.com,
willemb@google.com, void@manifault.com, kuba@kernel.org,
xdp-hints@xdp-project.net
Subject: [xdp-hints] Re: [PATCH bpf-next RFC V1] selftests/bpf: xdp_hw_metadata clear metadata when -EOPNOTSUPP
Date: Tue, 31 Jan 2023 11:01:57 -0800 [thread overview]
Message-ID: <CAKH8qBtcDqru=_g1h17ogu26FTwRLOgyyNTO-5PY2Ur2o7vhXw@mail.gmail.com> (raw)
In-Reply-To: <839c6cbb-1572-b3a8-57eb-2aa2488101dd@redhat.com>
On Tue, Jan 31, 2023 at 5:00 AM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
>
> On 27/01/2023 18.18, Stanislav Fomichev wrote:
> > On Fri, Jan 27, 2023 at 5:58 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Jesper Dangaard Brouer <brouer@redhat.com> writes:
> >>
> >>> The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of
> >>> the availability of rx_timestamp and rx_hash in data_meta area. The
> >>> kernel-side BPF-prog code doesn't initialize these members when kernel
> >>> returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to
> >>> be zeroed, and can contain garbage/previous values, which will be read
> >>> and interpreted by AF_XDP userspace side.
> >>>
> >>> Tested this on different drivers. The experiences are that for most
> >>> packets they will have zeroed this data_meta area, but occasionally it
> >>> will contain garbage data.
> >>>
> >>> Example of failure tested on ixgbe:
> >>> poll: 1 (0)
> >>> xsk_ring_cons__peek: 1
> >>> 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
> >>> rx_hash: 3697961069
> >>> rx_timestamp: 9024981991734834796 (sec:9024981991.7348)
> >>> 0x18ec788: complete idx=8 addr=8000
> >>>
> >>> Converting to date:
> >>> date -d @9024981991
> >>> 2255-12-28T20:26:31 CET
> >>>
> >>> I choose a simple fix in this patch. When kfunc fails or isn't supported
> >>> assign zero to the corresponding struct meta value.
> >>>
> >>> It's up to the individual BPF-programmer to do something smarter e.g.
> >>> that fits their use-case, like getting a software timestamp and marking
> >>> a flag that gives the type of timestamp.
> >>>
> >>> Another possibility is for the behavior of kfunc's
> >>> bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require
> >>> clearing return value pointer.
> >>
> >> I definitely think we should leave it up to the BPF programmer to react
> >> to failures; that's what the return code is there for, after all :)
> >
> > +1
>
> +1 I agree.
> We should keep this default functions as simple as possible, for future
> "unroll" of BPF-bytecode.
>
> I the -EOPNOTSUPP case (default functions for drivers not implementing
> kfunc), will likely be used runtime by BPF-prog to determine if the
> hardware have this offload hint, but it comes with the overhead of a
> function pointer call.
>
> I hope we can somehow BPF-bytecode "unroll" these (default functions) at
> BPF-load time, to remove this overhead, and perhaps even let BPF
> bytecode do const propagation and code elimination?
>
>
> > Maybe we can unconditionally memset(meta, sizeof(*meta), 0) in
> > tools/testing/selftests/bpf/progs/xdp_hw_metadata.c?
> > Since it's not a performance tool, it should be ok functionality-wise.
>
> I know this isn't a performance test, but IMHO always memsetting
> metadata area is a misleading example. We know from experience that
> developer simply copy-paste code examples, even quick-n-dirty testing
> example code.
>
> The specific issue in this example can lead to hard-to-find bugs, as my
> testing shows it is only occasionally that data_meta area contains
> garbage. We could do a memset, but it deserves a large code comment, why
> this is needed, so people copy-pasting understand. I choose current
> approach to keep code close to code people will copy-paste.
SG, I don't think it matters, but agreed that having this stated
explicitly could help with a blind copy-paste :-)
Then maybe repost with the TODO's removed from the kfucs? We seem to
agree that it's the user's job to manage the final buffer..
> --Jesper
>
prev parent reply other threads:[~2023-01-31 19:02 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-27 13:49 [xdp-hints] " Jesper Dangaard Brouer
2023-01-27 13:58 ` [xdp-hints] " Toke Høiland-Jørgensen
2023-01-27 17:18 ` Stanislav Fomichev
2023-01-31 13:00 ` Jesper Dangaard Brouer
2023-01-31 19:01 ` Stanislav Fomichev [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKH8qBtcDqru=_g1h17ogu26FTwRLOgyyNTO-5PY2Ur2o7vhXw@mail.gmail.com' \
--to=sdf@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=jbrouer@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=martin.lau@kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=song@kernel.org \
--cc=toke@redhat.com \
--cc=void@manifault.com \
--cc=willemb@google.com \
--cc=xdp-hints@xdp-project.net \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox