From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by mail.toke.dk (Postfix) with ESMTPS id 406029B024D for ; Fri, 28 Oct 2022 20:46:27 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=B4WcZf/o Received: by mail-io1-xd30.google.com with SMTP id e15so5344421iof.2 for ; Fri, 28 Oct 2022 11:46:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Hxm7sQTpDKl9UY4520J3/vp5jchfI3eX96wGKNMDOQs=; b=B4WcZf/oRIkjQLeSHBKGFOty1f5Cfg44oU823gpumf4Yh08/EL8sbaWf1wWbl4DP4k NZq7paMR231hbzbVZi2yMIxsMCH1oJSH42C7n26mGWX1Ggv3Ta6ZhAdzownsUnBDmutE EIiiKDgqEchlSBPNrhSs+fGlc16Q+TBKot8du6YYAfd9YzjuCuR/XmKviA3jHebBNBIC Yj0bCDYZeVNN9OcCsPTgsjO8Tt1R1WGdla2wTLQ2E53V297AUCatAlC0B51hRUq01H/+ h09hG8zI/MfH6BRknnTx6iEh/17dQxcU+/UT1dL1ziXYnjPR4kTzJsY3nCDJQwpOkQVY mMQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hxm7sQTpDKl9UY4520J3/vp5jchfI3eX96wGKNMDOQs=; b=H9vuIQaa7PaNFt69yl4XiO4X0cRuAZENVIPydk0QqOGyuVG/m1aRm8QZYx32g3mWZd YrdJ0lNriD+v9bLaSQkiEDQb+bF1mhMYnIlo9WkLYAfRtpBoi8dbc/buy8/BdYAY8/gh XMSVSZe3CLMEUl6+VOG8Rrfvx72HyRi0jOnJxi4z00/QehsobIknDx90VJ3so4+N0bXV Ey+Nhv8UWDg3dtTb+1wZ27BHm1ZsGNm75PbCnfanJmdGbO8FPP3mF99ujymlwY7wVziJ Qtr6u8tjAQBCfWbIFD9LJ/xm6YyLCVRyzGMH0kZnYuFSSgCEKjcYI1hzR6ssojMm/DuQ wUvA== X-Gm-Message-State: ACrzQf2T06cNSdgTjfvxp8xnSiklC3Pcpfrl82lSaiX6gk7Kfi2UCu6d 1+KP/Jp9frO3LoGsQmqZ345kUvLrobqQJ8bOkN46zw== X-Google-Smtp-Source: AMsMyM4Om9g03TRpDLXEtZ2L2StTMInxaehwT2GQT1L7utLfww8N9Q+e2e0zkjXrZxj6cPmzj+qsf3EGCKeKa0kA4pg= X-Received: by 2002:a05:6638:4519:b0:372:c7f1:425b with SMTP id bs25-20020a056638451900b00372c7f1425bmr457604jab.106.1666982785511; Fri, 28 Oct 2022 11:46:25 -0700 (PDT) MIME-Version: 1.0 References: <20221027200019.4106375-1-sdf@google.com> <20221027200019.4106375-6-sdf@google.com> <31f3aa18-d368-9738-8bb5-857cd5f2c5bf@linux.dev> <1885bc0c-1929-53ba-b6f8-ace2393a14df@redhat.com> In-Reply-To: <1885bc0c-1929-53ba-b6f8-ace2393a14df@redhat.com> From: Stanislav Fomichev Date: Fri, 28 Oct 2022 11:46:14 -0700 Message-ID: To: Jesper Dangaard Brouer Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: U7PNEPMAYZZESOA4YWRT3S3SOJ6UHTWY X-Message-ID-Hash: U7PNEPMAYZZESOA4YWRT3S3SOJ6UHTWY X-MailFrom: sdf@google.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Martin KaFai Lau , brouer@redhat.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, Jakub Kicinski , Willem de Bruijn , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org, bpf@vger.kernel.org X-Mailman-Version: 3.3.5 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, Oct 28, 2022 at 3:37 AM Jesper Dangaard Brouer wrote: > > > On 28/10/2022 08.22, Martin KaFai Lau wrote: > > On 10/27/22 1:00 PM, Stanislav Fomichev wrote: > >> Example on how the metadata is prepared from the BPF context > >> and consumed by AF_XDP: > >> > >> - bpf_xdp_metadata_have_rx_timestamp to test whether it's supported; > >> if not, I'm assuming verifier will remove this "if (0)" branch > >> - bpf_xdp_metadata_rx_timestamp returns a _copy_ of metadata; > >> the program has to bpf_xdp_adjust_meta+memcpy it; > >> maybe returning a pointer is better? > >> - af_xdp consumer grabs it from data- and > >> makes sure timestamp is not empty > >> - when loading the program, we pass BPF_F_XDP_HAS_METADATA+prog_ifindex > >> > >> Cc: Martin KaFai Lau > >> Cc: Jakub Kicinski > >> Cc: Willem de Bruijn > >> Cc: Jesper Dangaard Brouer > >> Cc: Anatoly Burakov > >> Cc: Alexander Lobakin > >> Cc: Magnus Karlsson > >> Cc: Maryam Tahhan > >> Cc: xdp-hints@xdp-project.net > >> Cc: netdev@vger.kernel.org > >> Signed-off-by: Stanislav Fomichev > >> --- > >> .../testing/selftests/bpf/progs/xskxceiver.c | 22 ++++++++++++++++++ > >> tools/testing/selftests/bpf/xskxceiver.c | 23 ++++++++++++++++++- > >> 2 files changed, 44 insertions(+), 1 deletion(-) > >> > >> diff --git a/tools/testing/selftests/bpf/progs/xskxceiver.c > >> b/tools/testing/selftests/bpf/progs/xskxceiver.c > >> index b135daddad3a..83c879aa3581 100644 > >> --- a/tools/testing/selftests/bpf/progs/xskxceiver.c > >> +++ b/tools/testing/selftests/bpf/progs/xskxceiver.c > >> @@ -12,9 +12,31 @@ struct { > >> __type(value, __u32); > >> } xsk SEC(".maps"); > >> +extern int bpf_xdp_metadata_have_rx_timestamp(struct xdp_md *ctx) > >> __ksym; > >> +extern __u32 bpf_xdp_metadata_rx_timestamp(struct xdp_md *ctx) __ksym; > >> + > >> SEC("xdp") > >> int rx(struct xdp_md *ctx) > >> { > >> + void *data, *data_meta; > >> + __u32 rx_timestamp; > >> + int ret; > >> + > >> + if (bpf_xdp_metadata_have_rx_timestamp(ctx)) { > > In current veth implementation, bpf_xdp_metadata_have_rx_timestamp() > will always return true here. > > In the case of hardware timestamps, not every packet will contain a > hardware timestamp. (See my/Maryam ixgbe patch, where timestamps are > read via HW device register, which isn't fast, and HW only support this > for timesync protocols like PTP). > > How do you imagine we can extend this? I'm always returning true for simplicity. In the real world, this bytecode will look at the descriptors and return true/false depending on whether the info is there or not. > >> + ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(__u32)); > > IMHO sizeof() should come from a struct describing data_meta area see: > > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_kern.c#L62 I guess I should've used pointers for the return type instead, something like: extern __u64 *bpf_xdp_metadata_rx_timestamp(struct xdp_md *ctx) __ksym; { ... __u64 *rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx); if (rx_timestamp) { bpf_xdp_adjust_meta(ctx, -(int)sizeof(*rx_timestamp)); __builtin_memcpy(data_meta, rx_timestamp, sizeof(*rx_timestamp)); } } Does that look better? > >> + if (ret != 0) > >> + return XDP_DROP; > >> + > >> + data = (void *)(long)ctx->data; > >> + data_meta = (void *)(long)ctx->data_meta; > >> + > >> + if (data_meta + sizeof(__u32) > data) > >> + return XDP_DROP; > >> + > >> + rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx); > >> + __builtin_memcpy(data_meta, &rx_timestamp, sizeof(__u32)); > > So, this approach first stores hints on some other memory location, and > then need to copy over information into data_meta area. That isn't good > from a performance perspective. > > My idea is to store it in the final data_meta destination immediately. This approach doesn't have to store the hints in the other memory location. xdp_buff->priv can point to the real hw descriptor and the kfunc can have a bytecode that extracts the data from the hw descriptors. For this particular RFC, we can think that 'skb' is that hw descriptor for veth driver. > Do notice that in my approach, the existing ethtool config setting and > socket options (for timestamps) still apply. Thus, each individual > hardware hint are already configurable. Thus we already have a config > interface. I do acknowledge, that in-case a feature is disabled it still > takes up space in data_meta areas, but importantly it is NOT stored into > the area (for performance reasons). That should be the case with this rfc as well, isn't it? Worst case scenario, that kfunc bytecode can explicitly check ethtool options and return false if it's disabled? > >> + } > > > > Thanks for the patches. I took a quick look at patch 1 and 2 but > > haven't had a chance to look at the implementation details (eg. > > KF_UNROLL...etc), yet. > > > > Yes, thanks for the patches, even-though I don't agree with the > approach, at-least until my concerns/use-case can be resolved. > IMHO the best way to convince people is through code. So, thank you for > the effort. Hopefully we can use some of these ideas and I can also > change/adjust my XDP-hints ideas to incorporate some of this :-) Thank you for the feedback as well, appreciate it! Definitely, looking forward to a v2 from you with some more clarity on how those btf ids are handled by the bpf/af_xdp side! > > Overall (with the example here) looks promising. There is a lot of > > flexibility on whether the xdp prog needs any hint at all, which hint it > > needs, and how to store it. > > > > I do see the advantage that XDP prog only populates metadata it needs. > But how can we use/access this in __xdp_build_skb_from_frame() ? I don't think __xdp_build_skb_from_frame is automagically solved by either proposal? For this proposal, there has to be some expected kernel metadata format that bpf programs will prepare and the kernel will understand? Think of it like xdp_hints_common from your proposal; the program will have to put together xdp_hints_skb into xdp metadata with the parts that can be populated into skb by the kernel. For your btf ids proposal, it seems there has to be some extra kernel code to parse all possible driver btf_if formats and copy the metadata? > >> + > >> return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); > >> } > >> diff --git a/tools/testing/selftests/bpf/xskxceiver.c > >> b/tools/testing/selftests/bpf/xskxceiver.c > >> index 066bd691db13..ce82c89a432e 100644 > >> --- a/tools/testing/selftests/bpf/xskxceiver.c > >> +++ b/tools/testing/selftests/bpf/xskxceiver.c > >> @@ -871,7 +871,9 @@ static bool is_offset_correct(struct xsk_umem_info > >> *umem, struct pkt_stream *pkt > >> static bool is_pkt_valid(struct pkt *pkt, void *buffer, u64 addr, > >> u32 len) > >> { > >> void *data = xsk_umem__get_data(buffer, addr); > >> + void *data_meta = data - sizeof(__u32); > >> struct iphdr *iphdr = (struct iphdr *)(data + sizeof(struct > >> ethhdr)); > >> + __u32 rx_timestamp = 0; > >> if (!pkt) { > >> ksft_print_msg("[%s] too many packets received\n", __func__); > >> @@ -907,6 +909,13 @@ static bool is_pkt_valid(struct pkt *pkt, void > >> *buffer, u64 addr, u32 len) > >> return false; > >> } > >> + memcpy(&rx_timestamp, data_meta, sizeof(rx_timestamp)); > > I acknowledge that it is too extensive to add to this patch, but in my > AF_XDP-interaction example[1], I'm creating a struct xdp_hints_rx_time > that gets BTF exported[1][2] to the userspace application, and userspace > decodes the BTF and gets[3] a xsk_btf_member struct for members that > simply contains a offset+size to read from. > > [1] > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_kern.c#L47-L51 > > [2] > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_kern.c#L80 > > [3] > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_user.c#L123-L129 > > >> + if (rx_timestamp == 0) { > >> + ksft_print_msg("Invalid metadata received: "); > >> + ksft_print_msg("got %08x, expected != 0\n", rx_timestamp); > >> + return false; > >> + } > >> + > >> return true; > >> } > > > > Looking forward to collaborate :-) > --Jesper >