From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by mail.toke.dk (Postfix) with ESMTPS id 9068DA12505 for ; Fri, 23 Jun 2023 13:12:24 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=brouer-com.20221208.gappssmtp.com header.i=@brouer-com.20221208.gappssmtp.com header.a=rsa-sha256 header.s=20221208 header.b=CdQoE1AB Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-3113675d582so499935f8f.3 for ; Fri, 23 Jun 2023 04:12:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brouer-com.20221208.gappssmtp.com; s=20221208; t=1687518742; x=1690110742; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Ya89jZy41fZuCej14Q/drVkMBb/gIlPbi2gQGlwLsvs=; b=CdQoE1ABHT30tNlFfZB3ujGAtkRfaBt16sC/cfDVLZIEwRSBtBqxG6SI6+cRb4rYVE hV4Uhqfz8cjE8EOUHSXSC69ZiWQr9TTBY8CyTBhEI/qfUXSVaNTWr/c3tOS06oZKFYJa AYjMBEz2n3Q9l7CNRxXrzTh24P8DUdIadeg2bglGFcrGkQqkOVbPuQKMVkOJ7PZX3FUQ iGsBYFUnRH9/c4bQ1U5pFv05YRtnTeS7o0+9x9a4r+CrekXeJSbXqJdPTzm6kYefvq+j WMzavM+No0M+aui/vNy9KcE0Lw3j2QYN8JLyIC7WBv9tEvt5Ju9PZWy79uZR9sErkALx PiaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687518742; x=1690110742; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:cc:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ya89jZy41fZuCej14Q/drVkMBb/gIlPbi2gQGlwLsvs=; b=N/ruNCgQdOOm4zd2RHYXYwUqJ+SYD429KnWJVbcjBBG9+Hv9JPaXk2xVOkr0xTmo+K 6v4uvCep3hdkPg3sOpZb6KbaiILqGTgeTa/BllnidU7RNXnT99GamVSBaT206SguLsmW z4q3NQOMcw6TRflTVXbuh4CrtX5k58w9r+IKUoK23U0lwVhlwQ2Q421xXFGm48xP9YGi IqYwbfCmkifxH+82eTtP6fC8P5lL+TEwCYpvXxoCcJvRlrSUzTnIuxDuuzdhGDDxa29s CfmK/8SU24EQC+/ICNbZHle0PgmeZyE9rd/RBvHHvgeigNba/sm1/nTCqlIRAN49GMbQ 9B6g== X-Gm-Message-State: AC+VfDzal1CrIm6FNxndsydrA7LCoT2x0N138QgFjK9jI0THWv5KpMFd Du+SRuckFdk8pc6wlj3OJDv8tQ== X-Google-Smtp-Source: ACHHUZ5S07lZ5zx2+dXQh1Nn0MTmmwrkcaSV7HjODHG0LCAu3aTMgwhexVeFgSfejG2SrUTZzBWMyA== X-Received: by 2002:a5d:650b:0:b0:30f:b3d1:8f99 with SMTP id x11-20020a5d650b000000b0030fb3d18f99mr14313926wru.38.1687518741592; Fri, 23 Jun 2023 04:12:21 -0700 (PDT) Received: from [192.168.42.222] (194-45-78-10.static.kviknet.net. [194.45.78.10]) by smtp.gmail.com with ESMTPSA id h17-20020a5d6891000000b0030497b3224bsm9374010wru.64.2023.06.23.04.12.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 23 Jun 2023 04:12:21 -0700 (PDT) Message-ID: Date: Fri, 23 Jun 2023 13:12:19 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Content-Language: en-US To: Stanislav Fomichev , bpf@vger.kernel.org References: <20230621170244.1283336-1-sdf@google.com> <20230621170244.1283336-10-sdf@google.com> From: "Jesper D. Brouer" In-Reply-To: <20230621170244.1283336-10-sdf@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Message-ID-Hash: WWOKAWTRCTXTCXHQGLHGTJIMVIKL67NX X-Message-ID-Hash: WWOKAWTRCTXTCXHQGLHGTJIMVIKL67NX X-MailFrom: netdev@brouer.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: brouer@redhat.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, netdev@vger.kernel.org, "xdp-hints@xdp-project.net" X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next v2 09/11] selftests/bpf: Extend xdp_metadata with devtx kfuncs List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 21/06/2023 19.02, Stanislav Fomichev wrote: > Attach kfuncs that request and report TX timestamp via ringbuf. > Confirm on the userspace side that the program has triggered > and the timestamp is non-zero. > > Also make sure devtx_frame has a sensible pointers and data. > [...] > diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c > index d151d406a123..fc025183d45a 100644 > --- a/tools/testing/selftests/bpf/progs/xdp_metadata.c > +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c [...] > @@ -19,10 +24,25 @@ struct { > __type(value, __u32); > } prog_arr SEC(".maps"); > > +struct { > + __uint(type, BPF_MAP_TYPE_RINGBUF); > + __uint(max_entries, 10); > +} tx_compl_buf SEC(".maps"); > + > +__u64 pkts_fail_tx = 0; > + > +int ifindex = -1; > +__u64 net_cookie = -1; > + > extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, > __u64 *timestamp) __ksym; > extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, > enum xdp_rss_hash_type *rss_type) __ksym; > +extern int bpf_devtx_sb_request_timestamp(const struct devtx_frame *ctx) __ksym; > +extern int bpf_devtx_cp_timestamp(const struct devtx_frame *ctx, __u64 *timestamp) __ksym; > + > +extern int bpf_devtx_sb_attach(int ifindex, int prog_fd) __ksym; > +extern int bpf_devtx_cp_attach(int ifindex, int prog_fd) __ksym; > > SEC("xdp") > int rx(struct xdp_md *ctx) > @@ -61,4 +81,102 @@ int rx(struct xdp_md *ctx) > return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); > } > > +static inline int verify_frame(const struct devtx_frame *frame) > +{ > + struct ethhdr eth = {}; > + > + /* all the pointers are set up correctly */ > + if (!frame->data) > + return -1; > + if (!frame->sinfo) > + return -1; > + > + /* can get to the frags */ > + if (frame->sinfo->nr_frags != 0) > + return -1; > + if (frame->sinfo->frags[0].bv_page != 0) > + return -1; > + if (frame->sinfo->frags[0].bv_len != 0) > + return -1; > + if (frame->sinfo->frags[0].bv_offset != 0) > + return -1; > + > + /* the data has something that looks like ethernet */ > + if (frame->len != 46) > + return -1; > + bpf_probe_read_kernel(ð, sizeof(eth), frame->data); > + > + if (eth.h_proto != bpf_htons(ETH_P_IP)) > + return -1; > + > + return 0; > +} > + > +SEC("fentry/veth_devtx_submit") > +int BPF_PROG(tx_submit, const struct devtx_frame *frame) > +{ > + struct xdp_tx_meta meta = {}; > + int ret; > + > + if (frame->netdev->ifindex != ifindex) > + return 0; > + if (frame->netdev->nd_net.net->net_cookie != net_cookie) > + return 0; > + if (frame->meta_len != TX_META_LEN) > + return 0; > + > + bpf_probe_read_kernel(&meta, sizeof(meta), frame->data - TX_META_LEN); > + if (!meta.request_timestamp) > + return 0; > + > + ret = verify_frame(frame); > + if (ret < 0) { > + __sync_add_and_fetch(&pkts_fail_tx, 1); > + return 0; > + } > + > + ret = bpf_devtx_sb_request_timestamp(frame); My original design thoughts were that BPF-progs would write into metadata area, with the intend that at TX-complete we can access this metadata area again. In this case with request_timestamp it would make sense to me, to store a sequence number (+ the TX-queue number), such that program code can correlate on complete event. Like xdp_hw_metadata example, I would likely also to add a software timestamp, what I could check at TX complete hook. > + if (ret < 0) { > + __sync_add_and_fetch(&pkts_fail_tx, 1); > + return 0; > + } > + > + return 0; > +} > + > +SEC("fentry/veth_devtx_complete") > +int BPF_PROG(tx_complete, const struct devtx_frame *frame) > +{ > + struct xdp_tx_meta meta = {}; > + struct devtx_sample *sample; > + int ret; > + > + if (frame->netdev->ifindex != ifindex) > + return 0; > + if (frame->netdev->nd_net.net->net_cookie != net_cookie) > + return 0; > + if (frame->meta_len != TX_META_LEN) > + return 0; > + > + bpf_probe_read_kernel(&meta, sizeof(meta), frame->data - TX_META_LEN); > + if (!meta.request_timestamp) > + return 0; > + > + ret = verify_frame(frame); > + if (ret < 0) { > + __sync_add_and_fetch(&pkts_fail_tx, 1); > + return 0; > + } > + > + sample = bpf_ringbuf_reserve(&tx_compl_buf, sizeof(*sample), 0); > + if (!sample) > + return 0; Sending this via a ringbuffer to userspace, will make it hard to correlate. (For AF_XDP it would help a little to add the TX-queue number, as this hook isn't queue bound but AF_XDP is). > + > + sample->timestamp_retval = bpf_devtx_cp_timestamp(frame, &sample->timestamp); > + I were expecting to see, information being written into the metadata area of the frame, such that AF_XDP completion-queue handling can extract this obtained timestamp. > + bpf_ringbuf_submit(sample, 0); > + > + return 0; > +} > + > char _license[] SEC("license") = "GPL"; > diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h > index 938a729bd307..e410f2b95e64 100644 > --- a/tools/testing/selftests/bpf/xdp_metadata.h > +++ b/tools/testing/selftests/bpf/xdp_metadata.h > @@ -18,3 +18,17 @@ struct xdp_meta { > __s32 rx_hash_err; > }; > }; > + > +struct devtx_sample { > + int timestamp_retval; > + __u64 timestamp; > +}; > + > +#define TX_META_LEN 8 Very static design. > + > +struct xdp_tx_meta { > + __u8 request_timestamp; > + __u8 padding0; > + __u16 padding1; > + __u32 padding2; > +}; padding2 could be a btf_id for creating a more flexible design. --Jesper