From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mail.toke.dk (Postfix) with ESMTPS id 30C069B183B for ; Tue, 1 Nov 2022 23:23:34 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=h+gAZvaL DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667341413; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BWYJNjs5+6LYYUemP579X3fkX685gPk4jKAzaeAb8GA=; b=h+gAZvaLF8pYYKE1f+RUcZ9psVdQSJcy6nx2g6Ot6i8nOkwm8akju8t/7tXLwjK5f64MdW fwwStMBAH7A3gxzlzI09Hu0p/JWH6KO6vstikk5Q3NrOvPwJsITJ39YYAd4TtTWD6zPsPp 1p5IfnAiK5UUQ3Z9boKjZABDc0l7+xs= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-635-FrJfLmArPBaYQHVDPFm8Zg-1; Tue, 01 Nov 2022 18:23:30 -0400 X-MC-Unique: FrJfLmArPBaYQHVDPFm8Zg-1 Received: by mail-ed1-f71.google.com with SMTP id w17-20020a056402269100b00461e28a75ccso10754584edd.8 for ; Tue, 01 Nov 2022 15:23:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BWYJNjs5+6LYYUemP579X3fkX685gPk4jKAzaeAb8GA=; b=T+IbJO1rsAyvloidJii8dlCVVNVBPxwosEWB5Ck5COcjdrE1ozv9c5LOxfc9awCF3x XMNa1AcQihE4q5qXXAsgVBFg5CATur6WAjLwLOVtiz4x3gymFi21D8StCTJRnm+ryZHx UQKyLh+9J887q79Ch6fjlAJ9hxMEsihlTHTWZfa0LZCd9c+JK2hx6o4HT3MvzXjN+F8/ /briUVBxmLaA+JIzpN4GXVCrcesolKDMc/NKARQDKTkQ4fmuLWhNFAifE7StiZcmqbKZ wvOmaqeOOOXaQZs+eB2BSCkLDX+P2d/chcXfgNAwzEvVcEffur5Gt9lf1oVbdEzrv5Dy 1Fiw== X-Gm-Message-State: ACrzQf2E721ALO5yMCeJaCopeu5FMhH1+Gv+6tM/VrCIMofoxxA50HzP V6Bcrh5YhwWvRuIIQkQhSD7xrQjHrSBGxen8vRiUiITmex+P7s7Kg6jv+zc4yZ1skc+9c8xGkJe G7PtW0X0uYuXtzpnEyaHB X-Received: by 2002:aa7:c54b:0:b0:463:e966:d30c with SMTP id s11-20020aa7c54b000000b00463e966d30cmr1541875edr.222.1667341408941; Tue, 01 Nov 2022 15:23:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM41pE1VzKILWoSabjbW8St1P0wddk3W1IlmLVT+qCywenwINlLE3zVv29pgwqByFPAA3GTwow== X-Received: by 2002:aa7:c54b:0:b0:463:e966:d30c with SMTP id s11-20020aa7c54b000000b00463e966d30cmr1541837edr.222.1667341408498; Tue, 01 Nov 2022 15:23:28 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id u18-20020a1709061db200b007030c97ae62sm4619454ejh.191.2022.11.01.15.23.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Nov 2022 15:23:28 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id A74E674B0C5; Tue, 1 Nov 2022 23:23:27 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Jesper Dangaard Brouer , Stanislav Fomichev , Alexander Lobakin In-Reply-To: References: <20221027200019.4106375-1-sdf@google.com> <20221027200019.4106375-6-sdf@google.com> <31f3aa18-d368-9738-8bb5-857cd5f2c5bf@linux.dev> <1885bc0c-1929-53ba-b6f8-ace2393a14df@redhat.com> <20221031142032.164247-1-alexandr.lobakin@intel.com> X-Clacks-Overhead: GNU Terry Pratchett Date: Tue, 01 Nov 2022 23:23:27 +0100 Message-ID: <87leou48m8.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain Message-ID-Hash: 5JOEFZMSNADKC2B7SKQPX67QKUVBLV5F X-Message-ID-Hash: 5JOEFZMSNADKC2B7SKQPX67QKUVBLV5F X-MailFrom: toke@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: brouer@redhat.com, Jesper Dangaard Brouer , Martin KaFai Lau , ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, song@kernel.org, yhs@fb.com, John Fastabend , kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, Jakub Kicinski , Willem de Bruijn , Anatoly Burakov , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org, bpf@vger.kernel.org X-Mailman-Version: 3.3.5 Precedence: list Subject: [xdp-hints] Re: [RFC bpf-next 5/5] selftests/bpf: Test rx_timestamp metadata in xskxceiver List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: >>>>> So, this approach first stores hints on some other memory location, and >>>>> then need to copy over information into data_meta area. That isn't good >>>>> from a performance perspective. >>>>> >>>>> My idea is to store it in the final data_meta destination immediately. >>>> >>>> This approach doesn't have to store the hints in the other memory >>>> location. xdp_buff->priv can point to the real hw descriptor and the >>>> kfunc can have a bytecode that extracts the data from the hw >>>> descriptors. For this particular RFC, we can think that 'skb' is that >>>> hw descriptor for veth driver. > > Once you point xdp_buff->priv to the real hw descriptor, then we also > need to have some additional data/pointers to NIC hardware info + HW > setup state. You will hit some of the same challenges as John, like > hardware/firmware revisions and chip models, that Jakub pointed out. > Because your approach stays with the driver code, I guess it will be a > bit easier code wise. Maybe we can store data/pointer needed for this in > xdp_rxq_info (xdp->rxq). > > I would need to see some code that juggling this HW NCI state from the > kfunc expansion to be convinced this is the right approach. +1 on needing to see this working for the actual metadata we want to support, but I think the kfunc approach otherwise shows promise; see below. [...] > Sure it is super cool if we can create this BPF layer that programmable > selects individual fields from the descriptor, and maybe we ALSO need that. > Could this layer could still be added after my patchset(?), as one could > disable the XDP-hints (via ethtool) and then use kfuncs/kptr to extract > only fields need by the specific XDP-prog use-case. > Could they also co-exist(?), kfuncs/kptr could extend the > xdp_hints_rx_common struct (in data_meta area) with more advanced > offload-hints and then update the BTF-ID (yes, BPF can already resolve > its own BTF-IDs from BPF-prog code). I actually think the two approaches are more similar than they appear from a user-facing API perspective. Or at least they should be. What I mean is, that with the BTF-ID approach, we still expect people to write code like (from Stanislav's example in the other xdp_hints thread[0]): If (ctx_hints_btf_id == xdp_hints_ixgbe_timestamp_btf_id /* supposedly populated at runtime by libbpf? */) { // do something with rx_timestamp // also, handle xdp_hints_ixgbe and then xdp_hints_common ? } else if (ctx_hints_btf_id == xdp_hints_ixgbe) { // do something else // plus explicitly handle xdp_hints_common here? } else { // handle xdp_hints_common } whereas with kfuncs (from this thread) this becomes: if (xdp_metadata_rx_timestamp_exists(ctx)) timestamp = xdp_metadata_rx_timestamp(ctx); We can hide the former behind CO-RE macros to make it look like the latter. But because we're just exposing the BTF IDs, people can in fact just write code like the example above (directly checking the BTF IDs), and that will work fine, but has a risk of leading to a proliferation of device-specific XDP programs. Whereas with kfuncs we keep all this stuff internal to the kernel (inside the kfuncs), making it much easier to change it later. Quoting yourself from the other thread[1]: > In this patchset I'm trying to balance the different users. And via BTF > I'm trying hard not to create more UAPI (e.g. more fixed fields avail in > xdp_md that we cannot get rid of). And trying to add driver flexibility > on-top of the common struct. This flexibility seems to be stalling the > patchset as we haven't found the perfect way to express this (yet) given > BTF layout is per driver. With kfuncs we kinda sidestep this issue because the kernel can handle the per-driver specialisation by the unrolling trick. The drawback being that programs will be tied to a particular device if they are using metadata, but I think that's an acceptable trade-off. -Toke [0] https://lore.kernel.org/r/CAKH8qBuYVk7QwVOSYrhMNnaKFKGd7M9bopDyNp6-SnN6hSeTDQ@mail.gmail.com [1] https://lore.kernel.org/r/ad360933-953a-7a99-5057-4d452a9a6005@redhat.com