From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by mail.toke.dk (Postfix) with ESMTPS id 31E799D28DF for ; Tue, 20 Dec 2022 23:20:49 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=UecwM7rj Received: by mail-pg1-x549.google.com with SMTP id g32-20020a635660000000b00478c21b8095so7868603pgm.10 for ; Tue, 20 Dec 2022 14:20:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=e9Lkq4/A6zPieb8tRyXJZSCMrT+1yCsHBU9Tu08+BlM=; b=UecwM7rjmCJSMpS1ctcVIZhsSq7zIrzMBjWLLoncLJ0pnnkOxCPiNzeHP7lwlyEU7S N4XEBf3HNWQJU+QcbtseXWhtK9pefSrqhU5LLl4aQdioRhmM+o96zegIYXay1FERjFle 9MhOnudncLKXuCF6AFO3quwZt1r4xh8kQKkAT47m8RVAQTXHyhZB//7nTrv5GhXC0xRN a+aVNziHSSLPDsI5PZxzwrsCPLrW0gzY5zNvZoL8l68E/S53OYTj6jJ7AR5sGpk+77RO 8peUTZXTR9TtKwQANqQ5GOoyp76VRWZXr65aVDmwI+4JjcwHAhcH8JjD7/PCrv2OQYee VZEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e9Lkq4/A6zPieb8tRyXJZSCMrT+1yCsHBU9Tu08+BlM=; b=qC7axVHQEORLQb1L+GvY/osT88Gun0UG0K3it49VKL3FgrjN9hwx2AdAxJxJ/kuqQp BILe34L6633MM/unSIt4lID5lm6QojE5JJUk48hBNN8kapncDr6DloMuWMvnCQyxLZKS oBs4QDGfRF96y1qsu0vLm6hrkDSoktKF+ZjbpcNmvtHJJD8smYQ4ECDanEQ9lpfLoLrc aCXE/ya4Zwi2ogZgLw+/sbTDPL42ZD98LdKi9rWtg9ac7oHQwothoY0v1ueyCYoNvtxK /WS+4dkzX2HyCzeGTS+dauGKuYVe3GQfIKfwiEF+KWrwwRLqLgkByyWzl4i5cjnPU64k 1BYg== X-Gm-Message-State: AFqh2ko+xAdICPBM8bIoDzSQgC6hhQkjUA4aYQfpAfXd8Hc5pktybpQz ARv4VPDuydf+0yu267VgML1YT5g= X-Google-Smtp-Source: AMrXdXsivygEcEwL+VAVuk0P0l8J3nETK6FPuBfFOMLnpKueW15/LKOHnObysKdaVDpQGRXkD/aLbS8= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:aa7:870f:0:b0:57e:c08b:b7bc with SMTP id b15-20020aa7870f000000b0057ec08bb7bcmr1006220pfo.77.1671574847102; Tue, 20 Dec 2022 14:20:47 -0800 (PST) Date: Tue, 20 Dec 2022 14:20:27 -0800 In-Reply-To: <20221220222043.3348718-1-sdf@google.com> Mime-Version: 1.0 References: <20221220222043.3348718-1-sdf@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20221220222043.3348718-2-sdf@google.com> From: Stanislav Fomichev To: bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: 6YQPNDEMSWNWCTMLK2F3AIZWCLXFJRAB X-Message-ID-Hash: 6YQPNDEMSWNWCTMLK2F3AIZWCLXFJRAB X-MailFrom: 3PzWiYwMKCVYG134CC492.0CAL1D-56BHGL1D-DFC720H.B2H@flex--sdf.bounces.google.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, David Ahern , Jakub Kicinski , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org X-Mailman-Version: 3.3.7 Precedence: list Subject: [xdp-hints] [PATCH bpf-next v5 01/17] bpf: Document XDP RX metadata List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Document all current use-cases and assumptions. Cc: John Fastabend Cc: David Ahern Cc: Martin KaFai Lau Cc: Jakub Kicinski Cc: Willem de Bruijn Cc: Jesper Dangaard Brouer Cc: Anatoly Burakov Cc: Alexander Lobakin Cc: Magnus Karlsson Cc: Maryam Tahhan Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev --- Documentation/networking/index.rst | 1 + Documentation/networking/xdp-rx-metadata.rst | 107 +++++++++++++++++++ 2 files changed, 108 insertions(+) create mode 100644 Documentation/networking/xdp-rx-metadata.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 4f2d1f682a18..4ddcae33c336 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -120,6 +120,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics. xfrm_proc xfrm_sync xfrm_sysctl + xdp-rx-metadata .. only:: subproject and html diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst new file mode 100644 index 000000000000..37e8192d9b60 --- /dev/null +++ b/Documentation/networking/xdp-rx-metadata.rst @@ -0,0 +1,107 @@ +=============== +XDP RX Metadata +=============== + +This document describes how an XDP program can access hardware metadata +related to a packet using a set of helper functions, and how it can pass +that metadata on to other consumers. + +General Design +============== + +XDP has access to a set of kfuncs to manipulate the metadata in an XDP frame. +Every device driver that wishes to expose additional packet metadata can +implement these kfuncs. The set of kfuncs is declared in ``include/net/xdp.h`` +via ``XDP_METADATA_KFUNC_xxx``. + +Currently, the following kfuncs are supported. In the future, as more +metadata is supported, this set will grow: + +- ``bpf_xdp_metadata_rx_timestamp`` returns a packet's RX timestamp +- ``bpf_xdp_metadata_rx_hash`` returns a packet's RX hash + +The XDP program can use these kfuncs to read the metadata into stack +variables for its own consumption. Or, to pass the metadata on to other +consumers, an XDP program can store it into the metadata area carried +ahead of the packet. + +Not all kfuncs have to be implemented by the device driver; when not +implemented, the default ones that return ``-EOPNOTSUPP`` will be used. + +Within the XDP frame, the metadata layout is as follows:: + + +----------+-----------------+------+ + | headroom | custom metadata | data | + +----------+-----------------+------+ + ^ ^ + | | + xdp_buff->data_meta xdp_buff->data + +The XDP program can store individual metadata items into this data_meta +area in whichever format it chooses. Later consumers of the metadata +will have to agree on the format by some out of band contract (like for +the AF_XDP use case, see below). + +AF_XDP +====== + +``AF_XDP`` use-case implies that there is a contract between the BPF program +that redirects XDP frames into the ``AF_XDP`` socket (``XSK``) and the final +consumer. Thus the BPF program manually allocates a fixed number of +bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset +of kfuncs to populate it. The userspace ``XSK`` consumer computes +``xsk_umem__get_data() - METADATA_SIZE`` to locate its metadata. + +Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer):: + + +----------+-----------------+------+ + | headroom | custom metadata | data | + +----------+-----------------+------+ + ^ + | + rx_desc->address + +XDP_PASS +======== + +This is the path where the packets processed by the XDP program are passed +into the kernel. The kernel creates the ``skb`` out of the ``xdp_buff`` +contents. Currently, every driver has custom kernel code to parse +the descriptors and populate ``skb`` metadata when doing this ``xdp_buff->skb`` +conversion, and the XDP metadata is not used by the kernel when building +skbs. However, TC-BPF programs can access the XDP metadata area using +the data_meta pointer. + +In the future, we'd like to support a case where an XDP program +can override some of the metadata used for building skbs. + +bpf_redirect_map +================ + +``bpf_redirect_map`` can redirect the frame to a different device. +Some devices (like virtual ethernet links) support running a second XDP +program after the redirect. However, the final consumer doesn't have +access to the original hardware descriptor and can't access any of +the original metadata. The same applies to XDP programs installed +into devmaps and cpumaps. + +This means that for redirected packets only custom metadata is +currently supported, which has to be prepared by the initial XDP program +before redirect. If the frame is eventually passed to the kernel, the +skb created from such a frame won't have any hardware metadata populated +in its skb. And if such a packet is later redirected into an ``XSK``, +that will also only have access to the custom metadata. + + +bpf_tail_call +============= + +Adding programs that access metadata kfuncs to the ``BPF_MAP_TYPE_PROG_ARRAY`` +is currently not supported. + +Example +======= + +See ``tools/testing/selftests/bpf/progs/xdp_metadata.c`` and +``tools/testing/selftests/bpf/prog_tests/xdp_metadata.c`` for an example of +BPF program that handles XDP metadata. -- 2.39.0.314.g84b9a713c41-goog