From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by mail.toke.dk (Postfix) with ESMTPS id 6E86CA332AA for ; Thu, 19 Oct 2023 19:50:07 +0200 (CEST) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=tymcb95R Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5a839b31a0dso96935347b3.0 for ; Thu, 19 Oct 2023 10:50:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697737806; x=1698342606; darn=xdp-project.net; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WPwWKJwDsC2YehI3kbqO9vkbp5bLVw0qzVyvRYEEyrk=; b=tymcb95R8jsdMpAuNfsbrlWtrZBtagA9EdfpJ9yvWIZixubAF6U76SyVT+PkE6W1EH HPRm83GL/sSvtZlCnW6ZorBrJTAeg6c8lE+X/ikU2md2Ke42d+hJWRnlmWiX6I9khwvC J36fpqCWBXNVyS2K6AfC5XPbVL24FLLktFAuRxcJWt0jmSyKCyqcwdd9PMpqfA0n8aYF FCiSP8Akh40Nm7J1j+ahHXMdKctTREhT0PMRlsvxCnvR2UyB0IqflxenVEi5BVoqv2Nq rXhgW26ggtQcXdd1wKs5U2dWxRTMzg4acr/4T/lKbhg5Th2M1STDh7nDgTgZ1ugMBiCp lctg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697737806; x=1698342606; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WPwWKJwDsC2YehI3kbqO9vkbp5bLVw0qzVyvRYEEyrk=; b=aD6WFjCyEuPiBWgXHRV8X91/w6ssEfkIkqkCspbkiVV1WerBizhzOyal2odQtgYEkC 0GFWVVC56doG0wTXHkW3BG0Kd5KAKm0lhifOKI/IUgSAcidJx9/UcWlM4kESyHK95LxI YfAuls0ISEYWNj5Ta97QAOEM83lQToqtO6KNi4XvGU7JUCXx6+dpcEfgFJtTfm1G3ZPg mnoM/db/I2iChefx8XDz6r2rygG0jcGF13b89L5njMp6WgSg8gyWOZzxg4hGr4gPJMRu dSs6S/83FrB83xRCVGd4kgyDFA0MzfMXeukhfd5n1ChmBJgnxoEorzC3dn1BQe6XD3Y/ virw== X-Gm-Message-State: AOJu0YyfEz/VlNekVFYVG8RnwOzeO53mAWf9CdeFjKky1FhgkiAsIALl O2swdlWEiHkSYNiz3Vq5q1rYnnA= X-Google-Smtp-Source: AGHT+IFZV5ghtzBMWXEbUT8RZK3/uvitsmQ/T8gHLg0h3ipUNE+ychR+MJ+lpb6PGz1g/dJdb2GUgUk= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a0d:dbcb:0:b0:5a7:b9b0:d23f with SMTP id d194-20020a0ddbcb000000b005a7b9b0d23fmr67605ywe.6.1697737806274; Thu, 19 Oct 2023 10:50:06 -0700 (PDT) Date: Thu, 19 Oct 2023 10:49:44 -0700 In-Reply-To: <20231019174944.3376335-1-sdf@google.com> Mime-Version: 1.0 References: <20231019174944.3376335-1-sdf@google.com> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog Message-ID: <20231019174944.3376335-12-sdf@google.com> From: Stanislav Fomichev To: bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Message-ID-Hash: SINKCMBRSY32N22HF7SC4F6SOOT5VKV3 X-Message-ID-Hash: SINKCMBRSY32N22HF7SC4F6SOOT5VKV3 X-MailFrom: 3TmwxZQMKCR0L689HH9E7.5HFQ6I-ABGMLQ6I-IKHC75M.G7M@flex--sdf.bounces.google.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, hawk@kernel.org, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] [PATCH bpf-next v4 11/11] xsk: Document tx_metadata_len layout List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: - how to use - how to query features - pointers to the examples Signed-off-by: Stanislav Fomichev --- Documentation/networking/index.rst | 1 + Documentation/networking/xsk-tx-metadata.rst | 77 ++++++++++++++++++++ 2 files changed, 78 insertions(+) create mode 100644 Documentation/networking/xsk-tx-metadata.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 2ffc5ad10295..f3c2566d6cad 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -122,6 +122,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics. xfrm_sync xfrm_sysctl xdp-rx-metadata + xsk-tx-metadata .. only:: subproject and html diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst new file mode 100644 index 000000000000..b7289f06745c --- /dev/null +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -0,0 +1,77 @@ +================== +AF_XDP TX Metadata +================== + +This document describes how to enable offloads when transmitting packets +via :doc:`af_xdp`. Refer to :doc:`xdp-rx-metadata` on how to access similar +metadata on the receive side. + +General Design +============== + +The headroom for the metadata is reserved via ``tx_metadata_len`` in +``struct xdp_umem_reg``. The metadata length is therefore the same for +every socket that shares the same umem. The metadata layout is a fixed UAPI, +refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. +Thus, generally, the ``tx_metadata_len`` field above should contain +``sizeof(union xsk_tx_metadata)``. + +The headroom and the metadata itself should be located right before +``xdp_desc->addr`` in the umem frame. Within a frame, the metadata +layout is as follows:: + + tx_metadata_len + / \ + +-----------------+---------+----------------------------+ + | xsk_tx_metadata | padding | payload | + +-----------------+---------+----------------------------+ + ^ + | + xdp_desc->addr + +An AF_XDP application can request headrooms larger than ``sizeof(struct +xsk_tx_metadata)``. The kernel will ignore the padding (and will still +use ``xdp_desc->addr - tx_metadata_len`` to locate +the ``xsk_tx_metadata``). For the frames that shouldn't carry +any metadata (i.e., the ones that don't have ``XDP_TX_METADATA`` option), +the metadata area is ignored by the kernel as well. + +The flags field enables the particular offload: + +- ``XDP_TX_METADATA_TIMESTAMP``: requests the device to put transmission + timestamp into ``tx_timestamp`` field of ``union xsk_tx_metadata``. +- ``XDP_TX_METADATA_CHECKSUM``: requests the device to calculate L4 + checksum. ``csum_start`` specifies byte offset of there the checksumming + should start and ``csum_offset`` specifies byte offset where the + device should store the computed checksum. +- ``XDP_TX_METADATA_CHECKSUM_SW``: requests checksum calculation to + be done in software; this mode works only in ``XSK_COPY`` mode and + is mostly intended for testing. Do not enable this option, it + will negatively affect performance. + +Besides the flags above, in order to trigger the offloads, the first +packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` +bit in the ``options`` field. Also not that in a multi-buffer packet +only the first chunk should carry the metadata. + +Querying Device Capabilities +============================ + +Every devices exports its offloads capabilities via netlink netdev family. +Refer to ``xsk-flags`` features bitmask in +``Documentation/netlink/specs/netdev.yaml``. + +- ``tx-timestamp``: device supports ``XDP_TX_METADATA_TIMESTAMP`` +- ``tx-checksum``: device supports ``XDP_TX_METADATA_CHECKSUM`` + +Note that every devices supports ``XDP_TX_METADATA_CHECKSUM_SW`` when +running in ``XSK_COPY`` mode. + +See ``tools/net/ynl/samples/netdev.c`` on how to query this information. + +Example +======= + +See ``tools/testing/selftests/bpf/xdp_hw_metadata.c`` for an example +program that handles TX metadata. Also see https://github.com/fomichev/xskgen +for a more bare-bones example. -- 2.42.0.655.g421f12c284-goog