From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) by mail.toke.dk (Postfix) with ESMTPS id 734A3A3C91F for ; Mon, 13 Nov 2023 15:11:03 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=M0xtWt+w Received: by mail-vk1-xa33.google.com with SMTP id 71dfb90a1353d-4ac023c8f82so1995122e0c.1 for ; Mon, 13 Nov 2023 06:11:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1699884659; x=1700489459; darn=xdp-project.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zgkwJXX1qr+S09dQpN5T+X4NS0++EZQcUD0oK4877bg=; b=M0xtWt+wduUx5CPpIZ5Wph/b5DSrni17UX6BmYXHLHirFF22H/x0M/q44VOQ7sTJMc gVGEIoYFu7vM+FWwdONTA50cZ3ipwqH6bXtqG/XAdaeGrWL5BByILybaKA7YJi/hdR5Q G1U+w/gEU5hXgAwb2Gs2IyvT4Cz42bczzDPKvPTDqPKICkYgLvSOe0AeZkMBgFcVptW/ 3y0Iah7YCiFpQIV9PKMAk873DW1dSgW4uBJUd9tFO2cbTJkM4lzxs2dzboxdEDZWx+rl FS8Lkf+M7UOV+r9Aw8z1TISYMaN1rxAoQ1JkGTAk9AwWuV11JIBVElhaGJlURqs21xob MnSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699884659; x=1700489459; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zgkwJXX1qr+S09dQpN5T+X4NS0++EZQcUD0oK4877bg=; b=ZhXukRfzehg06CQUQNm7yM3HnPWrpUyA4dIPichARzUTEnjQHg0/KAixVVkwWlNFN5 9E2JqnSWV0iuwYY3gZX8cNiVultGWnelfCI8ZQndULGtUMFAuFfe0kbb4ml1STmck7vn r48XNKD+UlnpXNtN1EeyqMlfUv1bqWhAFN012olIyCnnwBZxOsAZL8HGQIp8DWKqjPg5 TuIlZSnKmalaUjX0mExupVwQvu7j6PQ6YDSfDuimbi6SlDbWmNeIF6+WyCf3JaybJans u3/1WIFRbwaLVx+p8oDvbdtnzwfd3Lk/6Ge6HrUxInrOVI7NlbZtD+K12g+4dfR9FVw2 sQtA== X-Gm-Message-State: AOJu0Yxaw4IY1VH6H0oZQAQj+LLhV0eWQ3gdSXh/Cp8xrwAfk1K27/ex JXEbPSEfu36A3Ij/fAO4pudXgiWAZk/1i8m2f0VMVQ== X-Google-Smtp-Source: AGHT+IHgD3MMK0xSpB/Hu+wLHSaaGcEt7m/hhwjZNbiyq/jI6uWj3tZuNrmn4WrE9GCMj2OQ+APvCxx1bgD0YN4+nrI= X-Received: by 2002:a05:6122:180d:b0:49b:289a:cc4a with SMTP id ay13-20020a056122180d00b0049b289acc4amr6877585vkb.3.1699884658679; Mon, 13 Nov 2023 06:10:58 -0800 (PST) MIME-Version: 1.0 References: <20231102225837.1141915-1-sdf@google.com> <20231102225837.1141915-3-sdf@google.com> In-Reply-To: From: Stanislav Fomichev Date: Mon, 13 Nov 2023 06:10:45 -0800 Message-ID: To: Jesper Dangaard Brouer Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: MD2PSMVPN5EVZDLGUI435T6FKF6GQCAK X-Message-ID-Hash: MD2PSMVPN5EVZDLGUI435T6FKF6GQCAK X-MailFrom: sdf@google.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, kuba@kernel.org, toke@kernel.org, willemb@google.com, dsahern@kernel.org, magnus.karlsson@intel.com, bjorn@kernel.org, maciej.fijalkowski@intel.com, yoong.siang.song@intel.com, netdev@vger.kernel.org, xdp-hints@xdp-project.net X-Mailman-Version: 3.3.8 Precedence: list Subject: [xdp-hints] Re: [PATCH bpf-next v5 02/13] xsk: Add TX timestamp and TX checksum offload support List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon, Nov 13, 2023 at 5:16=E2=80=AFAM Jesper Dangaard Brouer wrote: > > > > On 11/2/23 23:58, Stanislav Fomichev wrote: > > diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h > > index 2ecf79282c26..b0ee7ad19b51 100644 > > --- a/include/uapi/linux/if_xdp.h > > +++ b/include/uapi/linux/if_xdp.h > > @@ -106,6 +106,41 @@ struct xdp_options { > > #define XSK_UNALIGNED_BUF_ADDR_MASK \ > > ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) > > > > +/* Request transmit timestamp. Upon completion, put it into tx_timesta= mp > > + * field of struct xsk_tx_metadata. > > + */ > > +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) > > + > > +/* Request transmit checksum offload. Checksum start position and offs= et > > + * are communicated via csum_start and csum_offset fields of struct > > + * xsk_tx_metadata. > > + */ > > +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) > > + > > +/* AF_XDP offloads request. 'request' union member is consumed by the = driver > > + * when the packet is being transmitted. 'completion' union member is > > + * filled by the driver when the transmit completion arrives. > > + */ > > +struct xsk_tx_metadata { > > + union { > > + struct { > > + __u32 flags; > > + > > + /* XDP_TXMD_FLAGS_CHECKSUM */ > > + > > + /* Offset from desc->addr where checksumming shou= ld start. */ > > + __u16 csum_start; > > + /* Offset from csum_start where checksum should b= e stored. */ > > + __u16 csum_offset; > > + } request; > > + > > + struct { > > + /* XDP_TXMD_FLAGS_TIMESTAMP */ > > + __u64 tx_timestamp; > > + } completion; > > + }; > > +}; > > This looks wrong to me. It looks like member @flags is not avail at > completion time. At completion time, I assume we also want to know if > someone requested to get the timestamp for this packet (else we could > read garbage). I've moved the parts that are preserved across tx and tx completion into xsk_tx_metadata_compl. This is to address Magnus/Maciej feedback where userspace might race with the kernel. See: https://lore.kernel.org/bpf/ZNoJenzKXW5QSR3E@boxer/ > Another thing (I've raised this before): It would be really practical to > store an u64 opaque value at TX and then read it at Completion time. > One use-case is a forwarding application storing HW RX-time and > comparing this to TX completion time to deduce the time spend processing > the packet. This can be another member, right? But note that extending xsk_tx_metadata_compl might be a bit complicated because drivers have to carry this info somewhere. So we have to balance the amount of passed data between the tx and the completion.