* [xdp-hints] [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata
@ 2025-01-06 13:56 Song Yoong Siang
2025-01-07 16:50 ` [xdp-hints] " Stanislav Fomichev
0 siblings, 1 reply; 4+ messages in thread
From: Song Yoong Siang @ 2025-01-06 13:56 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Florian Bezdeka, Donald Hunter,
Jonathan Corbet, Bjorn Topel, Magnus Karlsson,
Maciej Fijalkowski, Jonathan Lemon, Andrew Lunn,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend, Joe Damato, Stanislav Fomichev, Xuan Zhuo,
Mina Almasry, Daniel Jurgens, Song Yoong Siang, Amritha Nambiar,
Andrii Nakryiko, Eduard Zingerman, Mykola Lysenko,
Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Hao Luo,
Jiri Olsa, Shuah Khan, Alexandre Torgue, Jose Abreu,
Maxime Coquelin, Tony Nguyen, Przemek Kitszel
Cc: netdev, linux-kernel, linux-doc, bpf, linux-kselftest,
linux-stm32, linux-arm-kernel, intel-wired-lan, xdp-hints
Extend the XDP Tx metadata framework so that user can requests launch time
hardware offload, where the Ethernet device will schedule the packet for
transmission at a pre-determined time called launch time. The value of
launch time is communicated from user space to Ethernet driver via
launch_time field of struct xsk_tx_metadata.
Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
---
Documentation/netlink/specs/netdev.yaml | 4 ++
Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++
include/net/xdp_sock.h | 10 +++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 +++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 +++
tools/include/uapi/linux/netdev.h | 3 +
10 files changed, 110 insertions(+)
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index cbb544bd6c84..e59c8a14f7d1 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -70,6 +70,10 @@ definitions:
name: tx-checksum
doc:
L3 checksum HW offload is supported by the driver.
+ -
+ name: tx-launch-time
+ doc:
+ Launch time HW offload is supported by the driver.
-
name: queue-type
type: enum
diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
index e76b0cfc32f7..3cec089747ce 100644
--- a/Documentation/networking/xsk-tx-metadata.rst
+++ b/Documentation/networking/xsk-tx-metadata.rst
@@ -50,6 +50,10 @@ The flags field enables the particular offload:
checksum. ``csum_start`` specifies byte offset of where the checksumming
should start and ``csum_offset`` specifies byte offset where the
device should store the computed checksum.
+- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
+ packet for transmission at a pre-determined time called launch time. The
+ value of launch time is indicated by ``launch_time`` field of
+ ``union xsk_tx_metadata``.
Besides the flags above, in order to trigger the offloads, the first
packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
@@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum
is calculated on the CPU. Do not enable this option in production because
it will negatively affect performance.
+Launch Time
+===========
+
+The value of the requested launch time should be based on the device's PTP
+Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
+compared to the ETF queuing discipline, which organizes packets and delays
+their transmission. Instead, AF_XDP immediately hands off the packets to
+the device driver without rearranging their order or holding them prior to
+transmission. In scenarios where the launch time offload feature is
+disabled, the device driver is expected to disregard the launch time
+request. For correct interpretation and meaningful operation, the launch
+time should never be set to a value larger than the farthest programmable
+time in the future (the horizon). Different devices have different hardware
+limitations on the launch time offload feature.
+
+stmmac driver
+-------------
+
+For stmmac, TSO and launch time (TBS) features are mutually exclusive for
+each individual Tx Queue. By default, the driver configures Tx Queue 0 to
+support TSO and the rest of the Tx Queues to support TBS. The launch time
+hardware offload feature can be enabled or disabled by using the tc-etf
+command to call the driver's ndo_setup_tc() callback.
+
+The value of the launch time that is programmed in the Enhanced Normal
+Transmit Descriptors is a 32-bit value, where the most significant 8 bits
+represent the time in seconds and the remaining 24 bits represent the time
+in 256 ns increments. The programmed launch time is compared against the
+PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
+horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
+future.
+
+The stmmac driver maintains FIFO behavior and does not perform packet
+reordering. This means that a packet with a launch time request will block
+other packets in the same Tx Queue until it is transmitted.
+
+igc driver
+----------
+
+For igc, all four Tx Queues support the launch time feature. The launch
+time hardware offload feature can be enabled or disabled by using the
+tc-etf command to call the driver's ndo_setup_tc() callback. When entering
+TSN mode, the igc driver will reset the device and create a default Qbv
+schedule with a 1-second cycle time, with all Tx Queues open at all times.
+
+The value of the launch time that is programmed in the Advanced Transmit
+Context Descriptor is a relative offset to the starting time of the Qbv
+transmission window of the queue. The Frst flag of the descriptor can be
+set to schedule the packet for the next Qbv cycle. Therefore, the horizon
+of the launch time for i225 and i226 is the ending time of the next cycle
+of the Qbv transmission window of the queue. For example, when the Qbv
+cycle time is set to 1 second, the horizon of the launch time ranges
+from 1 second to 2 seconds, depending on where the Qbv cycle is currently
+running.
+
+The igc driver maintains FIFO behavior and does not perform packet
+reordering. This means that a packet with a launch time request will block
+other packets in the same Tx Queue until it is transmitted.
+
Querying Device Capabilities
============================
@@ -74,6 +137,7 @@ Refer to ``xsk-flags`` features bitmask in
- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP``
- ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM``
+- ``tx-launch-time``: device supports ``XDP_TXMD_FLAGS_LAUNCH_TIME``
See ``tools/net/ynl/samples/netdev.c`` on how to query this information.
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index bfe625b55d55..a58ae7589d12 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -110,11 +110,16 @@ struct xdp_sock {
* indicates position where checksumming should start.
* csum_offset indicates position where checksum should be stored.
*
+ * void (*tmo_request_launch_time)(u64 launch_time, void *priv)
+ * Called when AF_XDP frame requested launch time HW offload support.
+ * launch_time indicates the PTP time at which the device can schedule the
+ * packet for transmission.
*/
struct xsk_tx_metadata_ops {
void (*tmo_request_timestamp)(void *priv);
u64 (*tmo_fill_timestamp)(void *priv);
void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv);
+ void (*tmo_request_launch_time)(u64 launch_time, void *priv);
};
#ifdef CONFIG_XDP_SOCKETS
@@ -162,6 +167,11 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta,
if (!meta)
return;
+ if (ops->tmo_request_launch_time)
+ if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME)
+ ops->tmo_request_launch_time(meta->request.launch_time,
+ priv);
+
if (ops->tmo_request_timestamp)
if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP)
ops->tmo_request_timestamp(priv);
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 40085afd9160..78af371bc002 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -198,6 +198,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr)
#define XDP_TXMD_FLAGS_VALID ( \
XDP_TXMD_FLAGS_TIMESTAMP | \
XDP_TXMD_FLAGS_CHECKSUM | \
+ XDP_TXMD_FLAGS_LAUNCH_TIME | \
0)
static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta)
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index 42ec5ddaab8d..42869770776e 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -127,6 +127,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request launch time hardware offload. The device will schedule the packet for
+ * transmission at a pre-determined time called launch time. The value of
+ * launch time is communicated via launch_time field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -142,6 +148,10 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_LAUNCH_TIME */
+ /* Launch time in nanosecond against the PTP HW Clock */
+ __u64 launch_time;
} request;
struct {
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index e4be227d3ad6..5ab85f4af009 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the
+ * driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_LAUNCH_TIME = 4,
};
enum netdev_queue_type {
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 9527dd46e4dc..e2515cf9190f 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -52,6 +52,8 @@ XDP_METADATA_KFUNC_xxx
xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP;
if (netdev->xsk_tx_metadata_ops->tmo_request_checksum)
xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM;
+ if (netdev->xsk_tx_metadata_ops->tmo_request_launch_time)
+ xsk_features |= NETDEV_XSK_FLAGS_LAUNCH_TIME;
}
if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) ||
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3fa70286c846..8feaa0e86f07 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -743,6 +743,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
goto free_err;
}
}
+
+ if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME)
+ skb->skb_mstamp_ns = meta->request.launch_time;
}
}
diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h
index 2f082b01ff22..67719f8966c2 100644
--- a/tools/include/uapi/linux/if_xdp.h
+++ b/tools/include/uapi/linux/if_xdp.h
@@ -127,6 +127,12 @@ struct xdp_options {
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request launch time hardware offload. The device will schedule the packet for
+ * transmission at a pre-determined time called launch time. The value of
+ * launch time is communicated via launch_time field of struct xsk_tx_metadata.
+ */
+#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2)
+
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -142,6 +148,10 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
+
+ /* XDP_TXMD_FLAGS_LAUNCH_TIME */
+ /* Launch time in nanosecond against the PTP HW Clock */
+ __u64 launch_time;
} request;
struct {
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index e4be227d3ad6..5ab85f4af009 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
+ * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the
+ * driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
+ NETDEV_XSK_FLAGS_LAUNCH_TIME = 4,
};
enum netdev_queue_type {
--
2.34.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata
2025-01-06 13:56 [xdp-hints] [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata Song Yoong Siang
@ 2025-01-07 16:50 ` Stanislav Fomichev
2025-01-09 7:19 ` Song, Yoong Siang
0 siblings, 1 reply; 4+ messages in thread
From: Stanislav Fomichev @ 2025-01-07 16:50 UTC (permalink / raw)
To: Song Yoong Siang
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Florian Bezdeka, Donald Hunter,
Jonathan Corbet, Bjorn Topel, Magnus Karlsson,
Maciej Fijalkowski, Jonathan Lemon, Andrew Lunn,
Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
John Fastabend, Joe Damato, Stanislav Fomichev, Xuan Zhuo,
Mina Almasry, Daniel Jurgens, Amritha Nambiar, Andrii Nakryiko,
Eduard Zingerman, Mykola Lysenko, Martin KaFai Lau, Song Liu,
Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
Alexandre Torgue, Jose Abreu, Maxime Coquelin, Tony Nguyen,
Przemek Kitszel, netdev, linux-kernel, linux-doc, bpf,
linux-kselftest, linux-stm32, linux-arm-kernel, intel-wired-lan,
xdp-hints
On 01/06, Song Yoong Siang wrote:
> Extend the XDP Tx metadata framework so that user can requests launch time
> hardware offload, where the Ethernet device will schedule the packet for
> transmission at a pre-determined time called launch time. The value of
> launch time is communicated from user space to Ethernet driver via
> launch_time field of struct xsk_tx_metadata.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
> ---
> Documentation/netlink/specs/netdev.yaml | 4 ++
> Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++
> include/net/xdp_sock.h | 10 +++
> include/net/xdp_sock_drv.h | 1 +
> include/uapi/linux/if_xdp.h | 10 +++
> include/uapi/linux/netdev.h | 3 +
> net/core/netdev-genl.c | 2 +
> net/xdp/xsk.c | 3 +
> tools/include/uapi/linux/if_xdp.h | 10 +++
> tools/include/uapi/linux/netdev.h | 3 +
> 10 files changed, 110 insertions(+)
>
> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
> index cbb544bd6c84..e59c8a14f7d1 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -70,6 +70,10 @@ definitions:
> name: tx-checksum
> doc:
> L3 checksum HW offload is supported by the driver.
> + -
> + name: tx-launch-time
> + doc:
> + Launch time HW offload is supported by the driver.
> -
> name: queue-type
> type: enum
> diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
> index e76b0cfc32f7..3cec089747ce 100644
> --- a/Documentation/networking/xsk-tx-metadata.rst
> +++ b/Documentation/networking/xsk-tx-metadata.rst
> @@ -50,6 +50,10 @@ The flags field enables the particular offload:
> checksum. ``csum_start`` specifies byte offset of where the checksumming
> should start and ``csum_offset`` specifies byte offset where the
> device should store the computed checksum.
> +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
> + packet for transmission at a pre-determined time called launch time. The
> + value of launch time is indicated by ``launch_time`` field of
> + ``union xsk_tx_metadata``.
>
> Besides the flags above, in order to trigger the offloads, the first
> packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
> @@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum
> is calculated on the CPU. Do not enable this option in production because
> it will negatively affect performance.
>
> +Launch Time
> +===========
> +
> +The value of the requested launch time should be based on the device's PTP
> +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
> +compared to the ETF queuing discipline, which organizes packets and delays
> +their transmission. Instead, AF_XDP immediately hands off the packets to
> +the device driver without rearranging their order or holding them prior to
> +transmission. In scenarios where the launch time offload feature is
> +disabled, the device driver is expected to disregard the launch time
> +request. For correct interpretation and meaningful operation, the launch
> +time should never be set to a value larger than the farthest programmable
> +time in the future (the horizon). Different devices have different hardware
> +limitations on the launch time offload feature.
> +
> +stmmac driver
> +-------------
> +
> +For stmmac, TSO and launch time (TBS) features are mutually exclusive for
> +each individual Tx Queue. By default, the driver configures Tx Queue 0 to
> +support TSO and the rest of the Tx Queues to support TBS. The launch time
> +hardware offload feature can be enabled or disabled by using the tc-etf
> +command to call the driver's ndo_setup_tc() callback.
> +
> +The value of the launch time that is programmed in the Enhanced Normal
> +Transmit Descriptors is a 32-bit value, where the most significant 8 bits
> +represent the time in seconds and the remaining 24 bits represent the time
> +in 256 ns increments. The programmed launch time is compared against the
> +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
> +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
> +future.
> +
> +The stmmac driver maintains FIFO behavior and does not perform packet
> +reordering. This means that a packet with a launch time request will block
> +other packets in the same Tx Queue until it is transmitted.
> +
> +igc driver
> +----------
> +
> +For igc, all four Tx Queues support the launch time feature. The launch
> +time hardware offload feature can be enabled or disabled by using the
> +tc-etf command to call the driver's ndo_setup_tc() callback. When entering
> +TSN mode, the igc driver will reset the device and create a default Qbv
> +schedule with a 1-second cycle time, with all Tx Queues open at all times.
> +
> +The value of the launch time that is programmed in the Advanced Transmit
> +Context Descriptor is a relative offset to the starting time of the Qbv
> +transmission window of the queue. The Frst flag of the descriptor can be
> +set to schedule the packet for the next Qbv cycle. Therefore, the horizon
> +of the launch time for i225 and i226 is the ending time of the next cycle
> +of the Qbv transmission window of the queue. For example, when the Qbv
> +cycle time is set to 1 second, the horizon of the launch time ranges
> +from 1 second to 2 seconds, depending on where the Qbv cycle is currently
> +running.
> +
> +The igc driver maintains FIFO behavior and does not perform packet
> +reordering. This means that a packet with a launch time request will block
> +other packets in the same Tx Queue until it is transmitted.
Since two devices we initially support are using FIFO mode, should we more
explicitly target this case? Maybe even call netdev features
tx-launch-time-fifo? In the future, if/when we get support timing-wheel-like
queues, we can export another tx-launch-time-wheel?
It seems important for the userspace to know which mode it's running.
In a fifo mode, it might make sense to allocate separate queues
for scheduling things far into the future/etc.
Thoughts? No code changes required, just more explicitly state the
expectations.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata
2025-01-07 16:50 ` [xdp-hints] " Stanislav Fomichev
@ 2025-01-09 7:19 ` Song, Yoong Siang
2025-01-09 17:40 ` Stanislav Fomichev
0 siblings, 1 reply; 4+ messages in thread
From: Song, Yoong Siang @ 2025-01-09 7:19 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Bezdeka, Florian, Donald Hunter,
Jonathan Corbet, Bjorn Topel, Karlsson, Magnus, Fijalkowski,
Maciej, Jonathan Lemon, Andrew Lunn, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, Damato,
Joe, Stanislav Fomichev, Xuan Zhuo, Mina Almasry, Daniel Jurgens,
Amritha Nambiar, Andrii Nakryiko, Eduard Zingerman,
Mykola Lysenko, Martin KaFai Lau, Song Liu, Yonghong Song,
KP Singh, Hao Luo, Jiri Olsa, Shuah Khan, Alexandre Torgue,
Jose Abreu, Maxime Coquelin, Nguyen, Anthony L, Kitszel,
Przemyslaw, netdev, linux-kernel, linux-doc, bpf,
linux-kselftest, linux-stm32, linux-arm-kernel, intel-wired-lan,
xdp-hints
On Wednesday, January 8, 2025 12:50 AM, Stanislav Fomichev <stfomichev@gmail.com> wrote:
>On 01/06, Song Yoong Siang wrote:
>> Extend the XDP Tx metadata framework so that user can requests launch time
>> hardware offload, where the Ethernet device will schedule the packet for
>> transmission at a pre-determined time called launch time. The value of
>> launch time is communicated from user space to Ethernet driver via
>> launch_time field of struct xsk_tx_metadata.
>>
>> Suggested-by: Stanislav Fomichev <sdf@google.com>
Hi Stanislav Fomichev,
Thanks for your review comments.
I notice that you have two emails:
sdf@google.com & stfomichev@gmail.com
Which one I should use in the suggested-by tag?
>> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
>> ---
>> Documentation/netlink/specs/netdev.yaml | 4 ++
>> Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++
>> include/net/xdp_sock.h | 10 +++
>> include/net/xdp_sock_drv.h | 1 +
>> include/uapi/linux/if_xdp.h | 10 +++
>> include/uapi/linux/netdev.h | 3 +
>> net/core/netdev-genl.c | 2 +
>> net/xdp/xsk.c | 3 +
>> tools/include/uapi/linux/if_xdp.h | 10 +++
>> tools/include/uapi/linux/netdev.h | 3 +
>> 10 files changed, 110 insertions(+)
>>
>> diff --git a/Documentation/netlink/specs/netdev.yaml
>b/Documentation/netlink/specs/netdev.yaml
>> index cbb544bd6c84..e59c8a14f7d1 100644
>> --- a/Documentation/netlink/specs/netdev.yaml
>> +++ b/Documentation/netlink/specs/netdev.yaml
>> @@ -70,6 +70,10 @@ definitions:
>> name: tx-checksum
>> doc:
>> L3 checksum HW offload is supported by the driver.
>> + -
>> + name: tx-launch-time
>> + doc:
>> + Launch time HW offload is supported by the driver.
>> -
>> name: queue-type
>> type: enum
>> diff --git a/Documentation/networking/xsk-tx-metadata.rst
>b/Documentation/networking/xsk-tx-metadata.rst
>> index e76b0cfc32f7..3cec089747ce 100644
>> --- a/Documentation/networking/xsk-tx-metadata.rst
>> +++ b/Documentation/networking/xsk-tx-metadata.rst
>> @@ -50,6 +50,10 @@ The flags field enables the particular offload:
>> checksum. ``csum_start`` specifies byte offset of where the checksumming
>> should start and ``csum_offset`` specifies byte offset where the
>> device should store the computed checksum.
>> +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the
>> + packet for transmission at a pre-determined time called launch time. The
>> + value of launch time is indicated by ``launch_time`` field of
>> + ``union xsk_tx_metadata``.
>>
>> Besides the flags above, in order to trigger the offloads, the first
>> packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA``
>> @@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX
>checksum
>> is calculated on the CPU. Do not enable this option in production because
>> it will negatively affect performance.
>>
>> +Launch Time
>> +===========
>> +
>> +The value of the requested launch time should be based on the device's PTP
>> +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path
>> +compared to the ETF queuing discipline, which organizes packets and delays
>> +their transmission. Instead, AF_XDP immediately hands off the packets to
>> +the device driver without rearranging their order or holding them prior to
>> +transmission. In scenarios where the launch time offload feature is
>> +disabled, the device driver is expected to disregard the launch time
>> +request. For correct interpretation and meaningful operation, the launch
>> +time should never be set to a value larger than the farthest programmable
>> +time in the future (the horizon). Different devices have different hardware
>> +limitations on the launch time offload feature.
>> +
>> +stmmac driver
>> +-------------
>> +
>> +For stmmac, TSO and launch time (TBS) features are mutually exclusive for
>> +each individual Tx Queue. By default, the driver configures Tx Queue 0 to
>> +support TSO and the rest of the Tx Queues to support TBS. The launch time
>> +hardware offload feature can be enabled or disabled by using the tc-etf
>> +command to call the driver's ndo_setup_tc() callback.
>> +
>> +The value of the launch time that is programmed in the Enhanced Normal
>> +Transmit Descriptors is a 32-bit value, where the most significant 8 bits
>> +represent the time in seconds and the remaining 24 bits represent the time
>> +in 256 ns increments. The programmed launch time is compared against the
>> +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the
>> +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the
>> +future.
>> +
>> +The stmmac driver maintains FIFO behavior and does not perform packet
>> +reordering. This means that a packet with a launch time request will block
>> +other packets in the same Tx Queue until it is transmitted.
>> +
>> +igc driver
>> +----------
>> +
>> +For igc, all four Tx Queues support the launch time feature. The launch
>> +time hardware offload feature can be enabled or disabled by using the
>> +tc-etf command to call the driver's ndo_setup_tc() callback. When entering
>> +TSN mode, the igc driver will reset the device and create a default Qbv
>> +schedule with a 1-second cycle time, with all Tx Queues open at all times.
>> +
>> +The value of the launch time that is programmed in the Advanced Transmit
>> +Context Descriptor is a relative offset to the starting time of the Qbv
>> +transmission window of the queue. The Frst flag of the descriptor can be
>> +set to schedule the packet for the next Qbv cycle. Therefore, the horizon
>> +of the launch time for i225 and i226 is the ending time of the next cycle
>> +of the Qbv transmission window of the queue. For example, when the Qbv
>> +cycle time is set to 1 second, the horizon of the launch time ranges
>> +from 1 second to 2 seconds, depending on where the Qbv cycle is currently
>> +running.
>> +
>> +The igc driver maintains FIFO behavior and does not perform packet
>> +reordering. This means that a packet with a launch time request will block
>> +other packets in the same Tx Queue until it is transmitted.
>
>Since two devices we initially support are using FIFO mode, should we more
>explicitly target this case? Maybe even call netdev features
>tx-launch-time-fifo? In the future, if/when we get support timing-wheel-like
>queues, we can export another tx-launch-time-wheel?
>
>It seems important for the userspace to know which mode it's running.
>In a fifo mode, it might make sense to allocate separate queues
>for scheduling things far into the future/etc.
You are right, user should isolate one queue for scheduling things
far into future and use other queue for normal traffic.
>
>Thoughts? No code changes required, just more explicitly state the
>expectations.
Agree with you, let me change the name from tx-launch-time to
tx-launch-time-fifo to explicitly state the fifo behavior.
Thanks & Regards
Siang
^ permalink raw reply [flat|nested] 4+ messages in thread
* [xdp-hints] Re: [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata
2025-01-09 7:19 ` Song, Yoong Siang
@ 2025-01-09 17:40 ` Stanislav Fomichev
0 siblings, 0 replies; 4+ messages in thread
From: Stanislav Fomichev @ 2025-01-09 17:40 UTC (permalink / raw)
To: Song, Yoong Siang
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Bezdeka, Florian, Donald Hunter,
Jonathan Corbet, Bjorn Topel, Karlsson, Magnus, Fijalkowski,
Maciej, Jonathan Lemon, Andrew Lunn, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, Damato,
Joe, Stanislav Fomichev, Xuan Zhuo, Mina Almasry, Daniel Jurgens,
Amritha Nambiar, Andrii Nakryiko, Eduard Zingerman,
Mykola Lysenko, Martin KaFai Lau, Song Liu, Yonghong Song,
KP Singh, Hao Luo, Jiri Olsa, Shuah Khan, Alexandre Torgue,
Jose Abreu, Maxime Coquelin, Nguyen, Anthony L, Kitszel,
Przemyslaw, netdev, linux-kernel, linux-doc, bpf,
linux-kselftest, linux-stm32, linux-arm-kernel, intel-wired-lan,
xdp-hints
On 01/09, Song, Yoong Siang wrote:
> On Wednesday, January 8, 2025 12:50 AM, Stanislav Fomichev <stfomichev@gmail.com> wrote:
> >On 01/06, Song Yoong Siang wrote:
> >> Extend the XDP Tx metadata framework so that user can requests launch time
> >> hardware offload, where the Ethernet device will schedule the packet for
> >> transmission at a pre-determined time called launch time. The value of
> >> launch time is communicated from user space to Ethernet driver via
> >> launch_time field of struct xsk_tx_metadata.
> >>
> >> Suggested-by: Stanislav Fomichev <sdf@google.com>
>
> Hi Stanislav Fomichev,
>
> Thanks for your review comments.
> I notice that you have two emails:
> sdf@google.com & stfomichev@gmail.com
>
> Which one I should use in the suggested-by tag?
google.com should be bouncing now. sdf@fomichev.me is preferred.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-01-09 17:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-06 13:56 [xdp-hints] [PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata Song Yoong Siang
2025-01-07 16:50 ` [xdp-hints] " Stanislav Fomichev
2025-01-09 7:19 ` Song, Yoong Siang
2025-01-09 17:40 ` Stanislav Fomichev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox