* Re: TXTIME (Launch Time) setting per XDP frame from userspace.
2021-06-21 21:06 ` TXTIME (Launch Time) setting per XDP frame from userspace Jesper Dangaard Brouer
@ 2021-06-21 23:04 ` Maloor, Kishen
0 siblings, 0 replies; 2+ messages in thread
From: Maloor, Kishen @ 2021-06-21 23:04 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Zhang, Jessica, Desouza, Ederson, Joseph, Jithu, Gomes, Vinicius,
XDP-hints working-group, Magnus Karlsson, BPF-dev-list, Tedeschi,
Walfred, Andrii Nakryiko
Really appreciate the quick turnaround! Answers in-line below.
--
Kishen Maloor
Intel Corporation
On 6/21/21, 5:07 PM, "Jesper Dangaard Brouer" <brouer@redhat.com> wrote:
(Answer inlined below, please learn howto answer inline)
On Mon, 21 Jun 2021 19:08:54 +0000
"Maloor, Kishen" <kishen.maloor@intel.com> wrote:
> Hi Jesper,
>
> I wanted to run some patches by you for your comments.
Thanks a lot for reaching out, I really appreciate it.
> Intel has a requirement to support an SO_TXTIME like capability with
> AF_XDP sockets, to set a precise per-packet transmission time from
> user space XDP applications.
This is great! I also have a customer that have this requirement.
Thus, I'm very interested in collaborating on getting this feature
implemented in practice.
Super!
> It is also critical that this support be upstreamed into the Linux
> mainline.
I fully agree "upstream first". For this to happen, we should discuss
the approach in a public forum as early as possible to get upstream
feedback. Cc'ing bpf@vger.kernel.org. Also adding our working group
mailing list as not everybody follow the kernel list Cc'ed
xdp-hints@xdp-project.net (https://lists.xdp-project.net/xdp-hints/).
Multiple of your Intel colleagues (from other departments) have also
reached out to me, and they are also on this xdp-hints mailing list.
It would be great if you can keep this list Cc'ed to keep everybody in
the loop.
Consider subscribing:
https://lists.xdp-project.net/postorius/lists/xdp-hints.xdp-project.net/
Yes, I am aware, although it seems (per my reading) the main focus there is on capturing metadata on the RX path and funneling
that into the XDP program (and possibly higher).
Whereas this change is solely focused on passing down TXTIME (in the TX path obv.) from the XDP application running in userspace.
Said another way, the aim here is to accomplish something equivalent to SO_TXTIME with AF_XDP.
> The implementation is roughly equivalent to that of SO_TXTIME for
> AF_PACKET. It includes:
>
> - a new XSK bind flag,
So, this feature is enabled for all packets on this AF_XDP socket.
Yes, in setting the XDP_TXTIME bind flag while creating the AF_XDP socket, it will apply to all packets sent through that socket.
> - libbpf API to set a TXTIME for a frame,
>
> - leverages the concept of UMEM headroom to store the per-packet
> TXTIME (immediately preceding the XDP buff's data_hard_start),
This sounds like the XDP metadata area?
Or is this top of mem data-area (guess not, as you use +umem_headroom)?
No. More specifically it is stored in the UMEM headroom area (it's default size per XSK_UMEM__DEFAULT_FRAME_HEADROOM is 0). Currently, the code in net/xdp adjusts for
this headroom (pool->headroom) in one or more places so my understanding is that the entirety of the XDP frame (data_hard_start and beyond, including the metadata
area) follows the UMEM headroom.
What will happen when an XDP-prog use this same metadata area, and your
feature (patch below) is enabled (via AF_XDP) ?
So far as I understand, an XDP/BPF program that writes to the defined XDP metadata area will function as expected, i.e. it won't intrude in the UMEM headroom.
E.g. can I set the TXTIME in XDP-prog (RX) and XDP_REDIRECT out an
interface with TXTIME enabled?
I believe that's possible. This is kind of why I wanted to loop in experts such as yourself as you'd have a more complete understanding (than myself) of the XDP code as it
currently stands :) So, I'd appreciate all the feedback I can get to account for anything that's missed in the patches.
> - internally signals the NIC driver via the XDP descriptor when
> there's a TXTIME bound to a frame, and
Looking at patch, if I don't call 'xsk_umem__set_txtime' (libbpf), then
the kernel (xp_raw_get_txtime) will pickup random-data left over by
previous user of the memory, right?
Yes, though the expectation is that an XDP application that enables XDP_TXTIME on the socket will proceed to pass in a TXTIME with every frame it sends.
> - provides a kernel API to retrieve it in the driver so it may proceed
> to set the Launch Time on the NIC.
>
> It uses existing bitmasks and there's no overhead (i.e. not
> allocating any more memory than is already done), and there are no
> other dependencies.
>
> You may review the commit messages for all details.
>
> kernel: https://github.com/kmaloor/bpfnext/commit/b72a336bcb8df4a26646a29962ca95d58172147f
> libbpf: https://github.com/kmaloor/libbpf/commit/eda68c1ad11ed60739837ad6e4fb256768231b55
>
> We're currently testing these changes in conjunction with the igc
> driver.
Can you share the patches for the igc driver as well?
That's currently in the works and being tested internally. We should hopefully share it soon.
I want to see how you convert the "__s64 txtime" into the 30-bit
LaunchTime used[0] by the hardware (i225). It would also be good to see
how this is used for driver igb/i210, that have a 25-bit LaunchTime.
Details on driver+hardware here:
[0] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
> Meanwhile, we'd appreciate any comments you may have, particularly as
> an expert with the kernel XDP subsystem.
I think this TXTIME work is just *one* use-case of "xdp-hints". IMHO we
should use something like BTF to describe the metadata memory area
in-order to support more use-cases.
Being concrete .e.g. your XSK socket bind call could register a BTF-ID
that it tells the kernel about the layout. I guess, we need some
guidance from BTF experts on how kernel can check the driver supports
this LaunchTime feature, Andrii?
Other use-cases: (1) On TX we also want to support asking for TX-csum
by hardware. (2) Support setting VLAN via TX-descriptor (hint TSN can
use VLAN priority bits).
Well, to be perfectly clear, the TXTIME in this case is an application supplied actionable quantity, so I don't interpret it as "metadata".
Whereas the examples you cite above indeed seem like metadata. In this patch, the goal is simply to replicate SO_TXTIME with AF_XDP.
I am aware of ongoing discussions for handling XDP metadata/hints and so far as I understand (correct me if I'm wrong), we're still a ways
off from a solution that the community will uphold.
Again thanks for reaching out, lets find a solution together with
upstream.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Kernel:
s64 xp_raw_get_txtime(struct xsk_buff_pool *pool, u64 addr)
{
 return *(s64 *)((char *)xp_raw_get_data(pool, addr) - XDP_TXTIME_LEN);
}
EXPORT_SYMBOL(xp_raw_get_txtime);
libbpf:
static inline void xsk_umem__set_txtime(void *umem_area, __u64 addr, __u32 umem_headroom, __s64 txtime)
{
 *(__s64 *)(&((char *)umem_area)[addr] + umem_headroom - XDP_TXTIME_LEN) = txtime;
}
^ permalink raw reply [flat|nested] 2+ messages in thread