From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mail.toke.dk (Postfix) with ESMTPS id EE87F8EB3F2 for ; Mon, 22 Nov 2021 19:25:08 +0100 (CET) Authentication-Results: mail.toke.dk; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NULT2elb DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1637605507; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nYlJwFmWO5oqRP9HivIc9RzPBQsypU37/TLJxYkfaUo=; b=NULT2elbyRsVymJybhQ+C00ooQ8EpVMhaC28PwHDKqgcZnWOzRqiWzf2XPN4lme0axDdNi UgcxbVTZ7Q5oSXTiLBRMbJZRgLNQBNR+eEYV45rnfxKflslSqcZXvAHBR2xjYDrbw47bCL JMWnCiVzSGPwYE39dfCTF7uQabABjUU= Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-447-c77HK67YOsiisFV8F18zlg-1; Mon, 22 Nov 2021 13:25:06 -0500 X-MC-Unique: c77HK67YOsiisFV8F18zlg-1 Received: by mail-ed1-f69.google.com with SMTP id bx28-20020a0564020b5c00b003e7c42443dbso15569383edb.15 for ; Mon, 22 Nov 2021 10:25:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=nYlJwFmWO5oqRP9HivIc9RzPBQsypU37/TLJxYkfaUo=; b=zZONfhvf/gf0oVcZV8ZeI+KFeYB4pMemalArZY42yjLFrc3cwu8bbYqlcSPWJEgF3x t9oDqmJxRFYi2eWpeYcW/sCHehI0HbY6Au4njA/RpEUSxdPP2kYNWlRq9svO4s+3WjT1 A9fdEGm07BvM1nuIC1ppUuOdtOsaroPhhzkaH4SVO1tkBDJ9Lz2MC9VRBYcMc3/4VE2H t9aBI4L0XCdnDJuwR7KRQJnTaDuLyTPYKdVO5wKwTSqEudY0Oi6TtIYGBrWTO1tjZEep Da2kZiqzcEVrqlImUYYJi1qhwfro0Xika3C9KsQlT9K0jJJEZtSQs4axDC/gC8t2toUh XHGw== X-Gm-Message-State: AOAM5301HPC/jlMNIRAUsMbvv3uxe08Rm5jWG1b3fys9KsJFoeLMgY0M 2sO2S72s6ZrY8TctXusxNpY5bIyhgpcI2T7OjqjgSMZKM59ockfaQe01kumfkp1fQk5UC+95jap k1afcWxnVrsGLOZP1IL8R X-Received: by 2002:a17:907:9612:: with SMTP id gb18mr43582427ejc.205.1637605504294; Mon, 22 Nov 2021 10:25:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJwwTmuZLE+rs3Dq/j+wZF4ixoJknjeksStF0/guQ726fwDlJf/8nxHOY5gK8iWxhgxinUjj9Q== X-Received: by 2002:a17:907:9612:: with SMTP id gb18mr43582219ejc.205.1637605502856; Mon, 22 Nov 2021 10:25:02 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id s16sm4435278edt.30.2021.11.22.10.25.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Nov 2021 10:25:02 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 6E40A180270; Mon, 22 Nov 2021 19:25:00 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Tom Herbert In-Reply-To: References: <875ysqflg1.fsf@toke.dk> <61966ec0722fe_2f3212080@john.notmuch> <871r3cdwng.fsf@toke.dk> <87r1b8cmvf.fsf@toke.dk> X-Clacks-Overhead: GNU Terry Pratchett Date: Mon, 22 Nov 2021 19:25:00 +0100 Message-ID: <87fsrocakj.fsf@toke.dk> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=toke@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: 5QIVDZDRAGONXHJVX3B2ZP5JKYQ4EBPR X-Message-ID-Hash: 5QIVDZDRAGONXHJVX3B2ZP5JKYQ4EBPR X-MailFrom: toke@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jamal Hadi Salim , John Fastabend , Jesper Dangaard Brouer , "Karlsson, Magnus" , "Desouza, Ederson" , brouer@redhat.com, "xdp-hints@xdp-project.net" , Eelco Chaudron , Andrii Nakryiko , "Fijalkowski, Maciej" , "Burakov, Anatoly" X-Mailman-Version: 3.3.4 Precedence: list Subject: [xdp-hints] Re: Basic/Dumb question WAS(Re: Re: XDP-hints via local BTF info List-Id: XDP hardware hints design discussion Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Tom Herbert writes: > On Mon, Nov 22, 2021 at 5:59 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> >> Jamal Hadi Salim writes: >> >> > And it goes something like this: >> > >> > Why does the metadata have to go in the DMA descriptors? >> > Our experience with the XDP metadata is: you start accessing >> > that there is a performance penalty (extra cache miss(es)). >> > >> > Why is the metadata not encapped as part of the data? We >> > dont have MTU issues on receive since that is entirely a local matter; >> > meaning the hardware can expand the packet as much as it wants within >> > the boundaries of alloced DMA buffer space and XDP and any other >> > subsystem (TC for example) can take advantage of the metadata. >> >> Once the hardware learns how to do that, this is absolutely the >> direction we want to go in. But for existing stuff, the metadata is in >> the descriptor and needs to be translated somehow... >> > Toke, > > By existing stuff I think you mean legacy stuff :-) Well, whatever term you want to use for "the stuff that's in hardware today" :) > I tend to agree with Jamal. Descriptor space is exceedingly limited > and it's unclear to me that the benefits of getting a few bits of > *hints* from the device outweigh the costs to develop XDP hints and > perpetually maintain the facility. That is to say it doesn't seem to > address the two fundamental limitations of the venerable 50 year old > device driver model: 1) the narrow waist of PCIe bus in expressing > ancillary information 2) the black box nature of devices that prevent > the stack from having any visibility into what it's actually doing > (and hence we can only get hints at best and not real operational data > for primary protocol processing). > > I believe CXL and programmable devices are the emerging architectural > solution. This will enable split plane acceleration. For instance, if > we do this right, a NIC would be able to perform all the stateless > processing of a TCP/IP packet such that when the host gets the packet > it could immediately jump to the TCP receive processing function ala > TXDP-- or even more than that the device could process the whole > received TCP datapath and the host just puts the received data on the > socket (necessary to solve the TLS offload OOO problem). All this > requires a very tight integration between the stack and the device to > the extent that the lines are blurred and the boundary of the software > stack is effectively pushed into the device (but definitely not TOE!). This is certainly something that xdp-hints should be able to address. And it does: the XDP metadata area is right before the packet data, and while today that bit of memory is not covered by DMA, there is absolutely no reason why it couldn't be for devices that support putting arbitrary stuff in there. In which case the driver can just export a BTF description of what the hardware writes into that area, and XDP can access it directly using xdp-hints. All the talk about translating from hardware descriptors applies to existing hardware (or "legacy" if you want to call it that), but if the hardware can just write the metadata directly, driver translation is obviously not needed. -Toke