atom feed47 messages in org.freebsd.freebsd-currentRe: Some performance measurements on ...
FromSent OnAttachments
Luigi RizzoApr 19, 2012 6:12 am 
Slawa OlhovchenkovApr 19, 2012 11:53 am 
Andre OppermannApr 19, 2012 1:05 pm 
Luigi RizzoApr 19, 2012 1:26 pm 
K. MacyApr 19, 2012 1:34 pm 
Luigi RizzoApr 19, 2012 2:03 pm 
K. MacyApr 19, 2012 2:06 pm 
Andre OppermannApr 19, 2012 2:11 pm 
K. MacyApr 19, 2012 2:17 pm 
Andre OppermannApr 19, 2012 2:19 pm 
Andre OppermannApr 19, 2012 2:26 pm 
K. MacyApr 19, 2012 2:35 pm 
K. MacyApr 19, 2012 2:36 pm 
Luigi RizzoApr 19, 2012 2:43 pm 
Andre OppermannApr 19, 2012 3:36 pm 
Luigi RizzoApr 19, 2012 11:16 pm 
Alexander V. ChernikovApr 20, 2012 1:26 am 
Andre OppermannApr 20, 2012 2:00 am 
Andre OppermannApr 20, 2012 2:25 am 
John BaldwinApr 20, 2012 5:11 am 
Luigi RizzoApr 20, 2012 7:26 am 
K. MacyApr 20, 2012 9:28 am 
Luigi RizzoApr 20, 2012 11:46 am 
Bruce EvansApr 20, 2012 11:33 pm 
Adrian ChaddApr 21, 2012 7:14 pm 
K. MacyApr 22, 2012 7:04 am 
Andre OppermannApr 24, 2012 6:16 am 
Luigi RizzoApr 24, 2012 6:44 am 
Li, QingApr 24, 2012 7:15 am 
K. MacyApr 24, 2012 8:03 am 
K. MacyApr 24, 2012 8:05 am 
Luigi RizzoApr 24, 2012 9:16 am 
K. MacyApr 24, 2012 9:18 am 
Fabien ThomasApr 24, 2012 9:34 am 
Li, QingApr 24, 2012 10:39 am 
Li, QingApr 24, 2012 10:42 am 
Bjoern A. ZeebApr 24, 2012 5:01 pm 
Maxim KonovalovApr 25, 2012 2:21 am 
Slawa OlhovchenkovApr 25, 2012 3:19 am 
K. MacyApr 25, 2012 8:44 am 
Bjoern A. ZeebApr 25, 2012 11:53 am 
George Neville-NeilMay 1, 2012 7:27 am 
Luigi RizzoMay 1, 2012 8:21 am 
George Neville-NeilMay 1, 2012 10:33 am 
Bjoern A. ZeebMay 1, 2012 2:08 pm 
Luigi RizzoMay 1, 2012 2:22 pm 
Luigi RizzoMay 3, 2012 9:32 am 
Subject:Re: Some performance measurements on the FreeBSD network stack
From:Luigi Rizzo (riz@iet.unipi.it)
Date:Apr 19, 2012 11:16:37 pm
List:org.freebsd.freebsd-current

On Fri, Apr 20, 2012 at 12:37:21AM +0200, Andre Oppermann wrote:

On 20.04.2012 00:03, Luigi Rizzo wrote:

On Thu, Apr 19, 2012 at 11:20:00PM +0200, Andre Oppermann wrote:

On 19.04.2012 22:46, Luigi Rizzo wrote:

The allocation happens while the code has already an exclusive lock on so->snd_buf so a pool of fresh buffers could be attached there.

Ah, there it is not necessary to hold the snd_buf lock while doing the allocate+copyin. With soreceive_stream() (which is

it is not held in the tx path either -- but there is a short section before m_uiotombuf() which does

... SOCKBUF_LOCK(&so->so_snd); // check for pending errors, sbspace, so_state SOCKBUF_UNLOCK(&so->so_snd); ...

(some of this is slightly dubious, but that's another story)

Indeed the lock isn't held across the m_uiotombuf(). You're talking about filling an sockbuf mbuf cache while holding the lock?

all i am thinking is that when we have a serialization point we could use it for multiple related purposes. In this case yes we could keep a small mbuf cache attached to so_snd. When the cache is empty either get a new batch (say 10-20 bufs) from the zone allocator, possibly dropping and regaining the lock if the so_snd must be a leaf. Besides for protocols like TCP (does it use the same path ?) the mbufs are already there (released by incoming acks) in the steady state, so it is not even necessary to to refill the cache.

This said, i am not 100% sure that the 100ns I am seeing are all spent in the zone allocator. As i said the chain of indirect calls and other ops is rather long on both acquire and release.

But the other consideration is that one could defer the mbuf allocation to a later time when the packet is actually built (or anyways right before the thread returns). What i envision (and this would fit nicely with netmap) is the following: - have a (possibly readonly) template for the headers (MAC+IP+UDP) attached to the socket, built on demand, and cached and managed with similar invalidation rules as used by fastforward;

That would require to cross-pointer the rtentry and whatnot again.

i was planning to keep a copy, not a reference. If the copy becomes temporarily stale, no big deal, as long as you can detect it reasonably quiclky -- routes are not guaranteed to be correct, anyways.

Be wary of disappearing interface pointers...

(this reminds me, what prevents a route grabbed from the flowtable from disappearing and releasing the ifp reference ?)

In any case, it seems better to keep a more persistent ifp reference in the socket rather than grab and release one on every single packet transmission.

- possibly extend the pru_send interface so one can pass down the uio instead of the mbuf; - make an opportunistic buffer allocation in some place downstream, where the code already has an x-lock on some resource (could be the snd_buf, the interface, ...) so the allocation comes for free.

ETOOCOMPLEXOVERTIME.

maybe. But i want to investigate this.

I fail see what passing down the uio would gain you. The snd_buf lock isn't obtained again after the copyin. Not that I want to prevent you from investigating other ways. ;)

maybe it can open the way to other optimizations, such as reducing the number of places where you need to lock, or save some data copies, or reduce fragmentation, etc.

cheers luigi