|Luigi Rizzo||Apr 19, 2012 6:12 am|
|Slawa Olhovchenkov||Apr 19, 2012 11:53 am|
|Andre Oppermann||Apr 19, 2012 1:05 pm|
|Luigi Rizzo||Apr 19, 2012 1:26 pm|
|K. Macy||Apr 19, 2012 1:34 pm|
|Luigi Rizzo||Apr 19, 2012 2:03 pm|
|K. Macy||Apr 19, 2012 2:06 pm|
|Andre Oppermann||Apr 19, 2012 2:11 pm|
|K. Macy||Apr 19, 2012 2:17 pm|
|Andre Oppermann||Apr 19, 2012 2:19 pm|
|Andre Oppermann||Apr 19, 2012 2:26 pm|
|K. Macy||Apr 19, 2012 2:35 pm|
|K. Macy||Apr 19, 2012 2:36 pm|
|Luigi Rizzo||Apr 19, 2012 2:43 pm|
|Andre Oppermann||Apr 19, 2012 3:36 pm|
|Luigi Rizzo||Apr 19, 2012 11:16 pm|
|Alexander V. Chernikov||Apr 20, 2012 1:26 am|
|Andre Oppermann||Apr 20, 2012 2:00 am|
|Andre Oppermann||Apr 20, 2012 2:25 am|
|John Baldwin||Apr 20, 2012 5:11 am|
|Luigi Rizzo||Apr 20, 2012 7:26 am|
|K. Macy||Apr 20, 2012 9:28 am|
|Luigi Rizzo||Apr 20, 2012 11:46 am|
|Bruce Evans||Apr 20, 2012 11:33 pm|
|Adrian Chadd||Apr 21, 2012 7:14 pm|
|K. Macy||Apr 22, 2012 7:04 am|
|Andre Oppermann||Apr 24, 2012 6:16 am|
|Luigi Rizzo||Apr 24, 2012 6:44 am|
|Li, Qing||Apr 24, 2012 7:15 am|
|K. Macy||Apr 24, 2012 8:03 am|
|K. Macy||Apr 24, 2012 8:05 am|
|Luigi Rizzo||Apr 24, 2012 9:16 am|
|K. Macy||Apr 24, 2012 9:18 am|
|Fabien Thomas||Apr 24, 2012 9:34 am|
|Li, Qing||Apr 24, 2012 10:39 am|
|Li, Qing||Apr 24, 2012 10:42 am|
|Bjoern A. Zeeb||Apr 24, 2012 5:01 pm|
|Maxim Konovalov||Apr 25, 2012 2:21 am|
|Slawa Olhovchenkov||Apr 25, 2012 3:19 am|
|K. Macy||Apr 25, 2012 8:44 am|
|Bjoern A. Zeeb||Apr 25, 2012 11:53 am|
|George Neville-Neil||May 1, 2012 7:27 am|
|Luigi Rizzo||May 1, 2012 8:21 am|
|George Neville-Neil||May 1, 2012 10:33 am|
|Bjoern A. Zeeb||May 1, 2012 2:08 pm|
|Luigi Rizzo||May 1, 2012 2:22 pm|
|Luigi Rizzo||May 3, 2012 9:32 am|
|Subject:||Re: Some performance measurements on the FreeBSD network stack|
|From:||Luigi Rizzo (riz...@iet.unipi.it)|
|Date:||Apr 19, 2012 11:16:37 pm|
On Fri, Apr 20, 2012 at 12:37:21AM +0200, Andre Oppermann wrote:
On 20.04.2012 00:03, Luigi Rizzo wrote:
On Thu, Apr 19, 2012 at 11:20:00PM +0200, Andre Oppermann wrote:
On 19.04.2012 22:46, Luigi Rizzo wrote:
The allocation happens while the code has already an exclusive lock on so->snd_buf so a pool of fresh buffers could be attached there.
Ah, there it is not necessary to hold the snd_buf lock while doing the allocate+copyin. With soreceive_stream() (which is
it is not held in the tx path either -- but there is a short section before m_uiotombuf() which does
... SOCKBUF_LOCK(&so->so_snd); // check for pending errors, sbspace, so_state SOCKBUF_UNLOCK(&so->so_snd); ...
(some of this is slightly dubious, but that's another story)
Indeed the lock isn't held across the m_uiotombuf(). You're talking about filling an sockbuf mbuf cache while holding the lock?
all i am thinking is that when we have a serialization point we could use it for multiple related purposes. In this case yes we could keep a small mbuf cache attached to so_snd. When the cache is empty either get a new batch (say 10-20 bufs) from the zone allocator, possibly dropping and regaining the lock if the so_snd must be a leaf. Besides for protocols like TCP (does it use the same path ?) the mbufs are already there (released by incoming acks) in the steady state, so it is not even necessary to to refill the cache.
This said, i am not 100% sure that the 100ns I am seeing are all spent in the zone allocator. As i said the chain of indirect calls and other ops is rather long on both acquire and release.
But the other consideration is that one could defer the mbuf allocation to a later time when the packet is actually built (or anyways right before the thread returns). What i envision (and this would fit nicely with netmap) is the following: - have a (possibly readonly) template for the headers (MAC+IP+UDP) attached to the socket, built on demand, and cached and managed with similar invalidation rules as used by fastforward;
That would require to cross-pointer the rtentry and whatnot again.
i was planning to keep a copy, not a reference. If the copy becomes temporarily stale, no big deal, as long as you can detect it reasonably quiclky -- routes are not guaranteed to be correct, anyways.
Be wary of disappearing interface pointers...
(this reminds me, what prevents a route grabbed from the flowtable from disappearing and releasing the ifp reference ?)
In any case, it seems better to keep a more persistent ifp reference in the socket rather than grab and release one on every single packet transmission.
- possibly extend the pru_send interface so one can pass down the uio instead of the mbuf; - make an opportunistic buffer allocation in some place downstream, where the code already has an x-lock on some resource (could be the snd_buf, the interface, ...) so the allocation comes for free.
maybe. But i want to investigate this.
I fail see what passing down the uio would gain you. The snd_buf lock isn't obtained again after the copyin. Not that I want to prevent you from investigating other ways. ;)
maybe it can open the way to other optimizations, such as reducing the number of places where you need to lock, or save some data copies, or reduce fragmentation, etc.
_______________________________________________ free...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "free...@freebsd.org"