atom feed32 messages in org.freebsd.freebsd-smpRe: possible problem with SMP?
FromSent OnAttachments
Russell FrancisFeb 14, 2001 11:31 am 
Kris KennawayFeb 14, 2001 7:10 pm 
Seth LeighFeb 14, 2001 7:34 pm 
Kris KennawayFeb 14, 2001 7:53 pm 
Seth LeighFeb 14, 2001 9:01 pm 
Nate WilliamsFeb 14, 2001 9:36 pm 
Kris KennawayFeb 14, 2001 9:43 pm 
Terry LambertFeb 14, 2001 11:08 pm 
Seth LeighFeb 14, 2001 11:20 pm 
Seth LeighFeb 14, 2001 11:55 pm 
Jason EvansFeb 15, 2001 1:32 am 
Nate WilliamsFeb 15, 2001 9:06 am 
Terry LambertFeb 15, 2001 3:35 pm 
Nate WilliamsFeb 15, 2001 3:41 pm 
Terry LambertFeb 15, 2001 4:03 pm 
Terry LambertFeb 15, 2001 4:08 pm 
Terry LambertFeb 15, 2001 4:51 pm 
Arun SharmaFeb 15, 2001 5:19 pm 
Arun SharmaFeb 15, 2001 5:22 pm 
Seth LeighFeb 15, 2001 5:53 pm 
Nate WilliamsFeb 15, 2001 7:17 pm 
Kris KennawayFeb 15, 2001 9:16 pm 
Russell FrancisFeb 15, 2001 9:31 pm 
Yifeng XuFeb 15, 2001 9:33 pm 
Arun SharmaFeb 15, 2001 10:17 pm 
Kris KennawayFeb 15, 2001 10:35 pm 
Arun SharmaFeb 15, 2001 11:01 pm 
Seth LeighFeb 15, 2001 11:19 pm 
Jordan HubbardFeb 16, 2001 2:06 am 
Jacques A. VidrineFeb 16, 2001 9:06 am 
Arun SharmaFeb 16, 2001 10:45 am 
Kris KennawayFeb 16, 2001 9:24 pm 
Subject:Re: possible problem with SMP?
From:Nate Williams (na@yogotech.com)
Date:Feb 15, 2001 9:06:29 am
List:org.freebsd.freebsd-smp

There are problems of cpu starvation with a Many to Many model in that usually (unless you manually step up the concurrency) the Many to Many ends up being Many to Number of CPUs. If Number of CPUs is equal to 1, then usually you will only get one LWP beyond the ones that are sleeping. Using the Solaris libpthread, a user-level thread will not give up the LWP it's using until it voluntarily makes a call into libthread which gives libthread an opportunity to deschedule that thread and schedule a different one.

This is the same 'model' that Green Threads use, which isn't very effecient. It is possible to have user-space threads be pre-emtive (setjmp/longjmp), so don't blame the implementation issues solely on user threads.

To illustrate this, I talked to a customer the other day whose application had around 200 threads. He had around 40 LWPs created, but 39 of them were blocked in the kernel, and 1 appeared to be running when looking at the threads in dbx. The application appeared to be hung, though. Finally he figured out that the one thread that was actually running on the one free LWP was in fact running a hugely inefficient code chunk that was simply taking forever to complete, and since it wasn't calling into libthread, it was hogging that cpu and the other 160 or so threads in the program simply were not getting *any* time whatsoever.

This is a thread scheduler problem.

The solution to this was either to attempt to set the concurrency manually to something like "a few" LWPs higher than the number he expected to block, or throw in some sched_yield()s into all of his threads to make sure others got some time, or else simply make them bound. I advised him to just make them bound, run some tests to see how this affected his performance, and then decide from there. In all likelihood the majority of his threads can and do block at some point, and I see very little reason if that is the case why he shouldn't just make his threads bound.

Context switching is now *MUCH* higher. In properly written threaded programs (which are arguable hard to design, but not necessarily any more so than properly written high-performance multi-process software), context switching can be a big deal, especially in the case of *LOTS* of threads.

At some point, the more threads you use, the higher the context switching becomes (this is fairly obvious).

I am starting to believe, based on the number of customers I have talked to writing threaded code, that most real-world applications that use threads use threads that sometimes block in the kernel.

No argument there, see previous email re: I/O.

That being the case, I honestly see no real reason not to make them bound.

Only those threads that block need to be bound. And, in a good design, you can design the system so these threads have their own kernel context.

However, with the current FreeBSD kernel-thread design, this can be some more 'automatically', since the kernel can make this decision for you. This is the essence of kernel/scheduler activations.

Fact is, once they block, they cause a new LWP to be created anyways

No, they don't. The number of LWP's is fixed at compile time.

and then basically if all your threads block at least once in a relatively short amount of time you end up with most of your threads having their "own" LWP anyhow, plus you have the additional overhead of the more complicated thread library. In short, for many if not most real-world applications, user-level threads don't really buy you anything.

Again, this is pure conjecture, and not based on any real-world benchmarks or experience. I can point to *lots* of both that show the opposite.

Nate

To Unsubscribe: send mail to majo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message