atom feed3 messages in org.freebsd.freebsd-threadsSSE vs. stack alignment vs. pthread
FromSent OnAttachments
Daniel EischenNov 24, 2004 7:36 am 
Craig BostonNov 24, 2004 7:45 am 
Craig BostonNov 24, 2004 8:36 am 
Subject:SSE vs. stack alignment vs. pthread
From:Daniel Eischen (deis@freebsd.org)
Date:Nov 24, 2004 7:36:38 am
List:org.freebsd.freebsd-threads

On Wed, 24 Nov 2004, Craig Boston wrote:

First of all, I'd like to apologize for cross-posting to -hackers and -threads. I'm not sure yet if this is an application bug, a gcc bug, or a pthreads bug, so here goes...

I'm currently working on the audacity port. It's up to 1.2.3, but I want to get a problem I've observed with 1.2.2 resolved to make sure that it doesn't crop up later or affect other software...

Long story short, audacity is a threaded program. A straight compile of 1.2.2 results in a 100% reproducible bus error that happens on multiple Pentium-4 machines (5.3-STABLE). It always happens at this instruction:

0x081807c4: movaps %xmm0,0xffffff68(%ebp)

Now, at that time ebp is 0xbfadc6c0, so ebp+0xffffff68 (-0x152) is 0xbfadc56e. Oops, that's not 16-byte aligned like SSE wants. The offsets vary sligthly depending on the compile flags, etc., but the result is always the same -- SIGBUS.

Tor Egge reported similar problem to me yesterday. I haven't had a chance to test his patch, but this supposedly fixes it.

Index: lib/libc/i386/gen/makecontext.c =================================================================== RCS file: /home/ncvs/src/lib/libc/i386/gen/makecontext.c,v retrieving revision 1.4 diff -u -r1.4 makecontext.c --- lib/libc/i386/gen/makecontext.c 2 Jul 2004 14:19:44 -0000 1.4 +++ lib/libc/i386/gen/makecontext.c 22 Nov 2004 22:51:49 -0000 @@ -118,7 +118,9 @@ * address, _ctx_start, and ucp) and argc arguments. * We allow the arguments to be pointers also. */ - stack_top = stack_top - (sizeof(intptr_t) * (3 + argc)); + stack_top = stack_top - (sizeof(intptr_t) * (1 + argc)); + stack_top -= ((long) stack_top & 15); /* 16 bytes alignment */ + stack_top -= sizeof(intptr_t) * 2; argp = (intptr_t *)stack_top;