|Maks Verver||Mar 6, 2010 12:39 pm|
|Bernd Walter||Mar 6, 2010 1:16 pm|
|Bernd Walter||Mar 6, 2010 1:51 pm|
|M. Warner Losh||Mar 6, 2010 2:25 pm|
|Maks Verver||Mar 6, 2010 5:39 pm|
|Bernd Walter||Mar 6, 2010 10:59 pm|
|Maks Verver||Mar 7, 2010 11:55 am|
|Bernd Walter||Mar 7, 2010 12:11 pm|
|Rafal Jaworowski||Mar 7, 2010 12:30 pm|
|Mark Tinguely||Mar 7, 2010 1:25 pm|
|Maks Verver||Mar 7, 2010 1:38 pm|
|Bernd Walter||Mar 7, 2010 4:26 pm|
|Bernd Walter||Mar 7, 2010 5:30 pm|
|Bernd Walter||Mar 7, 2010 6:16 pm|
|Mark Tinguely||Mar 7, 2010 6:59 pm|
|Bernd Walter||Mar 8, 2010 12:20 am|
|Jacques Fourie||Mar 8, 2010 12:25 am|
|Hans Petter Selasky||Mar 8, 2010 1:06 am|
|Bernd Walter||Mar 8, 2010 4:40 am|
|Mark Tinguely||Mar 8, 2010 5:57 am|
|M. Warner Losh||Mar 8, 2010 6:07 am|
|Maks Verver||Mar 8, 2010 6:28 am|
|Grzegorz Bernacki||Mar 8, 2010 7:50 am|
|M. Warner Losh||Mar 8, 2010 8:14 am|
|Mark Tinguely||Mar 8, 2010 10:18 am|
|Bernd Walter||Mar 8, 2010 10:41 am|
|Mark Tinguely||Mar 8, 2010 11:36 am|
|Bernd Walter||Mar 8, 2010 11:54 am|
|Maks Verver||Mar 8, 2010 3:50 pm|
|Rafal Jaworowski||Mar 9, 2010 2:03 am|
|Grzegorz Bernacki||Mar 9, 2010 8:11 am|
|Mark Tinguely||Mar 9, 2010 10:11 am|
|Grzegorz Bernacki||Mar 10, 2010 5:57 am|
|Rafal Jaworowski||Mar 10, 2010 6:04 am|
|Mark Tinguely||Mar 10, 2010 6:20 am|
|Bernd Walter||Mar 10, 2010 6:37 am|
|Rafal Jaworowski||Mar 10, 2010 7:52 am|
|Mark Tinguely||Mar 10, 2010 8:41 am|
|Mark Tinguely||Mar 10, 2010 10:06 am|
|Rafal Jaworowski||Mar 11, 2010 1:18 pm|
|Maks Verver||Mar 12, 2010 9:51 am|
|Maks Verver||Mar 12, 2010 11:58 am|
|Mark Tinguely||Mar 12, 2010 1:20 pm|
|Mark Tinguely||Mar 15, 2010 10:50 am|
|Mark Tinguely||Mar 22, 2010 7:54 am|
|Olivier Houchard||Mar 22, 2010 8:05 am|
|Mark Tinguely||Mar 22, 2010 9:25 am|
|Steve Woodford||Mar 23, 2010 1:14 am|
|Grzegorz Bernacki||Mar 23, 2010 4:13 am|
|Mark Tinguely||Mar 23, 2010 5:56 am|
|Mark Tinguely||Nov 3, 2010 9:08 am|
|Subject:||Re: Performance of SheevaPlug on 8-stable|
|From:||Bernd Walter (tic...@cicely7.cicely.de)|
|Date:||Mar 8, 2010 11:54:26 am|
On Mon, Mar 08, 2010 at 01:37:23PM -0600, Mark Tinguely wrote:
This puzzled me as well. What is the requirement for such a handling with shared pages? I though handing over shared data is done by cache-flush, barriers or whatever an architectur has for this. Most systems we talk about are single CPU, so it is just DMA and handing over dcache writes to icache, but we don't support self modifying code, so it is always done in a controlled way. And even for SMP systems handing over data requires using cache coherence mechanisms - e.g. those embedded in mutexes. So what is wrong in my picture and requires us to do special handling for shared pages on ARM?
And if there's only one copy of 'test' running, why does it hit the 'shared' case for this code?
ARMv4/ARMv5 use virtual indexed / virtual tagged level one caches. They may or may not have level two caches. This is the ARM chips that we currently support, and I will explain the rules below.
Newest processors the ARMv6 can be virtual index / physical tagged or physical index / physical tagged level one caches; The ARM7 must have physical index / physical tag level one caches. The ARMv6 and ARMv7 have more pde/pte bit explaining the cache status on the "inner" and "outter" caches. The ARMv7 has the more mature cache management; it defines the "level of unity" and "level of coherence" for the caches. There is also a level snooping for the ARMv7 mulit-core, that I will just dance around. PIPT cache must be synced to the "level of coherency" before DMA and when modified from another process - think debugger in another address space modifying instruction code. ARMv6/ARMv7 have special address spaces to avoid tlb flushes. If they are not used, then tlbs have to be flushed on context switch. This is close to the i386/amd64 with the exception of DMA, the i386/amd64 have self snooping cache buses.
VIVT cache rules:
1) flush cache and tlb on context change.
2) USER cache must be disabled if a physical page has AT LEAST one writable user mapping AND is also mapped more than one time in the same user address space. (multiple read mappings and no writes are fine, they take up multiple cache entries. Obviously, a single read or a single write is fine. If the mappings are in different user address spaces, we will be okay because the flush on context change will sync things up).
3) KERNEL spaces are global. a) If the page is mapped writable AT LEAST ONCE to a kernel space AND the page is mapped more than once, no matter if the second mapping is in the user or kernel space, all mappings must not be cached.
I never assumed to be happy without a direct map.
b) If the page has only readable kernel mappings but at least one writable user mapping, the cache must be disabled for the mappings of page in this address space. This is slightly different from rule 2. Kernel mappings are typically writable, so this is a case that really does not happen.
It gets a little tricky to implement, because we have to catch the transition from cache -> non-cache (change pte and wbinv/inv data or instruction caches) and from non-cache -> cache (change the pte).
Thanks for the detailed explanation. I took a while, but now I got it. My picture wasn't expecting caching virtual pages.
_______________________________________________ free...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "free...@freebsd.org"