atom feed15 messages in org.opensolaris.caiman-discuss[caiman-discuss] ZFS data set division
FromSent OnAttachments
Dave MinerAug 22, 2007 2:45 pm 
Davi...@Sun.COMAug 23, 2007 11:46 am 
Mike GerdtsAug 23, 2007 3:12 pm 
Peter TribbleAug 23, 2007 3:16 pm 
Bart SmaaldersAug 23, 2007 3:20 pm 
Dave MinerAug 24, 2007 7:20 am 
Dave MinerAug 24, 2007 7:49 am 
Dave MinerAug 24, 2007 8:38 am 
Mike GerdtsAug 24, 2007 9:16 am 
Lori AltAug 24, 2007 9:54 am 
Peter TribbleAug 28, 2007 2:35 pm 
Richard L. HamiltonAug 29, 2007 2:30 am 
Mike GerdtsAug 29, 2007 7:36 am 
Richard EllingSep 2, 2007 5:55 pm 
Mike GerdtsSep 2, 2007 8:12 pm 
Subject:[caiman-discuss] ZFS data set division
From:Mike Gerdts (mger@gmail.com)
Date:Aug 24, 2007 9:16:32 am
List:org.opensolaris.caiman-discuss

[Bcc zones-discuss, zfs-discuss as a heads-up of discussion at caiman-discuss] http://mail.opensolaris.org/pipermail/caiman-discuss/2007-August/000678.html

On 8/24/07, Dave Miner <Dave.Miner at sun.com> wrote:

David.Comay at Sun.COM wrote:

Now there might be some good reasons for /export and /export/home to be separate datasets but I don't see it for /home.

That item was from Lori's list, so she'd have to comment, though my presumption now that I think about it was that /home was just a typo and /export/home was what was actually meant. Should have changed that before I sent it out, I suppose.

I would argue that there is no good reason for /export to be in / and should be a different fs. Whether /export/home is split to another fs not of great concern to me.

I expect that at some point in the very near future we will examine dividing /var more finely, as well as possibly /etc. The idea there would be to separate /var/adm or portions of /var/spool (as two examples) to be shared across boot environments where the data is not BE-specific and duplication can cause oddities in behavior of the system (such as duplicate copies of spooled emails) or discontinuities in operational data (such as message logs or audit trails). In Live Upgrade we've handled this problem to some extent with the /etc/lu/synclist and associated script machinery at reboot, but eliminating the need for synchronization where possible seems attractive to me.

In addition, how zones are handled will also need further refinement as well. The current attributes that sparse-root zones have with respect to sharing of storage and VM mappings is important but it comes at a cost with respect to package maintenance. I think those costs need to be reweighed given that we have ZFS now and the sorts of machines we expect zones to be deployed on in the future.

I agree that they do, and so a discussion with the zones team is certainly in order. One of the things which I failed to state originally is that this is the layout proposed to be used by the ZFS boot project, which is primarily targeting Solaris 10 (though our processes require that it show up in Nevada first) and thus limited in the level of change to other areas which is both allowable and feasible; it's scoped to provide a fairly straightforward transition from UFS to ZFS within the Solaris 10 stream. We obviously have more freedom in a minor release to make more fundamental changes, and I think we need to.

Since a lot of work seems to be going in to optimize the use of clones for zones, I would suggest that something along the lines of the following would be the ideal setup.

root on / root/var_share on /var/share root/export on /export root at install (not mounted) root/var_share at install (not mounted) root/zones on /zones root/zones/SUNWdefault on /zones/SUNWdefault (clone of root at install) root/zones/SUNWdefault_var_shared (mounted at zone boot, clone) root/zones/SUNWdefault_export (mounted at zone boot, empty)

Things that are destined to be shared between boot environments would be in a /var/share. Symbolic links would point those directories that are shared to the appropriate subdirectory of /var/share.

The SUNWdefault zone may have gzonly packages removed and will be sys-unconfig'd. It would have its own /var/shared and /export file systems that are shared between the zone's boot environments. The SUNWdefault zone would be cloned for each zone. Perhaps this may lead down the path of needing a zonecfg fs property that says whether it is shared between BE's and whether it should be cloned or created as empty when the zone is cloned.

Deduplication would be responsible for minimizing the overhead of possibly many clones of SUNWdefault zone snapshots as time + patches move forward. I suspect that an RFE may need to come around to drop uninteresting snapshots that were used only for clone purposes.

Oh, and on Monday 's/SUNW/JAVA/g' (sigh) all of the above. :)

Mike