2 messages in org.apache.jackrabbit.usersMemory usage issues of importml/expor...
FromSent OnAttachments
sbarribaOct 5, 2007 12:40 am 
Jacco van WeertOct 5, 2007 1:27 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Memory usage issues of importml/exportsysviewActions...
From:sbarriba (sbar@yahoo.co.uk)
Date:Oct 5, 2007 12:40:06 am
List:org.apache.jackrabbit.users

Hi all,

During a recent thread Hot Backup Tools were discussed - see http://www.mail-archive.com/users@jackrabbit.apache.org/msg04255.html.

As an outcome of that we're doing 2 things:

1) "Low-level" backup

o Backing up the database

o Backing up the repository file system

2) "High-level" backup

o Running exportsysview on each workspace

When migrating between environments or restoring backups solution 2) is very useful although the XML files are getting very large where the content has lots of binaries etc. The main issue is that the memory requirements of "importxml" increase linearly with the size of the XML file. I presume this is due to either a) the memory required to parse the file, and/or b) the memory required to hold the transient state of the import.

We're now needing to use a 1GB heap size for some imports and obviously this will hit a crunch point.

Any suggestions on how to resolve this memory issue? For example, could the "importxml" not use a SAX event model to avoid parsing the XML into a complete DOM etc (note I don't know the internals of importxml as it stands).

All suggestions welcome.

Regards,

Shaun