Someone found that doing a live migration of a domain that had ballooned
down took far longer to migrate. (Ballooned down from 3000M to 1000M, 31
seconds vs 89 seconds, real time) I came up with a complex theory and
asked him to look in the xend.log to confirm it. He didn't, but he
mentioned there was a lot of "netbuf race" messages in the log. In this
particular case, live migration generated approximately 512000 "netbuf
race" messages. Deleting the DPRINTF reduced the migration time to 11
seconds.
While it is simple enough to submit a patch to delete this DPRINTF,
perhaps something more subtle is called for such as modifying the
migrate/save command paths to accept a debug argument and passing to
xc_save?
There's nothing much we can do here - there's no easy way for us to
distinguish between pages which are 'ballooned out' and pages which
are temporarily being used for network buffers. I've checked in a fix
to unstable (cset 13185:62ef527eb19f) which simply removes this particular
debug output.
thanks for spotting this!
cheers,
S.