atom feed9 messages in org.oasis-open.lists.virtio-dev[virtio-dev] Re: [PATCH v2] balloon: ...
FromSent OnAttachments
Michael S. TsirkinApr 26, 2015 11:40 am 
Michael S. TsirkinApr 27, 2015 2:59 am 
Cornelia HuckApr 27, 2015 5:02 am 
Michael S. TsirkinApr 27, 2015 5:47 am 
Cornelia HuckApr 27, 2015 6:09 am 
Michael S. TsirkinApr 27, 2015 6:20 am 
Rusty RussellApr 27, 2015 9:09 pm 
Michael S. TsirkinApr 27, 2015 11:57 pm 
Michael S. TsirkinApr 28, 2015 8:18 am 
Subject:[virtio-dev] Re: [PATCH v2] balloon: transitional device support
From:Michael S. Tsirkin (ms@redhat.com)
Date:Apr 27, 2015 2:59:03 am
List:org.oasis-open.lists.virtio-dev

Virtio 1.0 cs02 doesn't include a modern balloon device. At some point we'll likely define an incompatible interface with a different ID and different semantics. But for now, it's not a big effort to support a transitional balloon device: this has the advantage of supporting existing drivers, transparently, as well as transports that don't allow mixing virtio 0 and virtio 1 devices. And balloon is an easy device to test, so it's also useful for people to test virtio core handling of transitional devices.

Three issues with legacy hypervisors have been identified: 1. Actual value is actually used, and is necessary for management to work. Luckily 4 byte config space writes are now atomic. When using old guests, hypervisors can detect access to the last byte. When using old hypervisors, drivers can use atomic 4-byte accesses. 2. Hypervisors actually didn't ignore the stats from the first buffer supplied. This means the values there would be incorrect until hypervisor resends the request. Add a note suggesting hypervisors ignore the 1st buffer. 3. QEMU simply over-writes stats from each buffer it gets. Thus if driver supplies a different subset of stats on each request, stale values will be there. Require drivers to supply the same subset on each request. This also gives us a simple way to figure out which stats are supported.

Please note the issues are there in legacy mode only: we need to fix them even if we don't implement a modern interface for balloon.

Signed-off-by: Michael S. Tsirkin <ms@redhat.com>

---

changes from v1: dropped stats format change added a bunch of normative statements documented work-arounds for legacy hypervisor bugs

conformance.tex | 24 +++++++- content.tex | 173 ++++++++++++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 170 insertions(+), 27 deletions(-)

diff --git a/conformance.tex b/conformance.tex index 80b333f..c51287a 100644 --- a/conformance.tex +++ b/conformance.tex @@ -15,7 +15,7 @@ Conformance targets: \begin{itemize} \item Clause \ref{sec:Conformance / Driver Conformance}, \item One of clauses \ref{sec:Conformance / Driver Conformance / PCI Driver
Conformance}, \ref{sec:Conformance / Driver Conformance / MMIO Driver
Conformance} or \ref{sec:Conformance / Driver Conformance / Channel I/O Driver
Conformance}. - \item One of clauses \ref{sec:Conformance / Driver Conformance / Network
Driver Conformance}, \ref{sec:Conformance / Driver Conformance / Block Driver
Conformance}, \ref{sec:Conformance / Driver Conformance / Console Driver
Conformance}, \ref{sec:Conformance / Driver Conformance / Entropy Driver
Conformance} or \ref{sec:Conformance / Driver Conformance / SCSI Host Driver
Conformance}. + \item One of clauses \ref{sec:Conformance / Driver Conformance / Network
Driver Conformance}, \ref{sec:Conformance / Driver Conformance / Block Driver
Conformance}, \ref{sec:Conformance / Driver Conformance / Console Driver
Conformance}, \ref{sec:Conformance / Driver Conformance / Entropy Driver
Conformance} or \ref{sec:Conformance / Driver Conformance / SCSI Host Driver
Conformance}. \end{itemize} \item[Device] A device MUST conform to three conformance clauses: \begin{itemize} @@ -123,6 +123,16 @@ An entropy driver MUST conform to the following normative
statements: \item \ref{drivernormative:Device Types / Entropy Device / Device Operation} \end{itemize}

+\subsection{Traditional Memory Balloon Driver
Conformance}\label{sec:Conformance / Driver Conformance / Traditional Memory
Balloon Driver Conformance} + +A traditional memory balloon driver MUST conform to the following normative
statements: + +\begin{itemize} +\item \ref{drivernormative:Device Types / Memory Balloon Device / Feature bits} +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device
Operation} +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device
Operation / Memory Statistics} +\end{itemize} + \subsection{SCSI Host Driver Conformance}\label{sec:Conformance / Driver
Conformance / SCSI Host Driver Conformance}

An SCSI host driver MUST conform to the following normative statements: @@ -230,6 +240,16 @@ An entropy device MUST conform to the following normative
statements: \item \ref{devicenormative:Device Types / Entropy Device / Device Operation} \end{itemize}

+\subsection{Traditional Memory Balloon Device
Conformance}\label{sec:Conformance / Device Conformance / Traditional Memory
Balloon Device Conformance} + +A traditional memory balloon device MUST conform to the following normative
statements: + +\begin{itemize} +\item \ref{devicenormative:Device Types / Memory Balloon Device / Feature bits} +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device
Operation} +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device
Operation / Memory Statistics} +\end{itemize} + \subsection{SCSI Host Device Conformance}\label{sec:Conformance / Device
Conformance / SCSI Host Device Conformance}

An SCSI host device MUST conform to the following normative statements: @@ -292,7 +312,7 @@ Feature Bits / Legacy Interface: A Note on Feature Bits} \item Section \ref{sec:Device Types / Block Device / Device Operation / Legacy
Interface: Device Operation} \item Section \ref{sec:Device Types / Console Device / Device configuration
layout / Legacy Interface: Device configuration layout} \item Section \ref{sec:Device Types / Console Device / Device Operation /
Legacy Interface: Device Operation} -\item Section \ref{drivernormative:Device Types / Memory Balloon Device /
Device Operation} +\item Section \ref{sec:Device Types / Memory Balloon Device / Feature bits /
Legacy Interface: Feature bits} \item Section \ref{sec:Device Types / Memory Balloon Device / Device Operation
/ Legacy Interface: Device Operation} \item Section \ref{sec:Device Types / Memory Balloon Device / Device Operation
/ Memory Statistics / Legacy Interface: Memory Statistics} \item Section \ref{sec:Device Types / SCSI Host Device / Device configuration
layout / Legacy Interface: Device configuration layout} diff --git a/content.tex b/content.tex index 11015a5..3f728e4 100644 --- a/content.tex +++ b/content.tex @@ -1,4 +1,4 @@ -\chapter{Basic Facilities of a Virtio Device}\label{sec:Basic Facilities of a
Virtio Device} +{Basic Facilities of a Virtio Device}\label{sec:Basic Facilities of a Virtio
Device}

A virtio device is discovered and identified by a bus-specific method (see the bus specific sections: \ref{sec:Virtio Transport Options / Virtio Over
PCI Bus}~\nameref{sec:Virtio Transport Options / Virtio Over PCI Bus}, @@ -1046,7 +1046,7 @@ Transitional PCI Device ID & Virtio Device \\ \hline 0x1001 & block device \\ \hline -0x1002 & memory ballooning (legacy) \\ +0x1002 & memory ballooning (traditional) \\ \hline 0x1003 & console \\ \hline @@ -2966,7 +2966,7 @@ Device ID & Virtio Device \\ \hline 4 & entropy source \\ \hline -5 & memory ballooning (legacy) \\ +5 & memory ballooning (traditional) \\ \hline 6 & ioMemory \\ \hline @@ -4357,14 +4357,13 @@ how many random bytes were received. The device MUST place one or more random bytes into the buffer, but it MAY use less than the entire buffer length.

-\section{Legacy Interface: Memory Balloon Device}\label{sec:Device Types /
Memory Balloon Device} +\section{Traditional Memory Balloon Device}\label{sec:Device Types / Memory
Balloon Device}

-This device is deprecated, and thus only exists as a legacy device -illustrated here for reference. The device number 13 is reserved for -a new memory balloon interface which is expected in a future version -of the standard. +This is the traditional balloon device. The device number 13 is +reserved for a new memory balloon interface, with different +semantics, which is expected in a future version of the standard.

-The virtio memory balloon device is a primitive device for +The traditional virtio memory balloon device is a primitive device for managing guest memory: the device asks for a certain amount of memory, and the driver supplies it (or withdraws it, if the device has more than it asks for). This allows the guest to adapt to @@ -4393,6 +4392,26 @@ guest memory statistics to the host. memory statistics is present. \end{description}

+\drivernormative{\subsubsection}{Feature bits}{Device Types / Memory Balloon
Device / Feature bits} +The driver SHOULD accept the VIRTIO_BALLOON_F_MUST_TELL_HOST +feature if offered by the device. + +\devicenormative{\subsubsection}{Feature bits}{Device Types / Memory Balloon
Device / Feature bits} +If the device offers the VIRTIO_BALLOON_F_MUST_TELL_HOST feature +bit, and if the driver did not accept this feature bit, the +device MAY signal failure by failing to set FEATURES_OK +\field{device status} bit when the driver writes FEATURES_OK into +\field{device status}. +\subparagraph{Legacy Interface: Feature bits}\label{sec:Device +Types / Memory Balloon Device / Feature bits / Legacy Interface: +Feature bits} +As the legacy interface does not have a way to gracefully report feature +negotiation failure, when using the legacy interface, +transitional devices MUST support guests which do not negotiate +VIRTIO_BALLOON_F_MUST_TELL_HOST feature, and SHOULD +allow guest to use memory before notifying host if +VIRTIO_BALLOON_F_MUST_TELL_HOST is not negotiated. + \subsection{Device configuration layout}\label{sec:Device Types / Memory
Balloon Device / Device configuration layout} Both fields of this configuration are always available. @@ -4404,25 +4423,32 @@ struct virtio_balloon_config { }; \end{lstlisting}

-Note that these fields are always little endian, despite convention +\subparagraph{Legacy Interface: Device configuration layout}\label{sec:Device
Types / Memory Balloon Device / Device +configuration layout / Legacy Interface: Device configuration layout} +When using the legacy interface, transitional devices and drivers +MUST format the fields in struct virtio_balloon_config +according to the little-endian format. +\begin{note} +This is unlike the usual convention that was used for other legacy devices that legacy device fields are guest endian. +\end{note}

\subsection{Device Initialization}\label{sec:Device Types / Memory Balloon
Device / Device Initialization}

+The device initialization process is outlined below: + \begin{enumerate} \item The inflate and deflate virtqueues are identified.

\item If the VIRTIO_BALLOON_F_STATS_VQ feature bit is negotiated: \begin{enumerate} \item Identify the stats virtqueue. - - \item Add one empty buffer to the stats virtqueue and notify the - device. + \item Add one empty buffer to the stats virtqueue. + \item DRIVER_OK is set: device operation begins. + \item Notify the device about the stats virtqueue buffer. \end{enumerate} \end{enumerate}

-Device operation begins immediately. - \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device /
Device Operation}

The device is driven by the receipt of a @@ -4451,16 +4477,16 @@ configuration change interrupt. \item If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is negotiated, the guest informs the device of pages before it uses them.

- \item Otherwise, the guest MAY begin to re-use pages previously + \item Otherwise, the guest is allowed to re-use pages previously given to the balloon before the device has acknowledged their withdrawal\footnote{In this case, deflation advice is merely a courtesy. }. \end{enumerate}

-\item In either case, once the device has completed the inflation or - deflation, the driver updates \field{actual} to reflect the new number of
pages in the balloon\footnote{As updates to device-specific configuration space
are not atomic, this field -isn't particularly reliable, but can be used to diagnose buggy guests. -}. +\item In either case, the device acknowledges inflate and deflate +requests by using the descriptor. +\item Once the device has acknowledged the inflation or + deflation, the driver updates \field{actual} to reflect the new number of
pages in the balloon. \end{enumerate}

\drivernormative{\subsubsection}{Device Operation}{Device Types / Memory
Balloon Device / Device Operation} @@ -4474,12 +4500,38 @@ The driver MUST use the deflateq to inform the device of
pages that it wants to use from the balloon.

If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is negotiated, the -driver MUST wait until the device has used the deflateq descriptor -before using the pages. +driver MUST NOT use pages from the balloon until +the device has acknowledged the deflate request. + +Otherwise, if the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is not +negotiated, the driver MAY begin to re-use pages previously +given to the balloon before the device has acknowledged the +deflate request. + +In any case, the driver MUST NOT use pages from the balloon +after adding the pages to the balloon, but before the device has +acknowledged the inflate request. + +The driver MUST NOT request deflation of pages in +the balloon before the device has acknowledged the inflate +request.

The driver MUST update \field{actual} after changing the number of pages in the balloon.

+\devicenormative{\subsubsection}{Device Operation}{Device Types / Memory
Balloon Device / Device Operation} + +The device MAY modify the contents of a page in the balloon +after detecting its physical number in an inflate request +and before acknowledging the inflate request by using the inflateq +descriptor. + +If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is negotiated, the +device MAY modify the contents of a page in the balloon +after detecting its physical number in an inflate request +and before detecting its physical number in a deflate request +and acknowledging the deflate request. + \paragraph{Legacy Interface: Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation / Legacy Interface: Device Operation} @@ -4488,7 +4540,27 @@ When using the legacy interface, the driver SHOULD ignore
the \field{len} value Historically, some devices put the total descriptor length there, even though no data was actually written. \end{note} +When using the legacy interface, the driver MUST write out all +4 bytes each time it updates the \field{actual} value in the +configuration space, using a single atomic operation. +\begin{note} +Historically, devices used the \field{actual} value, even though +the device-specific configuration space was never guaranteed +to be atomic. +\end{note} +When using the legacy interface, the device MUST NOT use the +\field{actual} value written by the driver in the configuration +space, until the last, most-significant byte of the value has been +written. +\begin{note} +Historically, drivers wrote the \field{actual} value +by using multiple single-byte writes in order, from the +least-significant to the most-significant value.

Or maybe: most-significant byte of the value.

+\end{note}

+\footnote{As updates to device-specific configuration space are not atomic,
this field +isn't particularly reliable, but can be used to diagnose buggy guests. +} \subsubsection{Memory Statistics}\label{sec:Device Types / Memory Balloon
Device / Device Operation / Memory Statistics}

The stats virtqueue is atypical because communication is driven @@ -4513,10 +4585,11 @@ as follows: subsequent request) and consumes the statistics. \end{enumerate}

+ Within the buffer, statistics are an array of 6-byte entries. Each statistic consists of a 16 bit tag and a 64 bit value. All statistics are optional and the driver chooses which ones to supply. To guarantee backwards - compatibility, the driver SHOULD omit unsupported statistics. + compatibility, the device ought to omit unsupported statistics.

\begin{lstlisting} struct virtio_balloon_stat { @@ -4526,17 +4599,67 @@ struct virtio_balloon_stat { #define VIRTIO_BALLOON_S_MINFLT 3 #define VIRTIO_BALLOON_S_MEMFREE 4 #define VIRTIO_BALLOON_S_MEMTOT 5 - u16 tag; - u64 val; + le16 tag; + le64 val; } __attribute__((packed)); \end{lstlisting}

+\drivernormative{\paragraph}{Memory Statistics}{Device Types / Memory Balloon
Device / Device Operation / Memory Statistics} +Normative statements in this section apply if and only if the +VIRTIO_BALLOON_F_STATS_VQ feature has been negotiated. + +The driver MUST make at most one buffer available to the device +in the statsq, at all times. + +After initializing the device, the driver MUST make an output +buffer available in the statsq. + +Upon detecting that device has used a buffer in the statsq, the +driver MUST make an output buffer available in the statsq. + +Before making an output buffer available in the statsq, the +driver MUST initialize it, including one struct +virtio_balloon_stat entry for each statistic that it supports. + +Driver MUST use an output buffer size which is a multiple of 6 +bytes for all buffers submitted to the statsq. + +Driver MAY supply struct virtio_balloon_stat entries in the +output buffer submitted to the statsq in any order, without +regard to \field{tag} values. + +Driver MAY supply a subset of all statistics in the output buffer +submitted to the statsq. + +Driver MUST supply the same subset of statistics in all buffers +submitted to the statsq. + +\devicenormative{\paragraph}{Memory Statistics}{Device Types / Memory Balloon
Device / Device Operation / Memory Statistics} +Normative statements in this section apply if and only if the +VIRTIO_BALLOON_F_STATS_VQ feature has been negotiated. + +Within an output buffer submitted to the statsq, +the device MUST ignore entries with \field{tag} values that it does not
recognize. + +Within an output buffer submitted to the statsq, +the device MUST accept struct virtio_balloon_stat entries in any +order without regard to \field{tag} values. + \paragraph{Legacy Interface: Memory Statistics}\label{sec:Device Types / Memory
Balloon Device / Device Operation / Memory Statistics / Legacy Interface: Memory
Statistics} + When using the legacy interface, transitional devices and drivers MUST format the fields in struct virtio_balloon_stat according to the native endian of the guest rather than (necessarily when not using the legacy interface) little-endian.

+When using the legacy interface, +the device MUST ignore all values in the first buffer in the +statsq supplied by the driver after device initialization. +\begin{note} +Historically, drivers supplied an uninitialized buffer in the +first buffer. +\end{note} + \subsubsection{Memory Statistics Tags}\label{sec:Device Types / Memory Balloon
Device / Device Operation / Memory Statistics Tags}

\begin{description}