interrupt.tex

%    Sidebar about panic:
% 	panic is the kernel's last resort: the impossible has happened and the
% 	kernel does not know how to proceed.  In xv6, panic does ...
\chapter{Interrupts and device drivers}
\label{CH:INTERRUPT}

A
\indextext{driver}
is the code in an operating system that manages a particular device:
it configures the device hardware,
tells the device to perform operations,
handles the resulting interrupts,
and interacts with processes that may be waiting
for I/O from the device.
Driver code can be tricky
because a driver executes concurrently with the device that it manages.  In
addition, the driver must understand the device's hardware interface,
which can be complex and poorly documented.

Devices that need attention from the operating system can usually be
configured to generate interrupts, which are one type of trap.
The kernel trap handling code recognizes when a device
has raised an interrupt and calls the driver's interrupt handler;
in xv6, this dispatch happens in {\tt devintr} \lineref{kernel/trap.c:/^devintr/}.

Many device drivers execute code in two contexts: a \indextext{top
half} that runs in a process's kernel thread, and a \indextext{bottom
half} that executes at interrupt time. The top half
is called via system calls such as {\tt read} and {\tt write} that want
the device to perform I/O. This code may ask the hardware to start an
operation (e.g., ask the disk to read a block); then the code waits for
the operation to complete. Eventually the device completes the
operation and raises an interrupt. The driver's interrupt handler,
acting as the bottom half,
figures out what operation has completed, wakes up a waiting
process if appropriate, and tells the hardware to start work
on any waiting next operation.

\section{Code: Console input}

The console driver \fileref{kernel/console.c}
is a simple illustration of driver structure. The
console driver accepts characters typed by a human, via the \indextext{UART}
serial-port hardware attached to the RISC-V. The console driver accumulates a
line of input at a time, processing special input characters such as
backspace and control-u. User processes, such as the shell, use
the {\tt read} system call to fetch lines of input from the console.
When you type input to xv6 in QEMU, your keystrokes are delivered to
xv6 by way of QEMU's simulated UART hardware.

The UART hardware that the driver talks to is a 16550
chip~\cite{ns16550a} emulated by QEMU. On a real computer, a 16550
would manage an RS232 serial link connecting to a terminal or other
computer. When running QEMU, it's connected to your keyboard and
display.

The UART hardware appears to software as a set of \indextext{memory-mapped}
control registers. That is, there are some physical addresses that 
RISC-V hardware connects to the UART device, so that loads and stores
interact with the device hardware rather than RAM.
The memory-mapped addresses for the UART start at 0x10000000, or {\tt UART0}
\lineref{kernel/memlayout.h:/UART0.0x/}.
There are a handful of UART control registers, each the width
of a byte. Their offsets from {\tt UART0} are defined in
\lineref{kernel/uart.c:/define.RHR/}. For example, the
{\tt LSR} register contains bits that indicate whether input
characters are waiting to be read by the software. These
characters (if any) are available for reading from the
{\tt RHR} register. Each time one is read, the UART hardware
deletes it from an internal FIFO of waiting characters, and
clears the ``ready'' bit in {\tt LSR} when the FIFO is empty.
The UART transmit hardware is largely independent of the receive
hardware; if software writes a byte to the {\tt THR},
the UART transmits that byte.

Xv6's {\tt main} calls {\tt consoleinit}
\lineref{kernel/console.c:/^consoleinit/} to initialize the UART
hardware. This code configures the UART to generate 
a receive 
interrupt when the UART receives each byte of input, and
a \indextext{transmit complete} interrupt each time the
UART finishes sending a byte of output \lineref{kernel/uart.c:/^uartinit/}.

The xv6 shell reads from the console by way of a file descriptor
opened by {\tt init.c} \lineref{user/init.c:/open..console/}. Calls to
the {\tt read} system call make their way through the kernel to {\tt
  consoleread} \lineref{kernel/console.c:/^consoleread/}. {\tt
  consoleread} waits for input to arrive (via interrupts) and be
buffered in {\tt cons.buf}, copies the input to user space, and (after
a whole line has arrived) returns to the user process. If the user
hasn't typed a full line yet, any reading processes will wait in the
{\tt sleep} call
\lineref{kernel/console.c:/sleep..cons/}
(Chapter~\ref{CH:SCHED} explains the details of {\tt sleep}).

When the user types a character, the UART hardware asks the RISC-V
to raise an interrupt, which activates
xv6's trap handler.
The trap handler calls {\tt devintr}
\lineref{kernel/trap.c:/^devintr/},
which looks at the RISC-V {\tt scause} register to discover that
the interrupt is from an external device.
Then it asks a hardware unit called the PLIC
\cite{riscv:priv}
to tell it which device interrupted
\lineref{kernel/trap.c:/plic.claim/}.
If it was the UART, {\tt devintr} calls {\tt uartintr}.

{\tt uartintr}
\lineref{kernel/uart.c:/^uartintr/}
reads any waiting input characters from the UART hardware
and hands them to {\tt consoleintr}
\lineref{kernel/console.c:/^consoleintr/}; it doesn't
wait for characters, since future input will raise a new interrupt.
The job of {\tt consoleintr} is to accumulate input characters in
{\tt cons.buf} 
until a whole line arrives.
{\tt consoleintr} treats backspace and a few other characters
specially.
When a newline arrives, {\tt consoleintr} wakes up a
waiting {\tt consoleread} (if there is one).

Once woken, {\tt consoleread} will observe a full line in {\tt
  cons.buf}, copy it to user space, and return (via the system call
machinery) to user space.

\section{Code: Console output}

A {\tt write} system call on a file descriptor connected to the console
eventually arrives at 
{\tt uartputc}
\lineref{kernel/uart.c:/^uartputc/}.
The device driver maintains an output buffer ({\tt uart\_tx\_buf})
so that writing processes do not have to wait for the UART to finish
sending; instead, {\tt uartputc} appends each character to the buffer,
calls {\tt uartstart} to start the device transmitting (if it isn't
already), and returns. The only situation in which {\tt uartputc}
waits is if the buffer is already full.

Each time the UART finishes sending a byte, it generates an interrupt.
{\tt uartintr} calls {\tt uartstart}, which checks that the device
really has finished sending, and hands the device the next buffered
output character. Thus if a process writes multiple bytes to the
console, typically the first byte will be sent by {\tt uartputc}'s
call to {\tt uartstart}, and the remaining buffered bytes will be sent
by {\tt uartstart} calls from {\tt uartintr} as transmit complete
interrupts arrive.

A general pattern to note is the decoupling of device activity from
process activity via buffering and interrupts. The console driver can
process input even when no process is waiting to read it; a subsequent
read will see the input. Similarly, processes can send output without
having to wait for the device. This decoupling can increase
performance by allowing processes to execute concurrently with device
I/O, and is particularly important when the device is slow (as with
the UART) or needs immediate attention (as with echoing typed
characters). This idea is sometimes called \indextext{I/O
  concurrency}.

\section{Concurrency in drivers}

You may have noticed calls to {\tt acquire} in {\tt consoleread}
and in {\tt consoleintr}. These calls acquire a lock, which protects
the console driver's data structures from concurrent access.
There are three concurrency dangers here: two processes on
different CPUs might call {\tt consoleread} at the same time;
the hardware might ask a CPU to deliver a console (really 
UART) interrupt while that CPU is already executing inside
{\tt consoleread};
and the hardware might deliver a console interrupt on
a different CPU while {\tt consoleread} is executing.
Chapter~\ref{CH:LOCK} explains how to use locks
to ensure that these dangers don't lead to incorrect results.

Another way in which concurrency requires care in drivers is that one
process may be waiting for input from a device, but the interrupt
signaling arrival of the input may arrive when a different process (or
no process at all) is running. Thus interrupt handlers are not allowed
to think about the process or code that they have interrupted. For
example, an interrupt handler cannot safely call {\tt copyout} with
the current process's page table. Interrupt handlers typically do
relatively little work (e.g., just copy the input data to a buffer),
and wake up top-half code to do the rest.

\section{Timer interrupts}

Xv6 uses timer interrupts to maintain its idea of the
current time and to switch among compute-bound processes. Timer
interrupts come from clock hardware attached to each RISC-V CPU. Xv6
programs each CPU's clock hardware to interrupt the CPU periodically.

Code in {\tt start.c} 
\lineref{kernel/start.c:/^timerinit/} sets some control bits
that allow supervisor-mode access to the timer control
registers, and then asks for the first timer interrupt.
The {\tt time} control register contains a count that the
hardware increments at a steady rate; this serves as a
notion of the current time. The {\tt stimecmp} register
contains a time at which the the CPU will raise a timer
interrupt; setting {\tt stimecmp} to the current value
of {\tt time} plus {\it x} will schedule an interrupt
{\it x} time units in the future. For {\tt qemu}'s RISC-V
emulation, 1000000 time units is roughly a tenth of second.

Timer interrupts arrive via {\tt usertrap} or {\tt kerneltrap}
and {\tt devintr}, like other device interrupts.
Timer interrupts arrive with {\tt scause}'s low bits set to
five; {\tt devintr} in {\tt trap.c} detects this situation
and calls {\tt clockintr}
\lineref{kernel/trap.c:/clockintr/}.
The latter function increments {\tt ticks},
allowing the kernel to track the
passage of time. The increment occurs on only one CPU, to avoid time
passing faster if there are multiple CPUs.
{\tt clockintr} wakes up any processes waiting in the {\tt sleep}
system call,
and schedules the next timer interrupt by writing
{\tt stimecmp}.

{\tt devintr} returns 2 for a timer interrupt
in order to indicate to {\tt kerneltrap}
or {\tt usertrap} that they should call {\tt yield} so that
CPUs can be multiplexed among runnable processes.

The fact that kernel code can be interrupted by a timer interrupt that
forces a context switch via {\tt yield} is part of the reason why
early code in {\tt usertrap} is careful to save state such as {\tt
  sepc} before enabling interrupts. These context switches also mean
that kernel code must be written in the knowledge that it may move
from one CPU to another without warning.

\section{Real world}

Xv6, like many operating systems, allows interrupts and even context
switches (via {\tt yield}) while executing in the kernel. The reason
for this is to retain quick response times during complex system calls
that run for a long time. However, as noted above, allowing interrupts
in the kernel is the source of some complexity; as a result, a few
operating systems allow interrupts only while executing user code.

Supporting all the devices on a typical computer in its full glory is
much work, because there are many devices, the devices have many
features, and the protocol between device and driver can be complex
and poorly documented. In many operating systems, the drivers account
for more code than the core kernel.

The UART driver retrieves data a byte at a time by reading the UART
control registers; this pattern is called \indextext{programmed I/O}, since
software is driving the data movement. Programmed I/O is simple, but
too slow to be used at high data rates. Devices that need to move lots
of data at high speed typically use \indextext{direct memory access (DMA)}.
DMA device hardware directly writes incoming data to RAM, and reads
outgoing data from RAM. Modern disk and network devices use DMA. A
driver for a DMA device would prepare data in RAM, and then use a
single write to a control register to tell the device to process the
prepared data.

Interrupts make sense when a device needs attention at unpredictable
times, and not too often. But interrupts have high CPU overhead. Thus
high speed devices, such as network and disk controllers, use tricks
that reduce the need for interrupts. One trick is to raise a single
interrupt for a whole batch of incoming or outgoing requests. Another
trick is for the driver to disable interrupts entirely, and to check
the device periodically to see if it needs attention. This technique
is called \indextext{polling}. Polling makes sense if the device performs
operations at a high rate, but it wastes CPU time if the device is mostly
idle. Some drivers dynamically switch between polling and interrupts
depending on the current device load.

The UART driver copies incoming data first to a buffer in the kernel,
and then to user space. This makes sense at low data rates, but such a
double copy can significantly reduce performance for devices that
generate or consume data very quickly. Some operating systems are able
to directly move data between user-space buffers and device hardware,
often with DMA.

As mentioned in Chapter~\ref{CH:UNIX}, the console appears to
applications as a regular file, and applications read input and write
output using the \lstinline{read} and \lstinline{write} system calls.
Applications may want to control aspects of a device that cannot be
expressed through the standard file system calls (e.g.,
enabling/disabling line buffering in the console driver).  Unix
operating systems support the \lstinline{ioctl} system call for such
cases.

Some usages of computers require that the system must respond in a
bounded time.  For example, in safety-critical systems missing a
deadline can lead to disasters.  Xv6 is not suitable for hard
real-time settings. Operating systems for hard real-time tend to be
libraries that link with the application in a way that allows for an
analysis to determine the worst-case response time.  Xv6 is also not
suitable for soft real-time applications, when missing a deadline
occasionally is acceptable, because xv6's scheduler is too simplistic and it has
kernel code path where interrupts are disabled for a long time.

\section{Exercises}

\begin{enumerate}

\item Modify {\tt uart.c} to not use interrupts at all. You may need
to modify {\tt console.c} as well.

\item Add a driver for an Ethernet card.

\end{enumerate}