Linux version: v6.0
Architecture: ARMv8
I have been digging into the details of interrupt processing in Linux in the past few weeks. Although there are quite some quality articles and videos explaining the interrupt subsystem that can be found on the internet, personally it felt hard to get the big picture. Only after I actually traced the code that everything came together. In this post I hope to give an overview of the interrupt architecture by quickly going through how Linux sets up and processes interrupts. First some concepts around interrupt processing is introduced, then a real example is given to show how the code works.
The example shown here is the ARM64 architecture with a GICv2 interrupt controller, setting up the IPI interrupt. To avoid the discussion from becoming too long and digressing, the explanation is going to be pretty specific to our example, hence some details would be omitted. There are a couple of related concepts at the end of the article for the interested readers to reference.
The post became quite long as I progressed, so I decided to split it into two articles:
“Interrupt Handling Initialization” and “Interrupt Handling Process”.
Quick Intro to OS Interrupts
Needless to say, knowing that an interrupt is is necessary to understanding how Linux processes them. At a fundamental level, interrupts are just the outside world changing a CPU’s pin’s voltage. After this change of voltage, the internal circuit of the CPU detects and reacts to this change. In a modern computer, there are many peripheral devices that each need to interrupt the CPU to signal some events, or request some computation, e.g. the NIC receiving and transmitting packets, keyboard input, IPI’s, etc. An Interrupt Controller is the piece of hardware that helps manage these interrupts, typically CPUs can manage interrupts through the interrupt controller to:
enable and disable each interrupt
set the destination CPU of each interrupt (CPU affinity)
set priorities of interrupts
invoke a software interrupt
etc.
The CPU also communicates with the interrupt controller when processing interrupts, following is a typical series of steps involved in processing an interrupt:
The interrupt controller detects that peripheral device number 21 wishes to interrupt the CPU
The interrupt controller observes that device 21’s interrupt is set to be delivered to CPU 0
The interrupt controller writes the number “21” into the register corresponding to the current interrupt (this register is within the interrupt controller, not the CPU)
The interrupt controller changes the pin’s voltage connected to CPU 0 to inform an interrupt had arrived
CPU0 handles interrupt for device 21
After interrupt handling completion, CPU 0 writes to the EOI (End-Of-Interrupt) register in the interrupt controller, to inform the interrupt controller that it had completed handling the current interrupt
Interrupt Management and Data Structures used in the Linux Kernel
Linux IRQ number & Hardware IRQ number
Linux allocates an IRQ number for each interrupt in the system, which is called Linux IRQ number by many. It is used to identify the interrupts. Interrupt numbers shown in /proc/interrupts
are Linux IRQ numbers. Hardware IRQ numbers, is another concept, the number “21” in the example above is a hardware IRQ number, it is the “hardware” interrupt number the CPU gets when it asks the interrupt controller for the current interrupt.
You might be wondering why Linux IRQ numbers are necessary, why not just use hardware IRQ numbers for everything?
The answer is yes it would be nice if hardware IRQ numbers are used everywhere, however, computers these days may be connected with more than one interrupt controller, and hardware IRQ numbers would not be enough to identify individual interrupts, as two controllers may have the same hardware IRQ number for two different devices. As a result, the concept of IRQ Domains
is introduced to translate between hardware IRQ number and Linux IRQ numbers.
IRQ Domain
Domain as defined in mathematics:
Domain, the set of inputs accepted by the function. —wikipedia
In our case, it stands for the different ways for translating hardware IRQ numbers → Linux IRQ numbers. Each interrupt controller has its own IRQ domain, as each interrupt controller has its own way of translating its hardware IRQ numbers to Linux IRQ numbers. Hardware IRQ numbers for an interrupt controller should be translated to Linux IRQ numbers by the interrupt controller’s IRQ domain.
Linux supports two methods for actually translating the IRQ numbers:
radix tree: taking the hardware IRQ number to lookup a radix tree to retrieve the Linux IRQ number. (I am not familiar with radix trees :P)
array lookup: index an array with the hardware IRQ number to retrieve the Linux IRQ number
I will use the term “revmap” to refer to the data structure (radix tree or array) that does the translation.
The driver for the interrupt controller can choose one of the options as it sees fit, radix tree is more suitable for sparse hardware IRQ numbers though.
1 | struct irq_domain { |
IRQ Desc & IRQ Data & IRQ Action & Interrupt Handlers
You can see that revmap[]
in irq_domain
does not store Linux IRQ numbers directly, but pointers to irq_data
instead. Each interrupt is allocated an irq_desc
as it gets initialized to store some metadata, and irq_data
is embedded within irq_desc
, therefore each interrupt corresponds to one irq_desc
and irq_data
. The Linux IRQ number (irq
), hardware IRQ number hwirq
, and the interrupt handling function (handle_irq
) for the interrupt can all be found in its corresponding irq_data
. irqaction
is a linked list within irq_desc
which stores a handler (irqaction→handler
) in each node in the list.
1 | struct irq_data { |
Here is the visualized connection of the data structures:
1 | ┌────────────────────┐ |
For each interrupt, why do we need irqaction→handler
if there is irq_desc→handle_irq
already?
irq_desc→handle_irq
is responsible for interrupt flow control and the interaction between the CPU and the interrupt controller, e.g. masking/unmasking interrupts, ACK (acknowledging the interrupt), EOI (signal end-of-interrupt to the interrupt controller), etc. irqaction→handler
, on the other hand, irqaction→handler
is responsible for device interaction, e.g. accessing the NIC, hard disk, etc. Normally, irqaction→handler
is supplied by the device drivers, and irq_desc_handle_irq
is set by the interrupt controller, here we name a few:
handle_level_irq
handle_edge_irq
handle_fasteoi_irq
handle_percpu_devid_irq
Some call these the IRQ’s high level irq event handler. Moreover, during initialization the interrupt controller has to provide another function for low level interrupt processing such as reading the hardware IRQ number, or lookup irq_desc
for the current interrupt. This function should then be pointed to by the global function pointer handle_arch_irq
. In our GICv2’s case, that function is gic_handler_irq
, driver sets handle_arch_irq
to the address of the function gic_handle_irq
during initialization.
Linux Interrupt Processing Flow
Here we go through a simplified interrupt processing flow to help understand all the data structures and interrupt handlers introduced above.
interrupt arrives, CPU jumps to the corresponding exception vector to save execution context, calls function pointed to by
handle_arch_irq
the function reads hardware IRQ number from the interrupt controller, and translates hardware IRQ number to Linux IRQ number using the controller’s
irq_domain
, and retrieves theirq_desc
of the interruptcalls
irq_desc→handle_irq
(high level irq event handler)high level irq event handler calls each
irq_desc→action→handler
in succession
Therefore, the initialization process must set up the three levels of interrupt handlers before interrupts can be handled:
The global handle_arch_irq
Called by low level exception handling code, set up by the interrupt controller’s driver during controller initialization, this function is called each time an interrupt is being handled.
irq_desc→handle_irq
Responsible for interrupt flow control, ACK, EOI, etc. one for each interrupt, also known as the high level event handler of the interrupt.
irqaction→handler
irq_desc→handle_irq
calls each handler
in the linked list, there can be many irqaction→handler
for each interrupt.
Example: GICv2’s IPI Processing
Let’s now take a look at how GICv2 and its IPI (inter processor interrupt) are initialized. The following subsections each describes a step of the process, note that some functions appear in more than one subsection since they do a lot of things.
It might be helpful to draw your own call graph for the functions mentioned below, there are a lot of them.
1. Setting the global irq handler handle_arch_irq
start_kernel
→ init_IRQ
→ irqchip_init
→ of_irq_init
of_irq_init
calls GICv2’s initialization callback gic_of_init
→ __gic_init_bases
→ set_irq_handler
1 | static int __init __gic_init_bases(struct gic_chip_data *gic, |
2. Allocate the irq_domain
and its revmap for the interrupt controller
similarly in __gic_init_bases
, __gic_init_bases
→ gic_init_bases
→ irq_domain_create_linear
→ __irq_domain_add
1 | static int gic_init_bases(struct gic_chip_data *gic, |
3. Allocate Linux IRQ numbers and irq_desc
s by checking IPI’s hardware IRQ number
__gic_init_bases
→ gic_smp_init
→ __irq_domain_alloc_irqs
→ irq_domain_alloc_descs
→ __irq_alloc_descs
→ alloc_descs
→ alloc_desc
& irq_insert_desc
1 | static __init void gic_smp_init(void) |
4. Set up IPI’s irq_desc→handle_irq
in GICv2’s callback
The previous step only allocated irq_desc
, next is to set irq_desc→handle_irq
.
__irq_domain_alloc_descs
→ irq_domain_alloc_irqs_hierarchy
→ gic_irq_domain_alloc
→ gic_irq_domain_map
→ irq_domain_set_info
→ __irq_set_handler
1 | int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base, |
5. Create revmap for translating hardware IRQ number → Linux IRQ number
__irq_domain_alloc_descs
→ irq_domain_insert_irq
→ irq_domain_set_mapping
1 | int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base, |
6. allocate and insert irqaction
in linked list irq_desc→action
, in request_percpu_irq
Next, gic_smp_init
→ set_smp_ipi_range
callsrequest_percpu_irq
to allocate irqaction
and insert it in irq_desc→action
1 | void __init set_smp_ipi_range(int ipi_base, int n) |
Linux Interrupt Handling Concepts not Mentioned
threaded IRQs
chained interrupt controllers
nested interrupt controllers
shared IRQs
spurious interrupts
softirqs
tasklets
workqueues
References
- Quick Intro to OS Interrupts
- Interrupt Management and Data Structures used in the Linux Kernel
- Linux Interrupt Processing Flow
- Example: GICv2’s IPI Processing
- 1. Setting the global irq handler handle_arch_irq
- 2. Allocate the irq_domain and its revmap for the interrupt controller
- 3. Allocate Linux IRQ numbers and irq_descs by checking IPI’s hardware IRQ number
- 4. Set up IPI’s irq_desc→handle_irq in GICv2’s callback
- 5. Create revmap for translating hardware IRQ number → Linux IRQ number
- 6. allocate and insert irqaction in linked list irq_desc→action , in request_percpu_irq
- Linux Interrupt Handling Concepts not Mentioned
- References