Linux version: v6.0
Architecture: ARMv8
KVM flavor: NVHE
Introduction
During the 5.10 release cycle, KVM ARM had many code improvements in preparation for the google pkvm project. This includes the EL2 per cpu variables. Because there are quite a lot to discuss, I decided to split this topic into two posts.
Definition & Usage
per cpu variable initialization
Besides these there are still many related aspects such as barriers, preemption, and interrupts. We’re not going to talk about them here, since I don’t think I have a solid enough understanding.
Per CPU Variables
Per CPU variables are simply variables that each CPU core sees a local copy. Upon access, the copy belonging to the running cpu is accessed, this eliminates inter-cpu data races.
There are already a lot of resources on Linux kernel’s per cpu variables online, this series of posts will focus on KVM ARM’s per cpu variable implementation catered for the EL2 environment.
Definition
The API to define a EL2 per cpu variable is the same as the normal one. Here’s an example:
1 | // file: arch/arm64/kvm/hyp/nvhe/switch.c |
This macro expands into:
1 | __attribute__((section(".data..percpu" ""))) __typeof__(unsigned long) kvm_hyp_vector; |
This result is also the same as the regular per cpu variable. So, does EL2 and EL1 use the same section for per cpu variables? The answer is no, code written for KVM EL2 are linked using a special linker script arch/arm64/kvm/hyp/nvhe/hyp.lds
. This linker script renames the input sections, effectively seperating the kernel and the hypervisor.
You’ll find that
hyp.lds
does not apper in the source code, this is becuasehyp.lds
is generated at compile time fromhyp.lds.S
in the same directory.
This is the hyp.lds
generated using defconfig
:
1 | SECTIONS { |
Note that kvm_hyp_vector
used in the previous example is moved from .data..percpu
section to .hyp.data..percpu
.
Usage
Here’s a per cpu variable usage example:
1 | // file: arch/arm64/kvm/hyp/nvhe/switch.c |
this_cpu_ptr(&var)
returns the address of the per cpu variable var
.
Lots of recursive macro is used in this line of code, it expands into (just skim through it):
1 | do { |
We’re not interested in write_sysreg()
and hcr_el2
, remove them:
1 | // macro expansion of this_cpu_ptr(&kvm_init_params): |
You can see that the basic mechanism behind per cpu variables is taking a base pointer and then add a cpu-specific offset to get the local copy’s address.
__hyp_my_cpu_offset()
’s implementation:
1 | static inline unsigned long __hyp_my_cpu_offset(void) |
It reads tpidr_el2
to get the per cpu offset. tpidr_el2
is a system register which hardware doesn’t use, software can use it as it sees fit. KVM ARM uses it to store the per cpu offset for each cpu.
Lastly lets look at how per cpu variables are used in assembly:
1 | // file: arch/arm64/kvm/hyp/nvhe/host.S |
This is also a macro, it places the address of the per cpu variable struct kvm_host_data kvm_host_data
into x0
, x1
is a temporary register. The macro expands into:
1 | // the first two instructions read the address of kvm_host_data (base) |
It uses tpidr_el2
as the offset, just like the C implementation.