Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel oops (unhandled page fault) on Ubuntu kernel #338

Open
Kazurin-775 opened this issue Apr 24, 2022 · 1 comment
Open

Kernel oops (unhandled page fault) on Ubuntu kernel #338

Kazurin-775 opened this issue Apr 24, 2022 · 1 comment

Comments

@Kazurin-775
Copy link

On a VM booted up with Ubuntu 20.04 LTS cloud image, when the vmsh kernel library is unloaded from the guest address space, an unhandled page fault will happen in the guest kernel:

[   39.862903] BUG: unable to handle page fault for address: ffffffff800012e2
[   39.863866] #PF: supervisor instruction fetch in kernel mode
[   39.864637] #PF: error_code(0x0010) - not-present page
[   39.865365] PGD 7eb80e067 P4D 7eb80e067 PUD 7eb80f063 PMD 0
[   39.866143] Oops: 0010 [#1] SMP PTI
...
[   39.881668] Call Trace:
[   39.882094]  ? process_one_work+0x1eb/0x3b0
[   39.882723]  ? worker_thread+0x4d/0x400
[   39.883314]  ? kthread+0x104/0x140
[   39.883845]  ? process_one_work+0x3b0/0x3b0
[   39.884475]  ? kthread_park+0x90/0x90
[   39.885046]  ? ret_from_fork+0x35/0x40

The fault address 0xffffffff800012e2 points to libstage1.so's code. The assembly reads as following:

    12d3:       48 8d 3d 73 a4 3e 00    lea    0x3ea473(%rip),%rdi        # 3eb74d <_fini+0x3e9979>
    12da:       31 c0                   xor    %eax,%eax
    12dc:       ff 15 a6 bc 3e 00       callq  *0x3ebca6(%rip)        # 3ecf88 <_printk>
--> 12e2:       48 83 c4 58             add    $0x58,%rsp
    12e6:       5b                      pop    %rbx
    12e7:       41 5c                   pop    %r12
    12e9:       41 5d                   pop    %r13
    12eb:       41 5e                   pop    %r14
    12ed:       41 5f                   pop    %r15
    12ef:       5d                      pop    %rbp
    12f0:       c3                      retq

which corresponds to the function tail after the following statement:

vmsh/src/stage1/src/lib.rs

Lines 587 to 588 in cfbb612

printkln!("stage1: finished");
}

It seems that the vmsh kernel library is unmapped before the stage1 kernel worker runs to completion, which should be a bug.


Commands to reproduce the error:

qemu-system-x86_64 --nographic -m 32G --machine 'q35,accel=kvm' --hda './focal-server-cloudimg-amd64.img'

cargo run attach --stage2-path /tmp/vmsh -f ../linux/nixos.ext4 `pidof qemu-system-x86_64`

# Use Ctrl-C to terminate vmsh when it says "stage1 driver started"

Logs: kernel-oops.log, vmsh.log

@Kazurin-775
Copy link
Author

Sorry that I forgot an important point: in order to trigger the oops, one has to actually make the kernel print the message stage1: finished, e.g. by using:

echo 7 | sudo tee /proc/sys/kernel/printk

But this bug still cannot be reproduced on the kernel shipped with VMSH anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant