Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable support for huge pages #6297

Closed
wants to merge 4 commits into from
Closed

Conversation

Botspot
Copy link

@Botspot Botspot commented Aug 3, 2024

This is useful for certain applications that need larger chunks of RAM allocated to perform well.
I have tested these changes on Raspberry Pi 5 and there are no known drawbacks or regressions.

Huge pages are enabled by default on all x86 kernels I am aware of, so enabling it here for arm64 seems reasonable.

@popcornmix
Copy link
Collaborator

You'll need to provide information on what the benefits are.
e.g. provide numbers for performance benefits of specific apps, or apps that didn't run without this change.

I suspect this setting won't work well with #6273 which gets a performance benefit by deliberately making the pages more scattered (so unlikely to be able to use huge pages).

@Botspot
Copy link
Author

Botspot commented Aug 3, 2024

You'll need to provide information on what the benefits are. e.g. provide numbers for performance benefits of specific apps, or apps that didn't run without this change.

My interest in hugepages is an up to 33% performance improvement for XMRig, a crypto mining application. See this github gist for more information.
HugeTLB pages are also beneficial to SQL databases and KVM/QEMU.

I suspect this setting won't work well with #6273 which gets a performance benefit by deliberately making the pages more scattered (so unlikely to be able to use huge pages).

Compiling the kernel with HugeTLB support does not change the default pages behavior at all, it simply gives the user the choice to configure it further without needing to recompile the kernel.
To make use of hugepages, a user would still need to add something like default_hugepagesz=2M hugepages=1280 hugepagesz=1G hugepages=3 to /boot/firmware/cmdline.txt, otherwise there is no change AFAIK.

Bear in mind that if you are not comfortable with potential ramifications of this PR, my specific application would continue to work fine without these lines:

CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
# CONFIG_READ_ONLY_THP_FOR_FS is not set

If you would like for me to remove these lines and only leave behind CONFIG_HUGETLBFS=y and CONFIG_HUGETLB_PAGE=y, I can do that.

@popcornmix
Copy link
Collaborator

I've done some testing with geekbench on a pi5. I'm reporting the single-core and multi-core scores.
Using the default 16k pagesize kernel config.

default (current kernel)
780/1457

config HUGE (this PR)
789/1461

config HUGE + cmdline default_hugepagesz=2M hugepages=1280 hugepagesz=1G hugepages=3
783/1436

NUMA (#6273)
898/2026

NUMA+config HUGE
861/1857

NUMA+config HUGE+cmdline default_hugepagesz=2M hugepages=1280 hugepagesz=1G hugepages=3
849/1858

So, this PR is not helping this test case, and it defeats the much more promising (NUMA) option.
Note: it harms the NUMA PR just by enabling the config option - even without adding it to cmdline.txt.

@pelwell
Copy link
Contributor

pelwell commented Aug 6, 2024

Given the amount of compute time involved in crypto mining, a bit extra to compile your own optimised kernel every now and again doesn't seem unreasonable. Based on @popcornmix's findings, this is a "No".

@pelwell pelwell closed this Aug 6, 2024
@theofficialgman
Copy link

theofficialgman commented Aug 30, 2024

@pelwell Please reconsider

CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

The above are common options used on pretty much all linux systems as previously indicated by @Botspot due to the varied userspace application benefits while having no performance downsides when left at stock configuration in the few application that are negatively affected.

For a detailed explanation on some of the benefits see the below:
https://www.netdata.cloud/blog/understanding-huge-pages/

@popcornmix 's tests are invalid due to testing with the transparent always options enabled TRANSPARENT_HUGEPAGE_ALWAYS (see the article for more explanation there too). Had this option not been enabled there would not have been regressions observed.

@popcornmix
Copy link
Collaborator

I've retested and TRANSPARENT_HUGEPAGE_MADVISE doesn't hurt performance for NUMA.
TRANSPARENT_HUGEPAGE_ALWAYS does hurt performance for NUMA.

@Botspot This PR currently enables TRANSPARENT_HUGEPAGE_ALWAYS. What was the reason for that?

Is it correct the only option that you need is CONFIG_HUGETLBFS?

@Botspot
Copy link
Author

Botspot commented Sep 5, 2024

I've retested and TRANSPARENT_HUGEPAGE_MADVISE doesn't hurt performance for NUMA. TRANSPARENT_HUGEPAGE_ALWAYS does hurt performance for NUMA.

@Botspot This PR currently enables TRANSPARENT_HUGEPAGE_ALWAYS. What was the reason for that?

Is it correct the only option that you need is CONFIG_HUGETLBFS?

If I understand correctly the XMRig docs, all that is needed at a bare minimum is CONFIG_HUGETLBFS and maybe also CONFIG_HUGETLB_PAGE, not sure.

My comment above did point this out, but the PR was still closed. My intention with the other purposed changes was to make the kernel more similar to mainline Linux distros.

popcornmix added a commit to popcornmix/linux that referenced this pull request Sep 26, 2024
Upstream v3d patches are adding support for big (64K) and super (1MB) pages,
which require these optons. See:
https://lore.kernel.org/dri-devel/[email protected]/

There are also some potential performance benefits linked from:
raspberrypi#6297

Signed-off-by: Dom Cobley <[email protected]>
popcornmix added a commit to popcornmix/linux that referenced this pull request Sep 26, 2024
Upstream v3d patches are adding support for big (64K) and super (1MB) pages,
which require these options. See:
https://lore.kernel.org/dri-devel/[email protected]/

There are also some potential performance benefits linked from:
raspberrypi#6297

Signed-off-by: Dom Cobley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants