Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fan smoothing and syncing #139

Closed

Conversation

curiousercreative
Copy link
Contributor

@curiousercreative curiousercreative commented Jan 19, 2021

NOTES

  • I'm a C newb and very interested in how this could be refactored for elegance, accuracy, etc.
  • First PR on this repo. I developed this off the commit for my current firmware (galp5 2020-12-15_79d6fa7) and later rebased to master. Please advise improvements to the PR itself.

CHANGES

I wrote this for my new galp5 to address a common complaint of the fans jumping from 0 to 100 and back abruptly. This PR adds support for new build flags:

  • SYNC_FANS will read target fan speeds from both CPU and GPU fan curves, take the max and set both fans to this speed.
  • FAN_SMOOTHING sets the number of fan speed event loops that a 0 - 100 speed ramp should be smoothed across. Effectively, this determines the maximum fan speed adjustment for each event loop to prevent audible stepping. A high number yields a slower and smoother ramp while a low number yields a fast response.
  • FAN_SMOOTHING_UP sets the fan smoothing (as described above) for the fan speed increases
  • FAN_SMOOTHING_DOWN sets the fan smoothing (as described above) for the fan speed decreases

See updated galp5 board config for an example of these new values: https://github.com/system76/ec/pull/139/files#diff-198f0231dadddcc1692e1557f98782a282a0e540eb1664fd95402cb0574caab8R43

Attempts to fix system76/firmware-open#149 and system76/firmware-open#148

@curiousercreative
Copy link
Contributor Author

Sensor graphs

2 minutes of prime95 single thread

2 minutes of prime95 single thread

Geekbench 5 CPU benchmark run

Geekbench 5 CPU benchmark run

board.mk values

# sync GPU fan speed to CPU fan speed
CFLAGS+=-DSYNC_FANS=1
# smooth the fan speed updates such that 0 to full speed happens over this period (divide by 4 for seconds)
CFLAGS+=-DFAN_SMOOTHING=50

# Custom fan curve
CFLAGS+=-DBOARD_HEATUP=3
CFLAGS+=-DBOARD_COOLDOWN=10
CFLAGS+=-DBOARD_FAN_POINTS="\
	FAN_POINT(65, 25), \
	FAN_POINT(70, 50), \
	FAN_POINT(75, 75), \
	FAN_POINT(80, 100), \
"

@ZeddieXX
Copy link

I think you have a better solution than mine.

If I'm understanding you correctly, even FAN_POINT(80, 100) shouldn't actually hit 100% fan speeds at 80C if it starts ramping up slowly and smoothly, which may bring the temps back down before it gets too hot.

Along with the boot entries/nvram issue and this improved fan curve, I am looking forward to the next firmware update. :)

@curiousercreative
Copy link
Contributor Author

@ZeddieXX on a machine with an adequate cooling system (like a big tower), this might be able to prevent the CPU from ever hitting 80C, but tiny laptops like our Galago Pros lack the cooling capabilities to do the same. In reality, our CPUs exceed 80C within a second! No joke, 40C to 80C in a second. My fan points do their best to keep the CPU below 80C, but under load they'll still dance around the 88C thermal ceiling of "balanced" power mode.

@curiousercreative
Copy link
Contributor Author

For myself, I've found that I prefer a quicker ramp up and a slower ramp down of fan speeds and this commit allows for that configuration. Happy to include that in this PR if it's not too messy.

@jackpot51
Copy link
Member

I generally like this. Is there a reason to re-enable interpolation? Interpolation was disabled so that fan noises would be more steady under load, with this change I would think it would oscillate upwards and downwards

@curiousercreative
Copy link
Contributor Author

@jackpot51 in my experience, there's nothing more annoying than a constant workload that sits on a fan point, causing the fan speed to infinitely loop between lower speed (thereby increasing temp) and higher speed (decreasing temp). You are of course correct that with interpolation turned on, a non-constant workload runs a high risk of fluctuating but I'd argue that with the default fan points, you're unlikely to have the fans on for an extended duration without the presence of a fairly constant workload.

Ultimately, the interpolation doesn't NEED to be tied to smoothing and could be broken out to its own flag.

@curiousercreative
Copy link
Contributor Author

curiousercreative commented Jan 27, 2021

@jackpot51 revisiting this today as I read other redditors wish they had smoother fans. What do you think about including this commit in the PR: curiousercreative@b39bb7f? These are the values I'm running and liking: https://github.com/curiousercreative/ec/blob/galp5/src/board/system76/galp5/board.mk#L40

@jackpot51
Copy link
Member

@curiousercreative Yes, you can add that commit. I'll take a look today. I'd like to replace SYNC_FANS with a test for the GPU being present. I can do that in another commit

@curiousercreative
Copy link
Contributor Author

@jackpot51 updated

@ZeddieXX
Copy link

ZeddieXX commented Jan 28, 2021

Will this be merged upstream so all GALP5 owners will get it? Or is this considered a custom firmware that needs to be compiled and flashed by those who know to seek it out?

I hope by having @jackpot51 commenting here, there's a good chance us non-dev people will get the benefits of @curiousercreative work.

@curiousercreative
Copy link
Contributor Author

Yes, you can add that commit. I'll take a look today. I'd like to replace SYNC_FANS with a test for the GPU being present. I can do that in another commit

@jackpot51 now with a dGPU galp5 in my hands, I think that I'd prefer fan syncing for both iGPU and dGPU galp5 and here's an idea that COULD be satisfactory or ideal for all laptops actually:

  1. Sync fans always (I believe Apple does this)
  2. Keep separate fan curves for CPU and GPU to allow finer thermal targets
  3. Move fan duty setting into the fan module itself while each GPU and CPU/PECI module are responsible for mapping their thermal sensor to a fan duty (as they currently do)
  4. fan module will take the max fan duty between CPU and GPU. If GPU is not present, the duty will always be 0, so we'll always be syncing fans to whatever the CPU fan duty is

Example

Example GPU fan curve

  • 70C = 25%
  • 80C = 50%
  • 90C = 100%

Example CPU fan curve

  • 65C = 10%
  • 70C = 10%
  • 72C = 20%
  • 80C = 80%
  • 88C = 100%

Mock scenarios

  • While GPU = 50C and CPU = 70C, both fans are set to 10%
  • While GPU = 70C and CPU = 70C, both fans are set to 25%
  • While GPU = 70C and CPU = 60C, both fans are set to 25%
  • While GPU = 0C and CPU = 88C, both fans are set to 100%

@jackpot51
Copy link
Member

I agree, given that all the supported laptops have a shared heatpipe system between the CPU and GPU. Increases in CPU heat should result in spinning the GPU fan a bit more and vice versa.

@curiousercreative
Copy link
Contributor Author

curiousercreative commented Feb 10, 2021

@jackpot51 updated with smarter fan syncing. Feel free to edit for any reason. I've flashed it and so far so good. These are the build flags I'm using (I generally keep that branch updated with what I'm running, rebased off the latest firmware release for galp5). NOTE: I've also cherry-picked this: #142

@curiousercreative
Copy link
Contributor Author

@jackpot51 anything I can do to get this PR ready for review?

@gregor160300
Copy link

gregor160300 commented Feb 20, 2021

@curiousercreative this pull request breaks building for any model without a dgpu (maybe even every model other than galp5)
with:

src/board/system76/common/dgpu.c:106: error 112: function 'PWM_DUTY' implicit declaration
src/board/system76/common/dgpu.c:106: error 101: too many parameters

I am trying to figure out how to fix it, I'll let you know if I find a fix.

EDIT: It is an easy fix, in dgpu.c the fan.h should be included regardless of HAVE_DGPU since that is where PWM_DUTY is defined.

@curiousercreative
Copy link
Contributor Author

@gregor160300 thanks! What model was the build failing for (so that I may verify the fix)?

@gregor160300
Copy link

@gregor160300 thanks! What model was the build failing for (so that I may verify the fix)?

I tried with both darp7 and lemp10 both failed before only tried again with darp7 after the fix, I did add the fansmoothing up and down flags to the makefile of darp7 so it might be worth testing also without adding extra cflags

@curiousercreative
Copy link
Contributor Author

@gregor160300 thanks again for testing this out. I've updated with your fix and another that I found.

@ZeddieXX
Copy link

Does this mean it's ready to be merged? ^_^

@jackpot51
Copy link
Member

I don't think it uses or needs board_detect, that could be removed.

@curiousercreative
Copy link
Contributor Author

curiousercreative commented Feb 26, 2021

@jackpot51 will do. Do we want to continue defaulting to independent fan speeds and unsmoothed or should we default to synced fans and smoothing? I'd say synced and smoothed makes sense and removing the board detect would imply that being a default...

@curiousercreative
Copy link
Contributor Author

@jackpot51 some fun observations regarding the galp5 fans (from psensor):

  • No fan point below 25% spins the fans at all (I tested 10, 15 and 20%)
  • 25% fan point (64 of 255) spins CPU fan at about 1000RPM and dGPU at 2000RPM
  • 50% fan point (128 of 255) spins CPU fan at about 3180RPM and dGPU at 3830TTtRPM
  • 100% fan point (255 of 255) spins CPU fan at about 6178RPM and dGPU fan at about 6304RPM

@ZeddieXX
Copy link

ZeddieXX commented Mar 26, 2021

@jackpot51 Any updates on this? Very excited to see this on the next official EC firmware update.

@ZeddieXX
Copy link

@jackpot51 I just saw a new firmware just dropped for the GALP5. It doesn't list smoother fan curves or syncing, however. Is this implemented? If not, what is preventing this from going into the production firmware?

@jacobgkau
Copy link
Member

Is this implemented?

@ZeddieXX Released firmware is typically based on the master branch at the time of creation. You can tell that this PR has not been merged into master yet because it is still open and in @curiousercreative's fans-rebase branch.

@ZeddieXX
Copy link

Is this implemented?

@ZeddieXX Released firmware is typically based on the master branch at the time of creation. You can tell that this PR has not been merged into master yet because it is still open and in @curiousercreative's fans-rebase branch.

So there's no plan in fan smoothing and syncing for the master branch?

@jackpot51
Copy link
Member

@curiousercreative I'd like to see the change for syncing CPU and DGPU fan speeds in a separate PR, to make it easier to release these changes

@curiousercreative
Copy link
Contributor Author

@jackpot51 sure, I can work on that. I'll update this PR to include just smoothing and I'll open a new PR based off this one (so including fan smoothing)

@jackpot51
Copy link
Member

@curiousercreative I would prefer the other way around. A PR that just implements syncing, nothing else. The smoothing is actually far more complicated for me to approve

@curiousercreative
Copy link
Contributor Author

@jackpot51 sure, I can do it that way as well. I understand it being complicated, but I'd bet the fan smoothing is by far the more important of the two features given what I see on the subreddit.

@curiousercreative
Copy link
Contributor Author

@jackpot51 please see #177 for fan syncing

@curiousercreative
Copy link
Contributor Author

@jackpot51 speaking of fan smoothing. from this review that was just shared on reddit: https://www.reddit.com/r/linuxhardware/comments/ms8h4k/i_ditched_hp_in_favor_of_system76_here_are_my/

The fan is usually silent but when it kicks on, it can get quite loud. I wish that when temps increase, the fan slowly trickles up so the noise doesn't feel as abrupt. Display is great, but I didn't test it for color accuracy or anything.

@MilesBHuff
Copy link

MilesBHuff commented Apr 20, 2021

Warning: This PR broke the CPU and GPU temperature monitors when I merged this to master on my local. Reverting the merge fixed this issue.
Also, interpolation doesn't seem any less broken with this.
EDIT: I forgot to add the interpolation build flags. -_-

@curiousercreative
Copy link
Contributor Author

@MilesBHuff what's your board.mk look like when you observed the issues? This is what my galp5 is running with good results (including psensor CPU and GPU sensor readings): https://github.com/curiousercreative/ec/blob/galp5/src/board/system76/galp5/board.mk#L44

@curiousercreative
Copy link
Contributor Author

@MilesBHuff did you have the fan smoothing build flags in there when things weren't working?

@MilesBHuff
Copy link

@curiousercreative Ah Christ, I'm an idiot. No. But that shouldn't have affected the CPU/GPU temperature sensors.

@curiousercreative
Copy link
Contributor Author

@MilesBHuff shouldn't, but could be a regression. I don't know if I've ever run the build without the fan smoothing flags, so I can give that a shot at some point. But to your previous comment about interpolation not working, it's only enabled with fan smoothing flags set. You can try it again with smoothing flags and report back.

@MilesBHuff
Copy link

@curiousercreative I'm mildly afraid to run it knowing it may break the sensors, like before. The CPU runs very hot, and I'm afraid that the lack of sensor data might impact the ability of the fan curve to function properly. As well, I've already hard-coded a smooth fan curve in my board.mk, so I don't need interpolation anymore.

@curiousercreative
Copy link
Contributor Author

@MilesBHuff ok, no worries. FWIW, I wouldn't worry about running too hot. The CPU package internally has thermal protections, it should be pretty difficult to do it any serious harm, especially in a short time window.

This was referenced Apr 21, 2021
@nbs
Copy link

nbs commented Apr 28, 2021

I've seen no mention of the Serval WS (serw12 I think) which has atrocious fan noise for no apparent reason. Can this please be ported to that series?

@leviport
Copy link
Member

I've seen no mention of the Serval WS (serw12 I think) which has atrocious fan noise for no apparent reason. Can this please be ported to that series?

Sorry, but the Serval WS does not run open EC, so that won't be possible.

@curiousercreative
Copy link
Contributor Author

closing in favor of newer implementation #190

@curiousercreative curiousercreative deleted the fans-rebase branch May 1, 2021 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

laptop fan speed changes should be smoother
9 participants