- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
For anyone like me:
Mesa, also called Mesa3D and The Mesa 3D Graphics Library, is an open source implementation of OpenGL, Vulkan, and other graphics API specifications as well as OpenCL. Mesa translates these specifications to vendor-specific graphics hardware drivers.
Very rough example of OpenGL, with Nvidia.
On Windows:
Game do OpenGL calls -> [Nvidia OpenGL Wrapper -> Nvidia internal driver call] -> Nvidia Firmware/HardwareThe
[...]section is made insidenvogvl64.dll.On Linux:
Game do OpenGL calls -> Mesa3D OpenGL Wrapper -> Nouveau internal call -> Nvidia Firmware/HardwareHere
Nouveauis the open source Nvidia driver, but it can beradeonsifor AMD.This stuff is like the backbone of how games work. Without it there is no pretty pixels on the screen. Really great work being done
Maybe AMD can put some resources into fixing RDNA3 instability with the latest kernels this year? I’ve had unrecoverable system crashes (no access to TTY, keyboard/mouse locked up) when gaming with RDNA3 on any kernel >6.12, and mostly on wayland, numerous times now. I don’t undervolt/overclock. And I’m only running a 7900XT (a 2+ year old card now).
I’m basically relegated to running either on the 6.12 kernel, or run everything with gamescope just to attempt to guarantee stability and recoverability from crashes. Gamescope “works”, but seems to run best using “-f” for fullscreen mode – anything else seems to have some quirkiness for me.
I’m a 7900 launch day customer who’s been on Linux the whole time.
AMD still has not fixed the vram getting stuck at 96 or 456mhz (low power states) since day one. It’s an issue that only happens on uncommon screen setups, but every time I want to play a game I have to start the game and then replug my monitor, or change it’s resolution to something else and back again. Every time.
This has been a reported bug ok the 6000 series cards, on launch. And they’ve still not fixed it.
Can you elaborate on your display config?
You kind of alluded to part of it there; it’s not so much a bug in sw/fw as it is a hardware limitation at both the adapter and display side. The variables for displays are vertical blanking intervals (and differences between panels), as well as total display bandwidth.
with RDNA2, a feature was implemented in DAL to leverage VRR in order to allow a single connected display system to achieve a lower mclk, and thus lower idle power draw. With RDNA3, hardware changes (MALL specifically) broadened this capability two concurrent displays. Even then, it’s not bulletproof.
The display eng team has more or less exhaustively worked towards this over the course of RDNA3’s lifespan; their work is applicable to both Windows and Linux.
Literally anything I’ve had that was “uncommon” has had this issue.
These setups have worked fine: Single, dual, and triple 1080p and 1440p, regardless of Hz. Single 4k, also regardless of Hz.
All these setup my MCLK gets stuck in 96MHz or 456MHz and needs a reconnect or resolution/refresh rate change to “unstick” so I can get higher MCLK and actually play games. Dual and Triple 4k, regardless of Hz. Dual/triple 4K with vertical 1080, regardless of Hz. Dual/triple 4k with dual vertical 1080, regardless of Hz.
It’s strange that you say that they’ve worked specifically to get LOWER MCLK rates for single/dual displays, when my issue is specifically that my MCLK is stuck at low MCLK with “uncommon” display setups? Almost sounds like you’re saying they’ve specifically worked to make my issue even worse lmao.
And ofc the issue is nonexistent on windows.
Oh sorry, I misunderstood, so you actually get locked into a low mclk under specific display configurations? I’ve genuinely never heard of or personally experienced that across a breadth of hw and sw configs.
I’m wondering if it could be worth probing the power play sysfs interface or hwmon the next time this happens to try and understand what’s happening there.
Do you use client apps to interact with tuning settings like LACT? Can you link me to an existing bug report so I can follow up with engineering?
So I never personally raised a bug report, but this issue was the one I saw when I first got the gpu: https://gitlab.freedesktop.org/drm/amd/-/issues/2460 Exact same issue I’ve always had, MCLK gets stuck at 96MHz and refuses to budge, but since it was relatively easily solved with a quick script that just changes one of my monitors between 60hz and 59.97hz I just accepted that 3 second operation as part of what I had to do to play games lol.
There’s also a level1techs thread with people with similar issues here: https://forum.level1techs.com/t/linux-really-sucks-right-now/223700 And also this Bazzite issue: https://github.com/ublue-os/bazzite/issues/1070
I can replicate the issue, though right now I’ve got the opposite issue since getting a 4K144Hz monitor that ends with the MCLK being stuck at maximum at all times, which tbh I’m fine with as I don’t care about the efficiency, I need the heating in this -30c weather anyway.
Thanks for these, I’ll discuss with the DAL team when I get the chance
Maybe AMD can put some resources into fixing RDNA3 instability with the latest kernels this year? I’ve had unrecoverable system crashes (no access to TTY, keyboard/mouse locked up) when gaming with RDNA3 on any kernel >6.12, and mostly on wayland, numerous times now.
7900XT
There may be some kind of regression, but as a data point, I run 6.12.48+deb13, which is what Debian stable uses, on a XT 7900 XTX and haven’t had stability problems with games.
It could be about 1,000 different things, including hardware issues completely unrelated to the OS. I also have a PC with a 7900XTX on Linux 6.18.2, using Plasma/Wayland and I’ve never had an unrecoverable system crash. Two of the other people that I game with are also running the exact same setup (Arch,btw/Linux6.18.2/Plasma/Wayland) without issue.
Blaming graphics cards sometimes feels like a meme. Its like if someone has any kind of problem and happen to mention that they use NVIDIA, you’ll see a huge portion of commenters, with nothing other to add, jump in to imply that it’s probably the NVIDIA card.
In fairness, distros do include their own patches, and it’s possible that there is some regression that showed up somewhere above 6.12 and he’s running a vanilla kernel and the Arch guys put in some patch that fixes whatever he’s seeing that isn’t in vanilla or something. Or maybe he’s using a distro that includes some kind of kernel patch that introduces the problem, if he’s not building his own kernels. I mean, I’ve got no idea what the guy is running; he doesn’t say.
I agree.
I don’t doubt that it could be the graphics card. It is just that we don’t have the information to say for sure and their description leaves a lot of other possibilities.
Well when the crash happens and the screen shows an enormous amount of green artifacting, sure looks like a graphics driver and/or kernel issue to me. But yeah, I’m probably wrong because your experience is pristine. And the many hours of research I’ve done on this subject leads to the conclusion that there are issues with 6.13+ kernel, amdgpu driver, and Wayland, and especially if having more than one display connected (despite setting the refresh rates to the lowest common denominator).
But I’m sure I’m wrong and it’s something completely different. Thanks for the insight.
Gosh, you sure got me. I certainly look like an idiot basing my opinion on what you wrote.
Thanks to your sarcastic reply I’ve learned to read the mind of commenters before replying.
What an idiot I am for not realizing all of the troubleshooting steps that you’ve taken simply because you never mentioned them.
👍
Yeah, same. At first it was just Nioh 2 (and, fair. That game is such a technical mess that it is a miracle to run it under normal circumstances) but I’ve increasingly had crashes in Warframe which makes me REAL sad.
Was aware of the kernel bypass. Wasn’t aware that gamescope had an impact so will definitely try that. Thanks
I had the same experience. Bazzite 41 or 42 (can’t remember) fixed it for me.
I run cachyos. I know that they like to tweak the kernel and change numerous settings which can make gaming less stable in my specific scenario. I think I’m going to just run everything via gamescope now since I seem to have the proper flags to make it just work without oddities for me (I liked being borderless, but that causes issues with some games, so I am avoiding that now). And then see if I get any crashes. If I continue to see crashes, hopefully recoverable this time, then I may stick with linux-lts kernel instead of using cachy’s.
Really? I thought it would be Nvidia! /s







