[Sammelthread] Ryzen RAM OC + mögliche Limitierungen

Set your logger to capture everything. Do not trust HWInfo to capture WHEA.
Do not trust the OS to capture WHEA as a Faultcondition.
Set it up to verbose so it actually starts to capture anything
Hey @Veii, after tweaking whole night and get rid of bclk as suggested, Global C-states disabled, no cpu boost, found no go IF2000 Wheas free, no matter what PBO limits & CO negative input values or voltages in combo needed to fix🤕
In TM5 runs found no #errors, just but 10 wheas per minute continuously everytime, and I was forced to stop testing by wheas party, only being able to manage straight 3933 wheas free for both Gear's 1T 👍
So far so good, I'll save my setup work in progress for now, need to restart my job for couple weeks, away from home, still confident with my best 102.25 bclk Profile anyway, but I'll be back, thx
 

Anhänge

  • 5800X3D_3933CL15-16_GearOFF_PBOLimitsOFF_CO-PerCore_LCLK DPM off.png
    5800X3D_3933CL15-16_GearOFF_PBOLimitsOFF_CO-PerCore_LCLK DPM off.png
    262 KB · Aufrufe: 64
  • 5800X3D_3933CL14-15_GearON_PBOLimitsOFF_CO-PerCore_LCLK DPM off.png
    5800X3D_3933CL14-15_GearON_PBOLimitsOFF_CO-PerCore_LCLK DPM off.png
    708,9 KB · Aufrufe: 66
Zuletzt bearbeitet:
Wenn Du diese Anzeige nicht sehen willst, registriere Dich und/oder logge Dich ein.
Moin zusammen,

ich habe meinen RAM nochmal gewechselt, am Wochenende hier viel im Thread gelesen und würde mich über eure Meinungen freuen.

Ziel war und sind konservative 3800 MHZ, CL16 mit guten Subtimings, ohne das letzte aus dem RAM raus zu quetschen. Ich denke das passt soweit ganz gut?! Das einzige was mich etwas irritiert sind die 60.7ns Latenz in Aida64. Die ist mit sehr ähnlichen Settings bei anderen niederiger. Könnte natürlich am neuesten BIOS liegen. CPU läuft mit Kombo Strike 3.

Bin insgesamt zufrieden, aber vielleicht habe ich etwas übersehen. Gruß.
 

Anhänge

  • Screenshot 2024-04-17 181447.png
    Screenshot 2024-04-17 181447.png
    211,5 KB · Aufrufe: 63
Zuletzt bearbeitet:
Hey @Veii, after tweaking whole night and get rid of bclk as suggested, Global C-states disabled, no cpu boost, found no go IF2000 Wheas free, no matter what PBO limits & CO negative input values or voltages in combo needed to fix🤕
Still no data for me to work from. I cant help you with no data.
C-states off and no boost is not a good idea,
But you can see its neither core boost nor memory topic.
Combining memOC and searching for WHEA reason is not a good idea.

MLCK & FCLK are split topics
FCLK and WHEA are split topics.
FCLK doesnt create WHEA, it helps them appear due to faster internal clock.
They still can come from +8 sources.

Why it logged nothing on 3933 GDM on, screenshot, i dont know. Maybe OS tweaks have disabled logging
Verbose logging is needed

There are good and bad WHEA reports.
Successfull and failed correction can happen without causing WHEA report to begin with.
Its just an OS logger.
So far so good, I'll save my setup work in progress for now, need to restart my job for couple weeks, away from home, still confident with my best 102.25 bclk Profile anyway, but I'll be back, thx
Alright !
Stay safe and have fun :)
 
sag mal @Veii kannst du mir mal bitte erklären was das hier schon wieder soll ?

Po81Nif.png


also 65535 verstehe ich ja aber 65528 ? lol
 
sag mal @Veii kannst du mir mal bitte erklären was das hier schon wieder soll ?
Nein.
Auch "bitte" wird nicht helfen, bei:
"was das schon wieder sein soll"

Die Information liegt irgendwo auf OCN.
Irgendwo zwischen dem AMD 24/7 und den Intel 24/7 Thread.

Abseits dem, befindest du dich im falschen Thread
AM4 erlaubt keine tREFI Erhöhung und das Zitierte gehört zu DDR5.
 
Zuletzt bearbeitet:
Das einzige was mich etwas irritiert sind die 60.7ns Latenz in Aida64. Die ist mit sehr ähnlichen Settings bei anderen niederiger. Könnte natürlich am neuesten BIOS liegen. CPU läuft mit Kombo Strike 3.
CS 1 = -10 CO
CS 2 = -20 CO
CS 3 = -30 CO

Zu viel negative CO erzwingt eine VID Korrektur in den neuen AGESAs (mehr wird angefordert, sollte es zu wenig sein)
GDM kostet Latenz

Vorschlag:
ComboStrike 2 , mit:
1713423079089.png

Min SOC für 980mV VDDG,
[GET, unter Last >1.022v] ~ 1.05v Bios input, da LLC existiert.

Gegentesten mit ~angehängt~
Fals weiterhin instabil, bitte errors abfotographieren (2-3 sammeln um diese Identifizieren zu können)
 

Anhänge

  • TM5_0.12.3_1usmus25-CoolCMD.zip
    25 KB · Aufrufe: 42
But IOD issues for VDDG, mostly relate to (here especially different on AM5), to SOC and procODT issues.
...
For the IMC, you neither need SOC nor VDDG.
Sorry, i dont think this is correct.
IMC supply and upkeep voltage on AM4 is solely cLDO_VDDP
I partially disagree, but I was not precise. The topic we were discussing was WHEA 19 coming from higher IF as a result of 1:1:1 RAM OC. According to eg this article on HWluxx. IOD hosts MC + DDR PHY.
  • VDDP is for DDR4 PHY, as stated in BIOS: "VDDP is a voltage for the DDR4 bus signaling (PHY)"
  • MC + PHY sit on IO-Die, connected with IF --> impacted by VDDG IOD voltage: "VDDG IOD represents voltage for the data portion of the Infinity Fabric".
So it is pretty obvious to me, why in my experience, as well as many others, raising VDDG IOD (+ CCD so some extent) is needed in IF-related WHEA 19 as a result of 1:1:1 RAM OC. Of course vSOC as base voltage needs to be enough as well, procODT impact on IF / WHEA 19 I did not observe.
Once you set FCLK, only your sample strain towards general impedance (procODT) and your IMC tollerances to cLDO_VDDP
Are what define your maximum MCLK
VDDG & SOC need zero touch.
Not for MCLK, but for WHEA 19 which are IF related? And WHEA 19 as a side effect of raising IF for RAM OC was the topic.
ProcODT for MemOC generally needs no touch, but its an old community habbit;
Now I am confused. You stated yourself multiple times that ProcODT needs to go down to 30-32 ideally in the context of MemOC? And I also made the experience that bringing it from 36.9 / 7/3/4 down to 32 7/3/4 fixed TM5 errors finally for my 3800 1T GDM off set.
I dont have anything negative , bad or forceive in my mind or actions. Zero :-)
All emotes are neither passive nor active aggressive. Absolutely zero.
If i write them i mean it ~ very happy right now to be honest :d

I just wish that , how do i phrase it
I just wish that LUXX delivers better.
You guys have much more accurate research and germans generally are known for perfectionism
I want to hold you to that ~ but will never force this onto somebody
Same here, and I enjoy the deep discussions. And I am Austrian, so I don´t feel forced :d
I also wish for precision, and that experts deliver better on building the bridge to the average enthusiast and explain things in a tangible way like "for your specific question, in 80% of cases the solution lies in X. Does not help? Try Y". Most folks don´t have that much time for optimization and ask for 80/20 rules to get to a 95% result without WHEA and TM5 stable, which for Ryzen 5000 is DDR4 3800 CL 14-19 (depending on IC). For those folks I think this forum is doing an excellent job, and for the "super-elite", you + some others come into play ;)
Selective (random) people i try to help, i want them "to not give up" even if its difficult and confusing due to lack of documentations;
Thats about all. It doesnt need to be first place or anything. If person learns and shares that knowledge further, pulling other people up ~ im happy :)
I hope thats understandable :)
Absolutely, and I see in principle we totally agree :)
 
I partially disagree, but I was not precise. The topic we were discussing was WHEA 19 coming from higher IF as a result of 1:1:1 RAM OC. According to eg this article on HWluxx. IOD hosts MC + DDR PHY.
  • VDDP is for DDR4 PHY, as stated in BIOS: "VDDP is a voltage for the DDR4 bus signaling (PHY)"
  • MC + PHY sit on IO-Die, connected with IF --> impacted by VDDG IOD voltage: "VDDG IOD represents voltage for the data portion of the Infinity Fabric".
So it is pretty obvious to me, why in my experience, as well as many others, raising VDDG IOD (+ CCD so some extent) is needed in IF-related WHEA 19 as a result of 1:1:1 RAM OC. Of course vSOC as base voltage needs to be enough as well, procODT impact on IF / WHEA 19 I did not observe.
We both could have phrased it better.
IMC itself yes, Physical interface is powered by LDO VDDP
That is the lower layer

When pushing its clock speed for higher UCLK
Sync or not being irrelevant :)
It will require a bit more VDDP.
Then its SI & required voltage, depends on the procODT supplied & defined;
// how it reacts to voltage, how much it wants and how far it scales with voltage ~ yet having nothing to do with procODT.
// Its just the whole chip influenced by it, like it is with CPU1P8 supply.

The layer ontop it is for MC-Clock towards IFOP?
It misses GMI & xGMI links to illustrate it better
Which that upper MC layer clocks to, is MCLK speed as "data path" towards X
That one still is hungry for VDDP, but partially. The +/- range from defined procODT is by how much "more" it wants from VDDP.
Soo if you push 5000MT/, even if UCLK is low ~ it will eat VDDP and requires strong ODT impedance.
Yet that procODT impedance has nothing to do with powering specific memory. Thats what i wanted to say.

Its not used for memOC ~ because its not the issue maker.
VDDP at 900mV with procODT range between 28-48ohm, is a pretty wide range and plenty for most capacity reaching substrate max gear 1.
// usually gear1 max should be 2000:2000:2000 on nearly all samples ~ which also does work with VDDP supply at 900mV , but more to that for another day

Then the MC links go towards Fabric, towards SOC, GMI and from the CCD IFOPs into the CCD
If i have this right.
The voltage between them is VDDG, but that is fully load balanced.
The VDDG data rail and the MC data rail may or may not syncronize.
I'm certain there is more in between (inside IOD) but i dont have a good illustration to draw it, neither have seen it well explained;

One layer higher is the SOC supply and soc clock
This SOC supply is what is send down towards VDDG and it plus procODT are what manage substrate impedance behavior.
Basically that behavior is what controls how far and how strong layers-down voltages affect the substrate itself.

Because "voltage" by itself has zero meaning !
And impedance needs more factors to "have any weight"

For overclocking X you mess with:
The FCLK ~ VDDG both, the CPU1P8, and the SOC // for whole AM4
The CCD & CO, ~ with VDDG CCD & CPU1P8
The UCLK & MCLK ~ VDD(s) & sometimes VPPM. One usually also messes with CAD_BUS & RTT to reach higher MCLK but that is also a sideproduct, and not their main purpose.
The Memory-DIMM & GearDownMode/Powering ~ the CAD ODT & RTTs. For memory powering thats your part ~ AFTER procODT CPU section is perfect. They are split things and split work

For FCLK push as sideproduct making fast/unstable GMI links, ~ you mess with PCH voltage and with LCLK. With DPM states too.

Soo to resolve WHEAs,
The focus mostly lies in getting GMI stability back
Messing with MC_Data & CCD Data ~ the VDDG blocks, is low priority.
They only have entry supply which could need an increase, together with SOC ~ due to strain increasing and generally higher clock

But the main troublemaker on everything and
"why gear1 max is staying at 1900 and not 2000 1:1 ~ are the GMI & xGMI links"
That and the connection towards PCH ~ soo GMI & LCLK.

Now what actually causes the WHEA report to vanish and the OS sensors pick it up
Thats a different topic.
But before ever bothering with that little issue ~ you need to bother with Package Throttle due to higher FCLK strap.
Something that will automatically happen and for 1CCD samples being easily noticeable on write bus being throttled lower than MCLK-BW_MAX/2
1713439416230.png

This is the first step how you notice throttling.
on dual CCDs thats a bit more difficult due to interleaving between CCDs.
MC<->CCD portion interleaving is a big topic, which is not that much voltage related.
Now I am confused. You stated yourself multiple times that ProcODT needs to go down to 30-32 ideally in the context of MemOC?
Its hard to explain
Memory powering , has zero to do with it
But procODT will affect everything inside the CPU, and also the behavior of CAD_BUS , towards DIMM slots.

On Intel you could explain that procODT will affect VDDQ_CPU & SA
Where CAD_BUS would affect VDDQ_MEM
RTT then controls the VREF section, which already was created by VDDQ.
Same here, and I enjoy the deep discussions. And I am Austrian, so I don´t feel forced :d
🤝


It is very hard to differenciate between
"it affects it" ~ but its a side influence
"it is mainly responsible" ~ it is the key influence and everything around are just rabbit holes

For WHEA itself, you have to get all those parts solid
And it very rarely is voltage related.
Voltage may help if sample doesnt internally have the correct voltages , because GMI links loose the report ~ soo a WHEA #19 report is created.

But WHEA are just endproduct reports of X amount of active sensors
Like shown on my previous post, 9/10 sensors active for this SKU on current (auto) bios configuration.

Now because you dont have WHEA does it mean target FCLK or MCLK or CO runs
Absolutely not.
It will Package throttle and thats how it should have been from the beginning.
Some samples who can run that 2100 FCLK, even 2167 but 2133 is far to hard without package-throttle ~ don't have those issues all other samples have.

Those specific samples got patched and fixed by a newer , lets call it FW branch
Than any of the samples leaving the specific factories.

Soo those samples are very different in behavior when it comes to error correction and WHEA report.
They rather full crash, and have more margins for GMI ~ compared to lets say 98% of the chips out there.

X3D was supposed to be the fixed sample, and many actually can target 2000 FCLK
But its not that easy, because like said ~ 8 different reasons a WHEA can be created , AFTER package-throttle can not keep up with it.
Beitrag automatisch zusammengeführt:

X3D was supposed to be the fixed sample on improved substrate color, and many actually can target 2000 FCLK
But its not that easy, because like said ~ 8 different reasons a WHEA can be created , AFTER package-throttle can not keep up with it.
Hence i want actual raw data from @Tatilica
before we jump down other rabbit holes

like CCD OC,
BCLK shenanigans that mess even more with the actual low level issue
and onwards.

You need the CPU side perfect, before you start to search for the item that causes the dropped reports ~ somewhere along the GMI connections.
#19 only means "report lost into VOID" ~ it doesnt mean anything else. It doesnt mean throttle, it doesnt mean instability.
It means something was reaching instability margins, may or may not got corrected but the report vanished due to ~something~ reason.
And the chance for this to happen, is with workloads that have high ring , cache and core to core communications

Soo you will see this happening more frequently with memory bandwidth tests
Or can track it closer with core2core benchmarks
May come up fast with CPU L1+L2 towards L3 through ring, back to CPU ~ y-cruncher FFT, or linX large dataset

And so on.
Post is too big already :-)

EDIT:
This whole thread may have the information you need
Its not that easy - but the biggest issue before was Boarddesign and FW issues ~ of which most is resolved already;
Sure the CPUs have fabrication issues, and AMD later finally figured it out (sadly only for AM5)
~ but its not something you can just "patch out" with AGESA blobs.

As rude as that sounds, ppl shouldn't dream about 2100-2167 FCLK results ~ even if its in the realms of possibility for AM4.
But people can finally target up to 2000 FCLK with normal samples ~ IF they balance everything.
Load balanced VDDG and SOC supply being the lowest priority of all.
Neither VDDP, as 900mV is plenty for 2100:2100:2100 ~ even 880mV can work.
But ya, voltage means nothing ;) neither on the DIMMs

EDIT2:
I want to write more* and add the little informations that needs to be known
Like VDDG tuning sure helping package throttle on high FCLK clock.
Yet those having nothing to do with WHEA itself ;) just basic stability improvements.
* But post is far too big already
Beitrag automatisch zusammengeführt:

Sigh, last post for now, i promise

I for example can never have WHEA #19 & it was not only my sample, neither only 6 cores :)
Sure i need to touch VDDG , SOC & procODT when messing with FCLK
// (that being higher or lower than MCLK, its a good idea to tune FCLK higher than MCLK isolated)
I need to find a balance between CPU1P & procODT without ruining the voltage range of this sample
~ when continuing work on 2133 FCLK.

BUT;
While those voltages will affect my stability and amount of package throttle
// (2167 being unbearable and a lost dream on this sample, unless maybe delided and watercooled)
They have zero to do with WHEA. :)

Those are two completely sepperated topics, and i can see that very clear ~ as i will never get WHEA#19
This sample doesnt have physical issues and this Board doesnt have physical issues. Never had, but its not my only AM4 sample;

Now does it mean the sample is lucky,
Not really. Its very far away of anything golden or platin sample.
Mine is just slightly above average,
but i've degraded one core of the 6 left slightly when pushing 1.55-1.6v into it for XOC contributing towards Project CTR/Hydra.
Yuri did too to figure out the limits and minimum voltage scales;

The reason i can keep 2100:2100 up, is because i worked a lot to resolve all sorts of package throttles.
But throttle or not, stable or not ~ non of this will ever create a WHEA #19

If i do stupid things with cores, i may get a WHEA #18
But all sensors and GMI works how its supposed to. Soo there never will be a WHEA #19
There might be hard shutdowns and crashes, if my voltage are bad ~ yet still , zero to do with #19 :-)


I hope thats now understandable~
 
Zuletzt bearbeitet:
  • Love
Reaktionen: Lic
So what are the specific BIOS settings that are relevant for balancing against WHEA? Not SOC, not VDDG as I understood, which is against my experience. But then again you write "VDDG both, the CPU1P8, and the SOC // for whole AM4" What is it now?
  • CPU1P8 - clear
  • GMI does not have BIOS settings, so it is
    • LCLK / LCLK DPM
    • PCH voltage - clear, but not existing in my MSI B550I BIOS
  • And still vSOC & VDDG CCD + IOD?
 
It basically took ~280-300 continuous days , to reach 2100 FCLK no throttle stability // i think it was ~3 months to reach 2000 1:1 and about 2-3 more for 2067 FCLK
But research on Vermeer took ~2 years , to build ontop of foundation from 1usmus (yuri) , who took over Matisse (it relates ~ VDDG relates)
Sure there are big big people who contributed like The Stilt and Elmor , on Matisse massively !

But i did my own thing and later joined with Yuri privately to continue some stuff which lacked documentations :)
I never got anything gifted, nobody wanted to help or work with me let alone teach;
So what are the specific BIOS settings that are relevant for balancing against WHEA?
I think this sums it up:
But its a "side effect", readout.
Because everything is loadbalanced.

If you ever get WHEA, it needs to be known from what they came.
Early it was Realtek NIC. That should be resolved by now from Drivers and AGESA (DXE) Blobs.
Later Intel's i225-V (revision 1 especially and 2) had trouble which needed a FW upgrade. (This was supplied too); Some Boards same Revision, had Rev3 of the NIC.
Further down the Road Chipset itself (supply) and USB/SATA dropouts from GMI link ~ caused an issue (LCLK topic)
Then came PCI 4 cards requiring different PCI redriver settings and some boards being build badly on that regard. Soo LCLK got another tuning and people had chipset issues
With chipset issues came then sound dropouts, because thats also PCI (and sometimes current AGESA still has those flukes at 2100 FCLK, if voltage is not perfect and procODT is nicely set)

All those, just one error on them, will cause a WHEA.
Its scenario specific, why the WHEA comes

Many just resolve it with VDDG CCD to 1.075-1.12v
Leaving IOD near 980mV .

IOD does scale a bit with FCLK target thats true, but at worst it will package throttle
This is one of the things you can track

I would not focus too much on "when" you get WHEA
But first try to not loose performance by higher FLCK.
// higher than MCLK, you can track that isolated without issues

The values that change this is starting with 1.88v CPU1P8
And a strong ODT ~ between 28-34ohm procODT

Now this strong ODT will absolutely mess with CAD_BUS and destabilize your mem
Soo pushing those voltages up (SOC till 1.2 as starting point , no worry till 1.28v) ~ weakening at the same time ClkDrvStr once
And VDDG IOD till 1v.
Should be a good foundation to start with

Then you just benchmark
See if you lose performance (super easy to track on 1CCD sample with Aida64)
And work on your voltages (SOC, VDDG, procODT & CPU1P8) till you dont throttle anymore or less.
The changes are drastic, even if something is 10mV off.
// EDIT: I lose around 20% easily if one voltage or procODT (they go together) is a tiny bit off.
// 2133 1:1 so far is 89% there, but it still throttles . Makes no sense to run it :-) ~ changes aida mem lat from 48.5ns to 49.8ns ~ but should be at 48ns once stable.
// from 34133MB/s write i reach about 34122-34125MB/s Write. Its close, but yet so far away. For 2100 G1 it stays consistent at 33599MB/s :d

Soo after that is done, you may push back MCLK up to 1:1 sync and make sure that is stable too , because procODT messes with everything.
The remain WHEA that may or may not arrise, will be due to the connection with the chipset.

Some people notice stutter in audio or crashes
Some people benchmark it with miners,
There are ways to torture fabric of the sample and make it crash

But first and foremost one must be actually stable at target FCLK
Be it throttling or not, at least stable :)

That lowers chance of WHEA to begin with
Only later you touch DPM states in the bios, and tune LCLK at the very end (sample unique)
One sec, let me screenshot the key options you should change ~ brb
Beitrag automatisch zusammengeführt:

Sorry for the delay;

This is what you want to run:
NfnQO5c.png

But i dont need it, because current bioses already handle it well ~ well at least ASRocks side for sure
// and that my sample doesnt do silly stuff internally, soo no need for option forcing
The main issue usually lies that DPMs overboost beyond their target at:
brave_esuK9GQ5VQ.png

^ Old 2022, i was testing what cuts down my 4.85 boost ~ it mostly was a hidden 65° limiter, which made no sense with tjMax of 95°.
// FIT rating was also westly different than bugged samples (nearly all are)

Most issues begin on NBIOs side
I've looked a bit into AMDs sheets and public documents
They removed any data and entries towards HSMP. Charming.

Soo even if i want, i can not tell you too much about LCLK and DPM.
Maybe i can not find it, but i feel its gone.
The main issue usually lies that DPMs overboost beyond their target at:
Those spikes cause the first and most common WHEA for consumer samples
It spikes too high and generates reports. It completely ignores their target range and just boosts into oblivion // same sillyness on RDNA2 tbh, but its only part of the overboost bug
Data gets lost and here we have one of the most common WHEA reasons even if you try run any other strap than 1900MHz

Before there was a similar issue which tried to be prevented, where people couldnt even boot 1900MHz
And AMD i remember, the years of AGESA-BootLoader hardlocks in booting anything above 1900MHz.
This silly mess started with Matisse, which also could run higher than 1900MHz. And similar shenanigans were done on Zen+ & Zen2

Later later, finally early Ryzens could push higher ~ although i'm unsure if they still or again hardlocked Matisse down.
For Vermeer it was a long time the case, just our special samples passed through this lock (bizarre ID) . Later it was noticeable when they lied but retracted the false advertisements for CO support on Vermeer (twitter).
"please update to this AGESA to get CO support, yea the one that had a 1900 FCLK hardlock haha"

Ah many back and forth and later gladly the lock was removed.
Mostly due to couple of us pushing above 2000 FCLK easily, and the patches locking us down to 1900 FCLK again
And so on~~
EDIT: There were/still are many lies from AMDs team about FMAX limits and why they exist, but every enthusiast can get drama, lets stay positive ~ everyone makes mistakes and gives their best to improve, on global scale :-)
You can find a lot of information about this [on OCN] from November 2021, till around mid/end 2022.
 
Zuletzt bearbeitet:
The changes are drastic, even if something is 10mV off.
// EDIT: I lose around 20% easily if one voltage or procODT (they go together) is a tiny bit off.
// 2133 1:1 so far is 89% there, but it still throttles . Makes no sense to run it :-) ~ changes aida mem lat from 48.5ns to 49.8ns ~ but should be at 48ns once stable.
// from 34133MB/s write i reach about 34122-34125MB/s Write. Its close, but yet so far away. For 2100 G1 it stays consistent at 33599MB/s :d
1713446589582.png
aida64_47l0EUpXrO.png

Just testet now
But i dont like to share, 4200C16-16 is embarrassing. 4200 15-15 flat @ 48.5ns daily ~ is what i'm known for :d
Was debugging some new AGESA CO shenanigans, hence dropped down to rule out issues
As A0 b-dies became unstable on last two AGESAs on my 1.68v on them.
// AGESA 1.2.0.8 onwards caused too many changes and issues. Just needs rebalancing again, but couldnt be bothered for mem's side so far

Air cooling issues ~ likely needs RTT & CAD_BUS redo again, but i couldnt be bothered for now (busy in RL)
1713446947593.png

Like i run this atm, and it's fine ~ but its not optimal for 2x8GB b-die. Primaries are not low enough.
Soo no sharing , ignore novice me :d

Needed full stability last 1-2 months, hence dropped down ~ but 1.62v for 4200C16 is embarrassing, when kit does 4267 C15-15 at same 1.68v cap :-)
Sorry~


EDIT:
I probably should study AMDs PPR for Family 19h ~ more & any 1Ah (AM5) document i can find
To get the names & descriptions right ... and if i want any work near this field.
 
Zuletzt bearbeitet:
Its scenario specific, why the WHEA comes
Many just resolve it with VDDG CCD to 1.075-1.12v
Leaving IOD near 980mV .
Does not work at all, IOD around 0,98 - 1V crashes. Even though I set:
The values that change this is starting with 1.88v CPU1P8
And a strong ODT ~ between 28-34ohm procODT
Now this strong ODT will absolutely mess with CAD_BUS and destabilize your mem
Soo pushing those voltages up (SOC till 1.2 as starting point , no worry till 1.28v) ~ weakening at the same time ClkDrvStr once
And VDDG IOD till 1v.
Should be a good foundation to start with
All done. 5800X3D with IF 2000 / CCD 1,1V+ / IOD ~0,98-1,0V does not boot or crashes to blackscreen. CCD 1V / IOD 1,1V+ boots fine, just throws WHEA. 1900 is best I can get WHEA free, like many others it seems. My experiences were verified again for my setup. Thanks anyway!
Beitrag automatisch zusammengeführt:

straight 3933 wheas free for both Gear's 1T 👍
Btw looking at @Tatilica it´s pretty much the same with him, like most others. For 5800X3D it´s all the time ~1,075-1,15 vSOC / 0,95-1,0V CCD / 1,025 - 1,060 IOD . I don´t know how the other way round works for you - do you actually own 5800X3D? I only know 5600X screenshots from you, which is a pretty different animal.
Beitrag automatisch zusammengeführt:

Even your own screenshot shows CCD below IOD, and IOD beyond 1,1V, so why do you let people test IF 2000 with 0,98 - 1,0V, which immediately crashed for me?
No bad feelings, just curious on the logic.
 
Zuletzt bearbeitet:
My experiences were verified again for my setup. Thanks anyway!
Is this the last post or how can i read it ?
You give it one attempt ?
Even your own screenshot shows CCD below IOD, and IOD beyond 1,1V, so why do you let people test IF 2000 with 0,98 - 1,0V, which immediately crashed for me?
Date,
2022 testing things, i market something from the old screenshot i found - about the issue

My current setup doesnt need much on CCD
But people with other CPUs than mine, other core & ccd layout ~ have own issues

Every sample is unique and VDDG is together load balanced.
Did i want you to jump from 1900 to 2000 strap ?

I dont understand.

Why 2000 FCLK with 980mV CCD and 1020mV IOD ?
Because its fine that way.

You push me to 3 answers here:
~ search badly changed OCN, to find you screenshots of it being stable
~ search a stable picture somewhere on my vault to also confirm you this works
~ go now stop my work, run this voltage, waste 2 hours in stability tests and show the same result but with this AGESA ?

I dont even know what you run , nor your Bios settings
Sorry that you fail your first attempt, but like ?
I dont do remote tech support ~ especially not blind

Show me what you run, and go step by step.
Then we can maybe talk about suggestions

Didnt i write that procODT setting strong is the first goal you should target
One that will mess up all your old voltages & you need to retest
Soo if "it will mess up everything old you have"
How do we expect to oneshot new settings ?

I am genuinely confused.


So you set some VDDGs and you push 1.88v on CPU1P8 ~ on an MSI board that may or may not cause trouble with EEPROM after 1.9v
But what about SOC ?
What about CO & procODT ?

Did we even test the higher/lower (whats baseline even?) voltages before changing clock on 2 steps higher
How do we know our VDDG (and required procODT) changes did no harm ?

I suggest to start slow, not jump 2 clock straps
And if you want to work on it, like now now
Give me some data please ~ i work blind here.
 
No, it was ~1000 attempts so far, ~500 of them documented in Excel, and these settings are all at tested up and down in various combinations. Not a single value goes lower or stronger, otherwise TM5 errors or WHEA. Well tFAW goes lower, but does not make sense ;)

1713453395884.png
1713453426834.png


I just wanted to test one more suggestion (CCD up, IOD down) / FCLK 2000 / MCLK 1900 , but it just proved what I knew already. IF >1900 does not work in this setup without WHEA, not even with 1,35 vSOC which I tried for fun ;) I am just tired of testing after 2 years with ~1000 tested combinations, and I don´t believe in 5800X3Ds hitting DDR4-4000 with 64GB RAM. So thanks, but no more help needed here. The mediocre 64GB of M16E don´t work stably beyond 1,33V at 3800 anyway, and thus hardly boot beyond 3800 so I think I am well optimized with what I have. Thank you for your suggestions anyway!
 
5800X3Ds hitting DDR4-4000 with 64GB RAM.
Now we talk about 64GB running 4000MT/s
I read that for the very first time.

And yes they are more than plenty:
so why do you let people test IF 2000 with 0,98 - 1,0V, which immediately crashed for me?
1713454632701.png

just run it now~
More than plenty VDDG
I don´t know how the other way round works for you - do you actually own 5800X3D?
We talk in the past. The past was 2021
If you want pitch-perfect data, i forward you to OCN.
AGESA changes, voltage behavior changes.

The values are more than plenty, even for dual CCDs.
Sure SOCCLK is higher on X3D, but thats besides the point - substrate is more efficient.
If anything it would require even lower voltages.
I suggest to start slow, not jump 2 clock straps
And if you want to work on it, like now now
Give me some data please ~ i work blind here.
1713454677889.png

I am wrong, 3 steps jumps
Of course its gonna fail.

And as long as you keep judging WHEA and unstable as the same thing
I dont know how to help any further than what i repeated now 3 times.
1713454739416.png

Throttle or bloated OS, or testet when zentimings and other tools were open.
No throttle targets 30399MB/s.
Beitrag automatisch zusammengeführt:

Now we talk about 64GB running 4000MT/s
I read that the very first time.
Whats your goal , 64gb at 4000MT/s
Or attempting to scale first to 1933 FCLK ~ with 1900 on MCLK/UCLK

Is there any current goal, or do we talk about the past only
Because i have stuff to do , like right now
Beitrag automatisch zusammengeführt:

not even with 1,35 vSOC which I tried for fun
I hope you dont just run SOC without changes on procODT.
They go together.

No, it was ~1000 attempts so far, ~500 of them documented in Excel, and these settings are all at tested up and down in various combinations. Not a single value goes lower or stronger, otherwise TM5 errors or WHEA.
If the working method is isolated single change testing
I'm sorry to say, but you will not make any progress ~ but i dont know about your data. I can not see it and i dont follow this thread much.
I talk with you for , probably the 2nd week and its just now that i see some results from you.
I don't know. Can't comment on this and haven't heard your name or seen your documents outside this place.
Generally haven't seen your sheets, i'm sorry, i dont know.

Tho no timing goes alone, and neither CO per core behaves solo.

No RTT goes alone, no cad_bus value goes alone
All in pairs or tripplets.
Changing one value is deemed to fail when you have little margins
 
Zuletzt bearbeitet:
This was just about trying to help @ApolloX maybe reach 2000 1:1:1 , for this (and out of curiosity) I tried to verfiy your suggestions on "everyone can reach IF 2000". Thats all, no goal so far for me
 
Zuletzt bearbeitet:
I tried to verfiy your suggestions on "everyone can reach IF 2000"
"Can"
Within the remain 5-6 paragraphs, that explain why it can and why it doesnt
But why it should and what to focus on.
Matisse "can" too. But Matisse was strictly prevented on ABL~

You "can" , not ~ just take that out of context.
I'm sorry if my post was not understood.
 
Da hab ich wieder was losgetreten... Aber gut, wenn die Infos kommen, wenn auch etwas unstrukturiert. So kommen wir weiter.

Ich gehe grad mal verschiedene Biose durch (ASUS): 4402/1208, 4501/120A, 4602 und mach mal nen Schnelltest mit meinen besten Settings vom 4702.
Beste bedeutet, dass ich mit Win Start, HWinfo an und nach ca. 30 sec TM5 starten innerhalb der ersten 2 min nur 3 WHEAs hatte. Das ist zwar eher ungenau, aber ich hab jetzt keine Jahre zeit. Maximal ein Jahr, weil dann kommt ein 9000X3D rein und der Spuk ist spätestens zuende.

Add/edit
Und so ist gerade mit dem 4702 die Lage
1713474676761.png


Das sind jetzt in Bezug zu den vielen WHEAs die man oft beobachtet, garnicht mal so viele ;-)
 
Zuletzt bearbeitet:
@ApolloX Na bitte, ich seh da schon deutlich bessere Chancen als bei mir ;) VDDG CCD viel zu niedrig imho. ich brauch da bei 3800 schon 0,955, gib da mal 0,975 bis 1,0V drauf, und auch vSOC mal mit 1,175 - 1,2V testen so wie auch @Veii sagt. ich krieg damit zumindest die 1933 fast WHEA frei - aber eben auch nur fast ;)

Wieso ist AddrCmdDrvStr mit GDM so hoch? Ich seh hier fast immer nur erfolgreiche 20 ;)
 
Da hab ich wieder was losgetreten... Aber gut, wenn die Infos kommen, wenn auch etwas unstrukturiert. So kommen wir weiter.
Alles super :d

Schritt für schritt, 1900 MCLK/UCLK - 1933 FCLKK
Dann 1966 FCLK usw~
Beobachte intercore latency und bandwidth.
Beobachte Aida write bandwidth, ob es MCLK/2 , freq-max erreicht oder throttled.

Du musst dich nicht auf 1:1:1 fokusieren
AM4 kann genau so 1:1:Auto with AM5. MCLK+UCLK & desync FCLK.
Die Spannungen bleiben hoch.


sunrise.gif


Soo i manage to find something
But it was already out, Valentines-day 2022
Gif of Gif , because host took it down. Archives~
DiscordCanary_wuOuRlm9ZU.gif

This is how pretty much all CPUs behave
1800 FCLK , works works, spikes to 1830 FCLK ... ok dealable if you optimize stability and never trust auto bios behavior

Now this is how the reality actually was and to some extend still is , before i worked on system after system after system
~ as AMD was slow with the updates and gamers demand stable systems.
brave_yQCLXAtgVG.png

You think you run 1800 FCLK strap - well no , happy 2000+ FCLK spikes and LCLK spikes.
The CPU doesnt even adhere to AMDs own limiters ~ but also ok, bugs , old bugs , we have time to fix it

Here is another CPU i had, when i was working as SI
brave_HtUQIYN49v.png

What about 2600 FCLK spikes with 880MHz LCLK. :geek:
We think its just 1800MHz Gear 1, what is supposed to go wrong , right guys
Absolutely no wonder the system hardcrashes out of random.

Lets have another one
dllhost_KIPgGoHrDY.png

This is gentle, its bearable , single CCD with nice 2000 G1
LCLK still bursts above allowed frequency Max.
Its not like we have limiters or anything, why even have them 🙃

Obviously i'm joking here and mocking for fun 🤭
But do you guys finally get the point :-)

Lets put some more,
Its not like i'm speaking without experience and "neever had XYZ CPU"
Alone reading that way of argument towards me, ugh~

How about 13GHz Boost Spikes :LOL:
1713500055650.png

I'd like to have some 300% IPC & FMAX uplift, yes please. // unfortunately screenshot was not when the spike happened
This was the time when i worked with ManniX and Yuri on powerplans, but created something as broken and fast
That it was too dangerous to share, as it played with controlling the overboost bug and abusing it (which is a serious set of issues, not just frontend)

Unfortunately i couldnt find the same spiking to 1.65v (actual requests) with 22GHz and RDNA spiking to 6GHz ~ picture
Before the system ultimately froze and crashed.
Maybe you'll find it as "a meme" in Elmorlabs Discord ~1-2 years ago.

This is what actually causes your reports inter-cpu to turn into "uncorrected, lost in void" WHEA's by sensors.
NBIO + DRAM & LCLK being completely unstable ~ link to PCH included;


Soo ,
I hope thats enough to get your attention

How do we handle it from here,
Do we want to stay at ~ "we worked 4 digit hours, we did soo well, who are you to tell us that its possible"
Do we want to give up, and just be done with it ?

Or do you actually want to use my time, and work on this - without searching for excuses:
How difficult this is,
how impossible this is
How long we tried and kept failing
How much we know and dont want to bother

Obviously i'm harsh sounding here,
But if you guys just have "give up" as an answer
Sorry but you're wasting time. Time of people who read and try to learn & time of person who may or may not want to help for literally zero reward, just wasting their time arguing on a forum;
Up to how you treat them (y)

Have a joyful and happy Friday + weekend 👋:giggle:
 
Zuletzt bearbeitet:
hallo in die Runde

hat jemand von euch vielleicht Erfahrung mit der Kühlung der RAM: Kühlkörper ja oder nein, im Fall von einem 2-DIMM Mobo?
ich sehe bei mir trotz aktiver Kühlung Temperaturen von 50+, da null Abstand zwischen den DIMMs ist.

Macht es da Sinn, die Kühlkörper zu entfernen, damit durch dem Lüfter genug Luft zwischen den DIMMs gelangt?

IMG20240322232211.jpg
 
@unifyx Ja, ich würds machen. Diese Heatintenser vorsichtig runter und dann mit nem Lüfter draufballern. Das sieht man selten, aber bei dir siehts grob zu warm aus.
Beitrag automatisch zusammengeführt:

@ApolloX Na bitte, ich seh da schon deutlich bessere Chancen als bei mir ;) VDDG CCD viel zu niedrig imho. ich brauch da bei 3800 schon 0,955, gib da mal 0,975 bis 1,0V drauf, und auch vSOC mal mit 1,175 - 1,2V testen so wie auch @Veii sagt. ich krieg damit zumindest die 1933 fast WHEA frei - aber eben auch nur fast ;)

Wieso ist AddrCmdDrvStr mit GDM so hoch? Ich seh hier fast immer nur erfolgreiche 20 ;)

Ja, ich geh mal von CCD 0,9 auf 0,95 rauf aber wie mehrfach geschrieben, das läuft bei mir entgegen jeder Theorie. AddrCmdDrvStr geh ich runter auf 20.

VDimm sind übrigens 1,47V und die 1,8 PLL sind 1,88V.
-> Jetzt mal die Theorie verfolgend, dass der IMC mehr Spannung braucht um keine WHEAs mehr zu produzieren...
Wie hoch kann man mit PLL gehen?

Add
-> PLL auf 1,95V
Jetzt hab ich mir mal den leichteren Default TM5 genommen um mal was positives zu erleben. 1 WHEA direkt bei Systemstart. Cycle 1: 0 WHEAs. Cycle 2: 1 WHEA. Cycle 3: 4 WHEAs.
1713544924858.png

Jetzt schau ich mal, wie schlimm es mit Extreme@Anta wird. Da hatte ich grad 7 WHEAs in Test 12.

Add
Jetzt hab ich dann mal ne Stunde laufen lassen, dabe waren 3 Runden der Default TM3. In Summe 10 WHEAs. Das ist jetzt schon nicht schlecht.
 
Zuletzt bearbeitet:
Das oben war in meinem ältesten System (Produktivsystem).
Jetzt nach dem Abendessen selbe Settings ins Gamingsystem gebootet, ne halbe Stunde Alrarmstufe Rot 2 gespielt (da idelt mein System bei 0 Last), danach über ne Stunde CP77 gespielt und 0 WHEAs.
Mich wundert das grad bissl, denn im andren System hatte ich spontante WHEAs allespätestens 10 min, hier garnichts. Jetzt werd ich das mal bissl durchrütteln und schaun, wann der erste kommt.

1713555700991.png
 
Zuletzt bearbeitet:
  • CPU1P8 - clear
  • GMI does not have BIOS settings, so it is
    • LCLK / LCLK DPM
    • PCH voltage - clear, but not existing in my MSI B550I BIOS
  • And still vSOC & VDDG CCD + IOD?
Bread, was sind denn hier deine Empfehlungen?
- für die 1,8V CPU Spannung?
- Wofür steht GMI? Was würdest du bei LCLK und LCLK DPM einstellen?

- Deine Ansichten zu SOC, CCD und IOD kenne und teile ich theoretisch, nur dass die von dir vorgeschlagenen Erhöhungen bei mir in einem WHEA-Regen enden.
 
Asus Q-Code C5-Fehler nach Aktualisierung des BIOS auf 2007.



Das System hat heute Morgen einwandfrei funktioniert. Ich habe beschlossen, das BIOS und den Chipsatztreiber zu aktualisieren. Nach dem Update bootet das System jetzt nicht mehr, ich erhalte den Fehlercode 00 und das System bleibt beim Start beim Fehlercode C5 hängen. Es scheint, dass Sie das BIOS nach der Installation von 2007 nicht auf eine ältere Version zurücksetzen können. Es sieht also so aus, als würde ich bis zum neuen BIOS-Update nicht mehr mit dem Single-Channel-Speicher arbeiten. Der Speichersteckplatz DIMM_A ist tot, das System startet mit nur einem Speicherriegel im Steckplatz DIMM_B und läuft nur mit einem Kanal einwandfrei.



Capture.PNG Only 1-Channel.PNG Screenshot 2024-04-20 113628.png IMG_3269.JPG IMG_3272.JPG IMG_3285.JPG
 
Zuletzt bearbeitet:
Still no data for me to work from. I cant help you with no data.
C-states off and no boost is not a good idea,

Why it logged nothing on 3933 GDM on,
Verbose logging is needed

There are good and bad WHEA reports.
Successfull and failed correction can happen without causing WHEA report to begin with.
Its just an OS logger.

Alright !
Stay safe and have fun :)
Hey @Veii great news I can tell, can't wait to thx you again, just return from job till monday, figured out what's going on for IF2000 Wheas free on my X3D 💪
After tweaking straight IF2000 whole day today, and found but just random wheas, follow back my bclk route, from previous tested Profile 3933 CL14 RCD 15 GearON no bclk as suggested + 101.75 bclk with slowly decreasing in -10 step by step PBO limits PPT & EDC, guess what, whea's goneee 👍
Can't believe even now, still testing & retesting, cause EDC did the trick, if I touch it +10A step up, wheas are invited back to party🎆
No thx, better with stick EDC=60, skip this wheas party for now, barely touch after 1year my looking goal, IF2000 wheas free for gaming only😋
And funny thing, beside I can see L1/2/3 cache bandwidth are in regression inline from previous PBO Limits=Disable, don't bother me much, short latency 'ns is still there, even better as also found more FPS in games too😜
EDIT: LCLK Control=min/max 592 DPM=Disabled
 

Anhänge

  • IMG_20240420_102935.jpg
    IMG_20240420_102935.jpg
    947,6 KB · Aufrufe: 48
  • IMG_20240420_103027.jpg
    IMG_20240420_103027.jpg
    1.018,8 KB · Aufrufe: 34
  • IMG_20240404_001727_copy_4523x2072.jpg
    IMG_20240404_001727_copy_4523x2072.jpg
    1,2 MB · Aufrufe: 44
  • IMG_20240428_001052.jpg
    IMG_20240428_001052.jpg
    1 MB · Aufrufe: 32
  • IMG_20240513_213256.png
    IMG_20240513_213256.png
    138,7 KB · Aufrufe: 29
  • 5800X3D_4000CL14_GearON_101.75_PBO 100-70-60_CO-PerCore_LCLK DPM off #.png
    5800X3D_4000CL14_GearON_101.75_PBO 100-70-60_CO-PerCore_LCLK DPM off #.png
    158,7 KB · Aufrufe: 32
  • CP2077_5800X3D_4000CL14_GearON_101,75_PBO 100-70-60_CO-PerCore_LCLK DPM off #.png
    CP2077_5800X3D_4000CL14_GearON_101,75_PBO 100-70-60_CO-PerCore_LCLK DPM off #.png
    1 MB · Aufrufe: 34
  • CP2077_5800X3D_4000CL15_GearOFF_101,75_PBO100-70-60_CO-PerCore_LCLK592DPMoff_copy_2048x1152.png
    CP2077_5800X3D_4000CL15_GearOFF_101,75_PBO100-70-60_CO-PerCore_LCLK592DPMoff_copy_2048x1152.png
    1 MB · Aufrufe: 32
Zuletzt bearbeitet:
Cool @Tatilica
But you haven't faced Doom - what happens if you start extreme1@anta777 instead ;-)

I just copied your settings for the sake if trying things...
 
Cool @Tatilica
But you haven't faced Doom - what happens if you start extreme1@anta777 instead ;-)

I just copied your settings for the sake if trying things...
We'll never find out just by copying, but different sample X3D in diff. combo: chipset mobo mem gpu😜
But yeah, u can try, why not, only for gaming, cause be aware all your benchmarks scores will gone 🧐
Btw I'am playing only CP2077 & Starfield at age 57
 
Zuletzt bearbeitet:
isn't a simple solution, but it may show directions. And did. Wit your super high voltage I was able to see no WHEAs in gaming and light browsing - but this isn't a big thing.
And 3 cycles default-TM5 only showed 3 WHEAs. So the direction is not bad, but need further fine tuning.

Royals_4000_high-volts_3WHEA-default.jpg
+ 3 WHEAs
-----
And slowly I'm coming closer ...
1713652267125.png
 
Zuletzt bearbeitet:
isn't a simple solution, but it may show directions. And did. Wit your super high voltage I was able to see no WHEAs in gaming and light browsing - but this isn't a big thing.
And 3 cycles default-TM5 only showed 3 WHEAs. So the direction is not bad, but need further fine tuning.
Who said is simple, was jocking with you, glad to help gamers and Luxx'ers here to get rid of wheas at high mem freq, PBO Limits fine tunning to lower balanced values is required, especially EDC, fine Curve negative PerCores in perfect aligned voltages combo soc/vddp/dg/iod/vdimm/1.8v rail, not that simple, many many thanks to @Veii who routed me to "never give up" all the time💪
 
Zuletzt bearbeitet:
Hardwareluxx setzt keine externen Werbe- und Tracking-Cookies ein. Auf unserer Webseite finden Sie nur noch Cookies nach berechtigtem Interesse (Art. 6 Abs. 1 Satz 1 lit. f DSGVO) oder eigene funktionelle Cookies. Durch die Nutzung unserer Webseite erklären Sie sich damit einverstanden, dass wir diese Cookies setzen. Mehr Informationen und Möglichkeiten zur Einstellung unserer Cookies finden Sie in unserer Datenschutzerklärung.


Zurück
Oben Unten refresh