Ursprungligen inskrivet av coobolt
Verkar inte som om så många läste det som stog på Anands sida.
Här kommer en text som jag har sparat på ett tag, vet inte var jag har fått den ifrån från början, så kan tyvärr inte ge credit till någon...
Burn-in of overclocked cpu's
Why?
If you overclock your cpu, you are basically running it outside its specification. It may run, but that's not guaranteed. To help it run, you might try burn-in and some other measures.
How is the cpu basically working?
A cpu functions with just two different signal states: "high" and "low" (sometimes referred to as "1" and "0" or signal "on" and "off").
To detect the difference between "high" and "low" the cpu uses some kind of reference voltage. If the voltage of the signal is higher as the reference voltage, the signal is detected as "high", if it is lower, it is detected as "low". The time the cpu has to detect the "high" or "low" states, is limited by the cpu clock. If the clock is set to higher speed, the time for detection, the "decision cycle", becomes shorter.
The signal itself is not at all digital. Its analog. Which means, it doesn't jump from "low" to "high" and back in no time, it gradually rises from the lower level to the higher and back. Parasitic capacitances and resistors cause this. They are called "parasitic", because you rather would like them not to be there, but given the current semiconductor manufacturing process, you can't avoid them.
In the design of the cpu, it is attempted to keep the parasitics as low as possible. Sometimes you run into a quagmire. If you make the resistance lower (for example wider metal lines have less resistance), you might increase the capacitance (wider metal lines have more capacitance). Which means, you will always end up with having the parasitics in one way or another.
The capacitances "suck" away the rising voltage until the capacitances are charged, the resistances make matters worse by "resisting" the current flow which tries to charge the capacitances. Temperature makes matters also worse, because heat further increases the resistances in the cpu.
The transistors in the cpu also have some "internal" resistances, if you look at the transfer characteristics, you will see a non-linear behaviour of the current versus the voltage. If you (or the signal, for that matter) increase the voltage, the current will start at zero and rise exponentially from some point on (the so called subthreshold swing), at the so called "threshold" it will become (almost) linear dependent on the voltage and then start to saturate at an certain current level.
The driving force in the cpu is the supply voltage of the cpu. Setting it to lower values would result in "slacker" subthreshold swings and lower saturation current, setting it to higher values would steepen the subthreshold swing and increase the saturation current, but since you are dissipating more power, you would also generate more heat in return.
If the combined parasitic and build-in effects limit the signal to the point, where in the given "decision cycle" the cpu can't detect the change from "high" to "low" or vice versa, the cpu will fail. Sometimes it will fail only once in a while, or at specific instructions, it can even fail unnoticed, because the internal error correction will step in. That's what makes figuring out whether the system is stable or not so difficult.
How to overcome the problem of the failing cpu.
As stated before, the cpu can fail for some of the following reasons:
Clock cycles too short. Temperature too high. Voltage too low. Transistors switch too slow.
Obviously, switching to a lower clock speed is not so desirable when trying to squeeze out the last MHz of performance out of your cpu.
For fighting too high temperatures, there are several methods, I don't want to discuss this in depth here. Just some basic hints: Use the fattest heatsink, throw sufficient air at it and make sure that heatsink and cpu have good thermal contact.
For increased overclockability, you can, very carefully, try bumping the corevoltage up in 50...100 mV steps. You have to be careful, not to exceed the point where it becomes counterproductive because of the additional heat generation. The power dissipation of a cpu goes nonlinear up with the supplyvoltage. Which means a 5% increase of voltage could lead to 10% increase of power dissipation, a 10% increase of voltage could result in 30% more power dissipation. If the voltage is too high, you can reach the point of brakedown in the transistors, this would shorten the life of your transistor significantly, maybe even to zero.
If you are wondering, when the burn-in comes to effect, it is in the "transistors switch too slow" point.
The transistors can made to switch faster with modifications in the semiconductor manufacturing process. This would include scaled down sizes for channel length or gate oxide thickness, optimization in contacts and wiring etc pp. All of which you have no influence in.
One effect in the actual using of the transistors is the hot-electron-degradation of the gateoxide. Hot electron degradation occurs, when electrons are accelerated to energy levels which allow them to cross the barrier of the gateoxide. The electrons would then either cross the gateoxide completely or get stuck within the gateoxide. A stuck electron would incorporate a negative charge into the gateoxide.
This degradation starts as soon as the transistor is used and will eventually lead to the failure of it. Usually, the cpus are designed to last almost forever.
If you can live without that (who wants to use a lame 500 in 20 years anyway?), you can actually make use of the degradation for your overclocking.
The fun part of this kind of degradation is, that regarding to speed, it makes 50% of the transistors in your cpu a bit worse, but the other 50% would get much better.
This is because there are two different flavours of transistors in the CMOS process, NMOS and the complementary PMOS. If your gateoxide has incorporated negative charge in it, the NMOS would get a slacker subthreshold swing, the PMOS swing on the contrary, would become steeper. Thus, the PMOS usually being the speedlimiting factor, the cpu at whole, which consists of NMOS and PMOS transistors, would be able to run faster.
However, the physical effects are not yet understood completely. And that applies not only to me...
Anyways, you can speed up this degradation process with the burn-in.
During the burn-in you try to get as much hot electrons incorporated in the gateoxide as possible. The hot-electron effect is sensitive to voltage and temperature. The higher the voltage, the higher the effect, the higher the temperature, the lower the effect. Thus, you would run your cpu at minimum clockrate, maximum voltage and minimum temperature (remember, voltage and temperature are dependent of each other). The time needed to incorporate a sufficient number of electrons varies widely. It depends on the specific cpu and what you expect out of it. Due to manufacturing variances, some cpus may be more susceptible to burn-in than others from a different production run. It may even be different with chips from the same wafer.
What to do during the burn-in
Since not every instruction or data will use the whole cpu, you will need to stress your cpu with a wide variety of tasks during the burn-in. If you just let it sit there and idle, only the parts needed for the halt instruction would be stressed...
You can use several programs to stress your cpu. Usually, a high cpu usage is desired. Programs that can do that would be prime95, rc5des, setiathome; loop-demos of 3dmark, quake, unreal; endless recompiling of code, etc pp.
Best, use all of them.
How long to burn-in and what's next
After a couple of hours or weeks, depending on what your cpu is capable of, you could try the machine at the desired overclocked speed, with lower voltage. When your lucky, it'll run smooth. You could test the stability with the same programs you used burning it in.
If it still hiccups, you may either need further burn-in, or you need to reevaluate other aspects of your machine (cooling, voltage, clockspeed etc).
Simply put, if a couple of weeks of burn-in didn't help, a couple of months probable won't either.
If problems persist, you can either go hardcore and try some funny stuff like submerging your computer in mineral oil or get a can of liquid nitrogen to pour over your cpu, or you may have to face the hard truth of overclocking:
Nothing is guaranteed in overclocking.