James Wilkinson wrote:
I really shouldn't be joining in this thread again. It's obvious Mike
can't quite get the hang of a simple concept...
The simple concept which I missed is (according to Gordon Messmer)
is that the software which is consuming the CPU is already
known. I missed that.
What I've been proposing is to find what piece of software was
consuming the CPU. Somehow, I missed the fact that we *already*
know what piece of software is doing so, and it is normal
operation for that.
I have never claimed that he doesn't have a hardware problem
with heat.
[snip]
If, as appears *highly* probable, the system really was overheating,
*whatever* Linux does it logically cannot be responsible.
That is not an a-priori fact. In this case, everyone be I seemingly
knew that the culprit was also already known, which put what I wrote
off-base. Sorry about that.
Hardware should not overheat whatever software does, so whatever Linux
I never claimed that it should.
is doing, the hardware should still not overheat. If it was happening in
Windows, Windows would not be responsible. It *cannot* be a software
bug!
It can be a symptom of a software defect, which was my only claim.
There is, however, a related argument you could make. That it is
important to find out exactly what is happening so that whoever is
appropriate can stop it happening again. But for this we need to know
such things as:
* where did the error messages come from;
* how hot does that processor actually get;
* are any temperature probes correctly configured.
But these are different questions, and we'd want to gather different
data. The question "what was going on" is relatively unimportant -- we
know the Original Poster was running yum, and we *know* that stresses
the processor.
But, unfortunately, I missed that point, and caused this big
furor.
For which I apologize.
If thinking we should take every opportunity to investigate
unusual behavior for possible defects is being an "idiot", then
every industry in the world which considers availability and
reliability in software to be important is full of idiots.
This includes telecomm, aviation, and power systems at
least.
These industries have the sense to know that reliable software is
dependent on reliable hardware. If the Original Poster's hardware is not
reliable, crashing software is expected. Indeed, fly-by-wire aeroplanes
are designed around the possibility that the computers will fail, and
usually have multiple redundant "voting" systems to identify and
neutralise rogue systems.
Re-read that in light of the fact that I missed that someone
had already identified the software culprit. That was the
only point I ever was trying to make: That we should identify
the software eating the CPU, and ascertain whether that was
considered to be normal behavior for that software.
That I missed this fact, is an oversight on my part, for which
I apologize.
Mike
--
p="p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
You have found the bank of Larn.
I can explain it for you, but I can't understand it for you.
I speak only for myself, and I am unanimous in that!