Tux

...making Linux just a little more fun!

Grub Disk Error

clarjon1 [clarjon1 at gmail.com]
Mon, 27 Nov 2006 08:43:45 -0500

This is a continuation of a discussion begun last month http://linuxgazette.net/133/misc/lg/Grub_Disk_Error.html - Kat

On 11/24/06, Benjamin A. Okopnik <ben@linuxgazette.net> wrote:

>
> Yep. The first thing to look at - and, of course, the first thing that
> Jonathan should have sent in - is the content of his GRUB configuration
> and the output of 'displaymem' and 'geometry' GRUB commands. However, at
> this point, the advice that he got from Neil should be sufficient to fix
> the problem. If it's not, then it'll be time to come back here, to
> checkpoint #1, and try again.
>

Thanks guys. Neil's suggestion worked (with a little tweaking). Here's what I did to fix it:

grub> root (hd0,0)
grub> find /boot/grub/stage1
grub> setup (hd0)
Now it works.

Ben, you are right, I should have brought in the config files, I'll remember that for next time. Thanks for your help! And to clarify, the error from grub, upon boot, was: Grub Hard Disk Error Not much help. I was able to boot because the LiveCD, which I installed from, has an option to boot hda1. Very useful. Once again, thanks for the help.


Top    Back


clarjon1 [clarjon1 at gmail.com]
Tue, 28 Nov 2006 08:53:25 -0500

Hello again, the problem won't go away!

I'm not sure, but I think the BIOS could be at fault, or it could be that my HDD is becoming flaky. My data seems to be intact... Anyways, to describe the problem.

I did the whole install back to the MBR, as was in my last post, and that worked...

For a while...

I rebooted, and Grub was there. I thought, all was fixed. So, after a while, I powered down my machine, to save power since I wasn't going to use it for a while. I come back that afternoon, and <slight sarcasm> you'll never guess what was back... </sarcasm>

So I think to myself, it could be the bios/CMOS, so I tell it to redetect the drive. It had said it was 20020 MB, so I hit enter (which in my CMOS, tells it to detect the drive, and save it's settings as a 'USER' configured drive. ) Suddenly, it was a 20019MB drive. All other values (as best as I can remember) stayed the same. So, I figured that was the problem, no need to worry.

Re-install grub, reboot (Yay! A menu!), turn off, leave for a couple of hours, turn back on, and GRUB Disk Error. CTRL-ALT-DEL, enter CMOS, and try redetecting. Now, it's a 20027MB drive, no changes to heads, platters, etc. Repeat. Still doing it.

Any ideas what could be wrong? It's kinda annoying, but no problems with my data on my HDD.

Still can boot, thanks to LiveCD's Boot from HDA1 option. Any advice would be nice!


Top    Back


Steve Brown [steve.stevebrown at gmail.com]
Tue, 28 Nov 2006 14:37:48 +0000

Hi,

I'm not much of an expert, but all of my PC's are made up from donations and some of the drives are a bit iffy. Maybe I can help you to trouble shoot?

On 28/11/06, clarjon1 <clarjon1@gmail.com> wrote:

> I'm not sure, but I think the BIOS could be at fault,

Possible, but the problem appears to be that the MBR is either corrupt or gets overwritten at some point.

> be that my HDD is becoming flaky.  My data seems to be intact...

I'd make sure at this piint that you have a very current back up. You may lose the drive suddenly if it's getting really bad.

> I did the whole install back to the MBR, as was in my last post, and
> that worked...
> For a while...
> I rebooted, and Grub was there.  I thought, all was fixed.  So, after
> a while, I powered down my
> machine, to save power since I wasn't going to use it for a while.  I
> come back that afternoon,
> and <slight sarcasm> you'll never guess what was back... </sarcasm>

So, an immediate reboot and all's well, but a longer shutdown and grubs gone?

> So I think to myself, it could be the bios/CMOS, so I tell it to
> redetect the drive.  It had
> said it was 20020 MB, so I hit enter (which in my CMOS, tells it to
> detect the drive, and save it's
> settings as a 'USER' configured drive. )  Suddenly, it was a 20019MB
> drive.  All other values
> (as best as I can remember) stayed the same.  So, I figured that was
> the problem, no need to worry.

Does this process actually change the problem at all? Does re-detecting the drive solve the problem - even for a short time? What if you re-installed grub without detecting the drive?

> Re-install grub, reboot (Yay!  A menu!), turn off, leave for a couple
> of hours, turn back on, and GRUB Disk Error.

Okay so you answered my earlier question.

Some thoughts:

If it's the CMOS battery then the BIOS will reset back to preset defaults. You should be able to confirm this by changing an unrelated setting (unrelated to the drives) and comparing after the problem crops up again - if it has reset to default settings I'd think about changine the battery. Some can be swapped out quite easily, some need soldering, I've no idea how easy they are to obtain though.

Also, if there are no changes to the drive data itself, then simply resetting the BIOS to the correct settings should mean that your drive would boot into Grub okay.

Is ther an anti-virus setting on the BIOS that re-writes the boot sector in case a virus overwrites it? My flaky memory is tapping me on the shoulder and saying that this the case on at least one of my machines. Maybe Grub is neing treated as an unwelcome virus?

I feel that if it were to be the hard drive, there'd be other data problems.

My Ubuntu system messses up my Grub settings after a kernel update (I've added drives and moved partitions since I installed the systems, works sweet providing I re-install Grub) but then I know when my system has been changed. Is there a script that runs as a cron job that alters Grub? I'm really pulling at straws now though.

I hope I've some and not a hindrance,

Steve


Top    Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Tue, 28 Nov 2006 11:08:00 -0500

On Tue, Nov 28, 2006 at 02:37:48PM +0000, Steve Brown wrote:

> On 28/11/06, clarjon1 <clarjon1@gmail.com> wrote:
> 
> > For a while...
> > I rebooted, and Grub was there.  I thought, all was fixed.  So, after
> > a while, I powered down my
> > machine, to save power since I wasn't going to use it for a while.  I
> > come back that afternoon,
> > and <slight sarcasm> you'll never guess what was back... </sarcasm>

<humor type="mild"> OH-OH. Do you realize what you just did? You opened a '<slight sarcasm>' tag, but you "closed" it with '</sarcasm>' - which didn't match the first tag and therefore left it open. As a result, everything here in TAG will be slightly sarcastic from now on. Man, are you in trouble! </humor>

> So, an immediate reboot and all's well, but a longer shutdown and grubs gone?

I do believe that you're looking around in the right area. A bit more info from Jon wouldn't hurt, but that's what this is starting to look like to me as well.

> Some thoughts:
> 
> If it's the CMOS battery then the BIOS will reset back to preset
> defaults. You should be able to confirm this by changing an unrelated
> setting (unrelated to the drives) and comparing after the problem
> crops up again - if it has reset to default settings I'd think about
> changine the battery. Some can be swapped out quite easily, some need
> soldering, I've no idea how easy they are to obtain though.

Usually, battery problems produce a series of POST beeps and a diagnostic message from the motherboard ROM - all of which happens before GRUB ever gets a chance at it. However, I can visualize a brain-dead motherboard that would just bypass that part of the POST (I've seen all sorts of weirdness like that in the past) and leave the reporting to the OS - in which case the result would be pretty much what's reported. Replacing the battery (assuming it's the easily removable type, which most are these days) is the smart way to go - particularly if it's one of the widely-available watch battery types.

Just to check, though: Jon, you're not hearing any weird beeps or seeing any odd messages before GRUB freaks out - right? The POST - that is, the bit where you normally get a single beep and a single flash from the keyboard LEDs almost immediately after you turn on the computer - is very important, although (to most people) it's nearly invisible. If it changes in any way from the usual sequence, that's a great big loud message that Something Is Definitely Wrong. (Historical note: my all-time favorite POST message was the famous "Keyboard not found, press any key to continue"; the second place is occupied by the user-panic-inducing "ROM BASIC NOT FOUND" - which latter has sadly gone away now that IBM doesn't do its own BIOS messages anymore.)

> Is ther an anti-virus setting on the BIOS that re-writes the boot
> sector in case a virus overwrites it? My flaky memory is tapping me on
> the shoulder and saying that this the case on at least one of my
> machines. Maybe Grub is neing treated as an unwelcome virus?

That would be a seriously weird thing to do. I do know that some BIOSes had an "anti-virus" option, although I never knew exactly what it did, but rewriting the boot sector would be like shooting yourself in the head at the first sign of a cold - it would "fix" the problem, but there would be some minor problems associated with it. (True, disposing of the body would be someone else's problem, but still.) I'm sure that Micr0s0ft would love the idea, but I'm pretty certain that no BIOS manufacturer would tie themselves to a single OS that way.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top    Back


Steve Brown [steve.stevebrown at gmail.com]
Tue, 28 Nov 2006 16:47:55 +0000

On 28/11/06, Benjamin A. Okopnik <ben@linuxgazette.net> wrote:

> On Tue, Nov 28, 2006 at 02:37:48PM +0000, Steve Brown wrote:
> > On 28/11/06, clarjon1 <clarjon1@gmail.com> wrote:
> >
> > > For a while...
> > > I rebooted, and Grub was there.  I thought, all was fixed.  So, after
> > > a while, I powered down my
> > > machine, to save power since I wasn't going to use it for a while.  I
> > > come back that afternoon,
> > > and <slight sarcasm> you'll never guess what was back... </sarcasm>
>
> <humor type="mild">
> OH-OH. Do you realize what you just did? You opened a '<slight sarcasm>'
> tag, but you "closed" it with '</sarcasm>' - which didn't match the first
> tag and therefore left it open. As a result, everything here in TAG will
> be slightly sarcastic from now on. Man, are you in trouble!
> </humor>

An ideal opportunity to introduce a </slight sarcasm> then?

Or does that only match your <sarcasm> ?

Aaargh,my brain is melting - will this end the madness: </slightsarcasm></slight sarcasm>

Nope, sadly, I'm still mad :(

>
> > Is ther an anti-virus setting on the BIOS that re-writes the boot
> > sector in case a virus overwrites it? My flaky memory is tapping me on
> > the shoulder and saying that this the case on at least one of my
> > machines. Maybe Grub is neing treated as an unwelcome virus?
>
> That would be a seriously weird thing to do. I do know that some
> BIOSes had an "anti-virus" option, although I never knew exactly what it
> did, but rewriting the boot sector would be like shooting yourself in
> the head at the first sign of a cold - it would "fix" the problem, but
> there would be some minor problems associated with it. (True, disposing
> of the body would be someone else's problem, but still.) I'm sure that
> Micr0s0ft would love the idea, but I'm pretty certain that no BIOS
> manufacturer would tie themselves to a single OS that way.

Ah yes, you see my flaky memory is also a liar - tells me just enough of a plausible answer to make a fool of myself in public, while it sits in a deckchair laughing. He also enjoys my entering rooms only to have to stand there wondering why I'm there at all, only to remind me once I've sat down again. I thought the notebook would do it - but I've misplaced it somewhere.

Steve


Top    Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Tue, 28 Nov 2006 13:28:23 -0500

On Tue, Nov 28, 2006 at 04:47:55PM +0000, Steve Brown wrote:

> On 28/11/06, Benjamin A. Okopnik <ben@linuxgazette.net> wrote:
> >
> > <humor type="mild">
> > OH-OH. Do you realize what you just did? You opened a '<slight sarcasm>'
> > tag, but you "closed" it with '</sarcasm>' - which didn't match the first
> > tag and therefore left it open. As a result, everything here in TAG will
> > be slightly sarcastic from now on. Man, are you in trouble!
> > </humor>
> 
> An ideal opportunity to introduce a </slight sarcasm> then?

But that wouldn't be any fun. If you can't snark at mis-formatted imaginary tags... well, I ask you - is life worth living at that point?

> Or does that only match your <sarcasm> ?

Nah. Mine is single-quoted in order to defang it. Otherwise, we'd have multiple instances of mild sarcasm running, and they just might be additive. [shudder]

> Aaargh,my brain is melting - will this end the madness: </slight
> sarcasm></slight sarcasm>
> 
> Nope, sadly, I'm still mad :(
It's all about the prions, really. I made an appointment with a physicist and had mine extracted early on; unfortunately, as you've just learned, that just takes away the excuse, not the madness.

> > That would be a seriously weird thing to do. I do know that some
> > BIOSes had an "anti-virus" option, although I never knew exactly what it
> > did, but rewriting the boot sector would be like shooting yourself in
> > the head at the first sign of a cold - it would "fix" the problem, but
> > there would be some minor problems associated with it. (True, disposing
> > of the body would be someone else's problem, but still.) I'm sure that
> > Micr0s0ft would love the idea, but I'm pretty certain that no BIOS
> > manufacturer would tie themselves to a single OS that way.
> 
> Ah yes, you see my flaky memory is also a liar - tells me just enough
> of a plausible answer to make a fool of myself in public, while it
> sits in a deckchair laughing.

Nah. Computers are just weird, that's all - and BIOS writers who put in "anti-virus protection" are weirder still.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top    Back


Rick Moen [rick at linuxmafia.com]
Tue, 28 Nov 2006 12:01:01 -0800

Quoting Steve Brown (steve.stevebrown@gmail.com):

> Or does that only match your <sarcasm> ?
> 
> Aaargh,my brain is melting - will this end the madness: </slight
> sarcasm></slight sarcasm>
> 
> Nope, sadly, I'm still mad :(

</insanity>

You're welcome.


Top    Back


clarjon1 [clarjon1 at gmail.com]
Tue, 28 Nov 2006 15:27:08 -0500

> Nah. Computers are just weird, that's all - and BIOS writers who put in
> "anti-virus protection" are weirder still.
>

Nope Ben, no AntiVirus thingy. I have seen those antivirus things (real pain when installing linux), and it just halts the PC when it detects a change to the MBR.

I'm suspicious of the changing HD size in the CMOS, so I'm putting my money on a somewhat flaky hard drive. It used to drop LILO about, i dunno, once every two months or so, forcing me to find a live CD to chroot. :) I should see about downloading LILO for FreeSpire, see if that helps.


Top    Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Tue, 28 Nov 2006 16:08:19 -0500

On Tue, Nov 28, 2006 at 03:27:08PM -0500, clarjon1 wrote:

> > Nah. Computers are just weird, that's all - and BIOS writers who put in
> > "anti-virus protection" are weirder still.
> 
> Nope Ben, no AntiVirus thingy.  I have seen those antivirus things
> (real pain when installing linux), and it just halts the PC when it
> detects a change to the MBR.

So... if it halts the PC... how are you supposed to fix it? I assume that some sort of a "Doom on you" [1] message pops up, and you're grudgingly allowed to get the keys to the kingdom... but if that's what you have to deal with every time you change the MBR, it may be time to change BIOS vendors.

> I'm suspicious of the changing HD size in the CMOS, so I'm putting my
> money on a somewhat flaky hard drive.  

It could be, but I doubt it. HD electronics are usually pretty much binary in operation - and magnetic signatures don't fade like they used to.

[1] This being how Cpt. Dick Marcinko (USN[Ret]), author of the various "Rogue Warrior" books, glosses "du-ma-nhieu" - for which see Google or any unexpurgated Vietnamese dictionary. :)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top    Back


Suramya Tomar [security at suramya.com]
Tue, 28 Nov 2006 18:55:48 -0500

Hey,

> So... if it halts the PC... how are you supposed to fix it? I assume
> that some sort of a "Doom on you" [1] message pops up, and you're
> grudgingly allowed to get the keys to the kingdom... but if that's what
> you have to deal with every time you change the MBR, it may be time to
> change BIOS vendors.

It doesn't actually halt the PC, It gives you a flashing screen with a warning message (complete with annoying beeping) telling you that the MBR is being changed and you have to press Y to continue. Atleast that was my experience with Antivirus in the bios. I guess this used to be useful when the only OS around were DOS & windows and boot sector viruses were common.

This feature saved my system once when I got hit with the Die Hard2 virus (The first and only time that happened.. That too from a disk given to me by a friend containing the latest antivirus program) I was installing the software and all of a sudden the system started beeping and I saw flashing lights (Scared the crap out of me..). Took me about 6 hours to recover my system from that...

> It could be, but I doubt it. HD electronics are usually pretty much
> binary in operation - and magnetic signatures don't fade like they used
> to.

Don't think you should use this drive for any important data... Doesn't look like its reliable anymore.

Oh, try using a different cable to connect the drive to the IDE controller and maybe try connecting to a different controller. 'cause it might be that the cable or controller thats bad.

Hope that helps.

- Suramya

-- 
Name : Suramya Tomar
Homepage URL: http://www.suramya.com
************************************************************


Top    Back


clarjon1 [clarjon1 at gmail.com]
Wed, 29 Nov 2006 08:46:57 -0500

On 11/28/06, Suramya Tomar <security@suramya.com> wrote:

> Don't think you should use this drive for any important data... Doesn't
> look like its reliable anymore.

It's not the drive, it's gotta be the bios/CMOS, or something... I tried disabling the drive in the bios, booted via CD, installed GRUB to my second HDD, then rebooted. All was fine. Of course, I was suspicious of how fine things were, so I turned it off, left it for 10 minutes this time, and...

Grub Hard Disk Error

Grr....

>
> Oh, try using a different cable to connect the drive to the IDE
> controller and maybe try connecting to a different controller. 'cause it
> might be that the cable or controller thats bad.
>

I'll have to try that when I get home tonight... Both HDDs are on the same cable, but only one of them was giving troubles from the CMOS.

Heh, well, I can't complain that Linux is boring, every time I think I've got everything fixed up, and get kinda bored, I seem to either break something or something fails on me.

> Hope that helps.
>


Top    Back


Steve Brown [steve.stevebrown at gmail.com]
Wed, 29 Nov 2006 14:06:53 +0000

On 29/11/06, clarjon1 <clarjon1@gmail.com> wrote:

> On 11/28/06, Suramya Tomar <security@suramya.com> wrote:
> > Oh, try using a different cable to connect the drive to the IDE
> > controller and maybe try connecting to a different controller. 'cause it
> > might be that the cable or controller thats bad.
> >
>
> I'll have to try that when I get home tonight...  Both HDDs are on the
> same cable,
> but only one of them was giving troubles from the CMOS.
>

That's because only one of them had an MBR.

Can you get another battery? It really does sound like there's just enough charge for it to keep the Bios settings for a minute before failing. Or do you have a spare motherboard?


Top    Back


clarjon1 [clarjon1 at gmail.com]
Wed, 29 Nov 2006 11:50:31 -0500

On 11/29/06, Steve Brown <steve.stevebrown@gmail.com> wrote:

> Can you get another battery? It really does sound like there's just
> enough charge for it to keep the Bios settings for a minute before
> failing. Or do you have a spare motherboard?

Oh, I forgot to mention that it doesn't seem to be the battery. It's keeping all of the settings, like date, time, boot options, etc... It's just that the thing can't decide a permanent drive size. I dunno. And I don't have any spare motherboards for this.

Of course, I could just accept this annoyance as something beyond my control, and move on...


Top    Back


Bob van der Poel [bob at mellowood.ca]
Wed, 29 Nov 2006 10:02:24 -0700

clarjon1 wrote:

> On 11/29/06, Steve Brown <steve.stevebrown@gmail.com> wrote:
> 
>> Can you get another battery? It really does sound like there's just
>> enough charge for it to keep the Bios settings for a minute before
>> failing. Or do you have a spare motherboard?
> 
> Oh, I forgot to mention that it doesn't seem to be the battery.  It's keeping
> all of the settings, like date, time, boot options, etc...  It's just that the
> thing can't decide a permanent drive size.  I dunno.  And I don't have any spare
> motherboards for this.
> 
> Of course, I could just accept this annoyance as something beyond my control,
> and move on...

A complete stab in the dark ... but I was having some goofy problems awhile ago on my machine after upgrading to a newer kernel/distro. For some reason I had the "spin down HD after 15 minutes" setting enabled in the BIOS power-save settings. After resetting that to "never" A whole bunch of problems went away. I have no idea what really was happening, but I sense that this set up some kind of conflict between the BIOS and linux power settings.

Which brings up a point ... if I am correct in assuming that the BIOS is disabled after linux boots, why would a setting like that cause a problem?

-- 
Bob van der Poel ** Wynndel, British Columbia, CANADA **
EMAIL: bob@mellowood.ca
WWW:   http://www.mellowood.ca

Top    Back


Steve Brown [steve.stevebrown at gmail.com]
Wed, 29 Nov 2006 17:22:03 +0000

On 29/11/06, clarjon1 <clarjon1@gmail.com> wrote:

> On 11/29/06, Steve Brown <steve.stevebrown@gmail.com> wrote:
>
> > Can you get another battery? It really does sound like there's just
> > enough charge for it to keep the Bios settings for a minute before
> > failing. Or do you have a spare motherboard?
>
> Oh, I forgot to mention that it doesn't seem to be the battery.  It's keeping
> all of the settings, like date, time, boot options, etc...  It's just that the
> thing can't decide a permanent drive size.  I dunno.  And I don't have any spare
> motherboards for this.

Okay ...

What about the other drives you have installed? You mentioned that you have two drives on one cable, does this mean that you have the CDROM on the other cable? And if so can you swap this hard drive to that other cable? Are the cables seated properly? And are they set up as master/slave correctly or do they use cable select?

If it's only the drive settings that change, and it's not the hard drive (both drives have the same problem), then it must be the controller. If swapping the drives removes the problem then we've proved that idea, but I'd guess your motherboard needs repair/replacement.

I can't think what else it could be, and we're right at the limit of where my knowledge ends - literally change everything one item at a time, at some point you'll swap out the fault (drives, cables, controller etc.) or ultimately be left with it (your motherboard).

> Of course, I could just accept this annoyance as something beyond my control,
> and move on...

Now where's the fun in that?


Top    Back