Guys,
Please bear with me and share your experiences with me if you would.
I have looked everywhere for hints regarding this problem.
It all started when I saw many entries like this in /var/log/messages:
"smartd[2527]: Device: /dev/sda, 4 currently unreadable
Then I booted with knoppix so I could run fsck on all partitions
including the root partition /dev/sda1.
I got a suitable superblock number (SBN)
mke2fs -n /dev/sda1
and ran
fsck.ext3 -fvy -b <SBN> -B 4096 /dev/sda1
fcsk performed many corrections in countless inodes but not bad
blocks were reported. I know this is already bad. And I'm in denial.
The next thing that happened I could not login into my system.
I set the system to boot at runlevel 3. The split second message
after a failed login attempt was
Login: no shell: permission denied
When the system boots it gets me to the GRUB splash window, and from
there I can go to single user mode (SUMode). Once I got to the CLine
I checked all file and home dirs permissions. Made sure that files
such as /etc/shells and /bin/bash were ok. Check permissions on
/home and "/" itself. Things seem to be in place; so I decided to create
a new account called "test". Booting and trying to login with the
test account also failed.
I went back to SUMode and ran strace to check the "test" account:
strace -o /tmp/xout.strace -f su test
Well, I got a ton of messages like this one:
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
I got the same message for many files including /usr/X11R/bin/xauth,
/var/run/utmpx, /usr/share/local/en/tcsh, etc.
Now, in SUMode I can login to the test account with "su test" and pwd
shows /home/test. I can do "su -" and do mount, umount, fsck and such
on the other filesystems.
So here are my questions:
Is one not to perform fsck on the "/" filesystem to avoid the mess Iran into?
Is giving the -y option to fsck a bad thing?
Did fsck remapped blocks of data in a way that wiped important files in my system?
How is it the just enough damage is done so one can only done in SUMode?
I read somewhere that you can replace bad sectors with
dd if=/dev/zero of=/dev/sda1 bs=4k
will this bring the disk to a usable state so I can reinstall the OS?
I also ran "smartctl -d ata -t long /dev/sda", but see nothing displayed
to give me a clue of the results of the test. How do I check this?
I now the easy way out is to "replace the hard drive",but I been trained
in the "Admin Ways". I'd like to understand what's going on.
Thanks in advance.
~Aldo.