On 10/17/07, Les Mikesell <[email protected]> wrote: > Jacques B. wrote: > > > > 3 - How can you effectively troubleshoot an existing problem when a > > past one was not dealt in such a manner as to ensure that it was > > corrected > > How can you use a system that does not have an effective troubleshooting > mechanism regardless of how it got into its current state? The simple > traditional unix mechanism is something you can easily understand and > verify. Ok, now we are arguing for the sake of arguing. It depends on what your definition is of a troubleshooting mechanism. Some would argue that a listserv is a troubleshooting mechanism. You post your problem. More experienced people provide you advice on what to try or what to post to help them help you. This list has served as a troubleshooting mechanism for many, many people. But like any other troubleshooting mechanism it will only work if you follow the steps, in this case those recommended to you by more experienced people. > > >(the intrusion incident being the most notable one but I'm > > sure others on the list could identify other past issues that were > > potentially not dealt with adequately based on what was posted in > > those threads). The existing problem could be a domino effect from a > > past problem and may never be properly dealt with until the underlying > > issue is dealt with. > > Regardless, you should have a way to check and fix it, unless what you > are running is unimportant and you can abandon it. There are ways to check and patch an intrusion. But it is beyond the abilities of most to be able to do so with confidence that there isn't still a vulnerability left behind as a result of the intrusion. And to the contrary, if you are running something important then that is when you will most likely wipe and re-install and/or restore from backups. Because if it's important then you cannot risk trying to clean the system and hope you didn't miss anything. Re-read the thread where Karl reported the intrusion and you will see the advice offered (I'm not going to re-post all that here). Things that would have to be checked if he opted to not wipe & re-install. Did he do all that? All indications from his postings is no, he just turned off sshd and changed his account password. Running a rootkit revealer would be one thing among many others he should have done (ideally from a bootable Linux CD). Whereas a wipe guarantees a clean system (of course you back up your user files and possibly some config files that you may use as reference when creating the new ones). If his system is compromised how do we know that there isn't some info being excluded from the logs? How do we know that some of the running processes are not hidden? How do we know that some open ports are hidden? Of course all this thanks to compromised binaries that purposely exclude that info from the output. So how can you properly troubleshoot an issue when you don't know if you can trust what's in your logs and what's being reported by ls, ps, mount, fdisk, etc...? You can't honestly suggest that there should be a tool that can check your entire system for any evidence of intrusion and fix it? What if you are using ssh? web server? sendmail? procmail? mysql? php? telnet? custom applications that require uncommon ports open? check the config files for those services and know if there is a problem in them? check all the binaries including custom kernel? Check all the shell scripts? Check drive partition and drive mapping? Check all alias settings? Check for user accounts that don't belong? Check the users' path to ensure it is still right? Check SELinux settings? Check iptables settings? Check hosts.allow, hosts.deny, and hosts files? Check the routing file? check for static arp entries? and the list goes on and on and on. Do you honestly believe that such a tool could be written? And I'm sure I've missed lots of other stuff. If a person keeps using a system that was compromised after having only turned off sshd and changed the user's password, then that person likes to live dangerously. And any problems that arise after this intrusion becomes potentially more difficult to troubleshoot. Because you don't know if you can rely on your binaries, on your logs, on anything... That incident was just one other example (along with this one with SELinux and others from the past) where troubleshooting advice was not properly followed therefore properly troubleshooting it becomes impossible. I've said all I can say on this issue. After this I'll be engaging in warmed over arguments which I don't wish to do. We have taken off on a tangent from the original topic. We may have to agree to disagree. Clearly there are two camps on this issue and neither is prepare to concede any of their arguments. I'm moving on... Jacques B.