Hi,
When the main thread of a multi-threaded program calls 'pthread_exit' before
other threads have exited, it results in the other threads becoming
'invisible' to commands like 'ps'. This problem was discussed here :
http://lkml.org/lkml/2004/10/5/234 and http://kerneltrap.org/node/3930, but
I can't find a patch or explanation for it anywhere. This problem is only
seen with NPTL and not with LinuxThreads, because Linuxthreads does not let
the main thread exit (puts it to sleep) until all other threads have exited.
The problem can be easily recreated with this simple program:
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>
void *run(void *arg)
{
sleep(60);
printf("Thread: Exiting\n");
pthread_exit(NULL);
}
int main()
{
pthread_t t;
pthread_create(&t, NULL, run, NULL);
sleep(20);
printf("Main: exiting\n");
pthread_exit(NULL);
}
After the main thread calls 'pthread_exit', it is shown to be defunct. We
can still see the directory /proc/<pid_of_main_thread>/task using 'ls',
'stat' on it returns success, but 'open' on that directory returns ENOENT.
Hence though the other thread is still running, it can't be seen.
The reason appears to be the call to __exit_fs from do_exit when the main
thread exits. This sets the 'fs' pointer in the task struct to NULL. It also
decrements the reference count on the fs structure, but does not release the
memory because the other thread still holds a reference (__put_fs_struct).
When we do open() on /proc/<pid>/task, proc_root_link() (flow is open_namei
- may_open - proc_permission - proc_check_root - proc_root_link) tries to
obtain the task_struct->fs of the main thread, which is now NULL. So it
returns ENOENT.
I think we can fix this problem by the following patch. We set the fs
pointer to NULL only if either the thread is not a thread group leader or if
the whole thread group has exited. If the main thread is the last to exit,
it will set the fs pointer to NULL. However, if it is not the last, it won't
set fs pointer to NULL so that other threads can still use it. Behavior of
__put_fs_struct is not affected.
Please let me know if this is reasonable or if there are other ways to fix
the problem.
Thanks and regards,
Sripathi.
Signed-off-by: Sripathi Kodi <[email protected]>
--- linux-2.6.13.1/kernel/exit.c 2005-09-12 02:46:26.000000000 -0500
+++ /home/sripathi/17794/patch_2.6.13.1/exit.c 2005-09-12 02:46:15.000000000
-0500
@@ -463,9 +463,11 @@ static inline void __exit_fs(struct task
struct fs_struct * fs = tsk->fs;
if (fs) {
- task_lock(tsk);
- tsk->fs = NULL;
- task_unlock(tsk);
+ if (!thread_group_leader(tsk) || !atomic_read(&tsk->signal->live)) {
+ task_lock(tsk);
+ tsk->fs = NULL;
+ task_unlock(tsk);
+ }
__put_fs_struct(fs);
}
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|