Question regarding pthread_cancel and pthread_cond_timedwait

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have a threading library which has been in production for
six years and currently functions
on Solaris 2.6-2.9 Sparc, Solaris 2.7-2.10 x86, HP-UX 11.00,
Tru64 5.1(a,b), AIX 4.3.x and AIX 5.x.

The library starts up within the current process 5-8 threads,
the operation runs to completion (with or without error), the
threads complete or are canceled and then complete depending on
what happened during processing.

At some latter time this repeated N times without the main process exiting. The threads are NOT detached.

The problem occurs on Fedora Core 3 if thread has exited exited and pthread_cancel is called with a thread id of a thread which has completed.

If thread has exited and we call pthread_cancel with that thread id on Fedora Core 3
( version info
getconf GNU_LIBPTHREAD_VERSION
NPTL 2.3.4
>uname -a
Linux irl-73-26 2.6.10-1.770_FC3 #1 Thu Feb 24 14:00:06 EST 2005 i686 i686 i386 GNU/Linux
)


the application segfaults.  Is this the expected behavior?

I am also getting a segfault when pthread_cond_timedwait is called, I still determining the
exact state when the segfault occurred. The back trace shows


#0 0x005c57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x00839dbc in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0


The directory listing shows:
ls -l /lib/tls/
total 1936
drwxr-xr-x 2 root root 4096 Mar 23 04:03 i486
drwxr-xr-x 2 root root 4096 Mar 23 04:03 i586
drwxr-xr-x 2 root root 4096 Mar 23 04:03 i686
-rwxr-xr-x 1 root root 1524828 Dec 21 02:04 libc-2.3.4.so
lrwxrwxrwx 1 root root 13 Mar 22 18:42 libc.so.6 -> libc-2.3.4.so
-rwxr-xr-x 1 root root 215272 Dec 21 02:04 libm-2.3.4.so
lrwxrwxrwx 1 root root 13 Mar 22 18:42 libm.so.6 -> libm-2.3.4.so
-rwxr-xr-x 1 root root 108560 Dec 21 02:04 libpthread-2.3.4.so
lrwxrwxrwx 1 root root 19 Mar 22 18:42 libpthread.so.0 -> libpthread-2.3.4.so
-rwxr-xr-x 1 root root 50984 Dec 21 02:04 librt-2.3.4.so
lrwxrwxrwx 1 root root 14 Mar 22 18:42 librt.so.1 -> librt-2.3.4.so
-rwxr-xr-x 1 root root 32308 Dec 21 02:04 libthread_db-1.0.so
lrwxrwxrwx 1 root root 19 Mar 22 18:42 libthread_db.so.1 -> libthread_db-1.0.so



Is this what NPTL on Fedora Core 3 does TODAY? or is there a problem in the sequence of releasing mutex's or condition variables that would cause this behavior in our code on Fedora Core 3.


We maintain internal thread exit status so I can skip cancelling the threads which have succesfully exited. We normally just cancel everything we started just
as a big hammer to make sure every thread shuts down and exits. We can make the abort function a bit smarter since it has access to our internal thread status if need be.


On the OS's I mentioned above 0 is returned on success, on failure:

On HP-UX  11.00 pthread_cancel returns the value ERSCH, errno is NOT set.

On Solaris SPARC and x86 same as HP-UX 11.00

AIX same as HP-UX an Solaris.

On Tru64 pthread_cancel returns EINVAL or ESRCH, errno is not set.

Eric Bruno.


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux