Re: Scheduler: SIGSTOP on multi threaded processes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi all


I worked on my problem in the last days, and I came to these main 2 questions:

- Can a SIGSTOP be in a pending state in Linux?

- If kill(SIGSTOP,...) returns, does that mean that the corresponding process is completly suspended?


I thought until now that SIGSTOP was so special that it could never be
pending, and that as soon as:
signal(SIGSTOP,pid)
returned, then it was assured that the corresponding process (and all
its threads) were suspended.

This would make sense in my opinion, but apparently it is not always the
case, and the POSIX norm do not say anything about that.

Any hint?


I did also some experiments, with one program which fork()s into:

- a child which potentially starts threads and does some stuff

- a parent which regularly sends SIGSTOP to the child and check if the activity really stopped, and then send SIGCONT again

You will find the source code below.

I tried that with different scheduling policies (SCHED_OTHER and SCHED_RR) and different number of threads:
- 0: no thread started (ie. mono threaded child)
- 1: 1 thread started, and the main task just pthread_join() it
- 2: 2 threads started, and the main task pthread_join() them

I came to the following results:

   Policy   OTHER   RR
Threads
0           OK      OK
1           FAIL    OK
2           FAIL    FAIL(1)


- the answer to my 2 questions (see above) see to be No and Yes respectively when no thread is started

- (1) For RR with 2 threads, there are 2 observed behaviour, apparently happening randomly:

* either the parent call always stop instantaneously all threads (like when no thread is started), and that for a long time

* or right at the beginning, we can observe that the parent can not do that

I find this behaviour really strange.

Any idea?

Can one rely on the fact that the SIGSTOP operates instantaneously for non-threaded applications?

Would it be possible to provide that for all applications?




#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <sys/time.h>

#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ipc.h>
#include <sys/shm.h>


#include <pthread.h>


int set_process_sched(pid_t pid, int policy, int priority) {
  struct sched_param p;

  p.sched_priority = priority;

  if ( 1 || policy != sched_getscheduler(pid) ) {
    if ( sched_setscheduler(pid,policy,&p) ) {
      perror("sched_setscheduler()");
      return 1;
    }
  }

  return 0;
}

unsigned long long gettime(void ) {

  struct timeval tv;

  if ( gettimeofday(&tv, NULL) ) {
    perror("gettimeofday()");
    return 0;
  }

  return (tv.tv_usec + tv.tv_sec * 1000000LL);
}

typedef struct {
  int         thread_nb; /* id defined by us */
  pthread_t   thread_id; /* system id of the thread */
} thread_data;


int   cont_main_loop = 1;


void sigterm_handler(int dummy) {
  printf("sigterm_handler\n");
  return;
}


/* We use a shared memory to communicate between the parent and the child
   They all only work in the first few bytes
*/
int     shmid;
unsigned long long int     *shared_array;
#define SHM_SIZE 1024

static inline void conf_shmem(void ) {

  shmid = shmget(IPC_PRIVATE, SHM_SIZE, 0666 | IPC_CREAT);
  if (shmid == -1) {
    perror("shmget()");
    exit(0);
  }

  shared_array = (long long int *) shmat(shmid, 0, 0);
  if (! shared_array ) {
    perror("shmat()");
    exit(0);
  }
}


void loop(int marker) {
  unsigned long long int begin = gettime();
  /* run for 2 minutes at max
     (useful in case we end up with a busy loop in SCHED_RR... */
  while ( gettime() - begin < 120000000LL ) {
    /* write in the shared memory */
    shared_array[0] = marker;
  }
}

void *go_thread(void *dummy) {
  thread_data *data = (thread_data *) dummy;
  loop(data->thread_nb);
  fprintf(stderr,"%llu\tQuitting!\n",gettime());
  return NULL;
}


#define MAX_THREADS 100

int main(int argc, char **argv)
{
  int pid;
  int test_failed = 0;
  unsigned long long exec_begin = gettime();
  int nb_threads = 0;


  conf_shmem();
  shared_array[0] = 0;

  if ( argc > 1 )
    nb_threads = atoi(argv[1]);
  if ( nb_threads > MAX_THREADS )
    nb_threads = MAX_THREADS;

  pid = fork();

  switch ( pid ) {

    case 0: /* child */
    {
      int thread;
      thread_data threads[MAX_THREADS];

      if ( nb_threads == 0 ) {
        /* no multi threading */
        loop(1);
        break;
      }

      /* start the threads */
      for ( thread = 0 ; thread < nb_threads ; thread ++) {
        threads[thread].thread_nb = thread + 1;
        if ( pthread_create (  & threads[thread].thread_id,
                          NULL,
                          go_thread,
                          (void *)&threads[thread]) )
          perror("pthread_create");

      }

      {
        int thread;
        for ( thread = 0 ; thread < nb_threads ; thread ++) {
          pthread_join (  threads[thread].thread_id, NULL);
        }
      }
      exit(0);
    }

    default: /* parent */
    {
      unsigned long long begin = gettime();

      /* depending whether we set the priorities or not,
         we get different results.
      */

      set_process_sched(0, SCHED_RR, 65);
      set_process_sched(pid, SCHED_RR, 60);


      /* run for 10s */
      while ( gettime() - begin < 10000000 ) {
        unsigned long long int b_stop, a_stop;

        /* let the child run a little bit */
        usleep(1000);

        /* stop it */
        kill(pid, SIGSTOP);

        /* Reset our flag */
        shared_array[0] = 0;

        /* Wait to see if someone dare overwriting our nice zero */
        usleep(1000);
        if ( shared_array[0] > 0 ) {
          test_failed = shared_array[0];
          break;
        }
        kill(pid, SIGCONT);
      }
      kill(pid, SIGKILL);
      break;
    }

    case -1:
      perror("fork()");
      exit(0);
  }

  system("uname -a");
  printf("%d thread(s)\n",nb_threads);
  if ( ! test_failed )
    printf("test passed");
  else
    printf("test FAILED (%d)",test_failed);
  printf(" after %f s\n\n", ( gettime() - exec_begin) / 1000000.0 );

  return 0;
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux