Eric,
All my commments are inline below.
This patch modifies the fork/exit, signal handling, and pid and
process group manipulating syscalls to support multiple process
spaces, and implements the data for allow multiple instaces of the pid
namespace.
[ ... skipped .... ]
+extern struct pspace init_pspace;
+
+#define INVALID_PID 0x7fffffff
<<<< what is it for?
+
+static inline int pspace_task_visible(struct pspace *pspace, struct task_struct *tsk)
+{
+ return (tsk->pspace == pspace) ||
+ ((tsk->pspace->child_reaper.pspace == pspace) &&
+ (tsk->pspace->child_reaper.task == tsk));
<<< the logic with child_reaper which can be somehow partly inside
pspace... and this check is not that abvious.
Actually I can't say your patch is cleaner somehow.
It is very big and most of the changes are trivial, which creates an
illusion that it is straightforward and clean.
[ ... skipped .... ]
@@ -788,6 +801,16 @@ fastcall NORET_TYPE void do_exit(long co
panic("Attempted to kill the idle task!");
if (unlikely(is_init(tsk)))
panic("Attempted to kill init!");
+
+ /*
+ * If we are the pspace leader it is nonsense for the pspace
+ * to continue so kill everyone else in the pspace.
+ */
+ if (pspace_leader(tsk)) {
+ tsk->pspace->flags = PSPACE_EXIT;
+ kill_pspace_info(SIGKILL, (void *)1, tsk->pspace);
+ }
+
<<<<
1.
flags are neither atomic nor protected with any lock.
So if it will be used later for something else in future, there will be
100% race. You also assigns them here, while everywhere in other places
a bit is checked.
2. due to 1) you code is buggy. in this respect do_exit() is not
serialized with copy_process().
3. due to the same 1) reason
> + kill_pspace_info(SIGKILL, (void *)1, tsk->pspace);
can miss a task being forked. Bang!!!
4.
So you are effectively inserting a code for cleaning up pspace here and
the same is actually required for other subsystems like networking/IPC etc.
I think you suppose that other resources are freed when the last
reference is dropped, but to tell the truth this is a way to deadlocks.
Because refs are put in too many places, where you can't make a real
cleanup due to locking etc. You can't for example call syncronize_net()
from bh, which is required for network cleanup.
Just an another argument why containers are easier/better and why you
will eventually end up with it.
if (tsk->io_context)
exit_io_context();
[ ... skipped ... ]
@@ -1147,11 +1150,55 @@ retry:
}
/*
+ * kill_pspace_info() sends a signal to all processes in a process space.
+ * This is what kill(-1, sig) does.
+ */
+
+int __kill_pspace_info(int sig, struct siginfo *info, struct pspace *pspace)
+{
+ struct task_struct *p = NULL;
+ int retval = 0, count = 0;
+
+ for_each_process(p) {
+ int err;
+ /* Skip the current pspace leader */
+ if (current_pspace_leader(p))
+ continue;
+
+ /* Skip the sender of the signal */
+ if (p->signal == current->signal)
+ continue;
+
+ /* Skip processes outside the target process space */
+ if (!in_pspace(pspace, p))
+ continue;
+
+ /* Finally it is a good process send the signal. */
+ err = group_send_sig_info(sig, info, p);
+ ++count;
+ if (err != -EPERM)
+ retval = err;
<<<<
why EPERM is ok?
do you want to miss some tasks?
+ }
+ return count ? retval : -ESRCH;
+}
+
Thanks,
Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]