Re: [patch 00/11] ANNOUNCE: "Syslets", generic asynchronous system call support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Alan <[email protected]> wrote:

> > A syslet is executed opportunistically: i.e. the syslet subsystem 
> > assumes that the syslet will not block, and it will switch to a 
> > cachemiss kernel thread from the scheduler. This means that even a
> 
> How is scheduler fairness maintained ? and what is done for resource 
> accounting here ?

the async threads are as if the user created user-space threads - and 
it's accounted (and scheduled) accordingly.

> > that the kernel fills and user-space clears. Waiting is done via the 
> > sys_async_wait() system call. Completion can be supressed on a 
> > per-atom
> 
> They should be selectable as well iff possible.

basically arbitrary notification interfaces are supported. For example 
if you add a sys_kill() call as the last syslet atom then this will 
notify any waiter in sigwait().

or if you want to select(), just do it in the fds that you are 
interested in, and the write that the syslet does triggers select() 
completion.

but the fastest one will be by using syslets: to just check the 
notification ring pointer in user-space, and then call into 
sys_async_wait() if the ring is empty.

I just noticed a small bug here: sys_async_wait() should also take the 
ring index userspace checked as a second parameter, and fix up the 
number of events it waits for with the delta between the ring index the 
kernel maintains and the ring index user-space has. The patch below 
fixes this bug.

> > Open issues:
> 
> Let me add some more
> 
> 	sys_setuid/gid/etc need to be synchronous only and not occur 
> while other async syscalls are running in parallel to meet current 
> kernel assumptions.

these should probably be taken out of the 'async syscall table', along 
with fork and the async syscalls themselves.

> 	sys_exec and other security boundaries must be synchronous 
> only and not allow async "spill over" (consider setuid async binary 
> patching)

i've tested sys_exec() and it seems to work, but i might have missed 
some corner-cases. (And what you raise is not academic, it might even 
make sense to do it, in the vfork() way.)

> >  - sys_fork() and sys_async_exec() should be filtered out from the 
> >    syscalls that are allowed - first one only makes sense with ptregs, 
> 
> clone and vfork. async_vfork is a real mindbender actually.

yeah. Also, create_module() perhaps. I'm starting to lean towards an 
async_syscall_table[]. At which point we could reduce the max syslet 
parameter count to 4, and do those few 5 and 6 parameter syscalls (of 
which only splice() and futex() truly matter i suspect) via wrappers. 
This would fit a syslet atom into 32 bytes on x86. Hm?

> >    second one is a nice kernel recursion thing :) I didnt want to 
> >    duplicate the sys_call_table though - maybe others have a better 
> >    idea.
> 
> What are the semantics of async sys_async_wait and async sys_async ?

agreed, that should be forbidden too.

	Ingo

---------------------->
---
 kernel/async.c |   12 +++++++++---
 kernel/async.h |    2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

Index: linux/kernel/async.c
===================================================================
--- linux.orig/kernel/async.c
+++ linux/kernel/async.c
@@ -721,7 +721,8 @@ static void refill_cachemiss_pool(struct
  * to finish or for all async processing to finish (whichever
  * comes first).
  */
-asmlinkage long sys_async_wait(unsigned long min_wait_events)
+asmlinkage long
+sys_async_wait(unsigned long min_wait_events, unsigned long user_curr_ring_idx)
 {
 	struct async_head *ah = current->ah;
 
@@ -730,12 +731,17 @@ asmlinkage long sys_async_wait(unsigned 
 
 	if (min_wait_events) {
 		spin_lock(&ah->lock);
-		ah->events_left = min_wait_events;
+		/*
+		 * Account any completions that happened since user-space
+		 * checked the ring:
+	 	 */
+		ah->events_left = min_wait_events -
+				(ah->curr_ring_idx - user_curr_ring_idx);
 		spin_unlock(&ah->lock);
 	}
 
 	return wait_event_interruptible(ah->wait,
-		list_empty(&ah->busy_async_threads) || !ah->events_left);
+		list_empty(&ah->busy_async_threads) || ah->events_left > 0);
 }
 
 /**
Index: linux/kernel/async.h
===================================================================
--- linux.orig/kernel/async.h
+++ linux/kernel/async.h
@@ -26,7 +26,7 @@ struct async_head {
 	struct list_head			ready_async_threads;
 	struct list_head			busy_async_threads;
 
-	unsigned long				events_left;
+	long					events_left;
 	wait_queue_head_t			wait;
 
 	struct async_head_user	__user		*uah;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux