On 05/12, Tejun Heo wrote:
>
> Oleg Nesterov wrote:
> >
> > Yes I hate this barrier too. That is why changelog explicitly mentions it.
> > Probably we can do something else.
>
> Fortunately, we have one bit left in the flags and can use it to mark
> pointer validity instead of list_empty() test. flags and wq_data live
> in the same atomic_long and thus can be updated together atomically.
Heh. I thought about another bit-in-pointer too. I can't explain this,
but I personally hate these bits even more than barriers.
I must admit, your suggestion is much more clean compared to what I
had in mind.
> * insert_work() sets VALID bit and the cwq pointer using one
> atomic_long_set().
OK, but not very suitable, we need to copy-and-paste set_wq_data(),
because it is also used by queue_delayed_work(). Not a big problem.
> * queue_delayed_work_on() sets the cwq pointer but not the VALID bit.
... and queue_delayed_work(), of course.
> * run_workqueue() clears the cwq pointer and VALID bit while holding
> lock before executing the work.
We should not clear cwq. I guess you meant "does list_del() and clears
VALID bit".
> * try_to_grab_pending() checks VALID && pointers equal after grabbing
> cwq->lock.
We don't even need to check the pointers. VALID bit is always changed
under cwq->lock. So, if we see VALID under cwq->lock, we have a right
pointer.
> What do you think? Is there any hole?
I'd be happy to say I found many nasty unfixable holes in your idea,
but I can't see any :) Something like the patch below, I guess.
Ugh. OK, the barrier is evil. As I said, we can convert smp_wmb() to
smp_bm__before_spinlock() (which we don't currently have, but which we
imho need regardless to workqueue.c), but anyway it is not easy to
understand the code with barriers.
OTOH, the new VALID bit is _only_ needed to implement a special cancel_xxx
functions. And unlike wmb/smp_bm__before_spinlock it is not hidden in
insert_work(), but spreads over the queue/unqueue path. And. the
bit-in-pointer is always a hack, imho. And. It is a bit difficult to
understand why do we need a VALID bit when we have !list_empty(), so
_perhaps_ a barrier is not much worse.
I am looking at the patch below, and I can't undestand whether this is
good change or not. I am biased, of course. Tejun, if you say "I strongly
believe this is better", I'll send the patch.
Anybody else has an opinion?
Oleg.
--- t/kernel/workqueue.c~ 2007-05-13 22:32:45.000000000 +0400
+++ t/kernel/workqueue.c 2007-05-13 22:37:42.000000000 +0400
@@ -120,11 +120,7 @@ static void insert_work(struct cpu_workq
struct work_struct *work, int tail)
{
set_wq_data(work, cwq);
- /*
- * Ensure that we get the right work->data if we see the
- * result of list_add() below, see try_to_grab_pending().
- */
- smp_wmb();
+ __set_bit(WORK_STRUCT_QUEUED, work_data_bits(work));
if (tail)
list_add_tail(&work->entry, &cwq->worklist);
else
@@ -231,6 +227,12 @@ int queue_delayed_work_on(int cpu, struc
}
EXPORT_SYMBOL_GPL(queue_delayed_work_on);
+static inline void dequeue_work(struct work_struct *work)
+{
+ __clear_bit(WORK_STRUCT_QUEUED, work_data_bits(work));
+ list_del_init(&work->entry);
+}
+
static void run_workqueue(struct cpu_workqueue_struct *cwq)
{
spin_lock_irq(&cwq->lock);
@@ -246,8 +248,8 @@ static void run_workqueue(struct cpu_wor
struct work_struct, entry);
work_func_t f = work->func;
+ dequeue_work(work);
cwq->current_work = work;
- list_del_init(cwq->worklist.next);
spin_unlock_irq(&cwq->lock);
BUG_ON(get_wq_data(work) != cwq);
@@ -410,17 +412,9 @@ static int try_to_grab_pending(struct wo
return ret;
spin_lock_irq(&cwq->lock);
- if (!list_empty(&work->entry)) {
- /*
- * This work is queued, but perhaps we locked the wrong cwq.
- * In that case we must see the new value after rmb(), see
- * insert_work()->wmb().
- */
- smp_rmb();
- if (cwq == get_wq_data(work)) {
- list_del_init(&work->entry);
- ret = 1;
- }
+ if (test_bit(WORK_STRUCT_QUEUED, work_data_bits(work))) {
+ dequeue_work(work);
+ ret = 1;
}
spin_unlock_irq(&cwq->lock);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]