Re: [Patch 2/12] IPMI: remove interface number limits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Paul E. McKenney wrote:
On Fri, Dec 01, 2006 at 10:24:22PM -0600, Corey Minyard wrote:
This patch removes the arbitrary limit of number of IPMI interfaces.
This has been tested with 8 interfaces.

I got a bit lost in this patch, so applied it to 2.6.19-rc6 and looked
over the resulting file.  Some of the issues predate this patch, which
I guess goes to show that I have not been paying enough attention.
I do not claim to be an IPMI expert, so may well be missing something.

In general, though, good application of RCU -- SMIs and locks don't
get along so well.  ;-)

All that said, here are my thoughts.  The first two issues (marked with
"!")  seem to need the most urgent attention.

						Thanx, Paul

drivers/char/ipmi/ipmi_msghandler.c:

!	clean_up_interface_data(), line 394: What prevents an RCU reader
	from finding the raw list_head "list" and interpreting stack
	garbage as a semi-valid ipmi_smi_t?  (The list_add_rcu() adds
	an unadorned struct list_head to the list.)

	Unless the RCU readers do something clever to reject this
	entry, should instead link an ipmi_smi_t into the list to
	avoid confusing the readers.  See below, at least some
	RCU readers are not being clever.

!	clean_up_interface_data(), line 395: If RCU readers expect to
	terminate their list traversal by finding the header, they
	could well be severely disappointed when the list_del_rcu()
	removes it...  The header being removed is the cmd_rcvrs
	field of the ipmi_smi_t.

	o	ipmi_destroy_user() line 880 does such a scan, but
		under the cmd_rcvrs_mutex, so OK.  Doesn't really
		need the _rcu suffix here, but doesn't hurt to have it.

	o	find_cmd_rcvr() line 992 does such a scan.  It is
		invoked as follows:

		o	Under cmd_rcvrs_mutex near line 1064 in
			ipmi_unregister_for_cmd().

		!	Under rcu_read_lock() near line 2686 in
			handle_ipmb_get_msg_cmd(), called from
			handle_new_recv_msg(), in turn called from:
			
			o	ipmi_smi_msg_received() line 3287.
				This may be called externally, so
				cannot hold the cmd_rcvrs_mutex.
			
			o	ipmi_timeout_handler() line 3428,
				called from ipmi_timeout() near line
				3493.  This is called from the
				timeout system (setup_timer() and
				mod_timer()), so cannot hold the
				lock.

			This loop also references a number of fields
			outside of the list_head, so is a problem
			for the addition of the raw struct list_head.

	o	is_cmd_rcvr_exclusive() at line 1007.  This is called
		from ipmi_register_for_cmd(), but under cmd_rcvrs_mutex,
		so OK as is.

	o	ipmi_register_smi() line 2494 initializes a newly
		allocated structure, so not yet accessible to readers.

	One way to fix this would be to leave the next pointer referencing
	the old list header, so that readers would find their way home.
	This would require waiting for a grace period after making this
	change, for example (untested, probably broken, but hopefully gets
	the idea across):

		if (list_empty(&intf->cmd_rcvrs))
			INIT_LIST_HEAD(&list);
		else {
			list->next = intf->cmd_rcvrs->next;
			list->prev = intf->cmd_rcvrs->prev;
			intf->cmd_rcvrs->next = &intf->cmd_rcvrs;
			intf->cmd_rcvrs->prev = &intf->cmd_rcvrs;

			/* List body still points to intf->cmd_rcvrs. */

			synchronize_rcu();

			/* All readers have exited list body. */

			list->next->prev = &list;
			list->prev->next = &list;
		}

		/*
		 * Note that we -don't- need the synchronize_rcu()
		 * currently following the mutex_unlock().
		 */

	If this sort of thing happens often, we should make a
	list_privatize_rcu(global_list, local_receiving_list) or some such.
Oops, yes, this is a big problem. I didn't want to delete the items one at a time and wait for each because that could take a long time. But I missed the end of list thing.
!	ipmi_register_smi() near line 2504: how does "i" get assigned
	to intf->intf_num before the struct is visible to RCU
	readers?  Or why doesn't it have to be?  It is initialized
	to -1, so maybe that helps...

	OK, I see the assignment at line 2561 near the end of
	ipmi_register_smi() -- it "turns on" the new ipmi_smi_t structure.
	But what keeps the memory ordering straight???	I believe you
	need an smp_wmb() before the "intf->intf_num = i;" assignment.

	Memory barriers will also be needed in the scan loops, most
	likely.

	But why not simply fully initialize -before- making this globally
	visible to RCU readers???  You do have a bunch of intervening
	setup, but does the entry -really- need to be linked "invisibly"
	into the list for this setup to work?  The /proc stuff seems
	harmless enough.  Besides, ipmi_unregister_smi() removes the
	entry from the list before undoing the /proc stuff.

	Or does the BMC registry somehow require this?	(Can't see
	why, as the "-1" ID won't be found by the search, right?) If
	so, can you instead have a global pointer that references
	the to-be-inserted item?  You would then assign to the global
	pointer using rcu_assign_pointer() -- then wait a grace period
	after adding to the real list before NULLing the global pointer.
This is subtle, but the entry must be in the interfaces list for the timer operations to occur on it (and thus for operations to time out if they fail). Before you make the entry available, you have to talk to the IPMI controller to get configuration information. However, you don't want it to appear as a normal entry yet. This is not really clean, I agree, but it does work with the added memory barriers.

I believe a read barrier is needed in one place in ipmi_create_user(), but that's all I found that looked like it was needed.
?	ipmi_smi_watcher_register(), line 432: list_for_each_entry_rcu()
	protected by lock.  OK, but _rcu suffix unnecessary.
Ok.
?	ipmi_destroy_user() line 880 uses list_for_each_entry_rcu()
	under ->cmd_rcvrs_mutex, not clear why.  (Scanning the
	cmd_rcvrs field of an ipmi_smi_t.)
Well, it's either that or list_for_entry_safe(), because entries are being deleted. I can change it to safe if you think that is more clear.

Thanks for all your help. I have a patch ready to go, expect that in a few minutes.

-Corey

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux