Re: [PATCH (try #3)] mm: avoid unnecessary OOM kills

Dave Peterson wrote:

Below is a 2.6.17-rc4-mm3 patch that fixes a problem where the OOM killer was
unnecessarily killing system daemons in addition to memory-hogging user
processes.  The patch fixes things so that the following assertion is
satisfied:

    If a failed attempt to allocate memory triggers the OOM killer, then the
    failed attempt must have occurred _after_ any process previously shot by
    the OOM killer has cleaned out its mm_struct.

Thus we avoid situations where concurrent invocations of the OOM killer cause
more processes to be shot than necessary to resolve the OOM condition.


Does this fix observed problems on real (or fake) workloads? Can we have
some more information about that?

I still don't quite understand why all this mechanism is needed. Suppose
that we single-thread the oom kill path (which isn't unreasonable, unless
you need really good OOM throughput :P), isn't it enough to find that any
process has TIF_MEMDIE set in order to know that an OOM kill is in progress?

down(&oom_sem);
for each process {
  if TIF_MEMDIE
     goto oom_in_progress;
  else
    calculate badness;
}
up(&oom_sem);

I have one other comment, below

+/* If an OOM kill is not already in progress, try once more to allocate
+ * memory.  If allocation fails this time, invoke the OOM killer.
+ */
+static struct page * oom_alloc(gfp_t gfp_mask, unsigned int order,
+		struct zonelist *zonelist)
+{
+	static DECLARE_MUTEX(sem);
+	struct page *page;
+
+	down(&sem);
+
+	/* Prevent parallel OOM kill operations.  This fixes a problem where
+	 * the OOM killer was observed shooting system daemons in addition to
+	 * memory-hogging user processes.
+	 */
+	if (oom_kill_active()) {
+		up(&sem);
+		goto out_sleep;
+	}
+
+	/* If we get here, we _know_ that any previous OOM killer victim has
+	 * cleaned out its mm_struct.  Therefore we should pick a victim to
+	 * shoot if this allocation fails.
+	 */
+	page = get_page_from_freelist(gfp_mask | __GFP_HARDWALL, order,
+				zonelist, ALLOC_WMARK_HIGH | ALLOC_CPUSET);
+
+	if (page) {
+		up(&sem);
+		return page;
+	}
+
+	oom_kill_start();
+	up(&sem);
+
+	/* Try to shoot a process.  Call oom_kill_finish() only if the OOM
+	 * killer did not shoot anything.  If the OOM killer shot something,
+	 * mmput() will call oom_kill_finish() once the mm_users count of the
+	 * victim's mm_struct has reached 0 and the mm_struct has been cleaned
+	 * out.
+	 */
+	if (out_of_memory(zonelist, gfp_mask, order))
+		oom_kill_finish();  /* cancel OOM kill */
+
+out_sleep:
+	/* Did we get shot by the OOM killer?  If not, sleep for a while to
+	 * avoid burning lots of CPU cycles looping in the memory allocator.
+	 * If the OOM killer shot a process, this gives the victim a good
+	 * chance to die before we retry allocation.
+	 */
+	if (!test_thread_flag(TIF_MEMDIE))
+		schedule_timeout_uninterruptible(1);
+
+	return NULL;
+}


Is all this really required? Shouldn't you just have in place the
mechanism to prevent concurrent OOM killings in the OOM code, and
so the page allocator doesn't have to bother with it at all (ie.
it can just call into the OOM killer, which may or may not actually
kill anything).

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com-

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH (try #3)] mm: avoid unnecessary OOM kills
  - From: Dave Peterson <[email protected]>

References:
- [PATCH (try #3)] mm: avoid unnecessary OOM kills
  - From: Dave Peterson <[email protected]>

Prev by Date: Re: OpenGL-based framebuffer concepts
Next by Date: Re: cpu hotplug sleeping from invalid context
Previous by thread: [PATCH (try #3)] mm: avoid unnecessary OOM kills
Next by thread: Re: [PATCH (try #3)] mm: avoid unnecessary OOM kills
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]