Re: Avoiding external fragmentation with a placement policy Version 12

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> > Unfortunately, it is a fundemental flaw of the buddy allocator that it
>> > fragments badly. The thing is, other allocators that do not fragment are
>> > also slower.
>> 
>> Do we care? 99.9% of allocations are fronted by the hot/cold page cache
>> now anyway ...
> 
> Very true, but only for order-0 allocations. As it is, higher order
> allocations are a lot less important because Linux has always avoided them
> unless absolutely necessary. I would like to reach the point where we can
> reliably allocate large blocks of memory so we do not have to split large
> amounts of data into page-sized chunks all the time.

Right. I agree that large allocs should be reliable. Whether we care so
much about if they're performant or not, I don't know ... is an interesting
question. I think the answer is maybe not, within reason. The cost of
fishing in the allocator might well be irrelevant compared to the cost
of freeing the necessary memory area?

> I did measure it and there is a slow-down on high order allocations which
> is not very surprising. The following is the result of a micro-benchmark
> comparing the standard and modified allocator for 1500 order-5
> allocations.
> 
> Standard
>      Average          Max          Min       Allocs
>      -------          ---          ---       ------
>         0.73         1.09         0.53         1476
>         1.33         1.87         1.10           23
>         2.10         2.10         2.10            1
> 
> Modified
>      Average          Max          Min       Allocs
>      -------          ---          ---       ------
>         0.82         1.23         0.60         1440
>         1.36         1.96         1.23           57
>         2.42         2.92         2.09            3
> 
> The average, max and min are in 1000's of clock cycles for an allocation
> so there is not a massive difference between the two allocators. Aim9
> still shows that overall, the modified allocator is as fast as the normal
> allocator.

Mmmm. that doesn't look too bad at all to me.
 
> High order allocations do slow down a lot when under memory pressure and
> neither allocator performs very well although the modified allocator
> probably performs worse as it has more lists to search. In the case of the
> placement policy though, I can work on the linear scanning patch to avoid
> using a blunderbuss on memory. With the standard allocator, linear scanning
> will not help significantly because non-reclaimable memory is scattered
> all over the place.
> 
> I have also found that the modified allocator can fairly reliably allocate
> memory on a desktop system which has been running a full day where the
> standard allocator cannot. However, that experience is subjective and
> benchmarks based on loads like kernel compiles will not be anything like a
> desktop system. At the very least, kernel compiles, while they load the
> system, will not pin memory used for PTEs like a desktop running
> long-lived applications would.
> 
> I'll work on reproducing scenarios that show where the standard allocator
> fails to allocate large blocks of memory without paging everything out
> that the placement policy works with.

Sounds great ... would be really valuable to get those testcases.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux