>> > Unfortunately, it is a fundemental flaw of the buddy allocator that it
>> > fragments badly. The thing is, other allocators that do not fragment are
>> > also slower.
>>
>> Do we care? 99.9% of allocations are fronted by the hot/cold page cache
>> now anyway ...
>
> Very true, but only for order-0 allocations. As it is, higher order
> allocations are a lot less important because Linux has always avoided them
> unless absolutely necessary. I would like to reach the point where we can
> reliably allocate large blocks of memory so we do not have to split large
> amounts of data into page-sized chunks all the time.
Right. I agree that large allocs should be reliable. Whether we care so
much about if they're performant or not, I don't know ... is an interesting
question. I think the answer is maybe not, within reason. The cost of
fishing in the allocator might well be irrelevant compared to the cost
of freeing the necessary memory area?
> I did measure it and there is a slow-down on high order allocations which
> is not very surprising. The following is the result of a micro-benchmark
> comparing the standard and modified allocator for 1500 order-5
> allocations.
>
> Standard
> Average Max Min Allocs
> ------- --- --- ------
> 0.73 1.09 0.53 1476
> 1.33 1.87 1.10 23
> 2.10 2.10 2.10 1
>
> Modified
> Average Max Min Allocs
> ------- --- --- ------
> 0.82 1.23 0.60 1440
> 1.36 1.96 1.23 57
> 2.42 2.92 2.09 3
>
> The average, max and min are in 1000's of clock cycles for an allocation
> so there is not a massive difference between the two allocators. Aim9
> still shows that overall, the modified allocator is as fast as the normal
> allocator.
Mmmm. that doesn't look too bad at all to me.
> High order allocations do slow down a lot when under memory pressure and
> neither allocator performs very well although the modified allocator
> probably performs worse as it has more lists to search. In the case of the
> placement policy though, I can work on the linear scanning patch to avoid
> using a blunderbuss on memory. With the standard allocator, linear scanning
> will not help significantly because non-reclaimable memory is scattered
> all over the place.
>
> I have also found that the modified allocator can fairly reliably allocate
> memory on a desktop system which has been running a full day where the
> standard allocator cannot. However, that experience is subjective and
> benchmarks based on loads like kernel compiles will not be anything like a
> desktop system. At the very least, kernel compiles, while they load the
> system, will not pin memory used for PTEs like a desktop running
> long-lived applications would.
>
> I'll work on reproducing scenarios that show where the standard allocator
> fails to allocate large blocks of memory without paging everything out
> that the placement policy works with.
Sounds great ... would be really valuable to get those testcases.
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]