Re: [1/3] Add 4GB DMA32 zone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/12/05, Andi Kleen <[email protected]> wrote:
> Add 4GB DMA32 zone
>
> Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.
>
> As a bit of historical background: when the x86-64 port
> was originally designed we had some discussion if we should
> use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
> both. Both was ruled out at this point because it was in early
> 2.4 when VM is still quite shakey and had bad troubles even
> dealing with one DMA zone.  We settled on the 16MB DMA zone mainly
> because we worried about older soundcards and the floppy.
>
> But this has always caused problems since then because
> device drivers had trouble getting enough DMA able memory. These days
> the VM works much better and the wide use of NUMA has proven
> it can deal with many zones successfully.
>
> So this patch adds both zones.
>
> This helps drivers who need a lot of memory below 4GB because
> their hardware is not accessing more (graphic drivers - proprietary
> and free ones, video frame buffer drivers, sound drivers etc.).
> Previously they could only use IOMMU+16MB GFP_DMA, which
> was not enough memory.
>
> Another common problem is that hardware who has full memory
> addressing for >4GB misses it for some control structures in memory
> (like transmit rings or other metadata).  They tended to allocate memory
> in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
> but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
> (even on AMD systems the IOMMU tends to be quite small) especially if you have
> many devices.  With the new zone pci_alloc_consistent can just put
> this stuff into memory below 4GB which works better.
>
> One argument was still if the zone should be 4GB or 2GB. The main
> motivation for 2GB would be an unnamed not so unpopular hardware
> raid controller (mostly found in older machines from a particular four letter
> company) who has a strange 2GB restriction in firmware. But
> that one works ok with swiotlb/IOMMU anyways, so it doesn't really
> need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
> it seems to be the most common restriction.
>
> The new zone is so far added only for x86-64.
>
> For other architectures who don't set up this
> new zone nothing changes. Architectures can set a compatibility
> define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
> as GFP_DMA. Otherwise it's a nop because on 32bit architectures
> it's normally not needed because GFP_NORMAL (=0) is DMA able
> enough.
>
> One problem is still that GFP_DMA means different things on different
> architectures. e.g. some drivers used to have #ifdef ia64  use GFP_DMA
> (trusting it to be 4GB) #elif __x86_64__ (use other hacks like
> the swiotlb because 16MB is not enough) ... . This was quite
> ugly and is now obsolete.
>
> These should be now converted to use GFP_DMA32 unconditionally. I haven't done
> this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
> which will use GFP_DMA32 transparently.
>
> Signed-off-by: Andi Kleen <[email protected]>
>

<snip>

> Index: linux/include/linux/mmzone.h
> ===================================================================
> --- linux.orig/include/linux/mmzone.h
> +++ linux/include/linux/mmzone.h
> @@ -70,11 +70,12 @@ struct per_cpu_pageset {
>  #endif
>
>  #define ZONE_DMA               0
> -#define ZONE_NORMAL            1
> -#define ZONE_HIGHMEM           2
> +#define ZONE_DMA32             1
> +#define ZONE_NORMAL            2
> +#define ZONE_HIGHMEM           3
>
> -#define MAX_NR_ZONES           3       /* Sync this with ZONES_SHIFT */
> -#define ZONES_SHIFT            2       /* ceil(log2(MAX_NR_ZONES)) */
> +#define MAX_NR_ZONES           4       /* Sync this with ZONES_SHIFT */
> +#define ZONES_SHIFT            3       /* ceil(log2(MAX_NR_ZONES)) */
>
>
>  /*
> @@ -90,7 +91,7 @@ struct per_cpu_pageset {
>   * be 8 (2 ** 3) zonelists.  GFP_ZONETYPES defines the number of possible
>   * combinations of zone modifiers in "zone modifier space".
>   */
> -#define GFP_ZONEMASK   0x03
> +#define GFP_ZONEMASK   0x07
>  /*
>   * As an optimisation any zone modifier bits which are only valid when
>   * no other zone modifier bits are set (loners) should be placed in
> @@ -110,6 +111,7 @@ struct per_cpu_pageset {
>   * into multiple physical zones. On a PC we have 3 zones:

Now 4 zones.

>   *
>   * ZONE_DMA      < 16 MB       ISA DMA capable memory
> + * ZONE_DMA32       0 MB       Empty
>   * ZONE_NORMAL 16-896 MB       direct mapped by the kernel
>   * ZONE_HIGHMEM         > 896 MB       only page cache and user processes
>   */

<snip>

--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux