(patches against 2.6.20-mm1) There is a fundamental deadlock associated with paging; when writing out a page to free memory requires free memory to complete. The usually solution is to keep a small amount of memory available at all times so we can overcome this problem. This however assumes the amount of memory needed for writeout is (constant and) smaller than the provided reserve. It is this latter assumption that breaks when doing writeout over network. Network can take up an unspecified amount of memory while waiting for a reply to our write request. This re-introduces the deadlock; we might never complete the writeout, for we might not have enough memory to receive the completion message. The proposed solution is simple, only allow traffic servicing the VM to make use of the reserves. Since the VM is always present to service, this limited amount of memory can sustain a full connection; after a packet has been processed its memory can be re-used for the next packet. This however implies you know what packets are for whom, which generally speaking you don't. Hence we need to receive all packets but discard them as soon as we encounter a non VM bound packet allocated from the reserves. Also knowing it is headed towards the VM needs a little help, hence we introduce the socket flag SOCK_VMIO to mark sockets with. Of course, since we are paging all this has to happen in kernel-space, since user-space might just not be there. Since packet processing might also require memory, this all also implies that those auxiliary allocations may use the reserves when an emergency packet is processed. This is accomplished by using PF_MEMALLOC. How much memory is to be reserved is also an issue, enough memory to saturate both the route cache and IP fragment reassembly, along with various constants. This patch-set comes in 5 parts: 1) introduce the memory reserve and make the SLAB allocator play nice with it. patches 01-09 2) add some needed infrastructure to the network code patches 10-12 3) implement the idea outlined above patches 13-19 4) teach the swap machinery to use generic address_spaces patches 20-23 5) implement swap over NFS using all the new stuff patches 24-29 -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- [PATCH 14/29] netvm: INET reserves.
- From: Peter Zijlstra <[email protected]>
- [PATCH 06/29] mm: __GFP_EMERGENCY
- From: Peter Zijlstra <[email protected]>
- [PATCH 15/29] netvm: hook skb allocation to reserves
- From: Peter Zijlstra <[email protected]>
- [PATCH 01/29] mm: page allocation rank
- From: Peter Zijlstra <[email protected]>
- [PATCH 22/29] mm: add support for non block device backed swap files
- From: Peter Zijlstra <[email protected]>
- [PATCH 28/29] nfs: enable swap on NFS
- From: Peter Zijlstra <[email protected]>
- [PATCH 13/29] netvm: link network to vm layer
- From: Peter Zijlstra <[email protected]>
- [PATCH 11/29] net: packet split receive api
- From: Peter Zijlstra <[email protected]>
- [PATCH 27/29] nfs: disable data cache revalidation for swapfiles
- From: Peter Zijlstra <[email protected]>
- [PATCH 23/29] mm: methods for teaching filesystems about PG_swapcache pages
- From: Peter Zijlstra <[email protected]>
- [PATCH 12/29] net: remove alloc_skb_from_cache
- From: Peter Zijlstra <[email protected]>
- [PATCH 03/29] mm: allow PF_MEMALLOC from softirq context
- From: Peter Zijlstra <[email protected]>
- [PATCH 17/29] netvm: prevent a TCP specific deadlock
- From: Peter Zijlstra <[email protected]>
- [PATCH 04/29] mm: serialize access to min_free_kbytes
- From: Peter Zijlstra <[email protected]>
- [PATCH 08/29] mm: kmem_cache_objs_to_pages()
- From: Peter Zijlstra <[email protected]>
- [PATCH 20/29] uml: rename arch/um remove_mapping()
- From: Peter Zijlstra <[email protected]>
- [PATCH 07/29] mm: allow mempool to fall back to memalloc reserves
- From: Peter Zijlstra <[email protected]>
- [PATCH 18/29] netfilter: notify about NF_QUEUE vs emergency skbs
- From: Peter Zijlstra <[email protected]>
- [PATCH 16/29] netvm: filter emergency skbs.
- From: Peter Zijlstra <[email protected]>
- [PATCH 26/29] nfs: teach the NFS client how to treat PG_swapcache pages
- From: Peter Zijlstra <[email protected]>
- [PATCH 19/29] netvm: skb processing
- From: Peter Zijlstra <[email protected]>
- [PATCH 10/29] net: wrap sk->sk_backlog_rcv()
- From: Peter Zijlstra <[email protected]>
- [PATCH 24/29] nfs: remove mempools
- From: Peter Zijlstra <[email protected]>
- [PATCH 05/29] mm: emergency pool
- From: Peter Zijlstra <[email protected]>
- [PATCH 25/29] nfs: only use stable storage for swap
- From: Peter Zijlstra <[email protected]>
- [PATCH 21/29] mm: prepare swap entry methods for use in page methods
- From: Peter Zijlstra <[email protected]>
- [PATCH 02/29] mm: slab allocation fairness
- From: Peter Zijlstra <[email protected]>
- [PATCH 29/29] balance_dirty_pages() vs throttle_vm_writeout() deadlock
- From: Peter Zijlstra <[email protected]>
- [PATCH 09/29] selinux: tag avc cache alloc as non-critical
- From: Peter Zijlstra <[email protected]>
- [PATCH 14/29] netvm: INET reserves.
- Prev by Date: Re: [PATCH 02/29] mm: slab allocation fairness
- Next by Date: [PATCH 17/29] netvm: prevent a TCP specific deadlock
- Previous by thread: [PATCH 2.6.21-rc1] serial: serial_txx9 driver update
- Next by thread: [PATCH 09/29] selinux: tag avc cache alloc as non-critical
- Index(es):