Re: [patch] SMP alternatives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 24, 2005 at 08:29:53PM +0100, Andi Kleen wrote:
> > We implemented AMD's reference algorithm, and made it work in the presence
> > of a hardware IO hole.  It seems to work beautifully, but the last step is
> > turning a (node:chip-select) into a (node:dimm).  Simple boards will use
> > simple mappings, but we can't know that without board specific info.
> > Especially with quad-rank DIMMs. :)
> 
> If you get something working it would be good if you could share the code
> (even if it still needs to be tweaked) 

The below code works for us.  Note that I did not implement the
node-interleaving parts of the AMD algorithm.  If that matters, it should
be simple enough to do.  The BKDG has good pseudo-code.  The only thing it
gets absolutely wrong is the IO hole.

Let me know if you see any problems here.  This is a userspace tool, but
should be trivial to adapt.


#include <stdio.h>
#include <stdlib.h>
#include "pci.h"

#define MAX_NODES	8
#define MAX_CS		8
#define NODE_DEV(n)	(24+(n))
#define FOUR_GIG	0x01000000	// shifted to hold address[39..8]

static char *progname;

static void
usage(void)
{
	fprintf(stderr, "usage: %s <address>\n", progname);
}

/*
 * Map a CS to a pair of DIMM slots (for 128 bit operation).  This mapping
 * is board-specific, and has the potential to be very ugly.
 */
static int cs_to_pair[MAX_CS][2] = {
	{ 0, 1 }, { 0, 1 },
	{ 2, 3 }, { 2, 3 },
	{ 4, 5 }, { 4, 5 },
	{ 6, 7 }, { 6, 7 },
};

int
main(int argc, char *argv[])
{
	unsigned long long raw_addr;
	uint32_t address;
	char *endp;
	int node;

	progname = argv[0];

	if (argc != 2) {
		usage();
		exit(1);
	}
	raw_addr = strtoull(argv[1], &endp, 0);
	if (endp == argv[1]) {
		usage();
		exit(2);
	}

	/*
	 * The address space is 40 bits (for now).  We want to use 32 bit
	 * values, so we convert the input address to hold address[39..8].
	 * We'll use this format everywhere.  This loses the
	 * low-order bits, so we keep a raw copy around.
	 */
	address = (raw_addr & 0xffffffffff) >> 8;

	/* find the node that holds this address */
	for (node = 0; node < MAX_NODES; node++) {
		int pci_dev;
		uint32_t tmp;
		int dram_en;
		uint32_t dram_base;
		uint32_t dram_limit;
		int hole_en;
		uint32_t hole_base;
		uint32_t hole_size;
		int cs;

		/*
		 * The DRAM map and IO hole info are in function 1 of each
		 * node.
		 */
		pci_dev = pci_open(0, NODE_DEV(node), 1);
		if (pci_dev < 0) {
			/* node does not exist */
			break;
		}

		/*
		 * DRAM_BASE and DRAM_LIMIT already hold address[39..8].
		 */
		tmp = pci_read32(pci_dev, 0x40+(node*8));
		dram_en = tmp & 0x3;
		dram_base = tmp & 0xffff0000;

		tmp = pci_read32(pci_dev, 0x44+(node*8));
		dram_limit = tmp | 0x0000ffff;

		/*
		 * HOLE_BASE holds address[35..4], so we convert it to hold
		 * address[39..8].
		 */
		tmp = pci_read32(pci_dev, 0xf0);
		hole_en = tmp & 0x1;
		hole_base = (tmp & 0xff000000) >> 8;
		hole_size = FOUR_GIG - hole_base;

		pci_close(pci_dev);

		if (!dram_en) {
			/* no DRAM here */
			continue;
		}

		if (address > dram_limit) {
			/* keep looking */
			continue;
		}

		/*
		 * The address must be on this node.
		 */

		/* is the address in the IO hole? */
		if (hole_en && address >= hole_base && address < FOUR_GIG) {
			/* no DRAM in the IO hole */
			break;
		}

		/* is the address >= 4GB on a node with a hole? */
		if (hole_en && address >= FOUR_GIG) {
			/* adjust address for the IO hole */
			address -= hole_size;
		}

		/* adjust address to be node-relative */
		address -= dram_base;

		/* store addr[35..4] */
		address <<= 4;

		/* The chip-select map is in function 2 of each node. */
		pci_dev = pci_open(0, NODE_DEV(node), 2);

		/* find the chip-select that has this address */
		for (cs = 0; cs < MAX_CS; cs++) {
			uint32_t cs_base;
			uint32_t cs_mask;

			/*
			 * CS_BASE and CS_MASK hold address[35..0], so we
			 * convert them to hold address[39..8].
			 */
			tmp = pci_read32(pci_dev, 0x40+(cs*4));
			if ((tmp & 0x1) == 0) {
				/* this CS is not enabled */
				continue;
			}
			cs_base = tmp & 0xffe0fe00;

			tmp = pci_read32(pci_dev, 0x60+(cs*4));
			cs_mask = (tmp | 0x001f01ff) & 0x3fffffff;

			/* this should handle interleaving, too */
			if ((address & ~cs_mask) == (cs_base & ~cs_mask)) {
				int *pair = cs_to_pair[cs];
				int dimm;

				/* which DIMM is this address on? */
				if ((raw_addr & (1<<3)) == 0) {
					dimm = pair[0];
				} else {
					dimm = pair[1];
				}

				printf("node %d, CS %d, DIMM %d\n",
				       node, cs, dimm);
				break;
			}
		}
		pci_close(pci_dev);
		break;
	}
	return 0;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux