Merge tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra into next/soc

From: Stephen Warren:
ARM: tegra: core SoC enhancements for 3.12

This branch includes a number of enhancements to core SoC support for
Tegra devices. The major new features are:

* Adds a new CPU-power-gated cpuidle state for Tegra114.
* Adds initial system suspend support for Tegra114, initially supporting
  just CPU-power-gating during suspend.
* Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
  both gates CPU power, and places the DRAM into self-refresh mode.
* A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
  from arch/arm/mach-tegra/ to drivers/pci/host/.

The PCIe driver work depends on the following tag from Thomas Petazzoni:
git://git.infradead.org/linux-mvebu.git mis-3.12.2
... which is merged into the middle of this pull request.

* tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra: (33 commits)
  ARM: tegra: disable LP2 cpuidle state if PCIe is enabled
  MAINTAINERS: Add myself as Tegra PCIe maintainer
  PCI: tegra: set up PADS_REFCLK_CFG1
  PCI: tegra: Add Tegra 30 PCIe support
  PCI: tegra: Move PCIe driver to drivers/pci/host
  PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platforms
  ARM: tegra: add LP1 suspend support for Tegra114
  ARM: tegra: add LP1 suspend support for Tegra20
  ARM: tegra: add LP1 suspend support for Tegra30
  ARM: tegra: add common LP1 suspend support
  clk: tegra114: add LP1 suspend/resume support
  ARM: tegra: config the polarity of the request of sys clock
  ARM: tegra: add common resume handling code for LP1 resuming
  ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pci
  of: pci: add registry of MSI chips
  PCI: Introduce new MSI chip infrastructure
  PCI: remove ARCH_SUPPORTS_MSI kconfig option
  PCI: use weak functions for MSI arch-specific functions
  ARM: tegra: unify Tegra's Kconfig a bit more
  ARM: tegra: remove the limitation that Tegra114 can't support suspend
  ...

Signed-off-by: Kevin Hilman <khilman@linaro.org>
diff --git a/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt b/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
new file mode 100644
index 0000000..6b75107
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
@@ -0,0 +1,163 @@
+NVIDIA Tegra PCIe controller
+
+Required properties:
+- compatible: "nvidia,tegra20-pcie" or "nvidia,tegra30-pcie"
+- device_type: Must be "pci"
+- reg: A list of physical base address and length for each set of controller
+  registers. Must contain an entry for each entry in the reg-names property.
+- reg-names: Must include the following entries:
+  "pads": PADS registers
+  "afi": AFI registers
+  "cs": configuration space region
+- interrupts: A list of interrupt outputs of the controller. Must contain an
+  entry for each entry in the interrupt-names property.
+- interrupt-names: Must include the following entries:
+  "intr": The Tegra interrupt that is asserted for controller interrupts
+  "msi": The Tegra interrupt that is asserted when an MSI is received
+- pex-clk-supply: Supply voltage for internal reference clock
+- vdd-supply: Power supply for controller (1.05V)
+- avdd-supply: Power supply for controller (1.05V) (not required for Tegra20)
+- bus-range: Range of bus numbers associated with this controller
+- #address-cells: Address representation for root ports (must be 3)
+  - cell 0 specifies the bus and device numbers of the root port:
+    [23:16]: bus number
+    [15:11]: device number
+  - cell 1 denotes the upper 32 address bits and should be 0
+  - cell 2 contains the lower 32 address bits and is used to translate to the
+    CPU address space
+- #size-cells: Size representation for root ports (must be 2)
+- ranges: Describes the translation of addresses for root ports and standard
+  PCI regions. The entries must be 6 cells each, where the first three cells
+  correspond to the address as described for the #address-cells property
+  above, the fourth cell is the physical CPU address to translate to and the
+  fifth and six cells are as described for the #size-cells property above.
+  - The first two entries are expected to translate the addresses for the root
+    port registers, which are referenced by the assigned-addresses property of
+    the root port nodes (see below).
+  - The remaining entries setup the mapping for the standard I/O, memory and
+    prefetchable PCI regions. The first cell determines the type of region
+    that is setup:
+    - 0x81000000: I/O memory region
+    - 0x82000000: non-prefetchable memory region
+    - 0xc2000000: prefetchable memory region
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- clocks: List of clock inputs of the controller. Must contain an entry for
+  each entry in the clock-names property.
+- clock-names: Must include the following entries:
+  "pex": The Tegra clock of that name
+  "afi": The Tegra clock of that name
+  "pcie_xclk": The Tegra clock of that name
+  "pll_e": The Tegra clock of that name
+  "cml": The Tegra clock of that name (not required for Tegra20)
+
+Root ports are defined as subnodes of the PCIe controller node.
+
+Required properties:
+- device_type: Must be "pci"
+- assigned-addresses: Address and size of the port configuration registers
+- reg: PCI bus address of the root port
+- #address-cells: Must be 3
+- #size-cells: Must be 2
+- ranges: Sub-ranges distributed from the PCIe controller node. An empty
+  property is sufficient.
+- nvidia,num-lanes: Number of lanes to use for this port. Valid combinations
+  are:
+  - Root port 0 uses 4 lanes, root port 1 is unused.
+  - Both root ports use 2 lanes.
+
+Example:
+
+SoC DTSI:
+
+	pcie-controller {
+		compatible = "nvidia,tegra20-pcie";
+		device_type = "pci";
+		reg = <0x80003000 0x00000800   /* PADS registers */
+		       0x80003800 0x00000200   /* AFI registers */
+		       0x90000000 0x10000000>; /* configuration space */
+		reg-names = "pads", "afi", "cs";
+		interrupts = <0 98 0x04   /* controller interrupt */
+		              0 99 0x04>; /* MSI interrupt */
+		interrupt-names = "intr", "msi";
+
+		bus-range = <0x00 0xff>;
+		#address-cells = <3>;
+		#size-cells = <2>;
+
+		ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000   /* port 0 registers */
+			  0x82000000 0 0x80001000 0x80001000 0 0x00001000   /* port 1 registers */
+			  0x81000000 0 0          0x82000000 0 0x00010000   /* downstream I/O */
+			  0x82000000 0 0xa0000000 0xa0000000 0 0x10000000   /* non-prefetchable memory */
+			  0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
+
+		clocks = <&tegra_car 70>, <&tegra_car 72>, <&tegra_car 74>,
+			 <&tegra_car 118>;
+		clock-names = "pex", "afi", "pcie_xclk", "pll_e";
+		status = "disabled";
+
+		pci@1,0 {
+			device_type = "pci";
+			assigned-addresses = <0x82000800 0 0x80000000 0 0x1000>;
+			reg = <0x000800 0 0 0 0>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges;
+
+			nvidia,num-lanes = <2>;
+		};
+
+		pci@2,0 {
+			device_type = "pci";
+			assigned-addresses = <0x82001000 0 0x80001000 0 0x1000>;
+			reg = <0x001000 0 0 0 0>;
+			status = "disabled";
+
+			#address-cells = <3>;
+			#size-cells = <2>;
+
+			ranges;
+
+			nvidia,num-lanes = <2>;
+		};
+	};
+
+
+Board DTS:
+
+	pcie-controller {
+		status = "okay";
+
+		vdd-supply = <&pci_vdd_reg>;
+		pex-clk-supply = <&pci_clk_reg>;
+
+		/* root port 00:01.0 */
+		pci@1,0 {
+			status = "okay";
+
+			/* bridge 01:00.0 (optional) */
+			pci@0,0 {
+				reg = <0x010000 0 0 0 0>;
+
+				#address-cells = <3>;
+				#size-cells = <2>;
+
+				device_type = "pci";
+
+				/* endpoint 02:00.0 */
+				pci@0,0 {
+					reg = <0x020000 0 0 0 0>;
+				};
+			};
+		};
+	};
+
+Note that devices on the PCI bus are dynamically discovered using PCI's bus
+enumeration and therefore don't need corresponding device nodes in DT. However
+if a device on the PCI bus provides a non-probeable bus such as I2C or SPI,
+device nodes need to be added in order to allow the bus' children to be
+instantiated at the proper location in the operating system's device tree (as
+illustrated by the optional nodes in the example above).
diff --git a/MAINTAINERS b/MAINTAINERS
index 7cacc88..8a261b5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6275,6 +6275,13 @@
 F:	drivers/pci/
 F:	include/linux/pci*
 
+PCI DRIVER FOR NVIDIA TEGRA
+M:	Thierry Reding <thierry.reding@gmail.com>
+L:	linux-tegra@vger.kernel.org
+S:	Supported
+F:	Documentation/devicetree/bindings/pci/nvidia,tegra20-pcie.txt
+F:	drivers/pci/host/pci-tegra.c
+
 PCMCIA SUBSYSTEM
 P:	Linux PCMCIA Team
 L:	linux-pcmcia@lists.infradead.org
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 50eada8..a3267d7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -441,7 +441,6 @@
 config ARCH_IOP13XX
 	bool "IOP13xx-based"
 	depends on MMU
-	select ARCH_SUPPORTS_MSI
 	select CPU_XSC3
 	select NEED_MACH_MEMORY_H
 	select NEED_RET_TO_USER
diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h
index a1c90d7..454d642 100644
--- a/arch/arm/include/asm/mach/pci.h
+++ b/arch/arm/include/asm/mach/pci.h
@@ -36,6 +36,8 @@
 					  resource_size_t start,
 					  resource_size_t size,
 					  resource_size_t align);
+	void		(*add_bus)(struct pci_bus *bus);
+	void		(*remove_bus)(struct pci_bus *bus);
 };
 
 /*
@@ -63,6 +65,8 @@
 					  resource_size_t start,
 					  resource_size_t size,
 					  resource_size_t align);
+	void		(*add_bus)(struct pci_bus *bus);
+	void		(*remove_bus)(struct pci_bus *bus);
 	void		*private_data;	/* platform controller private data	*/
 };
 
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 261fcc8..1ec9c87 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -363,6 +363,20 @@
 }
 EXPORT_SYMBOL(pcibios_fixup_bus);
 
+void pcibios_add_bus(struct pci_bus *bus)
+{
+	struct pci_sys_data *sys = bus->sysdata;
+	if (sys->add_bus)
+		sys->add_bus(bus);
+}
+
+void pcibios_remove_bus(struct pci_bus *bus)
+{
+	struct pci_sys_data *sys = bus->sysdata;
+	if (sys->remove_bus)
+		sys->remove_bus(bus);
+}
+
 /*
  * Swizzle the device pin each time we cross a bridge.  If a platform does
  * not provide a swizzle function, we perform the standard PCI swizzling.
@@ -464,6 +478,8 @@
 		sys->swizzle = hw->swizzle;
 		sys->map_irq = hw->map_irq;
 		sys->align_resource = hw->align_resource;
+		sys->add_bus = hw->add_bus;
+		sys->remove_bus = hw->remove_bus;
 		INIT_LIST_HEAD(&sys->resources);
 
 		if (hw->private_data)
diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index ef3a8da..def0564 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -2,19 +2,27 @@
 	bool "NVIDIA Tegra" if ARCH_MULTI_V7
 	select ARCH_HAS_CPUFREQ
 	select ARCH_REQUIRE_GPIOLIB
+	select ARM_GIC
 	select CLKDEV_LOOKUP
 	select CLKSRC_MMIO
 	select CLKSRC_OF
 	select COMMON_CLK
+	select CPU_V7
 	select GENERIC_CLOCKEVENTS
 	select HAVE_ARM_SCU if SMP
 	select HAVE_ARM_TWD if LOCAL_TIMERS
 	select HAVE_CLK
 	select HAVE_SMP
 	select MIGHT_HAVE_CACHE_L2X0
+	select PINCTRL
 	select SOC_BUS
 	select SPARSE_IRQ
+	select USB_ARCH_HAS_EHCI if USB_SUPPORT
+	select USB_ULPI if USB_PHY
+	select USB_ULPI_VIEWPORT if USB_PHY
 	select USE_OF
+	select MIGHT_HAVE_PCI
+	select ARCH_SUPPORTS_MSI
 	help
 	  This enables support for NVIDIA Tegra based systems.
 
@@ -27,15 +35,9 @@
 	select ARM_ERRATA_720789
 	select ARM_ERRATA_754327 if SMP
 	select ARM_ERRATA_764369 if SMP
-	select ARM_GIC
-	select CPU_V7
-	select PINCTRL
 	select PINCTRL_TEGRA20
 	select PL310_ERRATA_727915 if CACHE_L2X0
 	select PL310_ERRATA_769419 if CACHE_L2X0
-	select USB_ARCH_HAS_EHCI if USB_SUPPORT
-	select USB_ULPI if USB_PHY
-	select USB_ULPI_VIEWPORT if USB_PHY
 	help
 	  Support for NVIDIA Tegra AP20 and T20 processors, based on the
 	  ARM CortexA9MP CPU and the ARM PL310 L2 cache controller
@@ -44,14 +46,8 @@
 	bool "Enable support for Tegra30 family"
 	select ARM_ERRATA_754322
 	select ARM_ERRATA_764369 if SMP
-	select ARM_GIC
-	select CPU_V7
-	select PINCTRL
 	select PINCTRL_TEGRA30
 	select PL310_ERRATA_769419 if CACHE_L2X0
-	select USB_ARCH_HAS_EHCI if USB_SUPPORT
-	select USB_ULPI if USB_PHY
-	select USB_ULPI_VIEWPORT if USB_PHY
 	help
 	  Support for NVIDIA Tegra T30 processor family, based on the
 	  ARM CortexA9MP CPU and the ARM PL310 L2 cache controller
@@ -59,20 +55,13 @@
 config ARCH_TEGRA_114_SOC
 	bool "Enable support for Tegra114 family"
 	select HAVE_ARM_ARCH_TIMER
-	select ARM_GIC
+	select ARM_ERRATA_798181
 	select ARM_L1_CACHE_SHIFT_6
-	select CPU_V7
-	select PINCTRL
 	select PINCTRL_TEGRA114
 	help
 	  Support for NVIDIA Tegra T114 processor family, based on the
 	  ARM CortexA15MP CPU
 
-config TEGRA_PCI
-	bool "PCI Express support"
-	depends on ARCH_TEGRA_2x_SOC
-	select PCI
-
 config TEGRA_AHB
 	bool "Enable AHB driver for NVIDIA Tegra SoCs"
 	default y
diff --git a/arch/arm/mach-tegra/Makefile b/arch/arm/mach-tegra/Makefile
index 98b184e..e7e5f45 100644
--- a/arch/arm/mach-tegra/Makefile
+++ b/arch/arm/mach-tegra/Makefile
@@ -17,24 +17,24 @@
 obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= tegra20_speedo.o
 obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= tegra2_emc.o
 obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= sleep-tegra20.o
+obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= pm-tegra20.o
 ifeq ($(CONFIG_CPU_IDLE),y)
 obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= cpuidle-tegra20.o
 endif
 obj-$(CONFIG_ARCH_TEGRA_3x_SOC)		+= tegra30_speedo.o
 obj-$(CONFIG_ARCH_TEGRA_3x_SOC)		+= sleep-tegra30.o
+obj-$(CONFIG_ARCH_TEGRA_3x_SOC)		+= pm-tegra30.o
 ifeq ($(CONFIG_CPU_IDLE),y)
 obj-$(CONFIG_ARCH_TEGRA_3x_SOC)		+= cpuidle-tegra30.o
 endif
 obj-$(CONFIG_SMP)			+= platsmp.o headsmp.o
 obj-$(CONFIG_HOTPLUG_CPU)               += hotplug.o
-obj-$(CONFIG_TEGRA_PCI)			+= pcie.o
 
 obj-$(CONFIG_ARCH_TEGRA_114_SOC)	+= tegra114_speedo.o
 obj-$(CONFIG_ARCH_TEGRA_114_SOC)	+= sleep-tegra30.o
+obj-$(CONFIG_ARCH_TEGRA_114_SOC)	+= pm-tegra30.o
 ifeq ($(CONFIG_CPU_IDLE),y)
 obj-$(CONFIG_ARCH_TEGRA_114_SOC)	+= cpuidle-tegra114.o
 endif
 
-obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= board-harmony-pcie.o
-
 obj-$(CONFIG_ARCH_TEGRA_2x_SOC)		+= board-paz00.o
diff --git a/arch/arm/mach-tegra/board-harmony-pcie.c b/arch/arm/mach-tegra/board-harmony-pcie.c
deleted file mode 100644
index 035b240..0000000
--- a/arch/arm/mach-tegra/board-harmony-pcie.c
+++ /dev/null
@@ -1,89 +0,0 @@
-/*
- * arch/arm/mach-tegra/board-harmony-pcie.c
- *
- * Copyright (C) 2010 CompuLab, Ltd.
- * Mike Rapoport <mike@compulab.co.il>
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#include <linux/kernel.h>
-#include <linux/gpio.h>
-#include <linux/err.h>
-#include <linux/of_gpio.h>
-#include <linux/regulator/consumer.h>
-
-#include <asm/mach-types.h>
-
-#include "board.h"
-
-#ifdef CONFIG_TEGRA_PCI
-
-int __init harmony_pcie_init(void)
-{
-	struct device_node *np;
-	int en_vdd_1v05;
-	struct regulator *regulator = NULL;
-	int err;
-
-	np = of_find_node_by_path("/regulators/regulator@3");
-	if (!np) {
-		pr_err("%s: of_find_node_by_path failed\n", __func__);
-		return -ENODEV;
-	}
-
-	en_vdd_1v05 = of_get_named_gpio(np, "gpio", 0);
-	if (en_vdd_1v05 < 0) {
-		pr_err("%s: of_get_named_gpio failed: %d\n", __func__,
-		       en_vdd_1v05);
-		return en_vdd_1v05;
-	}
-
-	err = gpio_request(en_vdd_1v05, "EN_VDD_1V05");
-	if (err) {
-		pr_err("%s: gpio_request failed: %d\n", __func__, err);
-		return err;
-	}
-
-	gpio_direction_output(en_vdd_1v05, 1);
-
-	regulator = regulator_get(NULL, "vdd_ldo0,vddio_pex_clk");
-	if (IS_ERR(regulator)) {
-		err = PTR_ERR(regulator);
-		pr_err("%s: regulator_get failed: %d\n", __func__, err);
-		goto err_reg;
-	}
-
-	err = regulator_enable(regulator);
-	if (err) {
-		pr_err("%s: regulator_enable failed: %d\n", __func__, err);
-		goto err_en;
-	}
-
-	err = tegra_pcie_init(true, true);
-	if (err) {
-		pr_err("%s: tegra_pcie_init failed: %d\n", __func__, err);
-		goto err_pcie;
-	}
-
-	return 0;
-
-err_pcie:
-	regulator_disable(regulator);
-err_en:
-	regulator_put(regulator);
-err_reg:
-	gpio_free(en_vdd_1v05);
-
-	return err;
-}
-
-#endif
diff --git a/arch/arm/mach-tegra/board.h b/arch/arm/mach-tegra/board.h
index 9a6659f..db6810d 100644
--- a/arch/arm/mach-tegra/board.h
+++ b/arch/arm/mach-tegra/board.h
@@ -31,7 +31,6 @@
 void __init tegra_map_common_io(void);
 void __init tegra_init_irq(void);
 void __init tegra_dt_init_irq(void);
-int __init tegra_pcie_init(bool init_port0, bool init_port1);
 
 void tegra_init_late(void);
 
@@ -48,13 +47,6 @@
 static inline int tegra_powergate_debugfs_init(void) { return 0; }
 #endif
 
-int __init harmony_regulator_init(void);
-#ifdef CONFIG_TEGRA_PCI
-int __init harmony_pcie_init(void);
-#else
-static inline int harmony_pcie_init(void) { return 0; }
-#endif
-
 void __init tegra_paz00_wifikill_init(void);
 
 #endif
diff --git a/arch/arm/mach-tegra/common.h b/arch/arm/mach-tegra/common.h
index 32f8eb3..5900cc4 100644
--- a/arch/arm/mach-tegra/common.h
+++ b/arch/arm/mach-tegra/common.h
@@ -2,4 +2,3 @@
 
 extern int tegra_cpu_kill(unsigned int cpu);
 extern void tegra_cpu_die(unsigned int cpu);
-extern int tegra_cpu_disable(unsigned int cpu);
diff --git a/arch/arm/mach-tegra/cpuidle-tegra114.c b/arch/arm/mach-tegra/cpuidle-tegra114.c
index 1d1c602..e0b8730 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra114.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra114.c
@@ -17,15 +17,64 @@
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/cpuidle.h>
+#include <linux/cpu_pm.h>
+#include <linux/clockchips.h>
 
 #include <asm/cpuidle.h>
+#include <asm/suspend.h>
+#include <asm/smp_plat.h>
+
+#include "pm.h"
+#include "sleep.h"
+
+#ifdef CONFIG_PM_SLEEP
+#define TEGRA114_MAX_STATES 2
+#else
+#define TEGRA114_MAX_STATES 1
+#endif
+
+#ifdef CONFIG_PM_SLEEP
+static int tegra114_idle_power_down(struct cpuidle_device *dev,
+				    struct cpuidle_driver *drv,
+				    int index)
+{
+	local_fiq_disable();
+
+	tegra_set_cpu_in_lp2();
+	cpu_pm_enter();
+
+	clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+
+	cpu_suspend(0, tegra30_sleep_cpu_secondary_finish);
+
+	clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+
+	cpu_pm_exit();
+	tegra_clear_cpu_in_lp2();
+
+	local_fiq_enable();
+
+	return index;
+}
+#endif
 
 static struct cpuidle_driver tegra_idle_driver = {
 	.name = "tegra_idle",
 	.owner = THIS_MODULE,
-	.state_count = 1,
+	.state_count = TEGRA114_MAX_STATES,
 	.states = {
 		[0] = ARM_CPUIDLE_WFI_STATE_PWR(600),
+#ifdef CONFIG_PM_SLEEP
+		[1] = {
+			.enter			= tegra114_idle_power_down,
+			.exit_latency		= 500,
+			.target_residency	= 1000,
+			.power_usage		= 0,
+			.flags			= CPUIDLE_FLAG_TIME_VALID,
+			.name			= "powered-down",
+			.desc			= "CPU power gated",
+		},
+#endif
 	},
 };
 
diff --git a/arch/arm/mach-tegra/cpuidle-tegra20.c b/arch/arm/mach-tegra/cpuidle-tegra20.c
index 706aa42..b82dcae 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra20.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra20.c
@@ -211,6 +211,18 @@
 }
 #endif
 
+/*
+ * Tegra20 HW appears to have a bug such that PCIe device interrupts, whether
+ * they are legacy IRQs or MSI, are lost when LP2 is enabled. To work around
+ * this, simply disable LP2 if the PCI driver and DT node are both enabled.
+ */
+void tegra20_cpuidle_pcie_irqs_in_use(void)
+{
+	pr_info_once(
+		"Disabling cpuidle LP2 state, since PCIe IRQs are in use\n");
+	tegra_idle_driver.states[1].disabled = true;
+}
+
 int __init tegra20_cpuidle_init(void)
 {
 	return cpuidle_register(&tegra_idle_driver, cpu_possible_mask);
diff --git a/arch/arm/mach-tegra/cpuidle.c b/arch/arm/mach-tegra/cpuidle.c
index e85973c..0961dfc 100644
--- a/arch/arm/mach-tegra/cpuidle.c
+++ b/arch/arm/mach-tegra/cpuidle.c
@@ -44,3 +44,13 @@
 		break;
 	}
 }
+
+void tegra_cpuidle_pcie_irqs_in_use(void)
+{
+	switch (tegra_chip_id) {
+	case TEGRA20:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_2x_SOC))
+			tegra20_cpuidle_pcie_irqs_in_use();
+		break;
+	}
+}
diff --git a/arch/arm/mach-tegra/cpuidle.h b/arch/arm/mach-tegra/cpuidle.h
index 9ec2c1a..c017dab 100644
--- a/arch/arm/mach-tegra/cpuidle.h
+++ b/arch/arm/mach-tegra/cpuidle.h
@@ -19,6 +19,7 @@
 
 #ifdef CONFIG_CPU_IDLE
 int tegra20_cpuidle_init(void);
+void tegra20_cpuidle_pcie_irqs_in_use(void);
 int tegra30_cpuidle_init(void);
 int tegra114_cpuidle_init(void);
 void tegra_cpuidle_init(void);
diff --git a/arch/arm/mach-tegra/flowctrl.c b/arch/arm/mach-tegra/flowctrl.c
index b477ef3..5348543 100644
--- a/arch/arm/mach-tegra/flowctrl.c
+++ b/arch/arm/mach-tegra/flowctrl.c
@@ -86,6 +86,7 @@
 		reg |= TEGRA20_FLOW_CTRL_CSR_WFE_CPU0 << cpuid;
 		break;
 	case TEGRA30:
+	case TEGRA114:
 		/* clear wfe bitmap */
 		reg &= ~TEGRA30_FLOW_CTRL_CSR_WFE_BITMAP;
 		/* clear wfi bitmap */
@@ -123,6 +124,7 @@
 		reg &= ~TEGRA20_FLOW_CTRL_CSR_WFI_BITMAP;
 		break;
 	case TEGRA30:
+	case TEGRA114:
 		/* clear wfe bitmap */
 		reg &= ~TEGRA30_FLOW_CTRL_CSR_WFE_BITMAP;
 		/* clear wfi bitmap */
diff --git a/arch/arm/mach-tegra/flowctrl.h b/arch/arm/mach-tegra/flowctrl.h
index 7a29bae..c89aac6 100644
--- a/arch/arm/mach-tegra/flowctrl.h
+++ b/arch/arm/mach-tegra/flowctrl.h
@@ -28,9 +28,18 @@
 #define FLOW_CTRL_SCLK_RESUME		(1 << 27)
 #define FLOW_CTRL_HALT_CPU_IRQ		(1 << 10)
 #define	FLOW_CTRL_HALT_CPU_FIQ		(1 << 8)
+#define FLOW_CTRL_HALT_LIC_IRQ		(1 << 11)
+#define FLOW_CTRL_HALT_LIC_FIQ		(1 << 10)
+#define FLOW_CTRL_HALT_GIC_IRQ		(1 << 9)
+#define FLOW_CTRL_HALT_GIC_FIQ		(1 << 8)
 #define FLOW_CTRL_CPU0_CSR		0x8
 #define	FLOW_CTRL_CSR_INTR_FLAG		(1 << 15)
 #define FLOW_CTRL_CSR_EVENT_FLAG	(1 << 14)
+#define FLOW_CTRL_CSR_ENABLE_EXT_CRAIL	(1 << 13)
+#define FLOW_CTRL_CSR_ENABLE_EXT_NCPU	(1 << 12)
+#define FLOW_CTRL_CSR_ENABLE_EXT_MASK ( \
+		FLOW_CTRL_CSR_ENABLE_EXT_NCPU | \
+		FLOW_CTRL_CSR_ENABLE_EXT_CRAIL)
 #define FLOW_CTRL_CSR_ENABLE		(1 << 0)
 #define FLOW_CTRL_HALT_CPU1_EVENTS	0x14
 #define FLOW_CTRL_CPU1_CSR		0x18
diff --git a/arch/arm/mach-tegra/headsmp.S b/arch/arm/mach-tegra/headsmp.S
index 045c16f..2072e73 100644
--- a/arch/arm/mach-tegra/headsmp.S
+++ b/arch/arm/mach-tegra/headsmp.S
@@ -6,6 +6,7 @@
         .section ".text.head", "ax"
 
 ENTRY(tegra_secondary_startup)
-        bl      v7_invalidate_l1
+        check_cpu_part_num 0xc09, r8, r9
+        bleq    v7_invalidate_l1
         b       secondary_startup
 ENDPROC(tegra_secondary_startup)
diff --git a/arch/arm/mach-tegra/hotplug.c b/arch/arm/mach-tegra/hotplug.c
index a52c10e..04de2e8 100644
--- a/arch/arm/mach-tegra/hotplug.c
+++ b/arch/arm/mach-tegra/hotplug.c
@@ -37,7 +37,7 @@
 void __ref tegra_cpu_die(unsigned int cpu)
 {
 	/* Clean L1 data cache */
-	tegra_disable_clean_inv_dcache();
+	tegra_disable_clean_inv_dcache(TEGRA_FLUSH_CACHE_LOUIS);
 
 	/* Shut down the current CPU. */
 	tegra_hotplug_shutdown();
@@ -46,17 +46,6 @@
 	BUG();
 }
 
-int tegra_cpu_disable(unsigned int cpu)
-{
-	switch (tegra_chip_id) {
-	case TEGRA20:
-	case TEGRA30:
-		return cpu == 0 ? -EPERM : 0;
-	default:
-		return 0;
-	}
-}
-
 void __init tegra_hotplug_init(void)
 {
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
diff --git a/arch/arm/mach-tegra/iomap.h b/arch/arm/mach-tegra/iomap.h
index 399fbca..3f5fa07 100644
--- a/arch/arm/mach-tegra/iomap.h
+++ b/arch/arm/mach-tegra/iomap.h
@@ -24,6 +24,8 @@
 #define TEGRA_IRAM_BASE			0x40000000
 #define TEGRA_IRAM_SIZE			SZ_256K
 
+#define TEGRA_IRAM_CODE_AREA		(TEGRA_IRAM_BASE + SZ_4K)
+
 #define TEGRA_HOST1X_BASE		0x50000000
 #define TEGRA_HOST1X_SIZE		0x24000
 
@@ -237,6 +239,12 @@
 #define TEGRA_KFUSE_BASE		0x7000FC00
 #define TEGRA_KFUSE_SIZE		SZ_1K
 
+#define TEGRA_EMC0_BASE			0x7001A000
+#define TEGRA_EMC0_SIZE			SZ_2K
+
+#define TEGRA_EMC1_BASE			0x7001A800
+#define TEGRA_EMC1_SIZE			SZ_2K
+
 #define TEGRA_CSITE_BASE		0x70040000
 #define TEGRA_CSITE_SIZE		SZ_256K
 
@@ -278,9 +286,6 @@
 #define IO_APB_VIRT	IOMEM(0xFE300000)
 #define IO_APB_SIZE	SZ_1M
 
-#define TEGRA_PCIE_BASE		0x80000000
-#define TEGRA_PCIE_IO_BASE	(TEGRA_PCIE_BASE + SZ_4M)
-
 #define IO_TO_VIRT_BETWEEN(p, st, sz)	((p) >= (st) && (p) < ((st) + (sz)))
 #define IO_TO_VIRT_XLATE(p, pst, vst)	(((p) - (pst) + (vst)))
 
diff --git a/arch/arm/mach-tegra/irq.c b/arch/arm/mach-tegra/irq.c
index 0de4eed..1a74d56 100644
--- a/arch/arm/mach-tegra/irq.c
+++ b/arch/arm/mach-tegra/irq.c
@@ -18,10 +18,12 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/cpu_pm.h>
 #include <linux/interrupt.h>
 #include <linux/irq.h>
 #include <linux/io.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/irqchip/arm-gic.h>
 #include <linux/syscore_ops.h>
 
@@ -65,6 +67,7 @@
 static u32 cpu_iep[TEGRA_MAX_NUM_ICTLRS];
 
 static u32 ictlr_wake_mask[TEGRA_MAX_NUM_ICTLRS];
+static void __iomem *tegra_gic_cpu_base;
 #endif
 
 bool tegra_pending_sgi(void)
@@ -213,8 +216,43 @@
 
 	return 0;
 }
+
+static int tegra_gic_notifier(struct notifier_block *self,
+			      unsigned long cmd, void *v)
+{
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		writel_relaxed(0x1E0, tegra_gic_cpu_base + GIC_CPU_CTRL);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block tegra_gic_notifier_block = {
+	.notifier_call = tegra_gic_notifier,
+};
+
+static const struct of_device_id tegra114_dt_gic_match[] __initconst = {
+	{ .compatible = "arm,cortex-a15-gic" },
+	{ }
+};
+
+static void tegra114_gic_cpu_pm_registration(void)
+{
+	struct device_node *dn;
+
+	dn = of_find_matching_node(NULL, tegra114_dt_gic_match);
+	if (!dn)
+		return;
+
+	tegra_gic_cpu_base = of_iomap(dn, 1);
+
+	cpu_pm_register_notifier(&tegra_gic_notifier_block);
+}
 #else
 #define tegra_set_wake NULL
+static void tegra114_gic_cpu_pm_registration(void) { }
 #endif
 
 void __init tegra_init_irq(void)
@@ -252,4 +290,6 @@
 	if (!of_have_populated_dt())
 		gic_init(0, 29, distbase,
 			IO_ADDRESS(TEGRA_ARM_PERIF_BASE + 0x100));
+
+	tegra114_gic_cpu_pm_registration();
 }
diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
deleted file mode 100644
index 46144a1..0000000
--- a/arch/arm/mach-tegra/pcie.c
+++ /dev/null
@@ -1,886 +0,0 @@
-/*
- * arch/arm/mach-tegra/pci.c
- *
- * PCIe host controller driver for TEGRA(2) SOCs
- *
- * Copyright (c) 2010, CompuLab, Ltd.
- * Author: Mike Rapoport <mike@compulab.co.il>
- *
- * Based on NVIDIA PCIe driver
- * Copyright (c) 2008-2009, NVIDIA Corporation.
- *
- * Bits taken from arch/arm/mach-dove/pcie.c
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along
- * with this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
- */
-
-#include <linux/kernel.h>
-#include <linux/pci.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
-#include <linux/clk.h>
-#include <linux/delay.h>
-#include <linux/export.h>
-#include <linux/clk/tegra.h>
-#include <linux/tegra-powergate.h>
-
-#include <asm/sizes.h>
-#include <asm/mach/pci.h>
-
-#include "board.h"
-#include "iomap.h"
-
-/* Hack - need to parse this from DT */
-#define INT_PCIE_INTR 130
-
-/* register definitions */
-#define AFI_OFFSET	0x3800
-#define PADS_OFFSET	0x3000
-#define RP0_OFFSET	0x0000
-#define RP1_OFFSET	0x1000
-
-#define AFI_AXI_BAR0_SZ	0x00
-#define AFI_AXI_BAR1_SZ	0x04
-#define AFI_AXI_BAR2_SZ	0x08
-#define AFI_AXI_BAR3_SZ	0x0c
-#define AFI_AXI_BAR4_SZ	0x10
-#define AFI_AXI_BAR5_SZ	0x14
-
-#define AFI_AXI_BAR0_START	0x18
-#define AFI_AXI_BAR1_START	0x1c
-#define AFI_AXI_BAR2_START	0x20
-#define AFI_AXI_BAR3_START	0x24
-#define AFI_AXI_BAR4_START	0x28
-#define AFI_AXI_BAR5_START	0x2c
-
-#define AFI_FPCI_BAR0	0x30
-#define AFI_FPCI_BAR1	0x34
-#define AFI_FPCI_BAR2	0x38
-#define AFI_FPCI_BAR3	0x3c
-#define AFI_FPCI_BAR4	0x40
-#define AFI_FPCI_BAR5	0x44
-
-#define AFI_CACHE_BAR0_SZ	0x48
-#define AFI_CACHE_BAR0_ST	0x4c
-#define AFI_CACHE_BAR1_SZ	0x50
-#define AFI_CACHE_BAR1_ST	0x54
-
-#define AFI_MSI_BAR_SZ		0x60
-#define AFI_MSI_FPCI_BAR_ST	0x64
-#define AFI_MSI_AXI_BAR_ST	0x68
-
-#define AFI_CONFIGURATION		0xac
-#define  AFI_CONFIGURATION_EN_FPCI	(1 << 0)
-
-#define AFI_FPCI_ERROR_MASKS	0xb0
-
-#define AFI_INTR_MASK		0xb4
-#define  AFI_INTR_MASK_INT_MASK	(1 << 0)
-#define  AFI_INTR_MASK_MSI_MASK	(1 << 8)
-
-#define AFI_INTR_CODE		0xb8
-#define  AFI_INTR_CODE_MASK	0xf
-#define  AFI_INTR_MASTER_ABORT	4
-#define  AFI_INTR_LEGACY	6
-
-#define AFI_INTR_SIGNATURE	0xbc
-#define AFI_SM_INTR_ENABLE	0xc4
-
-#define AFI_AFI_INTR_ENABLE		0xc8
-#define  AFI_INTR_EN_INI_SLVERR		(1 << 0)
-#define  AFI_INTR_EN_INI_DECERR		(1 << 1)
-#define  AFI_INTR_EN_TGT_SLVERR		(1 << 2)
-#define  AFI_INTR_EN_TGT_DECERR		(1 << 3)
-#define  AFI_INTR_EN_TGT_WRERR		(1 << 4)
-#define  AFI_INTR_EN_DFPCI_DECERR	(1 << 5)
-#define  AFI_INTR_EN_AXI_DECERR		(1 << 6)
-#define  AFI_INTR_EN_FPCI_TIMEOUT	(1 << 7)
-
-#define AFI_PCIE_CONFIG					0x0f8
-#define  AFI_PCIE_CONFIG_PCIEC0_DISABLE_DEVICE		(1 << 1)
-#define  AFI_PCIE_CONFIG_PCIEC1_DISABLE_DEVICE		(1 << 2)
-#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK	(0xf << 20)
-#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_SINGLE	(0x0 << 20)
-#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL	(0x1 << 20)
-
-#define AFI_FUSE			0x104
-#define  AFI_FUSE_PCIE_T0_GEN2_DIS	(1 << 2)
-
-#define AFI_PEX0_CTRL			0x110
-#define AFI_PEX1_CTRL			0x118
-#define  AFI_PEX_CTRL_RST		(1 << 0)
-#define  AFI_PEX_CTRL_REFCLK_EN		(1 << 3)
-
-#define RP_VEND_XP	0x00000F00
-#define  RP_VEND_XP_DL_UP	(1 << 30)
-
-#define RP_LINK_CONTROL_STATUS			0x00000090
-#define  RP_LINK_CONTROL_STATUS_LINKSTAT_MASK	0x3fff0000
-
-#define PADS_CTL_SEL		0x0000009C
-
-#define PADS_CTL		0x000000A0
-#define  PADS_CTL_IDDQ_1L	(1 << 0)
-#define  PADS_CTL_TX_DATA_EN_1L	(1 << 6)
-#define  PADS_CTL_RX_DATA_EN_1L	(1 << 10)
-
-#define PADS_PLL_CTL				0x000000B8
-#define  PADS_PLL_CTL_RST_B4SM			(1 << 1)
-#define  PADS_PLL_CTL_LOCKDET			(1 << 8)
-#define  PADS_PLL_CTL_REFCLK_MASK		(0x3 << 16)
-#define  PADS_PLL_CTL_REFCLK_INTERNAL_CML	(0 << 16)
-#define  PADS_PLL_CTL_REFCLK_INTERNAL_CMOS	(1 << 16)
-#define  PADS_PLL_CTL_REFCLK_EXTERNAL		(2 << 16)
-#define  PADS_PLL_CTL_TXCLKREF_MASK		(0x1 << 20)
-#define  PADS_PLL_CTL_TXCLKREF_DIV10		(0 << 20)
-#define  PADS_PLL_CTL_TXCLKREF_DIV5		(1 << 20)
-
-/* PMC access is required for PCIE xclk (un)clamping */
-#define PMC_SCRATCH42		0x144
-#define PMC_SCRATCH42_PCX_CLAMP	(1 << 0)
-
-static void __iomem *reg_pmc_base = IO_ADDRESS(TEGRA_PMC_BASE);
-
-#define pmc_writel(value, reg) \
-	__raw_writel(value, reg_pmc_base + (reg))
-#define pmc_readl(reg) \
-	__raw_readl(reg_pmc_base + (reg))
-
-/*
- * Tegra2 defines 1GB in the AXI address map for PCIe.
- *
- * That address space is split into different regions, with sizes and
- * offsets as follows:
- *
- * 0x80000000 - 0x80003fff - PCI controller registers
- * 0x80004000 - 0x80103fff - PCI configuration space
- * 0x80104000 - 0x80203fff - PCI extended configuration space
- * 0x80203fff - 0x803fffff - unused
- * 0x80400000 - 0x8040ffff - downstream IO
- * 0x80410000 - 0x8fffffff - unused
- * 0x90000000 - 0x9fffffff - non-prefetchable memory
- * 0xa0000000 - 0xbfffffff - prefetchable memory
- */
-#define PCIE_REGS_SZ		SZ_16K
-#define PCIE_CFG_OFF		PCIE_REGS_SZ
-#define PCIE_CFG_SZ		SZ_1M
-#define PCIE_EXT_CFG_OFF	(PCIE_CFG_SZ + PCIE_CFG_OFF)
-#define PCIE_EXT_CFG_SZ		SZ_1M
-#define PCIE_IOMAP_SZ		(PCIE_REGS_SZ + PCIE_CFG_SZ + PCIE_EXT_CFG_SZ)
-
-#define MEM_BASE_0		(TEGRA_PCIE_BASE + SZ_256M)
-#define MEM_SIZE_0		SZ_128M
-#define MEM_BASE_1		(MEM_BASE_0 + MEM_SIZE_0)
-#define MEM_SIZE_1		SZ_128M
-#define PREFETCH_MEM_BASE_0	(MEM_BASE_1 + MEM_SIZE_1)
-#define PREFETCH_MEM_SIZE_0	SZ_128M
-#define PREFETCH_MEM_BASE_1	(PREFETCH_MEM_BASE_0 + PREFETCH_MEM_SIZE_0)
-#define PREFETCH_MEM_SIZE_1	SZ_128M
-
-#define  PCIE_CONF_BUS(b)	((b) << 16)
-#define  PCIE_CONF_DEV(d)	((d) << 11)
-#define  PCIE_CONF_FUNC(f)	((f) << 8)
-#define  PCIE_CONF_REG(r)	\
-	(((r) & ~0x3) | (((r) < 256) ? PCIE_CFG_OFF : PCIE_EXT_CFG_OFF))
-
-struct tegra_pcie_port {
-	int			index;
-	u8			root_bus_nr;
-	void __iomem		*base;
-
-	bool			link_up;
-
-	char			mem_space_name[16];
-	char			prefetch_space_name[20];
-	struct resource		res[2];
-};
-
-struct tegra_pcie_info {
-	struct tegra_pcie_port	port[2];
-	int			num_ports;
-
-	void __iomem		*regs;
-	struct resource		res_mmio;
-
-	struct clk		*pex_clk;
-	struct clk		*afi_clk;
-	struct clk		*pcie_xclk;
-	struct clk		*pll_e;
-};
-
-static struct tegra_pcie_info tegra_pcie;
-
-static inline void afi_writel(u32 value, unsigned long offset)
-{
-	writel(value, offset + AFI_OFFSET + tegra_pcie.regs);
-}
-
-static inline u32 afi_readl(unsigned long offset)
-{
-	return readl(offset + AFI_OFFSET + tegra_pcie.regs);
-}
-
-static inline void pads_writel(u32 value, unsigned long offset)
-{
-	writel(value, offset + PADS_OFFSET + tegra_pcie.regs);
-}
-
-static inline u32 pads_readl(unsigned long offset)
-{
-	return readl(offset + PADS_OFFSET + tegra_pcie.regs);
-}
-
-static struct tegra_pcie_port *bus_to_port(int bus)
-{
-	int i;
-
-	for (i = tegra_pcie.num_ports - 1; i >= 0; i--) {
-		int rbus = tegra_pcie.port[i].root_bus_nr;
-		if (rbus != -1 && rbus == bus)
-			break;
-	}
-
-	return i >= 0 ? tegra_pcie.port + i : NULL;
-}
-
-static int tegra_pcie_read_conf(struct pci_bus *bus, unsigned int devfn,
-				int where, int size, u32 *val)
-{
-	struct tegra_pcie_port *pp = bus_to_port(bus->number);
-	void __iomem *addr;
-
-	if (pp) {
-		if (devfn != 0) {
-			*val = 0xffffffff;
-			return PCIBIOS_DEVICE_NOT_FOUND;
-		}
-
-		addr = pp->base + (where & ~0x3);
-	} else {
-		addr = tegra_pcie.regs + (PCIE_CONF_BUS(bus->number) +
-					  PCIE_CONF_DEV(PCI_SLOT(devfn)) +
-					  PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
-					  PCIE_CONF_REG(where));
-	}
-
-	*val = readl(addr);
-
-	if (size == 1)
-		*val = (*val >> (8 * (where & 3))) & 0xff;
-	else if (size == 2)
-		*val = (*val >> (8 * (where & 3))) & 0xffff;
-
-	return PCIBIOS_SUCCESSFUL;
-}
-
-static int tegra_pcie_write_conf(struct pci_bus *bus, unsigned int devfn,
-				 int where, int size, u32 val)
-{
-	struct tegra_pcie_port *pp = bus_to_port(bus->number);
-	void __iomem *addr;
-
-	u32 mask;
-	u32 tmp;
-
-	if (pp) {
-		if (devfn != 0)
-			return PCIBIOS_DEVICE_NOT_FOUND;
-
-		addr = pp->base + (where & ~0x3);
-	} else {
-		addr = tegra_pcie.regs + (PCIE_CONF_BUS(bus->number) +
-					  PCIE_CONF_DEV(PCI_SLOT(devfn)) +
-					  PCIE_CONF_FUNC(PCI_FUNC(devfn)) +
-					  PCIE_CONF_REG(where));
-	}
-
-	if (size == 4) {
-		writel(val, addr);
-		return PCIBIOS_SUCCESSFUL;
-	}
-
-	if (size == 2)
-		mask = ~(0xffff << ((where & 0x3) * 8));
-	else if (size == 1)
-		mask = ~(0xff << ((where & 0x3) * 8));
-	else
-		return PCIBIOS_BAD_REGISTER_NUMBER;
-
-	tmp = readl(addr) & mask;
-	tmp |= val << ((where & 0x3) * 8);
-	writel(tmp, addr);
-
-	return PCIBIOS_SUCCESSFUL;
-}
-
-static struct pci_ops tegra_pcie_ops = {
-	.read	= tegra_pcie_read_conf,
-	.write	= tegra_pcie_write_conf,
-};
-
-static void tegra_pcie_fixup_bridge(struct pci_dev *dev)
-{
-	u16 reg;
-
-	if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE) {
-		pci_read_config_word(dev, PCI_COMMAND, &reg);
-		reg |= (PCI_COMMAND_IO | PCI_COMMAND_MEMORY |
-			PCI_COMMAND_MASTER | PCI_COMMAND_SERR);
-		pci_write_config_word(dev, PCI_COMMAND, reg);
-	}
-}
-DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_fixup_bridge);
-
-/* Tegra PCIE root complex wrongly reports device class */
-static void tegra_pcie_fixup_class(struct pci_dev *dev)
-{
-	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
-}
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf0, tegra_pcie_fixup_class);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf1, tegra_pcie_fixup_class);
-
-/* Tegra PCIE requires relaxed ordering */
-static void tegra_pcie_relax_enable(struct pci_dev *dev)
-{
-	pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
-}
-DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_relax_enable);
-
-static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
-{
-	struct tegra_pcie_port *pp;
-
-	if (nr >= tegra_pcie.num_ports)
-		return 0;
-
-	pp = tegra_pcie.port + nr;
-	pp->root_bus_nr = sys->busnr;
-
-	pci_ioremap_io(nr * SZ_64K, TEGRA_PCIE_IO_BASE);
-
-	/*
-	 * IORESOURCE_MEM
-	 */
-	snprintf(pp->mem_space_name, sizeof(pp->mem_space_name),
-		 "PCIe %d MEM", pp->index);
-	pp->mem_space_name[sizeof(pp->mem_space_name) - 1] = 0;
-	pp->res[0].name = pp->mem_space_name;
-	if (pp->index == 0) {
-		pp->res[0].start = MEM_BASE_0;
-		pp->res[0].end = pp->res[0].start + MEM_SIZE_0 - 1;
-	} else {
-		pp->res[0].start = MEM_BASE_1;
-		pp->res[0].end = pp->res[0].start + MEM_SIZE_1 - 1;
-	}
-	pp->res[0].flags = IORESOURCE_MEM;
-	if (request_resource(&iomem_resource, &pp->res[0]))
-		panic("Request PCIe Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[0], sys->mem_offset);
-
-	/*
-	 * IORESOURCE_MEM | IORESOURCE_PREFETCH
-	 */
-	snprintf(pp->prefetch_space_name, sizeof(pp->prefetch_space_name),
-		 "PCIe %d PREFETCH MEM", pp->index);
-	pp->prefetch_space_name[sizeof(pp->prefetch_space_name) - 1] = 0;
-	pp->res[1].name = pp->prefetch_space_name;
-	if (pp->index == 0) {
-		pp->res[1].start = PREFETCH_MEM_BASE_0;
-		pp->res[1].end = pp->res[1].start + PREFETCH_MEM_SIZE_0 - 1;
-	} else {
-		pp->res[1].start = PREFETCH_MEM_BASE_1;
-		pp->res[1].end = pp->res[1].start + PREFETCH_MEM_SIZE_1 - 1;
-	}
-	pp->res[1].flags = IORESOURCE_MEM | IORESOURCE_PREFETCH;
-	if (request_resource(&iomem_resource, &pp->res[1]))
-		panic("Request PCIe Prefetch Memory resource failed\n");
-	pci_add_resource_offset(&sys->resources, &pp->res[1], sys->mem_offset);
-
-	return 1;
-}
-
-static int tegra_pcie_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
-{
-	return INT_PCIE_INTR;
-}
-
-static struct pci_bus __init *tegra_pcie_scan_bus(int nr,
-						  struct pci_sys_data *sys)
-{
-	struct tegra_pcie_port *pp;
-
-	if (nr >= tegra_pcie.num_ports)
-		return NULL;
-
-	pp = tegra_pcie.port + nr;
-	pp->root_bus_nr = sys->busnr;
-
-	return pci_scan_root_bus(NULL, sys->busnr, &tegra_pcie_ops, sys,
-				 &sys->resources);
-}
-
-static struct hw_pci tegra_pcie_hw __initdata = {
-	.nr_controllers	= 2,
-	.setup		= tegra_pcie_setup,
-	.scan		= tegra_pcie_scan_bus,
-	.map_irq	= tegra_pcie_map_irq,
-};
-
-
-static irqreturn_t tegra_pcie_isr(int irq, void *arg)
-{
-	const char *err_msg[] = {
-		"Unknown",
-		"AXI slave error",
-		"AXI decode error",
-		"Target abort",
-		"Master abort",
-		"Invalid write",
-		"Response decoding error",
-		"AXI response decoding error",
-		"Transcation timeout",
-	};
-
-	u32 code, signature;
-
-	code = afi_readl(AFI_INTR_CODE) & AFI_INTR_CODE_MASK;
-	signature = afi_readl(AFI_INTR_SIGNATURE);
-	afi_writel(0, AFI_INTR_CODE);
-
-	if (code == AFI_INTR_LEGACY)
-		return IRQ_NONE;
-
-	if (code >= ARRAY_SIZE(err_msg))
-		code = 0;
-
-	/*
-	 * do not pollute kernel log with master abort reports since they
-	 * happen a lot during enumeration
-	 */
-	if (code == AFI_INTR_MASTER_ABORT)
-		pr_debug("PCIE: %s, signature: %08x\n", err_msg[code], signature);
-	else
-		pr_err("PCIE: %s, signature: %08x\n", err_msg[code], signature);
-
-	return IRQ_HANDLED;
-}
-
-static void tegra_pcie_setup_translations(void)
-{
-	u32 fpci_bar;
-	u32 size;
-	u32 axi_address;
-
-	/* Bar 0: config Bar */
-	fpci_bar = ((u32)0xfdff << 16);
-	size = PCIE_CFG_SZ;
-	axi_address = TEGRA_PCIE_BASE + PCIE_CFG_OFF;
-	afi_writel(axi_address, AFI_AXI_BAR0_START);
-	afi_writel(size >> 12, AFI_AXI_BAR0_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR0);
-
-	/* Bar 1: extended config Bar */
-	fpci_bar = ((u32)0xfe1 << 20);
-	size = PCIE_EXT_CFG_SZ;
-	axi_address = TEGRA_PCIE_BASE + PCIE_EXT_CFG_OFF;
-	afi_writel(axi_address, AFI_AXI_BAR1_START);
-	afi_writel(size >> 12, AFI_AXI_BAR1_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR1);
-
-	/* Bar 2: downstream IO bar */
-	fpci_bar = ((__u32)0xfdfc << 16);
-	size = SZ_128K;
-	axi_address = TEGRA_PCIE_IO_BASE;
-	afi_writel(axi_address, AFI_AXI_BAR2_START);
-	afi_writel(size >> 12, AFI_AXI_BAR2_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR2);
-
-	/* Bar 3: prefetchable memory BAR */
-	fpci_bar = (((PREFETCH_MEM_BASE_0 >> 12) & 0x0fffffff) << 4) | 0x1;
-	size =  PREFETCH_MEM_SIZE_0 +  PREFETCH_MEM_SIZE_1;
-	axi_address = PREFETCH_MEM_BASE_0;
-	afi_writel(axi_address, AFI_AXI_BAR3_START);
-	afi_writel(size >> 12, AFI_AXI_BAR3_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR3);
-
-	/* Bar 4: non prefetchable memory BAR */
-	fpci_bar = (((MEM_BASE_0 >> 12)	& 0x0FFFFFFF) << 4) | 0x1;
-	size = MEM_SIZE_0 + MEM_SIZE_1;
-	axi_address = MEM_BASE_0;
-	afi_writel(axi_address, AFI_AXI_BAR4_START);
-	afi_writel(size >> 12, AFI_AXI_BAR4_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR4);
-
-	/* Bar 5: NULL out the remaining BAR as it is not used */
-	fpci_bar = 0;
-	size = 0;
-	axi_address = 0;
-	afi_writel(axi_address, AFI_AXI_BAR5_START);
-	afi_writel(size >> 12, AFI_AXI_BAR5_SZ);
-	afi_writel(fpci_bar, AFI_FPCI_BAR5);
-
-	/* map all upstream transactions as uncached */
-	afi_writel(PHYS_OFFSET, AFI_CACHE_BAR0_ST);
-	afi_writel(0, AFI_CACHE_BAR0_SZ);
-	afi_writel(0, AFI_CACHE_BAR1_ST);
-	afi_writel(0, AFI_CACHE_BAR1_SZ);
-
-	/* No MSI */
-	afi_writel(0, AFI_MSI_FPCI_BAR_ST);
-	afi_writel(0, AFI_MSI_BAR_SZ);
-	afi_writel(0, AFI_MSI_AXI_BAR_ST);
-	afi_writel(0, AFI_MSI_BAR_SZ);
-}
-
-static int tegra_pcie_enable_controller(void)
-{
-	u32 val, reg;
-	int i, timeout;
-
-	/* Enable slot clock and pulse the reset signals */
-	for (i = 0, reg = AFI_PEX0_CTRL; i < 2; i++, reg += 0x8) {
-		val = afi_readl(reg) |  AFI_PEX_CTRL_REFCLK_EN;
-		afi_writel(val, reg);
-		val &= ~AFI_PEX_CTRL_RST;
-		afi_writel(val, reg);
-
-		val = afi_readl(reg) | AFI_PEX_CTRL_RST;
-		afi_writel(val, reg);
-	}
-
-	/* Enable dual controller and both ports */
-	val = afi_readl(AFI_PCIE_CONFIG);
-	val &= ~(AFI_PCIE_CONFIG_PCIEC0_DISABLE_DEVICE |
-		 AFI_PCIE_CONFIG_PCIEC1_DISABLE_DEVICE |
-		 AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK);
-	val |= AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL;
-	afi_writel(val, AFI_PCIE_CONFIG);
-
-	val = afi_readl(AFI_FUSE) & ~AFI_FUSE_PCIE_T0_GEN2_DIS;
-	afi_writel(val, AFI_FUSE);
-
-	/* Initialze internal PHY, enable up to 16 PCIE lanes */
-	pads_writel(0x0, PADS_CTL_SEL);
-
-	/* override IDDQ to 1 on all 4 lanes */
-	val = pads_readl(PADS_CTL) | PADS_CTL_IDDQ_1L;
-	pads_writel(val, PADS_CTL);
-
-	/*
-	 * set up PHY PLL inputs select PLLE output as refclock,
-	 * set TX ref sel to div10 (not div5)
-	 */
-	val = pads_readl(PADS_PLL_CTL);
-	val &= ~(PADS_PLL_CTL_REFCLK_MASK | PADS_PLL_CTL_TXCLKREF_MASK);
-	val |= (PADS_PLL_CTL_REFCLK_INTERNAL_CML | PADS_PLL_CTL_TXCLKREF_DIV10);
-	pads_writel(val, PADS_PLL_CTL);
-
-	/* take PLL out of reset  */
-	val = pads_readl(PADS_PLL_CTL) | PADS_PLL_CTL_RST_B4SM;
-	pads_writel(val, PADS_PLL_CTL);
-
-	/*
-	 * Hack, set the clock voltage to the DEFAULT provided by hw folks.
-	 * This doesn't exist in the documentation
-	 */
-	pads_writel(0xfa5cfa5c, 0xc8);
-
-	/* Wait for the PLL to lock */
-	timeout = 300;
-	do {
-		val = pads_readl(PADS_PLL_CTL);
-		usleep_range(1000, 1000);
-		if (--timeout == 0) {
-			pr_err("Tegra PCIe error: timeout waiting for PLL\n");
-			return -EBUSY;
-		}
-	} while (!(val & PADS_PLL_CTL_LOCKDET));
-
-	/* turn off IDDQ override */
-	val = pads_readl(PADS_CTL) & ~PADS_CTL_IDDQ_1L;
-	pads_writel(val, PADS_CTL);
-
-	/* enable TX/RX data */
-	val = pads_readl(PADS_CTL);
-	val |= (PADS_CTL_TX_DATA_EN_1L | PADS_CTL_RX_DATA_EN_1L);
-	pads_writel(val, PADS_CTL);
-
-	/* Take the PCIe interface module out of reset */
-	tegra_periph_reset_deassert(tegra_pcie.pcie_xclk);
-
-	/* Finally enable PCIe */
-	val = afi_readl(AFI_CONFIGURATION) | AFI_CONFIGURATION_EN_FPCI;
-	afi_writel(val, AFI_CONFIGURATION);
-
-	val = (AFI_INTR_EN_INI_SLVERR | AFI_INTR_EN_INI_DECERR |
-	       AFI_INTR_EN_TGT_SLVERR | AFI_INTR_EN_TGT_DECERR |
-	       AFI_INTR_EN_TGT_WRERR | AFI_INTR_EN_DFPCI_DECERR);
-	afi_writel(val, AFI_AFI_INTR_ENABLE);
-	afi_writel(0xffffffff, AFI_SM_INTR_ENABLE);
-
-	/* FIXME: No MSI for now, only INT */
-	afi_writel(AFI_INTR_MASK_INT_MASK, AFI_INTR_MASK);
-
-	/* Disable all execptions */
-	afi_writel(0, AFI_FPCI_ERROR_MASKS);
-
-	return 0;
-}
-
-static void tegra_pcie_xclk_clamp(bool clamp)
-{
-	u32 reg;
-
-	reg = pmc_readl(PMC_SCRATCH42) & ~PMC_SCRATCH42_PCX_CLAMP;
-
-	if (clamp)
-		reg |= PMC_SCRATCH42_PCX_CLAMP;
-
-	pmc_writel(reg, PMC_SCRATCH42);
-}
-
-static void tegra_pcie_power_off(void)
-{
-	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
-	tegra_periph_reset_assert(tegra_pcie.afi_clk);
-	tegra_periph_reset_assert(tegra_pcie.pex_clk);
-
-	tegra_powergate_power_off(TEGRA_POWERGATE_PCIE);
-	tegra_pcie_xclk_clamp(true);
-}
-
-static int tegra_pcie_power_regate(void)
-{
-	int err;
-
-	tegra_pcie_power_off();
-
-	tegra_pcie_xclk_clamp(true);
-
-	tegra_periph_reset_assert(tegra_pcie.pcie_xclk);
-	tegra_periph_reset_assert(tegra_pcie.afi_clk);
-
-	err = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_PCIE,
-						tegra_pcie.pex_clk);
-	if (err) {
-		pr_err("PCIE: powerup sequence failed: %d\n", err);
-		return err;
-	}
-
-	tegra_periph_reset_deassert(tegra_pcie.afi_clk);
-
-	tegra_pcie_xclk_clamp(false);
-
-	clk_prepare_enable(tegra_pcie.afi_clk);
-	clk_prepare_enable(tegra_pcie.pex_clk);
-	return clk_prepare_enable(tegra_pcie.pll_e);
-}
-
-static int tegra_pcie_clocks_get(void)
-{
-	int err;
-
-	tegra_pcie.pex_clk = clk_get(NULL, "pex");
-	if (IS_ERR(tegra_pcie.pex_clk))
-		return PTR_ERR(tegra_pcie.pex_clk);
-
-	tegra_pcie.afi_clk = clk_get(NULL, "afi");
-	if (IS_ERR(tegra_pcie.afi_clk)) {
-		err = PTR_ERR(tegra_pcie.afi_clk);
-		goto err_afi_clk;
-	}
-
-	tegra_pcie.pcie_xclk = clk_get(NULL, "pcie_xclk");
-	if (IS_ERR(tegra_pcie.pcie_xclk)) {
-		err =  PTR_ERR(tegra_pcie.pcie_xclk);
-		goto err_pcie_xclk;
-	}
-
-	tegra_pcie.pll_e = clk_get_sys(NULL, "pll_e");
-	if (IS_ERR(tegra_pcie.pll_e)) {
-		err = PTR_ERR(tegra_pcie.pll_e);
-		goto err_pll_e;
-	}
-
-	return 0;
-
-err_pll_e:
-	clk_put(tegra_pcie.pcie_xclk);
-err_pcie_xclk:
-	clk_put(tegra_pcie.afi_clk);
-err_afi_clk:
-	clk_put(tegra_pcie.pex_clk);
-
-	return err;
-}
-
-static void tegra_pcie_clocks_put(void)
-{
-	clk_put(tegra_pcie.pll_e);
-	clk_put(tegra_pcie.pcie_xclk);
-	clk_put(tegra_pcie.afi_clk);
-	clk_put(tegra_pcie.pex_clk);
-}
-
-static int __init tegra_pcie_get_resources(void)
-{
-	int err;
-
-	err = tegra_pcie_clocks_get();
-	if (err) {
-		pr_err("PCIE: failed to get clocks: %d\n", err);
-		return err;
-	}
-
-	err = tegra_pcie_power_regate();
-	if (err) {
-		pr_err("PCIE: failed to power up: %d\n", err);
-		goto err_pwr_on;
-	}
-
-	tegra_pcie.regs = ioremap_nocache(TEGRA_PCIE_BASE, PCIE_IOMAP_SZ);
-	if (tegra_pcie.regs == NULL) {
-		pr_err("PCIE: Failed to map PCI/AFI registers\n");
-		err = -ENOMEM;
-		goto err_map_reg;
-	}
-
-	err = request_irq(INT_PCIE_INTR, tegra_pcie_isr,
-			  IRQF_SHARED, "PCIE", &tegra_pcie);
-	if (err) {
-		pr_err("PCIE: Failed to register IRQ: %d\n", err);
-		goto err_req_io;
-	}
-	set_irq_flags(INT_PCIE_INTR, IRQF_VALID);
-
-	return 0;
-
-err_req_io:
-	iounmap(tegra_pcie.regs);
-err_map_reg:
-	tegra_pcie_power_off();
-err_pwr_on:
-	tegra_pcie_clocks_put();
-
-	return err;
-}
-
-/*
- * FIXME: If there are no PCIe cards attached, then calling this function
- * can result in the increase of the bootup time as there are big timeout
- * loops.
- */
-#define TEGRA_PCIE_LINKUP_TIMEOUT	200	/* up to 1.2 seconds */
-static bool tegra_pcie_check_link(struct tegra_pcie_port *pp, int idx,
-				  u32 reset_reg)
-{
-	u32 reg;
-	int retries = 3;
-	int timeout;
-
-	do {
-		timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
-		while (timeout) {
-			reg = readl(pp->base + RP_VEND_XP);
-
-			if (reg & RP_VEND_XP_DL_UP)
-				break;
-
-			mdelay(1);
-			timeout--;
-		}
-
-		if (!timeout)  {
-			pr_err("PCIE: port %d: link down, retrying\n", idx);
-			goto retry;
-		}
-
-		timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
-		while (timeout) {
-			reg = readl(pp->base + RP_LINK_CONTROL_STATUS);
-
-			if (reg & 0x20000000)
-				return true;
-
-			mdelay(1);
-			timeout--;
-		}
-
-retry:
-		/* Pulse the PEX reset */
-		reg = afi_readl(reset_reg) | AFI_PEX_CTRL_RST;
-		afi_writel(reg, reset_reg);
-		mdelay(1);
-		reg = afi_readl(reset_reg) & ~AFI_PEX_CTRL_RST;
-		afi_writel(reg, reset_reg);
-
-		retries--;
-	} while (retries);
-
-	return false;
-}
-
-static void __init tegra_pcie_add_port(int index, u32 offset, u32 reset_reg)
-{
-	struct tegra_pcie_port *pp;
-
-	pp = tegra_pcie.port + tegra_pcie.num_ports;
-
-	pp->index = -1;
-	pp->base = tegra_pcie.regs + offset;
-	pp->link_up = tegra_pcie_check_link(pp, index, reset_reg);
-
-	if (!pp->link_up) {
-		pp->base = NULL;
-		printk(KERN_INFO "PCIE: port %d: link down, ignoring\n", index);
-		return;
-	}
-
-	tegra_pcie.num_ports++;
-	pp->index = index;
-	pp->root_bus_nr = -1;
-	memset(pp->res, 0, sizeof(pp->res));
-}
-
-int __init tegra_pcie_init(bool init_port0, bool init_port1)
-{
-	int err;
-
-	if (!(init_port0 || init_port1))
-		return -ENODEV;
-
-	pcibios_min_mem = 0;
-
-	err = tegra_pcie_get_resources();
-	if (err)
-		return err;
-
-	err = tegra_pcie_enable_controller();
-	if (err)
-		return err;
-
-	/* setup the AFI address translations */
-	tegra_pcie_setup_translations();
-
-	if (init_port0)
-		tegra_pcie_add_port(0, RP0_OFFSET, AFI_PEX0_CTRL);
-
-	if (init_port1)
-		tegra_pcie_add_port(1, RP1_OFFSET, AFI_PEX1_CTRL);
-
-	pci_common_init(&tegra_pcie_hw);
-
-	return 0;
-}
diff --git a/arch/arm/mach-tegra/platsmp.c b/arch/arm/mach-tegra/platsmp.c
index 97b33a2..2d02036 100644
--- a/arch/arm/mach-tegra/platsmp.c
+++ b/arch/arm/mach-tegra/platsmp.c
@@ -196,6 +196,5 @@
 #ifdef CONFIG_HOTPLUG_CPU
 	.cpu_kill		= tegra_cpu_kill,
 	.cpu_die		= tegra_cpu_die,
-	.cpu_disable		= tegra_cpu_disable,
 #endif
 };
diff --git a/arch/arm/mach-tegra/pm-tegra20.c b/arch/arm/mach-tegra/pm-tegra20.c
new file mode 100644
index 0000000..d65e1d7
--- /dev/null
+++ b/arch/arm/mach-tegra/pm-tegra20.c
@@ -0,0 +1,34 @@
+/*
+ * Copyright (c) 2013, NVIDIA Corporation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/kernel.h>
+
+#include "pm.h"
+
+#ifdef CONFIG_PM_SLEEP
+extern u32 tegra20_iram_start, tegra20_iram_end;
+extern void tegra20_sleep_core_finish(unsigned long);
+
+void tegra20_lp1_iram_hook(void)
+{
+	tegra_lp1_iram.start_addr = &tegra20_iram_start;
+	tegra_lp1_iram.end_addr = &tegra20_iram_end;
+}
+
+void tegra20_sleep_core_init(void)
+{
+	tegra_sleep_core_finish = tegra20_sleep_core_finish;
+}
+#endif
diff --git a/arch/arm/mach-tegra/pm-tegra30.c b/arch/arm/mach-tegra/pm-tegra30.c
new file mode 100644
index 0000000..8fa326d
--- /dev/null
+++ b/arch/arm/mach-tegra/pm-tegra30.c
@@ -0,0 +1,34 @@
+/*
+ * Copyright (c) 2013, NVIDIA Corporation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/kernel.h>
+
+#include "pm.h"
+
+#ifdef CONFIG_PM_SLEEP
+extern u32 tegra30_iram_start, tegra30_iram_end;
+extern void tegra30_sleep_core_finish(unsigned long);
+
+void tegra30_lp1_iram_hook(void)
+{
+	tegra_lp1_iram.start_addr = &tegra30_iram_start;
+	tegra_lp1_iram.end_addr = &tegra30_iram_end;
+}
+
+void tegra30_sleep_core_init(void)
+{
+	tegra_sleep_core_finish = tegra30_sleep_core_finish;
+}
+#endif
diff --git a/arch/arm/mach-tegra/pm.c b/arch/arm/mach-tegra/pm.c
index 261fec1..ed294a0 100644
--- a/arch/arm/mach-tegra/pm.c
+++ b/arch/arm/mach-tegra/pm.c
@@ -37,12 +37,18 @@
 #include "reset.h"
 #include "flowctrl.h"
 #include "fuse.h"
+#include "pm.h"
 #include "pmc.h"
 #include "sleep.h"
 
 #ifdef CONFIG_PM_SLEEP
 static DEFINE_SPINLOCK(tegra_lp2_lock);
+static u32 iram_save_size;
+static void *iram_save_addr;
+struct tegra_lp1_iram tegra_lp1_iram;
 void (*tegra_tear_down_cpu)(void);
+void (*tegra_sleep_core_finish)(unsigned long v2p);
+static int (*tegra_sleep_func)(unsigned long v2p);
 
 static void tegra_tear_down_cpu_init(void)
 {
@@ -52,7 +58,9 @@
 			tegra_tear_down_cpu = tegra20_tear_down_cpu;
 		break;
 	case TEGRA30:
-		if (IS_ENABLED(CONFIG_ARCH_TEGRA_3x_SOC))
+	case TEGRA114:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_3x_SOC) ||
+		    IS_ENABLED(CONFIG_ARCH_TEGRA_114_SOC))
 			tegra_tear_down_cpu = tegra30_tear_down_cpu;
 		break;
 	}
@@ -171,19 +179,109 @@
 enum tegra_suspend_mode tegra_pm_validate_suspend_mode(
 				enum tegra_suspend_mode mode)
 {
-	/* Tegra114 didn't support any suspending mode yet. */
-	if (tegra_chip_id == TEGRA114)
-		return TEGRA_SUSPEND_NONE;
-
 	/*
-	 * The Tegra devices only support suspending to LP2 currently.
+	 * The Tegra devices support suspending to LP1 or lower currently.
 	 */
-	if (mode > TEGRA_SUSPEND_LP2)
-		return TEGRA_SUSPEND_LP2;
+	if (mode > TEGRA_SUSPEND_LP1)
+		return TEGRA_SUSPEND_LP1;
 
 	return mode;
 }
 
+static int tegra_sleep_core(unsigned long v2p)
+{
+	setup_mm_for_reboot();
+	tegra_sleep_core_finish(v2p);
+
+	/* should never here */
+	BUG();
+
+	return 0;
+}
+
+/*
+ * tegra_lp1_iram_hook
+ *
+ * Hooking the address of LP1 reset vector and SDRAM self-refresh code in
+ * SDRAM. These codes not be copied to IRAM in this fuction. We need to
+ * copy these code to IRAM before LP0/LP1 suspend and restore the content
+ * of IRAM after resume.
+ */
+static bool tegra_lp1_iram_hook(void)
+{
+	switch (tegra_chip_id) {
+	case TEGRA20:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_2x_SOC))
+			tegra20_lp1_iram_hook();
+		break;
+	case TEGRA30:
+	case TEGRA114:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_3x_SOC) ||
+		    IS_ENABLED(CONFIG_ARCH_TEGRA_114_SOC))
+			tegra30_lp1_iram_hook();
+		break;
+	default:
+		break;
+	}
+
+	if (!tegra_lp1_iram.start_addr || !tegra_lp1_iram.end_addr)
+		return false;
+
+	iram_save_size = tegra_lp1_iram.end_addr - tegra_lp1_iram.start_addr;
+	iram_save_addr = kmalloc(iram_save_size, GFP_KERNEL);
+	if (!iram_save_addr)
+		return false;
+
+	return true;
+}
+
+static bool tegra_sleep_core_init(void)
+{
+	switch (tegra_chip_id) {
+	case TEGRA20:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_2x_SOC))
+			tegra20_sleep_core_init();
+		break;
+	case TEGRA30:
+	case TEGRA114:
+		if (IS_ENABLED(CONFIG_ARCH_TEGRA_3x_SOC) ||
+		    IS_ENABLED(CONFIG_ARCH_TEGRA_114_SOC))
+			tegra30_sleep_core_init();
+		break;
+	default:
+		break;
+	}
+
+	if (!tegra_sleep_core_finish)
+		return false;
+
+	return true;
+}
+
+static void tegra_suspend_enter_lp1(void)
+{
+	tegra_pmc_suspend();
+
+	/* copy the reset vector & SDRAM shutdown code into IRAM */
+	memcpy(iram_save_addr, IO_ADDRESS(TEGRA_IRAM_CODE_AREA),
+		iram_save_size);
+	memcpy(IO_ADDRESS(TEGRA_IRAM_CODE_AREA), tegra_lp1_iram.start_addr,
+		iram_save_size);
+
+	*((u32 *)tegra_cpu_lp1_mask) = 1;
+}
+
+static void tegra_suspend_exit_lp1(void)
+{
+	tegra_pmc_resume();
+
+	/* restore IRAM */
+	memcpy(IO_ADDRESS(TEGRA_IRAM_CODE_AREA), iram_save_addr,
+		iram_save_size);
+
+	*(u32 *)tegra_cpu_lp1_mask = 0;
+}
+
 static const char *lp_state[TEGRA_MAX_SUSPEND_MODE] = {
 	[TEGRA_SUSPEND_NONE] = "none",
 	[TEGRA_SUSPEND_LP2] = "LP2",
@@ -207,6 +305,9 @@
 
 	suspend_cpu_complex();
 	switch (mode) {
+	case TEGRA_SUSPEND_LP1:
+		tegra_suspend_enter_lp1();
+		break;
 	case TEGRA_SUSPEND_LP2:
 		tegra_set_cpu_in_lp2();
 		break;
@@ -214,9 +315,12 @@
 		break;
 	}
 
-	cpu_suspend(PHYS_OFFSET - PAGE_OFFSET, &tegra_sleep_cpu);
+	cpu_suspend(PHYS_OFFSET - PAGE_OFFSET, tegra_sleep_func);
 
 	switch (mode) {
+	case TEGRA_SUSPEND_LP1:
+		tegra_suspend_exit_lp1();
+		break;
 	case TEGRA_SUSPEND_LP2:
 		tegra_clear_cpu_in_lp2();
 		break;
@@ -237,12 +341,36 @@
 
 void __init tegra_init_suspend(void)
 {
-	if (tegra_pmc_get_suspend_mode() == TEGRA_SUSPEND_NONE)
+	enum tegra_suspend_mode mode = tegra_pmc_get_suspend_mode();
+
+	if (mode == TEGRA_SUSPEND_NONE)
 		return;
 
 	tegra_tear_down_cpu_init();
 	tegra_pmc_suspend_init();
 
+	if (mode >= TEGRA_SUSPEND_LP1) {
+		if (!tegra_lp1_iram_hook() || !tegra_sleep_core_init()) {
+			pr_err("%s: unable to allocate memory for SDRAM"
+			       "self-refresh -- LP0/LP1 unavailable\n",
+			       __func__);
+			tegra_pmc_set_suspend_mode(TEGRA_SUSPEND_LP2);
+			mode = TEGRA_SUSPEND_LP2;
+		}
+	}
+
+	/* set up sleep function for cpu_suspend */
+	switch (mode) {
+	case TEGRA_SUSPEND_LP1:
+		tegra_sleep_func = tegra_sleep_core;
+		break;
+	case TEGRA_SUSPEND_LP2:
+		tegra_sleep_func = tegra_sleep_cpu;
+		break;
+	default:
+		break;
+	}
+
 	suspend_set_ops(&tegra_suspend_ops);
 }
 #endif
diff --git a/arch/arm/mach-tegra/pm.h b/arch/arm/mach-tegra/pm.h
index 94c4b9d..fe204e5 100644
--- a/arch/arm/mach-tegra/pm.h
+++ b/arch/arm/mach-tegra/pm.h
@@ -23,6 +23,18 @@
 
 #include "pmc.h"
 
+struct tegra_lp1_iram {
+	void	*start_addr;
+	void	*end_addr;
+};
+extern struct tegra_lp1_iram tegra_lp1_iram;
+extern void (*tegra_sleep_core_finish)(unsigned long v2p);
+
+void tegra20_lp1_iram_hook(void);
+void tegra20_sleep_core_init(void);
+void tegra30_lp1_iram_hook(void);
+void tegra30_sleep_core_init(void);
+
 extern unsigned long l2x0_saved_regs_addr;
 
 void save_cpu_arch_register(void);
diff --git a/arch/arm/mach-tegra/pmc.c b/arch/arm/mach-tegra/pmc.c
index eb3fa4a..8acb881 100644
--- a/arch/arm/mach-tegra/pmc.c
+++ b/arch/arm/mach-tegra/pmc.c
@@ -21,11 +21,14 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 
+#include "flowctrl.h"
 #include "fuse.h"
 #include "pm.h"
 #include "pmc.h"
 #include "sleep.h"
 
+#define TEGRA_POWER_SYSCLK_POLARITY	(1 << 10)  /* sys clk polarity */
+#define TEGRA_POWER_SYSCLK_OE		(1 << 11)  /* system clock enable */
 #define TEGRA_POWER_EFFECT_LP0		(1 << 14)  /* LP0 when CPU pwr gated */
 #define TEGRA_POWER_CPU_PWRREQ_POLARITY	(1 << 15)  /* CPU pwr req polarity */
 #define TEGRA_POWER_CPU_PWRREQ_OE	(1 << 16)  /* CPU pwr req enable */
@@ -193,16 +196,50 @@
 	return pmc_pm_data.suspend_mode;
 }
 
+void tegra_pmc_set_suspend_mode(enum tegra_suspend_mode mode)
+{
+	if (mode < TEGRA_SUSPEND_NONE || mode >= TEGRA_MAX_SUSPEND_MODE)
+		return;
+
+	pmc_pm_data.suspend_mode = mode;
+}
+
+void tegra_pmc_suspend(void)
+{
+	tegra_pmc_writel(virt_to_phys(tegra_resume), PMC_SCRATCH41);
+}
+
+void tegra_pmc_resume(void)
+{
+	tegra_pmc_writel(0x0, PMC_SCRATCH41);
+}
+
 void tegra_pmc_pm_set(enum tegra_suspend_mode mode)
 {
-	u32 reg;
+	u32 reg, csr_reg;
 	unsigned long rate = 0;
 
 	reg = tegra_pmc_readl(PMC_CTRL);
 	reg |= TEGRA_POWER_CPU_PWRREQ_OE;
 	reg &= ~TEGRA_POWER_EFFECT_LP0;
 
+	switch (tegra_chip_id) {
+	case TEGRA20:
+	case TEGRA30:
+		break;
+	default:
+		/* Turn off CRAIL */
+		csr_reg = flowctrl_read_cpu_csr(0);
+		csr_reg &= ~FLOW_CTRL_CSR_ENABLE_EXT_MASK;
+		csr_reg |= FLOW_CTRL_CSR_ENABLE_EXT_CRAIL;
+		flowctrl_write_cpu_csr(0, csr_reg);
+		break;
+	}
+
 	switch (mode) {
+	case TEGRA_SUSPEND_LP1:
+		rate = 32768;
+		break;
 	case TEGRA_SUSPEND_LP2:
 		rate = clk_get_rate(tegra_pclk);
 		break;
@@ -224,6 +261,20 @@
 	reg = tegra_pmc_readl(PMC_CTRL);
 	reg |= TEGRA_POWER_CPU_PWRREQ_OE;
 	tegra_pmc_writel(reg, PMC_CTRL);
+
+	reg = tegra_pmc_readl(PMC_CTRL);
+
+	if (!pmc_pm_data.sysclkreq_high)
+		reg |= TEGRA_POWER_SYSCLK_POLARITY;
+	else
+		reg &= ~TEGRA_POWER_SYSCLK_POLARITY;
+
+	/* configure the output polarity while the request is tristated */
+	tegra_pmc_writel(reg, PMC_CTRL);
+
+	/* now enable the request */
+	reg |= TEGRA_POWER_SYSCLK_OE;
+	tegra_pmc_writel(reg, PMC_CTRL);
 }
 #endif
 
diff --git a/arch/arm/mach-tegra/pmc.h b/arch/arm/mach-tegra/pmc.h
index e1c2df2..549f8c7 100644
--- a/arch/arm/mach-tegra/pmc.h
+++ b/arch/arm/mach-tegra/pmc.h
@@ -28,6 +28,9 @@
 
 #ifdef CONFIG_PM_SLEEP
 enum tegra_suspend_mode tegra_pmc_get_suspend_mode(void);
+void tegra_pmc_set_suspend_mode(enum tegra_suspend_mode mode);
+void tegra_pmc_suspend(void);
+void tegra_pmc_resume(void);
 void tegra_pmc_pm_set(enum tegra_suspend_mode mode);
 void tegra_pmc_suspend_init(void);
 #endif
diff --git a/arch/arm/mach-tegra/reset-handler.S b/arch/arm/mach-tegra/reset-handler.S
index 39dc9e7..f527b2c 100644
--- a/arch/arm/mach-tegra/reset-handler.S
+++ b/arch/arm/mach-tegra/reset-handler.S
@@ -40,9 +40,12 @@
  *	  re-enabling sdram.
  *
  *	r6: SoC ID
+ *	r8: CPU part number
  */
 ENTRY(tegra_resume)
-	bl	v7_invalidate_l1
+	check_cpu_part_num 0xc09, r8, r9
+	bleq	v7_invalidate_l1
+	blne	tegra_init_l2_for_a15
 
 	cpu_id	r0
 	tegra_get_soc_id TEGRA_APB_MISC_BASE, r6
@@ -70,7 +73,8 @@
 	str	r1, [r2]
 1:
 
-	check_cpu_part_num 0xc09, r8, r9
+	mov32	r9, 0xc09
+	cmp	r8, r9
 	bne	not_ca9
 #ifdef CONFIG_HAVE_ARM_SCU
 	/* enable SCU */
@@ -178,6 +182,19 @@
 1:
 #endif
 
+	/* Waking up from LP1? */
+	ldr	r8, [r12, #RESET_DATA(MASK_LP1)]
+	tst	r8, r11				@ if in_lp1
+	beq	__is_not_lp1
+	cmp	r10, #0
+	bne	__die				@ only CPU0 can be here
+	ldr	lr, [r12, #RESET_DATA(STARTUP_LP1)]
+	cmp	lr, #0
+	bleq	__die				@ no LP1 startup handler
+ THUMB(	add	lr, lr, #1 )			@ switch to Thumb mode
+	bx	lr
+__is_not_lp1:
+
 	/* Waking up from LP2? */
 	ldr	r9, [r12, #RESET_DATA(MASK_LP2)]
 	tst	r9, r11				@ if in_lp2
diff --git a/arch/arm/mach-tegra/reset.c b/arch/arm/mach-tegra/reset.c
index 1ac434e..fd0bbf8 100644
--- a/arch/arm/mach-tegra/reset.c
+++ b/arch/arm/mach-tegra/reset.c
@@ -81,6 +81,8 @@
 #endif
 
 #ifdef CONFIG_PM_SLEEP
+	__tegra_cpu_reset_handler_data[TEGRA_RESET_STARTUP_LP1] =
+		TEGRA_IRAM_CODE_AREA;
 	__tegra_cpu_reset_handler_data[TEGRA_RESET_STARTUP_LP2] =
 		virt_to_phys((void *)tegra_resume);
 #endif
diff --git a/arch/arm/mach-tegra/reset.h b/arch/arm/mach-tegra/reset.h
index c90d8e9..76a9343 100644
--- a/arch/arm/mach-tegra/reset.h
+++ b/arch/arm/mach-tegra/reset.h
@@ -39,6 +39,10 @@
 void tegra_secondary_startup(void);
 
 #ifdef CONFIG_PM_SLEEP
+#define tegra_cpu_lp1_mask \
+	(IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \
+	((u32)&__tegra_cpu_reset_handler_data[TEGRA_RESET_MASK_LP1] - \
+	 (u32)__tegra_cpu_reset_handler_start)))
 #define tegra_cpu_lp2_mask \
 	(IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \
 	((u32)&__tegra_cpu_reset_handler_data[TEGRA_RESET_MASK_LP2] - \
diff --git a/arch/arm/mach-tegra/sleep-tegra20.S b/arch/arm/mach-tegra/sleep-tegra20.S
index e3f2417..5c3bd11 100644
--- a/arch/arm/mach-tegra/sleep-tegra20.S
+++ b/arch/arm/mach-tegra/sleep-tegra20.S
@@ -23,10 +23,49 @@
 #include <asm/assembler.h>
 #include <asm/proc-fns.h>
 #include <asm/cp15.h>
+#include <asm/cache.h>
 
 #include "sleep.h"
 #include "flowctrl.h"
 
+#define EMC_CFG				0xc
+#define EMC_ADR_CFG			0x10
+#define EMC_REFRESH			0x70
+#define EMC_NOP				0xdc
+#define EMC_SELF_REF			0xe0
+#define EMC_REQ_CTRL			0x2b0
+#define EMC_EMC_STATUS			0x2b4
+
+#define CLK_RESET_CCLK_BURST		0x20
+#define CLK_RESET_CCLK_DIVIDER		0x24
+#define CLK_RESET_SCLK_BURST		0x28
+#define CLK_RESET_SCLK_DIVIDER		0x2c
+#define CLK_RESET_PLLC_BASE		0x80
+#define CLK_RESET_PLLM_BASE		0x90
+#define CLK_RESET_PLLP_BASE		0xa0
+
+#define APB_MISC_XM2CFGCPADCTRL		0x8c8
+#define APB_MISC_XM2CFGDPADCTRL		0x8cc
+#define APB_MISC_XM2CLKCFGPADCTRL	0x8d0
+#define APB_MISC_XM2COMPPADCTRL		0x8d4
+#define APB_MISC_XM2VTTGENPADCTRL	0x8d8
+#define APB_MISC_XM2CFGCPADCTRL2	0x8e4
+#define APB_MISC_XM2CFGDPADCTRL2	0x8e8
+
+.macro pll_enable, rd, r_car_base, pll_base
+	ldr	\rd, [\r_car_base, #\pll_base]
+	tst	\rd, #(1 << 30)
+	orreq	\rd, \rd, #(1 << 30)
+	streq	\rd, [\r_car_base, #\pll_base]
+.endm
+
+.macro emc_device_mask, rd, base
+	ldr	\rd, [\base, #EMC_ADR_CFG]
+	tst	\rd, #(0x3 << 24)
+	moveq	\rd, #(0x1 << 8)		@ just 1 device
+	movne	\rd, #(0x3 << 8)		@ 2 devices
+.endm
+
 #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PM_SLEEP)
 /*
  * tegra20_hotplug_shutdown(void)
@@ -181,6 +220,28 @@
 ENDPROC(tegra20_cpu_is_resettable_soon)
 
 /*
+ * tegra20_sleep_core_finish(unsigned long v2p)
+ *
+ * Enters suspend in LP0 or LP1 by turning off the mmu and jumping to
+ * tegra20_tear_down_core in IRAM
+ */
+ENTRY(tegra20_sleep_core_finish)
+	/* Flush, disable the L1 data cache and exit SMP */
+	bl	tegra_disable_clean_inv_dcache
+
+	mov32	r3, tegra_shut_off_mmu
+	add	r3, r3, r0
+
+	mov32	r0, tegra20_tear_down_core
+	mov32	r1, tegra20_iram_start
+	sub	r0, r0, r1
+	mov32	r1, TEGRA_IRAM_CODE_AREA
+	add	r0, r0, r1
+
+	mov	pc, r3
+ENDPROC(tegra20_sleep_core_finish)
+
+/*
  * tegra20_sleep_cpu_secondary_finish(unsigned long v2p)
  *
  * Enters WFI on secondary CPU by exiting coherency.
@@ -191,6 +252,7 @@
 	mrc	p15, 0, r11, c1, c0, 1  @ save actlr before exiting coherency
 
 	/* Flush and disable the L1 data cache */
+	mov	r0, #TEGRA_FLUSH_CACHE_LOUIS
 	bl	tegra_disable_clean_inv_dcache
 
 	mov32	r0, TEGRA_PMC_VIRT + PMC_SCRATCH41
@@ -250,6 +312,150 @@
 	b	tegra20_enter_sleep
 ENDPROC(tegra20_tear_down_cpu)
 
+/* START OF ROUTINES COPIED TO IRAM */
+	.align L1_CACHE_SHIFT
+	.globl tegra20_iram_start
+tegra20_iram_start:
+
+/*
+ * tegra20_lp1_reset
+ *
+ * reset vector for LP1 restore; copied into IRAM during suspend.
+ * Brings the system back up to a safe staring point (SDRAM out of
+ * self-refresh, PLLC, PLLM and PLLP reenabled, CPU running on PLLP,
+ * system clock running on the same PLL that it suspended at), and
+ * jumps to tegra_resume to restore virtual addressing and PLLX.
+ * The physical address of tegra_resume expected to be stored in
+ * PMC_SCRATCH41.
+ *
+ * NOTE: THIS *MUST* BE RELOCATED TO TEGRA_IRAM_CODE_AREA.
+ */
+ENTRY(tegra20_lp1_reset)
+	/*
+	 * The CPU and system bus are running at 32KHz and executing from
+	 * IRAM when this code is executed; immediately switch to CLKM and
+	 * enable PLLM, PLLP, PLLC.
+	 */
+	mov32	r0, TEGRA_CLK_RESET_BASE
+
+	mov	r1, #(1 << 28)
+	str	r1, [r0, #CLK_RESET_SCLK_BURST]
+	str	r1, [r0, #CLK_RESET_CCLK_BURST]
+	mov	r1, #0
+	str	r1, [r0, #CLK_RESET_CCLK_DIVIDER]
+	str	r1, [r0, #CLK_RESET_SCLK_DIVIDER]
+
+	pll_enable r1, r0, CLK_RESET_PLLM_BASE
+	pll_enable r1, r0, CLK_RESET_PLLP_BASE
+	pll_enable r1, r0, CLK_RESET_PLLC_BASE
+
+	adr	r2, tegra20_sdram_pad_address
+	adr	r4, tegra20_sdram_pad_save
+	mov	r5, #0
+
+	ldr	r6, tegra20_sdram_pad_size
+padload:
+	ldr	r7, [r2, r5]		@ r7 is the addr in the pad_address
+
+	ldr	r1, [r4, r5]
+	str	r1, [r7]		@ restore the value in pad_save
+
+	add	r5, r5, #4
+	cmp	r6, r5
+	bne	padload
+
+padload_done:
+	/* 255uS delay for PLL stabilization */
+	mov32	r7, TEGRA_TMRUS_BASE
+	ldr	r1, [r7]
+	add	r1, r1, #0xff
+	wait_until r1, r7, r9
+
+	adr	r4, tegra20_sclk_save
+	ldr	r4, [r4]
+	str	r4, [r0, #CLK_RESET_SCLK_BURST]
+	mov32	r4, ((1 << 28) | (4))	@ burst policy is PLLP
+	str	r4, [r0, #CLK_RESET_CCLK_BURST]
+
+	mov32	r0, TEGRA_EMC_BASE
+	ldr	r1, [r0, #EMC_CFG]
+	bic	r1, r1, #(1 << 31)	@ disable DRAM_CLK_STOP
+	str	r1, [r0, #EMC_CFG]
+
+	mov	r1, #0
+	str	r1, [r0, #EMC_SELF_REF]	@ take DRAM out of self refresh
+	mov	r1, #1
+	str	r1, [r0, #EMC_NOP]
+	str	r1, [r0, #EMC_NOP]
+	str	r1, [r0, #EMC_REFRESH]
+
+	emc_device_mask r1, r0
+
+exit_selfrefresh_loop:
+	ldr	r2, [r0, #EMC_EMC_STATUS]
+	ands	r2, r2, r1
+	bne	exit_selfrefresh_loop
+
+	mov	r1, #0			@ unstall all transactions
+	str	r1, [r0, #EMC_REQ_CTRL]
+
+	mov32	r0, TEGRA_PMC_BASE
+	ldr	r0, [r0, #PMC_SCRATCH41]
+	mov	pc, r0			@ jump to tegra_resume
+ENDPROC(tegra20_lp1_reset)
+
+/*
+ * tegra20_tear_down_core
+ *
+ * copied into and executed from IRAM
+ * puts memory in self-refresh for LP0 and LP1
+ */
+tegra20_tear_down_core:
+	bl	tegra20_sdram_self_refresh
+	bl	tegra20_switch_cpu_to_clk32k
+	b	tegra20_enter_sleep
+
+/*
+ * tegra20_switch_cpu_to_clk32k
+ *
+ * In LP0 and LP1 all PLLs will be turned off. Switch the CPU and system clock
+ * to the 32KHz clock.
+ */
+tegra20_switch_cpu_to_clk32k:
+	/*
+	 * start by switching to CLKM to safely disable PLLs, then switch to
+	 * CLKS.
+	 */
+	mov	r0, #(1 << 28)
+	str	r0, [r5, #CLK_RESET_SCLK_BURST]
+	str	r0, [r5, #CLK_RESET_CCLK_BURST]
+	mov	r0, #0
+	str	r0, [r5, #CLK_RESET_CCLK_DIVIDER]
+	str	r0, [r5, #CLK_RESET_SCLK_DIVIDER]
+
+	/* 2uS delay delay between changing SCLK and disabling PLLs */
+	mov32	r7, TEGRA_TMRUS_BASE
+	ldr	r1, [r7]
+	add	r1, r1, #2
+	wait_until r1, r7, r9
+
+	/* disable PLLM, PLLP and PLLC */
+	ldr	r0, [r5, #CLK_RESET_PLLM_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLM_BASE]
+	ldr	r0, [r5, #CLK_RESET_PLLP_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLP_BASE]
+	ldr	r0, [r5, #CLK_RESET_PLLC_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLC_BASE]
+
+	/* switch to CLKS */
+	mov	r0, #0	/* brust policy = 32KHz */
+	str	r0, [r5, #CLK_RESET_SCLK_BURST]
+
+	mov	pc, lr
+
 /*
  * tegra20_enter_sleep
  *
@@ -274,4 +480,95 @@
 	isb
 	b	halted
 
+/*
+ * tegra20_sdram_self_refresh
+ *
+ * called with MMU off and caches disabled
+ * puts sdram in self refresh
+ * must be executed from IRAM
+ */
+tegra20_sdram_self_refresh:
+	mov32	r1, TEGRA_EMC_BASE	@ r1 reserved for emc base addr
+
+	mov	r2, #3
+	str	r2, [r1, #EMC_REQ_CTRL]	@ stall incoming DRAM requests
+
+emcidle:
+	ldr	r2, [r1, #EMC_EMC_STATUS]
+	tst	r2, #4
+	beq	emcidle
+
+	mov	r2, #1
+	str	r2, [r1, #EMC_SELF_REF]
+
+	emc_device_mask r2, r1
+
+emcself:
+	ldr	r3, [r1, #EMC_EMC_STATUS]
+	and	r3, r3, r2
+	cmp	r3, r2
+	bne	emcself			@ loop until DDR in self-refresh
+
+	adr	r2, tegra20_sdram_pad_address
+	adr	r3, tegra20_sdram_pad_safe
+	adr	r4, tegra20_sdram_pad_save
+	mov	r5, #0
+
+	ldr	r6, tegra20_sdram_pad_size
+padsave:
+	ldr	r0, [r2, r5]		@ r0 is the addr in the pad_address
+
+	ldr	r1, [r0]
+	str	r1, [r4, r5]		@ save the content of the addr
+
+	ldr	r1, [r3, r5]
+	str	r1, [r0]		@ set the save val to the addr
+
+	add	r5, r5, #4
+	cmp	r6, r5
+	bne	padsave
+padsave_done:
+
+	mov32	r5, TEGRA_CLK_RESET_BASE
+	ldr	r0, [r5, #CLK_RESET_SCLK_BURST]
+	adr	r2, tegra20_sclk_save
+	str	r0, [r2]
+	dsb
+	mov	pc, lr
+
+tegra20_sdram_pad_address:
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2CFGCPADCTRL
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2CFGDPADCTRL
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2CLKCFGPADCTRL
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2COMPPADCTRL
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2VTTGENPADCTRL
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2CFGCPADCTRL2
+	.word	TEGRA_APB_MISC_BASE + APB_MISC_XM2CFGDPADCTRL2
+
+tegra20_sdram_pad_size:
+	.word	tegra20_sdram_pad_size - tegra20_sdram_pad_address
+
+tegra20_sdram_pad_safe:
+	.word	0x8
+	.word	0x8
+	.word	0x0
+	.word	0x8
+	.word	0x5500
+	.word	0x08080040
+	.word	0x0
+
+tegra20_sclk_save:
+	.word	0x0
+
+tegra20_sdram_pad_save:
+	.rept (tegra20_sdram_pad_size - tegra20_sdram_pad_address) / 4
+	.long	0
+	.endr
+
+	.ltorg
+/* dummy symbol for end of IRAM */
+	.align L1_CACHE_SHIFT
+	.globl tegra20_iram_end
+tegra20_iram_end:
+	b	.
 #endif
diff --git a/arch/arm/mach-tegra/sleep-tegra30.S b/arch/arm/mach-tegra/sleep-tegra30.S
index ada8821..63fa91b 100644
--- a/arch/arm/mach-tegra/sleep-tegra30.S
+++ b/arch/arm/mach-tegra/sleep-tegra30.S
@@ -18,13 +18,118 @@
 
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
+#include <asm/cache.h>
 
 #include "fuse.h"
 #include "sleep.h"
 #include "flowctrl.h"
 
+#define EMC_CFG				0xc
+#define EMC_ADR_CFG			0x10
+#define EMC_TIMING_CONTROL		0x28
+#define EMC_REFRESH			0x70
+#define EMC_NOP				0xdc
+#define EMC_SELF_REF			0xe0
+#define EMC_MRW				0xe8
+#define EMC_FBIO_CFG5			0x104
+#define EMC_AUTO_CAL_CONFIG		0x2a4
+#define EMC_AUTO_CAL_INTERVAL		0x2a8
+#define EMC_AUTO_CAL_STATUS		0x2ac
+#define EMC_REQ_CTRL			0x2b0
+#define EMC_CFG_DIG_DLL			0x2bc
+#define EMC_EMC_STATUS			0x2b4
+#define EMC_ZCAL_INTERVAL		0x2e0
+#define EMC_ZQ_CAL			0x2ec
+#define EMC_XM2VTTGENPADCTRL		0x310
+#define EMC_XM2VTTGENPADCTRL2		0x314
+
+#define PMC_CTRL			0x0
+#define PMC_CTRL_SIDE_EFFECT_LP0 (1 << 14) /* enter LP0 when CPU pwr gated */
+
+#define PMC_PLLP_WB0_OVERRIDE		0xf8
+#define PMC_IO_DPD_REQ			0x1b8
+#define PMC_IO_DPD_STATUS		0x1bc
+
+#define CLK_RESET_CCLK_BURST		0x20
+#define CLK_RESET_CCLK_DIVIDER		0x24
+#define CLK_RESET_SCLK_BURST		0x28
+#define CLK_RESET_SCLK_DIVIDER		0x2c
+
+#define CLK_RESET_PLLC_BASE		0x80
+#define CLK_RESET_PLLC_MISC		0x8c
+#define CLK_RESET_PLLM_BASE		0x90
+#define CLK_RESET_PLLM_MISC		0x9c
+#define CLK_RESET_PLLP_BASE		0xa0
+#define CLK_RESET_PLLP_MISC		0xac
+#define CLK_RESET_PLLA_BASE		0xb0
+#define CLK_RESET_PLLA_MISC		0xbc
+#define CLK_RESET_PLLX_BASE		0xe0
+#define CLK_RESET_PLLX_MISC		0xe4
+#define CLK_RESET_PLLX_MISC3		0x518
+#define CLK_RESET_PLLX_MISC3_IDDQ	3
+#define CLK_RESET_PLLM_MISC_IDDQ	5
+#define CLK_RESET_PLLC_MISC_IDDQ	26
+
+#define CLK_RESET_CLK_SOURCE_MSELECT	0x3b4
+
+#define MSELECT_CLKM			(0x3 << 30)
+
+#define LOCK_DELAY 50 /* safety delay after lock is detected */
+
 #define TEGRA30_POWER_HOTPLUG_SHUTDOWN	(1 << 27) /* Hotplug shutdown */
 
+.macro emc_device_mask, rd, base
+	ldr	\rd, [\base, #EMC_ADR_CFG]
+	tst	\rd, #0x1
+	moveq	\rd, #(0x1 << 8)		@ just 1 device
+	movne	\rd, #(0x3 << 8)		@ 2 devices
+.endm
+
+.macro emc_timing_update, rd, base
+	mov	\rd, #1
+	str	\rd, [\base, #EMC_TIMING_CONTROL]
+1001:
+	ldr	\rd, [\base, #EMC_EMC_STATUS]
+	tst	\rd, #(0x1<<23)	@ wait EMC_STATUS_TIMING_UPDATE_STALLED is clear
+	bne	1001b
+.endm
+
+.macro pll_enable, rd, r_car_base, pll_base, pll_misc
+	ldr	\rd, [\r_car_base, #\pll_base]
+	tst	\rd, #(1 << 30)
+	orreq	\rd, \rd, #(1 << 30)
+	streq	\rd, [\r_car_base, #\pll_base]
+	/* Enable lock detector */
+	.if	\pll_misc
+	ldr	\rd, [\r_car_base, #\pll_misc]
+	bic	\rd, \rd, #(1 << 18)
+	str	\rd, [\r_car_base, #\pll_misc]
+	ldr	\rd, [\r_car_base, #\pll_misc]
+	ldr	\rd, [\r_car_base, #\pll_misc]
+	orr	\rd, \rd, #(1 << 18)
+	str	\rd, [\r_car_base, #\pll_misc]
+	.endif
+.endm
+
+.macro pll_locked, rd, r_car_base, pll_base
+1:
+	ldr	\rd, [\r_car_base, #\pll_base]
+	tst	\rd, #(1 << 27)
+	beq	1b
+.endm
+
+.macro pll_iddq_exit, rd, car, iddq, iddq_bit
+	ldr	\rd, [\car, #\iddq]
+	bic	\rd, \rd, #(1<<\iddq_bit)
+	str	\rd, [\car, #\iddq]
+.endm
+
+.macro pll_iddq_entry, rd, car, iddq, iddq_bit
+	ldr	\rd, [\car, #\iddq]
+	orr	\rd, \rd, #(1<<\iddq_bit)
+	str	\rd, [\car, #\iddq]
+.endm
+
 #if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_PM_SLEEP)
 /*
  * tegra30_hotplug_shutdown(void)
@@ -99,6 +204,8 @@
 	cmp	r10, #TEGRA30
 	moveq   r3, #FLOW_CTRL_WAIT_FOR_INTERRUPT	@ For LP2
 	movne	r3, #FLOW_CTRL_WAITEVENT
+	orrne	r3, r3, #FLOW_CTRL_HALT_GIC_IRQ
+	orrne	r3, r3, #FLOW_CTRL_HALT_GIC_FIQ
 flow_ctrl_done:
 	cmp	r10, #TEGRA30
 	str	r3, [r2]
@@ -127,6 +234,41 @@
 
 #ifdef CONFIG_PM_SLEEP
 /*
+ * tegra30_sleep_core_finish(unsigned long v2p)
+ *
+ * Enters suspend in LP0 or LP1 by turning off the MMU and jumping to
+ * tegra30_tear_down_core in IRAM
+ */
+ENTRY(tegra30_sleep_core_finish)
+	/* Flush, disable the L1 data cache and exit SMP */
+	bl	tegra_disable_clean_inv_dcache
+
+	/*
+	 * Preload all the address literals that are needed for the
+	 * CPU power-gating process, to avoid loading from SDRAM which
+	 * are not supported once SDRAM is put into self-refresh.
+	 * LP0 / LP1 use physical address, since the MMU needs to be
+	 * disabled before putting SDRAM into self-refresh to avoid
+	 * memory access due to page table walks.
+	 */
+	mov32	r4, TEGRA_PMC_BASE
+	mov32	r5, TEGRA_CLK_RESET_BASE
+	mov32	r6, TEGRA_FLOW_CTRL_BASE
+	mov32	r7, TEGRA_TMRUS_BASE
+
+	mov32	r3, tegra_shut_off_mmu
+	add	r3, r3, r0
+
+	mov32	r0, tegra30_tear_down_core
+	mov32	r1, tegra30_iram_start
+	sub	r0, r0, r1
+	mov32	r1, TEGRA_IRAM_CODE_AREA
+	add	r0, r0, r1
+
+	mov	pc, r3
+ENDPROC(tegra30_sleep_core_finish)
+
+/*
  * tegra30_sleep_cpu_secondary_finish(unsigned long v2p)
  *
  * Enters LP2 on secondary CPU by exiting coherency and powergating the CPU.
@@ -135,6 +277,7 @@
 	mov	r7, lr
 
 	/* Flush and disable the L1 data cache */
+	mov 	r0, #TEGRA_FLUSH_CACHE_LOUIS
 	bl	tegra_disable_clean_inv_dcache
 
 	/* Powergate this CPU. */
@@ -155,6 +298,351 @@
 	b	tegra30_enter_sleep
 ENDPROC(tegra30_tear_down_cpu)
 
+/* START OF ROUTINES COPIED TO IRAM */
+	.align L1_CACHE_SHIFT
+	.globl tegra30_iram_start
+tegra30_iram_start:
+
+/*
+ * tegra30_lp1_reset
+ *
+ * reset vector for LP1 restore; copied into IRAM during suspend.
+ * Brings the system back up to a safe staring point (SDRAM out of
+ * self-refresh, PLLC, PLLM and PLLP reenabled, CPU running on PLLX,
+ * system clock running on the same PLL that it suspended at), and
+ * jumps to tegra_resume to restore virtual addressing.
+ * The physical address of tegra_resume expected to be stored in
+ * PMC_SCRATCH41.
+ *
+ * NOTE: THIS *MUST* BE RELOCATED TO TEGRA_IRAM_CODE_AREA.
+ */
+ENTRY(tegra30_lp1_reset)
+	/*
+	 * The CPU and system bus are running at 32KHz and executing from
+	 * IRAM when this code is executed; immediately switch to CLKM and
+	 * enable PLLP, PLLM, PLLC, PLLA and PLLX.
+	 */
+	mov32	r0, TEGRA_CLK_RESET_BASE
+
+	mov	r1, #(1 << 28)
+	str	r1, [r0, #CLK_RESET_SCLK_BURST]
+	str	r1, [r0, #CLK_RESET_CCLK_BURST]
+	mov	r1, #0
+	str	r1, [r0, #CLK_RESET_CCLK_DIVIDER]
+	str	r1, [r0, #CLK_RESET_SCLK_DIVIDER]
+
+	tegra_get_soc_id TEGRA_APB_MISC_BASE, r10
+	cmp	r10, #TEGRA30
+	beq	_no_pll_iddq_exit
+
+	pll_iddq_exit r1, r0, CLK_RESET_PLLM_MISC, CLK_RESET_PLLM_MISC_IDDQ
+	pll_iddq_exit r1, r0, CLK_RESET_PLLC_MISC, CLK_RESET_PLLC_MISC_IDDQ
+	pll_iddq_exit r1, r0, CLK_RESET_PLLX_MISC3, CLK_RESET_PLLX_MISC3_IDDQ
+
+	mov32	r7, TEGRA_TMRUS_BASE
+	ldr	r1, [r7]
+	add	r1, r1, #2
+	wait_until r1, r7, r3
+
+	/* enable PLLM via PMC */
+	mov32	r2, TEGRA_PMC_BASE
+	ldr	r1, [r2, #PMC_PLLP_WB0_OVERRIDE]
+	orr	r1, r1, #(1 << 12)
+	str	r1, [r2, #PMC_PLLP_WB0_OVERRIDE]
+
+	pll_enable r1, r0, CLK_RESET_PLLM_BASE, 0
+	pll_enable r1, r0, CLK_RESET_PLLC_BASE, 0
+	pll_enable r1, r0, CLK_RESET_PLLX_BASE, 0
+
+	b	_pll_m_c_x_done
+
+_no_pll_iddq_exit:
+	/* enable PLLM via PMC */
+	mov32	r2, TEGRA_PMC_BASE
+	ldr	r1, [r2, #PMC_PLLP_WB0_OVERRIDE]
+	orr	r1, r1, #(1 << 12)
+	str	r1, [r2, #PMC_PLLP_WB0_OVERRIDE]
+
+	pll_enable r1, r0, CLK_RESET_PLLM_BASE, CLK_RESET_PLLM_MISC
+	pll_enable r1, r0, CLK_RESET_PLLC_BASE, CLK_RESET_PLLC_MISC
+	pll_enable r1, r0, CLK_RESET_PLLX_BASE, CLK_RESET_PLLX_MISC
+
+_pll_m_c_x_done:
+	pll_enable r1, r0, CLK_RESET_PLLP_BASE, CLK_RESET_PLLP_MISC
+	pll_enable r1, r0, CLK_RESET_PLLA_BASE, CLK_RESET_PLLA_MISC
+
+	pll_locked r1, r0, CLK_RESET_PLLM_BASE
+	pll_locked r1, r0, CLK_RESET_PLLP_BASE
+	pll_locked r1, r0, CLK_RESET_PLLA_BASE
+	pll_locked r1, r0, CLK_RESET_PLLC_BASE
+	pll_locked r1, r0, CLK_RESET_PLLX_BASE
+
+	mov32	r7, TEGRA_TMRUS_BASE
+	ldr	r1, [r7]
+	add	r1, r1, #LOCK_DELAY
+	wait_until r1, r7, r3
+
+	adr	r5, tegra30_sdram_pad_save
+
+	ldr	r4, [r5, #0x18]		@ restore CLK_SOURCE_MSELECT
+	str	r4, [r0, #CLK_RESET_CLK_SOURCE_MSELECT]
+
+	ldr	r4, [r5, #0x1C]		@ restore SCLK_BURST
+	str	r4, [r0, #CLK_RESET_SCLK_BURST]
+
+	cmp	r10, #TEGRA30
+	movweq	r4, #:lower16:((1 << 28) | (0x8))	@ burst policy is PLLX
+	movteq	r4, #:upper16:((1 << 28) | (0x8))
+	movwne	r4, #:lower16:((1 << 28) | (0xe))
+	movtne	r4, #:upper16:((1 << 28) | (0xe))
+	str	r4, [r0, #CLK_RESET_CCLK_BURST]
+
+	/* Restore pad power state to normal */
+	ldr	r1, [r5, #0x14]		@ PMC_IO_DPD_STATUS
+	mvn	r1, r1
+	bic	r1, r1, #(1 << 31)
+	orr	r1, r1, #(1 << 30)
+	str	r1, [r2, #PMC_IO_DPD_REQ]	@ DPD_OFF
+
+	cmp	r10, #TEGRA30
+	movweq	r0, #:lower16:TEGRA_EMC_BASE	@ r0 reserved for emc base
+	movteq	r0, #:upper16:TEGRA_EMC_BASE
+	movwne	r0, #:lower16:TEGRA_EMC0_BASE
+	movtne	r0, #:upper16:TEGRA_EMC0_BASE
+
+exit_self_refresh:
+	ldr	r1, [r5, #0xC]		@ restore EMC_XM2VTTGENPADCTRL
+	str	r1, [r0, #EMC_XM2VTTGENPADCTRL]
+	ldr	r1, [r5, #0x10]		@ restore EMC_XM2VTTGENPADCTRL2
+	str	r1, [r0, #EMC_XM2VTTGENPADCTRL2]
+	ldr	r1, [r5, #0x8]		@ restore EMC_AUTO_CAL_INTERVAL
+	str	r1, [r0, #EMC_AUTO_CAL_INTERVAL]
+
+	/* Relock DLL */
+	ldr	r1, [r0, #EMC_CFG_DIG_DLL]
+	orr	r1, r1, #(1 << 30)	@ set DLL_RESET
+	str	r1, [r0, #EMC_CFG_DIG_DLL]
+
+	emc_timing_update r1, r0
+
+	cmp	r10, #TEGRA114
+	movweq	r1, #:lower16:TEGRA_EMC1_BASE
+	movteq	r1, #:upper16:TEGRA_EMC1_BASE
+	cmpeq	r0, r1
+
+	ldr	r1, [r0, #EMC_AUTO_CAL_CONFIG]
+	orr	r1, r1, #(1 << 31)	@ set AUTO_CAL_ACTIVE
+	orreq	r1, r1, #(1 << 27)	@ set slave mode for channel 1
+	str	r1, [r0, #EMC_AUTO_CAL_CONFIG]
+
+emc_wait_auto_cal_onetime:
+	ldr	r1, [r0, #EMC_AUTO_CAL_STATUS]
+	tst	r1, #(1 << 31)		@ wait until AUTO_CAL_ACTIVE is cleared
+	bne	emc_wait_auto_cal_onetime
+
+	ldr	r1, [r0, #EMC_CFG]
+	bic	r1, r1, #(1 << 31)	@ disable DRAM_CLK_STOP_PD
+	str	r1, [r0, #EMC_CFG]
+
+	mov	r1, #0
+	str	r1, [r0, #EMC_SELF_REF]	@ take DRAM out of self refresh
+	mov	r1, #1
+	cmp	r10, #TEGRA30
+	streq	r1, [r0, #EMC_NOP]
+	streq	r1, [r0, #EMC_NOP]
+	streq	r1, [r0, #EMC_REFRESH]
+
+	emc_device_mask r1, r0
+
+exit_selfrefresh_loop:
+	ldr	r2, [r0, #EMC_EMC_STATUS]
+	ands	r2, r2, r1
+	bne	exit_selfrefresh_loop
+
+	lsr	r1, r1, #8		@ devSel, bit0:dev0, bit1:dev1
+
+	mov32	r7, TEGRA_TMRUS_BASE
+	ldr	r2, [r0, #EMC_FBIO_CFG5]
+
+	and	r2, r2,	#3		@ check DRAM_TYPE
+	cmp	r2, #2
+	beq	emc_lpddr2
+
+	/* Issue a ZQ_CAL for dev0 - DDR3 */
+	mov32	r2, 0x80000011		@ DEV_SELECTION=2, LENGTH=LONG, CMD=1
+	str	r2, [r0, #EMC_ZQ_CAL]
+	ldr	r2, [r7]
+	add	r2, r2, #10
+	wait_until r2, r7, r3
+
+	tst	r1, #2
+	beq	zcal_done
+
+	/* Issue a ZQ_CAL for dev1 - DDR3 */
+	mov32	r2, 0x40000011		@ DEV_SELECTION=1, LENGTH=LONG, CMD=1
+	str	r2, [r0, #EMC_ZQ_CAL]
+	ldr	r2, [r7]
+	add	r2, r2, #10
+	wait_until r2, r7, r3
+	b	zcal_done
+
+emc_lpddr2:
+	/* Issue a ZQ_CAL for dev0 - LPDDR2 */
+	mov32	r2, 0x800A00AB		@ DEV_SELECTION=2, MA=10, OP=0xAB
+	str	r2, [r0, #EMC_MRW]
+	ldr	r2, [r7]
+	add	r2, r2, #1
+	wait_until r2, r7, r3
+
+	tst	r1, #2
+	beq	zcal_done
+
+	/* Issue a ZQ_CAL for dev0 - LPDDR2 */
+	mov32	r2, 0x400A00AB		@ DEV_SELECTION=1, MA=10, OP=0xAB
+	str	r2, [r0, #EMC_MRW]
+	ldr	r2, [r7]
+	add	r2, r2, #1
+	wait_until r2, r7, r3
+
+zcal_done:
+	mov	r1, #0			@ unstall all transactions
+	str	r1, [r0, #EMC_REQ_CTRL]
+	ldr	r1, [r5, #0x4]		@ restore EMC_ZCAL_INTERVAL
+	str	r1, [r0, #EMC_ZCAL_INTERVAL]
+	ldr	r1, [r5, #0x0]		@ restore EMC_CFG
+	str	r1, [r0, #EMC_CFG]
+
+	/* Tegra114 had dual EMC channel, now config the other one */
+	cmp	r10, #TEGRA114
+	bne	__no_dual_emc_chanl
+	mov32	r1, TEGRA_EMC1_BASE
+	cmp	r0, r1
+	movne	r0, r1
+	addne	r5, r5, #0x20
+	bne	exit_self_refresh
+__no_dual_emc_chanl:
+
+	mov32	r0, TEGRA_PMC_BASE
+	ldr	r0, [r0, #PMC_SCRATCH41]
+	mov	pc, r0			@ jump to tegra_resume
+ENDPROC(tegra30_lp1_reset)
+
+	.align	L1_CACHE_SHIFT
+tegra30_sdram_pad_address:
+	.word	TEGRA_EMC_BASE + EMC_CFG				@0x0
+	.word	TEGRA_EMC_BASE + EMC_ZCAL_INTERVAL			@0x4
+	.word	TEGRA_EMC_BASE + EMC_AUTO_CAL_INTERVAL			@0x8
+	.word	TEGRA_EMC_BASE + EMC_XM2VTTGENPADCTRL			@0xc
+	.word	TEGRA_EMC_BASE + EMC_XM2VTTGENPADCTRL2			@0x10
+	.word	TEGRA_PMC_BASE + PMC_IO_DPD_STATUS			@0x14
+	.word	TEGRA_CLK_RESET_BASE + CLK_RESET_CLK_SOURCE_MSELECT	@0x18
+	.word	TEGRA_CLK_RESET_BASE + CLK_RESET_SCLK_BURST		@0x1c
+
+tegra114_sdram_pad_address:
+	.word	TEGRA_EMC0_BASE + EMC_CFG				@0x0
+	.word	TEGRA_EMC0_BASE + EMC_ZCAL_INTERVAL			@0x4
+	.word	TEGRA_EMC0_BASE + EMC_AUTO_CAL_INTERVAL			@0x8
+	.word	TEGRA_EMC0_BASE + EMC_XM2VTTGENPADCTRL			@0xc
+	.word	TEGRA_EMC0_BASE + EMC_XM2VTTGENPADCTRL2			@0x10
+	.word	TEGRA_PMC_BASE + PMC_IO_DPD_STATUS			@0x14
+	.word	TEGRA_CLK_RESET_BASE + CLK_RESET_CLK_SOURCE_MSELECT	@0x18
+	.word	TEGRA_CLK_RESET_BASE + CLK_RESET_SCLK_BURST		@0x1c
+	.word	TEGRA_EMC1_BASE + EMC_CFG				@0x20
+	.word	TEGRA_EMC1_BASE + EMC_ZCAL_INTERVAL			@0x24
+	.word	TEGRA_EMC1_BASE + EMC_AUTO_CAL_INTERVAL			@0x28
+	.word	TEGRA_EMC1_BASE + EMC_XM2VTTGENPADCTRL			@0x2c
+	.word	TEGRA_EMC1_BASE + EMC_XM2VTTGENPADCTRL2			@0x30
+
+tegra30_sdram_pad_size:
+	.word	tegra114_sdram_pad_address - tegra30_sdram_pad_address
+
+tegra114_sdram_pad_size:
+	.word	tegra30_sdram_pad_size - tegra114_sdram_pad_address
+
+	.type	tegra30_sdram_pad_save, %object
+tegra30_sdram_pad_save:
+	.rept (tegra30_sdram_pad_size - tegra114_sdram_pad_address) / 4
+	.long	0
+	.endr
+
+/*
+ * tegra30_tear_down_core
+ *
+ * copied into and executed from IRAM
+ * puts memory in self-refresh for LP0 and LP1
+ */
+tegra30_tear_down_core:
+	bl	tegra30_sdram_self_refresh
+	bl	tegra30_switch_cpu_to_clk32k
+	b	tegra30_enter_sleep
+
+/*
+ * tegra30_switch_cpu_to_clk32k
+ *
+ * In LP0 and LP1 all PLLs will be turned off. Switching the CPU and System CLK
+ * to the 32KHz clock.
+ * r4 = TEGRA_PMC_BASE
+ * r5 = TEGRA_CLK_RESET_BASE
+ * r6 = TEGRA_FLOW_CTRL_BASE
+ * r7 = TEGRA_TMRUS_BASE
+ * r10= SoC ID
+ */
+tegra30_switch_cpu_to_clk32k:
+	/*
+	 * start by jumping to CLKM to safely disable PLLs, then jump to
+	 * CLKS.
+	 */
+	mov	r0, #(1 << 28)
+	str	r0, [r5, #CLK_RESET_SCLK_BURST]
+	/* 2uS delay delay between changing SCLK and CCLK */
+	ldr	r1, [r7]
+	add	r1, r1, #2
+	wait_until r1, r7, r9
+	str	r0, [r5, #CLK_RESET_CCLK_BURST]
+	mov	r0, #0
+	str	r0, [r5, #CLK_RESET_CCLK_DIVIDER]
+	str	r0, [r5, #CLK_RESET_SCLK_DIVIDER]
+
+	/* switch the clock source of mselect to be CLK_M */
+	ldr	r0, [r5, #CLK_RESET_CLK_SOURCE_MSELECT]
+	orr	r0, r0, #MSELECT_CLKM
+	str	r0, [r5, #CLK_RESET_CLK_SOURCE_MSELECT]
+
+	/* 2uS delay delay between changing SCLK and disabling PLLs */
+	ldr	r1, [r7]
+	add	r1, r1, #2
+	wait_until r1, r7, r9
+
+	/* disable PLLM via PMC in LP1 */
+	ldr	r0, [r4, #PMC_PLLP_WB0_OVERRIDE]
+	bic	r0, r0, #(1 << 12)
+	str	r0, [r4, #PMC_PLLP_WB0_OVERRIDE]
+
+	/* disable PLLP, PLLA, PLLC and PLLX */
+	ldr	r0, [r5, #CLK_RESET_PLLP_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLP_BASE]
+	ldr	r0, [r5, #CLK_RESET_PLLA_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLA_BASE]
+	ldr	r0, [r5, #CLK_RESET_PLLC_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLC_BASE]
+	ldr	r0, [r5, #CLK_RESET_PLLX_BASE]
+	bic	r0, r0, #(1 << 30)
+	str	r0, [r5, #CLK_RESET_PLLX_BASE]
+
+	cmp	r10, #TEGRA30
+	beq	_no_pll_in_iddq
+	pll_iddq_entry r1, r5, CLK_RESET_PLLX_MISC3, CLK_RESET_PLLX_MISC3_IDDQ
+_no_pll_in_iddq:
+
+	/* switch to CLKS */
+	mov	r0, #0	/* brust policy = 32KHz */
+	str	r0, [r5, #CLK_RESET_SCLK_BURST]
+
+	mov	pc, lr
+
 /*
  * tegra30_enter_sleep
  *
@@ -172,8 +660,12 @@
 	orr	r0, r0, #FLOW_CTRL_CSR_ENABLE
 	str	r0, [r6, r2]
 
+	tegra_get_soc_id TEGRA_APB_MISC_BASE, r10
+	cmp	r10, #TEGRA30
 	mov	r0, #FLOW_CTRL_WAIT_FOR_INTERRUPT
-	orr	r0, r0, #FLOW_CTRL_HALT_CPU_IRQ | FLOW_CTRL_HALT_CPU_FIQ
+	orreq	r0, r0, #FLOW_CTRL_HALT_CPU_IRQ | FLOW_CTRL_HALT_CPU_FIQ
+	orrne   r0, r0, #FLOW_CTRL_HALT_LIC_IRQ | FLOW_CTRL_HALT_LIC_FIQ
+
 	cpu_to_halt_reg r2, r1
 	str	r0, [r6, r2]
 	dsb
@@ -187,4 +679,126 @@
 	/* !!!FIXME!!! Implement halt failure handler */
 	b	halted
 
+/*
+ * tegra30_sdram_self_refresh
+ *
+ * called with MMU off and caches disabled
+ * must be executed from IRAM
+ * r4 = TEGRA_PMC_BASE
+ * r5 = TEGRA_CLK_RESET_BASE
+ * r6 = TEGRA_FLOW_CTRL_BASE
+ * r7 = TEGRA_TMRUS_BASE
+ * r10= SoC ID
+ */
+tegra30_sdram_self_refresh:
+
+	adr	r8, tegra30_sdram_pad_save
+	tegra_get_soc_id TEGRA_APB_MISC_BASE, r10
+	cmp	r10, #TEGRA30
+	adreq	r2, tegra30_sdram_pad_address
+	ldreq	r3, tegra30_sdram_pad_size
+	adrne	r2, tegra114_sdram_pad_address
+	ldrne	r3, tegra114_sdram_pad_size
+	mov	r9, #0
+
+padsave:
+	ldr	r0, [r2, r9]		@ r0 is the addr in the pad_address
+
+	ldr	r1, [r0]
+	str	r1, [r8, r9]		@ save the content of the addr
+
+	add	r9, r9, #4
+	cmp	r3, r9
+	bne	padsave
+padsave_done:
+
+	dsb
+
+	cmp	r10, #TEGRA30
+	ldreq	r0, =TEGRA_EMC_BASE	@ r0 reserved for emc base addr
+	ldrne	r0, =TEGRA_EMC0_BASE
+
+enter_self_refresh:
+	cmp	r10, #TEGRA30
+	mov	r1, #0
+	str	r1, [r0, #EMC_ZCAL_INTERVAL]
+	str	r1, [r0, #EMC_AUTO_CAL_INTERVAL]
+	ldr	r1, [r0, #EMC_CFG]
+	bic	r1, r1, #(1 << 28)
+	bicne	r1, r1, #(1 << 29)
+	str	r1, [r0, #EMC_CFG]	@ disable DYN_SELF_REF
+
+	emc_timing_update r1, r0
+
+	ldr	r1, [r7]
+	add	r1, r1, #5
+	wait_until r1, r7, r2
+
+emc_wait_auto_cal:
+	ldr	r1, [r0, #EMC_AUTO_CAL_STATUS]
+	tst	r1, #(1 << 31)		@ wait until AUTO_CAL_ACTIVE is cleared
+	bne	emc_wait_auto_cal
+
+	mov	r1, #3
+	str	r1, [r0, #EMC_REQ_CTRL]	@ stall incoming DRAM requests
+
+emcidle:
+	ldr	r1, [r0, #EMC_EMC_STATUS]
+	tst	r1, #4
+	beq	emcidle
+
+	mov	r1, #1
+	str	r1, [r0, #EMC_SELF_REF]
+
+	emc_device_mask r1, r0
+
+emcself:
+	ldr	r2, [r0, #EMC_EMC_STATUS]
+	and	r2, r2, r1
+	cmp	r2, r1
+	bne	emcself			@ loop until DDR in self-refresh
+
+	/* Put VTTGEN in the lowest power mode */
+	ldr	r1, [r0, #EMC_XM2VTTGENPADCTRL]
+	mov32	r2, 0xF8F8FFFF	@ clear XM2VTTGEN_DRVUP and XM2VTTGEN_DRVDN
+	and	r1, r1, r2
+	str	r1, [r0, #EMC_XM2VTTGENPADCTRL]
+	ldr	r1, [r0, #EMC_XM2VTTGENPADCTRL2]
+	cmp	r10, #TEGRA30
+	orreq	r1, r1, #7		@ set E_NO_VTTGEN
+	orrne	r1, r1, #0x3f
+	str	r1, [r0, #EMC_XM2VTTGENPADCTRL2]
+
+	emc_timing_update r1, r0
+
+	/* Tegra114 had dual EMC channel, now config the other one */
+	cmp	r10, #TEGRA114
+	bne	no_dual_emc_chanl
+	mov32	r1, TEGRA_EMC1_BASE
+	cmp	r0, r1
+	movne	r0, r1
+	bne	enter_self_refresh
+no_dual_emc_chanl:
+
+	ldr	r1, [r4, #PMC_CTRL]
+	tst	r1, #PMC_CTRL_SIDE_EFFECT_LP0
+	bne	pmc_io_dpd_skip
+	/*
+	 * Put DDR_DATA, DISC_ADDR_CMD, DDR_ADDR_CMD, POP_ADDR_CMD, POP_CLK
+	 * and COMP in the lowest power mode when LP1.
+	 */
+	mov32	r1, 0x8EC00000
+	str	r1, [r4, #PMC_IO_DPD_REQ]
+pmc_io_dpd_skip:
+
+	dsb
+
+	mov	pc, lr
+
+	.ltorg
+/* dummy symbol for end of IRAM */
+	.align L1_CACHE_SHIFT
+	.global tegra30_iram_end
+tegra30_iram_end:
+	b	.
 #endif
diff --git a/arch/arm/mach-tegra/sleep.S b/arch/arm/mach-tegra/sleep.S
index 9daaef2..8d06213 100644
--- a/arch/arm/mach-tegra/sleep.S
+++ b/arch/arm/mach-tegra/sleep.S
@@ -56,7 +56,9 @@
 	isb
 
 	/* Flush the D-cache */
-	bl	v7_flush_dcache_louis
+	cmp	r0, #TEGRA_FLUSH_CACHE_ALL
+	blne	v7_flush_dcache_louis
+	bleq	v7_flush_dcache_all
 
 	/* Trun off coherency */
 	exit_smp r4, r5
@@ -67,15 +69,40 @@
 
 #ifdef CONFIG_PM_SLEEP
 /*
+ * tegra_init_l2_for_a15
+ *
+ * set up the correct L2 cache data RAM latency
+ */
+ENTRY(tegra_init_l2_for_a15)
+	mrc	p15, 0, r0, c0, c0, 5
+	ubfx	r0, r0, #8, #4
+	tst	r0, #1				@ only need for cluster 0
+	bne	_exit_init_l2_a15
+
+	mrc	p15, 0x1, r0, c9, c0, 2
+	and	r0, r0, #7
+	cmp	r0, #2
+	bicne	r0, r0, #7
+	orrne	r0, r0, #2
+	mcrne	p15, 0x1, r0, c9, c0, 2
+_exit_init_l2_a15:
+
+	mov	pc, lr
+ENDPROC(tegra_init_l2_for_a15)
+
+/*
  * tegra_sleep_cpu_finish(unsigned long v2p)
  *
  * enters suspend in LP2 by turning off the mmu and jumping to
  * tegra?_tear_down_cpu
  */
 ENTRY(tegra_sleep_cpu_finish)
+	mov	r4, r0
 	/* Flush and disable the L1 data cache */
+	mov	r0, #TEGRA_FLUSH_CACHE_ALL
 	bl	tegra_disable_clean_inv_dcache
 
+	mov	r0, r4
 	mov32	r6, tegra_tear_down_cpu
 	ldr	r1, [r6]
 	add	r1, r1, r0
@@ -107,10 +134,10 @@
 #ifdef CONFIG_CACHE_L2X0
 	/* Disable L2 cache */
 	check_cpu_part_num 0xc09, r9, r10
-	movweq	r4, #:lower16:(TEGRA_ARM_PERIF_BASE + 0x3000)
-	movteq	r4, #:upper16:(TEGRA_ARM_PERIF_BASE + 0x3000)
-	moveq	r5, #0
-	streq	r5, [r4, #L2X0_CTRL]
+	movweq	r2, #:lower16:(TEGRA_ARM_PERIF_BASE + 0x3000)
+	movteq	r2, #:upper16:(TEGRA_ARM_PERIF_BASE + 0x3000)
+	moveq	r3, #0
+	streq	r3, [r2, #L2X0_CTRL]
 #endif
 	mov	pc, r0
 ENDPROC(tegra_shut_off_mmu)
diff --git a/arch/arm/mach-tegra/sleep.h b/arch/arm/mach-tegra/sleep.h
index 98b7da6..a4edbb3 100644
--- a/arch/arm/mach-tegra/sleep.h
+++ b/arch/arm/mach-tegra/sleep.h
@@ -41,7 +41,19 @@
 #define CPU_NOT_RESETTABLE	0
 #endif
 
+/* flag of tegra_disable_clean_inv_dcache to do LoUIS or all */
+#define TEGRA_FLUSH_CACHE_LOUIS	0
+#define TEGRA_FLUSH_CACHE_ALL	1
+
 #ifdef __ASSEMBLY__
+/* waits until the microsecond counter (base) is > rn */
+.macro wait_until, rn, base, tmp
+	add	\rn, \rn, #1
+1001:	ldr	\tmp, [\base]
+	cmp	\tmp, \rn
+	bmi	1001b
+.endm
+
 /* returns the offset of the flow controller halt register for a cpu */
 .macro cpu_to_halt_reg rd, rcpu
 	cmp	\rcpu, #0
@@ -144,7 +156,7 @@
 void tegra_pen_unlock(void);
 void tegra_resume(void);
 int tegra_sleep_cpu_finish(unsigned long);
-void tegra_disable_clean_inv_dcache(void);
+void tegra_disable_clean_inv_dcache(u32 flag);
 
 #ifdef CONFIG_HOTPLUG_CPU
 void tegra20_hotplug_shutdown(void);
diff --git a/arch/arm/mach-tegra/tegra.c b/arch/arm/mach-tegra/tegra.c
index 0d1e412..fe56fca 100644
--- a/arch/arm/mach-tegra/tegra.c
+++ b/arch/arm/mach-tegra/tegra.c
@@ -116,28 +116,6 @@
 				tegra20_auxdata_lookup, parent);
 }
 
-static void __init trimslice_init(void)
-{
-#ifdef CONFIG_TEGRA_PCI
-	int ret;
-
-	ret = tegra_pcie_init(true, true);
-	if (ret)
-		pr_err("tegra_pci_init() failed: %d\n", ret);
-#endif
-}
-
-static void __init harmony_init(void)
-{
-#ifdef CONFIG_TEGRA_PCI
-	int ret;
-
-	ret = harmony_pcie_init();
-	if (ret)
-		pr_err("harmony_pcie_init() failed: %d\n", ret);
-#endif
-}
-
 static void __init paz00_init(void)
 {
 	if (IS_ENABLED(CONFIG_ARCH_TEGRA_2x_SOC))
@@ -148,8 +126,6 @@
 	char *machine;
 	void (*init)(void);
 } board_init_funcs[] = {
-	{ "compulab,trimslice", trimslice_init },
-	{ "nvidia,harmony", harmony_init },
 	{ "compal,paz00", paz00_init },
 };
 
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 5a768ad..098602b 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -9,7 +9,6 @@
 	select PCI if (!IA64_HP_SIM)
 	select ACPI if (!IA64_HP_SIM)
 	select PM if (!IA64_HP_SIM)
-	select ARCH_SUPPORTS_MSI
 	select HAVE_UNSTABLE_SCHED_CLOCK
 	select HAVE_IDE
 	select HAVE_OPROFILE
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e12764c..1e31fcb9 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -727,7 +727,6 @@
 	select SYS_HAS_CPU_CAVIUM_OCTEON
 	select SWAP_IO_SPACE
 	select HW_HAS_PCI
-	select ARCH_SUPPORTS_MSI
 	select ZONE_DMA32
 	select USB_ARCH_HAS_OHCI
 	select USB_ARCH_HAS_EHCI
@@ -763,7 +762,6 @@
 	select CEVT_R4K
 	select CSRC_R4K
 	select IRQ_CPU
-	select ARCH_SUPPORTS_MSI
 	select ZONE_DMA32 if 64BIT
 	select SYNC_R4K
 	select SYS_HAS_EARLY_PRINTK
diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
index fa8e0aa..f194c08 100644
--- a/arch/mips/include/asm/pci.h
+++ b/arch/mips/include/asm/pci.h
@@ -136,11 +136,6 @@
 	return channel ? 15 : 14;
 }
 
-#ifdef CONFIG_CPU_CAVIUM_OCTEON
-/* MSI arch hook for OCTEON */
-#define arch_setup_msi_irqs arch_setup_msi_irqs
-#endif
-
 extern char * (*pcibios_plat_setup)(char *str);
 
 #ifdef CONFIG_OF
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index dbd9d3c..3a67869 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -727,7 +727,6 @@
 	default y if !40x && !CPM2 && !8xx && !PPC_83xx \
 		&& !PPC_85xx && !PPC_86xx && !GAMECUBE_COMMON
 	default PCI_QSPAN if !4xx && !CPM2 && 8xx
-	select ARCH_SUPPORTS_MSI
 	select GENERIC_PCI_IOMAP
 	help
 	  Find out whether your system includes a PCI bus. PCI is the name of
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 6653f27..95145a1 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -113,11 +113,6 @@
 /* Decide whether to display the domain number in /proc */
 extern int pci_proc_domain(struct pci_bus *bus);
 
-/* MSI arch hooks */
-#define arch_setup_msi_irqs arch_setup_msi_irqs
-#define arch_teardown_msi_irqs arch_teardown_msi_irqs
-#define arch_msi_check_device arch_msi_check_device
-
 struct vm_area_struct;
 /* Map a range of PCI memory or I/O space for a device into user space */
 int pci_mmap_page_range(struct pci_dev *pdev, struct vm_area_struct *vma,
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 8a4cae7..21e5c16 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -430,7 +430,6 @@
 	bool "PCI support"
 	default n
 	depends on 64BIT
-	select ARCH_SUPPORTS_MSI
 	select PCI_MSI
 	help
 	  Enable PCI support.
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 6e577ba..262b91b 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -21,10 +21,6 @@
 int pci_domain_nr(struct pci_bus *);
 int pci_proc_domain(struct pci_bus *);
 
-/* MSI arch hooks */
-#define arch_setup_msi_irqs	arch_setup_msi_irqs
-#define arch_teardown_msi_irqs	arch_teardown_msi_irqs
-
 #define ZPCI_BUS_NR			0	/* default bus number */
 #define ZPCI_DEVFN			0	/* default device number */
 
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a00cbd3..1570ad2 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -52,7 +52,6 @@
 
 config SPARC64
 	def_bool 64BIT
-	select ARCH_SUPPORTS_MSI
 	select HAVE_FUNCTION_TRACER
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_GRAPH_FP_TEST
diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig
index 24565a7..74dff90 100644
--- a/arch/tile/Kconfig
+++ b/arch/tile/Kconfig
@@ -380,7 +380,6 @@
 	select PCI_DOMAINS
 	select GENERIC_PCI_IOMAP
 	select TILE_GXIO_TRIO if TILEGX
-	select ARCH_SUPPORTS_MSI if TILEGX
 	select PCI_MSI if TILEGX
 	---help---
 	  Enable PCI root complex support, so PCIe endpoint devices can
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b32ebf9..5db62ef 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2014,7 +2014,6 @@
 config PCI
 	bool "PCI support"
 	default y
-	select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC)
 	---help---
 	  Find out whether you have a PCI motherboard. PCI is the name of a
 	  bus system, i.e. the way the CPU talks to the other stuff inside
diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index d9e9e6c..7d74432 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -100,29 +100,6 @@
 extern void pci_iommu_alloc(void);
 
 #ifdef CONFIG_PCI_MSI
-/* MSI arch specific hooks */
-static inline int x86_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
-{
-	return x86_msi.setup_msi_irqs(dev, nvec, type);
-}
-
-static inline void x86_teardown_msi_irqs(struct pci_dev *dev)
-{
-	x86_msi.teardown_msi_irqs(dev);
-}
-
-static inline void x86_teardown_msi_irq(unsigned int irq)
-{
-	x86_msi.teardown_msi_irq(irq);
-}
-static inline void x86_restore_msi_irqs(struct pci_dev *dev, int irq)
-{
-	x86_msi.restore_msi_irqs(dev, irq);
-}
-#define arch_setup_msi_irqs x86_setup_msi_irqs
-#define arch_teardown_msi_irqs x86_teardown_msi_irqs
-#define arch_teardown_msi_irq x86_teardown_msi_irq
-#define arch_restore_msi_irqs x86_restore_msi_irqs
 /* implemented in arch/x86/kernel/apic/io_apic. */
 struct msi_desc;
 int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
@@ -130,16 +107,9 @@
 void native_restore_msi_irqs(struct pci_dev *dev, int irq);
 int setup_msi_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 		  unsigned int irq_base, unsigned int irq_offset);
-/* default to the implementation in drivers/lib/msi.c */
-#define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
-#define HAVE_DEFAULT_MSI_RESTORE_IRQS
-void default_teardown_msi_irqs(struct pci_dev *dev);
-void default_restore_msi_irqs(struct pci_dev *dev, int irq);
 #else
 #define native_setup_msi_irqs		NULL
 #define native_teardown_msi_irq		NULL
-#define default_teardown_msi_irqs	NULL
-#define default_restore_msi_irqs	NULL
 #endif
 
 #define PCI_DMA_BUS_IS_PHYS (dma_ops->is_phys)
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 5f24c71..8ce0072 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -107,6 +107,8 @@
 };
 
 EXPORT_SYMBOL_GPL(x86_platform);
+
+#if defined(CONFIG_PCI_MSI)
 struct x86_msi_ops x86_msi = {
 	.setup_msi_irqs		= native_setup_msi_irqs,
 	.compose_msi_msg	= native_compose_msi_msg,
@@ -116,6 +118,28 @@
 	.setup_hpet_msi		= default_setup_hpet_msi,
 };
 
+/* MSI arch specific hooks */
+int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+{
+	return x86_msi.setup_msi_irqs(dev, nvec, type);
+}
+
+void arch_teardown_msi_irqs(struct pci_dev *dev)
+{
+	x86_msi.teardown_msi_irqs(dev);
+}
+
+void arch_teardown_msi_irq(unsigned int irq)
+{
+	x86_msi.teardown_msi_irq(irq);
+}
+
+void arch_restore_msi_irqs(struct pci_dev *dev, int irq)
+{
+	x86_msi.restore_msi_irqs(dev, irq);
+}
+#endif
+
 struct x86_io_apic_ops x86_io_apic_ops = {
 	.init			= native_io_apic_init_mappings,
 	.read			= native_io_apic_read,
diff --git a/drivers/clk/tegra/clk-tegra114.c b/drivers/clk/tegra/clk-tegra114.c
index b6015cb..806d803 100644
--- a/drivers/clk/tegra/clk-tegra114.c
+++ b/drivers/clk/tegra/clk-tegra114.c
@@ -290,6 +290,14 @@
 /* Tegra CPU clock and reset control regs */
 #define CLK_RST_CONTROLLER_CPU_CMPLX_STATUS	0x470
 
+#ifdef CONFIG_PM_SLEEP
+static struct cpu_clk_suspend_context {
+	u32 clk_csite_src;
+	u32 cclkg_burst;
+	u32 cclkg_divider;
+} tegra114_cpu_clk_sctx;
+#endif
+
 static int periph_clk_enb_refcnt[CLK_OUT_ENB_NUM * 32];
 
 static void __iomem *clk_base;
@@ -2142,9 +2150,39 @@
 	/* flow controller would take care in the power sequence. */
 }
 
+#ifdef CONFIG_PM_SLEEP
+static void tegra114_cpu_clock_suspend(void)
+{
+	/* switch coresite to clk_m, save off original source */
+	tegra114_cpu_clk_sctx.clk_csite_src =
+				readl(clk_base + CLK_SOURCE_CSITE);
+	writel(3 << 30, clk_base + CLK_SOURCE_CSITE);
+
+	tegra114_cpu_clk_sctx.cclkg_burst =
+				readl(clk_base + CCLKG_BURST_POLICY);
+	tegra114_cpu_clk_sctx.cclkg_divider =
+				readl(clk_base + CCLKG_BURST_POLICY + 4);
+}
+
+static void tegra114_cpu_clock_resume(void)
+{
+	writel(tegra114_cpu_clk_sctx.clk_csite_src,
+					clk_base + CLK_SOURCE_CSITE);
+
+	writel(tegra114_cpu_clk_sctx.cclkg_burst,
+					clk_base + CCLKG_BURST_POLICY);
+	writel(tegra114_cpu_clk_sctx.cclkg_divider,
+					clk_base + CCLKG_BURST_POLICY + 4);
+}
+#endif
+
 static struct tegra_cpu_car_ops tegra114_cpu_car_ops = {
 	.wait_for_reset	= tegra114_wait_cpu_in_reset,
 	.disable_clock	= tegra114_disable_cpu_clock,
+#ifdef CONFIG_PM_SLEEP
+	.suspend	= tegra114_cpu_clock_suspend,
+	.resume		= tegra114_cpu_clock_resume,
+#endif
 };
 
 static const struct of_device_id pmc_match[] __initconst = {
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 42c687a..e5ca008 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -89,3 +89,48 @@
 	return 0;
 }
 EXPORT_SYMBOL_GPL(of_pci_parse_bus_range);
+
+#ifdef CONFIG_PCI_MSI
+
+static LIST_HEAD(of_pci_msi_chip_list);
+static DEFINE_MUTEX(of_pci_msi_chip_mutex);
+
+int of_pci_msi_chip_add(struct msi_chip *chip)
+{
+	if (!of_property_read_bool(chip->of_node, "msi-controller"))
+		return -EINVAL;
+
+	mutex_lock(&of_pci_msi_chip_mutex);
+	list_add(&chip->list, &of_pci_msi_chip_list);
+	mutex_unlock(&of_pci_msi_chip_mutex);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(of_pci_msi_chip_add);
+
+void of_pci_msi_chip_remove(struct msi_chip *chip)
+{
+	mutex_lock(&of_pci_msi_chip_mutex);
+	list_del(&chip->list);
+	mutex_unlock(&of_pci_msi_chip_mutex);
+}
+EXPORT_SYMBOL_GPL(of_pci_msi_chip_remove);
+
+struct msi_chip *of_pci_find_msi_chip_by_node(struct device_node *of_node)
+{
+	struct msi_chip *c;
+
+	mutex_lock(&of_pci_msi_chip_mutex);
+	list_for_each_entry(c, &of_pci_msi_chip_list, list) {
+		if (c->of_node == of_node) {
+			mutex_unlock(&of_pci_msi_chip_mutex);
+			return c;
+		}
+	}
+	mutex_unlock(&of_pci_msi_chip_mutex);
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(of_pci_find_msi_chip_by_node);
+
+#endif /* CONFIG_PCI_MSI */
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 81944fb..b6a99f7 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -1,13 +1,9 @@
 #
 # PCI configuration
 #
-config ARCH_SUPPORTS_MSI
-	bool
-
 config PCI_MSI
 	bool "Message Signaled Interrupts (MSI and MSI-X)"
 	depends on PCI
-	depends on ARCH_SUPPORTS_MSI
 	help
 	   This allows device drivers to enable MSI (Message Signaled
 	   Interrupts).  Message Signaled Interrupts enable a device to
diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 1184ff6..5f33746 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -14,4 +14,8 @@
 	select PCIEPORTBUS
 	select PCIE_DW
 
+config PCI_TEGRA
+	bool "NVIDIA Tegra PCIe controller"
+	depends on ARCH_TEGRA
+
 endmenu
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 086d850..a733fb0 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_PCI_MVEBU) += pci-mvebu.o
 obj-$(CONFIG_PCIE_DW) += pcie-designware.o
+obj-$(CONFIG_PCI_TEGRA) += pci-tegra.o
diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
new file mode 100644
index 0000000..7356741
--- /dev/null
+++ b/drivers/pci/host/pci-tegra.c
@@ -0,0 +1,1702 @@
+/*
+ * PCIe host controller driver for Tegra SoCs
+ *
+ * Copyright (c) 2010, CompuLab, Ltd.
+ * Author: Mike Rapoport <mike@compulab.co.il>
+ *
+ * Based on NVIDIA PCIe driver
+ * Copyright (c) 2008-2009, NVIDIA Corporation.
+ *
+ * Bits taken from arch/arm/mach-dove/pcie.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+
+#include <linux/clk.h>
+#include <linux/clk/tegra.h>
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/of_address.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/tegra-cpuidle.h>
+#include <linux/tegra-powergate.h>
+#include <linux/vmalloc.h>
+#include <linux/regulator/consumer.h>
+
+#include <asm/mach/irq.h>
+#include <asm/mach/map.h>
+#include <asm/mach/pci.h>
+
+#define INT_PCI_MSI_NR (8 * 32)
+
+/* register definitions */
+
+#define AFI_AXI_BAR0_SZ	0x00
+#define AFI_AXI_BAR1_SZ	0x04
+#define AFI_AXI_BAR2_SZ	0x08
+#define AFI_AXI_BAR3_SZ	0x0c
+#define AFI_AXI_BAR4_SZ	0x10
+#define AFI_AXI_BAR5_SZ	0x14
+
+#define AFI_AXI_BAR0_START	0x18
+#define AFI_AXI_BAR1_START	0x1c
+#define AFI_AXI_BAR2_START	0x20
+#define AFI_AXI_BAR3_START	0x24
+#define AFI_AXI_BAR4_START	0x28
+#define AFI_AXI_BAR5_START	0x2c
+
+#define AFI_FPCI_BAR0	0x30
+#define AFI_FPCI_BAR1	0x34
+#define AFI_FPCI_BAR2	0x38
+#define AFI_FPCI_BAR3	0x3c
+#define AFI_FPCI_BAR4	0x40
+#define AFI_FPCI_BAR5	0x44
+
+#define AFI_CACHE_BAR0_SZ	0x48
+#define AFI_CACHE_BAR0_ST	0x4c
+#define AFI_CACHE_BAR1_SZ	0x50
+#define AFI_CACHE_BAR1_ST	0x54
+
+#define AFI_MSI_BAR_SZ		0x60
+#define AFI_MSI_FPCI_BAR_ST	0x64
+#define AFI_MSI_AXI_BAR_ST	0x68
+
+#define AFI_MSI_VEC0		0x6c
+#define AFI_MSI_VEC1		0x70
+#define AFI_MSI_VEC2		0x74
+#define AFI_MSI_VEC3		0x78
+#define AFI_MSI_VEC4		0x7c
+#define AFI_MSI_VEC5		0x80
+#define AFI_MSI_VEC6		0x84
+#define AFI_MSI_VEC7		0x88
+
+#define AFI_MSI_EN_VEC0		0x8c
+#define AFI_MSI_EN_VEC1		0x90
+#define AFI_MSI_EN_VEC2		0x94
+#define AFI_MSI_EN_VEC3		0x98
+#define AFI_MSI_EN_VEC4		0x9c
+#define AFI_MSI_EN_VEC5		0xa0
+#define AFI_MSI_EN_VEC6		0xa4
+#define AFI_MSI_EN_VEC7		0xa8
+
+#define AFI_CONFIGURATION		0xac
+#define  AFI_CONFIGURATION_EN_FPCI	(1 << 0)
+
+#define AFI_FPCI_ERROR_MASKS	0xb0
+
+#define AFI_INTR_MASK		0xb4
+#define  AFI_INTR_MASK_INT_MASK	(1 << 0)
+#define  AFI_INTR_MASK_MSI_MASK	(1 << 8)
+
+#define AFI_INTR_CODE			0xb8
+#define  AFI_INTR_CODE_MASK		0xf
+#define  AFI_INTR_AXI_SLAVE_ERROR	1
+#define  AFI_INTR_AXI_DECODE_ERROR	2
+#define  AFI_INTR_TARGET_ABORT		3
+#define  AFI_INTR_MASTER_ABORT		4
+#define  AFI_INTR_INVALID_WRITE		5
+#define  AFI_INTR_LEGACY		6
+#define  AFI_INTR_FPCI_DECODE_ERROR	7
+
+#define AFI_INTR_SIGNATURE	0xbc
+#define AFI_UPPER_FPCI_ADDRESS	0xc0
+#define AFI_SM_INTR_ENABLE	0xc4
+#define  AFI_SM_INTR_INTA_ASSERT	(1 << 0)
+#define  AFI_SM_INTR_INTB_ASSERT	(1 << 1)
+#define  AFI_SM_INTR_INTC_ASSERT	(1 << 2)
+#define  AFI_SM_INTR_INTD_ASSERT	(1 << 3)
+#define  AFI_SM_INTR_INTA_DEASSERT	(1 << 4)
+#define  AFI_SM_INTR_INTB_DEASSERT	(1 << 5)
+#define  AFI_SM_INTR_INTC_DEASSERT	(1 << 6)
+#define  AFI_SM_INTR_INTD_DEASSERT	(1 << 7)
+
+#define AFI_AFI_INTR_ENABLE		0xc8
+#define  AFI_INTR_EN_INI_SLVERR		(1 << 0)
+#define  AFI_INTR_EN_INI_DECERR		(1 << 1)
+#define  AFI_INTR_EN_TGT_SLVERR		(1 << 2)
+#define  AFI_INTR_EN_TGT_DECERR		(1 << 3)
+#define  AFI_INTR_EN_TGT_WRERR		(1 << 4)
+#define  AFI_INTR_EN_DFPCI_DECERR	(1 << 5)
+#define  AFI_INTR_EN_AXI_DECERR		(1 << 6)
+#define  AFI_INTR_EN_FPCI_TIMEOUT	(1 << 7)
+#define  AFI_INTR_EN_PRSNT_SENSE	(1 << 8)
+
+#define AFI_PCIE_CONFIG					0x0f8
+#define  AFI_PCIE_CONFIG_PCIE_DISABLE(x)		(1 << ((x) + 1))
+#define  AFI_PCIE_CONFIG_PCIE_DISABLE_ALL		0xe
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK	(0xf << 20)
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_SINGLE	(0x0 << 20)
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_420	(0x0 << 20)
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL	(0x1 << 20)
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_222	(0x1 << 20)
+#define  AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_411	(0x2 << 20)
+
+#define AFI_FUSE			0x104
+#define  AFI_FUSE_PCIE_T0_GEN2_DIS	(1 << 2)
+
+#define AFI_PEX0_CTRL			0x110
+#define AFI_PEX1_CTRL			0x118
+#define AFI_PEX2_CTRL			0x128
+#define  AFI_PEX_CTRL_RST		(1 << 0)
+#define  AFI_PEX_CTRL_CLKREQ_EN		(1 << 1)
+#define  AFI_PEX_CTRL_REFCLK_EN		(1 << 3)
+
+#define AFI_PEXBIAS_CTRL_0		0x168
+
+#define RP_VEND_XP	0x00000F00
+#define  RP_VEND_XP_DL_UP	(1 << 30)
+
+#define RP_LINK_CONTROL_STATUS			0x00000090
+#define  RP_LINK_CONTROL_STATUS_DL_LINK_ACTIVE	0x20000000
+#define  RP_LINK_CONTROL_STATUS_LINKSTAT_MASK	0x3fff0000
+
+#define PADS_CTL_SEL		0x0000009C
+
+#define PADS_CTL		0x000000A0
+#define  PADS_CTL_IDDQ_1L	(1 << 0)
+#define  PADS_CTL_TX_DATA_EN_1L	(1 << 6)
+#define  PADS_CTL_RX_DATA_EN_1L	(1 << 10)
+
+#define PADS_PLL_CTL_TEGRA20			0x000000B8
+#define PADS_PLL_CTL_TEGRA30			0x000000B4
+#define  PADS_PLL_CTL_RST_B4SM			(1 << 1)
+#define  PADS_PLL_CTL_LOCKDET			(1 << 8)
+#define  PADS_PLL_CTL_REFCLK_MASK		(0x3 << 16)
+#define  PADS_PLL_CTL_REFCLK_INTERNAL_CML	(0 << 16)
+#define  PADS_PLL_CTL_REFCLK_INTERNAL_CMOS	(1 << 16)
+#define  PADS_PLL_CTL_REFCLK_EXTERNAL		(2 << 16)
+#define  PADS_PLL_CTL_TXCLKREF_MASK		(0x1 << 20)
+#define  PADS_PLL_CTL_TXCLKREF_DIV10		(0 << 20)
+#define  PADS_PLL_CTL_TXCLKREF_DIV5		(1 << 20)
+#define  PADS_PLL_CTL_TXCLKREF_BUF_EN		(1 << 22)
+
+#define PADS_REFCLK_CFG0			0x000000C8
+#define PADS_REFCLK_CFG1			0x000000CC
+
+/*
+ * Fields in PADS_REFCLK_CFG*. Those registers form an array of 16-bit
+ * entries, one entry per PCIe port. These field definitions and desired
+ * values aren't in the TRM, but do come from NVIDIA.
+ */
+#define PADS_REFCLK_CFG_TERM_SHIFT		2  /* 6:2 */
+#define PADS_REFCLK_CFG_E_TERM_SHIFT		7
+#define PADS_REFCLK_CFG_PREDI_SHIFT		8  /* 11:8 */
+#define PADS_REFCLK_CFG_DRVI_SHIFT		12 /* 15:12 */
+
+/* Default value provided by HW engineering is 0xfa5c */
+#define PADS_REFCLK_CFG_VALUE \
+	( \
+		(0x17 << PADS_REFCLK_CFG_TERM_SHIFT)   | \
+		(0    << PADS_REFCLK_CFG_E_TERM_SHIFT) | \
+		(0xa  << PADS_REFCLK_CFG_PREDI_SHIFT)  | \
+		(0xf  << PADS_REFCLK_CFG_DRVI_SHIFT)     \
+	)
+
+struct tegra_msi {
+	struct msi_chip chip;
+	DECLARE_BITMAP(used, INT_PCI_MSI_NR);
+	struct irq_domain *domain;
+	unsigned long pages;
+	struct mutex lock;
+	int irq;
+};
+
+/* used to differentiate between Tegra SoC generations */
+struct tegra_pcie_soc_data {
+	unsigned int num_ports;
+	unsigned int msi_base_shift;
+	u32 pads_pll_ctl;
+	u32 tx_ref_sel;
+	bool has_pex_clkreq_en;
+	bool has_pex_bias_ctrl;
+	bool has_intr_prsnt_sense;
+	bool has_avdd_supply;
+	bool has_cml_clk;
+};
+
+static inline struct tegra_msi *to_tegra_msi(struct msi_chip *chip)
+{
+	return container_of(chip, struct tegra_msi, chip);
+}
+
+struct tegra_pcie {
+	struct device *dev;
+
+	void __iomem *pads;
+	void __iomem *afi;
+	int irq;
+
+	struct list_head busses;
+	struct resource *cs;
+
+	struct resource io;
+	struct resource mem;
+	struct resource prefetch;
+	struct resource busn;
+
+	struct clk *pex_clk;
+	struct clk *afi_clk;
+	struct clk *pcie_xclk;
+	struct clk *pll_e;
+	struct clk *cml_clk;
+
+	struct tegra_msi msi;
+
+	struct list_head ports;
+	unsigned int num_ports;
+	u32 xbar_config;
+
+	struct regulator *pex_clk_supply;
+	struct regulator *vdd_supply;
+	struct regulator *avdd_supply;
+
+	const struct tegra_pcie_soc_data *soc_data;
+};
+
+struct tegra_pcie_port {
+	struct tegra_pcie *pcie;
+	struct list_head list;
+	struct resource regs;
+	void __iomem *base;
+	unsigned int index;
+	unsigned int lanes;
+};
+
+struct tegra_pcie_bus {
+	struct vm_struct *area;
+	struct list_head list;
+	unsigned int nr;
+};
+
+static inline struct tegra_pcie *sys_to_pcie(struct pci_sys_data *sys)
+{
+	return sys->private_data;
+}
+
+static inline void afi_writel(struct tegra_pcie *pcie, u32 value,
+			      unsigned long offset)
+{
+	writel(value, pcie->afi + offset);
+}
+
+static inline u32 afi_readl(struct tegra_pcie *pcie, unsigned long offset)
+{
+	return readl(pcie->afi + offset);
+}
+
+static inline void pads_writel(struct tegra_pcie *pcie, u32 value,
+			       unsigned long offset)
+{
+	writel(value, pcie->pads + offset);
+}
+
+static inline u32 pads_readl(struct tegra_pcie *pcie, unsigned long offset)
+{
+	return readl(pcie->pads + offset);
+}
+
+/*
+ * The configuration space mapping on Tegra is somewhat similar to the ECAM
+ * defined by PCIe. However it deviates a bit in how the 4 bits for extended
+ * register accesses are mapped:
+ *
+ *    [27:24] extended register number
+ *    [23:16] bus number
+ *    [15:11] device number
+ *    [10: 8] function number
+ *    [ 7: 0] register number
+ *
+ * Mapping the whole extended configuration space would require 256 MiB of
+ * virtual address space, only a small part of which will actually be used.
+ * To work around this, a 1 MiB of virtual addresses are allocated per bus
+ * when the bus is first accessed. When the physical range is mapped, the
+ * the bus number bits are hidden so that the extended register number bits
+ * appear as bits [19:16]. Therefore the virtual mapping looks like this:
+ *
+ *    [19:16] extended register number
+ *    [15:11] device number
+ *    [10: 8] function number
+ *    [ 7: 0] register number
+ *
+ * This is achieved by stitching together 16 chunks of 64 KiB of physical
+ * address space via the MMU.
+ */
+static unsigned long tegra_pcie_conf_offset(unsigned int devfn, int where)
+{
+	return ((where & 0xf00) << 8) | (PCI_SLOT(devfn) << 11) |
+	       (PCI_FUNC(devfn) << 8) | (where & 0xfc);
+}
+
+static struct tegra_pcie_bus *tegra_pcie_bus_alloc(struct tegra_pcie *pcie,
+						   unsigned int busnr)
+{
+	pgprot_t prot = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY | L_PTE_XN |
+			L_PTE_MT_DEV_SHARED | L_PTE_SHARED;
+	phys_addr_t cs = pcie->cs->start;
+	struct tegra_pcie_bus *bus;
+	unsigned int i;
+	int err;
+
+	bus = kzalloc(sizeof(*bus), GFP_KERNEL);
+	if (!bus)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&bus->list);
+	bus->nr = busnr;
+
+	/* allocate 1 MiB of virtual addresses */
+	bus->area = get_vm_area(SZ_1M, VM_IOREMAP);
+	if (!bus->area) {
+		err = -ENOMEM;
+		goto free;
+	}
+
+	/* map each of the 16 chunks of 64 KiB each */
+	for (i = 0; i < 16; i++) {
+		unsigned long virt = (unsigned long)bus->area->addr +
+				     i * SZ_64K;
+		phys_addr_t phys = cs + i * SZ_1M + busnr * SZ_64K;
+
+		err = ioremap_page_range(virt, virt + SZ_64K, phys, prot);
+		if (err < 0) {
+			dev_err(pcie->dev, "ioremap_page_range() failed: %d\n",
+				err);
+			goto unmap;
+		}
+	}
+
+	return bus;
+
+unmap:
+	vunmap(bus->area->addr);
+free:
+	kfree(bus);
+	return ERR_PTR(err);
+}
+
+/*
+ * Look up a virtual address mapping for the specified bus number. If no such
+ * mapping existis, try to create one.
+ */
+static void __iomem *tegra_pcie_bus_map(struct tegra_pcie *pcie,
+					unsigned int busnr)
+{
+	struct tegra_pcie_bus *bus;
+
+	list_for_each_entry(bus, &pcie->busses, list)
+		if (bus->nr == busnr)
+			return bus->area->addr;
+
+	bus = tegra_pcie_bus_alloc(pcie, busnr);
+	if (IS_ERR(bus))
+		return NULL;
+
+	list_add_tail(&bus->list, &pcie->busses);
+
+	return bus->area->addr;
+}
+
+static void __iomem *tegra_pcie_conf_address(struct pci_bus *bus,
+					     unsigned int devfn,
+					     int where)
+{
+	struct tegra_pcie *pcie = sys_to_pcie(bus->sysdata);
+	void __iomem *addr = NULL;
+
+	if (bus->number == 0) {
+		unsigned int slot = PCI_SLOT(devfn);
+		struct tegra_pcie_port *port;
+
+		list_for_each_entry(port, &pcie->ports, list) {
+			if (port->index + 1 == slot) {
+				addr = port->base + (where & ~3);
+				break;
+			}
+		}
+	} else {
+		addr = tegra_pcie_bus_map(pcie, bus->number);
+		if (!addr) {
+			dev_err(pcie->dev,
+				"failed to map cfg. space for bus %u\n",
+				bus->number);
+			return NULL;
+		}
+
+		addr += tegra_pcie_conf_offset(devfn, where);
+	}
+
+	return addr;
+}
+
+static int tegra_pcie_read_conf(struct pci_bus *bus, unsigned int devfn,
+				int where, int size, u32 *value)
+{
+	void __iomem *addr;
+
+	addr = tegra_pcie_conf_address(bus, devfn, where);
+	if (!addr) {
+		*value = 0xffffffff;
+		return PCIBIOS_DEVICE_NOT_FOUND;
+	}
+
+	*value = readl(addr);
+
+	if (size == 1)
+		*value = (*value >> (8 * (where & 3))) & 0xff;
+	else if (size == 2)
+		*value = (*value >> (8 * (where & 3))) & 0xffff;
+
+	return PCIBIOS_SUCCESSFUL;
+}
+
+static int tegra_pcie_write_conf(struct pci_bus *bus, unsigned int devfn,
+				 int where, int size, u32 value)
+{
+	void __iomem *addr;
+	u32 mask, tmp;
+
+	addr = tegra_pcie_conf_address(bus, devfn, where);
+	if (!addr)
+		return PCIBIOS_DEVICE_NOT_FOUND;
+
+	if (size == 4) {
+		writel(value, addr);
+		return PCIBIOS_SUCCESSFUL;
+	}
+
+	if (size == 2)
+		mask = ~(0xffff << ((where & 0x3) * 8));
+	else if (size == 1)
+		mask = ~(0xff << ((where & 0x3) * 8));
+	else
+		return PCIBIOS_BAD_REGISTER_NUMBER;
+
+	tmp = readl(addr) & mask;
+	tmp |= value << ((where & 0x3) * 8);
+	writel(tmp, addr);
+
+	return PCIBIOS_SUCCESSFUL;
+}
+
+static struct pci_ops tegra_pcie_ops = {
+	.read = tegra_pcie_read_conf,
+	.write = tegra_pcie_write_conf,
+};
+
+static unsigned long tegra_pcie_port_get_pex_ctrl(struct tegra_pcie_port *port)
+{
+	unsigned long ret = 0;
+
+	switch (port->index) {
+	case 0:
+		ret = AFI_PEX0_CTRL;
+		break;
+
+	case 1:
+		ret = AFI_PEX1_CTRL;
+		break;
+
+	case 2:
+		ret = AFI_PEX2_CTRL;
+		break;
+	}
+
+	return ret;
+}
+
+static void tegra_pcie_port_reset(struct tegra_pcie_port *port)
+{
+	unsigned long ctrl = tegra_pcie_port_get_pex_ctrl(port);
+	unsigned long value;
+
+	/* pulse reset signal */
+	value = afi_readl(port->pcie, ctrl);
+	value &= ~AFI_PEX_CTRL_RST;
+	afi_writel(port->pcie, value, ctrl);
+
+	usleep_range(1000, 2000);
+
+	value = afi_readl(port->pcie, ctrl);
+	value |= AFI_PEX_CTRL_RST;
+	afi_writel(port->pcie, value, ctrl);
+}
+
+static void tegra_pcie_port_enable(struct tegra_pcie_port *port)
+{
+	const struct tegra_pcie_soc_data *soc = port->pcie->soc_data;
+	unsigned long ctrl = tegra_pcie_port_get_pex_ctrl(port);
+	unsigned long value;
+
+	/* enable reference clock */
+	value = afi_readl(port->pcie, ctrl);
+	value |= AFI_PEX_CTRL_REFCLK_EN;
+
+	if (soc->has_pex_clkreq_en)
+		value |= AFI_PEX_CTRL_CLKREQ_EN;
+
+	afi_writel(port->pcie, value, ctrl);
+
+	tegra_pcie_port_reset(port);
+}
+
+static void tegra_pcie_port_disable(struct tegra_pcie_port *port)
+{
+	unsigned long ctrl = tegra_pcie_port_get_pex_ctrl(port);
+	unsigned long value;
+
+	/* assert port reset */
+	value = afi_readl(port->pcie, ctrl);
+	value &= ~AFI_PEX_CTRL_RST;
+	afi_writel(port->pcie, value, ctrl);
+
+	/* disable reference clock */
+	value = afi_readl(port->pcie, ctrl);
+	value &= ~AFI_PEX_CTRL_REFCLK_EN;
+	afi_writel(port->pcie, value, ctrl);
+}
+
+static void tegra_pcie_port_free(struct tegra_pcie_port *port)
+{
+	struct tegra_pcie *pcie = port->pcie;
+
+	devm_iounmap(pcie->dev, port->base);
+	devm_release_mem_region(pcie->dev, port->regs.start,
+				resource_size(&port->regs));
+	list_del(&port->list);
+	devm_kfree(pcie->dev, port);
+}
+
+static void tegra_pcie_fixup_bridge(struct pci_dev *dev)
+{
+	u16 reg;
+
+	if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE) {
+		pci_read_config_word(dev, PCI_COMMAND, &reg);
+		reg |= (PCI_COMMAND_IO | PCI_COMMAND_MEMORY |
+			PCI_COMMAND_MASTER | PCI_COMMAND_SERR);
+		pci_write_config_word(dev, PCI_COMMAND, reg);
+	}
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_fixup_bridge);
+
+/* Tegra PCIE root complex wrongly reports device class */
+static void tegra_pcie_fixup_class(struct pci_dev *dev)
+{
+	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf0, tegra_pcie_fixup_class);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0bf1, tegra_pcie_fixup_class);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1c, tegra_pcie_fixup_class);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_NVIDIA, 0x0e1d, tegra_pcie_fixup_class);
+
+/* Tegra PCIE requires relaxed ordering */
+static void tegra_pcie_relax_enable(struct pci_dev *dev)
+{
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, tegra_pcie_relax_enable);
+
+static int tegra_pcie_setup(int nr, struct pci_sys_data *sys)
+{
+	struct tegra_pcie *pcie = sys_to_pcie(sys);
+
+	pci_add_resource_offset(&sys->resources, &pcie->mem, sys->mem_offset);
+	pci_add_resource_offset(&sys->resources, &pcie->prefetch,
+				sys->mem_offset);
+	pci_add_resource(&sys->resources, &pcie->busn);
+
+	pci_ioremap_io(nr * SZ_64K, pcie->io.start);
+
+	return 1;
+}
+
+static int tegra_pcie_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin)
+{
+	struct tegra_pcie *pcie = sys_to_pcie(pdev->bus->sysdata);
+
+	tegra_cpuidle_pcie_irqs_in_use();
+
+	return pcie->irq;
+}
+
+static void tegra_pcie_add_bus(struct pci_bus *bus)
+{
+	if (IS_ENABLED(CONFIG_PCI_MSI)) {
+		struct tegra_pcie *pcie = sys_to_pcie(bus->sysdata);
+
+		bus->msi = &pcie->msi.chip;
+	}
+}
+
+static struct pci_bus *tegra_pcie_scan_bus(int nr, struct pci_sys_data *sys)
+{
+	struct tegra_pcie *pcie = sys_to_pcie(sys);
+	struct pci_bus *bus;
+
+	bus = pci_create_root_bus(pcie->dev, sys->busnr, &tegra_pcie_ops, sys,
+				  &sys->resources);
+	if (!bus)
+		return NULL;
+
+	pci_scan_child_bus(bus);
+
+	return bus;
+}
+
+static irqreturn_t tegra_pcie_isr(int irq, void *arg)
+{
+	const char *err_msg[] = {
+		"Unknown",
+		"AXI slave error",
+		"AXI decode error",
+		"Target abort",
+		"Master abort",
+		"Invalid write",
+		"Response decoding error",
+		"AXI response decoding error",
+		"Transaction timeout",
+	};
+	struct tegra_pcie *pcie = arg;
+	u32 code, signature;
+
+	code = afi_readl(pcie, AFI_INTR_CODE) & AFI_INTR_CODE_MASK;
+	signature = afi_readl(pcie, AFI_INTR_SIGNATURE);
+	afi_writel(pcie, 0, AFI_INTR_CODE);
+
+	if (code == AFI_INTR_LEGACY)
+		return IRQ_NONE;
+
+	if (code >= ARRAY_SIZE(err_msg))
+		code = 0;
+
+	/*
+	 * do not pollute kernel log with master abort reports since they
+	 * happen a lot during enumeration
+	 */
+	if (code == AFI_INTR_MASTER_ABORT)
+		dev_dbg(pcie->dev, "%s, signature: %08x\n", err_msg[code],
+			signature);
+	else
+		dev_err(pcie->dev, "%s, signature: %08x\n", err_msg[code],
+			signature);
+
+	if (code == AFI_INTR_TARGET_ABORT || code == AFI_INTR_MASTER_ABORT ||
+	    code == AFI_INTR_FPCI_DECODE_ERROR) {
+		u32 fpci = afi_readl(pcie, AFI_UPPER_FPCI_ADDRESS) & 0xff;
+		u64 address = (u64)fpci << 32 | (signature & 0xfffffffc);
+
+		if (code == AFI_INTR_MASTER_ABORT)
+			dev_dbg(pcie->dev, "  FPCI address: %10llx\n", address);
+		else
+			dev_err(pcie->dev, "  FPCI address: %10llx\n", address);
+	}
+
+	return IRQ_HANDLED;
+}
+
+/*
+ * FPCI map is as follows:
+ * - 0xfdfc000000: I/O space
+ * - 0xfdfe000000: type 0 configuration space
+ * - 0xfdff000000: type 1 configuration space
+ * - 0xfe00000000: type 0 extended configuration space
+ * - 0xfe10000000: type 1 extended configuration space
+ */
+static void tegra_pcie_setup_translations(struct tegra_pcie *pcie)
+{
+	u32 fpci_bar, size, axi_address;
+
+	/* Bar 0: type 1 extended configuration space */
+	fpci_bar = 0xfe100000;
+	size = resource_size(pcie->cs);
+	axi_address = pcie->cs->start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR0_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR0_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR0);
+
+	/* Bar 1: downstream IO bar */
+	fpci_bar = 0xfdfc0000;
+	size = resource_size(&pcie->io);
+	axi_address = pcie->io.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR1_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR1_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR1);
+
+	/* Bar 2: prefetchable memory BAR */
+	fpci_bar = (((pcie->prefetch.start >> 12) & 0x0fffffff) << 4) | 0x1;
+	size = resource_size(&pcie->prefetch);
+	axi_address = pcie->prefetch.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR2_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR2_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR2);
+
+	/* Bar 3: non prefetchable memory BAR */
+	fpci_bar = (((pcie->mem.start >> 12) & 0x0fffffff) << 4) | 0x1;
+	size = resource_size(&pcie->mem);
+	axi_address = pcie->mem.start;
+	afi_writel(pcie, axi_address, AFI_AXI_BAR3_START);
+	afi_writel(pcie, size >> 12, AFI_AXI_BAR3_SZ);
+	afi_writel(pcie, fpci_bar, AFI_FPCI_BAR3);
+
+	/* NULL out the remaining BARs as they are not used */
+	afi_writel(pcie, 0, AFI_AXI_BAR4_START);
+	afi_writel(pcie, 0, AFI_AXI_BAR4_SZ);
+	afi_writel(pcie, 0, AFI_FPCI_BAR4);
+
+	afi_writel(pcie, 0, AFI_AXI_BAR5_START);
+	afi_writel(pcie, 0, AFI_AXI_BAR5_SZ);
+	afi_writel(pcie, 0, AFI_FPCI_BAR5);
+
+	/* map all upstream transactions as uncached */
+	afi_writel(pcie, PHYS_OFFSET, AFI_CACHE_BAR0_ST);
+	afi_writel(pcie, 0, AFI_CACHE_BAR0_SZ);
+	afi_writel(pcie, 0, AFI_CACHE_BAR1_ST);
+	afi_writel(pcie, 0, AFI_CACHE_BAR1_SZ);
+
+	/* MSI translations are setup only when needed */
+	afi_writel(pcie, 0, AFI_MSI_FPCI_BAR_ST);
+	afi_writel(pcie, 0, AFI_MSI_BAR_SZ);
+	afi_writel(pcie, 0, AFI_MSI_AXI_BAR_ST);
+	afi_writel(pcie, 0, AFI_MSI_BAR_SZ);
+}
+
+static int tegra_pcie_enable_controller(struct tegra_pcie *pcie)
+{
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+	struct tegra_pcie_port *port;
+	unsigned int timeout;
+	unsigned long value;
+
+	/* power down PCIe slot clock bias pad */
+	if (soc->has_pex_bias_ctrl)
+		afi_writel(pcie, 0, AFI_PEXBIAS_CTRL_0);
+
+	/* configure mode and disable all ports */
+	value = afi_readl(pcie, AFI_PCIE_CONFIG);
+	value &= ~AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_MASK;
+	value |= AFI_PCIE_CONFIG_PCIE_DISABLE_ALL | pcie->xbar_config;
+
+	list_for_each_entry(port, &pcie->ports, list)
+		value &= ~AFI_PCIE_CONFIG_PCIE_DISABLE(port->index);
+
+	afi_writel(pcie, value, AFI_PCIE_CONFIG);
+
+	value = afi_readl(pcie, AFI_FUSE);
+	value &= ~AFI_FUSE_PCIE_T0_GEN2_DIS;
+	afi_writel(pcie, value, AFI_FUSE);
+
+	/* initialze internal PHY, enable up to 16 PCIE lanes */
+	pads_writel(pcie, 0x0, PADS_CTL_SEL);
+
+	/* override IDDQ to 1 on all 4 lanes */
+	value = pads_readl(pcie, PADS_CTL);
+	value |= PADS_CTL_IDDQ_1L;
+	pads_writel(pcie, value, PADS_CTL);
+
+	/*
+	 * Set up PHY PLL inputs select PLLE output as refclock,
+	 * set TX ref sel to div10 (not div5).
+	 */
+	value = pads_readl(pcie, soc->pads_pll_ctl);
+	value &= ~(PADS_PLL_CTL_REFCLK_MASK | PADS_PLL_CTL_TXCLKREF_MASK);
+	value |= PADS_PLL_CTL_REFCLK_INTERNAL_CML | soc->tx_ref_sel;
+	pads_writel(pcie, value, soc->pads_pll_ctl);
+
+	/* take PLL out of reset  */
+	value = pads_readl(pcie, soc->pads_pll_ctl);
+	value |= PADS_PLL_CTL_RST_B4SM;
+	pads_writel(pcie, value, soc->pads_pll_ctl);
+
+	/* Configure the reference clock driver */
+	value = PADS_REFCLK_CFG_VALUE | (PADS_REFCLK_CFG_VALUE << 16);
+	pads_writel(pcie, value, PADS_REFCLK_CFG0);
+	if (soc->num_ports > 2)
+		pads_writel(pcie, PADS_REFCLK_CFG_VALUE, PADS_REFCLK_CFG1);
+
+	/* wait for the PLL to lock */
+	timeout = 300;
+	do {
+		value = pads_readl(pcie, soc->pads_pll_ctl);
+		usleep_range(1000, 2000);
+		if (--timeout == 0) {
+			pr_err("Tegra PCIe error: timeout waiting for PLL\n");
+			return -EBUSY;
+		}
+	} while (!(value & PADS_PLL_CTL_LOCKDET));
+
+	/* turn off IDDQ override */
+	value = pads_readl(pcie, PADS_CTL);
+	value &= ~PADS_CTL_IDDQ_1L;
+	pads_writel(pcie, value, PADS_CTL);
+
+	/* enable TX/RX data */
+	value = pads_readl(pcie, PADS_CTL);
+	value |= PADS_CTL_TX_DATA_EN_1L | PADS_CTL_RX_DATA_EN_1L;
+	pads_writel(pcie, value, PADS_CTL);
+
+	/* take the PCIe interface module out of reset */
+	tegra_periph_reset_deassert(pcie->pcie_xclk);
+
+	/* finally enable PCIe */
+	value = afi_readl(pcie, AFI_CONFIGURATION);
+	value |= AFI_CONFIGURATION_EN_FPCI;
+	afi_writel(pcie, value, AFI_CONFIGURATION);
+
+	value = AFI_INTR_EN_INI_SLVERR | AFI_INTR_EN_INI_DECERR |
+		AFI_INTR_EN_TGT_SLVERR | AFI_INTR_EN_TGT_DECERR |
+		AFI_INTR_EN_TGT_WRERR | AFI_INTR_EN_DFPCI_DECERR;
+
+	if (soc->has_intr_prsnt_sense)
+		value |= AFI_INTR_EN_PRSNT_SENSE;
+
+	afi_writel(pcie, value, AFI_AFI_INTR_ENABLE);
+	afi_writel(pcie, 0xffffffff, AFI_SM_INTR_ENABLE);
+
+	/* don't enable MSI for now, only when needed */
+	afi_writel(pcie, AFI_INTR_MASK_INT_MASK, AFI_INTR_MASK);
+
+	/* disable all exceptions */
+	afi_writel(pcie, 0, AFI_FPCI_ERROR_MASKS);
+
+	return 0;
+}
+
+static void tegra_pcie_power_off(struct tegra_pcie *pcie)
+{
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+	int err;
+
+	/* TODO: disable and unprepare clocks? */
+
+	tegra_periph_reset_assert(pcie->pcie_xclk);
+	tegra_periph_reset_assert(pcie->afi_clk);
+	tegra_periph_reset_assert(pcie->pex_clk);
+
+	tegra_powergate_power_off(TEGRA_POWERGATE_PCIE);
+
+	if (soc->has_avdd_supply) {
+		err = regulator_disable(pcie->avdd_supply);
+		if (err < 0)
+			dev_warn(pcie->dev,
+				 "failed to disable AVDD regulator: %d\n",
+				 err);
+	}
+
+	err = regulator_disable(pcie->pex_clk_supply);
+	if (err < 0)
+		dev_warn(pcie->dev, "failed to disable pex-clk regulator: %d\n",
+			 err);
+
+	err = regulator_disable(pcie->vdd_supply);
+	if (err < 0)
+		dev_warn(pcie->dev, "failed to disable VDD regulator: %d\n",
+			 err);
+}
+
+static int tegra_pcie_power_on(struct tegra_pcie *pcie)
+{
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+	int err;
+
+	tegra_periph_reset_assert(pcie->pcie_xclk);
+	tegra_periph_reset_assert(pcie->afi_clk);
+	tegra_periph_reset_assert(pcie->pex_clk);
+
+	tegra_powergate_power_off(TEGRA_POWERGATE_PCIE);
+
+	/* enable regulators */
+	err = regulator_enable(pcie->vdd_supply);
+	if (err < 0) {
+		dev_err(pcie->dev, "failed to enable VDD regulator: %d\n", err);
+		return err;
+	}
+
+	err = regulator_enable(pcie->pex_clk_supply);
+	if (err < 0) {
+		dev_err(pcie->dev, "failed to enable pex-clk regulator: %d\n",
+			err);
+		return err;
+	}
+
+	if (soc->has_avdd_supply) {
+		err = regulator_enable(pcie->avdd_supply);
+		if (err < 0) {
+			dev_err(pcie->dev,
+				"failed to enable AVDD regulator: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	err = tegra_powergate_sequence_power_up(TEGRA_POWERGATE_PCIE,
+						pcie->pex_clk);
+	if (err) {
+		dev_err(pcie->dev, "powerup sequence failed: %d\n", err);
+		return err;
+	}
+
+	tegra_periph_reset_deassert(pcie->afi_clk);
+
+	err = clk_prepare_enable(pcie->afi_clk);
+	if (err < 0) {
+		dev_err(pcie->dev, "failed to enable AFI clock: %d\n", err);
+		return err;
+	}
+
+	if (soc->has_cml_clk) {
+		err = clk_prepare_enable(pcie->cml_clk);
+		if (err < 0) {
+			dev_err(pcie->dev, "failed to enable CML clock: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	err = clk_prepare_enable(pcie->pll_e);
+	if (err < 0) {
+		dev_err(pcie->dev, "failed to enable PLLE clock: %d\n", err);
+		return err;
+	}
+
+	return 0;
+}
+
+static int tegra_pcie_clocks_get(struct tegra_pcie *pcie)
+{
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+
+	pcie->pex_clk = devm_clk_get(pcie->dev, "pex");
+	if (IS_ERR(pcie->pex_clk))
+		return PTR_ERR(pcie->pex_clk);
+
+	pcie->afi_clk = devm_clk_get(pcie->dev, "afi");
+	if (IS_ERR(pcie->afi_clk))
+		return PTR_ERR(pcie->afi_clk);
+
+	pcie->pcie_xclk = devm_clk_get(pcie->dev, "pcie_xclk");
+	if (IS_ERR(pcie->pcie_xclk))
+		return PTR_ERR(pcie->pcie_xclk);
+
+	pcie->pll_e = devm_clk_get(pcie->dev, "pll_e");
+	if (IS_ERR(pcie->pll_e))
+		return PTR_ERR(pcie->pll_e);
+
+	if (soc->has_cml_clk) {
+		pcie->cml_clk = devm_clk_get(pcie->dev, "cml");
+		if (IS_ERR(pcie->cml_clk))
+			return PTR_ERR(pcie->cml_clk);
+	}
+
+	return 0;
+}
+
+static int tegra_pcie_get_resources(struct tegra_pcie *pcie)
+{
+	struct platform_device *pdev = to_platform_device(pcie->dev);
+	struct resource *pads, *afi, *res;
+	int err;
+
+	err = tegra_pcie_clocks_get(pcie);
+	if (err) {
+		dev_err(&pdev->dev, "failed to get clocks: %d\n", err);
+		return err;
+	}
+
+	err = tegra_pcie_power_on(pcie);
+	if (err) {
+		dev_err(&pdev->dev, "failed to power up: %d\n", err);
+		return err;
+	}
+
+	/* request and remap controller registers */
+	pads = platform_get_resource_byname(pdev, IORESOURCE_MEM, "pads");
+	if (!pads) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	afi = platform_get_resource_byname(pdev, IORESOURCE_MEM, "afi");
+	if (!afi) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	pcie->pads = devm_request_and_ioremap(&pdev->dev, pads);
+	if (!pcie->pads) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	pcie->afi = devm_request_and_ioremap(&pdev->dev, afi);
+	if (!pcie->afi) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	/* request and remap configuration space */
+	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "cs");
+	if (!res) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	pcie->cs = devm_request_mem_region(pcie->dev, res->start,
+					   resource_size(res), res->name);
+	if (!pcie->cs) {
+		err = -EADDRNOTAVAIL;
+		goto poweroff;
+	}
+
+	/* request interrupt */
+	err = platform_get_irq_byname(pdev, "intr");
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to get IRQ: %d\n", err);
+		goto poweroff;
+	}
+
+	pcie->irq = err;
+
+	err = request_irq(pcie->irq, tegra_pcie_isr, IRQF_SHARED, "PCIE", pcie);
+	if (err) {
+		dev_err(&pdev->dev, "failed to register IRQ: %d\n", err);
+		goto poweroff;
+	}
+
+	return 0;
+
+poweroff:
+	tegra_pcie_power_off(pcie);
+	return err;
+}
+
+static int tegra_pcie_put_resources(struct tegra_pcie *pcie)
+{
+	if (pcie->irq > 0)
+		free_irq(pcie->irq, pcie);
+
+	tegra_pcie_power_off(pcie);
+	return 0;
+}
+
+static int tegra_msi_alloc(struct tegra_msi *chip)
+{
+	int msi;
+
+	mutex_lock(&chip->lock);
+
+	msi = find_first_zero_bit(chip->used, INT_PCI_MSI_NR);
+	if (msi < INT_PCI_MSI_NR)
+		set_bit(msi, chip->used);
+	else
+		msi = -ENOSPC;
+
+	mutex_unlock(&chip->lock);
+
+	return msi;
+}
+
+static void tegra_msi_free(struct tegra_msi *chip, unsigned long irq)
+{
+	struct device *dev = chip->chip.dev;
+
+	mutex_lock(&chip->lock);
+
+	if (!test_bit(irq, chip->used))
+		dev_err(dev, "trying to free unused MSI#%lu\n", irq);
+	else
+		clear_bit(irq, chip->used);
+
+	mutex_unlock(&chip->lock);
+}
+
+static irqreturn_t tegra_pcie_msi_irq(int irq, void *data)
+{
+	struct tegra_pcie *pcie = data;
+	struct tegra_msi *msi = &pcie->msi;
+	unsigned int i, processed = 0;
+
+	for (i = 0; i < 8; i++) {
+		unsigned long reg = afi_readl(pcie, AFI_MSI_VEC0 + i * 4);
+
+		while (reg) {
+			unsigned int offset = find_first_bit(&reg, 32);
+			unsigned int index = i * 32 + offset;
+			unsigned int irq;
+
+			/* clear the interrupt */
+			afi_writel(pcie, 1 << offset, AFI_MSI_VEC0 + i * 4);
+
+			irq = irq_find_mapping(msi->domain, index);
+			if (irq) {
+				if (test_bit(index, msi->used))
+					generic_handle_irq(irq);
+				else
+					dev_info(pcie->dev, "unhandled MSI\n");
+			} else {
+				/*
+				 * that's weird who triggered this?
+				 * just clear it
+				 */
+				dev_info(pcie->dev, "unexpected MSI\n");
+			}
+
+			/* see if there's any more pending in this vector */
+			reg = afi_readl(pcie, AFI_MSI_VEC0 + i * 4);
+
+			processed++;
+		}
+	}
+
+	return processed > 0 ? IRQ_HANDLED : IRQ_NONE;
+}
+
+static int tegra_msi_setup_irq(struct msi_chip *chip, struct pci_dev *pdev,
+			       struct msi_desc *desc)
+{
+	struct tegra_msi *msi = to_tegra_msi(chip);
+	struct msi_msg msg;
+	unsigned int irq;
+	int hwirq;
+
+	hwirq = tegra_msi_alloc(msi);
+	if (hwirq < 0)
+		return hwirq;
+
+	irq = irq_create_mapping(msi->domain, hwirq);
+	if (!irq)
+		return -EINVAL;
+
+	irq_set_msi_desc(irq, desc);
+
+	msg.address_lo = virt_to_phys((void *)msi->pages);
+	/* 32 bit address only */
+	msg.address_hi = 0;
+	msg.data = hwirq;
+
+	write_msi_msg(irq, &msg);
+
+	return 0;
+}
+
+static void tegra_msi_teardown_irq(struct msi_chip *chip, unsigned int irq)
+{
+	struct tegra_msi *msi = to_tegra_msi(chip);
+	struct irq_data *d = irq_get_irq_data(irq);
+
+	tegra_msi_free(msi, d->hwirq);
+}
+
+static struct irq_chip tegra_msi_irq_chip = {
+	.name = "Tegra PCIe MSI",
+	.irq_enable = unmask_msi_irq,
+	.irq_disable = mask_msi_irq,
+	.irq_mask = mask_msi_irq,
+	.irq_unmask = unmask_msi_irq,
+};
+
+static int tegra_msi_map(struct irq_domain *domain, unsigned int irq,
+			 irq_hw_number_t hwirq)
+{
+	irq_set_chip_and_handler(irq, &tegra_msi_irq_chip, handle_simple_irq);
+	irq_set_chip_data(irq, domain->host_data);
+	set_irq_flags(irq, IRQF_VALID);
+
+	tegra_cpuidle_pcie_irqs_in_use();
+
+	return 0;
+}
+
+static const struct irq_domain_ops msi_domain_ops = {
+	.map = tegra_msi_map,
+};
+
+static int tegra_pcie_enable_msi(struct tegra_pcie *pcie)
+{
+	struct platform_device *pdev = to_platform_device(pcie->dev);
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+	struct tegra_msi *msi = &pcie->msi;
+	unsigned long base;
+	int err;
+	u32 reg;
+
+	mutex_init(&msi->lock);
+
+	msi->chip.dev = pcie->dev;
+	msi->chip.setup_irq = tegra_msi_setup_irq;
+	msi->chip.teardown_irq = tegra_msi_teardown_irq;
+
+	msi->domain = irq_domain_add_linear(pcie->dev->of_node, INT_PCI_MSI_NR,
+					    &msi_domain_ops, &msi->chip);
+	if (!msi->domain) {
+		dev_err(&pdev->dev, "failed to create IRQ domain\n");
+		return -ENOMEM;
+	}
+
+	err = platform_get_irq_byname(pdev, "msi");
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to get IRQ: %d\n", err);
+		goto err;
+	}
+
+	msi->irq = err;
+
+	err = request_irq(msi->irq, tegra_pcie_msi_irq, 0,
+			  tegra_msi_irq_chip.name, pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request IRQ: %d\n", err);
+		goto err;
+	}
+
+	/* setup AFI/FPCI range */
+	msi->pages = __get_free_pages(GFP_KERNEL, 0);
+	base = virt_to_phys((void *)msi->pages);
+
+	afi_writel(pcie, base >> soc->msi_base_shift, AFI_MSI_FPCI_BAR_ST);
+	afi_writel(pcie, base, AFI_MSI_AXI_BAR_ST);
+	/* this register is in 4K increments */
+	afi_writel(pcie, 1, AFI_MSI_BAR_SZ);
+
+	/* enable all MSI vectors */
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC0);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC1);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC2);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC3);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC4);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC5);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC6);
+	afi_writel(pcie, 0xffffffff, AFI_MSI_EN_VEC7);
+
+	/* and unmask the MSI interrupt */
+	reg = afi_readl(pcie, AFI_INTR_MASK);
+	reg |= AFI_INTR_MASK_MSI_MASK;
+	afi_writel(pcie, reg, AFI_INTR_MASK);
+
+	return 0;
+
+err:
+	irq_domain_remove(msi->domain);
+	return err;
+}
+
+static int tegra_pcie_disable_msi(struct tegra_pcie *pcie)
+{
+	struct tegra_msi *msi = &pcie->msi;
+	unsigned int i, irq;
+	u32 value;
+
+	/* mask the MSI interrupt */
+	value = afi_readl(pcie, AFI_INTR_MASK);
+	value &= ~AFI_INTR_MASK_MSI_MASK;
+	afi_writel(pcie, value, AFI_INTR_MASK);
+
+	/* disable all MSI vectors */
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC0);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC1);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC2);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC3);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC4);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC5);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC6);
+	afi_writel(pcie, 0, AFI_MSI_EN_VEC7);
+
+	free_pages(msi->pages, 0);
+
+	if (msi->irq > 0)
+		free_irq(msi->irq, pcie);
+
+	for (i = 0; i < INT_PCI_MSI_NR; i++) {
+		irq = irq_find_mapping(msi->domain, i);
+		if (irq > 0)
+			irq_dispose_mapping(irq);
+	}
+
+	irq_domain_remove(msi->domain);
+
+	return 0;
+}
+
+static int tegra_pcie_get_xbar_config(struct tegra_pcie *pcie, u32 lanes,
+				      u32 *xbar)
+{
+	struct device_node *np = pcie->dev->of_node;
+
+	if (of_device_is_compatible(np, "nvidia,tegra30-pcie")) {
+		switch (lanes) {
+		case 0x00000204:
+			dev_info(pcie->dev, "4x1, 2x1 configuration\n");
+			*xbar = AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_420;
+			return 0;
+
+		case 0x00020202:
+			dev_info(pcie->dev, "2x3 configuration\n");
+			*xbar = AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_222;
+			return 0;
+
+		case 0x00010104:
+			dev_info(pcie->dev, "4x1, 1x2 configuration\n");
+			*xbar = AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_411;
+			return 0;
+		}
+	} else if (of_device_is_compatible(np, "nvidia,tegra20-pcie")) {
+		switch (lanes) {
+		case 0x00000004:
+			dev_info(pcie->dev, "single-mode configuration\n");
+			*xbar = AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_SINGLE;
+			return 0;
+
+		case 0x00000202:
+			dev_info(pcie->dev, "dual-mode configuration\n");
+			*xbar = AFI_PCIE_CONFIG_SM2TMS0_XBAR_CONFIG_DUAL;
+			return 0;
+		}
+	}
+
+	return -EINVAL;
+}
+
+static int tegra_pcie_parse_dt(struct tegra_pcie *pcie)
+{
+	const struct tegra_pcie_soc_data *soc = pcie->soc_data;
+	struct device_node *np = pcie->dev->of_node, *port;
+	struct of_pci_range_parser parser;
+	struct of_pci_range range;
+	struct resource res;
+	u32 lanes = 0;
+	int err;
+
+	if (of_pci_range_parser_init(&parser, np)) {
+		dev_err(pcie->dev, "missing \"ranges\" property\n");
+		return -EINVAL;
+	}
+
+	pcie->vdd_supply = devm_regulator_get(pcie->dev, "vdd");
+	if (IS_ERR(pcie->vdd_supply))
+		return PTR_ERR(pcie->vdd_supply);
+
+	pcie->pex_clk_supply = devm_regulator_get(pcie->dev, "pex-clk");
+	if (IS_ERR(pcie->pex_clk_supply))
+		return PTR_ERR(pcie->pex_clk_supply);
+
+	if (soc->has_avdd_supply) {
+		pcie->avdd_supply = devm_regulator_get(pcie->dev, "avdd");
+		if (IS_ERR(pcie->avdd_supply))
+			return PTR_ERR(pcie->avdd_supply);
+	}
+
+	for_each_of_pci_range(&parser, &range) {
+		of_pci_range_to_resource(&range, np, &res);
+
+		switch (res.flags & IORESOURCE_TYPE_BITS) {
+		case IORESOURCE_IO:
+			memcpy(&pcie->io, &res, sizeof(res));
+			pcie->io.name = "I/O";
+			break;
+
+		case IORESOURCE_MEM:
+			if (res.flags & IORESOURCE_PREFETCH) {
+				memcpy(&pcie->prefetch, &res, sizeof(res));
+				pcie->prefetch.name = "PREFETCH";
+			} else {
+				memcpy(&pcie->mem, &res, sizeof(res));
+				pcie->mem.name = "MEM";
+			}
+			break;
+		}
+	}
+
+	err = of_pci_parse_bus_range(np, &pcie->busn);
+	if (err < 0) {
+		dev_err(pcie->dev, "failed to parse ranges property: %d\n",
+			err);
+		pcie->busn.name = np->name;
+		pcie->busn.start = 0;
+		pcie->busn.end = 0xff;
+		pcie->busn.flags = IORESOURCE_BUS;
+	}
+
+	/* parse root ports */
+	for_each_child_of_node(np, port) {
+		struct tegra_pcie_port *rp;
+		unsigned int index;
+		u32 value;
+
+		err = of_pci_get_devfn(port);
+		if (err < 0) {
+			dev_err(pcie->dev, "failed to parse address: %d\n",
+				err);
+			return err;
+		}
+
+		index = PCI_SLOT(err);
+
+		if (index < 1 || index > soc->num_ports) {
+			dev_err(pcie->dev, "invalid port number: %d\n", index);
+			return -EINVAL;
+		}
+
+		index--;
+
+		err = of_property_read_u32(port, "nvidia,num-lanes", &value);
+		if (err < 0) {
+			dev_err(pcie->dev, "failed to parse # of lanes: %d\n",
+				err);
+			return err;
+		}
+
+		if (value > 16) {
+			dev_err(pcie->dev, "invalid # of lanes: %u\n", value);
+			return -EINVAL;
+		}
+
+		lanes |= value << (index << 3);
+
+		if (!of_device_is_available(port))
+			continue;
+
+		rp = devm_kzalloc(pcie->dev, sizeof(*rp), GFP_KERNEL);
+		if (!rp)
+			return -ENOMEM;
+
+		err = of_address_to_resource(port, 0, &rp->regs);
+		if (err < 0) {
+			dev_err(pcie->dev, "failed to parse address: %d\n",
+				err);
+			return err;
+		}
+
+		INIT_LIST_HEAD(&rp->list);
+		rp->index = index;
+		rp->lanes = value;
+		rp->pcie = pcie;
+
+		rp->base = devm_request_and_ioremap(pcie->dev, &rp->regs);
+		if (!rp->base)
+			return -EADDRNOTAVAIL;
+
+		list_add_tail(&rp->list, &pcie->ports);
+	}
+
+	err = tegra_pcie_get_xbar_config(pcie, lanes, &pcie->xbar_config);
+	if (err < 0) {
+		dev_err(pcie->dev, "invalid lane configuration\n");
+		return err;
+	}
+
+	return 0;
+}
+
+/*
+ * FIXME: If there are no PCIe cards attached, then calling this function
+ * can result in the increase of the bootup time as there are big timeout
+ * loops.
+ */
+#define TEGRA_PCIE_LINKUP_TIMEOUT	200	/* up to 1.2 seconds */
+static bool tegra_pcie_port_check_link(struct tegra_pcie_port *port)
+{
+	unsigned int retries = 3;
+	unsigned long value;
+
+	do {
+		unsigned int timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
+
+		do {
+			value = readl(port->base + RP_VEND_XP);
+
+			if (value & RP_VEND_XP_DL_UP)
+				break;
+
+			usleep_range(1000, 2000);
+		} while (--timeout);
+
+		if (!timeout) {
+			dev_err(port->pcie->dev, "link %u down, retrying\n",
+				port->index);
+			goto retry;
+		}
+
+		timeout = TEGRA_PCIE_LINKUP_TIMEOUT;
+
+		do {
+			value = readl(port->base + RP_LINK_CONTROL_STATUS);
+
+			if (value & RP_LINK_CONTROL_STATUS_DL_LINK_ACTIVE)
+				return true;
+
+			usleep_range(1000, 2000);
+		} while (--timeout);
+
+retry:
+		tegra_pcie_port_reset(port);
+	} while (--retries);
+
+	return false;
+}
+
+static int tegra_pcie_enable(struct tegra_pcie *pcie)
+{
+	struct tegra_pcie_port *port, *tmp;
+	struct hw_pci hw;
+
+	list_for_each_entry_safe(port, tmp, &pcie->ports, list) {
+		dev_info(pcie->dev, "probing port %u, using %u lanes\n",
+			 port->index, port->lanes);
+
+		tegra_pcie_port_enable(port);
+
+		if (tegra_pcie_port_check_link(port))
+			continue;
+
+		dev_info(pcie->dev, "link %u down, ignoring\n", port->index);
+
+		tegra_pcie_port_disable(port);
+		tegra_pcie_port_free(port);
+	}
+
+	memset(&hw, 0, sizeof(hw));
+
+	hw.nr_controllers = 1;
+	hw.private_data = (void **)&pcie;
+	hw.setup = tegra_pcie_setup;
+	hw.map_irq = tegra_pcie_map_irq;
+	hw.add_bus = tegra_pcie_add_bus;
+	hw.scan = tegra_pcie_scan_bus;
+	hw.ops = &tegra_pcie_ops;
+
+	pci_common_init_dev(pcie->dev, &hw);
+
+	return 0;
+}
+
+static const struct tegra_pcie_soc_data tegra20_pcie_data = {
+	.num_ports = 2,
+	.msi_base_shift = 0,
+	.pads_pll_ctl = PADS_PLL_CTL_TEGRA20,
+	.tx_ref_sel = PADS_PLL_CTL_TXCLKREF_DIV10,
+	.has_pex_clkreq_en = false,
+	.has_pex_bias_ctrl = false,
+	.has_intr_prsnt_sense = false,
+	.has_avdd_supply = false,
+	.has_cml_clk = false,
+};
+
+static const struct tegra_pcie_soc_data tegra30_pcie_data = {
+	.num_ports = 3,
+	.msi_base_shift = 8,
+	.pads_pll_ctl = PADS_PLL_CTL_TEGRA30,
+	.tx_ref_sel = PADS_PLL_CTL_TXCLKREF_BUF_EN,
+	.has_pex_clkreq_en = true,
+	.has_pex_bias_ctrl = true,
+	.has_intr_prsnt_sense = true,
+	.has_avdd_supply = true,
+	.has_cml_clk = true,
+};
+
+static const struct of_device_id tegra_pcie_of_match[] = {
+	{ .compatible = "nvidia,tegra30-pcie", .data = &tegra30_pcie_data },
+	{ .compatible = "nvidia,tegra20-pcie", .data = &tegra20_pcie_data },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, tegra_pcie_of_match);
+
+static int tegra_pcie_probe(struct platform_device *pdev)
+{
+	const struct of_device_id *match;
+	struct tegra_pcie *pcie;
+	int err;
+
+	match = of_match_device(tegra_pcie_of_match, &pdev->dev);
+	if (!match)
+		return -ENODEV;
+
+	pcie = devm_kzalloc(&pdev->dev, sizeof(*pcie), GFP_KERNEL);
+	if (!pcie)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&pcie->busses);
+	INIT_LIST_HEAD(&pcie->ports);
+	pcie->soc_data = match->data;
+	pcie->dev = &pdev->dev;
+
+	err = tegra_pcie_parse_dt(pcie);
+	if (err < 0)
+		return err;
+
+	pcibios_min_mem = 0;
+
+	err = tegra_pcie_get_resources(pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to request resources: %d\n", err);
+		return err;
+	}
+
+	err = tegra_pcie_enable_controller(pcie);
+	if (err)
+		goto put_resources;
+
+	/* setup the AFI address translations */
+	tegra_pcie_setup_translations(pcie);
+
+	if (IS_ENABLED(CONFIG_PCI_MSI)) {
+		err = tegra_pcie_enable_msi(pcie);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to enable MSI support: %d\n",
+				err);
+			goto put_resources;
+		}
+	}
+
+	err = tegra_pcie_enable(pcie);
+	if (err < 0) {
+		dev_err(&pdev->dev, "failed to enable PCIe ports: %d\n", err);
+		goto disable_msi;
+	}
+
+	platform_set_drvdata(pdev, pcie);
+	return 0;
+
+disable_msi:
+	if (IS_ENABLED(CONFIG_PCI_MSI))
+		tegra_pcie_disable_msi(pcie);
+put_resources:
+	tegra_pcie_put_resources(pcie);
+	return err;
+}
+
+static struct platform_driver tegra_pcie_driver = {
+	.driver = {
+		.name = "tegra-pcie",
+		.owner = THIS_MODULE,
+		.of_match_table = tegra_pcie_of_match,
+		.suppress_bind_attrs = true,
+	},
+	.probe = tegra_pcie_probe,
+};
+module_platform_driver(tegra_pcie_driver);
+
+MODULE_AUTHOR("Thierry Reding <treding@nvidia.com>");
+MODULE_DESCRIPTION("NVIDIA Tegra PCIe driver");
+MODULE_LICENSE("GPLv2");
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index aca7578..b35f93c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -30,20 +30,60 @@
 
 /* Arch hooks */
 
-#ifndef arch_msi_check_device
-int arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
+#if defined(CONFIG_GENERIC_HARDIRQS)
+int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+{
+	struct msi_chip *chip = dev->bus->msi;
+	int err;
+
+	if (!chip || !chip->setup_irq)
+		return -EINVAL;
+
+	err = chip->setup_irq(chip, dev, desc);
+	if (err < 0)
+		return err;
+
+	irq_set_chip_data(desc->irq, chip);
+
+	return 0;
+}
+
+void __weak arch_teardown_msi_irq(unsigned int irq)
+{
+	struct msi_chip *chip = irq_get_chip_data(irq);
+
+	if (!chip || !chip->teardown_irq)
+		return;
+
+	chip->teardown_irq(chip, irq);
+}
+
+int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
+{
+	struct msi_chip *chip = dev->bus->msi;
+
+	if (!chip || !chip->check_device)
+		return 0;
+
+	return chip->check_device(chip, dev, nvec, type);
+}
+#else
+int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
+{
+	return -ENOSYS;
+}
+
+void __weak arch_teardown_msi_irq(unsigned int irq)
+{
+}
+
+int __weak arch_msi_check_device(struct pci_dev *dev, int nvec, int type)
 {
 	return 0;
 }
-#endif
+#endif /* CONFIG_GENERIC_HARDIRQS */
 
-#ifndef arch_setup_msi_irqs
-# define arch_setup_msi_irqs default_setup_msi_irqs
-# define HAVE_DEFAULT_MSI_SETUP_IRQS
-#endif
-
-#ifdef HAVE_DEFAULT_MSI_SETUP_IRQS
-int default_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
+int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct msi_desc *entry;
 	int ret;
@@ -65,14 +105,11 @@
 
 	return 0;
 }
-#endif
 
-#ifndef arch_teardown_msi_irqs
-# define arch_teardown_msi_irqs default_teardown_msi_irqs
-# define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
-#endif
-
-#ifdef HAVE_DEFAULT_MSI_TEARDOWN_IRQS
+/*
+ * We have a default implementation available as a separate non-weak
+ * function, as it is used by the Xen x86 PCI code
+ */
 void default_teardown_msi_irqs(struct pci_dev *dev)
 {
 	struct msi_desc *entry;
@@ -89,14 +126,12 @@
 			arch_teardown_msi_irq(entry->irq + i);
 	}
 }
-#endif
 
-#ifndef arch_restore_msi_irqs
-# define arch_restore_msi_irqs default_restore_msi_irqs
-# define HAVE_DEFAULT_MSI_RESTORE_IRQS
-#endif
+void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
+{
+	return default_teardown_msi_irqs(dev);
+}
 
-#ifdef HAVE_DEFAULT_MSI_RESTORE_IRQS
 void default_restore_msi_irqs(struct pci_dev *dev, int irq)
 {
 	struct msi_desc *entry;
@@ -114,7 +149,11 @@
 	if (entry)
 		write_msi_msg(irq, &entry->msg);
 }
-#endif
+
+void __weak arch_restore_msi_irqs(struct pci_dev *dev, int irq)
+{
+	return default_restore_msi_irqs(dev, irq);
+}
 
 static void msi_set_enable(struct pci_dev *dev, int enable)
 {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 46ada5c..b8eaa81 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -666,6 +666,7 @@
 
 	child->parent = parent;
 	child->ops = parent->ops;
+	child->msi = parent->msi;
 	child->sysdata = parent->sysdata;
 	child->bus_flags = parent->bus_flags;
 
diff --git a/include/linux/msi.h b/include/linux/msi.h
index ee66f3a..b17ead8 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -51,12 +51,31 @@
 };
 
 /*
- * The arch hook for setup up msi irqs
+ * The arch hooks to setup up msi irqs. Those functions are
+ * implemented as weak symbols so that they /can/ be overriden by
+ * architecture specific code if needed.
  */
 int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc);
 void arch_teardown_msi_irq(unsigned int irq);
 int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type);
 void arch_teardown_msi_irqs(struct pci_dev *dev);
 int arch_msi_check_device(struct pci_dev* dev, int nvec, int type);
+void arch_restore_msi_irqs(struct pci_dev *dev, int irq);
+
+void default_teardown_msi_irqs(struct pci_dev *dev);
+void default_restore_msi_irqs(struct pci_dev *dev, int irq);
+
+struct msi_chip {
+	struct module *owner;
+	struct device *dev;
+	struct device_node *of_node;
+	struct list_head list;
+
+	int (*setup_irq)(struct msi_chip *chip, struct pci_dev *dev,
+			 struct msi_desc *desc);
+	void (*teardown_irq)(struct msi_chip *chip, unsigned int irq);
+	int (*check_device)(struct msi_chip *chip, struct pci_dev *dev,
+			    int nvec, int type);
+};
 
 #endif /* LINUX_MSI_H */
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 7a04826..fd9c408 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -2,6 +2,7 @@
 #define __OF_PCI_H
 
 #include <linux/pci.h>
+#include <linux/msi.h>
 
 struct pci_dev;
 struct of_irq;
@@ -13,4 +14,15 @@
 int of_pci_get_devfn(struct device_node *np);
 int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
 
+#if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)
+int of_pci_msi_chip_add(struct msi_chip *chip);
+void of_pci_msi_chip_remove(struct msi_chip *chip);
+struct msi_chip *of_pci_find_msi_chip_by_node(struct device_node *of_node);
+#else
+static inline int of_pci_msi_chip_add(struct msi_chip *chip) { return -EINVAL; }
+static inline void of_pci_msi_chip_remove(struct msi_chip *chip) { }
+static inline struct msi_chip *
+of_pci_find_msi_chip_by_node(struct device_node *of_node) { return NULL; }
+#endif
+
 #endif
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0fd1f15..4044e3c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -433,6 +433,7 @@
 	struct resource busn_res;	/* bus numbers routed to this bus */
 
 	struct pci_ops	*ops;		/* configuration access functions */
+	struct msi_chip	*msi;		/* MSI controller */
 	void		*sysdata;	/* hook for sys-specific extension */
 	struct proc_dir_entry *procdir;	/* directory entry in /proc/bus/pci */
 
diff --git a/include/linux/tegra-cpuidle.h b/include/linux/tegra-cpuidle.h
new file mode 100644
index 0000000..dda3647
--- /dev/null
+++ b/include/linux/tegra-cpuidle.h
@@ -0,0 +1,19 @@
+/*
+ * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __LINUX_TEGRA_CPUIDLE_H__
+#define __LINUX_TEGRA_CPUIDLE_H__
+
+void tegra_cpuidle_pcie_irqs_in_use(void);
+
+#endif