Merge tag 'wireless-drivers-next-for-davem-2016-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
wireless-drivers-next patches for 4.9
Major changes:
iwlwifi
* work for new hardware support continues
* dynamic queue allocation stabilization
* improvements in the MSIx code
* multiqueue support work continues
* new firmware version support (API 26)
* add 8275 series support
* add 9560 series support
* add support for MU-MIMO sniffer
* add support for RRM by scan
* add support for "reverse" rx packet injection faking hw descriptors
* migrate to devm memory allocation handling
* Remove support for older firmwares (API older than -17 and -22)
wl12xx
* support booting the same rootfs with both wl12xx and wl18xx
hostap
* mark the driver as obsolete
ath9k
* disable RNG by default
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/Documentation/devicetree/bindings/mmc/sdhci-st.txt b/Documentation/devicetree/bindings/mmc/sdhci-st.txt
index 88faa91..3cd4c43 100644
--- a/Documentation/devicetree/bindings/mmc/sdhci-st.txt
+++ b/Documentation/devicetree/bindings/mmc/sdhci-st.txt
@@ -10,7 +10,7 @@
subsystem (mmcss) inside the FlashSS (available in STiH407 SoC
family).
-- clock-names: Should be "mmc".
+- clock-names: Should be "mmc" and "icn". (NB: The latter is not compulsory)
See: Documentation/devicetree/bindings/resource-names.txt
- clocks: Phandle to the clock.
See: Documentation/devicetree/bindings/clock/clock-bindings.txt
diff --git a/Documentation/devicetree/bindings/net/ethernet.txt b/Documentation/devicetree/bindings/net/ethernet.txt
index 5d88f37..e1d7681 100644
--- a/Documentation/devicetree/bindings/net/ethernet.txt
+++ b/Documentation/devicetree/bindings/net/ethernet.txt
@@ -11,8 +11,8 @@
the maximum frame size (there's contradiction in ePAPR).
- phy-mode: string, operation mode of the PHY interface; supported values are
"mii", "gmii", "sgmii", "qsgmii", "tbi", "rev-mii", "rmii", "rgmii", "rgmii-id",
- "rgmii-rxid", "rgmii-txid", "rtbi", "smii", "xgmii"; this is now a de-facto
- standard property;
+ "rgmii-rxid", "rgmii-txid", "rtbi", "smii", "xgmii", "trgmii"; this is now a
+ de-facto standard property;
- phy-connection-type: the same as "phy-mode" property but described in ePAPR;
- phy-handle: phandle, specifies a reference to a node representing a PHY
device; this property is described in ePAPR and so preferred;
diff --git a/Documentation/devicetree/bindings/net/mediatek-net.txt b/Documentation/devicetree/bindings/net/mediatek-net.txt
index 32eaaca..f095257 100644
--- a/Documentation/devicetree/bindings/net/mediatek-net.txt
+++ b/Documentation/devicetree/bindings/net/mediatek-net.txt
@@ -24,14 +24,17 @@
Optional properties:
- interrupt-parent: Should be the phandle for the interrupt controller
that services interrupts for this device
-
+- mediatek,hwlro: the capability if the hardware supports LRO functions
* Ethernet MAC node
Required properties:
- compatible: Should be "mediatek,eth-mac"
- reg: The number of the MAC
-- phy-handle: see ethernet.txt file in the same directory.
+- phy-handle: see ethernet.txt file in the same directory and
+ the phy-mode "trgmii" required being provided when reg
+ is equal to 0 and the MAC uses fixed-link to connect
+ with internal switch such as MT7530.
Example:
@@ -51,6 +54,7 @@
reset-names = "eth";
mediatek,ethsys = <ðsys>;
mediatek,pctl = <&syscfg_pctl_a>;
+ mediatek,hwlro;
#address-cells = <1>;
#size-cells = <0>;
diff --git a/Documentation/devicetree/bindings/net/sh_eth.txt b/Documentation/devicetree/bindings/net/sh_eth.txt
index 2f6ec85..0115c85 100644
--- a/Documentation/devicetree/bindings/net/sh_eth.txt
+++ b/Documentation/devicetree/bindings/net/sh_eth.txt
@@ -5,6 +5,8 @@
Required properties:
- compatible: "renesas,gether-r8a7740" if the device is a part of R8A7740 SoC.
+ "renesas,ether-r8a7743" if the device is a part of R8A7743 SoC.
+ "renesas,ether-r8a7745" if the device is a part of R8A7745 SoC.
"renesas,ether-r8a7778" if the device is a part of R8A7778 SoC.
"renesas,ether-r8a7779" if the device is a part of R8A7779 SoC.
"renesas,ether-r8a7790" if the device is a part of R8A7790 SoC.
diff --git a/Documentation/media/uapi/cec/cec-ioc-adap-g-log-addrs.rst b/Documentation/media/uapi/cec/cec-ioc-adap-g-log-addrs.rst
index 04ee900..201d483 100644
--- a/Documentation/media/uapi/cec/cec-ioc-adap-g-log-addrs.rst
+++ b/Documentation/media/uapi/cec/cec-ioc-adap-g-log-addrs.rst
@@ -144,7 +144,7 @@
- ``flags``
- - Flags. No flags are defined yet, so set this to 0.
+ - Flags. See :ref:`cec-log-addrs-flags` for a list of available flags.
- .. row 7
@@ -201,6 +201,25 @@
give the CEC framework more information about the device type, even
though the framework won't use it directly in the CEC message.
+.. _cec-log-addrs-flags:
+
+.. flat-table:: Flags for struct cec_log_addrs
+ :header-rows: 0
+ :stub-columns: 0
+ :widths: 3 1 4
+
+
+ - .. _`CEC-LOG-ADDRS-FL-ALLOW-UNREG-FALLBACK`:
+
+ - ``CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK``
+
+ - 1
+
+ - By default if no logical address of the requested type can be claimed, then
+ it will go back to the unconfigured state. If this flag is set, then it will
+ fallback to the Unregistered logical address. Note that if the Unregistered
+ logical address was explicitly requested, then this flag has no effect.
+
.. _cec-versions:
.. flat-table:: CEC Versions
diff --git a/Documentation/media/uapi/cec/cec-ioc-dqevent.rst b/Documentation/media/uapi/cec/cec-ioc-dqevent.rst
index 7a6d6d0..2e1e7392 100644
--- a/Documentation/media/uapi/cec/cec-ioc-dqevent.rst
+++ b/Documentation/media/uapi/cec/cec-ioc-dqevent.rst
@@ -64,7 +64,8 @@
- ``phys_addr``
- - The current physical address.
+ - The current physical address. This is ``CEC_PHYS_ADDR_INVALID`` if no
+ valid physical address is set.
- .. row 2
@@ -72,7 +73,10 @@
- ``log_addr_mask``
- - The current set of claimed logical addresses.
+ - The current set of claimed logical addresses. This is 0 if no logical
+ addresses are claimed or if ``phys_addr`` is ``CEC_PHYS_ADDR_INVALID``.
+ If bit 15 is set (``1 << CEC_LOG_ADDR_UNREGISTERED``) then this device
+ has the unregistered logical address. In that case all other bits are 0.
diff --git a/Documentation/networking/ipvlan.txt b/Documentation/networking/ipvlan.txt
index 14422f8..24196ce 100644
--- a/Documentation/networking/ipvlan.txt
+++ b/Documentation/networking/ipvlan.txt
@@ -22,7 +22,7 @@
There are no module parameters for this driver and it can be configured
using IProute2/ip utility.
- ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | L3 }
+ ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | l3 | l3s }
e.g. ip link add link ipvl0 eth0 type ipvlan mode l2
@@ -48,6 +48,11 @@
used before packets are queued on the outbound device. In this mode the slaves
will not receive nor can send multicast / broadcast traffic.
+4.3 L3S mode:
+ This is very similar to the L3 mode except that iptables (conn-tracking)
+works in this mode and hence it is L3-symmetric (L3s). This will have slightly less
+performance but that shouldn't matter since you are choosing this mode over plain-L3
+mode to make conn-tracking work.
5. What to choose (macvlan vs. ipvlan)?
These two devices are very similar in many regards and the specific use
diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
index 44235e8..2bbac05 100644
--- a/Documentation/networking/switchdev.txt
+++ b/Documentation/networking/switchdev.txt
@@ -314,30 +314,29 @@
does a longest prefix match (LPM) on FIB entries matching route prefix and
forwards the packet to the matching FIB entry's nexthop(s) egress ports.
-To program the device, the driver implements support for
-SWITCHDEV_OBJ_IPV[4|6]_FIB object using switchdev_port_obj_xxx ops.
-switchdev_port_obj_add is used for both adding a new FIB entry to the device,
-or modifying an existing entry on the device.
+To program the device, the driver has to register a FIB notifier handler
+using register_fib_notifier. The following events are available:
+FIB_EVENT_ENTRY_ADD: used for both adding a new FIB entry to the device,
+ or modifying an existing entry on the device.
+FIB_EVENT_ENTRY_DEL: used for removing a FIB entry
+FIB_EVENT_RULE_ADD, FIB_EVENT_RULE_DEL: used to propagate FIB rule changes
-XXX: Currently, only SWITCHDEV_OBJ_ID_IPV4_FIB objects are supported.
+FIB_EVENT_ENTRY_ADD and FIB_EVENT_ENTRY_DEL events pass:
-SWITCHDEV_OBJ_ID_IPV4_FIB object passes:
-
- struct switchdev_obj_ipv4_fib { /* IPV4_FIB */
+ struct fib_entry_notifier_info {
+ struct fib_notifier_info info; /* must be first */
u32 dst;
int dst_len;
struct fib_info *fi;
u8 tos;
u8 type;
- u32 nlflags;
u32 tb_id;
- } ipv4_fib;
+ u32 nlflags;
+ };
to add/modify/delete IPv4 dst/dest_len prefix on table tb_id. The *fi
structure holds details on the route and route's nexthops. *dev is one of the
-port netdevs mentioned in the routes next hop list. If the output port netdevs
-referenced in the route's nexthop list don't all have the same switch ID, the
-driver is not called to add/modify/delete the FIB entry.
+port netdevs mentioned in the route's next hop list.
Routes offloaded to the device are labeled with "offload" in the ip route
listing:
@@ -355,6 +354,8 @@
12.0.0.4 via 11.0.0.9 dev sw1p2 proto zebra metric 20 offload
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.15
+The "offload" flag is set in case at least one device offloads the FIB entry.
+
XXX: add/mod/del IPv6 FIB API
Nexthop Resolution
diff --git a/MAINTAINERS b/MAINTAINERS
index 463daa5..20de5f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1634,6 +1634,7 @@
ARM/SAMSUNG EXYNOS ARM ARCHITECTURES
M: Kukjin Kim <kgene@kernel.org>
M: Krzysztof Kozlowski <krzk@kernel.org>
+R: Javier Martinez Canillas <javier@osg.samsung.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: linux-samsung-soc@vger.kernel.org (moderated for non-subscribers)
S: Maintained
@@ -2509,7 +2510,7 @@
F: kernel/bpf/
BROADCOM B44 10/100 ETHERNET DRIVER
-M: Gary Zambrano <zambrano@broadcom.com>
+M: Michael Chan <michael.chan@broadcom.com>
L: netdev@vger.kernel.org
S: Supported
F: drivers/net/ethernet/broadcom/b44.*
@@ -6110,7 +6111,7 @@
F: drivers/cpufreq/intel_pstate.c
INTEL FRAMEBUFFER DRIVER (excluding 810 and 815)
-M: Maik Broemme <mbroemme@plusserver.de>
+M: Maik Broemme <mbroemme@libmpq.org>
L: linux-fbdev@vger.kernel.org
S: Maintained
F: Documentation/fb/intelfb.txt
@@ -8168,6 +8169,15 @@
W: https://fedorahosted.org/dropwatch/
F: net/core/drop_monitor.c
+NETWORKING [DSA]
+M: Andrew Lunn <andrew@lunn.ch>
+M: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
+M: Florian Fainelli <f.fainelli@gmail.com>
+S: Maintained
+F: net/dsa/
+F: include/net/dsa.h
+F: drivers/net/dsa/
+
NETWORKING [GENERAL]
M: "David S. Miller" <davem@davemloft.net>
L: netdev@vger.kernel.org
@@ -12584,7 +12594,7 @@
F: net/8021q/
VLYNQ BUS
-M: Florian Fainelli <florian@openwrt.org>
+M: Florian Fainelli <f.fainelli@gmail.com>
L: openwrt-devel@lists.openwrt.org (subscribers-only)
S: Maintained
F: drivers/vlynq/vlynq.c
diff --git a/Makefile b/Makefile
index 1a8c8dd..74e22c2 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
VERSION = 4
PATCHLEVEL = 8
SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc7
NAME = Psychotic Stoned Sheep
# *DOCUMENTATION*
diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index c419b43..466e42e 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -371,14 +371,6 @@
return __cu_len;
}
-extern inline long
-__copy_tofrom_user(void *to, const void *from, long len, const void __user *validate)
-{
- if (__access_ok((unsigned long)validate, len, get_fs()))
- len = __copy_tofrom_user_nocheck(to, from, len);
- return len;
-}
-
#define __copy_to_user(to, from, n) \
({ \
__chk_user_ptr(to); \
@@ -393,17 +385,22 @@
#define __copy_to_user_inatomic __copy_to_user
#define __copy_from_user_inatomic __copy_from_user
-
extern inline long
copy_to_user(void __user *to, const void *from, long n)
{
- return __copy_tofrom_user((__force void *)to, from, n, to);
+ if (likely(__access_ok((unsigned long)to, n, get_fs())))
+ n = __copy_tofrom_user_nocheck((__force void *)to, from, n);
+ return n;
}
extern inline long
copy_from_user(void *to, const void __user *from, long n)
{
- return __copy_tofrom_user(to, (__force void *)from, n, from);
+ if (likely(__access_ok((unsigned long)from, n, get_fs())))
+ n = __copy_tofrom_user_nocheck(to, (__force void *)from, n);
+ else
+ memset(to, 0, n);
+ return n;
}
extern void __do_clear_user(void);
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index a78d567..41faf17 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -83,7 +83,10 @@
"2: ;nop\n" \
" .section .fixup, \"ax\"\n" \
" .align 4\n" \
- "3: mov %0, %3\n" \
+ "3: # return -EFAULT\n" \
+ " mov %0, %3\n" \
+ " # zero out dst ptr\n" \
+ " mov %1, 0\n" \
" j 2b\n" \
" .previous\n" \
" .section __ex_table, \"a\"\n" \
@@ -101,7 +104,11 @@
"2: ;nop\n" \
" .section .fixup, \"ax\"\n" \
" .align 4\n" \
- "3: mov %0, %3\n" \
+ "3: # return -EFAULT\n" \
+ " mov %0, %3\n" \
+ " # zero out dst ptr\n" \
+ " mov %1, 0\n" \
+ " mov %R1, 0\n" \
" j 2b\n" \
" .previous\n" \
" .section __ex_table, \"a\"\n" \
diff --git a/arch/arm/boot/dts/bcm2835-rpi.dtsi b/arch/arm/boot/dts/bcm2835-rpi.dtsi
index caf2707..e9b47b2 100644
--- a/arch/arm/boot/dts/bcm2835-rpi.dtsi
+++ b/arch/arm/boot/dts/bcm2835-rpi.dtsi
@@ -2,6 +2,7 @@
/ {
memory {
+ device_type = "memory";
reg = <0 0x10000000>;
};
diff --git a/arch/arm/boot/dts/bcm283x.dtsi b/arch/arm/boot/dts/bcm283x.dtsi
index b982522..445624a 100644
--- a/arch/arm/boot/dts/bcm283x.dtsi
+++ b/arch/arm/boot/dts/bcm283x.dtsi
@@ -2,7 +2,6 @@
#include <dt-bindings/clock/bcm2835.h>
#include <dt-bindings/clock/bcm2835-aux.h>
#include <dt-bindings/gpio/gpio.h>
-#include "skeleton.dtsi"
/* This include file covers the common peripherals and configuration between
* bcm2835 and bcm2836 implementations, leaving the CPU configuration to
@@ -13,6 +12,8 @@
compatible = "brcm,bcm2835";
model = "BCM2835";
interrupt-parent = <&intc>;
+ #address-cells = <1>;
+ #size-cells = <1>;
chosen {
bootargs = "earlyprintk console=ttyAMA0";
diff --git a/arch/arm/boot/dts/stih407-family.dtsi b/arch/arm/boot/dts/stih407-family.dtsi
index d294e82..8b063ab 100644
--- a/arch/arm/boot/dts/stih407-family.dtsi
+++ b/arch/arm/boot/dts/stih407-family.dtsi
@@ -550,8 +550,9 @@
interrupt-names = "mmcirq";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_mmc0>;
- clock-names = "mmc";
- clocks = <&clk_s_c0_flexgen CLK_MMC_0>;
+ clock-names = "mmc", "icn";
+ clocks = <&clk_s_c0_flexgen CLK_MMC_0>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_HVA>;
bus-width = <8>;
non-removable;
};
@@ -565,8 +566,9 @@
interrupt-names = "mmcirq";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_sd1>;
- clock-names = "mmc";
- clocks = <&clk_s_c0_flexgen CLK_MMC_1>;
+ clock-names = "mmc", "icn";
+ clocks = <&clk_s_c0_flexgen CLK_MMC_1>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_HVA>;
resets = <&softreset STIH407_MMC1_SOFTRESET>;
bus-width = <4>;
};
diff --git a/arch/arm/boot/dts/stih410.dtsi b/arch/arm/boot/dts/stih410.dtsi
index 18ed1ad..4031886 100644
--- a/arch/arm/boot/dts/stih410.dtsi
+++ b/arch/arm/boot/dts/stih410.dtsi
@@ -41,7 +41,8 @@
compatible = "st,st-ohci-300x";
reg = <0x9a03c00 0x100>;
interrupts = <GIC_SPI 180 IRQ_TYPE_NONE>;
- clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>;
+ clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_DISP_0>;
resets = <&powerdown STIH407_USB2_PORT0_POWERDOWN>,
<&softreset STIH407_USB2_PORT0_SOFTRESET>;
reset-names = "power", "softreset";
@@ -57,7 +58,8 @@
interrupts = <GIC_SPI 151 IRQ_TYPE_NONE>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb0>;
- clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>;
+ clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_DISP_0>;
resets = <&powerdown STIH407_USB2_PORT0_POWERDOWN>,
<&softreset STIH407_USB2_PORT0_SOFTRESET>;
reset-names = "power", "softreset";
@@ -71,7 +73,8 @@
compatible = "st,st-ohci-300x";
reg = <0x9a83c00 0x100>;
interrupts = <GIC_SPI 181 IRQ_TYPE_NONE>;
- clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>;
+ clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_DISP_0>;
resets = <&powerdown STIH407_USB2_PORT1_POWERDOWN>,
<&softreset STIH407_USB2_PORT1_SOFTRESET>;
reset-names = "power", "softreset";
@@ -87,7 +90,8 @@
interrupts = <GIC_SPI 153 IRQ_TYPE_NONE>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb1>;
- clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>;
+ clocks = <&clk_s_c0_flexgen CLK_TX_ICN_DISP_0>,
+ <&clk_s_c0_flexgen CLK_RX_ICN_DISP_0>;
resets = <&powerdown STIH407_USB2_PORT1_POWERDOWN>,
<&softreset STIH407_USB2_PORT1_SOFTRESET>;
reset-names = "power", "softreset";
diff --git a/arch/arm/common/locomo.c b/arch/arm/common/locomo.c
index 0e97b4b..6c7b06854 100644
--- a/arch/arm/common/locomo.c
+++ b/arch/arm/common/locomo.c
@@ -140,7 +140,7 @@
static void locomo_handler(struct irq_desc *desc)
{
- struct locomo *lchip = irq_desc_get_chip_data(desc);
+ struct locomo *lchip = irq_desc_get_handler_data(desc);
int req, i;
/* Acknowledge the parent IRQ */
@@ -200,8 +200,7 @@
* Install handler for IRQ_LOCOMO_HW.
*/
irq_set_irq_type(lchip->irq, IRQ_TYPE_EDGE_FALLING);
- irq_set_chip_data(lchip->irq, lchip);
- irq_set_chained_handler(lchip->irq, locomo_handler);
+ irq_set_chained_handler_and_data(lchip->irq, locomo_handler, lchip);
/* Install handlers for IRQ_LOCOMO_* */
for ( ; irq <= lchip->irq_base + 3; irq++) {
diff --git a/arch/arm/common/sa1111.c b/arch/arm/common/sa1111.c
index fb0a0a4..2e076c4 100644
--- a/arch/arm/common/sa1111.c
+++ b/arch/arm/common/sa1111.c
@@ -472,8 +472,8 @@
* specifies that S0ReadyInt and S1ReadyInt should be '1'.
*/
sa1111_writel(0, irqbase + SA1111_INTPOL0);
- sa1111_writel(SA1111_IRQMASK_HI(IRQ_S0_READY_NINT) |
- SA1111_IRQMASK_HI(IRQ_S1_READY_NINT),
+ sa1111_writel(BIT(IRQ_S0_READY_NINT & 31) |
+ BIT(IRQ_S1_READY_NINT & 31),
irqbase + SA1111_INTPOL1);
/* clear all IRQs */
@@ -754,7 +754,7 @@
if (sachip->irq != NO_IRQ) {
ret = sa1111_setup_irq(sachip, pd->irq_base);
if (ret)
- goto err_unmap;
+ goto err_clk;
}
#ifdef CONFIG_ARCH_SA1100
@@ -799,6 +799,8 @@
return 0;
+ err_clk:
+ clk_disable(sachip->clk);
err_unmap:
iounmap(sachip->base);
err_clk_unprep:
@@ -869,9 +871,9 @@
#ifdef CONFIG_PM
-static int sa1111_suspend(struct platform_device *dev, pm_message_t state)
+static int sa1111_suspend_noirq(struct device *dev)
{
- struct sa1111 *sachip = platform_get_drvdata(dev);
+ struct sa1111 *sachip = dev_get_drvdata(dev);
struct sa1111_save_data *save;
unsigned long flags;
unsigned int val;
@@ -934,9 +936,9 @@
* restored by their respective drivers, and must be called
* via LDM after this function.
*/
-static int sa1111_resume(struct platform_device *dev)
+static int sa1111_resume_noirq(struct device *dev)
{
- struct sa1111 *sachip = platform_get_drvdata(dev);
+ struct sa1111 *sachip = dev_get_drvdata(dev);
struct sa1111_save_data *save;
unsigned long flags, id;
void __iomem *base;
@@ -952,7 +954,7 @@
id = sa1111_readl(sachip->base + SA1111_SKID);
if ((id & SKID_ID_MASK) != SKID_SA1111_ID) {
__sa1111_remove(sachip);
- platform_set_drvdata(dev, NULL);
+ dev_set_drvdata(dev, NULL);
kfree(save);
return 0;
}
@@ -1003,8 +1005,8 @@
}
#else
-#define sa1111_suspend NULL
-#define sa1111_resume NULL
+#define sa1111_suspend_noirq NULL
+#define sa1111_resume_noirq NULL
#endif
static int sa1111_probe(struct platform_device *pdev)
@@ -1017,7 +1019,7 @@
return -EINVAL;
irq = platform_get_irq(pdev, 0);
if (irq < 0)
- return -ENXIO;
+ return irq;
return __sa1111_probe(&pdev->dev, mem, irq);
}
@@ -1038,6 +1040,11 @@
return 0;
}
+static struct dev_pm_ops sa1111_pm_ops = {
+ .suspend_noirq = sa1111_suspend_noirq,
+ .resume_noirq = sa1111_resume_noirq,
+};
+
/*
* Not sure if this should be on the system bus or not yet.
* We really want some way to register a system device at
@@ -1050,10 +1057,9 @@
static struct platform_driver sa1111_device_driver = {
.probe = sa1111_probe,
.remove = sa1111_remove,
- .suspend = sa1111_suspend,
- .resume = sa1111_resume,
.driver = {
.name = "sa1111",
+ .pm = &sa1111_pm_ops,
},
};
diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig
index 71b42e6..78cd2f1 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -161,6 +161,7 @@
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_DWC3=y
+CONFIG_NOP_USB_XCEIV=y
CONFIG_KEYSTONE_USB_PHY=y
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
index 2c8665c..ea3566f 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -781,7 +781,7 @@
CONFIG_DMA_BCM2835=y
CONFIG_DMA_OMAP=y
CONFIG_QCOM_BAM_DMA=y
-CONFIG_XILINX_VDMA=y
+CONFIG_XILINX_DMA=y
CONFIG_DMA_SUN6I=y
CONFIG_STAGING=y
CONFIG_SENSORS_ISL29018=y
diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index da3c042..aef022a 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -284,7 +284,7 @@
err = blkcipher_walk_done(desc, &walk,
walk.nbytes % AES_BLOCK_SIZE);
}
- if (nbytes) {
+ if (walk.nbytes % AES_BLOCK_SIZE) {
u8 *tdst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *tsrc = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
u8 __aligned(8) tail[AES_BLOCK_SIZE];
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
index d0131ee..3f82e9d 100644
--- a/arch/arm/include/asm/pgtable-2level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -47,6 +47,7 @@
#define PMD_SECT_WB (PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
#define PMD_SECT_MINICACHE (PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
#define PMD_SECT_WBWA (PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_CACHE_MASK (PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
#define PMD_SECT_NONSHARED_DEV (PMD_SECT_TEX(2))
/*
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index f8f1cff..4cd664a 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -62,6 +62,7 @@
#define PMD_SECT_WT (_AT(pmdval_t, 2) << 2) /* normal inner write-through */
#define PMD_SECT_WB (_AT(pmdval_t, 3) << 2) /* normal inner write-back */
#define PMD_SECT_WBWA (_AT(pmdval_t, 7) << 2) /* normal inner write-alloc */
+#define PMD_SECT_CACHE_MASK (_AT(pmdval_t, 7) << 2)
/*
* + Level 3 descriptor (PTE)
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 75f130e..c94b90d 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -158,8 +158,6 @@
{
int i;
- kvm_free_stage2_pgd(kvm);
-
for (i = 0; i < KVM_MAX_VCPUS; ++i) {
if (kvm->vcpus[i]) {
kvm_arch_vcpu_free(kvm->vcpus[i]);
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 29d0b23..e9a5c0e 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1714,7 +1714,8 @@
kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
- hyp_idmap_start < kern_hyp_va(~0UL)) {
+ hyp_idmap_start < kern_hyp_va(~0UL) &&
+ hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
/*
* The idmap page is intersecting with the VA space,
* it is not safe to continue further.
@@ -1893,6 +1894,7 @@
void kvm_arch_flush_shadow_all(struct kvm *kvm)
{
+ kvm_free_stage2_pgd(kvm);
}
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
diff --git a/arch/arm/mach-exynos/suspend.c b/arch/arm/mach-exynos/suspend.c
index 3750575..06332f6 100644
--- a/arch/arm/mach-exynos/suspend.c
+++ b/arch/arm/mach-exynos/suspend.c
@@ -255,6 +255,12 @@
return -ENOMEM;
}
+ /*
+ * Clear the OF_POPULATED flag set in of_irq_init so that
+ * later the Exynos PMU platform device won't be skipped.
+ */
+ of_node_clear_flag(node, OF_POPULATED);
+
return 0;
}
diff --git a/arch/arm/mach-pxa/lubbock.c b/arch/arm/mach-pxa/lubbock.c
index 7245f33..d6159f8 100644
--- a/arch/arm/mach-pxa/lubbock.c
+++ b/arch/arm/mach-pxa/lubbock.c
@@ -137,6 +137,18 @@
// no D+ pullup; lubbock can't connect/disconnect in software
};
+static void lubbock_init_pcmcia(void)
+{
+ struct clk *clk;
+
+ /* Add an alias for the SA1111 PCMCIA clock */
+ clk = clk_get_sys("pxa2xx-pcmcia", NULL);
+ if (!IS_ERR(clk)) {
+ clkdev_create(clk, NULL, "1800");
+ clk_put(clk);
+ }
+}
+
static struct resource sa1111_resources[] = {
[0] = {
.start = 0x10000000,
@@ -467,6 +479,8 @@
pxa_set_btuart_info(NULL);
pxa_set_stuart_info(NULL);
+ lubbock_init_pcmcia();
+
clk_add_alias("SA1111_CLK", NULL, "GPIO11_CLK", NULL);
pxa_set_udc_info(&udc_info);
pxa_set_fb_info(NULL, &sharp_lm8v31);
diff --git a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
index 62437b5..73e3adb 100644
--- a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
+++ b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
@@ -41,40 +41,27 @@
#define REGULATOR_IRQ_MASK BIT(2) /* IRQ2, active low */
+/* start of DA9210 System Control and Event Registers */
+#define DA9210_REG_MASK_A 0x54
+
static void __iomem *irqc;
-static const u8 da9063_mask_regs[] = {
- DA9063_REG_IRQ_MASK_A,
- DA9063_REG_IRQ_MASK_B,
- DA9063_REG_IRQ_MASK_C,
- DA9063_REG_IRQ_MASK_D,
+/* first byte sets the memory pointer, following are consecutive reg values */
+static u8 da9063_irq_clr[] = { DA9063_REG_IRQ_MASK_A, 0xff, 0xff, 0xff, 0xff };
+static u8 da9210_irq_clr[] = { DA9210_REG_MASK_A, 0xff, 0xff };
+
+static struct i2c_msg da9xxx_msgs[2] = {
+ {
+ .addr = 0x58,
+ .len = ARRAY_SIZE(da9063_irq_clr),
+ .buf = da9063_irq_clr,
+ }, {
+ .addr = 0x68,
+ .len = ARRAY_SIZE(da9210_irq_clr),
+ .buf = da9210_irq_clr,
+ },
};
-/* DA9210 System Control and Event Registers */
-#define DA9210_REG_MASK_A 0x54
-#define DA9210_REG_MASK_B 0x55
-
-static const u8 da9210_mask_regs[] = {
- DA9210_REG_MASK_A,
- DA9210_REG_MASK_B,
-};
-
-static void da9xxx_mask_irqs(struct i2c_client *client, const u8 regs[],
- unsigned int nregs)
-{
- unsigned int i;
-
- dev_info(&client->dev, "Masking %s interrupt sources\n", client->name);
-
- for (i = 0; i < nregs; i++) {
- int error = i2c_smbus_write_byte_data(client, regs[i], ~0);
- if (error) {
- dev_err(&client->dev, "i2c error %d\n", error);
- return;
- }
- }
-}
-
static int regulator_quirk_notify(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -93,12 +80,15 @@
client = to_i2c_client(dev);
dev_dbg(dev, "Detected %s\n", client->name);
- if ((client->addr == 0x58 && !strcmp(client->name, "da9063")))
- da9xxx_mask_irqs(client, da9063_mask_regs,
- ARRAY_SIZE(da9063_mask_regs));
- else if (client->addr == 0x68 && !strcmp(client->name, "da9210"))
- da9xxx_mask_irqs(client, da9210_mask_regs,
- ARRAY_SIZE(da9210_mask_regs));
+ if ((client->addr == 0x58 && !strcmp(client->name, "da9063")) ||
+ (client->addr == 0x68 && !strcmp(client->name, "da9210"))) {
+ int ret;
+
+ dev_info(&client->dev, "clearing da9063/da9210 interrupts\n");
+ ret = i2c_transfer(client->adapter, da9xxx_msgs, ARRAY_SIZE(da9xxx_msgs));
+ if (ret != ARRAY_SIZE(da9xxx_msgs))
+ dev_err(&client->dev, "i2c error %d\n", ret);
+ }
mon = ioread32(irqc + IRQC_MONITOR);
if (mon & REGULATOR_IRQ_MASK)
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 6344913..30fe03f 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -137,7 +137,7 @@
initial_pmd_value = pmd;
- pmd &= PMD_SECT_TEX(1) | PMD_SECT_BUFFERABLE | PMD_SECT_CACHEABLE;
+ pmd &= PMD_SECT_CACHE_MASK;
for (i = 0; i < ARRAY_SIZE(cache_policies); i++)
if (cache_policies[i].pmd == pmd) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 3d2cef6..f193414 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -170,9 +170,6 @@
pr_info("Xen: initializing cpu%d\n", cpu);
vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
- /* Direct vCPU id mapping for ARM guests. */
- per_cpu(xen_vcpu_id, cpu) = cpu;
-
info.mfn = virt_to_gfn(vcpup);
info.offset = xen_offset_in_page(vcpup);
@@ -330,6 +327,7 @@
{
struct xen_add_to_physmap xatp;
struct shared_info *shared_info_page = NULL;
+ int cpu;
if (!xen_domain())
return 0;
@@ -380,7 +378,8 @@
return -ENOMEM;
/* Direct vCPU id mapping for ARM guests. */
- per_cpu(xen_vcpu_id, 0) = 0;
+ for_each_possible_cpu(cpu)
+ per_cpu(xen_vcpu_id, cpu) = cpu;
xen_auto_xlat_grant_frames.count = gnttab_max_grant_frames();
if (xen_xlate_map_ballooned_pages(&xen_auto_xlat_grant_frames.pfn,
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
index 445aa67..c2b9bcb 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
@@ -255,10 +255,10 @@
/* Local timer */
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0xf01>,
- <1 14 0xf01>,
- <1 11 0xf01>,
- <1 10 0xf01>;
+ interrupts = <1 13 0xf08>,
+ <1 14 0xf08>,
+ <1 11 0xf08>,
+ <1 10 0xf08>;
};
timer0: timer0@ffc03000 {
diff --git a/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi b/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi
index e502c24..bf6c8d0 100644
--- a/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi
@@ -102,13 +102,13 @@
timer {
compatible = "arm,armv8-timer";
interrupts = <GIC_PPI 13
- (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_EDGE_RISING)>,
+ (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 14
- (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_EDGE_RISING)>,
+ (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 11
- (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_EDGE_RISING)>,
+ (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 10
- (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_EDGE_RISING)>;
+ (GIC_CPU_MASK_RAW(0xff) | IRQ_TYPE_LEVEL_LOW)>;
};
xtal: xtal-clk {
diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index d5c3435..31ea70a 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -110,10 +110,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 0 0xff01>, /* Secure Phys IRQ */
- <1 13 0xff01>, /* Non-secure Phys IRQ */
- <1 14 0xff01>, /* Virt IRQ */
- <1 15 0xff01>; /* Hyp IRQ */
+ interrupts = <1 0 0xff08>, /* Secure Phys IRQ */
+ <1 13 0xff08>, /* Non-secure Phys IRQ */
+ <1 14 0xff08>, /* Virt IRQ */
+ <1 15 0xff08>; /* Hyp IRQ */
clock-frequency = <50000000>;
};
diff --git a/arch/arm64/boot/dts/broadcom/bcm2835-rpi.dtsi b/arch/arm64/boot/dts/broadcom/bcm2835-rpi.dtsi
new file mode 120000
index 0000000..3937b77
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/bcm2835-rpi.dtsi
@@ -0,0 +1 @@
+../../../../arm/boot/dts/bcm2835-rpi.dtsi
\ No newline at end of file
diff --git a/arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b.dts b/arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b.dts
index 6f47dd2..7841b72 100644
--- a/arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b.dts
+++ b/arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b.dts
@@ -1,7 +1,7 @@
/dts-v1/;
#include "bcm2837.dtsi"
-#include "../../../../arm/boot/dts/bcm2835-rpi.dtsi"
-#include "../../../../arm/boot/dts/bcm283x-rpi-smsc9514.dtsi"
+#include "bcm2835-rpi.dtsi"
+#include "bcm283x-rpi-smsc9514.dtsi"
/ {
compatible = "raspberrypi,3-model-b", "brcm,bcm2837";
diff --git a/arch/arm64/boot/dts/broadcom/bcm2837.dtsi b/arch/arm64/boot/dts/broadcom/bcm2837.dtsi
index f2a31d0..8216bbb 100644
--- a/arch/arm64/boot/dts/broadcom/bcm2837.dtsi
+++ b/arch/arm64/boot/dts/broadcom/bcm2837.dtsi
@@ -1,4 +1,4 @@
-#include "../../../../arm/boot/dts/bcm283x.dtsi"
+#include "bcm283x.dtsi"
/ {
compatible = "brcm,bcm2836";
diff --git a/arch/arm64/boot/dts/broadcom/bcm283x-rpi-smsc9514.dtsi b/arch/arm64/boot/dts/broadcom/bcm283x-rpi-smsc9514.dtsi
new file mode 120000
index 0000000..dca7c05
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/bcm283x-rpi-smsc9514.dtsi
@@ -0,0 +1 @@
+../../../../arm/boot/dts/bcm283x-rpi-smsc9514.dtsi
\ No newline at end of file
diff --git a/arch/arm64/boot/dts/broadcom/bcm283x.dtsi b/arch/arm64/boot/dts/broadcom/bcm283x.dtsi
new file mode 120000
index 0000000..5f54e4c
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/bcm283x.dtsi
@@ -0,0 +1 @@
+../../../../arm/boot/dts/bcm283x.dtsi
\ No newline at end of file
diff --git a/arch/arm64/boot/dts/broadcom/ns2.dtsi b/arch/arm64/boot/dts/broadcom/ns2.dtsi
index f53b095..d4a12fa 100644
--- a/arch/arm64/boot/dts/broadcom/ns2.dtsi
+++ b/arch/arm64/boot/dts/broadcom/ns2.dtsi
@@ -88,13 +88,13 @@
timer {
compatible = "arm,armv8-timer";
interrupts = <GIC_PPI 13 (GIC_CPU_MASK_RAW(0xff) |
- IRQ_TYPE_EDGE_RISING)>,
+ IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 14 (GIC_CPU_MASK_RAW(0xff) |
- IRQ_TYPE_EDGE_RISING)>,
+ IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 11 (GIC_CPU_MASK_RAW(0xff) |
- IRQ_TYPE_EDGE_RISING)>,
+ IRQ_TYPE_LEVEL_LOW)>,
<GIC_PPI 10 (GIC_CPU_MASK_RAW(0xff) |
- IRQ_TYPE_EDGE_RISING)>;
+ IRQ_TYPE_LEVEL_LOW)>;
};
pmu {
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
index 2eb9b22..04dc8a8 100644
--- a/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx.dtsi
@@ -354,10 +354,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0xff01>,
- <1 14 0xff01>,
- <1 11 0xff01>,
- <1 10 0xff01>;
+ interrupts = <1 13 4>,
+ <1 14 4>,
+ <1 11 4>,
+ <1 10 4>;
};
pmu {
diff --git a/arch/arm64/boot/dts/exynos/exynos7.dtsi b/arch/arm64/boot/dts/exynos/exynos7.dtsi
index ca663df..1628315 100644
--- a/arch/arm64/boot/dts/exynos/exynos7.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynos7.dtsi
@@ -473,10 +473,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0xff01>,
- <1 14 0xff01>,
- <1 11 0xff01>,
- <1 10 0xff01>;
+ interrupts = <1 13 0xff08>,
+ <1 14 0xff08>,
+ <1 11 0xff08>,
+ <1 10 0xff08>;
};
pmu_system_controller: system-controller@105c0000 {
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
index e669fbd..a67e210 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
@@ -119,10 +119,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0x1>, /* Physical Secure PPI */
- <1 14 0x1>, /* Physical Non-Secure PPI */
- <1 11 0x1>, /* Virtual PPI */
- <1 10 0x1>; /* Hypervisor PPI */
+ interrupts = <1 13 0xf08>, /* Physical Secure PPI */
+ <1 14 0xf08>, /* Physical Non-Secure PPI */
+ <1 11 0xf08>, /* Virtual PPI */
+ <1 10 0xf08>; /* Hypervisor PPI */
};
pmu {
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
index 21023a3..e3b6034 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
@@ -191,10 +191,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0x8>, /* Physical Secure PPI, active-low */
- <1 14 0x8>, /* Physical Non-Secure PPI, active-low */
- <1 11 0x8>, /* Virtual PPI, active-low */
- <1 10 0x8>; /* Hypervisor PPI, active-low */
+ interrupts = <1 13 4>, /* Physical Secure PPI, active-low */
+ <1 14 4>, /* Physical Non-Secure PPI, active-low */
+ <1 11 4>, /* Virtual PPI, active-low */
+ <1 10 4>; /* Hypervisor PPI, active-low */
};
pmu {
diff --git a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
index eab1a42..c2a6745 100644
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
@@ -122,10 +122,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_EDGE_RISING)>,
- <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_EDGE_RISING)>,
- <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_EDGE_RISING)>,
- <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_EDGE_RISING)>;
+ interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+ <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+ <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
+ <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
};
odmi: odmi@300000 {
diff --git a/arch/arm64/boot/dts/socionext/uniphier-ph1-ld20.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ph1-ld20.dtsi
index c223915..d73bdc8 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-ph1-ld20.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-ph1-ld20.dtsi
@@ -129,10 +129,10 @@
timer {
compatible = "arm,armv8-timer";
- interrupts = <1 13 0xf01>,
- <1 14 0xf01>,
- <1 11 0xf01>,
- <1 10 0xf01>;
+ interrupts = <1 13 4>,
+ <1 14 4>,
+ <1 11 4>,
+ <1 10 4>;
};
soc {
diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
index e595f22..3e2e51f 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
+++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
@@ -65,10 +65,10 @@
timer {
compatible = "arm,armv8-timer";
interrupt-parent = <&gic>;
- interrupts = <1 13 0xf01>,
- <1 14 0xf01>,
- <1 11 0xf01>,
- <1 10 0xf01>;
+ interrupts = <1 13 0xf08>,
+ <1 14 0xf08>,
+ <1 11 0xf08>,
+ <1 10 0xf08>;
};
amba_apu {
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index 5c88804..6b2aa0f 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -216,7 +216,7 @@
err = blkcipher_walk_done(desc, &walk,
walk.nbytes % AES_BLOCK_SIZE);
}
- if (nbytes) {
+ if (walk.nbytes % AES_BLOCK_SIZE) {
u8 *tdst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *tsrc = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
u8 __aligned(8) tail[AES_BLOCK_SIZE];
diff --git a/arch/avr32/include/asm/uaccess.h b/arch/avr32/include/asm/uaccess.h
index 68cf638..b1ec1fa 100644
--- a/arch/avr32/include/asm/uaccess.h
+++ b/arch/avr32/include/asm/uaccess.h
@@ -74,7 +74,7 @@
extern __kernel_size_t copy_to_user(void __user *to, const void *from,
__kernel_size_t n);
-extern __kernel_size_t copy_from_user(void *to, const void __user *from,
+extern __kernel_size_t ___copy_from_user(void *to, const void __user *from,
__kernel_size_t n);
static inline __kernel_size_t __copy_to_user(void __user *to, const void *from,
@@ -88,6 +88,15 @@
{
return __copy_user(to, (const void __force *)from, n);
}
+static inline __kernel_size_t copy_from_user(void *to,
+ const void __user *from,
+ __kernel_size_t n)
+{
+ size_t res = ___copy_from_user(to, from, n);
+ if (unlikely(res))
+ memset(to + (n - res), 0, res);
+ return res;
+}
#define __copy_to_user_inatomic __copy_to_user
#define __copy_from_user_inatomic __copy_from_user
diff --git a/arch/avr32/kernel/avr32_ksyms.c b/arch/avr32/kernel/avr32_ksyms.c
index d93ead0..7c6cf14 100644
--- a/arch/avr32/kernel/avr32_ksyms.c
+++ b/arch/avr32/kernel/avr32_ksyms.c
@@ -36,7 +36,7 @@
/*
* Userspace access stuff.
*/
-EXPORT_SYMBOL(copy_from_user);
+EXPORT_SYMBOL(___copy_from_user);
EXPORT_SYMBOL(copy_to_user);
EXPORT_SYMBOL(__copy_user);
EXPORT_SYMBOL(strncpy_from_user);
diff --git a/arch/avr32/lib/copy_user.S b/arch/avr32/lib/copy_user.S
index ea59c04..0753734 100644
--- a/arch/avr32/lib/copy_user.S
+++ b/arch/avr32/lib/copy_user.S
@@ -23,13 +23,13 @@
*/
.text
.align 1
- .global copy_from_user
- .type copy_from_user, @function
-copy_from_user:
+ .global ___copy_from_user
+ .type ___copy_from_user, @function
+___copy_from_user:
branch_if_kernel r8, __copy_user
ret_if_privileged r8, r11, r10, r10
rjmp __copy_user
- .size copy_from_user, . - copy_from_user
+ .size ___copy_from_user, . - ___copy_from_user
.global copy_to_user
.type copy_to_user, @function
diff --git a/arch/blackfin/include/asm/uaccess.h b/arch/blackfin/include/asm/uaccess.h
index 12f5d68..0a2a700 100644
--- a/arch/blackfin/include/asm/uaccess.h
+++ b/arch/blackfin/include/asm/uaccess.h
@@ -171,11 +171,12 @@
static inline unsigned long __must_check
copy_from_user(void *to, const void __user *from, unsigned long n)
{
- if (access_ok(VERIFY_READ, from, n))
+ if (likely(access_ok(VERIFY_READ, from, n))) {
memcpy(to, (const void __force *)from, n);
- else
- return n;
- return 0;
+ return 0;
+ }
+ memset(to, 0, n);
+ return n;
}
static inline unsigned long __must_check
diff --git a/arch/cris/include/asm/uaccess.h b/arch/cris/include/asm/uaccess.h
index e3530d0..56c7d57 100644
--- a/arch/cris/include/asm/uaccess.h
+++ b/arch/cris/include/asm/uaccess.h
@@ -194,30 +194,6 @@
extern unsigned long __copy_user_zeroing(void *to, const void __user *from, unsigned long n);
extern unsigned long __do_clear_user(void __user *to, unsigned long n);
-static inline unsigned long
-__generic_copy_to_user(void __user *to, const void *from, unsigned long n)
-{
- if (access_ok(VERIFY_WRITE, to, n))
- return __copy_user(to, from, n);
- return n;
-}
-
-static inline unsigned long
-__generic_copy_from_user(void *to, const void __user *from, unsigned long n)
-{
- if (access_ok(VERIFY_READ, from, n))
- return __copy_user_zeroing(to, from, n);
- return n;
-}
-
-static inline unsigned long
-__generic_clear_user(void __user *to, unsigned long n)
-{
- if (access_ok(VERIFY_WRITE, to, n))
- return __do_clear_user(to, n);
- return n;
-}
-
static inline long
__strncpy_from_user(char *dst, const char __user *src, long count)
{
@@ -282,7 +258,7 @@
else if (n == 24)
__asm_copy_from_user_24(to, from, ret);
else
- ret = __generic_copy_from_user(to, from, n);
+ ret = __copy_user_zeroing(to, from, n);
return ret;
}
@@ -333,7 +309,7 @@
else if (n == 24)
__asm_copy_to_user_24(to, from, ret);
else
- ret = __generic_copy_to_user(to, from, n);
+ ret = __copy_user(to, from, n);
return ret;
}
@@ -366,26 +342,43 @@
else if (n == 24)
__asm_clear_24(to, ret);
else
- ret = __generic_clear_user(to, n);
+ ret = __do_clear_user(to, n);
return ret;
}
-#define clear_user(to, n) \
- (__builtin_constant_p(n) ? \
- __constant_clear_user(to, n) : \
- __generic_clear_user(to, n))
+static inline size_t clear_user(void __user *to, size_t n)
+{
+ if (unlikely(!access_ok(VERIFY_WRITE, to, n)))
+ return n;
+ if (__builtin_constant_p(n))
+ return __constant_clear_user(to, n);
+ else
+ return __do_clear_user(to, n);
+}
-#define copy_from_user(to, from, n) \
- (__builtin_constant_p(n) ? \
- __constant_copy_from_user(to, from, n) : \
- __generic_copy_from_user(to, from, n))
+static inline size_t copy_from_user(void *to, const void __user *from, size_t n)
+{
+ if (unlikely(!access_ok(VERIFY_READ, from, n))) {
+ memset(to, 0, n);
+ return n;
+ }
+ if (__builtin_constant_p(n))
+ return __constant_copy_from_user(to, from, n);
+ else
+ return __copy_user_zeroing(to, from, n);
+}
-#define copy_to_user(to, from, n) \
- (__builtin_constant_p(n) ? \
- __constant_copy_to_user(to, from, n) : \
- __generic_copy_to_user(to, from, n))
+static inline size_t copy_to_user(void __user *to, const void *from, size_t n)
+{
+ if (unlikely(!access_ok(VERIFY_WRITE, to, n)))
+ return n;
+ if (__builtin_constant_p(n))
+ return __constant_copy_to_user(to, from, n);
+ else
+ return __copy_user(to, from, n);
+}
/* We let the __ versions of copy_from/to_user inline, because they're often
* used in fast paths and have only a small space overhead.
diff --git a/arch/frv/include/asm/uaccess.h b/arch/frv/include/asm/uaccess.h
index 3ac9a59..87d9e34 100644
--- a/arch/frv/include/asm/uaccess.h
+++ b/arch/frv/include/asm/uaccess.h
@@ -263,19 +263,25 @@
extern long __memset_user(void *dst, unsigned long count);
extern long __memcpy_user(void *dst, const void *src, unsigned long count);
-#define clear_user(dst,count) __memset_user(____force(dst), (count))
+#define __clear_user(dst,count) __memset_user(____force(dst), (count))
#define __copy_from_user_inatomic(to, from, n) __memcpy_user((to), ____force(from), (n))
#define __copy_to_user_inatomic(to, from, n) __memcpy_user(____force(to), (from), (n))
#else
-#define clear_user(dst,count) (memset(____force(dst), 0, (count)), 0)
+#define __clear_user(dst,count) (memset(____force(dst), 0, (count)), 0)
#define __copy_from_user_inatomic(to, from, n) (memcpy((to), ____force(from), (n)), 0)
#define __copy_to_user_inatomic(to, from, n) (memcpy(____force(to), (from), (n)), 0)
#endif
-#define __clear_user clear_user
+static inline unsigned long __must_check
+clear_user(void __user *to, unsigned long n)
+{
+ if (likely(__access_ok(to, n)))
+ n = __clear_user(to, n);
+ return n;
+}
static inline unsigned long __must_check
__copy_to_user(void __user *to, const void *from, unsigned long n)
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index f000a38..f61cfb2 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -103,7 +103,8 @@
{
long res = __strnlen_user(src, n);
- /* return from strnlen can't be zero -- that would be rubbish. */
+ if (unlikely(!res))
+ return -EFAULT;
if (res > n) {
copy_from_user(dst, src, n);
diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index 0472927..bfe1319 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -269,19 +269,16 @@
__cu_len; \
})
-#define copy_from_user(to, from, n) \
-({ \
- void *__cu_to = (to); \
- const void __user *__cu_from = (from); \
- long __cu_len = (n); \
- \
- __chk_user_ptr(__cu_from); \
- if (__access_ok(__cu_from, __cu_len, get_fs())) { \
- check_object_size(__cu_to, __cu_len, false); \
- __cu_len = __copy_user((__force void __user *) __cu_to, __cu_from, __cu_len); \
- } \
- __cu_len; \
-})
+static inline unsigned long
+copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+ check_object_size(to, n, false);
+ if (likely(__access_ok(from, n, get_fs())))
+ n = __copy_user((__force void __user *) to, from, n);
+ else
+ memset(to, 0, n);
+ return n;
+}
#define __copy_in_user(to, from, size) __copy_user((to), (from), (size))
diff --git a/arch/m32r/include/asm/uaccess.h b/arch/m32r/include/asm/uaccess.h
index cac7014..6f89821 100644
--- a/arch/m32r/include/asm/uaccess.h
+++ b/arch/m32r/include/asm/uaccess.h
@@ -219,7 +219,7 @@
#define __get_user_nocheck(x, ptr, size) \
({ \
long __gu_err = 0; \
- unsigned long __gu_val; \
+ unsigned long __gu_val = 0; \
might_fault(); \
__get_user_size(__gu_val, (ptr), (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
diff --git a/arch/metag/include/asm/uaccess.h b/arch/metag/include/asm/uaccess.h
index 8282cbc..273e612 100644
--- a/arch/metag/include/asm/uaccess.h
+++ b/arch/metag/include/asm/uaccess.h
@@ -204,8 +204,9 @@
static inline unsigned long
copy_from_user(void *to, const void __user *from, unsigned long n)
{
- if (access_ok(VERIFY_READ, from, n))
+ if (likely(access_ok(VERIFY_READ, from, n)))
return __copy_user_zeroing(to, from, n);
+ memset(to, 0, n);
return n;
}
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 331b0d3..8266767 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -227,7 +227,7 @@
#define __get_user(x, ptr) \
({ \
- unsigned long __gu_val; \
+ unsigned long __gu_val = 0; \
/*unsigned long __gu_ptr = (unsigned long)(ptr);*/ \
long __gu_err; \
switch (sizeof(*(ptr))) { \
@@ -373,10 +373,13 @@
static inline long copy_from_user(void *to,
const void __user *from, unsigned long n)
{
+ unsigned long res = n;
might_fault();
- if (access_ok(VERIFY_READ, from, n))
- return __copy_from_user(to, from, n);
- return n;
+ if (likely(access_ok(VERIFY_READ, from, n)))
+ res = __copy_from_user(to, from, n);
+ if (unlikely(res))
+ memset(to + (n - res), 0, res);
+ return res;
}
#define __copy_to_user(to, from, n) \
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 11b965f..21a2aab 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -14,6 +14,7 @@
#include <linux/kernel.h>
#include <linux/errno.h>
#include <linux/thread_info.h>
+#include <linux/string.h>
#include <asm/asm-eva.h>
/*
@@ -1170,6 +1171,8 @@
__cu_len = __invoke_copy_from_user(__cu_to, \
__cu_from, \
__cu_len); \
+ } else { \
+ memset(__cu_to, 0, __cu_len); \
} \
} \
__cu_len; \
diff --git a/arch/mn10300/include/asm/uaccess.h b/arch/mn10300/include/asm/uaccess.h
index 20f7bf6..d012e87 100644
--- a/arch/mn10300/include/asm/uaccess.h
+++ b/arch/mn10300/include/asm/uaccess.h
@@ -166,6 +166,7 @@
"2:\n" \
" .section .fixup,\"ax\"\n" \
"3:\n\t" \
+ " mov 0,%1\n" \
" mov %3,%0\n" \
" jmp 2b\n" \
" .previous\n" \
diff --git a/arch/mn10300/lib/usercopy.c b/arch/mn10300/lib/usercopy.c
index 7826e6c..ce8899e 100644
--- a/arch/mn10300/lib/usercopy.c
+++ b/arch/mn10300/lib/usercopy.c
@@ -9,7 +9,7 @@
* as published by the Free Software Foundation; either version
* 2 of the Licence, or (at your option) any later version.
*/
-#include <asm/uaccess.h>
+#include <linux/uaccess.h>
unsigned long
__generic_copy_to_user(void *to, const void *from, unsigned long n)
@@ -24,6 +24,8 @@
{
if (access_ok(VERIFY_READ, from, n))
__copy_user_zeroing(to, from, n);
+ else
+ memset(to, 0, n);
return n;
}
diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index caa51ff..0ab8232 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -102,9 +102,12 @@
static inline long copy_from_user(void *to, const void __user *from,
unsigned long n)
{
- if (!access_ok(VERIFY_READ, from, n))
- return n;
- return __copy_from_user(to, from, n);
+ unsigned long res = n;
+ if (access_ok(VERIFY_READ, from, n))
+ res = __copy_from_user(to, from, n);
+ if (unlikely(res))
+ memset(to + (n - res), 0, res);
+ return res;
}
static inline long copy_to_user(void __user *to, const void *from,
@@ -139,7 +142,7 @@
#define __get_user_unknown(val, size, ptr, err) do { \
err = 0; \
- if (copy_from_user(&(val), ptr, size)) { \
+ if (__copy_from_user(&(val), ptr, size)) { \
err = -EFAULT; \
} \
} while (0)
@@ -166,7 +169,7 @@
({ \
long __gu_err = -EFAULT; \
const __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \
- unsigned long __gu_val; \
+ unsigned long __gu_val = 0; \
__get_user_common(__gu_val, sizeof(*(ptr)), __gu_ptr, __gu_err);\
(x) = (__force __typeof__(x))__gu_val; \
__gu_err; \
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index a6bd07c..5cc6b4f 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -273,28 +273,20 @@
static inline unsigned long
copy_from_user(void *to, const void *from, unsigned long n)
{
- unsigned long over;
+ unsigned long res = n;
- if (access_ok(VERIFY_READ, from, n))
- return __copy_tofrom_user(to, from, n);
- if ((unsigned long)from < TASK_SIZE) {
- over = (unsigned long)from + n - TASK_SIZE;
- return __copy_tofrom_user(to, from, n - over) + over;
- }
- return n;
+ if (likely(access_ok(VERIFY_READ, from, n)))
+ res = __copy_tofrom_user(to, from, n);
+ if (unlikely(res))
+ memset(to + (n - res), 0, res);
+ return res;
}
static inline unsigned long
copy_to_user(void *to, const void *from, unsigned long n)
{
- unsigned long over;
-
- if (access_ok(VERIFY_WRITE, to, n))
- return __copy_tofrom_user(to, from, n);
- if ((unsigned long)to < TASK_SIZE) {
- over = (unsigned long)to + n - TASK_SIZE;
- return __copy_tofrom_user(to, from, n - over) + over;
- }
+ if (likely(access_ok(VERIFY_WRITE, to, n)))
+ n = __copy_tofrom_user(to, from, n);
return n;
}
@@ -303,13 +295,8 @@
static inline __must_check unsigned long
clear_user(void *addr, unsigned long size)
{
-
- if (access_ok(VERIFY_WRITE, addr, size))
- return __clear_user(addr, size);
- if ((unsigned long)addr < TASK_SIZE) {
- unsigned long over = (unsigned long)addr + size - TASK_SIZE;
- return __clear_user(addr, size - over) + over;
- }
+ if (likely(access_ok(VERIFY_WRITE, addr, size)))
+ size = __clear_user(addr, size);
return size;
}
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index e915048..4828478 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -10,6 +10,7 @@
#include <asm-generic/uaccess-unaligned.h>
#include <linux/bug.h>
+#include <linux/string.h>
#define VERIFY_READ 0
#define VERIFY_WRITE 1
@@ -221,7 +222,7 @@
unsigned long n)
{
int sz = __compiletime_object_size(to);
- int ret = -EFAULT;
+ unsigned long ret = n;
if (likely(sz == -1 || sz >= n))
ret = __copy_from_user(to, from, n);
@@ -230,6 +231,8 @@
else
__bad_copy_user();
+ if (unlikely(ret))
+ memset(to + (n - ret), 0, ret);
return ret;
}
diff --git a/arch/powerpc/include/asm/cpu_has_feature.h b/arch/powerpc/include/asm/cpu_has_feature.h
index 2ef55f8..b312b15 100644
--- a/arch/powerpc/include/asm/cpu_has_feature.h
+++ b/arch/powerpc/include/asm/cpu_has_feature.h
@@ -15,7 +15,7 @@
#ifdef CONFIG_JUMP_LABEL_FEATURE_CHECKS
#include <linux/jump_label.h>
-#define NUM_CPU_FTR_KEYS 64
+#define NUM_CPU_FTR_KEYS BITS_PER_LONG
extern struct static_key_true cpu_feature_keys[NUM_CPU_FTR_KEYS];
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index f1e3824..c266227 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -308,36 +308,21 @@
static inline unsigned long copy_from_user(void *to,
const void __user *from, unsigned long n)
{
- unsigned long over;
-
- if (access_ok(VERIFY_READ, from, n)) {
+ if (likely(access_ok(VERIFY_READ, from, n))) {
check_object_size(to, n, false);
return __copy_tofrom_user((__force void __user *)to, from, n);
}
- if ((unsigned long)from < TASK_SIZE) {
- over = (unsigned long)from + n - TASK_SIZE;
- check_object_size(to, n - over, false);
- return __copy_tofrom_user((__force void __user *)to, from,
- n - over) + over;
- }
+ memset(to, 0, n);
return n;
}
static inline unsigned long copy_to_user(void __user *to,
const void *from, unsigned long n)
{
- unsigned long over;
-
if (access_ok(VERIFY_WRITE, to, n)) {
check_object_size(from, n, true);
return __copy_tofrom_user(to, (__force void __user *)from, n);
}
- if ((unsigned long)to < TASK_SIZE) {
- over = (unsigned long)to + n - TASK_SIZE;
- check_object_size(from, n - over, true);
- return __copy_tofrom_user(to, (__force void __user *)from,
- n - over) + over;
- }
return n;
}
@@ -434,10 +419,6 @@
might_fault();
if (likely(access_ok(VERIFY_WRITE, addr, size)))
return __clear_user(addr, size);
- if ((unsigned long)addr < TASK_SIZE) {
- unsigned long over = (unsigned long)addr + size - TASK_SIZE;
- return __clear_user(addr, size - over) + over;
- }
return size;
}
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 2265c63..bd739fe 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -411,7 +411,7 @@
*
* r13 - PACA
* cr3 - gt if waking up with partial/complete hypervisor state loss
- * cr4 - eq if waking up from complete hypervisor state loss.
+ * cr4 - gt or eq if waking up from complete hypervisor state loss.
*/
_GLOBAL(pnv_wakeup_tb_loss)
ld r1,PACAR1(r13)
@@ -453,7 +453,7 @@
* At this stage
* cr2 - eq if first thread to wakeup in core
* cr3- gt if waking up with partial/complete hypervisor state loss
- * cr4 - eq if waking up from complete hypervisor state loss.
+ * cr4 - gt or eq if waking up from complete hypervisor state loss.
*/
ori r15,r15,PNV_CORE_IDLE_LOCK_BIT
@@ -481,7 +481,7 @@
* If waking up from sleep, subcore state is not lost. Hence
* skip subcore state restore
*/
- bne cr4,subcore_state_restored
+ blt cr4,subcore_state_restored
/* Restore per-subcore state */
ld r4,_SDR1(r1)
@@ -526,7 +526,7 @@
* If waking up from sleep, per core state is not lost, skip to
* clear_lock.
*/
- bne cr4,clear_lock
+ blt cr4,clear_lock
/*
* First thread in the core to wake up and its waking up with
@@ -557,7 +557,7 @@
* If waking up from sleep, hypervisor state is not lost. Hence
* skip hypervisor state restore.
*/
- bne cr4,hypervisor_state_restored
+ blt cr4,hypervisor_state_restored
/* Waking up from winkle */
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index c16d790..bc0c91e 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2217,7 +2217,7 @@
pnv_pci_link_table_and_group(phb->hose->node, num,
tbl, &pe->table_group);
- pnv_pci_phb3_tce_invalidate_pe(pe);
+ pnv_pci_ioda2_tce_invalidate_pe(pe);
return 0;
}
@@ -2355,7 +2355,7 @@
if (ret)
pe_warn(pe, "Unmapping failed, ret = %ld\n", ret);
else
- pnv_pci_phb3_tce_invalidate_pe(pe);
+ pnv_pci_ioda2_tce_invalidate_pe(pe);
pnv_pci_unlink_table_and_group(table_group->tables[num], table_group);
@@ -3426,7 +3426,17 @@
}
}
- pnv_ioda_free_pe(pe);
+ /*
+ * The PE for root bus can be removed because of hotplug in EEH
+ * recovery for fenced PHB error. We need to mark the PE dead so
+ * that it can be populated again in PCI hot add path. The PE
+ * shouldn't be destroyed as it's the global reserved resource.
+ */
+ if (phb->ioda.root_pe_populated &&
+ phb->ioda.root_pe_idx == pe->pe_number)
+ phb->ioda.root_pe_populated = false;
+ else
+ pnv_ioda_free_pe(pe);
}
static void pnv_pci_release_device(struct pci_dev *pdev)
@@ -3442,7 +3452,17 @@
if (!pdn || pdn->pe_number == IODA_INVALID_PE)
return;
+ /*
+ * PCI hotplug can happen as part of EEH error recovery. The @pdn
+ * isn't removed and added afterwards in this scenario. We should
+ * set the PE number in @pdn to an invalid one. Otherwise, the PE's
+ * device count is decreased on removing devices while failing to
+ * be increased on adding devices. It leads to unbalanced PE's device
+ * count and eventually make normal PCI hotplug path broken.
+ */
pe = &phb->ioda.pe_array[pdn->pe_number];
+ pdn->pe_number = IODA_INVALID_PE;
+
WARN_ON(--pe->device_count < 0);
if (pe->device_count == 0)
pnv_ioda_release_pe(pe);
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 95aefdb..52d7c87 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -266,28 +266,28 @@
__chk_user_ptr(ptr); \
switch (sizeof(*(ptr))) { \
case 1: { \
- unsigned char __x; \
+ unsigned char __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 2: { \
- unsigned short __x; \
+ unsigned short __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 4: { \
- unsigned int __x; \
+ unsigned int __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 8: { \
- unsigned long long __x; \
+ unsigned long long __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index f142215..607ec91 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2231,9 +2231,10 @@
return -EINVAL;
current->thread.fpu.fpc = fpu->fpc;
if (MACHINE_HAS_VX)
- convert_fp_to_vx(current->thread.fpu.vxrs, (freg_t *)fpu->fprs);
+ convert_fp_to_vx((__vector128 *) vcpu->run->s.regs.vrs,
+ (freg_t *) fpu->fprs);
else
- memcpy(current->thread.fpu.fprs, &fpu->fprs, sizeof(fpu->fprs));
+ memcpy(vcpu->run->s.regs.fprs, &fpu->fprs, sizeof(fpu->fprs));
return 0;
}
@@ -2242,9 +2243,10 @@
/* make sure we have the latest values */
save_fpu_regs();
if (MACHINE_HAS_VX)
- convert_vx_to_fp((freg_t *)fpu->fprs, current->thread.fpu.vxrs);
+ convert_vx_to_fp((freg_t *) fpu->fprs,
+ (__vector128 *) vcpu->run->s.regs.vrs);
else
- memcpy(fpu->fprs, current->thread.fpu.fprs, sizeof(fpu->fprs));
+ memcpy(fpu->fprs, vcpu->run->s.regs.fprs, sizeof(fpu->fprs));
fpu->fpc = current->thread.fpu.fpc;
return 0;
}
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index c106488b..d8673e2 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -584,7 +584,7 @@
/* Validity 0x0044 will be checked by SIE */
if (rc)
goto unpin;
- scb_s->gvrd = hpa;
+ scb_s->riccbd = hpa;
}
return 0;
unpin:
diff --git a/arch/score/include/asm/uaccess.h b/arch/score/include/asm/uaccess.h
index 20a3591..01aec8c 100644
--- a/arch/score/include/asm/uaccess.h
+++ b/arch/score/include/asm/uaccess.h
@@ -163,7 +163,7 @@
__get_user_asm(val, "lw", ptr); \
break; \
case 8: \
- if ((copy_from_user((void *)&val, ptr, 8)) == 0) \
+ if (__copy_from_user((void *)&val, ptr, 8) == 0) \
__gu_err = 0; \
else \
__gu_err = -EFAULT; \
@@ -188,6 +188,8 @@
\
if (likely(access_ok(VERIFY_READ, __gu_ptr, size))) \
__get_user_common((x), size, __gu_ptr); \
+ else \
+ (x) = 0; \
\
__gu_err; \
})
@@ -201,6 +203,7 @@
"2:\n" \
".section .fixup,\"ax\"\n" \
"3:li %0, %4\n" \
+ "li %1, 0\n" \
"j 2b\n" \
".previous\n" \
".section __ex_table,\"a\"\n" \
@@ -298,35 +301,34 @@
static inline unsigned long
copy_from_user(void *to, const void *from, unsigned long len)
{
- unsigned long over;
+ unsigned long res = len;
- if (access_ok(VERIFY_READ, from, len))
- return __copy_tofrom_user(to, from, len);
+ if (likely(access_ok(VERIFY_READ, from, len)))
+ res = __copy_tofrom_user(to, from, len);
- if ((unsigned long)from < TASK_SIZE) {
- over = (unsigned long)from + len - TASK_SIZE;
- return __copy_tofrom_user(to, from, len - over) + over;
- }
- return len;
+ if (unlikely(res))
+ memset(to + (len - res), 0, res);
+
+ return res;
}
static inline unsigned long
copy_to_user(void *to, const void *from, unsigned long len)
{
- unsigned long over;
+ if (likely(access_ok(VERIFY_WRITE, to, len)))
+ len = __copy_tofrom_user(to, from, len);
- if (access_ok(VERIFY_WRITE, to, len))
- return __copy_tofrom_user(to, from, len);
-
- if ((unsigned long)to < TASK_SIZE) {
- over = (unsigned long)to + len - TASK_SIZE;
- return __copy_tofrom_user(to, from, len - over) + over;
- }
return len;
}
-#define __copy_from_user(to, from, len) \
- __copy_tofrom_user((to), (from), (len))
+static inline unsigned long
+__copy_from_user(void *to, const void *from, unsigned long len)
+{
+ unsigned long left = __copy_tofrom_user(to, from, len);
+ if (unlikely(left))
+ memset(to + (len - left), 0, left);
+ return left;
+}
#define __copy_to_user(to, from, len) \
__copy_tofrom_user((to), (from), (len))
@@ -340,17 +342,17 @@
static inline unsigned long
__copy_from_user_inatomic(void *to, const void *from, unsigned long len)
{
- return __copy_from_user(to, from, len);
+ return __copy_tofrom_user(to, from, len);
}
-#define __copy_in_user(to, from, len) __copy_from_user(to, from, len)
+#define __copy_in_user(to, from, len) __copy_tofrom_user(to, from, len)
static inline unsigned long
copy_in_user(void *to, const void *from, unsigned long len)
{
if (access_ok(VERIFY_READ, from, len) &&
access_ok(VERFITY_WRITE, to, len))
- return copy_from_user(to, from, len);
+ return __copy_tofrom_user(to, from, len);
}
/*
diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h
index a49635c..92ade79 100644
--- a/arch/sh/include/asm/uaccess.h
+++ b/arch/sh/include/asm/uaccess.h
@@ -151,7 +151,10 @@
__kernel_size_t __copy_size = (__kernel_size_t) n;
if (__copy_size && __access_ok(__copy_from, __copy_size))
- return __copy_user(to, from, __copy_size);
+ __copy_size = __copy_user(to, from, __copy_size);
+
+ if (unlikely(__copy_size))
+ memset(to + (n - __copy_size), 0, __copy_size);
return __copy_size;
}
diff --git a/arch/sh/include/asm/uaccess_64.h b/arch/sh/include/asm/uaccess_64.h
index c01376c..ca5073d 100644
--- a/arch/sh/include/asm/uaccess_64.h
+++ b/arch/sh/include/asm/uaccess_64.h
@@ -24,6 +24,7 @@
#define __get_user_size(x,ptr,size,retval) \
do { \
retval = 0; \
+ x = 0; \
switch (size) { \
case 1: \
retval = __get_user_asm_b((void *)&x, \
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index e722c51..ea55f86 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -266,8 +266,10 @@
if (n && __access_ok((unsigned long) from, n)) {
check_object_size(to, n, false);
return __copy_user((__force void __user *) to, from, n);
- } else
+ } else {
+ memset(to, 0, n);
return n;
+ }
}
static inline unsigned long __copy_from_user(void *to, const void __user *from, unsigned long n)
diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index ff574da..94dd4a3 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -1004,79 +1004,87 @@
return status;
}
+struct exit_boot_struct {
+ struct boot_params *boot_params;
+ struct efi_info *efi;
+ struct setup_data *e820ext;
+ __u32 e820ext_size;
+ bool is64;
+};
+
+static efi_status_t exit_boot_func(efi_system_table_t *sys_table_arg,
+ struct efi_boot_memmap *map,
+ void *priv)
+{
+ static bool first = true;
+ const char *signature;
+ __u32 nr_desc;
+ efi_status_t status;
+ struct exit_boot_struct *p = priv;
+
+ if (first) {
+ nr_desc = *map->buff_size / *map->desc_size;
+ if (nr_desc > ARRAY_SIZE(p->boot_params->e820_map)) {
+ u32 nr_e820ext = nr_desc -
+ ARRAY_SIZE(p->boot_params->e820_map);
+
+ status = alloc_e820ext(nr_e820ext, &p->e820ext,
+ &p->e820ext_size);
+ if (status != EFI_SUCCESS)
+ return status;
+ }
+ first = false;
+ }
+
+ signature = p->is64 ? EFI64_LOADER_SIGNATURE : EFI32_LOADER_SIGNATURE;
+ memcpy(&p->efi->efi_loader_signature, signature, sizeof(__u32));
+
+ p->efi->efi_systab = (unsigned long)sys_table_arg;
+ p->efi->efi_memdesc_size = *map->desc_size;
+ p->efi->efi_memdesc_version = *map->desc_ver;
+ p->efi->efi_memmap = (unsigned long)*map->map;
+ p->efi->efi_memmap_size = *map->map_size;
+
+#ifdef CONFIG_X86_64
+ p->efi->efi_systab_hi = (unsigned long)sys_table_arg >> 32;
+ p->efi->efi_memmap_hi = (unsigned long)*map->map >> 32;
+#endif
+
+ return EFI_SUCCESS;
+}
+
static efi_status_t exit_boot(struct boot_params *boot_params,
void *handle, bool is64)
{
- struct efi_info *efi = &boot_params->efi_info;
- unsigned long map_sz, key, desc_size;
+ unsigned long map_sz, key, desc_size, buff_size;
efi_memory_desc_t *mem_map;
struct setup_data *e820ext;
- const char *signature;
__u32 e820ext_size;
- __u32 nr_desc, prev_nr_desc;
efi_status_t status;
__u32 desc_version;
- bool called_exit = false;
- u8 nr_entries;
- int i;
+ struct efi_boot_memmap map;
+ struct exit_boot_struct priv;
- nr_desc = 0;
- e820ext = NULL;
- e820ext_size = 0;
+ map.map = &mem_map;
+ map.map_size = &map_sz;
+ map.desc_size = &desc_size;
+ map.desc_ver = &desc_version;
+ map.key_ptr = &key;
+ map.buff_size = &buff_size;
+ priv.boot_params = boot_params;
+ priv.efi = &boot_params->efi_info;
+ priv.e820ext = NULL;
+ priv.e820ext_size = 0;
+ priv.is64 = is64;
-get_map:
- status = efi_get_memory_map(sys_table, &mem_map, &map_sz, &desc_size,
- &desc_version, &key);
-
+ /* Might as well exit boot services now */
+ status = efi_exit_boot_services(sys_table, handle, &map, &priv,
+ exit_boot_func);
if (status != EFI_SUCCESS)
return status;
- prev_nr_desc = nr_desc;
- nr_desc = map_sz / desc_size;
- if (nr_desc > prev_nr_desc &&
- nr_desc > ARRAY_SIZE(boot_params->e820_map)) {
- u32 nr_e820ext = nr_desc - ARRAY_SIZE(boot_params->e820_map);
-
- status = alloc_e820ext(nr_e820ext, &e820ext, &e820ext_size);
- if (status != EFI_SUCCESS)
- goto free_mem_map;
-
- efi_call_early(free_pool, mem_map);
- goto get_map; /* Allocated memory, get map again */
- }
-
- signature = is64 ? EFI64_LOADER_SIGNATURE : EFI32_LOADER_SIGNATURE;
- memcpy(&efi->efi_loader_signature, signature, sizeof(__u32));
-
- efi->efi_systab = (unsigned long)sys_table;
- efi->efi_memdesc_size = desc_size;
- efi->efi_memdesc_version = desc_version;
- efi->efi_memmap = (unsigned long)mem_map;
- efi->efi_memmap_size = map_sz;
-
-#ifdef CONFIG_X86_64
- efi->efi_systab_hi = (unsigned long)sys_table >> 32;
- efi->efi_memmap_hi = (unsigned long)mem_map >> 32;
-#endif
-
- /* Might as well exit boot services now */
- status = efi_call_early(exit_boot_services, handle, key);
- if (status != EFI_SUCCESS) {
- /*
- * ExitBootServices() will fail if any of the event
- * handlers change the memory map. In which case, we
- * must be prepared to retry, but only once so that
- * we're guaranteed to exit on repeated failures instead
- * of spinning forever.
- */
- if (called_exit)
- goto free_mem_map;
-
- called_exit = true;
- efi_call_early(free_pool, mem_map);
- goto get_map;
- }
-
+ e820ext = priv.e820ext;
+ e820ext_size = priv.e820ext_size;
/* Historic? */
boot_params->alt_mem_k = 32 * 1024;
@@ -1085,10 +1093,6 @@
return status;
return EFI_SUCCESS;
-
-free_mem_map:
- efi_call_early(free_pool, mem_map);
- return status;
}
/*
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index e07a22b..f5f4b3f 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -119,8 +119,8 @@
{
[PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
[PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
- [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0080,
- [PERF_COUNT_HW_CACHE_MISSES] = 0x0081,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x077e,
[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
[PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */
diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index e6131d4..65577f0 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -29,6 +29,8 @@
#define COUNTER_SHIFT 16
+static HLIST_HEAD(uncore_unused_list);
+
struct amd_uncore {
int id;
int refcnt;
@@ -39,7 +41,7 @@
cpumask_t *active_mask;
struct pmu *pmu;
struct perf_event *events[MAX_COUNTERS];
- struct amd_uncore *free_when_cpu_online;
+ struct hlist_node node;
};
static struct amd_uncore * __percpu *amd_uncore_nb;
@@ -306,6 +308,7 @@
uncore_nb->msr_base = MSR_F15H_NB_PERF_CTL;
uncore_nb->active_mask = &amd_nb_active_mask;
uncore_nb->pmu = &amd_nb_pmu;
+ uncore_nb->id = -1;
*per_cpu_ptr(amd_uncore_nb, cpu) = uncore_nb;
}
@@ -319,6 +322,7 @@
uncore_l2->msr_base = MSR_F16H_L2I_PERF_CTL;
uncore_l2->active_mask = &amd_l2_active_mask;
uncore_l2->pmu = &amd_l2_pmu;
+ uncore_l2->id = -1;
*per_cpu_ptr(amd_uncore_l2, cpu) = uncore_l2;
}
@@ -348,7 +352,7 @@
continue;
if (this->id == that->id) {
- that->free_when_cpu_online = this;
+ hlist_add_head(&this->node, &uncore_unused_list);
this = that;
break;
}
@@ -388,13 +392,23 @@
return 0;
}
+static void uncore_clean_online(void)
+{
+ struct amd_uncore *uncore;
+ struct hlist_node *n;
+
+ hlist_for_each_entry_safe(uncore, n, &uncore_unused_list, node) {
+ hlist_del(&uncore->node);
+ kfree(uncore);
+ }
+}
+
static void uncore_online(unsigned int cpu,
struct amd_uncore * __percpu *uncores)
{
struct amd_uncore *uncore = *per_cpu_ptr(uncores, cpu);
- kfree(uncore->free_when_cpu_online);
- uncore->free_when_cpu_online = NULL;
+ uncore_clean_online();
if (cpu == uncore->cpu)
cpumask_set_cpu(cpu, uncore->active_mask);
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 0a6e393..bdcd651 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -31,7 +31,17 @@
struct bts_ctx {
struct perf_output_handle handle;
struct debug_store ds_back;
- int started;
+ int state;
+};
+
+/* BTS context states: */
+enum {
+ /* no ongoing AUX transactions */
+ BTS_STATE_STOPPED = 0,
+ /* AUX transaction is on, BTS tracing is disabled */
+ BTS_STATE_INACTIVE,
+ /* AUX transaction is on, BTS tracing is running */
+ BTS_STATE_ACTIVE,
};
static DEFINE_PER_CPU(struct bts_ctx, bts_ctx);
@@ -204,6 +214,15 @@
static int
bts_buffer_reset(struct bts_buffer *buf, struct perf_output_handle *handle);
+/*
+ * Ordering PMU callbacks wrt themselves and the PMI is done by means
+ * of bts::state, which:
+ * - is set when bts::handle::event is valid, that is, between
+ * perf_aux_output_begin() and perf_aux_output_end();
+ * - is zero otherwise;
+ * - is ordered against bts::handle::event with a compiler barrier.
+ */
+
static void __bts_event_start(struct perf_event *event)
{
struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
@@ -221,10 +240,13 @@
/*
* local barrier to make sure that ds configuration made it
- * before we enable BTS
+ * before we enable BTS and bts::state goes ACTIVE
*/
wmb();
+ /* INACTIVE/STOPPED -> ACTIVE */
+ WRITE_ONCE(bts->state, BTS_STATE_ACTIVE);
+
intel_pmu_enable_bts(config);
}
@@ -251,9 +273,6 @@
__bts_event_start(event);
- /* PMI handler: this counter is running and likely generating PMIs */
- ACCESS_ONCE(bts->started) = 1;
-
return;
fail_end_stop:
@@ -263,30 +282,34 @@
event->hw.state = PERF_HES_STOPPED;
}
-static void __bts_event_stop(struct perf_event *event)
+static void __bts_event_stop(struct perf_event *event, int state)
{
+ struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+
+ /* ACTIVE -> INACTIVE(PMI)/STOPPED(->stop()) */
+ WRITE_ONCE(bts->state, state);
+
/*
* No extra synchronization is mandated by the documentation to have
* BTS data stores globally visible.
*/
intel_pmu_disable_bts();
-
- if (event->hw.state & PERF_HES_STOPPED)
- return;
-
- ACCESS_ONCE(event->hw.state) |= PERF_HES_STOPPED;
}
static void bts_event_stop(struct perf_event *event, int flags)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
- struct bts_buffer *buf = perf_get_aux(&bts->handle);
+ struct bts_buffer *buf = NULL;
+ int state = READ_ONCE(bts->state);
- /* PMI handler: don't restart this counter */
- ACCESS_ONCE(bts->started) = 0;
+ if (state == BTS_STATE_ACTIVE)
+ __bts_event_stop(event, BTS_STATE_STOPPED);
- __bts_event_stop(event);
+ if (state != BTS_STATE_STOPPED)
+ buf = perf_get_aux(&bts->handle);
+
+ event->hw.state |= PERF_HES_STOPPED;
if (flags & PERF_EF_UPDATE) {
bts_update(bts);
@@ -296,6 +319,7 @@
bts->handle.head =
local_xchg(&buf->data_size,
buf->nr_pages << PAGE_SHIFT);
+
perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
!!local_xchg(&buf->lost, 0));
}
@@ -310,8 +334,20 @@
void intel_bts_enable_local(void)
{
struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+ int state = READ_ONCE(bts->state);
- if (bts->handle.event && bts->started)
+ /*
+ * Here we transition from INACTIVE to ACTIVE;
+ * if we instead are STOPPED from the interrupt handler,
+ * stay that way. Can't be ACTIVE here though.
+ */
+ if (WARN_ON_ONCE(state == BTS_STATE_ACTIVE))
+ return;
+
+ if (state == BTS_STATE_STOPPED)
+ return;
+
+ if (bts->handle.event)
__bts_event_start(bts->handle.event);
}
@@ -319,8 +355,15 @@
{
struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
+ /*
+ * Here we transition from ACTIVE to INACTIVE;
+ * do nothing for STOPPED or INACTIVE.
+ */
+ if (READ_ONCE(bts->state) != BTS_STATE_ACTIVE)
+ return;
+
if (bts->handle.event)
- __bts_event_stop(bts->handle.event);
+ __bts_event_stop(bts->handle.event, BTS_STATE_INACTIVE);
}
static int
@@ -335,8 +378,6 @@
return 0;
head = handle->head & ((buf->nr_pages << PAGE_SHIFT) - 1);
- if (WARN_ON_ONCE(head != local_read(&buf->head)))
- return -EINVAL;
phys = &buf->buf[buf->cur_buf];
space = phys->offset + phys->displacement + phys->size - head;
@@ -403,22 +444,37 @@
int intel_bts_interrupt(void)
{
+ struct debug_store *ds = this_cpu_ptr(&cpu_hw_events)->ds;
struct bts_ctx *bts = this_cpu_ptr(&bts_ctx);
struct perf_event *event = bts->handle.event;
struct bts_buffer *buf;
s64 old_head;
- int err;
+ int err = -ENOSPC, handled = 0;
- if (!event || !bts->started)
- return 0;
+ /*
+ * The only surefire way of knowing if this NMI is ours is by checking
+ * the write ptr against the PMI threshold.
+ */
+ if (ds->bts_index >= ds->bts_interrupt_threshold)
+ handled = 1;
+
+ /*
+ * this is wrapped in intel_bts_enable_local/intel_bts_disable_local,
+ * so we can only be INACTIVE or STOPPED
+ */
+ if (READ_ONCE(bts->state) == BTS_STATE_STOPPED)
+ return handled;
buf = perf_get_aux(&bts->handle);
+ if (!buf)
+ return handled;
+
/*
* Skip snapshot counters: they don't use the interrupt, but
* there's no other way of telling, because the pointer will
* keep moving
*/
- if (!buf || buf->snapshot)
+ if (buf->snapshot)
return 0;
old_head = local_read(&buf->head);
@@ -426,18 +482,27 @@
/* no new data */
if (old_head == local_read(&buf->head))
- return 0;
+ return handled;
perf_aux_output_end(&bts->handle, local_xchg(&buf->data_size, 0),
!!local_xchg(&buf->lost, 0));
buf = perf_aux_output_begin(&bts->handle, event);
- if (!buf)
- return 1;
+ if (buf)
+ err = bts_buffer_reset(buf, &bts->handle);
- err = bts_buffer_reset(buf, &bts->handle);
- if (err)
- perf_aux_output_end(&bts->handle, 0, false);
+ if (err) {
+ WRITE_ONCE(bts->state, BTS_STATE_STOPPED);
+
+ if (buf) {
+ /*
+ * BTS_STATE_STOPPED should be visible before
+ * cleared handle::event
+ */
+ barrier();
+ perf_aux_output_end(&bts->handle, 0, false);
+ }
+ }
return 1;
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 2cbde2f..4c9a79b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -1730,9 +1730,11 @@
* disabled state if called consecutively.
*
* During consecutive calls, the same disable value will be written to related
- * registers, so the PMU state remains unchanged. hw.state in
- * intel_bts_disable_local will remain PERF_HES_STOPPED too in consecutive
- * calls.
+ * registers, so the PMU state remains unchanged.
+ *
+ * intel_bts events don't coexist with intel PMU's BTS events because of
+ * x86_add_exclusive(x86_lbr_exclusive_lbr); there's no need to keep them
+ * disabled around intel PMU's event batching etc, only inside the PMI handler.
*/
static void __intel_pmu_disable_all(void)
{
@@ -1742,8 +1744,6 @@
if (test_bit(INTEL_PMC_IDX_FIXED_BTS, cpuc->active_mask))
intel_pmu_disable_bts();
- else
- intel_bts_disable_local();
intel_pmu_pebs_disable_all();
}
@@ -1771,8 +1771,7 @@
return;
intel_pmu_enable_bts(event->hw.config);
- } else
- intel_bts_enable_local();
+ }
}
static void intel_pmu_enable_all(int added)
@@ -2073,6 +2072,7 @@
*/
if (!x86_pmu.late_ack)
apic_write(APIC_LVTPC, APIC_DM_NMI);
+ intel_bts_disable_local();
__intel_pmu_disable_all();
handled = intel_pmu_drain_bts_buffer();
handled += intel_bts_interrupt();
@@ -2172,6 +2172,7 @@
/* Only restore PMU state when it's active. See x86_pmu_disable(). */
if (cpuc->enabled)
__intel_pmu_enable_all(0, true);
+ intel_bts_enable_local();
/*
* Only unmask the NMI after the overflow counters
diff --git a/arch/x86/events/intel/cqm.c b/arch/x86/events/intel/cqm.c
index 783c49d..8f82b02 100644
--- a/arch/x86/events/intel/cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -458,6 +458,11 @@
static void init_mbm_sample(u32 rmid, u32 evt_type);
static void __intel_mbm_event_count(void *info);
+static bool is_cqm_event(int e)
+{
+ return (e == QOS_L3_OCCUP_EVENT_ID);
+}
+
static bool is_mbm_event(int e)
{
return (e >= QOS_MBM_TOTAL_EVENT_ID && e <= QOS_MBM_LOCAL_EVENT_ID);
@@ -1366,6 +1371,10 @@
(event->attr.config > QOS_MBM_LOCAL_EVENT_ID))
return -EINVAL;
+ if ((is_cqm_event(event->attr.config) && !cqm_enabled) ||
+ (is_mbm_event(event->attr.config) && !mbm_enabled))
+ return -EINVAL;
+
/* unsupported modes and filters */
if (event->attr.exclude_user ||
event->attr.exclude_kernel ||
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ce9f3f..9b983a4 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1274,18 +1274,18 @@
struct pebs_record_nhm *p = at;
u64 pebs_status;
- /* PEBS v3 has accurate status bits */
+ pebs_status = p->status & cpuc->pebs_enabled;
+ pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
+
+ /* PEBS v3 has more accurate status bits */
if (x86_pmu.intel_cap.pebs_format >= 3) {
- for_each_set_bit(bit, (unsigned long *)&p->status,
- MAX_PEBS_EVENTS)
+ for_each_set_bit(bit, (unsigned long *)&pebs_status,
+ x86_pmu.max_pebs_events)
counts[bit]++;
continue;
}
- pebs_status = p->status & cpuc->pebs_enabled;
- pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
-
/*
* On some CPUs the PEBS status can be zero when PEBS is
* racing with clearing of GLOBAL_STATUS.
@@ -1333,8 +1333,11 @@
continue;
event = cpuc->events[bit];
- WARN_ON_ONCE(!event);
- WARN_ON_ONCE(!event->attr.precise_ip);
+ if (WARN_ON_ONCE(!event))
+ continue;
+
+ if (WARN_ON_ONCE(!event->attr.precise_ip))
+ continue;
/* log dropped samples number */
if (error[bit])
diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 04bb5fb..861a7d9 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1074,6 +1074,11 @@
event->hw.addr_filters = NULL;
}
+static inline bool valid_kernel_ip(unsigned long ip)
+{
+ return virt_addr_valid(ip) && kernel_ip(ip);
+}
+
static int pt_event_addr_filters_validate(struct list_head *filters)
{
struct perf_addr_filter *filter;
@@ -1081,11 +1086,16 @@
list_for_each_entry(filter, filters, entry) {
/* PT doesn't support single address triggers */
- if (!filter->range)
+ if (!filter->range || !filter->size)
return -EOPNOTSUPP;
- if (!filter->inode && !kernel_ip(filter->offset))
- return -EINVAL;
+ if (!filter->inode) {
+ if (!valid_kernel_ip(filter->offset))
+ return -EINVAL;
+
+ if (!valid_kernel_ip(filter->offset + filter->size))
+ return -EINVAL;
+ }
if (++range > pt_cap_get(PT_CAP_num_address_ranges))
return -EOPNOTSUPP;
@@ -1111,7 +1121,7 @@
} else {
/* apply the offset */
msr_a = filter->offset + offs[range];
- msr_b = filter->size + msr_a;
+ msr_b = filter->size + msr_a - 1;
}
filters->filter[range].msr_a = msr_a;
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index e3af86f..2131c4c 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -433,7 +433,11 @@
#define __get_user_asm_ex(x, addr, itype, rtype, ltype) \
asm volatile("1: mov"itype" %1,%"rtype"0\n" \
"2:\n" \
- _ASM_EXTABLE_EX(1b, 2b) \
+ ".section .fixup,\"ax\"\n" \
+ "3:xor"itype" %"rtype"0,%"rtype"0\n" \
+ " jmp 2b\n" \
+ ".previous\n" \
+ _ASM_EXTABLE_EX(1b, 3b) \
: ltype(x) : "m" (__m(addr)))
#define __put_user_nocheck(x, ptr, size) \
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 50c95af..f3e9b2d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2093,7 +2093,6 @@
return -EINVAL;
}
- num_processors++;
if (apicid == boot_cpu_physical_apicid) {
/*
* x86_bios_cpu_apicid is required to have processors listed
@@ -2116,10 +2115,13 @@
pr_warning("APIC: Package limit reached. Processor %d/0x%x ignored.\n",
thiscpu, apicid);
+
disabled_cpus++;
return -ENOSPC;
}
+ num_processors++;
+
/*
* Validate version
*/
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index b816971..620ab06 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -54,6 +54,7 @@
*/
static u8 *container;
static size_t container_size;
+static bool ucode_builtin;
static u32 ucode_new_rev;
static u8 amd_ucode_patch[PATCH_MAX_SIZE];
@@ -281,18 +282,22 @@
void __init load_ucode_amd_bsp(unsigned int family)
{
struct cpio_data cp;
+ bool *builtin;
void **data;
size_t *size;
#ifdef CONFIG_X86_32
data = (void **)__pa_nodebug(&ucode_cpio.data);
size = (size_t *)__pa_nodebug(&ucode_cpio.size);
+ builtin = (bool *)__pa_nodebug(&ucode_builtin);
#else
data = &ucode_cpio.data;
size = &ucode_cpio.size;
+ builtin = &ucode_builtin;
#endif
- if (!load_builtin_amd_microcode(&cp, family))
+ *builtin = load_builtin_amd_microcode(&cp, family);
+ if (!*builtin)
cp = find_ucode_in_initrd();
if (!(cp.data && cp.size))
@@ -373,7 +378,8 @@
return;
/* Add CONFIG_RANDOMIZE_MEMORY offset. */
- cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+ if (!ucode_builtin)
+ cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
eax = cpuid_eax(0x00000001);
eq = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
@@ -439,7 +445,8 @@
container = cont_va;
/* Add CONFIG_RANDOMIZE_MEMORY offset. */
- container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+ if (!ucode_builtin)
+ container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
eax = cpuid_eax(0x00000001);
eax = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1d39bfb..3692249 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -289,6 +289,7 @@
put_cpu();
x86_platform.calibrate_tsc = kvm_get_tsc_khz;
+ x86_platform.calibrate_cpu = kvm_get_tsc_khz;
x86_platform.get_wallclock = kvm_get_wallclock;
x86_platform.set_wallclock = kvm_set_wallclock;
#ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 5f42d03..c7220ba 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -109,6 +109,7 @@
{
bool new_val, old_val;
struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
+ struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
union kvm_ioapic_redirect_entry *e;
e = &ioapic->redirtbl[RTC_GSI];
@@ -117,16 +118,17 @@
return;
new_val = kvm_apic_pending_eoi(vcpu, e->fields.vector);
- old_val = test_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map.map);
+ old_val = test_bit(vcpu->vcpu_id, dest_map->map);
if (new_val == old_val)
return;
if (new_val) {
- __set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map.map);
+ __set_bit(vcpu->vcpu_id, dest_map->map);
+ dest_map->vectors[vcpu->vcpu_id] = e->fields.vector;
ioapic->rtc_status.pending_eoi++;
} else {
- __clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map.map);
+ __clear_bit(vcpu->vcpu_id, dest_map->map);
ioapic->rtc_status.pending_eoi--;
rtc_status_pending_eoi_check_valid(ioapic);
}
diff --git a/arch/x86/kvm/pmu_amd.c b/arch/x86/kvm/pmu_amd.c
index 39b9112..cd94443 100644
--- a/arch/x86/kvm/pmu_amd.c
+++ b/arch/x86/kvm/pmu_amd.c
@@ -23,8 +23,8 @@
static struct kvm_event_hw_type_mapping amd_event_mapping[] = {
[0] = { 0x76, 0x00, PERF_COUNT_HW_CPU_CYCLES },
[1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
- [2] = { 0x80, 0x00, PERF_COUNT_HW_CACHE_REFERENCES },
- [3] = { 0x81, 0x00, PERF_COUNT_HW_CACHE_MISSES },
+ [2] = { 0x7d, 0x07, PERF_COUNT_HW_CACHE_REFERENCES },
+ [3] = { 0x7e, 0x07, PERF_COUNT_HW_CACHE_MISSES },
[4] = { 0xc2, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
[5] = { 0xc3, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
[6] = { 0xd0, 0x00, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND },
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 19f9f9e..699f872 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2743,16 +2743,16 @@
if (tsc_delta < 0)
mark_tsc_unstable("KVM discovered backwards TSC");
- if (kvm_lapic_hv_timer_in_use(vcpu) &&
- kvm_x86_ops->set_hv_timer(vcpu,
- kvm_get_lapic_tscdeadline_msr(vcpu)))
- kvm_lapic_switch_to_sw_timer(vcpu);
if (check_tsc_unstable()) {
u64 offset = kvm_compute_tsc_offset(vcpu,
vcpu->arch.last_guest_tsc);
kvm_x86_ops->write_tsc_offset(vcpu, offset);
vcpu->arch.tsc_catchup = 1;
}
+ if (kvm_lapic_hv_timer_in_use(vcpu) &&
+ kvm_x86_ops->set_hv_timer(vcpu,
+ kvm_get_lapic_tscdeadline_msr(vcpu)))
+ kvm_lapic_switch_to_sw_timer(vcpu);
/*
* On a host with synchronized TSC, there is no need to update
* kvmclock on vcpu->cpu migration
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 837ea36..6d52b94 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -553,15 +553,21 @@
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27B9, twinhead_reserve_killing_zone);
/*
- * Broadwell EP Home Agent BARs erroneously return non-zero values when read.
+ * Device [8086:2fc0]
+ * Erratum HSE43
+ * CONFIG_TDP_NOMINAL CSR Implemented at Incorrect Offset
+ * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v3-spec-update.html
*
- * See http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html
- * entry BDF2.
+ * Devices [8086:6f60,6fa0,6fc0]
+ * Erratum BDF2
+ * PCI BARs in the Home Agent Will Return Non-Zero Values During Enumeration
+ * http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-v4-spec-update.html
*/
-static void pci_bdwep_bar(struct pci_dev *dev)
+static void pci_invalid_bar(struct pci_dev *dev)
{
dev->non_compliant_bars = 1;
}
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_bdwep_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_bdwep_bar);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_bdwep_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6f60, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fa0, pci_invalid_bar);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, pci_invalid_bar);
diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index 3699995..a832426 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -233,6 +233,8 @@
return blkcipher_walk_done(desc, walk, -EINVAL);
}
+ bsize = min(walk->walk_blocksize, n);
+
walk->flags &= ~(BLKCIPHER_WALK_SLOW | BLKCIPHER_WALK_COPY |
BLKCIPHER_WALK_DIFF);
if (!scatterwalk_aligned(&walk->in, walk->alignmask) ||
@@ -245,7 +247,6 @@
}
}
- bsize = min(walk->walk_blocksize, n);
n = scatterwalk_clamp(&walk->in, n);
n = scatterwalk_clamp(&walk->out, n);
diff --git a/crypto/cryptd.c b/crypto/cryptd.c
index 77207b4..0c654e5 100644
--- a/crypto/cryptd.c
+++ b/crypto/cryptd.c
@@ -631,9 +631,14 @@
static int cryptd_hash_import(struct ahash_request *req, const void *in)
{
- struct cryptd_hash_request_ctx *rctx = ahash_request_ctx(req);
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct cryptd_hash_ctx *ctx = crypto_ahash_ctx(tfm);
+ struct shash_desc *desc = cryptd_shash_desc(req);
- return crypto_shash_import(&rctx->desc, in);
+ desc->tfm = ctx->child;
+ desc->flags = req->base.flags;
+
+ return crypto_shash_import(desc, in);
}
static int cryptd_create_hash(struct crypto_template *tmpl, struct rtattr **tb,
diff --git a/crypto/echainiv.c b/crypto/echainiv.c
index 1b01fe9..e3d889b 100644
--- a/crypto/echainiv.c
+++ b/crypto/echainiv.c
@@ -1,8 +1,8 @@
/*
* echainiv: Encrypted Chain IV Generator
*
- * This generator generates an IV based on a sequence number by xoring it
- * with a salt and then encrypting it with the same key as used to encrypt
+ * This generator generates an IV based on a sequence number by multiplying
+ * it with a salt and then encrypting it with the same key as used to encrypt
* the plain text. This algorithm requires that the block size be equal
* to the IV size. It is mainly useful for CBC.
*
@@ -24,81 +24,17 @@
#include <linux/err.h>
#include <linux/init.h>
#include <linux/kernel.h>
-#include <linux/mm.h>
#include <linux/module.h>
-#include <linux/percpu.h>
-#include <linux/spinlock.h>
+#include <linux/slab.h>
#include <linux/string.h>
-#define MAX_IV_SIZE 16
-
-static DEFINE_PER_CPU(u32 [MAX_IV_SIZE / sizeof(u32)], echainiv_iv);
-
-/* We don't care if we get preempted and read/write IVs from the next CPU. */
-static void echainiv_read_iv(u8 *dst, unsigned size)
-{
- u32 *a = (u32 *)dst;
- u32 __percpu *b = echainiv_iv;
-
- for (; size >= 4; size -= 4) {
- *a++ = this_cpu_read(*b);
- b++;
- }
-}
-
-static void echainiv_write_iv(const u8 *src, unsigned size)
-{
- const u32 *a = (const u32 *)src;
- u32 __percpu *b = echainiv_iv;
-
- for (; size >= 4; size -= 4) {
- this_cpu_write(*b, *a);
- a++;
- b++;
- }
-}
-
-static void echainiv_encrypt_complete2(struct aead_request *req, int err)
-{
- struct aead_request *subreq = aead_request_ctx(req);
- struct crypto_aead *geniv;
- unsigned int ivsize;
-
- if (err == -EINPROGRESS)
- return;
-
- if (err)
- goto out;
-
- geniv = crypto_aead_reqtfm(req);
- ivsize = crypto_aead_ivsize(geniv);
-
- echainiv_write_iv(subreq->iv, ivsize);
-
- if (req->iv != subreq->iv)
- memcpy(req->iv, subreq->iv, ivsize);
-
-out:
- if (req->iv != subreq->iv)
- kzfree(subreq->iv);
-}
-
-static void echainiv_encrypt_complete(struct crypto_async_request *base,
- int err)
-{
- struct aead_request *req = base->data;
-
- echainiv_encrypt_complete2(req, err);
- aead_request_complete(req, err);
-}
-
static int echainiv_encrypt(struct aead_request *req)
{
struct crypto_aead *geniv = crypto_aead_reqtfm(req);
struct aead_geniv_ctx *ctx = crypto_aead_ctx(geniv);
struct aead_request *subreq = aead_request_ctx(req);
- crypto_completion_t compl;
- void *data;
+ __be64 nseqno;
+ u64 seqno;
u8 *info;
unsigned int ivsize = crypto_aead_ivsize(geniv);
int err;
@@ -108,8 +44,6 @@
aead_request_set_tfm(subreq, ctx->child);
- compl = echainiv_encrypt_complete;
- data = req;
info = req->iv;
if (req->src != req->dst) {
@@ -127,29 +61,30 @@
return err;
}
- if (unlikely(!IS_ALIGNED((unsigned long)info,
- crypto_aead_alignmask(geniv) + 1))) {
- info = kmalloc(ivsize, req->base.flags &
- CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL:
- GFP_ATOMIC);
- if (!info)
- return -ENOMEM;
-
- memcpy(info, req->iv, ivsize);
- }
-
- aead_request_set_callback(subreq, req->base.flags, compl, data);
+ aead_request_set_callback(subreq, req->base.flags,
+ req->base.complete, req->base.data);
aead_request_set_crypt(subreq, req->dst, req->dst,
req->cryptlen, info);
aead_request_set_ad(subreq, req->assoclen);
- crypto_xor(info, ctx->salt, ivsize);
- scatterwalk_map_and_copy(info, req->dst, req->assoclen, ivsize, 1);
- echainiv_read_iv(info, ivsize);
+ memcpy(&nseqno, info + ivsize - 8, 8);
+ seqno = be64_to_cpu(nseqno);
+ memset(info, 0, ivsize);
- err = crypto_aead_encrypt(subreq);
- echainiv_encrypt_complete2(req, err);
- return err;
+ scatterwalk_map_and_copy(info, req->dst, req->assoclen, ivsize, 1);
+
+ do {
+ u64 a;
+
+ memcpy(&a, ctx->salt + ivsize - 8, 8);
+
+ a |= 1;
+ a *= seqno;
+
+ memcpy(info + ivsize - 8, &a, 8);
+ } while ((ivsize -= 8));
+
+ return crypto_aead_encrypt(subreq);
}
static int echainiv_decrypt(struct aead_request *req)
@@ -196,8 +131,7 @@
alg = crypto_spawn_aead_alg(spawn);
err = -EINVAL;
- if (inst->alg.ivsize & (sizeof(u32) - 1) ||
- inst->alg.ivsize > MAX_IV_SIZE)
+ if (inst->alg.ivsize & (sizeof(u64) - 1) || !inst->alg.ivsize)
goto free_inst;
inst->alg.encrypt = echainiv_encrypt;
@@ -206,7 +140,6 @@
inst->alg.init = aead_init_geniv;
inst->alg.exit = aead_exit_geniv;
- inst->alg.base.cra_alignmask |= __alignof__(u32) - 1;
inst->alg.base.cra_ctxsize = sizeof(struct aead_geniv_ctx);
inst->alg.base.cra_ctxsize += inst->alg.ivsize;
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 17995fa..82a081e 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -419,7 +419,7 @@
struct device *parent = NULL;
int retval;
- trace_rpm_suspend(dev, rpmflags);
+ trace_rpm_suspend_rcuidle(dev, rpmflags);
repeat:
retval = rpm_check_suspend_allowed(dev);
@@ -549,7 +549,7 @@
}
out:
- trace_rpm_return_int(dev, _THIS_IP_, retval);
+ trace_rpm_return_int_rcuidle(dev, _THIS_IP_, retval);
return retval;
diff --git a/drivers/bluetooth/Kconfig b/drivers/bluetooth/Kconfig
index cf50fd2..3cc9bff 100644
--- a/drivers/bluetooth/Kconfig
+++ b/drivers/bluetooth/Kconfig
@@ -180,6 +180,17 @@
Say Y here to compile support for Intel AG6XX protocol.
+config BT_HCIUART_MRVL
+ bool "Marvell protocol support"
+ depends on BT_HCIUART
+ select BT_HCIUART_H4
+ help
+ Marvell is serial protocol for communication between Bluetooth
+ device and host. This protocol is required for most Marvell Bluetooth
+ devices with UART interface.
+
+ Say Y here to compile support for HCI MRVL protocol.
+
config BT_HCIBCM203X
tristate "HCI BCM203x USB driver"
depends on USB
@@ -331,4 +342,16 @@
Say Y here to compile support for Texas Instrument's WiLink7 driver
into the kernel or say M to compile it as module (btwilink).
+config BT_QCOMSMD
+ tristate "Qualcomm SMD based HCI support"
+ depends on QCOM_SMD && QCOM_WCNSS_CTRL
+ select BT_QCA
+ help
+ Qualcomm SMD based HCI driver.
+ This driver is used to bridge HCI data onto the shared memory
+ channels to the WCNSS core.
+
+ Say Y here to compile support for HCI over Qualcomm SMD into the
+ kernel or say M to compile as a module.
+
endmenu
diff --git a/drivers/bluetooth/Makefile b/drivers/bluetooth/Makefile
index 9c18939..b1fc29a 100644
--- a/drivers/bluetooth/Makefile
+++ b/drivers/bluetooth/Makefile
@@ -20,6 +20,7 @@
obj-$(CONFIG_BT_MRVL) += btmrvl.o
obj-$(CONFIG_BT_MRVL_SDIO) += btmrvl_sdio.o
obj-$(CONFIG_BT_WILINK) += btwilink.o
+obj-$(CONFIG_BT_QCOMSMD) += btqcomsmd.o
obj-$(CONFIG_BT_BCM) += btbcm.o
obj-$(CONFIG_BT_RTL) += btrtl.o
obj-$(CONFIG_BT_QCA) += btqca.o
@@ -37,6 +38,7 @@
hci_uart-$(CONFIG_BT_HCIUART_BCM) += hci_bcm.o
hci_uart-$(CONFIG_BT_HCIUART_QCA) += hci_qca.o
hci_uart-$(CONFIG_BT_HCIUART_AG6XX) += hci_ag6xx.o
+hci_uart-$(CONFIG_BT_HCIUART_MRVL) += hci_mrvl.o
hci_uart-objs := $(hci_uart-y)
ccflags-y += -D__CHECK_ENDIAN__
diff --git a/drivers/bluetooth/bcm203x.c b/drivers/bluetooth/bcm203x.c
index 5b0ef7b..5ce6d41 100644
--- a/drivers/bluetooth/bcm203x.c
+++ b/drivers/bluetooth/bcm203x.c
@@ -185,10 +185,8 @@
data->state = BCM203X_LOAD_MINIDRV;
data->urb = usb_alloc_urb(0, GFP_KERNEL);
- if (!data->urb) {
- BT_ERR("Can't allocate URB");
+ if (!data->urb)
return -ENOMEM;
- }
if (request_firmware(&firmware, "BCM2033-MD.hex", &udev->dev) < 0) {
BT_ERR("Mini driver request failed");
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index 4a62081..28afd5d 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -55,8 +55,8 @@
}
edl = (struct edl_event_hdr *)(skb->data);
- if (!edl || !edl->data) {
- BT_ERR("%s: TLV with no header or no data", hdev->name);
+ if (!edl) {
+ BT_ERR("%s: TLV with no header", hdev->name);
err = -EILSEQ;
goto out;
}
@@ -224,8 +224,8 @@
}
edl = (struct edl_event_hdr *)(skb->data);
- if (!edl || !edl->data) {
- BT_ERR("%s: TLV with no header or no data", hdev->name);
+ if (!edl) {
+ BT_ERR("%s: TLV with no header", hdev->name);
err = -EILSEQ;
goto out;
}
diff --git a/drivers/bluetooth/btqcomsmd.c b/drivers/bluetooth/btqcomsmd.c
new file mode 100644
index 0000000..08c2c93
--- /dev/null
+++ b/drivers/bluetooth/btqcomsmd.c
@@ -0,0 +1,182 @@
+/*
+ * Copyright (c) 2016, Linaro Ltd.
+ * Copyright (c) 2015, Sony Mobile Communications Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/soc/qcom/smd.h>
+#include <linux/soc/qcom/wcnss_ctrl.h>
+#include <linux/platform_device.h>
+
+#include <net/bluetooth/bluetooth.h>
+#include <net/bluetooth/hci_core.h>
+
+#include "btqca.h"
+
+struct btqcomsmd {
+ struct hci_dev *hdev;
+
+ struct qcom_smd_channel *acl_channel;
+ struct qcom_smd_channel *cmd_channel;
+};
+
+static int btqcomsmd_recv(struct hci_dev *hdev, unsigned int type,
+ const void *data, size_t count)
+{
+ struct sk_buff *skb;
+
+ /* Use GFP_ATOMIC as we're in IRQ context */
+ skb = bt_skb_alloc(count, GFP_ATOMIC);
+ if (!skb) {
+ hdev->stat.err_rx++;
+ return -ENOMEM;
+ }
+
+ hci_skb_pkt_type(skb) = type;
+ memcpy(skb_put(skb, count), data, count);
+
+ return hci_recv_frame(hdev, skb);
+}
+
+static int btqcomsmd_acl_callback(struct qcom_smd_channel *channel,
+ const void *data, size_t count)
+{
+ struct btqcomsmd *btq = qcom_smd_get_drvdata(channel);
+
+ btq->hdev->stat.byte_rx += count;
+ return btqcomsmd_recv(btq->hdev, HCI_ACLDATA_PKT, data, count);
+}
+
+static int btqcomsmd_cmd_callback(struct qcom_smd_channel *channel,
+ const void *data, size_t count)
+{
+ struct btqcomsmd *btq = qcom_smd_get_drvdata(channel);
+
+ return btqcomsmd_recv(btq->hdev, HCI_EVENT_PKT, data, count);
+}
+
+static int btqcomsmd_send(struct hci_dev *hdev, struct sk_buff *skb)
+{
+ struct btqcomsmd *btq = hci_get_drvdata(hdev);
+ int ret;
+
+ switch (hci_skb_pkt_type(skb)) {
+ case HCI_ACLDATA_PKT:
+ ret = qcom_smd_send(btq->acl_channel, skb->data, skb->len);
+ hdev->stat.acl_tx++;
+ hdev->stat.byte_tx += skb->len;
+ break;
+ case HCI_COMMAND_PKT:
+ ret = qcom_smd_send(btq->cmd_channel, skb->data, skb->len);
+ hdev->stat.cmd_tx++;
+ break;
+ default:
+ ret = -EILSEQ;
+ break;
+ }
+
+ kfree_skb(skb);
+
+ return ret;
+}
+
+static int btqcomsmd_open(struct hci_dev *hdev)
+{
+ return 0;
+}
+
+static int btqcomsmd_close(struct hci_dev *hdev)
+{
+ return 0;
+}
+
+static int btqcomsmd_probe(struct platform_device *pdev)
+{
+ struct btqcomsmd *btq;
+ struct hci_dev *hdev;
+ void *wcnss;
+ int ret;
+
+ btq = devm_kzalloc(&pdev->dev, sizeof(*btq), GFP_KERNEL);
+ if (!btq)
+ return -ENOMEM;
+
+ wcnss = dev_get_drvdata(pdev->dev.parent);
+
+ btq->acl_channel = qcom_wcnss_open_channel(wcnss, "APPS_RIVA_BT_ACL",
+ btqcomsmd_acl_callback);
+ if (IS_ERR(btq->acl_channel))
+ return PTR_ERR(btq->acl_channel);
+
+ btq->cmd_channel = qcom_wcnss_open_channel(wcnss, "APPS_RIVA_BT_CMD",
+ btqcomsmd_cmd_callback);
+ if (IS_ERR(btq->cmd_channel))
+ return PTR_ERR(btq->cmd_channel);
+
+ qcom_smd_set_drvdata(btq->acl_channel, btq);
+ qcom_smd_set_drvdata(btq->cmd_channel, btq);
+
+ hdev = hci_alloc_dev();
+ if (!hdev)
+ return -ENOMEM;
+
+ hci_set_drvdata(hdev, btq);
+ btq->hdev = hdev;
+ SET_HCIDEV_DEV(hdev, &pdev->dev);
+
+ hdev->bus = HCI_SMD;
+ hdev->open = btqcomsmd_open;
+ hdev->close = btqcomsmd_close;
+ hdev->send = btqcomsmd_send;
+ hdev->set_bdaddr = qca_set_bdaddr_rome;
+
+ ret = hci_register_dev(hdev);
+ if (ret < 0) {
+ hci_free_dev(hdev);
+ return ret;
+ }
+
+ platform_set_drvdata(pdev, btq);
+
+ return 0;
+}
+
+static int btqcomsmd_remove(struct platform_device *pdev)
+{
+ struct btqcomsmd *btq = platform_get_drvdata(pdev);
+
+ hci_unregister_dev(btq->hdev);
+ hci_free_dev(btq->hdev);
+
+ return 0;
+}
+
+static const struct of_device_id btqcomsmd_of_match[] = {
+ { .compatible = "qcom,wcnss-bt", },
+ { },
+};
+
+static struct platform_driver btqcomsmd_driver = {
+ .probe = btqcomsmd_probe,
+ .remove = btqcomsmd_remove,
+ .driver = {
+ .name = "btqcomsmd",
+ .of_match_table = btqcomsmd_of_match,
+ },
+};
+
+module_platform_driver(btqcomsmd_driver);
+
+MODULE_AUTHOR("Bjorn Andersson <bjorn.andersson@sonymobile.com>");
+MODULE_DESCRIPTION("Qualcomm SMD HCI driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/bluetooth/btrtl.c b/drivers/bluetooth/btrtl.c
index 8428893..fc9b257 100644
--- a/drivers/bluetooth/btrtl.c
+++ b/drivers/bluetooth/btrtl.c
@@ -33,6 +33,7 @@
#define RTL_ROM_LMP_8723B 0x8723
#define RTL_ROM_LMP_8821A 0x8821
#define RTL_ROM_LMP_8761A 0x8761
+#define RTL_ROM_LMP_8822B 0x8822
static int rtl_read_rom_version(struct hci_dev *hdev, u8 *version)
{
@@ -78,11 +79,15 @@
const unsigned char *patch_length_base, *patch_offset_base;
u32 patch_offset = 0;
u16 patch_length, num_patches;
- const u16 project_id_to_lmp_subver[] = {
- RTL_ROM_LMP_8723A,
- RTL_ROM_LMP_8723B,
- RTL_ROM_LMP_8821A,
- RTL_ROM_LMP_8761A
+ static const struct {
+ __u16 lmp_subver;
+ __u8 id;
+ } project_id_to_lmp_subver[] = {
+ { RTL_ROM_LMP_8723A, 0 },
+ { RTL_ROM_LMP_8723B, 1 },
+ { RTL_ROM_LMP_8821A, 2 },
+ { RTL_ROM_LMP_8761A, 3 },
+ { RTL_ROM_LMP_8822B, 8 },
};
ret = rtl_read_rom_version(hdev, &rom_version);
@@ -134,14 +139,20 @@
return -EINVAL;
}
- if (project_id >= ARRAY_SIZE(project_id_to_lmp_subver)) {
+ /* Find project_id in table */
+ for (i = 0; i < ARRAY_SIZE(project_id_to_lmp_subver); i++) {
+ if (project_id == project_id_to_lmp_subver[i].id)
+ break;
+ }
+
+ if (i >= ARRAY_SIZE(project_id_to_lmp_subver)) {
BT_ERR("%s: unknown project id %d", hdev->name, project_id);
return -EINVAL;
}
- if (lmp_subver != project_id_to_lmp_subver[project_id]) {
+ if (lmp_subver != project_id_to_lmp_subver[i].lmp_subver) {
BT_ERR("%s: firmware is for %x but this is a %x", hdev->name,
- project_id_to_lmp_subver[project_id], lmp_subver);
+ project_id_to_lmp_subver[i].lmp_subver, lmp_subver);
return -EINVAL;
}
@@ -257,6 +268,26 @@
return ret;
}
+static int rtl_load_config(struct hci_dev *hdev, const char *name, u8 **buff)
+{
+ const struct firmware *fw;
+ int ret;
+
+ BT_INFO("%s: rtl: loading %s", hdev->name, name);
+ ret = request_firmware(&fw, name, &hdev->dev);
+ if (ret < 0) {
+ BT_ERR("%s: Failed to load %s", hdev->name, name);
+ return ret;
+ }
+
+ ret = fw->size;
+ *buff = kmemdup(fw->data, ret, GFP_KERNEL);
+
+ release_firmware(fw);
+
+ return ret;
+}
+
static int btrtl_setup_rtl8723a(struct hci_dev *hdev)
{
const struct firmware *fw;
@@ -296,25 +327,74 @@
unsigned char *fw_data = NULL;
const struct firmware *fw;
int ret;
+ int cfg_sz;
+ u8 *cfg_buff = NULL;
+ u8 *tbuff;
+ char *cfg_name = NULL;
+
+ switch (lmp_subver) {
+ case RTL_ROM_LMP_8723B:
+ cfg_name = "rtl_bt/rtl8723b_config.bin";
+ break;
+ case RTL_ROM_LMP_8821A:
+ cfg_name = "rtl_bt/rtl8821a_config.bin";
+ break;
+ case RTL_ROM_LMP_8761A:
+ cfg_name = "rtl_bt/rtl8761a_config.bin";
+ break;
+ case RTL_ROM_LMP_8822B:
+ cfg_name = "rtl_bt/rtl8822b_config.bin";
+ break;
+ default:
+ BT_ERR("%s: rtl: no config according to lmp_subver %04x",
+ hdev->name, lmp_subver);
+ break;
+ }
+
+ if (cfg_name) {
+ cfg_sz = rtl_load_config(hdev, cfg_name, &cfg_buff);
+ if (cfg_sz < 0)
+ cfg_sz = 0;
+ } else
+ cfg_sz = 0;
BT_INFO("%s: rtl: loading %s", hdev->name, fw_name);
ret = request_firmware(&fw, fw_name, &hdev->dev);
if (ret < 0) {
BT_ERR("%s: Failed to load %s", hdev->name, fw_name);
- return ret;
+ goto err_req_fw;
}
ret = rtl8723b_parse_firmware(hdev, lmp_subver, fw, &fw_data);
if (ret < 0)
goto out;
+ if (cfg_sz) {
+ tbuff = kzalloc(ret + cfg_sz, GFP_KERNEL);
+ if (!tbuff) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ memcpy(tbuff, fw_data, ret);
+ kfree(fw_data);
+
+ memcpy(tbuff + ret, cfg_buff, cfg_sz);
+ ret += cfg_sz;
+
+ fw_data = tbuff;
+ }
+
+ BT_INFO("cfg_sz %d, total size %d", cfg_sz, ret);
+
ret = rtl_download_firmware(hdev, fw_data, ret);
- kfree(fw_data);
- if (ret < 0)
- goto out;
out:
release_firmware(fw);
+ kfree(fw_data);
+err_req_fw:
+ if (cfg_sz)
+ kfree(cfg_buff);
return ret;
}
@@ -377,6 +457,9 @@
case RTL_ROM_LMP_8761A:
return btrtl_setup_rtl8723b(hdev, lmp_subver,
"rtl_bt/rtl8761a_fw.bin");
+ case RTL_ROM_LMP_8822B:
+ return btrtl_setup_rtl8723b(hdev, lmp_subver,
+ "rtl_bt/rtl8822b_fw.bin");
default:
BT_INFO("rtl: assuming no firmware upload needed.");
return 0;
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 811f9b9..6bd63b8 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -62,6 +62,7 @@
#define BTUSB_REALTEK 0x20000
#define BTUSB_BCM2045 0x40000
#define BTUSB_IFNUM_2 0x80000
+#define BTUSB_CW6622 0x100000
static const struct usb_device_id btusb_table[] = {
/* Generic Bluetooth USB device */
@@ -248,9 +249,11 @@
/* QCA ROME chipset */
{ USB_DEVICE(0x0cf3, 0xe007), .driver_info = BTUSB_QCA_ROME },
+ { USB_DEVICE(0x0cf3, 0xe009), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe300), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0cf3, 0xe360), .driver_info = BTUSB_QCA_ROME },
{ USB_DEVICE(0x0489, 0xe092), .driver_info = BTUSB_QCA_ROME },
+ { USB_DEVICE(0x04ca, 0x3011), .driver_info = BTUSB_QCA_ROME },
/* Broadcom BCM2035 */
{ USB_DEVICE(0x0a5c, 0x2009), .driver_info = BTUSB_BCM92035 },
@@ -290,7 +293,8 @@
{ USB_DEVICE(0x0400, 0x080a), .driver_info = BTUSB_BROKEN_ISOC },
/* CONWISE Technology based adapters with buggy SCO support */
- { USB_DEVICE(0x0e5e, 0x6622), .driver_info = BTUSB_BROKEN_ISOC },
+ { USB_DEVICE(0x0e5e, 0x6622),
+ .driver_info = BTUSB_BROKEN_ISOC | BTUSB_CW6622},
/* Roper Class 1 Bluetooth Dongle (Silicon Wave based) */
{ USB_DEVICE(0x1310, 0x0001), .driver_info = BTUSB_SWAVE },
@@ -2221,9 +2225,8 @@
err = wait_on_bit_timeout(&data->flags, BTUSB_DOWNLOADING,
TASK_INTERRUPTIBLE,
msecs_to_jiffies(5000));
- if (err == 1) {
+ if (err == -EINTR) {
BT_ERR("%s: Firmware loading interrupted", hdev->name);
- err = -EINTR;
goto done;
}
@@ -2275,7 +2278,7 @@
TASK_INTERRUPTIBLE,
msecs_to_jiffies(1000));
- if (err == 1) {
+ if (err == -EINTR) {
BT_ERR("%s: Device boot interrupted", hdev->name);
return -EINTR;
}
@@ -2845,6 +2848,9 @@
hdev->send = btusb_send_frame;
hdev->notify = btusb_notify;
+ if (id->driver_info & BTUSB_CW6622)
+ set_bit(HCI_QUIRK_BROKEN_STORED_LINK_KEY, &hdev->quirks);
+
if (id->driver_info & BTUSB_BCM2045)
set_bit(HCI_QUIRK_BROKEN_STORED_LINK_KEY, &hdev->quirks);
diff --git a/drivers/bluetooth/btwilink.c b/drivers/bluetooth/btwilink.c
index 485281b..ef51c9c 100644
--- a/drivers/bluetooth/btwilink.c
+++ b/drivers/bluetooth/btwilink.c
@@ -245,6 +245,7 @@
{
struct ti_st *hst;
long len;
+ int pkt_type;
hst = hci_get_drvdata(hdev);
@@ -258,6 +259,7 @@
* Freeing skb memory is taken care in shared transport layer,
* so don't free skb memory here.
*/
+ pkt_type = hci_skb_pkt_type(skb);
len = hst->st_write(skb);
if (len < 0) {
kfree_skb(skb);
@@ -268,7 +270,7 @@
/* ST accepted our skb. So, Go ahead and do rest */
hdev->stat.byte_tx += len;
- ti_st_tx_complete(hst, hci_skb_pkt_type(skb));
+ ti_st_tx_complete(hst, pkt_type);
return 0;
}
diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c
index 1c97eda..5ccb90e 100644
--- a/drivers/bluetooth/hci_bcm.c
+++ b/drivers/bluetooth/hci_bcm.c
@@ -798,7 +798,7 @@
static const struct hci_uart_proto bcm_proto = {
.id = HCI_UART_BCM,
- .name = "BCM",
+ .name = "Broadcom",
.manufacturer = 15,
.init_speed = 115200,
.oper_speed = 4000000,
diff --git a/drivers/bluetooth/hci_bcsp.c b/drivers/bluetooth/hci_bcsp.c
index d7d23ce..a2c921f 100644
--- a/drivers/bluetooth/hci_bcsp.c
+++ b/drivers/bluetooth/hci_bcsp.c
@@ -90,7 +90,8 @@
/* ---- BCSP CRC calculation ---- */
/* Table for calculating CRC for polynomial 0x1021, LSB processed first,
-initial value 0xffff, bits shifted in reverse order. */
+ * initial value 0xffff, bits shifted in reverse order.
+ */
static const u16 crc_table[] = {
0x0000, 0x1081, 0x2102, 0x3183,
@@ -174,7 +175,7 @@
}
static struct sk_buff *bcsp_prepare_pkt(struct bcsp_struct *bcsp, u8 *data,
- int len, int pkt_type)
+ int len, int pkt_type)
{
struct sk_buff *nskb;
u8 hdr[4], chan;
@@ -213,6 +214,7 @@
/* Vendor specific commands */
if (hci_opcode_ogf(__le16_to_cpu(opcode)) == 0x3f) {
u8 desc = *(data + HCI_COMMAND_HDR_SIZE);
+
if ((desc & 0xf0) == 0xc0) {
data += HCI_COMMAND_HDR_SIZE + 1;
len -= HCI_COMMAND_HDR_SIZE + 1;
@@ -271,8 +273,8 @@
/* Put CRC */
if (bcsp->use_crc) {
bcsp_txmsg_crc = bitrev16(bcsp_txmsg_crc);
- bcsp_slip_one_byte(nskb, (u8) ((bcsp_txmsg_crc >> 8) & 0x00ff));
- bcsp_slip_one_byte(nskb, (u8) (bcsp_txmsg_crc & 0x00ff));
+ bcsp_slip_one_byte(nskb, (u8)((bcsp_txmsg_crc >> 8) & 0x00ff));
+ bcsp_slip_one_byte(nskb, (u8)(bcsp_txmsg_crc & 0x00ff));
}
bcsp_slip_msgdelim(nskb);
@@ -287,7 +289,8 @@
struct sk_buff *skb;
/* First of all, check for unreliable messages in the queue,
- since they have priority */
+ * since they have priority
+ */
skb = skb_dequeue(&bcsp->unrel);
if (skb != NULL) {
@@ -414,7 +417,7 @@
/* spot "conf" pkts and reply with a "conf rsp" pkt */
if (bcsp->rx_skb->data[1] >> 4 == 4 && bcsp->rx_skb->data[2] == 0 &&
- !memcmp(&bcsp->rx_skb->data[4], conf_pkt, 4)) {
+ !memcmp(&bcsp->rx_skb->data[4], conf_pkt, 4)) {
struct sk_buff *nskb = alloc_skb(4, GFP_ATOMIC);
BT_DBG("Found a LE conf pkt");
@@ -428,7 +431,7 @@
}
/* Spot "sync" pkts. If we find one...disaster! */
else if (bcsp->rx_skb->data[1] >> 4 == 4 && bcsp->rx_skb->data[2] == 0 &&
- !memcmp(&bcsp->rx_skb->data[4], sync_pkt, 4)) {
+ !memcmp(&bcsp->rx_skb->data[4], sync_pkt, 4)) {
BT_ERR("Found a LE sync pkt, card has reset");
}
}
@@ -446,7 +449,7 @@
default:
memcpy(skb_put(bcsp->rx_skb, 1), &byte, 1);
if ((bcsp->rx_skb->data[0] & 0x40) != 0 &&
- bcsp->rx_state != BCSP_W4_CRC)
+ bcsp->rx_state != BCSP_W4_CRC)
bcsp_crc_update(&bcsp->message_crc, byte);
bcsp->rx_count--;
}
@@ -457,7 +460,7 @@
case 0xdc:
memcpy(skb_put(bcsp->rx_skb, 1), &c0, 1);
if ((bcsp->rx_skb->data[0] & 0x40) != 0 &&
- bcsp->rx_state != BCSP_W4_CRC)
+ bcsp->rx_state != BCSP_W4_CRC)
bcsp_crc_update(&bcsp->message_crc, 0xc0);
bcsp->rx_esc_state = BCSP_ESCSTATE_NOESC;
bcsp->rx_count--;
@@ -466,7 +469,7 @@
case 0xdd:
memcpy(skb_put(bcsp->rx_skb, 1), &db, 1);
if ((bcsp->rx_skb->data[0] & 0x40) != 0 &&
- bcsp->rx_state != BCSP_W4_CRC)
+ bcsp->rx_state != BCSP_W4_CRC)
bcsp_crc_update(&bcsp->message_crc, 0xdb);
bcsp->rx_esc_state = BCSP_ESCSTATE_NOESC;
bcsp->rx_count--;
@@ -485,13 +488,28 @@
static void bcsp_complete_rx_pkt(struct hci_uart *hu)
{
struct bcsp_struct *bcsp = hu->priv;
- int pass_up;
+ int pass_up = 0;
if (bcsp->rx_skb->data[0] & 0x80) { /* reliable pkt */
BT_DBG("Received seqno %u from card", bcsp->rxseq_txack);
- bcsp->rxseq_txack++;
- bcsp->rxseq_txack %= 0x8;
- bcsp->txack_req = 1;
+
+ /* check the rx sequence number is as expected */
+ if ((bcsp->rx_skb->data[0] & 0x07) == bcsp->rxseq_txack) {
+ bcsp->rxseq_txack++;
+ bcsp->rxseq_txack %= 0x8;
+ } else {
+ /* handle re-transmitted packet or
+ * when packet was missed
+ */
+ BT_ERR("Out-of-order packet arrived, got %u expected %u",
+ bcsp->rx_skb->data[0] & 0x07, bcsp->rxseq_txack);
+
+ /* do not process out-of-order packet payload */
+ pass_up = 2;
+ }
+
+ /* send current txack value to all received reliable packets */
+ bcsp->txack_req = 1;
/* If needed, transmit an ack pkt */
hci_uart_tx_wakeup(hu);
@@ -500,26 +518,33 @@
bcsp->rxack = (bcsp->rx_skb->data[0] >> 3) & 0x07;
BT_DBG("Request for pkt %u from card", bcsp->rxack);
+ /* handle received ACK indications,
+ * including those from out-of-order packets
+ */
bcsp_pkt_cull(bcsp);
- if ((bcsp->rx_skb->data[1] & 0x0f) == 6 &&
- bcsp->rx_skb->data[0] & 0x80) {
- hci_skb_pkt_type(bcsp->rx_skb) = HCI_ACLDATA_PKT;
- pass_up = 1;
- } else if ((bcsp->rx_skb->data[1] & 0x0f) == 5 &&
- bcsp->rx_skb->data[0] & 0x80) {
- hci_skb_pkt_type(bcsp->rx_skb) = HCI_EVENT_PKT;
- pass_up = 1;
- } else if ((bcsp->rx_skb->data[1] & 0x0f) == 7) {
- hci_skb_pkt_type(bcsp->rx_skb) = HCI_SCODATA_PKT;
- pass_up = 1;
- } else if ((bcsp->rx_skb->data[1] & 0x0f) == 1 &&
- !(bcsp->rx_skb->data[0] & 0x80)) {
- bcsp_handle_le_pkt(hu);
- pass_up = 0;
- } else
- pass_up = 0;
- if (!pass_up) {
+ if (pass_up != 2) {
+ if ((bcsp->rx_skb->data[1] & 0x0f) == 6 &&
+ (bcsp->rx_skb->data[0] & 0x80)) {
+ hci_skb_pkt_type(bcsp->rx_skb) = HCI_ACLDATA_PKT;
+ pass_up = 1;
+ } else if ((bcsp->rx_skb->data[1] & 0x0f) == 5 &&
+ (bcsp->rx_skb->data[0] & 0x80)) {
+ hci_skb_pkt_type(bcsp->rx_skb) = HCI_EVENT_PKT;
+ pass_up = 1;
+ } else if ((bcsp->rx_skb->data[1] & 0x0f) == 7) {
+ hci_skb_pkt_type(bcsp->rx_skb) = HCI_SCODATA_PKT;
+ pass_up = 1;
+ } else if ((bcsp->rx_skb->data[1] & 0x0f) == 1 &&
+ !(bcsp->rx_skb->data[0] & 0x80)) {
+ bcsp_handle_le_pkt(hu);
+ pass_up = 0;
+ } else {
+ pass_up = 0;
+ }
+ }
+
+ if (pass_up == 0) {
struct hci_event_hdr hdr;
u8 desc = (bcsp->rx_skb->data[1] & 0x0f);
@@ -537,18 +562,23 @@
hci_recv_frame(hu->hdev, bcsp->rx_skb);
} else {
BT_ERR("Packet for unknown channel (%u %s)",
- bcsp->rx_skb->data[1] & 0x0f,
- bcsp->rx_skb->data[0] & 0x80 ?
- "reliable" : "unreliable");
+ bcsp->rx_skb->data[1] & 0x0f,
+ bcsp->rx_skb->data[0] & 0x80 ?
+ "reliable" : "unreliable");
kfree_skb(bcsp->rx_skb);
}
} else
kfree_skb(bcsp->rx_skb);
- } else {
+ } else if (pass_up == 1) {
/* Pull out BCSP hdr */
skb_pull(bcsp->rx_skb, 4);
hci_recv_frame(hu->hdev, bcsp->rx_skb);
+ } else {
+ /* ignore packet payload of already ACKed re-transmitted
+ * packets or when a packet was missed in the BCSP window
+ */
+ kfree_skb(bcsp->rx_skb);
}
bcsp->rx_state = BCSP_W4_PKT_DELIMITER;
@@ -567,7 +597,7 @@
const unsigned char *ptr;
BT_DBG("hu %p count %d rx_state %d rx_count %ld",
- hu, count, bcsp->rx_state, bcsp->rx_count);
+ hu, count, bcsp->rx_state, bcsp->rx_count);
ptr = data;
while (count) {
@@ -586,24 +616,14 @@
switch (bcsp->rx_state) {
case BCSP_W4_BCSP_HDR:
- if ((0xff & (u8) ~ (bcsp->rx_skb->data[0] + bcsp->rx_skb->data[1] +
- bcsp->rx_skb->data[2])) != bcsp->rx_skb->data[3]) {
+ if ((0xff & (u8)~(bcsp->rx_skb->data[0] + bcsp->rx_skb->data[1] +
+ bcsp->rx_skb->data[2])) != bcsp->rx_skb->data[3]) {
BT_ERR("Error in BCSP hdr checksum");
kfree_skb(bcsp->rx_skb);
bcsp->rx_state = BCSP_W4_PKT_DELIMITER;
bcsp->rx_count = 0;
continue;
}
- if (bcsp->rx_skb->data[0] & 0x80 /* reliable pkt */
- && (bcsp->rx_skb->data[0] & 0x07) != bcsp->rxseq_txack) {
- BT_ERR("Out-of-order packet arrived, got %u expected %u",
- bcsp->rx_skb->data[0] & 0x07, bcsp->rxseq_txack);
-
- kfree_skb(bcsp->rx_skb);
- bcsp->rx_state = BCSP_W4_PKT_DELIMITER;
- bcsp->rx_count = 0;
- continue;
- }
bcsp->rx_state = BCSP_W4_DATA;
bcsp->rx_count = (bcsp->rx_skb->data[1] >> 4) +
(bcsp->rx_skb->data[2] << 4); /* May be 0 */
@@ -620,8 +640,8 @@
case BCSP_W4_CRC:
if (bitrev16(bcsp->message_crc) != bscp_get_crc(bcsp)) {
BT_ERR("Checksum failed: computed %04x received %04x",
- bitrev16(bcsp->message_crc),
- bscp_get_crc(bcsp));
+ bitrev16(bcsp->message_crc),
+ bscp_get_crc(bcsp));
kfree_skb(bcsp->rx_skb);
bcsp->rx_state = BCSP_W4_PKT_DELIMITER;
@@ -679,7 +699,7 @@
/* Arrange to retransmit all messages in the relq. */
static void bcsp_timed_event(unsigned long arg)
{
- struct hci_uart *hu = (struct hci_uart *) arg;
+ struct hci_uart *hu = (struct hci_uart *)arg;
struct bcsp_struct *bcsp = hu->priv;
struct sk_buff *skb;
unsigned long flags;
@@ -715,7 +735,7 @@
init_timer(&bcsp->tbcsp);
bcsp->tbcsp.function = bcsp_timed_event;
- bcsp->tbcsp.data = (u_long) hu;
+ bcsp->tbcsp.data = (u_long)hu;
bcsp->rx_state = BCSP_W4_PKT_DELIMITER;
diff --git a/drivers/bluetooth/hci_intel.c b/drivers/bluetooth/hci_intel.c
index ed0a420..9e27128 100644
--- a/drivers/bluetooth/hci_intel.c
+++ b/drivers/bluetooth/hci_intel.c
@@ -128,7 +128,7 @@
TASK_INTERRUPTIBLE,
msecs_to_jiffies(1000));
- if (err == 1) {
+ if (err == -EINTR) {
bt_dev_err(hu->hdev, "Device boot interrupted");
return -EINTR;
}
@@ -151,7 +151,7 @@
TASK_INTERRUPTIBLE,
msecs_to_jiffies(1000));
- if (err == 1) {
+ if (err == -EINTR) {
bt_dev_err(hu->hdev, "LPM transaction interrupted");
return -EINTR;
}
@@ -813,7 +813,7 @@
err = wait_on_bit_timeout(&intel->flags, STATE_DOWNLOADING,
TASK_INTERRUPTIBLE,
msecs_to_jiffies(5000));
- if (err == 1) {
+ if (err == -EINTR) {
bt_dev_err(hdev, "Firmware loading interrupted");
err = -EINTR;
goto done;
diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index dda9739..9497c46 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -697,34 +697,36 @@
case HCIUARTSETPROTO:
if (!test_and_set_bit(HCI_UART_PROTO_SET, &hu->flags)) {
err = hci_uart_set_proto(hu, arg);
- if (err) {
+ if (err)
clear_bit(HCI_UART_PROTO_SET, &hu->flags);
- return err;
- }
} else
- return -EBUSY;
+ err = -EBUSY;
break;
case HCIUARTGETPROTO:
if (test_bit(HCI_UART_PROTO_SET, &hu->flags))
- return hu->proto->id;
- return -EUNATCH;
+ err = hu->proto->id;
+ else
+ err = -EUNATCH;
+ break;
case HCIUARTGETDEVICE:
if (test_bit(HCI_UART_REGISTERED, &hu->flags))
- return hu->hdev->id;
- return -EUNATCH;
+ err = hu->hdev->id;
+ else
+ err = -EUNATCH;
+ break;
case HCIUARTSETFLAGS:
if (test_bit(HCI_UART_PROTO_SET, &hu->flags))
- return -EBUSY;
- err = hci_uart_set_flags(hu, arg);
- if (err)
- return err;
+ err = -EBUSY;
+ else
+ err = hci_uart_set_flags(hu, arg);
break;
case HCIUARTGETFLAGS:
- return hu->hdev_flags;
+ err = hu->hdev_flags;
+ break;
default:
err = n_tty_ioctl_helper(tty, file, cmd, arg);
@@ -810,6 +812,9 @@
#ifdef CONFIG_BT_HCIUART_AG6XX
ag6xx_init();
#endif
+#ifdef CONFIG_BT_HCIUART_MRVL
+ mrvl_init();
+#endif
return 0;
}
@@ -845,6 +850,9 @@
#ifdef CONFIG_BT_HCIUART_AG6XX
ag6xx_deinit();
#endif
+#ifdef CONFIG_BT_HCIUART_MRVL
+ mrvl_deinit();
+#endif
/* Release tty registration of line discipline */
err = tty_unregister_ldisc(N_HCI);
diff --git a/drivers/bluetooth/hci_mrvl.c b/drivers/bluetooth/hci_mrvl.c
new file mode 100644
index 0000000..bbc4b39
--- /dev/null
+++ b/drivers/bluetooth/hci_mrvl.c
@@ -0,0 +1,387 @@
+/*
+ *
+ * Bluetooth HCI UART driver for marvell devices
+ *
+ * Copyright (C) 2016 Marvell International Ltd.
+ * Copyright (C) 2016 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/firmware.h>
+#include <linux/module.h>
+#include <linux/tty.h>
+
+#include <net/bluetooth/bluetooth.h>
+#include <net/bluetooth/hci_core.h>
+
+#include "hci_uart.h"
+
+#define HCI_FW_REQ_PKT 0xA5
+#define HCI_CHIP_VER_PKT 0xAA
+
+#define MRVL_ACK 0x5A
+#define MRVL_NAK 0xBF
+#define MRVL_RAW_DATA 0x1F
+
+enum {
+ STATE_CHIP_VER_PENDING,
+ STATE_FW_REQ_PENDING,
+};
+
+struct mrvl_data {
+ struct sk_buff *rx_skb;
+ struct sk_buff_head txq;
+ struct sk_buff_head rawq;
+ unsigned long flags;
+ unsigned int tx_len;
+ u8 id, rev;
+};
+
+struct hci_mrvl_pkt {
+ __le16 lhs;
+ __le16 rhs;
+} __packed;
+#define HCI_MRVL_PKT_SIZE 4
+
+static int mrvl_open(struct hci_uart *hu)
+{
+ struct mrvl_data *mrvl;
+
+ BT_DBG("hu %p", hu);
+
+ mrvl = kzalloc(sizeof(*mrvl), GFP_KERNEL);
+ if (!mrvl)
+ return -ENOMEM;
+
+ skb_queue_head_init(&mrvl->txq);
+ skb_queue_head_init(&mrvl->rawq);
+
+ set_bit(STATE_CHIP_VER_PENDING, &mrvl->flags);
+
+ hu->priv = mrvl;
+ return 0;
+}
+
+static int mrvl_close(struct hci_uart *hu)
+{
+ struct mrvl_data *mrvl = hu->priv;
+
+ BT_DBG("hu %p", hu);
+
+ skb_queue_purge(&mrvl->txq);
+ skb_queue_purge(&mrvl->rawq);
+ kfree_skb(mrvl->rx_skb);
+ kfree(mrvl);
+
+ hu->priv = NULL;
+ return 0;
+}
+
+static int mrvl_flush(struct hci_uart *hu)
+{
+ struct mrvl_data *mrvl = hu->priv;
+
+ BT_DBG("hu %p", hu);
+
+ skb_queue_purge(&mrvl->txq);
+ skb_queue_purge(&mrvl->rawq);
+
+ return 0;
+}
+
+static struct sk_buff *mrvl_dequeue(struct hci_uart *hu)
+{
+ struct mrvl_data *mrvl = hu->priv;
+ struct sk_buff *skb;
+
+ skb = skb_dequeue(&mrvl->txq);
+ if (!skb) {
+ /* Any raw data ? */
+ skb = skb_dequeue(&mrvl->rawq);
+ } else {
+ /* Prepend skb with frame type */
+ memcpy(skb_push(skb, 1), &bt_cb(skb)->pkt_type, 1);
+ }
+
+ return skb;
+}
+
+static int mrvl_enqueue(struct hci_uart *hu, struct sk_buff *skb)
+{
+ struct mrvl_data *mrvl = hu->priv;
+
+ skb_queue_tail(&mrvl->txq, skb);
+ return 0;
+}
+
+static void mrvl_send_ack(struct hci_uart *hu, unsigned char type)
+{
+ struct mrvl_data *mrvl = hu->priv;
+ struct sk_buff *skb;
+
+ /* No H4 payload, only 1 byte header */
+ skb = bt_skb_alloc(0, GFP_ATOMIC);
+ if (!skb) {
+ bt_dev_err(hu->hdev, "Unable to alloc ack/nak packet");
+ return;
+ }
+ hci_skb_pkt_type(skb) = type;
+
+ skb_queue_tail(&mrvl->txq, skb);
+ hci_uart_tx_wakeup(hu);
+}
+
+static int mrvl_recv_fw_req(struct hci_dev *hdev, struct sk_buff *skb)
+{
+ struct hci_mrvl_pkt *pkt = (void *)skb->data;
+ struct hci_uart *hu = hci_get_drvdata(hdev);
+ struct mrvl_data *mrvl = hu->priv;
+ int ret = 0;
+
+ if ((pkt->lhs ^ pkt->rhs) != 0xffff) {
+ bt_dev_err(hdev, "Corrupted mrvl header");
+ mrvl_send_ack(hu, MRVL_NAK);
+ ret = -EINVAL;
+ goto done;
+ }
+ mrvl_send_ack(hu, MRVL_ACK);
+
+ if (!test_bit(STATE_FW_REQ_PENDING, &mrvl->flags)) {
+ bt_dev_err(hdev, "Received unexpected firmware request");
+ ret = -EINVAL;
+ goto done;
+ }
+
+ mrvl->tx_len = le16_to_cpu(pkt->lhs);
+
+ clear_bit(STATE_FW_REQ_PENDING, &mrvl->flags);
+ smp_mb__after_atomic();
+ wake_up_bit(&mrvl->flags, STATE_FW_REQ_PENDING);
+
+done:
+ kfree_skb(skb);
+ return ret;
+}
+
+static int mrvl_recv_chip_ver(struct hci_dev *hdev, struct sk_buff *skb)
+{
+ struct hci_mrvl_pkt *pkt = (void *)skb->data;
+ struct hci_uart *hu = hci_get_drvdata(hdev);
+ struct mrvl_data *mrvl = hu->priv;
+ u16 version = le16_to_cpu(pkt->lhs);
+ int ret = 0;
+
+ if ((pkt->lhs ^ pkt->rhs) != 0xffff) {
+ bt_dev_err(hdev, "Corrupted mrvl header");
+ mrvl_send_ack(hu, MRVL_NAK);
+ ret = -EINVAL;
+ goto done;
+ }
+ mrvl_send_ack(hu, MRVL_ACK);
+
+ if (!test_bit(STATE_CHIP_VER_PENDING, &mrvl->flags)) {
+ bt_dev_err(hdev, "Received unexpected chip version");
+ goto done;
+ }
+
+ mrvl->id = version;
+ mrvl->rev = version >> 8;
+
+ bt_dev_info(hdev, "Controller id = %x, rev = %x", mrvl->id, mrvl->rev);
+
+ clear_bit(STATE_CHIP_VER_PENDING, &mrvl->flags);
+ smp_mb__after_atomic();
+ wake_up_bit(&mrvl->flags, STATE_CHIP_VER_PENDING);
+
+done:
+ kfree_skb(skb);
+ return ret;
+}
+
+#define HCI_RECV_CHIP_VER \
+ .type = HCI_CHIP_VER_PKT, \
+ .hlen = HCI_MRVL_PKT_SIZE, \
+ .loff = 0, \
+ .lsize = 0, \
+ .maxlen = HCI_MRVL_PKT_SIZE
+
+#define HCI_RECV_FW_REQ \
+ .type = HCI_FW_REQ_PKT, \
+ .hlen = HCI_MRVL_PKT_SIZE, \
+ .loff = 0, \
+ .lsize = 0, \
+ .maxlen = HCI_MRVL_PKT_SIZE
+
+static const struct h4_recv_pkt mrvl_recv_pkts[] = {
+ { H4_RECV_ACL, .recv = hci_recv_frame },
+ { H4_RECV_SCO, .recv = hci_recv_frame },
+ { H4_RECV_EVENT, .recv = hci_recv_frame },
+ { HCI_RECV_FW_REQ, .recv = mrvl_recv_fw_req },
+ { HCI_RECV_CHIP_VER, .recv = mrvl_recv_chip_ver },
+};
+
+static int mrvl_recv(struct hci_uart *hu, const void *data, int count)
+{
+ struct mrvl_data *mrvl = hu->priv;
+
+ if (!test_bit(HCI_UART_REGISTERED, &hu->flags))
+ return -EUNATCH;
+
+ mrvl->rx_skb = h4_recv_buf(hu->hdev, mrvl->rx_skb, data, count,
+ mrvl_recv_pkts,
+ ARRAY_SIZE(mrvl_recv_pkts));
+ if (IS_ERR(mrvl->rx_skb)) {
+ int err = PTR_ERR(mrvl->rx_skb);
+ bt_dev_err(hu->hdev, "Frame reassembly failed (%d)", err);
+ mrvl->rx_skb = NULL;
+ return err;
+ }
+
+ return count;
+}
+
+static int mrvl_load_firmware(struct hci_dev *hdev, const char *name)
+{
+ struct hci_uart *hu = hci_get_drvdata(hdev);
+ struct mrvl_data *mrvl = hu->priv;
+ const struct firmware *fw = NULL;
+ const u8 *fw_ptr, *fw_max;
+ int err;
+
+ err = request_firmware(&fw, name, &hdev->dev);
+ if (err < 0) {
+ bt_dev_err(hdev, "Failed to load firmware file %s", name);
+ return err;
+ }
+
+ fw_ptr = fw->data;
+ fw_max = fw->data + fw->size;
+
+ bt_dev_info(hdev, "Loading %s", name);
+
+ set_bit(STATE_FW_REQ_PENDING, &mrvl->flags);
+
+ while (fw_ptr <= fw_max) {
+ struct sk_buff *skb;
+
+ /* Controller drives the firmware load by sending firmware
+ * request packets containing the expected fragment size.
+ */
+ err = wait_on_bit_timeout(&mrvl->flags, STATE_FW_REQ_PENDING,
+ TASK_INTERRUPTIBLE,
+ msecs_to_jiffies(2000));
+ if (err == 1) {
+ bt_dev_err(hdev, "Firmware load interrupted");
+ err = -EINTR;
+ break;
+ } else if (err) {
+ bt_dev_err(hdev, "Firmware request timeout");
+ err = -ETIMEDOUT;
+ break;
+ }
+
+ bt_dev_dbg(hdev, "Firmware request, expecting %d bytes",
+ mrvl->tx_len);
+
+ if (fw_ptr == fw_max) {
+ /* Controller requests a null size once firmware is
+ * fully loaded. If controller expects more data, there
+ * is an issue.
+ */
+ if (!mrvl->tx_len) {
+ bt_dev_info(hdev, "Firmware loading complete");
+ } else {
+ bt_dev_err(hdev, "Firmware loading failure");
+ err = -EINVAL;
+ }
+ break;
+ }
+
+ if (fw_ptr + mrvl->tx_len > fw_max) {
+ mrvl->tx_len = fw_max - fw_ptr;
+ bt_dev_dbg(hdev, "Adjusting tx_len to %d",
+ mrvl->tx_len);
+ }
+
+ skb = bt_skb_alloc(mrvl->tx_len, GFP_KERNEL);
+ if (!skb) {
+ bt_dev_err(hdev, "Failed to alloc mem for FW packet");
+ err = -ENOMEM;
+ break;
+ }
+ bt_cb(skb)->pkt_type = MRVL_RAW_DATA;
+
+ memcpy(skb_put(skb, mrvl->tx_len), fw_ptr, mrvl->tx_len);
+ fw_ptr += mrvl->tx_len;
+
+ set_bit(STATE_FW_REQ_PENDING, &mrvl->flags);
+
+ skb_queue_tail(&mrvl->rawq, skb);
+ hci_uart_tx_wakeup(hu);
+ }
+
+ release_firmware(fw);
+ return err;
+}
+
+static int mrvl_setup(struct hci_uart *hu)
+{
+ int err;
+
+ hci_uart_set_flow_control(hu, true);
+
+ err = mrvl_load_firmware(hu->hdev, "mrvl/helper_uart_3000000.bin");
+ if (err) {
+ bt_dev_err(hu->hdev, "Unable to download firmware helper");
+ return -EINVAL;
+ }
+
+ hci_uart_set_baudrate(hu, 3000000);
+ hci_uart_set_flow_control(hu, false);
+
+ err = mrvl_load_firmware(hu->hdev, "mrvl/uart8897_bt.bin");
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static const struct hci_uart_proto mrvl_proto = {
+ .id = HCI_UART_MRVL,
+ .name = "Marvell",
+ .init_speed = 115200,
+ .open = mrvl_open,
+ .close = mrvl_close,
+ .flush = mrvl_flush,
+ .setup = mrvl_setup,
+ .recv = mrvl_recv,
+ .enqueue = mrvl_enqueue,
+ .dequeue = mrvl_dequeue,
+};
+
+int __init mrvl_init(void)
+{
+ return hci_uart_register_proto(&mrvl_proto);
+}
+
+int __exit mrvl_deinit(void)
+{
+ return hci_uart_unregister_proto(&mrvl_proto);
+}
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 683c2b6..6c867fb 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -397,7 +397,7 @@
skb_queue_head_init(&qca->txq);
skb_queue_head_init(&qca->tx_wait_q);
spin_lock_init(&qca->hci_ibs_lock);
- qca->workqueue = create_singlethread_workqueue("qca_wq");
+ qca->workqueue = alloc_ordered_workqueue("qca_wq", 0);
if (!qca->workqueue) {
BT_ERR("QCA Workqueue not initialized properly");
kfree(qca);
diff --git a/drivers/bluetooth/hci_uart.h b/drivers/bluetooth/hci_uart.h
index 839bad1..0701395 100644
--- a/drivers/bluetooth/hci_uart.h
+++ b/drivers/bluetooth/hci_uart.h
@@ -35,7 +35,7 @@
#define HCIUARTGETFLAGS _IOR('U', 204, int)
/* UART protocols */
-#define HCI_UART_MAX_PROTO 10
+#define HCI_UART_MAX_PROTO 12
#define HCI_UART_H4 0
#define HCI_UART_BCSP 1
@@ -47,6 +47,8 @@
#define HCI_UART_BCM 7
#define HCI_UART_QCA 8
#define HCI_UART_AG6XX 9
+#define HCI_UART_NOKIA 10
+#define HCI_UART_MRVL 11
#define HCI_UART_RAW_DEVICE 0
#define HCI_UART_RESET_ON_INIT 1
@@ -189,3 +191,8 @@
int ag6xx_init(void);
int ag6xx_deinit(void);
#endif
+
+#ifdef CONFIG_BT_HCIUART_MRVL
+int mrvl_init(void);
+int mrvl_deinit(void);
+#endif
diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
index 9af35954..267f995 100644
--- a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
+++ b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
@@ -783,14 +783,14 @@
[RST_BUS_I2S1] = { 0x2d0, BIT(13) },
[RST_BUS_I2S2] = { 0x2d0, BIT(14) },
- [RST_BUS_I2C0] = { 0x2d4, BIT(0) },
- [RST_BUS_I2C1] = { 0x2d4, BIT(1) },
- [RST_BUS_I2C2] = { 0x2d4, BIT(2) },
- [RST_BUS_UART0] = { 0x2d4, BIT(16) },
- [RST_BUS_UART1] = { 0x2d4, BIT(17) },
- [RST_BUS_UART2] = { 0x2d4, BIT(18) },
- [RST_BUS_UART3] = { 0x2d4, BIT(19) },
- [RST_BUS_SCR] = { 0x2d4, BIT(20) },
+ [RST_BUS_I2C0] = { 0x2d8, BIT(0) },
+ [RST_BUS_I2C1] = { 0x2d8, BIT(1) },
+ [RST_BUS_I2C2] = { 0x2d8, BIT(2) },
+ [RST_BUS_UART0] = { 0x2d8, BIT(16) },
+ [RST_BUS_UART1] = { 0x2d8, BIT(17) },
+ [RST_BUS_UART2] = { 0x2d8, BIT(18) },
+ [RST_BUS_UART3] = { 0x2d8, BIT(19) },
+ [RST_BUS_SCR] = { 0x2d8, BIT(20) },
};
static const struct sunxi_ccu_desc sun8i_h3_ccu_desc = {
diff --git a/drivers/clk/sunxi-ng/ccu_nk.c b/drivers/clk/sunxi-ng/ccu_nk.c
index 4470ffc..d6fafb3 100644
--- a/drivers/clk/sunxi-ng/ccu_nk.c
+++ b/drivers/clk/sunxi-ng/ccu_nk.c
@@ -14,9 +14,9 @@
#include "ccu_gate.h"
#include "ccu_nk.h"
-void ccu_nk_find_best(unsigned long parent, unsigned long rate,
- unsigned int max_n, unsigned int max_k,
- unsigned int *n, unsigned int *k)
+static void ccu_nk_find_best(unsigned long parent, unsigned long rate,
+ unsigned int max_n, unsigned int max_k,
+ unsigned int *n, unsigned int *k)
{
unsigned long best_rate = 0;
unsigned int best_k = 0, best_n = 0;
diff --git a/drivers/clk/sunxi/clk-a10-pll2.c b/drivers/clk/sunxi/clk-a10-pll2.c
index 0ee1f36..d8eab90 100644
--- a/drivers/clk/sunxi/clk-a10-pll2.c
+++ b/drivers/clk/sunxi/clk-a10-pll2.c
@@ -73,7 +73,7 @@
SUN4I_PLL2_PRE_DIV_WIDTH,
CLK_DIVIDER_ONE_BASED | CLK_DIVIDER_ALLOW_ZERO,
&sun4i_a10_pll2_lock);
- if (!prediv_clk) {
+ if (IS_ERR(prediv_clk)) {
pr_err("Couldn't register the prediv clock\n");
goto err_free_array;
}
@@ -106,7 +106,7 @@
&mult->hw, &clk_multiplier_ops,
&gate->hw, &clk_gate_ops,
CLK_SET_RATE_PARENT);
- if (!base_clk) {
+ if (IS_ERR(base_clk)) {
pr_err("Couldn't register the base multiplier clock\n");
goto err_free_multiplier;
}
diff --git a/drivers/clk/sunxi/clk-sun8i-mbus.c b/drivers/clk/sunxi/clk-sun8i-mbus.c
index 411d303..b200ebf 100644
--- a/drivers/clk/sunxi/clk-sun8i-mbus.c
+++ b/drivers/clk/sunxi/clk-sun8i-mbus.c
@@ -48,7 +48,7 @@
return;
reg = of_io_request_and_map(node, 0, of_node_full_name(node));
- if (!reg) {
+ if (IS_ERR(reg)) {
pr_err("Could not get registers for sun8i-mbus-clk\n");
goto err_free_parents;
}
diff --git a/drivers/crypto/chelsio/chcr_core.c b/drivers/crypto/chelsio/chcr_core.c
index 2f6156b..fb5f9bb 100644
--- a/drivers/crypto/chelsio/chcr_core.c
+++ b/drivers/crypto/chelsio/chcr_core.c
@@ -39,12 +39,10 @@
[CPL_FW6_PLD] = cpl_fw6_pld_handler,
};
-static struct cxgb4_pci_uld_info chcr_uld_info = {
+static struct cxgb4_uld_info chcr_uld_info = {
.name = DRV_MODULE_NAME,
- .nrxq = 4,
+ .nrxq = MAX_ULD_QSETS,
.rxq_size = 1024,
- .nciq = 0,
- .ciq_size = 0,
.add = chcr_uld_add,
.state_change = chcr_uld_state_change,
.rx_handler = chcr_uld_rx_handler,
@@ -205,7 +203,7 @@
static int __init chcr_crypto_init(void)
{
- if (cxgb4_register_pci_uld(CXGB4_PCI_ULD1, &chcr_uld_info)) {
+ if (cxgb4_register_uld(CXGB4_ULD_CRYPTO, &chcr_uld_info)) {
pr_err("ULD register fail: No chcr crypto support in cxgb4");
return -1;
}
@@ -228,7 +226,7 @@
kfree(u_ctx);
}
mutex_unlock(&dev_mutex);
- cxgb4_unregister_pci_uld(CXGB4_PCI_ULD1);
+ cxgb4_unregister_uld(CXGB4_ULD_CRYPTO);
}
module_init(chcr_crypto_init);
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 5a2631a..7dd2e2d 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -657,9 +657,12 @@
}
if (subnode) {
- node = of_get_flat_dt_subnode_by_name(node, subnode);
- if (node < 0)
+ int err = of_get_flat_dt_subnode_by_name(node, subnode);
+
+ if (err < 0)
return 0;
+
+ node = err;
}
return __find_uefi_params(node, info, dt_params[i].params);
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index 3bd127f9..aded106 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -41,6 +41,8 @@
#define EFI_ALLOC_ALIGN EFI_PAGE_SIZE
#endif
+#define EFI_MMAP_NR_SLACK_SLOTS 8
+
struct file_info {
efi_file_handle_t *handle;
u64 size;
@@ -63,49 +65,62 @@
}
}
+static inline bool mmap_has_headroom(unsigned long buff_size,
+ unsigned long map_size,
+ unsigned long desc_size)
+{
+ unsigned long slack = buff_size - map_size;
+
+ return slack / desc_size >= EFI_MMAP_NR_SLACK_SLOTS;
+}
+
efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
- efi_memory_desc_t **map,
- unsigned long *map_size,
- unsigned long *desc_size,
- u32 *desc_ver,
- unsigned long *key_ptr)
+ struct efi_boot_memmap *map)
{
efi_memory_desc_t *m = NULL;
efi_status_t status;
unsigned long key;
u32 desc_version;
- *map_size = sizeof(*m) * 32;
+ *map->desc_size = sizeof(*m);
+ *map->map_size = *map->desc_size * 32;
+ *map->buff_size = *map->map_size;
again:
- /*
- * Add an additional efi_memory_desc_t because we're doing an
- * allocation which may be in a new descriptor region.
- */
- *map_size += sizeof(*m);
status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
- *map_size, (void **)&m);
+ *map->map_size, (void **)&m);
if (status != EFI_SUCCESS)
goto fail;
- *desc_size = 0;
+ *map->desc_size = 0;
key = 0;
- status = efi_call_early(get_memory_map, map_size, m,
- &key, desc_size, &desc_version);
- if (status == EFI_BUFFER_TOO_SMALL) {
+ status = efi_call_early(get_memory_map, map->map_size, m,
+ &key, map->desc_size, &desc_version);
+ if (status == EFI_BUFFER_TOO_SMALL ||
+ !mmap_has_headroom(*map->buff_size, *map->map_size,
+ *map->desc_size)) {
efi_call_early(free_pool, m);
+ /*
+ * Make sure there is some entries of headroom so that the
+ * buffer can be reused for a new map after allocations are
+ * no longer permitted. Its unlikely that the map will grow to
+ * exceed this headroom once we are ready to trigger
+ * ExitBootServices()
+ */
+ *map->map_size += *map->desc_size * EFI_MMAP_NR_SLACK_SLOTS;
+ *map->buff_size = *map->map_size;
goto again;
}
if (status != EFI_SUCCESS)
efi_call_early(free_pool, m);
- if (key_ptr && status == EFI_SUCCESS)
- *key_ptr = key;
- if (desc_ver && status == EFI_SUCCESS)
- *desc_ver = desc_version;
+ if (map->key_ptr && status == EFI_SUCCESS)
+ *map->key_ptr = key;
+ if (map->desc_ver && status == EFI_SUCCESS)
+ *map->desc_ver = desc_version;
fail:
- *map = m;
+ *map->map = m;
return status;
}
@@ -113,13 +128,20 @@
unsigned long get_dram_base(efi_system_table_t *sys_table_arg)
{
efi_status_t status;
- unsigned long map_size;
+ unsigned long map_size, buff_size;
unsigned long membase = EFI_ERROR;
struct efi_memory_map map;
efi_memory_desc_t *md;
+ struct efi_boot_memmap boot_map;
- status = efi_get_memory_map(sys_table_arg, (efi_memory_desc_t **)&map.map,
- &map_size, &map.desc_size, NULL, NULL);
+ boot_map.map = (efi_memory_desc_t **)&map.map;
+ boot_map.map_size = &map_size;
+ boot_map.desc_size = &map.desc_size;
+ boot_map.desc_ver = NULL;
+ boot_map.key_ptr = NULL;
+ boot_map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(sys_table_arg, &boot_map);
if (status != EFI_SUCCESS)
return membase;
@@ -144,15 +166,22 @@
unsigned long size, unsigned long align,
unsigned long *addr, unsigned long max)
{
- unsigned long map_size, desc_size;
+ unsigned long map_size, desc_size, buff_size;
efi_memory_desc_t *map;
efi_status_t status;
unsigned long nr_pages;
u64 max_addr = 0;
int i;
+ struct efi_boot_memmap boot_map;
- status = efi_get_memory_map(sys_table_arg, &map, &map_size, &desc_size,
- NULL, NULL);
+ boot_map.map = ↦
+ boot_map.map_size = &map_size;
+ boot_map.desc_size = &desc_size;
+ boot_map.desc_ver = NULL;
+ boot_map.key_ptr = NULL;
+ boot_map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(sys_table_arg, &boot_map);
if (status != EFI_SUCCESS)
goto fail;
@@ -230,14 +259,21 @@
unsigned long size, unsigned long align,
unsigned long *addr)
{
- unsigned long map_size, desc_size;
+ unsigned long map_size, desc_size, buff_size;
efi_memory_desc_t *map;
efi_status_t status;
unsigned long nr_pages;
int i;
+ struct efi_boot_memmap boot_map;
- status = efi_get_memory_map(sys_table_arg, &map, &map_size, &desc_size,
- NULL, NULL);
+ boot_map.map = ↦
+ boot_map.map_size = &map_size;
+ boot_map.desc_size = &desc_size;
+ boot_map.desc_ver = NULL;
+ boot_map.key_ptr = NULL;
+ boot_map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(sys_table_arg, &boot_map);
if (status != EFI_SUCCESS)
goto fail;
@@ -704,3 +740,76 @@
*cmd_line_len = options_bytes;
return (char *)cmdline_addr;
}
+
+/*
+ * Handle calling ExitBootServices according to the requirements set out by the
+ * spec. Obtains the current memory map, and returns that info after calling
+ * ExitBootServices. The client must specify a function to perform any
+ * processing of the memory map data prior to ExitBootServices. A client
+ * specific structure may be passed to the function via priv. The client
+ * function may be called multiple times.
+ */
+efi_status_t efi_exit_boot_services(efi_system_table_t *sys_table_arg,
+ void *handle,
+ struct efi_boot_memmap *map,
+ void *priv,
+ efi_exit_boot_map_processing priv_func)
+{
+ efi_status_t status;
+
+ status = efi_get_memory_map(sys_table_arg, map);
+
+ if (status != EFI_SUCCESS)
+ goto fail;
+
+ status = priv_func(sys_table_arg, map, priv);
+ if (status != EFI_SUCCESS)
+ goto free_map;
+
+ status = efi_call_early(exit_boot_services, handle, *map->key_ptr);
+
+ if (status == EFI_INVALID_PARAMETER) {
+ /*
+ * The memory map changed between efi_get_memory_map() and
+ * exit_boot_services(). Per the UEFI Spec v2.6, Section 6.4:
+ * EFI_BOOT_SERVICES.ExitBootServices we need to get the
+ * updated map, and try again. The spec implies one retry
+ * should be sufficent, which is confirmed against the EDK2
+ * implementation. Per the spec, we can only invoke
+ * get_memory_map() and exit_boot_services() - we cannot alloc
+ * so efi_get_memory_map() cannot be used, and we must reuse
+ * the buffer. For all practical purposes, the headroom in the
+ * buffer should account for any changes in the map so the call
+ * to get_memory_map() is expected to succeed here.
+ */
+ *map->map_size = *map->buff_size;
+ status = efi_call_early(get_memory_map,
+ map->map_size,
+ *map->map,
+ map->key_ptr,
+ map->desc_size,
+ map->desc_ver);
+
+ /* exit_boot_services() was called, thus cannot free */
+ if (status != EFI_SUCCESS)
+ goto fail;
+
+ status = priv_func(sys_table_arg, map, priv);
+ /* exit_boot_services() was called, thus cannot free */
+ if (status != EFI_SUCCESS)
+ goto fail;
+
+ status = efi_call_early(exit_boot_services, handle, *map->key_ptr);
+ }
+
+ /* exit_boot_services() was called, thus cannot free */
+ if (status != EFI_SUCCESS)
+ goto fail;
+
+ return EFI_SUCCESS;
+
+free_map:
+ efi_call_early(free_pool, *map->map);
+fail:
+ return status;
+}
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index e58abfa..a6a9311 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -152,6 +152,27 @@
#define EFI_FDT_ALIGN EFI_PAGE_SIZE
#endif
+struct exit_boot_struct {
+ efi_memory_desc_t *runtime_map;
+ int *runtime_entry_count;
+};
+
+static efi_status_t exit_boot_func(efi_system_table_t *sys_table_arg,
+ struct efi_boot_memmap *map,
+ void *priv)
+{
+ struct exit_boot_struct *p = priv;
+ /*
+ * Update the memory map with virtual addresses. The function will also
+ * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
+ * entries so that we can pass it straight to SetVirtualAddressMap()
+ */
+ efi_get_virtmap(*map->map, *map->map_size, *map->desc_size,
+ p->runtime_map, p->runtime_entry_count);
+
+ return EFI_SUCCESS;
+}
+
/*
* Allocate memory for a new FDT, then add EFI, commandline, and
* initrd related fields to the FDT. This routine increases the
@@ -175,13 +196,22 @@
unsigned long fdt_addr,
unsigned long fdt_size)
{
- unsigned long map_size, desc_size;
+ unsigned long map_size, desc_size, buff_size;
u32 desc_ver;
unsigned long mmap_key;
efi_memory_desc_t *memory_map, *runtime_map;
unsigned long new_fdt_size;
efi_status_t status;
int runtime_entry_count = 0;
+ struct efi_boot_memmap map;
+ struct exit_boot_struct priv;
+
+ map.map = &runtime_map;
+ map.map_size = &map_size;
+ map.desc_size = &desc_size;
+ map.desc_ver = &desc_ver;
+ map.key_ptr = &mmap_key;
+ map.buff_size = &buff_size;
/*
* Get a copy of the current memory map that we will use to prepare
@@ -189,8 +219,7 @@
* subsequent allocations adding entries, since they could not affect
* the number of EFI_MEMORY_RUNTIME regions.
*/
- status = efi_get_memory_map(sys_table, &runtime_map, &map_size,
- &desc_size, &desc_ver, &mmap_key);
+ status = efi_get_memory_map(sys_table, &map);
if (status != EFI_SUCCESS) {
pr_efi_err(sys_table, "Unable to retrieve UEFI memory map.\n");
return status;
@@ -199,6 +228,7 @@
pr_efi(sys_table,
"Exiting boot services and installing virtual address map...\n");
+ map.map = &memory_map;
/*
* Estimate size of new FDT, and allocate memory for it. We
* will allocate a bigger buffer if this ends up being too
@@ -218,8 +248,7 @@
* we can get the memory map key needed for
* exit_boot_services().
*/
- status = efi_get_memory_map(sys_table, &memory_map, &map_size,
- &desc_size, &desc_ver, &mmap_key);
+ status = efi_get_memory_map(sys_table, &map);
if (status != EFI_SUCCESS)
goto fail_free_new_fdt;
@@ -250,16 +279,11 @@
}
}
- /*
- * Update the memory map with virtual addresses. The function will also
- * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME
- * entries so that we can pass it straight into SetVirtualAddressMap()
- */
- efi_get_virtmap(memory_map, map_size, desc_size, runtime_map,
- &runtime_entry_count);
-
- /* Now we are ready to exit_boot_services.*/
- status = sys_table->boottime->exit_boot_services(handle, mmap_key);
+ sys_table->boottime->free_pool(memory_map);
+ priv.runtime_map = runtime_map;
+ priv.runtime_entry_count = &runtime_entry_count;
+ status = efi_exit_boot_services(sys_table, handle, &map, &priv,
+ exit_boot_func);
if (status == EFI_SUCCESS) {
efi_set_virtual_address_map_t *svam;
diff --git a/drivers/firmware/efi/libstub/random.c b/drivers/firmware/efi/libstub/random.c
index 53f6d3f..0c9f58c 100644
--- a/drivers/firmware/efi/libstub/random.c
+++ b/drivers/firmware/efi/libstub/random.c
@@ -73,12 +73,20 @@
unsigned long random_seed)
{
unsigned long map_size, desc_size, total_slots = 0, target_slot;
+ unsigned long buff_size;
efi_status_t status;
efi_memory_desc_t *memory_map;
int map_offset;
+ struct efi_boot_memmap map;
- status = efi_get_memory_map(sys_table_arg, &memory_map, &map_size,
- &desc_size, NULL, NULL);
+ map.map = &memory_map;
+ map.map_size = &map_size;
+ map.desc_size = &desc_size;
+ map.desc_ver = NULL;
+ map.key_ptr = NULL;
+ map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(sys_table_arg, &map);
if (status != EFI_SUCCESS)
return status;
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
index a978381..9b17a66 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
@@ -387,7 +387,7 @@
atmel_hlcdc_crtc_finish_page_flip(drm_crtc_to_atmel_hlcdc_crtc(c));
}
-void atmel_hlcdc_crtc_reset(struct drm_crtc *crtc)
+static void atmel_hlcdc_crtc_reset(struct drm_crtc *crtc)
{
struct atmel_hlcdc_crtc_state *state;
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
index 016c191..52c527f 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
@@ -320,19 +320,19 @@
u32 *coeff_tab = heo_upscaling_ycoef;
u32 max_memsize;
- if (state->crtc_w < state->src_w)
+ if (state->crtc_h < state->src_h)
coeff_tab = heo_downscaling_ycoef;
for (i = 0; i < ARRAY_SIZE(heo_upscaling_ycoef); i++)
atmel_hlcdc_layer_update_cfg(&plane->layer,
33 + i,
0xffffffff,
coeff_tab[i]);
- factor = ((8 * 256 * state->src_w) - (256 * 4)) /
- state->crtc_w;
+ factor = ((8 * 256 * state->src_h) - (256 * 4)) /
+ state->crtc_h;
factor++;
- max_memsize = ((factor * state->crtc_w) + (256 * 4)) /
+ max_memsize = ((factor * state->crtc_h) + (256 * 4)) /
2048;
- if (max_memsize > state->src_w)
+ if (max_memsize > state->src_h)
factor--;
factor_reg |= (factor << 16) | 0x80000000;
}
diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
index 57676f8..a628975 100644
--- a/drivers/gpu/drm/drm_ioc32.c
+++ b/drivers/gpu/drm/drm_ioc32.c
@@ -1015,6 +1015,7 @@
return 0;
}
+#if defined(CONFIG_X86) || defined(CONFIG_IA64)
typedef struct drm_mode_fb_cmd232 {
u32 fb_id;
u32 width;
@@ -1071,6 +1072,7 @@
return 0;
}
+#endif
static drm_ioctl_compat_t *drm_compat_ioctls[] = {
[DRM_IOCTL_NR(DRM_IOCTL_VERSION32)] = compat_drm_version,
@@ -1104,7 +1106,9 @@
[DRM_IOCTL_NR(DRM_IOCTL_UPDATE_DRAW32)] = compat_drm_update_draw,
#endif
[DRM_IOCTL_NR(DRM_IOCTL_WAIT_VBLANK32)] = compat_drm_wait_vblank,
+#if defined(CONFIG_X86) || defined(CONFIG_IA64)
[DRM_IOCTL_NR(DRM_IOCTL_MODE_ADDFB232)] = compat_drm_mode_addfb2,
+#endif
};
/**
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index e016640..40ce841 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -55,11 +55,11 @@
flags = exynos_gem->flags;
/*
- * without iommu support, not support physically non-continuous memory
- * for framebuffer.
+ * Physically non-contiguous memory type for framebuffer is not
+ * supported without IOMMU.
*/
if (IS_NONCONTIG_BUFFER(flags)) {
- DRM_ERROR("cannot use this gem memory type for fb.\n");
+ DRM_ERROR("Non-contiguous GEM memory is not supported.\n");
return -EINVAL;
}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimc.c b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
index 0525c56..147ef0d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
@@ -1753,32 +1753,6 @@
return 0;
}
-#ifdef CONFIG_PM_SLEEP
-static int fimc_suspend(struct device *dev)
-{
- struct fimc_context *ctx = get_fimc_context(dev);
-
- DRM_DEBUG_KMS("id[%d]\n", ctx->id);
-
- if (pm_runtime_suspended(dev))
- return 0;
-
- return fimc_clk_ctrl(ctx, false);
-}
-
-static int fimc_resume(struct device *dev)
-{
- struct fimc_context *ctx = get_fimc_context(dev);
-
- DRM_DEBUG_KMS("id[%d]\n", ctx->id);
-
- if (!pm_runtime_suspended(dev))
- return fimc_clk_ctrl(ctx, true);
-
- return 0;
-}
-#endif
-
static int fimc_runtime_suspend(struct device *dev)
{
struct fimc_context *ctx = get_fimc_context(dev);
@@ -1799,7 +1773,8 @@
#endif
static const struct dev_pm_ops fimc_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(fimc_suspend, fimc_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(fimc_runtime_suspend, fimc_runtime_resume, NULL)
};
diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index 4bf00f5..6eca8bb 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -1475,8 +1475,8 @@
return 0;
}
-#ifdef CONFIG_PM_SLEEP
-static int g2d_suspend(struct device *dev)
+#ifdef CONFIG_PM
+static int g2d_runtime_suspend(struct device *dev)
{
struct g2d_data *g2d = dev_get_drvdata(dev);
@@ -1490,25 +1490,6 @@
flush_work(&g2d->runqueue_work);
- return 0;
-}
-
-static int g2d_resume(struct device *dev)
-{
- struct g2d_data *g2d = dev_get_drvdata(dev);
-
- g2d->suspended = false;
- g2d_exec_runqueue(g2d);
-
- return 0;
-}
-#endif
-
-#ifdef CONFIG_PM
-static int g2d_runtime_suspend(struct device *dev)
-{
- struct g2d_data *g2d = dev_get_drvdata(dev);
-
clk_disable_unprepare(g2d->gate_clk);
return 0;
@@ -1523,12 +1504,16 @@
if (ret < 0)
dev_warn(dev, "failed to enable clock.\n");
+ g2d->suspended = false;
+ g2d_exec_runqueue(g2d);
+
return ret;
}
#endif
static const struct dev_pm_ops g2d_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(g2d_suspend, g2d_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(g2d_runtime_suspend, g2d_runtime_resume, NULL)
};
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
index 5d20da8..52a9d26 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
@@ -1760,34 +1760,7 @@
return 0;
}
-#ifdef CONFIG_PM_SLEEP
-static int gsc_suspend(struct device *dev)
-{
- struct gsc_context *ctx = get_gsc_context(dev);
-
- DRM_DEBUG_KMS("id[%d]\n", ctx->id);
-
- if (pm_runtime_suspended(dev))
- return 0;
-
- return gsc_clk_ctrl(ctx, false);
-}
-
-static int gsc_resume(struct device *dev)
-{
- struct gsc_context *ctx = get_gsc_context(dev);
-
- DRM_DEBUG_KMS("id[%d]\n", ctx->id);
-
- if (!pm_runtime_suspended(dev))
- return gsc_clk_ctrl(ctx, true);
-
- return 0;
-}
-#endif
-
-#ifdef CONFIG_PM
-static int gsc_runtime_suspend(struct device *dev)
+static int __maybe_unused gsc_runtime_suspend(struct device *dev)
{
struct gsc_context *ctx = get_gsc_context(dev);
@@ -1796,7 +1769,7 @@
return gsc_clk_ctrl(ctx, false);
}
-static int gsc_runtime_resume(struct device *dev)
+static int __maybe_unused gsc_runtime_resume(struct device *dev)
{
struct gsc_context *ctx = get_gsc_context(dev);
@@ -1804,10 +1777,10 @@
return gsc_clk_ctrl(ctx, true);
}
-#endif
static const struct dev_pm_ops gsc_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(gsc_suspend, gsc_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(gsc_runtime_suspend, gsc_runtime_resume, NULL)
};
diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
index 404367a..6591e40 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
@@ -794,29 +794,6 @@
return 0;
}
-
-#ifdef CONFIG_PM_SLEEP
-static int rotator_suspend(struct device *dev)
-{
- struct rot_context *rot = dev_get_drvdata(dev);
-
- if (pm_runtime_suspended(dev))
- return 0;
-
- return rotator_clk_crtl(rot, false);
-}
-
-static int rotator_resume(struct device *dev)
-{
- struct rot_context *rot = dev_get_drvdata(dev);
-
- if (!pm_runtime_suspended(dev))
- return rotator_clk_crtl(rot, true);
-
- return 0;
-}
-#endif
-
static int rotator_runtime_suspend(struct device *dev)
{
struct rot_context *rot = dev_get_drvdata(dev);
@@ -833,7 +810,8 @@
#endif
static const struct dev_pm_ops rotator_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(rotator_suspend, rotator_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
SET_RUNTIME_PM_OPS(rotator_runtime_suspend, rotator_runtime_resume,
NULL)
};
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 95ddd56..5de36d8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1281,6 +1281,11 @@
intel_runtime_pm_enable(dev_priv);
+ /* Everything is in place, we can now relax! */
+ DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
+ driver.name, driver.major, driver.minor, driver.patchlevel,
+ driver.date, pci_name(pdev), dev_priv->drm.primary->index);
+
intel_runtime_pm_put(dev_priv);
return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7a30af7..f38ceff 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -122,8 +122,11 @@
has_full_48bit_ppgtt =
IS_BROADWELL(dev_priv) || INTEL_GEN(dev_priv) >= 9;
- if (intel_vgpu_active(dev_priv))
- has_full_ppgtt = false; /* emulation is too hard */
+ if (intel_vgpu_active(dev_priv)) {
+ /* emulation is too hard */
+ has_full_ppgtt = false;
+ has_full_48bit_ppgtt = false;
+ }
if (!has_aliasing_ppgtt)
return 0;
@@ -158,7 +161,7 @@
return 0;
}
- if (INTEL_GEN(dev_priv) >= 8 && i915.enable_execlists)
+ if (INTEL_GEN(dev_priv) >= 8 && i915.enable_execlists && has_full_ppgtt)
return has_full_48bit_ppgtt ? 3 : 2;
else
return has_aliasing_ppgtt ? 1 : 0;
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index f6acb5a..b81cfb3 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -65,9 +65,6 @@
BUILD_BUG_ON(sizeof(struct vgt_if) != VGT_PVINFO_SIZE);
- if (!IS_HASWELL(dev_priv))
- return;
-
magic = __raw_i915_read64(dev_priv, vgtif_reg(magic));
if (magic != VGT_MAGIC)
return;
diff --git a/drivers/gpu/drm/i915/intel_dvo.c b/drivers/gpu/drm/i915/intel_dvo.c
index 47bdf9d..b9e5a63 100644
--- a/drivers/gpu/drm/i915/intel_dvo.c
+++ b/drivers/gpu/drm/i915/intel_dvo.c
@@ -554,7 +554,6 @@
return;
}
- drm_encoder_cleanup(&intel_encoder->base);
kfree(intel_dvo);
kfree(intel_connector);
}
diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c
index adca262..7acbbbf 100644
--- a/drivers/gpu/drm/i915/intel_opregion.c
+++ b/drivers/gpu/drm/i915/intel_opregion.c
@@ -1047,6 +1047,23 @@
return err;
}
+static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
+{
+ DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
+ return 1;
+}
+
+static const struct dmi_system_id intel_use_opregion_panel_type[] = {
+ {
+ .callback = intel_use_opregion_panel_type_callback,
+ .ident = "Conrac GmbH IX45GM2",
+ .matches = {DMI_MATCH(DMI_SYS_VENDOR, "Conrac GmbH"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "IX45GM2"),
+ },
+ },
+ { }
+};
+
int
intel_opregion_get_panel_type(struct drm_i915_private *dev_priv)
{
@@ -1073,6 +1090,16 @@
}
/*
+ * So far we know that some machined must use it, others must not use it.
+ * There doesn't seem to be any way to determine which way to go, except
+ * via a quirk list :(
+ */
+ if (!dmi_check_system(intel_use_opregion_panel_type)) {
+ DRM_DEBUG_KMS("Ignoring OpRegion panel type (%d)\n", ret - 1);
+ return -ENODEV;
+ }
+
+ /*
* FIXME On Dell XPS 13 9350 the OpRegion panel type (0) gives us
* low vswing for eDP, whereas the VBT panel type (2) gives us normal
* vswing instead. Low vswing results in some display flickers, so
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 53e13c1..2d24813 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7859,6 +7859,7 @@
case GEN6_PCODE_ILLEGAL_CMD:
return -ENXIO;
case GEN6_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE:
+ case GEN7_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE:
return -EOVERFLOW;
case GEN6_PCODE_TIMEOUT:
return -ETIMEDOUT;
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 2b0d1ba..cf171b4 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -255,14 +255,14 @@
struct drm_i915_private *dev_priv = to_i915(dev);
uint32_t max_sleep_time = 0x1f;
- /* Lately it was identified that depending on panel idle frame count
- * calculated at HW can be off by 1. So let's use what came
- * from VBT + 1.
- * There are also other cases where panel demands at least 4
- * but VBT is not being set. To cover these 2 cases lets use
- * at least 5 when VBT isn't set to be on the safest side.
+ /*
+ * Let's respect VBT in case VBT asks a higher idle_frame value.
+ * Let's use 6 as the minimum to cover all known cases including
+ * the off-by-one issue that HW has in some cases. Also there are
+ * cases where sink should be able to train
+ * with the 5 or 6 idle patterns.
*/
- uint32_t idle_frames = dev_priv->vbt.psr.idle_frames + 1;
+ uint32_t idle_frames = max(6, dev_priv->vbt.psr.idle_frames);
uint32_t val = EDP_PSR_ENABLE;
val |= max_sleep_time << EDP_PSR_MAX_SLEEP_TIME_SHIFT;
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 59adcf8..3f6704c 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -144,7 +144,7 @@
return &vc4->bo_cache.size_list[page_index];
}
-void vc4_bo_cache_purge(struct drm_device *dev)
+static void vc4_bo_cache_purge(struct drm_device *dev)
{
struct vc4_dev *vc4 = to_vc4_dev(dev);
diff --git a/drivers/gpu/drm/vc4/vc4_validate_shaders.c b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
index 46527e9..2543cf5 100644
--- a/drivers/gpu/drm/vc4/vc4_validate_shaders.c
+++ b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
@@ -309,8 +309,14 @@
* of uniforms on each side. However, this scheme is easy to
* validate so it's all we allow for now.
*/
-
- if (QPU_GET_FIELD(inst, QPU_SIG) != QPU_SIG_NONE) {
+ switch (QPU_GET_FIELD(inst, QPU_SIG)) {
+ case QPU_SIG_NONE:
+ case QPU_SIG_SCOREBOARD_UNLOCK:
+ case QPU_SIG_COLOR_LOAD:
+ case QPU_SIG_LOAD_TMU0:
+ case QPU_SIG_LOAD_TMU1:
+ break;
+ default:
DRM_ERROR("uniforms address change must be "
"normal math\n");
return false;
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index 3cbbfbe..71c8867 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -332,6 +332,8 @@
spin_lock_irqsave(&ep->com.dev->lock, flags);
_remove_handle(ep->com.dev, &ep->com.dev->hwtid_idr, ep->hwtid, 0);
+ if (idr_is_empty(&ep->com.dev->hwtid_idr))
+ wake_up(&ep->com.dev->wait);
spin_unlock_irqrestore(&ep->com.dev->lock, flags);
}
@@ -2014,8 +2016,10 @@
}
ep->l2t = cxgb4_l2t_get(cdev->rdev.lldi.l2t,
n, pdev, rt_tos2priority(tos));
- if (!ep->l2t)
+ if (!ep->l2t) {
+ dev_put(pdev);
goto out;
+ }
ep->mtu = pdev->mtu;
ep->tx_chan = cxgb4_port_chan(pdev);
ep->smac_idx = cxgb4_tp_smt_idx(adapter_type,
diff --git a/drivers/infiniband/hw/cxgb4/device.c b/drivers/infiniband/hw/cxgb4/device.c
index 071d733..93e3d27 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -872,9 +872,13 @@
static void c4iw_dealloc(struct uld_ctx *ctx)
{
c4iw_rdev_close(&ctx->dev->rdev);
+ WARN_ON_ONCE(!idr_is_empty(&ctx->dev->cqidr));
idr_destroy(&ctx->dev->cqidr);
+ WARN_ON_ONCE(!idr_is_empty(&ctx->dev->qpidr));
idr_destroy(&ctx->dev->qpidr);
+ WARN_ON_ONCE(!idr_is_empty(&ctx->dev->mmidr));
idr_destroy(&ctx->dev->mmidr);
+ wait_event(ctx->dev->wait, idr_is_empty(&ctx->dev->hwtid_idr));
idr_destroy(&ctx->dev->hwtid_idr);
idr_destroy(&ctx->dev->stid_idr);
idr_destroy(&ctx->dev->atid_idr);
@@ -992,6 +996,7 @@
mutex_init(&devp->rdev.stats.lock);
mutex_init(&devp->db_mutex);
INIT_LIST_HEAD(&devp->db_fc_list);
+ init_waitqueue_head(&devp->wait);
devp->avail_ird = devp->rdev.lldi.max_ird_adapter;
if (c4iw_debugfs_root) {
@@ -1475,6 +1480,10 @@
static struct cxgb4_uld_info c4iw_uld_info = {
.name = DRV_NAME,
+ .nrxq = MAX_ULD_QSETS,
+ .rxq_size = 511,
+ .ciq = true,
+ .lro = false,
.add = c4iw_uld_add,
.rx_handler = c4iw_uld_rx_handler,
.state_change = c4iw_uld_state_change,
diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index 6a9bef1f..cdcf3ee 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -263,6 +263,7 @@
struct idr stid_idr;
struct list_head db_fc_list;
u32 avail_ird;
+ wait_queue_head_t wait;
};
static inline struct c4iw_dev *to_c4iw_dev(struct ib_device *ibdev)
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 9c2e53d..0f21c3a 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -1128,6 +1128,27 @@
/* Generate GUID changed event */
if (changed_attr & MLX4_EQ_PORT_INFO_GID_PFX_CHANGE_MASK) {
+ if (mlx4_is_master(dev->dev)) {
+ union ib_gid gid;
+ int err = 0;
+
+ if (!eqe->event.port_mgmt_change.params.port_info.gid_prefix)
+ err = __mlx4_ib_query_gid(&dev->ib_dev, port, 0, &gid, 1);
+ else
+ gid.global.subnet_prefix =
+ eqe->event.port_mgmt_change.params.port_info.gid_prefix;
+ if (err) {
+ pr_warn("Could not change QP1 subnet prefix for port %d: query_gid error (%d)\n",
+ port, err);
+ } else {
+ pr_debug("Changing QP1 subnet prefix for port %d. old=0x%llx. new=0x%llx\n",
+ port,
+ (u64)atomic64_read(&dev->sriov.demux[port - 1].subnet_prefix),
+ be64_to_cpu(gid.global.subnet_prefix));
+ atomic64_set(&dev->sriov.demux[port - 1].subnet_prefix,
+ be64_to_cpu(gid.global.subnet_prefix));
+ }
+ }
mlx4_ib_dispatch_event(dev, port, IB_EVENT_GID_CHANGE);
/*if master, notify all slaves*/
if (mlx4_is_master(dev->dev))
@@ -2202,6 +2223,8 @@
if (err)
goto demux_err;
dev->sriov.demux[i].guid_cache[0] = gid.global.interface_id;
+ atomic64_set(&dev->sriov.demux[i].subnet_prefix,
+ be64_to_cpu(gid.global.subnet_prefix));
err = alloc_pv_object(dev, mlx4_master_func_num(dev->dev), i + 1,
&dev->sriov.sqps[i]);
if (err)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 2af44c2..87ba9bc 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2202,6 +2202,9 @@
bool per_port = !!(ibdev->dev->caps.flags2 &
MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT);
+ if (mlx4_is_slave(ibdev->dev))
+ return 0;
+
for (i = 0; i < MLX4_DIAG_COUNTERS_TYPES; i++) {
/* i == 1 means we are building port counters */
if (i && !per_port)
diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index 8f7ad07..097bfcc 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -489,7 +489,7 @@
if (!group->members[i])
leave_state |= (1 << i);
- return leave_state & (group->rec.scope_join_state & 7);
+ return leave_state & (group->rec.scope_join_state & 0xf);
}
static int join_group(struct mcast_group *group, int slave, u8 join_mask)
@@ -564,8 +564,8 @@
} else
mcg_warn_group(group, "DRIVER BUG\n");
} else if (group->state == MCAST_LEAVE_SENT) {
- if (group->rec.scope_join_state & 7)
- group->rec.scope_join_state &= 0xf8;
+ if (group->rec.scope_join_state & 0xf)
+ group->rec.scope_join_state &= 0xf0;
group->state = MCAST_IDLE;
mutex_unlock(&group->lock);
if (release_group(group, 1))
@@ -605,7 +605,7 @@
static int handle_join_req(struct mcast_group *group, u8 join_mask,
struct mcast_req *req)
{
- u8 group_join_state = group->rec.scope_join_state & 7;
+ u8 group_join_state = group->rec.scope_join_state & 0xf;
int ref = 0;
u16 status;
struct ib_sa_mcmember_data *sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
@@ -690,8 +690,8 @@
u8 cur_join_state;
resp_join_state = ((struct ib_sa_mcmember_data *)
- group->response_sa_mad.data)->scope_join_state & 7;
- cur_join_state = group->rec.scope_join_state & 7;
+ group->response_sa_mad.data)->scope_join_state & 0xf;
+ cur_join_state = group->rec.scope_join_state & 0xf;
if (method == IB_MGMT_METHOD_GET_RESP) {
/* successfull join */
@@ -710,7 +710,7 @@
req = list_first_entry(&group->pending_list, struct mcast_req,
group_list);
sa_data = (struct ib_sa_mcmember_data *)req->sa_mad.data;
- req_join_state = sa_data->scope_join_state & 0x7;
+ req_join_state = sa_data->scope_join_state & 0xf;
/* For a leave request, we will immediately answer the VF, and
* update our internal counters. The actual leave will be sent
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 7c5832e..686ab48 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -448,7 +448,7 @@
struct workqueue_struct *wq;
struct workqueue_struct *ud_wq;
spinlock_t ud_lock;
- __be64 subnet_prefix;
+ atomic64_t subnet_prefix;
__be64 guid_cache[128];
struct mlx4_ib_dev *dev;
/* the following lock protects both mcg_table and mcg_mgid0_list */
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 768085f..7fb9629 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -2493,24 +2493,27 @@
sqp->ud_header.grh.flow_label =
ah->av.ib.sl_tclass_flowlabel & cpu_to_be32(0xfffff);
sqp->ud_header.grh.hop_limit = ah->av.ib.hop_limit;
- if (is_eth)
+ if (is_eth) {
memcpy(sqp->ud_header.grh.source_gid.raw, sgid.raw, 16);
- else {
- if (mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
- /* When multi-function is enabled, the ib_core gid
- * indexes don't necessarily match the hw ones, so
- * we must use our own cache */
- sqp->ud_header.grh.source_gid.global.subnet_prefix =
- to_mdev(ib_dev)->sriov.demux[sqp->qp.port - 1].
- subnet_prefix;
- sqp->ud_header.grh.source_gid.global.interface_id =
- to_mdev(ib_dev)->sriov.demux[sqp->qp.port - 1].
- guid_cache[ah->av.ib.gid_index];
- } else
- ib_get_cached_gid(ib_dev,
- be32_to_cpu(ah->av.ib.port_pd) >> 24,
- ah->av.ib.gid_index,
- &sqp->ud_header.grh.source_gid, NULL);
+ } else {
+ if (mlx4_is_mfunc(to_mdev(ib_dev)->dev)) {
+ /* When multi-function is enabled, the ib_core gid
+ * indexes don't necessarily match the hw ones, so
+ * we must use our own cache
+ */
+ sqp->ud_header.grh.source_gid.global.subnet_prefix =
+ cpu_to_be64(atomic64_read(&(to_mdev(ib_dev)->sriov.
+ demux[sqp->qp.port - 1].
+ subnet_prefix)));
+ sqp->ud_header.grh.source_gid.global.interface_id =
+ to_mdev(ib_dev)->sriov.demux[sqp->qp.port - 1].
+ guid_cache[ah->av.ib.gid_index];
+ } else {
+ ib_get_cached_gid(ib_dev,
+ be32_to_cpu(ah->av.ib.port_pd) >> 24,
+ ah->av.ib.gid_index,
+ &sqp->ud_header.grh.source_gid, NULL);
+ }
}
memcpy(sqp->ud_header.grh.destination_gid.raw,
ah->av.ib.dgid, 16);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e4aecbf..551aa0e 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -284,7 +284,9 @@
static int mlx5_use_mad_ifc(struct mlx5_ib_dev *dev)
{
- return !MLX5_CAP_GEN(dev->mdev, ib_virt);
+ if (MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_IB)
+ return !MLX5_CAP_GEN(dev->mdev, ib_virt);
+ return 0;
}
enum {
@@ -1423,6 +1425,13 @@
dmac_47_16),
ib_spec->eth.val.dst_mac);
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, outer_headers_c,
+ smac_47_16),
+ ib_spec->eth.mask.src_mac);
+ ether_addr_copy(MLX5_ADDR_OF(fte_match_set_lyr_2_4, outer_headers_v,
+ smac_47_16),
+ ib_spec->eth.val.src_mac);
+
if (ib_spec->eth.mask.vlan_tag) {
MLX5_SET(fte_match_set_lyr_2_4, outer_headers_c,
vlan_tag, 1);
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index 80c4b6b..46b6497 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -294,7 +294,7 @@
{
rvt_deinit_mregion(&mr->mr);
rvt_free_lkey(&mr->mr);
- vfree(mr);
+ kfree(mr);
}
/**
diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 55f0e8f..ddd5927 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -362,15 +362,34 @@
return err;
}
- err = rxe_net_init();
+ err = rxe_net_ipv4_init();
if (err) {
- pr_err("rxe: unable to init\n");
+ pr_err("rxe: unable to init ipv4 tunnel\n");
rxe_cache_exit();
- return err;
+ goto exit;
}
+
+ err = rxe_net_ipv6_init();
+ if (err) {
+ pr_err("rxe: unable to init ipv6 tunnel\n");
+ rxe_cache_exit();
+ goto exit;
+ }
+
+ err = register_netdevice_notifier(&rxe_net_notifier);
+ if (err) {
+ pr_err("rxe: Failed to rigister netdev notifier\n");
+ goto exit;
+ }
+
pr_info("rxe: loaded\n");
return 0;
+
+exit:
+ rxe_release_udp_tunnel(recv_sockets.sk4);
+ rxe_release_udp_tunnel(recv_sockets.sk6);
+ return err;
}
static void __exit rxe_module_exit(void)
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 36f67de..1c59ef2 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -689,7 +689,14 @@
qp->req.need_retry = 1;
rxe_run_task(&qp->req.task, 1);
}
+
+ if (pkt) {
+ rxe_drop_ref(pkt->qp);
+ kfree_skb(skb);
+ }
+
goto exit;
+
} else {
wqe->status = IB_WC_RETRY_EXC_ERR;
state = COMPST_ERROR;
@@ -716,6 +723,12 @@
case COMPST_ERROR:
do_complete(qp, wqe);
rxe_qp_error(qp);
+
+ if (pkt) {
+ rxe_drop_ref(pkt->qp);
+ kfree_skb(skb);
+ }
+
goto exit;
}
}
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 0b8d2ea..eedf2f1 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -275,9 +275,10 @@
return sock;
}
-static void rxe_release_udp_tunnel(struct socket *sk)
+void rxe_release_udp_tunnel(struct socket *sk)
{
- udp_tunnel_sock_release(sk);
+ if (sk)
+ udp_tunnel_sock_release(sk);
}
static void prepare_udp_hdr(struct sk_buff *skb, __be16 src_port,
@@ -658,51 +659,45 @@
return NOTIFY_OK;
}
-static struct notifier_block rxe_net_notifier = {
+struct notifier_block rxe_net_notifier = {
.notifier_call = rxe_notify,
};
-int rxe_net_init(void)
+int rxe_net_ipv4_init(void)
{
- int err;
+ spin_lock_init(&dev_list_lock);
+
+ recv_sockets.sk4 = rxe_setup_udp_tunnel(&init_net,
+ htons(ROCE_V2_UDP_DPORT), false);
+ if (IS_ERR(recv_sockets.sk4)) {
+ recv_sockets.sk4 = NULL;
+ pr_err("rxe: Failed to create IPv4 UDP tunnel\n");
+ return -1;
+ }
+
+ return 0;
+}
+
+int rxe_net_ipv6_init(void)
+{
+#if IS_ENABLED(CONFIG_IPV6)
spin_lock_init(&dev_list_lock);
recv_sockets.sk6 = rxe_setup_udp_tunnel(&init_net,
- htons(ROCE_V2_UDP_DPORT), true);
+ htons(ROCE_V2_UDP_DPORT), true);
if (IS_ERR(recv_sockets.sk6)) {
recv_sockets.sk6 = NULL;
pr_err("rxe: Failed to create IPv6 UDP tunnel\n");
return -1;
}
-
- recv_sockets.sk4 = rxe_setup_udp_tunnel(&init_net,
- htons(ROCE_V2_UDP_DPORT), false);
- if (IS_ERR(recv_sockets.sk4)) {
- rxe_release_udp_tunnel(recv_sockets.sk6);
- recv_sockets.sk4 = NULL;
- recv_sockets.sk6 = NULL;
- pr_err("rxe: Failed to create IPv4 UDP tunnel\n");
- return -1;
- }
-
- err = register_netdevice_notifier(&rxe_net_notifier);
- if (err) {
- rxe_release_udp_tunnel(recv_sockets.sk6);
- rxe_release_udp_tunnel(recv_sockets.sk4);
- pr_err("rxe: Failed to rigister netdev notifier\n");
- }
-
- return err;
+#endif
+ return 0;
}
void rxe_net_exit(void)
{
- if (recv_sockets.sk6)
- rxe_release_udp_tunnel(recv_sockets.sk6);
-
- if (recv_sockets.sk4)
- rxe_release_udp_tunnel(recv_sockets.sk4);
-
+ rxe_release_udp_tunnel(recv_sockets.sk6);
+ rxe_release_udp_tunnel(recv_sockets.sk4);
unregister_netdevice_notifier(&rxe_net_notifier);
}
diff --git a/drivers/infiniband/sw/rxe/rxe_net.h b/drivers/infiniband/sw/rxe/rxe_net.h
index 7b06f76..0daf7f0 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.h
+++ b/drivers/infiniband/sw/rxe/rxe_net.h
@@ -44,10 +44,13 @@
};
extern struct rxe_recv_sockets recv_sockets;
+extern struct notifier_block rxe_net_notifier;
+void rxe_release_udp_tunnel(struct socket *sk);
struct rxe_dev *rxe_net_add(struct net_device *ndev);
-int rxe_net_init(void);
+int rxe_net_ipv4_init(void);
+int rxe_net_ipv6_init(void);
void rxe_net_exit(void);
#endif /* RXE_NET_H */
diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
index 3d464c2..144d2f1 100644
--- a/drivers/infiniband/sw/rxe/rxe_recv.c
+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
@@ -312,7 +312,7 @@
* make a copy of the skb to post to the next qp
*/
skb_copy = (mce->qp_list.next != &mcg->qp_list) ?
- skb_clone(skb, GFP_KERNEL) : NULL;
+ skb_clone(skb, GFP_ATOMIC) : NULL;
pkt->qp = qp;
rxe_add_ref(qp);
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 33b2d9d..13a848a 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -511,24 +511,21 @@
}
static void update_wqe_state(struct rxe_qp *qp,
- struct rxe_send_wqe *wqe,
- struct rxe_pkt_info *pkt,
- enum wqe_state *prev_state)
+ struct rxe_send_wqe *wqe,
+ struct rxe_pkt_info *pkt)
{
- enum wqe_state prev_state_ = wqe->state;
-
if (pkt->mask & RXE_END_MASK) {
if (qp_type(qp) == IB_QPT_RC)
wqe->state = wqe_state_pending;
} else {
wqe->state = wqe_state_processing;
}
-
- *prev_state = prev_state_;
}
-static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
- struct rxe_pkt_info *pkt, int payload)
+static void update_wqe_psn(struct rxe_qp *qp,
+ struct rxe_send_wqe *wqe,
+ struct rxe_pkt_info *pkt,
+ int payload)
{
/* number of packets left to send including current one */
int num_pkt = (wqe->dma.resid + payload + qp->mtu - 1) / qp->mtu;
@@ -546,10 +543,35 @@
qp->req.psn = (wqe->first_psn + num_pkt) & BTH_PSN_MASK;
else
qp->req.psn = (qp->req.psn + 1) & BTH_PSN_MASK;
+}
+static void save_state(struct rxe_send_wqe *wqe,
+ struct rxe_qp *qp,
+ struct rxe_send_wqe *rollback_wqe,
+ struct rxe_qp *rollback_qp)
+{
+ rollback_wqe->state = wqe->state;
+ rollback_wqe->first_psn = wqe->first_psn;
+ rollback_wqe->last_psn = wqe->last_psn;
+ rollback_qp->req.psn = qp->req.psn;
+}
+
+static void rollback_state(struct rxe_send_wqe *wqe,
+ struct rxe_qp *qp,
+ struct rxe_send_wqe *rollback_wqe,
+ struct rxe_qp *rollback_qp)
+{
+ wqe->state = rollback_wqe->state;
+ wqe->first_psn = rollback_wqe->first_psn;
+ wqe->last_psn = rollback_wqe->last_psn;
+ qp->req.psn = rollback_qp->req.psn;
+}
+
+static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ struct rxe_pkt_info *pkt, int payload)
+{
qp->req.opcode = pkt->opcode;
-
if (pkt->mask & RXE_END_MASK)
qp->req.wqe_index = next_index(qp->sq.queue, qp->req.wqe_index);
@@ -571,7 +593,8 @@
int mtu;
int opcode;
int ret;
- enum wqe_state prev_state;
+ struct rxe_qp rollback_qp;
+ struct rxe_send_wqe rollback_wqe;
next_wqe:
if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR))
@@ -688,13 +711,21 @@
goto err;
}
- update_wqe_state(qp, wqe, &pkt, &prev_state);
+ /*
+ * To prevent a race on wqe access between requester and completer,
+ * wqe members state and psn need to be set before calling
+ * rxe_xmit_packet().
+ * Otherwise, completer might initiate an unjustified retry flow.
+ */
+ save_state(wqe, qp, &rollback_wqe, &rollback_qp);
+ update_wqe_state(qp, wqe, &pkt);
+ update_wqe_psn(qp, wqe, &pkt, payload);
ret = rxe_xmit_packet(to_rdev(qp->ibqp.device), qp, &pkt, skb);
if (ret) {
qp->need_req_skb = 1;
kfree_skb(skb);
- wqe->state = prev_state;
+ rollback_state(wqe, qp, &rollback_wqe, &rollback_qp);
if (ret == -EAGAIN) {
rxe_run_task(&qp->req.task, 1);
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index ebb03b4..3e0f0f2 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -972,11 +972,13 @@
free_rd_atomic_resource(qp, res);
rxe_advance_resp_resource(qp);
+ memcpy(SKB_TO_PKT(skb), &ack_pkt, sizeof(skb->cb));
+
res->type = RXE_ATOMIC_MASK;
res->atomic.skb = skb;
- res->first_psn = qp->resp.psn;
- res->last_psn = qp->resp.psn;
- res->cur_psn = qp->resp.psn;
+ res->first_psn = ack_pkt.psn;
+ res->last_psn = ack_pkt.psn;
+ res->cur_psn = ack_pkt.psn;
rc = rxe_xmit_packet(rxe, qp, &ack_pkt, skb_copy);
if (rc) {
@@ -1116,8 +1118,7 @@
rc = RESPST_CLEANUP;
goto out;
}
- bth_set_psn(SKB_TO_PKT(skb_copy),
- qp->resp.psn - 1);
+
/* Resend the result. */
rc = rxe_xmit_packet(to_rdev(qp->ibqp.device), qp,
pkt, skb_copy);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index dc6d241..be11d5d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -1161,8 +1161,17 @@
}
if (level == IPOIB_FLUSH_LIGHT) {
+ int oper_up;
ipoib_mark_paths_invalid(dev);
+ /* Set IPoIB operation as down to prevent races between:
+ * the flush flow which leaves MCG and on the fly joins
+ * which can happen during that time. mcast restart task
+ * should deal with join requests we missed.
+ */
+ oper_up = test_and_clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
ipoib_mcast_dev_flush(dev);
+ if (oper_up)
+ set_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
ipoib_flush_ah(dev);
}
diff --git a/drivers/irqchip/irq-atmel-aic.c b/drivers/irqchip/irq-atmel-aic.c
index 112e17c..37f952d 100644
--- a/drivers/irqchip/irq-atmel-aic.c
+++ b/drivers/irqchip/irq-atmel-aic.c
@@ -176,6 +176,7 @@
{
struct irq_domain_chip_generic *dgc = d->gc;
struct irq_chip_generic *gc;
+ unsigned long flags;
unsigned smr;
int idx;
int ret;
@@ -194,11 +195,11 @@
gc = dgc->gc[idx];
- irq_gc_lock(gc);
+ irq_gc_lock_irqsave(gc, flags);
smr = irq_reg_readl(gc, AT91_AIC_SMR(*out_hwirq));
aic_common_set_priority(intspec[2], &smr);
irq_reg_writel(gc, smr, AT91_AIC_SMR(*out_hwirq));
- irq_gc_unlock(gc);
+ irq_gc_unlock_irqrestore(gc, flags);
return ret;
}
diff --git a/drivers/irqchip/irq-atmel-aic5.c b/drivers/irqchip/irq-atmel-aic5.c
index 4f0d068..2a624d8 100644
--- a/drivers/irqchip/irq-atmel-aic5.c
+++ b/drivers/irqchip/irq-atmel-aic5.c
@@ -258,6 +258,7 @@
unsigned int *out_type)
{
struct irq_chip_generic *bgc = irq_get_domain_generic_chip(d, 0);
+ unsigned long flags;
unsigned smr;
int ret;
@@ -269,12 +270,12 @@
if (ret)
return ret;
- irq_gc_lock(bgc);
+ irq_gc_lock_irqsave(bgc, flags);
irq_reg_writel(bgc, *out_hwirq, AT91_AIC5_SSR);
smr = irq_reg_readl(bgc, AT91_AIC5_SMR);
aic_common_set_priority(intspec[2], &smr);
irq_reg_writel(bgc, smr, AT91_AIC5_SMR);
- irq_gc_unlock(bgc);
+ irq_gc_unlock_irqrestore(bgc, flags);
return ret;
}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 67642ba..915e84d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7610,16 +7610,12 @@
int md_setup_cluster(struct mddev *mddev, int nodes)
{
- int err;
-
- err = request_module("md-cluster");
- if (err) {
- pr_err("md-cluster module not found.\n");
- return -ENOENT;
- }
-
+ if (!md_cluster_ops)
+ request_module("md-cluster");
spin_lock(&pers_lock);
+ /* ensure module won't be unloaded */
if (!md_cluster_ops || !try_module_get(md_cluster_mod)) {
+ pr_err("can't find md-cluster module or get it's reference.\n");
spin_unlock(&pers_lock);
return -ENOENT;
}
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 51f76dd..1b1ab4a 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -96,7 +96,6 @@
spinlock_t no_space_stripes_lock;
bool need_cache_flush;
- bool in_teardown;
};
/*
@@ -704,31 +703,22 @@
mddev = log->rdev->mddev;
/*
- * This is to avoid a deadlock. r5l_quiesce holds reconfig_mutex and
- * wait for this thread to finish. This thread waits for
- * MD_CHANGE_PENDING clear, which is supposed to be done in
- * md_check_recovery(). md_check_recovery() tries to get
- * reconfig_mutex. Since r5l_quiesce already holds the mutex,
- * md_check_recovery() fails, so the PENDING never get cleared. The
- * in_teardown check workaround this issue.
+ * Discard could zero data, so before discard we must make sure
+ * superblock is updated to new log tail. Updating superblock (either
+ * directly call md_update_sb() or depend on md thread) must hold
+ * reconfig mutex. On the other hand, raid5_quiesce is called with
+ * reconfig_mutex hold. The first step of raid5_quiesce() is waitting
+ * for all IO finish, hence waitting for reclaim thread, while reclaim
+ * thread is calling this function and waitting for reconfig mutex. So
+ * there is a deadlock. We workaround this issue with a trylock.
+ * FIXME: we could miss discard if we can't take reconfig mutex
*/
- if (!log->in_teardown) {
- set_mask_bits(&mddev->flags, 0,
- BIT(MD_CHANGE_DEVS) | BIT(MD_CHANGE_PENDING));
- md_wakeup_thread(mddev->thread);
- wait_event(mddev->sb_wait,
- !test_bit(MD_CHANGE_PENDING, &mddev->flags) ||
- log->in_teardown);
- /*
- * r5l_quiesce could run after in_teardown check and hold
- * mutex first. Superblock might get updated twice.
- */
- if (log->in_teardown)
- md_update_sb(mddev, 1);
- } else {
- WARN_ON(!mddev_is_locked(mddev));
- md_update_sb(mddev, 1);
- }
+ set_mask_bits(&mddev->flags, 0,
+ BIT(MD_CHANGE_DEVS) | BIT(MD_CHANGE_PENDING));
+ if (!mddev_trylock(mddev))
+ return;
+ md_update_sb(mddev, 1);
+ mddev_unlock(mddev);
/* discard IO error really doesn't matter, ignore it */
if (log->last_checkpoint < end) {
@@ -827,7 +817,6 @@
if (!log || state == 2)
return;
if (state == 0) {
- log->in_teardown = 0;
/*
* This is a special case for hotadd. In suspend, the array has
* no journal. In resume, journal is initialized as well as the
@@ -838,11 +827,6 @@
log->reclaim_thread = md_register_thread(r5l_reclaim_thread,
log->rdev->mddev, "reclaim");
} else if (state == 1) {
- /*
- * at this point all stripes are finished, so io_unit is at
- * least in STRIPE_END state
- */
- log->in_teardown = 1;
/* make sure r5l_write_super_and_discard_space exits */
mddev = log->rdev->mddev;
wake_up(&mddev->sb_wait);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index da583bb..ee7fc37 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2423,10 +2423,10 @@
}
}
rdev_dec_pending(rdev, conf->mddev);
+ bio_reset(bi);
clear_bit(R5_LOCKED, &sh->dev[i].flags);
set_bit(STRIPE_HANDLE, &sh->state);
raid5_release_stripe(sh);
- bio_reset(bi);
}
static void raid5_end_write_request(struct bio *bi)
@@ -2498,6 +2498,7 @@
if (sh->batch_head && bi->bi_error && !replacement)
set_bit(STRIPE_BATCH_ERR, &sh->batch_head->state);
+ bio_reset(bi);
if (!test_and_clear_bit(R5_DOUBLE_LOCKED, &sh->dev[i].flags))
clear_bit(R5_LOCKED, &sh->dev[i].flags);
set_bit(STRIPE_HANDLE, &sh->state);
@@ -2505,7 +2506,6 @@
if (sh->batch_head && sh != sh->batch_head)
raid5_release_stripe(sh->batch_head);
- bio_reset(bi);
}
static void raid5_build_block(struct stripe_head *sh, int i, int previous)
@@ -6639,6 +6639,16 @@
}
conf->min_nr_stripes = NR_STRIPES;
+ if (mddev->reshape_position != MaxSector) {
+ int stripes = max_t(int,
+ ((mddev->chunk_sectors << 9) / STRIPE_SIZE) * 4,
+ ((mddev->new_chunk_sectors << 9) / STRIPE_SIZE) * 4);
+ conf->min_nr_stripes = max(NR_STRIPES, stripes);
+ if (conf->min_nr_stripes != NR_STRIPES)
+ printk(KERN_INFO
+ "md/raid:%s: force stripe size %d for reshape\n",
+ mdname(mddev), conf->min_nr_stripes);
+ }
memory = conf->min_nr_stripes * (sizeof(struct stripe_head) +
max_disks * ((sizeof(struct bio) + PAGE_SIZE))) / 1024;
atomic_set(&conf->empty_inactive_list_nr, NR_STRIPE_HASH_LOCKS);
diff --git a/drivers/media/cec-edid.c b/drivers/media/cec-edid.c
index 7001824..5719b99 100644
--- a/drivers/media/cec-edid.c
+++ b/drivers/media/cec-edid.c
@@ -70,7 +70,10 @@
u8 tag = edid[i] >> 5;
u8 len = edid[i] & 0x1f;
- if (tag == 3 && len >= 5 && i + len <= end)
+ if (tag == 3 && len >= 5 && i + len <= end &&
+ edid[i + 1] == 0x03 &&
+ edid[i + 2] == 0x0c &&
+ edid[i + 3] == 0x00)
return i + 4;
i += len + 1;
} while (i < end);
diff --git a/drivers/media/pci/cx23885/cx23885-417.c b/drivers/media/pci/cx23885/cx23885-417.c
index efec2d1..4d080da 100644
--- a/drivers/media/pci/cx23885/cx23885-417.c
+++ b/drivers/media/pci/cx23885/cx23885-417.c
@@ -1552,6 +1552,7 @@
q->mem_ops = &vb2_dma_sg_memops;
q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
q->lock = &dev->lock;
+ q->dev = &dev->pci->dev;
err = vb2_queue_init(q);
if (err < 0)
diff --git a/drivers/media/pci/saa7134/saa7134-dvb.c b/drivers/media/pci/saa7134/saa7134-dvb.c
index db987e5..59a4b5f 100644
--- a/drivers/media/pci/saa7134/saa7134-dvb.c
+++ b/drivers/media/pci/saa7134/saa7134-dvb.c
@@ -1238,6 +1238,7 @@
q->buf_struct_size = sizeof(struct saa7134_buf);
q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
q->lock = &dev->lock;
+ q->dev = &dev->pci->dev;
ret = vb2_queue_init(q);
if (ret) {
vb2_dvb_dealloc_frontends(&dev->frontends);
diff --git a/drivers/media/pci/saa7134/saa7134-empress.c b/drivers/media/pci/saa7134/saa7134-empress.c
index ca417a4..791a516 100644
--- a/drivers/media/pci/saa7134/saa7134-empress.c
+++ b/drivers/media/pci/saa7134/saa7134-empress.c
@@ -295,6 +295,7 @@
q->buf_struct_size = sizeof(struct saa7134_buf);
q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
q->lock = &dev->lock;
+ q->dev = &dev->pci->dev;
err = vb2_queue_init(q);
if (err)
return err;
diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index f25344b..552b635 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -169,7 +169,7 @@
config VIDEO_MEDIATEK_VCODEC
tristate "Mediatek Video Codec driver"
depends on MTK_IOMMU || COMPILE_TEST
- depends on VIDEO_DEV && VIDEO_V4L2
+ depends on VIDEO_DEV && VIDEO_V4L2 && HAS_DMA
depends on ARCH_MEDIATEK || COMPILE_TEST
select VIDEOBUF2_DMA_CONTIG
select V4L2_MEM2MEM_DEV
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 94f0a42..3a8e695 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -23,7 +23,6 @@
#include <media/v4l2-ioctl.h>
#include <media/videobuf2-core.h>
-#include "mtk_vcodec_util.h"
#define MTK_VCODEC_DRV_NAME "mtk_vcodec_drv"
#define MTK_VCODEC_ENC_NAME "mtk-vcodec-enc"
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 3ed3f2d..2c5719a 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -487,7 +487,6 @@
struct mtk_q_data *q_data;
int ret, i;
struct mtk_video_fmt *fmt;
- unsigned int pitch_w_div16;
struct v4l2_pix_format_mplane *pix_fmt_mp = &f->fmt.pix_mp;
vq = v4l2_m2m_get_vq(ctx->m2m_ctx, f->type);
@@ -530,15 +529,6 @@
q_data->coded_width = f->fmt.pix_mp.width;
q_data->coded_height = f->fmt.pix_mp.height;
- pitch_w_div16 = DIV_ROUND_UP(q_data->visible_width, 16);
- if (pitch_w_div16 % 8 != 0) {
- /* Adjust returned width/height, so application could correctly
- * allocate hw required memory
- */
- q_data->visible_height += 32;
- vidioc_try_fmt(f, q_data->fmt);
- }
-
q_data->field = f->fmt.pix_mp.field;
ctx->colorspace = f->fmt.pix_mp.colorspace;
ctx->ycbcr_enc = f->fmt.pix_mp.ycbcr_enc;
@@ -878,7 +868,8 @@
{
struct mtk_vcodec_ctx *ctx = priv;
int ret;
- struct vb2_buffer *dst_buf;
+ struct vb2_buffer *src_buf, *dst_buf;
+ struct vb2_v4l2_buffer *dst_vb2_v4l2, *src_vb2_v4l2;
struct mtk_vcodec_mem bs_buf;
struct venc_done_result enc_result;
@@ -911,6 +902,15 @@
mtk_v4l2_err("venc_if_encode failed=%d", ret);
return -EINVAL;
}
+ src_buf = v4l2_m2m_next_src_buf(ctx->m2m_ctx);
+ if (src_buf) {
+ src_vb2_v4l2 = to_vb2_v4l2_buffer(src_buf);
+ dst_vb2_v4l2 = to_vb2_v4l2_buffer(dst_buf);
+ dst_buf->timestamp = src_buf->timestamp;
+ dst_vb2_v4l2->timecode = src_vb2_v4l2->timecode;
+ } else {
+ mtk_v4l2_err("No timestamp for the header buffer.");
+ }
ctx->state = MTK_STATE_HEADER;
dst_buf->planes[0].bytesused = enc_result.bs_size;
@@ -1003,7 +1003,7 @@
struct mtk_vcodec_mem bs_buf;
struct venc_done_result enc_result;
int ret, i;
- struct vb2_v4l2_buffer *vb2_v4l2;
+ struct vb2_v4l2_buffer *dst_vb2_v4l2, *src_vb2_v4l2;
/* check dst_buf, dst_buf may be removed in device_run
* to stored encdoe header so we need check dst_buf and
@@ -1043,9 +1043,14 @@
ret = venc_if_encode(ctx, VENC_START_OPT_ENCODE_FRAME,
&frm_buf, &bs_buf, &enc_result);
- vb2_v4l2 = container_of(dst_buf, struct vb2_v4l2_buffer, vb2_buf);
+ src_vb2_v4l2 = to_vb2_v4l2_buffer(src_buf);
+ dst_vb2_v4l2 = to_vb2_v4l2_buffer(dst_buf);
+
+ dst_buf->timestamp = src_buf->timestamp;
+ dst_vb2_v4l2->timecode = src_vb2_v4l2->timecode;
+
if (enc_result.is_key_frm)
- vb2_v4l2->flags |= V4L2_BUF_FLAG_KEYFRAME;
+ dst_vb2_v4l2->flags |= V4L2_BUF_FLAG_KEYFRAME;
if (ret) {
v4l2_m2m_buf_done(to_vb2_v4l2_buffer(src_buf),
@@ -1217,7 +1222,7 @@
0, V4L2_MPEG_VIDEO_HEADER_MODE_SEPARATE);
v4l2_ctrl_new_std_menu(handler, ops, V4L2_CID_MPEG_VIDEO_H264_PROFILE,
V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
- 0, V4L2_MPEG_VIDEO_H264_PROFILE_MAIN);
+ 0, V4L2_MPEG_VIDEO_H264_PROFILE_HIGH);
v4l2_ctrl_new_std_menu(handler, ops, V4L2_CID_MPEG_VIDEO_H264_LEVEL,
V4L2_MPEG_VIDEO_H264_LEVEL_4_2,
0, V4L2_MPEG_VIDEO_H264_LEVEL_4_0);
@@ -1288,5 +1293,10 @@
void mtk_vcodec_enc_release(struct mtk_vcodec_ctx *ctx)
{
- venc_if_deinit(ctx);
+ int ret = venc_if_deinit(ctx);
+
+ if (ret)
+ mtk_v4l2_err("venc_if_deinit failed=%d", ret);
+
+ ctx->state = MTK_STATE_FREE;
}
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
index c7806ec..5cd2151 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_drv.c
@@ -218,11 +218,15 @@
mtk_v4l2_debug(1, "[%d] encoder", ctx->id);
mutex_lock(&dev->dev_mutex);
+ /*
+ * Call v4l2_m2m_ctx_release to make sure the worker thread is not
+ * running after venc_if_deinit.
+ */
+ v4l2_m2m_ctx_release(ctx->m2m_ctx);
mtk_vcodec_enc_release(ctx);
v4l2_fh_del(&ctx->fh);
v4l2_fh_exit(&ctx->fh);
v4l2_ctrl_handler_free(&ctx->ctrl_hdl);
- v4l2_m2m_ctx_release(ctx->m2m_ctx);
list_del_init(&ctx->list);
dev->num_instances--;
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.h
index 33e890f..1213185 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.h
@@ -16,7 +16,6 @@
#define _MTK_VCODEC_INTR_H_
#define MTK_INST_IRQ_RECEIVED 0x1
-#define MTK_INST_WORK_THREAD_ABORT_DONE 0x2
struct mtk_vcodec_ctx;
diff --git a/drivers/media/platform/mtk-vcodec/venc/venc_h264_if.c b/drivers/media/platform/mtk-vcodec/venc/venc_h264_if.c
index 9a60052..63d4be4 100644
--- a/drivers/media/platform/mtk-vcodec/venc/venc_h264_if.c
+++ b/drivers/media/platform/mtk-vcodec/venc/venc_h264_if.c
@@ -61,6 +61,8 @@
/*
* struct venc_h264_vpu_config - Structure for h264 encoder configuration
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* @input_fourcc: input fourcc
* @bitrate: target bitrate (in bps)
* @pic_w: picture width. Picture size is visible stream resolution, in pixels,
@@ -94,13 +96,13 @@
/*
* struct venc_h264_vpu_buf - Structure for buffer information
- * @align: buffer alignment (in bytes)
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* @iova: IO virtual address
* @vpua: VPU side memory addr which is used by RC_CODE
* @size: buffer size (in bytes)
*/
struct venc_h264_vpu_buf {
- u32 align;
u32 iova;
u32 vpua;
u32 size;
@@ -108,6 +110,8 @@
/*
* struct venc_h264_vsi - Structure for VPU driver control and info share
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* This structure is allocated in VPU side and shared to AP side.
* @config: h264 encoder configuration
* @work_bufs: working buffer information in VPU side
@@ -150,12 +154,6 @@
struct mtk_vcodec_ctx *ctx;
};
-static inline void h264_write_reg(struct venc_h264_inst *inst, u32 addr,
- u32 val)
-{
- writel(val, inst->hw_base + addr);
-}
-
static inline u32 h264_read_reg(struct venc_h264_inst *inst, u32 addr)
{
return readl(inst->hw_base + addr);
@@ -214,6 +212,8 @@
return 40;
case V4L2_MPEG_VIDEO_H264_LEVEL_4_1:
return 41;
+ case V4L2_MPEG_VIDEO_H264_LEVEL_4_2:
+ return 42;
default:
mtk_vcodec_debug(inst, "unsupported level %d", level);
return 31;
diff --git a/drivers/media/platform/mtk-vcodec/venc/venc_vp8_if.c b/drivers/media/platform/mtk-vcodec/venc/venc_vp8_if.c
index 60bbcd2..6d97584 100644
--- a/drivers/media/platform/mtk-vcodec/venc/venc_vp8_if.c
+++ b/drivers/media/platform/mtk-vcodec/venc/venc_vp8_if.c
@@ -56,6 +56,8 @@
/*
* struct venc_vp8_vpu_config - Structure for vp8 encoder configuration
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* @input_fourcc: input fourcc
* @bitrate: target bitrate (in bps)
* @pic_w: picture width. Picture size is visible stream resolution, in pixels,
@@ -83,14 +85,14 @@
};
/*
- * struct venc_vp8_vpu_buf -Structure for buffer information
- * @align: buffer alignment (in bytes)
+ * struct venc_vp8_vpu_buf - Structure for buffer information
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* @iova: IO virtual address
* @vpua: VPU side memory addr which is used by RC_CODE
* @size: buffer size (in bytes)
*/
struct venc_vp8_vpu_buf {
- u32 align;
u32 iova;
u32 vpua;
u32 size;
@@ -98,6 +100,8 @@
/*
* struct venc_vp8_vsi - Structure for VPU driver control and info share
+ * AP-W/R : AP is writer/reader on this item
+ * VPU-W/R: VPU is write/reader on this item
* This structure is allocated in VPU side and shared to AP side.
* @config: vp8 encoder configuration
* @work_bufs: working buffer information in VPU side
@@ -138,12 +142,6 @@
struct mtk_vcodec_ctx *ctx;
};
-static inline void vp8_enc_write_reg(struct venc_vp8_inst *inst, u32 addr,
- u32 val)
-{
- writel(val, inst->hw_base + addr);
-}
-
static inline u32 vp8_enc_read_reg(struct venc_vp8_inst *inst, u32 addr)
{
return readl(inst->hw_base + addr);
diff --git a/drivers/media/platform/rcar-fcp.c b/drivers/media/platform/rcar-fcp.c
index 6a7bcc3..bc50c69 100644
--- a/drivers/media/platform/rcar-fcp.c
+++ b/drivers/media/platform/rcar-fcp.c
@@ -99,10 +99,16 @@
*/
int rcar_fcp_enable(struct rcar_fcp_device *fcp)
{
+ int error;
+
if (!fcp)
return 0;
- return pm_runtime_get_sync(fcp->dev);
+ error = pm_runtime_get_sync(fcp->dev);
+ if (error < 0)
+ return error;
+
+ return 0;
}
EXPORT_SYMBOL_GPL(rcar_fcp_enable);
diff --git a/drivers/mmc/host/omap.c b/drivers/mmc/host/omap.c
index f23d65e..be3c49f 100644
--- a/drivers/mmc/host/omap.c
+++ b/drivers/mmc/host/omap.c
@@ -1016,14 +1016,16 @@
/* Only reconfigure if we have a different burst size */
if (*bp != burst) {
- struct dma_slave_config cfg;
-
- cfg.src_addr = host->phys_base + OMAP_MMC_REG(host, DATA);
- cfg.dst_addr = host->phys_base + OMAP_MMC_REG(host, DATA);
- cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
- cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
- cfg.src_maxburst = burst;
- cfg.dst_maxburst = burst;
+ struct dma_slave_config cfg = {
+ .src_addr = host->phys_base +
+ OMAP_MMC_REG(host, DATA),
+ .dst_addr = host->phys_base +
+ OMAP_MMC_REG(host, DATA),
+ .src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES,
+ .dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES,
+ .src_maxburst = burst,
+ .dst_maxburst = burst,
+ };
if (dmaengine_slave_config(c, &cfg))
goto use_pio;
diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 24ebc9a..5f2f24a 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -1409,11 +1409,18 @@
static int omap_hsmmc_setup_dma_transfer(struct omap_hsmmc_host *host,
struct mmc_request *req)
{
- struct dma_slave_config cfg;
struct dma_async_tx_descriptor *tx;
int ret = 0, i;
struct mmc_data *data = req->data;
struct dma_chan *chan;
+ struct dma_slave_config cfg = {
+ .src_addr = host->mapbase + OMAP_HSMMC_DATA,
+ .dst_addr = host->mapbase + OMAP_HSMMC_DATA,
+ .src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+ .dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+ .src_maxburst = data->blksz / 4,
+ .dst_maxburst = data->blksz / 4,
+ };
/* Sanity check: all the SG entries must be aligned by block size. */
for (i = 0; i < data->sg_len; i++) {
@@ -1433,13 +1440,6 @@
chan = omap_hsmmc_get_dma_chan(host, data);
- cfg.src_addr = host->mapbase + OMAP_HSMMC_DATA;
- cfg.dst_addr = host->mapbase + OMAP_HSMMC_DATA;
- cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- cfg.src_maxburst = data->blksz / 4;
- cfg.dst_maxburst = data->blksz / 4;
-
ret = dmaengine_slave_config(chan, &cfg);
if (ret)
return ret;
diff --git a/drivers/mmc/host/sdhci-st.c b/drivers/mmc/host/sdhci-st.c
index c95ba83..ed92ce72 100644
--- a/drivers/mmc/host/sdhci-st.c
+++ b/drivers/mmc/host/sdhci-st.c
@@ -28,6 +28,7 @@
struct st_mmc_platform_data {
struct reset_control *rstc;
+ struct clk *icnclk;
void __iomem *top_ioaddr;
};
@@ -353,7 +354,7 @@
struct sdhci_host *host;
struct st_mmc_platform_data *pdata;
struct sdhci_pltfm_host *pltfm_host;
- struct clk *clk;
+ struct clk *clk, *icnclk;
int ret = 0;
u16 host_version;
struct resource *res;
@@ -365,6 +366,11 @@
return PTR_ERR(clk);
}
+ /* ICN clock isn't compulsory, but use it if it's provided. */
+ icnclk = devm_clk_get(&pdev->dev, "icn");
+ if (IS_ERR(icnclk))
+ icnclk = NULL;
+
rstc = devm_reset_control_get(&pdev->dev, NULL);
if (IS_ERR(rstc))
rstc = NULL;
@@ -389,6 +395,7 @@
}
clk_prepare_enable(clk);
+ clk_prepare_enable(icnclk);
/* Configure the FlashSS Top registers for setting eMMC TX/RX delay */
res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
@@ -400,6 +407,7 @@
}
pltfm_host->clk = clk;
+ pdata->icnclk = icnclk;
/* Configure the Arasan HC inside the flashSS */
st_mmcss_cconfig(np, host);
@@ -422,6 +430,7 @@
return 0;
err_out:
+ clk_disable_unprepare(icnclk);
clk_disable_unprepare(clk);
err_of:
sdhci_pltfm_free(pdev);
@@ -442,6 +451,8 @@
ret = sdhci_pltfm_unregister(pdev);
+ clk_disable_unprepare(pdata->icnclk);
+
if (rstc)
reset_control_assert(rstc);
@@ -462,6 +473,7 @@
if (pdata->rstc)
reset_control_assert(pdata->rstc);
+ clk_disable_unprepare(pdata->icnclk);
clk_disable_unprepare(pltfm_host->clk);
out:
return ret;
@@ -475,6 +487,7 @@
struct device_node *np = dev->of_node;
clk_prepare_enable(pltfm_host->clk);
+ clk_prepare_enable(pdata->icnclk);
if (pdata->rstc)
reset_control_deassert(pdata->rstc);
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0c5415b..95c32f2 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -149,6 +149,8 @@
tristate "IP-VLAN support"
depends on INET
depends on IPV6
+ depends on NETFILTER
+ depends on NET_L3_MASTER_DEV
---help---
This allows one to create virtual devices off of a main interface
and packets will be delivered based on the dest L3 (IPv6/IPv4 addr)
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 41c0fc9..16f7cad 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -1268,11 +1268,10 @@
struct flexcan_priv *priv = netdev_priv(dev);
int err;
- err = flexcan_chip_disable(priv);
- if (err)
- return err;
-
if (netif_running(dev)) {
+ err = flexcan_chip_disable(priv);
+ if (err)
+ return err;
netif_stop_queue(dev);
netif_device_detach(dev);
}
@@ -1285,13 +1284,17 @@
{
struct net_device *dev = dev_get_drvdata(device);
struct flexcan_priv *priv = netdev_priv(dev);
+ int err;
priv->can.state = CAN_STATE_ERROR_ACTIVE;
if (netif_running(dev)) {
netif_device_attach(dev);
netif_start_queue(dev);
+ err = flexcan_chip_enable(priv);
+ if (err)
+ return err;
}
- return flexcan_chip_enable(priv);
+ return 0;
}
static SIMPLE_DEV_PM_OPS(flexcan_pm_ops, flexcan_suspend, flexcan_resume);
diff --git a/drivers/net/can/ifi_canfd/ifi_canfd.c b/drivers/net/can/ifi_canfd/ifi_canfd.c
index 2d1d22e..368bb07 100644
--- a/drivers/net/can/ifi_canfd/ifi_canfd.c
+++ b/drivers/net/can/ifi_canfd/ifi_canfd.c
@@ -81,6 +81,10 @@
#define IFI_CANFD_TIME_SET_TIMEA_4_12_6_6 BIT(15)
#define IFI_CANFD_TDELAY 0x1c
+#define IFI_CANFD_TDELAY_DEFAULT 0xb
+#define IFI_CANFD_TDELAY_MASK 0x3fff
+#define IFI_CANFD_TDELAY_ABS BIT(14)
+#define IFI_CANFD_TDELAY_EN BIT(15)
#define IFI_CANFD_ERROR 0x20
#define IFI_CANFD_ERROR_TX_OFFSET 0
@@ -641,7 +645,7 @@
struct ifi_canfd_priv *priv = netdev_priv(ndev);
const struct can_bittiming *bt = &priv->can.bittiming;
const struct can_bittiming *dbt = &priv->can.data_bittiming;
- u16 brp, sjw, tseg1, tseg2;
+ u16 brp, sjw, tseg1, tseg2, tdc;
/* Configure bit timing */
brp = bt->brp - 2;
@@ -664,6 +668,11 @@
(brp << IFI_CANFD_TIME_PRESCALE_OFF) |
(sjw << IFI_CANFD_TIME_SJW_OFF_7_9_8_8),
priv->base + IFI_CANFD_FTIME);
+
+ /* Configure transmitter delay */
+ tdc = (dbt->brp * (dbt->phase_seg1 + 1)) & IFI_CANFD_TDELAY_MASK;
+ writel(IFI_CANFD_TDELAY_EN | IFI_CANFD_TDELAY_ABS | tdc,
+ priv->base + IFI_CANFD_TDELAY);
}
static void ifi_canfd_set_filter(struct net_device *ndev, const u32 id,
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 0afc2e5..7717b19 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -764,11 +764,6 @@
return b53_get_mib_size(dev);
}
-static int b53_set_addr(struct dsa_switch *ds, u8 *addr)
-{
- return 0;
-}
-
static int b53_setup(struct dsa_switch *ds)
{
struct b53_device *dev = ds->priv;
@@ -1407,16 +1402,12 @@
}
}
-static void b53_br_set_stp_state(struct dsa_switch *ds, int port,
- u8 state)
+static void b53_br_set_stp_state(struct dsa_switch *ds, int port, u8 state)
{
struct b53_device *dev = ds->priv;
- u8 hw_state, cur_hw_state;
+ u8 hw_state;
u8 reg;
- b53_read8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(port), ®);
- cur_hw_state = reg & PORT_CTRL_STP_STATE_MASK;
-
switch (state) {
case BR_STATE_DISABLED:
hw_state = PORT_CTRL_DIS_STATE;
@@ -1438,26 +1429,20 @@
return;
}
- /* Fast-age ARL entries if we are moving a port from Learning or
- * Forwarding (cur_hw_state) state to Disabled, Blocking or Listening
- * state (hw_state)
- */
- if (cur_hw_state != hw_state) {
- if (cur_hw_state >= PORT_CTRL_LEARN_STATE &&
- hw_state <= PORT_CTRL_LISTEN_STATE) {
- if (b53_fast_age_port(dev, port)) {
- dev_err(ds->dev, "fast ageing failed\n");
- return;
- }
- }
- }
-
b53_read8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(port), ®);
reg &= ~PORT_CTRL_STP_STATE_MASK;
reg |= hw_state;
b53_write8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(port), reg);
}
+static void b53_br_fast_age(struct dsa_switch *ds, int port)
+{
+ struct b53_device *dev = ds->priv;
+
+ if (b53_fast_age_port(dev, port))
+ dev_err(ds->dev, "fast ageing failed\n");
+}
+
static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds)
{
return DSA_TAG_PROTO_NONE;
@@ -1466,7 +1451,6 @@
static struct dsa_switch_ops b53_switch_ops = {
.get_tag_protocol = b53_get_tag_protocol,
.setup = b53_setup,
- .set_addr = b53_set_addr,
.get_strings = b53_get_strings,
.get_ethtool_stats = b53_get_ethtool_stats,
.get_sset_count = b53_get_sset_count,
@@ -1478,6 +1462,7 @@
.port_bridge_join = b53_br_join,
.port_bridge_leave = b53_br_leave,
.port_stp_state_set = b53_br_set_stp_state,
+ .port_fast_age = b53_br_fast_age,
.port_vlan_filtering = b53_vlan_filtering,
.port_vlan_prepare = b53_vlan_prepare,
.port_vlan_add = b53_vlan_add,
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile
index 6971039..10ce820 100644
--- a/drivers/net/dsa/mv88e6xxx/Makefile
+++ b/drivers/net/dsa/mv88e6xxx/Makefile
@@ -1,3 +1,4 @@
obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o
mv88e6xxx-objs := chip.o
+mv88e6xxx-objs += global1.o
mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_GLOBAL2) += global2.o
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 70a812d..883fd98 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -31,6 +31,7 @@
#include <net/switchdev.h>
#include "mv88e6xxx.h"
+#include "global1.h"
#include "global2.h"
static void assert_reg_lock(struct mv88e6xxx_chip *chip)
@@ -97,7 +98,7 @@
return 0;
}
-static const struct mv88e6xxx_ops mv88e6xxx_smi_single_chip_ops = {
+static const struct mv88e6xxx_bus_ops mv88e6xxx_smi_single_chip_ops = {
.read = mv88e6xxx_smi_single_chip_read,
.write = mv88e6xxx_smi_single_chip_write,
};
@@ -179,7 +180,7 @@
return 0;
}
-static const struct mv88e6xxx_ops mv88e6xxx_smi_multi_chip_ops = {
+static const struct mv88e6xxx_bus_ops mv88e6xxx_smi_multi_chip_ops = {
.read = mv88e6xxx_smi_multi_chip_read,
.write = mv88e6xxx_smi_multi_chip_write,
};
@@ -216,15 +217,31 @@
return 0;
}
+static int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg,
+ u16 *val)
+{
+ int addr = chip->info->port_base_addr + port;
+
+ return mv88e6xxx_read(chip, addr, reg, val);
+}
+
+static int mv88e6xxx_port_write(struct mv88e6xxx_chip *chip, int port, int reg,
+ u16 val)
+{
+ int addr = chip->info->port_base_addr + port;
+
+ return mv88e6xxx_write(chip, addr, reg, val);
+}
+
static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
int reg, u16 *val)
{
int addr = phy; /* PHY devices addresses start at 0x0 */
- if (!chip->phy_ops)
+ if (!chip->info->ops->phy_read)
return -EOPNOTSUPP;
- return chip->phy_ops->read(chip, addr, reg, val);
+ return chip->info->ops->phy_read(chip, addr, reg, val);
}
static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
@@ -232,10 +249,10 @@
{
int addr = phy; /* PHY devices addresses start at 0x0 */
- if (!chip->phy_ops)
+ if (!chip->info->ops->phy_write)
return -EOPNOTSUPP;
- return chip->phy_ops->write(chip, addr, reg, val);
+ return chip->info->ops->phy_write(chip, addr, reg, val);
}
static int mv88e6xxx_phy_page_get(struct mv88e6xxx_chip *chip, int phy, u8 page)
@@ -345,46 +362,27 @@
return mv88e6xxx_write(chip, addr, reg, val);
}
-static int _mv88e6xxx_reg_read(struct mv88e6xxx_chip *chip, int addr, int reg)
+static int mv88e6xxx_ppu_disable(struct mv88e6xxx_chip *chip)
{
u16 val;
- int err;
+ int i, err;
- err = mv88e6xxx_read(chip, addr, reg, &val);
+ err = mv88e6xxx_g1_read(chip, GLOBAL_CONTROL, &val);
if (err)
return err;
- return val;
-}
-
-static int _mv88e6xxx_reg_write(struct mv88e6xxx_chip *chip, int addr,
- int reg, u16 val)
-{
- return mv88e6xxx_write(chip, addr, reg, val);
-}
-
-static int mv88e6xxx_ppu_disable(struct mv88e6xxx_chip *chip)
-{
- int ret;
- int i;
-
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_CONTROL);
- if (ret < 0)
- return ret;
-
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_CONTROL,
- ret & ~GLOBAL_CONTROL_PPU_ENABLE);
- if (ret)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL,
+ val & ~GLOBAL_CONTROL_PPU_ENABLE);
+ if (err)
+ return err;
for (i = 0; i < 16; i++) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATUS);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_STATUS, &val);
+ if (err)
+ return err;
usleep_range(1000, 2000);
- if ((ret & GLOBAL_STATUS_PPU_MASK) !=
- GLOBAL_STATUS_PPU_POLLING)
+ if ((val & GLOBAL_STATUS_PPU_MASK) != GLOBAL_STATUS_PPU_POLLING)
return 0;
}
@@ -393,25 +391,25 @@
static int mv88e6xxx_ppu_enable(struct mv88e6xxx_chip *chip)
{
- int ret, err, i;
+ u16 val;
+ int i, err;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_CONTROL);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_CONTROL, &val);
+ if (err)
+ return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_CONTROL,
- ret | GLOBAL_CONTROL_PPU_ENABLE);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL,
+ val | GLOBAL_CONTROL_PPU_ENABLE);
if (err)
return err;
for (i = 0; i < 16; i++) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATUS);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_STATUS, &val);
+ if (err)
+ return err;
usleep_range(1000, 2000);
- if ((ret & GLOBAL_STATUS_PPU_MASK) ==
- GLOBAL_STATUS_PPU_POLLING)
+ if ((val & GLOBAL_STATUS_PPU_MASK) == GLOBAL_STATUS_PPU_POLLING)
return 0;
}
@@ -517,11 +515,6 @@
return err;
}
-static const struct mv88e6xxx_ops mv88e6xxx_phy_ppu_ops = {
- .read = mv88e6xxx_phy_ppu_read,
- .write = mv88e6xxx_phy_ppu_write,
-};
-
static bool mv88e6xxx_6065_family(struct mv88e6xxx_chip *chip)
{
return chip->info->family == MV88E6XXX_FAMILY_6065;
@@ -562,21 +555,6 @@
return chip->info->family == MV88E6XXX_FAMILY_6352;
}
-static unsigned int mv88e6xxx_num_databases(struct mv88e6xxx_chip *chip)
-{
- return chip->info->num_databases;
-}
-
-static bool mv88e6xxx_has_fid_reg(struct mv88e6xxx_chip *chip)
-{
- /* Does the device have dedicated FID registers for ATU and VTU ops? */
- if (mv88e6xxx_6097_family(chip) || mv88e6xxx_6165_family(chip) ||
- mv88e6xxx_6351_family(chip) || mv88e6xxx_6352_family(chip))
- return true;
-
- return false;
-}
-
/* We expect the switch to perform auto negotiation if there is a real
* phy. However, in the case of a fixed link phy, we force the port
* settings from the fixed link settings.
@@ -585,23 +563,23 @@
struct phy_device *phydev)
{
struct mv88e6xxx_chip *chip = ds->priv;
- u32 reg;
- int ret;
+ u16 reg;
+ int err;
if (!phy_is_pseudo_fixed_link(phydev))
return;
mutex_lock(&chip->reg_lock);
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_PCS_CTRL);
- if (ret < 0)
+ err = mv88e6xxx_port_read(chip, port, PORT_PCS_CTRL, ®);
+ if (err)
goto out;
- reg = ret & ~(PORT_PCS_CTRL_LINK_UP |
- PORT_PCS_CTRL_FORCE_LINK |
- PORT_PCS_CTRL_DUPLEX_FULL |
- PORT_PCS_CTRL_FORCE_DUPLEX |
- PORT_PCS_CTRL_UNFORCED);
+ reg &= ~(PORT_PCS_CTRL_LINK_UP |
+ PORT_PCS_CTRL_FORCE_LINK |
+ PORT_PCS_CTRL_DUPLEX_FULL |
+ PORT_PCS_CTRL_FORCE_DUPLEX |
+ PORT_PCS_CTRL_UNFORCED);
reg |= PORT_PCS_CTRL_FORCE_LINK;
if (phydev->link)
@@ -630,7 +608,7 @@
reg |= PORT_PCS_CTRL_DUPLEX_FULL;
if ((mv88e6xxx_6352_family(chip) || mv88e6xxx_6351_family(chip)) &&
- (port >= chip->info->num_ports - 2)) {
+ (port >= mv88e6xxx_num_ports(chip) - 2)) {
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
reg |= PORT_PCS_CTRL_RGMII_DELAY_RXCLK;
if (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
@@ -639,7 +617,7 @@
reg |= (PORT_PCS_CTRL_RGMII_DELAY_RXCLK |
PORT_PCS_CTRL_RGMII_DELAY_TXCLK);
}
- _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_PCS_CTRL, reg);
+ mv88e6xxx_port_write(chip, port, PORT_PCS_CTRL, reg);
out:
mutex_unlock(&chip->reg_lock);
@@ -647,12 +625,12 @@
static int _mv88e6xxx_stats_wait(struct mv88e6xxx_chip *chip)
{
- int ret;
- int i;
+ u16 val;
+ int i, err;
for (i = 0; i < 10; i++) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATS_OP);
- if ((ret & GLOBAL_STATS_OP_BUSY) == 0)
+ err = mv88e6xxx_g1_read(chip, GLOBAL_STATS_OP, &val);
+ if ((val & GLOBAL_STATS_OP_BUSY) == 0)
return 0;
}
@@ -661,55 +639,52 @@
static int _mv88e6xxx_stats_snapshot(struct mv88e6xxx_chip *chip, int port)
{
- int ret;
+ int err;
if (mv88e6xxx_6320_family(chip) || mv88e6xxx_6352_family(chip))
port = (port + 1) << 5;
/* Snapshot the hardware statistics counters for this port. */
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_STATS_OP,
- GLOBAL_STATS_OP_CAPTURE_PORT |
- GLOBAL_STATS_OP_HIST_RX_TX | port);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_STATS_OP,
+ GLOBAL_STATS_OP_CAPTURE_PORT |
+ GLOBAL_STATS_OP_HIST_RX_TX | port);
+ if (err)
+ return err;
/* Wait for the snapshotting to complete. */
- ret = _mv88e6xxx_stats_wait(chip);
- if (ret < 0)
- return ret;
-
- return 0;
+ return _mv88e6xxx_stats_wait(chip);
}
static void _mv88e6xxx_stats_read(struct mv88e6xxx_chip *chip,
int stat, u32 *val)
{
- u32 _val;
- int ret;
+ u32 value;
+ u16 reg;
+ int err;
*val = 0;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_STATS_OP,
- GLOBAL_STATS_OP_READ_CAPTURED |
- GLOBAL_STATS_OP_HIST_RX_TX | stat);
- if (ret < 0)
+ err = mv88e6xxx_g1_write(chip, GLOBAL_STATS_OP,
+ GLOBAL_STATS_OP_READ_CAPTURED |
+ GLOBAL_STATS_OP_HIST_RX_TX | stat);
+ if (err)
return;
- ret = _mv88e6xxx_stats_wait(chip);
- if (ret < 0)
+ err = _mv88e6xxx_stats_wait(chip);
+ if (err)
return;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATS_COUNTER_32);
- if (ret < 0)
+ err = mv88e6xxx_g1_read(chip, GLOBAL_STATS_COUNTER_32, ®);
+ if (err)
return;
- _val = ret << 16;
+ value = reg << 16;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATS_COUNTER_01);
- if (ret < 0)
+ err = mv88e6xxx_g1_read(chip, GLOBAL_STATS_COUNTER_01, ®);
+ if (err)
return;
- *val = _val | ret;
+ *val = value | reg;
}
static struct mv88e6xxx_hw_stat mv88e6xxx_hw_stats[] = {
@@ -799,22 +774,22 @@
{
u32 low;
u32 high = 0;
- int ret;
+ int err;
+ u16 reg;
u64 value;
switch (s->type) {
case PORT:
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), s->reg);
- if (ret < 0)
+ err = mv88e6xxx_port_read(chip, port, s->reg, ®);
+ if (err)
return UINT64_MAX;
- low = ret;
+ low = reg;
if (s->sizeof_stat == 4) {
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port),
- s->reg + 1);
- if (ret < 0)
+ err = mv88e6xxx_port_read(chip, port, s->reg + 1, ®);
+ if (err)
return UINT64_MAX;
- high = ret;
+ high = reg;
}
break;
case BANK0:
@@ -893,6 +868,8 @@
struct ethtool_regs *regs, void *_p)
{
struct mv88e6xxx_chip *chip = ds->priv;
+ int err;
+ u16 reg;
u16 *p = _p;
int i;
@@ -903,11 +880,10 @@
mutex_lock(&chip->reg_lock);
for (i = 0; i < 32; i++) {
- int ret;
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), i);
- if (ret >= 0)
- p[i] = ret;
+ err = mv88e6xxx_port_read(chip, port, i, ®);
+ if (!err)
+ p[i] = reg;
}
mutex_unlock(&chip->reg_lock);
@@ -915,8 +891,7 @@
static int _mv88e6xxx_atu_wait(struct mv88e6xxx_chip *chip)
{
- return mv88e6xxx_wait(chip, REG_GLOBAL, GLOBAL_ATU_OP,
- GLOBAL_ATU_OP_BUSY);
+ return mv88e6xxx_g1_wait(chip, GLOBAL_ATU_OP, GLOBAL_ATU_OP_BUSY);
}
static int mv88e6xxx_get_eee(struct dsa_switch *ds, int port,
@@ -938,7 +913,7 @@
e->eee_enabled = !!(reg & 0x0200);
e->tx_lpi_enabled = !!(reg & 0x0100);
- err = mv88e6xxx_read(chip, REG_PORT(port), PORT_STATUS, ®);
+ err = mv88e6xxx_port_read(chip, port, PORT_STATUS, ®);
if (err)
goto out;
@@ -980,32 +955,31 @@
static int _mv88e6xxx_atu_cmd(struct mv88e6xxx_chip *chip, u16 fid, u16 cmd)
{
- int ret;
+ u16 val;
+ int err;
- if (mv88e6xxx_has_fid_reg(chip)) {
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_ATU_FID,
- fid);
- if (ret < 0)
- return ret;
+ if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G1_ATU_FID)) {
+ err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_FID, fid);
+ if (err)
+ return err;
} else if (mv88e6xxx_num_databases(chip) == 256) {
/* ATU DBNum[7:4] are located in ATU Control 15:12 */
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_ATU_CONTROL);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_ATU_CONTROL, &val);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_ATU_CONTROL,
- (ret & 0xfff) |
- ((fid << 8) & 0xf000));
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL,
+ (val & 0xfff) | ((fid << 8) & 0xf000));
+ if (err)
+ return err;
/* ATU DBNum[3:0] are located in ATU Operation 3:0 */
cmd |= fid & 0xf;
}
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_ATU_OP, cmd);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_OP, cmd);
+ if (err)
+ return err;
return _mv88e6xxx_atu_wait(chip);
}
@@ -1030,7 +1004,7 @@
data |= (entry->portv_trunkid << shift) & mask;
}
- return _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_ATU_DATA, data);
+ return mv88e6xxx_g1_write(chip, GLOBAL_ATU_DATA, data);
}
static int _mv88e6xxx_atu_flush_move(struct mv88e6xxx_chip *chip,
@@ -1106,57 +1080,45 @@
u8 state)
{
struct dsa_switch *ds = chip->ds;
- int reg, ret = 0;
+ u16 reg;
+ int err;
u8 oldstate;
- reg = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_CONTROL);
- if (reg < 0)
- return reg;
+ err = mv88e6xxx_port_read(chip, port, PORT_CONTROL, ®);
+ if (err)
+ return err;
oldstate = reg & PORT_CONTROL_STATE_MASK;
- if (oldstate != state) {
- /* Flush forwarding database if we're moving a port
- * from Learning or Forwarding state to Disabled or
- * Blocking or Listening state.
- */
- if ((oldstate == PORT_CONTROL_STATE_LEARNING ||
- oldstate == PORT_CONTROL_STATE_FORWARDING) &&
- (state == PORT_CONTROL_STATE_DISABLED ||
- state == PORT_CONTROL_STATE_BLOCKING)) {
- ret = _mv88e6xxx_atu_remove(chip, 0, port, false);
- if (ret)
- return ret;
- }
+ reg &= ~PORT_CONTROL_STATE_MASK;
+ reg |= state;
- reg = (reg & ~PORT_CONTROL_STATE_MASK) | state;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_CONTROL,
- reg);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL, reg);
+ if (err)
+ return err;
- netdev_dbg(ds->ports[port].netdev, "PortState %s (was %s)\n",
- mv88e6xxx_port_state_names[state],
- mv88e6xxx_port_state_names[oldstate]);
- }
+ netdev_dbg(ds->ports[port].netdev, "PortState %s (was %s)\n",
+ mv88e6xxx_port_state_names[state],
+ mv88e6xxx_port_state_names[oldstate]);
- return ret;
+ return 0;
}
static int _mv88e6xxx_port_based_vlan_map(struct mv88e6xxx_chip *chip, int port)
{
struct net_device *bridge = chip->ports[port].bridge_dev;
- const u16 mask = (1 << chip->info->num_ports) - 1;
+ const u16 mask = (1 << mv88e6xxx_num_ports(chip)) - 1;
struct dsa_switch *ds = chip->ds;
u16 output_ports = 0;
- int reg;
+ u16 reg;
+ int err;
int i;
/* allow CPU port or DSA link(s) to send frames to every port */
if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) {
output_ports = mask;
} else {
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
/* allow sending frames to every group member */
if (bridge && chip->ports[i].bridge_dev == bridge)
output_ports |= BIT(i);
@@ -1170,14 +1132,14 @@
/* prevent frames from going back out of the port they came in on */
output_ports &= ~BIT(port);
- reg = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_BASE_VLAN);
- if (reg < 0)
- return reg;
+ err = mv88e6xxx_port_read(chip, port, PORT_BASE_VLAN, ®);
+ if (err)
+ return err;
reg &= ~mask;
reg |= output_ports & mask;
- return _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_BASE_VLAN, reg);
+ return mv88e6xxx_port_write(chip, port, PORT_BASE_VLAN, reg);
}
static void mv88e6xxx_port_stp_state_set(struct dsa_switch *ds, int port,
@@ -1214,27 +1176,39 @@
mv88e6xxx_port_state_names[stp_state]);
}
+static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port)
+{
+ struct mv88e6xxx_chip *chip = ds->priv;
+ int err;
+
+ mutex_lock(&chip->reg_lock);
+ err = _mv88e6xxx_atu_remove(chip, 0, port, false);
+ mutex_unlock(&chip->reg_lock);
+
+ if (err)
+ netdev_err(ds->ports[port].netdev, "failed to flush ATU\n");
+}
+
static int _mv88e6xxx_port_pvid(struct mv88e6xxx_chip *chip, int port,
u16 *new, u16 *old)
{
struct dsa_switch *ds = chip->ds;
- u16 pvid;
- int ret;
+ u16 pvid, reg;
+ int err;
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_DEFAULT_VLAN);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_read(chip, port, PORT_DEFAULT_VLAN, ®);
+ if (err)
+ return err;
- pvid = ret & PORT_DEFAULT_VLAN_MASK;
+ pvid = reg & PORT_DEFAULT_VLAN_MASK;
if (new) {
- ret &= ~PORT_DEFAULT_VLAN_MASK;
- ret |= *new & PORT_DEFAULT_VLAN_MASK;
+ reg &= ~PORT_DEFAULT_VLAN_MASK;
+ reg |= *new & PORT_DEFAULT_VLAN_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_DEFAULT_VLAN, ret);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, reg);
+ if (err)
+ return err;
netdev_dbg(ds->ports[port].netdev,
"DefaultVID %d (was %d)\n", *new, pvid);
@@ -1260,17 +1234,16 @@
static int _mv88e6xxx_vtu_wait(struct mv88e6xxx_chip *chip)
{
- return mv88e6xxx_wait(chip, REG_GLOBAL, GLOBAL_VTU_OP,
- GLOBAL_VTU_OP_BUSY);
+ return mv88e6xxx_g1_wait(chip, GLOBAL_VTU_OP, GLOBAL_VTU_OP_BUSY);
}
static int _mv88e6xxx_vtu_cmd(struct mv88e6xxx_chip *chip, u16 op)
{
- int ret;
+ int err;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_OP, op);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_OP, op);
+ if (err)
+ return err;
return _mv88e6xxx_vtu_wait(chip);
}
@@ -1287,23 +1260,21 @@
}
static int _mv88e6xxx_vtu_stu_data_read(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry,
+ struct mv88e6xxx_vtu_entry *entry,
unsigned int nibble_offset)
{
u16 regs[3];
- int i;
- int ret;
+ int i, err;
for (i = 0; i < 3; ++i) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL,
- GLOBAL_VTU_DATA_0_3 + i);
- if (ret < 0)
- return ret;
+ u16 *reg = ®s[i];
- regs[i] = ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_DATA_0_3 + i, reg);
+ if (err)
+ return err;
}
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
unsigned int shift = (i % 4) * 4 + nibble_offset;
u16 reg = regs[i / 4];
@@ -1314,26 +1285,25 @@
}
static int mv88e6xxx_vtu_data_read(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
return _mv88e6xxx_vtu_stu_data_read(chip, entry, 0);
}
static int mv88e6xxx_stu_data_read(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
return _mv88e6xxx_vtu_stu_data_read(chip, entry, 2);
}
static int _mv88e6xxx_vtu_stu_data_write(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry,
+ struct mv88e6xxx_vtu_entry *entry,
unsigned int nibble_offset)
{
u16 regs[3] = { 0 };
- int i;
- int ret;
+ int i, err;
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
unsigned int shift = (i % 4) * 4 + nibble_offset;
u8 data = entry->data[i];
@@ -1341,86 +1311,85 @@
}
for (i = 0; i < 3; ++i) {
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL,
- GLOBAL_VTU_DATA_0_3 + i, regs[i]);
- if (ret < 0)
- return ret;
+ u16 reg = regs[i];
+
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_DATA_0_3 + i, reg);
+ if (err)
+ return err;
}
return 0;
}
static int mv88e6xxx_vtu_data_write(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
return _mv88e6xxx_vtu_stu_data_write(chip, entry, 0);
}
static int mv88e6xxx_stu_data_write(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
return _mv88e6xxx_vtu_stu_data_write(chip, entry, 2);
}
static int _mv88e6xxx_vtu_vid_write(struct mv88e6xxx_chip *chip, u16 vid)
{
- return _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_VID,
- vid & GLOBAL_VTU_VID_MASK);
+ return mv88e6xxx_g1_write(chip, GLOBAL_VTU_VID,
+ vid & GLOBAL_VTU_VID_MASK);
}
static int _mv88e6xxx_vtu_getnext(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
- struct mv88e6xxx_vtu_stu_entry next = { 0 };
- int ret;
+ struct mv88e6xxx_vtu_entry next = { 0 };
+ u16 val;
+ int err;
- ret = _mv88e6xxx_vtu_wait(chip);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_wait(chip);
+ if (err)
+ return err;
- ret = _mv88e6xxx_vtu_cmd(chip, GLOBAL_VTU_OP_VTU_GET_NEXT);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_cmd(chip, GLOBAL_VTU_OP_VTU_GET_NEXT);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_VTU_VID);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_VID, &val);
+ if (err)
+ return err;
- next.vid = ret & GLOBAL_VTU_VID_MASK;
- next.valid = !!(ret & GLOBAL_VTU_VID_VALID);
+ next.vid = val & GLOBAL_VTU_VID_MASK;
+ next.valid = !!(val & GLOBAL_VTU_VID_VALID);
if (next.valid) {
- ret = mv88e6xxx_vtu_data_read(chip, &next);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_vtu_data_read(chip, &next);
+ if (err)
+ return err;
- if (mv88e6xxx_has_fid_reg(chip)) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL,
- GLOBAL_VTU_FID);
- if (ret < 0)
- return ret;
+ if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G1_VTU_FID)) {
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_FID, &val);
+ if (err)
+ return err;
- next.fid = ret & GLOBAL_VTU_FID_MASK;
+ next.fid = val & GLOBAL_VTU_FID_MASK;
} else if (mv88e6xxx_num_databases(chip) == 256) {
/* VTU DBNum[7:4] are located in VTU Operation 11:8, and
* VTU DBNum[3:0] are located in VTU Operation 3:0
*/
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL,
- GLOBAL_VTU_OP);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_OP, &val);
+ if (err)
+ return err;
- next.fid = (ret & 0xf00) >> 4;
- next.fid |= ret & 0xf;
+ next.fid = (val & 0xf00) >> 4;
+ next.fid |= val & 0xf;
}
if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_STU)) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL,
- GLOBAL_VTU_SID);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_SID, &val);
+ if (err)
+ return err;
- next.sid = ret & GLOBAL_VTU_SID_MASK;
+ next.sid = val & GLOBAL_VTU_SID_MASK;
}
}
@@ -1433,7 +1402,7 @@
int (*cb)(struct switchdev_obj *obj))
{
struct mv88e6xxx_chip *chip = ds->priv;
- struct mv88e6xxx_vtu_stu_entry next;
+ struct mv88e6xxx_vtu_entry next;
u16 pvid;
int err;
@@ -1484,38 +1453,36 @@
}
static int _mv88e6xxx_vtu_loadpurge(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
u16 op = GLOBAL_VTU_OP_VTU_LOAD_PURGE;
u16 reg = 0;
- int ret;
+ int err;
- ret = _mv88e6xxx_vtu_wait(chip);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_wait(chip);
+ if (err)
+ return err;
if (!entry->valid)
goto loadpurge;
/* Write port member tags */
- ret = mv88e6xxx_vtu_data_write(chip, entry);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_vtu_data_write(chip, entry);
+ if (err)
+ return err;
if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_STU)) {
reg = entry->sid & GLOBAL_VTU_SID_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_SID,
- reg);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_SID, reg);
+ if (err)
+ return err;
}
- if (mv88e6xxx_has_fid_reg(chip)) {
+ if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G1_VTU_FID)) {
reg = entry->fid & GLOBAL_VTU_FID_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_FID,
- reg);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_FID, reg);
+ if (err)
+ return err;
} else if (mv88e6xxx_num_databases(chip) == 256) {
/* VTU DBNum[7:4] are located in VTU Operation 11:8, and
* VTU DBNum[3:0] are located in VTU Operation 3:0
@@ -1527,48 +1494,49 @@
reg = GLOBAL_VTU_VID_VALID;
loadpurge:
reg |= entry->vid & GLOBAL_VTU_VID_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_VID, reg);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_VID, reg);
+ if (err)
+ return err;
return _mv88e6xxx_vtu_cmd(chip, op);
}
static int _mv88e6xxx_stu_getnext(struct mv88e6xxx_chip *chip, u8 sid,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
- struct mv88e6xxx_vtu_stu_entry next = { 0 };
- int ret;
+ struct mv88e6xxx_vtu_entry next = { 0 };
+ u16 val;
+ int err;
- ret = _mv88e6xxx_vtu_wait(chip);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_wait(chip);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_SID,
- sid & GLOBAL_VTU_SID_MASK);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_SID,
+ sid & GLOBAL_VTU_SID_MASK);
+ if (err)
+ return err;
- ret = _mv88e6xxx_vtu_cmd(chip, GLOBAL_VTU_OP_STU_GET_NEXT);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_cmd(chip, GLOBAL_VTU_OP_STU_GET_NEXT);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_VTU_SID);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_SID, &val);
+ if (err)
+ return err;
- next.sid = ret & GLOBAL_VTU_SID_MASK;
+ next.sid = val & GLOBAL_VTU_SID_MASK;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_VTU_VID);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_VTU_VID, &val);
+ if (err)
+ return err;
- next.valid = !!(ret & GLOBAL_VTU_VID_VALID);
+ next.valid = !!(val & GLOBAL_VTU_VID_VALID);
if (next.valid) {
- ret = mv88e6xxx_stu_data_read(chip, &next);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_stu_data_read(chip, &next);
+ if (err)
+ return err;
}
*entry = next;
@@ -1576,33 +1544,33 @@
}
static int _mv88e6xxx_stu_loadpurge(struct mv88e6xxx_chip *chip,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
u16 reg = 0;
- int ret;
+ int err;
- ret = _mv88e6xxx_vtu_wait(chip);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_vtu_wait(chip);
+ if (err)
+ return err;
if (!entry->valid)
goto loadpurge;
/* Write port states */
- ret = mv88e6xxx_stu_data_write(chip, entry);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_stu_data_write(chip, entry);
+ if (err)
+ return err;
reg = GLOBAL_VTU_VID_VALID;
loadpurge:
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_VID, reg);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_VID, reg);
+ if (err)
+ return err;
reg = entry->sid & GLOBAL_VTU_SID_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_VTU_SID, reg);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_VTU_SID, reg);
+ if (err)
+ return err;
return _mv88e6xxx_vtu_cmd(chip, GLOBAL_VTU_OP_STU_LOAD_PURGE);
}
@@ -1613,7 +1581,8 @@
struct dsa_switch *ds = chip->ds;
u16 upper_mask;
u16 fid;
- int ret;
+ u16 reg;
+ int err;
if (mv88e6xxx_num_databases(chip) == 4096)
upper_mask = 0xff;
@@ -1623,37 +1592,35 @@
return -EOPNOTSUPP;
/* Port's default FID bits 3:0 are located in reg 0x06, offset 12 */
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_BASE_VLAN);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_read(chip, port, PORT_BASE_VLAN, ®);
+ if (err)
+ return err;
- fid = (ret & PORT_BASE_VLAN_FID_3_0_MASK) >> 12;
+ fid = (reg & PORT_BASE_VLAN_FID_3_0_MASK) >> 12;
if (new) {
- ret &= ~PORT_BASE_VLAN_FID_3_0_MASK;
- ret |= (*new << 12) & PORT_BASE_VLAN_FID_3_0_MASK;
+ reg &= ~PORT_BASE_VLAN_FID_3_0_MASK;
+ reg |= (*new << 12) & PORT_BASE_VLAN_FID_3_0_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_BASE_VLAN,
- ret);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_BASE_VLAN, reg);
+ if (err)
+ return err;
}
/* Port's default FID bits 11:4 are located in reg 0x05, offset 0 */
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_CONTROL_1);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_read(chip, port, PORT_CONTROL_1, ®);
+ if (err)
+ return err;
- fid |= (ret & upper_mask) << 4;
+ fid |= (reg & upper_mask) << 4;
if (new) {
- ret &= ~upper_mask;
- ret |= (*new >> 4) & upper_mask;
+ reg &= ~upper_mask;
+ reg |= (*new >> 4) & upper_mask;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_CONTROL_1,
- ret);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL_1, reg);
+ if (err)
+ return err;
netdev_dbg(ds->ports[port].netdev,
"FID %d (was %d)\n", *new, fid);
@@ -1680,13 +1647,13 @@
static int _mv88e6xxx_fid_new(struct mv88e6xxx_chip *chip, u16 *fid)
{
DECLARE_BITMAP(fid_bitmap, MV88E6XXX_N_FID);
- struct mv88e6xxx_vtu_stu_entry vlan;
+ struct mv88e6xxx_vtu_entry vlan;
int i, err;
bitmap_zero(fid_bitmap, MV88E6XXX_N_FID);
/* Set every FID bit used by the (un)bridged ports */
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
err = _mv88e6xxx_port_fid_get(chip, i, fid);
if (err)
return err;
@@ -1722,10 +1689,10 @@
}
static int _mv88e6xxx_vtu_new(struct mv88e6xxx_chip *chip, u16 vid,
- struct mv88e6xxx_vtu_stu_entry *entry)
+ struct mv88e6xxx_vtu_entry *entry)
{
struct dsa_switch *ds = chip->ds;
- struct mv88e6xxx_vtu_stu_entry vlan = {
+ struct mv88e6xxx_vtu_entry vlan = {
.valid = true,
.vid = vid,
};
@@ -1736,14 +1703,14 @@
return err;
/* exclude all ports except the CPU and DSA ports */
- for (i = 0; i < chip->info->num_ports; ++i)
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i)
vlan.data[i] = dsa_is_cpu_port(ds, i) || dsa_is_dsa_port(ds, i)
? GLOBAL_VTU_DATA_MEMBER_TAG_UNMODIFIED
: GLOBAL_VTU_DATA_MEMBER_TAG_NON_MEMBER;
if (mv88e6xxx_6097_family(chip) || mv88e6xxx_6165_family(chip) ||
mv88e6xxx_6351_family(chip) || mv88e6xxx_6352_family(chip)) {
- struct mv88e6xxx_vtu_stu_entry vstp;
+ struct mv88e6xxx_vtu_entry vstp;
/* Adding a VTU entry requires a valid STU entry. As VSTP is not
* implemented, only one STU entry is needed to cover all VTU
@@ -1770,7 +1737,7 @@
}
static int _mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid,
- struct mv88e6xxx_vtu_stu_entry *entry, bool creat)
+ struct mv88e6xxx_vtu_entry *entry, bool creat)
{
int err;
@@ -1802,7 +1769,7 @@
u16 vid_begin, u16 vid_end)
{
struct mv88e6xxx_chip *chip = ds->priv;
- struct mv88e6xxx_vtu_stu_entry vlan;
+ struct mv88e6xxx_vtu_entry vlan;
int i, err;
if (!vid_begin)
@@ -1825,7 +1792,7 @@
if (vlan.vid > vid_end)
break;
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
if (dsa_is_dsa_port(ds, i) || dsa_is_cpu_port(ds, i))
continue;
@@ -1865,26 +1832,26 @@
struct mv88e6xxx_chip *chip = ds->priv;
u16 old, new = vlan_filtering ? PORT_CONTROL_2_8021Q_SECURE :
PORT_CONTROL_2_8021Q_DISABLED;
- int ret;
+ u16 reg;
+ int err;
if (!mv88e6xxx_has(chip, MV88E6XXX_FLAG_VTU))
return -EOPNOTSUPP;
mutex_lock(&chip->reg_lock);
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_CONTROL_2);
- if (ret < 0)
+ err = mv88e6xxx_port_read(chip, port, PORT_CONTROL_2, ®);
+ if (err)
goto unlock;
- old = ret & PORT_CONTROL_2_8021Q_MASK;
+ old = reg & PORT_CONTROL_2_8021Q_MASK;
if (new != old) {
- ret &= ~PORT_CONTROL_2_8021Q_MASK;
- ret |= new & PORT_CONTROL_2_8021Q_MASK;
+ reg &= ~PORT_CONTROL_2_8021Q_MASK;
+ reg |= new & PORT_CONTROL_2_8021Q_MASK;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_CONTROL_2,
- ret);
- if (ret < 0)
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL_2, reg);
+ if (err)
goto unlock;
netdev_dbg(ds->ports[port].netdev, "802.1Q Mode %s (was %s)\n",
@@ -1892,11 +1859,11 @@
mv88e6xxx_port_8021q_mode_names[old]);
}
- ret = 0;
+ err = 0;
unlock:
mutex_unlock(&chip->reg_lock);
- return ret;
+ return err;
}
static int
@@ -1927,7 +1894,7 @@
static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
u16 vid, bool untagged)
{
- struct mv88e6xxx_vtu_stu_entry vlan;
+ struct mv88e6xxx_vtu_entry vlan;
int err;
err = _mv88e6xxx_vtu_get(chip, vid, &vlan, true);
@@ -1972,7 +1939,7 @@
int port, u16 vid)
{
struct dsa_switch *ds = chip->ds;
- struct mv88e6xxx_vtu_stu_entry vlan;
+ struct mv88e6xxx_vtu_entry vlan;
int i, err;
err = _mv88e6xxx_vtu_get(chip, vid, &vlan, false);
@@ -1987,7 +1954,7 @@
/* keep the VLAN unless all ports are excluded */
vlan.valid = false;
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
if (dsa_is_cpu_port(ds, i) || dsa_is_dsa_port(ds, i))
continue;
@@ -2041,14 +2008,13 @@
static int _mv88e6xxx_atu_mac_write(struct mv88e6xxx_chip *chip,
const unsigned char *addr)
{
- int i, ret;
+ int i, err;
for (i = 0; i < 3; i++) {
- ret = _mv88e6xxx_reg_write(
- chip, REG_GLOBAL, GLOBAL_ATU_MAC_01 + i,
- (addr[i * 2] << 8) | addr[i * 2 + 1]);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_MAC_01 + i,
+ (addr[i * 2] << 8) | addr[i * 2 + 1]);
+ if (err)
+ return err;
}
return 0;
@@ -2057,15 +2023,16 @@
static int _mv88e6xxx_atu_mac_read(struct mv88e6xxx_chip *chip,
unsigned char *addr)
{
- int i, ret;
+ u16 val;
+ int i, err;
for (i = 0; i < 3; i++) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL,
- GLOBAL_ATU_MAC_01 + i);
- if (ret < 0)
- return ret;
- addr[i * 2] = ret >> 8;
- addr[i * 2 + 1] = ret & 0xff;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_ATU_MAC_01 + i, &val);
+ if (err)
+ return err;
+
+ addr[i * 2] = val >> 8;
+ addr[i * 2 + 1] = val & 0xff;
}
return 0;
@@ -2091,12 +2058,48 @@
return _mv88e6xxx_atu_cmd(chip, entry->fid, GLOBAL_ATU_OP_LOAD_DB);
}
+static int _mv88e6xxx_atu_getnext(struct mv88e6xxx_chip *chip, u16 fid,
+ struct mv88e6xxx_atu_entry *entry);
+
+static int mv88e6xxx_atu_get(struct mv88e6xxx_chip *chip, int fid,
+ const u8 *addr, struct mv88e6xxx_atu_entry *entry)
+{
+ struct mv88e6xxx_atu_entry next;
+ int err;
+
+ eth_broadcast_addr(next.mac);
+
+ err = _mv88e6xxx_atu_mac_write(chip, next.mac);
+ if (err)
+ return err;
+
+ do {
+ err = _mv88e6xxx_atu_getnext(chip, fid, &next);
+ if (err)
+ return err;
+
+ if (next.state == GLOBAL_ATU_DATA_STATE_UNUSED)
+ break;
+
+ if (ether_addr_equal(next.mac, addr)) {
+ *entry = next;
+ return 0;
+ }
+ } while (!is_broadcast_ether_addr(next.mac));
+
+ memset(entry, 0, sizeof(*entry));
+ entry->fid = fid;
+ ether_addr_copy(entry->mac, addr);
+
+ return 0;
+}
+
static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
const unsigned char *addr, u16 vid,
u8 state)
{
- struct mv88e6xxx_atu_entry entry = { 0 };
- struct mv88e6xxx_vtu_stu_entry vlan;
+ struct mv88e6xxx_vtu_entry vlan;
+ struct mv88e6xxx_atu_entry entry;
int err;
/* Null VLAN ID corresponds to the port private database */
@@ -2107,12 +2110,18 @@
if (err)
return err;
- entry.fid = vlan.fid;
- entry.state = state;
- ether_addr_copy(entry.mac, addr);
- if (state != GLOBAL_ATU_DATA_STATE_UNUSED) {
- entry.trunk = false;
- entry.portv_trunkid = BIT(port);
+ err = mv88e6xxx_atu_get(chip, vlan.fid, addr, &entry);
+ if (err)
+ return err;
+
+ /* Purge the ATU entry only if no port is using it anymore */
+ if (state == GLOBAL_ATU_DATA_STATE_UNUSED) {
+ entry.portv_trunkid &= ~BIT(port);
+ if (!entry.portv_trunkid)
+ entry.state = GLOBAL_ATU_DATA_STATE_UNUSED;
+ } else {
+ entry.portv_trunkid |= BIT(port);
+ entry.state = state;
}
return _mv88e6xxx_atu_load(chip, &entry);
@@ -2159,31 +2168,32 @@
struct mv88e6xxx_atu_entry *entry)
{
struct mv88e6xxx_atu_entry next = { 0 };
- int ret;
+ u16 val;
+ int err;
next.fid = fid;
- ret = _mv88e6xxx_atu_wait(chip);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_atu_wait(chip);
+ if (err)
+ return err;
- ret = _mv88e6xxx_atu_cmd(chip, fid, GLOBAL_ATU_OP_GET_NEXT_DB);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_atu_cmd(chip, fid, GLOBAL_ATU_OP_GET_NEXT_DB);
+ if (err)
+ return err;
- ret = _mv88e6xxx_atu_mac_read(chip, next.mac);
- if (ret < 0)
- return ret;
+ err = _mv88e6xxx_atu_mac_read(chip, next.mac);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_ATU_DATA);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, GLOBAL_ATU_DATA, &val);
+ if (err)
+ return err;
- next.state = ret & GLOBAL_ATU_DATA_STATE_MASK;
+ next.state = val & GLOBAL_ATU_DATA_STATE_MASK;
if (next.state != GLOBAL_ATU_DATA_STATE_UNUSED) {
unsigned int mask, shift;
- if (ret & GLOBAL_ATU_DATA_TRUNK) {
+ if (val & GLOBAL_ATU_DATA_TRUNK) {
next.trunk = true;
mask = GLOBAL_ATU_DATA_TRUNK_ID_MASK;
shift = GLOBAL_ATU_DATA_TRUNK_ID_SHIFT;
@@ -2193,7 +2203,7 @@
shift = GLOBAL_ATU_DATA_PORT_VECTOR_SHIFT;
}
- next.portv_trunkid = (ret & mask) >> shift;
+ next.portv_trunkid = (val & mask) >> shift;
}
*entry = next;
@@ -2263,7 +2273,7 @@
struct switchdev_obj *obj,
int (*cb)(struct switchdev_obj *obj))
{
- struct mv88e6xxx_vtu_stu_entry vlan = {
+ struct mv88e6xxx_vtu_entry vlan = {
.vid = GLOBAL_VTU_VID_MASK, /* all ones */
};
u16 fid;
@@ -2325,7 +2335,7 @@
/* Assign the bridge and remap each port's VLANTable */
chip->ports[port].bridge_dev = bridge;
- for (i = 0; i < chip->info->num_ports; ++i) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i) {
if (chip->ports[i].bridge_dev == bridge) {
err = _mv88e6xxx_port_based_vlan_map(chip, i);
if (err)
@@ -2349,7 +2359,7 @@
/* Unassign the bridge and remap each port's VLANTable */
chip->ports[port].bridge_dev = NULL;
- for (i = 0; i < chip->info->num_ports; ++i)
+ for (i = 0; i < mv88e6xxx_num_ports(chip); ++i)
if (i == port || chip->ports[i].bridge_dev == bridge)
if (_mv88e6xxx_port_based_vlan_map(chip, i))
netdev_warn(ds->ports[i].netdev,
@@ -2364,19 +2374,20 @@
u16 is_reset = (ppu_active ? 0x8800 : 0xc800);
struct gpio_desc *gpiod = chip->reset;
unsigned long timeout;
- int ret;
+ u16 reg;
+ int err;
int i;
/* Set all ports to the disabled state. */
- for (i = 0; i < chip->info->num_ports; i++) {
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(i), PORT_CONTROL);
- if (ret < 0)
- return ret;
+ for (i = 0; i < mv88e6xxx_num_ports(chip); i++) {
+ err = mv88e6xxx_port_read(chip, i, PORT_CONTROL, ®);
+ if (err)
+ return err;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(i), PORT_CONTROL,
- ret & 0xfffc);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, i, PORT_CONTROL,
+ reg & 0xfffc);
+ if (err)
+ return err;
}
/* Wait for transmit queues to drain. */
@@ -2395,29 +2406,29 @@
* through global registers 0x18 and 0x19.
*/
if (ppu_active)
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, 0x04, 0xc000);
+ err = mv88e6xxx_g1_write(chip, 0x04, 0xc000);
else
- ret = _mv88e6xxx_reg_write(chip, REG_GLOBAL, 0x04, 0xc400);
- if (ret)
- return ret;
+ err = mv88e6xxx_g1_write(chip, 0x04, 0xc400);
+ if (err)
+ return err;
/* Wait up to one second for reset to complete. */
timeout = jiffies + 1 * HZ;
while (time_before(jiffies, timeout)) {
- ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, 0x00);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_g1_read(chip, 0x00, ®);
+ if (err)
+ return err;
- if ((ret & is_reset) == is_reset)
+ if ((reg & is_reset) == is_reset)
break;
usleep_range(1000, 2000);
}
if (time_after(jiffies, timeout))
- ret = -ETIMEDOUT;
+ err = -ETIMEDOUT;
else
- ret = 0;
+ err = 0;
- return ret;
+ return err;
}
static int mv88e6xxx_serdes_power_on(struct mv88e6xxx_chip *chip)
@@ -2438,21 +2449,10 @@
return err;
}
-static int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port,
- int reg, u16 *val)
-{
- int addr = chip->info->port_base_addr + port;
-
- if (port >= chip->info->num_ports)
- return -EINVAL;
-
- return mv88e6xxx_read(chip, addr, reg, val);
-}
-
static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
{
struct dsa_switch *ds = chip->ds;
- int ret;
+ int err;
u16 reg;
if (mv88e6xxx_6352_family(chip) || mv88e6xxx_6351_family(chip) ||
@@ -2465,7 +2465,7 @@
* and all DSA ports to their maximum bandwidth and
* full duplex.
*/
- reg = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_PCS_CTRL);
+ err = mv88e6xxx_port_read(chip, port, PORT_PCS_CTRL, ®);
if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) {
reg &= ~PORT_PCS_CTRL_UNFORCED;
reg |= PORT_PCS_CTRL_FORCE_LINK |
@@ -2480,10 +2480,9 @@
reg |= PORT_PCS_CTRL_UNFORCED;
}
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_PCS_CTRL, reg);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_PCS_CTRL, reg);
+ if (err)
+ return err;
}
/* Port Control: disable Drop-on-Unlock, disable Drop-on-Lock,
@@ -2534,26 +2533,25 @@
PORT_CONTROL_FORWARD_UNKNOWN_MC;
}
if (reg) {
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_CONTROL, reg);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL, reg);
+ if (err)
+ return err;
}
/* If this port is connected to a SerDes, make sure the SerDes is not
* powered down.
*/
if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_SERDES)) {
- ret = _mv88e6xxx_reg_read(chip, REG_PORT(port), PORT_STATUS);
- if (ret < 0)
- return ret;
- ret &= PORT_STATUS_CMODE_MASK;
- if ((ret == PORT_STATUS_CMODE_100BASE_X) ||
- (ret == PORT_STATUS_CMODE_1000BASE_X) ||
- (ret == PORT_STATUS_CMODE_SGMII)) {
- ret = mv88e6xxx_serdes_power_on(chip);
- if (ret < 0)
- return ret;
+ err = mv88e6xxx_port_read(chip, port, PORT_STATUS, ®);
+ if (err)
+ return err;
+ reg &= PORT_STATUS_CMODE_MASK;
+ if ((reg == PORT_STATUS_CMODE_100BASE_X) ||
+ (reg == PORT_STATUS_CMODE_1000BASE_X) ||
+ (reg == PORT_STATUS_CMODE_SGMII)) {
+ err = mv88e6xxx_serdes_power_on(chip);
+ if (err < 0)
+ return err;
}
}
@@ -2587,10 +2585,9 @@
reg |= PORT_CONTROL_2_8021Q_DISABLED;
if (reg) {
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_CONTROL_2, reg);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL_2, reg);
+ if (err)
+ return err;
}
/* Port Association Vector: when learning source addresses
@@ -2603,16 +2600,14 @@
if (dsa_is_cpu_port(ds, port))
reg = 0;
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_ASSOC_VECTOR,
- reg);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_ASSOC_VECTOR, reg);
+ if (err)
+ return err;
/* Egress rate control 2: disable egress rate control. */
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_RATE_CONTROL_2,
- 0x0000);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_RATE_CONTROL_2, 0x0000);
+ if (err)
+ return err;
if (mv88e6xxx_6352_family(chip) || mv88e6xxx_6351_family(chip) ||
mv88e6xxx_6165_family(chip) || mv88e6xxx_6097_family(chip) ||
@@ -2621,114 +2616,108 @@
* be paused for by the remote end or the period of
* time that this port can pause the remote end.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_PAUSE_CTRL, 0x0000);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_PAUSE_CTRL, 0x0000);
+ if (err)
+ return err;
/* Port ATU control: disable limiting the number of
* address database entries that this port is allowed
* to use.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_ATU_CONTROL, 0x0000);
+ err = mv88e6xxx_port_write(chip, port, PORT_ATU_CONTROL,
+ 0x0000);
/* Priority Override: disable DA, SA and VTU priority
* override.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_PRI_OVERRIDE, 0x0000);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_PRI_OVERRIDE,
+ 0x0000);
+ if (err)
+ return err;
/* Port Ethertype: use the Ethertype DSA Ethertype
* value.
*/
if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_EDSA)) {
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_ETH_TYPE, ETH_P_EDSA);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_ETH_TYPE,
+ ETH_P_EDSA);
+ if (err)
+ return err;
}
/* Tag Remap: use an identity 802.1p prio -> switch
* prio mapping.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_TAG_REGMAP_0123, 0x3210);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_TAG_REGMAP_0123,
+ 0x3210);
+ if (err)
+ return err;
/* Tag Remap 2: use an identity 802.1p prio -> switch
* prio mapping.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_TAG_REGMAP_4567, 0x7654);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_TAG_REGMAP_4567,
+ 0x7654);
+ if (err)
+ return err;
}
/* Rate Control: disable ingress rate limiting. */
if (mv88e6xxx_6352_family(chip) || mv88e6xxx_6351_family(chip) ||
mv88e6xxx_6165_family(chip) || mv88e6xxx_6097_family(chip) ||
mv88e6xxx_6320_family(chip)) {
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_RATE_CONTROL, 0x0001);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_RATE_CONTROL,
+ 0x0001);
+ if (err)
+ return err;
} else if (mv88e6xxx_6185_family(chip) || mv88e6xxx_6095_family(chip)) {
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port),
- PORT_RATE_CONTROL, 0x0000);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_RATE_CONTROL,
+ 0x0000);
+ if (err)
+ return err;
}
/* Port Control 1: disable trunking, disable sending
* learning messages to this port.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_CONTROL_1,
- 0x0000);
- if (ret)
- return ret;
+ err = mv88e6xxx_port_write(chip, port, PORT_CONTROL_1, 0x0000);
+ if (err)
+ return err;
/* Port based VLAN map: give each port the same default address
* database, and allow bidirectional communication between the
* CPU and DSA port(s), and the other ports.
*/
- ret = _mv88e6xxx_port_fid_set(chip, port, 0);
- if (ret)
- return ret;
+ err = _mv88e6xxx_port_fid_set(chip, port, 0);
+ if (err)
+ return err;
- ret = _mv88e6xxx_port_based_vlan_map(chip, port);
- if (ret)
- return ret;
+ err = _mv88e6xxx_port_based_vlan_map(chip, port);
+ if (err)
+ return err;
/* Default VLAN ID and priority: don't set a default VLAN
* ID, and set the default packet priority to zero.
*/
- ret = _mv88e6xxx_reg_write(chip, REG_PORT(port), PORT_DEFAULT_VLAN,
- 0x0000);
- if (ret)
- return ret;
-
- return 0;
+ return mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, 0x0000);
}
-static int mv88e6xxx_g1_set_switch_mac(struct mv88e6xxx_chip *chip, u8 *addr)
+int mv88e6xxx_g1_set_switch_mac(struct mv88e6xxx_chip *chip, u8 *addr)
{
int err;
- err = mv88e6xxx_write(chip, REG_GLOBAL, GLOBAL_MAC_01,
- (addr[0] << 8) | addr[1]);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_MAC_01, (addr[0] << 8) | addr[1]);
if (err)
return err;
- err = mv88e6xxx_write(chip, REG_GLOBAL, GLOBAL_MAC_23,
- (addr[2] << 8) | addr[3]);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_MAC_23, (addr[2] << 8) | addr[3]);
if (err)
return err;
- return mv88e6xxx_write(chip, REG_GLOBAL, GLOBAL_MAC_45,
- (addr[4] << 8) | addr[5]);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_MAC_45, (addr[4] << 8) | addr[5]);
+ if (err)
+ return err;
+
+ return 0;
}
static int mv88e6xxx_g1_set_age_time(struct mv88e6xxx_chip *chip,
@@ -2747,7 +2736,7 @@
/* Round to nearest multiple of coeff */
age_time = (msecs + coeff / 2) / coeff;
- err = mv88e6xxx_read(chip, REG_GLOBAL, GLOBAL_ATU_CONTROL, &val);
+ err = mv88e6xxx_g1_read(chip, GLOBAL_ATU_CONTROL, &val);
if (err)
return err;
@@ -2755,7 +2744,7 @@
val &= ~0xff0;
val |= age_time << 4;
- return mv88e6xxx_write(chip, REG_GLOBAL, GLOBAL_ATU_CONTROL, val);
+ return mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL, val);
}
static int mv88e6xxx_set_ageing_time(struct dsa_switch *ds,
@@ -2786,7 +2775,7 @@
mv88e6xxx_has(chip, MV88E6XXX_FLAG_PPU_ACTIVE))
reg |= GLOBAL_CONTROL_PPU_ENABLE;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_CONTROL, reg);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL, reg);
if (err)
return err;
@@ -2796,15 +2785,14 @@
reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_MONITOR_CONTROL,
- reg);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_MONITOR_CONTROL, reg);
if (err)
return err;
/* Disable remote management, and set the switch's DSA device number. */
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_CONTROL_2,
- GLOBAL_CONTROL_2_MULTIPLE_CASCADE |
- (ds->index & 0x1f));
+ err = mv88e6xxx_g1_write(chip, GLOBAL_CONTROL_2,
+ GLOBAL_CONTROL_2_MULTIPLE_CASCADE |
+ (ds->index & 0x1f));
if (err)
return err;
@@ -2817,8 +2805,8 @@
* enable address learn messages to be sent to all message
* ports.
*/
- err = mv88e6xxx_write(chip, REG_GLOBAL, GLOBAL_ATU_CONTROL,
- GLOBAL_ATU_CONTROL_LEARN2ALL);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_ATU_CONTROL,
+ GLOBAL_ATU_CONTROL_LEARN2ALL);
if (err)
return err;
@@ -2832,39 +2820,39 @@
return err;
/* Configure the IP ToS mapping registers. */
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_0, 0x0000);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_0, 0x0000);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_1, 0x0000);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_1, 0x0000);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_2, 0x5555);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_2, 0x5555);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_3, 0x5555);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_3, 0x5555);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_4, 0xaaaa);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_4, 0xaaaa);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_5, 0xaaaa);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_5, 0xaaaa);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_6, 0xffff);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_6, 0xffff);
if (err)
return err;
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IP_PRI_7, 0xffff);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IP_PRI_7, 0xffff);
if (err)
return err;
/* Configure the IEEE 802.1p priority mapping register. */
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_IEEE_PRI, 0xfa41);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_IEEE_PRI, 0xfa41);
if (err)
return err;
/* Clear the statistics counters for all ports */
- err = _mv88e6xxx_reg_write(chip, REG_GLOBAL, GLOBAL_STATS_OP,
- GLOBAL_STATS_OP_FLUSH_ALL);
+ err = mv88e6xxx_g1_write(chip, GLOBAL_STATS_OP,
+ GLOBAL_STATS_OP_FLUSH_ALL);
if (err)
return err;
@@ -2892,7 +2880,7 @@
goto unlock;
/* Setup Switch Port Registers */
- for (i = 0; i < chip->info->num_ports; i++) {
+ for (i = 0; i < mv88e6xxx_num_ports(chip); i++) {
err = mv88e6xxx_setup_port(chip, i);
if (err)
goto unlock;
@@ -2921,14 +2909,11 @@
struct mv88e6xxx_chip *chip = ds->priv;
int err;
+ if (!chip->info->ops->set_switch_mac)
+ return -EOPNOTSUPP;
+
mutex_lock(&chip->reg_lock);
-
- /* Has an indirect Switch MAC/WoL/WoF register in Global 2? */
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_SWITCH_MAC))
- err = mv88e6xxx_g2_set_switch_mac(chip, addr);
- else
- err = mv88e6xxx_g1_set_switch_mac(chip, addr);
-
+ err = chip->info->ops->set_switch_mac(chip, addr);
mutex_unlock(&chip->reg_lock);
return err;
@@ -2940,7 +2925,7 @@
u16 val;
int err;
- if (phy >= chip->info->num_ports)
+ if (phy >= mv88e6xxx_num_ports(chip))
return 0xffff;
mutex_lock(&chip->reg_lock);
@@ -2955,7 +2940,7 @@
struct mv88e6xxx_chip *chip = bus->priv;
int err;
- if (phy >= chip->info->num_ports)
+ if (phy >= mv88e6xxx_num_ports(chip))
return 0xffff;
mutex_lock(&chip->reg_lock);
@@ -3183,13 +3168,11 @@
struct mv88e6xxx_chip *chip = ds->priv;
int err;
+ if (!chip->info->ops->get_eeprom)
+ return -EOPNOTSUPP;
+
mutex_lock(&chip->reg_lock);
-
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_EEPROM16))
- err = mv88e6xxx_g2_get_eeprom16(chip, eeprom, data);
- else
- err = -EOPNOTSUPP;
-
+ err = chip->info->ops->get_eeprom(chip, eeprom, data);
mutex_unlock(&chip->reg_lock);
if (err)
@@ -3206,21 +3189,133 @@
struct mv88e6xxx_chip *chip = ds->priv;
int err;
+ if (!chip->info->ops->set_eeprom)
+ return -EOPNOTSUPP;
+
if (eeprom->magic != 0xc3ec4951)
return -EINVAL;
mutex_lock(&chip->reg_lock);
-
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_EEPROM16))
- err = mv88e6xxx_g2_set_eeprom16(chip, eeprom, data);
- else
- err = -EOPNOTSUPP;
-
+ err = chip->info->ops->set_eeprom(chip, eeprom, data);
mutex_unlock(&chip->reg_lock);
return err;
}
+static const struct mv88e6xxx_ops mv88e6085_ops = {
+ .set_switch_mac = mv88e6xxx_g1_set_switch_mac,
+ .phy_read = mv88e6xxx_phy_ppu_read,
+ .phy_write = mv88e6xxx_phy_ppu_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6095_ops = {
+ .set_switch_mac = mv88e6xxx_g1_set_switch_mac,
+ .phy_read = mv88e6xxx_phy_ppu_read,
+ .phy_write = mv88e6xxx_phy_ppu_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6123_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_read,
+ .phy_write = mv88e6xxx_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6131_ops = {
+ .set_switch_mac = mv88e6xxx_g1_set_switch_mac,
+ .phy_read = mv88e6xxx_phy_ppu_read,
+ .phy_write = mv88e6xxx_phy_ppu_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6161_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_read,
+ .phy_write = mv88e6xxx_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6165_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_read,
+ .phy_write = mv88e6xxx_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6171_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6172_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6175_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6176_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6185_ops = {
+ .set_switch_mac = mv88e6xxx_g1_set_switch_mac,
+ .phy_read = mv88e6xxx_phy_ppu_read,
+ .phy_write = mv88e6xxx_phy_ppu_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6240_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6320_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6321_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6350_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6351_ops = {
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
+static const struct mv88e6xxx_ops mv88e6352_ops = {
+ .get_eeprom = mv88e6xxx_g2_get_eeprom16,
+ .set_eeprom = mv88e6xxx_g2_set_eeprom16,
+ .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+ .phy_read = mv88e6xxx_g2_smi_phy_read,
+ .phy_write = mv88e6xxx_g2_smi_phy_write,
+};
+
static const struct mv88e6xxx_info mv88e6xxx_table[] = {
[MV88E6085] = {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6085,
@@ -3229,8 +3324,10 @@
.num_databases = 4096,
.num_ports = 10,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6097,
+ .ops = &mv88e6085_ops,
},
[MV88E6095] = {
@@ -3240,8 +3337,10 @@
.num_databases = 256,
.num_ports = 11,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6095,
+ .ops = &mv88e6095_ops,
},
[MV88E6123] = {
@@ -3251,8 +3350,10 @@
.num_databases = 4096,
.num_ports = 3,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
+ .ops = &mv88e6123_ops,
},
[MV88E6131] = {
@@ -3262,8 +3363,10 @@
.num_databases = 256,
.num_ports = 8,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6185,
+ .ops = &mv88e6131_ops,
},
[MV88E6161] = {
@@ -3273,8 +3376,10 @@
.num_databases = 4096,
.num_ports = 6,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
+ .ops = &mv88e6161_ops,
},
[MV88E6165] = {
@@ -3284,8 +3389,10 @@
.num_databases = 4096,
.num_ports = 6,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6165,
+ .ops = &mv88e6165_ops,
},
[MV88E6171] = {
@@ -3295,8 +3402,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6351,
+ .ops = &mv88e6171_ops,
},
[MV88E6172] = {
@@ -3306,8 +3415,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6352,
+ .ops = &mv88e6172_ops,
},
[MV88E6175] = {
@@ -3317,8 +3428,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6351,
+ .ops = &mv88e6175_ops,
},
[MV88E6176] = {
@@ -3328,8 +3441,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6352,
+ .ops = &mv88e6176_ops,
},
[MV88E6185] = {
@@ -3339,8 +3454,10 @@
.num_databases = 256,
.num_ports = 10,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6185,
+ .ops = &mv88e6185_ops,
},
[MV88E6240] = {
@@ -3350,8 +3467,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6352,
+ .ops = &mv88e6240_ops,
},
[MV88E6320] = {
@@ -3361,8 +3480,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6320,
+ .ops = &mv88e6320_ops,
},
[MV88E6321] = {
@@ -3372,8 +3493,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6320,
+ .ops = &mv88e6321_ops,
},
[MV88E6350] = {
@@ -3383,8 +3506,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6351,
+ .ops = &mv88e6350_ops,
},
[MV88E6351] = {
@@ -3394,8 +3519,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6351,
+ .ops = &mv88e6351_ops,
},
[MV88E6352] = {
@@ -3405,8 +3532,10 @@
.num_databases = 4096,
.num_ports = 7,
.port_base_addr = 0x10,
+ .global1_addr = 0x1b,
.age_time_coeff = 15000,
.flags = MV88E6XXX_FLAGS_FAMILY_6352,
+ .ops = &mv88e6352_ops,
},
};
@@ -3469,33 +3598,16 @@
return chip;
}
-static const struct mv88e6xxx_ops mv88e6xxx_g2_smi_phy_ops = {
- .read = mv88e6xxx_g2_smi_phy_read,
- .write = mv88e6xxx_g2_smi_phy_write,
-};
-
-static const struct mv88e6xxx_ops mv88e6xxx_phy_ops = {
- .read = mv88e6xxx_read,
- .write = mv88e6xxx_write,
-};
-
static void mv88e6xxx_phy_init(struct mv88e6xxx_chip *chip)
{
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_SMI_PHY)) {
- chip->phy_ops = &mv88e6xxx_g2_smi_phy_ops;
- } else if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_PPU)) {
- chip->phy_ops = &mv88e6xxx_phy_ppu_ops;
+ if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_PPU))
mv88e6xxx_ppu_state_init(chip);
- } else {
- chip->phy_ops = &mv88e6xxx_phy_ops;
- }
}
static void mv88e6xxx_phy_destroy(struct mv88e6xxx_chip *chip)
{
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_PPU)) {
+ if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_PPU))
mv88e6xxx_ppu_state_destroy(chip);
- }
}
static int mv88e6xxx_smi_init(struct mv88e6xxx_chip *chip,
@@ -3648,6 +3760,7 @@
.port_bridge_join = mv88e6xxx_port_bridge_join,
.port_bridge_leave = mv88e6xxx_port_bridge_leave,
.port_stp_state_set = mv88e6xxx_port_stp_state_set,
+ .port_fast_age = mv88e6xxx_port_fast_age,
.port_vlan_filtering = mv88e6xxx_port_vlan_filtering,
.port_vlan_prepare = mv88e6xxx_port_vlan_prepare,
.port_vlan_add = mv88e6xxx_port_vlan_add,
@@ -3720,7 +3833,7 @@
if (IS_ERR(chip->reset))
return PTR_ERR(chip->reset);
- if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_EEPROM16) &&
+ if (chip->info->ops->get_eeprom &&
!of_property_read_u32(np, "eeprom-length", &eeprom_len))
chip->eeprom_len = eeprom_len;
diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c
new file mode 100644
index 0000000..d358720
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/global1.c
@@ -0,0 +1,34 @@
+/*
+ * Marvell 88E6xxx Switch Global (1) Registers support
+ *
+ * Copyright (c) 2008 Marvell Semiconductor
+ *
+ * Copyright (c) 2016 Vivien Didelot <vivien.didelot@savoirfairelinux.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include "mv88e6xxx.h"
+#include "global1.h"
+
+int mv88e6xxx_g1_read(struct mv88e6xxx_chip *chip, int reg, u16 *val)
+{
+ int addr = chip->info->global1_addr;
+
+ return mv88e6xxx_read(chip, addr, reg, val);
+}
+
+int mv88e6xxx_g1_write(struct mv88e6xxx_chip *chip, int reg, u16 val)
+{
+ int addr = chip->info->global1_addr;
+
+ return mv88e6xxx_write(chip, addr, reg, val);
+}
+
+int mv88e6xxx_g1_wait(struct mv88e6xxx_chip *chip, int reg, u16 mask)
+{
+ return mv88e6xxx_wait(chip, chip->info->global1_addr, reg, mask);
+}
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
new file mode 100644
index 0000000..62291e6
--- /dev/null
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -0,0 +1,23 @@
+/*
+ * Marvell 88E6xxx Switch Global (1) Registers support
+ *
+ * Copyright (c) 2008 Marvell Semiconductor
+ *
+ * Copyright (c) 2016 Vivien Didelot <vivien.didelot@savoirfairelinux.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _MV88E6XXX_GLOBAL1_H
+#define _MV88E6XXX_GLOBAL1_H
+
+#include "mv88e6xxx.h"
+
+int mv88e6xxx_g1_read(struct mv88e6xxx_chip *chip, int reg, u16 *val);
+int mv88e6xxx_g1_write(struct mv88e6xxx_chip *chip, int reg, u16 val);
+int mv88e6xxx_g1_wait(struct mv88e6xxx_chip *chip, int reg, u16 mask);
+
+#endif /* _MV88E6XXX_GLOBAL1_H */
diff --git a/drivers/net/dsa/mv88e6xxx/global2.c b/drivers/net/dsa/mv88e6xxx/global2.c
index 99ed028..cf686e7 100644
--- a/drivers/net/dsa/mv88e6xxx/global2.c
+++ b/drivers/net/dsa/mv88e6xxx/global2.c
@@ -14,6 +14,28 @@
#include "mv88e6xxx.h"
#include "global2.h"
+#define ADDR_GLOBAL2 0x1c
+
+static int mv88e6xxx_g2_read(struct mv88e6xxx_chip *chip, int reg, u16 *val)
+{
+ return mv88e6xxx_read(chip, ADDR_GLOBAL2, reg, val);
+}
+
+static int mv88e6xxx_g2_write(struct mv88e6xxx_chip *chip, int reg, u16 val)
+{
+ return mv88e6xxx_write(chip, ADDR_GLOBAL2, reg, val);
+}
+
+static int mv88e6xxx_g2_update(struct mv88e6xxx_chip *chip, int reg, u16 update)
+{
+ return mv88e6xxx_update(chip, ADDR_GLOBAL2, reg, update);
+}
+
+static int mv88e6xxx_g2_wait(struct mv88e6xxx_chip *chip, int reg, u16 mask)
+{
+ return mv88e6xxx_wait(chip, ADDR_GLOBAL2, reg, mask);
+}
+
/* Offset 0x06: Device Mapping Table register */
static int mv88e6xxx_g2_device_mapping_write(struct mv88e6xxx_chip *chip,
@@ -21,7 +43,7 @@
{
u16 val = (target << 8) | (port & 0xf);
- return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_DEVICE_MAPPING, val);
+ return mv88e6xxx_g2_update(chip, GLOBAL2_DEVICE_MAPPING, val);
}
static int mv88e6xxx_g2_set_device_mapping(struct mv88e6xxx_chip *chip)
@@ -52,13 +74,13 @@
static int mv88e6xxx_g2_trunk_mask_write(struct mv88e6xxx_chip *chip, int num,
bool hask, u16 mask)
{
- const u16 port_mask = BIT(chip->info->num_ports) - 1;
+ const u16 port_mask = BIT(mv88e6xxx_num_ports(chip)) - 1;
u16 val = (num << 12) | (mask & port_mask);
if (hask)
val |= GLOBAL2_TRUNK_MASK_HASK;
- return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_TRUNK_MASK, val);
+ return mv88e6xxx_g2_update(chip, GLOBAL2_TRUNK_MASK, val);
}
/* Offset 0x08: Trunk Mapping Table register */
@@ -66,15 +88,15 @@
static int mv88e6xxx_g2_trunk_mapping_write(struct mv88e6xxx_chip *chip, int id,
u16 map)
{
- const u16 port_mask = BIT(chip->info->num_ports) - 1;
+ const u16 port_mask = BIT(mv88e6xxx_num_ports(chip)) - 1;
u16 val = (id << 11) | (map & port_mask);
- return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_TRUNK_MAPPING, val);
+ return mv88e6xxx_g2_update(chip, GLOBAL2_TRUNK_MAPPING, val);
}
static int mv88e6xxx_g2_clear_trunk(struct mv88e6xxx_chip *chip)
{
- const u16 port_mask = BIT(chip->info->num_ports) - 1;
+ const u16 port_mask = BIT(mv88e6xxx_num_ports(chip)) - 1;
int i, err;
/* Clear all eight possible Trunk Mask vectors */
@@ -103,17 +125,17 @@
int port, err;
/* Init all Ingress Rate Limit resources of all ports */
- for (port = 0; port < chip->info->num_ports; ++port) {
+ for (port = 0; port < mv88e6xxx_num_ports(chip); ++port) {
/* XXX newer chips (like 88E6390) have different 2-bit ops */
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_IRL_CMD,
- GLOBAL2_IRL_CMD_OP_INIT_ALL |
- (port << 8));
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_IRL_CMD,
+ GLOBAL2_IRL_CMD_OP_INIT_ALL |
+ (port << 8));
if (err)
break;
/* Wait for the operation to complete */
- err = mv88e6xxx_wait(chip, REG_GLOBAL2, GLOBAL2_IRL_CMD,
- GLOBAL2_IRL_CMD_BUSY);
+ err = mv88e6xxx_g2_wait(chip, GLOBAL2_IRL_CMD,
+ GLOBAL2_IRL_CMD_BUSY);
if (err)
break;
}
@@ -128,7 +150,7 @@
{
u16 val = (pointer << 8) | data;
- return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_SWITCH_MAC, val);
+ return mv88e6xxx_g2_update(chip, GLOBAL2_SWITCH_MAC, val);
}
int mv88e6xxx_g2_set_switch_mac(struct mv88e6xxx_chip *chip, u8 *addr)
@@ -151,7 +173,7 @@
{
u16 val = (pointer << 8) | (data & 0x7);
- return mv88e6xxx_update(chip, REG_GLOBAL2, GLOBAL2_PRIO_OVERRIDE, val);
+ return mv88e6xxx_g2_update(chip, GLOBAL2_PRIO_OVERRIDE, val);
}
static int mv88e6xxx_g2_clear_pot(struct mv88e6xxx_chip *chip)
@@ -174,16 +196,16 @@
static int mv88e6xxx_g2_eeprom_wait(struct mv88e6xxx_chip *chip)
{
- return mv88e6xxx_wait(chip, REG_GLOBAL2, GLOBAL2_EEPROM_CMD,
- GLOBAL2_EEPROM_CMD_BUSY |
- GLOBAL2_EEPROM_CMD_RUNNING);
+ return mv88e6xxx_g2_wait(chip, GLOBAL2_EEPROM_CMD,
+ GLOBAL2_EEPROM_CMD_BUSY |
+ GLOBAL2_EEPROM_CMD_RUNNING);
}
static int mv88e6xxx_g2_eeprom_cmd(struct mv88e6xxx_chip *chip, u16 cmd)
{
int err;
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_EEPROM_CMD, cmd);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_EEPROM_CMD, cmd);
if (err)
return err;
@@ -204,7 +226,7 @@
if (err)
return err;
- return mv88e6xxx_read(chip, REG_GLOBAL2, GLOBAL2_EEPROM_DATA, data);
+ return mv88e6xxx_g2_read(chip, GLOBAL2_EEPROM_DATA, data);
}
static int mv88e6xxx_g2_eeprom_write16(struct mv88e6xxx_chip *chip,
@@ -217,7 +239,7 @@
if (err)
return err;
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_EEPROM_DATA, data);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_EEPROM_DATA, data);
if (err)
return err;
@@ -283,7 +305,7 @@
int err;
/* Ensure the RO WriteEn bit is set */
- err = mv88e6xxx_read(chip, REG_GLOBAL2, GLOBAL2_EEPROM_CMD, &val);
+ err = mv88e6xxx_g2_read(chip, GLOBAL2_EEPROM_CMD, &val);
if (err)
return err;
@@ -346,15 +368,15 @@
static int mv88e6xxx_g2_smi_phy_wait(struct mv88e6xxx_chip *chip)
{
- return mv88e6xxx_wait(chip, REG_GLOBAL2, GLOBAL2_SMI_PHY_CMD,
- GLOBAL2_SMI_PHY_CMD_BUSY);
+ return mv88e6xxx_g2_wait(chip, GLOBAL2_SMI_PHY_CMD,
+ GLOBAL2_SMI_PHY_CMD_BUSY);
}
static int mv88e6xxx_g2_smi_phy_cmd(struct mv88e6xxx_chip *chip, u16 cmd)
{
int err;
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_SMI_PHY_CMD, cmd);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_SMI_PHY_CMD, cmd);
if (err)
return err;
@@ -375,7 +397,7 @@
if (err)
return err;
- return mv88e6xxx_read(chip, REG_GLOBAL2, GLOBAL2_SMI_PHY_DATA, val);
+ return mv88e6xxx_g2_read(chip, GLOBAL2_SMI_PHY_DATA, val);
}
int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, int addr, int reg,
@@ -388,7 +410,7 @@
if (err)
return err;
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_SMI_PHY_DATA, val);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_SMI_PHY_DATA, val);
if (err)
return err;
@@ -404,8 +426,7 @@
/* Consider the frames with reserved multicast destination
* addresses matching 01:80:c2:00:00:2x as MGMT.
*/
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_MGMT_EN_2X,
- 0xffff);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_MGMT_EN_2X, 0xffff);
if (err)
return err;
}
@@ -414,8 +435,7 @@
/* Consider the frames with reserved multicast destination
* addresses matching 01:80:c2:00:00:0x as MGMT.
*/
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_MGMT_EN_0X,
- 0xffff);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_MGMT_EN_0X, 0xffff);
if (err)
return err;
}
@@ -429,7 +449,7 @@
if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_MGMT_EN_0X) ||
mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_MGMT_EN_2X))
reg |= GLOBAL2_SWITCH_MGMT_RSVD2CPU | 0x7;
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_SWITCH_MGMT, reg);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_SWITCH_MGMT, reg);
if (err)
return err;
@@ -454,8 +474,8 @@
if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_PVT)) {
/* Initialize Cross-chip Port VLAN Table to reset defaults */
- err = mv88e6xxx_write(chip, REG_GLOBAL2, GLOBAL2_PVT_ADDR,
- GLOBAL2_PVT_ADDR_OP_INIT_ONES);
+ err = mv88e6xxx_g2_write(chip, GLOBAL2_PVT_ADDR,
+ GLOBAL2_PVT_ADDR_OP_INIT_ONES);
if (err)
return err;
}
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 52f3f52..e572121 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -37,7 +37,6 @@
#define ADDR_SERDES 0x0f
#define SERDES_PAGE_FIBER 0x01
-#define REG_PORT(p) (0x10 + (p))
#define PORT_STATUS 0x00
#define PORT_STATUS_PAUSE_EN BIT(15)
#define PORT_STATUS_MY_PAUSE BIT(14)
@@ -160,7 +159,6 @@
#define PORT_TAG_REGMAP_0123 0x18
#define PORT_TAG_REGMAP_4567 0x19
-#define REG_GLOBAL 0x1b
#define GLOBAL_STATUS 0x00
#define GLOBAL_STATUS_PPU_STATE BIT(15) /* 6351 and 6171 */
/* Two bits for 6165, 6185 etc */
@@ -172,8 +170,8 @@
#define GLOBAL_MAC_01 0x01
#define GLOBAL_MAC_23 0x02
#define GLOBAL_MAC_45 0x03
-#define GLOBAL_ATU_FID 0x01 /* 6097 6165 6351 6352 */
-#define GLOBAL_VTU_FID 0x02 /* 6097 6165 6351 6352 */
+#define GLOBAL_ATU_FID 0x01
+#define GLOBAL_VTU_FID 0x02
#define GLOBAL_VTU_FID_MASK 0xfff
#define GLOBAL_VTU_SID 0x03 /* 6097 6165 6351 6352 */
#define GLOBAL_VTU_SID_MASK 0x3f
@@ -278,7 +276,6 @@
#define GLOBAL_STATS_COUNTER_32 0x1e
#define GLOBAL_STATS_COUNTER_01 0x1f
-#define REG_GLOBAL2 0x1c
#define GLOBAL2_INT_SOURCE 0x00
#define GLOBAL2_INT_MASK 0x01
#define GLOBAL2_MGMT_EN_2X 0x02
@@ -411,6 +408,11 @@
*/
MV88E6XXX_CAP_SERDES,
+ /* Switch Global (1) Registers.
+ */
+ MV88E6XXX_CAP_G1_ATU_FID, /* (0x01) ATU FID Register */
+ MV88E6XXX_CAP_G1_VTU_FID, /* (0x02) VTU FID Register */
+
/* Switch Global 2 Registers.
* The device contains a second set of global 16-bit registers.
*/
@@ -421,12 +423,7 @@
MV88E6XXX_CAP_G2_IRL_DATA, /* (0x0a) Ingress Rate Data */
MV88E6XXX_CAP_G2_PVT_ADDR, /* (0x0b) Cross Chip Port VLAN Addr */
MV88E6XXX_CAP_G2_PVT_DATA, /* (0x0c) Cross Chip Port VLAN Data */
- MV88E6XXX_CAP_G2_SWITCH_MAC, /* (0x0d) Switch MAC/WoL/WoF */
MV88E6XXX_CAP_G2_POT, /* (0x0f) Priority Override Table */
- MV88E6XXX_CAP_G2_EEPROM_CMD, /* (0x14) EEPROM Command */
- MV88E6XXX_CAP_G2_EEPROM_DATA, /* (0x15) EEPROM Data */
- MV88E6XXX_CAP_G2_SMI_PHY_CMD, /* (0x18) SMI PHY Command */
- MV88E6XXX_CAP_G2_SMI_PHY_DATA, /* (0x19) SMI PHY Data */
/* PHY Polling Unit.
* See GLOBAL_CONTROL_PPU_ENABLE and GLOBAL_STATUS_PPU_POLLING.
@@ -453,41 +450,34 @@
};
/* Bitmask of capabilities */
-#define MV88E6XXX_FLAG_EDSA BIT(MV88E6XXX_CAP_EDSA)
-#define MV88E6XXX_FLAG_EEE BIT(MV88E6XXX_CAP_EEE)
+#define MV88E6XXX_FLAG_EDSA BIT_ULL(MV88E6XXX_CAP_EDSA)
+#define MV88E6XXX_FLAG_EEE BIT_ULL(MV88E6XXX_CAP_EEE)
-#define MV88E6XXX_FLAG_SMI_CMD BIT(MV88E6XXX_CAP_SMI_CMD)
-#define MV88E6XXX_FLAG_SMI_DATA BIT(MV88E6XXX_CAP_SMI_DATA)
+#define MV88E6XXX_FLAG_SMI_CMD BIT_ULL(MV88E6XXX_CAP_SMI_CMD)
+#define MV88E6XXX_FLAG_SMI_DATA BIT_ULL(MV88E6XXX_CAP_SMI_DATA)
-#define MV88E6XXX_FLAG_PHY_PAGE BIT(MV88E6XXX_CAP_PHY_PAGE)
+#define MV88E6XXX_FLAG_PHY_PAGE BIT_ULL(MV88E6XXX_CAP_PHY_PAGE)
-#define MV88E6XXX_FLAG_SERDES BIT(MV88E6XXX_CAP_SERDES)
+#define MV88E6XXX_FLAG_SERDES BIT_ULL(MV88E6XXX_CAP_SERDES)
-#define MV88E6XXX_FLAG_GLOBAL2 BIT(MV88E6XXX_CAP_GLOBAL2)
-#define MV88E6XXX_FLAG_G2_MGMT_EN_2X BIT(MV88E6XXX_CAP_G2_MGMT_EN_2X)
-#define MV88E6XXX_FLAG_G2_MGMT_EN_0X BIT(MV88E6XXX_CAP_G2_MGMT_EN_0X)
-#define MV88E6XXX_FLAG_G2_IRL_CMD BIT(MV88E6XXX_CAP_G2_IRL_CMD)
-#define MV88E6XXX_FLAG_G2_IRL_DATA BIT(MV88E6XXX_CAP_G2_IRL_DATA)
-#define MV88E6XXX_FLAG_G2_PVT_ADDR BIT(MV88E6XXX_CAP_G2_PVT_ADDR)
-#define MV88E6XXX_FLAG_G2_PVT_DATA BIT(MV88E6XXX_CAP_G2_PVT_DATA)
-#define MV88E6XXX_FLAG_G2_SWITCH_MAC BIT(MV88E6XXX_CAP_G2_SWITCH_MAC)
-#define MV88E6XXX_FLAG_G2_POT BIT(MV88E6XXX_CAP_G2_POT)
-#define MV88E6XXX_FLAG_G2_EEPROM_CMD BIT(MV88E6XXX_CAP_G2_EEPROM_CMD)
-#define MV88E6XXX_FLAG_G2_EEPROM_DATA BIT(MV88E6XXX_CAP_G2_EEPROM_DATA)
-#define MV88E6XXX_FLAG_G2_SMI_PHY_CMD BIT(MV88E6XXX_CAP_G2_SMI_PHY_CMD)
-#define MV88E6XXX_FLAG_G2_SMI_PHY_DATA BIT(MV88E6XXX_CAP_G2_SMI_PHY_DATA)
+#define MV88E6XXX_FLAG_G1_ATU_FID BIT_ULL(MV88E6XXX_CAP_G1_ATU_FID)
+#define MV88E6XXX_FLAG_G1_VTU_FID BIT_ULL(MV88E6XXX_CAP_G1_VTU_FID)
-#define MV88E6XXX_FLAG_PPU BIT(MV88E6XXX_CAP_PPU)
-#define MV88E6XXX_FLAG_PPU_ACTIVE BIT(MV88E6XXX_CAP_PPU_ACTIVE)
-#define MV88E6XXX_FLAG_STU BIT(MV88E6XXX_CAP_STU)
-#define MV88E6XXX_FLAG_TEMP BIT(MV88E6XXX_CAP_TEMP)
-#define MV88E6XXX_FLAG_TEMP_LIMIT BIT(MV88E6XXX_CAP_TEMP_LIMIT)
-#define MV88E6XXX_FLAG_VTU BIT(MV88E6XXX_CAP_VTU)
+#define MV88E6XXX_FLAG_GLOBAL2 BIT_ULL(MV88E6XXX_CAP_GLOBAL2)
+#define MV88E6XXX_FLAG_G2_MGMT_EN_2X BIT_ULL(MV88E6XXX_CAP_G2_MGMT_EN_2X)
+#define MV88E6XXX_FLAG_G2_MGMT_EN_0X BIT_ULL(MV88E6XXX_CAP_G2_MGMT_EN_0X)
+#define MV88E6XXX_FLAG_G2_IRL_CMD BIT_ULL(MV88E6XXX_CAP_G2_IRL_CMD)
+#define MV88E6XXX_FLAG_G2_IRL_DATA BIT_ULL(MV88E6XXX_CAP_G2_IRL_DATA)
+#define MV88E6XXX_FLAG_G2_PVT_ADDR BIT_ULL(MV88E6XXX_CAP_G2_PVT_ADDR)
+#define MV88E6XXX_FLAG_G2_PVT_DATA BIT_ULL(MV88E6XXX_CAP_G2_PVT_DATA)
+#define MV88E6XXX_FLAG_G2_POT BIT_ULL(MV88E6XXX_CAP_G2_POT)
-/* EEPROM Programming via Global2 with 16-bit data */
-#define MV88E6XXX_FLAGS_EEPROM16 \
- (MV88E6XXX_FLAG_G2_EEPROM_CMD | \
- MV88E6XXX_FLAG_G2_EEPROM_DATA)
+#define MV88E6XXX_FLAG_PPU BIT_ULL(MV88E6XXX_CAP_PPU)
+#define MV88E6XXX_FLAG_PPU_ACTIVE BIT_ULL(MV88E6XXX_CAP_PPU_ACTIVE)
+#define MV88E6XXX_FLAG_STU BIT_ULL(MV88E6XXX_CAP_STU)
+#define MV88E6XXX_FLAG_TEMP BIT_ULL(MV88E6XXX_CAP_TEMP)
+#define MV88E6XXX_FLAG_TEMP_LIMIT BIT_ULL(MV88E6XXX_CAP_TEMP_LIMIT)
+#define MV88E6XXX_FLAG_VTU BIT_ULL(MV88E6XXX_CAP_VTU)
/* Ingress Rate Limit unit */
#define MV88E6XXX_FLAGS_IRL \
@@ -509,11 +499,6 @@
(MV88E6XXX_FLAG_PHY_PAGE | \
MV88E6XXX_FLAG_SERDES)
-/* Indirect PHY access via Global2 SMI PHY registers */
-#define MV88E6XXX_FLAGS_SMI_PHY \
- (MV88E6XXX_FLAG_G2_SMI_PHY_CMD |\
- MV88E6XXX_FLAG_G2_SMI_PHY_DATA)
-
#define MV88E6XXX_FLAGS_FAMILY_6095 \
(MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
@@ -522,7 +507,9 @@
MV88E6XXX_FLAGS_MULTI_CHIP)
#define MV88E6XXX_FLAGS_FAMILY_6097 \
- (MV88E6XXX_FLAG_GLOBAL2 | \
+ (MV88E6XXX_FLAG_G1_ATU_FID | \
+ MV88E6XXX_FLAG_G1_VTU_FID | \
+ MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
MV88E6XXX_FLAG_G2_POT | \
@@ -534,10 +521,11 @@
MV88E6XXX_FLAGS_PVT)
#define MV88E6XXX_FLAGS_FAMILY_6165 \
- (MV88E6XXX_FLAG_GLOBAL2 | \
+ (MV88E6XXX_FLAG_G1_ATU_FID | \
+ MV88E6XXX_FLAG_G1_VTU_FID | \
+ MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
- MV88E6XXX_FLAG_G2_SWITCH_MAC | \
MV88E6XXX_FLAG_G2_POT | \
MV88E6XXX_FLAG_STU | \
MV88E6XXX_FLAG_TEMP | \
@@ -559,24 +547,22 @@
MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
- MV88E6XXX_FLAG_G2_SWITCH_MAC | \
MV88E6XXX_FLAG_G2_POT | \
MV88E6XXX_FLAG_PPU_ACTIVE | \
MV88E6XXX_FLAG_TEMP | \
MV88E6XXX_FLAG_TEMP_LIMIT | \
MV88E6XXX_FLAG_VTU | \
- MV88E6XXX_FLAGS_EEPROM16 | \
MV88E6XXX_FLAGS_IRL | \
MV88E6XXX_FLAGS_MULTI_CHIP | \
- MV88E6XXX_FLAGS_PVT | \
- MV88E6XXX_FLAGS_SMI_PHY)
+ MV88E6XXX_FLAGS_PVT)
#define MV88E6XXX_FLAGS_FAMILY_6351 \
(MV88E6XXX_FLAG_EDSA | \
+ MV88E6XXX_FLAG_G1_ATU_FID | \
+ MV88E6XXX_FLAG_G1_VTU_FID | \
MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
- MV88E6XXX_FLAG_G2_SWITCH_MAC | \
MV88E6XXX_FLAG_G2_POT | \
MV88E6XXX_FLAG_PPU_ACTIVE | \
MV88E6XXX_FLAG_STU | \
@@ -584,28 +570,28 @@
MV88E6XXX_FLAG_VTU | \
MV88E6XXX_FLAGS_IRL | \
MV88E6XXX_FLAGS_MULTI_CHIP | \
- MV88E6XXX_FLAGS_PVT | \
- MV88E6XXX_FLAGS_SMI_PHY)
+ MV88E6XXX_FLAGS_PVT)
#define MV88E6XXX_FLAGS_FAMILY_6352 \
(MV88E6XXX_FLAG_EDSA | \
MV88E6XXX_FLAG_EEE | \
+ MV88E6XXX_FLAG_G1_ATU_FID | \
+ MV88E6XXX_FLAG_G1_VTU_FID | \
MV88E6XXX_FLAG_GLOBAL2 | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
- MV88E6XXX_FLAG_G2_SWITCH_MAC | \
MV88E6XXX_FLAG_G2_POT | \
MV88E6XXX_FLAG_PPU_ACTIVE | \
MV88E6XXX_FLAG_STU | \
MV88E6XXX_FLAG_TEMP | \
MV88E6XXX_FLAG_TEMP_LIMIT | \
MV88E6XXX_FLAG_VTU | \
- MV88E6XXX_FLAGS_EEPROM16 | \
MV88E6XXX_FLAGS_IRL | \
MV88E6XXX_FLAGS_MULTI_CHIP | \
MV88E6XXX_FLAGS_PVT | \
- MV88E6XXX_FLAGS_SERDES | \
- MV88E6XXX_FLAGS_SMI_PHY)
+ MV88E6XXX_FLAGS_SERDES)
+
+struct mv88e6xxx_ops;
struct mv88e6xxx_info {
enum mv88e6xxx_family family;
@@ -614,8 +600,10 @@
unsigned int num_databases;
unsigned int num_ports;
unsigned int port_base_addr;
+ unsigned int global1_addr;
unsigned int age_time_coeff;
- unsigned long flags;
+ unsigned long long flags;
+ const struct mv88e6xxx_ops *ops;
};
struct mv88e6xxx_atu_entry {
@@ -626,18 +614,15 @@
u8 mac[ETH_ALEN];
};
-struct mv88e6xxx_vtu_stu_entry {
- /* VTU only */
+struct mv88e6xxx_vtu_entry {
u16 vid;
u16 fid;
-
- /* VTU and STU */
u8 sid;
bool valid;
u8 data[DSA_MAX_PORTS];
};
-struct mv88e6xxx_ops;
+struct mv88e6xxx_bus_ops;
struct mv88e6xxx_priv_port {
struct net_device *bridge_dev;
@@ -658,14 +643,14 @@
/* The MII bus and the address on the bus that is used to
* communication with the switch
*/
- const struct mv88e6xxx_ops *smi_ops;
+ const struct mv88e6xxx_bus_ops *smi_ops;
struct mii_bus *bus;
int sw_addr;
/* Handles automatic disabling and re-enabling of the PHY
* polling unit.
*/
- const struct mv88e6xxx_ops *phy_ops;
+ const struct mv88e6xxx_bus_ops *phy_ops;
struct mutex ppu_mutex;
int ppu_disabled;
struct work_struct ppu_work;
@@ -694,11 +679,25 @@
struct mii_bus *mdio_bus;
};
-struct mv88e6xxx_ops {
+struct mv88e6xxx_bus_ops {
int (*read)(struct mv88e6xxx_chip *chip, int addr, int reg, u16 *val);
int (*write)(struct mv88e6xxx_chip *chip, int addr, int reg, u16 val);
};
+struct mv88e6xxx_ops {
+ int (*get_eeprom)(struct mv88e6xxx_chip *chip,
+ struct ethtool_eeprom *eeprom, u8 *data);
+ int (*set_eeprom)(struct mv88e6xxx_chip *chip,
+ struct ethtool_eeprom *eeprom, u8 *data);
+
+ int (*set_switch_mac)(struct mv88e6xxx_chip *chip, u8 *addr);
+
+ int (*phy_read)(struct mv88e6xxx_chip *chip, int addr, int reg,
+ u16 *val);
+ int (*phy_write)(struct mv88e6xxx_chip *chip, int addr, int reg,
+ u16 val);
+};
+
enum stat_type {
BANK0,
BANK1,
@@ -718,6 +717,16 @@
return (chip->info->flags & flags) == flags;
}
+static inline unsigned int mv88e6xxx_num_databases(struct mv88e6xxx_chip *chip)
+{
+ return chip->info->num_databases;
+}
+
+static inline unsigned int mv88e6xxx_num_ports(struct mv88e6xxx_chip *chip)
+{
+ return chip->info->num_ports;
+}
+
int mv88e6xxx_read(struct mv88e6xxx_chip *chip, int addr, int reg, u16 *val);
int mv88e6xxx_write(struct mv88e6xxx_chip *chip, int addr, int reg, u16 val);
int mv88e6xxx_update(struct mv88e6xxx_chip *chip, int addr, int reg,
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 7f3f178..b3df70d 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -256,7 +256,7 @@
.n_yes_ranges = ARRAY_SIZE(qca8k_readable_ranges),
};
-struct regmap_config qca8k_regmap_config = {
+static struct regmap_config qca8k_regmap_config = {
.reg_bits = 16,
.val_bits = 32,
.reg_stride = 4,
@@ -586,13 +586,6 @@
}
static int
-qca8k_set_addr(struct dsa_switch *ds, u8 *addr)
-{
- /* The subsystem always calls this function so add an empty stub */
- return 0;
-}
-
-static int
qca8k_phy_read(struct dsa_switch *ds, int phy, int regnum)
{
struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
@@ -921,7 +914,6 @@
static struct dsa_switch_ops qca8k_switch_ops = {
.get_tag_protocol = qca8k_get_tag_protocol,
.setup = qca8k_setup,
- .set_addr = qca8k_set_addr,
.get_strings = qca8k_get_strings,
.phy_read = qca8k_phy_read,
.phy_write = qca8k_phy_write,
@@ -1040,19 +1032,7 @@
},
};
-static int __init
-qca8kmdio_driver_register(void)
-{
- return mdio_driver_register(&qca8kmdio_driver);
-}
-module_init(qca8kmdio_driver_register);
-
-static void __exit
-qca8kmdio_driver_unregister(void)
-{
- mdio_driver_unregister(&qca8kmdio_driver);
-}
-module_exit(qca8kmdio_driver_unregister);
+mdio_module_driver(qca8kmdio_driver);
MODULE_AUTHOR("Mathieu Olivari, John Crispin <john@phrozen.org>");
MODULE_DESCRIPTION("Driver for QCA8K ethernet switch family");
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index 8a8d055..8456337 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -237,6 +237,8 @@
#define TCPHDR_LEN 6
#define IPHDR_POS 6
#define IPHDR_LEN 6
+#define MSS_POS 20
+#define MSS_LEN 2
#define EC_POS 22 /* Enable checksum */
#define EC_LEN 1
#define ET_POS 23 /* Enable TSO */
@@ -253,6 +255,11 @@
#define LAST_BUFFER (0x7800ULL << BUFDATALEN_POS)
+#define TSO_MSS0_POS 0
+#define TSO_MSS0_LEN 14
+#define TSO_MSS1_POS 16
+#define TSO_MSS1_LEN 14
+
struct xgene_enet_raw_desc {
__le64 m0;
__le64 m1;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 522ba92..429f18f 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -137,6 +137,7 @@
static int xgene_enet_tx_completion(struct xgene_enet_desc_ring *cp_ring,
struct xgene_enet_raw_desc *raw_desc)
{
+ struct xgene_enet_pdata *pdata = netdev_priv(cp_ring->ndev);
struct sk_buff *skb;
struct device *dev;
skb_frag_t *frag;
@@ -144,6 +145,7 @@
u16 skb_index;
u8 status;
int i, ret = 0;
+ u8 mss_index;
skb_index = GET_VAL(USERINFO, le64_to_cpu(raw_desc->m0));
skb = cp_ring->cp_skb[skb_index];
@@ -160,6 +162,13 @@
DMA_TO_DEVICE);
}
+ if (GET_BIT(ET, le64_to_cpu(raw_desc->m3))) {
+ mss_index = GET_VAL(MSS, le64_to_cpu(raw_desc->m3));
+ spin_lock(&pdata->mss_lock);
+ pdata->mss_refcnt[mss_index]--;
+ spin_unlock(&pdata->mss_lock);
+ }
+
/* Checking for error */
status = GET_VAL(LERR, le64_to_cpu(raw_desc->m0));
if (unlikely(status > 2)) {
@@ -178,15 +187,53 @@
return ret;
}
-static u64 xgene_enet_work_msg(struct sk_buff *skb)
+static int xgene_enet_setup_mss(struct net_device *ndev, u32 mss)
+{
+ struct xgene_enet_pdata *pdata = netdev_priv(ndev);
+ bool mss_index_found = false;
+ int mss_index;
+ int i;
+
+ spin_lock(&pdata->mss_lock);
+
+ /* Reuse the slot if MSS matches */
+ for (i = 0; !mss_index_found && i < NUM_MSS_REG; i++) {
+ if (pdata->mss[i] == mss) {
+ pdata->mss_refcnt[i]++;
+ mss_index = i;
+ mss_index_found = true;
+ }
+ }
+
+ /* Overwrite the slot with ref_count = 0 */
+ for (i = 0; !mss_index_found && i < NUM_MSS_REG; i++) {
+ if (!pdata->mss_refcnt[i]) {
+ pdata->mss_refcnt[i]++;
+ pdata->mac_ops->set_mss(pdata, mss, i);
+ pdata->mss[i] = mss;
+ mss_index = i;
+ mss_index_found = true;
+ }
+ }
+
+ spin_unlock(&pdata->mss_lock);
+
+ /* No slots with ref_count = 0 available, return busy */
+ if (!mss_index_found)
+ return -EBUSY;
+
+ return mss_index;
+}
+
+static int xgene_enet_work_msg(struct sk_buff *skb, u64 *hopinfo)
{
struct net_device *ndev = skb->dev;
struct iphdr *iph;
u8 l3hlen = 0, l4hlen = 0;
u8 ethhdr, proto = 0, csum_enable = 0;
- u64 hopinfo = 0;
u32 hdr_len, mss = 0;
u32 i, len, nr_frags;
+ int mss_index;
ethhdr = xgene_enet_hdr_len(skb->data);
@@ -226,7 +273,11 @@
if (!mss || ((skb->len - hdr_len) <= mss))
goto out;
- hopinfo |= SET_BIT(ET);
+ mss_index = xgene_enet_setup_mss(ndev, mss);
+ if (unlikely(mss_index < 0))
+ return -EBUSY;
+
+ *hopinfo |= SET_BIT(ET) | SET_VAL(MSS, mss_index);
}
} else if (iph->protocol == IPPROTO_UDP) {
l4hlen = UDP_HDR_SIZE;
@@ -234,15 +285,15 @@
}
out:
l3hlen = ip_hdrlen(skb) >> 2;
- hopinfo |= SET_VAL(TCPHDR, l4hlen) |
- SET_VAL(IPHDR, l3hlen) |
- SET_VAL(ETHHDR, ethhdr) |
- SET_VAL(EC, csum_enable) |
- SET_VAL(IS, proto) |
- SET_BIT(IC) |
- SET_BIT(TYPE_ETH_WORK_MESSAGE);
+ *hopinfo |= SET_VAL(TCPHDR, l4hlen) |
+ SET_VAL(IPHDR, l3hlen) |
+ SET_VAL(ETHHDR, ethhdr) |
+ SET_VAL(EC, csum_enable) |
+ SET_VAL(IS, proto) |
+ SET_BIT(IC) |
+ SET_BIT(TYPE_ETH_WORK_MESSAGE);
- return hopinfo;
+ return 0;
}
static u16 xgene_enet_encode_len(u16 len)
@@ -282,20 +333,22 @@
dma_addr_t dma_addr, pbuf_addr, *frag_dma_addr;
skb_frag_t *frag;
u16 tail = tx_ring->tail;
- u64 hopinfo;
+ u64 hopinfo = 0;
u32 len, hw_len;
u8 ll = 0, nv = 0, idx = 0;
bool split = false;
u32 size, offset, ell_bytes = 0;
u32 i, fidx, nr_frags, count = 1;
+ int ret;
raw_desc = &tx_ring->raw_desc[tail];
tail = (tail + 1) & (tx_ring->slots - 1);
memset(raw_desc, 0, sizeof(struct xgene_enet_raw_desc));
- hopinfo = xgene_enet_work_msg(skb);
- if (!hopinfo)
- return -EINVAL;
+ ret = xgene_enet_work_msg(skb, &hopinfo);
+ if (ret)
+ return ret;
+
raw_desc->m3 = cpu_to_le64(SET_VAL(HENQNUM, tx_ring->dst_ring_num) |
hopinfo);
@@ -435,6 +488,9 @@
return NETDEV_TX_OK;
count = xgene_enet_setup_tx_desc(tx_ring, skb);
+ if (count == -EBUSY)
+ return NETDEV_TX_BUSY;
+
if (count <= 0) {
dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
@@ -1669,7 +1725,7 @@
if (pdata->phy_mode == PHY_INTERFACE_MODE_XGMII) {
ndev->features |= NETIF_F_TSO;
- pdata->mss = XGENE_ENET_MSS;
+ spin_lock_init(&pdata->mss_lock);
}
ndev->hw_features = ndev->features;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index 7735371..0cda58f 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -47,7 +47,7 @@
#define NUM_PKT_BUF 64
#define NUM_BUFPOOL 32
#define MAX_EXP_BUFFS 256
-#define XGENE_ENET_MSS 1448
+#define NUM_MSS_REG 4
#define XGENE_MIN_ENET_FRAME_SIZE 60
#define XGENE_MAX_ENET_IRQ 16
@@ -143,7 +143,7 @@
void (*rx_disable)(struct xgene_enet_pdata *pdata);
void (*set_speed)(struct xgene_enet_pdata *pdata);
void (*set_mac_addr)(struct xgene_enet_pdata *pdata);
- void (*set_mss)(struct xgene_enet_pdata *pdata);
+ void (*set_mss)(struct xgene_enet_pdata *pdata, u16 mss, u8 index);
void (*link_state)(struct work_struct *work);
};
@@ -212,7 +212,9 @@
u8 eth_bufnum;
u8 bp_bufnum;
u16 ring_num;
- u32 mss;
+ u32 mss[NUM_MSS_REG];
+ u32 mss_refcnt[NUM_MSS_REG];
+ spinlock_t mss_lock; /* mss lock */
u8 tx_delay;
u8 rx_delay;
bool mdio_driver;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
index 279ee27..6475f38 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
@@ -232,9 +232,22 @@
xgene_enet_wr_mac(pdata, HSTMACADR_MSW_ADDR, addr1);
}
-static void xgene_xgmac_set_mss(struct xgene_enet_pdata *pdata)
+static void xgene_xgmac_set_mss(struct xgene_enet_pdata *pdata,
+ u16 mss, u8 index)
{
- xgene_enet_wr_csr(pdata, XG_TSIF_MSS_REG0_ADDR, pdata->mss);
+ u8 offset;
+ u32 data;
+
+ offset = (index < 2) ? 0 : 4;
+ xgene_enet_rd_csr(pdata, XG_TSIF_MSS_REG0_ADDR + offset, &data);
+
+ if (!(index & 0x1))
+ data = SET_VAL(TSO_MSS1, data >> TSO_MSS1_POS) |
+ SET_VAL(TSO_MSS0, mss);
+ else
+ data = SET_VAL(TSO_MSS1, mss) | SET_VAL(TSO_MSS0, data);
+
+ xgene_enet_wr_csr(pdata, XG_TSIF_MSS_REG0_ADDR + offset, data);
}
static u32 xgene_enet_link_status(struct xgene_enet_pdata *pdata)
@@ -258,7 +271,6 @@
xgene_enet_wr_mac(pdata, AXGMAC_CONFIG_1, data);
xgene_xgmac_set_mac_addr(pdata);
- xgene_xgmac_set_mss(pdata);
xgene_enet_rd_csr(pdata, XG_RSIF_CONFIG_REG_ADDR, &data);
data |= CFG_RSIF_FPBUFF_TIMEOUT_EN;
diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index 74f0a37..17aa33c 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -1486,7 +1486,7 @@
b44_enable_ints(bp);
if (bp->flags & B44_FLAG_EXTERNAL_PHY)
- phy_start(bp->phydev);
+ phy_start(dev->phydev);
netif_start_queue(dev);
out:
@@ -1651,7 +1651,7 @@
netif_stop_queue(dev);
if (bp->flags & B44_FLAG_EXTERNAL_PHY)
- phy_stop(bp->phydev);
+ phy_stop(dev->phydev);
napi_disable(&bp->napi);
@@ -1832,90 +1832,100 @@
return r;
}
-static int b44_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int b44_get_link_ksettings(struct net_device *dev,
+ struct ethtool_link_ksettings *cmd)
{
struct b44 *bp = netdev_priv(dev);
+ u32 supported, advertising;
if (bp->flags & B44_FLAG_EXTERNAL_PHY) {
- BUG_ON(!bp->phydev);
- return phy_ethtool_gset(bp->phydev, cmd);
+ BUG_ON(!dev->phydev);
+ return phy_ethtool_ksettings_get(dev->phydev, cmd);
}
- cmd->supported = (SUPPORTED_Autoneg);
- cmd->supported |= (SUPPORTED_100baseT_Half |
- SUPPORTED_100baseT_Full |
- SUPPORTED_10baseT_Half |
- SUPPORTED_10baseT_Full |
- SUPPORTED_MII);
+ supported = (SUPPORTED_Autoneg);
+ supported |= (SUPPORTED_100baseT_Half |
+ SUPPORTED_100baseT_Full |
+ SUPPORTED_10baseT_Half |
+ SUPPORTED_10baseT_Full |
+ SUPPORTED_MII);
- cmd->advertising = 0;
+ advertising = 0;
if (bp->flags & B44_FLAG_ADV_10HALF)
- cmd->advertising |= ADVERTISED_10baseT_Half;
+ advertising |= ADVERTISED_10baseT_Half;
if (bp->flags & B44_FLAG_ADV_10FULL)
- cmd->advertising |= ADVERTISED_10baseT_Full;
+ advertising |= ADVERTISED_10baseT_Full;
if (bp->flags & B44_FLAG_ADV_100HALF)
- cmd->advertising |= ADVERTISED_100baseT_Half;
+ advertising |= ADVERTISED_100baseT_Half;
if (bp->flags & B44_FLAG_ADV_100FULL)
- cmd->advertising |= ADVERTISED_100baseT_Full;
- cmd->advertising |= ADVERTISED_Pause | ADVERTISED_Asym_Pause;
- ethtool_cmd_speed_set(cmd, ((bp->flags & B44_FLAG_100_BASE_T) ?
- SPEED_100 : SPEED_10));
- cmd->duplex = (bp->flags & B44_FLAG_FULL_DUPLEX) ?
+ advertising |= ADVERTISED_100baseT_Full;
+ advertising |= ADVERTISED_Pause | ADVERTISED_Asym_Pause;
+ cmd->base.speed = (bp->flags & B44_FLAG_100_BASE_T) ?
+ SPEED_100 : SPEED_10;
+ cmd->base.duplex = (bp->flags & B44_FLAG_FULL_DUPLEX) ?
DUPLEX_FULL : DUPLEX_HALF;
- cmd->port = 0;
- cmd->phy_address = bp->phy_addr;
- cmd->transceiver = (bp->flags & B44_FLAG_EXTERNAL_PHY) ?
- XCVR_EXTERNAL : XCVR_INTERNAL;
- cmd->autoneg = (bp->flags & B44_FLAG_FORCE_LINK) ?
+ cmd->base.port = 0;
+ cmd->base.phy_address = bp->phy_addr;
+ cmd->base.autoneg = (bp->flags & B44_FLAG_FORCE_LINK) ?
AUTONEG_DISABLE : AUTONEG_ENABLE;
- if (cmd->autoneg == AUTONEG_ENABLE)
- cmd->advertising |= ADVERTISED_Autoneg;
+ if (cmd->base.autoneg == AUTONEG_ENABLE)
+ advertising |= ADVERTISED_Autoneg;
+
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
+ supported);
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
+ advertising);
+
if (!netif_running(dev)){
- ethtool_cmd_speed_set(cmd, 0);
- cmd->duplex = 0xff;
+ cmd->base.speed = 0;
+ cmd->base.duplex = 0xff;
}
- cmd->maxtxpkt = 0;
- cmd->maxrxpkt = 0;
+
return 0;
}
-static int b44_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int b44_set_link_ksettings(struct net_device *dev,
+ const struct ethtool_link_ksettings *cmd)
{
struct b44 *bp = netdev_priv(dev);
u32 speed;
int ret;
+ u32 advertising;
if (bp->flags & B44_FLAG_EXTERNAL_PHY) {
- BUG_ON(!bp->phydev);
+ BUG_ON(!dev->phydev);
spin_lock_irq(&bp->lock);
if (netif_running(dev))
b44_setup_phy(bp);
- ret = phy_ethtool_sset(bp->phydev, cmd);
+ ret = phy_ethtool_ksettings_set(dev->phydev, cmd);
spin_unlock_irq(&bp->lock);
return ret;
}
- speed = ethtool_cmd_speed(cmd);
+ speed = cmd->base.speed;
+
+ ethtool_convert_link_mode_to_legacy_u32(&advertising,
+ cmd->link_modes.advertising);
/* We do not support gigabit. */
- if (cmd->autoneg == AUTONEG_ENABLE) {
- if (cmd->advertising &
+ if (cmd->base.autoneg == AUTONEG_ENABLE) {
+ if (advertising &
(ADVERTISED_1000baseT_Half |
ADVERTISED_1000baseT_Full))
return -EINVAL;
} else if ((speed != SPEED_100 &&
speed != SPEED_10) ||
- (cmd->duplex != DUPLEX_HALF &&
- cmd->duplex != DUPLEX_FULL)) {
+ (cmd->base.duplex != DUPLEX_HALF &&
+ cmd->base.duplex != DUPLEX_FULL)) {
return -EINVAL;
}
spin_lock_irq(&bp->lock);
- if (cmd->autoneg == AUTONEG_ENABLE) {
+ if (cmd->base.autoneg == AUTONEG_ENABLE) {
bp->flags &= ~(B44_FLAG_FORCE_LINK |
B44_FLAG_100_BASE_T |
B44_FLAG_FULL_DUPLEX |
@@ -1923,19 +1933,19 @@
B44_FLAG_ADV_10FULL |
B44_FLAG_ADV_100HALF |
B44_FLAG_ADV_100FULL);
- if (cmd->advertising == 0) {
+ if (advertising == 0) {
bp->flags |= (B44_FLAG_ADV_10HALF |
B44_FLAG_ADV_10FULL |
B44_FLAG_ADV_100HALF |
B44_FLAG_ADV_100FULL);
} else {
- if (cmd->advertising & ADVERTISED_10baseT_Half)
+ if (advertising & ADVERTISED_10baseT_Half)
bp->flags |= B44_FLAG_ADV_10HALF;
- if (cmd->advertising & ADVERTISED_10baseT_Full)
+ if (advertising & ADVERTISED_10baseT_Full)
bp->flags |= B44_FLAG_ADV_10FULL;
- if (cmd->advertising & ADVERTISED_100baseT_Half)
+ if (advertising & ADVERTISED_100baseT_Half)
bp->flags |= B44_FLAG_ADV_100HALF;
- if (cmd->advertising & ADVERTISED_100baseT_Full)
+ if (advertising & ADVERTISED_100baseT_Full)
bp->flags |= B44_FLAG_ADV_100FULL;
}
} else {
@@ -1943,7 +1953,7 @@
bp->flags &= ~(B44_FLAG_100_BASE_T | B44_FLAG_FULL_DUPLEX);
if (speed == SPEED_100)
bp->flags |= B44_FLAG_100_BASE_T;
- if (cmd->duplex == DUPLEX_FULL)
+ if (cmd->base.duplex == DUPLEX_FULL)
bp->flags |= B44_FLAG_FULL_DUPLEX;
}
@@ -2110,8 +2120,6 @@
static const struct ethtool_ops b44_ethtool_ops = {
.get_drvinfo = b44_get_drvinfo,
- .get_settings = b44_get_settings,
- .set_settings = b44_set_settings,
.nway_reset = b44_nway_reset,
.get_link = ethtool_op_get_link,
.get_wol = b44_get_wol,
@@ -2125,6 +2133,8 @@
.get_strings = b44_get_strings,
.get_sset_count = b44_get_sset_count,
.get_ethtool_stats = b44_get_ethtool_stats,
+ .get_link_ksettings = b44_get_link_ksettings,
+ .set_link_ksettings = b44_set_link_ksettings,
};
static int b44_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
@@ -2137,8 +2147,8 @@
spin_lock_irq(&bp->lock);
if (bp->flags & B44_FLAG_EXTERNAL_PHY) {
- BUG_ON(!bp->phydev);
- err = phy_mii_ioctl(bp->phydev, ifr, cmd);
+ BUG_ON(!dev->phydev);
+ err = phy_mii_ioctl(dev->phydev, ifr, cmd);
} else {
err = generic_mii_ioctl(&bp->mii_if, if_mii(ifr), cmd, NULL);
}
@@ -2206,7 +2216,7 @@
static void b44_adjust_link(struct net_device *dev)
{
struct b44 *bp = netdev_priv(dev);
- struct phy_device *phydev = bp->phydev;
+ struct phy_device *phydev = dev->phydev;
bool status_changed = 0;
BUG_ON(!phydev);
@@ -2303,7 +2313,6 @@
SUPPORTED_MII);
phydev->advertising = phydev->supported;
- bp->phydev = phydev;
bp->old_link = 0;
bp->phy_addr = phydev->mdio.addr;
@@ -2323,9 +2332,10 @@
static void b44_unregister_phy_one(struct b44 *bp)
{
+ struct net_device *dev = bp->dev;
struct mii_bus *mii_bus = bp->mii_bus;
- phy_disconnect(bp->phydev);
+ phy_disconnect(dev->phydev);
mdiobus_unregister(mii_bus);
mdiobus_free(mii_bus);
}
diff --git a/drivers/net/ethernet/broadcom/b44.h b/drivers/net/ethernet/broadcom/b44.h
index 65d88d7..89d2cf3 100644
--- a/drivers/net/ethernet/broadcom/b44.h
+++ b/drivers/net/ethernet/broadcom/b44.h
@@ -404,7 +404,6 @@
u32 tx_pending;
u8 phy_addr;
u8 force_copybreak;
- struct phy_device *phydev;
struct mii_bus *mii_bus;
int old_link;
struct mii_if_info mii_if;
diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index 6c8bc5f..ae364c7 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -791,7 +791,7 @@
int status_changed;
priv = netdev_priv(dev);
- phydev = priv->phydev;
+ phydev = dev->phydev;
status_changed = 0;
if (priv->old_link != phydev->link) {
@@ -913,7 +913,6 @@
priv->old_link = 0;
priv->old_duplex = -1;
priv->old_pause = -1;
- priv->phydev = phydev;
}
/* mask all interrupts and request them */
@@ -1085,7 +1084,7 @@
ENETDMAC_IRMASK, priv->tx_chan);
if (priv->has_phy)
- phy_start(priv->phydev);
+ phy_start(phydev);
else
bcm_enet_adjust_link(dev);
@@ -1127,7 +1126,7 @@
free_irq(dev->irq, dev);
out_phy_disconnect:
- phy_disconnect(priv->phydev);
+ phy_disconnect(phydev);
return ret;
}
@@ -1190,7 +1189,7 @@
netif_stop_queue(dev);
napi_disable(&priv->napi);
if (priv->has_phy)
- phy_stop(priv->phydev);
+ phy_stop(dev->phydev);
del_timer_sync(&priv->rx_timeout);
/* mask all interrupts */
@@ -1234,10 +1233,8 @@
free_irq(dev->irq, dev);
/* release phy */
- if (priv->has_phy) {
- phy_disconnect(priv->phydev);
- priv->phydev = NULL;
- }
+ if (priv->has_phy)
+ phy_disconnect(dev->phydev);
return 0;
}
@@ -1437,64 +1434,68 @@
priv = netdev_priv(dev);
if (priv->has_phy) {
- if (!priv->phydev)
+ if (!dev->phydev)
return -ENODEV;
- return genphy_restart_aneg(priv->phydev);
+ return genphy_restart_aneg(dev->phydev);
}
return -EOPNOTSUPP;
}
-static int bcm_enet_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int bcm_enet_get_link_ksettings(struct net_device *dev,
+ struct ethtool_link_ksettings *cmd)
{
struct bcm_enet_priv *priv;
+ u32 supported, advertising;
priv = netdev_priv(dev);
- cmd->maxrxpkt = 0;
- cmd->maxtxpkt = 0;
-
if (priv->has_phy) {
- if (!priv->phydev)
+ if (!dev->phydev)
return -ENODEV;
- return phy_ethtool_gset(priv->phydev, cmd);
+ return phy_ethtool_ksettings_get(dev->phydev, cmd);
} else {
- cmd->autoneg = 0;
- ethtool_cmd_speed_set(cmd, ((priv->force_speed_100)
- ? SPEED_100 : SPEED_10));
- cmd->duplex = (priv->force_duplex_full) ?
+ cmd->base.autoneg = 0;
+ cmd->base.speed = (priv->force_speed_100) ?
+ SPEED_100 : SPEED_10;
+ cmd->base.duplex = (priv->force_duplex_full) ?
DUPLEX_FULL : DUPLEX_HALF;
- cmd->supported = ADVERTISED_10baseT_Half |
+ supported = ADVERTISED_10baseT_Half |
ADVERTISED_10baseT_Full |
ADVERTISED_100baseT_Half |
ADVERTISED_100baseT_Full;
- cmd->advertising = 0;
- cmd->port = PORT_MII;
- cmd->transceiver = XCVR_EXTERNAL;
+ advertising = 0;
+ ethtool_convert_legacy_u32_to_link_mode(
+ cmd->link_modes.supported, supported);
+ ethtool_convert_legacy_u32_to_link_mode(
+ cmd->link_modes.advertising, advertising);
+ cmd->base.port = PORT_MII;
}
return 0;
}
-static int bcm_enet_set_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int bcm_enet_set_link_ksettings(struct net_device *dev,
+ const struct ethtool_link_ksettings *cmd)
{
struct bcm_enet_priv *priv;
priv = netdev_priv(dev);
if (priv->has_phy) {
- if (!priv->phydev)
+ if (!dev->phydev)
return -ENODEV;
- return phy_ethtool_sset(priv->phydev, cmd);
+ return phy_ethtool_ksettings_set(dev->phydev, cmd);
} else {
- if (cmd->autoneg ||
- (cmd->speed != SPEED_100 && cmd->speed != SPEED_10) ||
- cmd->port != PORT_MII)
+ if (cmd->base.autoneg ||
+ (cmd->base.speed != SPEED_100 &&
+ cmd->base.speed != SPEED_10) ||
+ cmd->base.port != PORT_MII)
return -EINVAL;
- priv->force_speed_100 = (cmd->speed == SPEED_100) ? 1 : 0;
- priv->force_duplex_full = (cmd->duplex == DUPLEX_FULL) ? 1 : 0;
+ priv->force_speed_100 =
+ (cmd->base.speed == SPEED_100) ? 1 : 0;
+ priv->force_duplex_full =
+ (cmd->base.duplex == DUPLEX_FULL) ? 1 : 0;
if (netif_running(dev))
bcm_enet_adjust_link(dev);
@@ -1588,14 +1589,14 @@
.get_sset_count = bcm_enet_get_sset_count,
.get_ethtool_stats = bcm_enet_get_ethtool_stats,
.nway_reset = bcm_enet_nway_reset,
- .get_settings = bcm_enet_get_settings,
- .set_settings = bcm_enet_set_settings,
.get_drvinfo = bcm_enet_get_drvinfo,
.get_link = ethtool_op_get_link,
.get_ringparam = bcm_enet_get_ringparam,
.set_ringparam = bcm_enet_set_ringparam,
.get_pauseparam = bcm_enet_get_pauseparam,
.set_pauseparam = bcm_enet_set_pauseparam,
+ .get_link_ksettings = bcm_enet_get_link_ksettings,
+ .set_link_ksettings = bcm_enet_set_link_ksettings,
};
static int bcm_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
@@ -1604,9 +1605,9 @@
priv = netdev_priv(dev);
if (priv->has_phy) {
- if (!priv->phydev)
+ if (!dev->phydev)
return -ENODEV;
- return phy_mii_ioctl(priv->phydev, rq, cmd);
+ return phy_mii_ioctl(dev->phydev, rq, cmd);
} else {
struct mii_if_info mii;
diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.h b/drivers/net/ethernet/broadcom/bcm63xx_enet.h
index f55af43..0a1b7b2 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.h
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.h
@@ -290,7 +290,6 @@
/* used when a phy is connected (phylib used) */
struct mii_bus *mii_bus;
- struct phy_device *phydev;
int old_link;
int old_duplex;
int old_pause;
diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index ecd357d..27f11a5 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -6356,10 +6356,6 @@
struct bnx2 *bp = netdev_priv(dev);
int rc;
- rc = bnx2_request_firmware(bp);
- if (rc < 0)
- goto out;
-
netif_carrier_off(dev);
bnx2_disable_int(bp);
@@ -6428,7 +6424,6 @@
bnx2_free_irq(bp);
bnx2_free_mem(bp);
bnx2_del_napi(bp);
- bnx2_release_firmware(bp);
goto out;
}
@@ -8575,6 +8570,12 @@
pci_set_drvdata(pdev, dev);
+ rc = bnx2_request_firmware(bp);
+ if (rc < 0)
+ goto error;
+
+
+ bnx2_reset_chip(bp, BNX2_DRV_MSG_CODE_RESET);
memcpy(dev->dev_addr, bp->mac_addr, ETH_ALEN);
dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_SG |
@@ -8607,6 +8608,7 @@
return 0;
error:
+ bnx2_release_firmware(bp);
pci_iounmap(pdev, bp->regview);
pci_release_regions(pdev);
pci_disable_device(pdev);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index 0e68fad..243cb97 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -492,7 +492,8 @@
int bnx2x_get_vf_config(struct net_device *dev, int vf,
struct ifla_vf_info *ivi);
int bnx2x_set_vf_mac(struct net_device *dev, int queue, u8 *mac);
-int bnx2x_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos);
+int bnx2x_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos,
+ __be16 vlan_proto);
/* select_queue callback */
u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index dab61a8..20fe6a8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -12563,41 +12563,62 @@
return 0;
}
-static int bnx2x_init_mcast_macs_list(struct bnx2x *bp,
- struct bnx2x_mcast_ramrod_params *p)
+struct bnx2x_mcast_list_elem_group
{
- int mc_count = netdev_mc_count(bp->dev);
- struct bnx2x_mcast_list_elem *mc_mac =
- kcalloc(mc_count, sizeof(*mc_mac), GFP_ATOMIC);
- struct netdev_hw_addr *ha;
+ struct list_head mcast_group_link;
+ struct bnx2x_mcast_list_elem mcast_elems[];
+};
- if (!mc_mac) {
- BNX2X_ERR("Failed to allocate mc MAC list\n");
- return -ENOMEM;
+#define MCAST_ELEMS_PER_PG \
+ ((PAGE_SIZE - sizeof(struct bnx2x_mcast_list_elem_group)) / \
+ sizeof(struct bnx2x_mcast_list_elem))
+
+static void bnx2x_free_mcast_macs_list(struct list_head *mcast_group_list)
+{
+ struct bnx2x_mcast_list_elem_group *current_mcast_group;
+
+ while (!list_empty(mcast_group_list)) {
+ current_mcast_group = list_first_entry(mcast_group_list,
+ struct bnx2x_mcast_list_elem_group,
+ mcast_group_link);
+ list_del(¤t_mcast_group->mcast_group_link);
+ free_page((unsigned long)current_mcast_group);
}
-
- INIT_LIST_HEAD(&p->mcast_list);
-
- netdev_for_each_mc_addr(ha, bp->dev) {
- mc_mac->mac = bnx2x_mc_addr(ha);
- list_add_tail(&mc_mac->link, &p->mcast_list);
- mc_mac++;
- }
-
- p->mcast_list_len = mc_count;
-
- return 0;
}
-static void bnx2x_free_mcast_macs_list(
- struct bnx2x_mcast_ramrod_params *p)
+static int bnx2x_init_mcast_macs_list(struct bnx2x *bp,
+ struct bnx2x_mcast_ramrod_params *p,
+ struct list_head *mcast_group_list)
{
- struct bnx2x_mcast_list_elem *mc_mac =
- list_first_entry(&p->mcast_list, struct bnx2x_mcast_list_elem,
- link);
+ struct bnx2x_mcast_list_elem *mc_mac;
+ struct netdev_hw_addr *ha;
+ struct bnx2x_mcast_list_elem_group *current_mcast_group = NULL;
+ int mc_count = netdev_mc_count(bp->dev);
+ int offset = 0;
- WARN_ON(!mc_mac);
- kfree(mc_mac);
+ INIT_LIST_HEAD(&p->mcast_list);
+ netdev_for_each_mc_addr(ha, bp->dev) {
+ if (!offset) {
+ current_mcast_group =
+ (struct bnx2x_mcast_list_elem_group *)
+ __get_free_page(GFP_ATOMIC);
+ if (!current_mcast_group) {
+ bnx2x_free_mcast_macs_list(mcast_group_list);
+ BNX2X_ERR("Failed to allocate mc MAC list\n");
+ return -ENOMEM;
+ }
+ list_add(¤t_mcast_group->mcast_group_link,
+ mcast_group_list);
+ }
+ mc_mac = ¤t_mcast_group->mcast_elems[offset];
+ mc_mac->mac = bnx2x_mc_addr(ha);
+ list_add_tail(&mc_mac->link, &p->mcast_list);
+ offset++;
+ if (offset == MCAST_ELEMS_PER_PG)
+ offset = 0;
+ }
+ p->mcast_list_len = mc_count;
+ return 0;
}
/**
@@ -12647,6 +12668,7 @@
static int bnx2x_set_mc_list_e1x(struct bnx2x *bp)
{
+ LIST_HEAD(mcast_group_list);
struct net_device *dev = bp->dev;
struct bnx2x_mcast_ramrod_params rparam = {NULL};
int rc = 0;
@@ -12662,7 +12684,7 @@
/* then, configure a new MACs list */
if (netdev_mc_count(dev)) {
- rc = bnx2x_init_mcast_macs_list(bp, &rparam);
+ rc = bnx2x_init_mcast_macs_list(bp, &rparam, &mcast_group_list);
if (rc)
return rc;
@@ -12673,7 +12695,7 @@
BNX2X_ERR("Failed to set a new multicast configuration: %d\n",
rc);
- bnx2x_free_mcast_macs_list(&rparam);
+ bnx2x_free_mcast_macs_list(&mcast_group_list);
}
return rc;
@@ -12681,6 +12703,7 @@
static int bnx2x_set_mc_list(struct bnx2x *bp)
{
+ LIST_HEAD(mcast_group_list);
struct bnx2x_mcast_ramrod_params rparam = {NULL};
struct net_device *dev = bp->dev;
int rc = 0;
@@ -12692,7 +12715,7 @@
rparam.mcast_obj = &bp->mcast_obj;
if (netdev_mc_count(dev)) {
- rc = bnx2x_init_mcast_macs_list(bp, &rparam);
+ rc = bnx2x_init_mcast_macs_list(bp, &rparam, &mcast_group_list);
if (rc)
return rc;
@@ -12703,7 +12726,7 @@
BNX2X_ERR("Failed to set a new multicast configuration: %d\n",
rc);
- bnx2x_free_mcast_macs_list(&rparam);
+ bnx2x_free_mcast_macs_list(&mcast_group_list);
} else {
/* If no mc addresses are required, flush the configuration */
rc = bnx2x_config_mcast(bp, &rparam, BNX2X_MCAST_CMD_DEL);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index d468380..cea6bdc 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -2606,8 +2606,23 @@
int type; /* BNX2X_MCAST_CMD_SET_{ADD, DEL} */
};
+union bnx2x_mcast_elem {
+ struct bnx2x_mcast_bin_elem bin_elem;
+ struct bnx2x_mcast_mac_elem mac_elem;
+};
+
+struct bnx2x_mcast_elem_group {
+ struct list_head mcast_group_link;
+ union bnx2x_mcast_elem mcast_elems[];
+};
+
+#define MCAST_MAC_ELEMS_PER_PG \
+ ((PAGE_SIZE - sizeof(struct bnx2x_mcast_elem_group)) / \
+ sizeof(union bnx2x_mcast_elem))
+
struct bnx2x_pending_mcast_cmd {
struct list_head link;
+ struct list_head group_head;
int type; /* BNX2X_MCAST_CMD_X */
union {
struct list_head macs_head;
@@ -2638,16 +2653,29 @@
return 0;
}
+static void bnx2x_free_groups(struct list_head *mcast_group_list)
+{
+ struct bnx2x_mcast_elem_group *current_mcast_group;
+
+ while (!list_empty(mcast_group_list)) {
+ current_mcast_group = list_first_entry(mcast_group_list,
+ struct bnx2x_mcast_elem_group,
+ mcast_group_link);
+ list_del(¤t_mcast_group->mcast_group_link);
+ free_page((unsigned long)current_mcast_group);
+ }
+}
+
static int bnx2x_mcast_enqueue_cmd(struct bnx2x *bp,
struct bnx2x_mcast_obj *o,
struct bnx2x_mcast_ramrod_params *p,
enum bnx2x_mcast_cmd cmd)
{
- int total_sz;
struct bnx2x_pending_mcast_cmd *new_cmd;
- struct bnx2x_mcast_mac_elem *cur_mac = NULL;
struct bnx2x_mcast_list_elem *pos;
- int macs_list_len = 0, macs_list_len_size;
+ struct bnx2x_mcast_elem_group *elem_group;
+ struct bnx2x_mcast_mac_elem *mac_elem;
+ int total_elems = 0, macs_list_len = 0, offset = 0;
/* When adding MACs we'll need to store their values */
if (cmd == BNX2X_MCAST_CMD_ADD || cmd == BNX2X_MCAST_CMD_SET)
@@ -2657,50 +2685,61 @@
if (!p->mcast_list_len)
return 0;
- /* For a set command, we need to allocate sufficient memory for all
- * the bins, since we can't analyze at this point how much memory would
- * be required.
- */
- macs_list_len_size = macs_list_len *
- sizeof(struct bnx2x_mcast_mac_elem);
- if (cmd == BNX2X_MCAST_CMD_SET) {
- int bin_size = BNX2X_MCAST_BINS_NUM *
- sizeof(struct bnx2x_mcast_bin_elem);
-
- if (bin_size > macs_list_len_size)
- macs_list_len_size = bin_size;
- }
- total_sz = sizeof(*new_cmd) + macs_list_len_size;
-
/* Add mcast is called under spin_lock, thus calling with GFP_ATOMIC */
- new_cmd = kzalloc(total_sz, GFP_ATOMIC);
-
+ new_cmd = kzalloc(sizeof(*new_cmd), GFP_ATOMIC);
if (!new_cmd)
return -ENOMEM;
+ INIT_LIST_HEAD(&new_cmd->data.macs_head);
+ INIT_LIST_HEAD(&new_cmd->group_head);
+ new_cmd->type = cmd;
+ new_cmd->done = false;
+
DP(BNX2X_MSG_SP, "About to enqueue a new %d command. macs_list_len=%d\n",
cmd, macs_list_len);
- INIT_LIST_HEAD(&new_cmd->data.macs_head);
-
- new_cmd->type = cmd;
- new_cmd->done = false;
-
switch (cmd) {
case BNX2X_MCAST_CMD_ADD:
case BNX2X_MCAST_CMD_SET:
- cur_mac = (struct bnx2x_mcast_mac_elem *)
- ((u8 *)new_cmd + sizeof(*new_cmd));
-
- /* Push the MACs of the current command into the pending command
- * MACs list: FIFO
+ /* For a set command, we need to allocate sufficient memory for
+ * all the bins, since we can't analyze at this point how much
+ * memory would be required.
*/
- list_for_each_entry(pos, &p->mcast_list, link) {
- memcpy(cur_mac->mac, pos->mac, ETH_ALEN);
- list_add_tail(&cur_mac->link, &new_cmd->data.macs_head);
- cur_mac++;
+ total_elems = macs_list_len;
+ if (cmd == BNX2X_MCAST_CMD_SET) {
+ if (total_elems < BNX2X_MCAST_BINS_NUM)
+ total_elems = BNX2X_MCAST_BINS_NUM;
}
-
+ while (total_elems > 0) {
+ elem_group = (struct bnx2x_mcast_elem_group *)
+ __get_free_page(GFP_ATOMIC | __GFP_ZERO);
+ if (!elem_group) {
+ bnx2x_free_groups(&new_cmd->group_head);
+ kfree(new_cmd);
+ return -ENOMEM;
+ }
+ total_elems -= MCAST_MAC_ELEMS_PER_PG;
+ list_add_tail(&elem_group->mcast_group_link,
+ &new_cmd->group_head);
+ }
+ elem_group = list_first_entry(&new_cmd->group_head,
+ struct bnx2x_mcast_elem_group,
+ mcast_group_link);
+ list_for_each_entry(pos, &p->mcast_list, link) {
+ mac_elem = &elem_group->mcast_elems[offset].mac_elem;
+ memcpy(mac_elem->mac, pos->mac, ETH_ALEN);
+ /* Push the MACs of the current command into the pending
+ * command MACs list: FIFO
+ */
+ list_add_tail(&mac_elem->link,
+ &new_cmd->data.macs_head);
+ offset++;
+ if (offset == MCAST_MAC_ELEMS_PER_PG) {
+ offset = 0;
+ elem_group = list_next_entry(elem_group,
+ mcast_group_link);
+ }
+ }
break;
case BNX2X_MCAST_CMD_DEL:
@@ -2978,7 +3017,8 @@
u64 cur[BNX2X_MCAST_VEC_SZ], req[BNX2X_MCAST_VEC_SZ];
struct bnx2x_mcast_mac_elem *pmac_pos, *pmac_pos_n;
struct bnx2x_mcast_bin_elem *p_item;
- int i, cnt = 0, mac_cnt = 0;
+ struct bnx2x_mcast_elem_group *elem_group;
+ int cnt = 0, mac_cnt = 0, offset = 0, i;
memset(req, 0, sizeof(u64) * BNX2X_MCAST_VEC_SZ);
memcpy(cur, o->registry.aprox_match.vec,
@@ -3001,9 +3041,10 @@
* a list that will be used to configure bins.
*/
cmd_pos->set_convert = true;
- p_item = (struct bnx2x_mcast_bin_elem *)(cmd_pos + 1);
INIT_LIST_HEAD(&cmd_pos->data.macs_head);
-
+ elem_group = list_first_entry(&cmd_pos->group_head,
+ struct bnx2x_mcast_elem_group,
+ mcast_group_link);
for (i = 0; i < BNX2X_MCAST_BINS_NUM; i++) {
bool b_current = !!BIT_VEC64_TEST_BIT(cur, i);
bool b_required = !!BIT_VEC64_TEST_BIT(req, i);
@@ -3011,12 +3052,18 @@
if (b_current == b_required)
continue;
+ p_item = &elem_group->mcast_elems[offset].bin_elem;
p_item->bin = i;
p_item->type = b_required ? BNX2X_MCAST_CMD_SET_ADD
: BNX2X_MCAST_CMD_SET_DEL;
list_add_tail(&p_item->link , &cmd_pos->data.macs_head);
- p_item++;
cnt++;
+ offset++;
+ if (offset == MCAST_MAC_ELEMS_PER_PG) {
+ offset = 0;
+ elem_group = list_next_entry(elem_group,
+ mcast_group_link);
+ }
}
/* We now definitely know how many commands are hiding here.
@@ -3103,6 +3150,7 @@
*/
if (cmd_pos->done) {
list_del(&cmd_pos->link);
+ bnx2x_free_groups(&cmd_pos->group_head);
kfree(cmd_pos);
}
@@ -3741,6 +3789,7 @@
}
list_del(&cmd_pos->link);
+ bnx2x_free_groups(&cmd_pos->group_head);
kfree(cmd_pos);
return cnt;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 6c586b0..3f77d08 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -2521,7 +2521,8 @@
for_each_vf(bp, vfidx) {
bulletin = BP_VF_BULLETIN(bp, vfidx);
if (bulletin->valid_bitmap & (1 << VLAN_VALID))
- bnx2x_set_vf_vlan(bp->dev, vfidx, bulletin->vlan, 0);
+ bnx2x_set_vf_vlan(bp->dev, vfidx, bulletin->vlan, 0,
+ htons(ETH_P_8021Q));
}
}
@@ -2781,7 +2782,8 @@
return 0;
}
-int bnx2x_set_vf_vlan(struct net_device *dev, int vfidx, u16 vlan, u8 qos)
+int bnx2x_set_vf_vlan(struct net_device *dev, int vfidx, u16 vlan, u8 qos,
+ __be16 vlan_proto)
{
struct pf_vf_bulletin_content *bulletin = NULL;
struct bnx2x *bp = netdev_priv(dev);
@@ -2796,6 +2798,9 @@
return -EINVAL;
}
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
DP(BNX2X_MSG_IOV, "configuring VF %d with VLAN %d qos %d\n",
vfidx, vlan, 0);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 228c964..a9f9f37 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -32,6 +32,7 @@
#include <linux/mii.h>
#include <linux/if.h>
#include <linux/if_vlan.h>
+#include <linux/rtc.h>
#include <net/ip.h>
#include <net/tcp.h>
#include <net/udp.h>
@@ -93,50 +94,49 @@
BCM57404_NPAR,
BCM57406_NPAR,
BCM57407_SFP,
+ BCM57407_NPAR,
BCM57414_NPAR,
BCM57416_NPAR,
- BCM57304_VF,
- BCM57404_VF,
- BCM57414_VF,
- BCM57314_VF,
+ NETXTREME_E_VF,
+ NETXTREME_C_VF,
};
/* indexed by enum above */
static const struct {
char *name;
} board_info[] = {
- { "Broadcom BCM57301 NetXtreme-C Single-port 10Gb Ethernet" },
- { "Broadcom BCM57302 NetXtreme-C Dual-port 10Gb/25Gb Ethernet" },
- { "Broadcom BCM57304 NetXtreme-C Dual-port 10Gb/25Gb/40Gb/50Gb Ethernet" },
+ { "Broadcom BCM57301 NetXtreme-C 10Gb Ethernet" },
+ { "Broadcom BCM57302 NetXtreme-C 10Gb/25Gb Ethernet" },
+ { "Broadcom BCM57304 NetXtreme-C 10Gb/25Gb/40Gb/50Gb Ethernet" },
{ "Broadcom BCM57417 NetXtreme-E Ethernet Partition" },
- { "Broadcom BCM58700 Nitro 4-port 1Gb/2.5Gb/10Gb Ethernet" },
- { "Broadcom BCM57311 NetXtreme-C Single-port 10Gb Ethernet" },
- { "Broadcom BCM57312 NetXtreme-C Dual-port 10Gb/25Gb Ethernet" },
- { "Broadcom BCM57402 NetXtreme-E Dual-port 10Gb Ethernet" },
- { "Broadcom BCM57404 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
- { "Broadcom BCM57406 NetXtreme-E Dual-port 10GBase-T Ethernet" },
+ { "Broadcom BCM58700 Nitro 1Gb/2.5Gb/10Gb Ethernet" },
+ { "Broadcom BCM57311 NetXtreme-C 10Gb Ethernet" },
+ { "Broadcom BCM57312 NetXtreme-C 10Gb/25Gb Ethernet" },
+ { "Broadcom BCM57402 NetXtreme-E 10Gb Ethernet" },
+ { "Broadcom BCM57404 NetXtreme-E 10Gb/25Gb Ethernet" },
+ { "Broadcom BCM57406 NetXtreme-E 10GBase-T Ethernet" },
{ "Broadcom BCM57402 NetXtreme-E Ethernet Partition" },
- { "Broadcom BCM57407 NetXtreme-E Dual-port 10GBase-T Ethernet" },
- { "Broadcom BCM57412 NetXtreme-E Dual-port 10Gb Ethernet" },
- { "Broadcom BCM57414 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
- { "Broadcom BCM57416 NetXtreme-E Dual-port 10GBase-T Ethernet" },
- { "Broadcom BCM57417 NetXtreme-E Dual-port 10GBase-T Ethernet" },
+ { "Broadcom BCM57407 NetXtreme-E 10GBase-T Ethernet" },
+ { "Broadcom BCM57412 NetXtreme-E 10Gb Ethernet" },
+ { "Broadcom BCM57414 NetXtreme-E 10Gb/25Gb Ethernet" },
+ { "Broadcom BCM57416 NetXtreme-E 10GBase-T Ethernet" },
+ { "Broadcom BCM57417 NetXtreme-E 10GBase-T Ethernet" },
{ "Broadcom BCM57412 NetXtreme-E Ethernet Partition" },
- { "Broadcom BCM57314 NetXtreme-C Dual-port 10Gb/25Gb/40Gb/50Gb Ethernet" },
- { "Broadcom BCM57417 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
- { "Broadcom BCM57416 NetXtreme-E Dual-port 10Gb Ethernet" },
+ { "Broadcom BCM57314 NetXtreme-C 10Gb/25Gb/40Gb/50Gb Ethernet" },
+ { "Broadcom BCM57417 NetXtreme-E 10Gb/25Gb Ethernet" },
+ { "Broadcom BCM57416 NetXtreme-E 10Gb Ethernet" },
{ "Broadcom BCM57404 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM57406 NetXtreme-E Ethernet Partition" },
- { "Broadcom BCM57407 NetXtreme-E Dual-port 25Gb Ethernet" },
+ { "Broadcom BCM57407 NetXtreme-E 25Gb Ethernet" },
+ { "Broadcom BCM57407 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM57414 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM57416 NetXtreme-E Ethernet Partition" },
- { "Broadcom BCM57304 NetXtreme-C Ethernet Virtual Function" },
- { "Broadcom BCM57404 NetXtreme-E Ethernet Virtual Function" },
- { "Broadcom BCM57414 NetXtreme-E Ethernet Virtual Function" },
- { "Broadcom BCM57314 NetXtreme-E Ethernet Virtual Function" },
+ { "Broadcom NetXtreme-E Ethernet Virtual Function" },
+ { "Broadcom NetXtreme-C Ethernet Virtual Function" },
};
static const struct pci_device_id bnxt_pci_tbl[] = {
+ { PCI_VDEVICE(BROADCOM, 0x16c0), .driver_data = BCM57417_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16c8), .driver_data = BCM57301 },
{ PCI_VDEVICE(BROADCOM, 0x16c9), .driver_data = BCM57302 },
{ PCI_VDEVICE(BROADCOM, 0x16ca), .driver_data = BCM57304 },
@@ -160,13 +160,19 @@
{ PCI_VDEVICE(BROADCOM, 0x16e7), .driver_data = BCM57404_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16e8), .driver_data = BCM57406_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16e9), .driver_data = BCM57407_SFP },
+ { PCI_VDEVICE(BROADCOM, 0x16ea), .driver_data = BCM57407_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x16eb), .driver_data = BCM57412_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16ec), .driver_data = BCM57414_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x16ed), .driver_data = BCM57414_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16ee), .driver_data = BCM57416_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x16ef), .driver_data = BCM57416_NPAR },
#ifdef CONFIG_BNXT_SRIOV
- { PCI_VDEVICE(BROADCOM, 0x16cb), .driver_data = BCM57304_VF },
- { PCI_VDEVICE(BROADCOM, 0x16d3), .driver_data = BCM57404_VF },
- { PCI_VDEVICE(BROADCOM, 0x16dc), .driver_data = BCM57414_VF },
- { PCI_VDEVICE(BROADCOM, 0x16e1), .driver_data = BCM57314_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16c1), .driver_data = NETXTREME_E_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16cb), .driver_data = NETXTREME_C_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16d3), .driver_data = NETXTREME_E_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16dc), .driver_data = NETXTREME_E_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16e1), .driver_data = NETXTREME_C_VF },
+ { PCI_VDEVICE(BROADCOM, 0x16e5), .driver_data = NETXTREME_C_VF },
#endif
{ 0 }
};
@@ -189,8 +195,7 @@
static bool bnxt_vf_pciid(enum board_idx idx)
{
- return (idx == BCM57304_VF || idx == BCM57404_VF ||
- idx == BCM57314_VF || idx == BCM57414_VF);
+ return (idx == NETXTREME_C_VF || idx == NETXTREME_E_VF);
}
#define DB_CP_REARM_FLAGS (DB_KEY_CP | DB_IDX_VALID)
@@ -3419,10 +3424,10 @@
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_VNIC_RSS_CFG, -1, -1);
if (set_rss) {
- vnic->hash_type = BNXT_RSS_HASH_TYPE_FLAG_IPV4 |
- BNXT_RSS_HASH_TYPE_FLAG_TCP_IPV4 |
- BNXT_RSS_HASH_TYPE_FLAG_IPV6 |
- BNXT_RSS_HASH_TYPE_FLAG_TCP_IPV6;
+ vnic->hash_type = VNIC_RSS_CFG_REQ_HASH_TYPE_IPV4 |
+ VNIC_RSS_CFG_REQ_HASH_TYPE_TCP_IPV4 |
+ VNIC_RSS_CFG_REQ_HASH_TYPE_IPV6 |
+ VNIC_RSS_CFG_REQ_HASH_TYPE_TCP_IPV6;
req.hash_type = cpu_to_le32(vnic->hash_type);
@@ -4156,6 +4161,11 @@
if (rc)
goto hwrm_func_qcaps_exit;
+ bp->tx_push_thresh = 0;
+ if (resp->flags &
+ cpu_to_le32(FUNC_QCAPS_RESP_FLAGS_PUSH_MODE_SUPPORTED))
+ bp->tx_push_thresh = BNXT_TX_PUSH_THRESH;
+
if (BNXT_PF(bp)) {
struct bnxt_pf_info *pf = &bp->pf;
@@ -4187,12 +4197,6 @@
struct bnxt_vf_info *vf = &bp->vf;
vf->fw_fid = le16_to_cpu(resp->fid);
- memcpy(vf->mac_addr, resp->mac_address, ETH_ALEN);
- if (is_valid_ether_addr(vf->mac_addr))
- /* overwrite netdev dev_adr with admin VF MAC */
- memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
- else
- random_ether_addr(bp->dev->dev_addr);
vf->max_rsscos_ctxs = le16_to_cpu(resp->max_rsscos_ctx);
vf->max_cp_rings = le16_to_cpu(resp->max_cmpl_rings);
@@ -4204,14 +4208,21 @@
vf->max_l2_ctxs = le16_to_cpu(resp->max_l2_ctxs);
vf->max_vnics = le16_to_cpu(resp->max_vnics);
vf->max_stat_ctxs = le16_to_cpu(resp->max_stat_ctx);
+
+ memcpy(vf->mac_addr, resp->mac_address, ETH_ALEN);
+ mutex_unlock(&bp->hwrm_cmd_lock);
+
+ if (is_valid_ether_addr(vf->mac_addr)) {
+ /* overwrite netdev dev_adr with admin VF MAC */
+ memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
+ } else {
+ random_ether_addr(bp->dev->dev_addr);
+ rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
+ }
+ return rc;
#endif
}
- bp->tx_push_thresh = 0;
- if (resp->flags &
- cpu_to_le32(FUNC_QCAPS_RESP_FLAGS_PUSH_MODE_SUPPORTED))
- bp->tx_push_thresh = BNXT_TX_PUSH_THRESH;
-
hwrm_func_qcaps_exit:
mutex_unlock(&bp->hwrm_cmd_lock);
return rc;
@@ -4249,6 +4260,9 @@
if (bp->max_tc > BNXT_MAX_QUEUE)
bp->max_tc = BNXT_MAX_QUEUE;
+ if (resp->queue_cfg_info & QUEUE_QPORTCFG_RESP_QUEUE_CFG_INFO_ASYM_CFG)
+ bp->max_tc = 1;
+
qptr = &resp->queue_id0;
for (i = 0; i < bp->max_tc; i++) {
bp->q_info[i].queue_id = *qptr++;
@@ -4307,6 +4321,31 @@
return rc;
}
+int bnxt_hwrm_fw_set_time(struct bnxt *bp)
+{
+#if IS_ENABLED(CONFIG_RTC_LIB)
+ struct hwrm_fw_set_time_input req = {0};
+ struct rtc_time tm;
+ struct timeval tv;
+
+ if (bp->hwrm_spec_code < 0x10400)
+ return -EOPNOTSUPP;
+
+ do_gettimeofday(&tv);
+ rtc_time_to_tm(tv.tv_sec, &tm);
+ bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FW_SET_TIME, -1, -1);
+ req.year = cpu_to_le16(1900 + tm.tm_year);
+ req.month = 1 + tm.tm_mon;
+ req.day = tm.tm_mday;
+ req.hour = tm.tm_hour;
+ req.minute = tm.tm_min;
+ req.second = tm.tm_sec;
+ return hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
static int bnxt_hwrm_port_qstats(struct bnxt *bp)
{
int rc;
@@ -6804,6 +6843,8 @@
if (rc)
goto init_err;
+ bnxt_hwrm_fw_set_time(bp);
+
dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
NETIF_F_TSO | NETIF_F_TSO6 |
NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 23e04a6..51b164a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -11,10 +11,10 @@
#define BNXT_H
#define DRV_MODULE_NAME "bnxt_en"
-#define DRV_MODULE_VERSION "1.3.0"
+#define DRV_MODULE_VERSION "1.5.0"
#define DRV_VER_MAJ 1
-#define DRV_VER_MIN 3
+#define DRV_VER_MIN 5
#define DRV_VER_UPD 0
struct tx_bd {
@@ -106,11 +106,11 @@
#define CMP_TYPE_REMOTE_DRIVER_REQ 34
#define CMP_TYPE_REMOTE_DRIVER_RESP 36
#define CMP_TYPE_ERROR_STATUS 48
- #define CMPL_BASE_TYPE_STAT_EJECT (0x1aUL << 0)
- #define CMPL_BASE_TYPE_HWRM_DONE (0x20UL << 0)
- #define CMPL_BASE_TYPE_HWRM_FWD_REQ (0x22UL << 0)
- #define CMPL_BASE_TYPE_HWRM_FWD_RESP (0x24UL << 0)
- #define CMPL_BASE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define CMPL_BASE_TYPE_STAT_EJECT 0x1aUL
+ #define CMPL_BASE_TYPE_HWRM_DONE 0x20UL
+ #define CMPL_BASE_TYPE_HWRM_FWD_REQ 0x22UL
+ #define CMPL_BASE_TYPE_HWRM_FWD_RESP 0x24UL
+ #define CMPL_BASE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
#define TX_CMP_FLAGS_ERROR (1 << 6)
#define TX_CMP_FLAGS_PUSH (1 << 7)
@@ -389,11 +389,6 @@
#define INVALID_HW_RING_ID ((u16)-1)
-#define BNXT_RSS_HASH_TYPE_FLAG_IPV4 0x01
-#define BNXT_RSS_HASH_TYPE_FLAG_TCP_IPV4 0x02
-#define BNXT_RSS_HASH_TYPE_FLAG_IPV6 0x04
-#define BNXT_RSS_HASH_TYPE_FLAG_TCP_IPV6 0x08
-
/* The hardware supports certain page sizes. Use the supported page sizes
* to allocate the rings.
*/
@@ -418,7 +413,7 @@
#define BNXT_RX_PAGE_SIZE (1 << BNXT_RX_PAGE_SHIFT)
-#define BNXT_MIN_PKT_SIZE 45
+#define BNXT_MIN_PKT_SIZE 52
#define BNXT_NUM_TESTS(bp) 0
@@ -1225,6 +1220,7 @@
int bnxt_hwrm_func_qcaps(struct bnxt *);
int bnxt_hwrm_set_pause(struct bnxt *);
int bnxt_hwrm_set_link_setting(struct bnxt *, bool, bool);
+int bnxt_hwrm_fw_set_time(struct bnxt *);
int bnxt_open_nic(struct bnxt *, bool, bool);
int bnxt_close_nic(struct bnxt *, bool, bool);
int bnxt_get_max_rings(struct bnxt *, int *, int *, bool);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index b83e174..a7e04ff 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -21,6 +21,8 @@
#include "bnxt_nvm_defs.h" /* NVRAM content constant and structure defs */
#include "bnxt_fw_hdr.h" /* Firmware hdr constant and structure defs */
#define FLASH_NVRAM_TIMEOUT ((HWRM_CMD_TIMEOUT) * 100)
+#define FLASH_PACKAGE_TIMEOUT ((HWRM_CMD_TIMEOUT) * 200)
+#define INSTALL_PACKAGE_TIMEOUT ((HWRM_CMD_TIMEOUT) * 200)
static char *bnxt_get_pkgver(struct net_device *dev, char *buf, size_t buflen);
@@ -346,7 +348,7 @@
int max_rx_rings, max_tx_rings, tcs;
bnxt_get_max_rings(bp, &max_rx_rings, &max_tx_rings, true);
- channel->max_combined = max_rx_rings;
+ channel->max_combined = max_t(int, max_rx_rings, max_tx_rings);
if (bnxt_get_max_rings(bp, &max_rx_rings, &max_tx_rings, false)) {
max_rx_rings = 0;
@@ -404,8 +406,8 @@
if (tcs > 1)
max_tx_rings /= tcs;
- if (sh && (channel->combined_count > max_rx_rings ||
- channel->combined_count > max_tx_rings))
+ if (sh &&
+ channel->combined_count > max_t(int, max_rx_rings, max_tx_rings))
return -ENOMEM;
if (!sh && (channel->rx_count > max_rx_rings ||
@@ -428,8 +430,10 @@
if (sh) {
bp->flags |= BNXT_FLAG_SHARED_RINGS;
- bp->rx_nr_rings = channel->combined_count;
- bp->tx_nr_rings_per_tc = channel->combined_count;
+ bp->rx_nr_rings = min_t(int, channel->combined_count,
+ max_rx_rings);
+ bp->tx_nr_rings_per_tc = min_t(int, channel->combined_count,
+ max_tx_rings);
} else {
bp->flags &= ~BNXT_FLAG_SHARED_RINGS;
bp->rx_nr_rings = channel->rx_count;
@@ -1028,6 +1032,10 @@
return bp->link_info.link_up;
}
+static int bnxt_find_nvram_item(struct net_device *dev, u16 type, u16 ordinal,
+ u16 ext, u16 *index, u32 *item_length,
+ u32 *data_length);
+
static int bnxt_flash_nvram(struct net_device *dev,
u16 dir_type,
u16 dir_ordinal,
@@ -1179,7 +1187,6 @@
(unsigned long)calculated_crc);
return -EINVAL;
}
- /* TODO: Validate digital signature (RSA-encrypted SHA-256 hash) here */
rc = bnxt_flash_nvram(dev, dir_type, BNX_DIR_ORDINAL_FIRST,
0, 0, fw_data, fw_size);
if (rc == 0) /* Firmware update successful */
@@ -1188,6 +1195,57 @@
return rc;
}
+static int bnxt_flash_microcode(struct net_device *dev,
+ u16 dir_type,
+ const u8 *fw_data,
+ size_t fw_size)
+{
+ struct bnxt_ucode_trailer *trailer;
+ u32 calculated_crc;
+ u32 stored_crc;
+ int rc = 0;
+
+ if (fw_size < sizeof(struct bnxt_ucode_trailer)) {
+ netdev_err(dev, "Invalid microcode file size: %u\n",
+ (unsigned int)fw_size);
+ return -EINVAL;
+ }
+ trailer = (struct bnxt_ucode_trailer *)(fw_data + (fw_size -
+ sizeof(*trailer)));
+ if (trailer->sig != cpu_to_le32(BNXT_UCODE_TRAILER_SIGNATURE)) {
+ netdev_err(dev, "Invalid microcode trailer signature: %08X\n",
+ le32_to_cpu(trailer->sig));
+ return -EINVAL;
+ }
+ if (le16_to_cpu(trailer->dir_type) != dir_type) {
+ netdev_err(dev, "Expected microcode type: %d, read: %d\n",
+ dir_type, le16_to_cpu(trailer->dir_type));
+ return -EINVAL;
+ }
+ if (le16_to_cpu(trailer->trailer_length) <
+ sizeof(struct bnxt_ucode_trailer)) {
+ netdev_err(dev, "Invalid microcode trailer length: %d\n",
+ le16_to_cpu(trailer->trailer_length));
+ return -EINVAL;
+ }
+
+ /* Confirm the CRC32 checksum of the file: */
+ stored_crc = le32_to_cpu(*(__le32 *)(fw_data + fw_size -
+ sizeof(stored_crc)));
+ calculated_crc = ~crc32(~0, fw_data, fw_size - sizeof(stored_crc));
+ if (calculated_crc != stored_crc) {
+ netdev_err(dev,
+ "CRC32 (%08lX) does not match calculated: %08lX\n",
+ (unsigned long)stored_crc,
+ (unsigned long)calculated_crc);
+ return -EINVAL;
+ }
+ rc = bnxt_flash_nvram(dev, dir_type, BNX_DIR_ORDINAL_FIRST,
+ 0, 0, fw_data, fw_size);
+
+ return rc;
+}
+
static bool bnxt_dir_type_is_ape_bin_format(u16 dir_type)
{
switch (dir_type) {
@@ -1206,7 +1264,7 @@
return false;
}
-static bool bnxt_dir_type_is_unprotected_exec_format(u16 dir_type)
+static bool bnxt_dir_type_is_other_exec_format(u16 dir_type)
{
switch (dir_type) {
case BNX_DIR_TYPE_AVS:
@@ -1227,7 +1285,7 @@
static bool bnxt_dir_type_is_executable(u16 dir_type)
{
return bnxt_dir_type_is_ape_bin_format(dir_type) ||
- bnxt_dir_type_is_unprotected_exec_format(dir_type);
+ bnxt_dir_type_is_other_exec_format(dir_type);
}
static int bnxt_flash_firmware_from_file(struct net_device *dev,
@@ -1237,10 +1295,6 @@
const struct firmware *fw;
int rc;
- if (dir_type != BNX_DIR_TYPE_UPDATE &&
- bnxt_dir_type_is_executable(dir_type) == false)
- return -EINVAL;
-
rc = request_firmware(&fw, filename, &dev->dev);
if (rc != 0) {
netdev_err(dev, "Error %d requesting firmware file: %s\n",
@@ -1249,6 +1303,8 @@
}
if (bnxt_dir_type_is_ape_bin_format(dir_type) == true)
rc = bnxt_flash_firmware(dev, dir_type, fw->data, fw->size);
+ else if (bnxt_dir_type_is_other_exec_format(dir_type) == true)
+ rc = bnxt_flash_microcode(dev, dir_type, fw->data, fw->size);
else
rc = bnxt_flash_nvram(dev, dir_type, BNX_DIR_ORDINAL_FIRST,
0, 0, fw->data, fw->size);
@@ -1257,10 +1313,83 @@
}
static int bnxt_flash_package_from_file(struct net_device *dev,
- char *filename)
+ char *filename, u32 install_type)
{
- netdev_err(dev, "packages are not yet supported\n");
- return -EINVAL;
+ struct bnxt *bp = netdev_priv(dev);
+ struct hwrm_nvm_install_update_output *resp = bp->hwrm_cmd_resp_addr;
+ struct hwrm_nvm_install_update_input install = {0};
+ const struct firmware *fw;
+ u32 item_len;
+ u16 index;
+ int rc;
+
+ bnxt_hwrm_fw_set_time(bp);
+
+ if (bnxt_find_nvram_item(dev, BNX_DIR_TYPE_UPDATE,
+ BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
+ &index, &item_len, NULL) != 0) {
+ netdev_err(dev, "PKG update area not created in nvram\n");
+ return -ENOBUFS;
+ }
+
+ rc = request_firmware(&fw, filename, &dev->dev);
+ if (rc != 0) {
+ netdev_err(dev, "PKG error %d requesting file: %s\n",
+ rc, filename);
+ return rc;
+ }
+
+ if (fw->size > item_len) {
+ netdev_err(dev, "PKG insufficient update area in nvram: %lu",
+ (unsigned long)fw->size);
+ rc = -EFBIG;
+ } else {
+ dma_addr_t dma_handle;
+ u8 *kmem;
+ struct hwrm_nvm_modify_input modify = {0};
+
+ bnxt_hwrm_cmd_hdr_init(bp, &modify, HWRM_NVM_MODIFY, -1, -1);
+
+ modify.dir_idx = cpu_to_le16(index);
+ modify.len = cpu_to_le32(fw->size);
+
+ kmem = dma_alloc_coherent(&bp->pdev->dev, fw->size,
+ &dma_handle, GFP_KERNEL);
+ if (!kmem) {
+ netdev_err(dev,
+ "dma_alloc_coherent failure, length = %u\n",
+ (unsigned int)fw->size);
+ rc = -ENOMEM;
+ } else {
+ memcpy(kmem, fw->data, fw->size);
+ modify.host_src_addr = cpu_to_le64(dma_handle);
+
+ rc = hwrm_send_message(bp, &modify, sizeof(modify),
+ FLASH_PACKAGE_TIMEOUT);
+ dma_free_coherent(&bp->pdev->dev, fw->size, kmem,
+ dma_handle);
+ }
+ }
+ release_firmware(fw);
+ if (rc)
+ return rc;
+
+ if ((install_type & 0xffff) == 0)
+ install_type >>= 16;
+ bnxt_hwrm_cmd_hdr_init(bp, &install, HWRM_NVM_INSTALL_UPDATE, -1, -1);
+ install.install_type = cpu_to_le32(install_type);
+
+ rc = hwrm_send_message(bp, &install, sizeof(install),
+ INSTALL_PACKAGE_TIMEOUT);
+ if (rc)
+ return -EOPNOTSUPP;
+
+ if (resp->result) {
+ netdev_err(dev, "PKG install error = %d, problem_item = %d\n",
+ (s8)resp->result, (int)resp->problem_item);
+ return -ENOPKG;
+ }
+ return 0;
}
static int bnxt_flash_device(struct net_device *dev,
@@ -1271,8 +1400,10 @@
return -EINVAL;
}
- if (flash->region == ETHTOOL_FLASH_ALL_REGIONS)
- return bnxt_flash_package_from_file(dev, flash->data);
+ if (flash->region == ETHTOOL_FLASH_ALL_REGIONS ||
+ flash->region > 0xffff)
+ return bnxt_flash_package_from_file(dev, flash->data,
+ flash->region);
return bnxt_flash_firmware_from_file(dev, flash->region, flash->data);
}
@@ -1516,7 +1647,7 @@
/* Create or re-write an NVM item: */
if (bnxt_dir_type_is_executable(type) == true)
- return -EINVAL;
+ return -EOPNOTSUPP;
ext = eeprom->magic & 0xffff;
ordinal = eeprom->offset >> 16;
attr = eeprom->offset & 0xffff;
@@ -1718,6 +1849,25 @@
return rc;
}
+static int bnxt_nway_reset(struct net_device *dev)
+{
+ int rc = 0;
+
+ struct bnxt *bp = netdev_priv(dev);
+ struct bnxt_link_info *link_info = &bp->link_info;
+
+ if (!BNXT_SINGLE_PF(bp))
+ return -EOPNOTSUPP;
+
+ if (!(link_info->autoneg & BNXT_AUTONEG_SPEED))
+ return -EINVAL;
+
+ if (netif_running(dev))
+ rc = bnxt_hwrm_set_link_setting(bp, true, false);
+
+ return rc;
+}
+
const struct ethtool_ops bnxt_ethtool_ops = {
.get_link_ksettings = bnxt_get_link_ksettings,
.set_link_ksettings = bnxt_set_link_ksettings,
@@ -1750,4 +1900,5 @@
.set_eee = bnxt_set_eee,
.get_module_info = bnxt_get_module_info,
.get_module_eeprom = bnxt_get_module_eeprom,
+ .nway_reset = bnxt_nway_reset
};
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_fw_hdr.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_fw_hdr.h
index 82bf44a..cad30dd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_fw_hdr.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_fw_hdr.h
@@ -11,6 +11,7 @@
#define __BNXT_FW_HDR_H__
#define BNXT_FIRMWARE_BIN_SIGNATURE 0x1a4d4342 /* "BCM"+0x1a */
+#define BNXT_UCODE_TRAILER_SIGNATURE 0x726c7254 /* "Trlr" */
enum SUPPORTED_FAMILY {
DEVICE_5702_3_4_FAMILY, /* 0 - Denali, Vinson, K2 */
@@ -85,7 +86,7 @@
struct bnxt_fw_header {
__le32 signature; /* constains the constant value of
- * BNXT_Firmware_Bin_Signatures
+ * BNXT_FIRMWARE_BIN_SIGNATURE
*/
u8 flags; /* reserved for ChiMP use */
u8 code_type; /* enum SUPPORTED_CODE */
@@ -102,4 +103,17 @@
u8 major_ver;
};
+/* Microcode and pre-boot software/firmware trailer: */
+struct bnxt_ucode_trailer {
+ u8 rsa_sig[256];
+ __le16 flags;
+ u8 version_format;
+ u8 version_length;
+ u8 version[16];
+ __le16 dir_type;
+ __le16 trailer_length;
+ __le32 sig; /* BNXT_UCODE_TRAILER_SIGNATURE */
+ __le32 chksum; /* CRC-32 */
+};
+
#endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
index 517567f..04a96cc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
@@ -39,7 +39,7 @@
__le16 type;
#define EJECT_CMPL_TYPE_MASK 0x3fUL
#define EJECT_CMPL_TYPE_SFT 0
- #define EJECT_CMPL_TYPE_STAT_EJECT (0x1aUL << 0)
+ #define EJECT_CMPL_TYPE_STAT_EJECT 0x1aUL
__le16 len;
__le32 opaque;
__le32 v;
@@ -52,7 +52,7 @@
__le16 type;
#define HWRM_CMPL_TYPE_MASK 0x3fUL
#define HWRM_CMPL_TYPE_SFT 0
- #define HWRM_CMPL_TYPE_HWRM_DONE (0x20UL << 0)
+ #define HWRM_CMPL_TYPE_HWRM_DONE 0x20UL
__le16 sequence_id;
__le32 unused_1;
__le32 v;
@@ -65,7 +65,7 @@
__le16 req_len_type;
#define HWRM_FWD_REQ_CMPL_TYPE_MASK 0x3fUL
#define HWRM_FWD_REQ_CMPL_TYPE_SFT 0
- #define HWRM_FWD_REQ_CMPL_TYPE_HWRM_FWD_REQ (0x22UL << 0)
+ #define HWRM_FWD_REQ_CMPL_TYPE_HWRM_FWD_REQ 0x22UL
#define HWRM_FWD_REQ_CMPL_REQ_LEN_MASK 0xffc0UL
#define HWRM_FWD_REQ_CMPL_REQ_LEN_SFT 6
__le16 source_id;
@@ -81,7 +81,7 @@
__le16 type;
#define HWRM_FWD_RESP_CMPL_TYPE_MASK 0x3fUL
#define HWRM_FWD_RESP_CMPL_TYPE_SFT 0
- #define HWRM_FWD_RESP_CMPL_TYPE_HWRM_FWD_RESP (0x24UL << 0)
+ #define HWRM_FWD_RESP_CMPL_TYPE_HWRM_FWD_RESP 0x24UL
__le16 source_id;
__le16 resp_len;
__le16 unused_1;
@@ -96,25 +96,26 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_STATUS_CHANGE (0x0UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_MTU_CHANGE (0x1UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CHANGE (0x2UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_DCB_CONFIG_CHANGE (0x3UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PORT_CONN_NOT_ALLOWED (0x4UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_NOT_ALLOWED (0x5UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_CHANGE (0x6UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PORT_PHY_CFG_CHANGE (0x7UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_FUNC_DRVR_UNLOAD (0x10UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_FUNC_DRVR_LOAD (0x11UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_UNLOAD (0x20UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_LOAD (0x21UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_FLR (0x30UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_MAC_ADDR_CHANGE (0x31UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_VF_COMM_STATUS_CHANGE (0x32UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_CFG_CHANGE (0x33UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_HWRM_ERROR (0xffUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_STATUS_CHANGE 0x0UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_MTU_CHANGE 0x1UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CHANGE 0x2UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_DCB_CONFIG_CHANGE 0x3UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PORT_CONN_NOT_ALLOWED 0x4UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_NOT_ALLOWED 0x5UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_CHANGE 0x6UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PORT_PHY_CFG_CHANGE 0x7UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_FUNC_DRVR_UNLOAD 0x10UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_FUNC_DRVR_LOAD 0x11UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_FUNC_FLR_PROC_CMPLT 0x12UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_UNLOAD 0x20UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_DRVR_LOAD 0x21UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_FLR 0x30UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_MAC_ADDR_CHANGE 0x31UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_PF_VF_COMM_STATUS_CHANGE 0x32UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_VF_CFG_CHANGE 0x33UL
+ #define HWRM_ASYNC_EVENT_CMPL_EVENT_ID_HWRM_ERROR 0xffUL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_V 0x1UL
@@ -130,9 +131,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_EVENT_ID_LINK_STATUS_CHANGE (0x0UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_EVENT_ID_LINK_STATUS_CHANGE 0x0UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_LINK_STATUS_CHANGE_V 0x1UL
@@ -156,9 +157,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_EVENT_ID_LINK_MTU_CHANGE (0x1UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_EVENT_ID_LINK_MTU_CHANGE 0x1UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_LINK_MTU_CHANGE_V 0x1UL
@@ -176,9 +177,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_ID_LINK_SPEED_CHANGE (0x2UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_ID_LINK_SPEED_CHANGE 0x2UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_V 0x1UL
@@ -200,8 +201,7 @@
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_40GB (0x190UL << 1)
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_50GB (0x1f4UL << 1)
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_100GB (0x3e8UL << 1)
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_10MB (0xffffUL << 1)
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_LAST HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_10MB
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_LAST HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_NEW_LINK_SPEED_100MBPS_100GB
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_PORT_ID_MASK 0xffff0000UL
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CHANGE_EVENT_DATA1_PORT_ID_SFT 16
};
@@ -211,9 +211,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_EVENT_ID_DCB_CONFIG_CHANGE (0x3UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_EVENT_ID_DCB_CONFIG_CHANGE 0x3UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_DCB_CONFIG_CHANGE_V 0x1UL
@@ -231,9 +231,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_EVENT_ID_PORT_CONN_NOT_ALLOWED (0x4UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_EVENT_ID_PORT_CONN_NOT_ALLOWED 0x4UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_PORT_CONN_NOT_ALLOWED_V 0x1UL
@@ -258,9 +258,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_EVENT_ID_LINK_SPEED_CFG_NOT_ALLOWED (0x5UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_EVENT_ID_LINK_SPEED_CFG_NOT_ALLOWED 0x5UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_NOT_ALLOWED_V 0x1UL
@@ -278,9 +278,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_EVENT_ID_LINK_SPEED_CFG_CHANGE (0x6UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_EVENT_ID_LINK_SPEED_CFG_CHANGE 0x6UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_LINK_SPEED_CFG_CHANGE_V 0x1UL
@@ -300,9 +300,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_EVENT_ID_FUNC_DRVR_UNLOAD (0x10UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_EVENT_ID_FUNC_DRVR_UNLOAD 0x10UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_UNLOAD_V 0x1UL
@@ -320,9 +320,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_EVENT_ID_FUNC_DRVR_LOAD (0x11UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_EVENT_ID_FUNC_DRVR_LOAD 0x11UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_FUNC_DRVR_LOAD_V 0x1UL
@@ -340,9 +340,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_EVENT_ID_PF_DRVR_UNLOAD (0x20UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_EVENT_ID_PF_DRVR_UNLOAD 0x20UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_UNLOAD_V 0x1UL
@@ -362,9 +362,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_EVENT_ID_PF_DRVR_LOAD (0x21UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_EVENT_ID_PF_DRVR_LOAD 0x21UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_PF_DRVR_LOAD_V 0x1UL
@@ -384,9 +384,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_VF_FLR_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_VF_FLR_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_VF_FLR_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_FLR_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_VF_FLR_EVENT_ID_VF_FLR (0x30UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_FLR_EVENT_ID_VF_FLR 0x30UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_VF_FLR_V 0x1UL
@@ -404,9 +404,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_EVENT_ID_VF_MAC_ADDR_CHANGE (0x31UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_EVENT_ID_VF_MAC_ADDR_CHANGE 0x31UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_VF_MAC_ADDR_CHANGE_V 0x1UL
@@ -424,9 +424,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_EVENT_ID_PF_VF_COMM_STATUS_CHANGE (0x32UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_EVENT_ID_PF_VF_COMM_STATUS_CHANGE 0x32UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_PF_VF_COMM_STATUS_CHANGE_V 0x1UL
@@ -443,9 +443,9 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_EVENT_ID_VF_CFG_CHANGE (0x33UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_EVENT_ID_VF_CFG_CHANGE 0x33UL
__le32 event_data2;
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_VF_CFG_CHANGE_V 0x1UL
@@ -465,15 +465,15 @@
__le16 type;
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_TYPE_MASK 0x3fUL
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_TYPE_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_TYPE_HWRM_ASYNC_EVENT (0x2eUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_TYPE_HWRM_ASYNC_EVENT 0x2eUL
__le16 event_id;
- #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_ID_HWRM_ERROR (0xffUL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_ID_HWRM_ERROR 0xffUL
__le32 event_data2;
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_MASK 0xffUL
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_SFT 0
- #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_WARNING (0x0UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_NONFATAL (0x1UL << 0)
- #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_FATAL (0x2UL << 0)
+ #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_WARNING 0x0UL
+ #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_NONFATAL 0x1UL
+ #define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_FATAL 0x2UL
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_LAST HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA2_SEVERITY_FATAL
u8 opaque_v;
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_V 0x1UL
@@ -485,12 +485,12 @@
#define HWRM_ASYNC_EVENT_CMPL_HWRM_ERROR_EVENT_DATA1_TIMESTAMP 0x1UL
};
-/* HW Resource Manager Specification 1.3.0 */
+/* HW Resource Manager Specification 1.5.1 */
#define HWRM_VERSION_MAJOR 1
-#define HWRM_VERSION_MINOR 3
-#define HWRM_VERSION_UPDATE 0
+#define HWRM_VERSION_MINOR 5
+#define HWRM_VERSION_UPDATE 1
-#define HWRM_VERSION_STR "1.3.0"
+#define HWRM_VERSION_STR "1.5.1"
/*
* Following is the signature for HWRM message field that indicates not
* applicable (All F's). Need to cast it the size of the field if needed.
@@ -556,8 +556,8 @@
#define HWRM_QUEUE_QPORTCFG (0x30UL)
#define HWRM_QUEUE_QCFG (0x31UL)
#define HWRM_QUEUE_CFG (0x32UL)
- #define HWRM_QUEUE_BUFFERS_QCFG (0x33UL)
- #define HWRM_QUEUE_BUFFERS_CFG (0x34UL)
+ #define RESERVED2 (0x33UL)
+ #define RESERVED3 (0x34UL)
#define HWRM_QUEUE_PFCENABLE_QCFG (0x35UL)
#define HWRM_QUEUE_PFCENABLE_CFG (0x36UL)
#define HWRM_QUEUE_PRI2COS_QCFG (0x37UL)
@@ -574,6 +574,7 @@
#define HWRM_VNIC_RSS_QCFG (0x47UL)
#define HWRM_VNIC_PLCMODES_CFG (0x48UL)
#define HWRM_VNIC_PLCMODES_QCFG (0x49UL)
+ #define HWRM_VNIC_QCAPS (0x4aUL)
#define HWRM_RING_ALLOC (0x50UL)
#define HWRM_RING_FREE (0x51UL)
#define HWRM_RING_CMPL_RING_QAGGINT_PARAMS (0x52UL)
@@ -581,13 +582,15 @@
#define HWRM_RING_RESET (0x5eUL)
#define HWRM_RING_GRP_ALLOC (0x60UL)
#define HWRM_RING_GRP_FREE (0x61UL)
+ #define RESERVED5 (0x64UL)
+ #define RESERVED6 (0x65UL)
#define HWRM_VNIC_RSS_COS_LB_CTX_ALLOC (0x70UL)
#define HWRM_VNIC_RSS_COS_LB_CTX_FREE (0x71UL)
#define HWRM_CFA_L2_FILTER_ALLOC (0x90UL)
#define HWRM_CFA_L2_FILTER_FREE (0x91UL)
#define HWRM_CFA_L2_FILTER_CFG (0x92UL)
#define HWRM_CFA_L2_SET_RX_MASK (0x93UL)
- #define RESERVED3 (0x94UL)
+ #define RESERVED4 (0x94UL)
#define HWRM_CFA_TUNNEL_FILTER_ALLOC (0x95UL)
#define HWRM_CFA_TUNNEL_FILTER_FREE (0x96UL)
#define HWRM_CFA_ENCAP_RECORD_ALLOC (0x97UL)
@@ -607,6 +610,8 @@
#define HWRM_STAT_CTX_CLR_STATS (0xb3UL)
#define HWRM_FW_RESET (0xc0UL)
#define HWRM_FW_QSTATUS (0xc1UL)
+ #define HWRM_FW_SET_TIME (0xc8UL)
+ #define HWRM_FW_GET_TIME (0xc9UL)
#define HWRM_EXEC_FWD_RESP (0xd0UL)
#define HWRM_REJECT_FWD_RESP (0xd1UL)
#define HWRM_FWD_RESP (0xd2UL)
@@ -615,11 +620,13 @@
#define HWRM_WOL_FILTER_ALLOC (0xf0UL)
#define HWRM_WOL_FILTER_FREE (0xf1UL)
#define HWRM_WOL_FILTER_QCFG (0xf2UL)
+ #define HWRM_WOL_REASON_QCFG (0xf3UL)
#define HWRM_DBG_READ_DIRECT (0xff10UL)
#define HWRM_DBG_READ_INDIRECT (0xff11UL)
#define HWRM_DBG_WRITE_DIRECT (0xff12UL)
#define HWRM_DBG_WRITE_INDIRECT (0xff13UL)
#define HWRM_DBG_DUMP (0xff14UL)
+ #define HWRM_NVM_INSTALL_UPDATE (0xfff3UL)
#define HWRM_NVM_MODIFY (0xfff4UL)
#define HWRM_NVM_VERIFY_UPDATE (0xfff5UL)
#define HWRM_NVM_GET_DEV_INFO (0xfff6UL)
@@ -824,7 +831,9 @@
u8 netctrl_fw_min;
u8 netctrl_fw_bld;
u8 netctrl_fw_rsvd;
- __le32 reserved1;
+ __le32 dev_caps_cfg;
+ #define VER_GET_RESP_DEV_CAPS_CFG_SECURE_FW_UPD_SUPPORTED 0x1UL
+ #define VER_GET_RESP_DEV_CAPS_CFG_FW_DCBX_AGENT_SUPPORTED 0x2UL
u8 roce_fw_maj;
u8 roce_fw_min;
u8 roce_fw_bld;
@@ -839,9 +848,9 @@
u8 chip_metal;
u8 chip_bond_id;
u8 chip_platform_type;
- #define VER_GET_RESP_CHIP_PLATFORM_TYPE_ASIC (0x0UL << 0)
- #define VER_GET_RESP_CHIP_PLATFORM_TYPE_FPGA (0x1UL << 0)
- #define VER_GET_RESP_CHIP_PLATFORM_TYPE_PALLADIUM (0x2UL << 0)
+ #define VER_GET_RESP_CHIP_PLATFORM_TYPE_ASIC 0x0UL
+ #define VER_GET_RESP_CHIP_PLATFORM_TYPE_FPGA 0x1UL
+ #define VER_GET_RESP_CHIP_PLATFORM_TYPE_PALLADIUM 0x2UL
__le16 max_req_win_len;
__le16 max_resp_len;
__le16 def_req_timeout;
@@ -863,10 +872,10 @@
#define FUNC_RESET_REQ_ENABLES_VF_ID_VALID 0x1UL
__le16 vf_id;
u8 func_reset_level;
- #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETALL (0x0UL << 0)
- #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETME (0x1UL << 0)
- #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETCHILDREN (0x2UL << 0)
- #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETVF (0x3UL << 0)
+ #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETALL 0x0UL
+ #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETME 0x1UL
+ #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETCHILDREN 0x2UL
+ #define FUNC_RESET_REQ_FUNC_RESET_LEVEL_RESETVF 0x3UL
u8 unused_0;
};
@@ -1028,6 +1037,10 @@
#define FUNC_QCAPS_RESP_FLAGS_ROCE_V2_SUPPORTED 0x10UL
#define FUNC_QCAPS_RESP_FLAGS_WOL_MAGICPKT_SUPPORTED 0x20UL
#define FUNC_QCAPS_RESP_FLAGS_WOL_BMP_SUPPORTED 0x40UL
+ #define FUNC_QCAPS_RESP_FLAGS_TX_RING_RL_SUPPORTED 0x80UL
+ #define FUNC_QCAPS_RESP_FLAGS_TX_BW_CFG_SUPPORTED 0x100UL
+ #define FUNC_QCAPS_RESP_FLAGS_VF_TX_RING_RL_SUPPORTED 0x200UL
+ #define FUNC_QCAPS_RESP_FLAGS_VF_BW_CFG_SUPPORTED 0x400UL
u8 mac_address[6];
__le16 max_rsscos_ctx;
__le16 max_cmpl_rings;
@@ -1047,9 +1060,8 @@
__le32 max_mcast_filters;
__le32 max_flow_id;
__le32 max_hw_ring_grps;
+ __le16 max_sp_tx_rings;
u8 unused_0;
- u8 unused_1;
- u8 unused_2;
u8 valid;
};
@@ -1077,6 +1089,7 @@
__le16 flags;
#define FUNC_QCFG_RESP_FLAGS_OOB_WOL_MAGICPKT_ENABLED 0x1UL
#define FUNC_QCFG_RESP_FLAGS_OOB_WOL_BMP_ENABLED 0x2UL
+ #define FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED 0x4UL
u8 mac_address[6];
__le16 pci_id;
__le16 alloc_rsscos_ctx;
@@ -1089,29 +1102,46 @@
__le16 mru;
__le16 stat_ctx_id;
u8 port_partition_type;
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_SPF (0x0UL << 0)
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_MPFS (0x1UL << 0)
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_0 (0x2UL << 0)
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_5 (0x3UL << 0)
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR2_0 (0x4UL << 0)
- #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_UNKNOWN (0xffUL << 0)
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_SPF 0x0UL
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_MPFS 0x1UL
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_0 0x2UL
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR1_5 0x3UL
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_NPAR2_0 0x4UL
+ #define FUNC_QCFG_RESP_PORT_PARTITION_TYPE_UNKNOWN 0xffUL
u8 unused_0;
__le16 dflt_vnic_id;
u8 unused_1;
u8 unused_2;
__le32 min_bw;
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_SFT 0
+ #define FUNC_QCFG_RESP_MIN_BW_RSVD 0x10000000UL
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_LAST FUNC_QCFG_RESP_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 max_bw;
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_SFT 0
+ #define FUNC_QCFG_RESP_MAX_BW_RSVD 0x10000000UL
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_LAST FUNC_QCFG_RESP_MAX_BW_BW_VALUE_UNIT_INVALID
u8 evb_mode;
- #define FUNC_QCFG_RESP_EVB_MODE_NO_EVB (0x0UL << 0)
- #define FUNC_QCFG_RESP_EVB_MODE_VEB (0x1UL << 0)
- #define FUNC_QCFG_RESP_EVB_MODE_VEPA (0x2UL << 0)
+ #define FUNC_QCFG_RESP_EVB_MODE_NO_EVB 0x0UL
+ #define FUNC_QCFG_RESP_EVB_MODE_VEB 0x1UL
+ #define FUNC_QCFG_RESP_EVB_MODE_VEPA 0x2UL
u8 unused_3;
- __le16 unused_4;
+ __le16 alloc_vfs;
__le32 alloc_mcast_filters;
__le32 alloc_hw_ring_grps;
- u8 unused_5;
- u8 unused_6;
- u8 unused_7;
+ __le16 alloc_sp_tx_rings;
+ u8 unused_4;
u8 valid;
};
@@ -1171,18 +1201,36 @@
__le16 dflt_vlan;
__be32 dflt_ip_addr[4];
__le32 min_bw;
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_SFT 0
+ #define FUNC_CFG_REQ_MIN_BW_RSVD 0x10000000UL
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_LAST FUNC_CFG_REQ_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 max_bw;
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_SFT 0
+ #define FUNC_CFG_REQ_MAX_BW_RSVD 0x10000000UL
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_LAST FUNC_CFG_REQ_MAX_BW_BW_VALUE_UNIT_INVALID
__le16 async_event_cr;
u8 vlan_antispoof_mode;
- #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_NOCHECK (0x0UL << 0)
- #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_VALIDATE_VLAN (0x1UL << 0)
- #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_INSERT_IF_VLANDNE (0x2UL << 0)
- #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_INSERT_OR_OVERRIDE_VLAN (0x3UL << 0)
+ #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_NOCHECK 0x0UL
+ #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_VALIDATE_VLAN 0x1UL
+ #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_INSERT_IF_VLANDNE 0x2UL
+ #define FUNC_CFG_REQ_VLAN_ANTISPOOF_MODE_INSERT_OR_OVERRIDE_VLAN 0x3UL
u8 allowed_vlan_pris;
u8 evb_mode;
- #define FUNC_CFG_REQ_EVB_MODE_NO_EVB (0x0UL << 0)
- #define FUNC_CFG_REQ_EVB_MODE_VEB (0x1UL << 0)
- #define FUNC_CFG_REQ_EVB_MODE_VEPA (0x2UL << 0)
+ #define FUNC_CFG_REQ_EVB_MODE_NO_EVB 0x0UL
+ #define FUNC_CFG_REQ_EVB_MODE_VEB 0x1UL
+ #define FUNC_CFG_REQ_EVB_MODE_VEPA 0x2UL
u8 unused_2;
__le16 num_mcast_filters;
};
@@ -1341,16 +1389,16 @@
#define FUNC_DRV_RGTR_REQ_ENABLES_VF_REQ_FWD 0x8UL
#define FUNC_DRV_RGTR_REQ_ENABLES_ASYNC_EVENT_FWD 0x10UL
__le16 os_type;
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_UNKNOWN (0x0UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_OTHER (0x1UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_MSDOS (0xeUL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_WINDOWS (0x12UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_SOLARIS (0x1dUL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_LINUX (0x24UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_FREEBSD (0x2aUL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_ESXI (0x68UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_WIN864 (0x73UL << 0)
- #define FUNC_DRV_RGTR_REQ_OS_TYPE_WIN2012R2 (0x74UL << 0)
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_UNKNOWN 0x0UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_OTHER 0x1UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_MSDOS 0xeUL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_WINDOWS 0x12UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_SOLARIS 0x1dUL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_LINUX 0x24UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_FREEBSD 0x2aUL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_ESXI 0x68UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_WIN864 0x73UL
+ #define FUNC_DRV_RGTR_REQ_OS_TYPE_WIN2012R2 0x74UL
u8 ver_maj;
u8 ver_min;
u8 ver_upd;
@@ -1415,13 +1463,13 @@
__le16 vf_id;
__le16 req_buf_num_pages;
__le16 req_buf_page_size;
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_16B (0x4UL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_4K (0xcUL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_8K (0xdUL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_64K (0x10UL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_2M (0x15UL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_4M (0x16UL << 0)
- #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_1G (0x1eUL << 0)
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_16B 0x4UL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_4K 0xcUL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_8K 0xdUL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_64K 0x10UL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_2M 0x15UL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_4M 0x16UL
+ #define FUNC_BUF_RGTR_REQ_REQ_BUF_PAGE_SIZE_1G 0x1eUL
__le16 req_buf_len;
__le16 resp_buf_len;
u8 unused_0;
@@ -1473,16 +1521,16 @@
__le16 seq_id;
__le16 resp_len;
__le16 os_type;
- #define FUNC_DRV_QVER_RESP_OS_TYPE_UNKNOWN (0x0UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_OTHER (0x1UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_MSDOS (0xeUL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_WINDOWS (0x12UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_SOLARIS (0x1dUL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_LINUX (0x24UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_FREEBSD (0x2aUL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_ESXI (0x68UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_WIN864 (0x73UL << 0)
- #define FUNC_DRV_QVER_RESP_OS_TYPE_WIN2012R2 (0x74UL << 0)
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_UNKNOWN 0x0UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_OTHER 0x1UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_MSDOS 0xeUL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_WINDOWS 0x12UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_SOLARIS 0x1dUL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_LINUX 0x24UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_FREEBSD 0x2aUL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_ESXI 0x68UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_WIN864 0x73UL
+ #define FUNC_DRV_QVER_RESP_OS_TYPE_WIN2012R2 0x74UL
u8 ver_maj;
u8 ver_min;
u8 ver_upd;
@@ -1528,44 +1576,44 @@
#define PORT_PHY_CFG_REQ_ENABLES_TX_LPI_TIMER 0x400UL
__le16 port_id;
__le16 force_link_speed;
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_100MB (0x1UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_1GB (0xaUL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_2GB (0x14UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_2_5GB (0x19UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10GB (0x64UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_20GB (0xc8UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_25GB (0xfaUL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_40GB (0x190UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_50GB (0x1f4UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_100GB (0x3e8UL << 0)
- #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10MB (0xffffUL << 0)
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_100MB 0x1UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_1GB 0xaUL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_2GB 0x14UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_2_5GB 0x19UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10GB 0x64UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_20GB 0xc8UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_25GB 0xfaUL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_40GB 0x190UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_50GB 0x1f4UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_100GB 0x3e8UL
+ #define PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10MB 0xffffUL
u8 auto_mode;
- #define PORT_PHY_CFG_REQ_AUTO_MODE_NONE (0x0UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_MODE_ALL_SPEEDS (0x1UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_MODE_ONE_SPEED (0x2UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_MODE_ONE_OR_BELOW (0x3UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_MODE_SPEED_MASK (0x4UL << 0)
+ #define PORT_PHY_CFG_REQ_AUTO_MODE_NONE 0x0UL
+ #define PORT_PHY_CFG_REQ_AUTO_MODE_ALL_SPEEDS 0x1UL
+ #define PORT_PHY_CFG_REQ_AUTO_MODE_ONE_SPEED 0x2UL
+ #define PORT_PHY_CFG_REQ_AUTO_MODE_ONE_OR_BELOW 0x3UL
+ #define PORT_PHY_CFG_REQ_AUTO_MODE_SPEED_MASK 0x4UL
u8 auto_duplex;
- #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_HALF (0x0UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_FULL (0x1UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_BOTH (0x2UL << 0)
+ #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_HALF 0x0UL
+ #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_FULL 0x1UL
+ #define PORT_PHY_CFG_REQ_AUTO_DUPLEX_BOTH 0x2UL
u8 auto_pause;
#define PORT_PHY_CFG_REQ_AUTO_PAUSE_TX 0x1UL
#define PORT_PHY_CFG_REQ_AUTO_PAUSE_RX 0x2UL
#define PORT_PHY_CFG_REQ_AUTO_PAUSE_AUTONEG_PAUSE 0x4UL
u8 unused_0;
__le16 auto_link_speed;
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_100MB (0x1UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_1GB (0xaUL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_2GB (0x14UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_2_5GB (0x19UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_10GB (0x64UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_20GB (0xc8UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_25GB (0xfaUL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_40GB (0x190UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_50GB (0x1f4UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_100GB (0x3e8UL << 0)
- #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_10MB (0xffffUL << 0)
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_100MB 0x1UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_1GB 0xaUL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_2GB 0x14UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_2_5GB 0x19UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_10GB 0x64UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_20GB 0xc8UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_25GB 0xfaUL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_40GB 0x190UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_50GB 0x1f4UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_100GB 0x3e8UL
+ #define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_10MB 0xffffUL
__le16 auto_link_speed_mask;
#define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_MASK_100MBHD 0x1UL
#define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_MASK_100MB 0x2UL
@@ -1582,12 +1630,12 @@
#define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_MASK_10MBHD 0x1000UL
#define PORT_PHY_CFG_REQ_AUTO_LINK_SPEED_MASK_10MB 0x2000UL
u8 wirespeed;
- #define PORT_PHY_CFG_REQ_WIRESPEED_OFF (0x0UL << 0)
- #define PORT_PHY_CFG_REQ_WIRESPEED_ON (0x1UL << 0)
+ #define PORT_PHY_CFG_REQ_WIRESPEED_OFF 0x0UL
+ #define PORT_PHY_CFG_REQ_WIRESPEED_ON 0x1UL
u8 lpbk;
- #define PORT_PHY_CFG_REQ_LPBK_NONE (0x0UL << 0)
- #define PORT_PHY_CFG_REQ_LPBK_LOCAL (0x1UL << 0)
- #define PORT_PHY_CFG_REQ_LPBK_REMOTE (0x2UL << 0)
+ #define PORT_PHY_CFG_REQ_LPBK_NONE 0x0UL
+ #define PORT_PHY_CFG_REQ_LPBK_LOCAL 0x1UL
+ #define PORT_PHY_CFG_REQ_LPBK_REMOTE 0x2UL
u8 force_pause;
#define PORT_PHY_CFG_REQ_FORCE_PAUSE_TX 0x1UL
#define PORT_PHY_CFG_REQ_FORCE_PAUSE_RX 0x2UL
@@ -1641,25 +1689,25 @@
__le16 seq_id;
__le16 resp_len;
u8 link;
- #define PORT_PHY_QCFG_RESP_LINK_NO_LINK (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SIGNAL (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_LINK (0x2UL << 0)
+ #define PORT_PHY_QCFG_RESP_LINK_NO_LINK 0x0UL
+ #define PORT_PHY_QCFG_RESP_LINK_SIGNAL 0x1UL
+ #define PORT_PHY_QCFG_RESP_LINK_LINK 0x2UL
u8 unused_0;
__le16 link_speed;
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_100MB (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_1GB (0xaUL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_2GB (0x14UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_2_5GB (0x19UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_10GB (0x64UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_20GB (0xc8UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_25GB (0xfaUL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_40GB (0x190UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_50GB (0x1f4UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_100GB (0x3e8UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_SPEED_10MB (0xffffUL << 0)
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_100MB 0x1UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_1GB 0xaUL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_2GB 0x14UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_2_5GB 0x19UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_10GB 0x64UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_20GB 0xc8UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_25GB 0xfaUL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_40GB 0x190UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_50GB 0x1f4UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_100GB 0x3e8UL
+ #define PORT_PHY_QCFG_RESP_LINK_SPEED_10MB 0xffffUL
u8 duplex;
- #define PORT_PHY_QCFG_RESP_DUPLEX_HALF (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_DUPLEX_FULL (0x1UL << 0)
+ #define PORT_PHY_QCFG_RESP_DUPLEX_HALF 0x0UL
+ #define PORT_PHY_QCFG_RESP_DUPLEX_FULL 0x1UL
u8 pause;
#define PORT_PHY_QCFG_RESP_PAUSE_TX 0x1UL
#define PORT_PHY_QCFG_RESP_PAUSE_RX 0x2UL
@@ -1679,39 +1727,39 @@
#define PORT_PHY_QCFG_RESP_SUPPORT_SPEEDS_10MBHD 0x1000UL
#define PORT_PHY_QCFG_RESP_SUPPORT_SPEEDS_10MB 0x2000UL
__le16 force_link_speed;
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_100MB (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_1GB (0xaUL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_2GB (0x14UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_2_5GB (0x19UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_10GB (0x64UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_20GB (0xc8UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_25GB (0xfaUL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_40GB (0x190UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_50GB (0x1f4UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_100GB (0x3e8UL << 0)
- #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_10MB (0xffffUL << 0)
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_100MB 0x1UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_1GB 0xaUL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_2GB 0x14UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_2_5GB 0x19UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_10GB 0x64UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_20GB 0xc8UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_25GB 0xfaUL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_40GB 0x190UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_50GB 0x1f4UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_100GB 0x3e8UL
+ #define PORT_PHY_QCFG_RESP_FORCE_LINK_SPEED_10MB 0xffffUL
u8 auto_mode;
- #define PORT_PHY_QCFG_RESP_AUTO_MODE_NONE (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_MODE_ALL_SPEEDS (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_MODE_ONE_SPEED (0x2UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_MODE_ONE_OR_BELOW (0x3UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_MODE_SPEED_MASK (0x4UL << 0)
+ #define PORT_PHY_QCFG_RESP_AUTO_MODE_NONE 0x0UL
+ #define PORT_PHY_QCFG_RESP_AUTO_MODE_ALL_SPEEDS 0x1UL
+ #define PORT_PHY_QCFG_RESP_AUTO_MODE_ONE_SPEED 0x2UL
+ #define PORT_PHY_QCFG_RESP_AUTO_MODE_ONE_OR_BELOW 0x3UL
+ #define PORT_PHY_QCFG_RESP_AUTO_MODE_SPEED_MASK 0x4UL
u8 auto_pause;
#define PORT_PHY_QCFG_RESP_AUTO_PAUSE_TX 0x1UL
#define PORT_PHY_QCFG_RESP_AUTO_PAUSE_RX 0x2UL
#define PORT_PHY_QCFG_RESP_AUTO_PAUSE_AUTONEG_PAUSE 0x4UL
__le16 auto_link_speed;
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_100MB (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_1GB (0xaUL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_2GB (0x14UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_2_5GB (0x19UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_10GB (0x64UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_20GB (0xc8UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_25GB (0xfaUL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_40GB (0x190UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_50GB (0x1f4UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_100GB (0x3e8UL << 0)
- #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_10MB (0xffffUL << 0)
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_100MB 0x1UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_1GB 0xaUL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_2GB 0x14UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_2_5GB 0x19UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_10GB 0x64UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_20GB 0xc8UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_25GB 0xfaUL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_40GB 0x190UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_50GB 0x1f4UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_100GB 0x3e8UL
+ #define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_10MB 0xffffUL
__le16 auto_link_speed_mask;
#define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_MASK_100MBHD 0x1UL
#define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_MASK_100MB 0x2UL
@@ -1728,46 +1776,46 @@
#define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_MASK_10MBHD 0x1000UL
#define PORT_PHY_QCFG_RESP_AUTO_LINK_SPEED_MASK_10MB 0x2000UL
u8 wirespeed;
- #define PORT_PHY_QCFG_RESP_WIRESPEED_OFF (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_WIRESPEED_ON (0x1UL << 0)
+ #define PORT_PHY_QCFG_RESP_WIRESPEED_OFF 0x0UL
+ #define PORT_PHY_QCFG_RESP_WIRESPEED_ON 0x1UL
u8 lpbk;
- #define PORT_PHY_QCFG_RESP_LPBK_NONE (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_LPBK_LOCAL (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_LPBK_REMOTE (0x2UL << 0)
+ #define PORT_PHY_QCFG_RESP_LPBK_NONE 0x0UL
+ #define PORT_PHY_QCFG_RESP_LPBK_LOCAL 0x1UL
+ #define PORT_PHY_QCFG_RESP_LPBK_REMOTE 0x2UL
u8 force_pause;
#define PORT_PHY_QCFG_RESP_FORCE_PAUSE_TX 0x1UL
#define PORT_PHY_QCFG_RESP_FORCE_PAUSE_RX 0x2UL
u8 module_status;
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NONE (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_DISABLETX (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_WARNINGMSG (0x2UL << 0)
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_PWRDOWN (0x3UL << 0)
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NOTINSERTED (0x4UL << 0)
- #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NOTAPPLICABLE (0xffUL << 0)
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NONE 0x0UL
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_DISABLETX 0x1UL
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_WARNINGMSG 0x2UL
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_PWRDOWN 0x3UL
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NOTINSERTED 0x4UL
+ #define PORT_PHY_QCFG_RESP_MODULE_STATUS_NOTAPPLICABLE 0xffUL
__le32 preemphasis;
u8 phy_maj;
u8 phy_min;
u8 phy_bld;
u8 phy_type;
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_UNKNOWN (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASECR (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR4 (0x2UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASELR (0x3UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASESR (0x4UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR2 (0x5UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKX (0x6UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR (0x7UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASET (0x8UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASETE (0x9UL << 0)
- #define PORT_PHY_QCFG_RESP_PHY_TYPE_SGMIIEXTPHY (0xaUL << 0)
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_UNKNOWN 0x0UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASECR 0x1UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR4 0x2UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASELR 0x3UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASESR 0x4UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR2 0x5UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKX 0x6UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASEKR 0x7UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASET 0x8UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_BASETE 0x9UL
+ #define PORT_PHY_QCFG_RESP_PHY_TYPE_SGMIIEXTPHY 0xaUL
u8 media_type;
- #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_UNKNOWN (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_TP (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_DAC (0x2UL << 0)
- #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_FIBRE (0x3UL << 0)
+ #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_UNKNOWN 0x0UL
+ #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_TP 0x1UL
+ #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_DAC 0x2UL
+ #define PORT_PHY_QCFG_RESP_MEDIA_TYPE_FIBRE 0x3UL
u8 xcvr_pkg_type;
- #define PORT_PHY_QCFG_RESP_XCVR_PKG_TYPE_XCVR_INTERNAL (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_XCVR_PKG_TYPE_XCVR_EXTERNAL (0x2UL << 0)
+ #define PORT_PHY_QCFG_RESP_XCVR_PKG_TYPE_XCVR_INTERNAL 0x1UL
+ #define PORT_PHY_QCFG_RESP_XCVR_PKG_TYPE_XCVR_EXTERNAL 0x2UL
u8 eee_config_phy_addr;
#define PORT_PHY_QCFG_RESP_PHY_ADDR_MASK 0x1fUL
#define PORT_PHY_QCFG_RESP_PHY_ADDR_SFT 0
@@ -1796,11 +1844,11 @@
#define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_SPEEDS_10MBHD 0x1000UL
#define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_SPEEDS_10MB 0x2000UL
u8 link_partner_adv_auto_mode;
- #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_NONE (0x0UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ALL_SPEEDS (0x1UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ONE_SPEED (0x2UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ONE_OR_BELOW (0x3UL << 0)
- #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_SPEED_MASK (0x4UL << 0)
+ #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_NONE 0x0UL
+ #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ALL_SPEEDS 0x1UL
+ #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ONE_SPEED 0x2UL
+ #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_ONE_OR_BELOW 0x3UL
+ #define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_AUTO_MODE_SPEED_MASK 0x4UL
u8 link_partner_adv_pause;
#define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_PAUSE_TX 0x1UL
#define PORT_PHY_QCFG_RESP_LINK_PARTNER_ADV_PAUSE_RX 0x2UL
@@ -1859,7 +1907,7 @@
__le64 resp_addr;
__le32 flags;
#define PORT_MAC_CFG_REQ_FLAGS_MATCH_LINK 0x1UL
- #define PORT_MAC_CFG_REQ_FLAGS_COS_ASSIGNMENT_ENABLE 0x2UL
+ #define PORT_MAC_CFG_REQ_FLAGS_VLAN_PRI2COS_ENABLE 0x2UL
#define PORT_MAC_CFG_REQ_FLAGS_TUNNEL_PRI2COS_ENABLE 0x4UL
#define PORT_MAC_CFG_REQ_FLAGS_IP_DSCP2COS_ENABLE 0x8UL
#define PORT_MAC_CFG_REQ_FLAGS_PTP_RX_TS_CAPTURE_ENABLE 0x10UL
@@ -1868,28 +1916,50 @@
#define PORT_MAC_CFG_REQ_FLAGS_PTP_TX_TS_CAPTURE_DISABLE 0x80UL
#define PORT_MAC_CFG_REQ_FLAGS_OOB_WOL_ENABLE 0x100UL
#define PORT_MAC_CFG_REQ_FLAGS_OOB_WOL_DISABLE 0x200UL
+ #define PORT_MAC_CFG_REQ_FLAGS_VLAN_PRI2COS_DISABLE 0x400UL
+ #define PORT_MAC_CFG_REQ_FLAGS_TUNNEL_PRI2COS_DISABLE 0x800UL
+ #define PORT_MAC_CFG_REQ_FLAGS_IP_DSCP2COS_DISABLE 0x1000UL
__le32 enables;
#define PORT_MAC_CFG_REQ_ENABLES_IPG 0x1UL
#define PORT_MAC_CFG_REQ_ENABLES_LPBK 0x2UL
- #define PORT_MAC_CFG_REQ_ENABLES_IVLAN_PRI2COS_MAP_PRI 0x4UL
- #define PORT_MAC_CFG_REQ_ENABLES_LCOS_MAP_PRI 0x8UL
+ #define PORT_MAC_CFG_REQ_ENABLES_VLAN_PRI2COS_MAP_PRI 0x4UL
+ #define PORT_MAC_CFG_REQ_ENABLES_RESERVED1 0x8UL
#define PORT_MAC_CFG_REQ_ENABLES_TUNNEL_PRI2COS_MAP_PRI 0x10UL
#define PORT_MAC_CFG_REQ_ENABLES_DSCP2COS_MAP_PRI 0x20UL
#define PORT_MAC_CFG_REQ_ENABLES_RX_TS_CAPTURE_PTP_MSG_TYPE 0x40UL
#define PORT_MAC_CFG_REQ_ENABLES_TX_TS_CAPTURE_PTP_MSG_TYPE 0x80UL
+ #define PORT_MAC_CFG_REQ_ENABLES_COS_FIELD_CFG 0x100UL
__le16 port_id;
u8 ipg;
u8 lpbk;
- #define PORT_MAC_CFG_REQ_LPBK_NONE (0x0UL << 0)
- #define PORT_MAC_CFG_REQ_LPBK_LOCAL (0x1UL << 0)
- #define PORT_MAC_CFG_REQ_LPBK_REMOTE (0x2UL << 0)
- u8 ivlan_pri2cos_map_pri;
- u8 lcos_map_pri;
+ #define PORT_MAC_CFG_REQ_LPBK_NONE 0x0UL
+ #define PORT_MAC_CFG_REQ_LPBK_LOCAL 0x1UL
+ #define PORT_MAC_CFG_REQ_LPBK_REMOTE 0x2UL
+ u8 vlan_pri2cos_map_pri;
+ u8 reserved1;
u8 tunnel_pri2cos_map_pri;
u8 dscp2pri_map_pri;
__le16 rx_ts_capture_ptp_msg_type;
__le16 tx_ts_capture_ptp_msg_type;
- __le32 unused_0;
+ u8 cos_field_cfg;
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_RSVD1 0x1UL
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_MASK 0x6UL
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_SFT 1
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_INNERMOST (0x0UL << 1)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_OUTER (0x1UL << 1)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_OUTERMOST (0x2UL << 1)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_UNSPECIFIED (0x3UL << 1)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_LAST PORT_MAC_CFG_REQ_COS_FIELD_CFG_VLAN_PRI_SEL_UNSPECIFIED
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_MASK 0x18UL
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_SFT 3
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_INNERMOST (0x0UL << 3)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_OUTER (0x1UL << 3)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_OUTERMOST (0x2UL << 3)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_UNSPECIFIED (0x3UL << 3)
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_LAST PORT_MAC_CFG_REQ_COS_FIELD_CFG_T_VLAN_PRI_SEL_UNSPECIFIED
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_DEFAULT_COS_MASK 0xe0UL
+ #define PORT_MAC_CFG_REQ_COS_FIELD_CFG_DEFAULT_COS_SFT 5
+ u8 unused_0[3];
};
/* Output (16 bytes) */
@@ -1902,9 +1972,9 @@
__le16 mtu;
u8 ipg;
u8 lpbk;
- #define PORT_MAC_CFG_RESP_LPBK_NONE (0x0UL << 0)
- #define PORT_MAC_CFG_RESP_LPBK_LOCAL (0x1UL << 0)
- #define PORT_MAC_CFG_RESP_LPBK_REMOTE (0x2UL << 0)
+ #define PORT_MAC_CFG_RESP_LPBK_NONE 0x0UL
+ #define PORT_MAC_CFG_RESP_LPBK_LOCAL 0x1UL
+ #define PORT_MAC_CFG_RESP_LPBK_REMOTE 0x2UL
u8 unused_0;
u8 valid;
};
@@ -2163,8 +2233,8 @@
__le64 resp_addr;
__le32 flags;
#define QUEUE_QPORTCFG_REQ_FLAGS_PATH 0x1UL
- #define QUEUE_QPORTCFG_REQ_FLAGS_PATH_TX (0x0UL << 0)
- #define QUEUE_QPORTCFG_REQ_FLAGS_PATH_RX (0x1UL << 0)
+ #define QUEUE_QPORTCFG_REQ_FLAGS_PATH_TX 0x0UL
+ #define QUEUE_QPORTCFG_REQ_FLAGS_PATH_RX 0x1UL
#define QUEUE_QPORTCFG_REQ_FLAGS_PATH_LAST QUEUE_QPORTCFG_REQ_FLAGS_PATH_RX
__le16 port_id;
__le16 unused_0;
@@ -2179,50 +2249,51 @@
u8 max_configurable_queues;
u8 max_configurable_lossless_queues;
u8 queue_cfg_allowed;
- u8 queue_buffers_cfg_allowed;
+ u8 queue_cfg_info;
+ #define QUEUE_QPORTCFG_RESP_QUEUE_CFG_INFO_ASYM_CFG 0x1UL
u8 queue_pfcenable_cfg_allowed;
u8 queue_pri2cos_cfg_allowed;
u8 queue_cos2bw_cfg_allowed;
u8 queue_id0;
u8 queue_id0_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id1;
u8 queue_id1_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID1_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id2;
u8 queue_id2_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID2_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id3;
u8 queue_id3_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID3_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id4;
u8 queue_id4_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID4_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id5;
u8 queue_id5_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID5_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id6;
u8 queue_id6_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID6_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 queue_id7;
u8 queue_id7_service_profile;
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_QPORTCFG_RESP_QUEUE_ID7_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 valid;
};
@@ -2235,19 +2306,21 @@
__le16 target_id;
__le64 resp_addr;
__le32 flags;
- #define QUEUE_CFG_REQ_FLAGS_PATH 0x1UL
- #define QUEUE_CFG_REQ_FLAGS_PATH_TX (0x0UL << 0)
- #define QUEUE_CFG_REQ_FLAGS_PATH_RX (0x1UL << 0)
- #define QUEUE_CFG_REQ_FLAGS_PATH_LAST QUEUE_CFG_REQ_FLAGS_PATH_RX
+ #define QUEUE_CFG_REQ_FLAGS_PATH_MASK 0x3UL
+ #define QUEUE_CFG_REQ_FLAGS_PATH_SFT 0
+ #define QUEUE_CFG_REQ_FLAGS_PATH_TX 0x0UL
+ #define QUEUE_CFG_REQ_FLAGS_PATH_RX 0x1UL
+ #define QUEUE_CFG_REQ_FLAGS_PATH_BIDIR 0x2UL
+ #define QUEUE_CFG_REQ_FLAGS_PATH_LAST QUEUE_CFG_REQ_FLAGS_PATH_BIDIR
__le32 enables;
#define QUEUE_CFG_REQ_ENABLES_DFLT_LEN 0x1UL
#define QUEUE_CFG_REQ_ENABLES_SERVICE_PROFILE 0x2UL
__le32 queue_id;
__le32 dflt_len;
u8 service_profile;
- #define QUEUE_CFG_REQ_SERVICE_PROFILE_LOSSY (0x0UL << 0)
- #define QUEUE_CFG_REQ_SERVICE_PROFILE_LOSSLESS (0x1UL << 0)
- #define QUEUE_CFG_REQ_SERVICE_PROFILE_UNKNOWN (0xffUL << 0)
+ #define QUEUE_CFG_REQ_SERVICE_PROFILE_LOSSY 0x0UL
+ #define QUEUE_CFG_REQ_SERVICE_PROFILE_LOSSLESS 0x1UL
+ #define QUEUE_CFG_REQ_SERVICE_PROFILE_UNKNOWN 0xffUL
u8 unused_0[7];
};
@@ -2264,50 +2337,6 @@
u8 valid;
};
-/* hwrm_queue_buffers_cfg */
-/* Input (56 bytes) */
-struct hwrm_queue_buffers_cfg_input {
- __le16 req_type;
- __le16 cmpl_ring;
- __le16 seq_id;
- __le16 target_id;
- __le64 resp_addr;
- __le32 flags;
- #define QUEUE_BUFFERS_CFG_REQ_FLAGS_PATH 0x1UL
- #define QUEUE_BUFFERS_CFG_REQ_FLAGS_PATH_TX (0x0UL << 0)
- #define QUEUE_BUFFERS_CFG_REQ_FLAGS_PATH_RX (0x1UL << 0)
- #define QUEUE_BUFFERS_CFG_REQ_FLAGS_PATH_LAST QUEUE_BUFFERS_CFG_REQ_FLAGS_PATH_RX
- __le32 enables;
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_RESERVED 0x1UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_SHARED 0x2UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_XOFF 0x4UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_XON 0x8UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_FULL 0x10UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_NOTFULL 0x20UL
- #define QUEUE_BUFFERS_CFG_REQ_ENABLES_MAX 0x40UL
- __le32 queue_id;
- __le32 reserved;
- __le32 shared;
- __le32 xoff;
- __le32 xon;
- __le32 full;
- __le32 notfull;
- __le32 max;
-};
-
-/* Output (16 bytes) */
-struct hwrm_queue_buffers_cfg_output {
- __le16 error_code;
- __le16 req_type;
- __le16 seq_id;
- __le16 resp_len;
- __le32 unused_0;
- u8 unused_1;
- u8 unused_2;
- u8 unused_3;
- u8 valid;
-};
-
/* hwrm_queue_pfcenable_cfg */
/* Input (24 bytes) */
struct hwrm_queue_pfcenable_cfg_input {
@@ -2351,12 +2380,22 @@
__le16 target_id;
__le64 resp_addr;
__le32 flags;
- #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH 0x1UL
+ #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_MASK 0x3UL
+ #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_SFT 0
#define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_TX (0x0UL << 0)
#define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_RX (0x1UL << 0)
- #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_LAST QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_RX
- #define QUEUE_PRI2COS_CFG_REQ_FLAGS_IVLAN 0x2UL
+ #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_BIDIR (0x2UL << 0)
+ #define QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_LAST QUEUE_PRI2COS_CFG_REQ_FLAGS_PATH_BIDIR
+ #define QUEUE_PRI2COS_CFG_REQ_FLAGS_IVLAN 0x4UL
__le32 enables;
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI0_COS_QUEUE_ID 0x1UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI1_COS_QUEUE_ID 0x2UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI2_COS_QUEUE_ID 0x4UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI3_COS_QUEUE_ID 0x8UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI4_COS_QUEUE_ID 0x10UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI5_COS_QUEUE_ID 0x20UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI6_COS_QUEUE_ID 0x40UL
+ #define QUEUE_PRI2COS_CFG_REQ_ENABLES_PRI7_COS_QUEUE_ID 0x80UL
u8 port_id;
u8 pri0_cos_queue_id;
u8 pri1_cos_queue_id;
@@ -2404,82 +2443,226 @@
u8 queue_id0;
u8 unused_0;
__le32 queue_id0_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id0_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id0_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID0_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id0_pri_lvl;
u8 queue_id0_bw_weight;
u8 queue_id1;
__le32 queue_id1_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id1_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id1_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID1_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id1_pri_lvl;
u8 queue_id1_bw_weight;
u8 queue_id2;
__le32 queue_id2_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id2_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id2_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID2_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id2_pri_lvl;
u8 queue_id2_bw_weight;
u8 queue_id3;
__le32 queue_id3_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id3_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id3_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID3_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id3_pri_lvl;
u8 queue_id3_bw_weight;
u8 queue_id4;
__le32 queue_id4_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id4_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id4_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID4_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id4_pri_lvl;
u8 queue_id4_bw_weight;
u8 queue_id5;
__le32 queue_id5_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id5_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id5_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID5_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id5_pri_lvl;
u8 queue_id5_bw_weight;
u8 queue_id6;
__le32 queue_id6_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id6_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id6_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID6_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id6_pri_lvl;
u8 queue_id6_bw_weight;
u8 queue_id7;
__le32 queue_id7_min_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MIN_BW_BW_VALUE_UNIT_INVALID
__le32 queue_id7_max_bw;
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_SFT 0
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_RSVD 0x10000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_LAST QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_MAX_BW_BW_VALUE_UNIT_INVALID
u8 queue_id7_tsa_assign;
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_SP (0x0UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_ETS (0x1UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_RESERVED_FIRST (0x2UL << 0)
- #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_RESERVED_LAST (0xffUL << 0)
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_SP 0x0UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_ETS 0x1UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_RESERVED_FIRST 0x2UL
+ #define QUEUE_COS2BW_CFG_REQ_QUEUE_ID7_TSA_ASSIGN_RESERVED_LAST 0xffUL
u8 queue_id7_pri_lvl;
u8 queue_id7_bw_weight;
u8 unused_1[5];
@@ -2563,6 +2746,7 @@
#define VNIC_CFG_REQ_FLAGS_BD_STALL_MODE 0x4UL
#define VNIC_CFG_REQ_FLAGS_ROCE_DUAL_VNIC_MODE 0x8UL
#define VNIC_CFG_REQ_FLAGS_ROCE_ONLY_VNIC_MODE 0x10UL
+ #define VNIC_CFG_REQ_FLAGS_RSS_DFLT_CR_MODE 0x20UL
__le32 enables;
#define VNIC_CFG_REQ_ENABLES_DFLT_RING_GRP 0x1UL
#define VNIC_CFG_REQ_ENABLES_RSS_RULE 0x2UL
@@ -2615,18 +2799,18 @@
#define VNIC_TPA_CFG_REQ_ENABLES_MIN_AGG_LEN 0x8UL
__le16 vnic_id;
__le16 max_agg_segs;
- #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_1 (0x0UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_2 (0x1UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_4 (0x2UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_8 (0x3UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_MAX (0x1fUL << 0)
+ #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_1 0x0UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_2 0x1UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_4 0x2UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_8 0x3UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGG_SEGS_MAX 0x1fUL
__le16 max_aggs;
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_1 (0x0UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_2 (0x1UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_4 (0x2UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_8 (0x3UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_16 (0x4UL << 0)
- #define VNIC_TPA_CFG_REQ_MAX_AGGS_MAX (0x7UL << 0)
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_1 0x0UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_2 0x1UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_4 0x2UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_8 0x3UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_16 0x4UL
+ #define VNIC_TPA_CFG_REQ_MAX_AGGS_MAX 0x7UL
u8 unused_0;
u8 unused_1;
__le32 max_agg_timer;
@@ -2780,15 +2964,15 @@
__le64 resp_addr;
__le32 enables;
#define RING_ALLOC_REQ_ENABLES_RESERVED1 0x1UL
- #define RING_ALLOC_REQ_ENABLES_RESERVED2 0x2UL
+ #define RING_ALLOC_REQ_ENABLES_RING_ARB_CFG 0x2UL
#define RING_ALLOC_REQ_ENABLES_RESERVED3 0x4UL
#define RING_ALLOC_REQ_ENABLES_STAT_CTX_ID_VALID 0x8UL
#define RING_ALLOC_REQ_ENABLES_RESERVED4 0x10UL
#define RING_ALLOC_REQ_ENABLES_MAX_BW_VALID 0x20UL
u8 ring_type;
- #define RING_ALLOC_REQ_RING_TYPE_CMPL (0x0UL << 0)
- #define RING_ALLOC_REQ_RING_TYPE_TX (0x1UL << 0)
- #define RING_ALLOC_REQ_RING_TYPE_RX (0x2UL << 0)
+ #define RING_ALLOC_REQ_RING_TYPE_CMPL 0x0UL
+ #define RING_ALLOC_REQ_RING_TYPE_TX 0x1UL
+ #define RING_ALLOC_REQ_RING_TYPE_RX 0x2UL
u8 unused_0;
__le16 unused_1;
__le64 page_tbl_addr;
@@ -2804,18 +2988,36 @@
u8 unused_4;
u8 unused_5;
__le32 reserved1;
- __le16 reserved2;
+ __le16 ring_arb_cfg;
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_MASK 0xfUL
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_SFT 0
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_SP (0x1UL << 0)
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_WFQ (0x2UL << 0)
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_LAST RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_WFQ
+ #define RING_ALLOC_REQ_RING_ARB_CFG_RSVD_MASK 0xf0UL
+ #define RING_ALLOC_REQ_RING_ARB_CFG_RSVD_SFT 4
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_PARAM_MASK 0xff00UL
+ #define RING_ALLOC_REQ_RING_ARB_CFG_ARB_POLICY_PARAM_SFT 8
u8 unused_6;
u8 unused_7;
__le32 reserved3;
__le32 stat_ctx_id;
__le32 reserved4;
__le32 max_bw;
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_MASK 0xfffffffUL
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_SFT 0
+ #define RING_ALLOC_REQ_MAX_BW_RSVD 0x10000000UL
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_MASK 0xe0000000UL
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_SFT 29
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_MBPS (0x0UL << 29)
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_PERCENT1_100 (0x1UL << 29)
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_INVALID (0x7UL << 29)
+ #define RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_LAST RING_ALLOC_REQ_MAX_BW_BW_VALUE_UNIT_INVALID
u8 int_mode;
- #define RING_ALLOC_REQ_INT_MODE_LEGACY (0x0UL << 0)
- #define RING_ALLOC_REQ_INT_MODE_RSVD (0x1UL << 0)
- #define RING_ALLOC_REQ_INT_MODE_MSIX (0x2UL << 0)
- #define RING_ALLOC_REQ_INT_MODE_POLL (0x3UL << 0)
+ #define RING_ALLOC_REQ_INT_MODE_LEGACY 0x0UL
+ #define RING_ALLOC_REQ_INT_MODE_RSVD 0x1UL
+ #define RING_ALLOC_REQ_INT_MODE_MSIX 0x2UL
+ #define RING_ALLOC_REQ_INT_MODE_POLL 0x3UL
u8 unused_8[3];
};
@@ -2842,9 +3044,9 @@
__le16 target_id;
__le64 resp_addr;
u8 ring_type;
- #define RING_FREE_REQ_RING_TYPE_CMPL (0x0UL << 0)
- #define RING_FREE_REQ_RING_TYPE_TX (0x1UL << 0)
- #define RING_FREE_REQ_RING_TYPE_RX (0x2UL << 0)
+ #define RING_FREE_REQ_RING_TYPE_CMPL 0x0UL
+ #define RING_FREE_REQ_RING_TYPE_TX 0x1UL
+ #define RING_FREE_REQ_RING_TYPE_RX 0x2UL
u8 unused_0;
__le16 ring_id;
__le32 unused_1;
@@ -2942,9 +3144,9 @@
__le16 target_id;
__le64 resp_addr;
u8 ring_type;
- #define RING_RESET_REQ_RING_TYPE_CMPL (0x0UL << 0)
- #define RING_RESET_REQ_RING_TYPE_TX (0x1UL << 0)
- #define RING_RESET_REQ_RING_TYPE_RX (0x2UL << 0)
+ #define RING_RESET_REQ_RING_TYPE_CMPL 0x0UL
+ #define RING_RESET_REQ_RING_TYPE_TX 0x1UL
+ #define RING_RESET_REQ_RING_TYPE_RX 0x2UL
u8 unused_0;
__le16 ring_id;
__le32 unused_1;
@@ -3068,36 +3270,36 @@
__le16 t_l2_ivlan;
__le16 t_l2_ivlan_mask;
u8 src_type;
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_NPORT (0x0UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_PF (0x1UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_VF (0x2UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_VNIC (0x3UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_KONG (0x4UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_APE (0x5UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_BONO (0x6UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_TANG (0x7UL << 0)
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_NPORT 0x0UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_PF 0x1UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_VF 0x2UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_VNIC 0x3UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_KONG 0x4UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_APE 0x5UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_BONO 0x6UL
+ #define CFA_L2_FILTER_ALLOC_REQ_SRC_TYPE_TANG 0x7UL
u8 unused_6;
__le32 src_id;
u8 tunnel_type;
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL (0x0UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE (0x2UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE (0x3UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP (0x4UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS (0x6UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT (0x7UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE (0x8UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL (0xffUL << 0)
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL 0x0UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE 0x2UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE 0x3UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP 0x4UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE 0x5UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS 0x6UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT 0x7UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE 0x8UL
+ #define CFA_L2_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL 0xffUL
u8 unused_7;
__le16 dst_id;
__le16 mirror_vnic_id;
u8 pri_hint;
- #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_NO_PREFER (0x0UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_ABOVE_FILTER (0x1UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_BELOW_FILTER (0x2UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_MAX (0x3UL << 0)
- #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_MIN (0x4UL << 0)
+ #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_NO_PREFER 0x0UL
+ #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_ABOVE_FILTER 0x1UL
+ #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_BELOW_FILTER 0x2UL
+ #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_MAX 0x3UL
+ #define CFA_L2_FILTER_ALLOC_REQ_PRI_HINT_MIN 0x4UL
u8 unused_8;
__le32 unused_9;
__le64 l2_filter_id_hint;
@@ -3246,16 +3448,16 @@
u8 l3_addr_type;
u8 t_l3_addr_type;
u8 tunnel_type;
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL (0x0UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE (0x2UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE (0x3UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP (0x4UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS (0x6UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT (0x7UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE (0x8UL << 0)
- #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL (0xffUL << 0)
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL 0x0UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE 0x2UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE 0x3UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP 0x4UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE 0x5UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS 0x6UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT 0x7UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE 0x8UL
+ #define CFA_TUNNEL_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL 0xffUL
u8 unused_0;
__le32 vni;
__le32 dst_vnic_id;
@@ -3311,14 +3513,14 @@
__le32 flags;
#define CFA_ENCAP_RECORD_ALLOC_REQ_FLAGS_LOOPBACK 0x1UL
u8 encap_type;
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_VXLAN (0x1UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_NVGRE (0x2UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_L2GRE (0x3UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_IPIP (0x4UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_GENEVE (0x5UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_MPLS (0x6UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_VLAN (0x7UL << 0)
- #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_IPGRE (0x8UL << 0)
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_VXLAN 0x1UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_NVGRE 0x2UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_L2GRE 0x3UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_IPIP 0x4UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_GENEVE 0x5UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_MPLS 0x6UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_VLAN 0x7UL
+ #define CFA_ENCAP_RECORD_ALLOC_REQ_ENCAP_TYPE_IPGRE 0x8UL
u8 unused_0;
__le16 unused_1;
__le32 encap_data[16];
@@ -3397,32 +3599,32 @@
u8 src_macaddr[6];
__be16 ethertype;
u8 ip_addr_type;
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_UNKNOWN (0x0UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_IPV4 (0x4UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_IPV6 (0x6UL << 0)
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_UNKNOWN 0x0UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_IPV4 0x4UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_ADDR_TYPE_IPV6 0x6UL
u8 ip_protocol;
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UNKNOWN (0x0UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UDP (0x6UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_TCP (0x11UL << 0)
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UNKNOWN 0x0UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_UDP 0x6UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_IP_PROTOCOL_TCP 0x11UL
__le16 dst_id;
__le16 mirror_vnic_id;
u8 tunnel_type;
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL (0x0UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE (0x2UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE (0x3UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP (0x4UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS (0x6UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT (0x7UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE (0x8UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL (0xffUL << 0)
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_NONTUNNEL 0x0UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_NVGRE 0x2UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_L2GRE 0x3UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPIP 0x4UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_GENEVE 0x5UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_MPLS 0x6UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_STT 0x7UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_IPGRE 0x8UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_TUNNEL_TYPE_ANYTUNNEL 0xffUL
u8 pri_hint;
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_NO_PREFER (0x0UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_ABOVE (0x1UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_BELOW (0x2UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_HIGHEST (0x3UL << 0)
- #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_LOWEST (0x4UL << 0)
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_NO_PREFER 0x0UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_ABOVE 0x1UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_BELOW 0x2UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_HIGHEST 0x3UL
+ #define CFA_NTUPLE_FILTER_ALLOC_REQ_PRI_HINT_LOWEST 0x4UL
__be32 src_ipaddr[4];
__be32 src_ipaddr_mask[4];
__be32 dst_ipaddr[4];
@@ -3511,8 +3713,8 @@
__le16 target_id;
__le64 resp_addr;
u8 tunnel_type;
- #define TUNNEL_DST_PORT_QUERY_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define TUNNEL_DST_PORT_QUERY_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
+ #define TUNNEL_DST_PORT_QUERY_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define TUNNEL_DST_PORT_QUERY_REQ_TUNNEL_TYPE_GENEVE 0x5UL
u8 unused_0[7];
};
@@ -3539,8 +3741,8 @@
__le16 target_id;
__le64 resp_addr;
u8 tunnel_type;
- #define TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
+ #define TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_GENEVE 0x5UL
u8 unused_0;
__be16 tunnel_dst_port_val;
__le32 unused_1;
@@ -3570,8 +3772,8 @@
__le16 target_id;
__le64 resp_addr;
u8 tunnel_type;
- #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
- #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
+ #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_VXLAN 0x1UL
+ #define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_GENEVE 0x5UL
u8 unused_0;
__le16 tunnel_dst_port_id;
__le32 unused_1;
@@ -3720,15 +3922,15 @@
__le16 target_id;
__le64 resp_addr;
u8 embedded_proc_type;
- #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_BOOT (0x0UL << 0)
- #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_MGMT (0x1UL << 0)
- #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_NETCTRL (0x2UL << 0)
- #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_ROCE (0x3UL << 0)
- #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_RSVD (0x4UL << 0)
+ #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_BOOT 0x0UL
+ #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_MGMT 0x1UL
+ #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_NETCTRL 0x2UL
+ #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_ROCE 0x3UL
+ #define FW_RESET_REQ_EMBEDDED_PROC_TYPE_RSVD 0x4UL
u8 selfrst_status;
- #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTNONE (0x0UL << 0)
- #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTASAP (0x1UL << 0)
- #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTPCIERST (0x2UL << 0)
+ #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTNONE 0x0UL
+ #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTASAP 0x1UL
+ #define FW_RESET_REQ_SELFRST_STATUS_SELFRSTPCIERST 0x2UL
__le16 unused_0[3];
};
@@ -3739,9 +3941,9 @@
__le16 seq_id;
__le16 resp_len;
u8 selfrst_status;
- #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTNONE (0x0UL << 0)
- #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTASAP (0x1UL << 0)
- #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTPCIERST (0x2UL << 0)
+ #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTNONE 0x0UL
+ #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTASAP 0x1UL
+ #define FW_RESET_RESP_SELFRST_STATUS_SELFRSTPCIERST 0x2UL
u8 unused_0;
__le16 unused_1;
u8 unused_2;
@@ -3759,11 +3961,11 @@
__le16 target_id;
__le64 resp_addr;
u8 embedded_proc_type;
- #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_BOOT (0x0UL << 0)
- #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_MGMT (0x1UL << 0)
- #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_NETCTRL (0x2UL << 0)
- #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_ROCE (0x3UL << 0)
- #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_RSVD (0x4UL << 0)
+ #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_BOOT 0x0UL
+ #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_MGMT 0x1UL
+ #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_NETCTRL 0x2UL
+ #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_ROCE 0x3UL
+ #define FW_QSTATUS_REQ_EMBEDDED_PROC_TYPE_RSVD 0x4UL
u8 unused_0[7];
};
@@ -3774,9 +3976,9 @@
__le16 seq_id;
__le16 resp_len;
u8 selfrst_status;
- #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTNONE (0x0UL << 0)
- #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTASAP (0x1UL << 0)
- #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTPCIERST (0x2UL << 0)
+ #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTNONE 0x0UL
+ #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTASAP 0x1UL
+ #define FW_QSTATUS_RESP_SELFRST_STATUS_SELFRSTPCIERST 0x2UL
u8 unused_0;
__le16 unused_1;
u8 unused_2;
@@ -3785,6 +3987,42 @@
u8 valid;
};
+/* hwrm_fw_set_time */
+/* Input (32 bytes) */
+struct hwrm_fw_set_time_input {
+ __le16 req_type;
+ __le16 cmpl_ring;
+ __le16 seq_id;
+ __le16 target_id;
+ __le64 resp_addr;
+ __le16 year;
+ #define FW_SET_TIME_REQ_YEAR_UNKNOWN 0x0UL
+ u8 month;
+ u8 day;
+ u8 hour;
+ u8 minute;
+ u8 second;
+ u8 unused_0;
+ __le16 millisecond;
+ __le16 zone;
+ #define FW_SET_TIME_REQ_ZONE_UTC 0x0UL
+ #define FW_SET_TIME_REQ_ZONE_UNKNOWN 0xffffUL
+ __le32 unused_1;
+};
+
+/* Output (16 bytes) */
+struct hwrm_fw_set_time_output {
+ __le16 error_code;
+ __le16 req_type;
+ __le16 seq_id;
+ __le16 resp_len;
+ __le32 unused_0;
+ u8 unused_1;
+ u8 unused_2;
+ u8 unused_3;
+ u8 valid;
+};
+
/* hwrm_exec_fwd_resp */
/* Input (128 bytes) */
struct hwrm_exec_fwd_resp_input {
@@ -3921,32 +4159,6 @@
u8 valid;
};
-/* hwrm_nvm_raw_write_blk */
-/* Input (32 bytes) */
-struct hwrm_nvm_raw_write_blk_input {
- __le16 req_type;
- __le16 cmpl_ring;
- __le16 seq_id;
- __le16 target_id;
- __le64 resp_addr;
- __le64 host_src_addr;
- __le32 dest_addr;
- __le32 len;
-};
-
-/* Output (16 bytes) */
-struct hwrm_nvm_raw_write_blk_output {
- __le16 error_code;
- __le16 req_type;
- __le16 seq_id;
- __le16 resp_len;
- __le32 unused_0;
- u8 unused_1;
- u8 unused_2;
- u8 unused_3;
- u8 valid;
-};
-
/* hwrm_nvm_read */
/* Input (40 bytes) */
struct hwrm_nvm_read_input {
@@ -4132,9 +4344,9 @@
u8 opt_ordinal;
#define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_MASK 0x3UL
#define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_SFT 0
- #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_EQ (0x0UL << 0)
- #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_GE (0x1UL << 0)
- #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_GT (0x2UL << 0)
+ #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_EQ 0x0UL
+ #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_GE 0x1UL
+ #define NVM_FIND_DIR_ENTRY_REQ_OPT_ORDINAL_GT 0x2UL
u8 unused_1[3];
};
@@ -4266,4 +4478,41 @@
u8 valid;
};
+/* hwrm_nvm_install_update */
+/* Input (24 bytes) */
+struct hwrm_nvm_install_update_input {
+ __le16 req_type;
+ __le16 cmpl_ring;
+ __le16 seq_id;
+ __le16 target_id;
+ __le64 resp_addr;
+ __le32 install_type;
+ #define NVM_INSTALL_UPDATE_REQ_INSTALL_TYPE_NORMAL 0x0UL
+ #define NVM_INSTALL_UPDATE_REQ_INSTALL_TYPE_ALL 0xffffffffUL
+ __le32 unused_0;
+};
+
+/* Output (24 bytes) */
+struct hwrm_nvm_install_update_output {
+ __le16 error_code;
+ __le16 req_type;
+ __le16 seq_id;
+ __le16 resp_len;
+ __le64 installed_items;
+ u8 result;
+ #define NVM_INSTALL_UPDATE_RESP_RESULT_SUCCESS 0x0UL
+ u8 problem_item;
+ #define NVM_INSTALL_UPDATE_RESP_PROBLEM_ITEM_NONE 0x0UL
+ #define NVM_INSTALL_UPDATE_RESP_PROBLEM_ITEM_PACKAGE 0xffUL
+ u8 reset_required;
+ #define NVM_INSTALL_UPDATE_RESP_RESET_REQUIRED_NONE 0x0UL
+ #define NVM_INSTALL_UPDATE_RESP_RESET_REQUIRED_PCI 0x1UL
+ #define NVM_INSTALL_UPDATE_RESP_RESET_REQUIRED_POWER 0x2UL
+ u8 unused_0;
+ u8 unused_1;
+ u8 unused_2;
+ u8 unused_3;
+ u8 valid;
+};
+
#endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 50d2007..ec6cd18 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -19,6 +19,45 @@
#include "bnxt_ethtool.h"
#ifdef CONFIG_BNXT_SRIOV
+static int bnxt_hwrm_fwd_async_event_cmpl(struct bnxt *bp,
+ struct bnxt_vf_info *vf, u16 event_id)
+{
+ struct hwrm_fwd_async_event_cmpl_output *resp = bp->hwrm_cmd_resp_addr;
+ struct hwrm_fwd_async_event_cmpl_input req = {0};
+ struct hwrm_async_event_cmpl *async_cmpl;
+ int rc = 0;
+
+ bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FWD_ASYNC_EVENT_CMPL, -1, -1);
+ if (vf)
+ req.encap_async_event_target_id = cpu_to_le16(vf->fw_fid);
+ else
+ /* broadcast this async event to all VFs */
+ req.encap_async_event_target_id = cpu_to_le16(0xffff);
+ async_cmpl = (struct hwrm_async_event_cmpl *)req.encap_async_event_cmpl;
+ async_cmpl->type =
+ cpu_to_le16(HWRM_ASYNC_EVENT_CMPL_TYPE_HWRM_ASYNC_EVENT);
+ async_cmpl->event_id = cpu_to_le16(event_id);
+
+ mutex_lock(&bp->hwrm_cmd_lock);
+ rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+
+ if (rc) {
+ netdev_err(bp->dev, "hwrm_fwd_async_event_cmpl failed. rc:%d\n",
+ rc);
+ goto fwd_async_event_cmpl_exit;
+ }
+
+ if (resp->error_code) {
+ netdev_err(bp->dev, "hwrm_fwd_async_event_cmpl error %d\n",
+ resp->error_code);
+ rc = -1;
+ }
+
+fwd_async_event_cmpl_exit:
+ mutex_unlock(&bp->hwrm_cmd_lock);
+ return rc;
+}
+
static int bnxt_vf_ndo_prep(struct bnxt *bp, int vf_id)
{
if (!test_bit(BNXT_STATE_OPEN, &bp->state)) {
@@ -135,7 +174,8 @@
return hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
}
-int bnxt_set_vf_vlan(struct net_device *dev, int vf_id, u16 vlan_id, u8 qos)
+int bnxt_set_vf_vlan(struct net_device *dev, int vf_id, u16 vlan_id, u8 qos,
+ __be16 vlan_proto)
{
struct hwrm_func_cfg_input req = {0};
struct bnxt *bp = netdev_priv(dev);
@@ -146,6 +186,9 @@
if (bp->hwrm_spec_code < 0x10201)
return -ENOTSUPP;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
rc = bnxt_vf_ndo_prep(bp, vf_id);
if (rc)
return rc;
@@ -243,8 +286,9 @@
rc = -EINVAL;
break;
}
- /* CHIMP TODO: send msg to VF to update new link state */
-
+ if (vf->flags & (BNXT_VF_LINK_UP | BNXT_VF_LINK_FORCED))
+ rc = bnxt_hwrm_fwd_async_event_cmpl(bp, vf,
+ HWRM_ASYNC_EVENT_CMPL_EVENT_ID_LINK_STATUS_CHANGE);
return rc;
}
@@ -525,46 +569,6 @@
return rc;
}
-static int bnxt_hwrm_fwd_async_event_cmpl(struct bnxt *bp,
- struct bnxt_vf_info *vf,
- u16 event_id)
-{
- int rc = 0;
- struct hwrm_fwd_async_event_cmpl_input req = {0};
- struct hwrm_fwd_async_event_cmpl_output *resp = bp->hwrm_cmd_resp_addr;
- struct hwrm_async_event_cmpl *async_cmpl;
-
- bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FWD_ASYNC_EVENT_CMPL, -1, -1);
- if (vf)
- req.encap_async_event_target_id = cpu_to_le16(vf->fw_fid);
- else
- /* broadcast this async event to all VFs */
- req.encap_async_event_target_id = cpu_to_le16(0xffff);
- async_cmpl = (struct hwrm_async_event_cmpl *)req.encap_async_event_cmpl;
- async_cmpl->type =
- cpu_to_le16(HWRM_ASYNC_EVENT_CMPL_TYPE_HWRM_ASYNC_EVENT);
- async_cmpl->event_id = cpu_to_le16(event_id);
-
- mutex_lock(&bp->hwrm_cmd_lock);
- rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
-
- if (rc) {
- netdev_err(bp->dev, "hwrm_fwd_async_event_cmpl failed. rc:%d\n",
- rc);
- goto fwd_async_event_cmpl_exit;
- }
-
- if (resp->error_code) {
- netdev_err(bp->dev, "hwrm_fwd_async_event_cmpl error %d\n",
- resp->error_code);
- rc = -1;
- }
-
-fwd_async_event_cmpl_exit:
- mutex_unlock(&bp->hwrm_cmd_lock);
- return rc;
-}
-
void bnxt_sriov_disable(struct bnxt *bp)
{
u16 num_vfs = pci_num_vf(bp->pdev);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
index 0392670..1ab72e4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
@@ -12,7 +12,7 @@
int bnxt_get_vf_config(struct net_device *, int, struct ifla_vf_info *);
int bnxt_set_vf_mac(struct net_device *, int, u8 *);
-int bnxt_set_vf_vlan(struct net_device *, int, u16, u8);
+int bnxt_set_vf_vlan(struct net_device *, int, u16, u8, __be16);
int bnxt_set_vf_bw(struct net_device *, int, int, int);
int bnxt_set_vf_link_state(struct net_device *, int, int);
int bnxt_set_vf_spoofchk(struct net_device *, int, bool);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 46f9043..4464bc5 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -450,28 +450,32 @@
genet_dma_ring_regs[r]);
}
-static int bcmgenet_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int bcmgenet_get_link_ksettings(struct net_device *dev,
+ struct ethtool_link_ksettings *cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
if (!netif_running(dev))
return -EINVAL;
- if (!dev->phydev)
+ if (!priv->phydev)
return -ENODEV;
- return phy_ethtool_gset(dev->phydev, cmd);
+ return phy_ethtool_ksettings_get(priv->phydev, cmd);
}
-static int bcmgenet_set_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int bcmgenet_set_link_ksettings(struct net_device *dev,
+ const struct ethtool_link_ksettings *cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
if (!netif_running(dev))
return -EINVAL;
- if (!dev->phydev)
+ if (!priv->phydev)
return -ENODEV;
- return phy_ethtool_sset(dev->phydev, cmd);
+ return phy_ethtool_ksettings_set(priv->phydev, cmd);
}
static int bcmgenet_set_rx_csum(struct net_device *dev,
@@ -937,7 +941,7 @@
e->eee_active = p->eee_active;
e->tx_lpi_timer = bcmgenet_umac_readl(priv, UMAC_EEE_LPI_TIMER);
- return phy_ethtool_get_eee(dev->phydev, e);
+ return phy_ethtool_get_eee(priv->phydev, e);
}
static int bcmgenet_set_eee(struct net_device *dev, struct ethtool_eee *e)
@@ -954,7 +958,7 @@
if (!p->eee_enabled) {
bcmgenet_eee_enable_set(dev, false);
} else {
- ret = phy_init_eee(dev->phydev, 0);
+ ret = phy_init_eee(priv->phydev, 0);
if (ret) {
netif_err(priv, hw, dev, "EEE initialization failed\n");
return ret;
@@ -964,12 +968,14 @@
bcmgenet_eee_enable_set(dev, true);
}
- return phy_ethtool_set_eee(dev->phydev, e);
+ return phy_ethtool_set_eee(priv->phydev, e);
}
static int bcmgenet_nway_reset(struct net_device *dev)
{
- return genphy_restart_aneg(dev->phydev);
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+
+ return genphy_restart_aneg(priv->phydev);
}
/* standard ethtool support functions. */
@@ -977,8 +983,6 @@
.get_strings = bcmgenet_get_strings,
.get_sset_count = bcmgenet_get_sset_count,
.get_ethtool_stats = bcmgenet_get_ethtool_stats,
- .get_settings = bcmgenet_get_settings,
- .set_settings = bcmgenet_set_settings,
.get_drvinfo = bcmgenet_get_drvinfo,
.get_link = ethtool_op_get_link,
.get_msglevel = bcmgenet_get_msglevel,
@@ -990,19 +994,20 @@
.nway_reset = bcmgenet_nway_reset,
.get_coalesce = bcmgenet_get_coalesce,
.set_coalesce = bcmgenet_set_coalesce,
+ .get_link_ksettings = bcmgenet_get_link_ksettings,
+ .set_link_ksettings = bcmgenet_set_link_ksettings,
};
/* Power down the unimac, based on mode. */
static int bcmgenet_power_down(struct bcmgenet_priv *priv,
enum bcmgenet_power_mode mode)
{
- struct net_device *ndev = priv->dev;
int ret = 0;
u32 reg;
switch (mode) {
case GENET_POWER_CABLE_SENSE:
- phy_detach(ndev->phydev);
+ phy_detach(priv->phydev);
break;
case GENET_POWER_WOL_MAGIC:
@@ -1063,6 +1068,7 @@
/* ioctl handle special commands that are not present in ethtool. */
static int bcmgenet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
int val = 0;
if (!netif_running(dev))
@@ -1072,10 +1078,10 @@
case SIOCGMIIPHY:
case SIOCGMIIREG:
case SIOCSMIIREG:
- if (!dev->phydev)
+ if (!priv->phydev)
val = -ENODEV;
else
- val = phy_mii_ioctl(dev->phydev, rq, cmd);
+ val = phy_mii_ioctl(priv->phydev, rq, cmd);
break;
default:
@@ -2458,7 +2464,6 @@
{
struct bcmgenet_priv *priv = container_of(
work, struct bcmgenet_priv, bcmgenet_irq_work);
- struct net_device *ndev = priv->dev;
netif_dbg(priv, intr, priv->dev, "%s\n", __func__);
@@ -2471,7 +2476,7 @@
/* Link UP/DOWN event */
if (priv->irq0_stat & UMAC_IRQ_LINK_EVENT) {
- phy_mac_interrupt(ndev->phydev,
+ phy_mac_interrupt(priv->phydev,
!!(priv->irq0_stat & UMAC_IRQ_LINK_UP));
priv->irq0_stat &= ~UMAC_IRQ_LINK_EVENT;
}
@@ -2664,128 +2669,6 @@
bcmgenet_tdma_writel(priv, reg, DMA_CTRL);
}
-static bool bcmgenet_hfb_is_filter_enabled(struct bcmgenet_priv *priv,
- u32 f_index)
-{
- u32 offset;
- u32 reg;
-
- offset = HFB_FLT_ENABLE_V3PLUS + (f_index < 32) * sizeof(u32);
- reg = bcmgenet_hfb_reg_readl(priv, offset);
- return !!(reg & (1 << (f_index % 32)));
-}
-
-static void bcmgenet_hfb_enable_filter(struct bcmgenet_priv *priv, u32 f_index)
-{
- u32 offset;
- u32 reg;
-
- offset = HFB_FLT_ENABLE_V3PLUS + (f_index < 32) * sizeof(u32);
- reg = bcmgenet_hfb_reg_readl(priv, offset);
- reg |= (1 << (f_index % 32));
- bcmgenet_hfb_reg_writel(priv, reg, offset);
-}
-
-static void bcmgenet_hfb_set_filter_rx_queue_mapping(struct bcmgenet_priv *priv,
- u32 f_index, u32 rx_queue)
-{
- u32 offset;
- u32 reg;
-
- offset = f_index / 8;
- reg = bcmgenet_rdma_readl(priv, DMA_INDEX2RING_0 + offset);
- reg &= ~(0xF << (4 * (f_index % 8)));
- reg |= ((rx_queue & 0xF) << (4 * (f_index % 8)));
- bcmgenet_rdma_writel(priv, reg, DMA_INDEX2RING_0 + offset);
-}
-
-static void bcmgenet_hfb_set_filter_length(struct bcmgenet_priv *priv,
- u32 f_index, u32 f_length)
-{
- u32 offset;
- u32 reg;
-
- offset = HFB_FLT_LEN_V3PLUS +
- ((priv->hw_params->hfb_filter_cnt - 1 - f_index) / 4) *
- sizeof(u32);
- reg = bcmgenet_hfb_reg_readl(priv, offset);
- reg &= ~(0xFF << (8 * (f_index % 4)));
- reg |= ((f_length & 0xFF) << (8 * (f_index % 4)));
- bcmgenet_hfb_reg_writel(priv, reg, offset);
-}
-
-static int bcmgenet_hfb_find_unused_filter(struct bcmgenet_priv *priv)
-{
- u32 f_index;
-
- for (f_index = 0; f_index < priv->hw_params->hfb_filter_cnt; f_index++)
- if (!bcmgenet_hfb_is_filter_enabled(priv, f_index))
- return f_index;
-
- return -ENOMEM;
-}
-
-/* bcmgenet_hfb_add_filter
- *
- * Add new filter to Hardware Filter Block to match and direct Rx traffic to
- * desired Rx queue.
- *
- * f_data is an array of unsigned 32-bit integers where each 32-bit integer
- * provides filter data for 2 bytes (4 nibbles) of Rx frame:
- *
- * bits 31:20 - unused
- * bit 19 - nibble 0 match enable
- * bit 18 - nibble 1 match enable
- * bit 17 - nibble 2 match enable
- * bit 16 - nibble 3 match enable
- * bits 15:12 - nibble 0 data
- * bits 11:8 - nibble 1 data
- * bits 7:4 - nibble 2 data
- * bits 3:0 - nibble 3 data
- *
- * Example:
- * In order to match:
- * - Ethernet frame type = 0x0800 (IP)
- * - IP version field = 4
- * - IP protocol field = 0x11 (UDP)
- *
- * The following filter is needed:
- * u32 hfb_filter_ipv4_udp[] = {
- * Rx frame offset 0x00: 0x00000000, 0x00000000, 0x00000000, 0x00000000,
- * Rx frame offset 0x08: 0x00000000, 0x00000000, 0x000F0800, 0x00084000,
- * Rx frame offset 0x10: 0x00000000, 0x00000000, 0x00000000, 0x00030011,
- * };
- *
- * To add the filter to HFB and direct the traffic to Rx queue 0, call:
- * bcmgenet_hfb_add_filter(priv, hfb_filter_ipv4_udp,
- * ARRAY_SIZE(hfb_filter_ipv4_udp), 0);
- */
-int bcmgenet_hfb_add_filter(struct bcmgenet_priv *priv, u32 *f_data,
- u32 f_length, u32 rx_queue)
-{
- int f_index;
- u32 i;
-
- f_index = bcmgenet_hfb_find_unused_filter(priv);
- if (f_index < 0)
- return -ENOMEM;
-
- if (f_length > priv->hw_params->hfb_filter_size)
- return -EINVAL;
-
- for (i = 0; i < f_length; i++)
- bcmgenet_hfb_writel(priv, f_data[i],
- (f_index * priv->hw_params->hfb_filter_size + i) *
- sizeof(u32));
-
- bcmgenet_hfb_set_filter_length(priv, f_index, 2 * f_length);
- bcmgenet_hfb_set_filter_rx_queue_mapping(priv, f_index, rx_queue);
- bcmgenet_hfb_enable_filter(priv, f_index);
- bcmgenet_hfb_reg_writel(priv, 0x1, HFB_CTRL);
-
- return 0;
-}
-
/* bcmgenet_hfb_clear
*
* Clear Hardware Filter Block and disable all filtering.
@@ -2833,7 +2716,7 @@
/* Monitor link interrupts now */
bcmgenet_link_intr_enable(priv);
- phy_start(dev->phydev);
+ phy_start(priv->phydev);
}
static int bcmgenet_open(struct net_device *dev)
@@ -2932,7 +2815,7 @@
struct bcmgenet_priv *priv = netdev_priv(dev);
netif_tx_stop_all_queues(dev);
- phy_stop(dev->phydev);
+ phy_stop(priv->phydev);
bcmgenet_intr_disable(priv);
bcmgenet_disable_rx_napi(priv);
bcmgenet_disable_tx_napi(priv);
@@ -2958,7 +2841,7 @@
bcmgenet_netif_stop(dev);
/* Really kill the PHY state machine and disconnect from it */
- phy_disconnect(dev->phydev);
+ phy_disconnect(priv->phydev);
/* Disable MAC receive */
umac_enable_set(priv, CMD_RX_EN, false);
@@ -3517,7 +3400,7 @@
bcmgenet_netif_stop(dev);
- phy_suspend(dev->phydev);
+ phy_suspend(priv->phydev);
netif_device_detach(dev);
@@ -3581,7 +3464,7 @@
if (priv->wolopts)
clk_disable_unprepare(priv->clk_wol);
- phy_init_hw(dev->phydev);
+ phy_init_hw(priv->phydev);
/* Speed settings must be restored */
bcmgenet_mii_config(priv->dev);
@@ -3614,7 +3497,7 @@
netif_device_attach(dev);
- phy_resume(dev->phydev);
+ phy_resume(priv->phydev);
if (priv->eee.eee_enabled)
bcmgenet_eee_enable_set(dev, true);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
index 0f0868c..1e2dc34 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
@@ -597,6 +597,7 @@
/* MDIO bus variables */
wait_queue_head_t wq;
+ struct phy_device *phydev;
bool internal_phy;
struct device_node *phy_dn;
struct device_node *mdio_dn;
diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c
index e907acd..457c3bc 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
@@ -86,7 +86,7 @@
void bcmgenet_mii_setup(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
- struct phy_device *phydev = dev->phydev;
+ struct phy_device *phydev = priv->phydev;
u32 reg, cmd_bits = 0;
bool status_changed = false;
@@ -183,9 +183,9 @@
if (GENET_IS_V4(priv))
return;
- if (dev->phydev) {
- phy_init_hw(dev->phydev);
- phy_start_aneg(dev->phydev);
+ if (priv->phydev) {
+ phy_init_hw(priv->phydev);
+ phy_start_aneg(priv->phydev);
}
}
@@ -236,7 +236,6 @@
static void bcmgenet_moca_phy_setup(struct bcmgenet_priv *priv)
{
- struct net_device *ndev = priv->dev;
u32 reg;
/* Speed settings are set in bcmgenet_mii_setup() */
@@ -245,14 +244,14 @@
bcmgenet_sys_writel(priv, reg, SYS_PORT_CTRL);
if (priv->hw_params->flags & GENET_HAS_MOCA_LINK_DET)
- fixed_phy_set_link_update(ndev->phydev,
+ fixed_phy_set_link_update(priv->phydev,
bcmgenet_fixed_phy_link_update);
}
int bcmgenet_mii_config(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
- struct phy_device *phydev = dev->phydev;
+ struct phy_device *phydev = priv->phydev;
struct device *kdev = &priv->pdev->dev;
const char *phy_name = NULL;
u32 id_mode_dis = 0;
@@ -303,7 +302,7 @@
* capabilities, use that knowledge to also configure the
* Reverse MII interface correctly.
*/
- if ((phydev->supported & PHY_BASIC_FEATURES) ==
+ if ((priv->phydev->supported & PHY_BASIC_FEATURES) ==
PHY_BASIC_FEATURES)
port_ctrl = PORT_MODE_EXT_RVMII_25;
else
@@ -372,7 +371,7 @@
return -ENODEV;
}
} else {
- phydev = dev->phydev;
+ phydev = priv->phydev;
phydev->dev_flags = phy_flags;
ret = phy_connect_direct(dev, phydev, bcmgenet_mii_setup,
@@ -383,6 +382,8 @@
}
}
+ priv->phydev = phydev;
+
/* Configure port multiplexer based on what the probed PHY device since
* reading the 'max-speed' property determines the maximum supported
* PHY speed which is needed for bcmgenet_mii_config() to configure
@@ -390,7 +391,7 @@
*/
ret = bcmgenet_mii_config(dev);
if (ret) {
- phy_disconnect(phydev);
+ phy_disconnect(priv->phydev);
return ret;
}
@@ -400,7 +401,7 @@
* Ethernet MAC ISRs
*/
if (priv->internal_phy)
- phydev->irq = PHY_IGNORE_INTERRUPT;
+ priv->phydev->irq = PHY_IGNORE_INTERRUPT;
return 0;
}
@@ -605,6 +606,7 @@
}
+ priv->phydev = phydev;
priv->phy_interface = pd->phy_interface;
return 0;
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index a2551bc..2726f03 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -12079,95 +12079,107 @@
return ret;
}
-static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int tg3_get_link_ksettings(struct net_device *dev,
+ struct ethtool_link_ksettings *cmd)
{
struct tg3 *tp = netdev_priv(dev);
+ u32 supported, advertising;
if (tg3_flag(tp, USE_PHYLIB)) {
struct phy_device *phydev;
if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
return -EAGAIN;
phydev = mdiobus_get_phy(tp->mdio_bus, tp->phy_addr);
- return phy_ethtool_gset(phydev, cmd);
+ return phy_ethtool_ksettings_get(phydev, cmd);
}
- cmd->supported = (SUPPORTED_Autoneg);
+ supported = (SUPPORTED_Autoneg);
if (!(tp->phy_flags & TG3_PHYFLG_10_100_ONLY))
- cmd->supported |= (SUPPORTED_1000baseT_Half |
- SUPPORTED_1000baseT_Full);
+ supported |= (SUPPORTED_1000baseT_Half |
+ SUPPORTED_1000baseT_Full);
if (!(tp->phy_flags & TG3_PHYFLG_ANY_SERDES)) {
- cmd->supported |= (SUPPORTED_100baseT_Half |
- SUPPORTED_100baseT_Full |
- SUPPORTED_10baseT_Half |
- SUPPORTED_10baseT_Full |
- SUPPORTED_TP);
- cmd->port = PORT_TP;
+ supported |= (SUPPORTED_100baseT_Half |
+ SUPPORTED_100baseT_Full |
+ SUPPORTED_10baseT_Half |
+ SUPPORTED_10baseT_Full |
+ SUPPORTED_TP);
+ cmd->base.port = PORT_TP;
} else {
- cmd->supported |= SUPPORTED_FIBRE;
- cmd->port = PORT_FIBRE;
+ supported |= SUPPORTED_FIBRE;
+ cmd->base.port = PORT_FIBRE;
}
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
+ supported);
- cmd->advertising = tp->link_config.advertising;
+ advertising = tp->link_config.advertising;
if (tg3_flag(tp, PAUSE_AUTONEG)) {
if (tp->link_config.flowctrl & FLOW_CTRL_RX) {
if (tp->link_config.flowctrl & FLOW_CTRL_TX) {
- cmd->advertising |= ADVERTISED_Pause;
+ advertising |= ADVERTISED_Pause;
} else {
- cmd->advertising |= ADVERTISED_Pause |
- ADVERTISED_Asym_Pause;
+ advertising |= ADVERTISED_Pause |
+ ADVERTISED_Asym_Pause;
}
} else if (tp->link_config.flowctrl & FLOW_CTRL_TX) {
- cmd->advertising |= ADVERTISED_Asym_Pause;
+ advertising |= ADVERTISED_Asym_Pause;
}
}
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
+ advertising);
+
if (netif_running(dev) && tp->link_up) {
- ethtool_cmd_speed_set(cmd, tp->link_config.active_speed);
- cmd->duplex = tp->link_config.active_duplex;
- cmd->lp_advertising = tp->link_config.rmt_adv;
+ cmd->base.speed = tp->link_config.active_speed;
+ cmd->base.duplex = tp->link_config.active_duplex;
+ ethtool_convert_legacy_u32_to_link_mode(
+ cmd->link_modes.lp_advertising,
+ tp->link_config.rmt_adv);
+
if (!(tp->phy_flags & TG3_PHYFLG_ANY_SERDES)) {
if (tp->phy_flags & TG3_PHYFLG_MDIX_STATE)
- cmd->eth_tp_mdix = ETH_TP_MDI_X;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI_X;
else
- cmd->eth_tp_mdix = ETH_TP_MDI;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI;
}
} else {
- ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
- cmd->duplex = DUPLEX_UNKNOWN;
- cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
+ cmd->base.speed = SPEED_UNKNOWN;
+ cmd->base.duplex = DUPLEX_UNKNOWN;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI_INVALID;
}
- cmd->phy_address = tp->phy_addr;
- cmd->transceiver = XCVR_INTERNAL;
- cmd->autoneg = tp->link_config.autoneg;
- cmd->maxtxpkt = 0;
- cmd->maxrxpkt = 0;
+ cmd->base.phy_address = tp->phy_addr;
+ cmd->base.autoneg = tp->link_config.autoneg;
return 0;
}
-static int tg3_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
+static int tg3_set_link_ksettings(struct net_device *dev,
+ const struct ethtool_link_ksettings *cmd)
{
struct tg3 *tp = netdev_priv(dev);
- u32 speed = ethtool_cmd_speed(cmd);
+ u32 speed = cmd->base.speed;
+ u32 advertising;
if (tg3_flag(tp, USE_PHYLIB)) {
struct phy_device *phydev;
if (!(tp->phy_flags & TG3_PHYFLG_IS_CONNECTED))
return -EAGAIN;
phydev = mdiobus_get_phy(tp->mdio_bus, tp->phy_addr);
- return phy_ethtool_sset(phydev, cmd);
+ return phy_ethtool_ksettings_set(phydev, cmd);
}
- if (cmd->autoneg != AUTONEG_ENABLE &&
- cmd->autoneg != AUTONEG_DISABLE)
+ if (cmd->base.autoneg != AUTONEG_ENABLE &&
+ cmd->base.autoneg != AUTONEG_DISABLE)
return -EINVAL;
- if (cmd->autoneg == AUTONEG_DISABLE &&
- cmd->duplex != DUPLEX_FULL &&
- cmd->duplex != DUPLEX_HALF)
+ if (cmd->base.autoneg == AUTONEG_DISABLE &&
+ cmd->base.duplex != DUPLEX_FULL &&
+ cmd->base.duplex != DUPLEX_HALF)
return -EINVAL;
- if (cmd->autoneg == AUTONEG_ENABLE) {
+ ethtool_convert_link_mode_to_legacy_u32(&advertising,
+ cmd->link_modes.advertising);
+
+ if (cmd->base.autoneg == AUTONEG_ENABLE) {
u32 mask = ADVERTISED_Autoneg |
ADVERTISED_Pause |
ADVERTISED_Asym_Pause;
@@ -12185,7 +12197,7 @@
else
mask |= ADVERTISED_FIBRE;
- if (cmd->advertising & ~mask)
+ if (advertising & ~mask)
return -EINVAL;
mask &= (ADVERTISED_1000baseT_Half |
@@ -12195,13 +12207,13 @@
ADVERTISED_10baseT_Half |
ADVERTISED_10baseT_Full);
- cmd->advertising &= mask;
+ advertising &= mask;
} else {
if (tp->phy_flags & TG3_PHYFLG_ANY_SERDES) {
if (speed != SPEED_1000)
return -EINVAL;
- if (cmd->duplex != DUPLEX_FULL)
+ if (cmd->base.duplex != DUPLEX_FULL)
return -EINVAL;
} else {
if (speed != SPEED_100 &&
@@ -12212,16 +12224,16 @@
tg3_full_lock(tp, 0);
- tp->link_config.autoneg = cmd->autoneg;
- if (cmd->autoneg == AUTONEG_ENABLE) {
- tp->link_config.advertising = (cmd->advertising |
+ tp->link_config.autoneg = cmd->base.autoneg;
+ if (cmd->base.autoneg == AUTONEG_ENABLE) {
+ tp->link_config.advertising = (advertising |
ADVERTISED_Autoneg);
tp->link_config.speed = SPEED_UNKNOWN;
tp->link_config.duplex = DUPLEX_UNKNOWN;
} else {
tp->link_config.advertising = 0;
tp->link_config.speed = speed;
- tp->link_config.duplex = cmd->duplex;
+ tp->link_config.duplex = cmd->base.duplex;
}
tp->phy_flags |= TG3_PHYFLG_USER_CONFIGURED;
@@ -14094,8 +14106,6 @@
}
static const struct ethtool_ops tg3_ethtool_ops = {
- .get_settings = tg3_get_settings,
- .set_settings = tg3_set_settings,
.get_drvinfo = tg3_get_drvinfo,
.get_regs_len = tg3_get_regs_len,
.get_regs = tg3_get_regs,
@@ -14128,6 +14138,8 @@
.get_ts_info = tg3_get_ts_info,
.get_eee = tg3_get_eee,
.set_eee = tg3_set_eee,
+ .get_link_ksettings = tg3_get_link_ksettings,
+ .set_link_ksettings = tg3_set_link_ksettings,
};
static struct rtnl_link_stats64 *tg3_get_stats64(struct net_device *dev,
diff --git a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
index 0e4fdc3..31f61a7 100644
--- a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
+++ b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
@@ -31,15 +31,10 @@
#define BNAD_NUM_TXF_COUNTERS 12
#define BNAD_NUM_RXF_COUNTERS 10
#define BNAD_NUM_CQ_COUNTERS (3 + 5)
-#define BNAD_NUM_RXQ_COUNTERS 6
+#define BNAD_NUM_RXQ_COUNTERS 7
#define BNAD_NUM_TXQ_COUNTERS 5
-#define BNAD_ETHTOOL_STATS_NUM \
- (sizeof(struct rtnl_link_stats64) / sizeof(u64) + \
- sizeof(struct bnad_drv_stats) / sizeof(u64) + \
- offsetof(struct bfi_enet_stats, rxf_stats[0]) / sizeof(u64))
-
-static const char *bnad_net_stats_strings[BNAD_ETHTOOL_STATS_NUM] = {
+static const char *bnad_net_stats_strings[] = {
"rx_packets",
"tx_packets",
"rx_bytes",
@@ -50,22 +45,10 @@
"tx_dropped",
"multicast",
"collisions",
-
"rx_length_errors",
- "rx_over_errors",
"rx_crc_errors",
"rx_frame_errors",
- "rx_fifo_errors",
- "rx_missed_errors",
-
- "tx_aborted_errors",
- "tx_carrier_errors",
"tx_fifo_errors",
- "tx_heartbeat_errors",
- "tx_window_errors",
-
- "rx_compressed",
- "tx_compressed",
"netif_queue_stop",
"netif_queue_wakeup",
@@ -254,6 +237,8 @@
"fc_tx_fid_parity_errors",
};
+#define BNAD_ETHTOOL_STATS_NUM ARRAY_SIZE(bnad_net_stats_strings)
+
static int
bnad_get_settings(struct net_device *netdev, struct ethtool_cmd *cmd)
{
@@ -658,6 +643,8 @@
string += ETH_GSTRING_LEN;
sprintf(string, "rxq%d_allocbuf_failed", q_num);
string += ETH_GSTRING_LEN;
+ sprintf(string, "rxq%d_mapbuf_failed", q_num);
+ string += ETH_GSTRING_LEN;
sprintf(string, "rxq%d_producer_index", q_num);
string += ETH_GSTRING_LEN;
sprintf(string, "rxq%d_consumer_index", q_num);
@@ -678,6 +665,9 @@
sprintf(string, "rxq%d_allocbuf_failed",
q_num);
string += ETH_GSTRING_LEN;
+ sprintf(string, "rxq%d_mapbuf_failed",
+ q_num);
+ string += ETH_GSTRING_LEN;
sprintf(string, "rxq%d_producer_index",
q_num);
string += ETH_GSTRING_LEN;
@@ -854,9 +844,9 @@
u64 *buf)
{
struct bnad *bnad = netdev_priv(netdev);
- int i, j, bi;
+ int i, j, bi = 0;
unsigned long flags;
- struct rtnl_link_stats64 *net_stats64;
+ struct rtnl_link_stats64 net_stats64;
u64 *stats64;
u32 bmap;
@@ -871,14 +861,25 @@
* under the same lock
*/
spin_lock_irqsave(&bnad->bna_lock, flags);
- bi = 0;
- memset(buf, 0, stats->n_stats * sizeof(u64));
- net_stats64 = (struct rtnl_link_stats64 *)buf;
- bnad_netdev_qstats_fill(bnad, net_stats64);
- bnad_netdev_hwstats_fill(bnad, net_stats64);
+ memset(&net_stats64, 0, sizeof(net_stats64));
+ bnad_netdev_qstats_fill(bnad, &net_stats64);
+ bnad_netdev_hwstats_fill(bnad, &net_stats64);
- bi = sizeof(*net_stats64) / sizeof(u64);
+ buf[bi++] = net_stats64.rx_packets;
+ buf[bi++] = net_stats64.tx_packets;
+ buf[bi++] = net_stats64.rx_bytes;
+ buf[bi++] = net_stats64.tx_bytes;
+ buf[bi++] = net_stats64.rx_errors;
+ buf[bi++] = net_stats64.tx_errors;
+ buf[bi++] = net_stats64.rx_dropped;
+ buf[bi++] = net_stats64.tx_dropped;
+ buf[bi++] = net_stats64.multicast;
+ buf[bi++] = net_stats64.collisions;
+ buf[bi++] = net_stats64.rx_length_errors;
+ buf[bi++] = net_stats64.rx_crc_errors;
+ buf[bi++] = net_stats64.rx_frame_errors;
+ buf[bi++] = net_stats64.tx_fifo_errors;
/* Get netif_queue_stopped from stack */
bnad->stats.drv_stats.netif_queue_stopped = netif_queue_stopped(netdev);
diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 18d12d3..3042610 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -305,7 +305,7 @@
bool msix_enabled;
u8 num_vec;
struct msix_entry msix_entries[NIC_VF_MSIX_VECTORS];
- char irq_name[NIC_VF_MSIX_VECTORS][20];
+ char irq_name[NIC_VF_MSIX_VECTORS][IFNAMSIZ + 15];
bool irq_allocated[NIC_VF_MSIX_VECTORS];
cpumask_var_t affinity_mask[NIC_VF_MSIX_VECTORS];
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 7d00162..45a13f7 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -516,7 +516,8 @@
static void nicvf_snd_pkt_handler(struct net_device *netdev,
struct cmp_queue *cq,
struct cqe_send_t *cqe_tx,
- int cqe_type, int budget)
+ int cqe_type, int budget,
+ unsigned int *tx_pkts, unsigned int *tx_bytes)
{
struct sk_buff *skb = NULL;
struct nicvf *nic = netdev_priv(netdev);
@@ -547,6 +548,8 @@
}
nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
prefetch(skb);
+ (*tx_pkts)++;
+ *tx_bytes += skb->len;
napi_consume_skb(skb, budget);
sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
} else {
@@ -662,6 +665,7 @@
struct cmp_queue *cq = &qs->cq[cq_idx];
struct cqe_rx_t *cq_desc;
struct netdev_queue *txq;
+ unsigned int tx_pkts = 0, tx_bytes = 0;
spin_lock_bh(&cq->lock);
loop:
@@ -701,7 +705,7 @@
case CQE_TYPE_SEND:
nicvf_snd_pkt_handler(netdev, cq,
(void *)cq_desc, CQE_TYPE_SEND,
- budget);
+ budget, &tx_pkts, &tx_bytes);
tx_done++;
break;
case CQE_TYPE_INVALID:
@@ -730,6 +734,9 @@
netdev = nic->pnicvf->netdev;
txq = netdev_get_tx_queue(netdev,
nicvf_netdev_qidx(nic, cq_idx));
+ if (tx_pkts)
+ netdev_tx_completed_queue(txq, tx_pkts, tx_bytes);
+
nic = nic->pnicvf;
if (netif_tx_queue_stopped(txq) && netif_carrier_ok(netdev)) {
netif_tx_start_queue(txq);
@@ -1160,6 +1167,9 @@
netif_tx_disable(netdev);
+ for (qidx = 0; qidx < netdev->num_tx_queues; qidx++)
+ netdev_tx_reset_queue(netdev_get_tx_queue(netdev, qidx));
+
/* Free resources */
nicvf_config_data_transfer(nic, false);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 178c5c7..a4fc501 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1082,6 +1082,24 @@
imm->len = 1;
}
+static inline void nicvf_sq_doorbell(struct nicvf *nic, struct sk_buff *skb,
+ int sq_num, int desc_cnt)
+{
+ struct netdev_queue *txq;
+
+ txq = netdev_get_tx_queue(nic->pnicvf->netdev,
+ skb_get_queue_mapping(skb));
+
+ netdev_tx_sent_queue(txq, skb->len);
+
+ /* make sure all memory stores are done before ringing doorbell */
+ smp_wmb();
+
+ /* Inform HW to xmit all TSO segments */
+ nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
+ sq_num, desc_cnt);
+}
+
/* Segment a TSO packet into 'gso_size' segments and append
* them to SQ for transfer
*/
@@ -1141,12 +1159,8 @@
/* Save SKB in the last segment for freeing */
sq->skbuff[hdr_qentry] = (u64)skb;
- /* make sure all memory stores are done before ringing doorbell */
- smp_wmb();
+ nicvf_sq_doorbell(nic, skb, sq_num, desc_cnt);
- /* Inform HW to xmit all TSO segments */
- nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
- sq_num, desc_cnt);
nic->drv_stats.tx_tso++;
return 1;
}
@@ -1219,12 +1233,8 @@
nicvf_sq_add_cqe_subdesc(sq, qentry, tso_sqe, skb);
}
- /* make sure all memory stores are done before ringing doorbell */
- smp_wmb();
+ nicvf_sq_doorbell(nic, skb, sq_num, subdesc_cnt);
- /* Inform HW to xmit new packet */
- nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
- sq_num, subdesc_cnt);
return 1;
append_fail:
diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index 2461296..c6b71f6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,7 +4,7 @@
obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o
cxgb4-$(CONFIG_CHELSIO_T4_DCB) += cxgb4_dcb.o
cxgb4-$(CONFIG_CHELSIO_T4_FCOE) += cxgb4_fcoe.o
cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 4595569..28e653e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -422,8 +422,8 @@
unsigned short supported; /* link capabilities */
unsigned short advertising; /* advertised capabilities */
unsigned short lp_advertising; /* peer advertised capabilities */
- unsigned short requested_speed; /* speed user has requested */
- unsigned short speed; /* actual link speed */
+ unsigned int requested_speed; /* speed user has requested */
+ unsigned int speed; /* actual link speed */
unsigned char requested_fc; /* flow control user has requested */
unsigned char fc; /* actual link flow control */
unsigned char autoneg; /* autonegotiating? */
@@ -437,11 +437,6 @@
MAX_ETH_QSETS = 32, /* # of Ethernet Tx/Rx queue sets */
MAX_OFLD_QSETS = 16, /* # of offload Tx, iscsi Rx queue sets */
MAX_CTRL_QUEUES = NCHAN, /* # of control Tx queues */
- MAX_RDMA_QUEUES = NCHAN, /* # of streaming RDMA Rx queues */
- MAX_RDMA_CIQS = 32, /* # of RDMA concentrator IQs */
-
- /* # of streaming iSCSIT Rx queues */
- MAX_ISCSIT_QUEUES = MAX_OFLD_QSETS,
};
enum {
@@ -458,8 +453,7 @@
enum {
INGQ_EXTRAS = 2, /* firmware event queue and */
/* forwarded interrupts */
- MAX_INGQ = MAX_ETH_QSETS + MAX_OFLD_QSETS + MAX_RDMA_QUEUES +
- MAX_RDMA_CIQS + MAX_ISCSIT_QUEUES + INGQ_EXTRAS,
+ MAX_INGQ = MAX_ETH_QSETS + INGQ_EXTRAS,
};
struct adapter;
@@ -704,10 +698,6 @@
struct sge_ctrl_txq ctrlq[MAX_CTRL_QUEUES];
struct sge_eth_rxq ethrxq[MAX_ETH_QSETS];
- struct sge_ofld_rxq iscsirxq[MAX_OFLD_QSETS];
- struct sge_ofld_rxq iscsitrxq[MAX_ISCSIT_QUEUES];
- struct sge_ofld_rxq rdmarxq[MAX_RDMA_QUEUES];
- struct sge_ofld_rxq rdmaciq[MAX_RDMA_CIQS];
struct sge_rspq fw_evtq ____cacheline_aligned_in_smp;
struct sge_uld_rxq_info **uld_rxq_info;
@@ -717,15 +707,8 @@
u16 max_ethqsets; /* # of available Ethernet queue sets */
u16 ethqsets; /* # of active Ethernet queue sets */
u16 ethtxq_rover; /* Tx queue to clean up next */
- u16 iscsiqsets; /* # of active iSCSI queue sets */
- u16 niscsitq; /* # of available iSCST Rx queues */
- u16 rdmaqs; /* # of available RDMA Rx queues */
- u16 rdmaciqs; /* # of available RDMA concentrator IQs */
+ u16 ofldqsets; /* # of active ofld queue sets */
u16 nqs_per_uld; /* # of Rx queues per ULD */
- u16 iscsi_rxq[MAX_OFLD_QSETS];
- u16 iscsit_rxq[MAX_ISCSIT_QUEUES];
- u16 rdma_rxq[MAX_RDMA_QUEUES];
- u16 rdma_ciq[MAX_RDMA_CIQS];
u16 timer_val[SGE_NTIMERS];
u8 counter_val[SGE_NCOUNTERS];
u32 fl_pg_order; /* large page allocation size */
@@ -749,10 +732,7 @@
};
#define for_each_ethrxq(sge, i) for (i = 0; i < (sge)->ethqsets; i++)
-#define for_each_iscsirxq(sge, i) for (i = 0; i < (sge)->iscsiqsets; i++)
-#define for_each_iscsitrxq(sge, i) for (i = 0; i < (sge)->niscsitq; i++)
-#define for_each_rdmarxq(sge, i) for (i = 0; i < (sge)->rdmaqs; i++)
-#define for_each_rdmaciq(sge, i) for (i = 0; i < (sge)->rdmaciqs; i++)
+#define for_each_ofldtxq(sge, i) for (i = 0; i < (sge)->ofldqsets; i++)
struct l2t_data;
@@ -786,6 +766,7 @@
struct uld_msix_info {
unsigned short vec;
char desc[IFNAMSIZ + 10];
+ unsigned int idx;
};
struct vf_info {
@@ -818,7 +799,7 @@
} msix_info[MAX_INGQ + 1];
struct uld_msix_info *msix_info_ulds; /* msix info for uld's */
struct uld_msix_bmap msix_bmap_ulds; /* msix bitmap for all uld */
- unsigned int msi_idx;
+ int msi_idx;
struct doorbell_stats db_stats;
struct sge sge;
@@ -836,9 +817,10 @@
unsigned int clipt_start;
unsigned int clipt_end;
struct clip_tbl *clipt;
- struct cxgb4_pci_uld_info *uld;
+ struct cxgb4_uld_info *uld;
void *uld_handle[CXGB4_ULD_MAX];
unsigned int num_uld;
+ unsigned int num_ofld_uld;
struct list_head list_node;
struct list_head rcu_node;
struct list_head mac_hlist; /* list of MAC addresses in MPS Hash */
@@ -858,6 +840,8 @@
#define T4_OS_LOG_MBOX_CMDS 256
struct mbox_cmd_log *mbox_log;
+ struct mutex uld_mutex;
+
struct dentry *debugfs_root;
bool use_bd; /* Use SGE Back Door intfc for reading SGE Contexts */
bool trace_rss; /* 1 implies that different RSS flit per filter is
@@ -867,6 +851,9 @@
spinlock_t stats_lock;
spinlock_t win0_lock ____cacheline_aligned_in_smp;
+
+ /* TC u32 offload */
+ struct cxgb4_tc_u32_table *tc_u32;
};
/* Support for "sched-class" command to allow a TX Scheduling Class to be
@@ -1041,6 +1028,32 @@
VLAN_REWRITE
};
+/* Host shadow copy of ingress filter entry. This is in host native format
+ * and doesn't match the ordering or bit order, etc. of the hardware of the
+ * firmware command. The use of bit-field structure elements is purely to
+ * remind ourselves of the field size limitations and save memory in the case
+ * where the filter table is large.
+ */
+struct filter_entry {
+ /* Administrative fields for filter. */
+ u32 valid:1; /* filter allocated and valid */
+ u32 locked:1; /* filter is administratively locked */
+
+ u32 pending:1; /* filter action is pending firmware reply */
+ u32 smtidx:8; /* Source MAC Table index for smac */
+ struct filter_ctx *ctx; /* Caller's completion hook */
+ struct l2t_entry *l2t; /* Layer Two Table entry for dmac */
+ struct net_device *dev; /* Associated net device */
+ u32 tid; /* This will store the actual tid */
+
+ /* The filter itself. Most of this is a straight copy of information
+ * provided by the extended ioctl(). Some fields are translated to
+ * internal forms -- for instance the Ingress Queue ID passed in from
+ * the ioctl() is translated into the Absolute Ingress Queue ID.
+ */
+ struct ch_filter_specification fs;
+};
+
static inline int is_offload(const struct adapter *adap)
{
return adap->params.offload;
@@ -1051,6 +1064,11 @@
return adap->params.crypto;
}
+static inline int is_uld(const struct adapter *adap)
+{
+ return (adap->params.offload || adap->params.crypto);
+}
+
static inline u32 t4_read_reg(struct adapter *adap, u32 reg_addr)
{
return readl(adap->regs + reg_addr);
@@ -1277,6 +1295,8 @@
int t4_sge_alloc_ctrl_txq(struct adapter *adap, struct sge_ctrl_txq *txq,
struct net_device *dev, unsigned int iqid,
unsigned int cmplqid);
+int t4_sge_mod_ctrl_txq(struct adapter *adap, unsigned int eqid,
+ unsigned int cmplqid);
int t4_sge_alloc_ofld_txq(struct adapter *adap, struct sge_ofld_txq *txq,
struct net_device *dev, unsigned int iqid);
irqreturn_t t4_sge_intr_msix(int irq, void *cookie);
@@ -1635,7 +1655,9 @@
int hz, int ticks);
int t4_set_vf_mac_acl(struct adapter *adapter, unsigned int vf,
unsigned int naddr, u8 *addr);
-void uld_mem_free(struct adapter *adap);
-int uld_mem_alloc(struct adapter *adap);
+void t4_uld_mem_free(struct adapter *adap);
+int t4_uld_mem_alloc(struct adapter *adap);
+void t4_uld_clean_up(struct adapter *adap);
+void t4_register_netevent_notifier(void);
void free_rspq_fl(struct adapter *adap, struct sge_rspq *rq, struct sge_fl *fl);
#endif /* __CXGB4_H__ */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 91fb508..20455d0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -2432,17 +2432,11 @@
{
struct adapter *adap = seq->private;
int eth_entries = DIV_ROUND_UP(adap->sge.ethqsets, 4);
- int iscsi_entries = DIV_ROUND_UP(adap->sge.iscsiqsets, 4);
- int iscsit_entries = DIV_ROUND_UP(adap->sge.niscsitq, 4);
- int rdma_entries = DIV_ROUND_UP(adap->sge.rdmaqs, 4);
- int ciq_entries = DIV_ROUND_UP(adap->sge.rdmaciqs, 4);
+ int ofld_entries = DIV_ROUND_UP(adap->sge.ofldqsets, 4);
int ctrl_entries = DIV_ROUND_UP(MAX_CTRL_QUEUES, 4);
int i, r = (uintptr_t)v - 1;
- int iscsi_idx = r - eth_entries;
- int iscsit_idx = iscsi_idx - iscsi_entries;
- int rdma_idx = iscsit_idx - iscsit_entries;
- int ciq_idx = rdma_idx - rdma_entries;
- int ctrl_idx = ciq_idx - ciq_entries;
+ int ofld_idx = r - eth_entries;
+ int ctrl_idx = ofld_idx - ofld_entries;
int fq_idx = ctrl_idx - ctrl_entries;
if (r)
@@ -2518,119 +2512,17 @@
RL("FLLow:", fl.low);
RL("FLStarving:", fl.starving);
- } else if (iscsi_idx < iscsi_entries) {
- const struct sge_ofld_rxq *rx =
- &adap->sge.iscsirxq[iscsi_idx * 4];
+ } else if (ofld_idx < ofld_entries) {
const struct sge_ofld_txq *tx =
- &adap->sge.ofldtxq[iscsi_idx * 4];
- int n = min(4, adap->sge.iscsiqsets - 4 * iscsi_idx);
+ &adap->sge.ofldtxq[ofld_idx * 4];
+ int n = min(4, adap->sge.ofldqsets - 4 * ofld_idx);
- S("QType:", "iSCSI");
+ S("QType:", "OFLD-Txq");
T("TxQ ID:", q.cntxt_id);
T("TxQ size:", q.size);
T("TxQ inuse:", q.in_use);
T("TxQ CIDX:", q.cidx);
T("TxQ PIDX:", q.pidx);
- R("RspQ ID:", rspq.abs_id);
- R("RspQ size:", rspq.size);
- R("RspQE size:", rspq.iqe_len);
- R("RspQ CIDX:", rspq.cidx);
- R("RspQ Gen:", rspq.gen);
- S3("u", "Intr delay:", qtimer_val(adap, &rx[i].rspq));
- S3("u", "Intr pktcnt:",
- adap->sge.counter_val[rx[i].rspq.pktcnt_idx]);
- R("FL ID:", fl.cntxt_id);
- R("FL size:", fl.size - 8);
- R("FL pend:", fl.pend_cred);
- R("FL avail:", fl.avail);
- R("FL PIDX:", fl.pidx);
- R("FL CIDX:", fl.cidx);
- RL("RxPackets:", stats.pkts);
- RL("RxImmPkts:", stats.imm);
- RL("RxNoMem:", stats.nomem);
- RL("FLAllocErr:", fl.alloc_failed);
- RL("FLLrgAlcErr:", fl.large_alloc_failed);
- RL("FLMapErr:", fl.mapping_err);
- RL("FLLow:", fl.low);
- RL("FLStarving:", fl.starving);
-
- } else if (iscsit_idx < iscsit_entries) {
- const struct sge_ofld_rxq *rx =
- &adap->sge.iscsitrxq[iscsit_idx * 4];
- int n = min(4, adap->sge.niscsitq - 4 * iscsit_idx);
-
- S("QType:", "iSCSIT");
- R("RspQ ID:", rspq.abs_id);
- R("RspQ size:", rspq.size);
- R("RspQE size:", rspq.iqe_len);
- R("RspQ CIDX:", rspq.cidx);
- R("RspQ Gen:", rspq.gen);
- S3("u", "Intr delay:", qtimer_val(adap, &rx[i].rspq));
- S3("u", "Intr pktcnt:",
- adap->sge.counter_val[rx[i].rspq.pktcnt_idx]);
- R("FL ID:", fl.cntxt_id);
- R("FL size:", fl.size - 8);
- R("FL pend:", fl.pend_cred);
- R("FL avail:", fl.avail);
- R("FL PIDX:", fl.pidx);
- R("FL CIDX:", fl.cidx);
- RL("RxPackets:", stats.pkts);
- RL("RxImmPkts:", stats.imm);
- RL("RxNoMem:", stats.nomem);
- RL("FLAllocErr:", fl.alloc_failed);
- RL("FLLrgAlcErr:", fl.large_alloc_failed);
- RL("FLMapErr:", fl.mapping_err);
- RL("FLLow:", fl.low);
- RL("FLStarving:", fl.starving);
-
- } else if (rdma_idx < rdma_entries) {
- const struct sge_ofld_rxq *rx =
- &adap->sge.rdmarxq[rdma_idx * 4];
- int n = min(4, adap->sge.rdmaqs - 4 * rdma_idx);
-
- S("QType:", "RDMA-CPL");
- S("Interface:",
- rx[i].rspq.netdev ? rx[i].rspq.netdev->name : "N/A");
- R("RspQ ID:", rspq.abs_id);
- R("RspQ size:", rspq.size);
- R("RspQE size:", rspq.iqe_len);
- R("RspQ CIDX:", rspq.cidx);
- R("RspQ Gen:", rspq.gen);
- S3("u", "Intr delay:", qtimer_val(adap, &rx[i].rspq));
- S3("u", "Intr pktcnt:",
- adap->sge.counter_val[rx[i].rspq.pktcnt_idx]);
- R("FL ID:", fl.cntxt_id);
- R("FL size:", fl.size - 8);
- R("FL pend:", fl.pend_cred);
- R("FL avail:", fl.avail);
- R("FL PIDX:", fl.pidx);
- R("FL CIDX:", fl.cidx);
- RL("RxPackets:", stats.pkts);
- RL("RxImmPkts:", stats.imm);
- RL("RxNoMem:", stats.nomem);
- RL("FLAllocErr:", fl.alloc_failed);
- RL("FLLrgAlcErr:", fl.large_alloc_failed);
- RL("FLMapErr:", fl.mapping_err);
- RL("FLLow:", fl.low);
- RL("FLStarving:", fl.starving);
-
- } else if (ciq_idx < ciq_entries) {
- const struct sge_ofld_rxq *rx = &adap->sge.rdmaciq[ciq_idx * 4];
- int n = min(4, adap->sge.rdmaciqs - 4 * ciq_idx);
-
- S("QType:", "RDMA-CIQ");
- S("Interface:",
- rx[i].rspq.netdev ? rx[i].rspq.netdev->name : "N/A");
- R("RspQ ID:", rspq.abs_id);
- R("RspQ size:", rspq.size);
- R("RspQE size:", rspq.iqe_len);
- R("RspQ CIDX:", rspq.cidx);
- R("RspQ Gen:", rspq.gen);
- S3("u", "Intr delay:", qtimer_val(adap, &rx[i].rspq));
- S3("u", "Intr pktcnt:",
- adap->sge.counter_val[rx[i].rspq.pktcnt_idx]);
- RL("RxAN:", stats.an);
- RL("RxNoMem:", stats.nomem);
} else if (ctrl_idx < ctrl_entries) {
const struct sge_ctrl_txq *tx = &adap->sge.ctrlq[ctrl_idx * 4];
@@ -2672,10 +2564,7 @@
static int sge_queue_entries(const struct adapter *adap)
{
return DIV_ROUND_UP(adap->sge.ethqsets, 4) +
- DIV_ROUND_UP(adap->sge.iscsiqsets, 4) +
- DIV_ROUND_UP(adap->sge.niscsitq, 4) +
- DIV_ROUND_UP(adap->sge.rdmaqs, 4) +
- DIV_ROUND_UP(adap->sge.rdmaciqs, 4) +
+ DIV_ROUND_UP(adap->sge.ofldqsets, 4) +
DIV_ROUND_UP(MAX_CTRL_QUEUES, 4) + 1;
}
@@ -2859,12 +2748,6 @@
size_mb << 20);
}
-static int blocked_fl_open(struct inode *inode, struct file *file)
-{
- file->private_data = inode->i_private;
- return 0;
-}
-
static ssize_t blocked_fl_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
@@ -2908,7 +2791,7 @@
static const struct file_operations blocked_fl_fops = {
.owner = THIS_MODULE,
- .open = blocked_fl_open,
+ .open = simple_open,
.read = blocked_fl_read,
.write = blocked_fl_write,
.llseek = generic_file_llseek,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
new file mode 100644
index 0000000..1073673
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -0,0 +1,721 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "cxgb4.h"
+#include "t4_regs.h"
+#include "l2t.h"
+#include "t4fw_api.h"
+#include "cxgb4_filter.h"
+
+static inline bool is_field_set(u32 val, u32 mask)
+{
+ return val || mask;
+}
+
+static inline bool unsupported(u32 conf, u32 conf_mask, u32 val, u32 mask)
+{
+ return !(conf & conf_mask) && is_field_set(val, mask);
+}
+
+/* Validate filter spec against configuration done on the card. */
+static int validate_filter(struct net_device *dev,
+ struct ch_filter_specification *fs)
+{
+ struct adapter *adapter = netdev2adap(dev);
+ u32 fconf, iconf;
+
+ /* Check for unconfigured fields being used. */
+ fconf = adapter->params.tp.vlan_pri_map;
+ iconf = adapter->params.tp.ingress_config;
+
+ if (unsupported(fconf, FCOE_F, fs->val.fcoe, fs->mask.fcoe) ||
+ unsupported(fconf, PORT_F, fs->val.iport, fs->mask.iport) ||
+ unsupported(fconf, TOS_F, fs->val.tos, fs->mask.tos) ||
+ unsupported(fconf, ETHERTYPE_F, fs->val.ethtype,
+ fs->mask.ethtype) ||
+ unsupported(fconf, MACMATCH_F, fs->val.macidx, fs->mask.macidx) ||
+ unsupported(fconf, MPSHITTYPE_F, fs->val.matchtype,
+ fs->mask.matchtype) ||
+ unsupported(fconf, FRAGMENTATION_F, fs->val.frag, fs->mask.frag) ||
+ unsupported(fconf, PROTOCOL_F, fs->val.proto, fs->mask.proto) ||
+ unsupported(fconf, VNIC_ID_F, fs->val.pfvf_vld,
+ fs->mask.pfvf_vld) ||
+ unsupported(fconf, VNIC_ID_F, fs->val.ovlan_vld,
+ fs->mask.ovlan_vld) ||
+ unsupported(fconf, VLAN_F, fs->val.ivlan_vld, fs->mask.ivlan_vld))
+ return -EOPNOTSUPP;
+
+ /* T4 inconveniently uses the same FT_VNIC_ID_W bits for both the Outer
+ * VLAN Tag and PF/VF/VFvld fields based on VNIC_F being set
+ * in TP_INGRESS_CONFIG. Hense the somewhat crazy checks
+ * below. Additionally, since the T4 firmware interface also
+ * carries that overlap, we need to translate any PF/VF
+ * specification into that internal format below.
+ */
+ if (is_field_set(fs->val.pfvf_vld, fs->mask.pfvf_vld) &&
+ is_field_set(fs->val.ovlan_vld, fs->mask.ovlan_vld))
+ return -EOPNOTSUPP;
+ if (unsupported(iconf, VNIC_F, fs->val.pfvf_vld, fs->mask.pfvf_vld) ||
+ (is_field_set(fs->val.ovlan_vld, fs->mask.ovlan_vld) &&
+ (iconf & VNIC_F)))
+ return -EOPNOTSUPP;
+ if (fs->val.pf > 0x7 || fs->val.vf > 0x7f)
+ return -ERANGE;
+ fs->mask.pf &= 0x7;
+ fs->mask.vf &= 0x7f;
+
+ /* If the user is requesting that the filter action loop
+ * matching packets back out one of our ports, make sure that
+ * the egress port is in range.
+ */
+ if (fs->action == FILTER_SWITCH &&
+ fs->eport >= adapter->params.nports)
+ return -ERANGE;
+
+ /* Don't allow various trivially obvious bogus out-of-range values... */
+ if (fs->val.iport >= adapter->params.nports)
+ return -ERANGE;
+
+ /* T4 doesn't support removing VLAN Tags for loop back filters. */
+ if (is_t4(adapter->params.chip) &&
+ fs->action == FILTER_SWITCH &&
+ (fs->newvlan == VLAN_REMOVE ||
+ fs->newvlan == VLAN_REWRITE))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+static int get_filter_steerq(struct net_device *dev,
+ struct ch_filter_specification *fs)
+{
+ struct adapter *adapter = netdev2adap(dev);
+ int iq;
+
+ /* If the user has requested steering matching Ingress Packets
+ * to a specific Queue Set, we need to make sure it's in range
+ * for the port and map that into the Absolute Queue ID of the
+ * Queue Set's Response Queue.
+ */
+ if (!fs->dirsteer) {
+ if (fs->iq)
+ return -EINVAL;
+ iq = 0;
+ } else {
+ struct port_info *pi = netdev_priv(dev);
+
+ /* If the iq id is greater than the number of qsets,
+ * then assume it is an absolute qid.
+ */
+ if (fs->iq < pi->nqsets)
+ iq = adapter->sge.ethrxq[pi->first_qset +
+ fs->iq].rspq.abs_id;
+ else
+ iq = fs->iq;
+ }
+
+ return iq;
+}
+
+static int cxgb4_set_ftid(struct tid_info *t, int fidx, int family)
+{
+ spin_lock_bh(&t->ftid_lock);
+
+ if (test_bit(fidx, t->ftid_bmap)) {
+ spin_unlock_bh(&t->ftid_lock);
+ return -EBUSY;
+ }
+
+ if (family == PF_INET)
+ __set_bit(fidx, t->ftid_bmap);
+ else
+ bitmap_allocate_region(t->ftid_bmap, fidx, 2);
+
+ spin_unlock_bh(&t->ftid_lock);
+ return 0;
+}
+
+static void cxgb4_clear_ftid(struct tid_info *t, int fidx, int family)
+{
+ spin_lock_bh(&t->ftid_lock);
+ if (family == PF_INET)
+ __clear_bit(fidx, t->ftid_bmap);
+ else
+ bitmap_release_region(t->ftid_bmap, fidx, 2);
+ spin_unlock_bh(&t->ftid_lock);
+}
+
+/* Delete the filter at a specified index. */
+static int del_filter_wr(struct adapter *adapter, int fidx)
+{
+ struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+ struct fw_filter_wr *fwr;
+ struct sk_buff *skb;
+ unsigned int len;
+
+ len = sizeof(*fwr);
+
+ skb = alloc_skb(len, GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ fwr = (struct fw_filter_wr *)__skb_put(skb, len);
+ t4_mk_filtdelwr(f->tid, fwr, adapter->sge.fw_evtq.abs_id);
+
+ /* Mark the filter as "pending" and ship off the Filter Work Request.
+ * When we get the Work Request Reply we'll clear the pending status.
+ */
+ f->pending = 1;
+ t4_mgmt_tx(adapter, skb);
+ return 0;
+}
+
+/* Send a Work Request to write the filter at a specified index. We construct
+ * a Firmware Filter Work Request to have the work done and put the indicated
+ * filter into "pending" mode which will prevent any further actions against
+ * it till we get a reply from the firmware on the completion status of the
+ * request.
+ */
+int set_filter_wr(struct adapter *adapter, int fidx)
+{
+ struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+ struct fw_filter_wr *fwr;
+ struct sk_buff *skb;
+
+ skb = alloc_skb(sizeof(*fwr), GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ /* If the new filter requires loopback Destination MAC and/or VLAN
+ * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
+ * the filter.
+ */
+ if (f->fs.newdmac || f->fs.newvlan) {
+ /* allocate L2T entry for new filter */
+ f->l2t = t4_l2t_alloc_switching(adapter, f->fs.vlan,
+ f->fs.eport, f->fs.dmac);
+ if (!f->l2t) {
+ kfree_skb(skb);
+ return -ENOMEM;
+ }
+ }
+
+ fwr = (struct fw_filter_wr *)__skb_put(skb, sizeof(*fwr));
+ memset(fwr, 0, sizeof(*fwr));
+
+ /* It would be nice to put most of the following in t4_hw.c but most
+ * of the work is translating the cxgbtool ch_filter_specification
+ * into the Work Request and the definition of that structure is
+ * currently in cxgbtool.h which isn't appropriate to pull into the
+ * common code. We may eventually try to come up with a more neutral
+ * filter specification structure but for now it's easiest to simply
+ * put this fairly direct code in line ...
+ */
+ fwr->op_pkd = htonl(FW_WR_OP_V(FW_FILTER_WR));
+ fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr) / 16));
+ fwr->tid_to_iq =
+ htonl(FW_FILTER_WR_TID_V(f->tid) |
+ FW_FILTER_WR_RQTYPE_V(f->fs.type) |
+ FW_FILTER_WR_NOREPLY_V(0) |
+ FW_FILTER_WR_IQ_V(f->fs.iq));
+ fwr->del_filter_to_l2tix =
+ htonl(FW_FILTER_WR_RPTTID_V(f->fs.rpttid) |
+ FW_FILTER_WR_DROP_V(f->fs.action == FILTER_DROP) |
+ FW_FILTER_WR_DIRSTEER_V(f->fs.dirsteer) |
+ FW_FILTER_WR_MASKHASH_V(f->fs.maskhash) |
+ FW_FILTER_WR_DIRSTEERHASH_V(f->fs.dirsteerhash) |
+ FW_FILTER_WR_LPBK_V(f->fs.action == FILTER_SWITCH) |
+ FW_FILTER_WR_DMAC_V(f->fs.newdmac) |
+ FW_FILTER_WR_SMAC_V(f->fs.newsmac) |
+ FW_FILTER_WR_INSVLAN_V(f->fs.newvlan == VLAN_INSERT ||
+ f->fs.newvlan == VLAN_REWRITE) |
+ FW_FILTER_WR_RMVLAN_V(f->fs.newvlan == VLAN_REMOVE ||
+ f->fs.newvlan == VLAN_REWRITE) |
+ FW_FILTER_WR_HITCNTS_V(f->fs.hitcnts) |
+ FW_FILTER_WR_TXCHAN_V(f->fs.eport) |
+ FW_FILTER_WR_PRIO_V(f->fs.prio) |
+ FW_FILTER_WR_L2TIX_V(f->l2t ? f->l2t->idx : 0));
+ fwr->ethtype = htons(f->fs.val.ethtype);
+ fwr->ethtypem = htons(f->fs.mask.ethtype);
+ fwr->frag_to_ovlan_vldm =
+ (FW_FILTER_WR_FRAG_V(f->fs.val.frag) |
+ FW_FILTER_WR_FRAGM_V(f->fs.mask.frag) |
+ FW_FILTER_WR_IVLAN_VLD_V(f->fs.val.ivlan_vld) |
+ FW_FILTER_WR_OVLAN_VLD_V(f->fs.val.ovlan_vld) |
+ FW_FILTER_WR_IVLAN_VLDM_V(f->fs.mask.ivlan_vld) |
+ FW_FILTER_WR_OVLAN_VLDM_V(f->fs.mask.ovlan_vld));
+ fwr->smac_sel = 0;
+ fwr->rx_chan_rx_rpl_iq =
+ htons(FW_FILTER_WR_RX_CHAN_V(0) |
+ FW_FILTER_WR_RX_RPL_IQ_V(adapter->sge.fw_evtq.abs_id));
+ fwr->maci_to_matchtypem =
+ htonl(FW_FILTER_WR_MACI_V(f->fs.val.macidx) |
+ FW_FILTER_WR_MACIM_V(f->fs.mask.macidx) |
+ FW_FILTER_WR_FCOE_V(f->fs.val.fcoe) |
+ FW_FILTER_WR_FCOEM_V(f->fs.mask.fcoe) |
+ FW_FILTER_WR_PORT_V(f->fs.val.iport) |
+ FW_FILTER_WR_PORTM_V(f->fs.mask.iport) |
+ FW_FILTER_WR_MATCHTYPE_V(f->fs.val.matchtype) |
+ FW_FILTER_WR_MATCHTYPEM_V(f->fs.mask.matchtype));
+ fwr->ptcl = f->fs.val.proto;
+ fwr->ptclm = f->fs.mask.proto;
+ fwr->ttyp = f->fs.val.tos;
+ fwr->ttypm = f->fs.mask.tos;
+ fwr->ivlan = htons(f->fs.val.ivlan);
+ fwr->ivlanm = htons(f->fs.mask.ivlan);
+ fwr->ovlan = htons(f->fs.val.ovlan);
+ fwr->ovlanm = htons(f->fs.mask.ovlan);
+ memcpy(fwr->lip, f->fs.val.lip, sizeof(fwr->lip));
+ memcpy(fwr->lipm, f->fs.mask.lip, sizeof(fwr->lipm));
+ memcpy(fwr->fip, f->fs.val.fip, sizeof(fwr->fip));
+ memcpy(fwr->fipm, f->fs.mask.fip, sizeof(fwr->fipm));
+ fwr->lp = htons(f->fs.val.lport);
+ fwr->lpm = htons(f->fs.mask.lport);
+ fwr->fp = htons(f->fs.val.fport);
+ fwr->fpm = htons(f->fs.mask.fport);
+ if (f->fs.newsmac)
+ memcpy(fwr->sma, f->fs.smac, sizeof(fwr->sma));
+
+ /* Mark the filter as "pending" and ship off the Filter Work Request.
+ * When we get the Work Request Reply we'll clear the pending status.
+ */
+ f->pending = 1;
+ set_wr_txq(skb, CPL_PRIORITY_CONTROL, f->fs.val.iport & 0x3);
+ t4_ofld_send(adapter, skb);
+ return 0;
+}
+
+/* Return an error number if the indicated filter isn't writable ... */
+int writable_filter(struct filter_entry *f)
+{
+ if (f->locked)
+ return -EPERM;
+ if (f->pending)
+ return -EBUSY;
+
+ return 0;
+}
+
+/* Delete the filter at the specified index (if valid). The checks for all
+ * the common problems with doing this like the filter being locked, currently
+ * pending in another operation, etc.
+ */
+int delete_filter(struct adapter *adapter, unsigned int fidx)
+{
+ struct filter_entry *f;
+ int ret;
+
+ if (fidx >= adapter->tids.nftids + adapter->tids.nsftids)
+ return -EINVAL;
+
+ f = &adapter->tids.ftid_tab[fidx];
+ ret = writable_filter(f);
+ if (ret)
+ return ret;
+ if (f->valid)
+ return del_filter_wr(adapter, fidx);
+
+ return 0;
+}
+
+/* Clear a filter and release any of its resources that we own. This also
+ * clears the filter's "pending" status.
+ */
+void clear_filter(struct adapter *adap, struct filter_entry *f)
+{
+ /* If the new or old filter have loopback rewriteing rules then we'll
+ * need to free any existing Layer Two Table (L2T) entries of the old
+ * filter rule. The firmware will handle freeing up any Source MAC
+ * Table (SMT) entries used for rewriting Source MAC Addresses in
+ * loopback rules.
+ */
+ if (f->l2t)
+ cxgb4_l2t_release(f->l2t);
+
+ /* The zeroing of the filter rule below clears the filter valid,
+ * pending, locked flags, l2t pointer, etc. so it's all we need for
+ * this operation.
+ */
+ memset(f, 0, sizeof(*f));
+}
+
+void clear_all_filters(struct adapter *adapter)
+{
+ unsigned int i;
+
+ if (adapter->tids.ftid_tab) {
+ struct filter_entry *f = &adapter->tids.ftid_tab[0];
+ unsigned int max_ftid = adapter->tids.nftids +
+ adapter->tids.nsftids;
+
+ for (i = 0; i < max_ftid; i++, f++)
+ if (f->valid || f->pending)
+ clear_filter(adapter, f);
+ }
+}
+
+/* Fill up default masks for set match fields. */
+static void fill_default_mask(struct ch_filter_specification *fs)
+{
+ unsigned int lip = 0, lip_mask = 0;
+ unsigned int fip = 0, fip_mask = 0;
+ unsigned int i;
+
+ if (fs->val.iport && !fs->mask.iport)
+ fs->mask.iport |= ~0;
+ if (fs->val.fcoe && !fs->mask.fcoe)
+ fs->mask.fcoe |= ~0;
+ if (fs->val.matchtype && !fs->mask.matchtype)
+ fs->mask.matchtype |= ~0;
+ if (fs->val.macidx && !fs->mask.macidx)
+ fs->mask.macidx |= ~0;
+ if (fs->val.ethtype && !fs->mask.ethtype)
+ fs->mask.ethtype |= ~0;
+ if (fs->val.ivlan && !fs->mask.ivlan)
+ fs->mask.ivlan |= ~0;
+ if (fs->val.ovlan && !fs->mask.ovlan)
+ fs->mask.ovlan |= ~0;
+ if (fs->val.frag && !fs->mask.frag)
+ fs->mask.frag |= ~0;
+ if (fs->val.tos && !fs->mask.tos)
+ fs->mask.tos |= ~0;
+ if (fs->val.proto && !fs->mask.proto)
+ fs->mask.proto |= ~0;
+
+ for (i = 0; i < ARRAY_SIZE(fs->val.lip); i++) {
+ lip |= fs->val.lip[i];
+ lip_mask |= fs->mask.lip[i];
+ fip |= fs->val.fip[i];
+ fip_mask |= fs->mask.fip[i];
+ }
+
+ if (lip && !lip_mask)
+ memset(fs->mask.lip, ~0, sizeof(fs->mask.lip));
+
+ if (fip && !fip_mask)
+ memset(fs->mask.fip, ~0, sizeof(fs->mask.lip));
+
+ if (fs->val.lport && !fs->mask.lport)
+ fs->mask.lport = ~0;
+ if (fs->val.fport && !fs->mask.fport)
+ fs->mask.fport = ~0;
+}
+
+/* Check a Chelsio Filter Request for validity, convert it into our internal
+ * format and send it to the hardware. Return 0 on success, an error number
+ * otherwise. We attach any provided filter operation context to the internal
+ * filter specification in order to facilitate signaling completion of the
+ * operation.
+ */
+int __cxgb4_set_filter(struct net_device *dev, int filter_id,
+ struct ch_filter_specification *fs,
+ struct filter_ctx *ctx)
+{
+ struct adapter *adapter = netdev2adap(dev);
+ unsigned int max_fidx, fidx;
+ struct filter_entry *f;
+ u32 iconf;
+ int iq, ret;
+
+ max_fidx = adapter->tids.nftids;
+ if (filter_id != (max_fidx + adapter->tids.nsftids - 1) &&
+ filter_id >= max_fidx)
+ return -E2BIG;
+
+ fill_default_mask(fs);
+
+ ret = validate_filter(dev, fs);
+ if (ret)
+ return ret;
+
+ iq = get_filter_steerq(dev, fs);
+ if (iq < 0)
+ return iq;
+
+ /* IPv6 filters occupy four slots and must be aligned on
+ * four-slot boundaries. IPv4 filters only occupy a single
+ * slot and have no alignment requirements but writing a new
+ * IPv4 filter into the middle of an existing IPv6 filter
+ * requires clearing the old IPv6 filter and hence we prevent
+ * insertion.
+ */
+ if (fs->type == 0) { /* IPv4 */
+ /* If our IPv4 filter isn't being written to a
+ * multiple of four filter index and there's an IPv6
+ * filter at the multiple of 4 base slot, then we
+ * prevent insertion.
+ */
+ fidx = filter_id & ~0x3;
+ if (fidx != filter_id &&
+ adapter->tids.ftid_tab[fidx].fs.type) {
+ f = &adapter->tids.ftid_tab[fidx];
+ if (f->valid) {
+ dev_err(adapter->pdev_dev,
+ "Invalid location. IPv6 requires 4 slots and is occupying slots %u to %u\n",
+ fidx, fidx + 3);
+ return -EINVAL;
+ }
+ }
+ } else { /* IPv6 */
+ /* Ensure that the IPv6 filter is aligned on a
+ * multiple of 4 boundary.
+ */
+ if (filter_id & 0x3) {
+ dev_err(adapter->pdev_dev,
+ "Invalid location. IPv6 must be aligned on a 4-slot boundary\n");
+ return -EINVAL;
+ }
+
+ /* Check all except the base overlapping IPv4 filter slots. */
+ for (fidx = filter_id + 1; fidx < filter_id + 4; fidx++) {
+ f = &adapter->tids.ftid_tab[fidx];
+ if (f->valid) {
+ dev_err(adapter->pdev_dev,
+ "Invalid location. IPv6 requires 4 slots and an IPv4 filter exists at %u\n",
+ fidx);
+ return -EINVAL;
+ }
+ }
+ }
+
+ /* Check to make sure that provided filter index is not
+ * already in use by someone else
+ */
+ f = &adapter->tids.ftid_tab[filter_id];
+ if (f->valid)
+ return -EBUSY;
+
+ fidx = filter_id + adapter->tids.ftid_base;
+ ret = cxgb4_set_ftid(&adapter->tids, filter_id,
+ fs->type ? PF_INET6 : PF_INET);
+ if (ret)
+ return ret;
+
+ /* Check to make sure the filter requested is writable ... */
+ ret = writable_filter(f);
+ if (ret) {
+ /* Clear the bits we have set above */
+ cxgb4_clear_ftid(&adapter->tids, filter_id,
+ fs->type ? PF_INET6 : PF_INET);
+ return ret;
+ }
+
+ /* Clear out any old resources being used by the filter before
+ * we start constructing the new filter.
+ */
+ if (f->valid)
+ clear_filter(adapter, f);
+
+ /* Convert the filter specification into our internal format.
+ * We copy the PF/VF specification into the Outer VLAN field
+ * here so the rest of the code -- including the interface to
+ * the firmware -- doesn't have to constantly do these checks.
+ */
+ f->fs = *fs;
+ f->fs.iq = iq;
+ f->dev = dev;
+
+ iconf = adapter->params.tp.ingress_config;
+ if (iconf & VNIC_F) {
+ f->fs.val.ovlan = (fs->val.pf << 13) | fs->val.vf;
+ f->fs.mask.ovlan = (fs->mask.pf << 13) | fs->mask.vf;
+ f->fs.val.ovlan_vld = fs->val.pfvf_vld;
+ f->fs.mask.ovlan_vld = fs->mask.pfvf_vld;
+ }
+
+ /* Attempt to set the filter. If we don't succeed, we clear
+ * it and return the failure.
+ */
+ f->ctx = ctx;
+ f->tid = fidx; /* Save the actual tid */
+ ret = set_filter_wr(adapter, filter_id);
+ if (ret) {
+ cxgb4_clear_ftid(&adapter->tids, filter_id,
+ fs->type ? PF_INET6 : PF_INET);
+ clear_filter(adapter, f);
+ }
+
+ return ret;
+}
+
+/* Check a delete filter request for validity and send it to the hardware.
+ * Return 0 on success, an error number otherwise. We attach any provided
+ * filter operation context to the internal filter specification in order to
+ * facilitate signaling completion of the operation.
+ */
+int __cxgb4_del_filter(struct net_device *dev, int filter_id,
+ struct filter_ctx *ctx)
+{
+ struct adapter *adapter = netdev2adap(dev);
+ struct filter_entry *f;
+ unsigned int max_fidx;
+ int ret;
+
+ max_fidx = adapter->tids.nftids;
+ if (filter_id != (max_fidx + adapter->tids.nsftids - 1) &&
+ filter_id >= max_fidx)
+ return -E2BIG;
+
+ f = &adapter->tids.ftid_tab[filter_id];
+ ret = writable_filter(f);
+ if (ret)
+ return ret;
+
+ if (f->valid) {
+ f->ctx = ctx;
+ cxgb4_clear_ftid(&adapter->tids, filter_id,
+ f->fs.type ? PF_INET6 : PF_INET);
+ return del_filter_wr(adapter, filter_id);
+ }
+
+ /* If the caller has passed in a Completion Context then we need to
+ * mark it as a successful completion so they don't stall waiting
+ * for it.
+ */
+ if (ctx) {
+ ctx->result = 0;
+ complete(&ctx->completion);
+ }
+ return ret;
+}
+
+int cxgb4_set_filter(struct net_device *dev, int filter_id,
+ struct ch_filter_specification *fs)
+{
+ struct filter_ctx ctx;
+ int ret;
+
+ init_completion(&ctx.completion);
+
+ ret = __cxgb4_set_filter(dev, filter_id, fs, &ctx);
+ if (ret)
+ goto out;
+
+ /* Wait for reply */
+ ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+ if (!ret)
+ return -ETIMEDOUT;
+
+ ret = ctx.result;
+out:
+ return ret;
+}
+
+int cxgb4_del_filter(struct net_device *dev, int filter_id)
+{
+ struct filter_ctx ctx;
+ int ret;
+
+ init_completion(&ctx.completion);
+
+ ret = __cxgb4_del_filter(dev, filter_id, &ctx);
+ if (ret)
+ goto out;
+
+ /* Wait for reply */
+ ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+ if (!ret)
+ return -ETIMEDOUT;
+
+ ret = ctx.result;
+out:
+ return ret;
+}
+
+/* Handle a filter write/deletion reply. */
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
+{
+ unsigned int tid = GET_TID(rpl);
+ struct filter_entry *f = NULL;
+ unsigned int max_fidx;
+ int idx;
+
+ max_fidx = adap->tids.nftids + adap->tids.nsftids;
+ /* Get the corresponding filter entry for this tid */
+ if (adap->tids.ftid_tab) {
+ /* Check this in normal filter region */
+ idx = tid - adap->tids.ftid_base;
+ if (idx >= max_fidx)
+ return;
+ f = &adap->tids.ftid_tab[idx];
+ if (f->tid != tid)
+ return;
+ }
+
+ /* We found the filter entry for this tid */
+ if (f) {
+ unsigned int ret = TCB_COOKIE_G(rpl->cookie);
+ struct filter_ctx *ctx;
+
+ /* Pull off any filter operation context attached to the
+ * filter.
+ */
+ ctx = f->ctx;
+ f->ctx = NULL;
+
+ if (ret == FW_FILTER_WR_FLT_DELETED) {
+ /* Clear the filter when we get confirmation from the
+ * hardware that the filter has been deleted.
+ */
+ clear_filter(adap, f);
+ if (ctx)
+ ctx->result = 0;
+ } else if (ret == FW_FILTER_WR_SMT_TBL_FULL) {
+ dev_err(adap->pdev_dev, "filter %u setup failed due to full SMT\n",
+ idx);
+ clear_filter(adap, f);
+ if (ctx)
+ ctx->result = -ENOMEM;
+ } else if (ret == FW_FILTER_WR_FLT_ADDED) {
+ f->smtidx = (be64_to_cpu(rpl->oldval) >> 24) & 0xff;
+ f->pending = 0; /* asynchronous setup completed */
+ f->valid = 1;
+ if (ctx) {
+ ctx->result = 0;
+ ctx->tid = idx;
+ }
+ } else {
+ /* Something went wrong. Issue a warning about the
+ * problem and clear everything out.
+ */
+ dev_err(adap->pdev_dev, "filter %u setup failed with error %u\n",
+ idx, ret);
+ clear_filter(adap, f);
+ if (ctx)
+ ctx->result = -EINVAL;
+ }
+ if (ctx)
+ complete(&ctx->completion);
+ }
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
new file mode 100644
index 0000000..23742cb
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
@@ -0,0 +1,48 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_FILTER_H
+#define __CXGB4_FILTER_H
+
+#include "t4_msg.h"
+
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl);
+void clear_filter(struct adapter *adap, struct filter_entry *f);
+
+int set_filter_wr(struct adapter *adapter, int fidx);
+int delete_filter(struct adapter *adapter, unsigned int fidx);
+
+int writable_filter(struct filter_entry *f);
+void clear_all_filters(struct adapter *adapter);
+#endif /* __CXGB4_FILTER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 44cc976..c8a5c43 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -67,6 +67,7 @@
#include <linux/crash_dump.h>
#include "cxgb4.h"
+#include "cxgb4_filter.h"
#include "t4_regs.h"
#include "t4_values.h"
#include "t4_msg.h"
@@ -77,6 +78,7 @@
#include "clip_tbl.h"
#include "l2t.h"
#include "sched.h"
+#include "cxgb4_tc_u32.h"
char cxgb4_driver_name[] = KBUILD_MODNAME;
@@ -87,30 +89,6 @@
const char cxgb4_driver_version[] = DRV_VERSION;
#define DRV_DESC "Chelsio T4/T5/T6 Network Driver"
-/* Host shadow copy of ingress filter entry. This is in host native format
- * and doesn't match the ordering or bit order, etc. of the hardware of the
- * firmware command. The use of bit-field structure elements is purely to
- * remind ourselves of the field size limitations and save memory in the case
- * where the filter table is large.
- */
-struct filter_entry {
- /* Administrative fields for filter.
- */
- u32 valid:1; /* filter allocated and valid */
- u32 locked:1; /* filter is administratively locked */
-
- u32 pending:1; /* filter action is pending firmware reply */
- u32 smtidx:8; /* Source MAC Table index for smac */
- struct l2t_entry *l2t; /* Layer Two Table entry for dmac */
-
- /* The filter itself. Most of this is a straight copy of information
- * provided by the extended ioctl(). Some fields are translated to
- * internal forms -- for instance the Ingress Queue ID passed in from
- * the ioctl() is translated into the Absolute Ingress Queue ID.
- */
- struct ch_filter_specification fs;
-};
-
#define DFLT_MSG_ENABLE (NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK | \
NETIF_MSG_TIMER | NETIF_MSG_IFDOWN | NETIF_MSG_IFUP |\
NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR)
@@ -226,11 +204,6 @@
LIST_HEAD(adapter_list);
DEFINE_MUTEX(uld_mutex);
-/* Adapter list to be accessed from atomic context */
-static LIST_HEAD(adap_rcu_list);
-static DEFINE_SPINLOCK(adap_rcu_lock);
-static struct cxgb4_uld_info ulds[CXGB4_ULD_MAX];
-static const char *const uld_str[] = { "RDMA", "iSCSI", "iSCSIT" };
static void link_report(struct net_device *dev)
{
@@ -306,7 +279,7 @@
}
#endif /* CONFIG_CHELSIO_T4_DCB */
-int cxgb4_dcb_enabled(const struct net_device *dev)
+static int cxgb4_dcb_enabled(const struct net_device *dev)
{
#ifdef CONFIG_CHELSIO_T4_DCB
struct port_info *pi = netdev_priv(dev);
@@ -532,66 +505,6 @@
}
#endif /* CONFIG_CHELSIO_T4_DCB */
-/* Clear a filter and release any of its resources that we own. This also
- * clears the filter's "pending" status.
- */
-static void clear_filter(struct adapter *adap, struct filter_entry *f)
-{
- /* If the new or old filter have loopback rewriteing rules then we'll
- * need to free any existing Layer Two Table (L2T) entries of the old
- * filter rule. The firmware will handle freeing up any Source MAC
- * Table (SMT) entries used for rewriting Source MAC Addresses in
- * loopback rules.
- */
- if (f->l2t)
- cxgb4_l2t_release(f->l2t);
-
- /* The zeroing of the filter rule below clears the filter valid,
- * pending, locked flags, l2t pointer, etc. so it's all we need for
- * this operation.
- */
- memset(f, 0, sizeof(*f));
-}
-
-/* Handle a filter write/deletion reply.
- */
-static void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
-{
- unsigned int idx = GET_TID(rpl);
- unsigned int nidx = idx - adap->tids.ftid_base;
- unsigned int ret;
- struct filter_entry *f;
-
- if (idx >= adap->tids.ftid_base && nidx <
- (adap->tids.nftids + adap->tids.nsftids)) {
- idx = nidx;
- ret = TCB_COOKIE_G(rpl->cookie);
- f = &adap->tids.ftid_tab[idx];
-
- if (ret == FW_FILTER_WR_FLT_DELETED) {
- /* Clear the filter when we get confirmation from the
- * hardware that the filter has been deleted.
- */
- clear_filter(adap, f);
- } else if (ret == FW_FILTER_WR_SMT_TBL_FULL) {
- dev_err(adap->pdev_dev, "filter %u setup failed due to full SMT\n",
- idx);
- clear_filter(adap, f);
- } else if (ret == FW_FILTER_WR_FLT_ADDED) {
- f->smtidx = (be64_to_cpu(rpl->oldval) >> 24) & 0xff;
- f->pending = 0; /* asynchronous setup completed */
- f->valid = 1;
- } else {
- /* Something went wrong. Issue a warning about the
- * problem and clear everything out.
- */
- dev_err(adap->pdev_dev, "filter %u setup failed with error %u\n",
- idx, ret);
- clear_filter(adap, f);
- }
- }
-}
-
/* Response queue handler for the FW event queue.
*/
static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
@@ -678,56 +591,6 @@
return 0;
}
-/* Flush the aggregated lro sessions */
-static void uldrx_flush_handler(struct sge_rspq *q)
-{
- if (ulds[q->uld].lro_flush)
- ulds[q->uld].lro_flush(&q->lro_mgr);
-}
-
-/**
- * uldrx_handler - response queue handler for ULD queues
- * @q: the response queue that received the packet
- * @rsp: the response queue descriptor holding the offload message
- * @gl: the gather list of packet fragments
- *
- * Deliver an ingress offload packet to a ULD. All processing is done by
- * the ULD, we just maintain statistics.
- */
-static int uldrx_handler(struct sge_rspq *q, const __be64 *rsp,
- const struct pkt_gl *gl)
-{
- struct sge_ofld_rxq *rxq = container_of(q, struct sge_ofld_rxq, rspq);
- int ret;
-
- /* FW can send CPLs encapsulated in a CPL_FW4_MSG.
- */
- if (((const struct rss_header *)rsp)->opcode == CPL_FW4_MSG &&
- ((const struct cpl_fw4_msg *)(rsp + 1))->type == FW_TYPE_RSSCPL)
- rsp += 2;
-
- if (q->flush_handler)
- ret = ulds[q->uld].lro_rx_handler(q->adap->uld_handle[q->uld],
- rsp, gl, &q->lro_mgr,
- &q->napi);
- else
- ret = ulds[q->uld].rx_handler(q->adap->uld_handle[q->uld],
- rsp, gl);
-
- if (ret) {
- rxq->stats.nomem++;
- return -1;
- }
-
- if (gl == NULL)
- rxq->stats.imm++;
- else if (gl == CXGB4_MSG_AN)
- rxq->stats.an++;
- else
- rxq->stats.pkts++;
- return 0;
-}
-
static void disable_msi(struct adapter *adapter)
{
if (adapter->flags & USING_MSIX) {
@@ -779,30 +642,12 @@
snprintf(adap->msix_info[msi_idx].desc, n, "%s-Rx%d",
d->name, i);
}
-
- /* offload queues */
- for_each_iscsirxq(&adap->sge, i)
- snprintf(adap->msix_info[msi_idx++].desc, n, "%s-iscsi%d",
- adap->port[0]->name, i);
-
- for_each_iscsitrxq(&adap->sge, i)
- snprintf(adap->msix_info[msi_idx++].desc, n, "%s-iSCSIT%d",
- adap->port[0]->name, i);
-
- for_each_rdmarxq(&adap->sge, i)
- snprintf(adap->msix_info[msi_idx++].desc, n, "%s-rdma%d",
- adap->port[0]->name, i);
-
- for_each_rdmaciq(&adap->sge, i)
- snprintf(adap->msix_info[msi_idx++].desc, n, "%s-rdma-ciq%d",
- adap->port[0]->name, i);
}
static int request_msix_queue_irqs(struct adapter *adap)
{
struct sge *s = &adap->sge;
- int err, ethqidx, iscsiqidx = 0, rdmaqidx = 0, rdmaciqqidx = 0;
- int iscsitqidx = 0;
+ int err, ethqidx;
int msi_index = 2;
err = request_irq(adap->msix_info[1].vec, t4_sge_intr_msix, 0,
@@ -819,57 +664,9 @@
goto unwind;
msi_index++;
}
- for_each_iscsirxq(s, iscsiqidx) {
- err = request_irq(adap->msix_info[msi_index].vec,
- t4_sge_intr_msix, 0,
- adap->msix_info[msi_index].desc,
- &s->iscsirxq[iscsiqidx].rspq);
- if (err)
- goto unwind;
- msi_index++;
- }
- for_each_iscsitrxq(s, iscsitqidx) {
- err = request_irq(adap->msix_info[msi_index].vec,
- t4_sge_intr_msix, 0,
- adap->msix_info[msi_index].desc,
- &s->iscsitrxq[iscsitqidx].rspq);
- if (err)
- goto unwind;
- msi_index++;
- }
- for_each_rdmarxq(s, rdmaqidx) {
- err = request_irq(adap->msix_info[msi_index].vec,
- t4_sge_intr_msix, 0,
- adap->msix_info[msi_index].desc,
- &s->rdmarxq[rdmaqidx].rspq);
- if (err)
- goto unwind;
- msi_index++;
- }
- for_each_rdmaciq(s, rdmaciqqidx) {
- err = request_irq(adap->msix_info[msi_index].vec,
- t4_sge_intr_msix, 0,
- adap->msix_info[msi_index].desc,
- &s->rdmaciq[rdmaciqqidx].rspq);
- if (err)
- goto unwind;
- msi_index++;
- }
return 0;
unwind:
- while (--rdmaciqqidx >= 0)
- free_irq(adap->msix_info[--msi_index].vec,
- &s->rdmaciq[rdmaciqqidx].rspq);
- while (--rdmaqidx >= 0)
- free_irq(adap->msix_info[--msi_index].vec,
- &s->rdmarxq[rdmaqidx].rspq);
- while (--iscsitqidx >= 0)
- free_irq(adap->msix_info[--msi_index].vec,
- &s->iscsitrxq[iscsitqidx].rspq);
- while (--iscsiqidx >= 0)
- free_irq(adap->msix_info[--msi_index].vec,
- &s->iscsirxq[iscsiqidx].rspq);
while (--ethqidx >= 0)
free_irq(adap->msix_info[--msi_index].vec,
&s->ethrxq[ethqidx].rspq);
@@ -885,16 +682,6 @@
free_irq(adap->msix_info[1].vec, &s->fw_evtq);
for_each_ethrxq(s, i)
free_irq(adap->msix_info[msi_index++].vec, &s->ethrxq[i].rspq);
- for_each_iscsirxq(s, i)
- free_irq(adap->msix_info[msi_index++].vec,
- &s->iscsirxq[i].rspq);
- for_each_iscsitrxq(s, i)
- free_irq(adap->msix_info[msi_index++].vec,
- &s->iscsitrxq[i].rspq);
- for_each_rdmarxq(s, i)
- free_irq(adap->msix_info[msi_index++].vec, &s->rdmarxq[i].rspq);
- for_each_rdmaciq(s, i)
- free_irq(adap->msix_info[msi_index++].vec, &s->rdmaciq[i].rspq);
}
/**
@@ -1033,42 +820,11 @@
}
}
-static int alloc_ofld_rxqs(struct adapter *adap, struct sge_ofld_rxq *q,
- unsigned int nq, unsigned int per_chan, int msi_idx,
- u16 *ids, bool lro)
-{
- int i, err;
- for (i = 0; i < nq; i++, q++) {
- if (msi_idx > 0)
- msi_idx++;
- err = t4_sge_alloc_rxq(adap, &q->rspq, false,
- adap->port[i / per_chan],
- msi_idx, q->fl.size ? &q->fl : NULL,
- uldrx_handler,
- lro ? uldrx_flush_handler : NULL,
- 0);
- if (err)
- return err;
- memset(&q->stats, 0, sizeof(q->stats));
- if (ids)
- ids[i] = q->rspq.abs_id;
- }
- return 0;
-}
-
-/**
- * setup_sge_queues - configure SGE Tx/Rx/response queues
- * @adap: the adapter
- *
- * Determines how many sets of SGE queues to use and initializes them.
- * We support multiple queue sets per port if we have MSI-X, otherwise
- * just one queue set per port.
- */
-static int setup_sge_queues(struct adapter *adap)
+static int setup_fw_sge_queues(struct adapter *adap)
{
- int err, i, j;
struct sge *s = &adap->sge;
+ int err = 0;
bitmap_zero(s->starving_fl, s->egr_sz);
bitmap_zero(s->txq_maperr, s->egr_sz);
@@ -1083,25 +839,27 @@
adap->msi_idx = -((int)s->intrq.abs_id + 1);
}
- /* NOTE: If you add/delete any Ingress/Egress Queue allocations in here,
- * don't forget to update the following which need to be
- * synchronized to and changes here.
- *
- * 1. The calculations of MAX_INGQ in cxgb4.h.
- *
- * 2. Update enable_msix/name_msix_vecs/request_msix_queue_irqs
- * to accommodate any new/deleted Ingress Queues
- * which need MSI-X Vectors.
- *
- * 3. Update sge_qinfo_show() to include information on the
- * new/deleted queues.
- */
err = t4_sge_alloc_rxq(adap, &s->fw_evtq, true, adap->port[0],
adap->msi_idx, NULL, fwevtq_handler, NULL, -1);
- if (err) {
-freeout: t4_free_sge_resources(adap);
- return err;
- }
+ if (err)
+ t4_free_sge_resources(adap);
+ return err;
+}
+
+/**
+ * setup_sge_queues - configure SGE Tx/Rx/response queues
+ * @adap: the adapter
+ *
+ * Determines how many sets of SGE queues to use and initializes them.
+ * We support multiple queue sets per port if we have MSI-X, otherwise
+ * just one queue set per port.
+ */
+static int setup_sge_queues(struct adapter *adap)
+{
+ int err, i, j;
+ struct sge *s = &adap->sge;
+ struct sge_uld_rxq_info *rxq_info = s->uld_rxq_info[CXGB4_ULD_RDMA];
+ unsigned int cmplqid = 0;
for_each_port(adap, i) {
struct net_device *dev = adap->port[i];
@@ -1132,8 +890,8 @@
}
}
- j = s->iscsiqsets / adap->params.nports; /* iscsi queues per channel */
- for_each_iscsirxq(s, i) {
+ j = s->ofldqsets / adap->params.nports; /* iscsi queues per channel */
+ for_each_ofldtxq(s, i) {
err = t4_sge_alloc_ofld_txq(adap, &s->ofldtxq[i],
adap->port[i / j],
s->fw_evtq.cntxt_id);
@@ -1141,30 +899,15 @@
goto freeout;
}
-#define ALLOC_OFLD_RXQS(firstq, nq, per_chan, ids, lro) do { \
- err = alloc_ofld_rxqs(adap, firstq, nq, per_chan, adap->msi_idx, ids, lro); \
- if (err) \
- goto freeout; \
- if (adap->msi_idx > 0) \
- adap->msi_idx += nq; \
-} while (0)
-
- ALLOC_OFLD_RXQS(s->iscsirxq, s->iscsiqsets, j, s->iscsi_rxq, false);
- ALLOC_OFLD_RXQS(s->iscsitrxq, s->niscsitq, j, s->iscsit_rxq, true);
- ALLOC_OFLD_RXQS(s->rdmarxq, s->rdmaqs, 1, s->rdma_rxq, false);
- j = s->rdmaciqs / adap->params.nports; /* rdmaq queues per channel */
- ALLOC_OFLD_RXQS(s->rdmaciq, s->rdmaciqs, j, s->rdma_ciq, false);
-
-#undef ALLOC_OFLD_RXQS
-
for_each_port(adap, i) {
- /*
- * Note that ->rdmarxq[i].rspq.cntxt_id below is 0 if we don't
+ /* Note that cmplqid below is 0 if we don't
* have RDMA queues, and that's the right value.
*/
+ if (rxq_info)
+ cmplqid = rxq_info->uldrxq[i].rspq.cntxt_id;
+
err = t4_sge_alloc_ctrl_txq(adap, &s->ctrlq[i], adap->port[i],
- s->fw_evtq.cntxt_id,
- s->rdmarxq[i].rspq.cntxt_id);
+ s->fw_evtq.cntxt_id, cmplqid);
if (err)
goto freeout;
}
@@ -1175,6 +918,9 @@
RSSCONTROL_V(netdev2pinfo(adap->port[0])->tx_chan) |
QUEUENUMBER_V(s->ethrxq[0].rspq.abs_id));
return 0;
+freeout:
+ t4_free_sge_resources(adap);
+ return err;
}
/*
@@ -1198,151 +944,6 @@
kvfree(addr);
}
-/* Send a Work Request to write the filter at a specified index. We construct
- * a Firmware Filter Work Request to have the work done and put the indicated
- * filter into "pending" mode which will prevent any further actions against
- * it till we get a reply from the firmware on the completion status of the
- * request.
- */
-static int set_filter_wr(struct adapter *adapter, int fidx)
-{
- struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
- struct sk_buff *skb;
- struct fw_filter_wr *fwr;
- unsigned int ftid;
-
- skb = alloc_skb(sizeof(*fwr), GFP_KERNEL);
- if (!skb)
- return -ENOMEM;
-
- /* If the new filter requires loopback Destination MAC and/or VLAN
- * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
- * the filter.
- */
- if (f->fs.newdmac || f->fs.newvlan) {
- /* allocate L2T entry for new filter */
- f->l2t = t4_l2t_alloc_switching(adapter, f->fs.vlan,
- f->fs.eport, f->fs.dmac);
- if (f->l2t == NULL) {
- kfree_skb(skb);
- return -ENOMEM;
- }
- }
-
- ftid = adapter->tids.ftid_base + fidx;
-
- fwr = (struct fw_filter_wr *)__skb_put(skb, sizeof(*fwr));
- memset(fwr, 0, sizeof(*fwr));
-
- /* It would be nice to put most of the following in t4_hw.c but most
- * of the work is translating the cxgbtool ch_filter_specification
- * into the Work Request and the definition of that structure is
- * currently in cxgbtool.h which isn't appropriate to pull into the
- * common code. We may eventually try to come up with a more neutral
- * filter specification structure but for now it's easiest to simply
- * put this fairly direct code in line ...
- */
- fwr->op_pkd = htonl(FW_WR_OP_V(FW_FILTER_WR));
- fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr)/16));
- fwr->tid_to_iq =
- htonl(FW_FILTER_WR_TID_V(ftid) |
- FW_FILTER_WR_RQTYPE_V(f->fs.type) |
- FW_FILTER_WR_NOREPLY_V(0) |
- FW_FILTER_WR_IQ_V(f->fs.iq));
- fwr->del_filter_to_l2tix =
- htonl(FW_FILTER_WR_RPTTID_V(f->fs.rpttid) |
- FW_FILTER_WR_DROP_V(f->fs.action == FILTER_DROP) |
- FW_FILTER_WR_DIRSTEER_V(f->fs.dirsteer) |
- FW_FILTER_WR_MASKHASH_V(f->fs.maskhash) |
- FW_FILTER_WR_DIRSTEERHASH_V(f->fs.dirsteerhash) |
- FW_FILTER_WR_LPBK_V(f->fs.action == FILTER_SWITCH) |
- FW_FILTER_WR_DMAC_V(f->fs.newdmac) |
- FW_FILTER_WR_SMAC_V(f->fs.newsmac) |
- FW_FILTER_WR_INSVLAN_V(f->fs.newvlan == VLAN_INSERT ||
- f->fs.newvlan == VLAN_REWRITE) |
- FW_FILTER_WR_RMVLAN_V(f->fs.newvlan == VLAN_REMOVE ||
- f->fs.newvlan == VLAN_REWRITE) |
- FW_FILTER_WR_HITCNTS_V(f->fs.hitcnts) |
- FW_FILTER_WR_TXCHAN_V(f->fs.eport) |
- FW_FILTER_WR_PRIO_V(f->fs.prio) |
- FW_FILTER_WR_L2TIX_V(f->l2t ? f->l2t->idx : 0));
- fwr->ethtype = htons(f->fs.val.ethtype);
- fwr->ethtypem = htons(f->fs.mask.ethtype);
- fwr->frag_to_ovlan_vldm =
- (FW_FILTER_WR_FRAG_V(f->fs.val.frag) |
- FW_FILTER_WR_FRAGM_V(f->fs.mask.frag) |
- FW_FILTER_WR_IVLAN_VLD_V(f->fs.val.ivlan_vld) |
- FW_FILTER_WR_OVLAN_VLD_V(f->fs.val.ovlan_vld) |
- FW_FILTER_WR_IVLAN_VLDM_V(f->fs.mask.ivlan_vld) |
- FW_FILTER_WR_OVLAN_VLDM_V(f->fs.mask.ovlan_vld));
- fwr->smac_sel = 0;
- fwr->rx_chan_rx_rpl_iq =
- htons(FW_FILTER_WR_RX_CHAN_V(0) |
- FW_FILTER_WR_RX_RPL_IQ_V(adapter->sge.fw_evtq.abs_id));
- fwr->maci_to_matchtypem =
- htonl(FW_FILTER_WR_MACI_V(f->fs.val.macidx) |
- FW_FILTER_WR_MACIM_V(f->fs.mask.macidx) |
- FW_FILTER_WR_FCOE_V(f->fs.val.fcoe) |
- FW_FILTER_WR_FCOEM_V(f->fs.mask.fcoe) |
- FW_FILTER_WR_PORT_V(f->fs.val.iport) |
- FW_FILTER_WR_PORTM_V(f->fs.mask.iport) |
- FW_FILTER_WR_MATCHTYPE_V(f->fs.val.matchtype) |
- FW_FILTER_WR_MATCHTYPEM_V(f->fs.mask.matchtype));
- fwr->ptcl = f->fs.val.proto;
- fwr->ptclm = f->fs.mask.proto;
- fwr->ttyp = f->fs.val.tos;
- fwr->ttypm = f->fs.mask.tos;
- fwr->ivlan = htons(f->fs.val.ivlan);
- fwr->ivlanm = htons(f->fs.mask.ivlan);
- fwr->ovlan = htons(f->fs.val.ovlan);
- fwr->ovlanm = htons(f->fs.mask.ovlan);
- memcpy(fwr->lip, f->fs.val.lip, sizeof(fwr->lip));
- memcpy(fwr->lipm, f->fs.mask.lip, sizeof(fwr->lipm));
- memcpy(fwr->fip, f->fs.val.fip, sizeof(fwr->fip));
- memcpy(fwr->fipm, f->fs.mask.fip, sizeof(fwr->fipm));
- fwr->lp = htons(f->fs.val.lport);
- fwr->lpm = htons(f->fs.mask.lport);
- fwr->fp = htons(f->fs.val.fport);
- fwr->fpm = htons(f->fs.mask.fport);
- if (f->fs.newsmac)
- memcpy(fwr->sma, f->fs.smac, sizeof(fwr->sma));
-
- /* Mark the filter as "pending" and ship off the Filter Work Request.
- * When we get the Work Request Reply we'll clear the pending status.
- */
- f->pending = 1;
- set_wr_txq(skb, CPL_PRIORITY_CONTROL, f->fs.val.iport & 0x3);
- t4_ofld_send(adapter, skb);
- return 0;
-}
-
-/* Delete the filter at a specified index.
- */
-static int del_filter_wr(struct adapter *adapter, int fidx)
-{
- struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
- struct sk_buff *skb;
- struct fw_filter_wr *fwr;
- unsigned int len, ftid;
-
- len = sizeof(*fwr);
- ftid = adapter->tids.ftid_base + fidx;
-
- skb = alloc_skb(len, GFP_KERNEL);
- if (!skb)
- return -ENOMEM;
-
- fwr = (struct fw_filter_wr *)__skb_put(skb, len);
- t4_mk_filtdelwr(ftid, fwr, adapter->sge.fw_evtq.abs_id);
-
- /* Mark the filter as "pending" and ship off the Filter Work Request.
- * When we get the Work Request Reply we'll clear the pending status.
- */
- f->pending = 1;
- t4_mgmt_tx(adapter, skb);
- return 0;
-}
-
static u16 cxgb_select_queue(struct net_device *dev, struct sk_buff *skb,
void *accel_priv, select_queue_fallback_t fallback)
{
@@ -1724,19 +1325,22 @@
*/
static int tid_init(struct tid_info *t)
{
- size_t size;
- unsigned int stid_bmap_size;
- unsigned int natids = t->natids;
struct adapter *adap = container_of(t, struct adapter, tids);
+ unsigned int max_ftids = t->nftids + t->nsftids;
+ unsigned int natids = t->natids;
+ unsigned int stid_bmap_size;
+ unsigned int ftid_bmap_size;
+ size_t size;
stid_bmap_size = BITS_TO_LONGS(t->nstids + t->nsftids);
+ ftid_bmap_size = BITS_TO_LONGS(t->nftids);
size = t->ntids * sizeof(*t->tid_tab) +
natids * sizeof(*t->atid_tab) +
t->nstids * sizeof(*t->stid_tab) +
t->nsftids * sizeof(*t->stid_tab) +
stid_bmap_size * sizeof(long) +
- t->nftids * sizeof(*t->ftid_tab) +
- t->nsftids * sizeof(*t->ftid_tab);
+ max_ftids * sizeof(*t->ftid_tab) +
+ ftid_bmap_size * sizeof(long);
t->tid_tab = t4_alloc_mem(size);
if (!t->tid_tab)
@@ -1746,8 +1350,10 @@
t->stid_tab = (struct serv_entry *)&t->atid_tab[natids];
t->stid_bmap = (unsigned long *)&t->stid_tab[t->nstids + t->nsftids];
t->ftid_tab = (struct filter_entry *)&t->stid_bmap[stid_bmap_size];
+ t->ftid_bmap = (unsigned long *)&t->ftid_tab[max_ftids];
spin_lock_init(&t->stid_lock);
spin_lock_init(&t->atid_lock);
+ spin_lock_init(&t->ftid_lock);
t->stids_in_use = 0;
t->sftids_in_use = 0;
@@ -1762,12 +1368,16 @@
t->atid_tab[natids - 1].next = &t->atid_tab[natids];
t->afree = t->atid_tab;
}
- bitmap_zero(t->stid_bmap, t->nstids + t->nsftids);
- /* Reserve stid 0 for T4/T5 adapters */
- if (!t->stid_base &&
- (CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5))
- __set_bit(0, t->stid_bmap);
+ if (is_offload(adap)) {
+ bitmap_zero(t->stid_bmap, t->nstids + t->nsftids);
+ /* Reserve stid 0 for T4/T5 adapters */
+ if (!t->stid_base &&
+ CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5)
+ __set_bit(0, t->stid_bmap);
+ }
+
+ bitmap_zero(t->ftid_bmap, t->nftids);
return 0;
}
@@ -2317,7 +1927,7 @@
for_each_ethrxq(&adap->sge, i)
disable_txq_db(&adap->sge.ethtxq[i].q);
- for_each_iscsirxq(&adap->sge, i)
+ for_each_ofldtxq(&adap->sge, i)
disable_txq_db(&adap->sge.ofldtxq[i].q);
for_each_port(adap, i)
disable_txq_db(&adap->sge.ctrlq[i].q);
@@ -2329,7 +1939,7 @@
for_each_ethrxq(&adap->sge, i)
enable_txq_db(adap, &adap->sge.ethtxq[i].q);
- for_each_iscsirxq(&adap->sge, i)
+ for_each_ofldtxq(&adap->sge, i)
enable_txq_db(adap, &adap->sge.ofldtxq[i].q);
for_each_port(adap, i)
enable_txq_db(adap, &adap->sge.ctrlq[i].q);
@@ -2337,9 +1947,10 @@
static void notify_rdma_uld(struct adapter *adap, enum cxgb4_control cmd)
{
- if (adap->uld_handle[CXGB4_ULD_RDMA])
- ulds[CXGB4_ULD_RDMA].control(adap->uld_handle[CXGB4_ULD_RDMA],
- cmd);
+ enum cxgb4_uld type = CXGB4_ULD_RDMA;
+
+ if (adap->uld && adap->uld[type].handle)
+ adap->uld[type].control(adap->uld[type].handle, cmd);
}
static void process_db_full(struct work_struct *work)
@@ -2393,13 +2004,14 @@
if (ret)
CH_WARN(adap, "DB drop recovery failed.\n");
}
+
static void recover_all_queues(struct adapter *adap)
{
int i;
for_each_ethrxq(&adap->sge, i)
sync_txq_pidx(adap, &adap->sge.ethtxq[i].q);
- for_each_iscsirxq(&adap->sge, i)
+ for_each_ofldtxq(&adap->sge, i)
sync_txq_pidx(adap, &adap->sge.ofldtxq[i].q);
for_each_port(adap, i)
sync_txq_pidx(adap, &adap->sge.ctrlq[i].q);
@@ -2464,94 +2076,12 @@
queue_work(adap->workq, &adap->db_drop_task);
}
-static void uld_attach(struct adapter *adap, unsigned int uld)
+void t4_register_netevent_notifier(void)
{
- void *handle;
- struct cxgb4_lld_info lli;
- unsigned short i;
-
- lli.pdev = adap->pdev;
- lli.pf = adap->pf;
- lli.l2t = adap->l2t;
- lli.tids = &adap->tids;
- lli.ports = adap->port;
- lli.vr = &adap->vres;
- lli.mtus = adap->params.mtus;
- if (uld == CXGB4_ULD_RDMA) {
- lli.rxq_ids = adap->sge.rdma_rxq;
- lli.ciq_ids = adap->sge.rdma_ciq;
- lli.nrxq = adap->sge.rdmaqs;
- lli.nciq = adap->sge.rdmaciqs;
- } else if (uld == CXGB4_ULD_ISCSI) {
- lli.rxq_ids = adap->sge.iscsi_rxq;
- lli.nrxq = adap->sge.iscsiqsets;
- } else if (uld == CXGB4_ULD_ISCSIT) {
- lli.rxq_ids = adap->sge.iscsit_rxq;
- lli.nrxq = adap->sge.niscsitq;
- }
- lli.ntxq = adap->sge.iscsiqsets;
- lli.nchan = adap->params.nports;
- lli.nports = adap->params.nports;
- lli.wr_cred = adap->params.ofldq_wr_cred;
- lli.adapter_type = adap->params.chip;
- lli.iscsi_iolen = MAXRXDATA_G(t4_read_reg(adap, TP_PARA_REG2_A));
- lli.iscsi_tagmask = t4_read_reg(adap, ULP_RX_ISCSI_TAGMASK_A);
- lli.iscsi_pgsz_order = t4_read_reg(adap, ULP_RX_ISCSI_PSZ_A);
- lli.iscsi_llimit = t4_read_reg(adap, ULP_RX_ISCSI_LLIMIT_A);
- lli.iscsi_ppm = &adap->iscsi_ppm;
- lli.cclk_ps = 1000000000 / adap->params.vpd.cclk;
- lli.udb_density = 1 << adap->params.sge.eq_qpp;
- lli.ucq_density = 1 << adap->params.sge.iq_qpp;
- lli.filt_mode = adap->params.tp.vlan_pri_map;
- /* MODQ_REQ_MAP sets queues 0-3 to chan 0-3 */
- for (i = 0; i < NCHAN; i++)
- lli.tx_modq[i] = i;
- lli.gts_reg = adap->regs + MYPF_REG(SGE_PF_GTS_A);
- lli.db_reg = adap->regs + MYPF_REG(SGE_PF_KDOORBELL_A);
- lli.fw_vers = adap->params.fw_vers;
- lli.dbfifo_int_thresh = dbfifo_int_thresh;
- lli.sge_ingpadboundary = adap->sge.fl_align;
- lli.sge_egrstatuspagesize = adap->sge.stat_len;
- lli.sge_pktshift = adap->sge.pktshift;
- lli.enable_fw_ofld_conn = adap->flags & FW_OFLD_CONN;
- lli.max_ordird_qp = adap->params.max_ordird_qp;
- lli.max_ird_adapter = adap->params.max_ird_adapter;
- lli.ulptx_memwrite_dsgl = adap->params.ulptx_memwrite_dsgl;
- lli.nodeid = dev_to_node(adap->pdev_dev);
-
- handle = ulds[uld].add(&lli);
- if (IS_ERR(handle)) {
- dev_warn(adap->pdev_dev,
- "could not attach to the %s driver, error %ld\n",
- uld_str[uld], PTR_ERR(handle));
- return;
- }
-
- adap->uld_handle[uld] = handle;
-
if (!netevent_registered) {
register_netevent_notifier(&cxgb4_netevent_nb);
netevent_registered = true;
}
-
- if (adap->flags & FULL_INIT_DONE)
- ulds[uld].state_change(handle, CXGB4_STATE_UP);
-}
-
-static void attach_ulds(struct adapter *adap)
-{
- unsigned int i;
-
- spin_lock(&adap_rcu_lock);
- list_add_tail_rcu(&adap->rcu_node, &adap_rcu_list);
- spin_unlock(&adap_rcu_lock);
-
- mutex_lock(&uld_mutex);
- list_add_tail(&adap->list_node, &adapter_list);
- for (i = 0; i < CXGB4_ULD_MAX; i++)
- if (ulds[i].add)
- uld_attach(adap, i);
- mutex_unlock(&uld_mutex);
}
static void detach_ulds(struct adapter *adap)
@@ -2561,12 +2091,6 @@
mutex_lock(&uld_mutex);
list_del(&adap->list_node);
for (i = 0; i < CXGB4_ULD_MAX; i++)
- if (adap->uld_handle[i]) {
- ulds[i].state_change(adap->uld_handle[i],
- CXGB4_STATE_DETACH);
- adap->uld_handle[i] = NULL;
- }
- for (i = 0; i < CXGB4_PCI_ULD_MAX; i++)
if (adap->uld && adap->uld[i].handle) {
adap->uld[i].state_change(adap->uld[i].handle,
CXGB4_STATE_DETACH);
@@ -2577,10 +2101,6 @@
netevent_registered = false;
}
mutex_unlock(&uld_mutex);
-
- spin_lock(&adap_rcu_lock);
- list_del_rcu(&adap->rcu_node);
- spin_unlock(&adap_rcu_lock);
}
static void notify_ulds(struct adapter *adap, enum cxgb4_state new_state)
@@ -2589,65 +2109,12 @@
mutex_lock(&uld_mutex);
for (i = 0; i < CXGB4_ULD_MAX; i++)
- if (adap->uld_handle[i])
- ulds[i].state_change(adap->uld_handle[i], new_state);
- for (i = 0; i < CXGB4_PCI_ULD_MAX; i++)
if (adap->uld && adap->uld[i].handle)
adap->uld[i].state_change(adap->uld[i].handle,
new_state);
mutex_unlock(&uld_mutex);
}
-/**
- * cxgb4_register_uld - register an upper-layer driver
- * @type: the ULD type
- * @p: the ULD methods
- *
- * Registers an upper-layer driver with this driver and notifies the ULD
- * about any presently available devices that support its type. Returns
- * %-EBUSY if a ULD of the same type is already registered.
- */
-int cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p)
-{
- int ret = 0;
- struct adapter *adap;
-
- if (type >= CXGB4_ULD_MAX)
- return -EINVAL;
- mutex_lock(&uld_mutex);
- if (ulds[type].add) {
- ret = -EBUSY;
- goto out;
- }
- ulds[type] = *p;
- list_for_each_entry(adap, &adapter_list, list_node)
- uld_attach(adap, type);
-out: mutex_unlock(&uld_mutex);
- return ret;
-}
-EXPORT_SYMBOL(cxgb4_register_uld);
-
-/**
- * cxgb4_unregister_uld - unregister an upper-layer driver
- * @type: the ULD type
- *
- * Unregisters an existing upper-layer driver.
- */
-int cxgb4_unregister_uld(enum cxgb4_uld type)
-{
- struct adapter *adap;
-
- if (type >= CXGB4_ULD_MAX)
- return -EINVAL;
- mutex_lock(&uld_mutex);
- list_for_each_entry(adap, &adapter_list, list_node)
- adap->uld_handle[type] = NULL;
- ulds[type].add = NULL;
- mutex_unlock(&uld_mutex);
- return 0;
-}
-EXPORT_SYMBOL(cxgb4_unregister_uld);
-
#if IS_ENABLED(CONFIG_IPV6)
static int cxgb4_inet6addr_handler(struct notifier_block *this,
unsigned long event, void *data)
@@ -2752,7 +2219,6 @@
adap->msix_info[0].desc, adap);
if (err)
goto irq_err;
-
err = request_msix_queue_irqs(adap);
if (err) {
free_irq(adap->msix_info[0].vec, adap);
@@ -2830,40 +2296,6 @@
return t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
}
-/* Return an error number if the indicated filter isn't writable ...
- */
-static int writable_filter(struct filter_entry *f)
-{
- if (f->locked)
- return -EPERM;
- if (f->pending)
- return -EBUSY;
-
- return 0;
-}
-
-/* Delete the filter at the specified index (if valid). The checks for all
- * the common problems with doing this like the filter being locked, currently
- * pending in another operation, etc.
- */
-static int delete_filter(struct adapter *adapter, unsigned int fidx)
-{
- struct filter_entry *f;
- int ret;
-
- if (fidx >= adapter->tids.nftids + adapter->tids.nsftids)
- return -EINVAL;
-
- f = &adapter->tids.ftid_tab[fidx];
- ret = writable_filter(f);
- if (ret)
- return ret;
- if (f->valid)
- return del_filter_wr(adapter, fidx);
-
- return 0;
-}
-
int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid,
__be32 sip, __be16 sport, __be16 vlan,
unsigned int queue, unsigned char port, unsigned char mask)
@@ -3280,6 +2712,35 @@
return err;
}
+int cxgb_setup_tc(struct net_device *dev, u32 handle, __be16 proto,
+ struct tc_to_netdev *tc)
+{
+ struct port_info *pi = netdev2pinfo(dev);
+ struct adapter *adap = netdev2adap(dev);
+
+ if (!(adap->flags & FULL_INIT_DONE)) {
+ dev_err(adap->pdev_dev,
+ "Failed to setup tc on port %d. Link Down?\n",
+ pi->port_id);
+ return -EINVAL;
+ }
+
+ if (TC_H_MAJ(handle) == TC_H_MAJ(TC_H_INGRESS) &&
+ tc->type == TC_SETUP_CLSU32) {
+ switch (tc->cls_u32->command) {
+ case TC_CLSU32_NEW_KNODE:
+ case TC_CLSU32_REPLACE_KNODE:
+ return cxgb4_config_knode(dev, proto, tc->cls_u32);
+ case TC_CLSU32_DELETE_KNODE:
+ return cxgb4_delete_knode(dev, proto, tc->cls_u32);
+ default:
+ return -EOPNOTSUPP;
+ }
+ }
+
+ return -EOPNOTSUPP;
+}
+
static const struct net_device_ops cxgb4_netdev_ops = {
.ndo_open = cxgb_open,
.ndo_stop = cxgb_close,
@@ -3303,6 +2764,7 @@
.ndo_busy_poll = cxgb_busy_poll,
#endif
.ndo_set_tx_maxrate = cxgb_set_tx_maxrate,
+ .ndo_setup_tc = cxgb_setup_tc,
};
#ifdef CONFIG_PCI_IOV
@@ -4262,6 +3724,7 @@
adap->params.ofldq_wr_cred = val[5];
adap->params.offload = 1;
+ adap->num_ofld_uld += 1;
}
if (caps_cmd.rdmacaps) {
params[0] = FW_PARAM_PFVF(STAG_START);
@@ -4314,6 +3777,7 @@
"max_ordird_qp %d max_ird_adapter %d\n",
adap->params.max_ordird_qp,
adap->params.max_ird_adapter);
+ adap->num_ofld_uld += 2;
}
if (caps_cmd.iscsicaps) {
params[0] = FW_PARAM_PFVF(ISCSI_START);
@@ -4324,6 +3788,8 @@
goto bye;
adap->vres.iscsi.start = val[0];
adap->vres.iscsi.size = val[1] - val[0] + 1;
+ /* LIO target and cxgb4i initiaitor */
+ adap->num_ofld_uld += 2;
}
if (caps_cmd.cryptocaps) {
/* Should query params here...TODO */
@@ -4505,10 +3971,17 @@
.resume = eeh_resume,
};
+/* Return true if the Link Configuration supports "High Speeds" (those greater
+ * than 1Gb/s).
+ */
static inline bool is_x_10g_port(const struct link_config *lc)
{
- return (lc->supported & FW_PORT_CAP_SPEED_10G) != 0 ||
- (lc->supported & FW_PORT_CAP_SPEED_40G) != 0;
+ unsigned int speeds, high_speeds;
+
+ speeds = FW_PORT_CAP_SPEED_V(FW_PORT_CAP_SPEED_G(lc->supported));
+ high_speeds = speeds & ~(FW_PORT_CAP_SPEED_100M | FW_PORT_CAP_SPEED_1G);
+
+ return high_speeds != 0;
}
/*
@@ -4523,14 +3996,14 @@
#ifndef CONFIG_CHELSIO_T4_DCB
int q10g = 0;
#endif
- int ciq_size;
/* Reduce memory usage in kdump environment, disable all offload.
*/
if (is_kdump_kernel()) {
adap->params.offload = 0;
adap->params.crypto = 0;
- } else if (adap->num_uld && uld_mem_alloc(adap)) {
+ } else if (is_uld(adap) && t4_uld_mem_alloc(adap)) {
+ adap->params.offload = 0;
adap->params.crypto = 0;
}
@@ -4576,33 +4049,18 @@
s->ethqsets = qidx;
s->max_ethqsets = qidx; /* MSI-X may lower it later */
- if (is_offload(adap)) {
+ if (is_uld(adap)) {
/*
* For offload we use 1 queue/channel if all ports are up to 1G,
* otherwise we divide all available queues amongst the channels
* capped by the number of available cores.
*/
if (n10g) {
- i = min_t(int, ARRAY_SIZE(s->iscsirxq),
- num_online_cpus());
- s->iscsiqsets = roundup(i, adap->params.nports);
- } else
- s->iscsiqsets = adap->params.nports;
- /* For RDMA one Rx queue per channel suffices */
- s->rdmaqs = adap->params.nports;
- /* Try and allow at least 1 CIQ per cpu rounding down
- * to the number of ports, with a minimum of 1 per port.
- * A 2 port card in a 6 cpu system: 6 CIQs, 3 / port.
- * A 4 port card in a 6 cpu system: 4 CIQs, 1 / port.
- * A 4 port card in a 2 cpu system: 4 CIQs, 1 / port.
- */
- s->rdmaciqs = min_t(int, MAX_RDMA_CIQS, num_online_cpus());
- s->rdmaciqs = (s->rdmaciqs / adap->params.nports) *
- adap->params.nports;
- s->rdmaciqs = max_t(int, s->rdmaciqs, adap->params.nports);
-
- if (!is_t4(adap->params.chip))
- s->niscsitq = s->iscsiqsets;
+ i = num_online_cpus();
+ s->ofldqsets = roundup(i, adap->params.nports);
+ } else {
+ s->ofldqsets = adap->params.nports;
+ }
}
for (i = 0; i < ARRAY_SIZE(s->ethrxq); i++) {
@@ -4621,47 +4079,8 @@
for (i = 0; i < ARRAY_SIZE(s->ofldtxq); i++)
s->ofldtxq[i].q.size = 1024;
- for (i = 0; i < ARRAY_SIZE(s->iscsirxq); i++) {
- struct sge_ofld_rxq *r = &s->iscsirxq[i];
-
- init_rspq(adap, &r->rspq, 5, 1, 1024, 64);
- r->rspq.uld = CXGB4_ULD_ISCSI;
- r->fl.size = 72;
- }
-
- if (!is_t4(adap->params.chip)) {
- for (i = 0; i < ARRAY_SIZE(s->iscsitrxq); i++) {
- struct sge_ofld_rxq *r = &s->iscsitrxq[i];
-
- init_rspq(adap, &r->rspq, 5, 1, 1024, 64);
- r->rspq.uld = CXGB4_ULD_ISCSIT;
- r->fl.size = 72;
- }
- }
-
- for (i = 0; i < ARRAY_SIZE(s->rdmarxq); i++) {
- struct sge_ofld_rxq *r = &s->rdmarxq[i];
-
- init_rspq(adap, &r->rspq, 5, 1, 511, 64);
- r->rspq.uld = CXGB4_ULD_RDMA;
- r->fl.size = 72;
- }
-
- ciq_size = 64 + adap->vres.cq.size + adap->tids.nftids;
- if (ciq_size > SGE_MAX_IQ_SIZE) {
- CH_WARN(adap, "CIQ size too small for available IQs\n");
- ciq_size = SGE_MAX_IQ_SIZE;
- }
-
- for (i = 0; i < ARRAY_SIZE(s->rdmaciq); i++) {
- struct sge_ofld_rxq *r = &s->rdmaciq[i];
-
- init_rspq(adap, &r->rspq, 5, 1, ciq_size, 64);
- r->rspq.uld = CXGB4_ULD_RDMA;
- }
-
init_rspq(adap, &s->fw_evtq, 0, 1, 1024, 64);
- init_rspq(adap, &s->intrq, 0, 1, 2 * MAX_INGQ, 64);
+ init_rspq(adap, &s->intrq, 0, 1, 512, 64);
}
/*
@@ -4695,7 +4114,15 @@
static int get_msix_info(struct adapter *adap)
{
struct uld_msix_info *msix_info;
- int max_ingq = (MAX_OFLD_QSETS * adap->num_uld);
+ unsigned int max_ingq = 0;
+
+ if (is_offload(adap))
+ max_ingq += MAX_OFLD_QSETS * adap->num_ofld_uld;
+ if (is_pci_uld(adap))
+ max_ingq += MAX_OFLD_QSETS * adap->num_uld;
+
+ if (!max_ingq)
+ goto out;
msix_info = kcalloc(max_ingq, sizeof(*msix_info), GFP_KERNEL);
if (!msix_info)
@@ -4709,12 +4136,13 @@
}
spin_lock_init(&adap->msix_bmap_ulds.lock);
adap->msix_info_ulds = msix_info;
+out:
return 0;
}
static void free_msix_info(struct adapter *adap)
{
- if (!adap->num_uld)
+ if (!(adap->num_uld && adap->num_ofld_uld))
return;
kfree(adap->msix_info_ulds);
@@ -4733,32 +4161,32 @@
struct msix_entry *entries;
int max_ingq = MAX_INGQ;
- max_ingq += (MAX_OFLD_QSETS * adap->num_uld);
+ if (is_pci_uld(adap))
+ max_ingq += (MAX_OFLD_QSETS * adap->num_uld);
+ if (is_offload(adap))
+ max_ingq += (MAX_OFLD_QSETS * adap->num_ofld_uld);
entries = kmalloc(sizeof(*entries) * (max_ingq + 1),
GFP_KERNEL);
if (!entries)
return -ENOMEM;
/* map for msix */
- if (is_pci_uld(adap) && get_msix_info(adap))
+ if (get_msix_info(adap)) {
+ adap->params.offload = 0;
adap->params.crypto = 0;
+ }
for (i = 0; i < max_ingq + 1; ++i)
entries[i].entry = i;
want = s->max_ethqsets + EXTRA_VECS;
if (is_offload(adap)) {
- want += s->rdmaqs + s->rdmaciqs + s->iscsiqsets +
- s->niscsitq;
- /* need nchan for each possible ULD */
- if (is_t4(adap->params.chip))
- ofld_need = 3 * nchan;
- else
- ofld_need = 4 * nchan;
+ want += adap->num_ofld_uld * s->ofldqsets;
+ ofld_need = adap->num_ofld_uld * nchan;
}
if (is_pci_uld(adap)) {
- want += netif_get_num_default_rss_queues() * nchan;
- uld_need = nchan;
+ want += adap->num_uld * s->ofldqsets;
+ uld_need = adap->num_uld * nchan;
}
#ifdef CONFIG_CHELSIO_T4_DCB
/* For Data Center Bridging we need 8 Ethernet TX Priority Queues for
@@ -4786,43 +4214,25 @@
if (i < s->ethqsets)
reduce_ethqs(adap, i);
}
- if (is_pci_uld(adap)) {
+ if (is_uld(adap)) {
if (allocated < want)
s->nqs_per_uld = nchan;
else
- s->nqs_per_uld = netif_get_num_default_rss_queues() *
- nchan;
+ s->nqs_per_uld = s->ofldqsets;
}
- if (is_offload(adap)) {
- if (allocated < want) {
- s->rdmaqs = nchan;
- s->rdmaciqs = nchan;
-
- if (!is_t4(adap->params.chip))
- s->niscsitq = nchan;
- }
-
- /* leftovers go to OFLD */
- i = allocated - EXTRA_VECS - s->max_ethqsets -
- s->rdmaqs - s->rdmaciqs - s->niscsitq;
- if (is_pci_uld(adap))
- i -= s->nqs_per_uld * adap->num_uld;
- s->iscsiqsets = (i / nchan) * nchan; /* round down */
-
- }
-
- for (i = 0; i < (allocated - (s->nqs_per_uld * adap->num_uld)); ++i)
+ for (i = 0; i < (s->max_ethqsets + EXTRA_VECS); ++i)
adap->msix_info[i].vec = entries[i].vector;
- if (is_pci_uld(adap)) {
- for (j = 0 ; i < allocated; ++i, j++)
+ if (is_uld(adap)) {
+ for (j = 0 ; i < allocated; ++i, j++) {
adap->msix_info_ulds[j].vec = entries[i].vector;
+ adap->msix_info_ulds[j].idx = i;
+ }
adap->msix_bmap_ulds.mapsize = j;
}
dev_info(adap->pdev_dev, "%d MSI-X vectors allocated, "
- "nic %d iscsi %d rdma cpl %d rdma ciq %d uld %d\n",
- allocated, s->max_ethqsets, s->iscsiqsets, s->rdmaqs,
- s->rdmaciqs, s->nqs_per_uld);
+ "nic %d per uld %d\n",
+ allocated, s->max_ethqsets, s->nqs_per_uld);
kfree(entries);
return 0;
@@ -5005,8 +4415,12 @@
bufp += sprintf(bufp, "1000/");
if (pi->link_cfg.supported & FW_PORT_CAP_SPEED_10G)
bufp += sprintf(bufp, "10G/");
+ if (pi->link_cfg.supported & FW_PORT_CAP_SPEED_25G)
+ bufp += sprintf(bufp, "25G/");
if (pi->link_cfg.supported & FW_PORT_CAP_SPEED_40G)
bufp += sprintf(bufp, "40G/");
+ if (pi->link_cfg.supported & FW_PORT_CAP_SPEED_100G)
+ bufp += sprintf(bufp, "100G/");
if (bufp != buf)
--bufp;
sprintf(bufp, "BASE-%s", t4_get_port_type_description(pi->port_type));
@@ -5034,6 +4448,7 @@
t4_free_mem(adapter->l2t);
t4_cleanup_sched(adapter);
t4_free_mem(adapter->tids.tid_tab);
+ cxgb4_cleanup_tc_u32(adapter);
kfree(adapter->sge.egr_map);
kfree(adapter->sge.ingr_map);
kfree(adapter->sge.starving_fl);
@@ -5378,7 +4793,8 @@
netdev->hw_features = NETIF_F_SG | TSO_FLAGS |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
NETIF_F_RXCSUM | NETIF_F_RXHASH |
- NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
+ NETIF_F_HW_TC;
if (highdma)
netdev->hw_features |= NETIF_F_HIGHDMA;
netdev->features |= netdev->hw_features;
@@ -5462,10 +4878,16 @@
i);
}
- if (is_offload(adapter) && tid_init(&adapter->tids) < 0) {
+ if (tid_init(&adapter->tids) < 0) {
dev_warn(&pdev->dev, "could not allocate TID table, "
"continuing\n");
adapter->params.offload = 0;
+ } else {
+ adapter->tc_u32 = cxgb4_init_tc_u32(adapter,
+ CXGB4_MAX_LINK_HANDLE);
+ if (!adapter->tc_u32)
+ dev_warn(&pdev->dev,
+ "could not offload tc u32, continuing\n");
}
if (is_offload(adapter)) {
@@ -5535,10 +4957,14 @@
/* PCIe EEH recovery on powerpc platforms needs fundamental reset */
pdev->needs_freset = 1;
- if (is_offload(adapter))
- attach_ulds(adapter);
+ if (is_uld(adapter)) {
+ mutex_lock(&uld_mutex);
+ list_add_tail(&adapter->list_node, &adapter_list);
+ mutex_unlock(&uld_mutex);
+ }
print_adapter_info(adapter);
+ setup_fw_sge_queues(adapter);
return 0;
sriov:
@@ -5593,8 +5019,8 @@
free_some_resources(adapter);
if (adapter->flags & USING_MSIX)
free_msix_info(adapter);
- if (adapter->num_uld)
- uld_mem_free(adapter);
+ if (adapter->num_uld || adapter->num_ofld_uld)
+ t4_uld_mem_free(adapter);
out_unmap_bar:
if (!is_t4(adapter->params.chip))
iounmap(adapter->bar2);
@@ -5631,7 +5057,7 @@
*/
destroy_workqueue(adapter->workq);
- if (is_offload(adapter))
+ if (is_uld(adapter))
detach_ulds(adapter);
disable_interrupts(adapter);
@@ -5645,21 +5071,15 @@
/* If we allocated filters, free up state associated with any
* valid filters ...
*/
- if (adapter->tids.ftid_tab) {
- struct filter_entry *f = &adapter->tids.ftid_tab[0];
- for (i = 0; i < (adapter->tids.nftids +
- adapter->tids.nsftids); i++, f++)
- if (f->valid)
- clear_filter(adapter, f);
- }
+ clear_all_filters(adapter);
if (adapter->flags & FULL_INIT_DONE)
cxgb_down(adapter);
if (adapter->flags & USING_MSIX)
free_msix_info(adapter);
- if (adapter->num_uld)
- uld_mem_free(adapter);
+ if (adapter->num_uld || adapter->num_ofld_uld)
+ t4_uld_mem_free(adapter);
free_some_resources(adapter);
#if IS_ENABLED(CONFIG_IPV6)
t4_cleanup_clip_tbl(adapter);
@@ -5690,12 +5110,58 @@
#endif
}
+/* "Shutdown" quiesces the device, stopping Ingress Packet and Interrupt
+ * delivery. This is essentially a stripped down version of the PCI remove()
+ * function where we do the minimal amount of work necessary to shutdown any
+ * further activity.
+ */
+static void shutdown_one(struct pci_dev *pdev)
+{
+ struct adapter *adapter = pci_get_drvdata(pdev);
+
+ /* As with remove_one() above (see extended comment), we only want do
+ * do cleanup on PCI Devices which went all the way through init_one()
+ * ...
+ */
+ if (!adapter) {
+ pci_release_regions(pdev);
+ return;
+ }
+
+ if (adapter->pf == 4) {
+ int i;
+
+ for_each_port(adapter, i)
+ if (adapter->port[i]->reg_state == NETREG_REGISTERED)
+ cxgb_close(adapter->port[i]);
+
+ t4_uld_clean_up(adapter);
+ disable_interrupts(adapter);
+ disable_msi(adapter);
+
+ t4_sge_stop(adapter);
+ if (adapter->flags & FW_OK)
+ t4_fw_bye(adapter, adapter->mbox);
+ }
+#ifdef CONFIG_PCI_IOV
+ else {
+ if (adapter->port[0])
+ unregister_netdev(adapter->port[0]);
+ iounmap(adapter->regs);
+ kfree(adapter->vfinfo);
+ kfree(adapter);
+ pci_disable_sriov(pdev);
+ pci_release_regions(pdev);
+ }
+#endif
+}
+
static struct pci_driver cxgb4_driver = {
.name = KBUILD_MODNAME,
.id_table = cxgb4_pci_tbl,
.probe = init_one,
.remove = remove_one,
- .shutdown = remove_one,
+ .shutdown = shutdown_one,
#ifdef CONFIG_PCI_IOV
.sriov_configure = cxgb4_iov_configure,
#endif
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
new file mode 100644
index 0000000..49d2deb
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
@@ -0,0 +1,483 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
+#include "cxgb4.h"
+#include "cxgb4_tc_u32_parse.h"
+#include "cxgb4_tc_u32.h"
+
+/* Fill ch_filter_specification with parsed match value/mask pair. */
+static int fill_match_fields(struct adapter *adap,
+ struct ch_filter_specification *fs,
+ struct tc_cls_u32_offload *cls,
+ const struct cxgb4_match_field *entry,
+ bool next_header)
+{
+ unsigned int i, j;
+ u32 val, mask;
+ int off, err;
+ bool found;
+
+ for (i = 0; i < cls->knode.sel->nkeys; i++) {
+ off = cls->knode.sel->keys[i].off;
+ val = cls->knode.sel->keys[i].val;
+ mask = cls->knode.sel->keys[i].mask;
+
+ if (next_header) {
+ /* For next headers, parse only keys with offmask */
+ if (!cls->knode.sel->keys[i].offmask)
+ continue;
+ } else {
+ /* For the remaining, parse only keys without offmask */
+ if (cls->knode.sel->keys[i].offmask)
+ continue;
+ }
+
+ found = false;
+
+ for (j = 0; entry[j].val; j++) {
+ if (off == entry[j].off) {
+ found = true;
+ err = entry[j].val(fs, val, mask);
+ if (err)
+ return err;
+ break;
+ }
+ }
+
+ if (!found)
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/* Fill ch_filter_specification with parsed action. */
+static int fill_action_fields(struct adapter *adap,
+ struct ch_filter_specification *fs,
+ struct tc_cls_u32_offload *cls)
+{
+ unsigned int num_actions = 0;
+ const struct tc_action *a;
+ struct tcf_exts *exts;
+ LIST_HEAD(actions);
+
+ exts = cls->knode.exts;
+ if (tc_no_actions(exts))
+ return -EINVAL;
+
+ tcf_exts_to_list(exts, &actions);
+ list_for_each_entry(a, &actions, list) {
+ /* Don't allow more than one action per rule. */
+ if (num_actions)
+ return -EINVAL;
+
+ /* Drop in hardware. */
+ if (is_tcf_gact_shot(a)) {
+ fs->action = FILTER_DROP;
+ num_actions++;
+ continue;
+ }
+
+ /* Re-direct to specified port in hardware. */
+ if (is_tcf_mirred_redirect(a)) {
+ struct net_device *n_dev;
+ unsigned int i, index;
+ bool found = false;
+
+ index = tcf_mirred_ifindex(a);
+ for_each_port(adap, i) {
+ n_dev = adap->port[i];
+ if (index == n_dev->ifindex) {
+ fs->action = FILTER_SWITCH;
+ fs->eport = i;
+ found = true;
+ break;
+ }
+ }
+
+ /* Interface doesn't belong to any port of
+ * the underlying hardware.
+ */
+ if (!found)
+ return -EINVAL;
+
+ num_actions++;
+ continue;
+ }
+
+ /* Un-supported action. */
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
+ struct tc_cls_u32_offload *cls)
+{
+ const struct cxgb4_match_field *start, *link_start = NULL;
+ struct adapter *adapter = netdev2adap(dev);
+ struct ch_filter_specification fs;
+ struct cxgb4_tc_u32_table *t;
+ struct cxgb4_link *link;
+ unsigned int filter_id;
+ u32 uhtid, link_uhtid;
+ bool is_ipv6 = false;
+ int ret;
+
+ if (!can_tc_u32_offload(dev))
+ return -EOPNOTSUPP;
+
+ if (protocol != htons(ETH_P_IP) && protocol != htons(ETH_P_IPV6))
+ return -EOPNOTSUPP;
+
+ /* Fetch the location to insert the filter. */
+ filter_id = cls->knode.handle & 0xFFFFF;
+
+ if (filter_id > adapter->tids.nftids) {
+ dev_err(adapter->pdev_dev,
+ "Location %d out of range for insertion. Max: %d\n",
+ filter_id, adapter->tids.nftids);
+ return -ERANGE;
+ }
+
+ t = adapter->tc_u32;
+ uhtid = TC_U32_USERHTID(cls->knode.handle);
+ link_uhtid = TC_U32_USERHTID(cls->knode.link_handle);
+
+ /* Ensure that uhtid is either root u32 (i.e. 0x800)
+ * or a a valid linked bucket.
+ */
+ if (uhtid != 0x800 && uhtid >= t->size)
+ return -EINVAL;
+
+ /* Ensure link handle uhtid is sane, if specified. */
+ if (link_uhtid >= t->size)
+ return -EINVAL;
+
+ memset(&fs, 0, sizeof(fs));
+
+ if (protocol == htons(ETH_P_IPV6)) {
+ start = cxgb4_ipv6_fields;
+ is_ipv6 = true;
+ } else {
+ start = cxgb4_ipv4_fields;
+ is_ipv6 = false;
+ }
+
+ if (uhtid != 0x800) {
+ /* Link must exist from root node before insertion. */
+ if (!t->table[uhtid - 1].link_handle)
+ return -EINVAL;
+
+ /* Link must have a valid supported next header. */
+ link_start = t->table[uhtid - 1].match_field;
+ if (!link_start)
+ return -EINVAL;
+ }
+
+ /* Parse links and record them for subsequent jumps to valid
+ * next headers.
+ */
+ if (link_uhtid) {
+ const struct cxgb4_next_header *next;
+ bool found = false;
+ unsigned int i, j;
+ u32 val, mask;
+ int off;
+
+ if (t->table[link_uhtid - 1].link_handle) {
+ dev_err(adapter->pdev_dev,
+ "Link handle exists for: 0x%x\n",
+ link_uhtid);
+ return -EINVAL;
+ }
+
+ next = is_ipv6 ? cxgb4_ipv6_jumps : cxgb4_ipv4_jumps;
+
+ /* Try to find matches that allow jumps to next header. */
+ for (i = 0; next[i].jump; i++) {
+ if (next[i].offoff != cls->knode.sel->offoff ||
+ next[i].shift != cls->knode.sel->offshift ||
+ next[i].mask != cls->knode.sel->offmask ||
+ next[i].offset != cls->knode.sel->off)
+ continue;
+
+ /* Found a possible candidate. Find a key that
+ * matches the corresponding offset, value, and
+ * mask to jump to next header.
+ */
+ for (j = 0; j < cls->knode.sel->nkeys; j++) {
+ off = cls->knode.sel->keys[j].off;
+ val = cls->knode.sel->keys[j].val;
+ mask = cls->knode.sel->keys[j].mask;
+
+ if (next[i].match_off == off &&
+ next[i].match_val == val &&
+ next[i].match_mask == mask) {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found)
+ continue; /* Try next candidate. */
+
+ /* Candidate to jump to next header found.
+ * Translate all keys to internal specification
+ * and store them in jump table. This spec is copied
+ * later to set the actual filters.
+ */
+ ret = fill_match_fields(adapter, &fs, cls,
+ start, false);
+ if (ret)
+ goto out;
+
+ link = &t->table[link_uhtid - 1];
+ link->match_field = next[i].jump;
+ link->link_handle = cls->knode.handle;
+ memcpy(&link->fs, &fs, sizeof(fs));
+ break;
+ }
+
+ /* No candidate found to jump to next header. */
+ if (!found)
+ return -EINVAL;
+
+ return 0;
+ }
+
+ /* Fill ch_filter_specification match fields to be shipped to hardware.
+ * Copy the linked spec (if any) first. And then update the spec as
+ * needed.
+ */
+ if (uhtid != 0x800 && t->table[uhtid - 1].link_handle) {
+ /* Copy linked ch_filter_specification */
+ memcpy(&fs, &t->table[uhtid - 1].fs, sizeof(fs));
+ ret = fill_match_fields(adapter, &fs, cls,
+ link_start, true);
+ if (ret)
+ goto out;
+ }
+
+ ret = fill_match_fields(adapter, &fs, cls, start, false);
+ if (ret)
+ goto out;
+
+ /* Fill ch_filter_specification action fields to be shipped to
+ * hardware.
+ */
+ ret = fill_action_fields(adapter, &fs, cls);
+ if (ret)
+ goto out;
+
+ /* The filter spec has been completely built from the info
+ * provided from u32. We now set some default fields in the
+ * spec for sanity.
+ */
+
+ /* Match only packets coming from the ingress port where this
+ * filter will be created.
+ */
+ fs.val.iport = netdev2pinfo(dev)->port_id;
+ fs.mask.iport = ~0;
+
+ /* Enable filter hit counts. */
+ fs.hitcnts = 1;
+
+ /* Set type of filter - IPv6 or IPv4 */
+ fs.type = is_ipv6 ? 1 : 0;
+
+ /* Set the filter */
+ ret = cxgb4_set_filter(dev, filter_id, &fs);
+ if (ret)
+ goto out;
+
+ /* If this is a linked bucket, then set the corresponding
+ * entry in the bitmap to mark it as belonging to this linked
+ * bucket.
+ */
+ if (uhtid != 0x800 && t->table[uhtid - 1].link_handle)
+ set_bit(filter_id, t->table[uhtid - 1].tid_map);
+
+out:
+ return ret;
+}
+
+int cxgb4_delete_knode(struct net_device *dev, __be16 protocol,
+ struct tc_cls_u32_offload *cls)
+{
+ struct adapter *adapter = netdev2adap(dev);
+ unsigned int filter_id, max_tids, i, j;
+ struct cxgb4_link *link = NULL;
+ struct cxgb4_tc_u32_table *t;
+ u32 handle, uhtid;
+ int ret;
+
+ if (!can_tc_u32_offload(dev))
+ return -EOPNOTSUPP;
+
+ /* Fetch the location to delete the filter. */
+ filter_id = cls->knode.handle & 0xFFFFF;
+
+ if (filter_id > adapter->tids.nftids) {
+ dev_err(adapter->pdev_dev,
+ "Location %d out of range for deletion. Max: %d\n",
+ filter_id, adapter->tids.nftids);
+ return -ERANGE;
+ }
+
+ t = adapter->tc_u32;
+ handle = cls->knode.handle;
+ uhtid = TC_U32_USERHTID(cls->knode.handle);
+
+ /* Ensure that uhtid is either root u32 (i.e. 0x800)
+ * or a a valid linked bucket.
+ */
+ if (uhtid != 0x800 && uhtid >= t->size)
+ return -EINVAL;
+
+ /* Delete the specified filter */
+ if (uhtid != 0x800) {
+ link = &t->table[uhtid - 1];
+ if (!link->link_handle)
+ return -EINVAL;
+
+ if (!test_bit(filter_id, link->tid_map))
+ return -EINVAL;
+ }
+
+ ret = cxgb4_del_filter(dev, filter_id);
+ if (ret)
+ goto out;
+
+ if (link)
+ clear_bit(filter_id, link->tid_map);
+
+ /* If a link is being deleted, then delete all filters
+ * associated with the link.
+ */
+ max_tids = adapter->tids.nftids;
+ for (i = 0; i < t->size; i++) {
+ link = &t->table[i];
+
+ if (link->link_handle == handle) {
+ for (j = 0; j < max_tids; j++) {
+ if (!test_bit(j, link->tid_map))
+ continue;
+
+ ret = __cxgb4_del_filter(dev, j, NULL);
+ if (ret)
+ goto out;
+
+ clear_bit(j, link->tid_map);
+ }
+
+ /* Clear the link state */
+ link->match_field = NULL;
+ link->link_handle = 0;
+ memset(&link->fs, 0, sizeof(link->fs));
+ break;
+ }
+ }
+
+out:
+ return ret;
+}
+
+void cxgb4_cleanup_tc_u32(struct adapter *adap)
+{
+ struct cxgb4_tc_u32_table *t;
+ unsigned int i;
+
+ if (!adap->tc_u32)
+ return;
+
+ /* Free up all allocated memory. */
+ t = adap->tc_u32;
+ for (i = 0; i < t->size; i++) {
+ struct cxgb4_link *link = &t->table[i];
+
+ t4_free_mem(link->tid_map);
+ }
+ t4_free_mem(adap->tc_u32);
+}
+
+struct cxgb4_tc_u32_table *cxgb4_init_tc_u32(struct adapter *adap,
+ unsigned int size)
+{
+ struct cxgb4_tc_u32_table *t;
+ unsigned int i;
+
+ if (!size)
+ return NULL;
+
+ t = t4_alloc_mem(sizeof(*t) +
+ (size * sizeof(struct cxgb4_link)));
+ if (!t)
+ return NULL;
+
+ t->size = size;
+
+ for (i = 0; i < t->size; i++) {
+ struct cxgb4_link *link = &t->table[i];
+ unsigned int bmap_size;
+ unsigned int max_tids;
+
+ max_tids = adap->tids.nftids;
+ bmap_size = BITS_TO_LONGS(max_tids);
+ link->tid_map = t4_alloc_mem(sizeof(unsigned long) * bmap_size);
+ if (!link->tid_map)
+ goto out_no_mem;
+ bitmap_zero(link->tid_map, max_tids);
+ }
+
+ return t;
+
+out_no_mem:
+ for (i = 0; i < t->size; i++) {
+ struct cxgb4_link *link = &t->table[i];
+
+ if (link->tid_map)
+ t4_free_mem(link->tid_map);
+ }
+
+ if (t)
+ t4_free_mem(t);
+
+ return NULL;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h
new file mode 100644
index 0000000..6bdc885
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h
@@ -0,0 +1,57 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_U32_H
+#define __CXGB4_TC_U32_H
+
+#include <net/pkt_cls.h>
+
+#define CXGB4_MAX_LINK_HANDLE 32
+
+static inline bool can_tc_u32_offload(struct net_device *dev)
+{
+ struct adapter *adap = netdev2adap(dev);
+
+ return (dev->features & NETIF_F_HW_TC) && adap->tc_u32 ? true : false;
+}
+
+int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
+ struct tc_cls_u32_offload *cls);
+int cxgb4_delete_knode(struct net_device *dev, __be16 protocol,
+ struct tc_cls_u32_offload *cls);
+
+void cxgb4_cleanup_tc_u32(struct adapter *adapter);
+struct cxgb4_tc_u32_table *cxgb4_init_tc_u32(struct adapter *adap,
+ unsigned int size);
+#endif /* __CXGB4_TC_U32_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
new file mode 100644
index 0000000..a4b99ed
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
@@ -0,0 +1,294 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_U32_PARSE_H
+#define __CXGB4_TC_U32_PARSE_H
+
+struct cxgb4_match_field {
+ int off; /* Offset from the beginning of the header to match */
+ /* Fill the value/mask pair in the spec if matched */
+ int (*val)(struct ch_filter_specification *f, u32 val, u32 mask);
+};
+
+/* IPv4 match fields */
+static inline int cxgb4_fill_ipv4_tos(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ f->val.tos = (ntohl(val) >> 16) & 0x000000FF;
+ f->mask.tos = (ntohl(mask) >> 16) & 0x000000FF;
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv4_frag(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ u32 mask_val;
+ u8 frag_val;
+
+ frag_val = (ntohl(val) >> 13) & 0x00000007;
+ mask_val = ntohl(mask) & 0x0000FFFF;
+
+ if (frag_val == 0x1 && mask_val != 0x3FFF) { /* MF set */
+ f->val.frag = 1;
+ f->mask.frag = 1;
+ } else if (frag_val == 0x2 && mask_val != 0x3FFF) { /* DF set */
+ f->val.frag = 0;
+ f->mask.frag = 1;
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv4_proto(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ f->val.proto = (ntohl(val) >> 16) & 0x000000FF;
+ f->mask.proto = (ntohl(mask) >> 16) & 0x000000FF;
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv4_src_ip(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.fip[0], &val, sizeof(u32));
+ memcpy(&f->mask.fip[0], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv4_dst_ip(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.lip[0], &val, sizeof(u32));
+ memcpy(&f->mask.lip[0], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static const struct cxgb4_match_field cxgb4_ipv4_fields[] = {
+ { .off = 0, .val = cxgb4_fill_ipv4_tos },
+ { .off = 4, .val = cxgb4_fill_ipv4_frag },
+ { .off = 8, .val = cxgb4_fill_ipv4_proto },
+ { .off = 12, .val = cxgb4_fill_ipv4_src_ip },
+ { .off = 16, .val = cxgb4_fill_ipv4_dst_ip },
+ { .val = NULL }
+};
+
+/* IPv6 match fields */
+static inline int cxgb4_fill_ipv6_tos(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ f->val.tos = (ntohl(val) >> 20) & 0x000000FF;
+ f->mask.tos = (ntohl(mask) >> 20) & 0x000000FF;
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_proto(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ f->val.proto = (ntohl(val) >> 8) & 0x000000FF;
+ f->mask.proto = (ntohl(mask) >> 8) & 0x000000FF;
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip0(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.fip[0], &val, sizeof(u32));
+ memcpy(&f->mask.fip[0], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip1(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.fip[4], &val, sizeof(u32));
+ memcpy(&f->mask.fip[4], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip2(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.fip[8], &val, sizeof(u32));
+ memcpy(&f->mask.fip[8], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip3(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.fip[12], &val, sizeof(u32));
+ memcpy(&f->mask.fip[12], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip0(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.lip[0], &val, sizeof(u32));
+ memcpy(&f->mask.lip[0], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip1(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.lip[4], &val, sizeof(u32));
+ memcpy(&f->mask.lip[4], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip2(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.lip[8], &val, sizeof(u32));
+ memcpy(&f->mask.lip[8], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip3(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ memcpy(&f->val.lip[12], &val, sizeof(u32));
+ memcpy(&f->mask.lip[12], &mask, sizeof(u32));
+
+ return 0;
+}
+
+static const struct cxgb4_match_field cxgb4_ipv6_fields[] = {
+ { .off = 0, .val = cxgb4_fill_ipv6_tos },
+ { .off = 4, .val = cxgb4_fill_ipv6_proto },
+ { .off = 8, .val = cxgb4_fill_ipv6_src_ip0 },
+ { .off = 12, .val = cxgb4_fill_ipv6_src_ip1 },
+ { .off = 16, .val = cxgb4_fill_ipv6_src_ip2 },
+ { .off = 20, .val = cxgb4_fill_ipv6_src_ip3 },
+ { .off = 24, .val = cxgb4_fill_ipv6_dst_ip0 },
+ { .off = 28, .val = cxgb4_fill_ipv6_dst_ip1 },
+ { .off = 32, .val = cxgb4_fill_ipv6_dst_ip2 },
+ { .off = 36, .val = cxgb4_fill_ipv6_dst_ip3 },
+ { .val = NULL }
+};
+
+/* TCP/UDP match */
+static inline int cxgb4_fill_l4_ports(struct ch_filter_specification *f,
+ u32 val, u32 mask)
+{
+ f->val.fport = ntohl(val) >> 16;
+ f->mask.fport = ntohl(mask) >> 16;
+ f->val.lport = ntohl(val) & 0x0000FFFF;
+ f->mask.lport = ntohl(mask) & 0x0000FFFF;
+
+ return 0;
+};
+
+static const struct cxgb4_match_field cxgb4_tcp_fields[] = {
+ { .off = 0, .val = cxgb4_fill_l4_ports },
+ { .val = NULL }
+};
+
+static const struct cxgb4_match_field cxgb4_udp_fields[] = {
+ { .off = 0, .val = cxgb4_fill_l4_ports },
+ { .val = NULL }
+};
+
+struct cxgb4_next_header {
+ unsigned int offset; /* Offset to next header */
+ /* offset, shift, and mask added to offset above
+ * to get to next header. Useful when using a header
+ * field's value to jump to next header such as IHL field
+ * in IPv4 header.
+ */
+ unsigned int offoff;
+ u32 shift;
+ u32 mask;
+ /* match criteria to make this jump */
+ unsigned int match_off;
+ u32 match_val;
+ u32 match_mask;
+ /* location of jump to make */
+ const struct cxgb4_match_field *jump;
+};
+
+/* Accept a rule with a jump to transport layer header based on IHL field in
+ * IPv4 header.
+ */
+static const struct cxgb4_next_header cxgb4_ipv4_jumps[] = {
+ { .offset = 0, .offoff = 0, .shift = 6, .mask = 0xF,
+ .match_off = 8, .match_val = 0x600, .match_mask = 0xFF00,
+ .jump = cxgb4_tcp_fields },
+ { .offset = 0, .offoff = 0, .shift = 6, .mask = 0xF,
+ .match_off = 8, .match_val = 0x1100, .match_mask = 0xFF00,
+ .jump = cxgb4_udp_fields },
+ { .jump = NULL }
+};
+
+/* Accept a rule with a jump directly past the 40 Bytes of IPv6 fixed header
+ * to get to transport layer header.
+ */
+static const struct cxgb4_next_header cxgb4_ipv6_jumps[] = {
+ { .offset = 0x28, .offoff = 0, .shift = 0, .mask = 0,
+ .match_off = 4, .match_val = 0x60000, .match_mask = 0xFF0000,
+ .jump = cxgb4_tcp_fields },
+ { .offset = 0x28, .offoff = 0, .shift = 0, .mask = 0,
+ .match_off = 4, .match_val = 0x110000, .match_mask = 0xFF0000,
+ .jump = cxgb4_udp_fields },
+ { .jump = NULL }
+};
+
+struct cxgb4_link {
+ const struct cxgb4_match_field *match_field; /* Next header */
+ struct ch_filter_specification fs; /* Match spec associated with link */
+ u32 link_handle; /* Knode handle associated with the link */
+ unsigned long *tid_map; /* Bitmap for filter tids */
+};
+
+struct cxgb4_tc_u32_table {
+ unsigned int size; /* number of entries in table */
+ struct cxgb4_link table[0]; /* Jump table */
+};
+#endif /* __CXGB4_TC_U32_PARSE_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
index 5d402ba..b4b2d20 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
@@ -82,6 +82,24 @@
spin_unlock_irqrestore(&bmap->lock, flags);
}
+/* Flush the aggregated lro sessions */
+static void uldrx_flush_handler(struct sge_rspq *q)
+{
+ struct adapter *adap = q->adap;
+
+ if (adap->uld[q->uld].lro_flush)
+ adap->uld[q->uld].lro_flush(&q->lro_mgr);
+}
+
+/**
+ * uldrx_handler - response queue handler for ULD queues
+ * @q: the response queue that received the packet
+ * @rsp: the response queue descriptor holding the offload message
+ * @gl: the gather list of packet fragments
+ *
+ * Deliver an ingress offload packet to a ULD. All processing is done by
+ * the ULD, we just maintain statistics.
+ */
static int uldrx_handler(struct sge_rspq *q, const __be64 *rsp,
const struct pkt_gl *gl)
{
@@ -124,8 +142,8 @@
struct sge_ofld_rxq *q = rxq_info->uldrxq + offset;
unsigned short *ids = rxq_info->rspq_id + offset;
unsigned int per_chan = nq / adap->params.nports;
- unsigned int msi_idx, bmap_idx;
- int i, err;
+ unsigned int bmap_idx = 0;
+ int i, err, msi_idx;
if (adap->flags & USING_MSIX)
msi_idx = 1;
@@ -135,14 +153,14 @@
for (i = 0; i < nq; i++, q++) {
if (msi_idx >= 0) {
bmap_idx = get_msix_idx_from_bmap(adap);
- adap->msi_idx++;
+ msi_idx = adap->msix_info_ulds[bmap_idx].idx;
}
err = t4_sge_alloc_rxq(adap, &q->rspq, false,
adap->port[i / per_chan],
- adap->msi_idx,
+ msi_idx,
q->fl.size ? &q->fl : NULL,
uldrx_handler,
- NULL,
+ lro ? uldrx_flush_handler : NULL,
0);
if (err)
goto freeout;
@@ -159,7 +177,6 @@
if (q->rspq.desc)
free_rspq_fl(adap, &q->rspq,
q->fl.size ? &q->fl : NULL);
- adap->msi_idx--;
}
/* We need to free rxq also in case of ciq allocation failure */
@@ -169,26 +186,47 @@
if (q->rspq.desc)
free_rspq_fl(adap, &q->rspq,
q->fl.size ? &q->fl : NULL);
- adap->msi_idx--;
}
}
return err;
}
-int setup_sge_queues_uld(struct adapter *adap, unsigned int uld_type, bool lro)
+static int
+setup_sge_queues_uld(struct adapter *adap, unsigned int uld_type, bool lro)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
+ int i, ret = 0;
if (adap->flags & USING_MSIX) {
- rxq_info->msix_tbl = kzalloc(rxq_info->nrxq + rxq_info->nciq,
+ rxq_info->msix_tbl = kcalloc((rxq_info->nrxq + rxq_info->nciq),
+ sizeof(unsigned short),
GFP_KERNEL);
if (!rxq_info->msix_tbl)
return -ENOMEM;
}
- return !(!alloc_uld_rxqs(adap, rxq_info, rxq_info->nrxq, 0, lro) &&
+ ret = !(!alloc_uld_rxqs(adap, rxq_info, rxq_info->nrxq, 0, lro) &&
!alloc_uld_rxqs(adap, rxq_info, rxq_info->nciq,
rxq_info->nrxq, lro));
+
+ /* Tell uP to route control queue completions to rdma rspq */
+ if (adap->flags & FULL_INIT_DONE &&
+ !ret && uld_type == CXGB4_ULD_RDMA) {
+ struct sge *s = &adap->sge;
+ unsigned int cmplqid;
+ u32 param, cmdop;
+
+ cmdop = FW_PARAMS_PARAM_DMAQ_EQ_CMPLIQID_CTRL;
+ for_each_port(adap, i) {
+ cmplqid = rxq_info->uldrxq[i].rspq.cntxt_id;
+ param = (FW_PARAMS_MNEM_V(FW_PARAMS_MNEM_DMAQ) |
+ FW_PARAMS_PARAM_X_V(cmdop) |
+ FW_PARAMS_PARAM_YZ_V(s->ctrlq[i].q.cntxt_id));
+ ret = t4_set_params(adap, adap->mbox, adap->pf,
+ 0, 1, ¶m, &cmplqid);
+ }
+ }
+ return ret;
}
static void t4_free_uld_rxqs(struct adapter *adap, int n,
@@ -198,14 +236,28 @@
if (q->rspq.desc)
free_rspq_fl(adap, &q->rspq,
q->fl.size ? &q->fl : NULL);
- adap->msi_idx--;
}
}
-void free_sge_queues_uld(struct adapter *adap, unsigned int uld_type)
+static void free_sge_queues_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
+ if (adap->flags & FULL_INIT_DONE && uld_type == CXGB4_ULD_RDMA) {
+ struct sge *s = &adap->sge;
+ u32 param, cmdop, cmplqid = 0;
+ int i;
+
+ cmdop = FW_PARAMS_PARAM_DMAQ_EQ_CMPLIQID_CTRL;
+ for_each_port(adap, i) {
+ param = (FW_PARAMS_MNEM_V(FW_PARAMS_MNEM_DMAQ) |
+ FW_PARAMS_PARAM_X_V(cmdop) |
+ FW_PARAMS_PARAM_YZ_V(s->ctrlq[i].q.cntxt_id));
+ t4_set_params(adap, adap->mbox, adap->pf,
+ 0, 1, ¶m, &cmplqid);
+ }
+ }
+
if (rxq_info->nciq)
t4_free_uld_rxqs(adap, rxq_info->nciq,
rxq_info->uldrxq + rxq_info->nrxq);
@@ -214,27 +266,39 @@
kfree(rxq_info->msix_tbl);
}
-int cfg_queues_uld(struct adapter *adap, unsigned int uld_type,
- const struct cxgb4_pci_uld_info *uld_info)
+static int cfg_queues_uld(struct adapter *adap, unsigned int uld_type,
+ const struct cxgb4_uld_info *uld_info)
{
struct sge *s = &adap->sge;
struct sge_uld_rxq_info *rxq_info;
- int i, nrxq;
+ int i, nrxq, ciq_size;
rxq_info = kzalloc(sizeof(*rxq_info), GFP_KERNEL);
if (!rxq_info)
return -ENOMEM;
- if (uld_info->nrxq > s->nqs_per_uld)
- rxq_info->nrxq = s->nqs_per_uld;
- else
- rxq_info->nrxq = uld_info->nrxq;
- if (!uld_info->nciq)
+ if (adap->flags & USING_MSIX && uld_info->nrxq > s->nqs_per_uld) {
+ i = s->nqs_per_uld;
+ rxq_info->nrxq = roundup(i, adap->params.nports);
+ } else {
+ i = min_t(int, uld_info->nrxq,
+ num_online_cpus());
+ rxq_info->nrxq = roundup(i, adap->params.nports);
+ }
+ if (!uld_info->ciq) {
rxq_info->nciq = 0;
- else if (uld_info->nciq && uld_info->nciq > s->nqs_per_uld)
- rxq_info->nciq = s->nqs_per_uld;
- else
- rxq_info->nciq = uld_info->nciq;
+ } else {
+ if (adap->flags & USING_MSIX)
+ rxq_info->nciq = min_t(int, s->nqs_per_uld,
+ num_online_cpus());
+ else
+ rxq_info->nciq = min_t(int, MAX_OFLD_QSETS,
+ num_online_cpus());
+ rxq_info->nciq = ((rxq_info->nciq / adap->params.nports) *
+ adap->params.nports);
+ rxq_info->nciq = max_t(int, rxq_info->nciq,
+ adap->params.nports);
+ }
nrxq = rxq_info->nrxq + rxq_info->nciq; /* total rxq's */
rxq_info->uldrxq = kcalloc(nrxq, sizeof(struct sge_ofld_rxq),
@@ -245,7 +309,7 @@
}
rxq_info->rspq_id = kcalloc(nrxq, sizeof(unsigned short), GFP_KERNEL);
- if (!rxq_info->uldrxq) {
+ if (!rxq_info->rspq_id) {
kfree(rxq_info->uldrxq);
kfree(rxq_info);
return -ENOMEM;
@@ -259,12 +323,17 @@
r->fl.size = 72;
}
+ ciq_size = 64 + adap->vres.cq.size + adap->tids.nftids;
+ if (ciq_size > SGE_MAX_IQ_SIZE) {
+ dev_warn(adap->pdev_dev, "CIQ size too small for available IQs\n");
+ ciq_size = SGE_MAX_IQ_SIZE;
+ }
+
for (i = rxq_info->nrxq; i < nrxq; i++) {
struct sge_ofld_rxq *r = &rxq_info->uldrxq[i];
- init_rspq(adap, &r->rspq, 5, 1, uld_info->ciq_size, 64);
+ init_rspq(adap, &r->rspq, 5, 1, ciq_size, 64);
r->rspq.uld = uld_type;
- r->fl.size = 72;
}
memcpy(rxq_info->name, uld_info->name, IFNAMSIZ);
@@ -273,7 +342,7 @@
return 0;
}
-void free_queues_uld(struct adapter *adap, unsigned int uld_type)
+static void free_queues_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
@@ -282,10 +351,12 @@
kfree(rxq_info);
}
-int request_msix_queue_irqs_uld(struct adapter *adap, unsigned int uld_type)
+static int
+request_msix_queue_irqs_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
- int idx, bmap_idx, err = 0;
+ int err = 0;
+ unsigned int idx, bmap_idx;
for_each_uldrxq(rxq_info, idx) {
bmap_idx = rxq_info->msix_tbl[idx];
@@ -298,7 +369,7 @@
}
return 0;
unwind:
- while (--idx >= 0) {
+ while (idx-- > 0) {
bmap_idx = rxq_info->msix_tbl[idx];
free_msix_idx_in_bmap(adap, bmap_idx);
free_irq(adap->msix_info_ulds[bmap_idx].vec,
@@ -307,13 +378,14 @@
return err;
}
-void free_msix_queue_irqs_uld(struct adapter *adap, unsigned int uld_type)
+static void
+free_msix_queue_irqs_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
- int idx;
+ unsigned int idx, bmap_idx;
for_each_uldrxq(rxq_info, idx) {
- unsigned int bmap_idx = rxq_info->msix_tbl[idx];
+ bmap_idx = rxq_info->msix_tbl[idx];
free_msix_idx_in_bmap(adap, bmap_idx);
free_irq(adap->msix_info_ulds[bmap_idx].vec,
@@ -321,14 +393,14 @@
}
}
-void name_msix_vecs_uld(struct adapter *adap, unsigned int uld_type)
+static void name_msix_vecs_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
int n = sizeof(adap->msix_info_ulds[0].desc);
- int idx;
+ unsigned int idx, bmap_idx;
for_each_uldrxq(rxq_info, idx) {
- unsigned int bmap_idx = rxq_info->msix_tbl[idx];
+ bmap_idx = rxq_info->msix_tbl[idx];
snprintf(adap->msix_info_ulds[bmap_idx].desc, n, "%s-%s%d",
adap->port[0]->name, rxq_info->name, idx);
@@ -361,7 +433,7 @@
}
}
-void enable_rx_uld(struct adapter *adap, unsigned int uld_type)
+static void enable_rx_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
int idx;
@@ -370,7 +442,7 @@
enable_rx(adap, &rxq_info->uldrxq[idx].rspq);
}
-void quiesce_rx_uld(struct adapter *adap, unsigned int uld_type)
+static void quiesce_rx_uld(struct adapter *adap, unsigned int uld_type)
{
struct sge_uld_rxq_info *rxq_info = adap->sge.uld_rxq_info[uld_type];
int idx;
@@ -390,15 +462,15 @@
lli->nciq = rxq_info->nciq;
}
-int uld_mem_alloc(struct adapter *adap)
+int t4_uld_mem_alloc(struct adapter *adap)
{
struct sge *s = &adap->sge;
- adap->uld = kcalloc(adap->num_uld, sizeof(*adap->uld), GFP_KERNEL);
+ adap->uld = kcalloc(CXGB4_ULD_MAX, sizeof(*adap->uld), GFP_KERNEL);
if (!adap->uld)
return -ENOMEM;
- s->uld_rxq_info = kzalloc(adap->num_uld *
+ s->uld_rxq_info = kzalloc(CXGB4_ULD_MAX *
sizeof(struct sge_uld_rxq_info *),
GFP_KERNEL);
if (!s->uld_rxq_info)
@@ -410,7 +482,7 @@
return -ENOMEM;
}
-void uld_mem_free(struct adapter *adap)
+void t4_uld_mem_free(struct adapter *adap)
{
struct sge *s = &adap->sge;
@@ -418,6 +490,26 @@
kfree(adap->uld);
}
+void t4_uld_clean_up(struct adapter *adap)
+{
+ struct sge_uld_rxq_info *rxq_info;
+ unsigned int i;
+
+ if (!adap->uld)
+ return;
+ for (i = 0; i < CXGB4_ULD_MAX; i++) {
+ if (!adap->uld[i].handle)
+ continue;
+ rxq_info = adap->sge.uld_rxq_info[i];
+ if (adap->flags & FULL_INIT_DONE)
+ quiesce_rx_uld(adap, i);
+ if (adap->flags & USING_MSIX)
+ free_msix_queue_irqs_uld(adap, i);
+ free_sge_queues_uld(adap, i);
+ free_queues_uld(adap, i);
+ }
+}
+
static void uld_init(struct adapter *adap, struct cxgb4_lld_info *lld)
{
int i;
@@ -429,10 +521,15 @@
lld->ports = adap->port;
lld->vr = &adap->vres;
lld->mtus = adap->params.mtus;
- lld->ntxq = adap->sge.iscsiqsets;
+ lld->ntxq = adap->sge.ofldqsets;
lld->nchan = adap->params.nports;
lld->nports = adap->params.nports;
lld->wr_cred = adap->params.ofldq_wr_cred;
+ lld->iscsi_iolen = MAXRXDATA_G(t4_read_reg(adap, TP_PARA_REG2_A));
+ lld->iscsi_tagmask = t4_read_reg(adap, ULP_RX_ISCSI_TAGMASK_A);
+ lld->iscsi_pgsz_order = t4_read_reg(adap, ULP_RX_ISCSI_PSZ_A);
+ lld->iscsi_llimit = t4_read_reg(adap, ULP_RX_ISCSI_LLIMIT_A);
+ lld->iscsi_ppm = &adap->iscsi_ppm;
lld->adapter_type = adap->params.chip;
lld->cclk_ps = 1000000000 / adap->params.vpd.cclk;
lld->udb_density = 1 << adap->params.sge.eq_qpp;
@@ -472,23 +569,37 @@
}
adap->uld[uld].handle = handle;
+ t4_register_netevent_notifier();
if (adap->flags & FULL_INIT_DONE)
adap->uld[uld].state_change(handle, CXGB4_STATE_UP);
}
-int cxgb4_register_pci_uld(enum cxgb4_pci_uld type,
- struct cxgb4_pci_uld_info *p)
+/**
+ * cxgb4_register_uld - register an upper-layer driver
+ * @type: the ULD type
+ * @p: the ULD methods
+ *
+ * Registers an upper-layer driver with this driver and notifies the ULD
+ * about any presently available devices that support its type. Returns
+ * %-EBUSY if a ULD of the same type is already registered.
+ */
+int cxgb4_register_uld(enum cxgb4_uld type,
+ const struct cxgb4_uld_info *p)
{
int ret = 0;
+ unsigned int adap_idx = 0;
struct adapter *adap;
- if (type >= CXGB4_PCI_ULD_MAX)
+ if (type >= CXGB4_ULD_MAX)
return -EINVAL;
mutex_lock(&uld_mutex);
list_for_each_entry(adap, &adapter_list, list_node) {
- if (!is_pci_uld(adap))
+ if ((type == CXGB4_ULD_CRYPTO && !is_pci_uld(adap)) ||
+ (type != CXGB4_ULD_CRYPTO && !is_offload(adap)))
+ continue;
+ if (type == CXGB4_ULD_ISCSIT && is_t4(adap->params.chip))
continue;
ret = cfg_queues_uld(adap, type, p);
if (ret)
@@ -510,11 +621,14 @@
}
adap->uld[type] = *p;
uld_attach(adap, type);
+ adap_idx++;
}
mutex_unlock(&uld_mutex);
return 0;
free_irq:
+ if (adap->flags & FULL_INIT_DONE)
+ quiesce_rx_uld(adap, type);
if (adap->flags & USING_MSIX)
free_msix_queue_irqs_uld(adap, type);
free_rxq:
@@ -522,21 +636,49 @@
free_queues:
free_queues_uld(adap, type);
out:
+
+ list_for_each_entry(adap, &adapter_list, list_node) {
+ if ((type == CXGB4_ULD_CRYPTO && !is_pci_uld(adap)) ||
+ (type != CXGB4_ULD_CRYPTO && !is_offload(adap)))
+ continue;
+ if (type == CXGB4_ULD_ISCSIT && is_t4(adap->params.chip))
+ continue;
+ if (!adap_idx)
+ break;
+ adap->uld[type].handle = NULL;
+ adap->uld[type].add = NULL;
+ if (adap->flags & FULL_INIT_DONE)
+ quiesce_rx_uld(adap, type);
+ if (adap->flags & USING_MSIX)
+ free_msix_queue_irqs_uld(adap, type);
+ free_sge_queues_uld(adap, type);
+ free_queues_uld(adap, type);
+ adap_idx--;
+ }
mutex_unlock(&uld_mutex);
return ret;
}
-EXPORT_SYMBOL(cxgb4_register_pci_uld);
+EXPORT_SYMBOL(cxgb4_register_uld);
-int cxgb4_unregister_pci_uld(enum cxgb4_pci_uld type)
+/**
+ * cxgb4_unregister_uld - unregister an upper-layer driver
+ * @type: the ULD type
+ *
+ * Unregisters an existing upper-layer driver.
+ */
+int cxgb4_unregister_uld(enum cxgb4_uld type)
{
struct adapter *adap;
- if (type >= CXGB4_PCI_ULD_MAX)
+ if (type >= CXGB4_ULD_MAX)
return -EINVAL;
mutex_lock(&uld_mutex);
list_for_each_entry(adap, &adapter_list, list_node) {
- if (!is_pci_uld(adap))
+ if ((type == CXGB4_ULD_CRYPTO && !is_pci_uld(adap)) ||
+ (type != CXGB4_ULD_CRYPTO && !is_offload(adap)))
+ continue;
+ if (type == CXGB4_ULD_ISCSIT && is_t4(adap->params.chip))
continue;
adap->uld[type].handle = NULL;
adap->uld[type].add = NULL;
@@ -551,4 +693,4 @@
return 0;
}
-EXPORT_SYMBOL(cxgb4_unregister_pci_uld);
+EXPORT_SYMBOL(cxgb4_unregister_uld);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index ab40372..47bd14f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -1,7 +1,7 @@
/*
* This file is part of the Chelsio T4 Ethernet driver for Linux.
*
- * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved.
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -42,6 +42,8 @@
#include <linux/atomic.h>
#include "cxgb4.h"
+#define MAX_ULD_QSETS 16
+
/* CPL message priority levels */
enum {
CPL_PRIORITY_DATA = 0, /* data messages */
@@ -104,6 +106,7 @@
unsigned int atid_base;
struct filter_entry *ftid_tab;
+ unsigned long *ftid_bmap;
unsigned int nftids;
unsigned int ftid_base;
unsigned int aftid_base;
@@ -124,6 +127,8 @@
atomic_t tids_in_use;
/* TIDs in the HASH */
atomic_t hash_tids_in_use;
+ /* lock for setting/clearing filter bitmap */
+ spinlock_t ftid_lock;
};
static inline void *lookup_tid(const struct tid_info *t, unsigned int tid)
@@ -183,15 +188,38 @@
int cxgb4_remove_server_filter(const struct net_device *dev, unsigned int stid,
unsigned int queue, bool ipv6);
+/* Filter operation context to allow callers of cxgb4_set_filter() and
+ * cxgb4_del_filter() to wait for an asynchronous completion.
+ */
+struct filter_ctx {
+ struct completion completion; /* completion rendezvous */
+ void *closure; /* caller's opaque information */
+ int result; /* result of operation */
+ u32 tid; /* to store tid */
+};
+
+struct ch_filter_specification;
+
+int __cxgb4_set_filter(struct net_device *dev, int filter_id,
+ struct ch_filter_specification *fs,
+ struct filter_ctx *ctx);
+int __cxgb4_del_filter(struct net_device *dev, int filter_id,
+ struct filter_ctx *ctx);
+int cxgb4_set_filter(struct net_device *dev, int filter_id,
+ struct ch_filter_specification *fs);
+int cxgb4_del_filter(struct net_device *dev, int filter_id);
+
static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
{
skb_set_queue_mapping(skb, (queue << 1) | prio);
}
enum cxgb4_uld {
+ CXGB4_ULD_INIT,
CXGB4_ULD_RDMA,
CXGB4_ULD_ISCSI,
CXGB4_ULD_ISCSIT,
+ CXGB4_ULD_CRYPTO,
CXGB4_ULD_MAX
};
@@ -284,31 +312,11 @@
struct cxgb4_uld_info {
const char *name;
- void *(*add)(const struct cxgb4_lld_info *p);
- int (*rx_handler)(void *handle, const __be64 *rsp,
- const struct pkt_gl *gl);
- int (*state_change)(void *handle, enum cxgb4_state new_state);
- int (*control)(void *handle, enum cxgb4_control control, ...);
- int (*lro_rx_handler)(void *handle, const __be64 *rsp,
- const struct pkt_gl *gl,
- struct t4_lro_mgr *lro_mgr,
- struct napi_struct *napi);
- void (*lro_flush)(struct t4_lro_mgr *);
-};
-
-enum cxgb4_pci_uld {
- CXGB4_PCI_ULD1,
- CXGB4_PCI_ULD_MAX
-};
-
-struct cxgb4_pci_uld_info {
- const char *name;
- bool lro;
void *handle;
unsigned int nrxq;
- unsigned int nciq;
unsigned int rxq_size;
- unsigned int ciq_size;
+ bool ciq;
+ bool lro;
void *(*add)(const struct cxgb4_lld_info *p);
int (*rx_handler)(void *handle, const __be64 *rsp,
const struct pkt_gl *gl);
@@ -323,9 +331,6 @@
int cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p);
int cxgb4_unregister_uld(enum cxgb4_uld type);
-int cxgb4_register_pci_uld(enum cxgb4_pci_uld type,
- struct cxgb4_pci_uld_info *p);
-int cxgb4_unregister_pci_uld(enum cxgb4_pci_uld type);
int cxgb4_ofld_send(struct net_device *dev, struct sk_buff *skb);
unsigned int cxgb4_dbfifo_count(const struct net_device *dev, int lpfifo);
unsigned int cxgb4_port_chan(const struct net_device *dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 9a607db..1e74fd6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -2860,6 +2860,18 @@
return 0;
}
+int t4_sge_mod_ctrl_txq(struct adapter *adap, unsigned int eqid,
+ unsigned int cmplqid)
+{
+ u32 param, val;
+
+ param = (FW_PARAMS_MNEM_V(FW_PARAMS_MNEM_DMAQ) |
+ FW_PARAMS_PARAM_X_V(FW_PARAMS_PARAM_DMAQ_EQ_CMPLIQID_CTRL) |
+ FW_PARAMS_PARAM_YZ_V(eqid));
+ val = cmplqid;
+ return t4_set_params(adap, adap->mbox, adap->pf, 0, 1, ¶m, &val);
+}
+
int t4_sge_alloc_ofld_txq(struct adapter *adap, struct sge_ofld_txq *txq,
struct net_device *dev, unsigned int iqid)
{
@@ -3014,12 +3026,6 @@
}
}
- /* clean up RDMA and iSCSI Rx queues */
- t4_free_ofld_rxqs(adap, adap->sge.iscsiqsets, adap->sge.iscsirxq);
- t4_free_ofld_rxqs(adap, adap->sge.niscsitq, adap->sge.iscsitrxq);
- t4_free_ofld_rxqs(adap, adap->sge.rdmaqs, adap->sge.rdmarxq);
- t4_free_ofld_rxqs(adap, adap->sge.rdmaciqs, adap->sge.rdmaciq);
-
/* clean up offload Tx queues */
for (i = 0; i < ARRAY_SIZE(adap->sge.ofldtxq); i++) {
struct sge_ofld_txq *q = &adap->sge.ofldtxq[i];
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 15be5432..20dec85 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -3627,7 +3627,8 @@
}
#define ADVERT_MASK (FW_PORT_CAP_SPEED_100M | FW_PORT_CAP_SPEED_1G |\
- FW_PORT_CAP_SPEED_10G | FW_PORT_CAP_SPEED_40G | \
+ FW_PORT_CAP_SPEED_10G | FW_PORT_CAP_SPEED_25G | \
+ FW_PORT_CAP_SPEED_40G | FW_PORT_CAP_SPEED_100G | \
FW_PORT_CAP_ANEG)
/**
@@ -7196,8 +7197,12 @@
speed = 1000;
else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_10G))
speed = 10000;
+ else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_25G))
+ speed = 25000;
else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_40G))
speed = 40000;
+ else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_100G))
+ speed = 100000;
lc = &pi->link_cfg;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
index ffe4bf4..4b58b32 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
@@ -2267,6 +2267,12 @@
FW_PORT_CAP_802_3_ASM_DIR = 0x8000,
};
+#define FW_PORT_CAP_SPEED_S 0
+#define FW_PORT_CAP_SPEED_M 0x3f
+#define FW_PORT_CAP_SPEED_V(x) ((x) << FW_PORT_CAP_SPEED_S)
+#define FW_PORT_CAP_SPEED_G(x) \
+ (((x) >> FW_PORT_CAP_SPEED_S) & FW_PORT_CAP_SPEED_M)
+
enum fw_port_mdi {
FW_PORT_CAP_MDI_UNCHANGED,
FW_PORT_CAP_MDI_AUTO,
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_common.h b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_common.h
index 8067424..b3903fe 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_common.h
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_common.h
@@ -108,8 +108,8 @@
unsigned int supported; /* link capabilities */
unsigned int advertising; /* advertised capabilities */
unsigned short lp_advertising; /* peer advertised capabilities */
- unsigned short requested_speed; /* speed user has requested */
- unsigned short speed; /* actual link speed */
+ unsigned int requested_speed; /* speed user has requested */
+ unsigned int speed; /* actual link speed */
unsigned char requested_fc; /* flow control user has requested */
unsigned char fc; /* actual link flow control */
unsigned char autoneg; /* autonegotiating? */
@@ -271,10 +271,17 @@
return (lc->supported & FW_PORT_CAP_SPEED_10G) != 0;
}
+/* Return true if the Link Configuration supports "High Speeds" (those greater
+ * than 1Gb/s).
+ */
static inline bool is_x_10g_port(const struct link_config *lc)
{
- return (lc->supported & FW_PORT_CAP_SPEED_10G) != 0 ||
- (lc->supported & FW_PORT_CAP_SPEED_40G) != 0;
+ unsigned int speeds, high_speeds;
+
+ speeds = FW_PORT_CAP_SPEED_V(FW_PORT_CAP_SPEED_G(lc->supported));
+ high_speeds = speeds & ~(FW_PORT_CAP_SPEED_100M | FW_PORT_CAP_SPEED_1G);
+
+ return high_speeds != 0;
}
static inline unsigned int core_ticks_per_usec(const struct adapter *adapter)
diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
index 879f4c5..e98248f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4vf/t4vf_hw.c
@@ -314,8 +314,9 @@
}
#define ADVERT_MASK (FW_PORT_CAP_SPEED_100M | FW_PORT_CAP_SPEED_1G |\
- FW_PORT_CAP_SPEED_10G | FW_PORT_CAP_SPEED_40G | \
- FW_PORT_CAP_SPEED_100G | FW_PORT_CAP_ANEG)
+ FW_PORT_CAP_SPEED_10G | FW_PORT_CAP_SPEED_25G | \
+ FW_PORT_CAP_SPEED_40G | FW_PORT_CAP_SPEED_100G | \
+ FW_PORT_CAP_ANEG)
/**
* init_link_config - initialize a link's SW state
@@ -1716,8 +1717,12 @@
speed = 1000;
else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_10G))
speed = 10000;
+ else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_25G))
+ speed = 25000;
else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_40G))
speed = 40000;
+ else if (stat & FW_PORT_CMD_LSPEED_V(FW_PORT_CAP_SPEED_100G))
+ speed = 100000;
/*
* Scan all of our "ports" (Virtual Interfaces) looking for
diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
index 15d02da..9cffe48 100644
--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
+++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
@@ -4382,7 +4382,7 @@
}
/* This routine returns a list of all the NIC PF_nums in the adapter */
-u16 be_get_nic_pf_num_list(u8 *buf, u32 desc_count, u16 *nic_pf_nums)
+static u16 be_get_nic_pf_num_list(u8 *buf, u32 desc_count, u16 *nic_pf_nums)
{
struct be_res_desc_hdr *hdr = (struct be_res_desc_hdr *)buf;
struct be_pcie_res_desc *pcie = NULL;
@@ -4534,7 +4534,7 @@
}
/* Mark all fields invalid */
-void be_reset_nic_desc(struct be_nic_res_desc *nic)
+static void be_reset_nic_desc(struct be_nic_res_desc *nic)
{
memset(nic, 0, sizeof(*nic));
nic->unicast_mac_count = 0xFFFF;
@@ -4907,8 +4907,9 @@
return status;
}
-int __be_cmd_set_logical_link_config(struct be_adapter *adapter,
- int link_state, int version, u8 domain)
+static int
+__be_cmd_set_logical_link_config(struct be_adapter *adapter,
+ int link_state, int version, u8 domain)
{
struct be_mcc_wrb *wrb;
struct be_cmd_req_set_ll_link *req;
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 34f63ef..dcb930a 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -44,7 +44,7 @@
/* Per-module error detection/recovery workq shared across all functions.
* Each function schedules its own work request on this shared workq.
*/
-struct workqueue_struct *be_err_recovery_workq;
+static struct workqueue_struct *be_err_recovery_workq;
static const struct pci_device_id be_dev_ids[] = {
{ PCI_DEVICE(BE_VENDOR_ID, BE_DEVICE_ID1) },
@@ -60,7 +60,7 @@
MODULE_DEVICE_TABLE(pci, be_dev_ids);
/* Workqueue used by all functions for defering cmd calls to the adapter */
-struct workqueue_struct *be_wq;
+static struct workqueue_struct *be_wq;
/* UE Status Low CSR */
static const char * const ue_status_low_desc[] = {
@@ -1895,7 +1895,8 @@
return 0;
}
-static int be_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos)
+static int be_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos,
+ __be16 vlan_proto)
{
struct be_adapter *adapter = netdev_priv(netdev);
struct be_vf_cfg *vf_cfg = &adapter->vf_cfg[vf];
@@ -1907,6 +1908,9 @@
if (vf >= adapter->num_vfs || vlan > 4095 || qos > 7)
return -EINVAL;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
if (vlan || qos) {
vlan |= qos << VLAN_PRIO_SHIFT;
status = be_set_vf_tvt(adapter, vf, vlan);
@@ -4365,7 +4369,7 @@
* for distribution between the VFs. This self-imposed limit will determine the
* no: of VFs for which RSS can be enabled.
*/
-void be_calculate_pf_pool_rss_tables(struct be_adapter *adapter)
+static void be_calculate_pf_pool_rss_tables(struct be_adapter *adapter)
{
struct be_port_resources port_res = {0};
u8 rss_tables_on_port;
diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index 36361f8..90f9c54 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -60,6 +60,8 @@
struct ftgmac100_descs *descs;
dma_addr_t descs_dma_addr;
+ struct page *rx_pages[RX_QUEUE_ENTRIES];
+
unsigned int rx_pointer;
unsigned int tx_clean_pointer;
unsigned int tx_pointer;
@@ -77,6 +79,9 @@
int int_mask_all;
bool use_ncsi;
bool enabled;
+
+ u32 rxdes0_edorr_mask;
+ u32 txdes0_edotr_mask;
};
static int ftgmac100_alloc_rx_page(struct ftgmac100 *priv,
@@ -257,10 +262,11 @@
return rxdes->rxdes0 & cpu_to_le32(FTGMAC100_RXDES0_RXPKT_RDY);
}
-static void ftgmac100_rxdes_set_dma_own(struct ftgmac100_rxdes *rxdes)
+static void ftgmac100_rxdes_set_dma_own(const struct ftgmac100 *priv,
+ struct ftgmac100_rxdes *rxdes)
{
/* clear status bits */
- rxdes->rxdes0 &= cpu_to_le32(FTGMAC100_RXDES0_EDORR);
+ rxdes->rxdes0 &= cpu_to_le32(priv->rxdes0_edorr_mask);
}
static bool ftgmac100_rxdes_rx_error(struct ftgmac100_rxdes *rxdes)
@@ -298,9 +304,10 @@
return rxdes->rxdes0 & cpu_to_le32(FTGMAC100_RXDES0_MULTICAST);
}
-static void ftgmac100_rxdes_set_end_of_ring(struct ftgmac100_rxdes *rxdes)
+static void ftgmac100_rxdes_set_end_of_ring(const struct ftgmac100 *priv,
+ struct ftgmac100_rxdes *rxdes)
{
- rxdes->rxdes0 |= cpu_to_le32(FTGMAC100_RXDES0_EDORR);
+ rxdes->rxdes0 |= cpu_to_le32(priv->rxdes0_edorr_mask);
}
static void ftgmac100_rxdes_set_dma_addr(struct ftgmac100_rxdes *rxdes,
@@ -341,18 +348,27 @@
return rxdes->rxdes1 & cpu_to_le32(FTGMAC100_RXDES1_IP_CHKSUM_ERR);
}
+static inline struct page **ftgmac100_rxdes_page_slot(struct ftgmac100 *priv,
+ struct ftgmac100_rxdes *rxdes)
+{
+ return &priv->rx_pages[rxdes - priv->descs->rxdes];
+}
+
/*
* rxdes2 is not used by hardware. We use it to keep track of page.
* Since hardware does not touch it, we can skip cpu_to_le32()/le32_to_cpu().
*/
-static void ftgmac100_rxdes_set_page(struct ftgmac100_rxdes *rxdes, struct page *page)
+static void ftgmac100_rxdes_set_page(struct ftgmac100 *priv,
+ struct ftgmac100_rxdes *rxdes,
+ struct page *page)
{
- rxdes->rxdes2 = (unsigned int)page;
+ *ftgmac100_rxdes_page_slot(priv, rxdes) = page;
}
-static struct page *ftgmac100_rxdes_get_page(struct ftgmac100_rxdes *rxdes)
+static struct page *ftgmac100_rxdes_get_page(struct ftgmac100 *priv,
+ struct ftgmac100_rxdes *rxdes)
{
- return (struct page *)rxdes->rxdes2;
+ return *ftgmac100_rxdes_page_slot(priv, rxdes);
}
/******************************************************************************
@@ -382,7 +398,7 @@
if (ftgmac100_rxdes_first_segment(rxdes))
return rxdes;
- ftgmac100_rxdes_set_dma_own(rxdes);
+ ftgmac100_rxdes_set_dma_own(priv, rxdes);
ftgmac100_rx_pointer_advance(priv);
rxdes = ftgmac100_current_rxdes(priv);
}
@@ -453,7 +469,7 @@
if (ftgmac100_rxdes_last_segment(rxdes))
done = true;
- ftgmac100_rxdes_set_dma_own(rxdes);
+ ftgmac100_rxdes_set_dma_own(priv, rxdes);
ftgmac100_rx_pointer_advance(priv);
rxdes = ftgmac100_current_rxdes(priv);
} while (!done && ftgmac100_rxdes_packet_ready(rxdes));
@@ -501,7 +517,7 @@
do {
dma_addr_t map = ftgmac100_rxdes_get_dma_addr(rxdes);
- struct page *page = ftgmac100_rxdes_get_page(rxdes);
+ struct page *page = ftgmac100_rxdes_get_page(priv, rxdes);
unsigned int size;
dma_unmap_page(priv->dev, map, RX_BUF_SIZE, DMA_FROM_DEVICE);
@@ -545,10 +561,11 @@
/******************************************************************************
* internal functions (transmit descriptor)
*****************************************************************************/
-static void ftgmac100_txdes_reset(struct ftgmac100_txdes *txdes)
+static void ftgmac100_txdes_reset(const struct ftgmac100 *priv,
+ struct ftgmac100_txdes *txdes)
{
/* clear all except end of ring bit */
- txdes->txdes0 &= cpu_to_le32(FTGMAC100_TXDES0_EDOTR);
+ txdes->txdes0 &= cpu_to_le32(priv->txdes0_edotr_mask);
txdes->txdes1 = 0;
txdes->txdes2 = 0;
txdes->txdes3 = 0;
@@ -569,9 +586,10 @@
txdes->txdes0 |= cpu_to_le32(FTGMAC100_TXDES0_TXDMA_OWN);
}
-static void ftgmac100_txdes_set_end_of_ring(struct ftgmac100_txdes *txdes)
+static void ftgmac100_txdes_set_end_of_ring(const struct ftgmac100 *priv,
+ struct ftgmac100_txdes *txdes)
{
- txdes->txdes0 |= cpu_to_le32(FTGMAC100_TXDES0_EDOTR);
+ txdes->txdes0 |= cpu_to_le32(priv->txdes0_edotr_mask);
}
static void ftgmac100_txdes_set_first_segment(struct ftgmac100_txdes *txdes)
@@ -690,7 +708,7 @@
dev_kfree_skb(skb);
- ftgmac100_txdes_reset(txdes);
+ ftgmac100_txdes_reset(priv, txdes);
ftgmac100_tx_clean_pointer_advance(priv);
@@ -779,9 +797,9 @@
return -ENOMEM;
}
- ftgmac100_rxdes_set_page(rxdes, page);
+ ftgmac100_rxdes_set_page(priv, rxdes, page);
ftgmac100_rxdes_set_dma_addr(rxdes, map);
- ftgmac100_rxdes_set_dma_own(rxdes);
+ ftgmac100_rxdes_set_dma_own(priv, rxdes);
return 0;
}
@@ -791,7 +809,7 @@
for (i = 0; i < RX_QUEUE_ENTRIES; i++) {
struct ftgmac100_rxdes *rxdes = &priv->descs->rxdes[i];
- struct page *page = ftgmac100_rxdes_get_page(rxdes);
+ struct page *page = ftgmac100_rxdes_get_page(priv, rxdes);
dma_addr_t map = ftgmac100_rxdes_get_dma_addr(rxdes);
if (!page)
@@ -828,7 +846,8 @@
return -ENOMEM;
/* initialize RX ring */
- ftgmac100_rxdes_set_end_of_ring(&priv->descs->rxdes[RX_QUEUE_ENTRIES - 1]);
+ ftgmac100_rxdes_set_end_of_ring(priv,
+ &priv->descs->rxdes[RX_QUEUE_ENTRIES - 1]);
for (i = 0; i < RX_QUEUE_ENTRIES; i++) {
struct ftgmac100_rxdes *rxdes = &priv->descs->rxdes[i];
@@ -838,7 +857,8 @@
}
/* initialize TX ring */
- ftgmac100_txdes_set_end_of_ring(&priv->descs->txdes[TX_QUEUE_ENTRIES - 1]);
+ ftgmac100_txdes_set_end_of_ring(priv,
+ &priv->descs->txdes[TX_QUEUE_ENTRIES - 1]);
return 0;
err:
@@ -1055,14 +1075,12 @@
}
if (status & priv->int_mask_all & (FTGMAC100_INT_NO_RXBUF |
- FTGMAC100_INT_RPKT_LOST | FTGMAC100_INT_AHB_ERR |
- FTGMAC100_INT_PHYSTS_CHG)) {
+ FTGMAC100_INT_RPKT_LOST | FTGMAC100_INT_AHB_ERR)) {
if (net_ratelimit())
- netdev_info(netdev, "[ISR] = 0x%x: %s%s%s%s\n", status,
+ netdev_info(netdev, "[ISR] = 0x%x: %s%s%s\n", status,
status & FTGMAC100_INT_NO_RXBUF ? "NO_RXBUF " : "",
status & FTGMAC100_INT_RPKT_LOST ? "RPKT_LOST " : "",
- status & FTGMAC100_INT_AHB_ERR ? "AHB_ERR " : "",
- status & FTGMAC100_INT_PHYSTS_CHG ? "PHYSTS_CHG" : "");
+ status & FTGMAC100_INT_AHB_ERR ? "AHB_ERR " : "");
if (status & FTGMAC100_INT_NO_RXBUF) {
/* RX buffer unavailable */
@@ -1092,6 +1110,7 @@
static int ftgmac100_open(struct net_device *netdev)
{
struct ftgmac100 *priv = netdev_priv(netdev);
+ unsigned int status;
int err;
err = ftgmac100_alloc_buffers(priv);
@@ -1117,6 +1136,11 @@
ftgmac100_init_hw(priv);
ftgmac100_start_hw(priv, priv->use_ncsi ? 100 : 10);
+
+ /* Clear stale interrupts */
+ status = ioread32(priv->base + FTGMAC100_OFFSET_ISR);
+ iowrite32(status, priv->base + FTGMAC100_OFFSET_ISR);
+
if (netdev->phydev)
phy_start(netdev->phydev);
else if (priv->use_ncsi)
@@ -1226,12 +1250,21 @@
struct ftgmac100 *priv = netdev_priv(netdev);
struct platform_device *pdev = to_platform_device(priv->dev);
int i, err = 0;
+ u32 reg;
/* initialize mdio bus */
priv->mii_bus = mdiobus_alloc();
if (!priv->mii_bus)
return -EIO;
+ if (of_machine_is_compatible("aspeed,ast2400") ||
+ of_machine_is_compatible("aspeed,ast2500")) {
+ /* This driver supports the old MDIO interface */
+ reg = ioread32(priv->base + FTGMAC100_OFFSET_REVR);
+ reg &= ~FTGMAC100_REVR_NEW_MDIO_INTERFACE;
+ iowrite32(reg, priv->base + FTGMAC100_OFFSET_REVR);
+ };
+
priv->mii_bus->name = "ftgmac100_mdio";
snprintf(priv->mii_bus->id, MII_BUS_ID_SIZE, "%s-%d",
pdev->name, pdev->id);
@@ -1355,9 +1388,18 @@
FTGMAC100_INT_XPKT_ETH |
FTGMAC100_INT_XPKT_LOST |
FTGMAC100_INT_AHB_ERR |
- FTGMAC100_INT_PHYSTS_CHG |
FTGMAC100_INT_RPKT_BUF |
FTGMAC100_INT_NO_RXBUF);
+
+ if (of_machine_is_compatible("aspeed,ast2400") ||
+ of_machine_is_compatible("aspeed,ast2500")) {
+ priv->rxdes0_edorr_mask = BIT(30);
+ priv->txdes0_edotr_mask = BIT(30);
+ } else {
+ priv->rxdes0_edorr_mask = BIT(15);
+ priv->txdes0_edotr_mask = BIT(15);
+ }
+
if (pdev->dev.of_node &&
of_get_property(pdev->dev.of_node, "use-ncsi", NULL)) {
if (!IS_ENABLED(CONFIG_NET_NCSI)) {
@@ -1367,7 +1409,6 @@
dev_info(&pdev->dev, "Using NCSI interface\n");
priv->use_ncsi = true;
- priv->int_mask_all &= ~FTGMAC100_INT_PHYSTS_CHG;
priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);
if (!priv->ndev)
goto err_ncsi_dev;
diff --git a/drivers/net/ethernet/faraday/ftgmac100.h b/drivers/net/ethernet/faraday/ftgmac100.h
index 13408d4..a7ce0ac 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.h
+++ b/drivers/net/ethernet/faraday/ftgmac100.h
@@ -134,6 +134,11 @@
#define FTGMAC100_DMAFIFOS_TXDMA_REQ (1 << 31)
/*
+ * Feature Register
+ */
+#define FTGMAC100_REVR_NEW_MDIO_INTERFACE BIT(31)
+
+/*
* Receive buffer size register
*/
#define FTGMAC100_RBSR_SIZE(x) ((x) & 0x3fff)
@@ -152,6 +157,7 @@
#define FTGMAC100_MACCR_FULLDUP (1 << 8)
#define FTGMAC100_MACCR_GIGA_MODE (1 << 9)
#define FTGMAC100_MACCR_CRC_APD (1 << 10)
+#define FTGMAC100_MACCR_PHY_LINK_LEVEL (1 << 11)
#define FTGMAC100_MACCR_RX_RUNT (1 << 12)
#define FTGMAC100_MACCR_JUMBO_LF (1 << 13)
#define FTGMAC100_MACCR_RX_ALL (1 << 14)
@@ -189,7 +195,6 @@
} __attribute__ ((aligned(16)));
#define FTGMAC100_TXDES0_TXBUF_SIZE(x) ((x) & 0x3fff)
-#define FTGMAC100_TXDES0_EDOTR (1 << 15)
#define FTGMAC100_TXDES0_CRC_ERR (1 << 19)
#define FTGMAC100_TXDES0_LTS (1 << 28)
#define FTGMAC100_TXDES0_FTS (1 << 29)
@@ -215,7 +220,6 @@
} __attribute__ ((aligned(16)));
#define FTGMAC100_RXDES0_VDBC 0x3fff
-#define FTGMAC100_RXDES0_EDORR (1 << 15)
#define FTGMAC100_RXDES0_MULTICAST (1 << 16)
#define FTGMAC100_RXDES0_BROADCAST (1 << 17)
#define FTGMAC100_RXDES0_RX_ERR (1 << 18)
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index 415ffa1..3977889 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -600,7 +600,7 @@
return IRQ_HANDLED;
}
-enum hrtimer_restart tx_done(struct hrtimer *hrtimer)
+static enum hrtimer_restart tx_done(struct hrtimer *hrtimer)
{
struct hip04_priv *priv;
diff --git a/drivers/net/ethernet/hisilicon/hisi_femac.c b/drivers/net/ethernet/hisilicon/hisi_femac.c
index ca68e22..ced1859 100644
--- a/drivers/net/ethernet/hisilicon/hisi_femac.c
+++ b/drivers/net/ethernet/hisilicon/hisi_femac.c
@@ -940,8 +940,8 @@
}
#ifdef CONFIG_PM
-int hisi_femac_drv_suspend(struct platform_device *pdev,
- pm_message_t state)
+static int hisi_femac_drv_suspend(struct platform_device *pdev,
+ pm_message_t state)
{
struct net_device *ndev = platform_get_drvdata(pdev);
struct hisi_femac_priv *priv = netdev_priv(ndev);
@@ -957,7 +957,7 @@
return 0;
}
-int hisi_femac_drv_resume(struct platform_device *pdev)
+static int hisi_femac_drv_resume(struct platform_device *pdev)
{
struct net_device *ndev = platform_get_drvdata(pdev);
struct hisi_femac_priv *priv = netdev_priv(ndev);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index d7e1f8c..059aaed 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -994,10 +994,10 @@
struct hnae_handle *h = priv->ae_handle;
int state = 1;
- if (priv->phy) {
+ if (ndev->phydev) {
h->dev->ops->adjust_link(h, ndev->phydev->speed,
ndev->phydev->duplex);
- state = priv->phy->link;
+ state = ndev->phydev->link;
}
state = state && h->dev->ops->get_status(h);
@@ -1022,7 +1022,6 @@
*/
int hns_nic_init_phy(struct net_device *ndev, struct hnae_handle *h)
{
- struct hns_nic_priv *priv = netdev_priv(ndev);
struct phy_device *phy_dev = h->phy_dev;
int ret;
@@ -1046,8 +1045,6 @@
if (h->phy_if == PHY_INTERFACE_MODE_XGMII)
phy_dev->autoneg = false;
- priv->phy = phy_dev;
-
return 0;
}
@@ -1224,8 +1221,8 @@
if (ret)
goto out_start_err;
- if (priv->phy)
- phy_start(priv->phy);
+ if (ndev->phydev)
+ phy_start(ndev->phydev);
clear_bit(NIC_STATE_DOWN, &priv->state);
(void)mod_timer(&priv->service_timer, jiffies + SERVICE_TIMER_HZ);
@@ -1259,8 +1256,8 @@
netif_tx_disable(ndev);
priv->link = 0;
- if (priv->phy)
- phy_stop(priv->phy);
+ if (ndev->phydev)
+ phy_stop(ndev->phydev);
ops = priv->ae_handle->dev->ops;
@@ -1359,8 +1356,7 @@
static int hns_nic_do_ioctl(struct net_device *netdev, struct ifreq *ifr,
int cmd)
{
- struct hns_nic_priv *priv = netdev_priv(netdev);
- struct phy_device *phy_dev = priv->phy;
+ struct phy_device *phy_dev = netdev->phydev;
if (!netif_running(netdev))
return -EINVAL;
@@ -2017,9 +2013,8 @@
hns_nic_uninit_ring_data(priv);
priv->ring_data = NULL;
- if (priv->phy)
- phy_disconnect(priv->phy);
- priv->phy = NULL;
+ if (ndev->phydev)
+ phy_disconnect(ndev->phydev);
if (!IS_ERR_OR_NULL(priv->ae_handle))
hnae_put_handle(priv->ae_handle);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.h b/drivers/net/ethernet/hisilicon/hns/hns_enet.h
index 44bb301..5b412de 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.h
@@ -59,7 +59,6 @@
u32 port_id;
int phy_mode;
int phy_led_val;
- struct phy_device *phy;
struct net_device *netdev;
struct device *dev;
struct hnae_handle *ae_handle;
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index 5eb3245..47e59bb 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -48,9 +48,9 @@
h = priv->ae_handle;
- if (priv->phy) {
- if (!genphy_read_status(priv->phy))
- link_stat = priv->phy->link;
+ if (net_dev->phydev) {
+ if (!genphy_read_status(net_dev->phydev))
+ link_stat = net_dev->phydev->link;
else
link_stat = 0;
}
@@ -64,15 +64,14 @@
}
static void hns_get_mdix_mode(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
+ struct ethtool_link_ksettings *cmd)
{
int mdix_ctrl, mdix, retval, is_resolved;
- struct hns_nic_priv *priv = netdev_priv(net_dev);
- struct phy_device *phy_dev = priv->phy;
+ struct phy_device *phy_dev = net_dev->phydev;
if (!phy_dev || !phy_dev->mdio.bus) {
- cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_INVALID;
- cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
+ cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI_INVALID;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI_INVALID;
return;
}
@@ -89,35 +88,35 @@
switch (mdix_ctrl) {
case 0x0:
- cmd->eth_tp_mdix_ctrl = ETH_TP_MDI;
+ cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI;
break;
case 0x1:
- cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_X;
+ cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI_X;
break;
case 0x3:
- cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_AUTO;
+ cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI_AUTO;
break;
default:
- cmd->eth_tp_mdix_ctrl = ETH_TP_MDI_INVALID;
+ cmd->base.eth_tp_mdix_ctrl = ETH_TP_MDI_INVALID;
break;
}
if (!is_resolved)
- cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI_INVALID;
else if (mdix)
- cmd->eth_tp_mdix = ETH_TP_MDI_X;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI_X;
else
- cmd->eth_tp_mdix = ETH_TP_MDI;
+ cmd->base.eth_tp_mdix = ETH_TP_MDI;
}
/**
- *hns_nic_get_settings - implement ethtool get settings
+ *hns_nic_get_link_ksettings - implement ethtool get link ksettings
*@net_dev: net_device
- *@cmd: ethtool_cmd
+ *@cmd: ethtool_link_ksettings
*retuen 0 - success , negative --fail
*/
-static int hns_nic_get_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
+static int hns_nic_get_link_ksettings(struct net_device *net_dev,
+ struct ethtool_link_ksettings *cmd)
{
struct hns_nic_priv *priv = netdev_priv(net_dev);
struct hnae_handle *h;
@@ -125,6 +124,7 @@
int ret;
u8 duplex;
u16 speed;
+ u32 supported, advertising;
if (!priv || !priv->ae_handle)
return -ESRCH;
@@ -139,38 +139,43 @@
return -EINVAL;
}
- /* When there is no phy, autoneg is off. */
- cmd->autoneg = false;
- ethtool_cmd_speed_set(cmd, speed);
- cmd->duplex = duplex;
+ ethtool_convert_link_mode_to_legacy_u32(&supported,
+ cmd->link_modes.supported);
+ ethtool_convert_link_mode_to_legacy_u32(&advertising,
+ cmd->link_modes.advertising);
- if (priv->phy)
- (void)phy_ethtool_gset(priv->phy, cmd);
+ /* When there is no phy, autoneg is off. */
+ cmd->base.autoneg = false;
+ cmd->base.cmd = speed;
+ cmd->base.duplex = duplex;
+
+ if (net_dev->phydev)
+ (void)phy_ethtool_ksettings_get(net_dev->phydev, cmd);
link_stat = hns_nic_get_link(net_dev);
if (!link_stat) {
- ethtool_cmd_speed_set(cmd, (u32)SPEED_UNKNOWN);
- cmd->duplex = DUPLEX_UNKNOWN;
+ cmd->base.speed = (u32)SPEED_UNKNOWN;
+ cmd->base.duplex = DUPLEX_UNKNOWN;
}
- if (cmd->autoneg)
- cmd->advertising |= ADVERTISED_Autoneg;
+ if (cmd->base.autoneg)
+ advertising |= ADVERTISED_Autoneg;
- cmd->supported |= h->if_support;
+ supported |= h->if_support;
if (h->phy_if == PHY_INTERFACE_MODE_SGMII) {
- cmd->supported |= SUPPORTED_TP;
- cmd->advertising |= ADVERTISED_1000baseT_Full;
+ supported |= SUPPORTED_TP;
+ advertising |= ADVERTISED_1000baseT_Full;
} else if (h->phy_if == PHY_INTERFACE_MODE_XGMII) {
- cmd->supported |= SUPPORTED_FIBRE;
- cmd->advertising |= ADVERTISED_10000baseKR_Full;
+ supported |= SUPPORTED_FIBRE;
+ advertising |= ADVERTISED_10000baseKR_Full;
}
switch (h->media_type) {
case HNAE_MEDIA_TYPE_FIBER:
- cmd->port = PORT_FIBRE;
+ cmd->base.port = PORT_FIBRE;
break;
case HNAE_MEDIA_TYPE_COPPER:
- cmd->port = PORT_TP;
+ cmd->base.port = PORT_TP;
break;
case HNAE_MEDIA_TYPE_UNKNOWN:
default:
@@ -178,23 +183,27 @@
}
if (!(AE_IS_VER1(priv->enet_ver) && h->port_type == HNAE_PORT_DEBUG))
- cmd->supported |= SUPPORTED_Pause;
+ supported |= SUPPORTED_Pause;
- cmd->transceiver = XCVR_EXTERNAL;
- cmd->mdio_support = (ETH_MDIO_SUPPORTS_C45 | ETH_MDIO_SUPPORTS_C22);
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
+ supported);
+ ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
+ advertising);
+
+ cmd->base.mdio_support = ETH_MDIO_SUPPORTS_C45 | ETH_MDIO_SUPPORTS_C22;
hns_get_mdix_mode(net_dev, cmd);
return 0;
}
/**
- *hns_nic_set_settings - implement ethtool set settings
+ *hns_nic_set_link_settings - implement ethtool set link ksettings
*@net_dev: net_device
- *@cmd: ethtool_cmd
+ *@cmd: ethtool_link_ksettings
*retuen 0 - success , negative --fail
*/
-static int hns_nic_set_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
+static int hns_nic_set_link_ksettings(struct net_device *net_dev,
+ const struct ethtool_link_ksettings *cmd)
{
struct hns_nic_priv *priv = netdev_priv(net_dev);
struct hnae_handle *h;
@@ -208,24 +217,25 @@
return -ENODEV;
h = priv->ae_handle;
- speed = ethtool_cmd_speed(cmd);
+ speed = cmd->base.speed;
if (h->phy_if == PHY_INTERFACE_MODE_XGMII) {
- if (cmd->autoneg == AUTONEG_ENABLE || speed != SPEED_10000 ||
- cmd->duplex != DUPLEX_FULL)
+ if (cmd->base.autoneg == AUTONEG_ENABLE ||
+ speed != SPEED_10000 ||
+ cmd->base.duplex != DUPLEX_FULL)
return -EINVAL;
} else if (h->phy_if == PHY_INTERFACE_MODE_SGMII) {
- if (!priv->phy && cmd->autoneg == AUTONEG_ENABLE)
+ if (!net_dev->phydev && cmd->base.autoneg == AUTONEG_ENABLE)
return -EINVAL;
- if (speed == SPEED_1000 && cmd->duplex == DUPLEX_HALF)
+ if (speed == SPEED_1000 && cmd->base.duplex == DUPLEX_HALF)
return -EINVAL;
- if (priv->phy)
- return phy_ethtool_sset(priv->phy, cmd);
+ if (net_dev->phydev)
+ return phy_ethtool_ksettings_set(net_dev->phydev, cmd);
if ((speed != SPEED_10 && speed != SPEED_100 &&
- speed != SPEED_1000) || (cmd->duplex != DUPLEX_HALF &&
- cmd->duplex != DUPLEX_FULL))
+ speed != SPEED_1000) || (cmd->base.duplex != DUPLEX_HALF &&
+ cmd->base.duplex != DUPLEX_FULL))
return -EINVAL;
} else {
netdev_err(net_dev, "Not supported!");
@@ -233,7 +243,7 @@
}
if (h->dev->ops->adjust_link) {
- h->dev->ops->adjust_link(h, (int)speed, cmd->duplex);
+ h->dev->ops->adjust_link(h, (int)speed, cmd->base.duplex);
return 0;
}
@@ -305,7 +315,7 @@
{
int ret = 0;
struct hns_nic_priv *priv = netdev_priv(ndev);
- struct phy_device *phy_dev = priv->phy;
+ struct phy_device *phy_dev = ndev->phydev;
struct hnae_handle *h = priv->ae_handle;
switch (loop) {
@@ -910,7 +920,7 @@
memcpy(buff, hns_nic_test_strs[MAC_INTERNALLOOP_SERDES],
ETH_GSTRING_LEN);
buff += ETH_GSTRING_LEN;
- if ((priv->phy) && (!priv->phy->is_c45))
+ if ((netdev->phydev) && (!netdev->phydev->is_c45))
memcpy(buff, hns_nic_test_strs[MAC_INTERNALLOOP_PHY],
ETH_GSTRING_LEN);
@@ -996,7 +1006,7 @@
if (priv->ae_handle->phy_if == PHY_INTERFACE_MODE_XGMII)
cnt--;
- if ((!priv->phy) || (priv->phy->is_c45))
+ if ((!netdev->phydev) || (netdev->phydev->is_c45))
cnt--;
return cnt;
@@ -1015,8 +1025,7 @@
int hns_phy_led_set(struct net_device *netdev, int value)
{
int retval;
- struct hns_nic_priv *priv = netdev_priv(netdev);
- struct phy_device *phy_dev = priv->phy;
+ struct phy_device *phy_dev = netdev->phydev;
retval = phy_write(phy_dev, HNS_PHY_PAGE_REG, HNS_PHY_PAGE_LED);
retval |= phy_write(phy_dev, HNS_LED_FC_REG, value);
@@ -1039,7 +1048,7 @@
{
struct hns_nic_priv *priv = netdev_priv(netdev);
struct hnae_handle *h = priv->ae_handle;
- struct phy_device *phy_dev = priv->phy;
+ struct phy_device *phy_dev = netdev->phydev;
int ret;
if (phy_dev)
@@ -1159,8 +1168,7 @@
static int hns_nic_nway_reset(struct net_device *netdev)
{
int ret = 0;
- struct hns_nic_priv *priv = netdev_priv(netdev);
- struct phy_device *phy = priv->phy;
+ struct phy_device *phy = netdev->phydev;
if (netif_running(netdev)) {
if (phy)
@@ -1267,8 +1275,6 @@
static const struct ethtool_ops hns_ethtool_ops = {
.get_drvinfo = hns_nic_get_drvinfo,
.get_link = hns_nic_get_link,
- .get_settings = hns_nic_get_settings,
- .set_settings = hns_nic_set_settings,
.get_ringparam = hns_get_ringparam,
.get_pauseparam = hns_get_pauseparam,
.set_pauseparam = hns_set_pauseparam,
@@ -1288,6 +1294,8 @@
.get_rxfh = hns_get_rss,
.set_rxfh = hns_set_rss,
.get_rxnfc = hns_get_rxnfc,
+ .get_link_ksettings = hns_nic_get_link_ksettings,
+ .set_link_ksettings = hns_nic_set_link_ksettings,
};
void hns_ethtool_set_ops(struct net_device *ndev)
diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index ec4d0f3..8f13919 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -977,7 +977,37 @@
dev->mcast_pending = 1;
return;
}
+
+ mutex_lock(&dev->link_lock);
__emac_set_multicast_list(dev);
+ mutex_unlock(&dev->link_lock);
+}
+
+static int emac_set_mac_address(struct net_device *ndev, void *sa)
+{
+ struct emac_instance *dev = netdev_priv(ndev);
+ struct sockaddr *addr = sa;
+ struct emac_regs __iomem *p = dev->emacp;
+
+ if (!is_valid_ether_addr(addr->sa_data))
+ return -EADDRNOTAVAIL;
+
+ mutex_lock(&dev->link_lock);
+
+ memcpy(ndev->dev_addr, addr->sa_data, ndev->addr_len);
+
+ emac_rx_disable(dev);
+ emac_tx_disable(dev);
+ out_be32(&p->iahr, (ndev->dev_addr[0] << 8) | ndev->dev_addr[1]);
+ out_be32(&p->ialr, (ndev->dev_addr[2] << 24) |
+ (ndev->dev_addr[3] << 16) | (ndev->dev_addr[4] << 8) |
+ ndev->dev_addr[5]);
+ emac_tx_enable(dev);
+ emac_rx_enable(dev);
+
+ mutex_unlock(&dev->link_lock);
+
+ return 0;
}
static int emac_resize_rx_ring(struct emac_instance *dev, int new_mtu)
@@ -2686,7 +2716,7 @@
.ndo_do_ioctl = emac_ioctl,
.ndo_tx_timeout = emac_tx_timeout,
.ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_mac_address = emac_set_mac_address,
.ndo_start_xmit = emac_start_xmit,
.ndo_change_mtu = eth_change_mtu,
};
@@ -2699,7 +2729,7 @@
.ndo_do_ioctl = emac_ioctl,
.ndo_tx_timeout = emac_tx_timeout,
.ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_mac_address = emac_set_mac_address,
.ndo_start_xmit = emac_start_xmit_sg,
.ndo_change_mtu = emac_change_mtu,
};
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 2e1b17a..ad03763 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -334,7 +334,7 @@
if (IS_ERR(adapter->ptp_clock)) {
adapter->ptp_clock = NULL;
e_err("ptp_clock_register failed\n");
- } else {
+ } else if (adapter->ptp_clock) {
e_info("registered PHC clock\n");
}
}
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
index 67ff01a..4d19e46 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k.h
+++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
@@ -507,7 +507,7 @@
s32 fm10k_iov_update_pvid(struct fm10k_intfc *interface, u16 glort, u16 pvid);
int fm10k_ndo_set_vf_mac(struct net_device *netdev, int vf_idx, u8 *mac);
int fm10k_ndo_set_vf_vlan(struct net_device *netdev,
- int vf_idx, u16 vid, u8 qos);
+ int vf_idx, u16 vid, u8 qos, __be16 vlan_proto);
int fm10k_ndo_set_vf_bw(struct net_device *netdev, int vf_idx, int rate,
int unused);
int fm10k_ndo_get_vf_config(struct net_device *netdev,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index d9dec81..5f4dac0 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -445,7 +445,7 @@
}
int fm10k_ndo_set_vf_vlan(struct net_device *netdev, int vf_idx, u16 vid,
- u8 qos)
+ u8 qos, __be16 vlan_proto)
{
struct fm10k_intfc *interface = netdev_priv(netdev);
struct fm10k_iov_data *iov_data = interface->iov_data;
@@ -460,6 +460,10 @@
if (qos || (vid > (VLAN_VID_MASK - 1)))
return -EINVAL;
+ /* VF VLAN Protocol part to default is unsupported */
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
vf_info = &iov_data->vf_info[vf_idx];
/* exit if there is nothing to do */
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 19103a6..2030d7c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -74,7 +74,7 @@
#define I40E_MIN_NUM_DESCRIPTORS 64
#define I40E_MIN_MSIX 2
#define I40E_DEFAULT_NUM_VMDQ_VSI 8 /* max 256 VSIs */
-#define I40E_MIN_VSI_ALLOC 51 /* LAN, ATR, FCOE, 32 VF, 16 VMDQ */
+#define I40E_MIN_VSI_ALLOC 83 /* LAN, ATR, FCOE, 64 VF */
/* max 16 qps */
#define i40e_default_queues_per_vmdq(pf) \
(((pf)->flags & I40E_FLAG_RSS_AQ_CAPABLE) ? 4 : 1)
@@ -701,6 +701,8 @@
void i40e_do_reset(struct i40e_pf *pf, u32 reset_flags);
int i40e_config_rss(struct i40e_vsi *vsi, u8 *seed, u8 *lut, u16 lut_size);
int i40e_get_rss(struct i40e_vsi *vsi, u8 *seed, u8 *lut, u16 lut_size);
+void i40e_fill_rss_lut(struct i40e_pf *pf, u8 *lut,
+ u16 rss_table_size, u16 rss_size);
struct i40e_vsi *i40e_find_vsi_from_id(struct i40e_pf *pf, u16 id);
void i40e_update_stats(struct i40e_vsi *vsi);
void i40e_update_eth_stats(struct i40e_vsi *vsi);
@@ -708,8 +710,6 @@
int i40e_fetch_switch_configuration(struct i40e_pf *pf,
bool printconfig);
-int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data, u8 *raw_packet,
- struct i40e_pf *pf, bool add);
int i40e_add_del_fdir(struct i40e_vsi *vsi,
struct i40e_fdir_filter *input, bool add);
void i40e_fdir_check_and_reenable(struct i40e_pf *pf);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 05cf9a7..0c1875b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -1054,6 +1054,7 @@
struct i40e_dcbx_config *r_cfg =
&pf->hw.remote_dcbx_config;
int i, ret;
+ u32 switch_id;
bw_data = kzalloc(sizeof(
struct i40e_aqc_query_port_ets_config_resp),
@@ -1063,8 +1064,12 @@
goto command_write_done;
}
+ vsi = pf->vsi[pf->lan_vsi];
+ switch_id =
+ vsi->info.switch_id & I40E_AQ_VSI_SW_ID_MASK;
+
ret = i40e_aq_query_port_ets_config(&pf->hw,
- pf->mac_seid,
+ switch_id,
bw_data, NULL);
if (ret) {
dev_info(&pf->pdev->dev,
@@ -1425,84 +1430,6 @@
buff = NULL;
kfree(desc);
desc = NULL;
- } else if ((strncmp(cmd_buf, "add fd_filter", 13) == 0) ||
- (strncmp(cmd_buf, "rem fd_filter", 13) == 0)) {
- struct i40e_fdir_filter fd_data;
- u16 packet_len, i, j = 0;
- char *asc_packet;
- u8 *raw_packet;
- bool add = false;
- int ret;
-
- if (!(pf->flags & I40E_FLAG_FD_SB_ENABLED))
- goto command_write_done;
-
- if (strncmp(cmd_buf, "add", 3) == 0)
- add = true;
-
- if (add && (pf->auto_disable_flags & I40E_FLAG_FD_SB_ENABLED))
- goto command_write_done;
-
- asc_packet = kzalloc(I40E_FDIR_MAX_RAW_PACKET_SIZE,
- GFP_KERNEL);
- if (!asc_packet)
- goto command_write_done;
-
- raw_packet = kzalloc(I40E_FDIR_MAX_RAW_PACKET_SIZE,
- GFP_KERNEL);
-
- if (!raw_packet) {
- kfree(asc_packet);
- asc_packet = NULL;
- goto command_write_done;
- }
-
- cnt = sscanf(&cmd_buf[13],
- "%hx %2hhx %2hhx %hx %2hhx %2hhx %hx %x %hd %511s",
- &fd_data.q_index,
- &fd_data.flex_off, &fd_data.pctype,
- &fd_data.dest_vsi, &fd_data.dest_ctl,
- &fd_data.fd_status, &fd_data.cnt_index,
- &fd_data.fd_id, &packet_len, asc_packet);
- if (cnt != 10) {
- dev_info(&pf->pdev->dev,
- "program fd_filter: bad command string, cnt=%d\n",
- cnt);
- kfree(asc_packet);
- asc_packet = NULL;
- kfree(raw_packet);
- goto command_write_done;
- }
-
- /* fix packet length if user entered 0 */
- if (packet_len == 0)
- packet_len = I40E_FDIR_MAX_RAW_PACKET_SIZE;
-
- /* make sure to check the max as well */
- packet_len = min_t(u16,
- packet_len, I40E_FDIR_MAX_RAW_PACKET_SIZE);
-
- for (i = 0; i < packet_len; i++) {
- cnt = sscanf(&asc_packet[j], "%2hhx ", &raw_packet[i]);
- if (!cnt)
- break;
- j += 3;
- }
- dev_info(&pf->pdev->dev, "FD raw packet dump\n");
- print_hex_dump(KERN_INFO, "FD raw packet: ",
- DUMP_PREFIX_OFFSET, 16, 1,
- raw_packet, packet_len, true);
- ret = i40e_program_fdir_filter(&fd_data, raw_packet, pf, add);
- if (!ret) {
- dev_info(&pf->pdev->dev, "Filter command send Status : Success\n");
- } else {
- dev_info(&pf->pdev->dev,
- "Filter command send failed %d\n", ret);
- }
- kfree(raw_packet);
- raw_packet = NULL;
- kfree(asc_packet);
- asc_packet = NULL;
} else if (strncmp(cmd_buf, "fd current cnt", 14) == 0) {
dev_info(&pf->pdev->dev, "FD current total filter count for this interface: %d\n",
i40e_get_current_fd_count(pf));
@@ -1727,8 +1654,6 @@
dev_info(&pf->pdev->dev, " globr\n");
dev_info(&pf->pdev->dev, " send aq_cmd <flags> <opcode> <datalen> <retval> <cookie_h> <cookie_l> <param0> <param1> <param2> <param3>\n");
dev_info(&pf->pdev->dev, " send indirect aq_cmd <flags> <opcode> <datalen> <retval> <cookie_h> <cookie_l> <param0> <param1> <param2> <param3> <buffer_len>\n");
- dev_info(&pf->pdev->dev, " add fd_filter <dest q_index> <flex_off> <pctype> <dest_vsi> <dest_ctl> <fd_status> <cnt_index> <fd_id> <packet_len> <packet>\n");
- dev_info(&pf->pdev->dev, " rem fd_filter <dest q_index> <flex_off> <pctype> <dest_vsi> <dest_ctl> <fd_status> <cnt_index> <fd_id> <packet_len> <packet>\n");
dev_info(&pf->pdev->dev, " fd current cnt");
dev_info(&pf->pdev->dev, " lldp start\n");
dev_info(&pf->pdev->dev, " lldp stop\n");
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 1835186..92bc884 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1970,11 +1970,22 @@
* 125us (8000 interrupts per second) == ITR(62)
*/
+/**
+ * __i40e_get_coalesce - get per-queue coalesce settings
+ * @netdev: the netdev to check
+ * @ec: ethtool coalesce data structure
+ * @queue: which queue to pick
+ *
+ * Gets the per-queue settings for coalescence. Specifically Rx and Tx usecs
+ * are per queue. If queue is <0 then we default to queue 0 as the
+ * representative value.
+ **/
static int __i40e_get_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec,
int queue)
{
struct i40e_netdev_priv *np = netdev_priv(netdev);
+ struct i40e_ring *rx_ring, *tx_ring;
struct i40e_vsi *vsi = np->vsi;
ec->tx_max_coalesced_frames_irq = vsi->work_limit;
@@ -1989,14 +2000,18 @@
return -EINVAL;
}
- if (ITR_IS_DYNAMIC(vsi->rx_rings[queue]->rx_itr_setting))
+ rx_ring = vsi->rx_rings[queue];
+ tx_ring = vsi->tx_rings[queue];
+
+ if (ITR_IS_DYNAMIC(rx_ring->rx_itr_setting))
ec->use_adaptive_rx_coalesce = 1;
- if (ITR_IS_DYNAMIC(vsi->tx_rings[queue]->tx_itr_setting))
+ if (ITR_IS_DYNAMIC(tx_ring->tx_itr_setting))
ec->use_adaptive_tx_coalesce = 1;
- ec->rx_coalesce_usecs = vsi->rx_rings[queue]->rx_itr_setting & ~I40E_ITR_DYNAMIC;
- ec->tx_coalesce_usecs = vsi->tx_rings[queue]->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+ ec->rx_coalesce_usecs = rx_ring->rx_itr_setting & ~I40E_ITR_DYNAMIC;
+ ec->tx_coalesce_usecs = tx_ring->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+
/* we use the _usecs_high to store/set the interrupt rate limit
* that the hardware supports, that almost but not quite
@@ -2010,18 +2025,44 @@
return 0;
}
+/**
+ * i40e_get_coalesce - get a netdev's coalesce settings
+ * @netdev: the netdev to check
+ * @ec: ethtool coalesce data structure
+ *
+ * Gets the coalesce settings for a particular netdev. Note that if user has
+ * modified per-queue settings, this only guarantees to represent queue 0. See
+ * __i40e_get_coalesce for more details.
+ **/
static int i40e_get_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec)
{
return __i40e_get_coalesce(netdev, ec, -1);
}
+/**
+ * i40e_get_per_queue_coalesce - gets coalesce settings for particular queue
+ * @netdev: netdev structure
+ * @ec: ethtool's coalesce settings
+ * @queue: the particular queue to read
+ *
+ * Will read a specific queue's coalesce settings
+ **/
static int i40e_get_per_queue_coalesce(struct net_device *netdev, u32 queue,
struct ethtool_coalesce *ec)
{
return __i40e_get_coalesce(netdev, ec, queue);
}
+/**
+ * i40e_set_itr_per_queue - set ITR values for specific queue
+ * @vsi: the VSI to set values for
+ * @ec: coalesce settings from ethtool
+ * @queue: the queue to modify
+ *
+ * Change the ITR settings for a specific queue.
+ **/
+
static void i40e_set_itr_per_queue(struct i40e_vsi *vsi,
struct ethtool_coalesce *ec,
int queue)
@@ -2060,6 +2101,14 @@
i40e_flush(hw);
}
+/**
+ * __i40e_set_coalesce - set coalesce settings for particular queue
+ * @netdev: the netdev to change
+ * @ec: ethtool coalesce settings
+ * @queue: the queue to change
+ *
+ * Sets the coalesce settings for a particular queue.
+ **/
static int __i40e_set_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec,
int queue)
@@ -2120,12 +2169,27 @@
return 0;
}
+/**
+ * i40e_set_coalesce - set coalesce settings for every queue on the netdev
+ * @netdev: the netdev to change
+ * @ec: ethtool coalesce settings
+ *
+ * This will set each queue to the same coalesce settings.
+ **/
static int i40e_set_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec)
{
return __i40e_set_coalesce(netdev, ec, -1);
}
+/**
+ * i40e_set_per_queue_coalesce - set specific queue's coalesce settings
+ * @netdev: the netdev to change
+ * @ec: ethtool's coalesce settings
+ * @queue: the queue to change
+ *
+ * Sets the specified queue's coalesce settings.
+ **/
static int i40e_set_per_queue_coalesce(struct net_device *netdev, u32 queue,
struct ethtool_coalesce *ec)
{
@@ -2922,15 +2986,13 @@
{
struct i40e_netdev_priv *np = netdev_priv(netdev);
struct i40e_vsi *vsi = np->vsi;
+ struct i40e_pf *pf = vsi->back;
u8 *seed = NULL;
u16 i;
if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP)
return -EOPNOTSUPP;
- if (!indir)
- return 0;
-
if (key) {
if (!vsi->rss_hkey_user) {
vsi->rss_hkey_user = kzalloc(I40E_HKEY_ARRAY_SIZE,
@@ -2948,8 +3010,12 @@
}
/* Each 32 bits pointed by 'indir' is stored with a lut entry */
- for (i = 0; i < I40E_HLUT_ARRAY_SIZE; i++)
- vsi->rss_lut_user[i] = (u8)(indir[i]);
+ if (indir)
+ for (i = 0; i < I40E_HLUT_ARRAY_SIZE; i++)
+ vsi->rss_lut_user[i] = (u8)(indir[i]);
+ else
+ i40e_fill_rss_lut(pf, vsi->rss_lut_user, I40E_HLUT_ARRAY_SIZE,
+ vsi->rss_size);
return i40e_config_rss(vsi, seed, vsi->rss_lut_user,
I40E_HLUT_ARRAY_SIZE);
@@ -3019,6 +3085,9 @@
} else {
pf->flags &= ~I40E_FLAG_FD_ATR_ENABLED;
pf->auto_disable_flags |= I40E_FLAG_FD_ATR_ENABLED;
+
+ /* flush current ATR settings */
+ set_bit(__I40E_FD_FLUSH_REQUESTED, &pf->state);
}
if ((flags & I40E_PRIV_FLAGS_VEB_STATS) &&
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 61b0fc4..8176596 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -41,7 +41,7 @@
#define DRV_VERSION_MAJOR 1
#define DRV_VERSION_MINOR 6
-#define DRV_VERSION_BUILD 12
+#define DRV_VERSION_BUILD 16
#define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
__stringify(DRV_VERSION_MINOR) "." \
__stringify(DRV_VERSION_BUILD) DRV_KERN
@@ -57,8 +57,6 @@
static int i40e_setup_misc_vector(struct i40e_pf *pf);
static void i40e_determine_queue_usage(struct i40e_pf *pf);
static int i40e_setup_pf_filter_control(struct i40e_pf *pf);
-static void i40e_fill_rss_lut(struct i40e_pf *pf, u8 *lut,
- u16 rss_table_size, u16 rss_size);
static void i40e_fdir_sb_setup(struct i40e_pf *pf);
static int i40e_veb_get_bw_info(struct i40e_veb *veb);
@@ -1317,7 +1315,7 @@
element.vlan_tag = 0;
/* ...and some firmware does it this way. */
element.flags = I40E_AQC_MACVLAN_DEL_PERFECT_MATCH |
- I40E_AQC_MACVLAN_ADD_IGNORE_VLAN;
+ I40E_AQC_MACVLAN_DEL_IGNORE_VLAN;
i40e_aq_remove_macvlan(&pf->hw, vsi->seid, &element, 1, NULL);
}
@@ -1910,7 +1908,7 @@
ether_addr_copy(del_list[num_del].mac_addr, f->macaddr);
if (f->vlan == I40E_VLAN_ANY) {
del_list[num_del].vlan_tag = 0;
- cmd_flags |= I40E_AQC_MACVLAN_ADD_IGNORE_VLAN;
+ cmd_flags |= I40E_AQC_MACVLAN_DEL_IGNORE_VLAN;
} else {
del_list[num_del].vlan_tag =
cpu_to_le16((u16)(f->vlan));
@@ -5244,7 +5242,7 @@
/* reset fd counters */
pf->fd_add_err = pf->fd_atr_cnt = 0;
if (pf->fd_tcp_rule > 0) {
- pf->flags &= ~I40E_FLAG_FD_ATR_ENABLED;
+ pf->auto_disable_flags |= I40E_FLAG_FD_ATR_ENABLED;
if (I40E_DEBUG_FD & pf->hw.debug_mask)
dev_info(&pf->pdev->dev, "Forcing ATR off, sideband rules for TCP/IPv4 exist\n");
pf->fd_tcp_rule = 0;
@@ -5941,13 +5939,17 @@
dev_info(&pf->pdev->dev, "FD Sideband/ntuple is being enabled since we have space in the table now\n");
}
}
- /* Wait for some more space to be available to turn on ATR */
+
+ /* Wait for some more space to be available to turn on ATR. We also
+ * must check that no existing ntuple rules for TCP are in effect
+ */
if (fcnt_prog < (fcnt_avail - I40E_FDIR_BUFFER_HEAD_ROOM * 2)) {
if ((pf->flags & I40E_FLAG_FD_ATR_ENABLED) &&
- (pf->auto_disable_flags & I40E_FLAG_FD_ATR_ENABLED)) {
+ (pf->auto_disable_flags & I40E_FLAG_FD_ATR_ENABLED) &&
+ (pf->fd_tcp_rule == 0)) {
pf->auto_disable_flags &= ~I40E_FLAG_FD_ATR_ENABLED;
if (I40E_DEBUG_FD & pf->hw.debug_mask)
- dev_info(&pf->pdev->dev, "ATR is being enabled since we have space in the table now\n");
+ dev_info(&pf->pdev->dev, "ATR is being enabled since we have space in the table and there are no conflicting ntuple rules\n");
}
}
@@ -5978,9 +5980,6 @@
int fd_room;
int reg;
- if (!(pf->flags & (I40E_FLAG_FD_SB_ENABLED | I40E_FLAG_FD_ATR_ENABLED)))
- return;
-
if (!time_after(jiffies, pf->fd_flush_timestamp +
(I40E_MIN_FD_FLUSH_INTERVAL * HZ)))
return;
@@ -6000,7 +5999,7 @@
}
pf->fd_flush_timestamp = jiffies;
- pf->flags &= ~I40E_FLAG_FD_ATR_ENABLED;
+ pf->auto_disable_flags |= I40E_FLAG_FD_ATR_ENABLED;
/* flush all filters */
wr32(&pf->hw, I40E_PFQF_CTL_1,
I40E_PFQF_CTL_1_CLEARFDTABLE_MASK);
@@ -6020,7 +6019,7 @@
/* replay sideband filters */
i40e_fdir_filter_restore(pf->vsi[pf->lan_vsi]);
if (!disable_atr)
- pf->flags |= I40E_FLAG_FD_ATR_ENABLED;
+ pf->auto_disable_flags &= ~I40E_FLAG_FD_ATR_ENABLED;
clear_bit(__I40E_FD_FLUSH_REQUESTED, &pf->state);
if (I40E_DEBUG_FD & pf->hw.debug_mask)
dev_info(&pf->pdev->dev, "FD Filter table flushed and FD-SB replayed.\n");
@@ -6054,9 +6053,6 @@
if (test_bit(__I40E_DOWN, &pf->state))
return;
- if (!(pf->flags & (I40E_FLAG_FD_SB_ENABLED | I40E_FLAG_FD_ATR_ENABLED)))
- return;
-
if (test_bit(__I40E_FD_FLUSH_REQUESTED, &pf->state))
i40e_fdir_flush_and_replay(pf);
@@ -7156,9 +7152,9 @@
pf->pending_udp_bitmap &= ~BIT_ULL(i);
port = pf->udp_ports[i].index;
if (port)
- ret = i40e_aq_add_udp_tunnel(hw, ntohs(port),
- pf->udp_ports[i].type,
- NULL, NULL);
+ ret = i40e_aq_add_udp_tunnel(hw, port,
+ pf->udp_ports[i].type,
+ NULL, NULL);
else
ret = i40e_aq_del_udp_tunnel(hw, i, NULL);
@@ -8244,8 +8240,8 @@
* @rss_table_size: Lookup table size
* @rss_size: Range of queue number for hashing
*/
-static void i40e_fill_rss_lut(struct i40e_pf *pf, u8 *lut,
- u16 rss_table_size, u16 rss_size)
+void i40e_fill_rss_lut(struct i40e_pf *pf, u8 *lut,
+ u16 rss_table_size, u16 rss_size)
{
u16 i;
@@ -8286,6 +8282,8 @@
if (!vsi->rss_size)
vsi->rss_size = min_t(int, pf->alloc_rss_size,
vsi->num_queue_pairs);
+ if (!vsi->rss_size)
+ return -EINVAL;
lut = kzalloc(vsi->rss_table_size, GFP_KERNEL);
if (!lut)
@@ -8610,7 +8608,6 @@
I40E_FLAG_WB_ON_ITR_CAPABLE |
I40E_FLAG_MULTIPLE_TCP_UDP_RSS_PCTYPE |
I40E_FLAG_NO_PCI_LINK_CHECK |
- I40E_FLAG_100M_SGMII_CAPABLE |
I40E_FLAG_USE_SET_LLDP_MIB |
I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
} else if ((pf->hw.aq.api_maj_ver > 1) ||
@@ -8685,13 +8682,13 @@
/* reset fd counters */
pf->fd_add_err = pf->fd_atr_cnt = pf->fd_tcp_rule = 0;
pf->fdir_pf_active_filters = 0;
- pf->flags |= I40E_FLAG_FD_ATR_ENABLED;
- if (I40E_DEBUG_FD & pf->hw.debug_mask)
- dev_info(&pf->pdev->dev, "ATR re-enabled.\n");
/* if ATR was auto disabled it can be re-enabled. */
if ((pf->flags & I40E_FLAG_FD_ATR_ENABLED) &&
- (pf->auto_disable_flags & I40E_FLAG_FD_ATR_ENABLED))
+ (pf->auto_disable_flags & I40E_FLAG_FD_ATR_ENABLED)) {
pf->auto_disable_flags &= ~I40E_FLAG_FD_ATR_ENABLED;
+ if (I40E_DEBUG_FD & pf->hw.debug_mask)
+ dev_info(&pf->pdev->dev, "ATR re-enabled.\n");
+ }
}
return need_reset;
}
@@ -11338,11 +11335,7 @@
}
/* shutdown the adminq */
- ret_code = i40e_shutdown_adminq(hw);
- if (ret_code)
- dev_warn(&pdev->dev,
- "Failed to destroy the Admin Queue resources: %d\n",
- ret_code);
+ i40e_shutdown_adminq(hw);
/* destroy the locks only once, here */
mutex_destroy(&hw->aq.arq_mutex);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index ed39cba..f1fecea 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -669,7 +669,7 @@
pf->ptp_clock = NULL;
dev_err(&pf->pdev->dev, "%s: ptp_clock_register failed\n",
__func__);
- } else {
+ } else if (pf->ptp_clock) {
struct timespec64 ts;
u32 regval;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index f8d6623..6287bf6 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -40,6 +40,69 @@
}
#define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP | I40E_TX_DESC_CMD_RS)
+/**
+ * i40e_fdir - Generate a Flow Director descriptor based on fdata
+ * @tx_ring: Tx ring to send buffer on
+ * @fdata: Flow director filter data
+ * @add: Indicate if we are adding a rule or deleting one
+ *
+ **/
+static void i40e_fdir(struct i40e_ring *tx_ring,
+ struct i40e_fdir_filter *fdata, bool add)
+{
+ struct i40e_filter_program_desc *fdir_desc;
+ struct i40e_pf *pf = tx_ring->vsi->back;
+ u32 flex_ptype, dtype_cmd;
+ u16 i;
+
+ /* grab the next descriptor */
+ i = tx_ring->next_to_use;
+ fdir_desc = I40E_TX_FDIRDESC(tx_ring, i);
+
+ i++;
+ tx_ring->next_to_use = (i < tx_ring->count) ? i : 0;
+
+ flex_ptype = I40E_TXD_FLTR_QW0_QINDEX_MASK &
+ (fdata->q_index << I40E_TXD_FLTR_QW0_QINDEX_SHIFT);
+
+ flex_ptype |= I40E_TXD_FLTR_QW0_FLEXOFF_MASK &
+ (fdata->flex_off << I40E_TXD_FLTR_QW0_FLEXOFF_SHIFT);
+
+ flex_ptype |= I40E_TXD_FLTR_QW0_PCTYPE_MASK &
+ (fdata->pctype << I40E_TXD_FLTR_QW0_PCTYPE_SHIFT);
+
+ /* Use LAN VSI Id if not programmed by user */
+ flex_ptype |= I40E_TXD_FLTR_QW0_DEST_VSI_MASK &
+ ((u32)(fdata->dest_vsi ? : pf->vsi[pf->lan_vsi]->id) <<
+ I40E_TXD_FLTR_QW0_DEST_VSI_SHIFT);
+
+ dtype_cmd = I40E_TX_DESC_DTYPE_FILTER_PROG;
+
+ dtype_cmd |= add ?
+ I40E_FILTER_PROGRAM_DESC_PCMD_ADD_UPDATE <<
+ I40E_TXD_FLTR_QW1_PCMD_SHIFT :
+ I40E_FILTER_PROGRAM_DESC_PCMD_REMOVE <<
+ I40E_TXD_FLTR_QW1_PCMD_SHIFT;
+
+ dtype_cmd |= I40E_TXD_FLTR_QW1_DEST_MASK &
+ (fdata->dest_ctl << I40E_TXD_FLTR_QW1_DEST_SHIFT);
+
+ dtype_cmd |= I40E_TXD_FLTR_QW1_FD_STATUS_MASK &
+ (fdata->fd_status << I40E_TXD_FLTR_QW1_FD_STATUS_SHIFT);
+
+ if (fdata->cnt_index) {
+ dtype_cmd |= I40E_TXD_FLTR_QW1_CNT_ENA_MASK;
+ dtype_cmd |= I40E_TXD_FLTR_QW1_CNTINDEX_MASK &
+ ((u32)fdata->cnt_index <<
+ I40E_TXD_FLTR_QW1_CNTINDEX_SHIFT);
+ }
+
+ fdir_desc->qindex_flex_ptype_vsi = cpu_to_le32(flex_ptype);
+ fdir_desc->rsvd = cpu_to_le32(0);
+ fdir_desc->dtype_cmd_cntindex = cpu_to_le32(dtype_cmd);
+ fdir_desc->fd_id = cpu_to_le32(fdata->fd_id);
+}
+
#define I40E_FD_CLEAN_DELAY 10
/**
* i40e_program_fdir_filter - Program a Flow Director filter
@@ -48,14 +111,13 @@
* @pf: The PF pointer
* @add: True for add/update, False for remove
**/
-int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data, u8 *raw_packet,
- struct i40e_pf *pf, bool add)
+static int i40e_program_fdir_filter(struct i40e_fdir_filter *fdir_data,
+ u8 *raw_packet, struct i40e_pf *pf,
+ bool add)
{
- struct i40e_filter_program_desc *fdir_desc;
struct i40e_tx_buffer *tx_buf, *first;
struct i40e_tx_desc *tx_desc;
struct i40e_ring *tx_ring;
- unsigned int fpt, dcc;
struct i40e_vsi *vsi;
struct device *dev;
dma_addr_t dma;
@@ -92,56 +154,8 @@
/* grab the next descriptor */
i = tx_ring->next_to_use;
- fdir_desc = I40E_TX_FDIRDESC(tx_ring, i);
first = &tx_ring->tx_bi[i];
- memset(first, 0, sizeof(struct i40e_tx_buffer));
-
- tx_ring->next_to_use = ((i + 1) < tx_ring->count) ? i + 1 : 0;
-
- fpt = (fdir_data->q_index << I40E_TXD_FLTR_QW0_QINDEX_SHIFT) &
- I40E_TXD_FLTR_QW0_QINDEX_MASK;
-
- fpt |= (fdir_data->flex_off << I40E_TXD_FLTR_QW0_FLEXOFF_SHIFT) &
- I40E_TXD_FLTR_QW0_FLEXOFF_MASK;
-
- fpt |= (fdir_data->pctype << I40E_TXD_FLTR_QW0_PCTYPE_SHIFT) &
- I40E_TXD_FLTR_QW0_PCTYPE_MASK;
-
- /* Use LAN VSI Id if not programmed by user */
- if (fdir_data->dest_vsi == 0)
- fpt |= (pf->vsi[pf->lan_vsi]->id) <<
- I40E_TXD_FLTR_QW0_DEST_VSI_SHIFT;
- else
- fpt |= ((u32)fdir_data->dest_vsi <<
- I40E_TXD_FLTR_QW0_DEST_VSI_SHIFT) &
- I40E_TXD_FLTR_QW0_DEST_VSI_MASK;
-
- dcc = I40E_TX_DESC_DTYPE_FILTER_PROG;
-
- if (add)
- dcc |= I40E_FILTER_PROGRAM_DESC_PCMD_ADD_UPDATE <<
- I40E_TXD_FLTR_QW1_PCMD_SHIFT;
- else
- dcc |= I40E_FILTER_PROGRAM_DESC_PCMD_REMOVE <<
- I40E_TXD_FLTR_QW1_PCMD_SHIFT;
-
- dcc |= (fdir_data->dest_ctl << I40E_TXD_FLTR_QW1_DEST_SHIFT) &
- I40E_TXD_FLTR_QW1_DEST_MASK;
-
- dcc |= (fdir_data->fd_status << I40E_TXD_FLTR_QW1_FD_STATUS_SHIFT) &
- I40E_TXD_FLTR_QW1_FD_STATUS_MASK;
-
- if (fdir_data->cnt_index != 0) {
- dcc |= I40E_TXD_FLTR_QW1_CNT_ENA_MASK;
- dcc |= ((u32)fdir_data->cnt_index <<
- I40E_TXD_FLTR_QW1_CNTINDEX_SHIFT) &
- I40E_TXD_FLTR_QW1_CNTINDEX_MASK;
- }
-
- fdir_desc->qindex_flex_ptype_vsi = cpu_to_le32(fpt);
- fdir_desc->rsvd = cpu_to_le32(0);
- fdir_desc->dtype_cmd_cntindex = cpu_to_le32(dcc);
- fdir_desc->fd_id = cpu_to_le32(fdir_data->fd_id);
+ i40e_fdir(tx_ring, fdir_data, add);
/* Now program a dummy descriptor */
i = tx_ring->next_to_use;
@@ -282,18 +296,18 @@
if (add) {
pf->fd_tcp_rule++;
- if (pf->flags & I40E_FLAG_FD_ATR_ENABLED) {
- if (I40E_DEBUG_FD & pf->hw.debug_mask)
- dev_info(&pf->pdev->dev, "Forcing ATR off, sideband rules for TCP/IPv4 flow being applied\n");
- pf->flags &= ~I40E_FLAG_FD_ATR_ENABLED;
- }
+ if ((pf->flags & I40E_FLAG_FD_ATR_ENABLED) &&
+ I40E_DEBUG_FD & pf->hw.debug_mask)
+ dev_info(&pf->pdev->dev, "Forcing ATR off, sideband rules for TCP/IPv4 flow being applied\n");
+ pf->auto_disable_flags |= I40E_FLAG_FD_ATR_ENABLED;
} else {
pf->fd_tcp_rule = (pf->fd_tcp_rule > 0) ?
(pf->fd_tcp_rule - 1) : 0;
if (pf->fd_tcp_rule == 0) {
- pf->flags |= I40E_FLAG_FD_ATR_ENABLED;
- if (I40E_DEBUG_FD & pf->hw.debug_mask)
+ if ((pf->flags & I40E_FLAG_FD_ATR_ENABLED) &&
+ I40E_DEBUG_FD & pf->hw.debug_mask)
dev_info(&pf->pdev->dev, "ATR re-enabled due to no sideband TCP/IPv4 rules\n");
+ pf->auto_disable_flags &= ~I40E_FLAG_FD_ATR_ENABLED;
}
}
@@ -532,7 +546,10 @@
struct i40e_tx_buffer *tx_buffer)
{
if (tx_buffer->skb) {
- dev_kfree_skb_any(tx_buffer->skb);
+ if (tx_buffer->tx_flags & I40E_TX_FLAGS_FD_SB)
+ kfree(tx_buffer->raw_buf);
+ else
+ dev_kfree_skb_any(tx_buffer->skb);
if (dma_unmap_len(tx_buffer, len))
dma_unmap_single(ring->dev,
dma_unmap_addr(tx_buffer, dma),
@@ -545,9 +562,6 @@
DMA_TO_DEVICE);
}
- if (tx_buffer->tx_flags & I40E_TX_FLAGS_FD_SB)
- kfree(tx_buffer->raw_buf);
-
tx_buffer->next_to_watch = NULL;
tx_buffer->skb = NULL;
dma_unmap_len_set(tx_buffer, len, 0);
@@ -584,8 +598,7 @@
return;
/* cleanup Tx queue statistics */
- netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index));
+ netdev_tx_reset_queue(txring_txq(tx_ring));
}
/**
@@ -754,8 +767,8 @@
tx_ring->arm_wb = true;
}
- netdev_tx_completed_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index),
+ /* notify netdev of completed buffers */
+ netdev_tx_completed_queue(txring_txq(tx_ring),
total_packets, total_bytes);
#define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
@@ -1864,6 +1877,15 @@
/* a small macro to shorten up some long lines */
#define INTREG I40E_PFINT_DYN_CTLN
+static inline int get_rx_itr_enabled(struct i40e_vsi *vsi, int idx)
+{
+ return !!(vsi->rx_rings[idx]->rx_itr_setting);
+}
+
+static inline int get_tx_itr_enabled(struct i40e_vsi *vsi, int idx)
+{
+ return !!(vsi->tx_rings[idx]->tx_itr_setting);
+}
/**
* i40e_update_enable_itr - Update itr and re-enable MSIX interrupt
@@ -1879,6 +1901,7 @@
u32 rxval, txval;
int vector;
int idx = q_vector->v_idx;
+ int rx_itr_setting, tx_itr_setting;
vector = (q_vector->v_idx + vsi->base_vector);
@@ -1887,18 +1910,21 @@
*/
rxval = txval = i40e_buildreg_itr(I40E_ITR_NONE, 0);
+ rx_itr_setting = get_rx_itr_enabled(vsi, idx);
+ tx_itr_setting = get_tx_itr_enabled(vsi, idx);
+
if (q_vector->itr_countdown > 0 ||
- (!ITR_IS_DYNAMIC(vsi->rx_rings[idx]->rx_itr_setting) &&
- !ITR_IS_DYNAMIC(vsi->tx_rings[idx]->tx_itr_setting))) {
+ (!ITR_IS_DYNAMIC(rx_itr_setting) &&
+ !ITR_IS_DYNAMIC(tx_itr_setting))) {
goto enable_int;
}
- if (ITR_IS_DYNAMIC(vsi->rx_rings[idx]->rx_itr_setting)) {
+ if (ITR_IS_DYNAMIC(tx_itr_setting)) {
rx = i40e_set_new_dynamic_itr(&q_vector->rx);
rxval = i40e_buildreg_itr(I40E_RX_ITR, q_vector->rx.itr);
}
- if (ITR_IS_DYNAMIC(vsi->tx_rings[idx]->tx_itr_setting)) {
+ if (ITR_IS_DYNAMIC(tx_itr_setting)) {
tx = i40e_set_new_dynamic_itr(&q_vector->tx);
txval = i40e_buildreg_itr(I40E_TX_ITR, q_vector->tx.itr);
}
@@ -2621,9 +2647,7 @@
return false;
/* We need to walk through the list and validate that each group
- * of 6 fragments totals at least gso_size. However we don't need
- * to perform such validation on the last 6 since the last 6 cannot
- * inherit any data from a descriptor after them.
+ * of 6 fragments totals at least gso_size.
*/
nr_frags -= I40E_MAX_BUFFER_TXD - 2;
frag = &skb_shinfo(skb)->frags[0];
@@ -2654,8 +2678,7 @@
if (sum < 0)
return true;
- /* use pre-decrement to avoid processing last fragment */
- if (!--nr_frags)
+ if (!nr_frags--)
break;
sum -= skb_frag_size(stale++);
@@ -2787,9 +2810,7 @@
tx_ring->next_to_use = i;
- netdev_tx_sent_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index),
- first->bytecount);
+ netdev_tx_sent_queue(txring_txq(tx_ring), first->bytecount);
i40e_maybe_stop_tx(tx_ring, DESC_NEEDED);
/* Algorithm to optimize tail and RS bit setting:
@@ -2814,13 +2835,11 @@
* trigger a force WB.
*/
if (skb->xmit_more &&
- !netif_xmit_stopped(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index))) {
+ !netif_xmit_stopped(txring_txq(tx_ring))) {
tx_ring->flags |= I40E_TXR_FLAGS_LAST_XMIT_MORE_SET;
tail_bump = false;
} else if (!skb->xmit_more &&
- !netif_xmit_stopped(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index)) &&
+ !netif_xmit_stopped(txring_txq(tx_ring)) &&
(!(tx_ring->flags & I40E_TXR_FLAGS_LAST_XMIT_MORE_SET)) &&
(tx_ring->packet_stride < WB_STRIDE) &&
(desc_count < WB_STRIDE)) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index b78c810..5088405 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -463,4 +463,13 @@
return (ptype >= I40E_RX_PTYPE_L2_FCOE_PAY3) &&
(ptype <= I40E_RX_PTYPE_L2_FCOE_VFT_FCOTHER);
}
+
+/**
+ * txring_txq - Find the netdev Tx ring based on the i40e Tx ring
+ * @ring: Tx ring to find the netdev equivalent of
+ **/
+static inline struct netdev_queue *txring_txq(const struct i40e_ring *ring)
+{
+ return netdev_get_tx_queue(ring->netdev, ring->queue_index);
+}
#endif /* _I40E_TXRX_H_ */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl.h
index c92a3bd..f861d31 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl.h
@@ -163,6 +163,7 @@
#define I40E_VIRTCHNL_VF_OFFLOAD_RX_POLLING 0x00020000
#define I40E_VIRTCHNL_VF_OFFLOAD_RSS_PCTYPE_V2 0x00040000
#define I40E_VIRTCHNL_VF_OFFLOAD_RSS_PF 0X00080000
+#define I40E_VIRTCHNL_VF_OFFLOAD_ENCAP_CSUM 0X00100000
struct i40e_virtchnl_vf_resource {
u16 num_vsis;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index da34235..54b8ee2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -502,8 +502,16 @@
u32 qtx_ctl;
int ret = 0;
+ if (!i40e_vc_isvalid_vsi_id(vf, info->vsi_id)) {
+ ret = -ENOENT;
+ goto error_context;
+ }
pf_queue_id = i40e_vc_get_pf_queue_id(vf, vsi_id, vsi_queue_id);
vsi = i40e_find_vsi_from_id(pf, vsi_id);
+ if (!vsi) {
+ ret = -ENOENT;
+ goto error_context;
+ }
/* clear the context structure first */
memset(&tx_ctx, 0, sizeof(struct i40e_hmc_obj_txq));
@@ -1476,7 +1484,8 @@
vsi = i40e_find_vsi_from_id(pf, info->vsi_id);
if (!test_bit(I40E_VF_STAT_ACTIVE, &vf->vf_states) ||
- !i40e_vc_isvalid_vsi_id(vf, info->vsi_id)) {
+ !i40e_vc_isvalid_vsi_id(vf, info->vsi_id) ||
+ !vsi) {
aq_ret = I40E_ERR_PARAM;
goto error_param;
}
@@ -2217,8 +2226,8 @@
error_param:
/* send the response to the VF */
return i40e_vc_send_resp_to_vf(vf,
- config ? I40E_VIRTCHNL_OP_RELEASE_IWARP_IRQ_MAP :
- I40E_VIRTCHNL_OP_CONFIG_IWARP_IRQ_MAP,
+ config ? I40E_VIRTCHNL_OP_CONFIG_IWARP_IRQ_MAP :
+ I40E_VIRTCHNL_OP_RELEASE_IWARP_IRQ_MAP,
aq_ret);
}
@@ -2747,11 +2756,12 @@
* @vf_id: VF identifier
* @vlan_id: mac address
* @qos: priority setting
+ * @vlan_proto: vlan protocol
*
* program VF vlan id and/or qos
**/
-int i40e_ndo_set_vf_port_vlan(struct net_device *netdev,
- int vf_id, u16 vlan_id, u8 qos)
+int i40e_ndo_set_vf_port_vlan(struct net_device *netdev, int vf_id,
+ u16 vlan_id, u8 qos, __be16 vlan_proto)
{
u16 vlanprio = vlan_id | (qos << I40E_VLAN_PRIORITY_SHIFT);
struct i40e_netdev_priv *np = netdev_priv(netdev);
@@ -2774,6 +2784,12 @@
goto error_pvid;
}
+ if (vlan_proto != htons(ETH_P_8021Q)) {
+ dev_err(&pf->pdev->dev, "VF VLAN protocol is not supported\n");
+ ret = -EPROTONOSUPPORT;
+ goto error_pvid;
+ }
+
vf = &(pf->vf[vf_id]);
vsi = pf->vsi[vf->lan_vsi_idx];
if (!test_bit(I40E_VF_STAT_INIT, &vf->vf_states)) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
index 8751741..4012d06 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.h
@@ -129,8 +129,8 @@
/* VF configuration related iplink handlers */
int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac);
-int i40e_ndo_set_vf_port_vlan(struct net_device *netdev,
- int vf_id, u16 vlan_id, u8 qos);
+int i40e_ndo_set_vf_port_vlan(struct net_device *netdev, int vf_id,
+ u16 vlan_id, u8 qos, __be16 vlan_proto);
int i40e_ndo_set_vf_bw(struct net_device *netdev, int vf_id, int min_tx_rate,
int max_tx_rate);
int i40e_ndo_set_vf_trust(struct net_device *netdev, int vf_id, bool setting);
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_common.c b/drivers/net/ethernet/intel/i40evf/i40e_common.c
index 4db0c03..7953c13 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_common.c
@@ -302,7 +302,6 @@
void *buffer, u16 buf_len)
{
struct i40e_aq_desc *aq_desc = (struct i40e_aq_desc *)desc;
- u16 len = le16_to_cpu(aq_desc->datalen);
u8 *buf = (u8 *)buffer;
u16 i = 0;
@@ -326,6 +325,8 @@
le32_to_cpu(aq_desc->params.external.addr_low));
if ((buffer != NULL) && (aq_desc->datalen != 0)) {
+ u16 len = le16_to_cpu(aq_desc->datalen);
+
i40e_debug(hw, mask, "AQ CMD Buffer:\n");
if (buf_len < len)
len = buf_len;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 0130458..75f2a2c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -51,7 +51,10 @@
struct i40e_tx_buffer *tx_buffer)
{
if (tx_buffer->skb) {
- dev_kfree_skb_any(tx_buffer->skb);
+ if (tx_buffer->tx_flags & I40E_TX_FLAGS_FD_SB)
+ kfree(tx_buffer->raw_buf);
+ else
+ dev_kfree_skb_any(tx_buffer->skb);
if (dma_unmap_len(tx_buffer, len))
dma_unmap_single(ring->dev,
dma_unmap_addr(tx_buffer, dma),
@@ -64,9 +67,6 @@
DMA_TO_DEVICE);
}
- if (tx_buffer->tx_flags & I40E_TX_FLAGS_FD_SB)
- kfree(tx_buffer->raw_buf);
-
tx_buffer->next_to_watch = NULL;
tx_buffer->skb = NULL;
dma_unmap_len_set(tx_buffer, len, 0);
@@ -103,8 +103,7 @@
return;
/* cleanup Tx queue statistics */
- netdev_tx_reset_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index));
+ netdev_tx_reset_queue(txring_txq(tx_ring));
}
/**
@@ -273,8 +272,8 @@
tx_ring->arm_wb = true;
}
- netdev_tx_completed_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index),
+ /* notify netdev of completed buffers */
+ netdev_tx_completed_queue(txring_txq(tx_ring),
total_packets, total_bytes);
#define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
@@ -1312,6 +1311,19 @@
/* a small macro to shorten up some long lines */
#define INTREG I40E_VFINT_DYN_CTLN1
+static inline int get_rx_itr_enabled(struct i40e_vsi *vsi, int idx)
+{
+ struct i40evf_adapter *adapter = vsi->back;
+
+ return !!(adapter->rx_rings[idx].rx_itr_setting);
+}
+
+static inline int get_tx_itr_enabled(struct i40e_vsi *vsi, int idx)
+{
+ struct i40evf_adapter *adapter = vsi->back;
+
+ return !!(adapter->tx_rings[idx].tx_itr_setting);
+}
/**
* i40e_update_enable_itr - Update itr and re-enable MSIX interrupt
@@ -1326,6 +1338,8 @@
bool rx = false, tx = false;
u32 rxval, txval;
int vector;
+ int idx = q_vector->v_idx;
+ int rx_itr_setting, tx_itr_setting;
vector = (q_vector->v_idx + vsi->base_vector);
@@ -1334,18 +1348,21 @@
*/
rxval = txval = i40e_buildreg_itr(I40E_ITR_NONE, 0);
+ rx_itr_setting = get_rx_itr_enabled(vsi, idx);
+ tx_itr_setting = get_tx_itr_enabled(vsi, idx);
+
if (q_vector->itr_countdown > 0 ||
- (!ITR_IS_DYNAMIC(vsi->rx_itr_setting) &&
- !ITR_IS_DYNAMIC(vsi->tx_itr_setting))) {
+ (!ITR_IS_DYNAMIC(rx_itr_setting) &&
+ !ITR_IS_DYNAMIC(tx_itr_setting))) {
goto enable_int;
}
- if (ITR_IS_DYNAMIC(vsi->rx_itr_setting)) {
+ if (ITR_IS_DYNAMIC(rx_itr_setting)) {
rx = i40e_set_new_dynamic_itr(&q_vector->rx);
rxval = i40e_buildreg_itr(I40E_RX_ITR, q_vector->rx.itr);
}
- if (ITR_IS_DYNAMIC(vsi->tx_itr_setting)) {
+ if (ITR_IS_DYNAMIC(tx_itr_setting)) {
tx = i40e_set_new_dynamic_itr(&q_vector->tx);
txval = i40e_buildreg_itr(I40E_TX_ITR, q_vector->tx.itr);
}
@@ -1832,9 +1849,7 @@
return false;
/* We need to walk through the list and validate that each group
- * of 6 fragments totals at least gso_size. However we don't need
- * to perform such validation on the last 6 since the last 6 cannot
- * inherit any data from a descriptor after them.
+ * of 6 fragments totals at least gso_size.
*/
nr_frags -= I40E_MAX_BUFFER_TXD - 2;
frag = &skb_shinfo(skb)->frags[0];
@@ -1865,8 +1880,7 @@
if (sum < 0)
return true;
- /* use pre-decrement to avoid processing last fragment */
- if (!--nr_frags)
+ if (!nr_frags--)
break;
sum -= skb_frag_size(stale++);
@@ -2015,9 +2029,7 @@
tx_ring->next_to_use = i;
- netdev_tx_sent_queue(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index),
- first->bytecount);
+ netdev_tx_sent_queue(txring_txq(tx_ring), first->bytecount);
i40e_maybe_stop_tx(tx_ring, DESC_NEEDED);
/* Algorithm to optimize tail and RS bit setting:
@@ -2042,13 +2054,11 @@
* trigger a force WB.
*/
if (skb->xmit_more &&
- !netif_xmit_stopped(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index))) {
+ !netif_xmit_stopped(txring_txq(tx_ring))) {
tx_ring->flags |= I40E_TXR_FLAGS_LAST_XMIT_MORE_SET;
tail_bump = false;
} else if (!skb->xmit_more &&
- !netif_xmit_stopped(netdev_get_tx_queue(tx_ring->netdev,
- tx_ring->queue_index)) &&
+ !netif_xmit_stopped(txring_txq(tx_ring)) &&
(!(tx_ring->flags & I40E_TXR_FLAGS_LAST_XMIT_MORE_SET)) &&
(tx_ring->packet_stride < WB_STRIDE) &&
(desc_count < WB_STRIDE)) {
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
index 0112277..abcdeca 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
@@ -287,6 +287,14 @@
u8 dcb_tc; /* Traffic class of ring */
u8 __iomem *tail;
+ /* high bit set means dynamic, use accessors routines to read/write.
+ * hardware only supports 2us resolution for the ITR registers.
+ * these values always store the USER setting, and must be converted
+ * before programming to a register.
+ */
+ u16 rx_itr_setting;
+ u16 tx_itr_setting;
+
u16 count; /* Number of descriptors */
u16 reg_idx; /* HW register index of the ring */
u16 rx_buf_len;
@@ -445,4 +453,13 @@
return (ptype >= I40E_RX_PTYPE_L2_FCOE_PAY3) &&
(ptype <= I40E_RX_PTYPE_L2_FCOE_VFT_FCOTHER);
}
+
+/**
+ * txring_txq - Find the netdev Tx ring based on the i40e Tx ring
+ * @ring: Tx ring to find the netdev equivalent of
+ **/
+static inline struct netdev_queue *txring_txq(const struct i40e_ring *ring)
+{
+ return netdev_get_tx_queue(ring->netdev, ring->queue_index);
+}
#endif /* _I40E_TXRX_H_ */
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h b/drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h
index f04ce6c..bd691ad 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_virtchnl.h
@@ -160,6 +160,7 @@
#define I40E_VIRTCHNL_VF_OFFLOAD_RX_POLLING 0x00020000
#define I40E_VIRTCHNL_VF_OFFLOAD_RSS_PCTYPE_V2 0x00040000
#define I40E_VIRTCHNL_VF_OFFLOAD_RSS_PF 0X00080000
+#define I40E_VIRTCHNL_VF_OFFLOAD_ENCAP_CSUM 0X00100000
struct i40e_virtchnl_vf_resource {
u16 num_vsis;
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf.h b/drivers/net/ethernet/intel/i40evf/i40evf.h
index dc00aaf..c5fd724 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf.h
+++ b/drivers/net/ethernet/intel/i40evf/i40evf.h
@@ -59,13 +59,6 @@
unsigned long state;
int base_vector;
u16 work_limit;
- /* high bit set means dynamic, use accessor routines to read/write.
- * hardware only supports 2us resolution for the ITR registers.
- * these values always store the USER setting, and must be converted
- * before programming to a register.
- */
- u16 rx_itr_setting;
- u16 tx_itr_setting;
u16 qs_handle;
};
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index e17a154..a994015 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -296,31 +296,174 @@
}
/**
+ * __i40evf_get_coalesce - get per-queue coalesce settings
+ * @netdev: the netdev to check
+ * @ec: ethtool coalesce data structure
+ * @queue: which queue to pick
+ *
+ * Gets the per-queue settings for coalescence. Specifically Rx and Tx usecs
+ * are per queue. If queue is <0 then we default to queue 0 as the
+ * representative value.
+ **/
+static int __i40evf_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec,
+ int queue)
+{
+ struct i40evf_adapter *adapter = netdev_priv(netdev);
+ struct i40e_vsi *vsi = &adapter->vsi;
+ struct i40e_ring *rx_ring, *tx_ring;
+
+ ec->tx_max_coalesced_frames = vsi->work_limit;
+ ec->rx_max_coalesced_frames = vsi->work_limit;
+
+ /* Rx and Tx usecs per queue value. If user doesn't specify the
+ * queue, return queue 0's value to represent.
+ */
+ if (queue < 0)
+ queue = 0;
+ else if (queue >= adapter->num_active_queues)
+ return -EINVAL;
+
+ rx_ring = &adapter->rx_rings[queue];
+ tx_ring = &adapter->tx_rings[queue];
+
+ if (ITR_IS_DYNAMIC(rx_ring->rx_itr_setting))
+ ec->use_adaptive_rx_coalesce = 1;
+
+ if (ITR_IS_DYNAMIC(tx_ring->tx_itr_setting))
+ ec->use_adaptive_tx_coalesce = 1;
+
+ ec->rx_coalesce_usecs = rx_ring->rx_itr_setting & ~I40E_ITR_DYNAMIC;
+ ec->tx_coalesce_usecs = tx_ring->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+
+ return 0;
+}
+
+/**
* i40evf_get_coalesce - Get interrupt coalescing settings
* @netdev: network interface device structure
* @ec: ethtool coalesce structure
*
* Returns current coalescing settings. This is referred to elsewhere in the
* driver as Interrupt Throttle Rate, as this is how the hardware describes
- * this functionality.
+ * this functionality. Note that if per-queue settings have been modified this
+ * only represents the settings of queue 0.
**/
static int i40evf_get_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec)
{
+ return __i40evf_get_coalesce(netdev, ec, -1);
+}
+
+/**
+ * i40evf_get_per_queue_coalesce - get coalesce values for specific queue
+ * @netdev: netdev to read
+ * @ec: coalesce settings from ethtool
+ * @queue: the queue to read
+ *
+ * Read specific queue's coalesce settings.
+ **/
+static int i40evf_get_per_queue_coalesce(struct net_device *netdev,
+ u32 queue,
+ struct ethtool_coalesce *ec)
+{
+ return __i40evf_get_coalesce(netdev, ec, queue);
+}
+
+/**
+ * i40evf_set_itr_per_queue - set ITR values for specific queue
+ * @vsi: the VSI to set values for
+ * @ec: coalesce settings from ethtool
+ * @queue: the queue to modify
+ *
+ * Change the ITR settings for a specific queue.
+ **/
+static void i40evf_set_itr_per_queue(struct i40evf_adapter *adapter,
+ struct ethtool_coalesce *ec,
+ int queue)
+{
+ struct i40e_vsi *vsi = &adapter->vsi;
+ struct i40e_hw *hw = &adapter->hw;
+ struct i40e_q_vector *q_vector;
+ u16 vector;
+
+ adapter->rx_rings[queue].rx_itr_setting = ec->rx_coalesce_usecs;
+ adapter->tx_rings[queue].tx_itr_setting = ec->tx_coalesce_usecs;
+
+ if (ec->use_adaptive_rx_coalesce)
+ adapter->rx_rings[queue].rx_itr_setting |= I40E_ITR_DYNAMIC;
+ else
+ adapter->rx_rings[queue].rx_itr_setting &= ~I40E_ITR_DYNAMIC;
+
+ if (ec->use_adaptive_tx_coalesce)
+ adapter->tx_rings[queue].tx_itr_setting |= I40E_ITR_DYNAMIC;
+ else
+ adapter->tx_rings[queue].tx_itr_setting &= ~I40E_ITR_DYNAMIC;
+
+ q_vector = adapter->rx_rings[queue].q_vector;
+ q_vector->rx.itr = ITR_TO_REG(adapter->rx_rings[queue].rx_itr_setting);
+ vector = vsi->base_vector + q_vector->v_idx;
+ wr32(hw, I40E_VFINT_ITRN1(I40E_RX_ITR, vector - 1), q_vector->rx.itr);
+
+ q_vector = adapter->tx_rings[queue].q_vector;
+ q_vector->tx.itr = ITR_TO_REG(adapter->tx_rings[queue].tx_itr_setting);
+ vector = vsi->base_vector + q_vector->v_idx;
+ wr32(hw, I40E_VFINT_ITRN1(I40E_TX_ITR, vector - 1), q_vector->tx.itr);
+
+ i40e_flush(hw);
+}
+
+/**
+ * __i40evf_set_coalesce - set coalesce settings for particular queue
+ * @netdev: the netdev to change
+ * @ec: ethtool coalesce settings
+ * @queue: the queue to change
+ *
+ * Sets the coalesce settings for a particular queue.
+ **/
+static int __i40evf_set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec,
+ int queue)
+{
struct i40evf_adapter *adapter = netdev_priv(netdev);
struct i40e_vsi *vsi = &adapter->vsi;
+ int i;
- ec->tx_max_coalesced_frames = vsi->work_limit;
- ec->rx_max_coalesced_frames = vsi->work_limit;
+ if (ec->tx_max_coalesced_frames_irq || ec->rx_max_coalesced_frames_irq)
+ vsi->work_limit = ec->tx_max_coalesced_frames_irq;
- if (ITR_IS_DYNAMIC(vsi->rx_itr_setting))
- ec->use_adaptive_rx_coalesce = 1;
+ if (ec->rx_coalesce_usecs == 0) {
+ if (ec->use_adaptive_rx_coalesce)
+ netif_info(adapter, drv, netdev, "rx-usecs=0, need to disable adaptive-rx for a complete disable\n");
+ } else if ((ec->rx_coalesce_usecs < (I40E_MIN_ITR << 1)) ||
+ (ec->rx_coalesce_usecs > (I40E_MAX_ITR << 1))) {
+ netif_info(adapter, drv, netdev, "Invalid value, rx-usecs range is 0-8160\n");
+ return -EINVAL;
+ }
- if (ITR_IS_DYNAMIC(vsi->tx_itr_setting))
- ec->use_adaptive_tx_coalesce = 1;
+ else
+ if (ec->tx_coalesce_usecs == 0) {
+ if (ec->use_adaptive_tx_coalesce)
+ netif_info(adapter, drv, netdev, "tx-usecs=0, need to disable adaptive-tx for a complete disable\n");
+ } else if ((ec->tx_coalesce_usecs < (I40E_MIN_ITR << 1)) ||
+ (ec->tx_coalesce_usecs > (I40E_MAX_ITR << 1))) {
+ netif_info(adapter, drv, netdev, "Invalid value, tx-usecs range is 0-8160\n");
+ return -EINVAL;
+ }
- ec->rx_coalesce_usecs = vsi->rx_itr_setting & ~I40E_ITR_DYNAMIC;
- ec->tx_coalesce_usecs = vsi->tx_itr_setting & ~I40E_ITR_DYNAMIC;
+ /* Rx and Tx usecs has per queue value. If user doesn't specify the
+ * queue, apply to all queues.
+ */
+ if (queue < 0) {
+ for (i = 0; i < adapter->num_active_queues; i++)
+ i40evf_set_itr_per_queue(adapter, ec, i);
+ } else if (queue < adapter->num_active_queues) {
+ i40evf_set_itr_per_queue(adapter, ec, queue);
+ } else {
+ netif_info(adapter, drv, netdev, "Invalid queue value, queue range is 0 - %d\n",
+ adapter->num_active_queues - 1);
+ return -EINVAL;
+ }
return 0;
}
@@ -330,56 +473,27 @@
* @netdev: network interface device structure
* @ec: ethtool coalesce structure
*
- * Change current coalescing settings.
+ * Change current coalescing settings for every queue.
**/
static int i40evf_set_coalesce(struct net_device *netdev,
struct ethtool_coalesce *ec)
{
- struct i40evf_adapter *adapter = netdev_priv(netdev);
- struct i40e_hw *hw = &adapter->hw;
- struct i40e_vsi *vsi = &adapter->vsi;
- struct i40e_q_vector *q_vector;
- int i;
+ return __i40evf_set_coalesce(netdev, ec, -1);
+}
- if (ec->tx_max_coalesced_frames_irq || ec->rx_max_coalesced_frames_irq)
- vsi->work_limit = ec->tx_max_coalesced_frames_irq;
-
- if ((ec->rx_coalesce_usecs >= (I40E_MIN_ITR << 1)) &&
- (ec->rx_coalesce_usecs <= (I40E_MAX_ITR << 1)))
- vsi->rx_itr_setting = ec->rx_coalesce_usecs;
-
- else
- return -EINVAL;
-
- if ((ec->tx_coalesce_usecs >= (I40E_MIN_ITR << 1)) &&
- (ec->tx_coalesce_usecs <= (I40E_MAX_ITR << 1)))
- vsi->tx_itr_setting = ec->tx_coalesce_usecs;
- else if (ec->use_adaptive_tx_coalesce)
- vsi->tx_itr_setting = (I40E_ITR_DYNAMIC |
- ITR_REG_TO_USEC(I40E_ITR_RX_DEF));
- else
- return -EINVAL;
-
- if (ec->use_adaptive_rx_coalesce)
- vsi->rx_itr_setting |= I40E_ITR_DYNAMIC;
- else
- vsi->rx_itr_setting &= ~I40E_ITR_DYNAMIC;
-
- if (ec->use_adaptive_tx_coalesce)
- vsi->tx_itr_setting |= I40E_ITR_DYNAMIC;
- else
- vsi->tx_itr_setting &= ~I40E_ITR_DYNAMIC;
-
- for (i = 0; i < adapter->num_msix_vectors - NONQ_VECS; i++) {
- q_vector = &adapter->q_vectors[i];
- q_vector->rx.itr = ITR_TO_REG(vsi->rx_itr_setting);
- wr32(hw, I40E_VFINT_ITRN1(0, i), q_vector->rx.itr);
- q_vector->tx.itr = ITR_TO_REG(vsi->tx_itr_setting);
- wr32(hw, I40E_VFINT_ITRN1(1, i), q_vector->tx.itr);
- i40e_flush(hw);
- }
-
- return 0;
+/**
+ * i40evf_set_per_queue_coalesce - set specific queue's coalesce settings
+ * @netdev: the netdev to change
+ * @ec: ethtool's coalesce settings
+ * @queue: the queue to modify
+ *
+ * Modifies a specific queue's coalesce settings.
+ */
+static int i40evf_set_per_queue_coalesce(struct net_device *netdev,
+ u32 queue,
+ struct ethtool_coalesce *ec)
+{
+ return __i40evf_set_coalesce(netdev, ec, queue);
}
/**
@@ -533,6 +647,8 @@
.set_msglevel = i40evf_set_msglevel,
.get_coalesce = i40evf_get_coalesce,
.set_coalesce = i40evf_set_coalesce,
+ .get_per_queue_coalesce = i40evf_get_per_queue_coalesce,
+ .set_per_queue_coalesce = i40evf_set_per_queue_coalesce,
.get_rxnfc = i40evf_get_rxnfc,
.get_rxfh_indir_size = i40evf_get_rxfh_indir_size,
.get_rxfh = i40evf_get_rxfh,
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index f751f7b..1437281 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -38,7 +38,7 @@
#define DRV_VERSION_MAJOR 1
#define DRV_VERSION_MINOR 6
-#define DRV_VERSION_BUILD 12
+#define DRV_VERSION_BUILD 16
#define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
__stringify(DRV_VERSION_MINOR) "." \
__stringify(DRV_VERSION_BUILD) \
@@ -370,6 +370,7 @@
{
struct i40e_q_vector *q_vector = &adapter->q_vectors[v_idx];
struct i40e_ring *rx_ring = &adapter->rx_rings[r_idx];
+ struct i40e_hw *hw = &adapter->hw;
rx_ring->q_vector = q_vector;
rx_ring->next = q_vector->rx.ring;
@@ -377,7 +378,10 @@
q_vector->rx.ring = rx_ring;
q_vector->rx.count++;
q_vector->rx.latency_range = I40E_LOW_LATENCY;
+ q_vector->rx.itr = ITR_TO_REG(rx_ring->rx_itr_setting);
+ q_vector->ring_mask |= BIT(r_idx);
q_vector->itr_countdown = ITR_COUNTDOWN_START;
+ wr32(hw, I40E_VFINT_ITRN1(I40E_RX_ITR, v_idx - 1), q_vector->rx.itr);
}
/**
@@ -391,6 +395,7 @@
{
struct i40e_q_vector *q_vector = &adapter->q_vectors[v_idx];
struct i40e_ring *tx_ring = &adapter->tx_rings[t_idx];
+ struct i40e_hw *hw = &adapter->hw;
tx_ring->q_vector = q_vector;
tx_ring->next = q_vector->tx.ring;
@@ -398,9 +403,10 @@
q_vector->tx.ring = tx_ring;
q_vector->tx.count++;
q_vector->tx.latency_range = I40E_LOW_LATENCY;
+ q_vector->tx.itr = ITR_TO_REG(tx_ring->tx_itr_setting);
q_vector->itr_countdown = ITR_COUNTDOWN_START;
q_vector->num_ringpairs++;
- q_vector->ring_mask |= BIT(t_idx);
+ wr32(hw, I40E_VFINT_ITRN1(I40E_TX_ITR, v_idx - 1), q_vector->tx.itr);
}
/**
@@ -1007,7 +1013,7 @@
* i40evf_up_complete - Finish the last steps of bringing up a connection
* @adapter: board private structure
**/
-static int i40evf_up_complete(struct i40evf_adapter *adapter)
+static void i40evf_up_complete(struct i40evf_adapter *adapter)
{
adapter->state = __I40EVF_RUNNING;
clear_bit(__I40E_DOWN, &adapter->vsi.state);
@@ -1016,7 +1022,6 @@
adapter->aq_required |= I40EVF_FLAG_AQ_ENABLE_QUEUES;
mod_timer_pending(&adapter->watchdog_timer, jiffies + 1);
- return 0;
}
/**
@@ -1037,6 +1042,7 @@
netif_carrier_off(netdev);
netif_tx_disable(netdev);
+ adapter->link_up = false;
i40evf_napi_disable_all(adapter);
i40evf_irq_disable(adapter);
@@ -1154,6 +1160,7 @@
tx_ring->netdev = adapter->netdev;
tx_ring->dev = &adapter->pdev->dev;
tx_ring->count = adapter->tx_desc_count;
+ tx_ring->tx_itr_setting = (I40E_ITR_DYNAMIC | I40E_ITR_TX_DEF);
if (adapter->flags & I40E_FLAG_WB_ON_ITR_CAPABLE)
tx_ring->flags |= I40E_TXR_FLAGS_WB_ON_ITR;
@@ -1162,6 +1169,7 @@
rx_ring->netdev = adapter->netdev;
rx_ring->dev = &adapter->pdev->dev;
rx_ring->count = adapter->rx_desc_count;
+ rx_ring->rx_itr_setting = (I40E_ITR_DYNAMIC | I40E_ITR_RX_DEF);
}
return 0;
@@ -1731,6 +1739,7 @@
set_bit(__I40E_DOWN, &adapter->vsi.state);
netif_carrier_off(netdev);
netif_tx_disable(netdev);
+ adapter->link_up = false;
i40evf_napi_disable_all(adapter);
i40evf_irq_disable(adapter);
i40evf_free_traffic_irqs(adapter);
@@ -1769,6 +1778,7 @@
if (netif_running(adapter->netdev)) {
netif_carrier_off(netdev);
netif_tx_stop_all_queues(netdev);
+ adapter->link_up = false;
i40evf_napi_disable_all(adapter);
}
i40evf_irq_disable(adapter);
@@ -1783,8 +1793,7 @@
i40evf_free_all_tx_resources(adapter);
/* kill and reinit the admin queue */
- if (i40evf_shutdown_adminq(hw))
- dev_warn(&adapter->pdev->dev, "Failed to shut down adminq\n");
+ i40evf_shutdown_adminq(hw);
adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;
err = i40evf_init_adminq(hw);
if (err)
@@ -1824,9 +1833,7 @@
i40evf_configure(adapter);
- err = i40evf_up_complete(adapter);
- if (err)
- goto reset_err;
+ i40evf_up_complete(adapter);
i40evf_irq_enable(adapter, true);
} else {
@@ -2056,9 +2063,7 @@
i40evf_add_filter(adapter, adapter->hw.mac.addr);
i40evf_configure(adapter);
- err = i40evf_up_complete(adapter);
- if (err)
- goto err_req_irq;
+ i40evf_up_complete(adapter);
i40evf_irq_enable(adapter, true);
@@ -2272,10 +2277,6 @@
adapter->vsi.back = adapter;
adapter->vsi.base_vector = 1;
adapter->vsi.work_limit = I40E_DEFAULT_IRQ_WORK;
- adapter->vsi.rx_itr_setting = (I40E_ITR_DYNAMIC |
- ITR_REG_TO_USEC(I40E_ITR_RX_DEF));
- adapter->vsi.tx_itr_setting = (I40E_ITR_DYNAMIC |
- ITR_REG_TO_USEC(I40E_ITR_TX_DEF));
vsi->netdev = adapter->netdev;
vsi->qs_handle = adapter->vsi_res->qset_handle;
if (vfres->vf_offload_flags & I40E_VIRTCHNL_VF_OFFLOAD_RSS_PF) {
@@ -2457,6 +2458,7 @@
goto err_sw_init;
netif_carrier_off(netdev);
+ adapter->link_up = false;
if (!adapter->netdev_registered) {
err = register_netdev(netdev);
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index cc6cb30..ddf478d 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -898,8 +898,14 @@
vpe->event_data.link_event.link_status) {
adapter->link_up =
vpe->event_data.link_event.link_status;
+ if (adapter->link_up) {
+ netif_tx_start_all_queues(netdev);
+ netif_carrier_on(netdev);
+ } else {
+ netif_tx_stop_all_queues(netdev);
+ netif_carrier_off(netdev);
+ }
i40evf_print_link_message(adapter);
- netif_tx_stop_all_queues(netdev);
}
break;
case I40E_VIRTCHNL_EVENT_RESET_IMPENDING:
@@ -974,8 +980,6 @@
case I40E_VIRTCHNL_OP_ENABLE_QUEUES:
/* enable transmits */
i40evf_irq_enable(adapter, true);
- netif_tx_start_all_queues(adapter->netdev);
- netif_carrier_on(adapter->netdev);
break;
case I40E_VIRTCHNL_OP_DISABLE_QUEUES:
i40evf_free_all_tx_resources(adapter);
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 03fbe4b..d11093d 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -489,6 +489,7 @@
struct timecounter tc;
u32 tx_hwtstamp_timeouts;
u32 rx_hwtstamp_cleared;
+ bool pps_sys_wrap_on;
struct ptp_pin_desc sdp_config[IGB_N_SDP];
struct {
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 0c33eca..737b664 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2704,8 +2704,8 @@
return 0;
}
-int igb_rxnfc_write_vlan_prio_filter(struct igb_adapter *adapter,
- struct igb_nfc_filter *input)
+static int igb_rxnfc_write_vlan_prio_filter(struct igb_adapter *adapter,
+ struct igb_nfc_filter *input)
{
struct e1000_hw *hw = &adapter->hw;
u8 vlan_priority;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index af75eac..edc9a6a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -58,7 +58,7 @@
#include "igb.h"
#define MAJ 5
-#define MIN 3
+#define MIN 4
#define BUILD 0
#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN) "." \
__stringify(BUILD) "-k"
@@ -169,7 +169,7 @@
static void igb_restore_vf_multicasts(struct igb_adapter *adapter);
static int igb_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac);
static int igb_ndo_set_vf_vlan(struct net_device *netdev,
- int vf, u16 vlan, u8 qos);
+ int vf, u16 vlan, u8 qos, __be16 vlan_proto);
static int igb_ndo_set_vf_bw(struct net_device *, int, int, int);
static int igb_ndo_set_vf_spoofchk(struct net_device *netdev, int vf,
bool setting);
@@ -6222,14 +6222,17 @@
return 0;
}
-static int igb_ndo_set_vf_vlan(struct net_device *netdev,
- int vf, u16 vlan, u8 qos)
+static int igb_ndo_set_vf_vlan(struct net_device *netdev, int vf,
+ u16 vlan, u8 qos, __be16 vlan_proto)
{
struct igb_adapter *adapter = netdev_priv(netdev);
if ((vf >= adapter->vfs_allocated_count) || (vlan > 4095) || (qos > 7))
return -EINVAL;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
return (vlan || qos) ? igb_enable_port_vlan(adapter, vf, vlan, qos) :
igb_disable_port_vlan(adapter, vf);
}
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 66dfa20..a7895c4 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -591,6 +591,7 @@
tsim |= TSINTR_SYS_WRAP;
else
tsim &= ~TSINTR_SYS_WRAP;
+ igb->pps_sys_wrap_on = !!on;
wr32(E1000_TSIM, tsim);
spin_unlock_irqrestore(&igb->tmreg_lock, flags);
return 0;
@@ -1159,7 +1160,7 @@
if (IS_ERR(adapter->ptp_clock)) {
adapter->ptp_clock = NULL;
dev_err(&adapter->pdev->dev, "ptp_clock_register failed\n");
- } else {
+ } else if (adapter->ptp_clock) {
dev_info(&adapter->pdev->dev, "added PHC on %s\n",
adapter->netdev->name);
adapter->ptp_flags |= IGB_PTP_ENABLED;
@@ -1235,7 +1236,9 @@
case e1000_i211:
wr32(E1000_TSAUXC, 0x0);
wr32(E1000_TSSDP, 0x0);
- wr32(E1000_TSIM, TSYNC_INTERRUPTS);
+ wr32(E1000_TSIM,
+ TSYNC_INTERRUPTS |
+ (adapter->pps_sys_wrap_on ? TSINTR_SYS_WRAP : 0));
wr32(E1000_IMS, E1000_IMS_TS);
break;
default:
diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c
index b0778ba..12bb877 100644
--- a/drivers/net/ethernet/intel/igbvf/netdev.c
+++ b/drivers/net/ethernet/intel/igbvf/netdev.c
@@ -47,7 +47,7 @@
#include "igbvf.h"
-#define DRV_VERSION "2.0.2-k"
+#define DRV_VERSION "2.4.0-k"
char igbvf_driver_name[] = "igbvf";
const char igbvf_driver_version[] = DRV_VERSION;
static const char igbvf_driver_string[] =
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 9547191..f49f803 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -313,6 +313,25 @@
break;
}
+ /* Indicate pause support */
+ ecmd->supported |= SUPPORTED_Pause;
+
+ switch (hw->fc.requested_mode) {
+ case ixgbe_fc_full:
+ ecmd->advertising |= ADVERTISED_Pause;
+ break;
+ case ixgbe_fc_rx_pause:
+ ecmd->advertising |= ADVERTISED_Pause |
+ ADVERTISED_Asym_Pause;
+ break;
+ case ixgbe_fc_tx_pause:
+ ecmd->advertising |= ADVERTISED_Asym_Pause;
+ break;
+ default:
+ ecmd->advertising &= ~(ADVERTISED_Pause |
+ ADVERTISED_Asym_Pause);
+ }
+
if (netif_carrier_ok(netdev)) {
switch (adapter->link_speed) {
case IXGBE_LINK_SPEED_10GB_FULL:
@@ -2928,9 +2947,13 @@
static void ixgbe_get_reta(struct ixgbe_adapter *adapter, u32 *indir)
{
int i, reta_size = ixgbe_rss_indir_tbl_entries(adapter);
+ u16 rss_m = adapter->ring_feature[RING_F_RSS].mask;
+
+ if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
+ rss_m = adapter->ring_feature[RING_F_RSS].indices - 1;
for (i = 0; i < reta_size; i++)
- indir[i] = adapter->rss_indir_tbl[i];
+ indir[i] = adapter->rss_indir_tbl[i] & rss_m;
}
static int ixgbe_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
@@ -3041,8 +3064,8 @@
/* We only support one q_vector without MSI-X */
max_combined = 1;
} else if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
- /* SR-IOV currently only allows one queue on the PF */
- max_combined = 1;
+ /* Limit value based on the queue mask */
+ max_combined = adapter->ring_feature[RING_F_RSS].mask + 1;
} else if (tcs > 1) {
/* For DCB report channels per traffic class */
if (adapter->hw.mac.type == ixgbe_mac_82598EB) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index bcdc884..15ab337 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -515,15 +515,16 @@
vmdq_i = min_t(u16, IXGBE_MAX_VMDQ_INDICES, vmdq_i);
/* 64 pool mode with 2 queues per pool */
- if ((vmdq_i > 32) || (rss_i < 4) || (vmdq_i > 16 && pools)) {
+ if ((vmdq_i > 32) || (vmdq_i > 16 && pools)) {
vmdq_m = IXGBE_82599_VMDQ_2Q_MASK;
rss_m = IXGBE_RSS_2Q_MASK;
rss_i = min_t(u16, rss_i, 2);
- /* 32 pool mode with 4 queues per pool */
+ /* 32 pool mode with up to 4 queues per pool */
} else {
vmdq_m = IXGBE_82599_VMDQ_4Q_MASK;
rss_m = IXGBE_RSS_4Q_MASK;
- rss_i = 4;
+ /* We can support 4, 2, or 1 queues */
+ rss_i = (rss_i > 3) ? 4 : (rss_i > 1) ? 2 : 1;
}
#ifdef IXGBE_FCOE
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d76bc1a..a244d9a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3248,7 +3248,8 @@
mtqc |= IXGBE_MTQC_RT_ENA | IXGBE_MTQC_8TC_8TQ;
else if (tcs > 1)
mtqc |= IXGBE_MTQC_RT_ENA | IXGBE_MTQC_4TC_4TQ;
- else if (adapter->ring_feature[RING_F_RSS].indices == 4)
+ else if (adapter->ring_feature[RING_F_VMDQ].mask ==
+ IXGBE_82599_VMDQ_4Q_MASK)
mtqc |= IXGBE_MTQC_32VF;
else
mtqc |= IXGBE_MTQC_64VF;
@@ -3475,12 +3476,12 @@
u32 reta_entries = ixgbe_rss_indir_tbl_entries(adapter);
u16 rss_i = adapter->ring_feature[RING_F_RSS].indices;
- /* Program table for at least 2 queues w/ SR-IOV so that VFs can
+ /* Program table for at least 4 queues w/ SR-IOV so that VFs can
* make full use of any rings they may have. We will use the
* PSRTYPE register to control how many rings we use within the PF.
*/
- if ((adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) && (rss_i < 2))
- rss_i = 2;
+ if ((adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) && (rss_i < 4))
+ rss_i = 4;
/* Fill out hash function seeds */
for (i = 0; i < 10; i++)
@@ -3544,7 +3545,8 @@
mrqc = IXGBE_MRQC_VMDQRT8TCEN; /* 8 TCs */
else if (tcs > 1)
mrqc = IXGBE_MRQC_VMDQRT4TCEN; /* 4 TCs */
- else if (adapter->ring_feature[RING_F_RSS].indices == 4)
+ else if (adapter->ring_feature[RING_F_VMDQ].mask ==
+ IXGBE_82599_VMDQ_4Q_MASK)
mrqc = IXGBE_MRQC_VMDQRSS32EN;
else
mrqc = IXGBE_MRQC_VMDQRSS64EN;
@@ -4105,23 +4107,20 @@
vlnctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL);
- switch (hw->mac.type) {
- case ixgbe_mac_82599EB:
- case ixgbe_mac_X540:
- case ixgbe_mac_X550:
- case ixgbe_mac_X550EM_x:
- case ixgbe_mac_x550em_a:
- default:
- if (adapter->flags & IXGBE_FLAG_VMDQ_ENABLED)
- break;
- /* fall through */
- case ixgbe_mac_82598EB:
- /* legacy case, we can just disable VLAN filtering */
+ if (adapter->flags & IXGBE_FLAG_VMDQ_ENABLED) {
+ /* For VMDq and SR-IOV we must leave VLAN filtering enabled */
+ vlnctrl |= IXGBE_VLNCTRL_VFE;
+ IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl);
+ } else {
vlnctrl &= ~IXGBE_VLNCTRL_VFE;
IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl);
return;
}
+ /* Nothing to do for 82598 */
+ if (hw->mac.type == ixgbe_mac_82598EB)
+ return;
+
/* We are already in VLAN promisc, nothing to do */
if (adapter->flags2 & IXGBE_FLAG2_VLAN_PROMISC)
return;
@@ -4129,10 +4128,6 @@
/* Set flag so we don't redo unnecessary work */
adapter->flags2 |= IXGBE_FLAG2_VLAN_PROMISC;
- /* For VMDq and SR-IOV we must leave VLAN filtering enabled */
- vlnctrl |= IXGBE_VLNCTRL_VFE;
- IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl);
-
/* Add PF to all active pools */
for (i = IXGBE_VLVF_ENTRIES; --i;) {
u32 reg_offset = IXGBE_VLVFB(i * 2 + VMDQ_P(0) / 32);
@@ -4204,19 +4199,9 @@
vlnctrl |= IXGBE_VLNCTRL_VFE;
IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, vlnctrl);
- switch (hw->mac.type) {
- case ixgbe_mac_82599EB:
- case ixgbe_mac_X540:
- case ixgbe_mac_X550:
- case ixgbe_mac_X550EM_x:
- case ixgbe_mac_x550em_a:
- default:
- if (adapter->flags & IXGBE_FLAG_VMDQ_ENABLED)
- break;
- /* fall through */
- case ixgbe_mac_82598EB:
+ if (!(adapter->flags & IXGBE_FLAG_VMDQ_ENABLED) ||
+ hw->mac.type == ixgbe_mac_82598EB)
return;
- }
/* We are not in VLAN promisc, nothing to do */
if (!(adapter->flags2 & IXGBE_FLAG2_VLAN_PROMISC))
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
index db0731e..021ab9b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
@@ -346,8 +346,8 @@
return 0;
}
}
- /* clear value if nothing found */
- hw->phy.mdio.prtad = 0;
+ /* indicate no PHY found */
+ hw->phy.mdio.prtad = MDIO_PRTAD_NONE;
return IXGBE_ERR_PHY_ADDR_INVALID;
}
return 0;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index e5431bf..a922776 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -1254,7 +1254,7 @@
adapter->ptp_clock = NULL;
e_dev_err("ptp_clock_register failed\n");
return err;
- } else
+ } else if (adapter->ptp_clock)
e_dev_info("registered PHC device on %s\n", netdev->name);
/* set default timestamp mode to disabled here. We do this in
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 8618599..7e5d985 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -329,13 +329,15 @@
for (i = 0; i < adapter->num_vfs; i++)
ixgbe_vf_configuration(dev, (i | 0x10000000));
+ /* reset before enabling SRIOV to avoid mailbox issues */
+ ixgbe_sriov_reinit(adapter);
+
err = pci_enable_sriov(dev, num_vfs);
if (err) {
e_dev_warn("Failed to enable PCI sriov: %d\n", err);
return err;
}
ixgbe_get_vfs(adapter);
- ixgbe_sriov_reinit(adapter);
return num_vfs;
#else
@@ -1354,13 +1356,16 @@
return err;
}
-int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos)
+int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan,
+ u8 qos, __be16 vlan_proto)
{
int err = 0;
struct ixgbe_adapter *adapter = netdev_priv(netdev);
if ((vf >= adapter->num_vfs) || (vlan > 4095) || (qos > 7))
return -EINVAL;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
if (vlan || qos) {
/* Check if there is already a port VLAN set, if so
* we have to delete the old one first before we
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
index 47e65e2..0c7977d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
@@ -43,7 +43,7 @@
void ixgbe_ping_all_vfs(struct ixgbe_adapter *adapter);
int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int queue, u8 *mac);
int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int queue, u16 vlan,
- u8 qos);
+ u8 qos, __be16 vlan_proto);
int ixgbe_link_mbps(struct ixgbe_adapter *adapter);
int ixgbe_ndo_set_vf_bw(struct net_device *netdev, int vf, int min_tx_rate,
int max_tx_rate);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index e092a89..7e6b926 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -1459,7 +1459,7 @@
/* Configure internal PHY for KR/KX. */
ixgbe_setup_kr_speed_x550em(hw, speed);
- if (!hw->phy.mdio.prtad || hw->phy.mdio.prtad == 0xFFFF)
+ if (hw->phy.mdio.prtad == MDIO_PRTAD_NONE)
return IXGBE_ERR_PHY_ADDR_INVALID;
/* Get external PHY device id */
@@ -2125,7 +2125,7 @@
* @hw: pointer to hardware structure
* @led_idx: led number to turn on
**/
-s32 ixgbe_led_on_t_x550em(struct ixgbe_hw *hw, u32 led_idx)
+static s32 ixgbe_led_on_t_x550em(struct ixgbe_hw *hw, u32 led_idx)
{
u16 phy_data;
@@ -2147,7 +2147,7 @@
* @hw: pointer to hardware structure
* @led_idx: led number to turn off
**/
-s32 ixgbe_led_off_t_x550em(struct ixgbe_hw *hw, u32 led_idx)
+static s32 ixgbe_led_off_t_x550em(struct ixgbe_hw *hw, u32 led_idx)
{
u16 phy_data;
@@ -3049,6 +3049,8 @@
.identify = &ixgbe_identify_phy_x550em,
.read_reg = &ixgbe_read_phy_reg_x550a,
.write_reg = &ixgbe_write_phy_reg_x550a,
+ .read_reg_mdi = &ixgbe_read_phy_reg_mdi,
+ .write_reg_mdi = &ixgbe_write_phy_reg_mdi,
};
static const u32 ixgbe_mvals_X550[IXGBE_MVALS_IDX_LIMIT] = {
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 4044608..7eaac32 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1810,8 +1810,10 @@
if (hw->mac.type >= ixgbe_mac_X550_vf)
ixgbevf_setup_vfmrqc(adapter);
+ spin_lock_bh(&adapter->mbx_lock);
/* notify the PF of our intent to use this size of frame */
ret = hw->mac.ops.set_rlpml(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN);
+ spin_unlock_bh(&adapter->mbx_lock);
if (ret)
dev_err(&adapter->pdev->dev,
"Failed to set MTU at %d\n", netdev->mtu);
@@ -3758,8 +3760,10 @@
if ((new_mtu < 68) || (max_frame > max_possible_frame))
return -EINVAL;
+ spin_lock_bh(&adapter->mbx_lock);
/* notify the PF of our intent to use this size of frame */
ret = hw->mac.ops.set_rlpml(hw, max_frame);
+ spin_unlock_bh(&adapter->mbx_lock);
if (ret)
return -EINVAL;
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 8e4252d..bfada88 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -634,8 +634,9 @@
}
/* Get System Network Statistics */
-struct rtnl_link_stats64 *mvneta_get_stats64(struct net_device *dev,
- struct rtnl_link_stats64 *stats)
+static struct rtnl_link_stats64 *
+mvneta_get_stats64(struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
{
struct mvneta_port *pp = netdev_priv(dev);
unsigned int start;
@@ -3506,8 +3507,9 @@
/* Ethtool methods */
/* Set link ksettings (phy address, speed) for ethtools */
-int mvneta_ethtool_set_link_ksettings(struct net_device *ndev,
- const struct ethtool_link_ksettings *cmd)
+static int
+mvneta_ethtool_set_link_ksettings(struct net_device *ndev,
+ const struct ethtool_link_ksettings *cmd)
{
struct mvneta_port *pp = netdev_priv(ndev);
struct phy_device *phydev = ndev->phydev;
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 522fe8d..ddf20a0 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -52,7 +52,7 @@
};
static const char * const mtk_clks_source_name[] = {
- "ethif", "esw", "gp1", "gp2"
+ "ethif", "esw", "gp1", "gp2", "trgpll"
};
void mtk_w32(struct mtk_eth *eth, u32 val, unsigned reg)
@@ -135,6 +135,33 @@
return _mtk_mdio_read(eth, phy_addr, phy_reg);
}
+static void mtk_gmac0_rgmii_adjust(struct mtk_eth *eth, int speed)
+{
+ u32 val;
+ int ret;
+
+ val = (speed == SPEED_1000) ?
+ INTF_MODE_RGMII_1000 : INTF_MODE_RGMII_10_100;
+ mtk_w32(eth, val, INTF_MODE);
+
+ regmap_update_bits(eth->ethsys, ETHSYS_CLKCFG0,
+ ETHSYS_TRGMII_CLK_SEL362_5,
+ ETHSYS_TRGMII_CLK_SEL362_5);
+
+ val = (speed == SPEED_1000) ? 250000000 : 500000000;
+ ret = clk_set_rate(eth->clks[MTK_CLK_TRGPLL], val);
+ if (ret)
+ dev_err(eth->dev, "Failed to set trgmii pll: %d\n", ret);
+
+ val = (speed == SPEED_1000) ?
+ RCK_CTRL_RGMII_1000 : RCK_CTRL_RGMII_10_100;
+ mtk_w32(eth, val, TRGMII_RCK_CTRL);
+
+ val = (speed == SPEED_1000) ?
+ TCK_CTRL_RGMII_1000 : TCK_CTRL_RGMII_10_100;
+ mtk_w32(eth, val, TRGMII_TCK_CTRL);
+}
+
static void mtk_phy_link_adjust(struct net_device *dev)
{
struct mtk_mac *mac = netdev_priv(dev);
@@ -148,7 +175,7 @@
if (unlikely(test_bit(MTK_RESETTING, &mac->hw->state)))
return;
- switch (mac->phy_dev->speed) {
+ switch (dev->phydev->speed) {
case SPEED_1000:
mcr |= MAC_MCR_SPEED_1000;
break;
@@ -157,20 +184,23 @@
break;
};
- if (mac->phy_dev->link)
+ if (mac->id == 0 && !mac->trgmii)
+ mtk_gmac0_rgmii_adjust(mac->hw, dev->phydev->speed);
+
+ if (dev->phydev->link)
mcr |= MAC_MCR_FORCE_LINK;
- if (mac->phy_dev->duplex) {
+ if (dev->phydev->duplex) {
mcr |= MAC_MCR_FORCE_DPX;
- if (mac->phy_dev->pause)
+ if (dev->phydev->pause)
rmt_adv = LPA_PAUSE_CAP;
- if (mac->phy_dev->asym_pause)
+ if (dev->phydev->asym_pause)
rmt_adv |= LPA_PAUSE_ASYM;
- if (mac->phy_dev->advertising & ADVERTISED_Pause)
+ if (dev->phydev->advertising & ADVERTISED_Pause)
lcl_adv |= ADVERTISE_PAUSE_CAP;
- if (mac->phy_dev->advertising & ADVERTISED_Asym_Pause)
+ if (dev->phydev->advertising & ADVERTISED_Asym_Pause)
lcl_adv |= ADVERTISE_PAUSE_ASYM;
flowctrl = mii_resolve_flowctrl_fdx(lcl_adv, rmt_adv);
@@ -187,7 +217,7 @@
mtk_w32(mac->hw, mcr, MTK_MAC_MCR(mac->id));
- if (mac->phy_dev->link)
+ if (dev->phydev->link)
netif_carrier_on(dev);
else
netif_carrier_off(dev);
@@ -196,17 +226,9 @@
static int mtk_phy_connect_node(struct mtk_eth *eth, struct mtk_mac *mac,
struct device_node *phy_node)
{
- const __be32 *_addr = NULL;
struct phy_device *phydev;
- int phy_mode, addr;
+ int phy_mode;
- _addr = of_get_property(phy_node, "reg", NULL);
-
- if (!_addr || (be32_to_cpu(*_addr) >= 0x20)) {
- pr_err("%s: invalid phy address\n", phy_node->name);
- return -EINVAL;
- }
- addr = be32_to_cpu(*_addr);
phy_mode = of_get_phy_mode(phy_node);
if (phy_mode < 0) {
dev_err(eth->dev, "incorrect phy-mode %d\n", phy_mode);
@@ -225,17 +247,17 @@
mac->id, phydev_name(phydev), phydev->phy_id,
phydev->drv->name);
- mac->phy_dev = phydev;
-
return 0;
}
-static int mtk_phy_connect(struct mtk_mac *mac)
+static int mtk_phy_connect(struct net_device *dev)
{
- struct mtk_eth *eth = mac->hw;
+ struct mtk_mac *mac = netdev_priv(dev);
+ struct mtk_eth *eth;
struct device_node *np;
u32 val;
+ eth = mac->hw;
np = of_parse_phandle(mac->of_node, "phy-handle", 0);
if (!np && of_phy_is_fixed_link(mac->of_node))
if (!of_phy_register_fixed_link(mac->of_node))
@@ -244,6 +266,8 @@
return -ENODEV;
switch (of_get_phy_mode(np)) {
+ case PHY_INTERFACE_MODE_TRGMII:
+ mac->trgmii = true;
case PHY_INTERFACE_MODE_RGMII_TXID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_ID:
@@ -271,20 +295,23 @@
val |= SYSCFG0_GE_MODE(mac->ge_mode, mac->id);
regmap_write(eth->ethsys, ETHSYS_SYSCFG0, val);
- mtk_phy_connect_node(eth, mac, np);
- mac->phy_dev->autoneg = AUTONEG_ENABLE;
- mac->phy_dev->speed = 0;
- mac->phy_dev->duplex = 0;
+ /* couple phydev to net_device */
+ if (mtk_phy_connect_node(eth, mac, np))
+ goto err_phy;
+
+ dev->phydev->autoneg = AUTONEG_ENABLE;
+ dev->phydev->speed = 0;
+ dev->phydev->duplex = 0;
if (of_phy_is_fixed_link(mac->of_node))
- mac->phy_dev->supported |=
+ dev->phydev->supported |=
SUPPORTED_Pause | SUPPORTED_Asym_Pause;
- mac->phy_dev->supported &= PHY_GBIT_FEATURES | SUPPORTED_Pause |
+ dev->phydev->supported &= PHY_GBIT_FEATURES | SUPPORTED_Pause |
SUPPORTED_Asym_Pause;
- mac->phy_dev->advertising = mac->phy_dev->supported |
+ dev->phydev->advertising = dev->phydev->supported |
ADVERTISED_Autoneg;
- phy_start_aneg(mac->phy_dev);
+ phy_start_aneg(dev->phydev);
of_node_put(np);
@@ -292,7 +319,7 @@
err_phy:
of_node_put(np);
- dev_err(eth->dev, "invalid phy_mode\n");
+ dev_err(eth->dev, "%s: invalid phy\n", __func__);
return -EINVAL;
}
@@ -820,11 +847,51 @@
return NETDEV_TX_OK;
}
+static struct mtk_rx_ring *mtk_get_rx_ring(struct mtk_eth *eth)
+{
+ int i;
+ struct mtk_rx_ring *ring;
+ int idx;
+
+ if (!eth->hwlro)
+ return ð->rx_ring[0];
+
+ for (i = 0; i < MTK_MAX_RX_RING_NUM; i++) {
+ ring = ð->rx_ring[i];
+ idx = NEXT_RX_DESP_IDX(ring->calc_idx, ring->dma_size);
+ if (ring->dma[idx].rxd2 & RX_DMA_DONE) {
+ ring->calc_idx_update = true;
+ return ring;
+ }
+ }
+
+ return NULL;
+}
+
+static void mtk_update_rx_cpu_idx(struct mtk_eth *eth)
+{
+ struct mtk_rx_ring *ring;
+ int i;
+
+ if (!eth->hwlro) {
+ ring = ð->rx_ring[0];
+ mtk_w32(eth, ring->calc_idx, ring->crx_idx_reg);
+ } else {
+ for (i = 0; i < MTK_MAX_RX_RING_NUM; i++) {
+ ring = ð->rx_ring[i];
+ if (ring->calc_idx_update) {
+ ring->calc_idx_update = false;
+ mtk_w32(eth, ring->calc_idx, ring->crx_idx_reg);
+ }
+ }
+ }
+}
+
static int mtk_poll_rx(struct napi_struct *napi, int budget,
struct mtk_eth *eth)
{
- struct mtk_rx_ring *ring = ð->rx_ring;
- int idx = ring->calc_idx;
+ struct mtk_rx_ring *ring;
+ int idx;
struct sk_buff *skb;
u8 *data, *new_data;
struct mtk_rx_dma *rxd, trxd;
@@ -836,7 +903,11 @@
dma_addr_t dma_addr;
int mac = 0;
- idx = NEXT_RX_DESP_IDX(idx);
+ ring = mtk_get_rx_ring(eth);
+ if (unlikely(!ring))
+ goto rx_done;
+
+ idx = NEXT_RX_DESP_IDX(ring->calc_idx, ring->dma_size);
rxd = &ring->dma[idx];
data = ring->data[idx];
@@ -907,12 +978,13 @@
done++;
}
+rx_done:
if (done) {
/* make sure that all changes to the dma ring are flushed before
* we continue
*/
wmb();
- mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0);
+ mtk_update_rx_cpu_idx(eth);
}
return done;
@@ -1135,32 +1207,41 @@
}
}
-static int mtk_rx_alloc(struct mtk_eth *eth)
+static int mtk_rx_alloc(struct mtk_eth *eth, int ring_no, int rx_flag)
{
- struct mtk_rx_ring *ring = ð->rx_ring;
+ struct mtk_rx_ring *ring = ð->rx_ring[ring_no];
+ int rx_data_len, rx_dma_size;
int i;
- ring->frag_size = mtk_max_frag_size(ETH_DATA_LEN);
+ if (rx_flag == MTK_RX_FLAGS_HWLRO) {
+ rx_data_len = MTK_MAX_LRO_RX_LENGTH;
+ rx_dma_size = MTK_HW_LRO_DMA_SIZE;
+ } else {
+ rx_data_len = ETH_DATA_LEN;
+ rx_dma_size = MTK_DMA_SIZE;
+ }
+
+ ring->frag_size = mtk_max_frag_size(rx_data_len);
ring->buf_size = mtk_max_buf_size(ring->frag_size);
- ring->data = kcalloc(MTK_DMA_SIZE, sizeof(*ring->data),
+ ring->data = kcalloc(rx_dma_size, sizeof(*ring->data),
GFP_KERNEL);
if (!ring->data)
return -ENOMEM;
- for (i = 0; i < MTK_DMA_SIZE; i++) {
+ for (i = 0; i < rx_dma_size; i++) {
ring->data[i] = netdev_alloc_frag(ring->frag_size);
if (!ring->data[i])
return -ENOMEM;
}
ring->dma = dma_alloc_coherent(eth->dev,
- MTK_DMA_SIZE * sizeof(*ring->dma),
+ rx_dma_size * sizeof(*ring->dma),
&ring->phys,
GFP_ATOMIC | __GFP_ZERO);
if (!ring->dma)
return -ENOMEM;
- for (i = 0; i < MTK_DMA_SIZE; i++) {
+ for (i = 0; i < rx_dma_size; i++) {
dma_addr_t dma_addr = dma_map_single(eth->dev,
ring->data[i] + NET_SKB_PAD,
ring->buf_size,
@@ -1171,27 +1252,30 @@
ring->dma[i].rxd2 = RX_DMA_PLEN0(ring->buf_size);
}
- ring->calc_idx = MTK_DMA_SIZE - 1;
+ ring->dma_size = rx_dma_size;
+ ring->calc_idx_update = false;
+ ring->calc_idx = rx_dma_size - 1;
+ ring->crx_idx_reg = MTK_PRX_CRX_IDX_CFG(ring_no);
/* make sure that all changes to the dma ring are flushed before we
* continue
*/
wmb();
- mtk_w32(eth, eth->rx_ring.phys, MTK_PRX_BASE_PTR0);
- mtk_w32(eth, MTK_DMA_SIZE, MTK_PRX_MAX_CNT0);
- mtk_w32(eth, eth->rx_ring.calc_idx, MTK_PRX_CRX_IDX0);
- mtk_w32(eth, MTK_PST_DRX_IDX0, MTK_PDMA_RST_IDX);
+ mtk_w32(eth, ring->phys, MTK_PRX_BASE_PTR_CFG(ring_no));
+ mtk_w32(eth, rx_dma_size, MTK_PRX_MAX_CNT_CFG(ring_no));
+ mtk_w32(eth, ring->calc_idx, ring->crx_idx_reg);
+ mtk_w32(eth, MTK_PST_DRX_IDX_CFG(ring_no), MTK_PDMA_RST_IDX);
return 0;
}
-static void mtk_rx_clean(struct mtk_eth *eth)
+static void mtk_rx_clean(struct mtk_eth *eth, int ring_no)
{
- struct mtk_rx_ring *ring = ð->rx_ring;
+ struct mtk_rx_ring *ring = ð->rx_ring[ring_no];
int i;
if (ring->data && ring->dma) {
- for (i = 0; i < MTK_DMA_SIZE; i++) {
+ for (i = 0; i < ring->dma_size; i++) {
if (!ring->data[i])
continue;
if (!ring->dma[i].rxd1)
@@ -1208,13 +1292,275 @@
if (ring->dma) {
dma_free_coherent(eth->dev,
- MTK_DMA_SIZE * sizeof(*ring->dma),
+ ring->dma_size * sizeof(*ring->dma),
ring->dma,
ring->phys);
ring->dma = NULL;
}
}
+static int mtk_hwlro_rx_init(struct mtk_eth *eth)
+{
+ int i;
+ u32 ring_ctrl_dw1 = 0, ring_ctrl_dw2 = 0, ring_ctrl_dw3 = 0;
+ u32 lro_ctrl_dw0 = 0, lro_ctrl_dw3 = 0;
+
+ /* set LRO rings to auto-learn modes */
+ ring_ctrl_dw2 |= MTK_RING_AUTO_LERAN_MODE;
+
+ /* validate LRO ring */
+ ring_ctrl_dw2 |= MTK_RING_VLD;
+
+ /* set AGE timer (unit: 20us) */
+ ring_ctrl_dw2 |= MTK_RING_AGE_TIME_H;
+ ring_ctrl_dw1 |= MTK_RING_AGE_TIME_L;
+
+ /* set max AGG timer (unit: 20us) */
+ ring_ctrl_dw2 |= MTK_RING_MAX_AGG_TIME;
+
+ /* set max LRO AGG count */
+ ring_ctrl_dw2 |= MTK_RING_MAX_AGG_CNT_L;
+ ring_ctrl_dw3 |= MTK_RING_MAX_AGG_CNT_H;
+
+ for (i = 1; i < MTK_MAX_RX_RING_NUM; i++) {
+ mtk_w32(eth, ring_ctrl_dw1, MTK_LRO_CTRL_DW1_CFG(i));
+ mtk_w32(eth, ring_ctrl_dw2, MTK_LRO_CTRL_DW2_CFG(i));
+ mtk_w32(eth, ring_ctrl_dw3, MTK_LRO_CTRL_DW3_CFG(i));
+ }
+
+ /* IPv4 checksum update enable */
+ lro_ctrl_dw0 |= MTK_L3_CKS_UPD_EN;
+
+ /* switch priority comparison to packet count mode */
+ lro_ctrl_dw0 |= MTK_LRO_ALT_PKT_CNT_MODE;
+
+ /* bandwidth threshold setting */
+ mtk_w32(eth, MTK_HW_LRO_BW_THRE, MTK_PDMA_LRO_CTRL_DW2);
+
+ /* auto-learn score delta setting */
+ mtk_w32(eth, MTK_HW_LRO_REPLACE_DELTA, MTK_PDMA_LRO_ALT_SCORE_DELTA);
+
+ /* set refresh timer for altering flows to 1 sec. (unit: 20us) */
+ mtk_w32(eth, (MTK_HW_LRO_TIMER_UNIT << 16) | MTK_HW_LRO_REFRESH_TIME,
+ MTK_PDMA_LRO_ALT_REFRESH_TIMER);
+
+ /* set HW LRO mode & the max aggregation count for rx packets */
+ lro_ctrl_dw3 |= MTK_ADMA_MODE | (MTK_HW_LRO_MAX_AGG_CNT & 0xff);
+
+ /* the minimal remaining room of SDL0 in RXD for lro aggregation */
+ lro_ctrl_dw3 |= MTK_LRO_MIN_RXD_SDL;
+
+ /* enable HW LRO */
+ lro_ctrl_dw0 |= MTK_LRO_EN;
+
+ mtk_w32(eth, lro_ctrl_dw3, MTK_PDMA_LRO_CTRL_DW3);
+ mtk_w32(eth, lro_ctrl_dw0, MTK_PDMA_LRO_CTRL_DW0);
+
+ return 0;
+}
+
+static void mtk_hwlro_rx_uninit(struct mtk_eth *eth)
+{
+ int i;
+ u32 val;
+
+ /* relinquish lro rings, flush aggregated packets */
+ mtk_w32(eth, MTK_LRO_RING_RELINQUISH_REQ, MTK_PDMA_LRO_CTRL_DW0);
+
+ /* wait for relinquishments done */
+ for (i = 0; i < 10; i++) {
+ val = mtk_r32(eth, MTK_PDMA_LRO_CTRL_DW0);
+ if (val & MTK_LRO_RING_RELINQUISH_DONE) {
+ msleep(20);
+ continue;
+ }
+ break;
+ }
+
+ /* invalidate lro rings */
+ for (i = 1; i < MTK_MAX_RX_RING_NUM; i++)
+ mtk_w32(eth, 0, MTK_LRO_CTRL_DW2_CFG(i));
+
+ /* disable HW LRO */
+ mtk_w32(eth, 0, MTK_PDMA_LRO_CTRL_DW0);
+}
+
+static void mtk_hwlro_val_ipaddr(struct mtk_eth *eth, int idx, __be32 ip)
+{
+ u32 reg_val;
+
+ reg_val = mtk_r32(eth, MTK_LRO_CTRL_DW2_CFG(idx));
+
+ /* invalidate the IP setting */
+ mtk_w32(eth, (reg_val & ~MTK_RING_MYIP_VLD), MTK_LRO_CTRL_DW2_CFG(idx));
+
+ mtk_w32(eth, ip, MTK_LRO_DIP_DW0_CFG(idx));
+
+ /* validate the IP setting */
+ mtk_w32(eth, (reg_val | MTK_RING_MYIP_VLD), MTK_LRO_CTRL_DW2_CFG(idx));
+}
+
+static void mtk_hwlro_inval_ipaddr(struct mtk_eth *eth, int idx)
+{
+ u32 reg_val;
+
+ reg_val = mtk_r32(eth, MTK_LRO_CTRL_DW2_CFG(idx));
+
+ /* invalidate the IP setting */
+ mtk_w32(eth, (reg_val & ~MTK_RING_MYIP_VLD), MTK_LRO_CTRL_DW2_CFG(idx));
+
+ mtk_w32(eth, 0, MTK_LRO_DIP_DW0_CFG(idx));
+}
+
+static int mtk_hwlro_get_ip_cnt(struct mtk_mac *mac)
+{
+ int cnt = 0;
+ int i;
+
+ for (i = 0; i < MTK_MAX_LRO_IP_CNT; i++) {
+ if (mac->hwlro_ip[i])
+ cnt++;
+ }
+
+ return cnt;
+}
+
+static int mtk_hwlro_add_ipaddr(struct net_device *dev,
+ struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp =
+ (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct mtk_mac *mac = netdev_priv(dev);
+ struct mtk_eth *eth = mac->hw;
+ int hwlro_idx;
+
+ if ((fsp->flow_type != TCP_V4_FLOW) ||
+ (!fsp->h_u.tcp_ip4_spec.ip4dst) ||
+ (fsp->location > 1))
+ return -EINVAL;
+
+ mac->hwlro_ip[fsp->location] = htonl(fsp->h_u.tcp_ip4_spec.ip4dst);
+ hwlro_idx = (mac->id * MTK_MAX_LRO_IP_CNT) + fsp->location;
+
+ mac->hwlro_ip_cnt = mtk_hwlro_get_ip_cnt(mac);
+
+ mtk_hwlro_val_ipaddr(eth, hwlro_idx, mac->hwlro_ip[fsp->location]);
+
+ return 0;
+}
+
+static int mtk_hwlro_del_ipaddr(struct net_device *dev,
+ struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp =
+ (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct mtk_mac *mac = netdev_priv(dev);
+ struct mtk_eth *eth = mac->hw;
+ int hwlro_idx;
+
+ if (fsp->location > 1)
+ return -EINVAL;
+
+ mac->hwlro_ip[fsp->location] = 0;
+ hwlro_idx = (mac->id * MTK_MAX_LRO_IP_CNT) + fsp->location;
+
+ mac->hwlro_ip_cnt = mtk_hwlro_get_ip_cnt(mac);
+
+ mtk_hwlro_inval_ipaddr(eth, hwlro_idx);
+
+ return 0;
+}
+
+static void mtk_hwlro_netdev_disable(struct net_device *dev)
+{
+ struct mtk_mac *mac = netdev_priv(dev);
+ struct mtk_eth *eth = mac->hw;
+ int i, hwlro_idx;
+
+ for (i = 0; i < MTK_MAX_LRO_IP_CNT; i++) {
+ mac->hwlro_ip[i] = 0;
+ hwlro_idx = (mac->id * MTK_MAX_LRO_IP_CNT) + i;
+
+ mtk_hwlro_inval_ipaddr(eth, hwlro_idx);
+ }
+
+ mac->hwlro_ip_cnt = 0;
+}
+
+static int mtk_hwlro_get_fdir_entry(struct net_device *dev,
+ struct ethtool_rxnfc *cmd)
+{
+ struct mtk_mac *mac = netdev_priv(dev);
+ struct ethtool_rx_flow_spec *fsp =
+ (struct ethtool_rx_flow_spec *)&cmd->fs;
+
+ /* only tcp dst ipv4 is meaningful, others are meaningless */
+ fsp->flow_type = TCP_V4_FLOW;
+ fsp->h_u.tcp_ip4_spec.ip4dst = ntohl(mac->hwlro_ip[fsp->location]);
+ fsp->m_u.tcp_ip4_spec.ip4dst = 0;
+
+ fsp->h_u.tcp_ip4_spec.ip4src = 0;
+ fsp->m_u.tcp_ip4_spec.ip4src = 0xffffffff;
+ fsp->h_u.tcp_ip4_spec.psrc = 0;
+ fsp->m_u.tcp_ip4_spec.psrc = 0xffff;
+ fsp->h_u.tcp_ip4_spec.pdst = 0;
+ fsp->m_u.tcp_ip4_spec.pdst = 0xffff;
+ fsp->h_u.tcp_ip4_spec.tos = 0;
+ fsp->m_u.tcp_ip4_spec.tos = 0xff;
+
+ return 0;
+}
+
+static int mtk_hwlro_get_fdir_all(struct net_device *dev,
+ struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ struct mtk_mac *mac = netdev_priv(dev);
+ int cnt = 0;
+ int i;
+
+ for (i = 0; i < MTK_MAX_LRO_IP_CNT; i++) {
+ if (mac->hwlro_ip[i]) {
+ rule_locs[cnt] = i;
+ cnt++;
+ }
+ }
+
+ cmd->rule_cnt = cnt;
+
+ return 0;
+}
+
+static netdev_features_t mtk_fix_features(struct net_device *dev,
+ netdev_features_t features)
+{
+ if (!(features & NETIF_F_LRO)) {
+ struct mtk_mac *mac = netdev_priv(dev);
+ int ip_cnt = mtk_hwlro_get_ip_cnt(mac);
+
+ if (ip_cnt) {
+ netdev_info(dev, "RX flow is programmed, LRO should keep on\n");
+
+ features |= NETIF_F_LRO;
+ }
+ }
+
+ return features;
+}
+
+static int mtk_set_features(struct net_device *dev, netdev_features_t features)
+{
+ int err = 0;
+
+ if (!((dev->features ^ features) & NETIF_F_LRO))
+ return 0;
+
+ if (!(features & NETIF_F_LRO))
+ mtk_hwlro_netdev_disable(dev);
+
+ return err;
+}
+
/* wait for DMA to finish whatever it is doing before we start using it again */
static int mtk_dma_busy_wait(struct mtk_eth *eth)
{
@@ -1235,6 +1581,7 @@
static int mtk_dma_init(struct mtk_eth *eth)
{
int err;
+ u32 i;
if (mtk_dma_busy_wait(eth))
return -EBUSY;
@@ -1250,10 +1597,21 @@
if (err)
return err;
- err = mtk_rx_alloc(eth);
+ err = mtk_rx_alloc(eth, 0, MTK_RX_FLAGS_NORMAL);
if (err)
return err;
+ if (eth->hwlro) {
+ for (i = 1; i < MTK_MAX_RX_RING_NUM; i++) {
+ err = mtk_rx_alloc(eth, i, MTK_RX_FLAGS_HWLRO);
+ if (err)
+ return err;
+ }
+ err = mtk_hwlro_rx_init(eth);
+ if (err)
+ return err;
+ }
+
/* Enable random early drop and set drop threshold automatically */
mtk_w32(eth, FC_THRES_DROP_MODE | FC_THRES_DROP_EN | FC_THRES_MIN,
MTK_QDMA_FC_THRES);
@@ -1278,7 +1636,14 @@
eth->phy_scratch_ring = 0;
}
mtk_tx_clean(eth);
- mtk_rx_clean(eth);
+ mtk_rx_clean(eth, 0);
+
+ if (eth->hwlro) {
+ mtk_hwlro_rx_uninit(eth);
+ for (i = 1; i < MTK_MAX_RX_RING_NUM; i++)
+ mtk_rx_clean(eth, i);
+ }
+
kfree(eth->scratch_head);
}
@@ -1373,7 +1738,7 @@
}
atomic_inc(ð->dma_refcnt);
- phy_start(mac->phy_dev);
+ phy_start(dev->phydev);
netif_start_queue(dev);
return 0;
@@ -1408,7 +1773,7 @@
struct mtk_eth *eth = mac->hw;
netif_tx_disable(dev);
- phy_stop(mac->phy_dev);
+ phy_stop(dev->phydev);
/* only shutdown DMA if this is the last user */
if (!atomic_dec_and_test(ð->dma_refcnt))
@@ -1420,6 +1785,7 @@
napi_disable(ð->rx_napi);
mtk_stop_dma(eth, MTK_QDMA_GLO_CFG);
+ mtk_stop_dma(eth, MTK_PDMA_GLO_CFG);
mtk_dma_free(eth);
@@ -1548,7 +1914,7 @@
dev->addr_assign_type = NET_ADDR_RANDOM;
}
- return mtk_phy_connect(mac);
+ return mtk_phy_connect(dev);
}
static void mtk_uninit(struct net_device *dev)
@@ -1556,23 +1922,18 @@
struct mtk_mac *mac = netdev_priv(dev);
struct mtk_eth *eth = mac->hw;
- phy_disconnect(mac->phy_dev);
- mtk_mdio_cleanup(eth);
+ phy_disconnect(dev->phydev);
mtk_irq_disable(eth, MTK_QDMA_INT_MASK, ~0);
mtk_irq_disable(eth, MTK_PDMA_INT_MASK, ~0);
- free_irq(eth->irq[1], dev);
- free_irq(eth->irq[2], dev);
}
static int mtk_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
{
- struct mtk_mac *mac = netdev_priv(dev);
-
switch (cmd) {
case SIOCGMIIPHY:
case SIOCGMIIREG:
case SIOCSMIIREG:
- return phy_mii_ioctl(mac->phy_dev, ifr, cmd);
+ return phy_mii_ioctl(dev->phydev, ifr, cmd);
default:
break;
}
@@ -1617,7 +1978,7 @@
if (!eth->mac[i] ||
of_phy_is_fixed_link(eth->mac[i]->of_node))
continue;
- err = phy_init_hw(eth->mac[i]->phy_dev);
+ err = phy_init_hw(eth->netdev[i]->phydev);
if (err)
dev_err(eth->dev, "%s: PHY init failed.\n",
eth->netdev[i]->name);
@@ -1677,35 +2038,26 @@
return 0;
}
-static int mtk_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+int mtk_get_link_ksettings(struct net_device *ndev,
+ struct ethtool_link_ksettings *cmd)
{
- struct mtk_mac *mac = netdev_priv(dev);
- int err;
+ struct mtk_mac *mac = netdev_priv(ndev);
if (unlikely(test_bit(MTK_RESETTING, &mac->hw->state)))
return -EBUSY;
- err = phy_read_status(mac->phy_dev);
- if (err)
- return -ENODEV;
-
- return phy_ethtool_gset(mac->phy_dev, cmd);
+ return phy_ethtool_ksettings_get(ndev->phydev, cmd);
}
-static int mtk_set_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+int mtk_set_link_ksettings(struct net_device *ndev,
+ const struct ethtool_link_ksettings *cmd)
{
- struct mtk_mac *mac = netdev_priv(dev);
+ struct mtk_mac *mac = netdev_priv(ndev);
- if (cmd->phy_address != mac->phy_dev->mdio.addr) {
- mac->phy_dev = mdiobus_get_phy(mac->hw->mii_bus,
- cmd->phy_address);
- if (!mac->phy_dev)
- return -ENODEV;
- }
+ if (unlikely(test_bit(MTK_RESETTING, &mac->hw->state)))
+ return -EBUSY;
- return phy_ethtool_sset(mac->phy_dev, cmd);
+ return phy_ethtool_ksettings_set(ndev->phydev, cmd);
}
static void mtk_get_drvinfo(struct net_device *dev,
@@ -1739,7 +2091,7 @@
if (unlikely(test_bit(MTK_RESETTING, &mac->hw->state)))
return -EBUSY;
- return genphy_restart_aneg(mac->phy_dev);
+ return genphy_restart_aneg(dev->phydev);
}
static u32 mtk_get_link(struct net_device *dev)
@@ -1750,11 +2102,11 @@
if (unlikely(test_bit(MTK_RESETTING, &mac->hw->state)))
return -EBUSY;
- err = genphy_update_link(mac->phy_dev);
+ err = genphy_update_link(dev->phydev);
if (err)
return ethtool_op_get_link(dev);
- return mac->phy_dev->link;
+ return dev->phydev->link;
}
static void mtk_get_strings(struct net_device *dev, u32 stringset, u8 *data)
@@ -1800,8 +2152,9 @@
}
}
+ data_src = (u64 *)hwstats;
+
do {
- data_src = (u64 *)hwstats;
data_dst = data;
start = u64_stats_fetch_begin_irq(&hwstats->syncp);
@@ -1810,9 +2163,65 @@
} while (u64_stats_fetch_retry_irq(&hwstats->syncp, start));
}
+static int mtk_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ int ret = -EOPNOTSUPP;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ if (dev->features & NETIF_F_LRO) {
+ cmd->data = MTK_MAX_RX_RING_NUM;
+ ret = 0;
+ }
+ break;
+ case ETHTOOL_GRXCLSRLCNT:
+ if (dev->features & NETIF_F_LRO) {
+ struct mtk_mac *mac = netdev_priv(dev);
+
+ cmd->rule_cnt = mac->hwlro_ip_cnt;
+ ret = 0;
+ }
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ if (dev->features & NETIF_F_LRO)
+ ret = mtk_hwlro_get_fdir_entry(dev, cmd);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ if (dev->features & NETIF_F_LRO)
+ ret = mtk_hwlro_get_fdir_all(dev, cmd,
+ rule_locs);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int mtk_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
+{
+ int ret = -EOPNOTSUPP;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXCLSRLINS:
+ if (dev->features & NETIF_F_LRO)
+ ret = mtk_hwlro_add_ipaddr(dev, cmd);
+ break;
+ case ETHTOOL_SRXCLSRLDEL:
+ if (dev->features & NETIF_F_LRO)
+ ret = mtk_hwlro_del_ipaddr(dev, cmd);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
static const struct ethtool_ops mtk_ethtool_ops = {
- .get_settings = mtk_get_settings,
- .set_settings = mtk_set_settings,
+ .get_link_ksettings = mtk_get_link_ksettings,
+ .set_link_ksettings = mtk_set_link_ksettings,
.get_drvinfo = mtk_get_drvinfo,
.get_msglevel = mtk_get_msglevel,
.set_msglevel = mtk_set_msglevel,
@@ -1821,6 +2230,8 @@
.get_strings = mtk_get_strings,
.get_sset_count = mtk_get_sset_count,
.get_ethtool_stats = mtk_get_ethtool_stats,
+ .get_rxnfc = mtk_get_rxnfc,
+ .set_rxnfc = mtk_set_rxnfc,
};
static const struct net_device_ops mtk_netdev_ops = {
@@ -1835,6 +2246,8 @@
.ndo_change_mtu = eth_change_mtu,
.ndo_tx_timeout = mtk_tx_timeout,
.ndo_get_stats64 = mtk_get_stats64,
+ .ndo_fix_features = mtk_fix_features,
+ .ndo_set_features = mtk_set_features,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = mtk_poll_controller,
#endif
@@ -1873,6 +2286,9 @@
mac->hw = eth;
mac->of_node = np;
+ memset(mac->hwlro_ip, 0, sizeof(mac->hwlro_ip));
+ mac->hwlro_ip_cnt = 0;
+
mac->hw_stats = devm_kzalloc(eth->dev,
sizeof(*mac->hw_stats),
GFP_KERNEL);
@@ -1889,6 +2305,11 @@
eth->netdev[id]->watchdog_timeo = 5 * HZ;
eth->netdev[id]->netdev_ops = &mtk_netdev_ops;
eth->netdev[id]->base_addr = (unsigned long)eth->base;
+
+ eth->netdev[id]->hw_features = MTK_HW_FEATURES;
+ if (eth->hwlro)
+ eth->netdev[id]->hw_features |= NETIF_F_LRO;
+
eth->netdev[id]->vlan_features = MTK_HW_FEATURES &
~(NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX);
eth->netdev[id]->features |= MTK_HW_FEATURES;
@@ -1941,6 +2362,8 @@
return PTR_ERR(eth->pctl);
}
+ eth->hwlro = of_property_read_bool(pdev->dev.of_node, "mediatek,hwlro");
+
for (i = 0; i < 3; i++) {
eth->irq[i] = platform_get_irq(pdev, i);
if (eth->irq[i] < 0) {
@@ -2046,6 +2469,7 @@
netif_napi_del(ð->tx_napi);
netif_napi_del(ð->rx_napi);
mtk_cleanup(eth);
+ mtk_mdio_cleanup(eth);
return 0;
}
@@ -2054,6 +2478,7 @@
{ .compatible = "mediatek,mt7623-eth" },
{},
};
+MODULE_DEVICE_TABLE(of, of_mtk_match);
static struct platform_driver mtk_driver = {
.probe = mtk_probe,
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 79954b4..3003195 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -39,7 +39,21 @@
NETIF_F_SG | NETIF_F_TSO | \
NETIF_F_TSO6 | \
NETIF_F_IPV6_CSUM)
-#define NEXT_RX_DESP_IDX(X) (((X) + 1) & (MTK_DMA_SIZE - 1))
+#define NEXT_RX_DESP_IDX(X, Y) (((X) + 1) & ((Y) - 1))
+
+#define MTK_MAX_RX_RING_NUM 4
+#define MTK_HW_LRO_DMA_SIZE 8
+
+#define MTK_MAX_LRO_RX_LENGTH (4096 * 3)
+#define MTK_MAX_LRO_IP_CNT 2
+#define MTK_HW_LRO_TIMER_UNIT 1 /* 20 us */
+#define MTK_HW_LRO_REFRESH_TIME 50000 /* 1 sec. */
+#define MTK_HW_LRO_AGG_TIME 10 /* 200us */
+#define MTK_HW_LRO_AGE_TIME 50 /* 1ms */
+#define MTK_HW_LRO_MAX_AGG_CNT 64
+#define MTK_HW_LRO_BW_THRE 3000
+#define MTK_HW_LRO_REPLACE_DELTA 1000
+#define MTK_HW_LRO_SDL_REMAIN_ROOM 1522
/* Frame Engine Global Reset Register */
#define MTK_RST_GL 0x04
@@ -50,6 +64,9 @@
#define MTK_GDM1_AF BIT(28)
#define MTK_GDM2_AF BIT(29)
+/* PDMA HW LRO Alter Flow Timer Register */
+#define MTK_PDMA_LRO_ALT_REFRESH_TIMER 0x1c
+
/* Frame Engine Interrupt Grouping Register */
#define MTK_FE_INT_GRP 0x20
@@ -70,12 +87,29 @@
/* PDMA RX Base Pointer Register */
#define MTK_PRX_BASE_PTR0 0x900
+#define MTK_PRX_BASE_PTR_CFG(x) (MTK_PRX_BASE_PTR0 + (x * 0x10))
/* PDMA RX Maximum Count Register */
#define MTK_PRX_MAX_CNT0 0x904
+#define MTK_PRX_MAX_CNT_CFG(x) (MTK_PRX_MAX_CNT0 + (x * 0x10))
/* PDMA RX CPU Pointer Register */
#define MTK_PRX_CRX_IDX0 0x908
+#define MTK_PRX_CRX_IDX_CFG(x) (MTK_PRX_CRX_IDX0 + (x * 0x10))
+
+/* PDMA HW LRO Control Registers */
+#define MTK_PDMA_LRO_CTRL_DW0 0x980
+#define MTK_LRO_EN BIT(0)
+#define MTK_L3_CKS_UPD_EN BIT(7)
+#define MTK_LRO_ALT_PKT_CNT_MODE BIT(21)
+#define MTK_LRO_RING_RELINQUISH_REQ (0x7 << 26)
+#define MTK_LRO_RING_RELINQUISH_DONE (0x7 << 29)
+
+#define MTK_PDMA_LRO_CTRL_DW1 0x984
+#define MTK_PDMA_LRO_CTRL_DW2 0x988
+#define MTK_PDMA_LRO_CTRL_DW3 0x98c
+#define MTK_ADMA_MODE BIT(15)
+#define MTK_LRO_MIN_RXD_SDL (MTK_HW_LRO_SDL_REMAIN_ROOM << 16)
/* PDMA Global Configuration Register */
#define MTK_PDMA_GLO_CFG 0xa04
@@ -84,6 +118,7 @@
/* PDMA Reset Index Register */
#define MTK_PDMA_RST_IDX 0xa08
#define MTK_PST_DRX_IDX0 BIT(16)
+#define MTK_PST_DRX_IDX_CFG(x) (MTK_PST_DRX_IDX0 << (x))
/* PDMA Delay Interrupt Register */
#define MTK_PDMA_DELAY_INT 0xa0c
@@ -94,10 +129,33 @@
/* PDMA Interrupt Mask Register */
#define MTK_PDMA_INT_MASK 0xa28
+/* PDMA HW LRO Alter Flow Delta Register */
+#define MTK_PDMA_LRO_ALT_SCORE_DELTA 0xa4c
+
/* PDMA Interrupt grouping registers */
#define MTK_PDMA_INT_GRP1 0xa50
#define MTK_PDMA_INT_GRP2 0xa54
+/* PDMA HW LRO IP Setting Registers */
+#define MTK_LRO_RX_RING0_DIP_DW0 0xb04
+#define MTK_LRO_DIP_DW0_CFG(x) (MTK_LRO_RX_RING0_DIP_DW0 + (x * 0x40))
+#define MTK_RING_MYIP_VLD BIT(9)
+
+/* PDMA HW LRO Ring Control Registers */
+#define MTK_LRO_RX_RING0_CTRL_DW1 0xb28
+#define MTK_LRO_RX_RING0_CTRL_DW2 0xb2c
+#define MTK_LRO_RX_RING0_CTRL_DW3 0xb30
+#define MTK_LRO_CTRL_DW1_CFG(x) (MTK_LRO_RX_RING0_CTRL_DW1 + (x * 0x40))
+#define MTK_LRO_CTRL_DW2_CFG(x) (MTK_LRO_RX_RING0_CTRL_DW2 + (x * 0x40))
+#define MTK_LRO_CTRL_DW3_CFG(x) (MTK_LRO_RX_RING0_CTRL_DW3 + (x * 0x40))
+#define MTK_RING_AGE_TIME_L ((MTK_HW_LRO_AGE_TIME & 0x3ff) << 22)
+#define MTK_RING_AGE_TIME_H ((MTK_HW_LRO_AGE_TIME >> 10) & 0x3f)
+#define MTK_RING_AUTO_LERAN_MODE (3 << 6)
+#define MTK_RING_VLD BIT(8)
+#define MTK_RING_MAX_AGG_TIME ((MTK_HW_LRO_AGG_TIME & 0xffff) << 10)
+#define MTK_RING_MAX_AGG_CNT_L ((MTK_HW_LRO_MAX_AGG_CNT & 0x3f) << 26)
+#define MTK_RING_MAX_AGG_CNT_H ((MTK_HW_LRO_MAX_AGG_CNT >> 6) & 0x3)
+
/* QDMA TX Queue Configuration Registers */
#define MTK_QTX_CFG(x) (0x1800 + (x * 0x10))
#define QDMA_RES_THRES 4
@@ -132,7 +190,6 @@
/* QDMA Reset Index Register */
#define MTK_QDMA_RST_IDX 0x1A08
-#define MTK_PST_DRX_IDX0 BIT(16)
/* QDMA Delay Interrupt Register */
#define MTK_QDMA_DELAY_INT 0x1A0C
@@ -256,6 +313,30 @@
MAC_MCR_FORCE_TX_FC | MAC_MCR_SPEED_1000 | \
MAC_MCR_FORCE_DPX | MAC_MCR_FORCE_LINK)
+/* TRGMII RXC control register */
+#define TRGMII_RCK_CTRL 0x10300
+#define DQSI0(x) ((x << 0) & GENMASK(6, 0))
+#define DQSI1(x) ((x << 8) & GENMASK(14, 8))
+#define RXCTL_DMWTLAT(x) ((x << 16) & GENMASK(18, 16))
+#define RXC_DQSISEL BIT(30)
+#define RCK_CTRL_RGMII_1000 (RXC_DQSISEL | RXCTL_DMWTLAT(2) | DQSI1(16))
+#define RCK_CTRL_RGMII_10_100 RXCTL_DMWTLAT(2)
+
+/* TRGMII RXC control register */
+#define TRGMII_TCK_CTRL 0x10340
+#define TXCTL_DMWTLAT(x) ((x << 16) & GENMASK(18, 16))
+#define TXC_INV BIT(30)
+#define TCK_CTRL_RGMII_1000 TXCTL_DMWTLAT(2)
+#define TCK_CTRL_RGMII_10_100 (TXC_INV | TXCTL_DMWTLAT(2))
+
+/* TRGMII Interface mode register */
+#define INTF_MODE 0x10390
+#define TRGMII_INTF_DIS BIT(0)
+#define TRGMII_MODE BIT(1)
+#define TRGMII_CENTRAL_ALIGNED BIT(2)
+#define INTF_MODE_RGMII_1000 (TRGMII_MODE | TRGMII_CENTRAL_ALIGNED)
+#define INTF_MODE_RGMII_10_100 0
+
/* GPIO port control registers for GMAC 2*/
#define GPIO_OD33_CTRL8 0x4c0
#define GPIO_BIAS_CTRL 0xed0
@@ -266,7 +347,11 @@
#define SYSCFG0_GE_MASK 0x3
#define SYSCFG0_GE_MODE(x, y) (x << (12 + (y * 2)))
-/*ethernet reset control register*/
+/* ethernet subsystem clock register */
+#define ETHSYS_CLKCFG0 0x2c
+#define ETHSYS_TRGMII_CLK_SEL362_5 BIT(11)
+
+/* ethernet reset control register */
#define ETHSYS_RSTCTRL 0x34
#define RSTCTRL_FE BIT(6)
#define RSTCTRL_PPE BIT(31)
@@ -332,6 +417,7 @@
MTK_CLK_ESW,
MTK_CLK_GP1,
MTK_CLK_GP2,
+ MTK_CLK_TRGPLL,
MTK_CLK_MAX
};
@@ -377,6 +463,12 @@
atomic_t free_count;
};
+/* PDMA rx ring mode */
+enum mtk_rx_flags {
+ MTK_RX_FLAGS_NORMAL = 0,
+ MTK_RX_FLAGS_HWLRO,
+};
+
/* struct mtk_rx_ring - This struct holds info describing a RX ring
* @dma: The descriptor ring
* @data: The memory pointed at by the ring
@@ -391,7 +483,10 @@
dma_addr_t phys;
u16 frag_size;
u16 buf_size;
+ u16 dma_size;
+ bool calc_idx_update;
u16 calc_idx;
+ u32 crx_idx_reg;
};
/* currently no SoC has more than 2 macs */
@@ -439,9 +534,10 @@
unsigned long sysclk;
struct regmap *ethsys;
struct regmap *pctl;
+ bool hwlro;
atomic_t dma_refcnt;
struct mtk_tx_ring tx_ring;
- struct mtk_rx_ring rx_ring;
+ struct mtk_rx_ring rx_ring[MTK_MAX_RX_RING_NUM];
struct napi_struct tx_napi;
struct napi_struct rx_napi;
struct mtk_tx_dma *scratch_ring;
@@ -461,7 +557,8 @@
* @of_node: Our devicetree node
* @hw: Backpointer to our main datastruture
* @hw_stats: Packet statistics counter
- * @phy_dev: The attached PHY if available
+ * @trgmii Indicate if the MAC uses TRGMII connected to internal
+ switch
*/
struct mtk_mac {
int id;
@@ -469,7 +566,9 @@
struct device_node *of_node;
struct mtk_eth *hw;
struct mtk_hw_stats *hw_stats;
- struct phy_device *phy_dev;
+ __be32 hwlro_ip[MTK_MAX_LRO_IP_CNT];
+ int hwlro_ip_cnt;
+ bool trgmii;
};
/* the struct describing the SoC. these are declared in the soc_xyz.c files */
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index f04a423..b1cef7a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -785,17 +785,23 @@
return mlx4_cmd_reset_flow(dev, op, op_modifier, -EIO);
if (!mlx4_is_mfunc(dev) || (native && mlx4_is_master(dev))) {
+ int ret;
+
if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR)
return mlx4_internal_err_ret_value(dev, op,
op_modifier);
+ down_read(&mlx4_priv(dev)->cmd.switch_sem);
if (mlx4_priv(dev)->cmd.use_events)
- return mlx4_cmd_wait(dev, in_param, out_param,
- out_is_imm, in_modifier,
- op_modifier, op, timeout);
+ ret = mlx4_cmd_wait(dev, in_param, out_param,
+ out_is_imm, in_modifier,
+ op_modifier, op, timeout);
else
- return mlx4_cmd_poll(dev, in_param, out_param,
- out_is_imm, in_modifier,
- op_modifier, op, timeout);
+ ret = mlx4_cmd_poll(dev, in_param, out_param,
+ out_is_imm, in_modifier,
+ op_modifier, op, timeout);
+
+ up_read(&mlx4_priv(dev)->cmd.switch_sem);
+ return ret;
}
return mlx4_slave_cmd(dev, in_param, out_param, out_is_imm,
in_modifier, op_modifier, op, timeout);
@@ -1845,6 +1851,7 @@
if (vp_oper->state.default_vlan == vp_admin->default_vlan &&
vp_oper->state.default_qos == vp_admin->default_qos &&
+ vp_oper->state.vlan_proto == vp_admin->vlan_proto &&
vp_oper->state.link_state == vp_admin->link_state &&
vp_oper->state.qos_vport == vp_admin->qos_vport)
return 0;
@@ -1903,6 +1910,7 @@
vp_oper->state.default_vlan = vp_admin->default_vlan;
vp_oper->state.default_qos = vp_admin->default_qos;
+ vp_oper->state.vlan_proto = vp_admin->vlan_proto;
vp_oper->state.link_state = vp_admin->link_state;
vp_oper->state.qos_vport = vp_admin->qos_vport;
@@ -1916,6 +1924,7 @@
work->qos_vport = vp_oper->state.qos_vport;
work->vlan_id = vp_oper->state.default_vlan;
work->vlan_ix = vp_oper->vlan_idx;
+ work->vlan_proto = vp_oper->state.vlan_proto;
work->priv = priv;
INIT_WORK(&work->work, mlx4_vf_immed_vlan_work_handler);
queue_work(priv->mfunc.master.comm_wq, &work->work);
@@ -1986,6 +1995,8 @@
int port, err;
struct mlx4_vport_state *vp_admin;
struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_slave_state *slave_state =
+ &priv->mfunc.master.slave_state[slave];
struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
&priv->dev, slave);
int min_port = find_first_bit(actv_ports.ports,
@@ -2000,12 +2011,26 @@
priv->mfunc.master.vf_admin[slave].enable_smi[port];
vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
vp_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
- vp_oper->state = *vp_admin;
+ if (vp_admin->vlan_proto != htons(ETH_P_8021AD) ||
+ slave_state->vst_qinq_supported) {
+ vp_oper->state.vlan_proto = vp_admin->vlan_proto;
+ vp_oper->state.default_vlan = vp_admin->default_vlan;
+ vp_oper->state.default_qos = vp_admin->default_qos;
+ }
+ vp_oper->state.link_state = vp_admin->link_state;
+ vp_oper->state.mac = vp_admin->mac;
+ vp_oper->state.spoofchk = vp_admin->spoofchk;
+ vp_oper->state.tx_rate = vp_admin->tx_rate;
+ vp_oper->state.qos_vport = vp_admin->qos_vport;
+ vp_oper->state.guid = vp_admin->guid;
+
if (MLX4_VGT != vp_admin->default_vlan) {
err = __mlx4_register_vlan(&priv->dev, port,
vp_admin->default_vlan, &(vp_oper->vlan_idx));
if (err) {
vp_oper->vlan_idx = NO_INDX;
+ vp_oper->state.default_vlan = MLX4_VGT;
+ vp_oper->state.vlan_proto = htons(ETH_P_8021Q);
mlx4_warn(&priv->dev,
"No vlan resources slave %d, port %d\n",
slave, port);
@@ -2086,6 +2111,7 @@
mlx4_warn(dev, "Received reset from slave:%d\n", slave);
slave_state[slave].active = false;
slave_state[slave].old_vlan_api = false;
+ slave_state[slave].vst_qinq_supported = false;
mlx4_master_deactivate_admin_state(priv, slave);
for (i = 0; i < MLX4_EVENT_TYPES_NUM; ++i) {
slave_state[slave].event_eq[i].eqn = -1;
@@ -2353,6 +2379,7 @@
vf_oper = &priv->mfunc.master.vf_oper[i];
s_state = &priv->mfunc.master.slave_state[i];
s_state->last_cmd = MLX4_COMM_CMD_RESET;
+ s_state->vst_qinq_supported = false;
mutex_init(&priv->mfunc.master.gen_eqe_mutex[i]);
for (j = 0; j < MLX4_EVENT_TYPES_NUM; ++j)
s_state->event_eq[j].eqn = -1;
@@ -2382,6 +2409,8 @@
admin_vport->qos_vport =
MLX4_VPP_DEFAULT_VPORT;
oper_vport->qos_vport = MLX4_VPP_DEFAULT_VPORT;
+ admin_vport->vlan_proto = htons(ETH_P_8021Q);
+ oper_vport->vlan_proto = htons(ETH_P_8021Q);
vf_oper->vport[port].vlan_idx = NO_INDX;
vf_oper->vport[port].mac_idx = NO_INDX;
mlx4_set_random_admin_guid(dev, i, port);
@@ -2454,6 +2483,7 @@
int flags = 0;
if (!priv->cmd.initialized) {
+ init_rwsem(&priv->cmd.switch_sem);
mutex_init(&priv->cmd.slave_cmd_mutex);
sema_init(&priv->cmd.poll_sem, 1);
priv->cmd.use_events = 0;
@@ -2583,6 +2613,7 @@
if (!priv->cmd.context)
return -ENOMEM;
+ down_write(&priv->cmd.switch_sem);
for (i = 0; i < priv->cmd.max_cmds; ++i) {
priv->cmd.context[i].token = i;
priv->cmd.context[i].next = i + 1;
@@ -2606,6 +2637,7 @@
down(&priv->cmd.poll_sem);
priv->cmd.use_events = 1;
+ up_write(&priv->cmd.switch_sem);
return err;
}
@@ -2618,6 +2650,7 @@
struct mlx4_priv *priv = mlx4_priv(dev);
int i;
+ down_write(&priv->cmd.switch_sem);
priv->cmd.use_events = 0;
for (i = 0; i < priv->cmd.max_cmds; ++i)
@@ -2626,6 +2659,7 @@
kfree(priv->cmd.context);
up(&priv->cmd.poll_sem);
+ up_write(&priv->cmd.switch_sem);
}
struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
@@ -2937,10 +2971,13 @@
EXPORT_SYMBOL_GPL(mlx4_set_vf_mac);
-int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan, u8 qos)
+int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan, u8 qos,
+ __be16 proto)
{
struct mlx4_priv *priv = mlx4_priv(dev);
struct mlx4_vport_state *vf_admin;
+ struct mlx4_slave_state *slave_state;
+ struct mlx4_vport_oper_state *vf_oper;
int slave;
if ((!mlx4_is_master(dev)) ||
@@ -2950,12 +2987,31 @@
if ((vlan > 4095) || (qos > 7))
return -EINVAL;
+ if (proto == htons(ETH_P_8021AD) &&
+ !(dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP))
+ return -EPROTONOSUPPORT;
+
+ if (proto != htons(ETH_P_8021Q) &&
+ proto != htons(ETH_P_8021AD))
+ return -EINVAL;
+
+ if ((proto == htons(ETH_P_8021AD)) &&
+ ((vlan == 0) || (vlan == MLX4_VGT)))
+ return -EINVAL;
+
slave = mlx4_get_slave_indx(dev, vf);
if (slave < 0)
return -EINVAL;
+ slave_state = &priv->mfunc.master.slave_state[slave];
+ if ((proto == htons(ETH_P_8021AD)) && (slave_state->active) &&
+ (!slave_state->vst_qinq_supported)) {
+ mlx4_err(dev, "vf %d does not support VST QinQ mode\n", vf);
+ return -EPROTONOSUPPORT;
+ }
port = mlx4_slaves_closest_port(dev, slave, port);
vf_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+ vf_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
if (!mlx4_valid_vf_state_change(dev, port, vf_admin, vlan, qos))
return -EPERM;
@@ -2965,6 +3021,7 @@
else
vf_admin->default_vlan = vlan;
vf_admin->default_qos = qos;
+ vf_admin->vlan_proto = proto;
/* If rate was configured prior to VST, we saved the configured rate
* in vf_admin->rate and now, if priority supported we enforce the QoS
@@ -2973,7 +3030,12 @@
vf_admin->tx_rate)
vf_admin->qos_vport = slave;
- if (mlx4_master_immediate_activate_vlan_qos(priv, slave, port))
+ /* Try to activate new vf state without restart,
+ * this option is not supported while moving to VST QinQ mode.
+ */
+ if ((proto == htons(ETH_P_8021AD) &&
+ vf_oper->state.vlan_proto != proto) ||
+ mlx4_master_immediate_activate_vlan_qos(priv, slave, port))
mlx4_info(dev,
"updating vf %d port %d config will take effect on next VF restart\n",
vf, port);
@@ -3117,6 +3179,7 @@
ivf->vlan = s_info->default_vlan;
ivf->qos = s_info->default_qos;
+ ivf->vlan_proto = s_info->vlan_proto;
if (mlx4_is_vf_vst_and_prio_qos(dev, port, s_info))
ivf->max_tx_rate = s_info->tx_rate;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
index 1494997..08fc5fc 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
@@ -298,7 +298,7 @@
if (IS_ERR(mdev->ptp_clock)) {
mdev->ptp_clock = NULL;
mlx4_err(mdev, "ptp_clock_register failed\n");
- } else {
+ } else if (mdev->ptp_clock) {
mlx4_info(mdev, "registered PHC clock\n");
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 62516f8..7e703be 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2400,12 +2400,14 @@
return mlx4_set_vf_mac(mdev->dev, en_priv->port, queue, mac_u64);
}
-static int mlx4_en_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos)
+static int mlx4_en_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos,
+ __be16 vlan_proto)
{
struct mlx4_en_priv *en_priv = netdev_priv(dev);
struct mlx4_en_dev *mdev = en_priv->mdev;
- return mlx4_set_vf_vlan(mdev->dev, en_priv->port, vf, vlan, qos);
+ return mlx4_set_vf_vlan(mdev->dev, en_priv->port, vf, vlan, qos,
+ vlan_proto);
}
static int mlx4_en_set_vf_rate(struct net_device *dev, int vf, int min_tx_rate,
@@ -3224,6 +3226,7 @@
}
if (mlx4_is_slave(mdev->dev)) {
+ bool vlan_offload_disabled;
int phv;
err = get_phv_bit(mdev->dev, port, &phv);
@@ -3231,6 +3234,18 @@
dev->hw_features |= NETIF_F_HW_VLAN_STAG_TX;
priv->pflags |= MLX4_EN_PRIV_FLAGS_PHV;
}
+ err = mlx4_get_is_vlan_offload_disabled(mdev->dev, port,
+ &vlan_offload_disabled);
+ if (!err && vlan_offload_disabled) {
+ dev->hw_features &= ~(NETIF_F_HW_VLAN_CTAG_TX |
+ NETIF_F_HW_VLAN_CTAG_RX |
+ NETIF_F_HW_VLAN_STAG_TX |
+ NETIF_F_HW_VLAN_STAG_RX);
+ dev->features &= ~(NETIF_F_HW_VLAN_CTAG_TX |
+ NETIF_F_HW_VLAN_CTAG_RX |
+ NETIF_F_HW_VLAN_STAG_TX |
+ NETIF_F_HW_VLAN_STAG_RX);
+ }
} else {
if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_PHV_EN &&
!(mdev->dev->caps.flags2 &
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 6758292..f2e8bed 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -72,7 +72,7 @@
}
dma = dma_map_page(priv->ddev, page, 0, PAGE_SIZE << order,
frag_info->dma_dir);
- if (dma_mapping_error(priv->ddev, dma)) {
+ if (unlikely(dma_mapping_error(priv->ddev, dma))) {
put_page(page);
return -ENOMEM;
}
@@ -108,7 +108,8 @@
ring_alloc[i].page_size)
continue;
- if (mlx4_alloc_pages(priv, &page_alloc[i], frag_info, gfp))
+ if (unlikely(mlx4_alloc_pages(priv, &page_alloc[i],
+ frag_info, gfp)))
goto out;
}
@@ -585,7 +586,7 @@
frag_info = &priv->frag_info[nr];
if (length <= frag_info->frag_prefix_size)
break;
- if (!frags[nr].page)
+ if (unlikely(!frags[nr].page))
goto fail;
dma = be64_to_cpu(rx_desc->data[nr].addr);
@@ -625,7 +626,7 @@
dma_addr_t dma;
skb = netdev_alloc_skb(priv->dev, SMALL_PACKET_SIZE + NET_IP_ALIGN);
- if (!skb) {
+ if (unlikely(!skb)) {
en_dbg(RX_ERR, priv, "Failed allocating skb\n");
return NULL;
}
@@ -736,7 +737,8 @@
{
__wsum csum_pseudo_hdr = 0;
- if (ipv6h->nexthdr == IPPROTO_FRAGMENT || ipv6h->nexthdr == IPPROTO_HOPOPTS)
+ if (unlikely(ipv6h->nexthdr == IPPROTO_FRAGMENT ||
+ ipv6h->nexthdr == IPPROTO_HOPOPTS))
return -1;
hw_checksum = csum_add(hw_checksum, (__force __wsum)htons(ipv6h->nexthdr));
@@ -769,7 +771,7 @@
get_fixed_ipv4_csum(hw_checksum, skb, hdr);
#if IS_ENABLED(CONFIG_IPV6)
else if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV6))
- if (get_fixed_ipv6_csum(hw_checksum, skb, hdr))
+ if (unlikely(get_fixed_ipv6_csum(hw_checksum, skb, hdr)))
return -1;
#endif
return 0;
@@ -796,10 +798,10 @@
u64 timestamp;
bool l2_tunnel;
- if (!priv->port_up)
+ if (unlikely(!priv->port_up))
return 0;
- if (budget <= 0)
+ if (unlikely(budget <= 0))
return polled;
/* Protect accesses to: ring->xdp_prog, priv->mac_hash list */
@@ -902,16 +904,17 @@
case XDP_PASS:
break;
case XDP_TX:
- if (!mlx4_en_xmit_frame(frags, dev,
+ if (likely(!mlx4_en_xmit_frame(frags, dev,
length, tx_index,
- &doorbell_pending))
+ &doorbell_pending)))
goto consumed;
- break;
+ goto xdp_drop; /* Drop on xmit failure */
default:
bpf_warn_invalid_xdp_action(act);
case XDP_ABORTED:
case XDP_DROP:
- if (mlx4_en_rx_recycle(ring, frags))
+xdp_drop:
+ if (likely(mlx4_en_rx_recycle(ring, frags)))
goto consumed;
goto next;
}
@@ -1015,12 +1018,12 @@
/* GRO not possible, complete processing here */
skb = mlx4_en_rx_skb(priv, rx_desc, frags, length);
- if (!skb) {
+ if (unlikely(!skb)) {
ring->dropped++;
goto next;
}
- if (unlikely(priv->validate_loopback)) {
+ if (unlikely(priv->validate_loopback)) {
validate_loopback(priv, skb);
goto next;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index f613977..cf8f8a7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1305,8 +1305,8 @@
return 0;
err_out_unmap:
- while (i >= 0)
- mlx4_free_eq(dev, &priv->eq_table.eq[i--]);
+ while (i > 0)
+ mlx4_free_eq(dev, &priv->eq_table.eq[--i]);
#ifdef CONFIG_RFS_ACCEL
for (i = 1; i <= dev->caps.num_ports; i++) {
if (mlx4_priv(dev)->port[i].rmap) {
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index d728704..090bf81 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -158,7 +158,8 @@
[31] = "Modifying loopback source checks using UPDATE_QP support",
[32] = "Loopback source checks support",
[33] = "RoCEv2 support",
- [34] = "DMFS Sniffer support (UC & MC)"
+ [34] = "DMFS Sniffer support (UC & MC)",
+ [35] = "QinQ VST mode support",
};
int i;
@@ -248,6 +249,72 @@
return err;
}
+static int mlx4_activate_vst_qinq(struct mlx4_priv *priv, int slave, int port)
+{
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_vport_state *vp_admin;
+ int err;
+
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ vp_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+
+ if (vp_admin->default_vlan != vp_oper->state.default_vlan) {
+ err = __mlx4_register_vlan(&priv->dev, port,
+ vp_admin->default_vlan,
+ &vp_oper->vlan_idx);
+ if (err) {
+ vp_oper->vlan_idx = NO_INDX;
+ mlx4_warn(&priv->dev,
+ "No vlan resources slave %d, port %d\n",
+ slave, port);
+ return err;
+ }
+ mlx4_dbg(&priv->dev, "alloc vlan %d idx %d slave %d port %d\n",
+ (int)(vp_oper->state.default_vlan),
+ vp_oper->vlan_idx, slave, port);
+ }
+ vp_oper->state.vlan_proto = vp_admin->vlan_proto;
+ vp_oper->state.default_vlan = vp_admin->default_vlan;
+ vp_oper->state.default_qos = vp_admin->default_qos;
+
+ return 0;
+}
+
+static int mlx4_handle_vst_qinq(struct mlx4_priv *priv, int slave, int port)
+{
+ struct mlx4_vport_oper_state *vp_oper;
+ struct mlx4_slave_state *slave_state;
+ struct mlx4_vport_state *vp_admin;
+ int err;
+
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ vp_admin = &priv->mfunc.master.vf_admin[slave].vport[port];
+ slave_state = &priv->mfunc.master.slave_state[slave];
+
+ if ((vp_admin->vlan_proto != htons(ETH_P_8021AD)) ||
+ (!slave_state->active))
+ return 0;
+
+ if (vp_oper->state.vlan_proto == vp_admin->vlan_proto &&
+ vp_oper->state.default_vlan == vp_admin->default_vlan &&
+ vp_oper->state.default_qos == vp_admin->default_qos)
+ return 0;
+
+ if (!slave_state->vst_qinq_supported) {
+ /* Warn and revert the request to set vst QinQ mode */
+ vp_admin->vlan_proto = vp_oper->state.vlan_proto;
+ vp_admin->default_vlan = vp_oper->state.default_vlan;
+ vp_admin->default_qos = vp_oper->state.default_qos;
+
+ mlx4_warn(&priv->dev,
+ "Slave %d does not support VST QinQ mode\n", slave);
+ return 0;
+ }
+
+ err = mlx4_activate_vst_qinq(priv, slave, port);
+ return err;
+}
+
int mlx4_QUERY_FUNC_CAP_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
@@ -311,14 +378,18 @@
#define QUERY_FUNC_CAP_VF_ENABLE_QP0 0x08
#define QUERY_FUNC_CAP_FLAGS0_FORCE_PHY_WQE_GID 0x80
-#define QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS (1 << 31)
#define QUERY_FUNC_CAP_PHV_BIT 0x40
+#define QUERY_FUNC_CAP_VLAN_OFFLOAD_DISABLE 0x20
+
+#define QUERY_FUNC_CAP_SUPPORTS_VST_QINQ BIT(30)
+#define QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS BIT(31)
if (vhcr->op_modifier == 1) {
struct mlx4_active_ports actv_ports =
mlx4_get_active_ports(dev, slave);
int converted_port = mlx4_slave_convert_port(
dev, slave, vhcr->in_modifier);
+ struct mlx4_vport_oper_state *vp_oper;
if (converted_port < 0)
return -EINVAL;
@@ -357,15 +428,24 @@
MLX4_PUT(outbox->buf, dev->caps.phys_port_id[vhcr->in_modifier],
QUERY_FUNC_CAP_PHYS_PORT_ID);
- if (dev->caps.phv_bit[port]) {
- field = QUERY_FUNC_CAP_PHV_BIT;
- MLX4_PUT(outbox->buf, field,
- QUERY_FUNC_CAP_FLAGS0_OFFSET);
- }
+ vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
+ err = mlx4_handle_vst_qinq(priv, slave, port);
+ if (err)
+ return err;
+
+ field = 0;
+ if (dev->caps.phv_bit[port])
+ field |= QUERY_FUNC_CAP_PHV_BIT;
+ if (vp_oper->state.vlan_proto == htons(ETH_P_8021AD))
+ field |= QUERY_FUNC_CAP_VLAN_OFFLOAD_DISABLE;
+ MLX4_PUT(outbox->buf, field, QUERY_FUNC_CAP_FLAGS0_OFFSET);
} else if (vhcr->op_modifier == 0) {
struct mlx4_active_ports actv_ports =
mlx4_get_active_ports(dev, slave);
+ struct mlx4_slave_state *slave_state =
+ &priv->mfunc.master.slave_state[slave];
+
/* enable rdma and ethernet interfaces, new quota locations,
* and reserved lkey
*/
@@ -439,6 +519,10 @@
size = dev->caps.reserved_lkey + ((slave << 8) & 0xFF00);
MLX4_PUT(outbox->buf, size, QUERY_FUNC_CAP_QP_RESD_LKEY_OFFSET);
+
+ if (vhcr->in_modifier & QUERY_FUNC_CAP_SUPPORTS_VST_QINQ)
+ slave_state->vst_qinq_supported = true;
+
} else
err = -EINVAL;
@@ -454,10 +538,12 @@
u32 size, qkey;
int err = 0, quotas = 0;
u32 in_modifier;
+ u32 slave_caps;
op_modifier = !!gen_or_port; /* 0 = general, 1 = logical port */
- in_modifier = op_modifier ? gen_or_port :
+ slave_caps = QUERY_FUNC_CAP_SUPPORTS_VST_QINQ |
QUERY_FUNC_CAP_SUPPORTS_NON_POWER_OF_2_NUM_EQS;
+ in_modifier = op_modifier ? gen_or_port : slave_caps;
mailbox = mlx4_alloc_cmd_mailbox(dev);
if (IS_ERR(mailbox))
@@ -612,8 +698,7 @@
MLX4_GET(func_cap->phys_port_id, outbox,
QUERY_FUNC_CAP_PHYS_PORT_ID);
- MLX4_GET(field, outbox, QUERY_FUNC_CAP_FLAGS0_OFFSET);
- func_cap->flags |= (field & QUERY_FUNC_CAP_PHV_BIT);
+ MLX4_GET(func_cap->flags0, outbox, QUERY_FUNC_CAP_FLAGS0_OFFSET);
/* All other resources are allocated by the master, but we still report
* 'num' and 'reserved' capabilities as follows:
@@ -690,6 +775,7 @@
#define QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET 0x52
#define QUERY_DEV_CAP_MAX_SG_RQ_OFFSET 0x55
#define QUERY_DEV_CAP_MAX_DESC_SZ_RQ_OFFSET 0x56
+#define QUERY_DEV_CAP_SVLAN_BY_QP_OFFSET 0x5D
#define QUERY_DEV_CAP_MAX_QP_MCG_OFFSET 0x61
#define QUERY_DEV_CAP_RSVD_MCG_OFFSET 0x62
#define QUERY_DEV_CAP_MAX_MCG_OFFSET 0x63
@@ -857,6 +943,9 @@
MLX4_GET(size, outbox, QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET);
dev_cap->max_sq_desc_sz = size;
+ MLX4_GET(field, outbox, QUERY_DEV_CAP_SVLAN_BY_QP_OFFSET);
+ if (field & 0x1)
+ dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP;
MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_QP_MCG_OFFSET);
dev_cap->max_qp_per_mcg = 1 << field;
MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_MCG_OFFSET);
@@ -2914,7 +3003,7 @@
memset(&func_cap, 0, sizeof(func_cap));
err = mlx4_QUERY_FUNC_CAP(dev, port, &func_cap);
if (!err)
- *phv = func_cap.flags & QUERY_FUNC_CAP_PHV_BIT;
+ *phv = func_cap.flags0 & QUERY_FUNC_CAP_PHV_BIT;
return err;
}
EXPORT_SYMBOL(get_phv_bit);
@@ -2938,6 +3027,22 @@
}
EXPORT_SYMBOL(set_phv_bit);
+int mlx4_get_is_vlan_offload_disabled(struct mlx4_dev *dev, u8 port,
+ bool *vlan_offload_disabled)
+{
+ struct mlx4_func_cap func_cap;
+ int err;
+
+ memset(&func_cap, 0, sizeof(func_cap));
+ err = mlx4_QUERY_FUNC_CAP(dev, port, &func_cap);
+ if (!err)
+ *vlan_offload_disabled =
+ !!(func_cap.flags0 &
+ QUERY_FUNC_CAP_VLAN_OFFLOAD_DISABLE);
+ return err;
+}
+EXPORT_SYMBOL(mlx4_get_is_vlan_offload_disabled);
+
void mlx4_replace_zero_macs(struct mlx4_dev *dev)
{
int i;
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.h b/drivers/net/ethernet/mellanox/mlx4/fw.h
index cdbd76f..f11614f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.h
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.h
@@ -152,7 +152,7 @@
u32 qp1_proxy_qpn;
u32 reserved_lkey;
u8 physical_port;
- u8 port_flags;
+ u8 flags0;
u8 flags1;
u64 phys_port_id;
u32 extra_flags;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 75dd2e3..7183ac4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2970,6 +2970,7 @@
mlx4_err(dev, "Failed to create mtu file for port %d\n", port);
device_remove_file(&info->dev->persist->pdev->dev,
&info->port_attr);
+ devlink_port_unregister(&info->devlink_port);
info->port = -1;
}
@@ -2984,6 +2985,8 @@
device_remove_file(&info->dev->persist->pdev->dev, &info->port_attr);
device_remove_file(&info->dev->persist->pdev->dev,
&info->port_mtu_attr);
+ devlink_port_unregister(&info->devlink_port);
+
#ifdef CONFIG_RFS_ACCEL
free_irq_cpu_rmap(info->rmap);
info->rmap = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index c9d7fc51..e4878f3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -46,6 +46,7 @@
#include <linux/interrupt.h>
#include <linux/spinlock.h>
#include <net/devlink.h>
+#include <linux/rwsem.h>
#include <linux/mlx4/device.h>
#include <linux/mlx4/driver.h>
@@ -482,6 +483,7 @@
u8 init_port_mask;
bool active;
bool old_vlan_api;
+ bool vst_qinq_supported;
u8 function;
dma_addr_t vhcr_dma;
u16 mtu[MLX4_MAX_PORTS + 1];
@@ -507,6 +509,7 @@
u64 mac;
u16 default_vlan;
u8 default_qos;
+ __be16 vlan_proto;
u32 tx_rate;
bool spoofchk;
u32 link_state;
@@ -627,6 +630,7 @@
struct mutex slave_cmd_mutex;
struct semaphore poll_sem;
struct semaphore event_sem;
+ struct rw_semaphore switch_sem;
int max_cmds;
spinlock_t context_lock;
int free_head;
@@ -655,6 +659,7 @@
u8 qos_vport;
u16 vlan_id;
u16 orig_vlan_id;
+ __be16 vlan_proto;
};
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 8b81114..84d7857 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -790,10 +790,22 @@
MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
} else if (0 != vp_oper->state.default_vlan) {
- qpc->pri_path.vlan_control |=
- MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
- MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
- MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+ if (vp_oper->state.vlan_proto == htons(ETH_P_8021AD)) {
+ /* vst QinQ should block untagged on TX,
+ * but cvlan is in payload and phv is set so
+ * hw see it as untagged. Block tagged instead.
+ */
+ qpc->pri_path.vlan_control |=
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+ } else { /* vst 802.1Q */
+ qpc->pri_path.vlan_control |=
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+ }
} else { /* priority tagged */
qpc->pri_path.vlan_control |=
MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
@@ -802,7 +814,11 @@
qpc->pri_path.fvl_rx |= MLX4_FVL_RX_FORCE_ETH_VLAN;
qpc->pri_path.vlan_index = vp_oper->vlan_idx;
- qpc->pri_path.fl |= MLX4_FL_CV | MLX4_FL_ETH_HIDE_CQE_VLAN;
+ qpc->pri_path.fl |= MLX4_FL_ETH_HIDE_CQE_VLAN;
+ if (vp_oper->state.vlan_proto == htons(ETH_P_8021AD))
+ qpc->pri_path.fl |= MLX4_FL_SV;
+ else
+ qpc->pri_path.fl |= MLX4_FL_CV;
qpc->pri_path.feup |= MLX4_FEUP_FORCE_ETH_UP | MLX4_FVL_FORCE_ETH_VLAN;
qpc->pri_path.sched_queue &= 0xC7;
qpc->pri_path.sched_queue |= (vp_oper->state.default_qos) << 3;
@@ -5238,6 +5254,7 @@
u64 qp_path_mask = ((1ULL << MLX4_UPD_QP_PATH_MASK_VLAN_INDEX) |
(1ULL << MLX4_UPD_QP_PATH_MASK_FVL) |
(1ULL << MLX4_UPD_QP_PATH_MASK_CV) |
+ (1ULL << MLX4_UPD_QP_PATH_MASK_SV) |
(1ULL << MLX4_UPD_QP_PATH_MASK_ETH_HIDE_CQE_VLAN) |
(1ULL << MLX4_UPD_QP_PATH_MASK_FEUP) |
(1ULL << MLX4_UPD_QP_PATH_MASK_FVL_RX) |
@@ -5266,7 +5283,12 @@
else if (!work->vlan_id)
vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED;
- else
+ else if (work->vlan_proto == htons(ETH_P_8021AD))
+ vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
+ MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
+ else /* vst 802.1Q */
vlan_control = MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED |
MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED;
@@ -5311,7 +5333,11 @@
upd_context->qp_context.pri_path.fvl_rx =
qp->fvl_rx | MLX4_FVL_RX_FORCE_ETH_VLAN;
upd_context->qp_context.pri_path.fl =
- qp->pri_path_fl | MLX4_FL_CV | MLX4_FL_ETH_HIDE_CQE_VLAN;
+ qp->pri_path_fl | MLX4_FL_ETH_HIDE_CQE_VLAN;
+ if (work->vlan_proto == htons(ETH_P_8021AD))
+ upd_context->qp_context.pri_path.fl |= MLX4_FL_SV;
+ else
+ upd_context->qp_context.pri_path.fl |= MLX4_FL_CV;
upd_context->qp_context.pri_path.feup =
qp->feup | MLX4_FEUP_FORCE_ETH_UP | MLX4_FVL_FORCE_ETH_VLAN;
upd_context->qp_context.pri_path.sched_queue =
diff --git a/drivers/net/ethernet/mellanox/mlx4/srq.c b/drivers/net/ethernet/mellanox/mlx4/srq.c
index 6714662..f44d089 100644
--- a/drivers/net/ethernet/mellanox/mlx4/srq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/srq.c
@@ -45,15 +45,12 @@
struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
struct mlx4_srq *srq;
- spin_lock(&srq_table->lock);
-
+ rcu_read_lock();
srq = radix_tree_lookup(&srq_table->tree, srqn & (dev->caps.num_srqs - 1));
+ rcu_read_unlock();
if (srq)
atomic_inc(&srq->refcount);
-
- spin_unlock(&srq_table->lock);
-
- if (!srq) {
+ else {
mlx4_warn(dev, "Async event for bogus SRQ %08x\n", srqn);
return;
}
@@ -301,12 +298,11 @@
{
struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
struct mlx4_srq *srq;
- unsigned long flags;
- spin_lock_irqsave(&srq_table->lock, flags);
+ rcu_read_lock();
srq = radix_tree_lookup(&srq_table->tree,
srqn & (dev->caps.num_srqs - 1));
- spin_unlock_irqrestore(&srq_table->lock, flags);
+ rcu_read_unlock();
return srq;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 7dd4763..460363b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -65,6 +65,8 @@
#define MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW 0x3
#define MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE_MPW 0x6
+#define MLX5_RX_HEADROOM NET_SKB_PAD
+
#define MLX5_MPWRQ_LOG_STRIDE_SIZE 6 /* >= 6, HW restriction */
#define MLX5_MPWRQ_LOG_STRIDE_SIZE_CQE_COMPRESS 8 /* >= 6, HW restriction */
#define MLX5_MPWRQ_LOG_WQE_SZ 18
@@ -99,6 +101,18 @@
#define MLX5E_UPDATE_STATS_INTERVAL 200 /* msecs */
#define MLX5E_SQ_BF_BUDGET 16
+#define MLX5E_ICOSQ_MAX_WQEBBS \
+ (DIV_ROUND_UP(sizeof(struct mlx5e_umr_wqe), MLX5_SEND_WQE_BB))
+
+#define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN)
+#define MLX5E_XDP_IHS_DS_COUNT \
+ DIV_ROUND_UP(MLX5E_XDP_MIN_INLINE - 2, MLX5_SEND_WQE_DS)
+#define MLX5E_XDP_TX_DS_COUNT \
+ (MLX5E_XDP_IHS_DS_COUNT + \
+ (sizeof(struct mlx5e_tx_wqe) / MLX5_SEND_WQE_DS) + 1 /* SG DS */)
+#define MLX5E_XDP_TX_WQEBBS \
+ DIV_ROUND_UP(MLX5E_XDP_TX_DS_COUNT, MLX5_SEND_WQEBB_NUM_DS)
+
#define MLX5E_NUM_MAIN_GROUPS 9
static inline u16 mlx5_min_rx_wqes(int wq_type, u32 wq_size)
@@ -302,10 +316,20 @@
struct mlx5e_rq {
/* data path */
struct mlx5_wq_ll wq;
- u32 wqe_sz;
- struct sk_buff **skb;
- struct mlx5e_mpw_info *wqe_info;
- void *mtt_no_align;
+
+ union {
+ struct mlx5e_dma_info *dma_info;
+ struct {
+ struct mlx5e_mpw_info *info;
+ void *mtt_no_align;
+ u32 mtt_offset;
+ } mpwqe;
+ };
+ struct {
+ u8 page_order;
+ u32 wqe_sz; /* wqe data buffer size */
+ u8 map_dir; /* dma map direction */
+ } buff;
__be32 mkey_be;
struct device *pdev;
@@ -321,9 +345,9 @@
unsigned long state;
int ix;
- u32 mpwqe_mtt_offset;
struct mlx5e_rx_am am; /* Adaptive Moderation */
+ struct bpf_prog *xdp_prog;
/* control */
struct mlx5_wq_ctrl wq_ctrl;
@@ -370,11 +394,17 @@
MLX5E_SQ_STATE_BF_ENABLE,
};
-struct mlx5e_ico_wqe_info {
+struct mlx5e_sq_wqe_info {
u8 opcode;
u8 num_wqebbs;
};
+enum mlx5e_sq_type {
+ MLX5E_SQ_TXQ,
+ MLX5E_SQ_ICO,
+ MLX5E_SQ_XDP
+};
+
struct mlx5e_sq {
/* data path */
@@ -392,10 +422,20 @@
struct mlx5e_cq cq;
- /* pointers to per packet info: write@xmit, read@completion */
- struct sk_buff **skb;
- struct mlx5e_sq_dma *dma_fifo;
- struct mlx5e_tx_wqe_info *wqe_info;
+ /* pointers to per tx element info: write@xmit, read@completion */
+ union {
+ struct {
+ struct sk_buff **skb;
+ struct mlx5e_sq_dma *dma_fifo;
+ struct mlx5e_tx_wqe_info *wqe_info;
+ } txq;
+ struct mlx5e_sq_wqe_info *ico_wqe;
+ struct {
+ struct mlx5e_sq_wqe_info *wqe_info;
+ struct mlx5e_dma_info *di;
+ bool doorbell;
+ } xdp;
+ } db;
/* read only */
struct mlx5_wq_cyc wq;
@@ -417,8 +457,8 @@
struct mlx5_uar uar;
struct mlx5e_channel *channel;
int tc;
- struct mlx5e_ico_wqe_info *ico_wqe_info;
u32 rate_limit;
+ u8 type;
} ____cacheline_aligned_in_smp;
static inline bool mlx5e_sq_has_room_for(struct mlx5e_sq *sq, u16 n)
@@ -434,8 +474,10 @@
struct mlx5e_channel {
/* data path */
struct mlx5e_rq rq;
+ struct mlx5e_sq xdp_sq;
struct mlx5e_sq sq[MLX5E_MAX_NUM_TC];
struct mlx5e_sq icosq; /* internal control operations */
+ bool xdp;
struct napi_struct napi;
struct device *pdev;
struct net_device *netdev;
@@ -617,6 +659,7 @@
/* priv data path fields - start */
struct mlx5e_sq **txq_to_sq_map;
int channeltc_to_txq_map[MLX5E_MAX_NUM_CHANNELS][MLX5E_MAX_NUM_TC];
+ struct bpf_prog *xdp_prog;
/* priv data path fields - end */
unsigned long state;
@@ -663,7 +706,7 @@
int mlx5e_napi_poll(struct napi_struct *napi, int budget);
bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget);
int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget);
-void mlx5e_free_tx_descs(struct mlx5e_sq *sq);
+void mlx5e_free_sq_descs(struct mlx5e_sq *sq);
void mlx5e_page_release(struct mlx5e_rq *rq, struct mlx5e_dma_info *dma_info,
bool recycle);
@@ -764,7 +807,7 @@
static inline u32 mlx5e_get_wqe_mtt_offset(struct mlx5e_rq *rq, u16 wqe_ix)
{
- return rq->mpwqe_mtt_offset +
+ return rq->mpwqe.mtt_offset +
wqe_ix * ALIGN(MLX5_MPWRQ_PAGES_PER_WQE, 8);
}
@@ -826,6 +869,7 @@
int mlx5e_add_sqs_fwd_rules(struct mlx5e_priv *priv);
void mlx5e_remove_sqs_fwd_rules(struct mlx5e_priv *priv);
int mlx5e_attr_get(struct net_device *dev, struct switchdev_attr *attr);
+void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe);
int mlx5e_create_direct_rqts(struct mlx5e_priv *priv);
void mlx5e_destroy_rqt(struct mlx5e_priv *priv, struct mlx5e_rqt *rqt);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
index 847a8f3..13dc388 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
@@ -273,7 +273,7 @@
tstamp->ptp = ptp_clock_register(&tstamp->ptp_info,
&priv->mdev->pdev->dev);
- if (IS_ERR_OR_NULL(tstamp->ptp)) {
+ if (IS_ERR(tstamp->ptp)) {
mlx5_core_warn(priv->mdev, "ptp_clock_register failed %ld\n",
PTR_ERR(tstamp->ptp));
tstamp->ptp = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8595b50..b58cfe3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -34,6 +34,7 @@
#include <net/pkt_cls.h>
#include <linux/mlx5/fs.h>
#include <net/vxlan.h>
+#include <linux/bpf.h>
#include "en.h"
#include "en_tc.h"
#include "eswitch.h"
@@ -50,7 +51,7 @@
struct mlx5_wq_param wq;
u16 max_inline;
u8 min_inline_mode;
- bool icosq;
+ enum mlx5e_sq_type type;
};
struct mlx5e_cq_param {
@@ -63,12 +64,55 @@
struct mlx5e_channel_param {
struct mlx5e_rq_param rq;
struct mlx5e_sq_param sq;
+ struct mlx5e_sq_param xdp_sq;
struct mlx5e_sq_param icosq;
struct mlx5e_cq_param rx_cq;
struct mlx5e_cq_param tx_cq;
struct mlx5e_cq_param icosq_cq;
};
+static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
+{
+ return MLX5_CAP_GEN(mdev, striding_rq) &&
+ MLX5_CAP_GEN(mdev, umr_ptr_rlky) &&
+ MLX5_CAP_ETH(mdev, reg_umr_sq);
+}
+
+static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
+{
+ priv->params.rq_wq_type = rq_type;
+ switch (priv->params.rq_wq_type) {
+ case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
+ priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
+ priv->params.mpwqe_log_stride_sz = priv->params.rx_cqe_compress ?
+ MLX5_MPWRQ_LOG_STRIDE_SIZE_CQE_COMPRESS :
+ MLX5_MPWRQ_LOG_STRIDE_SIZE;
+ priv->params.mpwqe_log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ -
+ priv->params.mpwqe_log_stride_sz;
+ break;
+ default: /* MLX5_WQ_TYPE_LINKED_LIST */
+ priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+ }
+ priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
+ BIT(priv->params.log_rq_size));
+
+ mlx5_core_info(priv->mdev,
+ "MLX5E: StrdRq(%d) RqSz(%ld) StrdSz(%ld) RxCqeCmprss(%d)\n",
+ priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ,
+ BIT(priv->params.log_rq_size),
+ BIT(priv->params.mpwqe_log_stride_sz),
+ priv->params.rx_cqe_compress_admin);
+}
+
+static void mlx5e_set_rq_priv_params(struct mlx5e_priv *priv)
+{
+ u8 rq_type = mlx5e_check_fragmented_striding_rq_cap(priv->mdev) &&
+ !priv->xdp_prog ?
+ MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ :
+ MLX5_WQ_TYPE_LINKED_LIST;
+ mlx5e_set_rq_type_params(priv, rq_type);
+}
+
static void mlx5e_update_carrier(struct mlx5e_priv *priv)
{
struct mlx5_core_dev *mdev = priv->mdev;
@@ -136,6 +180,9 @@
s->rx_csum_none += rq_stats->csum_none;
s->rx_csum_complete += rq_stats->csum_complete;
s->rx_csum_unnecessary_inner += rq_stats->csum_unnecessary_inner;
+ s->rx_xdp_drop += rq_stats->xdp_drop;
+ s->rx_xdp_tx += rq_stats->xdp_tx;
+ s->rx_xdp_tx_full += rq_stats->xdp_tx_full;
s->rx_wqe_err += rq_stats->wqe_err;
s->rx_mpwqe_filler += rq_stats->mpwqe_filler;
s->rx_buff_alloc_err += rq_stats->buff_alloc_err;
@@ -314,7 +361,7 @@
struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl;
struct mlx5_wqe_umr_ctrl_seg *ucseg = &wqe->uctrl;
struct mlx5_wqe_data_seg *dseg = &wqe->data;
- struct mlx5e_mpw_info *wi = &rq->wqe_info[ix];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[ix];
u8 ds_cnt = DIV_ROUND_UP(sizeof(*wqe), MLX5_SEND_WQE_DS);
u32 umr_wqe_mtt_offset = mlx5e_get_wqe_mtt_offset(rq, ix);
@@ -342,21 +389,21 @@
int mtt_alloc = mtt_sz + MLX5_UMR_ALIGN - 1;
int i;
- rq->wqe_info = kzalloc_node(wq_sz * sizeof(*rq->wqe_info),
- GFP_KERNEL, cpu_to_node(c->cpu));
- if (!rq->wqe_info)
+ rq->mpwqe.info = kzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
+ GFP_KERNEL, cpu_to_node(c->cpu));
+ if (!rq->mpwqe.info)
goto err_out;
/* We allocate more than mtt_sz as we will align the pointer */
- rq->mtt_no_align = kzalloc_node(mtt_alloc * wq_sz, GFP_KERNEL,
+ rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz, GFP_KERNEL,
cpu_to_node(c->cpu));
- if (unlikely(!rq->mtt_no_align))
+ if (unlikely(!rq->mpwqe.mtt_no_align))
goto err_free_wqe_info;
for (i = 0; i < wq_sz; i++) {
- struct mlx5e_mpw_info *wi = &rq->wqe_info[i];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[i];
- wi->umr.mtt = PTR_ALIGN(rq->mtt_no_align + i * mtt_alloc,
+ wi->umr.mtt = PTR_ALIGN(rq->mpwqe.mtt_no_align + i * mtt_alloc,
MLX5_UMR_ALIGN);
wi->umr.mtt_addr = dma_map_single(c->pdev, wi->umr.mtt, mtt_sz,
PCI_DMA_TODEVICE);
@@ -370,14 +417,14 @@
err_unmap_mtts:
while (--i >= 0) {
- struct mlx5e_mpw_info *wi = &rq->wqe_info[i];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[i];
dma_unmap_single(c->pdev, wi->umr.mtt_addr, mtt_sz,
PCI_DMA_TODEVICE);
}
- kfree(rq->mtt_no_align);
+ kfree(rq->mpwqe.mtt_no_align);
err_free_wqe_info:
- kfree(rq->wqe_info);
+ kfree(rq->mpwqe.info);
err_out:
return -ENOMEM;
@@ -390,13 +437,23 @@
int i;
for (i = 0; i < wq_sz; i++) {
- struct mlx5e_mpw_info *wi = &rq->wqe_info[i];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[i];
dma_unmap_single(rq->pdev, wi->umr.mtt_addr, mtt_sz,
PCI_DMA_TODEVICE);
}
- kfree(rq->mtt_no_align);
- kfree(rq->wqe_info);
+ kfree(rq->mpwqe.mtt_no_align);
+ kfree(rq->mpwqe.info);
+}
+
+static bool mlx5e_is_vf_vport_rep(struct mlx5e_priv *priv)
+{
+ struct mlx5_eswitch_rep *rep = (struct mlx5_eswitch_rep *)priv->ppriv;
+
+ if (rep && rep->vport != FDB_UPLINK_VPORT)
+ return true;
+
+ return false;
}
static int mlx5e_create_rq(struct mlx5e_channel *c,
@@ -408,6 +465,8 @@
void *rqc = param->rqc;
void *rqc_wq = MLX5_ADDR_OF(rqc, rqc, wq);
u32 byte_count;
+ u32 frag_sz;
+ int npages;
int wq_sz;
int err;
int i;
@@ -430,41 +489,66 @@
rq->channel = c;
rq->ix = c->ix;
rq->priv = c->priv;
+ rq->xdp_prog = priv->xdp_prog;
+
+ rq->buff.map_dir = DMA_FROM_DEVICE;
+ if (rq->xdp_prog)
+ rq->buff.map_dir = DMA_BIDIRECTIONAL;
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
+ if (mlx5e_is_vf_vport_rep(priv)) {
+ err = -EINVAL;
+ goto err_rq_wq_destroy;
+ }
+
rq->handle_rx_cqe = mlx5e_handle_rx_cqe_mpwrq;
rq->alloc_wqe = mlx5e_alloc_rx_mpwqe;
rq->dealloc_wqe = mlx5e_dealloc_rx_mpwqe;
- rq->mpwqe_mtt_offset = c->ix *
+ rq->mpwqe.mtt_offset = c->ix *
MLX5E_REQUIRED_MTTS(1, BIT(priv->params.log_rq_size));
rq->mpwqe_stride_sz = BIT(priv->params.mpwqe_log_stride_sz);
rq->mpwqe_num_strides = BIT(priv->params.mpwqe_log_num_strides);
- rq->wqe_sz = rq->mpwqe_stride_sz * rq->mpwqe_num_strides;
- byte_count = rq->wqe_sz;
+
+ rq->buff.wqe_sz = rq->mpwqe_stride_sz * rq->mpwqe_num_strides;
+ byte_count = rq->buff.wqe_sz;
rq->mkey_be = cpu_to_be32(c->priv->umr_mkey.key);
err = mlx5e_rq_alloc_mpwqe_info(rq, c);
if (err)
goto err_rq_wq_destroy;
break;
default: /* MLX5_WQ_TYPE_LINKED_LIST */
- rq->skb = kzalloc_node(wq_sz * sizeof(*rq->skb), GFP_KERNEL,
- cpu_to_node(c->cpu));
- if (!rq->skb) {
+ rq->dma_info = kzalloc_node(wq_sz * sizeof(*rq->dma_info),
+ GFP_KERNEL, cpu_to_node(c->cpu));
+ if (!rq->dma_info) {
err = -ENOMEM;
goto err_rq_wq_destroy;
}
- rq->handle_rx_cqe = mlx5e_handle_rx_cqe;
+
+ if (mlx5e_is_vf_vport_rep(priv))
+ rq->handle_rx_cqe = mlx5e_handle_rx_cqe_rep;
+ else
+ rq->handle_rx_cqe = mlx5e_handle_rx_cqe;
+
rq->alloc_wqe = mlx5e_alloc_rx_wqe;
rq->dealloc_wqe = mlx5e_dealloc_rx_wqe;
- rq->wqe_sz = (priv->params.lro_en) ?
+ rq->buff.wqe_sz = (priv->params.lro_en) ?
priv->params.lro_wqe_sz :
MLX5E_SW2HW_MTU(priv->netdev->mtu);
- rq->wqe_sz = SKB_DATA_ALIGN(rq->wqe_sz);
- byte_count = rq->wqe_sz;
+ byte_count = rq->buff.wqe_sz;
+
+ /* calc the required page order */
+ frag_sz = MLX5_RX_HEADROOM +
+ byte_count /* packet data */ +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ frag_sz = SKB_DATA_ALIGN(frag_sz);
+
+ npages = DIV_ROUND_UP(frag_sz, PAGE_SIZE);
+ rq->buff.page_order = order_base_2(npages);
+
byte_count |= MLX5_HW_START_PADDING;
rq->mkey_be = c->mkey_be;
}
@@ -482,6 +566,9 @@
rq->page_cache.head = 0;
rq->page_cache.tail = 0;
+ if (rq->xdp_prog)
+ bpf_prog_add(rq->xdp_prog, 1);
+
return 0;
err_rq_wq_destroy:
@@ -494,12 +581,15 @@
{
int i;
+ if (rq->xdp_prog)
+ bpf_prog_put(rq->xdp_prog);
+
switch (rq->wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
mlx5e_rq_free_mpwqe_info(rq);
break;
default: /* MLX5_WQ_TYPE_LINKED_LIST */
- kfree(rq->skb);
+ kfree(rq->dma_info);
}
for (i = rq->page_cache.head; i != rq->page_cache.tail;
@@ -641,7 +731,7 @@
/* UMR WQE (if in progress) is always at wq->head */
if (test_bit(MLX5E_RQ_STATE_UMR_WQE_IN_PROGRESS, &rq->state))
- mlx5e_free_rx_mpwqe(rq, &rq->wqe_info[wq->head]);
+ mlx5e_free_rx_mpwqe(rq, &rq->mpwqe.info[wq->head]);
while (!mlx5_wq_ll_is_empty(wq)) {
wqe_ix_be = *wq->tail_next;
@@ -676,8 +766,8 @@
if (param->am_enabled)
set_bit(MLX5E_RQ_STATE_AM, &c->rq.state);
- sq->ico_wqe_info[pi].opcode = MLX5_OPCODE_NOP;
- sq->ico_wqe_info[pi].num_wqebbs = 1;
+ sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_NOP;
+ sq->db.ico_wqe[pi].num_wqebbs = 1;
mlx5e_send_nop(sq, true); /* trigger mlx5e_post_rx_wqes() */
return 0;
@@ -701,26 +791,65 @@
mlx5e_destroy_rq(rq);
}
-static void mlx5e_free_sq_db(struct mlx5e_sq *sq)
+static void mlx5e_free_sq_xdp_db(struct mlx5e_sq *sq)
{
- kfree(sq->wqe_info);
- kfree(sq->dma_fifo);
- kfree(sq->skb);
+ kfree(sq->db.xdp.di);
+ kfree(sq->db.xdp.wqe_info);
}
-static int mlx5e_alloc_sq_db(struct mlx5e_sq *sq, int numa)
+static int mlx5e_alloc_sq_xdp_db(struct mlx5e_sq *sq, int numa)
+{
+ int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+
+ sq->db.xdp.di = kzalloc_node(sizeof(*sq->db.xdp.di) * wq_sz,
+ GFP_KERNEL, numa);
+ sq->db.xdp.wqe_info = kzalloc_node(sizeof(*sq->db.xdp.wqe_info) * wq_sz,
+ GFP_KERNEL, numa);
+ if (!sq->db.xdp.di || !sq->db.xdp.wqe_info) {
+ mlx5e_free_sq_xdp_db(sq);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void mlx5e_free_sq_ico_db(struct mlx5e_sq *sq)
+{
+ kfree(sq->db.ico_wqe);
+}
+
+static int mlx5e_alloc_sq_ico_db(struct mlx5e_sq *sq, int numa)
+{
+ u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+
+ sq->db.ico_wqe = kzalloc_node(sizeof(*sq->db.ico_wqe) * wq_sz,
+ GFP_KERNEL, numa);
+ if (!sq->db.ico_wqe)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void mlx5e_free_sq_txq_db(struct mlx5e_sq *sq)
+{
+ kfree(sq->db.txq.wqe_info);
+ kfree(sq->db.txq.dma_fifo);
+ kfree(sq->db.txq.skb);
+}
+
+static int mlx5e_alloc_sq_txq_db(struct mlx5e_sq *sq, int numa)
{
int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
int df_sz = wq_sz * MLX5_SEND_WQEBB_NUM_DS;
- sq->skb = kzalloc_node(wq_sz * sizeof(*sq->skb), GFP_KERNEL, numa);
- sq->dma_fifo = kzalloc_node(df_sz * sizeof(*sq->dma_fifo), GFP_KERNEL,
- numa);
- sq->wqe_info = kzalloc_node(wq_sz * sizeof(*sq->wqe_info), GFP_KERNEL,
- numa);
-
- if (!sq->skb || !sq->dma_fifo || !sq->wqe_info) {
- mlx5e_free_sq_db(sq);
+ sq->db.txq.skb = kzalloc_node(wq_sz * sizeof(*sq->db.txq.skb),
+ GFP_KERNEL, numa);
+ sq->db.txq.dma_fifo = kzalloc_node(df_sz * sizeof(*sq->db.txq.dma_fifo),
+ GFP_KERNEL, numa);
+ sq->db.txq.wqe_info = kzalloc_node(wq_sz * sizeof(*sq->db.txq.wqe_info),
+ GFP_KERNEL, numa);
+ if (!sq->db.txq.skb || !sq->db.txq.dma_fifo || !sq->db.txq.wqe_info) {
+ mlx5e_free_sq_txq_db(sq);
return -ENOMEM;
}
@@ -729,6 +858,46 @@
return 0;
}
+static void mlx5e_free_sq_db(struct mlx5e_sq *sq)
+{
+ switch (sq->type) {
+ case MLX5E_SQ_TXQ:
+ mlx5e_free_sq_txq_db(sq);
+ break;
+ case MLX5E_SQ_ICO:
+ mlx5e_free_sq_ico_db(sq);
+ break;
+ case MLX5E_SQ_XDP:
+ mlx5e_free_sq_xdp_db(sq);
+ break;
+ }
+}
+
+static int mlx5e_alloc_sq_db(struct mlx5e_sq *sq, int numa)
+{
+ switch (sq->type) {
+ case MLX5E_SQ_TXQ:
+ return mlx5e_alloc_sq_txq_db(sq, numa);
+ case MLX5E_SQ_ICO:
+ return mlx5e_alloc_sq_ico_db(sq, numa);
+ case MLX5E_SQ_XDP:
+ return mlx5e_alloc_sq_xdp_db(sq, numa);
+ }
+
+ return 0;
+}
+
+static int mlx5e_sq_get_max_wqebbs(u8 sq_type)
+{
+ switch (sq_type) {
+ case MLX5E_SQ_ICO:
+ return MLX5E_ICOSQ_MAX_WQEBBS;
+ case MLX5E_SQ_XDP:
+ return MLX5E_XDP_TX_WQEBBS;
+ }
+ return MLX5_SEND_WQE_MAX_WQEBBS;
+}
+
static int mlx5e_create_sq(struct mlx5e_channel *c,
int tc,
struct mlx5e_sq_param *param,
@@ -741,6 +910,13 @@
void *sqc_wq = MLX5_ADDR_OF(sqc, sqc, wq);
int err;
+ sq->type = param->type;
+ sq->pdev = c->pdev;
+ sq->tstamp = &priv->tstamp;
+ sq->mkey_be = c->mkey_be;
+ sq->channel = c;
+ sq->tc = tc;
+
err = mlx5_alloc_map_uar(mdev, &sq->uar, !!MLX5_CAP_GEN(mdev, bf));
if (err)
return err;
@@ -769,18 +945,7 @@
if (err)
goto err_sq_wq_destroy;
- if (param->icosq) {
- u8 wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
-
- sq->ico_wqe_info = kzalloc_node(sizeof(*sq->ico_wqe_info) *
- wq_sz,
- GFP_KERNEL,
- cpu_to_node(c->cpu));
- if (!sq->ico_wqe_info) {
- err = -ENOMEM;
- goto err_free_sq_db;
- }
- } else {
+ if (sq->type == MLX5E_SQ_TXQ) {
int txq_ix;
txq_ix = c->ix + tc * priv->params.num_channels;
@@ -788,19 +953,11 @@
priv->txq_to_sq_map[txq_ix] = sq;
}
- sq->pdev = c->pdev;
- sq->tstamp = &priv->tstamp;
- sq->mkey_be = c->mkey_be;
- sq->channel = c;
- sq->tc = tc;
- sq->edge = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+ sq->edge = (sq->wq.sz_m1 + 1) - mlx5e_sq_get_max_wqebbs(sq->type);
sq->bf_budget = MLX5E_SQ_BF_BUDGET;
return 0;
-err_free_sq_db:
- mlx5e_free_sq_db(sq);
-
err_sq_wq_destroy:
mlx5_wq_destroy(&sq->wq_ctrl);
@@ -815,7 +972,6 @@
struct mlx5e_channel *c = sq->channel;
struct mlx5e_priv *priv = c->priv;
- kfree(sq->ico_wqe_info);
mlx5e_free_sq_db(sq);
mlx5_wq_destroy(&sq->wq_ctrl);
mlx5_unmap_free_uar(priv->mdev, &sq->uar);
@@ -844,11 +1000,12 @@
memcpy(sqc, param->sqc, sizeof(param->sqc));
- MLX5_SET(sqc, sqc, tis_num_0, param->icosq ? 0 : priv->tisn[sq->tc]);
+ MLX5_SET(sqc, sqc, tis_num_0, param->type == MLX5E_SQ_ICO ?
+ 0 : priv->tisn[sq->tc]);
MLX5_SET(sqc, sqc, cqn, sq->cq.mcq.cqn);
MLX5_SET(sqc, sqc, min_wqe_inline_mode, sq->min_inline_mode);
MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RST);
- MLX5_SET(sqc, sqc, tis_lst_sz, param->icosq ? 0 : 1);
+ MLX5_SET(sqc, sqc, tis_lst_sz, param->type == MLX5E_SQ_ICO ? 0 : 1);
MLX5_SET(sqc, sqc, flush_in_error_en, 1);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
@@ -963,12 +1120,14 @@
netif_tx_disable_queue(sq->txq);
/* last doorbell out, godspeed .. */
- if (mlx5e_sq_has_room_for(sq, 1))
+ if (mlx5e_sq_has_room_for(sq, 1)) {
+ sq->db.txq.skb[(sq->pc & sq->wq.sz_m1)] = NULL;
mlx5e_send_nop(sq, true);
+ }
}
mlx5e_disable_sq(sq);
- mlx5e_free_tx_descs(sq);
+ mlx5e_free_sq_descs(sq);
mlx5e_destroy_sq(sq);
}
@@ -1329,14 +1488,31 @@
}
}
+ if (priv->xdp_prog) {
+ /* XDP SQ CQ params are same as normal TXQ sq CQ params */
+ err = mlx5e_open_cq(c, &cparam->tx_cq, &c->xdp_sq.cq,
+ priv->params.tx_cq_moderation);
+ if (err)
+ goto err_close_sqs;
+
+ err = mlx5e_open_sq(c, 0, &cparam->xdp_sq, &c->xdp_sq);
+ if (err) {
+ mlx5e_close_cq(&c->xdp_sq.cq);
+ goto err_close_sqs;
+ }
+ }
+
+ c->xdp = !!priv->xdp_prog;
err = mlx5e_open_rq(c, &cparam->rq, &c->rq);
if (err)
- goto err_close_sqs;
+ goto err_close_xdp_sq;
netif_set_xps_queue(netdev, get_cpu_mask(c->cpu), ix);
*cp = c;
return 0;
+err_close_xdp_sq:
+ mlx5e_close_sq(&c->xdp_sq);
err_close_sqs:
mlx5e_close_sqs(c);
@@ -1365,9 +1541,13 @@
static void mlx5e_close_channel(struct mlx5e_channel *c)
{
mlx5e_close_rq(&c->rq);
+ if (c->xdp)
+ mlx5e_close_sq(&c->xdp_sq);
mlx5e_close_sqs(c);
mlx5e_close_sq(&c->icosq);
napi_disable(&c->napi);
+ if (c->xdp)
+ mlx5e_close_cq(&c->xdp_sq.cq);
mlx5e_close_cq(&c->rq.cq);
mlx5e_close_tx_cqs(c);
mlx5e_close_cq(&c->icosq.cq);
@@ -1441,6 +1621,7 @@
param->max_inline = priv->params.tx_max_inline;
param->min_inline_mode = priv->params.tx_min_inline_mode;
+ param->type = MLX5E_SQ_TXQ;
}
static void mlx5e_build_common_cq_param(struct mlx5e_priv *priv,
@@ -1514,7 +1695,22 @@
MLX5_SET(wq, wq, log_wq_sz, log_wq_size);
MLX5_SET(sqc, sqc, reg_umr, MLX5_CAP_ETH(priv->mdev, reg_umr_sq));
- param->icosq = true;
+ param->type = MLX5E_SQ_ICO;
+}
+
+static void mlx5e_build_xdpsq_param(struct mlx5e_priv *priv,
+ struct mlx5e_sq_param *param)
+{
+ void *sqc = param->sqc;
+ void *wq = MLX5_ADDR_OF(sqc, sqc, wq);
+
+ mlx5e_build_sq_param_common(priv, param);
+ MLX5_SET(wq, wq, log_wq_sz, priv->params.log_sq_size);
+
+ param->max_inline = priv->params.tx_max_inline;
+ /* FOR XDP SQs will support only L2 inline mode */
+ param->min_inline_mode = MLX5_INLINE_MODE_NONE;
+ param->type = MLX5E_SQ_XDP;
}
static void mlx5e_build_channel_param(struct mlx5e_priv *priv, struct mlx5e_channel_param *cparam)
@@ -1523,6 +1719,7 @@
mlx5e_build_rq_param(priv, &cparam->rq);
mlx5e_build_sq_param(priv, &cparam->sq);
+ mlx5e_build_xdpsq_param(priv, &cparam->xdp_sq);
mlx5e_build_icosq_param(priv, &cparam->icosq, icosq_log_wq_sz);
mlx5e_build_rx_cq_param(priv, &cparam->rx_cq);
mlx5e_build_tx_cq_param(priv, &cparam->tx_cq);
@@ -2720,11 +2917,15 @@
return mlx5_eswitch_set_vport_mac(mdev->priv.eswitch, vf + 1, mac);
}
-static int mlx5e_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos)
+static int mlx5e_set_vf_vlan(struct net_device *dev, int vf, u16 vlan, u8 qos,
+ __be16 vlan_proto)
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
return mlx5_eswitch_set_vport_vlan(mdev->priv.eswitch, vf + 1,
vlan, qos);
}
@@ -2901,6 +3102,92 @@
schedule_work(&priv->tx_timeout_work);
}
+static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
+{
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct bpf_prog *old_prog;
+ int err = 0;
+ bool reset, was_opened;
+ int i;
+
+ mutex_lock(&priv->state_lock);
+
+ if ((netdev->features & NETIF_F_LRO) && prog) {
+ netdev_warn(netdev, "can't set XDP while LRO is on, disable LRO first\n");
+ err = -EINVAL;
+ goto unlock;
+ }
+
+ was_opened = test_bit(MLX5E_STATE_OPENED, &priv->state);
+ /* no need for full reset when exchanging programs */
+ reset = (!priv->xdp_prog || !prog);
+
+ if (was_opened && reset)
+ mlx5e_close_locked(netdev);
+
+ /* exchange programs */
+ old_prog = xchg(&priv->xdp_prog, prog);
+ if (prog)
+ bpf_prog_add(prog, 1);
+ if (old_prog)
+ bpf_prog_put(old_prog);
+
+ if (reset) /* change RQ type according to priv->xdp_prog */
+ mlx5e_set_rq_priv_params(priv);
+
+ if (was_opened && reset)
+ mlx5e_open_locked(netdev);
+
+ if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset)
+ goto unlock;
+
+ /* exchanging programs w/o reset, we update ref counts on behalf
+ * of the channels RQs here.
+ */
+ bpf_prog_add(prog, priv->params.num_channels);
+ for (i = 0; i < priv->params.num_channels; i++) {
+ struct mlx5e_channel *c = priv->channel[i];
+
+ set_bit(MLX5E_RQ_STATE_FLUSH, &c->rq.state);
+ napi_synchronize(&c->napi);
+ /* prevent mlx5e_poll_rx_cq from accessing rq->xdp_prog */
+
+ old_prog = xchg(&c->rq.xdp_prog, prog);
+
+ clear_bit(MLX5E_RQ_STATE_FLUSH, &c->rq.state);
+ /* napi_schedule in case we have missed anything */
+ set_bit(MLX5E_CHANNEL_NAPI_SCHED, &c->flags);
+ napi_schedule(&c->napi);
+
+ if (old_prog)
+ bpf_prog_put(old_prog);
+ }
+
+unlock:
+ mutex_unlock(&priv->state_lock);
+ return err;
+}
+
+static bool mlx5e_xdp_attached(struct net_device *dev)
+{
+ struct mlx5e_priv *priv = netdev_priv(dev);
+
+ return !!priv->xdp_prog;
+}
+
+static int mlx5e_xdp(struct net_device *dev, struct netdev_xdp *xdp)
+{
+ switch (xdp->command) {
+ case XDP_SETUP_PROG:
+ return mlx5e_xdp_set(dev, xdp->prog);
+ case XDP_QUERY_PROG:
+ xdp->prog_attached = mlx5e_xdp_attached(dev);
+ return 0;
+ default:
+ return -EINVAL;
+ }
+}
+
static const struct net_device_ops mlx5e_netdev_ops_basic = {
.ndo_open = mlx5e_open,
.ndo_stop = mlx5e_close,
@@ -2920,6 +3207,7 @@
.ndo_rx_flow_steer = mlx5e_rx_flow_steer,
#endif
.ndo_tx_timeout = mlx5e_tx_timeout,
+ .ndo_xdp = mlx5e_xdp,
};
static const struct net_device_ops mlx5e_netdev_ops_sriov = {
@@ -2951,6 +3239,7 @@
.ndo_set_vf_link_state = mlx5e_set_vf_link_state,
.ndo_get_vf_stats = mlx5e_get_vf_stats,
.ndo_tx_timeout = mlx5e_tx_timeout,
+ .ndo_xdp = mlx5e_xdp,
};
static int mlx5e_check_required_hca_cap(struct mlx5_core_dev *mdev)
@@ -3025,13 +3314,6 @@
indirection_rqt[i] = i % num_channels;
}
-static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
-{
- return MLX5_CAP_GEN(mdev, striding_rq) &&
- MLX5_CAP_GEN(mdev, umr_ptr_rlky) &&
- MLX5_CAP_ETH(mdev, reg_umr_sq);
-}
-
static int mlx5e_get_pci_bw(struct mlx5_core_dev *mdev, u32 *pci_bw)
{
enum pcie_link_width width;
@@ -3111,11 +3393,13 @@
MLX5_CQ_PERIOD_MODE_START_FROM_CQE :
MLX5_CQ_PERIOD_MODE_START_FROM_EQE;
- priv->params.log_sq_size =
- MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE;
- priv->params.rq_wq_type = mlx5e_check_fragmented_striding_rq_cap(mdev) ?
- MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ :
- MLX5_WQ_TYPE_LINKED_LIST;
+ priv->mdev = mdev;
+ priv->netdev = netdev;
+ priv->params.num_channels = profile->max_nch(mdev);
+ priv->profile = profile;
+ priv->ppriv = ppriv;
+
+ priv->params.log_sq_size = MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE;
/* set CQE compression */
priv->params.rx_cqe_compress_admin = false;
@@ -3128,33 +3412,11 @@
priv->params.rx_cqe_compress_admin =
cqe_compress_heuristic(link_speed, pci_bw);
}
-
priv->params.rx_cqe_compress = priv->params.rx_cqe_compress_admin;
- switch (priv->params.rq_wq_type) {
- case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
- priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
- priv->params.mpwqe_log_stride_sz =
- priv->params.rx_cqe_compress ?
- MLX5_MPWRQ_LOG_STRIDE_SIZE_CQE_COMPRESS :
- MLX5_MPWRQ_LOG_STRIDE_SIZE;
- priv->params.mpwqe_log_num_strides = MLX5_MPWRQ_LOG_WQE_SZ -
- priv->params.mpwqe_log_stride_sz;
+ mlx5e_set_rq_priv_params(priv);
+ if (priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
priv->params.lro_en = true;
- break;
- default: /* MLX5_WQ_TYPE_LINKED_LIST */
- priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
- }
-
- mlx5_core_info(mdev,
- "MLX5E: StrdRq(%d) RqSz(%ld) StrdSz(%ld) RxCqeCmprss(%d)\n",
- priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ,
- BIT(priv->params.log_rq_size),
- BIT(priv->params.mpwqe_log_stride_sz),
- priv->params.rx_cqe_compress_admin);
-
- priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
- BIT(priv->params.log_rq_size));
priv->params.rx_am_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
mlx5e_set_rx_cq_mode_params(&priv->params, cq_period_mode);
@@ -3174,19 +3436,16 @@
mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
MLX5E_INDIR_RQT_SIZE, profile->max_nch(mdev));
- priv->params.lro_wqe_sz =
- MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
+ priv->params.lro_wqe_sz =
+ MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ -
+ /* Extra room needed for build_skb */
+ MLX5_RX_HEADROOM -
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
/* Initialize pflags */
MLX5E_SET_PRIV_FLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
- priv->mdev = mdev;
- priv->netdev = netdev;
- priv->params.num_channels = profile->max_nch(mdev);
- priv->profile = profile;
- priv->ppriv = ppriv;
-
#ifdef CONFIG_MLX5_CORE_EN_DCB
mlx5e_ets_init(priv);
#endif
@@ -3490,9 +3749,9 @@
mlx5_query_nic_vport_mac_address(mdev, 0, rep.hw_id);
rep.load = mlx5e_nic_rep_load;
rep.unload = mlx5e_nic_rep_unload;
- rep.vport = 0;
+ rep.vport = FDB_UPLINK_VPORT;
rep.priv_data = priv;
- mlx5_eswitch_register_vport_rep(esw, &rep);
+ mlx5_eswitch_register_vport_rep(esw, 0, &rep);
}
}
@@ -3631,7 +3890,7 @@
rep.unload = mlx5e_vport_rep_unload;
rep.vport = vport;
ether_addr_copy(rep.hw_id, mac);
- mlx5_eswitch_register_vport_rep(esw, &rep);
+ mlx5_eswitch_register_vport_rep(esw, vport, &rep);
}
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index dc86779..c6de6fb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -36,6 +36,7 @@
#include <net/busy_poll.h>
#include "en.h"
#include "en_tc.h"
+#include "eswitch.h"
static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
{
@@ -179,50 +180,99 @@
mutex_unlock(&priv->state_lock);
}
-int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe *wqe, u16 ix)
-{
- struct sk_buff *skb;
- dma_addr_t dma_addr;
+#define RQ_PAGE_SIZE(rq) ((1 << rq->buff.page_order) << PAGE_SHIFT)
- skb = napi_alloc_skb(rq->cq.napi, rq->wqe_sz);
- if (unlikely(!skb))
+static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq,
+ struct mlx5e_dma_info *dma_info)
+{
+ struct mlx5e_page_cache *cache = &rq->page_cache;
+ u32 tail_next = (cache->tail + 1) & (MLX5E_CACHE_SIZE - 1);
+
+ if (tail_next == cache->head) {
+ rq->stats.cache_full++;
+ return false;
+ }
+
+ cache->page_cache[cache->tail] = *dma_info;
+ cache->tail = tail_next;
+ return true;
+}
+
+static inline bool mlx5e_rx_cache_get(struct mlx5e_rq *rq,
+ struct mlx5e_dma_info *dma_info)
+{
+ struct mlx5e_page_cache *cache = &rq->page_cache;
+
+ if (unlikely(cache->head == cache->tail)) {
+ rq->stats.cache_empty++;
+ return false;
+ }
+
+ if (page_ref_count(cache->page_cache[cache->head].page) != 1) {
+ rq->stats.cache_busy++;
+ return false;
+ }
+
+ *dma_info = cache->page_cache[cache->head];
+ cache->head = (cache->head + 1) & (MLX5E_CACHE_SIZE - 1);
+ rq->stats.cache_reuse++;
+
+ dma_sync_single_for_device(rq->pdev, dma_info->addr,
+ RQ_PAGE_SIZE(rq),
+ DMA_FROM_DEVICE);
+ return true;
+}
+
+static inline int mlx5e_page_alloc_mapped(struct mlx5e_rq *rq,
+ struct mlx5e_dma_info *dma_info)
+{
+ struct page *page;
+
+ if (mlx5e_rx_cache_get(rq, dma_info))
+ return 0;
+
+ page = dev_alloc_pages(rq->buff.page_order);
+ if (unlikely(!page))
return -ENOMEM;
- dma_addr = dma_map_single(rq->pdev,
- /* hw start padding */
- skb->data,
- /* hw end padding */
- rq->wqe_sz,
- DMA_FROM_DEVICE);
-
- if (unlikely(dma_mapping_error(rq->pdev, dma_addr)))
- goto err_free_skb;
-
- *((dma_addr_t *)skb->cb) = dma_addr;
- wqe->data.addr = cpu_to_be64(dma_addr);
-
- rq->skb[ix] = skb;
+ dma_info->page = page;
+ dma_info->addr = dma_map_page(rq->pdev, page, 0,
+ RQ_PAGE_SIZE(rq), rq->buff.map_dir);
+ if (unlikely(dma_mapping_error(rq->pdev, dma_info->addr))) {
+ put_page(page);
+ return -ENOMEM;
+ }
return 0;
+}
-err_free_skb:
- dev_kfree_skb(skb);
+void mlx5e_page_release(struct mlx5e_rq *rq, struct mlx5e_dma_info *dma_info,
+ bool recycle)
+{
+ if (likely(recycle) && mlx5e_rx_cache_put(rq, dma_info))
+ return;
- return -ENOMEM;
+ dma_unmap_page(rq->pdev, dma_info->addr, RQ_PAGE_SIZE(rq),
+ rq->buff.map_dir);
+ put_page(dma_info->page);
+}
+
+int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe *wqe, u16 ix)
+{
+ struct mlx5e_dma_info *di = &rq->dma_info[ix];
+
+ if (unlikely(mlx5e_page_alloc_mapped(rq, di)))
+ return -ENOMEM;
+
+ wqe->data.addr = cpu_to_be64(di->addr + MLX5_RX_HEADROOM);
+ return 0;
}
void mlx5e_dealloc_rx_wqe(struct mlx5e_rq *rq, u16 ix)
{
- struct sk_buff *skb = rq->skb[ix];
+ struct mlx5e_dma_info *di = &rq->dma_info[ix];
- if (skb) {
- rq->skb[ix] = NULL;
- dma_unmap_single(rq->pdev,
- *((dma_addr_t *)skb->cb),
- rq->wqe_sz,
- DMA_FROM_DEVICE);
- dev_kfree_skb(skb);
- }
+ mlx5e_page_release(rq, di, true);
}
static inline int mlx5e_mpwqe_strides_per_page(struct mlx5e_rq *rq)
@@ -279,7 +329,7 @@
static inline void mlx5e_post_umr_wqe(struct mlx5e_rq *rq, u16 ix)
{
- struct mlx5e_mpw_info *wi = &rq->wqe_info[ix];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[ix];
struct mlx5e_sq *sq = &rq->channel->icosq;
struct mlx5_wq_cyc *wq = &sq->wq;
struct mlx5e_umr_wqe *wqe;
@@ -288,8 +338,8 @@
/* fill sq edge with nops to avoid wqe wrap around */
while ((pi = (sq->pc & wq->sz_m1)) > sq->edge) {
- sq->ico_wqe_info[pi].opcode = MLX5_OPCODE_NOP;
- sq->ico_wqe_info[pi].num_wqebbs = 1;
+ sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_NOP;
+ sq->db.ico_wqe[pi].num_wqebbs = 1;
mlx5e_send_nop(sq, true);
}
@@ -299,90 +349,17 @@
cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) |
MLX5_OPCODE_UMR);
- sq->ico_wqe_info[pi].opcode = MLX5_OPCODE_UMR;
- sq->ico_wqe_info[pi].num_wqebbs = num_wqebbs;
+ sq->db.ico_wqe[pi].opcode = MLX5_OPCODE_UMR;
+ sq->db.ico_wqe[pi].num_wqebbs = num_wqebbs;
sq->pc += num_wqebbs;
mlx5e_tx_notify_hw(sq, &wqe->ctrl, 0);
}
-static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq,
- struct mlx5e_dma_info *dma_info)
-{
- struct mlx5e_page_cache *cache = &rq->page_cache;
- u32 tail_next = (cache->tail + 1) & (MLX5E_CACHE_SIZE - 1);
-
- if (tail_next == cache->head) {
- rq->stats.cache_full++;
- return false;
- }
-
- cache->page_cache[cache->tail] = *dma_info;
- cache->tail = tail_next;
- return true;
-}
-
-static inline bool mlx5e_rx_cache_get(struct mlx5e_rq *rq,
- struct mlx5e_dma_info *dma_info)
-{
- struct mlx5e_page_cache *cache = &rq->page_cache;
-
- if (unlikely(cache->head == cache->tail)) {
- rq->stats.cache_empty++;
- return false;
- }
-
- if (page_ref_count(cache->page_cache[cache->head].page) != 1) {
- rq->stats.cache_busy++;
- return false;
- }
-
- *dma_info = cache->page_cache[cache->head];
- cache->head = (cache->head + 1) & (MLX5E_CACHE_SIZE - 1);
- rq->stats.cache_reuse++;
-
- dma_sync_single_for_device(rq->pdev, dma_info->addr, PAGE_SIZE,
- DMA_FROM_DEVICE);
- return true;
-}
-
-static inline int mlx5e_page_alloc_mapped(struct mlx5e_rq *rq,
- struct mlx5e_dma_info *dma_info)
-{
- struct page *page;
-
- if (mlx5e_rx_cache_get(rq, dma_info))
- return 0;
-
- page = dev_alloc_page();
- if (unlikely(!page))
- return -ENOMEM;
-
- dma_info->page = page;
- dma_info->addr = dma_map_page(rq->pdev, page, 0, PAGE_SIZE,
- DMA_FROM_DEVICE);
- if (unlikely(dma_mapping_error(rq->pdev, dma_info->addr))) {
- put_page(page);
- return -ENOMEM;
- }
-
- return 0;
-}
-
-void mlx5e_page_release(struct mlx5e_rq *rq, struct mlx5e_dma_info *dma_info,
- bool recycle)
-{
- if (likely(recycle) && mlx5e_rx_cache_put(rq, dma_info))
- return;
-
- dma_unmap_page(rq->pdev, dma_info->addr, PAGE_SIZE, DMA_FROM_DEVICE);
- put_page(dma_info->page);
-}
-
static int mlx5e_alloc_rx_umr_mpwqe(struct mlx5e_rq *rq,
struct mlx5e_rx_wqe *wqe,
u16 ix)
{
- struct mlx5e_mpw_info *wi = &rq->wqe_info[ix];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[ix];
u64 dma_offset = (u64)mlx5e_get_wqe_mtt_offset(rq, ix) << PAGE_SHIFT;
int pg_strides = mlx5e_mpwqe_strides_per_page(rq);
int err;
@@ -436,7 +413,7 @@
clear_bit(MLX5E_RQ_STATE_UMR_WQE_IN_PROGRESS, &rq->state);
if (unlikely(test_bit(MLX5E_RQ_STATE_FLUSH, &rq->state))) {
- mlx5e_free_rx_mpwqe(rq, &rq->wqe_info[wq->head]);
+ mlx5e_free_rx_mpwqe(rq, &rq->mpwqe.info[wq->head]);
return;
}
@@ -448,7 +425,7 @@
mlx5_wq_ll_update_db_record(wq);
}
-int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe *wqe, u16 ix)
+int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe *wqe, u16 ix)
{
int err;
@@ -462,7 +439,7 @@
void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
{
- struct mlx5e_mpw_info *wi = &rq->wqe_info[ix];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[ix];
mlx5e_free_rx_mpwqe(rq, wi);
}
@@ -653,12 +630,186 @@
rq->stats.packets++;
rq->stats.bytes += cqe_bcnt;
mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb);
- napi_gro_receive(rq->cq.napi, skb);
+}
+
+static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_sq *sq)
+{
+ struct mlx5_wq_cyc *wq = &sq->wq;
+ struct mlx5e_tx_wqe *wqe;
+ u16 pi = (sq->pc - MLX5E_XDP_TX_WQEBBS) & wq->sz_m1; /* last pi */
+
+ wqe = mlx5_wq_cyc_get_wqe(wq, pi);
+
+ wqe->ctrl.fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
+ mlx5e_tx_notify_hw(sq, &wqe->ctrl, 0);
+}
+
+static inline void mlx5e_xmit_xdp_frame(struct mlx5e_rq *rq,
+ struct mlx5e_dma_info *di,
+ unsigned int data_offset,
+ int len)
+{
+ struct mlx5e_sq *sq = &rq->channel->xdp_sq;
+ struct mlx5_wq_cyc *wq = &sq->wq;
+ u16 pi = sq->pc & wq->sz_m1;
+ struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
+ struct mlx5e_sq_wqe_info *wi = &sq->db.xdp.wqe_info[pi];
+
+ struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl;
+ struct mlx5_wqe_eth_seg *eseg = &wqe->eth;
+ struct mlx5_wqe_data_seg *dseg;
+
+ dma_addr_t dma_addr = di->addr + data_offset + MLX5E_XDP_MIN_INLINE;
+ unsigned int dma_len = len - MLX5E_XDP_MIN_INLINE;
+ void *data = page_address(di->page) + data_offset;
+
+ if (unlikely(!mlx5e_sq_has_room_for(sq, MLX5E_XDP_TX_WQEBBS))) {
+ if (sq->db.xdp.doorbell) {
+ /* SQ is full, ring doorbell */
+ mlx5e_xmit_xdp_doorbell(sq);
+ sq->db.xdp.doorbell = false;
+ }
+ rq->stats.xdp_tx_full++;
+ mlx5e_page_release(rq, di, true);
+ return;
+ }
+
+ dma_sync_single_for_device(sq->pdev, dma_addr, dma_len,
+ PCI_DMA_TODEVICE);
+
+ memset(wqe, 0, sizeof(*wqe));
+
+ /* copy the inline part */
+ memcpy(eseg->inline_hdr_start, data, MLX5E_XDP_MIN_INLINE);
+ eseg->inline_hdr_sz = cpu_to_be16(MLX5E_XDP_MIN_INLINE);
+
+ dseg = (struct mlx5_wqe_data_seg *)cseg + (MLX5E_XDP_TX_DS_COUNT - 1);
+
+ /* write the dma part */
+ dseg->addr = cpu_to_be64(dma_addr);
+ dseg->byte_count = cpu_to_be32(dma_len);
+ dseg->lkey = sq->mkey_be;
+
+ cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | MLX5_OPCODE_SEND);
+ cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | MLX5E_XDP_TX_DS_COUNT);
+
+ sq->db.xdp.di[pi] = *di;
+ wi->opcode = MLX5_OPCODE_SEND;
+ wi->num_wqebbs = MLX5E_XDP_TX_WQEBBS;
+ sq->pc += MLX5E_XDP_TX_WQEBBS;
+
+ sq->db.xdp.doorbell = true;
+ rq->stats.xdp_tx++;
+}
+
+/* returns true if packet was consumed by xdp */
+static inline bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
+ const struct bpf_prog *prog,
+ struct mlx5e_dma_info *di,
+ void *data, u16 len)
+{
+ struct xdp_buff xdp;
+ u32 act;
+
+ if (!prog)
+ return false;
+
+ xdp.data = data;
+ xdp.data_end = xdp.data + len;
+ act = bpf_prog_run_xdp(prog, &xdp);
+ switch (act) {
+ case XDP_PASS:
+ return false;
+ case XDP_TX:
+ mlx5e_xmit_xdp_frame(rq, di, MLX5_RX_HEADROOM, len);
+ return true;
+ default:
+ bpf_warn_invalid_xdp_action(act);
+ case XDP_ABORTED:
+ case XDP_DROP:
+ rq->stats.xdp_drop++;
+ mlx5e_page_release(rq, di, true);
+ return true;
+ }
+}
+
+static inline
+struct sk_buff *skb_from_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
+ u16 wqe_counter, u32 cqe_bcnt)
+{
+ struct bpf_prog *xdp_prog = READ_ONCE(rq->xdp_prog);
+ struct mlx5e_dma_info *di;
+ struct sk_buff *skb;
+ void *va, *data;
+
+ di = &rq->dma_info[wqe_counter];
+ va = page_address(di->page);
+ data = va + MLX5_RX_HEADROOM;
+
+ dma_sync_single_range_for_cpu(rq->pdev,
+ di->addr,
+ MLX5_RX_HEADROOM,
+ rq->buff.wqe_sz,
+ DMA_FROM_DEVICE);
+ prefetch(data);
+
+ if (unlikely((cqe->op_own >> 4) != MLX5_CQE_RESP_SEND)) {
+ rq->stats.wqe_err++;
+ mlx5e_page_release(rq, di, true);
+ return NULL;
+ }
+
+ if (mlx5e_xdp_handle(rq, xdp_prog, di, data, cqe_bcnt))
+ return NULL; /* page/packet was consumed by XDP */
+
+ skb = build_skb(va, RQ_PAGE_SIZE(rq));
+ if (unlikely(!skb)) {
+ rq->stats.buff_alloc_err++;
+ mlx5e_page_release(rq, di, true);
+ return NULL;
+ }
+
+ /* queue up for recycling ..*/
+ page_ref_inc(di->page);
+ mlx5e_page_release(rq, di, true);
+
+ skb_reserve(skb, MLX5_RX_HEADROOM);
+ skb_put(skb, cqe_bcnt);
+
+ return skb;
}
void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
{
struct mlx5e_rx_wqe *wqe;
+ __be16 wqe_counter_be;
+ struct sk_buff *skb;
+ u16 wqe_counter;
+ u32 cqe_bcnt;
+
+ wqe_counter_be = cqe->wqe_counter;
+ wqe_counter = be16_to_cpu(wqe_counter_be);
+ wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
+ cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
+
+ skb = skb_from_cqe(rq, cqe, wqe_counter, cqe_bcnt);
+ if (!skb)
+ goto wq_ll_pop;
+
+ mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+ napi_gro_receive(rq->cq.napi, skb);
+
+wq_ll_pop:
+ mlx5_wq_ll_pop(&rq->wq, wqe_counter_be,
+ &wqe->next.next_wqe_index);
+}
+
+void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+{
+ struct net_device *netdev = rq->netdev;
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_eswitch_rep *rep = priv->ppriv;
+ struct mlx5e_rx_wqe *wqe;
struct sk_buff *skb;
__be16 wqe_counter_be;
u16 wqe_counter;
@@ -667,26 +818,19 @@
wqe_counter_be = cqe->wqe_counter;
wqe_counter = be16_to_cpu(wqe_counter_be);
wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
- skb = rq->skb[wqe_counter];
- prefetch(skb->data);
- rq->skb[wqe_counter] = NULL;
+ cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
- dma_unmap_single(rq->pdev,
- *((dma_addr_t *)skb->cb),
- rq->wqe_sz,
- DMA_FROM_DEVICE);
-
- if (unlikely((cqe->op_own >> 4) != MLX5_CQE_RESP_SEND)) {
- rq->stats.wqe_err++;
- dev_kfree_skb(skb);
+ skb = skb_from_cqe(rq, cqe, wqe_counter, cqe_bcnt);
+ if (!skb)
goto wq_ll_pop;
- }
-
- cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
- skb_put(skb, cqe_bcnt);
mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+ if (rep->vlan && skb_vlan_tag_present(skb))
+ skb_vlan_pop(skb);
+
+ napi_gro_receive(rq->cq.napi, skb);
+
wq_ll_pop:
mlx5_wq_ll_pop(&rq->wq, wqe_counter_be,
&wqe->next.next_wqe_index);
@@ -734,7 +878,7 @@
{
u16 cstrides = mpwrq_get_cqe_consumed_strides(cqe);
u16 wqe_id = be16_to_cpu(cqe->wqe_id);
- struct mlx5e_mpw_info *wi = &rq->wqe_info[wqe_id];
+ struct mlx5e_mpw_info *wi = &rq->mpwqe.info[wqe_id];
struct mlx5e_rx_wqe *wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_id);
struct sk_buff *skb;
u16 cqe_bcnt;
@@ -764,6 +908,7 @@
mlx5e_mpwqe_fill_rx_skb(rq, cqe, wi, cqe_bcnt, skb);
mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+ napi_gro_receive(rq->cq.napi, skb);
mpwrq_cqe_out:
if (likely(wi->consumed_strides < rq->mpwqe_num_strides))
@@ -776,6 +921,7 @@
int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget)
{
struct mlx5e_rq *rq = container_of(cq, struct mlx5e_rq, cq);
+ struct mlx5e_sq *xdp_sq = &rq->channel->xdp_sq;
int work_done = 0;
if (unlikely(test_bit(MLX5E_RQ_STATE_FLUSH, &rq->state)))
@@ -802,6 +948,11 @@
rq->handle_rx_cqe(rq, cqe);
}
+ if (xdp_sq->db.xdp.doorbell) {
+ mlx5e_xmit_xdp_doorbell(xdp_sq);
+ xdp_sq->db.xdp.doorbell = false;
+ }
+
mlx5_cqwq_update_db_record(&cq->wq);
/* ensure cq space is freed before enabling more cqes */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 6af8d79..57452fd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -65,6 +65,9 @@
u64 rx_csum_none;
u64 rx_csum_complete;
u64 rx_csum_unnecessary_inner;
+ u64 rx_xdp_drop;
+ u64 rx_xdp_tx;
+ u64 rx_xdp_tx_full;
u64 tx_csum_partial;
u64 tx_csum_partial_inner;
u64 tx_queue_stopped;
@@ -100,6 +103,9 @@
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_none) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_complete) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_unnecessary_inner) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_drop) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx) },
+ { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_xdp_tx_full) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_partial) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_csum_partial_inner) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, tx_queue_stopped) },
@@ -278,6 +284,9 @@
u64 csum_none;
u64 lro_packets;
u64 lro_bytes;
+ u64 xdp_drop;
+ u64 xdp_tx;
+ u64 xdp_tx_full;
u64 wqe_err;
u64 mpwqe_filler;
u64 buff_alloc_err;
@@ -295,6 +304,9 @@
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, csum_complete) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, csum_unnecessary_inner) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, csum_none) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_drop) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx) },
+ { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_tx_full) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_packets) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 22cfc4a..a350b71 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -39,6 +39,7 @@
#include <linux/rhashtable.h>
#include <net/switchdev.h>
#include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_vlan.h>
#include "en.h"
#include "en_tc.h"
#include "eswitch.h"
@@ -47,6 +48,7 @@
struct rhash_head node;
u64 cookie;
struct mlx5_flow_rule *rule;
+ struct mlx5_esw_flow_attr *attr;
};
#define MLX5E_TC_TABLE_NUM_ENTRIES 1024
@@ -114,27 +116,30 @@
static struct mlx5_flow_rule *mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
struct mlx5_flow_spec *spec,
- u32 action, u32 dst_vport)
+ struct mlx5_esw_flow_attr *attr)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
- struct mlx5_eswitch_rep *rep = priv->ppriv;
- u32 src_vport;
+ int err;
- if (rep->vport) /* set source vport for the flow */
- src_vport = rep->vport;
- else
- src_vport = FDB_UPLINK_VPORT;
+ err = mlx5_eswitch_add_vlan_action(esw, attr);
+ if (err)
+ return ERR_PTR(err);
- return mlx5_eswitch_add_offloaded_rule(esw, spec, action, src_vport, dst_vport);
+ return mlx5_eswitch_add_offloaded_rule(esw, spec, attr);
}
static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
- struct mlx5_flow_rule *rule)
+ struct mlx5_flow_rule *rule,
+ struct mlx5_esw_flow_attr *attr)
{
+ struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct mlx5_fc *counter = NULL;
counter = mlx5_flow_rule_counter(rule);
+ if (esw && esw->mode == SRIOV_OFFLOADS)
+ mlx5_eswitch_del_vlan_action(esw, attr);
+
mlx5_del_flow_rule(rule);
mlx5_fc_destroy(priv->mdev, counter);
@@ -159,6 +164,7 @@
~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
BIT(FLOW_DISSECTOR_KEY_BASIC) |
BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) |
+ BIT(FLOW_DISSECTOR_KEY_VLAN) |
BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
BIT(FLOW_DISSECTOR_KEY_PORTS))) {
@@ -222,6 +228,24 @@
key->src);
}
+ if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_VLAN)) {
+ struct flow_dissector_key_vlan *key =
+ skb_flow_dissector_target(f->dissector,
+ FLOW_DISSECTOR_KEY_VLAN,
+ f->key);
+ struct flow_dissector_key_vlan *mask =
+ skb_flow_dissector_target(f->dissector,
+ FLOW_DISSECTOR_KEY_VLAN,
+ f->mask);
+ if (mask->vlan_id) {
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, vlan_tag, 1);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, vlan_tag, 1);
+
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, first_vid, mask->vlan_id);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, first_vid, key->vlan_id);
+ }
+ }
+
if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
struct flow_dissector_key_ipv4_addrs *key =
skb_flow_dissector_target(f->dissector,
@@ -361,7 +385,7 @@
}
static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
- u32 *action, u32 *dest_vport)
+ struct mlx5_esw_flow_attr *attr)
{
const struct tc_action *a;
LIST_HEAD(actions);
@@ -369,17 +393,14 @@
if (tc_no_actions(exts))
return -EINVAL;
- *action = 0;
+ memset(attr, 0, sizeof(*attr));
+ attr->in_rep = priv->ppriv;
tcf_exts_to_list(exts, &actions);
list_for_each_entry(a, &actions, list) {
- /* Only support a single action per rule */
- if (*action)
- return -EINVAL;
-
if (is_tcf_gact_shot(a)) {
- *action = MLX5_FLOW_CONTEXT_ACTION_DROP |
- MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_DROP |
+ MLX5_FLOW_CONTEXT_ACTION_COUNT;
continue;
}
@@ -387,7 +408,6 @@
int ifindex = tcf_mirred_ifindex(a);
struct net_device *out_dev;
struct mlx5e_priv *out_priv;
- struct mlx5_eswitch_rep *out_rep;
out_dev = __dev_get_by_index(dev_net(priv->netdev), ifindex);
@@ -397,13 +417,22 @@
return -EINVAL;
}
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
out_priv = netdev_priv(out_dev);
- out_rep = out_priv->ppriv;
- if (out_rep->vport == 0)
- *dest_vport = FDB_UPLINK_VPORT;
- else
- *dest_vport = out_rep->vport;
- *action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+ attr->out_rep = out_priv->ppriv;
+ continue;
+ }
+
+ if (is_tcf_vlan(a)) {
+ if (tcf_vlan_action(a) == VLAN_F_POP) {
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_VLAN_POP;
+ } else if (tcf_vlan_action(a) == VLAN_F_PUSH) {
+ if (tcf_vlan_push_proto(a) != htons(ETH_P_8021Q))
+ return -EOPNOTSUPP;
+
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH;
+ attr->vlan = tcf_vlan_push_vid(a);
+ }
continue;
}
@@ -417,18 +446,29 @@
{
struct mlx5e_tc_table *tc = &priv->fs.tc;
int err = 0;
- u32 flow_tag, action, dest_vport = 0;
+ bool fdb_flow = false;
+ u32 flow_tag, action;
struct mlx5e_tc_flow *flow;
struct mlx5_flow_spec *spec;
struct mlx5_flow_rule *old = NULL;
+ struct mlx5_esw_flow_attr *old_attr;
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+ if (esw && esw->mode == SRIOV_OFFLOADS)
+ fdb_flow = true;
+
flow = rhashtable_lookup_fast(&tc->ht, &f->cookie,
tc->ht_params);
- if (flow)
+ if (flow) {
old = flow->rule;
- else
- flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+ old_attr = flow->attr;
+ } else {
+ if (fdb_flow)
+ flow = kzalloc(sizeof(*flow) + sizeof(struct mlx5_esw_flow_attr),
+ GFP_KERNEL);
+ else
+ flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+ }
spec = mlx5_vzalloc(sizeof(*spec));
if (!spec || !flow) {
@@ -442,11 +482,12 @@
if (err < 0)
goto err_free;
- if (esw && esw->mode == SRIOV_OFFLOADS) {
- err = parse_tc_fdb_actions(priv, f->exts, &action, &dest_vport);
+ if (fdb_flow) {
+ flow->attr = (struct mlx5_esw_flow_attr *)(flow + 1);
+ err = parse_tc_fdb_actions(priv, f->exts, flow->attr);
if (err < 0)
goto err_free;
- flow->rule = mlx5e_tc_add_fdb_flow(priv, spec, action, dest_vport);
+ flow->rule = mlx5e_tc_add_fdb_flow(priv, spec, flow->attr);
} else {
err = parse_tc_nic_actions(priv, f->exts, &action, &flow_tag);
if (err < 0)
@@ -465,7 +506,7 @@
goto err_del_rule;
if (old)
- mlx5e_tc_del_flow(priv, old);
+ mlx5e_tc_del_flow(priv, old, old_attr);
goto out;
@@ -493,7 +534,7 @@
rhashtable_remove_fast(&tc->ht, &flow->node, tc->ht_params);
- mlx5e_tc_del_flow(priv, flow->rule);
+ mlx5e_tc_del_flow(priv, flow->rule, flow->attr);
kfree(flow);
@@ -550,7 +591,7 @@
struct mlx5e_tc_flow *flow = ptr;
struct mlx5e_priv *priv = arg;
- mlx5e_tc_del_flow(priv, flow->rule);
+ mlx5e_tc_del_flow(priv, flow->rule, flow->attr);
kfree(flow);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index eb0e725..70a7173 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -52,7 +52,6 @@
cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | MLX5_OPCODE_NOP);
cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | 0x01);
- sq->skb[pi] = NULL;
sq->pc++;
sq->stats.nop++;
@@ -82,15 +81,17 @@
u32 size,
enum mlx5e_dma_map_type map_type)
{
- sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].addr = addr;
- sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].size = size;
- sq->dma_fifo[sq->dma_fifo_pc & sq->dma_fifo_mask].type = map_type;
+ u32 i = sq->dma_fifo_pc & sq->dma_fifo_mask;
+
+ sq->db.txq.dma_fifo[i].addr = addr;
+ sq->db.txq.dma_fifo[i].size = size;
+ sq->db.txq.dma_fifo[i].type = map_type;
sq->dma_fifo_pc++;
}
static inline struct mlx5e_sq_dma *mlx5e_dma_get(struct mlx5e_sq *sq, u32 i)
{
- return &sq->dma_fifo[i & sq->dma_fifo_mask];
+ return &sq->db.txq.dma_fifo[i & sq->dma_fifo_mask];
}
static void mlx5e_dma_unmap_wqe_err(struct mlx5e_sq *sq, u8 num_dma)
@@ -221,7 +222,7 @@
u16 pi = sq->pc & wq->sz_m1;
struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
- struct mlx5e_tx_wqe_info *wi = &sq->wqe_info[pi];
+ struct mlx5e_tx_wqe_info *wi = &sq->db.txq.wqe_info[pi];
struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl;
struct mlx5_wqe_eth_seg *eseg = &wqe->eth;
@@ -341,7 +342,7 @@
cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | opcode);
cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_cnt);
- sq->skb[pi] = skb;
+ sq->db.txq.skb[pi] = skb;
wi->num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
sq->pc += wi->num_wqebbs;
@@ -368,8 +369,10 @@
}
/* fill sq edge with nops to avoid wqe wrap around */
- while ((sq->pc & wq->sz_m1) > sq->edge)
+ while ((pi = (sq->pc & wq->sz_m1)) > sq->edge) {
+ sq->db.txq.skb[pi] = NULL;
mlx5e_send_nop(sq, false);
+ }
if (bf)
sq->bf_budget--;
@@ -442,8 +445,8 @@
last_wqe = (sqcc == wqe_counter);
ci = sqcc & sq->wq.sz_m1;
- skb = sq->skb[ci];
- wi = &sq->wqe_info[ci];
+ skb = sq->db.txq.skb[ci];
+ wi = &sq->db.txq.wqe_info[ci];
if (unlikely(!skb)) { /* nop */
sqcc++;
@@ -492,7 +495,7 @@
return (i == MLX5E_TX_CQ_POLL_BUDGET);
}
-void mlx5e_free_tx_descs(struct mlx5e_sq *sq)
+static void mlx5e_free_txq_sq_descs(struct mlx5e_sq *sq)
{
struct mlx5e_tx_wqe_info *wi;
struct sk_buff *skb;
@@ -501,8 +504,8 @@
while (sq->cc != sq->pc) {
ci = sq->cc & sq->wq.sz_m1;
- skb = sq->skb[ci];
- wi = &sq->wqe_info[ci];
+ skb = sq->db.txq.skb[ci];
+ wi = &sq->db.txq.wqe_info[ci];
if (!skb) { /* nop */
sq->cc++;
@@ -520,3 +523,37 @@
sq->cc += wi->num_wqebbs;
}
}
+
+static void mlx5e_free_xdp_sq_descs(struct mlx5e_sq *sq)
+{
+ struct mlx5e_sq_wqe_info *wi;
+ struct mlx5e_dma_info *di;
+ u16 ci;
+
+ while (sq->cc != sq->pc) {
+ ci = sq->cc & sq->wq.sz_m1;
+ di = &sq->db.xdp.di[ci];
+ wi = &sq->db.xdp.wqe_info[ci];
+
+ if (wi->opcode == MLX5_OPCODE_NOP) {
+ sq->cc++;
+ continue;
+ }
+
+ sq->cc += wi->num_wqebbs;
+
+ mlx5e_page_release(&sq->channel->rq, di, false);
+ }
+}
+
+void mlx5e_free_sq_descs(struct mlx5e_sq *sq)
+{
+ switch (sq->type) {
+ case MLX5E_SQ_TXQ:
+ mlx5e_free_txq_sq_descs(sq);
+ break;
+ case MLX5E_SQ_XDP:
+ mlx5e_free_xdp_sq_descs(sq);
+ break;
+ }
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
index 08d8b0c..5703f19 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
@@ -72,7 +72,7 @@
do {
u16 ci = be16_to_cpu(cqe->wqe_counter) & wq->sz_m1;
- struct mlx5e_ico_wqe_info *icowi = &sq->ico_wqe_info[ci];
+ struct mlx5e_sq_wqe_info *icowi = &sq->db.ico_wqe[ci];
mlx5_cqwq_pop(&cq->wq);
sqcc += icowi->num_wqebbs;
@@ -105,6 +105,66 @@
sq->cc = sqcc;
}
+static inline bool mlx5e_poll_xdp_tx_cq(struct mlx5e_cq *cq)
+{
+ struct mlx5e_sq *sq;
+ u16 sqcc;
+ int i;
+
+ sq = container_of(cq, struct mlx5e_sq, cq);
+
+ if (unlikely(test_bit(MLX5E_SQ_STATE_FLUSH, &sq->state)))
+ return false;
+
+ /* sq->cc must be updated only after mlx5_cqwq_update_db_record(),
+ * otherwise a cq overrun may occur
+ */
+ sqcc = sq->cc;
+
+ for (i = 0; i < MLX5E_TX_CQ_POLL_BUDGET; i++) {
+ struct mlx5_cqe64 *cqe;
+ u16 wqe_counter;
+ bool last_wqe;
+
+ cqe = mlx5e_get_cqe(cq);
+ if (!cqe)
+ break;
+
+ mlx5_cqwq_pop(&cq->wq);
+
+ wqe_counter = be16_to_cpu(cqe->wqe_counter);
+
+ do {
+ struct mlx5e_sq_wqe_info *wi;
+ struct mlx5e_dma_info *di;
+ u16 ci;
+
+ last_wqe = (sqcc == wqe_counter);
+
+ ci = sqcc & sq->wq.sz_m1;
+ di = &sq->db.xdp.di[ci];
+ wi = &sq->db.xdp.wqe_info[ci];
+
+ if (unlikely(wi->opcode == MLX5_OPCODE_NOP)) {
+ sqcc++;
+ continue;
+ }
+
+ sqcc += wi->num_wqebbs;
+ /* Recycle RX page */
+ mlx5e_page_release(&sq->channel->rq, di, true);
+ } while (!last_wqe);
+ }
+
+ mlx5_cqwq_update_db_record(&cq->wq);
+
+ /* ensure cq space is freed before enabling more cqes */
+ wmb();
+
+ sq->cc = sqcc;
+ return (i == MLX5E_TX_CQ_POLL_BUDGET);
+}
+
int mlx5e_napi_poll(struct napi_struct *napi, int budget)
{
struct mlx5e_channel *c = container_of(napi, struct mlx5e_channel,
@@ -121,6 +181,9 @@
work_done = mlx5e_poll_rx_cq(&c->rq.cq, budget);
busy |= work_done == budget;
+ if (c->xdp)
+ busy |= mlx5e_poll_xdp_tx_cq(&c->xdp_sq.cq);
+
mlx5e_poll_ico_cq(&c->icosq.cq);
busy |= mlx5e_post_rx_wqes(&c->rq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 654b76f..abbf2c3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -81,9 +81,6 @@
MC_ADDR_CHANGE | \
PROMISC_CHANGE)
-int esw_offloads_init(struct mlx5_eswitch *esw, int nvports);
-void esw_offloads_cleanup(struct mlx5_eswitch *esw, int nvports);
-
static int arm_vport_context_events_cmd(struct mlx5_core_dev *dev, u16 vport,
u32 events_mask)
{
@@ -130,7 +127,7 @@
}
static int modify_esw_vport_cvlan(struct mlx5_core_dev *dev, u32 vport,
- u16 vlan, u8 qos, bool set)
+ u16 vlan, u8 qos, u8 set_flags)
{
u32 in[MLX5_ST_SZ_DW(modify_esw_vport_context_in)] = {0};
@@ -138,14 +135,18 @@
!MLX5_CAP_ESW(dev, vport_cvlan_insert_if_not_exist))
return -ENOTSUPP;
- esw_debug(dev, "Set Vport[%d] VLAN %d qos %d set=%d\n",
- vport, vlan, qos, set);
- if (set) {
+ esw_debug(dev, "Set Vport[%d] VLAN %d qos %d set=%x\n",
+ vport, vlan, qos, set_flags);
+
+ if (set_flags & SET_VLAN_STRIP)
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.vport_cvlan_strip, 1);
+
+ if (set_flags & SET_VLAN_INSERT) {
/* insert only if no vlan in packet */
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.vport_cvlan_insert, 1);
+
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.cvlan_pcp, qos);
MLX5_SET(modify_esw_vport_context_in, in,
@@ -1492,6 +1493,7 @@
abort:
esw_enable_vport(esw, 0, UC_ADDR_CHANGE);
+ esw->mode = SRIOV_NONE;
return err;
}
@@ -1780,25 +1782,21 @@
return 0;
}
-int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
- int vport, u16 vlan, u8 qos)
+int __mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos, u8 set_flags)
{
struct mlx5_vport *evport;
int err = 0;
- int set = 0;
if (!ESW_ALLOWED(esw))
return -EPERM;
if (!LEGAL_VPORT(esw, vport) || (vlan > 4095) || (qos > 7))
return -EINVAL;
- if (vlan || qos)
- set = 1;
-
mutex_lock(&esw->state_lock);
evport = &esw->vports[vport];
- err = modify_esw_vport_cvlan(esw->dev, vport, vlan, qos, set);
+ err = modify_esw_vport_cvlan(esw->dev, vport, vlan, qos, set_flags);
if (err)
goto unlock;
@@ -1816,6 +1814,17 @@
return err;
}
+int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos)
+{
+ u8 set_flags = 0;
+
+ if (vlan || qos)
+ set_flags = SET_VLAN_STRIP | SET_VLAN_INSERT;
+
+ return __mlx5_eswitch_set_vport_vlan(esw, vport, vlan, qos, set_flags);
+}
+
int mlx5_eswitch_set_vport_spoofchk(struct mlx5_eswitch *esw,
int vport, bool spoofchk)
{
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 6855783..2e2938e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -157,6 +157,7 @@
struct mlx5_flow_group *send_to_vport_grp;
struct mlx5_flow_group *miss_grp;
struct mlx5_flow_rule *miss_rule;
+ int vlan_push_pop_refcount;
} offloads;
};
};
@@ -178,11 +179,14 @@
void (*unload)(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep);
u16 vport;
- struct mlx5_flow_rule *vport_rx_rule;
- void *priv_data;
- struct list_head vport_sqs_list;
- bool valid;
u8 hw_id[ETH_ALEN];
+ void *priv_data;
+
+ struct mlx5_flow_rule *vport_rx_rule;
+ struct list_head vport_sqs_list;
+ u16 vlan;
+ u32 vlan_refcount;
+ bool valid;
};
struct mlx5_esw_offload {
@@ -209,6 +213,9 @@
int mode;
};
+void esw_offloads_cleanup(struct mlx5_eswitch *esw, int nvports);
+int esw_offloads_init(struct mlx5_eswitch *esw, int nvports);
+
/* E-Switch API */
int mlx5_eswitch_init(struct mlx5_core_dev *dev);
void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw);
@@ -234,14 +241,32 @@
struct ifla_vf_stats *vf_stats);
struct mlx5_flow_spec;
+struct mlx5_esw_flow_attr;
struct mlx5_flow_rule *
mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
struct mlx5_flow_spec *spec,
- u32 action, u32 src_vport, u32 dst_vport);
+ struct mlx5_esw_flow_attr *attr);
struct mlx5_flow_rule *
mlx5_eswitch_create_vport_rx_rule(struct mlx5_eswitch *esw, int vport, u32 tirn);
+enum {
+ SET_VLAN_STRIP = BIT(0),
+ SET_VLAN_INSERT = BIT(1)
+};
+
+#define MLX5_FLOW_CONTEXT_ACTION_VLAN_POP 0x40
+#define MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH 0x80
+
+struct mlx5_esw_flow_attr {
+ struct mlx5_eswitch_rep *in_rep;
+ struct mlx5_eswitch_rep *out_rep;
+
+ int action;
+ u16 vlan;
+ bool vlan_handled;
+};
+
int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep,
u16 *sqns_array, int sqns_num);
@@ -251,9 +276,17 @@
int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode);
int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode);
void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
+ int vport_index,
struct mlx5_eswitch_rep *rep);
void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
- int vport);
+ int vport_index);
+
+int mlx5_eswitch_add_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr);
+int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr);
+int __mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos, u8 set_flags);
#define MLX5_DEBUG_ESWITCH_MASK BIT(3)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 3dc83a9..c55ad8d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -46,19 +46,22 @@
struct mlx5_flow_rule *
mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
struct mlx5_flow_spec *spec,
- u32 action, u32 src_vport, u32 dst_vport)
+ struct mlx5_esw_flow_attr *attr)
{
struct mlx5_flow_destination dest = { 0 };
struct mlx5_fc *counter = NULL;
struct mlx5_flow_rule *rule;
void *misc;
+ int action;
if (esw->mode != SRIOV_OFFLOADS)
return ERR_PTR(-EOPNOTSUPP);
+ action = attr->action;
+
if (action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
- dest.vport_num = dst_vport;
+ dest.vport_num = attr->out_rep->vport;
action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
} else if (action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
counter = mlx5_fc_create(esw->dev, true);
@@ -69,7 +72,7 @@
}
misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters);
- MLX5_SET(fte_match_set_misc, misc, source_port, src_vport);
+ MLX5_SET(fte_match_set_misc, misc, source_port, attr->in_rep->vport);
misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters);
MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port);
@@ -86,6 +89,186 @@
return rule;
}
+static int esw_set_global_vlan_pop(struct mlx5_eswitch *esw, u8 val)
+{
+ struct mlx5_eswitch_rep *rep;
+ int vf_vport, err = 0;
+
+ esw_debug(esw->dev, "%s applying global %s policy\n", __func__, val ? "pop" : "none");
+ for (vf_vport = 1; vf_vport < esw->enabled_vports; vf_vport++) {
+ rep = &esw->offloads.vport_reps[vf_vport];
+ if (!rep->valid)
+ continue;
+
+ err = __mlx5_eswitch_set_vport_vlan(esw, rep->vport, 0, 0, val);
+ if (err)
+ goto out;
+ }
+
+out:
+ return err;
+}
+
+static struct mlx5_eswitch_rep *
+esw_vlan_action_get_vport(struct mlx5_esw_flow_attr *attr, bool push, bool pop)
+{
+ struct mlx5_eswitch_rep *in_rep, *out_rep, *vport = NULL;
+
+ in_rep = attr->in_rep;
+ out_rep = attr->out_rep;
+
+ if (push)
+ vport = in_rep;
+ else if (pop)
+ vport = out_rep;
+ else
+ vport = in_rep;
+
+ return vport;
+}
+
+static int esw_add_vlan_action_check(struct mlx5_esw_flow_attr *attr,
+ bool push, bool pop, bool fwd)
+{
+ struct mlx5_eswitch_rep *in_rep, *out_rep;
+
+ if ((push || pop) && !fwd)
+ goto out_notsupp;
+
+ in_rep = attr->in_rep;
+ out_rep = attr->out_rep;
+
+ if (push && in_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ if (pop && out_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ /* vport has vlan push configured, can't offload VF --> wire rules w.o it */
+ if (!push && !pop && fwd)
+ if (in_rep->vlan && out_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ /* protects against (1) setting rules with different vlans to push and
+ * (2) setting rules w.o vlans (attr->vlan = 0) && w. vlans to push (!= 0)
+ */
+ if (push && in_rep->vlan_refcount && (in_rep->vlan != attr->vlan))
+ goto out_notsupp;
+
+ return 0;
+
+out_notsupp:
+ return -ENOTSUPP;
+}
+
+int mlx5_eswitch_add_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr)
+{
+ struct offloads_fdb *offloads = &esw->fdb_table.offloads;
+ struct mlx5_eswitch_rep *vport = NULL;
+ bool push, pop, fwd;
+ int err = 0;
+
+ push = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH);
+ pop = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
+ fwd = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+
+ err = esw_add_vlan_action_check(attr, push, pop, fwd);
+ if (err)
+ return err;
+
+ attr->vlan_handled = false;
+
+ vport = esw_vlan_action_get_vport(attr, push, pop);
+
+ if (!push && !pop && fwd) {
+ /* tracks VF --> wire rules without vlan push action */
+ if (attr->out_rep->vport == FDB_UPLINK_VPORT) {
+ vport->vlan_refcount++;
+ attr->vlan_handled = true;
+ }
+
+ return 0;
+ }
+
+ if (!push && !pop)
+ return 0;
+
+ if (!(offloads->vlan_push_pop_refcount)) {
+ /* it's the 1st vlan rule, apply global vlan pop policy */
+ err = esw_set_global_vlan_pop(esw, SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ }
+ offloads->vlan_push_pop_refcount++;
+
+ if (push) {
+ if (vport->vlan_refcount)
+ goto skip_set_push;
+
+ err = __mlx5_eswitch_set_vport_vlan(esw, vport->vport, attr->vlan, 0,
+ SET_VLAN_INSERT | SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ vport->vlan = attr->vlan;
+skip_set_push:
+ vport->vlan_refcount++;
+ }
+out:
+ if (!err)
+ attr->vlan_handled = true;
+ return err;
+}
+
+int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr)
+{
+ struct offloads_fdb *offloads = &esw->fdb_table.offloads;
+ struct mlx5_eswitch_rep *vport = NULL;
+ bool push, pop, fwd;
+ int err = 0;
+
+ if (!attr->vlan_handled)
+ return 0;
+
+ push = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH);
+ pop = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
+ fwd = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+
+ vport = esw_vlan_action_get_vport(attr, push, pop);
+
+ if (!push && !pop && fwd) {
+ /* tracks VF --> wire rules without vlan push action */
+ if (attr->out_rep->vport == FDB_UPLINK_VPORT)
+ vport->vlan_refcount--;
+
+ return 0;
+ }
+
+ if (push) {
+ vport->vlan_refcount--;
+ if (vport->vlan_refcount)
+ goto skip_unset_push;
+
+ vport->vlan = 0;
+ err = __mlx5_eswitch_set_vport_vlan(esw, vport->vport,
+ 0, 0, SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ }
+
+skip_unset_push:
+ offloads->vlan_push_pop_refcount--;
+ if (offloads->vlan_push_pop_refcount)
+ return 0;
+
+ /* no more vlan rules, stop global vlan pop policy */
+ err = esw_set_global_vlan_pop(esw, 0);
+
+out:
+ return err;
+}
+
static struct mlx5_flow_rule *
mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw, int vport, u32 sqn)
{
@@ -144,16 +327,12 @@
{
struct mlx5_flow_rule *flow_rule;
struct mlx5_esw_sq *esw_sq;
- int vport;
int err;
int i;
if (esw->mode != SRIOV_OFFLOADS)
return 0;
- vport = rep->vport == 0 ?
- FDB_UPLINK_VPORT : rep->vport;
-
for (i = 0; i < sqns_num; i++) {
esw_sq = kzalloc(sizeof(*esw_sq), GFP_KERNEL);
if (!esw_sq) {
@@ -163,7 +342,7 @@
/* Add re-inject rule to the PF/representor sqs */
flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw,
- vport,
+ rep->vport,
sqns_array[i]);
if (IS_ERR(flow_rule)) {
err = PTR_ERR(flow_rule);
@@ -446,7 +625,7 @@
static int esw_offloads_start(struct mlx5_eswitch *esw)
{
- int err, num_vfs = esw->dev->priv.sriov.num_vfs;
+ int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs;
if (esw->mode != SRIOV_LEGACY) {
esw_warn(esw->dev, "Can't set offloads mode, SRIOV legacy not enabled\n");
@@ -455,8 +634,12 @@
mlx5_eswitch_disable_sriov(esw);
err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_OFFLOADS);
- if (err)
- esw_warn(esw->dev, "Failed set eswitch to offloads, err %d\n", err);
+ if (err) {
+ esw_warn(esw->dev, "Failed setting eswitch to offloads, err %d\n", err);
+ err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
+ if (err1)
+ esw_warn(esw->dev, "Failed setting eswitch back to legacy, err %d\n", err);
+ }
return err;
}
@@ -508,12 +691,16 @@
static int esw_offloads_stop(struct mlx5_eswitch *esw)
{
- int err, num_vfs = esw->dev->priv.sriov.num_vfs;
+ int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs;
mlx5_eswitch_disable_sriov(esw);
err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
- if (err)
- esw_warn(esw->dev, "Failed set eswitch legacy mode. err %d\n", err);
+ if (err) {
+ esw_warn(esw->dev, "Failed setting eswitch to legacy, err %d\n", err);
+ err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_OFFLOADS);
+ if (err1)
+ esw_warn(esw->dev, "Failed setting eswitch back to offloads, err %d\n", err);
+ }
return err;
}
@@ -612,27 +799,36 @@
}
void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
- struct mlx5_eswitch_rep *rep)
-{
- struct mlx5_esw_offload *offloads = &esw->offloads;
-
- memcpy(&offloads->vport_reps[rep->vport], rep,
- sizeof(struct mlx5_eswitch_rep));
-
- INIT_LIST_HEAD(&offloads->vport_reps[rep->vport].vport_sqs_list);
- offloads->vport_reps[rep->vport].valid = true;
-}
-
-void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
- int vport)
+ int vport_index,
+ struct mlx5_eswitch_rep *__rep)
{
struct mlx5_esw_offload *offloads = &esw->offloads;
struct mlx5_eswitch_rep *rep;
- rep = &offloads->vport_reps[vport];
+ rep = &offloads->vport_reps[vport_index];
- if (esw->mode == SRIOV_OFFLOADS && esw->vports[vport].enabled)
+ memset(rep, 0, sizeof(*rep));
+
+ rep->load = __rep->load;
+ rep->unload = __rep->unload;
+ rep->vport = __rep->vport;
+ rep->priv_data = __rep->priv_data;
+ ether_addr_copy(rep->hw_id, __rep->hw_id);
+
+ INIT_LIST_HEAD(&rep->vport_sqs_list);
+ rep->valid = true;
+}
+
+void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
+ int vport_index)
+{
+ struct mlx5_esw_offload *offloads = &esw->offloads;
+ struct mlx5_eswitch_rep *rep;
+
+ rep = &offloads->vport_reps[vport_index];
+
+ if (esw->mode == SRIOV_OFFLOADS && esw->vports[vport_index].enabled)
rep->unload(esw, rep);
- offloads->vport_reps[vport].valid = false;
+ rep->valid = false;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 7a0415e..113c323 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -401,11 +401,11 @@
mlx5_cmd_fc_bulk_alloc(struct mlx5_core_dev *dev, u16 id, int num)
{
struct mlx5_cmd_fc_bulk *b;
- int outlen = sizeof(*b) +
+ int outlen =
MLX5_ST_SZ_BYTES(query_flow_counter_out) +
MLX5_ST_SZ_BYTES(traffic_counter) * num;
- b = kzalloc(outlen, GFP_KERNEL);
+ b = kzalloc(sizeof(*b) + outlen, GFP_KERNEL);
if (!b)
return NULL;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 068ee65a..aa33d58 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1100,10 +1100,15 @@
goto err_alloc_stats;
}
- if (mlxsw_driver->profile->used_max_lag &&
- mlxsw_driver->profile->used_max_port_per_lag) {
- alloc_size = sizeof(u8) * mlxsw_driver->profile->max_lag *
- mlxsw_driver->profile->max_port_per_lag;
+ err = mlxsw_bus->init(bus_priv, mlxsw_core, mlxsw_driver->profile,
+ &mlxsw_core->resources);
+ if (err)
+ goto err_bus_init;
+
+ if (mlxsw_core->resources.max_lag_valid &&
+ mlxsw_core->resources.max_ports_in_lag_valid) {
+ alloc_size = sizeof(u8) * mlxsw_core->resources.max_lag *
+ mlxsw_core->resources.max_ports_in_lag;
mlxsw_core->lag.mapping = kzalloc(alloc_size, GFP_KERNEL);
if (!mlxsw_core->lag.mapping) {
err = -ENOMEM;
@@ -1111,11 +1116,6 @@
}
}
- err = mlxsw_bus->init(bus_priv, mlxsw_core, mlxsw_driver->profile,
- &mlxsw_core->resources);
- if (err)
- goto err_bus_init;
-
err = mlxsw_emad_init(mlxsw_core);
if (err)
goto err_emad_init;
@@ -1146,10 +1146,10 @@
err_devlink_register:
mlxsw_emad_fini(mlxsw_core);
err_emad_init:
- mlxsw_bus->fini(bus_priv);
-err_bus_init:
kfree(mlxsw_core->lag.mapping);
err_alloc_lag_mapping:
+ mlxsw_bus->fini(bus_priv);
+err_bus_init:
free_percpu(mlxsw_core->pcpu_stats);
err_alloc_stats:
devlink_free(devlink);
@@ -1615,7 +1615,7 @@
static int mlxsw_core_lag_mapping_index(struct mlxsw_core *mlxsw_core,
u16 lag_id, u8 port_index)
{
- return mlxsw_core->driver->profile->max_port_per_lag * lag_id +
+ return mlxsw_core->resources.max_ports_in_lag * lag_id +
port_index;
}
@@ -1644,7 +1644,7 @@
{
int i;
- for (i = 0; i < mlxsw_core->driver->profile->max_port_per_lag; i++) {
+ for (i = 0; i < mlxsw_core->resources.max_ports_in_lag; i++) {
int index = mlxsw_core_lag_mapping_index(mlxsw_core,
lag_id, i);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index d2e3297..c4f550b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -179,8 +179,6 @@
struct mlxsw_config_profile {
u16 used_max_vepa_channels:1,
- used_max_lag:1,
- used_max_port_per_lag:1,
used_max_mid:1,
used_max_pgt:1,
used_max_system_port:1,
@@ -192,10 +190,9 @@
used_max_pkey:1,
used_ar_sec:1,
used_adaptive_routing_group_cap:1,
- used_kvd_sizes:1;
+ used_kvd_split_data:1; /* indicate for the kvd's values */
+
u8 max_vepa_channels;
- u16 max_lag;
- u16 max_port_per_lag;
u16 max_mid;
u16 max_pgt;
u16 max_system_port;
@@ -214,8 +211,9 @@
u16 adaptive_routing_group_cap;
u8 arn;
u32 kvd_linear_size;
- u32 kvd_hash_single_size;
- u32 kvd_hash_double_size;
+ u16 kvd_hash_granularity;
+ u8 kvd_hash_single_parts;
+ u8 kvd_hash_double_parts;
u8 resource_query_enable;
struct mlxsw_swid_config swid_config[MLXSW_CONFIG_PROFILE_SWID_COUNT];
};
@@ -269,8 +267,35 @@
};
struct mlxsw_resources {
- u8 max_span_valid:1;
+ u32 max_span_valid:1,
+ max_lag_valid:1,
+ max_ports_in_lag_valid:1,
+ kvd_size_valid:1,
+ kvd_single_min_size_valid:1,
+ kvd_double_min_size_valid:1,
+ max_virtual_routers_valid:1,
+ max_system_ports_valid:1,
+ max_vlan_groups_valid:1,
+ max_regions_valid:1,
+ max_rif_valid:1;
u8 max_span;
+ u8 max_lag;
+ u8 max_ports_in_lag;
+ u32 kvd_size;
+ u32 kvd_single_min_size;
+ u32 kvd_double_min_size;
+ u16 max_virtual_routers;
+ u16 max_system_ports;
+ u16 max_vlan_groups;
+ u16 max_regions;
+ u16 max_rif;
+
+ /* Internal resources.
+ * Determined by the SW, not queried from the HW.
+ */
+ u32 kvd_single_size;
+ u32 kvd_double_size;
+ u32 kvd_linear_size;
};
struct mlxsw_resources *mlxsw_core_resources_get(struct mlxsw_core *mlxsw_core);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 1d1360c..e742bd4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1156,6 +1156,16 @@
#define MLXSW_RESOURCES_TABLE_END_ID 0xffff
#define MLXSW_MAX_SPAN_ID 0x2420
+#define MLXSW_MAX_LAG_ID 0x2520
+#define MLXSW_MAX_PORTS_IN_LAG_ID 0x2521
+#define MLXSW_KVD_SIZE_ID 0x1001
+#define MLXSW_KVD_SINGLE_MIN_SIZE_ID 0x1002
+#define MLXSW_KVD_DOUBLE_MIN_SIZE_ID 0x1003
+#define MLXSW_MAX_VIRTUAL_ROUTERS_ID 0x2C01
+#define MLXSW_MAX_SYSTEM_PORT_ID 0x2502
+#define MLXSW_MAX_VLAN_GROUPS_ID 0x2906
+#define MLXSW_MAX_REGIONS_ID 0x2901
+#define MLXSW_MAX_RIF_ID 0x2C02
#define MLXSW_RESOURCES_QUERY_MAX_QUERIES 100
#define MLXSW_RESOURCES_PER_QUERY 32
@@ -1167,6 +1177,46 @@
resources->max_span = val;
resources->max_span_valid = 1;
break;
+ case MLXSW_MAX_LAG_ID:
+ resources->max_lag = val;
+ resources->max_lag_valid = 1;
+ break;
+ case MLXSW_MAX_PORTS_IN_LAG_ID:
+ resources->max_ports_in_lag = val;
+ resources->max_ports_in_lag_valid = 1;
+ break;
+ case MLXSW_KVD_SIZE_ID:
+ resources->kvd_size = val;
+ resources->kvd_size_valid = 1;
+ break;
+ case MLXSW_KVD_SINGLE_MIN_SIZE_ID:
+ resources->kvd_single_min_size = val;
+ resources->kvd_single_min_size_valid = 1;
+ break;
+ case MLXSW_KVD_DOUBLE_MIN_SIZE_ID:
+ resources->kvd_double_min_size = val;
+ resources->kvd_double_min_size_valid = 1;
+ break;
+ case MLXSW_MAX_VIRTUAL_ROUTERS_ID:
+ resources->max_virtual_routers = val;
+ resources->max_virtual_routers_valid = 1;
+ break;
+ case MLXSW_MAX_SYSTEM_PORT_ID:
+ resources->max_system_ports = val;
+ resources->max_system_ports_valid = 1;
+ break;
+ case MLXSW_MAX_VLAN_GROUPS_ID:
+ resources->max_vlan_groups = val;
+ resources->max_vlan_groups_valid = 1;
+ break;
+ case MLXSW_MAX_REGIONS_ID:
+ resources->max_regions = val;
+ resources->max_regions_valid = 1;
+ break;
+ case MLXSW_MAX_RIF_ID:
+ resources->max_rif = val;
+ resources->max_rif_valid = 1;
+ break;
default:
break;
}
@@ -1209,10 +1259,52 @@
return -EIO;
}
+static int mlxsw_pci_profile_get_kvd_sizes(const struct mlxsw_config_profile *profile,
+ struct mlxsw_resources *resources)
+{
+ u32 singles_size, doubles_size, linear_size;
+
+ if (!resources->kvd_single_min_size_valid ||
+ !resources->kvd_double_min_size_valid ||
+ !profile->used_kvd_split_data)
+ return -EIO;
+
+ linear_size = profile->kvd_linear_size;
+
+ /* The hash part is what left of the kvd without the
+ * linear part. It is split to the single size and
+ * double size by the parts ratio from the profile.
+ * Both sizes must be a multiplications of the
+ * granularity from the profile.
+ */
+ doubles_size = (resources->kvd_size - linear_size);
+ doubles_size *= profile->kvd_hash_double_parts;
+ doubles_size /= (profile->kvd_hash_double_parts +
+ profile->kvd_hash_single_parts);
+ doubles_size /= profile->kvd_hash_granularity;
+ doubles_size *= profile->kvd_hash_granularity;
+ singles_size = resources->kvd_size - doubles_size -
+ linear_size;
+
+ /* Check results are legal. */
+ if (singles_size < resources->kvd_single_min_size ||
+ doubles_size < resources->kvd_double_min_size ||
+ resources->kvd_size < linear_size)
+ return -EIO;
+
+ resources->kvd_single_size = singles_size;
+ resources->kvd_double_size = doubles_size;
+ resources->kvd_linear_size = linear_size;
+
+ return 0;
+}
+
static int mlxsw_pci_config_profile(struct mlxsw_pci *mlxsw_pci, char *mbox,
- const struct mlxsw_config_profile *profile)
+ const struct mlxsw_config_profile *profile,
+ struct mlxsw_resources *resources)
{
int i;
+ int err;
mlxsw_cmd_mbox_zero(mbox);
@@ -1222,18 +1314,6 @@
mlxsw_cmd_mbox_config_profile_max_vepa_channels_set(
mbox, profile->max_vepa_channels);
}
- if (profile->used_max_lag) {
- mlxsw_cmd_mbox_config_profile_set_max_lag_set(
- mbox, 1);
- mlxsw_cmd_mbox_config_profile_max_lag_set(
- mbox, profile->max_lag);
- }
- if (profile->used_max_port_per_lag) {
- mlxsw_cmd_mbox_config_profile_set_max_port_per_lag_set(
- mbox, 1);
- mlxsw_cmd_mbox_config_profile_max_port_per_lag_set(
- mbox, profile->max_port_per_lag);
- }
if (profile->used_max_mid) {
mlxsw_cmd_mbox_config_profile_set_max_mid_set(
mbox, 1);
@@ -1310,19 +1390,22 @@
mlxsw_cmd_mbox_config_profile_adaptive_routing_group_cap_set(
mbox, profile->adaptive_routing_group_cap);
}
- if (profile->used_kvd_sizes) {
- mlxsw_cmd_mbox_config_profile_set_kvd_linear_size_set(
- mbox, 1);
- mlxsw_cmd_mbox_config_profile_kvd_linear_size_set(
- mbox, profile->kvd_linear_size);
- mlxsw_cmd_mbox_config_profile_set_kvd_hash_single_size_set(
- mbox, 1);
- mlxsw_cmd_mbox_config_profile_kvd_hash_single_size_set(
- mbox, profile->kvd_hash_single_size);
+ if (resources->kvd_size_valid) {
+ err = mlxsw_pci_profile_get_kvd_sizes(profile, resources);
+ if (err)
+ return err;
+
+ mlxsw_cmd_mbox_config_profile_set_kvd_linear_size_set(mbox, 1);
+ mlxsw_cmd_mbox_config_profile_kvd_linear_size_set(mbox,
+ resources->kvd_linear_size);
+ mlxsw_cmd_mbox_config_profile_set_kvd_hash_single_size_set(mbox,
+ 1);
+ mlxsw_cmd_mbox_config_profile_kvd_hash_single_size_set(mbox,
+ resources->kvd_single_size);
mlxsw_cmd_mbox_config_profile_set_kvd_hash_double_size_set(
- mbox, 1);
- mlxsw_cmd_mbox_config_profile_kvd_hash_double_size_set(
- mbox, profile->kvd_hash_double_size);
+ mbox, 1);
+ mlxsw_cmd_mbox_config_profile_kvd_hash_double_size_set(mbox,
+ resources->kvd_double_size);
}
for (i = 0; i < MLXSW_CONFIG_PROFILE_SWID_COUNT; i++)
@@ -1524,7 +1607,7 @@
if (err)
goto err_query_resources;
- err = mlxsw_pci_config_profile(mlxsw_pci, mbox, profile);
+ err = mlxsw_pci_config_profile(mlxsw_pci, mbox, profile, resources);
if (err)
goto err_config_profile;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 4e2354c..6460c72 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -1392,7 +1392,7 @@
{
MLXSW_REG_ZERO(slcr, payload);
mlxsw_reg_slcr_pp_set(payload, MLXSW_REG_SLCR_PP_GLOBAL);
- mlxsw_reg_slcr_type_set(payload, MLXSW_REG_SLCR_TYPE_XOR);
+ mlxsw_reg_slcr_type_set(payload, MLXSW_REG_SLCR_TYPE_CRC);
mlxsw_reg_slcr_lag_hash_set(payload, lag_hash);
}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 27bbcaf..fd74d10 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -248,7 +248,8 @@
span_entry->used = false;
}
-struct mlxsw_sp_span_entry *mlxsw_sp_span_entry_find(struct mlxsw_sp_port *port)
+static struct mlxsw_sp_span_entry *
+mlxsw_sp_span_entry_find(struct mlxsw_sp_port *port)
{
struct mlxsw_sp *mlxsw_sp = port->mlxsw_sp;
int i;
@@ -262,7 +263,8 @@
return NULL;
}
-struct mlxsw_sp_span_entry *mlxsw_sp_span_entry_get(struct mlxsw_sp_port *port)
+static struct mlxsw_sp_span_entry
+*mlxsw_sp_span_entry_get(struct mlxsw_sp_port *port)
{
struct mlxsw_sp_span_entry *span_entry;
@@ -364,7 +366,8 @@
}
/* bind the port to the SPAN entry */
- mlxsw_reg_mpar_pack(mpar_pl, port->local_port, type, true, pa_id);
+ mlxsw_reg_mpar_pack(mpar_pl, port->local_port,
+ (enum mlxsw_reg_mpar_i_e) type, true, pa_id);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(mpar), mpar_pl);
if (err)
goto err_mpar_reg_write;
@@ -405,7 +408,8 @@
return;
/* remove the inspected port */
- mlxsw_reg_mpar_pack(mpar_pl, port->local_port, type, false, pa_id);
+ mlxsw_reg_mpar_pack(mpar_pl, port->local_port,
+ (enum mlxsw_reg_mpar_i_e) type, false, pa_id);
mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(mpar), mpar_pl);
/* remove the SBIB buffer if it was egress SPAN */
@@ -819,9 +823,9 @@
return err;
}
-static struct rtnl_link_stats64 *
-mlxsw_sp_port_get_stats64(struct net_device *dev,
- struct rtnl_link_stats64 *stats)
+static int
+mlxsw_sp_port_get_sw_stats64(const struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
{
struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
struct mlxsw_sp_port_pcpu_stats *p;
@@ -848,6 +852,107 @@
tx_dropped += p->tx_dropped;
}
stats->tx_dropped = tx_dropped;
+ return 0;
+}
+
+static bool mlxsw_sp_port_has_offload_stats(int attr_id)
+{
+ switch (attr_id) {
+ case IFLA_OFFLOAD_XSTATS_CPU_HIT:
+ return true;
+ }
+
+ return false;
+}
+
+static int mlxsw_sp_port_get_offload_stats(int attr_id, const struct net_device *dev,
+ void *sp)
+{
+ switch (attr_id) {
+ case IFLA_OFFLOAD_XSTATS_CPU_HIT:
+ return mlxsw_sp_port_get_sw_stats64(dev, sp);
+ }
+
+ return -EINVAL;
+}
+
+static int mlxsw_sp_port_get_stats_raw(struct net_device *dev, int grp,
+ int prio, char *ppcnt_pl)
+{
+ struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
+ struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
+
+ mlxsw_reg_ppcnt_pack(ppcnt_pl, mlxsw_sp_port->local_port, grp, prio);
+ return mlxsw_reg_query(mlxsw_sp->core, MLXSW_REG(ppcnt), ppcnt_pl);
+}
+
+static int mlxsw_sp_port_get_hw_stats(struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
+{
+ char ppcnt_pl[MLXSW_REG_PPCNT_LEN];
+ int err;
+
+ err = mlxsw_sp_port_get_stats_raw(dev, MLXSW_REG_PPCNT_IEEE_8023_CNT,
+ 0, ppcnt_pl);
+ if (err)
+ goto out;
+
+ stats->tx_packets =
+ mlxsw_reg_ppcnt_a_frames_transmitted_ok_get(ppcnt_pl);
+ stats->rx_packets =
+ mlxsw_reg_ppcnt_a_frames_received_ok_get(ppcnt_pl);
+ stats->tx_bytes =
+ mlxsw_reg_ppcnt_a_octets_transmitted_ok_get(ppcnt_pl);
+ stats->rx_bytes =
+ mlxsw_reg_ppcnt_a_octets_received_ok_get(ppcnt_pl);
+ stats->multicast =
+ mlxsw_reg_ppcnt_a_multicast_frames_received_ok_get(ppcnt_pl);
+
+ stats->rx_crc_errors =
+ mlxsw_reg_ppcnt_a_frame_check_sequence_errors_get(ppcnt_pl);
+ stats->rx_frame_errors =
+ mlxsw_reg_ppcnt_a_alignment_errors_get(ppcnt_pl);
+
+ stats->rx_length_errors = (
+ mlxsw_reg_ppcnt_a_in_range_length_errors_get(ppcnt_pl) +
+ mlxsw_reg_ppcnt_a_out_of_range_length_field_get(ppcnt_pl) +
+ mlxsw_reg_ppcnt_a_frame_too_long_errors_get(ppcnt_pl));
+
+ stats->rx_errors = (stats->rx_crc_errors +
+ stats->rx_frame_errors + stats->rx_length_errors);
+
+out:
+ return err;
+}
+
+static void update_stats_cache(struct work_struct *work)
+{
+ struct mlxsw_sp_port *mlxsw_sp_port =
+ container_of(work, struct mlxsw_sp_port,
+ hw_stats.update_dw.work);
+
+ if (!netif_carrier_ok(mlxsw_sp_port->dev))
+ goto out;
+
+ mlxsw_sp_port_get_hw_stats(mlxsw_sp_port->dev,
+ mlxsw_sp_port->hw_stats.cache);
+
+out:
+ mlxsw_core_schedule_dw(&mlxsw_sp_port->hw_stats.update_dw,
+ MLXSW_HW_STATS_UPDATE_TIME);
+}
+
+/* Return the stats from a cache that is updated periodically,
+ * as this function might get called in an atomic context.
+ */
+static struct rtnl_link_stats64 *
+mlxsw_sp_port_get_stats64(struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
+{
+ struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
+
+ memcpy(stats, mlxsw_sp_port->hw_stats.cache, sizeof(*stats));
+
return stats;
}
@@ -1209,6 +1314,8 @@
.ndo_set_mac_address = mlxsw_sp_port_set_mac_address,
.ndo_change_mtu = mlxsw_sp_port_change_mtu,
.ndo_get_stats64 = mlxsw_sp_port_get_stats64,
+ .ndo_has_offload_stats = mlxsw_sp_port_has_offload_stats,
+ .ndo_get_offload_stats = mlxsw_sp_port_get_offload_stats,
.ndo_vlan_rx_add_vid = mlxsw_sp_port_add_vid,
.ndo_vlan_rx_kill_vid = mlxsw_sp_port_kill_vid,
.ndo_neigh_construct = mlxsw_sp_router_neigh_construct,
@@ -1547,8 +1654,6 @@
enum mlxsw_reg_ppcnt_grp grp, int prio,
u64 *data, int data_index)
{
- struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
struct mlxsw_sp_port_hw_stats *hw_stats;
char ppcnt_pl[MLXSW_REG_PPCNT_LEN];
int i, len;
@@ -1557,10 +1662,9 @@
err = mlxsw_sp_get_hw_stats_by_group(&hw_stats, &len, grp);
if (err)
return;
- mlxsw_reg_ppcnt_pack(ppcnt_pl, mlxsw_sp_port->local_port, grp, prio);
- err = mlxsw_reg_query(mlxsw_sp->core, MLXSW_REG(ppcnt), ppcnt_pl);
+ mlxsw_sp_port_get_stats_raw(dev, grp, prio, ppcnt_pl);
for (i = 0; i < len; i++)
- data[data_index + i] = !err ? hw_stats[i].getter(ppcnt_pl) : 0;
+ data[data_index + i] = hw_stats[i].getter(ppcnt_pl);
}
static void mlxsw_sp_port_get_stats(struct net_device *dev,
@@ -2145,6 +2249,16 @@
goto err_alloc_stats;
}
+ mlxsw_sp_port->hw_stats.cache =
+ kzalloc(sizeof(*mlxsw_sp_port->hw_stats.cache), GFP_KERNEL);
+
+ if (!mlxsw_sp_port->hw_stats.cache) {
+ err = -ENOMEM;
+ goto err_alloc_hw_stats;
+ }
+ INIT_DELAYED_WORK(&mlxsw_sp_port->hw_stats.update_dw,
+ &update_stats_cache);
+
dev->netdev_ops = &mlxsw_sp_port_netdev_ops;
dev->ethtool_ops = &mlxsw_sp_port_ethtool_ops;
@@ -2245,6 +2359,7 @@
goto err_core_port_init;
}
+ mlxsw_core_schedule_dw(&mlxsw_sp_port->hw_stats.update_dw, 0);
return 0;
err_core_port_init:
@@ -2265,6 +2380,8 @@
err_dev_addr_init:
mlxsw_sp_port_swid_set(mlxsw_sp_port, MLXSW_PORT_SWID_DISABLED_PORT);
err_port_swid_set:
+ kfree(mlxsw_sp_port->hw_stats.cache);
+err_alloc_hw_stats:
free_percpu(mlxsw_sp_port->pcpu_stats);
err_alloc_stats:
kfree(mlxsw_sp_port->untagged_vlans);
@@ -2281,6 +2398,7 @@
if (!mlxsw_sp_port)
return;
+ cancel_delayed_work_sync(&mlxsw_sp_port->hw_stats.update_dw);
mlxsw_core_port_fini(&mlxsw_sp_port->core_port);
unregister_netdev(mlxsw_sp_port->dev); /* This calls ndo_stop */
mlxsw_sp->ports[local_port] = NULL;
@@ -2290,6 +2408,7 @@
mlxsw_sp_port_swid_set(mlxsw_sp_port, MLXSW_PORT_SWID_DISABLED_PORT);
mlxsw_sp_port_module_unmap(mlxsw_sp, mlxsw_sp_port->local_port);
free_percpu(mlxsw_sp_port->pcpu_stats);
+ kfree(mlxsw_sp_port->hw_stats.cache);
kfree(mlxsw_sp_port->untagged_vlans);
kfree(mlxsw_sp_port->active_vlans);
WARN_ON_ONCE(!list_empty(&mlxsw_sp_port->vports_list));
@@ -2768,7 +2887,9 @@
static int mlxsw_sp_lag_init(struct mlxsw_sp *mlxsw_sp)
{
+ struct mlxsw_resources *resources;
char slcr_pl[MLXSW_REG_SLCR_LEN];
+ int err;
mlxsw_reg_slcr_pack(slcr_pl, MLXSW_REG_SLCR_LAG_HASH_SMAC |
MLXSW_REG_SLCR_LAG_HASH_DMAC |
@@ -2779,7 +2900,26 @@
MLXSW_REG_SLCR_LAG_HASH_SPORT |
MLXSW_REG_SLCR_LAG_HASH_DPORT |
MLXSW_REG_SLCR_LAG_HASH_IPPROTO);
- return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(slcr), slcr_pl);
+ err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(slcr), slcr_pl);
+ if (err)
+ return err;
+
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ if (!(resources->max_lag_valid && resources->max_ports_in_lag_valid))
+ return -EIO;
+
+ mlxsw_sp->lags = kcalloc(resources->max_lag,
+ sizeof(struct mlxsw_sp_upper),
+ GFP_KERNEL);
+ if (!mlxsw_sp->lags)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void mlxsw_sp_lag_fini(struct mlxsw_sp *mlxsw_sp)
+{
+ kfree(mlxsw_sp->lags);
}
static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
@@ -2863,6 +3003,7 @@
err_router_init:
mlxsw_sp_switchdev_fini(mlxsw_sp);
err_switchdev_init:
+ mlxsw_sp_lag_fini(mlxsw_sp);
err_lag_init:
mlxsw_sp_buffers_fini(mlxsw_sp);
err_buffers_init:
@@ -2876,38 +3017,26 @@
static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
{
struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
- int i;
mlxsw_sp_ports_remove(mlxsw_sp);
mlxsw_sp_span_fini(mlxsw_sp);
mlxsw_sp_router_fini(mlxsw_sp);
mlxsw_sp_switchdev_fini(mlxsw_sp);
+ mlxsw_sp_lag_fini(mlxsw_sp);
mlxsw_sp_buffers_fini(mlxsw_sp);
mlxsw_sp_traps_fini(mlxsw_sp);
mlxsw_sp_event_unregister(mlxsw_sp, MLXSW_TRAP_ID_PUDE);
WARN_ON(!list_empty(&mlxsw_sp->vfids.list));
WARN_ON(!list_empty(&mlxsw_sp->fids));
- for (i = 0; i < MLXSW_SP_RIF_MAX; i++)
- WARN_ON_ONCE(mlxsw_sp->rifs[i]);
}
static struct mlxsw_config_profile mlxsw_sp_config_profile = {
.used_max_vepa_channels = 1,
.max_vepa_channels = 0,
- .used_max_lag = 1,
- .max_lag = MLXSW_SP_LAG_MAX,
- .used_max_port_per_lag = 1,
- .max_port_per_lag = MLXSW_SP_PORT_PER_LAG_MAX,
.used_max_mid = 1,
.max_mid = MLXSW_SP_MID_MAX,
.used_max_pgt = 1,
.max_pgt = 0,
- .used_max_system_port = 1,
- .max_system_port = 64,
- .used_max_vlan_groups = 1,
- .max_vlan_groups = 127,
- .used_max_regions = 1,
- .max_regions = 400,
.used_flood_tables = 1,
.used_flood_mode = 1,
.flood_mode = 3,
@@ -2919,10 +3048,11 @@
.max_ib_mc = 0,
.used_max_pkey = 1,
.max_pkey = 0,
- .used_kvd_sizes = 1,
+ .used_kvd_split_data = 1,
+ .kvd_hash_granularity = MLXSW_SP_KVD_GRANULARITY,
+ .kvd_hash_single_parts = 2,
+ .kvd_hash_double_parts = 1,
.kvd_linear_size = MLXSW_SP_KVD_LINEAR_SIZE,
- .kvd_hash_single_size = MLXSW_SP_KVD_HASH_SINGLE_SIZE,
- .kvd_hash_double_size = MLXSW_SP_KVD_HASH_DOUBLE_SIZE,
.swid_config = {
{
.used_type = 1,
@@ -3039,13 +3169,15 @@
static int mlxsw_sp_avail_rif_get(struct mlxsw_sp *mlxsw_sp)
{
+ struct mlxsw_resources *resources;
int i;
- for (i = 0; i < MLXSW_SP_RIF_MAX; i++)
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_rif; i++)
if (!mlxsw_sp->rifs[i])
return i;
- return MLXSW_SP_RIF_MAX;
+ return MLXSW_SP_INVALID_RIF;
}
static void mlxsw_sp_vport_rif_sp_attr_get(struct mlxsw_sp_port *mlxsw_sp_vport,
@@ -3125,7 +3257,7 @@
int err;
rif = mlxsw_sp_avail_rif_get(mlxsw_sp);
- if (rif == MLXSW_SP_RIF_MAX)
+ if (rif == MLXSW_SP_INVALID_RIF)
return ERR_PTR(-ERANGE);
err = mlxsw_sp_vport_rif_sp_op(mlxsw_sp_vport, l3_dev, rif, true);
@@ -3357,7 +3489,7 @@
int err;
rif = mlxsw_sp_avail_rif_get(mlxsw_sp);
- if (rif == MLXSW_SP_RIF_MAX)
+ if (rif == MLXSW_SP_INVALID_RIF)
return -ERANGE;
err = mlxsw_sp_router_port_flood_set(mlxsw_sp, f->fid, true);
@@ -3564,12 +3696,14 @@
struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
u8 local_port = mlxsw_sp_port->local_port;
u16 lag_id = mlxsw_sp_port->lag_id;
+ struct mlxsw_resources *resources;
int i, count = 0;
if (!mlxsw_sp_port->lagged)
return true;
- for (i = 0; i < MLXSW_SP_PORT_PER_LAG_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_ports_in_lag; i++) {
struct mlxsw_sp_port *lag_port;
lag_port = mlxsw_sp_port_lagged_get(mlxsw_sp, lag_id, i);
@@ -3775,11 +3909,13 @@
struct net_device *lag_dev,
u16 *p_lag_id)
{
+ struct mlxsw_resources *resources;
struct mlxsw_sp_upper *lag;
int free_lag_id = -1;
int i;
- for (i = 0; i < MLXSW_SP_LAG_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_lag; i++) {
lag = mlxsw_sp_lag_get(mlxsw_sp, i);
if (lag->ref_count) {
if (lag->dev == lag_dev) {
@@ -3813,9 +3949,11 @@
static int mlxsw_sp_port_lag_index_get(struct mlxsw_sp *mlxsw_sp,
u16 lag_id, u8 *p_port_index)
{
+ struct mlxsw_resources *resources;
int i;
- for (i = 0; i < MLXSW_SP_PORT_PER_LAG_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_ports_in_lag; i++) {
if (!mlxsw_sp_port_lagged_get(mlxsw_sp, lag_id, i)) {
*p_port_index = i;
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 969c250..9b22863 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -45,7 +45,7 @@
#include <linux/list.h>
#include <linux/dcbnl.h>
#include <linux/in6.h>
-#include <net/switchdev.h>
+#include <linux/notifier.h>
#include "port.h"
#include "core.h"
@@ -54,10 +54,7 @@
#define MLXSW_SP_VFID_MAX 6656 /* Bridged VLAN interfaces */
#define MLXSW_SP_RFID_BASE 15360
-#define MLXSW_SP_RIF_MAX 800
-
-#define MLXSW_SP_LAG_MAX 64
-#define MLXSW_SP_PORT_PER_LAG_MAX 16
+#define MLXSW_SP_INVALID_RIF 0xffff
#define MLXSW_SP_MID_MAX 7000
@@ -67,8 +64,6 @@
#define MLXSW_SP_LPM_TREE_MAX 22
#define MLXSW_SP_LPM_TREE_COUNT (MLXSW_SP_LPM_TREE_MAX - MLXSW_SP_LPM_TREE_MIN)
-#define MLXSW_SP_VIRTUAL_ROUTER_MAX 256
-
#define MLXSW_SP_PORT_BASE_SPEED 25000 /* Mb/s */
#define MLXSW_SP_BYTES_PER_CELL 96
@@ -77,8 +72,7 @@
#define MLXSW_SP_CELLS_TO_BYTES(c) (c * MLXSW_SP_BYTES_PER_CELL)
#define MLXSW_SP_KVD_LINEAR_SIZE 65536 /* entries */
-#define MLXSW_SP_KVD_HASH_SINGLE_SIZE 163840 /* entries */
-#define MLXSW_SP_KVD_HASH_DOUBLE_SIZE 32768 /* entries */
+#define MLXSW_SP_KVD_GRANULARITY 128
/* Maximum delay buffer needed in case of PAUSE frames, in cells.
* Assumes 100m cable and maximum MTU.
@@ -253,7 +247,7 @@
struct mlxsw_sp_router {
struct mlxsw_sp_lpm_tree lpm_trees[MLXSW_SP_LPM_TREE_COUNT];
- struct mlxsw_sp_vr vrs[MLXSW_SP_VIRTUAL_ROUTER_MAX];
+ struct mlxsw_sp_vr *vrs;
struct rhashtable neigh_ht;
struct {
struct delayed_work dw;
@@ -263,6 +257,7 @@
#define MLXSW_SP_UNRESOLVED_NH_PROBE_INTERVAL 5000 /* ms */
struct list_head nexthop_group_list;
struct list_head nexthop_neighs_list;
+ bool aborted;
};
struct mlxsw_sp {
@@ -275,7 +270,7 @@
DECLARE_BITMAP(mapped, MLXSW_SP_MID_MAX);
} br_mids;
struct list_head fids; /* VLAN-aware bridge FIDs */
- struct mlxsw_sp_rif *rifs[MLXSW_SP_RIF_MAX];
+ struct mlxsw_sp_rif **rifs;
struct mlxsw_sp_port **ports;
struct mlxsw_core *core;
const struct mlxsw_bus_info *bus_info;
@@ -290,7 +285,7 @@
#define MLXSW_SP_DEFAULT_AGEING_TIME 300
u32 ageing_time;
struct mlxsw_sp_upper master_bridge;
- struct mlxsw_sp_upper lags[MLXSW_SP_LAG_MAX];
+ struct mlxsw_sp_upper *lags;
u8 port_to_module[MLXSW_PORT_MAX_PORTS];
struct mlxsw_sp_sb sb;
struct mlxsw_sp_router router;
@@ -302,6 +297,7 @@
struct mlxsw_sp_span_entry *entries;
int entries_count;
} span;
+ struct notifier_block fib_nb;
};
static inline struct mlxsw_sp_upper *
@@ -361,6 +357,11 @@
struct list_head vports_list;
/* TC handles */
struct list_head mall_tc_list;
+ struct {
+ #define MLXSW_HW_STATS_UPDATE_TIME HZ
+ struct rtnl_link_stats64 *cache;
+ struct delayed_work update_dw;
+ } hw_stats;
};
struct mlxsw_sp_port *mlxsw_sp_port_lower_dev_hold(struct net_device *dev);
@@ -478,9 +479,12 @@
mlxsw_sp_rif_find_by_dev(const struct mlxsw_sp *mlxsw_sp,
const struct net_device *dev)
{
+ struct mlxsw_resources *resources;
int i;
- for (i = 0; i < MLXSW_SP_RIF_MAX; i++)
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+
+ for (i = 0; i < resources->max_rif; i++)
if (mlxsw_sp->rifs[i] && mlxsw_sp->rifs[i]->dev == dev)
return mlxsw_sp->rifs[i];
@@ -582,11 +586,6 @@
int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp);
void mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp);
-int mlxsw_sp_router_fib4_add(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans);
-int mlxsw_sp_router_fib4_del(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4);
int mlxsw_sp_router_neigh_construct(struct net_device *dev,
struct neighbour *n);
void mlxsw_sp_router_neigh_destroy(struct net_device *dev,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
index 953b214..bcaed8a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c
@@ -595,9 +595,9 @@
enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
struct mlxsw_sp_sb_pr *pr = mlxsw_sp_sb_pr_get(mlxsw_sp, pool, dir);
- pool_info->pool_type = dir;
+ pool_info->pool_type = (enum devlink_sb_pool_type) dir;
pool_info->size = MLXSW_SP_CELLS_TO_BYTES(pr->size);
- pool_info->threshold_type = pr->mode;
+ pool_info->threshold_type = (enum devlink_sb_threshold_type) pr->mode;
return 0;
}
@@ -608,9 +608,10 @@
struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
u8 pool = pool_get(pool_index);
enum mlxsw_reg_sbxx_dir dir = dir_get(pool_index);
- enum mlxsw_reg_sbpr_mode mode = threshold_type;
u32 pool_size = MLXSW_SP_BYTES_TO_CELLS(size);
+ enum mlxsw_reg_sbpr_mode mode;
+ mode = (enum mlxsw_reg_sbpr_mode) threshold_type;
return mlxsw_sp_sb_pr_write(mlxsw_sp, pool, dir, mode, pool_size);
}
@@ -696,13 +697,13 @@
struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
u8 local_port = mlxsw_sp_port->local_port;
u8 pg_buff = tc_index;
- enum mlxsw_reg_sbxx_dir dir = pool_type;
+ enum mlxsw_reg_sbxx_dir dir = (enum mlxsw_reg_sbxx_dir) pool_type;
struct mlxsw_sp_sb_cm *cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port,
pg_buff, dir);
*p_threshold = mlxsw_sp_sb_threshold_out(mlxsw_sp, cm->pool, dir,
cm->max_buff);
- *p_pool_index = pool_index_get(cm->pool, pool_type);
+ *p_pool_index = pool_index_get(cm->pool, dir);
return 0;
}
@@ -716,7 +717,7 @@
struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
u8 local_port = mlxsw_sp_port->local_port;
u8 pg_buff = tc_index;
- enum mlxsw_reg_sbxx_dir dir = pool_type;
+ enum mlxsw_reg_sbxx_dir dir = (enum mlxsw_reg_sbxx_dir) pool_type;
u8 pool = pool_get(pool_index);
u32 max_buff;
int err;
@@ -943,7 +944,7 @@
struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
u8 local_port = mlxsw_sp_port->local_port;
u8 pg_buff = tc_index;
- enum mlxsw_reg_sbxx_dir dir = pool_type;
+ enum mlxsw_reg_sbxx_dir dir = (enum mlxsw_reg_sbxx_dir) pool_type;
struct mlxsw_sp_sb_cm *cm = mlxsw_sp_sb_cm_get(mlxsw_sp, local_port,
pg_buff, dir);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 3f5c51d..48d50ef 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -43,6 +43,7 @@
#include <net/netevent.h>
#include <net/neighbour.h>
#include <net/arp.h>
+#include <net/ip_fib.h>
#include "spectrum.h"
#include "core.h"
@@ -122,17 +123,20 @@
struct mlxsw_sp_fib_entry {
struct rhash_head ht_node;
+ struct list_head list;
struct mlxsw_sp_fib_key key;
enum mlxsw_sp_fib_entry_type type;
unsigned int ref_count;
u16 rif; /* used for action local */
struct mlxsw_sp_vr *vr;
+ struct fib_info *fi;
struct list_head nexthop_group_node;
struct mlxsw_sp_nexthop_group *nh_group;
};
struct mlxsw_sp_fib {
struct rhashtable ht;
+ struct list_head entry_list;
unsigned long prefix_ref_count[MLXSW_SP_PREFIX_COUNT];
struct mlxsw_sp_prefix_usage prefix_usage;
};
@@ -154,6 +158,7 @@
mlxsw_sp_fib_ht_params);
if (err)
return err;
+ list_add_tail(&fib_entry->list, &fib->entry_list);
if (fib->prefix_ref_count[prefix_len]++ == 0)
mlxsw_sp_prefix_usage_set(&fib->prefix_usage, prefix_len);
return 0;
@@ -166,6 +171,7 @@
if (--fib->prefix_ref_count[prefix_len] == 0)
mlxsw_sp_prefix_usage_clear(&fib->prefix_usage, prefix_len);
+ list_del(&fib_entry->list);
rhashtable_remove_fast(&fib->ht, &fib_entry->ht_node,
mlxsw_sp_fib_ht_params);
}
@@ -216,6 +222,7 @@
err = rhashtable_init(&fib->ht, &mlxsw_sp_fib_ht_params);
if (err)
goto err_rhashtable_init;
+ INIT_LIST_HEAD(&fib->entry_list);
return fib;
err_rhashtable_init:
@@ -252,7 +259,9 @@
{
char ralta_pl[MLXSW_REG_RALTA_LEN];
- mlxsw_reg_ralta_pack(ralta_pl, true, lpm_tree->proto, lpm_tree->id);
+ mlxsw_reg_ralta_pack(ralta_pl, true,
+ (enum mlxsw_reg_ralxx_protocol) lpm_tree->proto,
+ lpm_tree->id);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralta), ralta_pl);
}
@@ -261,7 +270,9 @@
{
char ralta_pl[MLXSW_REG_RALTA_LEN];
- mlxsw_reg_ralta_pack(ralta_pl, false, lpm_tree->proto, lpm_tree->id);
+ mlxsw_reg_ralta_pack(ralta_pl, false,
+ (enum mlxsw_reg_ralxx_protocol) lpm_tree->proto,
+ lpm_tree->id);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralta), ralta_pl);
}
@@ -368,10 +379,12 @@
static struct mlxsw_sp_vr *mlxsw_sp_vr_find_unused(struct mlxsw_sp *mlxsw_sp)
{
+ struct mlxsw_resources *resources;
struct mlxsw_sp_vr *vr;
int i;
- for (i = 0; i < MLXSW_SP_VIRTUAL_ROUTER_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_virtual_routers; i++) {
vr = &mlxsw_sp->router.vrs[i];
if (!vr->used)
return vr;
@@ -384,7 +397,9 @@
{
char raltb_pl[MLXSW_REG_RALTB_LEN];
- mlxsw_reg_raltb_pack(raltb_pl, vr->id, vr->proto, vr->lpm_tree->id);
+ mlxsw_reg_raltb_pack(raltb_pl, vr->id,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto,
+ vr->lpm_tree->id);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(raltb), raltb_pl);
}
@@ -394,7 +409,8 @@
char raltb_pl[MLXSW_REG_RALTB_LEN];
/* Bind to tree 0 which is default */
- mlxsw_reg_raltb_pack(raltb_pl, vr->id, vr->proto, 0);
+ mlxsw_reg_raltb_pack(raltb_pl, vr->id,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto, 0);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(raltb), raltb_pl);
}
@@ -410,11 +426,14 @@
u32 tb_id,
enum mlxsw_sp_l3proto proto)
{
+ struct mlxsw_resources *resources;
struct mlxsw_sp_vr *vr;
int i;
tb_id = mlxsw_sp_fix_tb_id(tb_id);
- for (i = 0; i < MLXSW_SP_VIRTUAL_ROUTER_MAX; i++) {
+
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_virtual_routers; i++) {
vr = &mlxsw_sp->router.vrs[i];
if (vr->used && vr->proto == proto && vr->tb_id == tb_id)
return vr;
@@ -548,15 +567,33 @@
&vr->fib->prefix_usage);
}
-static void mlxsw_sp_vrs_init(struct mlxsw_sp *mlxsw_sp)
+static int mlxsw_sp_vrs_init(struct mlxsw_sp *mlxsw_sp)
{
+ struct mlxsw_resources *resources;
struct mlxsw_sp_vr *vr;
int i;
- for (i = 0; i < MLXSW_SP_VIRTUAL_ROUTER_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ if (!resources->max_virtual_routers_valid)
+ return -EIO;
+
+ mlxsw_sp->router.vrs = kcalloc(resources->max_virtual_routers,
+ sizeof(struct mlxsw_sp_vr),
+ GFP_KERNEL);
+ if (!mlxsw_sp->router.vrs)
+ return -ENOMEM;
+
+ for (i = 0; i < resources->max_virtual_routers; i++) {
vr = &mlxsw_sp->router.vrs[i];
vr->id = i;
}
+
+ return 0;
+}
+
+static void mlxsw_sp_vrs_fini(struct mlxsw_sp *mlxsw_sp)
+{
+ kfree(mlxsw_sp->router.vrs);
}
struct mlxsw_sp_neigh_key {
@@ -1081,9 +1118,10 @@
{
char raleu_pl[MLXSW_REG_RALEU_LEN];
- mlxsw_reg_raleu_pack(raleu_pl, vr->proto, vr->id,
- adj_index, ecmp_size,
- new_adj_index, new_ecmp_size);
+ mlxsw_reg_raleu_pack(raleu_pl,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto, vr->id,
+ adj_index, ecmp_size, new_adj_index,
+ new_ecmp_size);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(raleu), raleu_pl);
}
@@ -1489,50 +1527,6 @@
mlxsw_sp_nexthop_group_destroy(mlxsw_sp, nh_grp);
}
-static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
-{
- char rgcr_pl[MLXSW_REG_RGCR_LEN];
-
- mlxsw_reg_rgcr_pack(rgcr_pl, true);
- mlxsw_reg_rgcr_max_router_interfaces_set(rgcr_pl, MLXSW_SP_RIF_MAX);
- return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rgcr), rgcr_pl);
-}
-
-static void __mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp)
-{
- char rgcr_pl[MLXSW_REG_RGCR_LEN];
-
- mlxsw_reg_rgcr_pack(rgcr_pl, false);
- mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rgcr), rgcr_pl);
-}
-
-int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
-{
- int err;
-
- INIT_LIST_HEAD(&mlxsw_sp->router.nexthop_neighs_list);
- INIT_LIST_HEAD(&mlxsw_sp->router.nexthop_group_list);
- err = __mlxsw_sp_router_init(mlxsw_sp);
- if (err)
- return err;
- mlxsw_sp_lpm_init(mlxsw_sp);
- mlxsw_sp_vrs_init(mlxsw_sp);
- err = mlxsw_sp_neigh_init(mlxsw_sp);
- if (err)
- goto err_neigh_init;
- return 0;
-
-err_neigh_init:
- __mlxsw_sp_router_fini(mlxsw_sp);
- return err;
-}
-
-void mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp)
-{
- mlxsw_sp_neigh_fini(mlxsw_sp);
- __mlxsw_sp_router_fini(mlxsw_sp);
-}
-
static int mlxsw_sp_fib_entry_op4_remote(struct mlxsw_sp *mlxsw_sp,
struct mlxsw_sp_fib_entry *fib_entry,
enum mlxsw_reg_ralue_op op)
@@ -1558,8 +1552,9 @@
trap_id = MLXSW_TRAP_ID_RTR_INGRESS0;
}
- mlxsw_reg_ralue_pack4(ralue_pl, vr->proto, op, vr->id,
- fib_entry->key.prefix_len, *p_dip);
+ mlxsw_reg_ralue_pack4(ralue_pl,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto, op,
+ vr->id, fib_entry->key.prefix_len, *p_dip);
mlxsw_reg_ralue_act_remote_pack(ralue_pl, trap_action, trap_id,
adjacency_index, ecmp_size);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralue), ralue_pl);
@@ -1573,8 +1568,9 @@
u32 *p_dip = (u32 *) fib_entry->key.addr;
struct mlxsw_sp_vr *vr = fib_entry->vr;
- mlxsw_reg_ralue_pack4(ralue_pl, vr->proto, op, vr->id,
- fib_entry->key.prefix_len, *p_dip);
+ mlxsw_reg_ralue_pack4(ralue_pl,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto, op,
+ vr->id, fib_entry->key.prefix_len, *p_dip);
mlxsw_reg_ralue_act_local_pack(ralue_pl,
MLXSW_REG_RALUE_TRAP_ACTION_NOP, 0,
fib_entry->rif);
@@ -1589,8 +1585,9 @@
u32 *p_dip = (u32 *) fib_entry->key.addr;
struct mlxsw_sp_vr *vr = fib_entry->vr;
- mlxsw_reg_ralue_pack4(ralue_pl, vr->proto, op, vr->id,
- fib_entry->key.prefix_len, *p_dip);
+ mlxsw_reg_ralue_pack4(ralue_pl,
+ (enum mlxsw_reg_ralxx_protocol) vr->proto, op,
+ vr->id, fib_entry->key.prefix_len, *p_dip);
mlxsw_reg_ralue_act_ip2me_pack(ralue_pl);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralue), ralue_pl);
}
@@ -1637,94 +1634,98 @@
MLXSW_REG_RALUE_OP_WRITE_DELETE);
}
-struct mlxsw_sp_router_fib4_add_info {
- struct switchdev_trans_item tritem;
- struct mlxsw_sp *mlxsw_sp;
- struct mlxsw_sp_fib_entry *fib_entry;
-};
-
-static void mlxsw_sp_router_fib4_add_info_destroy(void const *data)
-{
- const struct mlxsw_sp_router_fib4_add_info *info = data;
- struct mlxsw_sp_fib_entry *fib_entry = info->fib_entry;
- struct mlxsw_sp *mlxsw_sp = info->mlxsw_sp;
- struct mlxsw_sp_vr *vr = fib_entry->vr;
-
- mlxsw_sp_fib_entry_destroy(fib_entry);
- mlxsw_sp_vr_put(mlxsw_sp, vr);
- kfree(info);
-}
-
static int
mlxsw_sp_router_fib4_entry_init(struct mlxsw_sp *mlxsw_sp,
- const struct switchdev_obj_ipv4_fib *fib4,
+ const struct fib_entry_notifier_info *fen_info,
struct mlxsw_sp_fib_entry *fib_entry)
{
- struct fib_info *fi = fib4->fi;
+ struct fib_info *fi = fen_info->fi;
+ struct mlxsw_sp_rif *r;
+ int nhsel;
+ int err;
- if (fib4->type == RTN_LOCAL || fib4->type == RTN_BROADCAST) {
+ if (fen_info->type == RTN_LOCAL || fen_info->type == RTN_BROADCAST) {
fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_TRAP;
return 0;
}
- if (fib4->type != RTN_UNICAST)
+ if (fen_info->type != RTN_UNICAST)
return -EINVAL;
- if (fi->fib_scope != RT_SCOPE_UNIVERSE) {
- struct mlxsw_sp_rif *r;
+ for (nhsel = 0; nhsel < fi->fib_nhs; nhsel++) {
+ const struct fib_nh *nh = &fi->fib_nh[nhsel];
- fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_LOCAL;
- r = mlxsw_sp_rif_find_by_dev(mlxsw_sp, fi->fib_dev);
- if (!r)
- return -EINVAL;
- fib_entry->rif = r->rif;
- return 0;
+ if (!nh->nh_dev)
+ continue;
+ r = mlxsw_sp_rif_find_by_dev(mlxsw_sp, nh->nh_dev);
+ if (!r) {
+ /* In case router interface is not found for
+ * at least one of the nexthops, that means
+ * the nexthop points to some device unrelated
+ * to us. Set trap and pass the packets for
+ * this prefix to kernel.
+ */
+ fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_TRAP;
+ return 0;
+ }
}
- fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_REMOTE;
- return mlxsw_sp_nexthop_group_get(mlxsw_sp, fib_entry, fi);
+
+ if (fi->fib_scope != RT_SCOPE_UNIVERSE) {
+ fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_LOCAL;
+ fib_entry->rif = r->rif;
+ } else {
+ fib_entry->type = MLXSW_SP_FIB_ENTRY_TYPE_REMOTE;
+ err = mlxsw_sp_nexthop_group_get(mlxsw_sp, fib_entry, fi);
+ if (err)
+ return err;
+ }
+ fib_info_offload_inc(fen_info->fi);
+ return 0;
}
static void
mlxsw_sp_router_fib4_entry_fini(struct mlxsw_sp *mlxsw_sp,
struct mlxsw_sp_fib_entry *fib_entry)
{
- if (fib_entry->type != MLXSW_SP_FIB_ENTRY_TYPE_REMOTE)
- return;
- mlxsw_sp_nexthop_group_put(mlxsw_sp, fib_entry);
+ if (fib_entry->type != MLXSW_SP_FIB_ENTRY_TYPE_TRAP)
+ fib_info_offload_dec(fib_entry->fi);
+ if (fib_entry->type == MLXSW_SP_FIB_ENTRY_TYPE_REMOTE)
+ mlxsw_sp_nexthop_group_put(mlxsw_sp, fib_entry);
}
static struct mlxsw_sp_fib_entry *
mlxsw_sp_fib_entry_get(struct mlxsw_sp *mlxsw_sp,
- const struct switchdev_obj_ipv4_fib *fib4)
+ const struct fib_entry_notifier_info *fen_info)
{
struct mlxsw_sp_fib_entry *fib_entry;
- struct fib_info *fi = fib4->fi;
+ struct fib_info *fi = fen_info->fi;
struct mlxsw_sp_vr *vr;
int err;
- vr = mlxsw_sp_vr_get(mlxsw_sp, fib4->dst_len, fib4->tb_id,
+ vr = mlxsw_sp_vr_get(mlxsw_sp, fen_info->dst_len, fen_info->tb_id,
MLXSW_SP_L3_PROTO_IPV4);
if (IS_ERR(vr))
return ERR_CAST(vr);
- fib_entry = mlxsw_sp_fib_entry_lookup(vr->fib, &fib4->dst,
- sizeof(fib4->dst),
- fib4->dst_len, fi->fib_dev);
+ fib_entry = mlxsw_sp_fib_entry_lookup(vr->fib, &fen_info->dst,
+ sizeof(fen_info->dst),
+ fen_info->dst_len, fi->fib_dev);
if (fib_entry) {
/* Already exists, just take a reference */
fib_entry->ref_count++;
return fib_entry;
}
- fib_entry = mlxsw_sp_fib_entry_create(vr->fib, &fib4->dst,
- sizeof(fib4->dst),
- fib4->dst_len, fi->fib_dev);
+ fib_entry = mlxsw_sp_fib_entry_create(vr->fib, &fen_info->dst,
+ sizeof(fen_info->dst),
+ fen_info->dst_len, fi->fib_dev);
if (!fib_entry) {
err = -ENOMEM;
goto err_fib_entry_create;
}
fib_entry->vr = vr;
+ fib_entry->fi = fi;
fib_entry->ref_count = 1;
- err = mlxsw_sp_router_fib4_entry_init(mlxsw_sp, fib4, fib_entry);
+ err = mlxsw_sp_router_fib4_entry_init(mlxsw_sp, fen_info, fib_entry);
if (err)
goto err_fib4_entry_init;
@@ -1740,21 +1741,23 @@
static struct mlxsw_sp_fib_entry *
mlxsw_sp_fib_entry_find(struct mlxsw_sp *mlxsw_sp,
- const struct switchdev_obj_ipv4_fib *fib4)
+ const struct fib_entry_notifier_info *fen_info)
{
struct mlxsw_sp_vr *vr;
- vr = mlxsw_sp_vr_find(mlxsw_sp, fib4->tb_id, MLXSW_SP_L3_PROTO_IPV4);
+ vr = mlxsw_sp_vr_find(mlxsw_sp, fen_info->tb_id,
+ MLXSW_SP_L3_PROTO_IPV4);
if (!vr)
return NULL;
- return mlxsw_sp_fib_entry_lookup(vr->fib, &fib4->dst,
- sizeof(fib4->dst), fib4->dst_len,
- fib4->fi->fib_dev);
+ return mlxsw_sp_fib_entry_lookup(vr->fib, &fen_info->dst,
+ sizeof(fen_info->dst),
+ fen_info->dst_len,
+ fen_info->fi->fib_dev);
}
-void mlxsw_sp_fib_entry_put(struct mlxsw_sp *mlxsw_sp,
- struct mlxsw_sp_fib_entry *fib_entry)
+static void mlxsw_sp_fib_entry_put(struct mlxsw_sp *mlxsw_sp,
+ struct mlxsw_sp_fib_entry *fib_entry)
{
struct mlxsw_sp_vr *vr = fib_entry->vr;
@@ -1765,60 +1768,43 @@
mlxsw_sp_vr_put(mlxsw_sp, vr);
}
-static int
-mlxsw_sp_router_fib4_add_prepare(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans)
+static void mlxsw_sp_fib_entry_put_all(struct mlxsw_sp *mlxsw_sp,
+ struct mlxsw_sp_fib_entry *fib_entry)
{
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
- struct mlxsw_sp_router_fib4_add_info *info;
- struct mlxsw_sp_fib_entry *fib_entry;
- int err;
+ unsigned int last_ref_count;
- fib_entry = mlxsw_sp_fib_entry_get(mlxsw_sp, fib4);
- if (IS_ERR(fib_entry))
- return PTR_ERR(fib_entry);
-
- info = kmalloc(sizeof(*info), GFP_KERNEL);
- if (!info) {
- err = -ENOMEM;
- goto err_alloc_info;
- }
- info->mlxsw_sp = mlxsw_sp;
- info->fib_entry = fib_entry;
- switchdev_trans_item_enqueue(trans, info,
- mlxsw_sp_router_fib4_add_info_destroy,
- &info->tritem);
- return 0;
-
-err_alloc_info:
- mlxsw_sp_fib_entry_put(mlxsw_sp, fib_entry);
- return err;
+ do {
+ last_ref_count = fib_entry->ref_count;
+ mlxsw_sp_fib_entry_put(mlxsw_sp, fib_entry);
+ } while (last_ref_count != 1);
}
-static int
-mlxsw_sp_router_fib4_add_commit(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans)
+static int mlxsw_sp_router_fib4_add(struct mlxsw_sp *mlxsw_sp,
+ struct fib_entry_notifier_info *fen_info)
{
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
- struct mlxsw_sp_router_fib4_add_info *info;
struct mlxsw_sp_fib_entry *fib_entry;
struct mlxsw_sp_vr *vr;
int err;
- info = switchdev_trans_item_dequeue(trans);
- fib_entry = info->fib_entry;
- kfree(info);
+ if (mlxsw_sp->router.aborted)
+ return 0;
+
+ fib_entry = mlxsw_sp_fib_entry_get(mlxsw_sp, fen_info);
+ if (IS_ERR(fib_entry)) {
+ dev_warn(mlxsw_sp->bus_info->dev, "Failed to get FIB4 entry being added.\n");
+ return PTR_ERR(fib_entry);
+ }
if (fib_entry->ref_count != 1)
return 0;
vr = fib_entry->vr;
err = mlxsw_sp_fib_entry_insert(vr->fib, fib_entry);
- if (err)
+ if (err) {
+ dev_warn(mlxsw_sp->bus_info->dev, "Failed to insert FIB4 entry being added.\n");
goto err_fib_entry_insert;
- err = mlxsw_sp_fib_entry_update(mlxsw_sp_port->mlxsw_sp, fib_entry);
+ }
+ err = mlxsw_sp_fib_entry_update(mlxsw_sp, fib_entry);
if (err)
goto err_fib_entry_add;
return 0;
@@ -1830,24 +1816,15 @@
return err;
}
-int mlxsw_sp_router_fib4_add(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans)
+static int mlxsw_sp_router_fib4_del(struct mlxsw_sp *mlxsw_sp,
+ struct fib_entry_notifier_info *fen_info)
{
- if (switchdev_trans_ph_prepare(trans))
- return mlxsw_sp_router_fib4_add_prepare(mlxsw_sp_port,
- fib4, trans);
- return mlxsw_sp_router_fib4_add_commit(mlxsw_sp_port,
- fib4, trans);
-}
-
-int mlxsw_sp_router_fib4_del(struct mlxsw_sp_port *mlxsw_sp_port,
- const struct switchdev_obj_ipv4_fib *fib4)
-{
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
struct mlxsw_sp_fib_entry *fib_entry;
- fib_entry = mlxsw_sp_fib_entry_find(mlxsw_sp, fib4);
+ if (mlxsw_sp->router.aborted)
+ return 0;
+
+ fib_entry = mlxsw_sp_fib_entry_find(mlxsw_sp, fen_info);
if (!fib_entry) {
dev_warn(mlxsw_sp->bus_info->dev, "Failed to find FIB4 entry being removed.\n");
return -ENOENT;
@@ -1861,3 +1838,172 @@
mlxsw_sp_fib_entry_put(mlxsw_sp, fib_entry);
return 0;
}
+
+static int mlxsw_sp_router_set_abort_trap(struct mlxsw_sp *mlxsw_sp)
+{
+ char ralta_pl[MLXSW_REG_RALTA_LEN];
+ char ralst_pl[MLXSW_REG_RALST_LEN];
+ char raltb_pl[MLXSW_REG_RALTB_LEN];
+ char ralue_pl[MLXSW_REG_RALUE_LEN];
+ int err;
+
+ mlxsw_reg_ralta_pack(ralta_pl, true, MLXSW_REG_RALXX_PROTOCOL_IPV4,
+ MLXSW_SP_LPM_TREE_MIN);
+ err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralta), ralta_pl);
+ if (err)
+ return err;
+
+ mlxsw_reg_ralst_pack(ralst_pl, 0xff, MLXSW_SP_LPM_TREE_MIN);
+ err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralst), ralst_pl);
+ if (err)
+ return err;
+
+ mlxsw_reg_raltb_pack(raltb_pl, 0, MLXSW_REG_RALXX_PROTOCOL_IPV4, 0);
+ err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(raltb), raltb_pl);
+ if (err)
+ return err;
+
+ mlxsw_reg_ralue_pack4(ralue_pl, MLXSW_SP_L3_PROTO_IPV4,
+ MLXSW_REG_RALUE_OP_WRITE_WRITE, 0, 0, 0);
+ mlxsw_reg_ralue_act_ip2me_pack(ralue_pl);
+ return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(ralue), ralue_pl);
+}
+
+static void mlxsw_sp_router_fib4_abort(struct mlxsw_sp *mlxsw_sp)
+{
+ struct mlxsw_resources *resources;
+ struct mlxsw_sp_fib_entry *fib_entry;
+ struct mlxsw_sp_fib_entry *tmp;
+ struct mlxsw_sp_vr *vr;
+ int i;
+ int err;
+
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_virtual_routers; i++) {
+ vr = &mlxsw_sp->router.vrs[i];
+ if (!vr->used)
+ continue;
+
+ list_for_each_entry_safe(fib_entry, tmp,
+ &vr->fib->entry_list, list) {
+ bool do_break = &tmp->list == &vr->fib->entry_list;
+
+ mlxsw_sp_fib_entry_del(mlxsw_sp, fib_entry);
+ mlxsw_sp_fib_entry_remove(fib_entry->vr->fib,
+ fib_entry);
+ mlxsw_sp_fib_entry_put_all(mlxsw_sp, fib_entry);
+ if (do_break)
+ break;
+ }
+ }
+ mlxsw_sp->router.aborted = true;
+ err = mlxsw_sp_router_set_abort_trap(mlxsw_sp);
+ if (err)
+ dev_warn(mlxsw_sp->bus_info->dev, "Failed to set abort trap.\n");
+}
+
+static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
+{
+ struct mlxsw_resources *resources;
+ char rgcr_pl[MLXSW_REG_RGCR_LEN];
+ int err;
+
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ if (!resources->max_rif_valid)
+ return -EIO;
+
+ mlxsw_sp->rifs = kcalloc(resources->max_rif,
+ sizeof(struct mlxsw_sp_rif *), GFP_KERNEL);
+ if (!mlxsw_sp->rifs)
+ return -ENOMEM;
+
+ mlxsw_reg_rgcr_pack(rgcr_pl, true);
+ mlxsw_reg_rgcr_max_router_interfaces_set(rgcr_pl, resources->max_rif);
+ err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rgcr), rgcr_pl);
+ if (err)
+ goto err_rgcr_fail;
+
+ return 0;
+
+err_rgcr_fail:
+ kfree(mlxsw_sp->rifs);
+ return err;
+}
+
+static void __mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp)
+{
+ struct mlxsw_resources *resources;
+ char rgcr_pl[MLXSW_REG_RGCR_LEN];
+ int i;
+
+ mlxsw_reg_rgcr_pack(rgcr_pl, false);
+ mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(rgcr), rgcr_pl);
+
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_rif; i++)
+ WARN_ON_ONCE(mlxsw_sp->rifs[i]);
+
+ kfree(mlxsw_sp->rifs);
+}
+
+static int mlxsw_sp_router_fib_event(struct notifier_block *nb,
+ unsigned long event, void *ptr)
+{
+ struct mlxsw_sp *mlxsw_sp = container_of(nb, struct mlxsw_sp, fib_nb);
+ struct fib_entry_notifier_info *fen_info = ptr;
+ int err;
+
+ switch (event) {
+ case FIB_EVENT_ENTRY_ADD:
+ err = mlxsw_sp_router_fib4_add(mlxsw_sp, fen_info);
+ if (err)
+ mlxsw_sp_router_fib4_abort(mlxsw_sp);
+ break;
+ case FIB_EVENT_ENTRY_DEL:
+ mlxsw_sp_router_fib4_del(mlxsw_sp, fen_info);
+ break;
+ case FIB_EVENT_RULE_ADD: /* fall through */
+ case FIB_EVENT_RULE_DEL:
+ mlxsw_sp_router_fib4_abort(mlxsw_sp);
+ break;
+ }
+ return NOTIFY_DONE;
+}
+
+int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
+{
+ int err;
+
+ INIT_LIST_HEAD(&mlxsw_sp->router.nexthop_neighs_list);
+ INIT_LIST_HEAD(&mlxsw_sp->router.nexthop_group_list);
+ err = __mlxsw_sp_router_init(mlxsw_sp);
+ if (err)
+ return err;
+
+ mlxsw_sp_lpm_init(mlxsw_sp);
+ err = mlxsw_sp_vrs_init(mlxsw_sp);
+ if (err)
+ goto err_vrs_init;
+
+ err = mlxsw_sp_neigh_init(mlxsw_sp);
+ if (err)
+ goto err_neigh_init;
+
+ mlxsw_sp->fib_nb.notifier_call = mlxsw_sp_router_fib_event;
+ register_fib_notifier(&mlxsw_sp->fib_nb);
+ return 0;
+
+err_neigh_init:
+ mlxsw_sp_vrs_fini(mlxsw_sp);
+err_vrs_init:
+ __mlxsw_sp_router_fini(mlxsw_sp);
+ return err;
+}
+
+void mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp)
+{
+ unregister_fib_notifier(&mlxsw_sp->fib_nb);
+ mlxsw_sp_neigh_fini(mlxsw_sp);
+ mlxsw_sp_vrs_fini(mlxsw_sp);
+ __mlxsw_sp_router_fini(mlxsw_sp);
+}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 7186c48..5e00c79 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1044,11 +1044,6 @@
SWITCHDEV_OBJ_PORT_VLAN(obj),
trans);
break;
- case SWITCHDEV_OBJ_ID_IPV4_FIB:
- err = mlxsw_sp_router_fib4_add(mlxsw_sp_port,
- SWITCHDEV_OBJ_IPV4_FIB(obj),
- trans);
- break;
case SWITCHDEV_OBJ_ID_PORT_FDB:
err = mlxsw_sp_port_fdb_static_add(mlxsw_sp_port,
SWITCHDEV_OBJ_PORT_FDB(obj),
@@ -1181,10 +1176,6 @@
err = mlxsw_sp_port_vlans_del(mlxsw_sp_port,
SWITCHDEV_OBJ_PORT_VLAN(obj));
break;
- case SWITCHDEV_OBJ_ID_IPV4_FIB:
- err = mlxsw_sp_router_fib4_del(mlxsw_sp_port,
- SWITCHDEV_OBJ_IPV4_FIB(obj));
- break;
case SWITCHDEV_OBJ_ID_PORT_FDB:
err = mlxsw_sp_port_fdb_static_del(mlxsw_sp_port,
SWITCHDEV_OBJ_PORT_FDB(obj));
@@ -1205,9 +1196,11 @@
u16 lag_id)
{
struct mlxsw_sp_port *mlxsw_sp_port;
+ struct mlxsw_resources *resources;
int i;
- for (i = 0; i < MLXSW_SP_PORT_PER_LAG_MAX; i++) {
+ resources = mlxsw_core_resources_get(mlxsw_sp->core);
+ for (i = 0; i < resources->max_ports_in_lag; i++) {
mlxsw_sp_port = mlxsw_sp_port_lagged_get(mlxsw_sp, lag_id, i);
if (mlxsw_sp_port)
return mlxsw_sp_port;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index 377daa4..8b15bf0 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -1512,10 +1512,6 @@
static struct mlxsw_config_profile mlxsw_sx_config_profile = {
.used_max_vepa_channels = 1,
.max_vepa_channels = 0,
- .used_max_lag = 1,
- .max_lag = 64,
- .used_max_port_per_lag = 1,
- .max_port_per_lag = 16,
.used_max_mid = 1,
.max_mid = 7000,
.used_max_pgt = 1,
diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile
index 6817881..0efb2ba 100644
--- a/drivers/net/ethernet/netronome/nfp/Makefile
+++ b/drivers/net/ethernet/netronome/nfp/Makefile
@@ -3,6 +3,13 @@
nfp_netvf-objs := \
nfp_net_common.o \
nfp_net_ethtool.o \
+ nfp_net_offload.o \
nfp_netvf_main.o
+ifeq ($(CONFIG_BPF_SYSCALL),y)
+nfp_netvf-objs += \
+ nfp_bpf_verifier.o \
+ nfp_bpf_jit.o
+endif
+
nfp_netvf-$(CONFIG_NFP_NET_DEBUG) += nfp_net_debugfs.o
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
new file mode 100644
index 0000000..22484b6
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -0,0 +1,233 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below. You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __NFP_ASM_H__
+#define __NFP_ASM_H__ 1
+
+#include "nfp_bpf.h"
+
+#define REG_NONE 0
+
+#define RE_REG_NO_DST 0x020
+#define RE_REG_IMM 0x020
+#define RE_REG_IMM_encode(x) \
+ (RE_REG_IMM | ((x) & 0x1f) | (((x) & 0x60) << 1))
+#define RE_REG_IMM_MAX 0x07fULL
+#define RE_REG_XFR 0x080
+
+#define UR_REG_XFR 0x180
+#define UR_REG_NN 0x280
+#define UR_REG_NO_DST 0x300
+#define UR_REG_IMM UR_REG_NO_DST
+#define UR_REG_IMM_encode(x) (UR_REG_IMM | (x))
+#define UR_REG_IMM_MAX 0x0ffULL
+
+#define OP_BR_BASE 0x0d800000020ULL
+#define OP_BR_BASE_MASK 0x0f8000c3ce0ULL
+#define OP_BR_MASK 0x0000000001fULL
+#define OP_BR_EV_PIP 0x00000000300ULL
+#define OP_BR_CSS 0x0000003c000ULL
+#define OP_BR_DEFBR 0x00000300000ULL
+#define OP_BR_ADDR_LO 0x007ffc00000ULL
+#define OP_BR_ADDR_HI 0x10000000000ULL
+
+#define nfp_is_br(_insn) \
+ (((_insn) & OP_BR_BASE_MASK) == OP_BR_BASE)
+
+enum br_mask {
+ BR_BEQ = 0x00,
+ BR_BNE = 0x01,
+ BR_BHS = 0x04,
+ BR_BLO = 0x05,
+ BR_BGE = 0x08,
+ BR_UNC = 0x18,
+};
+
+enum br_ev_pip {
+ BR_EV_PIP_UNCOND = 0,
+ BR_EV_PIP_COND = 1,
+};
+
+enum br_ctx_signal_state {
+ BR_CSS_NONE = 2,
+};
+
+#define OP_BBYTE_BASE 0x0c800000000ULL
+#define OP_BB_A_SRC 0x000000000ffULL
+#define OP_BB_BYTE 0x00000000300ULL
+#define OP_BB_B_SRC 0x0000003fc00ULL
+#define OP_BB_I8 0x00000040000ULL
+#define OP_BB_EQ 0x00000080000ULL
+#define OP_BB_DEFBR 0x00000300000ULL
+#define OP_BB_ADDR_LO 0x007ffc00000ULL
+#define OP_BB_ADDR_HI 0x10000000000ULL
+
+#define OP_BALU_BASE 0x0e800000000ULL
+#define OP_BA_A_SRC 0x000000003ffULL
+#define OP_BA_B_SRC 0x000000ffc00ULL
+#define OP_BA_DEFBR 0x00000300000ULL
+#define OP_BA_ADDR_HI 0x0007fc00000ULL
+
+#define OP_IMMED_A_SRC 0x000000003ffULL
+#define OP_IMMED_B_SRC 0x000000ffc00ULL
+#define OP_IMMED_IMM 0x0000ff00000ULL
+#define OP_IMMED_WIDTH 0x00060000000ULL
+#define OP_IMMED_INV 0x00080000000ULL
+#define OP_IMMED_SHIFT 0x00600000000ULL
+#define OP_IMMED_BASE 0x0f000000000ULL
+#define OP_IMMED_WR_AB 0x20000000000ULL
+
+enum immed_width {
+ IMMED_WIDTH_ALL = 0,
+ IMMED_WIDTH_BYTE = 1,
+ IMMED_WIDTH_WORD = 2,
+};
+
+enum immed_shift {
+ IMMED_SHIFT_0B = 0,
+ IMMED_SHIFT_1B = 1,
+ IMMED_SHIFT_2B = 2,
+};
+
+#define OP_SHF_BASE 0x08000000000ULL
+#define OP_SHF_A_SRC 0x000000000ffULL
+#define OP_SHF_SC 0x00000000300ULL
+#define OP_SHF_B_SRC 0x0000003fc00ULL
+#define OP_SHF_I8 0x00000040000ULL
+#define OP_SHF_SW 0x00000080000ULL
+#define OP_SHF_DST 0x0000ff00000ULL
+#define OP_SHF_SHIFT 0x001f0000000ULL
+#define OP_SHF_OP 0x00e00000000ULL
+#define OP_SHF_DST_AB 0x01000000000ULL
+#define OP_SHF_WR_AB 0x20000000000ULL
+
+enum shf_op {
+ SHF_OP_NONE = 0,
+ SHF_OP_AND = 2,
+ SHF_OP_OR = 5,
+};
+
+enum shf_sc {
+ SHF_SC_R_ROT = 0,
+ SHF_SC_R_SHF = 1,
+ SHF_SC_L_SHF = 2,
+ SHF_SC_R_DSHF = 3,
+};
+
+#define OP_ALU_A_SRC 0x000000003ffULL
+#define OP_ALU_B_SRC 0x000000ffc00ULL
+#define OP_ALU_DST 0x0003ff00000ULL
+#define OP_ALU_SW 0x00040000000ULL
+#define OP_ALU_OP 0x00f80000000ULL
+#define OP_ALU_DST_AB 0x01000000000ULL
+#define OP_ALU_BASE 0x0a000000000ULL
+#define OP_ALU_WR_AB 0x20000000000ULL
+
+enum alu_op {
+ ALU_OP_NONE = 0x00,
+ ALU_OP_ADD = 0x01,
+ ALU_OP_NEG = 0x04,
+ ALU_OP_AND = 0x08,
+ ALU_OP_SUB_C = 0x0d,
+ ALU_OP_ADD_C = 0x11,
+ ALU_OP_OR = 0x14,
+ ALU_OP_SUB = 0x15,
+ ALU_OP_XOR = 0x18,
+};
+
+enum alu_dst_ab {
+ ALU_DST_A = 0,
+ ALU_DST_B = 1,
+};
+
+#define OP_LDF_BASE 0x0c000000000ULL
+#define OP_LDF_A_SRC 0x000000000ffULL
+#define OP_LDF_SC 0x00000000300ULL
+#define OP_LDF_B_SRC 0x0000003fc00ULL
+#define OP_LDF_I8 0x00000040000ULL
+#define OP_LDF_SW 0x00000080000ULL
+#define OP_LDF_ZF 0x00000100000ULL
+#define OP_LDF_BMASK 0x0000f000000ULL
+#define OP_LDF_SHF 0x001f0000000ULL
+#define OP_LDF_WR_AB 0x20000000000ULL
+
+#define OP_CMD_A_SRC 0x000000000ffULL
+#define OP_CMD_CTX 0x00000000300ULL
+#define OP_CMD_B_SRC 0x0000003fc00ULL
+#define OP_CMD_TOKEN 0x000000c0000ULL
+#define OP_CMD_XFER 0x00001f00000ULL
+#define OP_CMD_CNT 0x0000e000000ULL
+#define OP_CMD_SIG 0x000f0000000ULL
+#define OP_CMD_TGT_CMD 0x07f00000000ULL
+#define OP_CMD_MODE 0x1c0000000000ULL
+
+struct cmd_tgt_act {
+ u8 token;
+ u8 tgt_cmd;
+};
+
+enum cmd_tgt_map {
+ CMD_TGT_READ8,
+ CMD_TGT_WRITE8,
+ CMD_TGT_READ_LE,
+ CMD_TGT_READ_SWAP_LE,
+ __CMD_TGT_MAP_SIZE,
+};
+
+enum cmd_mode {
+ CMD_MODE_40b_AB = 0,
+ CMD_MODE_40b_BA = 1,
+ CMD_MODE_32b = 4,
+};
+
+enum cmd_ctx_swap {
+ CMD_CTX_SWAP = 0,
+ CMD_CTX_NO_SWAP = 3,
+};
+
+#define OP_LCSR_BASE 0x0fc00000000ULL
+#define OP_LCSR_A_SRC 0x000000003ffULL
+#define OP_LCSR_B_SRC 0x000000ffc00ULL
+#define OP_LCSR_WRITE 0x00000200000ULL
+#define OP_LCSR_ADDR 0x001ffc00000ULL
+
+enum lcsr_wr_src {
+ LCSR_WR_AREG,
+ LCSR_WR_BREG,
+ LCSR_WR_IMM,
+};
+
+#define OP_CARB_BASE 0x0e000000000ULL
+#define OP_CARB_OR 0x00000010000ULL
+
+#endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf.h b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
new file mode 100644
index 0000000..87aa8a3
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf.h
@@ -0,0 +1,202 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below. You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __NFP_BPF_H__
+#define __NFP_BPF_H__ 1
+
+#include <linux/bitfield.h>
+#include <linux/bpf.h>
+#include <linux/list.h>
+#include <linux/types.h>
+
+#define FIELD_FIT(mask, val) (!((((u64)val) << __bf_shf(mask)) & ~(mask)))
+
+/* For branch fixup logic use up-most byte of branch instruction as scratch
+ * area. Remember to clear this before sending instructions to HW!
+ */
+#define OP_BR_SPECIAL 0xff00000000000000ULL
+
+enum br_special {
+ OP_BR_NORMAL = 0,
+ OP_BR_GO_OUT,
+ OP_BR_GO_ABORT,
+};
+
+enum static_regs {
+ STATIC_REG_PKT = 1,
+#define REG_PKT_BANK ALU_DST_A
+ STATIC_REG_IMM = 2, /* Bank AB */
+};
+
+enum nfp_bpf_action_type {
+ NN_ACT_TC_DROP,
+ NN_ACT_TC_REDIR,
+ NN_ACT_DIRECT,
+};
+
+/* Software register representation, hardware encoding in asm.h */
+#define NN_REG_TYPE GENMASK(31, 24)
+#define NN_REG_VAL GENMASK(7, 0)
+
+enum nfp_bpf_reg_type {
+ NN_REG_GPR_A = BIT(0),
+ NN_REG_GPR_B = BIT(1),
+ NN_REG_NNR = BIT(2),
+ NN_REG_XFER = BIT(3),
+ NN_REG_IMM = BIT(4),
+ NN_REG_NONE = BIT(5),
+};
+
+#define NN_REG_GPR_BOTH (NN_REG_GPR_A | NN_REG_GPR_B)
+
+#define reg_both(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_BOTH))
+#define reg_a(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_A))
+#define reg_b(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_GPR_B))
+#define reg_nnr(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_NNR))
+#define reg_xfer(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_XFER))
+#define reg_imm(x) ((x) | FIELD_PREP(NN_REG_TYPE, NN_REG_IMM))
+#define reg_none() (FIELD_PREP(NN_REG_TYPE, NN_REG_NONE))
+
+#define pkt_reg(np) reg_a((np)->regs_per_thread - STATIC_REG_PKT)
+#define imm_a(np) reg_a((np)->regs_per_thread - STATIC_REG_IMM)
+#define imm_b(np) reg_b((np)->regs_per_thread - STATIC_REG_IMM)
+#define imm_both(np) reg_both((np)->regs_per_thread - STATIC_REG_IMM)
+
+#define NFP_BPF_ABI_FLAGS reg_nnr(0)
+#define NFP_BPF_ABI_FLAG_MARK 1
+#define NFP_BPF_ABI_MARK reg_nnr(1)
+#define NFP_BPF_ABI_PKT reg_nnr(2)
+#define NFP_BPF_ABI_LEN reg_nnr(3)
+
+struct nfp_prog;
+struct nfp_insn_meta;
+typedef int (*instr_cb_t)(struct nfp_prog *, struct nfp_insn_meta *);
+
+#define nfp_prog_first_meta(nfp_prog) \
+ list_first_entry(&(nfp_prog)->insns, struct nfp_insn_meta, l)
+#define nfp_prog_last_meta(nfp_prog) \
+ list_last_entry(&(nfp_prog)->insns, struct nfp_insn_meta, l)
+#define nfp_meta_next(meta) list_next_entry(meta, l)
+#define nfp_meta_prev(meta) list_prev_entry(meta, l)
+
+/**
+ * struct nfp_insn_meta - BPF instruction wrapper
+ * @insn: BPF instruction
+ * @off: index of first generated machine instruction (in nfp_prog.prog)
+ * @n: eBPF instruction number
+ * @skip: skip this instruction (optimized out)
+ * @double_cb: callback for second part of the instruction
+ * @l: link on nfp_prog->insns list
+ */
+struct nfp_insn_meta {
+ struct bpf_insn insn;
+ unsigned int off;
+ unsigned short n;
+ bool skip;
+ instr_cb_t double_cb;
+
+ struct list_head l;
+};
+
+#define BPF_SIZE_MASK 0x18
+
+static inline u8 mbpf_class(const struct nfp_insn_meta *meta)
+{
+ return BPF_CLASS(meta->insn.code);
+}
+
+static inline u8 mbpf_src(const struct nfp_insn_meta *meta)
+{
+ return BPF_SRC(meta->insn.code);
+}
+
+static inline u8 mbpf_op(const struct nfp_insn_meta *meta)
+{
+ return BPF_OP(meta->insn.code);
+}
+
+static inline u8 mbpf_mode(const struct nfp_insn_meta *meta)
+{
+ return BPF_MODE(meta->insn.code);
+}
+
+/**
+ * struct nfp_prog - nfp BPF program
+ * @prog: machine code
+ * @prog_len: number of valid instructions in @prog array
+ * @__prog_alloc_len: alloc size of @prog array
+ * @act: BPF program/action type (TC DA, TC with action, XDP etc.)
+ * @num_regs: number of registers used by this program
+ * @regs_per_thread: number of basic registers allocated per thread
+ * @start_off: address of the first instruction in the memory
+ * @tgt_out: jump target for normal exit
+ * @tgt_abort: jump target for abort (e.g. access outside of packet buffer)
+ * @tgt_done: jump target to get the next packet
+ * @n_translated: number of successfully translated instructions (for errors)
+ * @error: error code if something went wrong
+ * @insns: list of BPF instruction wrappers (struct nfp_insn_meta)
+ */
+struct nfp_prog {
+ u64 *prog;
+ unsigned int prog_len;
+ unsigned int __prog_alloc_len;
+
+ enum nfp_bpf_action_type act;
+
+ unsigned int num_regs;
+ unsigned int regs_per_thread;
+
+ unsigned int start_off;
+ unsigned int tgt_out;
+ unsigned int tgt_abort;
+ unsigned int tgt_done;
+
+ unsigned int n_translated;
+ int error;
+
+ struct list_head insns;
+};
+
+struct nfp_bpf_result {
+ unsigned int n_instr;
+ bool dense_mode;
+};
+
+int
+nfp_bpf_jit(struct bpf_prog *filter, void *prog, enum nfp_bpf_action_type act,
+ unsigned int prog_start, unsigned int prog_done,
+ unsigned int prog_sz, struct nfp_bpf_result *res);
+
+int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog);
+
+#endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
new file mode 100644
index 0000000..3de819a
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -0,0 +1,1811 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below. You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define pr_fmt(fmt) "NFP net bpf: " fmt
+
+#include <linux/kernel.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/pkt_cls.h>
+#include <linux/unistd.h>
+
+#include "nfp_asm.h"
+#include "nfp_bpf.h"
+
+/* --- NFP prog --- */
+/* Foreach "multiple" entries macros provide pos and next<n> pointers.
+ * It's safe to modify the next pointers (but not pos).
+ */
+#define nfp_for_each_insn_walk2(nfp_prog, pos, next) \
+ for (pos = list_first_entry(&(nfp_prog)->insns, typeof(*pos), l), \
+ next = list_next_entry(pos, l); \
+ &(nfp_prog)->insns != &pos->l && \
+ &(nfp_prog)->insns != &next->l; \
+ pos = nfp_meta_next(pos), \
+ next = nfp_meta_next(pos))
+
+#define nfp_for_each_insn_walk3(nfp_prog, pos, next, next2) \
+ for (pos = list_first_entry(&(nfp_prog)->insns, typeof(*pos), l), \
+ next = list_next_entry(pos, l), \
+ next2 = list_next_entry(next, l); \
+ &(nfp_prog)->insns != &pos->l && \
+ &(nfp_prog)->insns != &next->l && \
+ &(nfp_prog)->insns != &next2->l; \
+ pos = nfp_meta_next(pos), \
+ next = nfp_meta_next(pos), \
+ next2 = nfp_meta_next(next))
+
+static bool
+nfp_meta_has_next(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return meta->l.next != &nfp_prog->insns;
+}
+
+static bool
+nfp_meta_has_prev(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return meta->l.prev != &nfp_prog->insns;
+}
+
+static void nfp_prog_free(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta, *tmp;
+
+ list_for_each_entry_safe(meta, tmp, &nfp_prog->insns, l) {
+ list_del(&meta->l);
+ kfree(meta);
+ }
+ kfree(nfp_prog);
+}
+
+static void nfp_prog_push(struct nfp_prog *nfp_prog, u64 insn)
+{
+ if (nfp_prog->__prog_alloc_len == nfp_prog->prog_len) {
+ nfp_prog->error = -ENOSPC;
+ return;
+ }
+
+ nfp_prog->prog[nfp_prog->prog_len] = insn;
+ nfp_prog->prog_len++;
+}
+
+static unsigned int nfp_prog_current_offset(struct nfp_prog *nfp_prog)
+{
+ return nfp_prog->start_off + nfp_prog->prog_len;
+}
+
+static unsigned int
+nfp_prog_offset_to_index(struct nfp_prog *nfp_prog, unsigned int offset)
+{
+ return offset - nfp_prog->start_off;
+}
+
+/* --- SW reg --- */
+struct nfp_insn_ur_regs {
+ enum alu_dst_ab dst_ab;
+ u16 dst;
+ u16 areg, breg;
+ bool swap;
+ bool wr_both;
+};
+
+struct nfp_insn_re_regs {
+ enum alu_dst_ab dst_ab;
+ u8 dst;
+ u8 areg, breg;
+ bool swap;
+ bool wr_both;
+ bool i8;
+};
+
+static u16 nfp_swreg_to_unreg(u32 swreg, bool is_dst)
+{
+ u16 val = FIELD_GET(NN_REG_VAL, swreg);
+
+ switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+ case NN_REG_GPR_A:
+ case NN_REG_GPR_B:
+ case NN_REG_GPR_BOTH:
+ return val;
+ case NN_REG_NNR:
+ return UR_REG_NN | val;
+ case NN_REG_XFER:
+ return UR_REG_XFR | val;
+ case NN_REG_IMM:
+ if (val & ~0xff) {
+ pr_err("immediate too large\n");
+ return 0;
+ }
+ return UR_REG_IMM_encode(val);
+ case NN_REG_NONE:
+ return is_dst ? UR_REG_NO_DST : REG_NONE;
+ default:
+ pr_err("unrecognized reg encoding %08x\n", swreg);
+ return 0;
+ }
+}
+
+static int
+swreg_to_unrestricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_ur_regs *reg)
+{
+ memset(reg, 0, sizeof(*reg));
+
+ /* Decode destination */
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+ return -EFAULT;
+
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+ reg->dst_ab = ALU_DST_B;
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+ reg->wr_both = true;
+ reg->dst = nfp_swreg_to_unreg(dst, true);
+
+ /* Decode source operands */
+ if (FIELD_GET(NN_REG_TYPE, lreg) == FIELD_GET(NN_REG_TYPE, rreg))
+ return -EFAULT;
+
+ if (FIELD_GET(NN_REG_TYPE, lreg) == NN_REG_GPR_B ||
+ FIELD_GET(NN_REG_TYPE, rreg) == NN_REG_GPR_A) {
+ reg->areg = nfp_swreg_to_unreg(rreg, false);
+ reg->breg = nfp_swreg_to_unreg(lreg, false);
+ reg->swap = true;
+ } else {
+ reg->areg = nfp_swreg_to_unreg(lreg, false);
+ reg->breg = nfp_swreg_to_unreg(rreg, false);
+ }
+
+ return 0;
+}
+
+static u16 nfp_swreg_to_rereg(u32 swreg, bool is_dst, bool has_imm8, bool *i8)
+{
+ u16 val = FIELD_GET(NN_REG_VAL, swreg);
+
+ switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+ case NN_REG_GPR_A:
+ case NN_REG_GPR_B:
+ case NN_REG_GPR_BOTH:
+ return val;
+ case NN_REG_XFER:
+ return RE_REG_XFR | val;
+ case NN_REG_IMM:
+ if (val & ~(0x7f | has_imm8 << 7)) {
+ pr_err("immediate too large\n");
+ return 0;
+ }
+ *i8 = val & 0x80;
+ return RE_REG_IMM_encode(val & 0x7f);
+ case NN_REG_NONE:
+ return is_dst ? RE_REG_NO_DST : REG_NONE;
+ default:
+ pr_err("unrecognized reg encoding\n");
+ return 0;
+ }
+}
+
+static int
+swreg_to_restricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_re_regs *reg,
+ bool has_imm8)
+{
+ memset(reg, 0, sizeof(*reg));
+
+ /* Decode destination */
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+ return -EFAULT;
+
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+ reg->dst_ab = ALU_DST_B;
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+ reg->wr_both = true;
+ reg->dst = nfp_swreg_to_rereg(dst, true, false, NULL);
+
+ /* Decode source operands */
+ if (FIELD_GET(NN_REG_TYPE, lreg) == FIELD_GET(NN_REG_TYPE, rreg))
+ return -EFAULT;
+
+ if (FIELD_GET(NN_REG_TYPE, lreg) == NN_REG_GPR_B ||
+ FIELD_GET(NN_REG_TYPE, rreg) == NN_REG_GPR_A) {
+ reg->areg = nfp_swreg_to_rereg(rreg, false, has_imm8, ®->i8);
+ reg->breg = nfp_swreg_to_rereg(lreg, false, has_imm8, ®->i8);
+ reg->swap = true;
+ } else {
+ reg->areg = nfp_swreg_to_rereg(lreg, false, has_imm8, ®->i8);
+ reg->breg = nfp_swreg_to_rereg(rreg, false, has_imm8, ®->i8);
+ }
+
+ return 0;
+}
+
+/* --- Emitters --- */
+static const struct cmd_tgt_act cmd_tgt_act[__CMD_TGT_MAP_SIZE] = {
+ [CMD_TGT_WRITE8] = { 0x00, 0x42 },
+ [CMD_TGT_READ8] = { 0x01, 0x43 },
+ [CMD_TGT_READ_LE] = { 0x01, 0x40 },
+ [CMD_TGT_READ_SWAP_LE] = { 0x03, 0x40 },
+};
+
+static void
+__emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
+ u8 mode, u8 xfer, u8 areg, u8 breg, u8 size, bool sync)
+{
+ enum cmd_ctx_swap ctx;
+ u64 insn;
+
+ if (sync)
+ ctx = CMD_CTX_SWAP;
+ else
+ ctx = CMD_CTX_NO_SWAP;
+
+ insn = FIELD_PREP(OP_CMD_A_SRC, areg) |
+ FIELD_PREP(OP_CMD_CTX, ctx) |
+ FIELD_PREP(OP_CMD_B_SRC, breg) |
+ FIELD_PREP(OP_CMD_TOKEN, cmd_tgt_act[op].token) |
+ FIELD_PREP(OP_CMD_XFER, xfer) |
+ FIELD_PREP(OP_CMD_CNT, size) |
+ FIELD_PREP(OP_CMD_SIG, sync) |
+ FIELD_PREP(OP_CMD_TGT_CMD, cmd_tgt_act[op].tgt_cmd) |
+ FIELD_PREP(OP_CMD_MODE, mode);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
+ u8 mode, u8 xfer, u32 lreg, u32 rreg, u8 size, bool sync)
+{
+ struct nfp_insn_re_regs reg;
+ int err;
+
+ err = swreg_to_restricted(reg_none(), lreg, rreg, ®, false);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+ if (reg.swap) {
+ pr_err("cmd can't swap arguments\n");
+ nfp_prog->error = -EFAULT;
+ return;
+ }
+
+ __emit_cmd(nfp_prog, op, mode, xfer, reg.areg, reg.breg, size, sync);
+}
+
+static void
+__emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, enum br_ev_pip ev_pip,
+ enum br_ctx_signal_state css, u16 addr, u8 defer)
+{
+ u16 addr_lo, addr_hi;
+ u64 insn;
+
+ addr_lo = addr & (OP_BR_ADDR_LO >> __bf_shf(OP_BR_ADDR_LO));
+ addr_hi = addr != addr_lo;
+
+ insn = OP_BR_BASE |
+ FIELD_PREP(OP_BR_MASK, mask) |
+ FIELD_PREP(OP_BR_EV_PIP, ev_pip) |
+ FIELD_PREP(OP_BR_CSS, css) |
+ FIELD_PREP(OP_BR_DEFBR, defer) |
+ FIELD_PREP(OP_BR_ADDR_LO, addr_lo) |
+ FIELD_PREP(OP_BR_ADDR_HI, addr_hi);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void emit_br_def(struct nfp_prog *nfp_prog, u16 addr, u8 defer)
+{
+ if (defer > 2) {
+ pr_err("BUG: branch defer out of bounds %d\n", defer);
+ nfp_prog->error = -EFAULT;
+ return;
+ }
+ __emit_br(nfp_prog, BR_UNC, BR_EV_PIP_UNCOND, BR_CSS_NONE, addr, defer);
+}
+
+static void
+emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, u16 addr, u8 defer)
+{
+ __emit_br(nfp_prog, mask,
+ mask != BR_UNC ? BR_EV_PIP_COND : BR_EV_PIP_UNCOND,
+ BR_CSS_NONE, addr, defer);
+}
+
+static void
+__emit_br_byte(struct nfp_prog *nfp_prog, u8 areg, u8 breg, bool imm8,
+ u8 byte, bool equal, u16 addr, u8 defer)
+{
+ u16 addr_lo, addr_hi;
+ u64 insn;
+
+ addr_lo = addr & (OP_BB_ADDR_LO >> __bf_shf(OP_BB_ADDR_LO));
+ addr_hi = addr != addr_lo;
+
+ insn = OP_BBYTE_BASE |
+ FIELD_PREP(OP_BB_A_SRC, areg) |
+ FIELD_PREP(OP_BB_BYTE, byte) |
+ FIELD_PREP(OP_BB_B_SRC, breg) |
+ FIELD_PREP(OP_BB_I8, imm8) |
+ FIELD_PREP(OP_BB_EQ, equal) |
+ FIELD_PREP(OP_BB_DEFBR, defer) |
+ FIELD_PREP(OP_BB_ADDR_LO, addr_lo) |
+ FIELD_PREP(OP_BB_ADDR_HI, addr_hi);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_br_byte_neq(struct nfp_prog *nfp_prog,
+ u32 dst, u8 imm, u8 byte, u16 addr, u8 defer)
+{
+ struct nfp_insn_re_regs reg;
+ int err;
+
+ err = swreg_to_restricted(reg_none(), dst, reg_imm(imm), ®, true);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+
+ __emit_br_byte(nfp_prog, reg.areg, reg.breg, reg.i8, byte, false, addr,
+ defer);
+}
+
+static void
+__emit_immed(struct nfp_prog *nfp_prog, u16 areg, u16 breg, u16 imm_hi,
+ enum immed_width width, bool invert,
+ enum immed_shift shift, bool wr_both)
+{
+ u64 insn;
+
+ insn = OP_IMMED_BASE |
+ FIELD_PREP(OP_IMMED_A_SRC, areg) |
+ FIELD_PREP(OP_IMMED_B_SRC, breg) |
+ FIELD_PREP(OP_IMMED_IMM, imm_hi) |
+ FIELD_PREP(OP_IMMED_WIDTH, width) |
+ FIELD_PREP(OP_IMMED_INV, invert) |
+ FIELD_PREP(OP_IMMED_SHIFT, shift) |
+ FIELD_PREP(OP_IMMED_WR_AB, wr_both);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_immed(struct nfp_prog *nfp_prog, u32 dst, u16 imm,
+ enum immed_width width, bool invert, enum immed_shift shift)
+{
+ struct nfp_insn_ur_regs reg;
+ int err;
+
+ if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM) {
+ nfp_prog->error = -EFAULT;
+ return;
+ }
+
+ err = swreg_to_unrestricted(dst, dst, reg_imm(imm & 0xff), ®);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+
+ __emit_immed(nfp_prog, reg.areg, reg.breg, imm >> 8, width,
+ invert, shift, reg.wr_both);
+}
+
+static void
+__emit_shf(struct nfp_prog *nfp_prog, u16 dst, enum alu_dst_ab dst_ab,
+ enum shf_sc sc, u8 shift,
+ u16 areg, enum shf_op op, u16 breg, bool i8, bool sw, bool wr_both)
+{
+ u64 insn;
+
+ if (!FIELD_FIT(OP_SHF_SHIFT, shift)) {
+ nfp_prog->error = -EFAULT;
+ return;
+ }
+
+ if (sc == SHF_SC_L_SHF)
+ shift = 32 - shift;
+
+ insn = OP_SHF_BASE |
+ FIELD_PREP(OP_SHF_A_SRC, areg) |
+ FIELD_PREP(OP_SHF_SC, sc) |
+ FIELD_PREP(OP_SHF_B_SRC, breg) |
+ FIELD_PREP(OP_SHF_I8, i8) |
+ FIELD_PREP(OP_SHF_SW, sw) |
+ FIELD_PREP(OP_SHF_DST, dst) |
+ FIELD_PREP(OP_SHF_SHIFT, shift) |
+ FIELD_PREP(OP_SHF_OP, op) |
+ FIELD_PREP(OP_SHF_DST_AB, dst_ab) |
+ FIELD_PREP(OP_SHF_WR_AB, wr_both);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_shf(struct nfp_prog *nfp_prog, u32 dst, u32 lreg, enum shf_op op, u32 rreg,
+ enum shf_sc sc, u8 shift)
+{
+ struct nfp_insn_re_regs reg;
+ int err;
+
+ err = swreg_to_restricted(dst, lreg, rreg, ®, true);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+
+ __emit_shf(nfp_prog, reg.dst, reg.dst_ab, sc, shift,
+ reg.areg, op, reg.breg, reg.i8, reg.swap, reg.wr_both);
+}
+
+static void
+__emit_alu(struct nfp_prog *nfp_prog, u16 dst, enum alu_dst_ab dst_ab,
+ u16 areg, enum alu_op op, u16 breg, bool swap, bool wr_both)
+{
+ u64 insn;
+
+ insn = OP_ALU_BASE |
+ FIELD_PREP(OP_ALU_A_SRC, areg) |
+ FIELD_PREP(OP_ALU_B_SRC, breg) |
+ FIELD_PREP(OP_ALU_DST, dst) |
+ FIELD_PREP(OP_ALU_SW, swap) |
+ FIELD_PREP(OP_ALU_OP, op) |
+ FIELD_PREP(OP_ALU_DST_AB, dst_ab) |
+ FIELD_PREP(OP_ALU_WR_AB, wr_both);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_alu(struct nfp_prog *nfp_prog, u32 dst, u32 lreg, enum alu_op op, u32 rreg)
+{
+ struct nfp_insn_ur_regs reg;
+ int err;
+
+ err = swreg_to_unrestricted(dst, lreg, rreg, ®);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+
+ __emit_alu(nfp_prog, reg.dst, reg.dst_ab,
+ reg.areg, op, reg.breg, reg.swap, reg.wr_both);
+}
+
+static void
+__emit_ld_field(struct nfp_prog *nfp_prog, enum shf_sc sc,
+ u8 areg, u8 bmask, u8 breg, u8 shift, bool imm8,
+ bool zero, bool swap, bool wr_both)
+{
+ u64 insn;
+
+ insn = OP_LDF_BASE |
+ FIELD_PREP(OP_LDF_A_SRC, areg) |
+ FIELD_PREP(OP_LDF_SC, sc) |
+ FIELD_PREP(OP_LDF_B_SRC, breg) |
+ FIELD_PREP(OP_LDF_I8, imm8) |
+ FIELD_PREP(OP_LDF_SW, swap) |
+ FIELD_PREP(OP_LDF_ZF, zero) |
+ FIELD_PREP(OP_LDF_BMASK, bmask) |
+ FIELD_PREP(OP_LDF_SHF, shift) |
+ FIELD_PREP(OP_LDF_WR_AB, wr_both);
+
+ nfp_prog_push(nfp_prog, insn);
+}
+
+static void
+emit_ld_field_any(struct nfp_prog *nfp_prog, enum shf_sc sc, u8 shift,
+ u32 dst, u8 bmask, u32 src, bool zero)
+{
+ struct nfp_insn_re_regs reg;
+ int err;
+
+ err = swreg_to_restricted(reg_none(), dst, src, ®, true);
+ if (err) {
+ nfp_prog->error = err;
+ return;
+ }
+
+ __emit_ld_field(nfp_prog, sc, reg.areg, bmask, reg.breg, shift,
+ reg.i8, zero, reg.swap, reg.wr_both);
+}
+
+static void
+emit_ld_field(struct nfp_prog *nfp_prog, u32 dst, u8 bmask, u32 src,
+ enum shf_sc sc, u8 shift)
+{
+ emit_ld_field_any(nfp_prog, sc, shift, dst, bmask, src, false);
+}
+
+/* --- Wrappers --- */
+static bool pack_immed(u32 imm, u16 *val, enum immed_shift *shift)
+{
+ if (!(imm & 0xffff0000)) {
+ *val = imm;
+ *shift = IMMED_SHIFT_0B;
+ } else if (!(imm & 0xff0000ff)) {
+ *val = imm >> 8;
+ *shift = IMMED_SHIFT_1B;
+ } else if (!(imm & 0x0000ffff)) {
+ *val = imm >> 16;
+ *shift = IMMED_SHIFT_2B;
+ } else {
+ return false;
+ }
+
+ return true;
+}
+
+static void wrp_immed(struct nfp_prog *nfp_prog, u32 dst, u32 imm)
+{
+ enum immed_shift shift;
+ u16 val;
+
+ if (pack_immed(imm, &val, &shift)) {
+ emit_immed(nfp_prog, dst, val, IMMED_WIDTH_ALL, false, shift);
+ } else if (pack_immed(~imm, &val, &shift)) {
+ emit_immed(nfp_prog, dst, val, IMMED_WIDTH_ALL, true, shift);
+ } else {
+ emit_immed(nfp_prog, dst, imm & 0xffff, IMMED_WIDTH_ALL,
+ false, IMMED_SHIFT_0B);
+ emit_immed(nfp_prog, dst, imm >> 16, IMMED_WIDTH_WORD,
+ false, IMMED_SHIFT_2B);
+ }
+}
+
+/* ur_load_imm_any() - encode immediate or use tmp register (unrestricted)
+ * If the @imm is small enough encode it directly in operand and return
+ * otherwise load @imm to a spare register and return its encoding.
+ */
+static u32 ur_load_imm_any(struct nfp_prog *nfp_prog, u32 imm, u32 tmp_reg)
+{
+ if (FIELD_FIT(UR_REG_IMM_MAX, imm))
+ return reg_imm(imm);
+
+ wrp_immed(nfp_prog, tmp_reg, imm);
+ return tmp_reg;
+}
+
+/* re_load_imm_any() - encode immediate or use tmp register (restricted)
+ * If the @imm is small enough encode it directly in operand and return
+ * otherwise load @imm to a spare register and return its encoding.
+ */
+static u32 re_load_imm_any(struct nfp_prog *nfp_prog, u32 imm, u32 tmp_reg)
+{
+ if (FIELD_FIT(RE_REG_IMM_MAX, imm))
+ return reg_imm(imm);
+
+ wrp_immed(nfp_prog, tmp_reg, imm);
+ return tmp_reg;
+}
+
+static void
+wrp_br_special(struct nfp_prog *nfp_prog, enum br_mask mask,
+ enum br_special special)
+{
+ emit_br(nfp_prog, mask, 0, 0);
+
+ nfp_prog->prog[nfp_prog->prog_len - 1] |=
+ FIELD_PREP(OP_BR_SPECIAL, special);
+}
+
+static void wrp_reg_mov(struct nfp_prog *nfp_prog, u16 dst, u16 src)
+{
+ emit_alu(nfp_prog, reg_both(dst), reg_none(), ALU_OP_NONE, reg_b(src));
+}
+
+static int
+construct_data_ind_ld(struct nfp_prog *nfp_prog, u16 offset,
+ u16 src, bool src_valid, u8 size)
+{
+ unsigned int i;
+ u16 shift, sz;
+ u32 tmp_reg;
+
+ /* We load the value from the address indicated in @offset and then
+ * shift out the data we don't need. Note: this is big endian!
+ */
+ sz = size < 4 ? 4 : size;
+ shift = size < 4 ? 4 - size : 0;
+
+ if (src_valid) {
+ /* Calculate the true offset (src_reg + imm) */
+ tmp_reg = ur_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
+ emit_alu(nfp_prog, imm_both(nfp_prog),
+ reg_a(src), ALU_OP_ADD, tmp_reg);
+ /* Check packet length (size guaranteed to fit b/c it's u8) */
+ emit_alu(nfp_prog, imm_a(nfp_prog),
+ imm_a(nfp_prog), ALU_OP_ADD, reg_imm(size));
+ emit_alu(nfp_prog, reg_none(),
+ NFP_BPF_ABI_LEN, ALU_OP_SUB, imm_a(nfp_prog));
+ wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
+ /* Load data */
+ emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
+ pkt_reg(nfp_prog), imm_b(nfp_prog), sz - 1, true);
+ } else {
+ /* Check packet length */
+ tmp_reg = ur_load_imm_any(nfp_prog, offset + size,
+ imm_a(nfp_prog));
+ emit_alu(nfp_prog, reg_none(),
+ NFP_BPF_ABI_LEN, ALU_OP_SUB, tmp_reg);
+ wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
+ /* Load data */
+ tmp_reg = re_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
+ emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
+ pkt_reg(nfp_prog), tmp_reg, sz - 1, true);
+ }
+
+ i = 0;
+ if (shift)
+ emit_shf(nfp_prog, reg_both(0), reg_none(), SHF_OP_NONE,
+ reg_xfer(0), SHF_SC_R_SHF, shift * 8);
+ else
+ for (; i * 4 < size; i++)
+ emit_alu(nfp_prog, reg_both(i),
+ reg_none(), ALU_OP_NONE, reg_xfer(i));
+
+ if (i < 2)
+ wrp_immed(nfp_prog, reg_both(1), 0);
+
+ return 0;
+}
+
+static int construct_data_ld(struct nfp_prog *nfp_prog, u16 offset, u8 size)
+{
+ return construct_data_ind_ld(nfp_prog, offset, 0, false, size);
+}
+
+static int wrp_set_mark(struct nfp_prog *nfp_prog, u8 src)
+{
+ emit_alu(nfp_prog, NFP_BPF_ABI_MARK,
+ reg_none(), ALU_OP_NONE, reg_b(src));
+ emit_alu(nfp_prog, NFP_BPF_ABI_FLAGS,
+ NFP_BPF_ABI_FLAGS, ALU_OP_OR, reg_imm(NFP_BPF_ABI_FLAG_MARK));
+
+ return 0;
+}
+
+static void
+wrp_alu_imm(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u32 imm)
+{
+ u32 tmp_reg;
+
+ if (alu_op == ALU_OP_AND) {
+ if (!imm)
+ wrp_immed(nfp_prog, reg_both(dst), 0);
+ if (!imm || !~imm)
+ return;
+ }
+ if (alu_op == ALU_OP_OR) {
+ if (!~imm)
+ wrp_immed(nfp_prog, reg_both(dst), ~0U);
+ if (!imm || !~imm)
+ return;
+ }
+ if (alu_op == ALU_OP_XOR) {
+ if (!~imm)
+ emit_alu(nfp_prog, reg_both(dst), reg_none(),
+ ALU_OP_NEG, reg_b(dst));
+ if (!imm || !~imm)
+ return;
+ }
+
+ tmp_reg = ur_load_imm_any(nfp_prog, imm, imm_b(nfp_prog));
+ emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, tmp_reg);
+}
+
+static int
+wrp_alu64_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum alu_op alu_op, bool skip)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+
+ if (skip) {
+ meta->skip = true;
+ return 0;
+ }
+
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, imm & ~0U);
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, alu_op, imm >> 32);
+
+ return 0;
+}
+
+static int
+wrp_alu64_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum alu_op alu_op)
+{
+ u8 dst = meta->insn.dst_reg * 2, src = meta->insn.src_reg * 2;
+
+ emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, reg_b(src));
+ emit_alu(nfp_prog, reg_both(dst + 1),
+ reg_a(dst + 1), alu_op, reg_b(src + 1));
+
+ return 0;
+}
+
+static int
+wrp_alu32_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum alu_op alu_op, bool skip)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (skip) {
+ meta->skip = true;
+ return 0;
+ }
+
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, insn->imm);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static int
+wrp_alu32_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum alu_op alu_op)
+{
+ u8 dst = meta->insn.dst_reg * 2, src = meta->insn.src_reg * 2;
+
+ emit_alu(nfp_prog, reg_both(dst), reg_a(dst), alu_op, reg_b(src));
+ wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static void
+wrp_test_reg_one(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u8 src,
+ enum br_mask br_mask, u16 off)
+{
+ emit_alu(nfp_prog, reg_none(), reg_a(dst), alu_op, reg_b(src));
+ emit_br(nfp_prog, br_mask, off, 0);
+}
+
+static int
+wrp_test_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum alu_op alu_op, enum br_mask br_mask)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ wrp_test_reg_one(nfp_prog, insn->dst_reg * 2, alu_op,
+ insn->src_reg * 2, br_mask, insn->off);
+ wrp_test_reg_one(nfp_prog, insn->dst_reg * 2 + 1, alu_op,
+ insn->src_reg * 2 + 1, br_mask, insn->off);
+
+ return 0;
+}
+
+static int
+wrp_cmp_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum br_mask br_mask, bool swap)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+ u8 reg = insn->dst_reg * 2;
+ u32 tmp_reg;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+ if (!swap)
+ emit_alu(nfp_prog, reg_none(), reg_a(reg), ALU_OP_SUB, tmp_reg);
+ else
+ emit_alu(nfp_prog, reg_none(), tmp_reg, ALU_OP_SUB, reg_a(reg));
+
+ tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+ if (!swap)
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(reg + 1), ALU_OP_SUB_C, tmp_reg);
+ else
+ emit_alu(nfp_prog, reg_none(),
+ tmp_reg, ALU_OP_SUB_C, reg_a(reg + 1));
+
+ emit_br(nfp_prog, br_mask, insn->off, 0);
+
+ return 0;
+}
+
+static int
+wrp_cmp_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ enum br_mask br_mask, bool swap)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u8 areg = insn->src_reg * 2, breg = insn->dst_reg * 2;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ if (swap) {
+ areg ^= breg;
+ breg ^= areg;
+ areg ^= breg;
+ }
+
+ emit_alu(nfp_prog, reg_none(), reg_a(areg), ALU_OP_SUB, reg_b(breg));
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(areg + 1), ALU_OP_SUB_C, reg_b(breg + 1));
+ emit_br(nfp_prog, br_mask, insn->off, 0);
+
+ return 0;
+}
+
+/* --- Callbacks --- */
+static int mov_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->src_reg * 2);
+ wrp_reg_mov(nfp_prog, insn->dst_reg * 2 + 1, insn->src_reg * 2 + 1);
+
+ return 0;
+}
+
+static int mov_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ u64 imm = meta->insn.imm; /* sign extend */
+
+ wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2), imm & ~0U);
+ wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), imm >> 32);
+
+ return 0;
+}
+
+static int xor_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_reg(nfp_prog, meta, ALU_OP_XOR);
+}
+
+static int xor_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_imm(nfp_prog, meta, ALU_OP_XOR, !meta->insn.imm);
+}
+
+static int and_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_reg(nfp_prog, meta, ALU_OP_AND);
+}
+
+static int and_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_imm(nfp_prog, meta, ALU_OP_AND, !~meta->insn.imm);
+}
+
+static int or_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_reg(nfp_prog, meta, ALU_OP_OR);
+}
+
+static int or_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu64_imm(nfp_prog, meta, ALU_OP_OR, !meta->insn.imm);
+}
+
+static int add_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ emit_alu(nfp_prog, reg_both(insn->dst_reg * 2),
+ reg_a(insn->dst_reg * 2), ALU_OP_ADD,
+ reg_b(insn->src_reg * 2));
+ emit_alu(nfp_prog, reg_both(insn->dst_reg * 2 + 1),
+ reg_a(insn->dst_reg * 2 + 1), ALU_OP_ADD_C,
+ reg_b(insn->src_reg * 2 + 1));
+
+ return 0;
+}
+
+static int add_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2, ALU_OP_ADD, imm & ~0U);
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, ALU_OP_ADD_C, imm >> 32);
+
+ return 0;
+}
+
+static int sub_reg64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ emit_alu(nfp_prog, reg_both(insn->dst_reg * 2),
+ reg_a(insn->dst_reg * 2), ALU_OP_SUB,
+ reg_b(insn->src_reg * 2));
+ emit_alu(nfp_prog, reg_both(insn->dst_reg * 2 + 1),
+ reg_a(insn->dst_reg * 2 + 1), ALU_OP_SUB_C,
+ reg_b(insn->src_reg * 2 + 1));
+
+ return 0;
+}
+
+static int sub_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2, ALU_OP_SUB, imm & ~0U);
+ wrp_alu_imm(nfp_prog, insn->dst_reg * 2 + 1, ALU_OP_SUB_C, imm >> 32);
+
+ return 0;
+}
+
+static int shl_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (insn->imm != 32)
+ return 1; /* TODO */
+
+ wrp_reg_mov(nfp_prog, insn->dst_reg * 2 + 1, insn->dst_reg * 2);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), 0);
+
+ return 0;
+}
+
+static int shr_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (insn->imm != 32)
+ return 1; /* TODO */
+
+ wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->dst_reg * 2 + 1);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static int mov_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->src_reg * 2);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static int mov_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), insn->imm);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static int xor_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_reg(nfp_prog, meta, ALU_OP_XOR);
+}
+
+static int xor_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_XOR, !~meta->insn.imm);
+}
+
+static int and_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_reg(nfp_prog, meta, ALU_OP_AND);
+}
+
+static int and_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_AND, !~meta->insn.imm);
+}
+
+static int or_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_reg(nfp_prog, meta, ALU_OP_OR);
+}
+
+static int or_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_OR, !meta->insn.imm);
+}
+
+static int add_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_reg(nfp_prog, meta, ALU_OP_ADD);
+}
+
+static int add_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_ADD, !meta->insn.imm);
+}
+
+static int sub_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_reg(nfp_prog, meta, ALU_OP_SUB);
+}
+
+static int sub_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_SUB, !meta->insn.imm);
+}
+
+static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (!insn->imm)
+ return 1; /* TODO: zero shift means indirect */
+
+ emit_shf(nfp_prog, reg_both(insn->dst_reg * 2),
+ reg_none(), SHF_OP_NONE, reg_b(insn->dst_reg * 2),
+ SHF_SC_L_SHF, insn->imm);
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+
+ return 0;
+}
+
+static int imm_ld8_part2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ wrp_immed(nfp_prog, reg_both(nfp_meta_prev(meta)->insn.dst_reg * 2 + 1),
+ meta->insn.imm);
+
+ return 0;
+}
+
+static int imm_ld8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ meta->double_cb = imm_ld8_part2;
+ wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), insn->imm);
+
+ return 0;
+}
+
+static int data_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ld(nfp_prog, meta->insn.imm, 1);
+}
+
+static int data_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ld(nfp_prog, meta->insn.imm, 2);
+}
+
+static int data_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ld(nfp_prog, meta->insn.imm, 4);
+}
+
+static int data_ind_ld1(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+ meta->insn.src_reg * 2, true, 1);
+}
+
+static int data_ind_ld2(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+ meta->insn.src_reg * 2, true, 2);
+}
+
+static int data_ind_ld4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return construct_data_ind_ld(nfp_prog, meta->insn.imm,
+ meta->insn.src_reg * 2, true, 4);
+}
+
+static int mem_ldx4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ if (meta->insn.off == offsetof(struct sk_buff, len))
+ emit_alu(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+ reg_none(), ALU_OP_NONE, NFP_BPF_ABI_LEN);
+ else
+ return -ENOTSUPP;
+
+ return 0;
+}
+
+static int mem_stx4(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ if (meta->insn.off == offsetof(struct sk_buff, mark))
+ return wrp_set_mark(nfp_prog, meta->insn.src_reg * 2);
+
+ return -ENOTSUPP;
+}
+
+static int jump(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ if (meta->insn.off < 0) /* TODO */
+ return -ENOTSUPP;
+ emit_br(nfp_prog, BR_UNC, meta->insn.off, 0);
+
+ return 0;
+}
+
+static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+ u32 or1 = reg_a(insn->dst_reg * 2), or2 = reg_b(insn->dst_reg * 2 + 1);
+ u32 tmp_reg;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ if (imm & ~0U) {
+ tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+ emit_alu(nfp_prog, imm_a(nfp_prog),
+ reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
+ or1 = imm_a(nfp_prog);
+ }
+
+ if (imm >> 32) {
+ tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+ emit_alu(nfp_prog, imm_b(nfp_prog),
+ reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR, tmp_reg);
+ or2 = imm_b(nfp_prog);
+ }
+
+ emit_alu(nfp_prog, reg_none(), or1, ALU_OP_OR, or2);
+ emit_br(nfp_prog, BR_BEQ, insn->off, 0);
+
+ return 0;
+}
+
+static int jgt_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_cmp_imm(nfp_prog, meta, BR_BLO, false);
+}
+
+static int jge_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_cmp_imm(nfp_prog, meta, BR_BHS, true);
+}
+
+static int jset_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+ u32 tmp_reg;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ if (!imm) {
+ meta->skip = true;
+ return 0;
+ }
+
+ if (imm & ~0U) {
+ tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(insn->dst_reg * 2), ALU_OP_AND, tmp_reg);
+ emit_br(nfp_prog, BR_BNE, insn->off, 0);
+ }
+
+ if (imm >> 32) {
+ tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(insn->dst_reg * 2 + 1), ALU_OP_AND, tmp_reg);
+ emit_br(nfp_prog, BR_BNE, insn->off, 0);
+ }
+
+ return 0;
+}
+
+static int jne_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+ u64 imm = insn->imm; /* sign extend */
+ u32 tmp_reg;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ if (!imm) {
+ emit_alu(nfp_prog, reg_none(), reg_a(insn->dst_reg * 2),
+ ALU_OP_OR, reg_b(insn->dst_reg * 2 + 1));
+ emit_br(nfp_prog, BR_BNE, insn->off, 0);
+ }
+
+ tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(insn->dst_reg * 2), ALU_OP_XOR, tmp_reg);
+ emit_br(nfp_prog, BR_BNE, insn->off, 0);
+
+ tmp_reg = ur_load_imm_any(nfp_prog, imm >> 32, imm_b(nfp_prog));
+ emit_alu(nfp_prog, reg_none(),
+ reg_a(insn->dst_reg * 2 + 1), ALU_OP_XOR, tmp_reg);
+ emit_br(nfp_prog, BR_BNE, insn->off, 0);
+
+ return 0;
+}
+
+static int jeq_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ const struct bpf_insn *insn = &meta->insn;
+
+ if (insn->off < 0) /* TODO */
+ return -ENOTSUPP;
+
+ emit_alu(nfp_prog, imm_a(nfp_prog), reg_a(insn->dst_reg * 2),
+ ALU_OP_XOR, reg_b(insn->src_reg * 2));
+ emit_alu(nfp_prog, imm_b(nfp_prog), reg_a(insn->dst_reg * 2 + 1),
+ ALU_OP_XOR, reg_b(insn->src_reg * 2 + 1));
+ emit_alu(nfp_prog, reg_none(),
+ imm_a(nfp_prog), ALU_OP_OR, imm_b(nfp_prog));
+ emit_br(nfp_prog, BR_BEQ, insn->off, 0);
+
+ return 0;
+}
+
+static int jgt_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_cmp_reg(nfp_prog, meta, BR_BLO, false);
+}
+
+static int jge_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_cmp_reg(nfp_prog, meta, BR_BHS, true);
+}
+
+static int jset_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_test_reg(nfp_prog, meta, ALU_OP_AND, BR_BNE);
+}
+
+static int jne_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ return wrp_test_reg(nfp_prog, meta, ALU_OP_XOR, BR_BNE);
+}
+
+static int goto_out(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+ wrp_br_special(nfp_prog, BR_UNC, OP_BR_GO_OUT);
+
+ return 0;
+}
+
+static const instr_cb_t instr_cb[256] = {
+ [BPF_ALU64 | BPF_MOV | BPF_X] = mov_reg64,
+ [BPF_ALU64 | BPF_MOV | BPF_K] = mov_imm64,
+ [BPF_ALU64 | BPF_XOR | BPF_X] = xor_reg64,
+ [BPF_ALU64 | BPF_XOR | BPF_K] = xor_imm64,
+ [BPF_ALU64 | BPF_AND | BPF_X] = and_reg64,
+ [BPF_ALU64 | BPF_AND | BPF_K] = and_imm64,
+ [BPF_ALU64 | BPF_OR | BPF_X] = or_reg64,
+ [BPF_ALU64 | BPF_OR | BPF_K] = or_imm64,
+ [BPF_ALU64 | BPF_ADD | BPF_X] = add_reg64,
+ [BPF_ALU64 | BPF_ADD | BPF_K] = add_imm64,
+ [BPF_ALU64 | BPF_SUB | BPF_X] = sub_reg64,
+ [BPF_ALU64 | BPF_SUB | BPF_K] = sub_imm64,
+ [BPF_ALU64 | BPF_LSH | BPF_K] = shl_imm64,
+ [BPF_ALU64 | BPF_RSH | BPF_K] = shr_imm64,
+ [BPF_ALU | BPF_MOV | BPF_X] = mov_reg,
+ [BPF_ALU | BPF_MOV | BPF_K] = mov_imm,
+ [BPF_ALU | BPF_XOR | BPF_X] = xor_reg,
+ [BPF_ALU | BPF_XOR | BPF_K] = xor_imm,
+ [BPF_ALU | BPF_AND | BPF_X] = and_reg,
+ [BPF_ALU | BPF_AND | BPF_K] = and_imm,
+ [BPF_ALU | BPF_OR | BPF_X] = or_reg,
+ [BPF_ALU | BPF_OR | BPF_K] = or_imm,
+ [BPF_ALU | BPF_ADD | BPF_X] = add_reg,
+ [BPF_ALU | BPF_ADD | BPF_K] = add_imm,
+ [BPF_ALU | BPF_SUB | BPF_X] = sub_reg,
+ [BPF_ALU | BPF_SUB | BPF_K] = sub_imm,
+ [BPF_ALU | BPF_LSH | BPF_K] = shl_imm,
+ [BPF_LD | BPF_IMM | BPF_DW] = imm_ld8,
+ [BPF_LD | BPF_ABS | BPF_B] = data_ld1,
+ [BPF_LD | BPF_ABS | BPF_H] = data_ld2,
+ [BPF_LD | BPF_ABS | BPF_W] = data_ld4,
+ [BPF_LD | BPF_IND | BPF_B] = data_ind_ld1,
+ [BPF_LD | BPF_IND | BPF_H] = data_ind_ld2,
+ [BPF_LD | BPF_IND | BPF_W] = data_ind_ld4,
+ [BPF_LDX | BPF_MEM | BPF_W] = mem_ldx4,
+ [BPF_STX | BPF_MEM | BPF_W] = mem_stx4,
+ [BPF_JMP | BPF_JA | BPF_K] = jump,
+ [BPF_JMP | BPF_JEQ | BPF_K] = jeq_imm,
+ [BPF_JMP | BPF_JGT | BPF_K] = jgt_imm,
+ [BPF_JMP | BPF_JGE | BPF_K] = jge_imm,
+ [BPF_JMP | BPF_JSET | BPF_K] = jset_imm,
+ [BPF_JMP | BPF_JNE | BPF_K] = jne_imm,
+ [BPF_JMP | BPF_JEQ | BPF_X] = jeq_reg,
+ [BPF_JMP | BPF_JGT | BPF_X] = jgt_reg,
+ [BPF_JMP | BPF_JGE | BPF_X] = jge_reg,
+ [BPF_JMP | BPF_JSET | BPF_X] = jset_reg,
+ [BPF_JMP | BPF_JNE | BPF_X] = jne_reg,
+ [BPF_JMP | BPF_EXIT] = goto_out,
+};
+
+/* --- Misc code --- */
+static void br_set_offset(u64 *instr, u16 offset)
+{
+ u16 addr_lo, addr_hi;
+
+ addr_lo = offset & (OP_BR_ADDR_LO >> __bf_shf(OP_BR_ADDR_LO));
+ addr_hi = offset != addr_lo;
+ *instr &= ~(OP_BR_ADDR_HI | OP_BR_ADDR_LO);
+ *instr |= FIELD_PREP(OP_BR_ADDR_HI, addr_hi);
+ *instr |= FIELD_PREP(OP_BR_ADDR_LO, addr_lo);
+}
+
+/* --- Assembler logic --- */
+static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta, *next;
+ u32 off, br_idx;
+ u32 idx;
+
+ nfp_for_each_insn_walk2(nfp_prog, meta, next) {
+ if (meta->skip)
+ continue;
+ if (BPF_CLASS(meta->insn.code) != BPF_JMP)
+ continue;
+
+ br_idx = nfp_prog_offset_to_index(nfp_prog, next->off) - 1;
+ if (!nfp_is_br(nfp_prog->prog[br_idx])) {
+ pr_err("Fixup found block not ending in branch %d %02x %016llx!!\n",
+ br_idx, meta->insn.code, nfp_prog->prog[br_idx]);
+ return -ELOOP;
+ }
+ /* Leave special branches for later */
+ if (FIELD_GET(OP_BR_SPECIAL, nfp_prog->prog[br_idx]))
+ continue;
+
+ /* Find the target offset in assembler realm */
+ off = meta->insn.off;
+ if (!off) {
+ pr_err("Fixup found zero offset!!\n");
+ return -ELOOP;
+ }
+
+ while (off && nfp_meta_has_next(nfp_prog, next)) {
+ next = nfp_meta_next(next);
+ off--;
+ }
+ if (off) {
+ pr_err("Fixup found too large jump!! %d\n", off);
+ return -ELOOP;
+ }
+
+ if (next->skip) {
+ pr_err("Branch landing on removed instruction!!\n");
+ return -ELOOP;
+ }
+
+ for (idx = nfp_prog_offset_to_index(nfp_prog, meta->off);
+ idx <= br_idx; idx++) {
+ if (!nfp_is_br(nfp_prog->prog[idx]))
+ continue;
+ br_set_offset(&nfp_prog->prog[idx], next->off);
+ }
+ }
+
+ /* Fixup 'goto out's separately, they can be scattered around */
+ for (br_idx = 0; br_idx < nfp_prog->prog_len; br_idx++) {
+ enum br_special special;
+
+ if ((nfp_prog->prog[br_idx] & OP_BR_BASE_MASK) != OP_BR_BASE)
+ continue;
+
+ special = FIELD_GET(OP_BR_SPECIAL, nfp_prog->prog[br_idx]);
+ switch (special) {
+ case OP_BR_NORMAL:
+ break;
+ case OP_BR_GO_OUT:
+ br_set_offset(&nfp_prog->prog[br_idx],
+ nfp_prog->tgt_out);
+ break;
+ case OP_BR_GO_ABORT:
+ br_set_offset(&nfp_prog->prog[br_idx],
+ nfp_prog->tgt_abort);
+ break;
+ }
+
+ nfp_prog->prog[br_idx] &= ~OP_BR_SPECIAL;
+ }
+
+ return 0;
+}
+
+static void nfp_intro(struct nfp_prog *nfp_prog)
+{
+ emit_alu(nfp_prog, pkt_reg(nfp_prog),
+ reg_none(), ALU_OP_NONE, NFP_BPF_ABI_PKT);
+}
+
+static void nfp_outro_tc_legacy(struct nfp_prog *nfp_prog)
+{
+ const u8 act2code[] = {
+ [NN_ACT_TC_DROP] = 0x22,
+ [NN_ACT_TC_REDIR] = 0x24
+ };
+ /* Target for aborts */
+ nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
+ wrp_immed(nfp_prog, reg_both(0), 0);
+
+ /* Target for normal exits */
+ nfp_prog->tgt_out = nfp_prog_current_offset(nfp_prog);
+ /* Legacy TC mode:
+ * 0 0x11 -> pass, count as stat0
+ * -1 drop 0x22 -> drop, count as stat1
+ * redir 0x24 -> redir, count as stat1
+ * ife mark 0x21 -> pass, count as stat1
+ * ife + tx 0x24 -> redir, count as stat1
+ */
+ emit_br_byte_neq(nfp_prog, reg_b(0), 0xff, 0, nfp_prog->tgt_done, 2);
+ emit_alu(nfp_prog, reg_a(0),
+ reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+ emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(0x11), SHF_SC_L_SHF, 16);
+
+ emit_br(nfp_prog, BR_UNC, nfp_prog->tgt_done, 1);
+ emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(act2code[nfp_prog->act]),
+ SHF_SC_L_SHF, 16);
+}
+
+static void nfp_outro_tc_da(struct nfp_prog *nfp_prog)
+{
+ /* TC direct-action mode:
+ * 0,1 ok NOT SUPPORTED[1]
+ * 2 drop 0x22 -> drop, count as stat1
+ * 4,5 nuke 0x02 -> drop
+ * 7 redir 0x44 -> redir, count as stat2
+ * * unspec 0x11 -> pass, count as stat0
+ *
+ * [1] We can't support OK and RECLASSIFY because we can't tell TC
+ * the exact decision made. We are forced to support UNSPEC
+ * to handle aborts so that's the only one we handle for passing
+ * packets up the stack.
+ */
+ /* Target for aborts */
+ nfp_prog->tgt_abort = nfp_prog_current_offset(nfp_prog);
+
+ emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
+
+ emit_alu(nfp_prog, reg_a(0),
+ reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+ emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_imm(0x11), SHF_SC_L_SHF, 16);
+
+ /* Target for normal exits */
+ nfp_prog->tgt_out = nfp_prog_current_offset(nfp_prog);
+
+ /* if R0 > 7 jump to abort */
+ emit_alu(nfp_prog, reg_none(), reg_imm(7), ALU_OP_SUB, reg_b(0));
+ emit_br(nfp_prog, BR_BLO, nfp_prog->tgt_abort, 0);
+ emit_alu(nfp_prog, reg_a(0),
+ reg_none(), ALU_OP_NONE, NFP_BPF_ABI_FLAGS);
+
+ wrp_immed(nfp_prog, reg_b(2), 0x41221211);
+ wrp_immed(nfp_prog, reg_b(3), 0x41001211);
+
+ emit_shf(nfp_prog, reg_a(1),
+ reg_none(), SHF_OP_NONE, reg_b(0), SHF_SC_L_SHF, 2);
+
+ emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
+ emit_shf(nfp_prog, reg_a(2),
+ reg_imm(0xf), SHF_OP_AND, reg_b(2), SHF_SC_R_SHF, 0);
+
+ emit_alu(nfp_prog, reg_none(), reg_a(1), ALU_OP_OR, reg_imm(0));
+ emit_shf(nfp_prog, reg_b(2),
+ reg_imm(0xf), SHF_OP_AND, reg_b(3), SHF_SC_R_SHF, 0);
+
+ emit_br_def(nfp_prog, nfp_prog->tgt_done, 2);
+
+ emit_shf(nfp_prog, reg_b(2),
+ reg_a(2), SHF_OP_OR, reg_b(2), SHF_SC_L_SHF, 4);
+ emit_ld_field(nfp_prog, reg_a(0), 0xc, reg_b(2), SHF_SC_L_SHF, 16);
+}
+
+static void nfp_outro(struct nfp_prog *nfp_prog)
+{
+ switch (nfp_prog->act) {
+ case NN_ACT_DIRECT:
+ nfp_outro_tc_da(nfp_prog);
+ break;
+ case NN_ACT_TC_DROP:
+ case NN_ACT_TC_REDIR:
+ nfp_outro_tc_legacy(nfp_prog);
+ break;
+ }
+}
+
+static int nfp_translate(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta;
+ int err;
+
+ nfp_intro(nfp_prog);
+ if (nfp_prog->error)
+ return nfp_prog->error;
+
+ list_for_each_entry(meta, &nfp_prog->insns, l) {
+ instr_cb_t cb = instr_cb[meta->insn.code];
+
+ meta->off = nfp_prog_current_offset(nfp_prog);
+
+ if (meta->skip) {
+ nfp_prog->n_translated++;
+ continue;
+ }
+
+ if (nfp_meta_has_prev(nfp_prog, meta) &&
+ nfp_meta_prev(meta)->double_cb)
+ cb = nfp_meta_prev(meta)->double_cb;
+ if (!cb)
+ return -ENOENT;
+ err = cb(nfp_prog, meta);
+ if (err)
+ return err;
+
+ nfp_prog->n_translated++;
+ }
+
+ nfp_outro(nfp_prog);
+ if (nfp_prog->error)
+ return nfp_prog->error;
+
+ return nfp_fixup_branches(nfp_prog);
+}
+
+static int
+nfp_prog_prepare(struct nfp_prog *nfp_prog, const struct bpf_insn *prog,
+ unsigned int cnt)
+{
+ unsigned int i;
+
+ for (i = 0; i < cnt; i++) {
+ struct nfp_insn_meta *meta;
+
+ meta = kzalloc(sizeof(*meta), GFP_KERNEL);
+ if (!meta)
+ return -ENOMEM;
+
+ meta->insn = prog[i];
+ meta->n = i;
+
+ list_add_tail(&meta->l, &nfp_prog->insns);
+ }
+
+ return 0;
+}
+
+/* --- Optimizations --- */
+static void nfp_bpf_opt_reg_init(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta;
+
+ list_for_each_entry(meta, &nfp_prog->insns, l) {
+ struct bpf_insn insn = meta->insn;
+
+ /* Programs converted from cBPF start with register xoring */
+ if (insn.code == (BPF_ALU64 | BPF_XOR | BPF_X) &&
+ insn.src_reg == insn.dst_reg)
+ continue;
+
+ /* Programs start with R6 = R1 but we ignore the skb pointer */
+ if (insn.code == (BPF_ALU64 | BPF_MOV | BPF_X) &&
+ insn.src_reg == 1 && insn.dst_reg == 6)
+ meta->skip = true;
+
+ /* Return as soon as something doesn't match */
+ if (!meta->skip)
+ return;
+ }
+}
+
+/* Try to rename registers so that program uses only low ones */
+static int nfp_bpf_opt_reg_rename(struct nfp_prog *nfp_prog)
+{
+ bool reg_used[MAX_BPF_REG] = {};
+ u8 tgt_reg[MAX_BPF_REG] = {};
+ struct nfp_insn_meta *meta;
+ unsigned int i, j;
+
+ list_for_each_entry(meta, &nfp_prog->insns, l) {
+ if (meta->skip)
+ continue;
+
+ reg_used[meta->insn.src_reg] = true;
+ reg_used[meta->insn.dst_reg] = true;
+ }
+
+ for (i = 0, j = 0; i < ARRAY_SIZE(tgt_reg); i++) {
+ if (!reg_used[i])
+ continue;
+
+ tgt_reg[i] = j++;
+ }
+ nfp_prog->num_regs = j;
+
+ list_for_each_entry(meta, &nfp_prog->insns, l) {
+ meta->insn.src_reg = tgt_reg[meta->insn.src_reg];
+ meta->insn.dst_reg = tgt_reg[meta->insn.dst_reg];
+ }
+
+ return 0;
+}
+
+/* Remove masking after load since our load guarantees this is not needed */
+static void nfp_bpf_opt_ld_mask(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta1, *meta2;
+ const s32 exp_mask[] = {
+ [BPF_B] = 0x000000ffU,
+ [BPF_H] = 0x0000ffffU,
+ [BPF_W] = 0xffffffffU,
+ };
+
+ nfp_for_each_insn_walk2(nfp_prog, meta1, meta2) {
+ struct bpf_insn insn, next;
+
+ insn = meta1->insn;
+ next = meta2->insn;
+
+ if (BPF_CLASS(insn.code) != BPF_LD)
+ continue;
+ if (BPF_MODE(insn.code) != BPF_ABS &&
+ BPF_MODE(insn.code) != BPF_IND)
+ continue;
+
+ if (next.code != (BPF_ALU64 | BPF_AND | BPF_K))
+ continue;
+
+ if (!exp_mask[BPF_SIZE(insn.code)])
+ continue;
+ if (exp_mask[BPF_SIZE(insn.code)] != next.imm)
+ continue;
+
+ if (next.src_reg || next.dst_reg)
+ continue;
+
+ meta2->skip = true;
+ }
+}
+
+static void nfp_bpf_opt_ld_shift(struct nfp_prog *nfp_prog)
+{
+ struct nfp_insn_meta *meta1, *meta2, *meta3;
+
+ nfp_for_each_insn_walk3(nfp_prog, meta1, meta2, meta3) {
+ struct bpf_insn insn, next1, next2;
+
+ insn = meta1->insn;
+ next1 = meta2->insn;
+ next2 = meta3->insn;
+
+ if (BPF_CLASS(insn.code) != BPF_LD)
+ continue;
+ if (BPF_MODE(insn.code) != BPF_ABS &&
+ BPF_MODE(insn.code) != BPF_IND)
+ continue;
+ if (BPF_SIZE(insn.code) != BPF_W)
+ continue;
+
+ if (!(next1.code == (BPF_LSH | BPF_K | BPF_ALU64) &&
+ next2.code == (BPF_RSH | BPF_K | BPF_ALU64)) &&
+ !(next1.code == (BPF_RSH | BPF_K | BPF_ALU64) &&
+ next2.code == (BPF_LSH | BPF_K | BPF_ALU64)))
+ continue;
+
+ if (next1.src_reg || next1.dst_reg ||
+ next2.src_reg || next2.dst_reg)
+ continue;
+
+ if (next1.imm != 0x20 || next2.imm != 0x20)
+ continue;
+
+ meta2->skip = true;
+ meta3->skip = true;
+ }
+}
+
+static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
+{
+ int ret;
+
+ nfp_bpf_opt_reg_init(nfp_prog);
+
+ ret = nfp_bpf_opt_reg_rename(nfp_prog);
+ if (ret)
+ return ret;
+
+ nfp_bpf_opt_ld_mask(nfp_prog);
+ nfp_bpf_opt_ld_shift(nfp_prog);
+
+ return 0;
+}
+
+/**
+ * nfp_bpf_jit() - translate BPF code into NFP assembly
+ * @filter: kernel BPF filter struct
+ * @prog_mem: memory to store assembler instructions
+ * @act: action attached to this eBPF program
+ * @prog_start: offset of the first instruction when loaded
+ * @prog_done: where to jump on exit
+ * @prog_sz: size of @prog_mem in instructions
+ * @res: achieved parameters of translation results
+ */
+int
+nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
+ enum nfp_bpf_action_type act,
+ unsigned int prog_start, unsigned int prog_done,
+ unsigned int prog_sz, struct nfp_bpf_result *res)
+{
+ struct nfp_prog *nfp_prog;
+ int ret;
+
+ nfp_prog = kzalloc(sizeof(*nfp_prog), GFP_KERNEL);
+ if (!nfp_prog)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&nfp_prog->insns);
+ nfp_prog->act = act;
+ nfp_prog->start_off = prog_start;
+ nfp_prog->tgt_done = prog_done;
+
+ ret = nfp_prog_prepare(nfp_prog, filter->insnsi, filter->len);
+ if (ret)
+ goto out;
+
+ ret = nfp_prog_verify(nfp_prog, filter);
+ if (ret)
+ goto out;
+
+ ret = nfp_bpf_optimize(nfp_prog);
+ if (ret)
+ goto out;
+
+ if (nfp_prog->num_regs <= 7)
+ nfp_prog->regs_per_thread = 16;
+ else
+ nfp_prog->regs_per_thread = 32;
+
+ nfp_prog->prog = prog_mem;
+ nfp_prog->__prog_alloc_len = prog_sz;
+
+ ret = nfp_translate(nfp_prog);
+ if (ret) {
+ pr_err("Translation failed with error %d (translated: %u)\n",
+ ret, nfp_prog->n_translated);
+ ret = -EINVAL;
+ }
+
+ res->n_instr = nfp_prog->prog_len;
+ res->dense_mode = nfp_prog->num_regs <= 7;
+out:
+ nfp_prog_free(nfp_prog);
+
+ return ret;
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
new file mode 100644
index 0000000..144cae8
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_verifier.c
@@ -0,0 +1,171 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below. You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#define pr_fmt(fmt) "NFP net bpf: " fmt
+
+#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
+#include <linux/kernel.h>
+#include <linux/pkt_cls.h>
+
+#include "nfp_bpf.h"
+
+/* Analyzer/verifier definitions */
+struct nfp_bpf_analyzer_priv {
+ struct nfp_prog *prog;
+ struct nfp_insn_meta *meta;
+};
+
+static struct nfp_insn_meta *
+nfp_bpf_goto_meta(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+ unsigned int insn_idx, unsigned int n_insns)
+{
+ unsigned int forward, backward, i;
+
+ backward = meta->n - insn_idx;
+ forward = insn_idx - meta->n;
+
+ if (min(forward, backward) > n_insns - insn_idx - 1) {
+ backward = n_insns - insn_idx - 1;
+ meta = nfp_prog_last_meta(nfp_prog);
+ }
+ if (min(forward, backward) > insn_idx && backward > insn_idx) {
+ forward = insn_idx;
+ meta = nfp_prog_first_meta(nfp_prog);
+ }
+
+ if (forward < backward)
+ for (i = 0; i < forward; i++)
+ meta = nfp_meta_next(meta);
+ else
+ for (i = 0; i < backward; i++)
+ meta = nfp_meta_prev(meta);
+
+ return meta;
+}
+
+static int
+nfp_bpf_check_exit(struct nfp_prog *nfp_prog,
+ const struct bpf_verifier_env *env)
+{
+ const struct bpf_reg_state *reg0 = &env->cur_state.regs[0];
+
+ if (reg0->type != CONST_IMM) {
+ pr_info("unsupported exit state: %d, imm: %llx\n",
+ reg0->type, reg0->imm);
+ return -EINVAL;
+ }
+
+ if (nfp_prog->act != NN_ACT_DIRECT &&
+ reg0->imm != 0 && (reg0->imm & ~0U) != ~0U) {
+ pr_info("unsupported exit state: %d, imm: %llx\n",
+ reg0->type, reg0->imm);
+ return -EINVAL;
+ }
+
+ if (nfp_prog->act == NN_ACT_DIRECT && reg0->imm <= TC_ACT_REDIRECT &&
+ reg0->imm != TC_ACT_SHOT && reg0->imm != TC_ACT_STOLEN &&
+ reg0->imm != TC_ACT_QUEUED) {
+ pr_info("unsupported exit state: %d, imm: %llx\n",
+ reg0->type, reg0->imm);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int
+nfp_bpf_check_ctx_ptr(struct nfp_prog *nfp_prog,
+ const struct bpf_verifier_env *env, u8 reg)
+{
+ if (env->cur_state.regs[reg].type != PTR_TO_CTX)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int
+nfp_verify_insn(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx)
+{
+ struct nfp_bpf_analyzer_priv *priv = env->analyzer_priv;
+ struct nfp_insn_meta *meta = priv->meta;
+
+ meta = nfp_bpf_goto_meta(priv->prog, meta, insn_idx, env->prog->len);
+ priv->meta = meta;
+
+ if (meta->insn.src_reg == BPF_REG_10 ||
+ meta->insn.dst_reg == BPF_REG_10) {
+ pr_err("stack not yet supported\n");
+ return -EINVAL;
+ }
+ if (meta->insn.src_reg >= MAX_BPF_REG ||
+ meta->insn.dst_reg >= MAX_BPF_REG) {
+ pr_err("program uses extended registers - jit hardening?\n");
+ return -EINVAL;
+ }
+
+ if (meta->insn.code == (BPF_JMP | BPF_EXIT))
+ return nfp_bpf_check_exit(priv->prog, env);
+
+ if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM))
+ return nfp_bpf_check_ctx_ptr(priv->prog, env,
+ meta->insn.src_reg);
+ if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM))
+ return nfp_bpf_check_ctx_ptr(priv->prog, env,
+ meta->insn.dst_reg);
+
+ return 0;
+}
+
+static const struct bpf_ext_analyzer_ops nfp_bpf_analyzer_ops = {
+ .insn_hook = nfp_verify_insn,
+};
+
+int nfp_prog_verify(struct nfp_prog *nfp_prog, struct bpf_prog *prog)
+{
+ struct nfp_bpf_analyzer_priv *priv;
+ int ret;
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ priv->prog = nfp_prog;
+ priv->meta = nfp_prog_first_meta(nfp_prog);
+
+ ret = bpf_analyzer(prog, &nfp_bpf_analyzer_ops, priv);
+
+ kfree(priv);
+
+ return ret;
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 6906356..ed824e1 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -62,6 +62,9 @@
/* Max time to wait for NFP to respond on updates (in seconds) */
#define NFP_NET_POLL_TIMEOUT 5
+/* Interval for reading offloaded filter stats */
+#define NFP_NET_STAT_POLL_IVL msecs_to_jiffies(100)
+
/* Bar allocation */
#define NFP_NET_CTRL_BAR 0
#define NFP_NET_Q0_BAR 2
@@ -220,7 +223,7 @@
#define PCIE_DESC_RX_I_TCP_CSUM_OK cpu_to_le16(BIT(11))
#define PCIE_DESC_RX_I_UDP_CSUM cpu_to_le16(BIT(10))
#define PCIE_DESC_RX_I_UDP_CSUM_OK cpu_to_le16(BIT(9))
-#define PCIE_DESC_RX_SPARE cpu_to_le16(BIT(8))
+#define PCIE_DESC_RX_BPF cpu_to_le16(BIT(8))
#define PCIE_DESC_RX_EOP cpu_to_le16(BIT(7))
#define PCIE_DESC_RX_IP4_CSUM cpu_to_le16(BIT(6))
#define PCIE_DESC_RX_IP4_CSUM_OK cpu_to_le16(BIT(5))
@@ -266,6 +269,8 @@
};
};
+#define NFP_NET_META_FIELD_MASK GENMASK(NFP_NET_META_FIELD_SIZE - 1, 0)
+
struct nfp_net_rx_hash {
__be32 hash_type;
__be32 hash;
@@ -405,6 +410,11 @@
fw_ver->minor == minor;
}
+struct nfp_stat_pair {
+ u64 pkts;
+ u64 bytes;
+};
+
/**
* struct nfp_net - NFP network device structure
* @pdev: Backpointer to PCI device
@@ -413,6 +423,7 @@
* @is_vf: Is the driver attached to a VF?
* @is_nfp3200: Is the driver for a NFP-3200 card?
* @fw_loaded: Is the firmware loaded?
+ * @bpf_offload_skip_sw: Offloaded BPF program will not be rerun by cls_bpf
* @ctrl: Local copy of the control register/word.
* @fl_bufsz: Currently configured size of the freelist buffers
* @rx_offset: Offset in the RX buffers where packet data starts
@@ -427,6 +438,11 @@
* @rss_cfg: RSS configuration
* @rss_key: RSS secret key
* @rss_itbl: RSS indirection table
+ * @rx_filter: Filter offload statistics - dropped packets/bytes
+ * @rx_filter_prev: Filter offload statistics - values from previous update
+ * @rx_filter_change: Jiffies when statistics last changed
+ * @rx_filter_stats_timer: Timer for polling filter offload statistics
+ * @rx_filter_lock: Lock protecting timer state changes (teardown)
* @max_tx_rings: Maximum number of TX rings supported by the Firmware
* @max_rx_rings: Maximum number of RX rings supported by the Firmware
* @num_tx_rings: Currently configured number of TX rings
@@ -473,6 +489,7 @@
unsigned is_vf:1;
unsigned is_nfp3200:1;
unsigned fw_loaded:1;
+ unsigned bpf_offload_skip_sw:1;
u32 ctrl;
u32 fl_bufsz;
@@ -502,6 +519,11 @@
u8 rss_key[NFP_NET_CFG_RSS_KEY_SZ];
u8 rss_itbl[NFP_NET_CFG_RSS_ITBL_SZ];
+ struct nfp_stat_pair rx_filter, rx_filter_prev;
+ unsigned long rx_filter_change;
+ struct timer_list rx_filter_stats_timer;
+ spinlock_t rx_filter_lock;
+
int max_tx_rings;
int max_rx_rings;
@@ -561,12 +583,28 @@
/* Functions to read/write from/to a BAR
* Performs any endian conversion necessary.
*/
+static inline u16 nn_readb(struct nfp_net *nn, int off)
+{
+ return readb(nn->ctrl_bar + off);
+}
+
static inline void nn_writeb(struct nfp_net *nn, int off, u8 val)
{
writeb(val, nn->ctrl_bar + off);
}
-/* NFP-3200 can't handle 16-bit accesses too well - hence no readw/writew */
+/* NFP-3200 can't handle 16-bit accesses too well */
+static inline u16 nn_readw(struct nfp_net *nn, int off)
+{
+ WARN_ON_ONCE(nn->is_nfp3200);
+ return readw(nn->ctrl_bar + off);
+}
+
+static inline void nn_writew(struct nfp_net *nn, int off, u16 val)
+{
+ WARN_ON_ONCE(nn->is_nfp3200);
+ writew(val, nn->ctrl_bar + off);
+}
static inline u32 nn_readl(struct nfp_net *nn, int off)
{
@@ -757,4 +795,9 @@
}
#endif /* CONFIG_NFP_NET_DEBUG */
+void nfp_net_filter_stats_timer(unsigned long data);
+int
+nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
+ struct tc_cls_bpf_offload *cls_bpf);
+
#endif /* _NFP_NET_H_ */
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 252e492..aee3fd2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -60,6 +60,7 @@
#include <linux/ktime.h>
+#include <net/pkt_cls.h>
#include <net/vxlan.h>
#include "nfp_net_ctrl.h"
@@ -1292,36 +1293,70 @@
}
}
-/**
- * nfp_net_set_hash() - Set SKB hash data
- * @netdev: adapter's net_device structure
- * @skb: SKB to set the hash data on
- * @rxd: RX descriptor
- *
- * The RSS hash and hash-type are pre-pended to the packet data.
- * Extract and decode it and set the skb fields.
- */
static void nfp_net_set_hash(struct net_device *netdev, struct sk_buff *skb,
- struct nfp_net_rx_desc *rxd)
+ unsigned int type, __be32 *hash)
+{
+ if (!(netdev->features & NETIF_F_RXHASH))
+ return;
+
+ switch (type) {
+ case NFP_NET_RSS_IPV4:
+ case NFP_NET_RSS_IPV6:
+ case NFP_NET_RSS_IPV6_EX:
+ skb_set_hash(skb, get_unaligned_be32(hash), PKT_HASH_TYPE_L3);
+ break;
+ default:
+ skb_set_hash(skb, get_unaligned_be32(hash), PKT_HASH_TYPE_L4);
+ break;
+ }
+}
+
+static void
+nfp_net_set_hash_desc(struct net_device *netdev, struct sk_buff *skb,
+ struct nfp_net_rx_desc *rxd)
{
struct nfp_net_rx_hash *rx_hash;
- if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS) ||
- !(netdev->features & NETIF_F_RXHASH))
+ if (!(rxd->rxd.flags & PCIE_DESC_RX_RSS))
return;
rx_hash = (struct nfp_net_rx_hash *)(skb->data - sizeof(*rx_hash));
- switch (be32_to_cpu(rx_hash->hash_type)) {
- case NFP_NET_RSS_IPV4:
- case NFP_NET_RSS_IPV6:
- case NFP_NET_RSS_IPV6_EX:
- skb_set_hash(skb, be32_to_cpu(rx_hash->hash), PKT_HASH_TYPE_L3);
- break;
- default:
- skb_set_hash(skb, be32_to_cpu(rx_hash->hash), PKT_HASH_TYPE_L4);
- break;
+ nfp_net_set_hash(netdev, skb, get_unaligned_be32(&rx_hash->hash_type),
+ &rx_hash->hash);
+}
+
+static void *
+nfp_net_parse_meta(struct net_device *netdev, struct sk_buff *skb,
+ int meta_len)
+{
+ u8 *data = skb->data - meta_len;
+ u32 meta_info;
+
+ meta_info = get_unaligned_be32(data);
+ data += 4;
+
+ while (meta_info) {
+ switch (meta_info & NFP_NET_META_FIELD_MASK) {
+ case NFP_NET_META_HASH:
+ meta_info >>= NFP_NET_META_FIELD_SIZE;
+ nfp_net_set_hash(netdev, skb,
+ meta_info & NFP_NET_META_FIELD_MASK,
+ (__be32 *)data);
+ data += 4;
+ break;
+ case NFP_NET_META_MARK:
+ skb->mark = get_unaligned_be32(data);
+ data += 4;
+ break;
+ default:
+ return NULL;
+ }
+
+ meta_info >>= NFP_NET_META_FIELD_SIZE;
}
+
+ return data;
}
/**
@@ -1438,14 +1473,29 @@
skb_reserve(skb, nn->rx_offset);
skb_put(skb, data_len - meta_len);
- nfp_net_set_hash(nn->netdev, skb, rxd);
-
/* Stats update */
u64_stats_update_begin(&r_vec->rx_sync);
r_vec->rx_pkts++;
r_vec->rx_bytes += skb->len;
u64_stats_update_end(&r_vec->rx_sync);
+ if (nn->fw_ver.major <= 3) {
+ nfp_net_set_hash_desc(nn->netdev, skb, rxd);
+ } else if (meta_len) {
+ void *end;
+
+ end = nfp_net_parse_meta(nn->netdev, skb, meta_len);
+ if (unlikely(end != skb->data)) {
+ u64_stats_update_begin(&r_vec->rx_sync);
+ r_vec->rx_drops++;
+ u64_stats_update_end(&r_vec->rx_sync);
+
+ dev_kfree_skb_any(skb);
+ nn_warn_ratelimit(nn, "invalid RX packet metadata\n");
+ continue;
+ }
+ }
+
skb_record_rx_queue(skb, rx_ring->idx);
skb->protocol = eth_type_trans(skb, nn->netdev);
@@ -2044,12 +2094,16 @@
nn->rx_rings = kcalloc(nn->num_rx_rings, sizeof(*nn->rx_rings),
GFP_KERNEL);
- if (!nn->rx_rings)
+ if (!nn->rx_rings) {
+ err = -ENOMEM;
goto err_free_lsc;
+ }
nn->tx_rings = kcalloc(nn->num_tx_rings, sizeof(*nn->tx_rings),
GFP_KERNEL);
- if (!nn->tx_rings)
+ if (!nn->tx_rings) {
+ err = -ENOMEM;
goto err_free_rx_rings;
+ }
for (r = 0; r < nn->num_r_vecs; r++) {
err = nfp_net_prepare_vector(nn, &nn->r_vecs[r], r);
@@ -2382,6 +2436,31 @@
return stats;
}
+static bool nfp_net_ebpf_capable(struct nfp_net *nn)
+{
+ if (nn->cap & NFP_NET_CFG_CTRL_BPF &&
+ nn_readb(nn, NFP_NET_CFG_BPF_ABI) == NFP_NET_BPF_ABI)
+ return true;
+ return false;
+}
+
+static int
+nfp_net_setup_tc(struct net_device *netdev, u32 handle, __be16 proto,
+ struct tc_to_netdev *tc)
+{
+ struct nfp_net *nn = netdev_priv(netdev);
+
+ if (TC_H_MAJ(handle) != TC_H_MAJ(TC_H_INGRESS))
+ return -ENOTSUPP;
+ if (proto != htons(ETH_P_ALL))
+ return -ENOTSUPP;
+
+ if (tc->type == TC_SETUP_CLSBPF && nfp_net_ebpf_capable(nn))
+ return nfp_net_bpf_offload(nn, handle, proto, tc->cls_bpf);
+
+ return -EINVAL;
+}
+
static int nfp_net_set_features(struct net_device *netdev,
netdev_features_t features)
{
@@ -2436,6 +2515,11 @@
new_ctrl &= ~NFP_NET_CFG_CTRL_GATHER;
}
+ if (changed & NETIF_F_HW_TC && nn->ctrl & NFP_NET_CFG_CTRL_BPF) {
+ nn_err(nn, "Cannot disable HW TC offload while in use\n");
+ return -EBUSY;
+ }
+
nn_dbg(nn, "Feature change 0x%llx -> 0x%llx (changed=0x%llx)\n",
netdev->features, features, changed);
@@ -2585,6 +2669,7 @@
.ndo_stop = nfp_net_netdev_close,
.ndo_start_xmit = nfp_net_tx,
.ndo_get_stats64 = nfp_net_stat64,
+ .ndo_setup_tc = nfp_net_setup_tc,
.ndo_tx_timeout = nfp_net_tx_timeout,
.ndo_set_rx_mode = nfp_net_set_rx_mode,
.ndo_change_mtu = nfp_net_change_mtu,
@@ -2610,7 +2695,7 @@
nn->fw_ver.resv, nn->fw_ver.class,
nn->fw_ver.major, nn->fw_ver.minor,
nn->max_mtu);
- nn_info(nn, "CAP: %#x %s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
+ nn_info(nn, "CAP: %#x %s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n",
nn->cap,
nn->cap & NFP_NET_CFG_CTRL_PROMISC ? "PROMISC " : "",
nn->cap & NFP_NET_CFG_CTRL_L2BC ? "L2BCFILT " : "",
@@ -2627,7 +2712,8 @@
nn->cap & NFP_NET_CFG_CTRL_MSIXAUTO ? "AUTOMASK " : "",
nn->cap & NFP_NET_CFG_CTRL_IRQMOD ? "IRQMOD " : "",
nn->cap & NFP_NET_CFG_CTRL_VXLAN ? "VXLAN " : "",
- nn->cap & NFP_NET_CFG_CTRL_NVGRE ? "NVGRE " : "");
+ nn->cap & NFP_NET_CFG_CTRL_NVGRE ? "NVGRE " : "",
+ nfp_net_ebpf_capable(nn) ? "BPF " : "");
}
/**
@@ -2670,10 +2756,13 @@
nn->rxd_cnt = NFP_NET_RX_DESCS_DEFAULT;
spin_lock_init(&nn->reconfig_lock);
+ spin_lock_init(&nn->rx_filter_lock);
spin_lock_init(&nn->link_status_lock);
setup_timer(&nn->reconfig_timer,
nfp_net_reconfig_timer, (unsigned long)nn);
+ setup_timer(&nn->rx_filter_stats_timer,
+ nfp_net_filter_stats_timer, (unsigned long)nn);
return nn;
}
@@ -2795,6 +2884,9 @@
netdev->features = netdev->hw_features;
+ if (nfp_net_ebpf_capable(nn))
+ netdev->hw_features |= NETIF_F_HW_TC;
+
/* Advertise but disable TSO by default. */
netdev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
index ad6c4e3..93b10b4 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h
@@ -66,6 +66,13 @@
#define NFP_NET_LSO_MAX_HDR_SZ 255
/**
+ * Prepend field types
+ */
+#define NFP_NET_META_FIELD_SIZE 4
+#define NFP_NET_META_HASH 1 /* next field carries hash type */
+#define NFP_NET_META_MARK 2
+
+/**
* Hash type pre-pended when a RSS hash was computed
*/
#define NFP_NET_RSS_NONE 0
@@ -123,6 +130,7 @@
#define NFP_NET_CFG_CTRL_L2SWITCH_LOCAL (0x1 << 23) /* Switch to local */
#define NFP_NET_CFG_CTRL_VXLAN (0x1 << 24) /* VXLAN tunnel support */
#define NFP_NET_CFG_CTRL_NVGRE (0x1 << 25) /* NVGRE tunnel support */
+#define NFP_NET_CFG_CTRL_BPF (0x1 << 27) /* BPF offload capable */
#define NFP_NET_CFG_UPDATE 0x0004
#define NFP_NET_CFG_UPDATE_GEN (0x1 << 0) /* General update */
#define NFP_NET_CFG_UPDATE_RING (0x1 << 1) /* Ring config change */
@@ -134,6 +142,7 @@
#define NFP_NET_CFG_UPDATE_RESET (0x1 << 7) /* Update due to FLR */
#define NFP_NET_CFG_UPDATE_IRQMOD (0x1 << 8) /* IRQ mod change */
#define NFP_NET_CFG_UPDATE_VXLAN (0x1 << 9) /* VXLAN port change */
+#define NFP_NET_CFG_UPDATE_BPF (0x1 << 10) /* BPF program load */
#define NFP_NET_CFG_UPDATE_ERR (0x1 << 31) /* A error occurred */
#define NFP_NET_CFG_TXRS_ENABLE 0x0008
#define NFP_NET_CFG_RXRS_ENABLE 0x0010
@@ -196,10 +205,37 @@
#define NFP_NET_CFG_VXLAN_SZ 0x0008
/**
- * 64B reserved for future use (0x0080 - 0x00c0)
+ * NFP6000 - BPF section
+ * @NFP_NET_CFG_BPF_ABI: BPF ABI version
+ * @NFP_NET_CFG_BPF_CAP: BPF capabilities
+ * @NFP_NET_CFG_BPF_MAX_LEN: Maximum size of JITed BPF code in bytes
+ * @NFP_NET_CFG_BPF_START: Offset at which BPF will be loaded
+ * @NFP_NET_CFG_BPF_DONE: Offset to jump to on exit
+ * @NFP_NET_CFG_BPF_STACK_SZ: Total size of stack area in 64B chunks
+ * @NFP_NET_CFG_BPF_INL_MTU: Packet data split offset in 64B chunks
+ * @NFP_NET_CFG_BPF_SIZE: Size of the JITed BPF code in instructions
+ * @NFP_NET_CFG_BPF_ADDR: DMA address of the buffer with JITed BPF code
*/
-#define NFP_NET_CFG_RESERVED 0x0080
-#define NFP_NET_CFG_RESERVED_SZ 0x0040
+#define NFP_NET_CFG_BPF_ABI 0x0080
+#define NFP_NET_BPF_ABI 1
+#define NFP_NET_CFG_BPF_CAP 0x0081
+#define NFP_NET_BPF_CAP_RELO (1 << 0) /* seamless reload */
+#define NFP_NET_CFG_BPF_MAX_LEN 0x0082
+#define NFP_NET_CFG_BPF_START 0x0084
+#define NFP_NET_CFG_BPF_DONE 0x0086
+#define NFP_NET_CFG_BPF_STACK_SZ 0x0088
+#define NFP_NET_CFG_BPF_INL_MTU 0x0089
+#define NFP_NET_CFG_BPF_SIZE 0x008e
+#define NFP_NET_CFG_BPF_ADDR 0x0090
+#define NFP_NET_CFG_BPF_CFG_8CTX (1 << 0) /* 8ctx mode */
+#define NFP_NET_CFG_BPF_CFG_MASK 7ULL
+#define NFP_NET_CFG_BPF_ADDR_MASK (~NFP_NET_CFG_BPF_CFG_MASK)
+
+/**
+ * 40B reserved for future use (0x0098 - 0x00c0)
+ */
+#define NFP_NET_CFG_RESERVED 0x0098
+#define NFP_NET_CFG_RESERVED_SZ 0x0028
/**
* RSS configuration (0x0100 - 0x01ac):
@@ -303,6 +339,15 @@
#define NFP_NET_CFG_STATS_TX_MC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x80)
#define NFP_NET_CFG_STATS_TX_BC_FRAMES (NFP_NET_CFG_STATS_BASE + 0x88)
+#define NFP_NET_CFG_STATS_APP0_FRAMES (NFP_NET_CFG_STATS_BASE + 0x90)
+#define NFP_NET_CFG_STATS_APP0_BYTES (NFP_NET_CFG_STATS_BASE + 0x98)
+#define NFP_NET_CFG_STATS_APP1_FRAMES (NFP_NET_CFG_STATS_BASE + 0xa0)
+#define NFP_NET_CFG_STATS_APP1_BYTES (NFP_NET_CFG_STATS_BASE + 0xa8)
+#define NFP_NET_CFG_STATS_APP2_FRAMES (NFP_NET_CFG_STATS_BASE + 0xb0)
+#define NFP_NET_CFG_STATS_APP2_BYTES (NFP_NET_CFG_STATS_BASE + 0xb8)
+#define NFP_NET_CFG_STATS_APP3_FRAMES (NFP_NET_CFG_STATS_BASE + 0xc0)
+#define NFP_NET_CFG_STATS_APP3_BYTES (NFP_NET_CFG_STATS_BASE + 0xc8)
+
/**
* Per ring stats (0x1000 - 0x1800)
* options, 64bit per entry
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 4c98972..3418f22 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -106,6 +106,18 @@
{"dev_tx_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_TX_FRAMES)},
{"dev_tx_mc_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_TX_MC_FRAMES)},
{"dev_tx_bc_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_TX_BC_FRAMES)},
+
+ {"bpf_pass_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP0_FRAMES)},
+ {"bpf_pass_bytes", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP0_BYTES)},
+ /* see comments in outro functions in nfp_bpf_jit.c to find out
+ * how different BPF modes use app-specific counters
+ */
+ {"bpf_app1_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP1_FRAMES)},
+ {"bpf_app1_bytes", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP1_BYTES)},
+ {"bpf_app2_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP2_FRAMES)},
+ {"bpf_app2_bytes", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP2_BYTES)},
+ {"bpf_app3_pkts", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP3_FRAMES)},
+ {"bpf_app3_bytes", NN_ET_DEV_STAT(NFP_NET_CFG_STATS_APP3_BYTES)},
};
#define NN_ET_GLOBAL_STATS_LEN ARRAY_SIZE(nfp_net_et_stats)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
new file mode 100644
index 0000000..8acfb63
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_offload.c
@@ -0,0 +1,294 @@
+/*
+ * Copyright (C) 2016 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below. You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * nfp_net_offload.c
+ * Netronome network device driver: TC offload functions for PF and VF
+ */
+
+#include <linux/kernel.h>
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+#include <linux/jiffies.h>
+#include <linux/timer.h>
+#include <linux/list.h>
+
+#include <net/pkt_cls.h>
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
+#include "nfp_bpf.h"
+#include "nfp_net_ctrl.h"
+#include "nfp_net.h"
+
+void nfp_net_filter_stats_timer(unsigned long data)
+{
+ struct nfp_net *nn = (void *)data;
+ struct nfp_stat_pair latest;
+
+ spin_lock_bh(&nn->rx_filter_lock);
+
+ if (nn->ctrl & NFP_NET_CFG_CTRL_BPF)
+ mod_timer(&nn->rx_filter_stats_timer,
+ jiffies + NFP_NET_STAT_POLL_IVL);
+
+ spin_unlock_bh(&nn->rx_filter_lock);
+
+ latest.pkts = nn_readq(nn, NFP_NET_CFG_STATS_APP1_FRAMES);
+ latest.bytes = nn_readq(nn, NFP_NET_CFG_STATS_APP1_BYTES);
+
+ if (latest.pkts != nn->rx_filter.pkts)
+ nn->rx_filter_change = jiffies;
+
+ nn->rx_filter = latest;
+}
+
+static void nfp_net_bpf_stats_reset(struct nfp_net *nn)
+{
+ nn->rx_filter.pkts = nn_readq(nn, NFP_NET_CFG_STATS_APP1_FRAMES);
+ nn->rx_filter.bytes = nn_readq(nn, NFP_NET_CFG_STATS_APP1_BYTES);
+ nn->rx_filter_prev = nn->rx_filter;
+ nn->rx_filter_change = jiffies;
+}
+
+static int
+nfp_net_bpf_stats_update(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
+{
+ struct tc_action *a;
+ LIST_HEAD(actions);
+ u64 bytes, pkts;
+
+ pkts = nn->rx_filter.pkts - nn->rx_filter_prev.pkts;
+ bytes = nn->rx_filter.bytes - nn->rx_filter_prev.bytes;
+ bytes -= pkts * ETH_HLEN;
+
+ nn->rx_filter_prev = nn->rx_filter;
+
+ preempt_disable();
+
+ tcf_exts_to_list(cls_bpf->exts, &actions);
+ list_for_each_entry(a, &actions, list)
+ tcf_action_stats_update(a, bytes, pkts, nn->rx_filter_change);
+
+ preempt_enable();
+
+ return 0;
+}
+
+static int
+nfp_net_bpf_get_act(struct nfp_net *nn, struct tc_cls_bpf_offload *cls_bpf)
+{
+ const struct tc_action *a;
+ LIST_HEAD(actions);
+
+ /* TC direct action */
+ if (cls_bpf->exts_integrated) {
+ if (tc_no_actions(cls_bpf->exts))
+ return NN_ACT_DIRECT;
+
+ return -ENOTSUPP;
+ }
+
+ /* TC legacy mode */
+ if (!tc_single_action(cls_bpf->exts))
+ return -ENOTSUPP;
+
+ tcf_exts_to_list(cls_bpf->exts, &actions);
+ list_for_each_entry(a, &actions, list) {
+ if (is_tcf_gact_shot(a))
+ return NN_ACT_TC_DROP;
+
+ if (is_tcf_mirred_redirect(a) &&
+ tcf_mirred_ifindex(a) == nn->netdev->ifindex)
+ return NN_ACT_TC_REDIR;
+ }
+
+ return -ENOTSUPP;
+}
+
+static int
+nfp_net_bpf_offload_prepare(struct nfp_net *nn,
+ struct tc_cls_bpf_offload *cls_bpf,
+ struct nfp_bpf_result *res,
+ void **code, dma_addr_t *dma_addr, u16 max_instr)
+{
+ unsigned int code_sz = max_instr * sizeof(u64);
+ enum nfp_bpf_action_type act;
+ u16 start_off, done_off;
+ unsigned int max_mtu;
+ int ret;
+
+ if (!IS_ENABLED(CONFIG_BPF_SYSCALL))
+ return -ENOTSUPP;
+
+ ret = nfp_net_bpf_get_act(nn, cls_bpf);
+ if (ret < 0)
+ return ret;
+ act = ret;
+
+ max_mtu = nn_readb(nn, NFP_NET_CFG_BPF_INL_MTU) * 64 - 32;
+ if (max_mtu < nn->netdev->mtu) {
+ nn_info(nn, "BPF offload not supported with MTU larger than HW packet split boundary\n");
+ return -ENOTSUPP;
+ }
+
+ start_off = nn_readw(nn, NFP_NET_CFG_BPF_START);
+ done_off = nn_readw(nn, NFP_NET_CFG_BPF_DONE);
+
+ *code = dma_zalloc_coherent(&nn->pdev->dev, code_sz, dma_addr,
+ GFP_KERNEL);
+ if (!*code)
+ return -ENOMEM;
+
+ ret = nfp_bpf_jit(cls_bpf->prog, *code, act, start_off, done_off,
+ max_instr, res);
+ if (ret)
+ goto out;
+
+ return 0;
+
+out:
+ dma_free_coherent(&nn->pdev->dev, code_sz, *code, *dma_addr);
+ return ret;
+}
+
+static void
+nfp_net_bpf_load_and_start(struct nfp_net *nn, u32 tc_flags,
+ void *code, dma_addr_t dma_addr,
+ unsigned int code_sz, unsigned int n_instr,
+ bool dense_mode)
+{
+ u64 bpf_addr = dma_addr;
+ int err;
+
+ nn->bpf_offload_skip_sw = !!(tc_flags & TCA_CLS_FLAGS_SKIP_SW);
+
+ if (dense_mode)
+ bpf_addr |= NFP_NET_CFG_BPF_CFG_8CTX;
+
+ nn_writew(nn, NFP_NET_CFG_BPF_SIZE, n_instr);
+ nn_writeq(nn, NFP_NET_CFG_BPF_ADDR, bpf_addr);
+
+ /* Load up the JITed code */
+ err = nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_BPF);
+ if (err)
+ nn_err(nn, "FW command error while loading BPF: %d\n", err);
+
+ /* Enable passing packets through BPF function */
+ nn->ctrl |= NFP_NET_CFG_CTRL_BPF;
+ nn_writel(nn, NFP_NET_CFG_CTRL, nn->ctrl);
+ err = nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_GEN);
+ if (err)
+ nn_err(nn, "FW command error while enabling BPF: %d\n", err);
+
+ dma_free_coherent(&nn->pdev->dev, code_sz, code, dma_addr);
+
+ nfp_net_bpf_stats_reset(nn);
+ mod_timer(&nn->rx_filter_stats_timer, jiffies + NFP_NET_STAT_POLL_IVL);
+}
+
+static int nfp_net_bpf_stop(struct nfp_net *nn)
+{
+ if (!(nn->ctrl & NFP_NET_CFG_CTRL_BPF))
+ return 0;
+
+ spin_lock_bh(&nn->rx_filter_lock);
+ nn->ctrl &= ~NFP_NET_CFG_CTRL_BPF;
+ spin_unlock_bh(&nn->rx_filter_lock);
+ nn_writel(nn, NFP_NET_CFG_CTRL, nn->ctrl);
+
+ del_timer_sync(&nn->rx_filter_stats_timer);
+ nn->bpf_offload_skip_sw = 0;
+
+ return nfp_net_reconfig(nn, NFP_NET_CFG_UPDATE_GEN);
+}
+
+int
+nfp_net_bpf_offload(struct nfp_net *nn, u32 handle, __be16 proto,
+ struct tc_cls_bpf_offload *cls_bpf)
+{
+ struct nfp_bpf_result res;
+ dma_addr_t dma_addr;
+ u16 max_instr;
+ void *code;
+ int err;
+
+ max_instr = nn_readw(nn, NFP_NET_CFG_BPF_MAX_LEN);
+
+ switch (cls_bpf->command) {
+ case TC_CLSBPF_REPLACE:
+ /* There is nothing stopping us from implementing seamless
+ * replace but the simple method of loading I adopted in
+ * the firmware does not handle atomic replace (i.e. we have to
+ * stop the BPF offload and re-enable it). Leaking-in a few
+ * frames which didn't have BPF applied in the hardware should
+ * be fine if software fallback is available, though.
+ */
+ if (nn->bpf_offload_skip_sw)
+ return -EBUSY;
+
+ err = nfp_net_bpf_offload_prepare(nn, cls_bpf, &res, &code,
+ &dma_addr, max_instr);
+ if (err)
+ return err;
+
+ nfp_net_bpf_stop(nn);
+ nfp_net_bpf_load_and_start(nn, cls_bpf->gen_flags, code,
+ dma_addr, max_instr * sizeof(u64),
+ res.n_instr, res.dense_mode);
+ return 0;
+
+ case TC_CLSBPF_ADD:
+ if (nn->ctrl & NFP_NET_CFG_CTRL_BPF)
+ return -EBUSY;
+
+ err = nfp_net_bpf_offload_prepare(nn, cls_bpf, &res, &code,
+ &dma_addr, max_instr);
+ if (err)
+ return err;
+
+ nfp_net_bpf_load_and_start(nn, cls_bpf->gen_flags, code,
+ dma_addr, max_instr * sizeof(u64),
+ res.n_instr, res.dense_mode);
+ return 0;
+
+ case TC_CLSBPF_DESTROY:
+ return nfp_net_bpf_stop(nn);
+
+ case TC_CLSBPF_STATS:
+ return nfp_net_bpf_stats_update(nn, cls_bpf);
+
+ default:
+ return -ENOTSUPP;
+ }
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c b/drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c
index f7062cb..2800bbf 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c
@@ -148,7 +148,7 @@
dev_warn(&pdev->dev, "OBSOLETE Firmware detected - VF isolation not available\n");
} else {
switch (fw_ver.major) {
- case 1 ... 3:
+ case 1 ... 4:
if (is_nfp3200) {
stride = 2;
tx_bar_no = NFP_NET_Q0_BAR;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 7d39cb9..bdc9ba9 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -1181,8 +1181,8 @@
p_drv_version = &union_data.drv_version;
p_drv_version->version = p_ver->version;
- for (i = 0; i < MCP_DRV_VER_STR_SIZE - 1; i += 4) {
- val = cpu_to_be32(p_ver->name[i]);
+ for (i = 0; i < (MCP_DRV_VER_STR_SIZE - 4) / sizeof(u32); i++) {
+ val = cpu_to_be32(*((u32 *)&p_ver->name[i * sizeof(u32)]));
*(__be32 *)&p_drv_version->name[i * sizeof(u32)] = val;
}
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index cd23a29..0e198fe 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -100,7 +100,8 @@
static void qede_link_update(void *dev, struct qed_link_output *link);
#ifdef CONFIG_QED_SRIOV
-static int qede_set_vf_vlan(struct net_device *ndev, int vf, u16 vlan, u8 qos)
+static int qede_set_vf_vlan(struct net_device *ndev, int vf, u16 vlan, u8 qos,
+ __be16 vlan_proto)
{
struct qede_dev *edev = netdev_priv(ndev);
@@ -109,6 +110,9 @@
return -EINVAL;
}
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
DP_VERBOSE(edev, QED_MSG_IOV, "Setting Vlan 0x%04x to VF [%d]\n",
vlan, vf);
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
index 24061b9..5f32765 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
@@ -238,7 +238,7 @@
int qlcnic_sriov_set_vf_tx_rate(struct net_device *, int, int, int);
int qlcnic_sriov_get_vf_config(struct net_device *, int ,
struct ifla_vf_info *);
-int qlcnic_sriov_set_vf_vlan(struct net_device *, int, u16, u8);
+int qlcnic_sriov_set_vf_vlan(struct net_device *, int, u16, u8, __be16);
int qlcnic_sriov_set_vf_spoofchk(struct net_device *, int, bool);
#else
static inline void qlcnic_sriov_pf_disable(struct qlcnic_adapter *adapter) {}
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
index afd687e..50eaafa 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
@@ -1915,7 +1915,7 @@
}
int qlcnic_sriov_set_vf_vlan(struct net_device *netdev, int vf,
- u16 vlan, u8 qos)
+ u16 vlan, u8 qos, __be16 vlan_proto)
{
struct qlcnic_adapter *adapter = netdev_priv(netdev);
struct qlcnic_sriov *sriov = adapter->ahw->sriov;
@@ -1928,6 +1928,9 @@
if (vf >= sriov->num_vfs || qos > 7)
return -EINVAL;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
if (vlan > MAX_VLAN_ID) {
netdev_err(netdev,
"Invalid VLAN ID, allowed range is [0 - %d]\n",
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index c412ba9..da4e90d 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -19,6 +19,7 @@
#include <linux/of_mdio.h>
#include <linux/phy.h>
#include <linux/iopoll.h>
+#include <linux/acpi.h>
#include "emac.h"
#include "emac-mac.h"
#include "emac-phy.h"
@@ -167,7 +168,6 @@
int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
{
struct device_node *np = pdev->dev.of_node;
- struct device_node *phy_np;
struct mii_bus *mii_bus;
int ret;
@@ -183,14 +183,37 @@
mii_bus->parent = &pdev->dev;
mii_bus->priv = adpt;
- ret = of_mdiobus_register(mii_bus, np);
- if (ret) {
- dev_err(&pdev->dev, "could not register mdio bus\n");
- return ret;
+ if (has_acpi_companion(&pdev->dev)) {
+ u32 phy_addr;
+
+ ret = mdiobus_register(mii_bus);
+ if (ret) {
+ dev_err(&pdev->dev, "could not register mdio bus\n");
+ return ret;
+ }
+ ret = device_property_read_u32(&pdev->dev, "phy-channel",
+ &phy_addr);
+ if (ret)
+ /* If we can't read a valid phy address, then assume
+ * that there is only one phy on this mdio bus.
+ */
+ adpt->phydev = phy_find_first(mii_bus);
+ else
+ adpt->phydev = mdiobus_get_phy(mii_bus, phy_addr);
+
+ } else {
+ struct device_node *phy_np;
+
+ ret = of_mdiobus_register(mii_bus, np);
+ if (ret) {
+ dev_err(&pdev->dev, "could not register mdio bus\n");
+ return ret;
+ }
+
+ phy_np = of_parse_phandle(np, "phy-handle", 0);
+ adpt->phydev = of_phy_find_device(phy_np);
}
- phy_np = of_parse_phandle(np, "phy-handle", 0);
- adpt->phydev = of_phy_find_device(phy_np);
if (!adpt->phydev) {
dev_err(&pdev->dev, "could not find external phy\n");
mdiobus_unregister(mii_bus);
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index 6ab0a3c..3d2c05a 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -14,6 +14,7 @@
*/
#include <linux/iopoll.h>
+#include <linux/acpi.h>
#include <linux/of_device.h>
#include "emac.h"
#include "emac-mac.h"
@@ -662,6 +663,24 @@
clk_set_rate(adpt->clk[EMAC_CLK_HIGH_SPEED], 125000000);
}
+static int emac_sgmii_acpi_match(struct device *dev, void *data)
+{
+ static const struct acpi_device_id match_table[] = {
+ {
+ .id = "QCOM8071",
+ .driver_data = (kernel_ulong_t)emac_sgmii_init_v2,
+ },
+ {}
+ };
+ const struct acpi_device_id *id = acpi_match_device(match_table, dev);
+ emac_sgmii_initialize *initialize = data;
+
+ if (id)
+ *initialize = (emac_sgmii_initialize)id->driver_data;
+
+ return !!id;
+}
+
static const struct of_device_id emac_sgmii_dt_match[] = {
{
.compatible = "qcom,fsm9900-emac-sgmii",
@@ -679,43 +698,82 @@
struct platform_device *sgmii_pdev = NULL;
struct emac_phy *phy = &adpt->phy;
struct resource *res;
- const struct of_device_id *match;
- struct device_node *np;
+ int ret;
- np = of_parse_phandle(pdev->dev.of_node, "internal-phy", 0);
- if (!np) {
- dev_err(&pdev->dev, "missing internal-phy property\n");
- return -ENODEV;
+ if (has_acpi_companion(&pdev->dev)) {
+ struct device *dev;
+
+ dev = device_find_child(&pdev->dev, &phy->initialize,
+ emac_sgmii_acpi_match);
+
+ if (!dev) {
+ dev_err(&pdev->dev, "cannot find internal phy node\n");
+ return -ENODEV;
+ }
+
+ sgmii_pdev = to_platform_device(dev);
+ } else {
+ const struct of_device_id *match;
+ struct device_node *np;
+
+ np = of_parse_phandle(pdev->dev.of_node, "internal-phy", 0);
+ if (!np) {
+ dev_err(&pdev->dev, "missing internal-phy property\n");
+ return -ENODEV;
+ }
+
+ sgmii_pdev = of_find_device_by_node(np);
+ if (!sgmii_pdev) {
+ dev_err(&pdev->dev, "invalid internal-phy property\n");
+ return -ENODEV;
+ }
+
+ match = of_match_device(emac_sgmii_dt_match, &sgmii_pdev->dev);
+ if (!match) {
+ dev_err(&pdev->dev, "unrecognized internal phy node\n");
+ ret = -ENODEV;
+ goto error_put_device;
+ }
+
+ phy->initialize = (emac_sgmii_initialize)match->data;
}
- sgmii_pdev = of_find_device_by_node(np);
- if (!sgmii_pdev) {
- dev_err(&pdev->dev, "invalid internal-phy property\n");
- return -ENODEV;
- }
-
- match = of_match_device(emac_sgmii_dt_match, &sgmii_pdev->dev);
- if (!match) {
- dev_err(&pdev->dev, "unrecognized internal phy node\n");
- return -ENODEV;
- }
-
- phy->initialize = (emac_sgmii_initialize)match->data;
-
/* Base address is the first address */
res = platform_get_resource(sgmii_pdev, IORESOURCE_MEM, 0);
- phy->base = devm_ioremap_resource(&sgmii_pdev->dev, res);
- if (IS_ERR(phy->base))
- return PTR_ERR(phy->base);
+ phy->base = ioremap(res->start, resource_size(res));
+ if (IS_ERR(phy->base)) {
+ ret = PTR_ERR(phy->base);
+ goto error_put_device;
+ }
/* v2 SGMII has a per-lane digital digital, so parse it if it exists */
res = platform_get_resource(sgmii_pdev, IORESOURCE_MEM, 1);
if (res) {
- phy->digital = devm_ioremap_resource(&sgmii_pdev->dev, res);
- if (IS_ERR(phy->base))
- return PTR_ERR(phy->base);
-
+ phy->digital = ioremap(res->start, resource_size(res));
+ if (IS_ERR(phy->digital)) {
+ ret = PTR_ERR(phy->digital);
+ goto error_unmap_base;
+ }
}
- return phy->initialize(adpt);
+ ret = phy->initialize(adpt);
+ if (ret)
+ goto error;
+
+ /* We've remapped the addresses, so we don't need the device any
+ * more. of_find_device_by_node() says we should release it.
+ */
+ put_device(&sgmii_pdev->dev);
+
+ return 0;
+
+error:
+ if (phy->digital)
+ iounmap(phy->digital);
+error_unmap_base:
+ iounmap(phy->base);
+error_put_device:
+ put_device(&sgmii_pdev->dev);
+
+ return ret;
}
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c b/drivers/net/ethernet/qualcomm/emac/emac.c
index e47d387..9bf3b2b 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -22,6 +22,7 @@
#include <linux/of_device.h>
#include <linux/phy.h>
#include <linux/platform_device.h>
+#include <linux/acpi.h>
#include "emac.h"
#include "emac-mac.h"
#include "emac-phy.h"
@@ -531,18 +532,16 @@
static int emac_probe_resources(struct platform_device *pdev,
struct emac_adapter *adpt)
{
- struct device_node *node = pdev->dev.of_node;
struct net_device *netdev = adpt->netdev;
struct resource *res;
- const void *maddr;
+ char maddr[ETH_ALEN];
int ret = 0;
/* get mac address */
- maddr = of_get_mac_address(node);
- if (!maddr)
- eth_hw_addr_random(netdev);
- else
+ if (device_get_mac_address(&pdev->dev, maddr, ETH_ALEN))
ether_addr_copy(netdev->dev_addr, maddr);
+ else
+ eth_hw_addr_random(netdev);
/* Core 0 interrupt */
ret = platform_get_irq(pdev, 0);
@@ -577,6 +576,16 @@
{}
};
+#if IS_ENABLED(CONFIG_ACPI)
+static const struct acpi_device_id emac_acpi_match[] = {
+ {
+ .id = "QCOM8070",
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(acpi, emac_acpi_match);
+#endif
+
static int emac_probe(struct platform_device *pdev)
{
struct net_device *netdev;
@@ -723,6 +732,10 @@
mdiobus_unregister(adpt->mii_bus);
free_netdev(netdev);
+ if (adpt->phy.digital)
+ iounmap(adpt->phy.digital);
+ iounmap(adpt->phy.base);
+
return 0;
}
@@ -732,6 +745,7 @@
.driver = {
.name = "qcom-emac",
.of_match_table = emac_dt_match,
+ .acpi_match_table = ACPI_PTR(emac_acpi_match),
},
};
diff --git a/drivers/net/ethernet/rdc/r6040.c b/drivers/net/ethernet/rdc/r6040.c
index cb29ee2..5ef5d72 100644
--- a/drivers/net/ethernet/rdc/r6040.c
+++ b/drivers/net/ethernet/rdc/r6040.c
@@ -1062,14 +1062,12 @@
/* this should always be supported */
err = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
if (err) {
- dev_err(&pdev->dev, "32-bit PCI DMA addresses"
- "not supported by the card\n");
+ dev_err(&pdev->dev, "32-bit PCI DMA addresses not supported by the card\n");
goto err_out_disable_dev;
}
err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
if (err) {
- dev_err(&pdev->dev, "32-bit PCI DMA addresses"
- "not supported by the card\n");
+ dev_err(&pdev->dev, "32-bit PCI DMA addresses not supported by the card\n");
goto err_out_disable_dev;
}
diff --git a/drivers/net/ethernet/renesas/Kconfig b/drivers/net/ethernet/renesas/Kconfig
index 4f132cf..85ec447 100644
--- a/drivers/net/ethernet/renesas/Kconfig
+++ b/drivers/net/ethernet/renesas/Kconfig
@@ -27,7 +27,7 @@
Renesas SuperH Ethernet device driver.
This driver supporting CPUs are:
- SH7619, SH7710, SH7712, SH7724, SH7734, SH7763, SH7757,
- R8A7740, R8A777x and R8A779x.
+ R8A7740, R8A774x, R8A777x and R8A779x.
config RAVB
tristate "Renesas Ethernet AVB support"
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 440ae27..05b0dc5 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -2959,6 +2959,8 @@
static const struct of_device_id sh_eth_match_table[] = {
{ .compatible = "renesas,gether-r8a7740", .data = &r8a7740_data },
+ { .compatible = "renesas,ether-r8a7743", .data = &r8a779x_data },
+ { .compatible = "renesas,ether-r8a7745", .data = &r8a779x_data },
{ .compatible = "renesas,ether-r8a7778", .data = &r8a777x_data },
{ .compatible = "renesas,ether-r8a7779", .data = &r8a777x_data },
{ .compatible = "renesas,ether-r8a7790", .data = &r8a779x_data },
diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
index 1ab995f..2eb9b49 100644
--- a/drivers/net/ethernet/rocker/rocker.h
+++ b/drivers/net/ethernet/rocker/rocker.h
@@ -15,6 +15,7 @@
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/netdevice.h>
+#include <linux/notifier.h>
#include <net/neighbour.h>
#include <net/switchdev.h>
@@ -52,6 +53,9 @@
struct rocker_dma_ring_info rx_ring;
};
+struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev,
+ struct rocker *rocker);
+
struct rocker_world_ops;
struct rocker {
@@ -66,6 +70,7 @@
spinlock_t cmd_ring_lock; /* for cmd ring accesses */
struct rocker_dma_ring_info cmd_ring;
struct rocker_dma_ring_info event_ring;
+ struct notifier_block fib_nb;
struct rocker_world_ops *wops;
void *wpriv;
};
@@ -117,11 +122,6 @@
int (*port_obj_vlan_dump)(const struct rocker_port *rocker_port,
struct switchdev_obj_port_vlan *vlan,
switchdev_obj_dump_cb_t *cb);
- int (*port_obj_fib4_add)(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans);
- int (*port_obj_fib4_del)(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4);
int (*port_obj_fdb_add)(struct rocker_port *rocker_port,
const struct switchdev_obj_port_fdb *fdb,
struct switchdev_trans *trans);
@@ -141,6 +141,11 @@
int (*port_ev_mac_vlan_seen)(struct rocker_port *rocker_port,
const unsigned char *addr,
__be16 vlan_id);
+ int (*fib4_add)(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info);
+ int (*fib4_del)(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info);
+ void (*fib4_abort)(struct rocker *rocker);
};
extern struct rocker_world_ops rocker_ofdpa_ops;
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 1f0c086..5424fb3 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -1625,29 +1625,6 @@
}
static int
-rocker_world_port_obj_fib4_add(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans)
-{
- struct rocker_world_ops *wops = rocker_port->rocker->wops;
-
- if (!wops->port_obj_fib4_add)
- return -EOPNOTSUPP;
- return wops->port_obj_fib4_add(rocker_port, fib4, trans);
-}
-
-static int
-rocker_world_port_obj_fib4_del(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4)
-{
- struct rocker_world_ops *wops = rocker_port->rocker->wops;
-
- if (!wops->port_obj_fib4_del)
- return -EOPNOTSUPP;
- return wops->port_obj_fib4_del(rocker_port, fib4);
-}
-
-static int
rocker_world_port_obj_fdb_add(struct rocker_port *rocker_port,
const struct switchdev_obj_port_fdb *fdb,
struct switchdev_trans *trans)
@@ -1733,6 +1710,34 @@
return wops->port_ev_mac_vlan_seen(rocker_port, addr, vlan_id);
}
+static int rocker_world_fib4_add(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info)
+{
+ struct rocker_world_ops *wops = rocker->wops;
+
+ if (!wops->fib4_add)
+ return 0;
+ return wops->fib4_add(rocker, fen_info);
+}
+
+static int rocker_world_fib4_del(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info)
+{
+ struct rocker_world_ops *wops = rocker->wops;
+
+ if (!wops->fib4_del)
+ return 0;
+ return wops->fib4_del(rocker, fen_info);
+}
+
+static void rocker_world_fib4_abort(struct rocker *rocker)
+{
+ struct rocker_world_ops *wops = rocker->wops;
+
+ if (wops->fib4_abort)
+ wops->fib4_abort(rocker);
+}
+
/*****************
* Net device ops
*****************/
@@ -2096,11 +2101,6 @@
SWITCHDEV_OBJ_PORT_VLAN(obj),
trans);
break;
- case SWITCHDEV_OBJ_ID_IPV4_FIB:
- err = rocker_world_port_obj_fib4_add(rocker_port,
- SWITCHDEV_OBJ_IPV4_FIB(obj),
- trans);
- break;
case SWITCHDEV_OBJ_ID_PORT_FDB:
err = rocker_world_port_obj_fdb_add(rocker_port,
SWITCHDEV_OBJ_PORT_FDB(obj),
@@ -2125,10 +2125,6 @@
err = rocker_world_port_obj_vlan_del(rocker_port,
SWITCHDEV_OBJ_PORT_VLAN(obj));
break;
- case SWITCHDEV_OBJ_ID_IPV4_FIB:
- err = rocker_world_port_obj_fib4_del(rocker_port,
- SWITCHDEV_OBJ_IPV4_FIB(obj));
- break;
case SWITCHDEV_OBJ_ID_PORT_FDB:
err = rocker_world_port_obj_fdb_del(rocker_port,
SWITCHDEV_OBJ_PORT_FDB(obj));
@@ -2175,6 +2171,31 @@
.switchdev_port_obj_dump = rocker_port_obj_dump,
};
+static int rocker_router_fib_event(struct notifier_block *nb,
+ unsigned long event, void *ptr)
+{
+ struct rocker *rocker = container_of(nb, struct rocker, fib_nb);
+ struct fib_entry_notifier_info *fen_info = ptr;
+ int err;
+
+ switch (event) {
+ case FIB_EVENT_ENTRY_ADD:
+ err = rocker_world_fib4_add(rocker, fen_info);
+ if (err)
+ rocker_world_fib4_abort(rocker);
+ else
+ break;
+ case FIB_EVENT_ENTRY_DEL:
+ rocker_world_fib4_del(rocker, fen_info);
+ break;
+ case FIB_EVENT_RULE_ADD: /* fall through */
+ case FIB_EVENT_RULE_DEL:
+ rocker_world_fib4_abort(rocker);
+ break;
+ }
+ return NOTIFY_DONE;
+}
+
/********************
* ethtool interface
********************/
@@ -2740,6 +2761,9 @@
goto err_probe_ports;
}
+ rocker->fib_nb.notifier_call = rocker_router_fib_event;
+ register_fib_notifier(&rocker->fib_nb);
+
dev_info(&pdev->dev, "Rocker switch with id %*phN\n",
(int)sizeof(rocker->hw.id), &rocker->hw.id);
@@ -2771,6 +2795,7 @@
{
struct rocker *rocker = pci_get_drvdata(pdev);
+ unregister_fib_notifier(&rocker->fib_nb);
rocker_write32(rocker, CONTROL, ROCKER_CONTROL_RESET);
rocker_remove_ports(rocker);
free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
@@ -2799,6 +2824,37 @@
return dev->netdev_ops == &rocker_port_netdev_ops;
}
+static bool rocker_port_dev_check_under(const struct net_device *dev,
+ struct rocker *rocker)
+{
+ struct rocker_port *rocker_port;
+
+ if (!rocker_port_dev_check(dev))
+ return false;
+
+ rocker_port = netdev_priv(dev);
+ if (rocker_port->rocker != rocker)
+ return false;
+
+ return true;
+}
+
+struct rocker_port *rocker_port_dev_lower_find(struct net_device *dev,
+ struct rocker *rocker)
+{
+ struct net_device *lower_dev;
+ struct list_head *iter;
+
+ if (rocker_port_dev_check_under(dev, rocker))
+ return netdev_priv(dev);
+
+ netdev_for_each_all_lower_dev(dev, lower_dev, iter) {
+ if (rocker_port_dev_check_under(lower_dev, rocker))
+ return netdev_priv(lower_dev);
+ }
+ return NULL;
+}
+
static int rocker_netdevice_event(struct notifier_block *unused,
unsigned long event, void *ptr)
{
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index fcad907..431a608 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -99,6 +99,7 @@
struct ofdpa_flow_tbl_key key;
size_t key_len;
u32 key_crc32; /* key */
+ struct fib_info *fi;
};
struct ofdpa_group_tbl_entry {
@@ -189,6 +190,7 @@
spinlock_t neigh_tbl_lock; /* for neigh tbl accesses */
u32 neigh_tbl_next_index;
unsigned long ageing_time;
+ bool fib_aborted;
};
struct ofdpa_port {
@@ -1043,7 +1045,8 @@
__be16 eth_type, __be32 dst,
__be32 dst_mask, u32 priority,
enum rocker_of_dpa_table_id goto_tbl,
- u32 group_id, int flags)
+ u32 group_id, struct fib_info *fi,
+ int flags)
{
struct ofdpa_flow_tbl_entry *entry;
@@ -1060,6 +1063,7 @@
entry->key.ucast_routing.group_id = group_id;
entry->key_len = offsetof(struct ofdpa_flow_tbl_key,
ucast_routing.group_id);
+ entry->fi = fi;
return ofdpa_flow_tbl_do(ofdpa_port, trans, flags, entry);
}
@@ -1425,7 +1429,7 @@
eth_type, ip_addr,
inet_make_mask(32),
priority, goto_tbl,
- group_id, flags);
+ group_id, NULL, flags);
if (err)
netdev_err(ofdpa_port->dev, "Error (%d) /32 unicast route %pI4 group 0x%08x\n",
@@ -2390,7 +2394,7 @@
static int ofdpa_port_fib_ipv4(struct ofdpa_port *ofdpa_port,
struct switchdev_trans *trans, __be32 dst,
- int dst_len, const struct fib_info *fi,
+ int dst_len, struct fib_info *fi,
u32 tb_id, int flags)
{
const struct fib_nh *nh;
@@ -2426,7 +2430,7 @@
err = ofdpa_flow_tbl_ucast4_routing(ofdpa_port, trans, eth_type, dst,
dst_mask, priority, goto_tbl,
- group_id, flags);
+ group_id, fi, flags);
if (err)
netdev_err(ofdpa_port->dev, "Error (%d) IPv4 route %pI4\n",
err, &dst);
@@ -2718,28 +2722,6 @@
return err;
}
-static int ofdpa_port_obj_fib4_add(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4,
- struct switchdev_trans *trans)
-{
- struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
-
- return ofdpa_port_fib_ipv4(ofdpa_port, trans,
- htonl(fib4->dst), fib4->dst_len,
- fib4->fi, fib4->tb_id, 0);
-}
-
-static int ofdpa_port_obj_fib4_del(struct rocker_port *rocker_port,
- const struct switchdev_obj_ipv4_fib *fib4)
-{
- struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
-
- return ofdpa_port_fib_ipv4(ofdpa_port, NULL,
- htonl(fib4->dst), fib4->dst_len,
- fib4->fi, fib4->tb_id,
- OFDPA_OP_FLAG_REMOVE);
-}
-
static int ofdpa_port_obj_fdb_add(struct rocker_port *rocker_port,
const struct switchdev_obj_port_fdb *fdb,
struct switchdev_trans *trans)
@@ -2922,6 +2904,82 @@
return ofdpa_port_fdb(ofdpa_port, NULL, addr, vlan_id, flags);
}
+static struct ofdpa_port *ofdpa_port_dev_lower_find(struct net_device *dev,
+ struct rocker *rocker)
+{
+ struct rocker_port *rocker_port;
+
+ rocker_port = rocker_port_dev_lower_find(dev, rocker);
+ return rocker_port ? rocker_port->wpriv : NULL;
+}
+
+static int ofdpa_fib4_add(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info)
+{
+ struct ofdpa *ofdpa = rocker->wpriv;
+ struct ofdpa_port *ofdpa_port;
+ int err;
+
+ if (ofdpa->fib_aborted)
+ return 0;
+ ofdpa_port = ofdpa_port_dev_lower_find(fen_info->fi->fib_dev, rocker);
+ if (!ofdpa_port)
+ return 0;
+ err = ofdpa_port_fib_ipv4(ofdpa_port, NULL, htonl(fen_info->dst),
+ fen_info->dst_len, fen_info->fi,
+ fen_info->tb_id, 0);
+ if (err)
+ return err;
+ fib_info_offload_inc(fen_info->fi);
+ return 0;
+}
+
+static int ofdpa_fib4_del(struct rocker *rocker,
+ const struct fib_entry_notifier_info *fen_info)
+{
+ struct ofdpa *ofdpa = rocker->wpriv;
+ struct ofdpa_port *ofdpa_port;
+
+ if (ofdpa->fib_aborted)
+ return 0;
+ ofdpa_port = ofdpa_port_dev_lower_find(fen_info->fi->fib_dev, rocker);
+ if (!ofdpa_port)
+ return 0;
+ fib_info_offload_dec(fen_info->fi);
+ return ofdpa_port_fib_ipv4(ofdpa_port, NULL, htonl(fen_info->dst),
+ fen_info->dst_len, fen_info->fi,
+ fen_info->tb_id, OFDPA_OP_FLAG_REMOVE);
+}
+
+static void ofdpa_fib4_abort(struct rocker *rocker)
+{
+ struct ofdpa *ofdpa = rocker->wpriv;
+ struct ofdpa_port *ofdpa_port;
+ struct ofdpa_flow_tbl_entry *flow_entry;
+ struct hlist_node *tmp;
+ unsigned long flags;
+ int bkt;
+
+ if (ofdpa->fib_aborted)
+ return;
+
+ spin_lock_irqsave(&ofdpa->flow_tbl_lock, flags);
+ hash_for_each_safe(ofdpa->flow_tbl, bkt, tmp, flow_entry, entry) {
+ if (flow_entry->key.tbl_id !=
+ ROCKER_OF_DPA_TABLE_ID_UNICAST_ROUTING)
+ continue;
+ ofdpa_port = ofdpa_port_dev_lower_find(flow_entry->fi->fib_dev,
+ rocker);
+ if (!ofdpa_port)
+ continue;
+ fib_info_offload_dec(flow_entry->fi);
+ ofdpa_flow_tbl_del(ofdpa_port, NULL, OFDPA_OP_FLAG_REMOVE,
+ flow_entry);
+ }
+ spin_unlock_irqrestore(&ofdpa->flow_tbl_lock, flags);
+ ofdpa->fib_aborted = true;
+}
+
struct rocker_world_ops rocker_ofdpa_ops = {
.kind = "ofdpa",
.priv_size = sizeof(struct ofdpa),
@@ -2941,8 +2999,6 @@
.port_obj_vlan_add = ofdpa_port_obj_vlan_add,
.port_obj_vlan_del = ofdpa_port_obj_vlan_del,
.port_obj_vlan_dump = ofdpa_port_obj_vlan_dump,
- .port_obj_fib4_add = ofdpa_port_obj_fib4_add,
- .port_obj_fib4_del = ofdpa_port_obj_fib4_del,
.port_obj_fdb_add = ofdpa_port_obj_fdb_add,
.port_obj_fdb_del = ofdpa_port_obj_fdb_del,
.port_obj_fdb_dump = ofdpa_port_obj_fdb_dump,
@@ -2951,4 +3007,7 @@
.port_neigh_update = ofdpa_port_neigh_update,
.port_neigh_destroy = ofdpa_port_neigh_destroy,
.port_ev_mac_vlan_seen = ofdpa_port_ev_mac_vlan_seen,
+ .fib4_add = ofdpa_fib4_add,
+ .fib4_del = ofdpa_fib4_del,
+ .fib4_abort = ofdpa_fib4_abort,
};
diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 9fbc12a..2415209 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -1156,7 +1156,8 @@
* acquired locks in the wrong order.
*/
list_for_each_entry_safe(async, next, &mcdi->async_list, list) {
- async->complete(efx, async->cookie, -ENETDOWN, NULL, 0);
+ if (async->complete)
+ async->complete(efx, async->cookie, -ENETDOWN, NULL, 0);
list_del(&async->list);
kfree(async);
}
diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
index dd204d9..77a5364 100644
--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -1269,13 +1269,13 @@
if (IS_ERR(ptp->phc_clock)) {
rc = PTR_ERR(ptp->phc_clock);
goto fail3;
- }
-
- INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
- ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
- if (!ptp->pps_workwq) {
- rc = -ENOMEM;
- goto fail4;
+ } else if (ptp->phc_clock) {
+ INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
+ ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
+ if (!ptp->pps_workwq) {
+ rc = -ENOMEM;
+ goto fail4;
+ }
}
}
ptp->nic_ts_enabled = false;
diff --git a/drivers/net/ethernet/sfc/sriov.c b/drivers/net/ethernet/sfc/sriov.c
index 816c446..9abcf4a 100644
--- a/drivers/net/ethernet/sfc/sriov.c
+++ b/drivers/net/ethernet/sfc/sriov.c
@@ -22,7 +22,7 @@
}
int efx_sriov_set_vf_vlan(struct net_device *net_dev, int vf_i, u16 vlan,
- u8 qos)
+ u8 qos, __be16 vlan_proto)
{
struct efx_nic *efx = netdev_priv(net_dev);
@@ -31,6 +31,9 @@
(qos & ~(VLAN_PRIO_MASK >> VLAN_PRIO_SHIFT)))
return -EINVAL;
+ if (vlan_proto != htons(ETH_P_8021Q))
+ return -EPROTONOSUPPORT;
+
return efx->type->sriov_set_vf_vlan(efx, vf_i, vlan, qos);
} else {
return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/sfc/sriov.h b/drivers/net/ethernet/sfc/sriov.h
index 400df52..ba1762e 100644
--- a/drivers/net/ethernet/sfc/sriov.h
+++ b/drivers/net/ethernet/sfc/sriov.h
@@ -16,7 +16,7 @@
int efx_sriov_set_vf_mac(struct net_device *net_dev, int vf_i, u8 *mac);
int efx_sriov_set_vf_vlan(struct net_device *net_dev, int vf_i, u16 vlan,
- u8 qos);
+ u8 qos, __be16 vlan_proto);
int efx_sriov_set_vf_spoofchk(struct net_device *net_dev, int vf_i,
bool spoofchk);
int efx_sriov_get_vf_config(struct net_device *net_dev, int vf_i,
diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c
index 503a3b6..7321259 100644
--- a/drivers/net/ethernet/smsc/smc91x.c
+++ b/drivers/net/ethernet/smsc/smc91x.c
@@ -2323,6 +2323,9 @@
} else {
lp->cfg.flags |= SMC91X_USE_16BIT;
}
+ if (!device_property_read_u32(&pdev->dev, "reg-shift",
+ &val))
+ lp->io_shift = val;
}
#endif
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index 6f6bbc5..7df4ff1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -261,7 +261,7 @@
}
if (mode & WAKE_UCAST) {
pr_debug("GMAC: WOL on global unicast\n");
- pmt |= global_unicast;
+ pmt |= power_down | global_unicast | wake_up_frame_en;
}
writel(pmt, ioaddr + GMAC_PMT);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index df5580d..51019b7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -102,7 +102,7 @@
}
if (mode & WAKE_UCAST) {
pr_debug("GMAC: WOL on global unicast\n");
- pmt |= global_unicast;
+ pmt |= power_down | global_unicast | wake_up_frame_en;
}
writel(pmt, ioaddr + GMAC_PMT);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
index 170a18b..6e3b829 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
@@ -187,7 +187,7 @@
if (IS_ERR(priv->ptp_clock)) {
priv->ptp_clock = NULL;
pr_err("ptp_clock_register() failed on %s\n", priv->dev->name);
- } else
+ } else if (priv->ptp_clock)
pr_debug("Added PTP HW clock successfully on %s\n",
priv->dev->name);
diff --git a/drivers/net/hamradio/6pack.c b/drivers/net/hamradio/6pack.c
index 5a1e985..470b3dc 100644
--- a/drivers/net/hamradio/6pack.c
+++ b/drivers/net/hamradio/6pack.c
@@ -127,7 +127,7 @@
#define AX25_6PACK_HEADER_LEN 0
-static void sixpack_decode(struct sixpack *, unsigned char[], int);
+static void sixpack_decode(struct sixpack *, const unsigned char[], int);
static int encode_sixpack(unsigned char *, unsigned char *, int, unsigned char);
/*
@@ -428,7 +428,7 @@
/*
* Handle the 'receiver data ready' interrupt.
- * This function is called by the 'tty_io' module in the kernel when
+ * This function is called by the tty module in the kernel when
* a block of 6pack data has been received, which can now be decapsulated
* and sent on to some IP layer for further processing.
*/
@@ -436,7 +436,6 @@
const unsigned char *cp, char *fp, int count)
{
struct sixpack *sp;
- unsigned char buf[512];
int count1;
if (!count)
@@ -446,10 +445,7 @@
if (!sp)
return;
- memcpy(buf, cp, count < sizeof(buf) ? count : sizeof(buf));
-
/* Read the characters out of the buffer */
-
count1 = count;
while (count) {
count--;
@@ -459,7 +455,7 @@
continue;
}
}
- sixpack_decode(sp, buf, count1);
+ sixpack_decode(sp, cp, count1);
sp_put(sp);
tty_unthrottle(tty);
@@ -992,7 +988,7 @@
/* decode a 6pack packet */
static void
-sixpack_decode(struct sixpack *sp, unsigned char *pre_rbuff, int count)
+sixpack_decode(struct sixpack *sp, const unsigned char *pre_rbuff, int count)
{
unsigned char inbyte;
int count1;
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 284b97b..f4fbcb5 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -433,7 +433,7 @@
*/
struct nvsp_1_message_send_rndis_packet {
/*
- * This field is specified by RNIDS. They assume there's two different
+ * This field is specified by RNDIS. They assume there's two different
* channels of communication. However, the Network VSP only has one.
* Therefore, the channel travels with the RNDIS packet.
*/
@@ -578,7 +578,7 @@
/* The number of entries in the send indirection table */
u32 count;
- /* The offset of the send indireciton table from top of this struct.
+ /* The offset of the send indirection table from top of this struct.
* The send indirection table tells which channel to put the send
* traffic on. Each entry is a channel number.
*/
@@ -649,6 +649,8 @@
struct netvsc_stats {
u64 packets;
u64 bytes;
+ u64 broadcast;
+ u64 multicast;
struct u64_stats_sync syncp;
};
@@ -695,9 +697,8 @@
bool start_remove;
/* State to manage the associated VF interface. */
- struct net_device *vf_netdev;
- bool vf_inject;
- atomic_t vf_use_cnt;
+ struct net_device __rcu *vf_netdev;
+
/* 1: allocated, serial number is valid. 0: not allocated */
u32 vf_alloc;
/* Serial number of the VF to team with */
@@ -733,7 +734,6 @@
struct nvsp_message channel_init_pkt;
struct nvsp_message revoke_packet;
- /* unsigned char HwMacAddr[HW_MACADDR_LEN]; */
struct vmbus_channel *chn_table[VRSS_CHANNEL_MAX];
u32 send_table[VRSS_SEND_TAB_SIZE];
@@ -1238,7 +1238,7 @@
u32 ndis_msg_type;
/* Total length of this message, from the beginning */
- /* of the sruct rndis_message, in bytes. */
+ /* of the struct rndis_message, in bytes. */
u32 msg_len;
/* Actual message */
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index ff05b9b..720b5fa 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -635,7 +635,7 @@
q_idx = nvsc_packet->q_idx;
channel = incoming_channel;
- dev_kfree_skb_any(skb);
+ dev_consume_skb_any(skb);
}
num_outstanding_sends =
@@ -944,7 +944,7 @@
}
if (msdp->skb)
- dev_kfree_skb_any(msdp->skb);
+ dev_consume_skb_any(msdp->skb);
if (xmit_more && !packet->cp_partial) {
msdp->skb = skb;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 2360e70..52eeb2f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -667,51 +667,23 @@
{
struct net_device *net = hv_get_drvdata(device_obj);
struct net_device_context *net_device_ctx = netdev_priv(net);
+ struct net_device *vf_netdev;
struct sk_buff *skb;
- struct sk_buff *vf_skb;
struct netvsc_stats *rx_stats;
- u32 bytes_recvd = packet->total_data_buflen;
- int ret = 0;
- if (!net || net->reg_state != NETREG_REGISTERED)
+ if (net->reg_state != NETREG_REGISTERED)
return NVSP_STAT_FAIL;
- if (READ_ONCE(net_device_ctx->vf_inject)) {
- atomic_inc(&net_device_ctx->vf_use_cnt);
- if (!READ_ONCE(net_device_ctx->vf_inject)) {
- /*
- * We raced; just move on.
- */
- atomic_dec(&net_device_ctx->vf_use_cnt);
- goto vf_injection_done;
- }
-
- /*
- * Inject this packet into the VF inerface.
- * On Hyper-V, multicast and brodcast packets
- * are only delivered on the synthetic interface
- * (after subjecting these to policy filters on
- * the host). Deliver these via the VF interface
- * in the guest.
- */
- vf_skb = netvsc_alloc_recv_skb(net_device_ctx->vf_netdev,
- packet, csum_info, *data,
- vlan_tci);
- if (vf_skb != NULL) {
- ++net_device_ctx->vf_netdev->stats.rx_packets;
- net_device_ctx->vf_netdev->stats.rx_bytes +=
- bytes_recvd;
- netif_receive_skb(vf_skb);
- } else {
- ++net->stats.rx_dropped;
- ret = NVSP_STAT_FAIL;
- }
- atomic_dec(&net_device_ctx->vf_use_cnt);
- return ret;
- }
-
-vf_injection_done:
- rx_stats = this_cpu_ptr(net_device_ctx->rx_stats);
+ /*
+ * If necessary, inject this packet into the VF interface.
+ * On Hyper-V, multicast and brodcast packets are only delivered
+ * to the synthetic interface (after subjecting these to
+ * policy filters on the host). Deliver these via the VF
+ * interface in the guest.
+ */
+ vf_netdev = rcu_dereference(net_device_ctx->vf_netdev);
+ if (vf_netdev && (vf_netdev->flags & IFF_UP))
+ net = vf_netdev;
/* Allocate a skb - TODO direct I/O to pages? */
skb = netvsc_alloc_recv_skb(net, packet, csum_info, *data, vlan_tci);
@@ -719,12 +691,25 @@
++net->stats.rx_dropped;
return NVSP_STAT_FAIL;
}
- skb_record_rx_queue(skb, channel->
- offermsg.offer.sub_channel_index);
+ if (net != vf_netdev)
+ skb_record_rx_queue(skb,
+ channel->offermsg.offer.sub_channel_index);
+
+ /*
+ * Even if injecting the packet, record the statistics
+ * on the synthetic device because modifying the VF device
+ * statistics will not work correctly.
+ */
+ rx_stats = this_cpu_ptr(net_device_ctx->rx_stats);
u64_stats_update_begin(&rx_stats->syncp);
rx_stats->packets++;
rx_stats->bytes += packet->total_data_buflen;
+
+ if (skb->pkt_type == PACKET_BROADCAST)
+ ++rx_stats->broadcast;
+ else if (skb->pkt_type == PACKET_MULTICAST)
+ ++rx_stats->multicast;
u64_stats_update_end(&rx_stats->syncp);
/*
@@ -967,7 +952,7 @@
cpu);
struct netvsc_stats *rx_stats = per_cpu_ptr(ndev_ctx->rx_stats,
cpu);
- u64 tx_packets, tx_bytes, rx_packets, rx_bytes;
+ u64 tx_packets, tx_bytes, rx_packets, rx_bytes, rx_multicast;
unsigned int start;
do {
@@ -980,12 +965,14 @@
start = u64_stats_fetch_begin_irq(&rx_stats->syncp);
rx_packets = rx_stats->packets;
rx_bytes = rx_stats->bytes;
+ rx_multicast = rx_stats->multicast + rx_stats->broadcast;
} while (u64_stats_fetch_retry_irq(&rx_stats->syncp, start));
t->tx_bytes += tx_bytes;
t->tx_packets += tx_packets;
t->rx_bytes += rx_bytes;
t->rx_packets += rx_packets;
+ t->multicast += rx_multicast;
}
t->tx_dropped = net->stats.tx_dropped;
@@ -1215,22 +1202,44 @@
free_netdev(netdev);
}
-static struct net_device *get_netvsc_net_device(char *mac)
+static struct net_device *get_netvsc_bymac(const u8 *mac)
{
- struct net_device *dev, *found = NULL;
+ struct net_device *dev;
ASSERT_RTNL();
for_each_netdev(&init_net, dev) {
- if (memcmp(dev->dev_addr, mac, ETH_ALEN) == 0) {
- if (dev->netdev_ops != &device_ops)
- continue;
- found = dev;
- break;
- }
+ if (dev->netdev_ops != &device_ops)
+ continue; /* not a netvsc device */
+
+ if (ether_addr_equal(mac, dev->perm_addr))
+ return dev;
}
- return found;
+ return NULL;
+}
+
+static struct net_device *get_netvsc_byref(struct net_device *vf_netdev)
+{
+ struct net_device *dev;
+
+ ASSERT_RTNL();
+
+ for_each_netdev(&init_net, dev) {
+ struct net_device_context *net_device_ctx;
+
+ if (dev->netdev_ops != &device_ops)
+ continue; /* not a netvsc device */
+
+ net_device_ctx = netdev_priv(dev);
+ if (net_device_ctx->nvdev == NULL)
+ continue; /* device is removed */
+
+ if (rtnl_dereference(net_device_ctx->vf_netdev) == vf_netdev)
+ return dev; /* a match */
+ }
+
+ return NULL;
}
static int netvsc_register_vf(struct net_device *vf_netdev)
@@ -1238,9 +1247,8 @@
struct net_device *ndev;
struct net_device_context *net_device_ctx;
struct netvsc_device *netvsc_dev;
- const struct ethtool_ops *eth_ops = vf_netdev->ethtool_ops;
- if (eth_ops == NULL || eth_ops == ðtool_ops)
+ if (vf_netdev->addr_len != ETH_ALEN)
return NOTIFY_DONE;
/*
@@ -1248,13 +1256,13 @@
* associate with the VF interface. If we don't find a matching
* synthetic interface, move on.
*/
- ndev = get_netvsc_net_device(vf_netdev->dev_addr);
+ ndev = get_netvsc_bymac(vf_netdev->perm_addr);
if (!ndev)
return NOTIFY_DONE;
net_device_ctx = netdev_priv(ndev);
netvsc_dev = net_device_ctx->nvdev;
- if (!netvsc_dev || net_device_ctx->vf_netdev)
+ if (!netvsc_dev || rtnl_dereference(net_device_ctx->vf_netdev))
return NOTIFY_DONE;
netdev_info(ndev, "VF registering: %s\n", vf_netdev->name);
@@ -1262,46 +1270,26 @@
* Take a reference on the module.
*/
try_module_get(THIS_MODULE);
- net_device_ctx->vf_netdev = vf_netdev;
+
+ dev_hold(vf_netdev);
+ rcu_assign_pointer(net_device_ctx->vf_netdev, vf_netdev);
return NOTIFY_OK;
}
-static void netvsc_inject_enable(struct net_device_context *net_device_ctx)
-{
- net_device_ctx->vf_inject = true;
-}
-
-static void netvsc_inject_disable(struct net_device_context *net_device_ctx)
-{
- net_device_ctx->vf_inject = false;
-
- /* Wait for currently active users to drain out. */
- while (atomic_read(&net_device_ctx->vf_use_cnt) != 0)
- udelay(50);
-}
-
static int netvsc_vf_up(struct net_device *vf_netdev)
{
struct net_device *ndev;
struct netvsc_device *netvsc_dev;
- const struct ethtool_ops *eth_ops = vf_netdev->ethtool_ops;
struct net_device_context *net_device_ctx;
- if (eth_ops == ðtool_ops)
- return NOTIFY_DONE;
-
- ndev = get_netvsc_net_device(vf_netdev->dev_addr);
+ ndev = get_netvsc_byref(vf_netdev);
if (!ndev)
return NOTIFY_DONE;
net_device_ctx = netdev_priv(ndev);
netvsc_dev = net_device_ctx->nvdev;
- if (!netvsc_dev || !net_device_ctx->vf_netdev)
- return NOTIFY_DONE;
-
netdev_info(ndev, "VF up: %s\n", vf_netdev->name);
- netvsc_inject_enable(net_device_ctx);
/*
* Open the device before switching data path.
@@ -1327,23 +1315,15 @@
struct net_device *ndev;
struct netvsc_device *netvsc_dev;
struct net_device_context *net_device_ctx;
- const struct ethtool_ops *eth_ops = vf_netdev->ethtool_ops;
- if (eth_ops == ðtool_ops)
- return NOTIFY_DONE;
-
- ndev = get_netvsc_net_device(vf_netdev->dev_addr);
+ ndev = get_netvsc_byref(vf_netdev);
if (!ndev)
return NOTIFY_DONE;
net_device_ctx = netdev_priv(ndev);
netvsc_dev = net_device_ctx->nvdev;
- if (!netvsc_dev || !net_device_ctx->vf_netdev)
- return NOTIFY_DONE;
-
netdev_info(ndev, "VF down: %s\n", vf_netdev->name);
- netvsc_inject_disable(net_device_ctx);
netvsc_switch_datapath(ndev, false);
netdev_info(ndev, "Data path switched from VF: %s\n", vf_netdev->name);
rndis_filter_close(netvsc_dev);
@@ -1359,23 +1339,19 @@
{
struct net_device *ndev;
struct netvsc_device *netvsc_dev;
- const struct ethtool_ops *eth_ops = vf_netdev->ethtool_ops;
struct net_device_context *net_device_ctx;
- if (eth_ops == ðtool_ops)
- return NOTIFY_DONE;
-
- ndev = get_netvsc_net_device(vf_netdev->dev_addr);
+ ndev = get_netvsc_byref(vf_netdev);
if (!ndev)
return NOTIFY_DONE;
net_device_ctx = netdev_priv(ndev);
netvsc_dev = net_device_ctx->nvdev;
- if (!netvsc_dev || !net_device_ctx->vf_netdev)
- return NOTIFY_DONE;
+
netdev_info(ndev, "VF unregistering: %s\n", vf_netdev->name);
- netvsc_inject_disable(net_device_ctx);
- net_device_ctx->vf_netdev = NULL;
+
+ RCU_INIT_POINTER(net_device_ctx->vf_netdev, NULL);
+ dev_put(vf_netdev);
module_put(THIS_MODULE);
return NOTIFY_OK;
}
@@ -1427,10 +1403,6 @@
spin_lock_init(&net_device_ctx->lock);
INIT_LIST_HEAD(&net_device_ctx->reconfig_events);
- atomic_set(&net_device_ctx->vf_use_cnt, 0);
- net_device_ctx->vf_netdev = NULL;
- net_device_ctx->vf_inject = false;
-
net->netdev_ops = &device_ops;
net->hw_features = NETVSC_HW_FEATURES;
@@ -1539,13 +1511,21 @@
{
struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
+ /* Skip our own events */
+ if (event_dev->netdev_ops == &device_ops)
+ return NOTIFY_DONE;
+
+ /* Avoid non-Ethernet type devices */
+ if (event_dev->type != ARPHRD_ETHER)
+ return NOTIFY_DONE;
+
/* Avoid Vlan dev with same MAC registering as VF */
if (event_dev->priv_flags & IFF_802_1Q_VLAN)
return NOTIFY_DONE;
/* Avoid Bonding master dev with same MAC registering as VF */
- if (event_dev->priv_flags & IFF_BONDING &&
- event_dev->flags & IFF_MASTER)
+ if ((event_dev->priv_flags & IFF_BONDING) &&
+ (event_dev->flags & IFF_MASTER))
return NOTIFY_DONE;
switch (event) {
diff --git a/drivers/net/ieee802154/fakelb.c b/drivers/net/ieee802154/fakelb.c
index 0becf0a..ec387ef 100644
--- a/drivers/net/ieee802154/fakelb.c
+++ b/drivers/net/ieee802154/fakelb.c
@@ -30,7 +30,7 @@
static int numlbs = 2;
static LIST_HEAD(fakelb_phys);
-static DEFINE_SPINLOCK(fakelb_phys_lock);
+static DEFINE_MUTEX(fakelb_phys_lock);
static LIST_HEAD(fakelb_ifup_phys);
static DEFINE_RWLOCK(fakelb_ifup_phys_lock);
@@ -188,9 +188,9 @@
if (err)
goto err_reg;
- spin_lock(&fakelb_phys_lock);
+ mutex_lock(&fakelb_phys_lock);
list_add_tail(&phy->list, &fakelb_phys);
- spin_unlock(&fakelb_phys_lock);
+ mutex_unlock(&fakelb_phys_lock);
return 0;
@@ -222,10 +222,10 @@
return 0;
err_slave:
- spin_lock(&fakelb_phys_lock);
+ mutex_lock(&fakelb_phys_lock);
list_for_each_entry_safe(phy, tmp, &fakelb_phys, list)
fakelb_del(phy);
- spin_unlock(&fakelb_phys_lock);
+ mutex_unlock(&fakelb_phys_lock);
return err;
}
@@ -233,10 +233,10 @@
{
struct fakelb_phy *phy, *tmp;
- spin_lock(&fakelb_phys_lock);
+ mutex_lock(&fakelb_phys_lock);
list_for_each_entry_safe(phy, tmp, &fakelb_phys, list)
fakelb_del(phy);
- spin_unlock(&fakelb_phys_lock);
+ mutex_unlock(&fakelb_phys_lock);
return 0;
}
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 695a5dc..7e0732f 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -23,11 +23,13 @@
#include <linux/if_vlan.h>
#include <linux/ip.h>
#include <linux/inetdevice.h>
+#include <linux/netfilter.h>
#include <net/ip.h>
#include <net/ip6_route.h>
#include <net/rtnetlink.h>
#include <net/route.h>
#include <net/addrconf.h>
+#include <net/l3mdev.h>
#define IPVLAN_DRV "ipvlan"
#define IPV_DRV_VER "0.1"
@@ -124,4 +126,8 @@
const void *iaddr, bool is_v6);
bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6);
void ipvlan_ht_addr_del(struct ipvl_addr *addr);
+struct sk_buff *ipvlan_l3_rcv(struct net_device *dev, struct sk_buff *skb,
+ u16 proto);
+unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
+ const struct nf_hook_state *state);
#endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index b5f9511..b4e9907 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -560,6 +560,7 @@
case IPVLAN_MODE_L2:
return ipvlan_xmit_mode_l2(skb, dev);
case IPVLAN_MODE_L3:
+ case IPVLAN_MODE_L3S:
return ipvlan_xmit_mode_l3(skb, dev);
}
@@ -664,6 +665,8 @@
return ipvlan_handle_mode_l2(pskb, port);
case IPVLAN_MODE_L3:
return ipvlan_handle_mode_l3(pskb, port);
+ case IPVLAN_MODE_L3S:
+ return RX_HANDLER_PASS;
}
/* Should not reach here */
@@ -672,3 +675,94 @@
kfree_skb(skb);
return RX_HANDLER_CONSUMED;
}
+
+static struct ipvl_addr *ipvlan_skb_to_addr(struct sk_buff *skb,
+ struct net_device *dev)
+{
+ struct ipvl_addr *addr = NULL;
+ struct ipvl_port *port;
+ void *lyr3h;
+ int addr_type;
+
+ if (!dev || !netif_is_ipvlan_port(dev))
+ goto out;
+
+ port = ipvlan_port_get_rcu(dev);
+ if (!port || port->mode != IPVLAN_MODE_L3S)
+ goto out;
+
+ lyr3h = ipvlan_get_L3_hdr(skb, &addr_type);
+ if (!lyr3h)
+ goto out;
+
+ addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
+out:
+ return addr;
+}
+
+struct sk_buff *ipvlan_l3_rcv(struct net_device *dev, struct sk_buff *skb,
+ u16 proto)
+{
+ struct ipvl_addr *addr;
+ struct net_device *sdev;
+
+ addr = ipvlan_skb_to_addr(skb, dev);
+ if (!addr)
+ goto out;
+
+ sdev = addr->master->dev;
+ switch (proto) {
+ case AF_INET:
+ {
+ int err;
+ struct iphdr *ip4h = ip_hdr(skb);
+
+ err = ip_route_input_noref(skb, ip4h->daddr, ip4h->saddr,
+ ip4h->tos, sdev);
+ if (unlikely(err))
+ goto out;
+ break;
+ }
+ case AF_INET6:
+ {
+ struct dst_entry *dst;
+ struct ipv6hdr *ip6h = ipv6_hdr(skb);
+ int flags = RT6_LOOKUP_F_HAS_SADDR;
+ struct flowi6 fl6 = {
+ .flowi6_iif = sdev->ifindex,
+ .daddr = ip6h->daddr,
+ .saddr = ip6h->saddr,
+ .flowlabel = ip6_flowinfo(ip6h),
+ .flowi6_mark = skb->mark,
+ .flowi6_proto = ip6h->nexthdr,
+ };
+
+ skb_dst_drop(skb);
+ dst = ip6_route_input_lookup(dev_net(sdev), sdev, &fl6, flags);
+ skb_dst_set(skb, dst);
+ break;
+ }
+ default:
+ break;
+ }
+
+out:
+ return skb;
+}
+
+unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ struct ipvl_addr *addr;
+ unsigned int len;
+
+ addr = ipvlan_skb_to_addr(skb, skb->dev);
+ if (!addr)
+ goto out;
+
+ skb->dev = addr->master->dev;
+ len = skb->len + ETH_HLEN;
+ ipvlan_count_rx(addr->master, len, true, false);
+out:
+ return NF_ACCEPT;
+}
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 18b4e8c..f442eb3 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -9,24 +9,87 @@
#include "ipvlan.h"
+static u32 ipvl_nf_hook_refcnt = 0;
+
+static struct nf_hook_ops ipvl_nfops[] __read_mostly = {
+ {
+ .hook = ipvlan_nf_input,
+ .pf = NFPROTO_IPV4,
+ .hooknum = NF_INET_LOCAL_IN,
+ .priority = INT_MAX,
+ },
+ {
+ .hook = ipvlan_nf_input,
+ .pf = NFPROTO_IPV6,
+ .hooknum = NF_INET_LOCAL_IN,
+ .priority = INT_MAX,
+ },
+};
+
+static struct l3mdev_ops ipvl_l3mdev_ops __read_mostly = {
+ .l3mdev_l3_rcv = ipvlan_l3_rcv,
+};
+
static void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev)
{
ipvlan->dev->mtu = dev->mtu - ipvlan->mtu_adj;
}
-static void ipvlan_set_port_mode(struct ipvl_port *port, u16 nval)
+static int ipvlan_register_nf_hook(void)
+{
+ int err = 0;
+
+ if (!ipvl_nf_hook_refcnt) {
+ err = _nf_register_hooks(ipvl_nfops, ARRAY_SIZE(ipvl_nfops));
+ if (!err)
+ ipvl_nf_hook_refcnt = 1;
+ } else {
+ ipvl_nf_hook_refcnt++;
+ }
+
+ return err;
+}
+
+static void ipvlan_unregister_nf_hook(void)
+{
+ WARN_ON(!ipvl_nf_hook_refcnt);
+
+ ipvl_nf_hook_refcnt--;
+ if (!ipvl_nf_hook_refcnt)
+ _nf_unregister_hooks(ipvl_nfops, ARRAY_SIZE(ipvl_nfops));
+}
+
+static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval)
{
struct ipvl_dev *ipvlan;
+ struct net_device *mdev = port->dev;
+ int err = 0;
+ ASSERT_RTNL();
if (port->mode != nval) {
+ if (nval == IPVLAN_MODE_L3S) {
+ /* New mode is L3S */
+ err = ipvlan_register_nf_hook();
+ if (!err) {
+ mdev->l3mdev_ops = &ipvl_l3mdev_ops;
+ mdev->priv_flags |= IFF_L3MDEV_MASTER;
+ } else
+ return err;
+ } else if (port->mode == IPVLAN_MODE_L3S) {
+ /* Old mode was L3S */
+ mdev->priv_flags &= ~IFF_L3MDEV_MASTER;
+ ipvlan_unregister_nf_hook();
+ mdev->l3mdev_ops = NULL;
+ }
list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
- if (nval == IPVLAN_MODE_L3)
+ if (nval == IPVLAN_MODE_L3 || nval == IPVLAN_MODE_L3S)
ipvlan->dev->flags |= IFF_NOARP;
else
ipvlan->dev->flags &= ~IFF_NOARP;
}
port->mode = nval;
}
+ return err;
}
static int ipvlan_port_create(struct net_device *dev)
@@ -74,6 +137,11 @@
struct ipvl_port *port = ipvlan_port_get_rtnl(dev);
dev->priv_flags &= ~IFF_IPVLAN_MASTER;
+ if (port->mode == IPVLAN_MODE_L3S) {
+ dev->priv_flags &= ~IFF_L3MDEV_MASTER;
+ ipvlan_unregister_nf_hook();
+ dev->l3mdev_ops = NULL;
+ }
netdev_rx_handler_unregister(dev);
cancel_work_sync(&port->wq);
__skb_queue_purge(&port->backlog);
@@ -132,7 +200,8 @@
struct net_device *phy_dev = ipvlan->phy_dev;
struct ipvl_addr *addr;
- if (ipvlan->port->mode == IPVLAN_MODE_L3)
+ if (ipvlan->port->mode == IPVLAN_MODE_L3 ||
+ ipvlan->port->mode == IPVLAN_MODE_L3S)
dev->flags |= IFF_NOARP;
else
dev->flags &= ~IFF_NOARP;
@@ -372,13 +441,14 @@
{
struct ipvl_dev *ipvlan = netdev_priv(dev);
struct ipvl_port *port = ipvlan_port_get_rtnl(ipvlan->phy_dev);
+ int err = 0;
if (data && data[IFLA_IPVLAN_MODE]) {
u16 nmode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
- ipvlan_set_port_mode(port, nmode);
+ err = ipvlan_set_port_mode(port, nmode);
}
- return 0;
+ return err;
}
static size_t ipvlan_nl_getsize(const struct net_device *dev)
@@ -473,10 +543,13 @@
unregister_netdevice(dev);
return err;
}
+ err = ipvlan_set_port_mode(port, mode);
+ if (err) {
+ unregister_netdevice(dev);
+ return err;
+ }
list_add_tail_rcu(&ipvlan->pnode, &port->ipvlans);
- ipvlan_set_port_mode(port, mode);
-
netif_stacked_transfer_operstate(phy_dev, dev);
return 0;
}
diff --git a/drivers/net/phy/mdio-xgene.c b/drivers/net/phy/mdio-xgene.c
index 7756748..92af182 100644
--- a/drivers/net/phy/mdio-xgene.c
+++ b/drivers/net/phy/mdio-xgene.c
@@ -424,10 +424,8 @@
mdiobus_unregister(mdio_bus);
mdiobus_free(mdio_bus);
- if (dev->of_node) {
- if (IS_ERR(pdata->clk))
- clk_disable_unprepare(pdata->clk);
- }
+ if (dev->of_node)
+ clk_disable_unprepare(pdata->clk);
return 0;
}
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c
index 15f8206..7c00e50 100644
--- a/drivers/net/phy/microchip.c
+++ b/drivers/net/phy/microchip.c
@@ -55,7 +55,7 @@
return rc < 0 ? rc : 0;
}
-int lan88xx_suspend(struct phy_device *phydev)
+static int lan88xx_suspend(struct phy_device *phydev)
{
struct lan88xx_priv *priv = phydev->priv;
diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
index c09cc4a..d350deb 100644
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -23,6 +23,16 @@
RGMII_RX_CLK_DELAY_3_4_NS = 7
};
+/* Microsemi VSC85xx PHY registers */
+/* IEEE 802. Std Registers */
+#define MSCC_PHY_EXT_PHY_CNTL_1 23
+#define MAC_IF_SELECTION_MASK 0x1800
+#define MAC_IF_SELECTION_GMII 0
+#define MAC_IF_SELECTION_RMII 1
+#define MAC_IF_SELECTION_RGMII 2
+#define MAC_IF_SELECTION_POS 11
+#define FAR_END_LOOPBACK_MODE_MASK 0x0008
+
#define MII_VSC85XX_INT_MASK 25
#define MII_VSC85XX_INT_MASK_MASK 0xa000
#define MII_VSC85XX_INT_STATUS 26
@@ -48,6 +58,42 @@
return rc;
}
+static int vsc85xx_mac_if_set(struct phy_device *phydev,
+ phy_interface_t interface)
+{
+ int rc;
+ u16 reg_val;
+
+ mutex_lock(&phydev->lock);
+ reg_val = phy_read(phydev, MSCC_PHY_EXT_PHY_CNTL_1);
+ reg_val &= ~(MAC_IF_SELECTION_MASK);
+ switch (interface) {
+ case PHY_INTERFACE_MODE_RGMII:
+ reg_val |= (MAC_IF_SELECTION_RGMII << MAC_IF_SELECTION_POS);
+ break;
+ case PHY_INTERFACE_MODE_RMII:
+ reg_val |= (MAC_IF_SELECTION_RMII << MAC_IF_SELECTION_POS);
+ break;
+ case PHY_INTERFACE_MODE_MII:
+ case PHY_INTERFACE_MODE_GMII:
+ reg_val |= (MAC_IF_SELECTION_GMII << MAC_IF_SELECTION_POS);
+ break;
+ default:
+ rc = -EINVAL;
+ goto out_unlock;
+ }
+ rc = phy_write(phydev, MSCC_PHY_EXT_PHY_CNTL_1, reg_val);
+ if (rc != 0)
+ goto out_unlock;
+
+ rc = genphy_soft_reset(phydev);
+
+out_unlock:
+ mutex_unlock(&phydev->lock);
+
+ return rc;
+}
+
static int vsc85xx_default_config(struct phy_device *phydev)
{
int rc;
@@ -77,6 +123,11 @@
rc = vsc85xx_default_config(phydev);
if (rc)
return rc;
+
+ rc = vsc85xx_mac_if_set(phydev, phydev->interface);
+ if (rc)
+ return rc;
+
rc = genphy_config_init(phydev);
return rc;
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 9338f58..44d439f 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -32,7 +32,7 @@
#define NETNEXT_VERSION "08"
/* Information for net */
-#define NET_VERSION "5"
+#define NET_VERSION "6"
#define DRIVER_VERSION "v1." NETNEXT_VERSION "." NET_VERSION
#define DRIVER_AUTHOR "Realtek linux nic maintainers <nic_swsd@realtek.com>"
@@ -2551,6 +2551,77 @@
}
}
+static inline void r8152_mmd_indirect(struct r8152 *tp, u16 dev, u16 reg)
+{
+ ocp_reg_write(tp, OCP_EEE_AR, FUN_ADDR | dev);
+ ocp_reg_write(tp, OCP_EEE_DATA, reg);
+ ocp_reg_write(tp, OCP_EEE_AR, FUN_DATA | dev);
+}
+
+static u16 r8152_mmd_read(struct r8152 *tp, u16 dev, u16 reg)
+{
+ u16 data;
+
+ r8152_mmd_indirect(tp, dev, reg);
+ data = ocp_reg_read(tp, OCP_EEE_DATA);
+ ocp_reg_write(tp, OCP_EEE_AR, 0x0000);
+
+ return data;
+}
+
+static void r8152_mmd_write(struct r8152 *tp, u16 dev, u16 reg, u16 data)
+{
+ r8152_mmd_indirect(tp, dev, reg);
+ ocp_reg_write(tp, OCP_EEE_DATA, data);
+ ocp_reg_write(tp, OCP_EEE_AR, 0x0000);
+}
+
+static void r8152_eee_en(struct r8152 *tp, bool enable)
+{
+ u16 config1, config2, config3;
+ u32 ocp_data;
+
+ ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EEE_CR);
+ config1 = ocp_reg_read(tp, OCP_EEE_CONFIG1) & ~sd_rise_time_mask;
+ config2 = ocp_reg_read(tp, OCP_EEE_CONFIG2);
+ config3 = ocp_reg_read(tp, OCP_EEE_CONFIG3) & ~fast_snr_mask;
+
+ if (enable) {
+ ocp_data |= EEE_RX_EN | EEE_TX_EN;
+ config1 |= EEE_10_CAP | EEE_NWAY_EN | TX_QUIET_EN | RX_QUIET_EN;
+ config1 |= sd_rise_time(1);
+ config2 |= RG_DACQUIET_EN | RG_LDVQUIET_EN;
+ config3 |= fast_snr(42);
+ } else {
+ ocp_data &= ~(EEE_RX_EN | EEE_TX_EN);
+ config1 &= ~(EEE_10_CAP | EEE_NWAY_EN | TX_QUIET_EN |
+ RX_QUIET_EN);
+ config1 |= sd_rise_time(7);
+ config2 &= ~(RG_DACQUIET_EN | RG_LDVQUIET_EN);
+ config3 |= fast_snr(511);
+ }
+
+ ocp_write_word(tp, MCU_TYPE_PLA, PLA_EEE_CR, ocp_data);
+ ocp_reg_write(tp, OCP_EEE_CONFIG1, config1);
+ ocp_reg_write(tp, OCP_EEE_CONFIG2, config2);
+ ocp_reg_write(tp, OCP_EEE_CONFIG3, config3);
+}
+
+static void r8152b_enable_eee(struct r8152 *tp)
+{
+ r8152_eee_en(tp, true);
+ r8152_mmd_write(tp, MDIO_MMD_AN, MDIO_AN_EEE_ADV, MDIO_EEE_100TX);
+}
+
+static void r8152b_enable_fc(struct r8152 *tp)
+{
+ u16 anar;
+
+ anar = r8152_mdio_read(tp, MII_ADVERTISE);
+ anar |= ADVERTISE_PAUSE_CAP | ADVERTISE_PAUSE_ASYM;
+ r8152_mdio_write(tp, MII_ADVERTISE, anar);
+}
+
static void rtl8152_disable(struct r8152 *tp)
{
r8152_aldps_en(tp, false);
@@ -2560,13 +2631,9 @@
static void r8152b_hw_phy_cfg(struct r8152 *tp)
{
- u16 data;
-
- data = r8152_mdio_read(tp, MII_BMCR);
- if (data & BMCR_PDOWN) {
- data &= ~BMCR_PDOWN;
- r8152_mdio_write(tp, MII_BMCR, data);
- }
+ r8152b_enable_eee(tp);
+ r8152_aldps_en(tp, true);
+ r8152b_enable_fc(tp);
set_bit(PHY_RESET, &tp->flags);
}
@@ -2700,20 +2767,52 @@
ocp_write_dword(tp, MCU_TYPE_PLA, PLA_RCR, ocp_data);
}
+static void r8153_aldps_en(struct r8152 *tp, bool enable)
+{
+ u16 data;
+
+ data = ocp_reg_read(tp, OCP_POWER_CFG);
+ if (enable) {
+ data |= EN_ALDPS;
+ ocp_reg_write(tp, OCP_POWER_CFG, data);
+ } else {
+ data &= ~EN_ALDPS;
+ ocp_reg_write(tp, OCP_POWER_CFG, data);
+ msleep(20);
+ }
+}
+
+static void r8153_eee_en(struct r8152 *tp, bool enable)
+{
+ u32 ocp_data;
+ u16 config;
+
+ ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EEE_CR);
+ config = ocp_reg_read(tp, OCP_EEE_CFG);
+
+ if (enable) {
+ ocp_data |= EEE_RX_EN | EEE_TX_EN;
+ config |= EEE10_EN;
+ } else {
+ ocp_data &= ~(EEE_RX_EN | EEE_TX_EN);
+ config &= ~EEE10_EN;
+ }
+
+ ocp_write_word(tp, MCU_TYPE_PLA, PLA_EEE_CR, ocp_data);
+ ocp_reg_write(tp, OCP_EEE_CFG, config);
+}
+
static void r8153_hw_phy_cfg(struct r8152 *tp)
{
u32 ocp_data;
u16 data;
- if (tp->version == RTL_VER_03 || tp->version == RTL_VER_04 ||
- tp->version == RTL_VER_05)
- ocp_reg_write(tp, OCP_ADC_CFG, CKADSEL_L | ADC_EN | EN_EMI_L);
+ /* disable ALDPS before updating the PHY parameters */
+ r8153_aldps_en(tp, false);
- data = r8152_mdio_read(tp, MII_BMCR);
- if (data & BMCR_PDOWN) {
- data &= ~BMCR_PDOWN;
- r8152_mdio_write(tp, MII_BMCR, data);
- }
+ /* disable EEE before updating the PHY parameters */
+ r8153_eee_en(tp, false);
+ ocp_reg_write(tp, OCP_EEE_ADV, 0);
if (tp->version == RTL_VER_03) {
data = ocp_reg_read(tp, OCP_EEE_CFG);
@@ -2744,6 +2843,12 @@
sram_write(tp, SRAM_10M_AMP1, 0x00af);
sram_write(tp, SRAM_10M_AMP2, 0x0208);
+ r8153_eee_en(tp, true);
+ ocp_reg_write(tp, OCP_EEE_ADV, MDIO_EEE_1000T | MDIO_EEE_100TX);
+
+ r8153_aldps_en(tp, true);
+ r8152b_enable_fc(tp);
+
set_bit(PHY_RESET, &tp->flags);
}
@@ -2865,21 +2970,6 @@
ocp_write_dword(tp, MCU_TYPE_PLA, PLA_RCR, ocp_data);
}
-static void r8153_aldps_en(struct r8152 *tp, bool enable)
-{
- u16 data;
-
- data = ocp_reg_read(tp, OCP_POWER_CFG);
- if (enable) {
- data |= EN_ALDPS;
- ocp_reg_write(tp, OCP_POWER_CFG, data);
- } else {
- data &= ~EN_ALDPS;
- ocp_reg_write(tp, OCP_POWER_CFG, data);
- msleep(20);
- }
-}
-
static void rtl8153_disable(struct r8152 *tp)
{
r8153_aldps_en(tp, false);
@@ -3245,103 +3335,6 @@
return res;
}
-static inline void r8152_mmd_indirect(struct r8152 *tp, u16 dev, u16 reg)
-{
- ocp_reg_write(tp, OCP_EEE_AR, FUN_ADDR | dev);
- ocp_reg_write(tp, OCP_EEE_DATA, reg);
- ocp_reg_write(tp, OCP_EEE_AR, FUN_DATA | dev);
-}
-
-static u16 r8152_mmd_read(struct r8152 *tp, u16 dev, u16 reg)
-{
- u16 data;
-
- r8152_mmd_indirect(tp, dev, reg);
- data = ocp_reg_read(tp, OCP_EEE_DATA);
- ocp_reg_write(tp, OCP_EEE_AR, 0x0000);
-
- return data;
-}
-
-static void r8152_mmd_write(struct r8152 *tp, u16 dev, u16 reg, u16 data)
-{
- r8152_mmd_indirect(tp, dev, reg);
- ocp_reg_write(tp, OCP_EEE_DATA, data);
- ocp_reg_write(tp, OCP_EEE_AR, 0x0000);
-}
-
-static void r8152_eee_en(struct r8152 *tp, bool enable)
-{
- u16 config1, config2, config3;
- u32 ocp_data;
-
- ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EEE_CR);
- config1 = ocp_reg_read(tp, OCP_EEE_CONFIG1) & ~sd_rise_time_mask;
- config2 = ocp_reg_read(tp, OCP_EEE_CONFIG2);
- config3 = ocp_reg_read(tp, OCP_EEE_CONFIG3) & ~fast_snr_mask;
-
- if (enable) {
- ocp_data |= EEE_RX_EN | EEE_TX_EN;
- config1 |= EEE_10_CAP | EEE_NWAY_EN | TX_QUIET_EN | RX_QUIET_EN;
- config1 |= sd_rise_time(1);
- config2 |= RG_DACQUIET_EN | RG_LDVQUIET_EN;
- config3 |= fast_snr(42);
- } else {
- ocp_data &= ~(EEE_RX_EN | EEE_TX_EN);
- config1 &= ~(EEE_10_CAP | EEE_NWAY_EN | TX_QUIET_EN |
- RX_QUIET_EN);
- config1 |= sd_rise_time(7);
- config2 &= ~(RG_DACQUIET_EN | RG_LDVQUIET_EN);
- config3 |= fast_snr(511);
- }
-
- ocp_write_word(tp, MCU_TYPE_PLA, PLA_EEE_CR, ocp_data);
- ocp_reg_write(tp, OCP_EEE_CONFIG1, config1);
- ocp_reg_write(tp, OCP_EEE_CONFIG2, config2);
- ocp_reg_write(tp, OCP_EEE_CONFIG3, config3);
-}
-
-static void r8152b_enable_eee(struct r8152 *tp)
-{
- r8152_eee_en(tp, true);
- r8152_mmd_write(tp, MDIO_MMD_AN, MDIO_AN_EEE_ADV, MDIO_EEE_100TX);
-}
-
-static void r8153_eee_en(struct r8152 *tp, bool enable)
-{
- u32 ocp_data;
- u16 config;
-
- ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EEE_CR);
- config = ocp_reg_read(tp, OCP_EEE_CFG);
-
- if (enable) {
- ocp_data |= EEE_RX_EN | EEE_TX_EN;
- config |= EEE10_EN;
- } else {
- ocp_data &= ~(EEE_RX_EN | EEE_TX_EN);
- config &= ~EEE10_EN;
- }
-
- ocp_write_word(tp, MCU_TYPE_PLA, PLA_EEE_CR, ocp_data);
- ocp_reg_write(tp, OCP_EEE_CFG, config);
-}
-
-static void r8153_enable_eee(struct r8152 *tp)
-{
- r8153_eee_en(tp, true);
- ocp_reg_write(tp, OCP_EEE_ADV, MDIO_EEE_1000T | MDIO_EEE_100TX);
-}
-
-static void r8152b_enable_fc(struct r8152 *tp)
-{
- u16 anar;
-
- anar = r8152_mdio_read(tp, MII_ADVERTISE);
- anar |= ADVERTISE_PAUSE_CAP | ADVERTISE_PAUSE_ASYM;
- r8152_mdio_write(tp, MII_ADVERTISE, anar);
-}
-
static void rtl_tally_reset(struct r8152 *tp)
{
u32 ocp_data;
@@ -3354,10 +3347,17 @@
static void r8152b_init(struct r8152 *tp)
{
u32 ocp_data;
+ u16 data;
if (test_bit(RTL8152_UNPLUG, &tp->flags))
return;
+ data = r8152_mdio_read(tp, MII_BMCR);
+ if (data & BMCR_PDOWN) {
+ data &= ~BMCR_PDOWN;
+ r8152_mdio_write(tp, MII_BMCR, data);
+ }
+
r8152_aldps_en(tp, false);
if (tp->version == RTL_VER_01) {
@@ -3379,9 +3379,6 @@
SPDWN_RXDV_MSK | SPDWN_LINKCHG_MSK;
ocp_write_word(tp, MCU_TYPE_PLA, PLA_GPHY_INTR_IMR, ocp_data);
- r8152b_enable_eee(tp);
- r8152_aldps_en(tp, true);
- r8152b_enable_fc(tp);
rtl_tally_reset(tp);
/* enable rx aggregation */
@@ -3393,12 +3390,12 @@
static void r8153_init(struct r8152 *tp)
{
u32 ocp_data;
+ u16 data;
int i;
if (test_bit(RTL8152_UNPLUG, &tp->flags))
return;
- r8153_aldps_en(tp, false);
r8153_u1u2en(tp, false);
for (i = 0; i < 500; i++) {
@@ -3415,6 +3412,23 @@
msleep(20);
}
+ if (tp->version == RTL_VER_03 || tp->version == RTL_VER_04 ||
+ tp->version == RTL_VER_05)
+ ocp_reg_write(tp, OCP_ADC_CFG, CKADSEL_L | ADC_EN | EN_EMI_L);
+
+ data = r8152_mdio_read(tp, MII_BMCR);
+ if (data & BMCR_PDOWN) {
+ data &= ~BMCR_PDOWN;
+ r8152_mdio_write(tp, MII_BMCR, data);
+ }
+
+ for (i = 0; i < 500; i++) {
+ ocp_data = ocp_reg_read(tp, OCP_PHY_STATUS) & PHY_STAT_MASK;
+ if (ocp_data == PHY_STAT_LAN_ON)
+ break;
+ msleep(20);
+ }
+
usb_disable_lpm(tp->udev);
r8153_u2p3en(tp, false);
@@ -3482,9 +3496,6 @@
ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, 0);
ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL4, 0);
- r8153_enable_eee(tp);
- r8153_aldps_en(tp, true);
- r8152b_enable_fc(tp);
rtl_tally_reset(tp);
r8153_u2p3en(tp, true);
}
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
index a3fe73b..66957ac 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
@@ -531,6 +531,15 @@
int hdrlen = ieee80211_hdrlen(hdr->frame_control);
int queue;
+ /* IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used
+ * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel
+ * queue. STATION (HS2.0) uses the auxiliary context of the FW,
+ * and hence needs to be sent on the aux queue
+ */
+ if (IEEE80211_SKB_CB(skb)->hw_queue == IWL_MVM_OFFCHANNEL_QUEUE &&
+ skb_info->control.vif->type == NL80211_IFTYPE_STATION)
+ IEEE80211_SKB_CB(skb)->hw_queue = mvm->aux_queue;
+
memcpy(&info, skb->cb, sizeof(info));
if (WARN_ON_ONCE(info.flags & IEEE80211_TX_CTL_AMPDU))
@@ -544,16 +553,6 @@
/* This holds the amsdu headers length */
skb_info->driver_data[0] = (void *)(uintptr_t)0;
- /*
- * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used
- * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel
- * queue. STATION (HS2.0) uses the auxiliary context of the FW,
- * and hence needs to be sent on the aux queue
- */
- if (IEEE80211_SKB_CB(skb)->hw_queue == IWL_MVM_OFFCHANNEL_QUEUE &&
- info.control.vif->type == NL80211_IFTYPE_STATION)
- IEEE80211_SKB_CB(skb)->hw_queue = mvm->aux_queue;
-
queue = info.hw_queue;
/*
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index 8c35ac8..431f13b 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -487,7 +487,7 @@
};
static spinlock_t hwsim_radio_lock;
-static struct list_head hwsim_radios;
+static LIST_HEAD(hwsim_radios);
static int hwsim_radio_idx;
static struct platform_driver mac80211_hwsim_driver = {
@@ -3376,7 +3376,6 @@
mac80211_hwsim_unassign_vif_chanctx;
spin_lock_init(&hwsim_radio_lock);
- INIT_LIST_HEAD(&hwsim_radios);
err = register_pernet_device(&hwsim_net_ops);
if (err)
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 3a56268..b38fb2c 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -292,8 +292,6 @@
#endif
struct xen_netif_ctrl_back_ring ctrl;
- struct task_struct *ctrl_task;
- wait_queue_head_t ctrl_wq;
unsigned int ctrl_irq;
/* Miscellaneous private stuff. */
@@ -359,7 +357,7 @@
int xenvif_dealloc_kthread(void *data);
-int xenvif_ctrl_kthread(void *data);
+irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data);
void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
@@ -412,8 +410,4 @@
void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
-#ifdef CONFIG_DEBUG_FS
-void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m);
-#endif
-
#endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
index e8c5ddd..613bac0 100644
--- a/drivers/net/xen-netback/hash.c
+++ b/drivers/net/xen-netback/hash.c
@@ -360,74 +360,6 @@
return XEN_NETIF_CTRL_STATUS_SUCCESS;
}
-#ifdef CONFIG_DEBUG_FS
-void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m)
-{
- unsigned int i;
-
- switch (vif->hash.alg) {
- case XEN_NETIF_CTRL_HASH_ALGORITHM_TOEPLITZ:
- seq_puts(m, "Hash Algorithm: TOEPLITZ\n");
- break;
-
- case XEN_NETIF_CTRL_HASH_ALGORITHM_NONE:
- seq_puts(m, "Hash Algorithm: NONE\n");
- /* FALLTHRU */
- default:
- return;
- }
-
- if (vif->hash.flags) {
- seq_puts(m, "\nHash Flags:\n");
-
- if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4)
- seq_puts(m, "- IPv4\n");
- if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP)
- seq_puts(m, "- IPv4 + TCP\n");
- if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6)
- seq_puts(m, "- IPv6\n");
- if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP)
- seq_puts(m, "- IPv6 + TCP\n");
- }
-
- seq_puts(m, "\nHash Key:\n");
-
- for (i = 0; i < XEN_NETBK_MAX_HASH_KEY_SIZE; ) {
- unsigned int j, n;
-
- n = 8;
- if (i + n >= XEN_NETBK_MAX_HASH_KEY_SIZE)
- n = XEN_NETBK_MAX_HASH_KEY_SIZE - i;
-
- seq_printf(m, "[%2u - %2u]: ", i, i + n - 1);
-
- for (j = 0; j < n; j++, i++)
- seq_printf(m, "%02x ", vif->hash.key[i]);
-
- seq_puts(m, "\n");
- }
-
- if (vif->hash.size != 0) {
- seq_puts(m, "\nHash Mapping:\n");
-
- for (i = 0; i < vif->hash.size; ) {
- unsigned int j, n;
-
- n = 8;
- if (i + n >= vif->hash.size)
- n = vif->hash.size - i;
-
- seq_printf(m, "[%4u - %4u]: ", i, i + n - 1);
-
- for (j = 0; j < n; j++, i++)
- seq_printf(m, "%4u ", vif->hash.mapping[i]);
-
- seq_puts(m, "\n");
- }
- }
-}
-#endif /* CONFIG_DEBUG_FS */
-
void xenvif_init_hash(struct xenvif *vif)
{
if (xenvif_hash_cache_size == 0)
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 83deeeb..fb50c6d 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -128,15 +128,6 @@
return IRQ_HANDLED;
}
-irqreturn_t xenvif_ctrl_interrupt(int irq, void *dev_id)
-{
- struct xenvif *vif = dev_id;
-
- wake_up(&vif->ctrl_wq);
-
- return IRQ_HANDLED;
-}
-
int xenvif_queue_stopped(struct xenvif_queue *queue)
{
struct net_device *dev = queue->vif->dev;
@@ -570,8 +561,7 @@
struct net_device *dev = vif->dev;
void *addr;
struct xen_netif_ctrl_sring *shared;
- struct task_struct *task;
- int err = -ENOMEM;
+ int err;
err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
&ring_ref, 1, &addr);
@@ -581,11 +571,7 @@
shared = (struct xen_netif_ctrl_sring *)addr;
BACK_RING_INIT(&vif->ctrl, shared, XEN_PAGE_SIZE);
- init_waitqueue_head(&vif->ctrl_wq);
-
- err = bind_interdomain_evtchn_to_irqhandler(vif->domid, evtchn,
- xenvif_ctrl_interrupt,
- 0, dev->name, vif);
+ err = bind_interdomain_evtchn_to_irq(vif->domid, evtchn);
if (err < 0)
goto err_unmap;
@@ -593,19 +579,13 @@
xenvif_init_hash(vif);
- task = kthread_create(xenvif_ctrl_kthread, (void *)vif,
- "%s-control", dev->name);
- if (IS_ERR(task)) {
- pr_warn("Could not allocate kthread for %s\n", dev->name);
- err = PTR_ERR(task);
+ err = request_threaded_irq(vif->ctrl_irq, NULL, xenvif_ctrl_irq_fn,
+ IRQF_ONESHOT, "xen-netback-ctrl", vif);
+ if (err) {
+ pr_warn("Could not setup irq handler for %s\n", dev->name);
goto err_deinit;
}
- get_task_struct(task);
- vif->ctrl_task = task;
-
- wake_up_process(vif->ctrl_task);
-
return 0;
err_deinit:
@@ -774,12 +754,6 @@
void xenvif_disconnect_ctrl(struct xenvif *vif)
{
- if (vif->ctrl_task) {
- kthread_stop(vif->ctrl_task);
- put_task_struct(vif->ctrl_task);
- vif->ctrl_task = NULL;
- }
-
if (vif->ctrl_irq) {
xenvif_deinit_hash(vif);
unbind_from_irqhandler(vif->ctrl_irq, vif);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index edbae0b..3d0c989 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -2359,24 +2359,14 @@
return 0;
}
-int xenvif_ctrl_kthread(void *data)
+irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data)
{
struct xenvif *vif = data;
- for (;;) {
- wait_event_interruptible(vif->ctrl_wq,
- xenvif_ctrl_work_todo(vif) ||
- kthread_should_stop());
- if (kthread_should_stop())
- break;
+ while (xenvif_ctrl_work_todo(vif))
+ xenvif_ctrl_action(vif);
- while (xenvif_ctrl_work_todo(vif))
- xenvif_ctrl_action(vif);
-
- cond_resched();
- }
-
- return 0;
+ return IRQ_HANDLED;
}
static int __init netback_init(void)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index bacf6e0..daf4c78 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -165,7 +165,7 @@
return count;
}
-static int xenvif_io_ring_open(struct inode *inode, struct file *filp)
+static int xenvif_dump_open(struct inode *inode, struct file *filp)
{
int ret;
void *queue = NULL;
@@ -179,35 +179,13 @@
static const struct file_operations xenvif_dbg_io_ring_ops_fops = {
.owner = THIS_MODULE,
- .open = xenvif_io_ring_open,
+ .open = xenvif_dump_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
.write = xenvif_write_io_ring,
};
-static int xenvif_read_ctrl(struct seq_file *m, void *v)
-{
- struct xenvif *vif = m->private;
-
- xenvif_dump_hash_info(vif, m);
-
- return 0;
-}
-
-static int xenvif_ctrl_open(struct inode *inode, struct file *filp)
-{
- return single_open(filp, xenvif_read_ctrl, inode->i_private);
-}
-
-static const struct file_operations xenvif_dbg_ctrl_ops_fops = {
- .owner = THIS_MODULE,
- .open = xenvif_ctrl_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
-
static void xenvif_debugfs_addif(struct xenvif *vif)
{
struct dentry *pfile;
@@ -232,17 +210,6 @@
pr_warn("Creation of io_ring file returned %ld!\n",
PTR_ERR(pfile));
}
-
- if (vif->ctrl_task) {
- pfile = debugfs_create_file("ctrl",
- S_IRUSR,
- vif->xenvif_dbg_root,
- vif,
- &xenvif_dbg_ctrl_ops_fops);
- if (IS_ERR_OR_NULL(pfile))
- pr_warn("Creation of ctrl file returned %ld!\n",
- PTR_ERR(pfile));
- }
} else
netdev_warn(vif->dev,
"Creation of vif debugfs dir returned %ld!\n",
@@ -304,6 +271,11 @@
be->dev = dev;
dev_set_drvdata(&dev->dev, be);
+ be->state = XenbusStateInitialising;
+ err = xenbus_switch_state(dev, XenbusStateInitialising);
+ if (err)
+ goto fail;
+
sg = 1;
do {
@@ -416,11 +388,6 @@
be->hotplug_script = script;
- err = xenbus_switch_state(dev, XenbusStateInitWait);
- if (err)
- goto fail;
-
- be->state = XenbusStateInitWait;
/* This kicks hotplug scripts, so do it immediately. */
err = backend_create_xenvif(be);
@@ -525,20 +492,20 @@
/* Handle backend state transitions:
*
- * The backend state starts in InitWait and the following transitions are
+ * The backend state starts in Initialising and the following transitions are
* allowed.
*
- * InitWait -> Connected
+ * Initialising -> InitWait -> Connected
+ * \
+ * \ ^ \ |
+ * \ | \ |
+ * \ | \ |
+ * \ | \ |
+ * \ | \ |
+ * \ | \ |
+ * V | V V
*
- * ^ \ |
- * | \ |
- * | \ |
- * | \ |
- * | \ |
- * | \ |
- * | V V
- *
- * Closed <-> Closing
+ * Closed <-> Closing
*
* The state argument specifies the eventual state of the backend and the
* function transitions to that state via the shortest path.
@@ -548,6 +515,20 @@
{
while (be->state != state) {
switch (be->state) {
+ case XenbusStateInitialising:
+ switch (state) {
+ case XenbusStateInitWait:
+ case XenbusStateConnected:
+ case XenbusStateClosing:
+ backend_switch_state(be, XenbusStateInitWait);
+ break;
+ case XenbusStateClosed:
+ backend_switch_state(be, XenbusStateClosed);
+ break;
+ default:
+ BUG();
+ }
+ break;
case XenbusStateClosed:
switch (state) {
case XenbusStateInitWait:
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 96ccd4e..e17879d 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -565,6 +565,7 @@
struct netfront_queue *queue = NULL;
unsigned int num_queues = dev->real_num_tx_queues;
u16 queue_index;
+ struct sk_buff *nskb;
/* Drop the packet if no queues are set up */
if (num_queues < 1)
@@ -593,6 +594,20 @@
page = virt_to_page(skb->data);
offset = offset_in_page(skb->data);
+
+ /* The first req should be at least ETH_HLEN size or the packet will be
+ * dropped by netback.
+ */
+ if (unlikely(PAGE_SIZE - offset < ETH_HLEN)) {
+ nskb = skb_copy(skb, GFP_ATOMIC);
+ if (!nskb)
+ goto drop;
+ dev_kfree_skb_any(skb);
+ skb = nskb;
+ page = virt_to_page(skb->data);
+ offset = offset_in_page(skb->data);
+ }
+
len = skb_headlen(skb);
spin_lock_irqsave(&queue->tx_lock, flags);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8dcf5a9..60f7eab 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1693,7 +1693,12 @@
nvme_suspend_queue(dev->queues[i]);
if (csts & NVME_CSTS_CFS || !(csts & NVME_CSTS_RDY)) {
- nvme_suspend_queue(dev->queues[0]);
+ /* A device might become IO incapable very soon during
+ * probe, before the admin queue is configured. Thus,
+ * queue_count can be 0 here.
+ */
+ if (dev->queue_count)
+ nvme_suspend_queue(dev->queues[0]);
} else {
nvme_disable_io_queues(dev);
nvme_disable_admin_queue(dev, shutdown);
@@ -2112,6 +2117,8 @@
.driver_data = NVME_QUIRK_IDENTIFY_CNS, },
{ PCI_DEVICE(0x1c58, 0x0003), /* HGST adapter */
.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
+ { PCI_DEVICE(0x1c5f, 0x0540), /* Memblaze Pblaze4 adapter */
+ .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) },
{ 0, }
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index ab545fb..c2c2c28 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -82,6 +82,8 @@
enum nvme_rdma_queue_flags {
NVME_RDMA_Q_CONNECTED = (1 << 0),
+ NVME_RDMA_IB_QUEUE_ALLOCATED = (1 << 1),
+ NVME_RDMA_Q_DELETING = (1 << 2),
};
struct nvme_rdma_queue {
@@ -291,6 +293,7 @@
if (IS_ERR(req->mr)) {
ret = PTR_ERR(req->mr);
req->mr = NULL;
+ goto out;
}
req->mr->need_inval = false;
@@ -480,9 +483,14 @@
static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
{
- struct nvme_rdma_device *dev = queue->device;
- struct ib_device *ibdev = dev->dev;
+ struct nvme_rdma_device *dev;
+ struct ib_device *ibdev;
+ if (!test_and_clear_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags))
+ return;
+
+ dev = queue->device;
+ ibdev = dev->dev;
rdma_destroy_qp(queue->cm_id);
ib_free_cq(queue->ib_cq);
@@ -533,6 +541,7 @@
ret = -ENOMEM;
goto out_destroy_qp;
}
+ set_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags);
return 0;
@@ -552,6 +561,7 @@
queue = &ctrl->queues[idx];
queue->ctrl = ctrl;
+ queue->flags = 0;
init_completion(&queue->cm_done);
if (idx > 0)
@@ -590,6 +600,7 @@
return 0;
out_destroy_cm_id:
+ nvme_rdma_destroy_queue_ib(queue);
rdma_destroy_id(queue->cm_id);
return ret;
}
@@ -608,7 +619,7 @@
static void nvme_rdma_stop_and_free_queue(struct nvme_rdma_queue *queue)
{
- if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
+ if (test_and_set_bit(NVME_RDMA_Q_DELETING, &queue->flags))
return;
nvme_rdma_stop_queue(queue);
nvme_rdma_free_queue(queue);
@@ -652,7 +663,7 @@
return 0;
out_free_queues:
- for (; i >= 1; i--)
+ for (i--; i >= 1; i--)
nvme_rdma_stop_and_free_queue(&ctrl->queues[i]);
return ret;
@@ -761,8 +772,13 @@
{
struct nvme_rdma_ctrl *ctrl = container_of(work,
struct nvme_rdma_ctrl, err_work);
+ int i;
nvme_stop_keep_alive(&ctrl->ctrl);
+
+ for (i = 0; i < ctrl->queue_count; i++)
+ clear_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[i].flags);
+
if (ctrl->queue_count > 1)
nvme_stop_queues(&ctrl->ctrl);
blk_mq_stop_hw_queues(ctrl->ctrl.admin_q);
@@ -1305,58 +1321,6 @@
return ret;
}
-/**
- * nvme_rdma_device_unplug() - Handle RDMA device unplug
- * @queue: Queue that owns the cm_id that caught the event
- *
- * DEVICE_REMOVAL event notifies us that the RDMA device is about
- * to unplug so we should take care of destroying our RDMA resources.
- * This event will be generated for each allocated cm_id.
- *
- * In our case, the RDMA resources are managed per controller and not
- * only per queue. So the way we handle this is we trigger an implicit
- * controller deletion upon the first DEVICE_REMOVAL event we see, and
- * hold the event inflight until the controller deletion is completed.
- *
- * One exception that we need to handle is the destruction of the cm_id
- * that caught the event. Since we hold the callout until the controller
- * deletion is completed, we'll deadlock if the controller deletion will
- * call rdma_destroy_id on this queue's cm_id. Thus, we claim ownership
- * of destroying this queue before-hand, destroy the queue resources,
- * then queue the controller deletion which won't destroy this queue and
- * we destroy the cm_id implicitely by returning a non-zero rc to the callout.
- */
-static int nvme_rdma_device_unplug(struct nvme_rdma_queue *queue)
-{
- struct nvme_rdma_ctrl *ctrl = queue->ctrl;
- int ret = 0;
-
- /* Own the controller deletion */
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
- return 0;
-
- dev_warn(ctrl->ctrl.device,
- "Got rdma device removal event, deleting ctrl\n");
-
- /* Get rid of reconnect work if its running */
- cancel_delayed_work_sync(&ctrl->reconnect_work);
-
- /* Disable the queue so ctrl delete won't free it */
- if (test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
- /* Free this queue ourselves */
- nvme_rdma_stop_queue(queue);
- nvme_rdma_destroy_queue_ib(queue);
-
- /* Return non-zero so the cm_id will destroy implicitly */
- ret = 1;
- }
-
- /* Queue controller deletion */
- queue_work(nvme_rdma_wq, &ctrl->delete_work);
- flush_work(&ctrl->delete_work);
- return ret;
-}
-
static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
struct rdma_cm_event *ev)
{
@@ -1398,8 +1362,8 @@
nvme_rdma_error_recovery(queue->ctrl);
break;
case RDMA_CM_EVENT_DEVICE_REMOVAL:
- /* return 1 means impliciy CM ID destroy */
- return nvme_rdma_device_unplug(queue);
+ /* device removal is handled via the ib_client API */
+ break;
default:
dev_err(queue->ctrl->ctrl.device,
"Unexpected RDMA CM event (%d)\n", ev->event);
@@ -1700,15 +1664,19 @@
static int nvme_rdma_del_ctrl(struct nvme_ctrl *nctrl)
{
struct nvme_rdma_ctrl *ctrl = to_rdma_ctrl(nctrl);
- int ret;
+ int ret = 0;
+ /*
+ * Keep a reference until all work is flushed since
+ * __nvme_rdma_del_ctrl can free the ctrl mem
+ */
+ if (!kref_get_unless_zero(&ctrl->ctrl.kref))
+ return -EBUSY;
ret = __nvme_rdma_del_ctrl(ctrl);
- if (ret)
- return ret;
-
- flush_work(&ctrl->delete_work);
-
- return 0;
+ if (!ret)
+ flush_work(&ctrl->delete_work);
+ nvme_put_ctrl(&ctrl->ctrl);
+ return ret;
}
static void nvme_rdma_remove_ctrl_work(struct work_struct *work)
@@ -2005,27 +1973,57 @@
.create_ctrl = nvme_rdma_create_ctrl,
};
+static void nvme_rdma_add_one(struct ib_device *ib_device)
+{
+}
+
+static void nvme_rdma_remove_one(struct ib_device *ib_device, void *client_data)
+{
+ struct nvme_rdma_ctrl *ctrl;
+
+ /* Delete all controllers using this device */
+ mutex_lock(&nvme_rdma_ctrl_mutex);
+ list_for_each_entry(ctrl, &nvme_rdma_ctrl_list, list) {
+ if (ctrl->device->dev != ib_device)
+ continue;
+ dev_info(ctrl->ctrl.device,
+ "Removing ctrl: NQN \"%s\", addr %pISp\n",
+ ctrl->ctrl.opts->subsysnqn, &ctrl->addr);
+ __nvme_rdma_del_ctrl(ctrl);
+ }
+ mutex_unlock(&nvme_rdma_ctrl_mutex);
+
+ flush_workqueue(nvme_rdma_wq);
+}
+
+static struct ib_client nvme_rdma_ib_client = {
+ .name = "nvme_rdma",
+ .add = nvme_rdma_add_one,
+ .remove = nvme_rdma_remove_one
+};
+
static int __init nvme_rdma_init_module(void)
{
+ int ret;
+
nvme_rdma_wq = create_workqueue("nvme_rdma_wq");
if (!nvme_rdma_wq)
return -ENOMEM;
+ ret = ib_register_client(&nvme_rdma_ib_client);
+ if (ret) {
+ destroy_workqueue(nvme_rdma_wq);
+ return ret;
+ }
+
nvmf_register_transport(&nvme_rdma_transport);
return 0;
}
static void __exit nvme_rdma_cleanup_module(void)
{
- struct nvme_rdma_ctrl *ctrl;
-
nvmf_unregister_transport(&nvme_rdma_transport);
-
- mutex_lock(&nvme_rdma_ctrl_mutex);
- list_for_each_entry(ctrl, &nvme_rdma_ctrl_list, list)
- __nvme_rdma_del_ctrl(ctrl);
- mutex_unlock(&nvme_rdma_ctrl_mutex);
-
+ ib_unregister_client(&nvme_rdma_ib_client);
destroy_workqueue(nvme_rdma_wq);
}
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index d1ef7ac..f9357e0 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -40,6 +40,7 @@
list_del(&dev->bus_list);
up_write(&pci_bus_sem);
+ pci_bridge_d3_device_removed(dev);
pci_free_resources(dev);
put_device(&dev->dev);
}
@@ -96,8 +97,6 @@
dev->subordinate = NULL;
}
- pci_bridge_d3_device_removed(dev);
-
pci_destroy_dev(dev);
}
diff --git a/drivers/pcmcia/ds.c b/drivers/pcmcia/ds.c
index 489ea10..69b5e81 100644
--- a/drivers/pcmcia/ds.c
+++ b/drivers/pcmcia/ds.c
@@ -977,7 +977,7 @@
/************************ runtime PM support ***************************/
-static int pcmcia_dev_suspend(struct device *dev, pm_message_t state);
+static int pcmcia_dev_suspend(struct device *dev);
static int pcmcia_dev_resume(struct device *dev);
static int runtime_suspend(struct device *dev)
@@ -985,7 +985,7 @@
int rc;
device_lock(dev);
- rc = pcmcia_dev_suspend(dev, PMSG_SUSPEND);
+ rc = pcmcia_dev_suspend(dev);
device_unlock(dev);
return rc;
}
@@ -1135,7 +1135,7 @@
/* PM support, also needed for reset */
-static int pcmcia_dev_suspend(struct device *dev, pm_message_t state)
+static int pcmcia_dev_suspend(struct device *dev)
{
struct pcmcia_device *p_dev = to_pcmcia_dev(dev);
struct pcmcia_driver *p_drv = NULL;
@@ -1410,6 +1410,9 @@
.remove_dev = &pcmcia_bus_remove_socket,
};
+static const struct dev_pm_ops pcmcia_bus_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(pcmcia_dev_suspend, pcmcia_dev_resume)
+};
struct bus_type pcmcia_bus_type = {
.name = "pcmcia",
@@ -1418,8 +1421,7 @@
.dev_groups = pcmcia_dev_groups,
.probe = pcmcia_device_probe,
.remove = pcmcia_device_remove,
- .suspend = pcmcia_dev_suspend,
- .resume = pcmcia_dev_resume,
+ .pm = &pcmcia_bus_pm_ops,
};
diff --git a/drivers/pcmcia/pxa2xx_base.c b/drivers/pcmcia/pxa2xx_base.c
index 483f919..91b5f57 100644
--- a/drivers/pcmcia/pxa2xx_base.c
+++ b/drivers/pcmcia/pxa2xx_base.c
@@ -214,9 +214,8 @@
}
#endif
-void pxa2xx_configure_sockets(struct device *dev)
+void pxa2xx_configure_sockets(struct device *dev, struct pcmcia_low_level *ops)
{
- struct pcmcia_low_level *ops = dev->platform_data;
/*
* We have at least one socket, so set MECR:CIT
* (Card Is There)
@@ -322,7 +321,7 @@
goto err1;
}
- pxa2xx_configure_sockets(&dev->dev);
+ pxa2xx_configure_sockets(&dev->dev, ops);
dev_set_drvdata(&dev->dev, sinfo);
return 0;
@@ -348,7 +347,9 @@
static int pxa2xx_drv_pcmcia_resume(struct device *dev)
{
- pxa2xx_configure_sockets(dev);
+ struct pcmcia_low_level *ops = (struct pcmcia_low_level *)dev->platform_data;
+
+ pxa2xx_configure_sockets(dev, ops);
return 0;
}
diff --git a/drivers/pcmcia/pxa2xx_base.h b/drivers/pcmcia/pxa2xx_base.h
index b609b45..e58c7a4 100644
--- a/drivers/pcmcia/pxa2xx_base.h
+++ b/drivers/pcmcia/pxa2xx_base.h
@@ -1,4 +1,4 @@
int pxa2xx_drv_pcmcia_add_one(struct soc_pcmcia_socket *skt);
void pxa2xx_drv_pcmcia_ops(struct pcmcia_low_level *ops);
-void pxa2xx_configure_sockets(struct device *dev);
+void pxa2xx_configure_sockets(struct device *dev, struct pcmcia_low_level *ops);
diff --git a/drivers/pcmcia/sa1111_badge4.c b/drivers/pcmcia/sa1111_badge4.c
index 12f0dd0..2f49093 100644
--- a/drivers/pcmcia/sa1111_badge4.c
+++ b/drivers/pcmcia/sa1111_badge4.c
@@ -134,20 +134,14 @@
int pcmcia_badge4_init(struct sa1111_dev *dev)
{
- int ret = -ENODEV;
+ printk(KERN_INFO
+ "%s: badge4_pcmvcc=%d, badge4_pcmvpp=%d, badge4_cfvcc=%d\n",
+ __func__,
+ badge4_pcmvcc, badge4_pcmvpp, badge4_cfvcc);
- if (machine_is_badge4()) {
- printk(KERN_INFO
- "%s: badge4_pcmvcc=%d, badge4_pcmvpp=%d, badge4_cfvcc=%d\n",
- __func__,
- badge4_pcmvcc, badge4_pcmvpp, badge4_cfvcc);
-
- sa11xx_drv_pcmcia_ops(&badge4_pcmcia_ops);
- ret = sa1111_pcmcia_add(dev, &badge4_pcmcia_ops,
- sa11xx_drv_pcmcia_add_one);
- }
-
- return ret;
+ sa11xx_drv_pcmcia_ops(&badge4_pcmcia_ops);
+ return sa1111_pcmcia_add(dev, &badge4_pcmcia_ops,
+ sa11xx_drv_pcmcia_add_one);
}
static int __init pcmv_setup(char *s)
diff --git a/drivers/pcmcia/sa1111_generic.c b/drivers/pcmcia/sa1111_generic.c
index a1531fe..3d95dff 100644
--- a/drivers/pcmcia/sa1111_generic.c
+++ b/drivers/pcmcia/sa1111_generic.c
@@ -18,6 +18,7 @@
#include <mach/hardware.h>
#include <asm/hardware/sa1111.h>
+#include <asm/mach-types.h>
#include <asm/irq.h>
#include "sa1111_generic.h"
@@ -203,19 +204,30 @@
sa1111_writel(PCSSR_S0_SLEEP | PCSSR_S1_SLEEP, base + PCSSR);
sa1111_writel(PCCR_S0_FLT | PCCR_S1_FLT, base + PCCR);
+ ret = -ENODEV;
#ifdef CONFIG_SA1100_BADGE4
- pcmcia_badge4_init(dev);
+ if (machine_is_badge4())
+ ret = pcmcia_badge4_init(dev);
#endif
#ifdef CONFIG_SA1100_JORNADA720
- pcmcia_jornada720_init(dev);
+ if (machine_is_jornada720())
+ ret = pcmcia_jornada720_init(dev);
#endif
#ifdef CONFIG_ARCH_LUBBOCK
- pcmcia_lubbock_init(dev);
+ if (machine_is_lubbock())
+ ret = pcmcia_lubbock_init(dev);
#endif
#ifdef CONFIG_ASSABET_NEPONSET
- pcmcia_neponset_init(dev);
+ if (machine_is_assabet())
+ ret = pcmcia_neponset_init(dev);
#endif
- return 0;
+
+ if (ret) {
+ release_mem_region(dev->res.start, 512);
+ sa1111_disable_device(dev);
+ }
+
+ return ret;
}
static int pcmcia_remove(struct sa1111_dev *dev)
diff --git a/drivers/pcmcia/sa1111_jornada720.c b/drivers/pcmcia/sa1111_jornada720.c
index c2c3058..480a3ed 100644
--- a/drivers/pcmcia/sa1111_jornada720.c
+++ b/drivers/pcmcia/sa1111_jornada720.c
@@ -94,22 +94,17 @@
int pcmcia_jornada720_init(struct sa1111_dev *sadev)
{
- int ret = -ENODEV;
+ unsigned int pin = GPIO_A0 | GPIO_A1 | GPIO_A2 | GPIO_A3;
- if (machine_is_jornada720()) {
- unsigned int pin = GPIO_A0 | GPIO_A1 | GPIO_A2 | GPIO_A3;
+ /* Fixme: why messing around with SA11x0's GPIO1? */
+ GRER |= 0x00000002;
- GRER |= 0x00000002;
+ /* Set GPIO_A<3:1> to be outputs for PCMCIA/CF power controller: */
+ sa1111_set_io_dir(sadev, pin, 0, 0);
+ sa1111_set_io(sadev, pin, 0);
+ sa1111_set_sleep_io(sadev, pin, 0);
- /* Set GPIO_A<3:1> to be outputs for PCMCIA/CF power controller: */
- sa1111_set_io_dir(sadev, pin, 0, 0);
- sa1111_set_io(sadev, pin, 0);
- sa1111_set_sleep_io(sadev, pin, 0);
-
- sa11xx_drv_pcmcia_ops(&jornada720_pcmcia_ops);
- ret = sa1111_pcmcia_add(sadev, &jornada720_pcmcia_ops,
- sa11xx_drv_pcmcia_add_one);
- }
-
- return ret;
+ sa11xx_drv_pcmcia_ops(&jornada720_pcmcia_ops);
+ return sa1111_pcmcia_add(sadev, &jornada720_pcmcia_ops,
+ sa11xx_drv_pcmcia_add_one);
}
diff --git a/drivers/pcmcia/sa1111_lubbock.c b/drivers/pcmcia/sa1111_lubbock.c
index c5caf57..e741f49 100644
--- a/drivers/pcmcia/sa1111_lubbock.c
+++ b/drivers/pcmcia/sa1111_lubbock.c
@@ -210,27 +210,21 @@
int pcmcia_lubbock_init(struct sa1111_dev *sadev)
{
- int ret = -ENODEV;
+ /*
+ * Set GPIO_A<3:0> to be outputs for the MAX1600,
+ * and switch to standby mode.
+ */
+ sa1111_set_io_dir(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0, 0);
+ sa1111_set_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
+ sa1111_set_sleep_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
- if (machine_is_lubbock()) {
- /*
- * Set GPIO_A<3:0> to be outputs for the MAX1600,
- * and switch to standby mode.
- */
- sa1111_set_io_dir(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0, 0);
- sa1111_set_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
- sa1111_set_sleep_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
+ /* Set CF Socket 1 power to standby mode. */
+ lubbock_set_misc_wr((1 << 15) | (1 << 14), 0);
- /* Set CF Socket 1 power to standby mode. */
- lubbock_set_misc_wr((1 << 15) | (1 << 14), 0);
-
- pxa2xx_drv_pcmcia_ops(&lubbock_pcmcia_ops);
- pxa2xx_configure_sockets(&sadev->dev);
- ret = sa1111_pcmcia_add(sadev, &lubbock_pcmcia_ops,
- pxa2xx_drv_pcmcia_add_one);
- }
-
- return ret;
+ pxa2xx_drv_pcmcia_ops(&lubbock_pcmcia_ops);
+ pxa2xx_configure_sockets(&sadev->dev, &lubbock_pcmcia_ops);
+ return sa1111_pcmcia_add(sadev, &lubbock_pcmcia_ops,
+ pxa2xx_drv_pcmcia_add_one);
}
MODULE_LICENSE("GPL");
diff --git a/drivers/pcmcia/sa1111_neponset.c b/drivers/pcmcia/sa1111_neponset.c
index 1d78739..019c395 100644
--- a/drivers/pcmcia/sa1111_neponset.c
+++ b/drivers/pcmcia/sa1111_neponset.c
@@ -110,20 +110,14 @@
int pcmcia_neponset_init(struct sa1111_dev *sadev)
{
- int ret = -ENODEV;
-
- if (machine_is_assabet()) {
- /*
- * Set GPIO_A<3:0> to be outputs for the MAX1600,
- * and switch to standby mode.
- */
- sa1111_set_io_dir(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0, 0);
- sa1111_set_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
- sa1111_set_sleep_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
- sa11xx_drv_pcmcia_ops(&neponset_pcmcia_ops);
- ret = sa1111_pcmcia_add(sadev, &neponset_pcmcia_ops,
- sa11xx_drv_pcmcia_add_one);
- }
-
- return ret;
+ /*
+ * Set GPIO_A<3:0> to be outputs for the MAX1600,
+ * and switch to standby mode.
+ */
+ sa1111_set_io_dir(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0, 0);
+ sa1111_set_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
+ sa1111_set_sleep_io(sadev, GPIO_A0|GPIO_A1|GPIO_A2|GPIO_A3, 0);
+ sa11xx_drv_pcmcia_ops(&neponset_pcmcia_ops);
+ return sa1111_pcmcia_add(sadev, &neponset_pcmcia_ops,
+ sa11xx_drv_pcmcia_add_one);
}
diff --git a/drivers/pcmcia/sa11xx_base.c b/drivers/pcmcia/sa11xx_base.c
index 9f6ec87..48140ac 100644
--- a/drivers/pcmcia/sa11xx_base.c
+++ b/drivers/pcmcia/sa11xx_base.c
@@ -144,19 +144,19 @@
sa1100_pcmcia_show_timing(struct soc_pcmcia_socket *skt, char *buf)
{
struct soc_pcmcia_timing timing;
- unsigned int clock = clk_get_rate(skt->clk);
+ unsigned int clock = clk_get_rate(skt->clk) / 1000;
unsigned long mecr = MECR;
char *p = buf;
soc_common_pcmcia_get_timing(skt, &timing);
- p+=sprintf(p, "I/O : %u (%u)\n", timing.io,
+ p+=sprintf(p, "I/O : %uns (%uns)\n", timing.io,
sa1100_pcmcia_cmd_time(clock, MECR_BSIO_GET(mecr, skt->nr)));
- p+=sprintf(p, "attribute: %u (%u)\n", timing.attr,
+ p+=sprintf(p, "attribute: %uns (%uns)\n", timing.attr,
sa1100_pcmcia_cmd_time(clock, MECR_BSA_GET(mecr, skt->nr)));
- p+=sprintf(p, "common : %u (%u)\n", timing.mem,
+ p+=sprintf(p, "common : %uns (%uns)\n", timing.mem,
sa1100_pcmcia_cmd_time(clock, MECR_BSM_GET(mecr, skt->nr)));
return p - buf;
diff --git a/drivers/pcmcia/soc_common.c b/drivers/pcmcia/soc_common.c
index eed5e9c..d5ca760 100644
--- a/drivers/pcmcia/soc_common.c
+++ b/drivers/pcmcia/soc_common.c
@@ -235,7 +235,7 @@
stat |= skt->cs_state.Vcc ? SS_POWERON : 0;
if (skt->cs_state.flags & SS_IOCARD)
- stat |= state.bvd1 ? SS_STSCHG : 0;
+ stat |= state.bvd1 ? 0 : SS_STSCHG;
else {
if (state.bvd1 == 0)
stat |= SS_BATDEAD;
diff --git a/drivers/rapidio/rio_cm.c b/drivers/rapidio/rio_cm.c
index 3fa17ac..cebc296 100644
--- a/drivers/rapidio/rio_cm.c
+++ b/drivers/rapidio/rio_cm.c
@@ -2247,17 +2247,30 @@
{
struct rio_channel *ch;
unsigned int i;
+ LIST_HEAD(list);
riocm_debug(EXIT, ".");
+ /*
+ * If there are any channels left in connected state send
+ * close notification to the connection partner.
+ * First build a list of channels that require a closing
+ * notification because function riocm_send_close() should
+ * be called outside of spinlock protected code.
+ */
spin_lock_bh(&idr_lock);
idr_for_each_entry(&ch_idr, ch, i) {
- riocm_debug(EXIT, "close ch %d", ch->id);
- if (ch->state == RIO_CM_CONNECTED)
- riocm_send_close(ch);
+ if (ch->state == RIO_CM_CONNECTED) {
+ riocm_debug(EXIT, "close ch %d", ch->id);
+ idr_remove(&ch_idr, ch->id);
+ list_add(&ch->ch_node, &list);
+ }
}
spin_unlock_bh(&idr_lock);
+ list_for_each_entry(ch, &list, ch_node)
+ riocm_send_close(ch);
+
return NOTIFY_DONE;
}
diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index bf40063..6d4b68c4 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -999,6 +999,7 @@
__u16, __u16,
enum qeth_prot_versions);
int qeth_set_features(struct net_device *, netdev_features_t);
+int qeth_recover_features(struct net_device *);
netdev_features_t qeth_fix_features(struct net_device *, netdev_features_t);
/* exports for OSN */
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 7dba6c8..20cf296 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -3619,7 +3619,8 @@
int e;
e = 0;
- while (buffer->element[e].addr) {
+ while ((e < QDIO_MAX_ELEMENTS_PER_BUFFER) &&
+ buffer->element[e].addr) {
unsigned long phys_aob_addr;
phys_aob_addr = (unsigned long) buffer->element[e].addr;
@@ -6131,6 +6132,35 @@
return rc;
}
+/* try to restore device features on a device after recovery */
+int qeth_recover_features(struct net_device *dev)
+{
+ struct qeth_card *card = dev->ml_priv;
+ netdev_features_t recover = dev->features;
+
+ if (recover & NETIF_F_IP_CSUM) {
+ if (qeth_set_ipa_csum(card, 1, IPA_OUTBOUND_CHECKSUM))
+ recover ^= NETIF_F_IP_CSUM;
+ }
+ if (recover & NETIF_F_RXCSUM) {
+ if (qeth_set_ipa_csum(card, 1, IPA_INBOUND_CHECKSUM))
+ recover ^= NETIF_F_RXCSUM;
+ }
+ if (recover & NETIF_F_TSO) {
+ if (qeth_set_ipa_tso(card, 1))
+ recover ^= NETIF_F_TSO;
+ }
+
+ if (recover == dev->features)
+ return 0;
+
+ dev_warn(&card->gdev->dev,
+ "Device recovery failed to restore all offload features\n");
+ dev->features = recover;
+ return -EIO;
+}
+EXPORT_SYMBOL_GPL(qeth_recover_features);
+
int qeth_set_features(struct net_device *dev, netdev_features_t features)
{
struct qeth_card *card = dev->ml_priv;
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 7bc20c5..bb27058 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1124,14 +1124,11 @@
card->dev->hw_features |= NETIF_F_RXCSUM;
card->dev->vlan_features |= NETIF_F_RXCSUM;
}
- /* Turn on SG per default */
- card->dev->features |= NETIF_F_SG;
}
card->info.broadcast_capable = 1;
qeth_l2_request_initial_mac(card);
card->dev->gso_max_size = (QETH_MAX_BUFFER_ELEMENTS(card) - 1) *
PAGE_SIZE;
- card->dev->gso_max_segs = (QETH_MAX_BUFFER_ELEMENTS(card) - 1);
SET_NETDEV_DEV(card->dev, &card->gdev->dev);
netif_napi_add(card->dev, &card->napi, qeth_l2_poll, QETH_NAPI_WEIGHT);
netif_carrier_off(card->dev);
@@ -1246,6 +1243,9 @@
}
/* this also sets saved unicast addresses */
qeth_l2_set_rx_mode(card->dev);
+ rtnl_lock();
+ qeth_recover_features(card->dev);
+ rtnl_unlock();
}
/* let user_space know that device is online */
kobject_uevent(&gdev->dev.kobj, KOBJ_CHANGE);
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index 7293466..272d9e7 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -257,6 +257,11 @@
if (addr->in_progress)
return -EINPROGRESS;
+ if (!qeth_card_hw_is_reachable(card)) {
+ addr->disp_flag = QETH_DISP_ADDR_DELETE;
+ return 0;
+ }
+
rc = qeth_l3_deregister_addr_entry(card, addr);
hash_del(&addr->hnode);
@@ -296,6 +301,11 @@
hash_add(card->ip_htable, &addr->hnode,
qeth_l3_ipaddr_hash(addr));
+ if (!qeth_card_hw_is_reachable(card)) {
+ addr->disp_flag = QETH_DISP_ADDR_ADD;
+ return 0;
+ }
+
/* qeth_l3_register_addr_entry can go to sleep
* if we add a IPV4 addr. It is caused by the reason
* that SETIP ipa cmd starts ARP staff for IPV4 addr.
@@ -390,12 +400,16 @@
int i;
int rc;
- QETH_CARD_TEXT(card, 4, "recoverip");
+ QETH_CARD_TEXT(card, 4, "recovrip");
spin_lock_bh(&card->ip_lock);
hash_for_each_safe(card->ip_htable, i, tmp, addr, hnode) {
- if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
+ if (addr->disp_flag == QETH_DISP_ADDR_DELETE) {
+ qeth_l3_deregister_addr_entry(card, addr);
+ hash_del(&addr->hnode);
+ kfree(addr);
+ } else if (addr->disp_flag == QETH_DISP_ADDR_ADD) {
if (addr->proto == QETH_PROT_IPV4) {
addr->in_progress = 1;
spin_unlock_bh(&card->ip_lock);
@@ -407,10 +421,8 @@
if (!rc) {
addr->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
- if (addr->ref_counter < 1) {
+ if (addr->ref_counter < 1)
qeth_l3_delete_ip(card, addr);
- kfree(addr);
- }
} else {
hash_del(&addr->hnode);
kfree(addr);
@@ -689,7 +701,7 @@
spin_lock_bh(&card->ip_lock);
- if (!qeth_l3_ip_from_hash(card, ipaddr))
+ if (qeth_l3_ip_from_hash(card, ipaddr))
rc = -EEXIST;
else
qeth_l3_add_ip(card, ipaddr);
@@ -757,7 +769,7 @@
spin_lock_bh(&card->ip_lock);
- if (!qeth_l3_ip_from_hash(card, ipaddr))
+ if (qeth_l3_ip_from_hash(card, ipaddr))
rc = -EEXIST;
else
qeth_l3_add_ip(card, ipaddr);
@@ -3108,7 +3120,6 @@
card->dev->vlan_features = NETIF_F_SG |
NETIF_F_RXCSUM | NETIF_F_IP_CSUM |
NETIF_F_TSO;
- card->dev->features = NETIF_F_SG;
}
}
} else if (card->info.type == QETH_CARD_TYPE_IQD) {
@@ -3136,7 +3147,6 @@
netif_keep_dst(card->dev);
card->dev->gso_max_size = (QETH_MAX_BUFFER_ELEMENTS(card) - 1) *
PAGE_SIZE;
- card->dev->gso_max_segs = (QETH_MAX_BUFFER_ELEMENTS(card) - 1);
SET_NETDEV_DEV(card->dev, &card->gdev->dev);
netif_napi_add(card->dev, &card->napi, qeth_l3_poll, QETH_NAPI_WEIGHT);
@@ -3269,6 +3279,7 @@
else
dev_open(card->dev);
qeth_l3_set_multicast_list(card->dev);
+ qeth_recover_features(card->dev);
rtnl_unlock();
}
qeth_trace_features(card);
diff --git a/drivers/s390/net/qeth_l3_sys.c b/drivers/s390/net/qeth_l3_sys.c
index 65645b1..0e00a5c 100644
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -297,7 +297,9 @@
addr->u.a6.pfxlen = 0;
addr->type = QETH_IP_TYPE_NORMAL;
+ spin_lock_bh(&card->ip_lock);
qeth_l3_delete_ip(card, addr);
+ spin_unlock_bh(&card->ip_lock);
kfree(addr);
}
@@ -329,7 +331,10 @@
addr->type = QETH_IP_TYPE_NORMAL;
} else
return -ENOMEM;
+
+ spin_lock_bh(&card->ip_lock);
qeth_l3_add_ip(card, addr);
+ spin_unlock_bh(&card->ip_lock);
kfree(addr);
return count;
diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
index e4ba2d2..7c0d7af 100644
--- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
+++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
@@ -84,6 +84,9 @@
static const struct cxgb4_uld_info cxgb4i_uld_info = {
.name = DRV_MODULE_NAME,
+ .nrxq = MAX_ULD_QSETS,
+ .rxq_size = 1024,
+ .lro = false,
.add = t4_uld_add,
.rx_handler = t4_uld_rx_handler,
.state_change = t4_uld_state_change,
diff --git a/drivers/staging/media/cec/TODO b/drivers/staging/media/cec/TODO
index a10d4f8..1322469 100644
--- a/drivers/staging/media/cec/TODO
+++ b/drivers/staging/media/cec/TODO
@@ -12,6 +12,7 @@
Other TODOs:
+- There are two possible replies to CEC_MSG_INITIATE_ARC. How to handle that?
- Add a flag to inhibit passing CEC RC messages to the rc subsystem.
Applications should be able to choose this when calling S_LOG_ADDRS.
- If the reply field of cec_msg is set then when the reply arrives it
diff --git a/drivers/staging/media/cec/cec-adap.c b/drivers/staging/media/cec/cec-adap.c
index b2393bb..946986f 100644
--- a/drivers/staging/media/cec/cec-adap.c
+++ b/drivers/staging/media/cec/cec-adap.c
@@ -124,10 +124,10 @@
u64 ts = ktime_get_ns();
struct cec_fh *fh;
- mutex_lock(&adap->devnode.fhs_lock);
+ mutex_lock(&adap->devnode.lock);
list_for_each_entry(fh, &adap->devnode.fhs, list)
cec_queue_event_fh(fh, ev, ts);
- mutex_unlock(&adap->devnode.fhs_lock);
+ mutex_unlock(&adap->devnode.lock);
}
/*
@@ -191,12 +191,12 @@
u32 monitor_mode = valid_la ? CEC_MODE_MONITOR :
CEC_MODE_MONITOR_ALL;
- mutex_lock(&adap->devnode.fhs_lock);
+ mutex_lock(&adap->devnode.lock);
list_for_each_entry(fh, &adap->devnode.fhs, list) {
if (fh->mode_follower >= monitor_mode)
cec_queue_msg_fh(fh, msg);
}
- mutex_unlock(&adap->devnode.fhs_lock);
+ mutex_unlock(&adap->devnode.lock);
}
/*
@@ -207,12 +207,12 @@
{
struct cec_fh *fh;
- mutex_lock(&adap->devnode.fhs_lock);
+ mutex_lock(&adap->devnode.lock);
list_for_each_entry(fh, &adap->devnode.fhs, list) {
if (fh->mode_follower == CEC_MODE_FOLLOWER)
cec_queue_msg_fh(fh, msg);
}
- mutex_unlock(&adap->devnode.fhs_lock);
+ mutex_unlock(&adap->devnode.lock);
}
/* Notify userspace of an adapter state change. */
@@ -851,6 +851,9 @@
if (!valid_la || msg->len <= 1)
return;
+ if (adap->log_addrs.log_addr_mask == 0)
+ return;
+
/*
* Process the message on the protocol level. If is_reply is true,
* then cec_receive_notify() won't pass on the reply to the listener(s)
@@ -1047,11 +1050,17 @@
dprintk(1, "could not claim LA %d\n", i);
}
+ if (adap->log_addrs.log_addr_mask == 0 &&
+ !(las->flags & CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK))
+ goto unconfigure;
+
configured:
if (adap->log_addrs.log_addr_mask == 0) {
/* Fall back to unregistered */
las->log_addr[0] = CEC_LOG_ADDR_UNREGISTERED;
las->log_addr_mask = 1 << las->log_addr[0];
+ for (i = 1; i < las->num_log_addrs; i++)
+ las->log_addr[i] = CEC_LOG_ADDR_INVALID;
}
adap->is_configured = true;
adap->is_configuring = false;
@@ -1070,6 +1079,8 @@
cec_report_features(adap, i);
cec_report_phys_addr(adap, i);
}
+ for (i = las->num_log_addrs; i < CEC_MAX_LOG_ADDRS; i++)
+ las->log_addr[i] = CEC_LOG_ADDR_INVALID;
mutex_lock(&adap->lock);
adap->kthread_config = NULL;
mutex_unlock(&adap->lock);
@@ -1398,7 +1409,6 @@
u8 init_laddr = cec_msg_initiator(msg);
u8 devtype = cec_log_addr2dev(adap, dest_laddr);
int la_idx = cec_log_addr2idx(adap, dest_laddr);
- bool is_directed = la_idx >= 0;
bool from_unregistered = init_laddr == 0xf;
struct cec_msg tx_cec_msg = { };
@@ -1560,7 +1570,7 @@
* Unprocessed messages are aborted if userspace isn't doing
* any processing either.
*/
- if (is_directed && !is_reply && !adap->follower_cnt &&
+ if (!is_broadcast && !is_reply && !adap->follower_cnt &&
!adap->cec_follower && msg->msg[1] != CEC_MSG_FEATURE_ABORT)
return cec_feature_abort(adap, msg);
break;
diff --git a/drivers/staging/media/cec/cec-api.c b/drivers/staging/media/cec/cec-api.c
index 7be7615..e274e2f 100644
--- a/drivers/staging/media/cec/cec-api.c
+++ b/drivers/staging/media/cec/cec-api.c
@@ -162,7 +162,7 @@
return -ENOTTY;
if (copy_from_user(&log_addrs, parg, sizeof(log_addrs)))
return -EFAULT;
- log_addrs.flags = 0;
+ log_addrs.flags &= CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK;
mutex_lock(&adap->lock);
if (!adap->is_configuring &&
(!log_addrs.num_log_addrs || !adap->is_configured) &&
@@ -435,7 +435,7 @@
void __user *parg = (void __user *)arg;
if (!devnode->registered)
- return -EIO;
+ return -ENODEV;
switch (cmd) {
case CEC_ADAP_G_CAPS:
@@ -508,14 +508,14 @@
filp->private_data = fh;
- mutex_lock(&devnode->fhs_lock);
+ mutex_lock(&devnode->lock);
/* Queue up initial state events */
ev_state.state_change.phys_addr = adap->phys_addr;
ev_state.state_change.log_addr_mask = adap->log_addrs.log_addr_mask;
cec_queue_event_fh(fh, &ev_state, 0);
list_add(&fh->list, &devnode->fhs);
- mutex_unlock(&devnode->fhs_lock);
+ mutex_unlock(&devnode->lock);
return 0;
}
@@ -540,9 +540,9 @@
cec_monitor_all_cnt_dec(adap);
mutex_unlock(&adap->lock);
- mutex_lock(&devnode->fhs_lock);
+ mutex_lock(&devnode->lock);
list_del(&fh->list);
- mutex_unlock(&devnode->fhs_lock);
+ mutex_unlock(&devnode->lock);
/* Unhook pending transmits from this filehandle. */
mutex_lock(&adap->lock);
diff --git a/drivers/staging/media/cec/cec-core.c b/drivers/staging/media/cec/cec-core.c
index 112a5fa..3b1e4d2 100644
--- a/drivers/staging/media/cec/cec-core.c
+++ b/drivers/staging/media/cec/cec-core.c
@@ -51,31 +51,29 @@
{
/*
* Check if the cec device is available. This needs to be done with
- * the cec_devnode_lock held to prevent an open/unregister race:
+ * the devnode->lock held to prevent an open/unregister race:
* without the lock, the device could be unregistered and freed between
* the devnode->registered check and get_device() calls, leading to
* a crash.
*/
- mutex_lock(&cec_devnode_lock);
+ mutex_lock(&devnode->lock);
/*
* return ENXIO if the cec device has been removed
* already or if it is not registered anymore.
*/
if (!devnode->registered) {
- mutex_unlock(&cec_devnode_lock);
+ mutex_unlock(&devnode->lock);
return -ENXIO;
}
/* and increase the device refcount */
get_device(&devnode->dev);
- mutex_unlock(&cec_devnode_lock);
+ mutex_unlock(&devnode->lock);
return 0;
}
void cec_put_device(struct cec_devnode *devnode)
{
- mutex_lock(&cec_devnode_lock);
put_device(&devnode->dev);
- mutex_unlock(&cec_devnode_lock);
}
/* Called when the last user of the cec device exits. */
@@ -84,11 +82,10 @@
struct cec_devnode *devnode = to_cec_devnode(cd);
mutex_lock(&cec_devnode_lock);
-
/* Mark device node number as free */
clear_bit(devnode->minor, cec_devnode_nums);
-
mutex_unlock(&cec_devnode_lock);
+
cec_delete_adapter(to_cec_adapter(devnode));
}
@@ -117,7 +114,7 @@
/* Initialization */
INIT_LIST_HEAD(&devnode->fhs);
- mutex_init(&devnode->fhs_lock);
+ mutex_init(&devnode->lock);
/* Part 1: Find a free minor number */
mutex_lock(&cec_devnode_lock);
@@ -160,7 +157,9 @@
cdev_del:
cdev_del(&devnode->cdev);
clr_bit:
+ mutex_lock(&cec_devnode_lock);
clear_bit(devnode->minor, cec_devnode_nums);
+ mutex_unlock(&cec_devnode_lock);
return ret;
}
@@ -177,17 +176,21 @@
{
struct cec_fh *fh;
- /* Check if devnode was never registered or already unregistered */
- if (!devnode->registered || devnode->unregistered)
- return;
+ mutex_lock(&devnode->lock);
- mutex_lock(&devnode->fhs_lock);
+ /* Check if devnode was never registered or already unregistered */
+ if (!devnode->registered || devnode->unregistered) {
+ mutex_unlock(&devnode->lock);
+ return;
+ }
+
list_for_each_entry(fh, &devnode->fhs, list)
wake_up_interruptible(&fh->wait);
- mutex_unlock(&devnode->fhs_lock);
devnode->registered = false;
devnode->unregistered = true;
+ mutex_unlock(&devnode->lock);
+
device_del(&devnode->dev);
cdev_del(&devnode->cdev);
put_device(&devnode->dev);
diff --git a/drivers/staging/media/pulse8-cec/pulse8-cec.c b/drivers/staging/media/pulse8-cec/pulse8-cec.c
index 94f8590..ed8bd95 100644
--- a/drivers/staging/media/pulse8-cec/pulse8-cec.c
+++ b/drivers/staging/media/pulse8-cec/pulse8-cec.c
@@ -114,14 +114,11 @@
cec_transmit_done(pulse8->adap, CEC_TX_STATUS_OK,
0, 0, 0, 0);
break;
- case MSGCODE_TRANSMIT_FAILED_LINE:
- cec_transmit_done(pulse8->adap, CEC_TX_STATUS_ARB_LOST,
- 1, 0, 0, 0);
- break;
case MSGCODE_TRANSMIT_FAILED_ACK:
cec_transmit_done(pulse8->adap, CEC_TX_STATUS_NACK,
0, 1, 0, 0);
break;
+ case MSGCODE_TRANSMIT_FAILED_LINE:
case MSGCODE_TRANSMIT_FAILED_TIMEOUT_DATA:
case MSGCODE_TRANSMIT_FAILED_TIMEOUT_LINE:
cec_transmit_done(pulse8->adap, CEC_TX_STATUS_ERROR,
@@ -170,6 +167,9 @@
case MSGCODE_TRANSMIT_FAILED_TIMEOUT_LINE:
schedule_work(&pulse8->work);
break;
+ case MSGCODE_HIGH_ERROR:
+ case MSGCODE_LOW_ERROR:
+ case MSGCODE_RECEIVE_FAILED:
case MSGCODE_TIMEOUT_ERROR:
break;
case MSGCODE_COMMAND_ACCEPTED:
@@ -388,7 +388,7 @@
int err;
cmd[0] = MSGCODE_TRANSMIT_IDLETIME;
- cmd[1] = 3;
+ cmd[1] = signal_free_time;
err = pulse8_send_and_wait(pulse8, cmd, 2,
MSGCODE_COMMAND_ACCEPTED, 1);
cmd[0] = MSGCODE_TRANSMIT_ACK_POLARITY;
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_main.c b/drivers/target/iscsi/cxgbit/cxgbit_main.c
index 27dd11a..ad26b93 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_main.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_main.c
@@ -652,6 +652,9 @@
static struct cxgb4_uld_info cxgbit_uld_info = {
.name = DRV_NAME,
+ .nrxq = MAX_ULD_QSETS,
+ .rxq_size = 1024,
+ .lro = true,
.add = cxgbit_uld_add,
.state_change = cxgbit_uld_state_change,
.lro_rx_handler = cxgbit_uld_lro_rx_handler,
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
index 15ce4ab..a2d90ac 100644
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -240,8 +240,10 @@
memcpy(&endpoint->desc, d, n);
INIT_LIST_HEAD(&endpoint->urb_list);
- /* Fix up bInterval values outside the legal range. Use 32 ms if no
- * proper value can be guessed. */
+ /*
+ * Fix up bInterval values outside the legal range.
+ * Use 10 or 8 ms if no proper value can be guessed.
+ */
i = 0; /* i = min, j = max, n = default */
j = 255;
if (usb_endpoint_xfer_int(d)) {
@@ -250,13 +252,15 @@
case USB_SPEED_SUPER_PLUS:
case USB_SPEED_SUPER:
case USB_SPEED_HIGH:
- /* Many device manufacturers are using full-speed
+ /*
+ * Many device manufacturers are using full-speed
* bInterval values in high-speed interrupt endpoint
- * descriptors. Try to fix those and fall back to a
- * 32 ms default value otherwise. */
+ * descriptors. Try to fix those and fall back to an
+ * 8-ms default value otherwise.
+ */
n = fls(d->bInterval*8);
if (n == 0)
- n = 9; /* 32 ms = 2^(9-1) uframes */
+ n = 7; /* 8 ms = 2^(7-1) uframes */
j = 16;
/*
@@ -271,10 +275,12 @@
}
break;
default: /* USB_SPEED_FULL or _LOW */
- /* For low-speed, 10 ms is the official minimum.
+ /*
+ * For low-speed, 10 ms is the official minimum.
* But some "overclocked" devices might want faster
- * polling so we'll allow it. */
- n = 32;
+ * polling so we'll allow it.
+ */
+ n = 10;
break;
}
} else if (usb_endpoint_xfer_isoc(d)) {
@@ -282,10 +288,10 @@
j = 16;
switch (to_usb_device(ddev)->speed) {
case USB_SPEED_HIGH:
- n = 9; /* 32 ms = 2^(9-1) uframes */
+ n = 7; /* 8 ms = 2^(7-1) uframes */
break;
default: /* USB_SPEED_FULL */
- n = 6; /* 32 ms = 2^(6-1) frames */
+ n = 4; /* 8 ms = 2^(4-1) frames */
break;
}
}
diff --git a/drivers/usb/musb/Kconfig b/drivers/usb/musb/Kconfig
index 886526b..73cfa13 100644
--- a/drivers/usb/musb/Kconfig
+++ b/drivers/usb/musb/Kconfig
@@ -87,7 +87,7 @@
config USB_MUSB_TUSB6010
tristate "TUSB6010"
depends on HAS_IOMEM
- depends on ARCH_OMAP2PLUS || COMPILE_TEST
+ depends on (ARCH_OMAP2PLUS || COMPILE_TEST) && !BLACKFIN
depends on NOP_USB_XCEIV = USB_MUSB_HDRC # both built-in or both modules
config USB_MUSB_OMAP2PLUS
diff --git a/drivers/usb/serial/usb-serial-simple.c b/drivers/usb/serial/usb-serial-simple.c
index a204782..e98b6e5 100644
--- a/drivers/usb/serial/usb-serial-simple.c
+++ b/drivers/usb/serial/usb-serial-simple.c
@@ -54,7 +54,8 @@
/* Infineon Flashloader driver */
#define FLASHLOADER_IDS() \
{ USB_DEVICE_INTERFACE_CLASS(0x058b, 0x0041, USB_CLASS_CDC_DATA) }, \
- { USB_DEVICE(0x8087, 0x0716) }
+ { USB_DEVICE(0x8087, 0x0716) }, \
+ { USB_DEVICE(0x8087, 0x0801) }
DEVICE(flashloader, FLASHLOADER_IDS);
/* Google Serial USB SubClass */
diff --git a/fs/aio.c b/fs/aio.c
index fb8e45b..4fe81d1 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -239,7 +239,12 @@
static const struct dentry_operations ops = {
.d_dname = simple_dname,
};
- return mount_pseudo(fs_type, "aio:", NULL, &ops, AIO_RING_MAGIC);
+ struct dentry *root = mount_pseudo(fs_type, "aio:", NULL, &ops,
+ AIO_RING_MAGIC);
+
+ if (!IS_ERR(root))
+ root->d_sb->s_iflags |= SB_I_NOEXEC;
+ return root;
}
/* aio_setup
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index b493909..d8e6d42 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -417,6 +417,7 @@
}
return NULL;
}
+
/*
* Find an eligible tree to time-out
* A tree is eligible if :-
@@ -432,6 +433,7 @@
struct dentry *root = sb->s_root;
struct dentry *dentry;
struct dentry *expired;
+ struct dentry *found;
struct autofs_info *ino;
if (!root)
@@ -442,31 +444,46 @@
dentry = NULL;
while ((dentry = get_next_positive_subdir(dentry, root))) {
+ int flags = how;
+
spin_lock(&sbi->fs_lock);
ino = autofs4_dentry_ino(dentry);
- if (ino->flags & AUTOFS_INF_WANT_EXPIRE)
- expired = NULL;
- else
- expired = should_expire(dentry, mnt, timeout, how);
- if (!expired) {
+ if (ino->flags & AUTOFS_INF_WANT_EXPIRE) {
spin_unlock(&sbi->fs_lock);
continue;
}
+ spin_unlock(&sbi->fs_lock);
+
+ expired = should_expire(dentry, mnt, timeout, flags);
+ if (!expired)
+ continue;
+
+ spin_lock(&sbi->fs_lock);
ino = autofs4_dentry_ino(expired);
ino->flags |= AUTOFS_INF_WANT_EXPIRE;
spin_unlock(&sbi->fs_lock);
synchronize_rcu();
- spin_lock(&sbi->fs_lock);
- if (should_expire(expired, mnt, timeout, how)) {
- if (expired != dentry)
- dput(dentry);
- goto found;
- }
+ /* Make sure a reference is not taken on found if
+ * things have changed.
+ */
+ flags &= ~AUTOFS_EXP_LEAVES;
+ found = should_expire(expired, mnt, timeout, how);
+ if (!found || found != expired)
+ /* Something has changed, continue */
+ goto next;
+
+ if (expired != dentry)
+ dput(dentry);
+
+ spin_lock(&sbi->fs_lock);
+ goto found;
+next:
+ spin_lock(&sbi->fs_lock);
ino->flags &= ~AUTOFS_INF_WANT_EXPIRE;
+ spin_unlock(&sbi->fs_lock);
if (expired != dentry)
dput(expired);
- spin_unlock(&sbi->fs_lock);
}
return NULL;
@@ -483,6 +500,7 @@
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry);
int status;
+ int state;
/* Block on any pending expire */
if (!(ino->flags & AUTOFS_INF_WANT_EXPIRE))
@@ -490,8 +508,19 @@
if (rcu_walk)
return -ECHILD;
+retry:
spin_lock(&sbi->fs_lock);
- if (ino->flags & AUTOFS_INF_EXPIRING) {
+ state = ino->flags & (AUTOFS_INF_WANT_EXPIRE | AUTOFS_INF_EXPIRING);
+ if (state == AUTOFS_INF_WANT_EXPIRE) {
+ spin_unlock(&sbi->fs_lock);
+ /*
+ * Possibly being selected for expire, wait until
+ * it's selected or not.
+ */
+ schedule_timeout_uninterruptible(HZ/10);
+ goto retry;
+ }
+ if (state & AUTOFS_INF_EXPIRING) {
spin_unlock(&sbi->fs_lock);
pr_debug("waiting for expire %p name=%pd\n", dentry, dentry);
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 6bbec5e..14ae4b8 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -609,6 +609,9 @@
char *s, *p;
char sep;
+ if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH)
+ return dget(sb->s_root);
+
full_path = cifs_build_path_to_root(vol, cifs_sb,
cifs_sb_master_tcon(cifs_sb));
if (full_path == NULL)
@@ -686,26 +689,22 @@
cifs_sb->mountdata = kstrndup(data, PAGE_SIZE, GFP_KERNEL);
if (cifs_sb->mountdata == NULL) {
root = ERR_PTR(-ENOMEM);
- goto out_cifs_sb;
+ goto out_free;
}
- if (volume_info->prepath) {
- cifs_sb->prepath = kstrdup(volume_info->prepath, GFP_KERNEL);
- if (cifs_sb->prepath == NULL) {
- root = ERR_PTR(-ENOMEM);
- goto out_cifs_sb;
- }
+ rc = cifs_setup_cifs_sb(volume_info, cifs_sb);
+ if (rc) {
+ root = ERR_PTR(rc);
+ goto out_free;
}
- cifs_setup_cifs_sb(volume_info, cifs_sb);
-
rc = cifs_mount(cifs_sb, volume_info);
if (rc) {
if (!(flags & MS_SILENT))
cifs_dbg(VFS, "cifs_mount failed w/return code = %d\n",
rc);
root = ERR_PTR(rc);
- goto out_mountdata;
+ goto out_free;
}
mnt_data.vol = volume_info;
@@ -735,11 +734,7 @@
sb->s_flags |= MS_ACTIVE;
}
- if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH)
- root = dget(sb->s_root);
- else
- root = cifs_get_root(volume_info, sb);
-
+ root = cifs_get_root(volume_info, sb);
if (IS_ERR(root))
goto out_super;
@@ -752,9 +747,9 @@
cifs_cleanup_volume_info(volume_info);
return root;
-out_mountdata:
+out_free:
+ kfree(cifs_sb->prepath);
kfree(cifs_sb->mountdata);
-out_cifs_sb:
kfree(cifs_sb);
out_nls:
unload_nls(volume_info->local_nls);
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 1243bd3..95dab43 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -184,7 +184,7 @@
unsigned int to_read);
extern int cifs_read_page_from_socket(struct TCP_Server_Info *server,
struct page *page, unsigned int to_read);
-extern void cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
+extern int cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
struct cifs_sb_info *cifs_sb);
extern int cifs_match_super(struct super_block *, void *);
extern void cifs_cleanup_volume_info(struct smb_vol *pvolume_info);
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 7ae0328..2e4f4ba 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -2781,6 +2781,24 @@
return 1;
}
+static int
+match_prepath(struct super_block *sb, struct cifs_mnt_data *mnt_data)
+{
+ struct cifs_sb_info *old = CIFS_SB(sb);
+ struct cifs_sb_info *new = mnt_data->cifs_sb;
+
+ if (old->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH) {
+ if (!(new->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH))
+ return 0;
+ /* The prepath should be null terminated strings */
+ if (strcmp(new->prepath, old->prepath))
+ return 0;
+
+ return 1;
+ }
+ return 0;
+}
+
int
cifs_match_super(struct super_block *sb, void *data)
{
@@ -2808,7 +2826,8 @@
if (!match_server(tcp_srv, volume_info) ||
!match_session(ses, volume_info) ||
- !match_tcon(tcon, volume_info->UNC)) {
+ !match_tcon(tcon, volume_info->UNC) ||
+ !match_prepath(sb, mnt_data)) {
rc = 0;
goto out;
}
@@ -3222,7 +3241,7 @@
}
}
-void cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
+int cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
struct cifs_sb_info *cifs_sb)
{
INIT_DELAYED_WORK(&cifs_sb->prune_tlinks, cifs_prune_tlinks);
@@ -3316,6 +3335,14 @@
if ((pvolume_info->cifs_acl) && (pvolume_info->dynperm))
cifs_dbg(VFS, "mount option dynperm ignored if cifsacl mount option supported\n");
+
+ if (pvolume_info->prepath) {
+ cifs_sb->prepath = kstrdup(pvolume_info->prepath, GFP_KERNEL);
+ if (cifs_sb->prepath == NULL)
+ return -ENOMEM;
+ }
+
+ return 0;
}
static void
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 0f56deb..c415668 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -568,7 +568,7 @@
return thaw_super(sb);
}
-static long ioctl_file_dedupe_range(struct file *file, void __user *arg)
+static int ioctl_file_dedupe_range(struct file *file, void __user *arg)
{
struct file_dedupe_range __user *argp = arg;
struct file_dedupe_range *same = NULL;
@@ -582,6 +582,10 @@
}
size = offsetof(struct file_dedupe_range __user, info[count]);
+ if (size > PAGE_SIZE) {
+ ret = -ENOMEM;
+ goto out;
+ }
same = memdup_user(argp, size);
if (IS_ERR(same)) {
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 7d62097..ca699dd 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -657,7 +657,10 @@
if (result <= 0)
goto out;
- written = generic_write_sync(iocb, result);
+ result = generic_write_sync(iocb, result);
+ if (result < 0)
+ goto out;
+ written = result;
iocb->ki_pos += written;
/* Return error values */
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f5aecaa..a9dec32 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7570,12 +7570,20 @@
status = rpc_call_sync(session->clp->cl_rpcclient, &msg, RPC_TASK_TIMEOUT);
trace_nfs4_create_session(clp, status);
+ switch (status) {
+ case -NFS4ERR_STALE_CLIENTID:
+ case -NFS4ERR_DELAY:
+ case -ETIMEDOUT:
+ case -EACCES:
+ case -EAGAIN:
+ goto out;
+ };
+
+ clp->cl_seqid++;
if (!status) {
/* Verify the session's negotiated channel_attrs values */
status = nfs4_verify_channel_attrs(&args, &res);
/* Increment the clientid slot sequence id */
- if (clp->cl_seqid == res.seqid)
- clp->cl_seqid++;
if (status)
goto out;
nfs4_update_session(session, &res);
@@ -8190,10 +8198,13 @@
dprintk("--> %s\n", __func__);
spin_lock(&lo->plh_inode->i_lock);
- pnfs_mark_matching_lsegs_invalid(lo, &freeme, &lrp->args.range,
- be32_to_cpu(lrp->args.stateid.seqid));
- if (lrp->res.lrs_present && pnfs_layout_is_valid(lo))
+ if (lrp->res.lrs_present) {
+ pnfs_mark_matching_lsegs_invalid(lo, &freeme,
+ &lrp->args.range,
+ be32_to_cpu(lrp->args.stateid.seqid));
pnfs_set_layout_stateid(lo, &lrp->res.stateid, true);
+ } else
+ pnfs_mark_layout_stateid_invalid(lo, &freeme);
pnfs_clear_layoutreturn_waitbit(lo);
spin_unlock(&lo->plh_inode->i_lock);
nfs4_sequence_free_slot(&lrp->res.seq_res);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 6daf034..2c93a85 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -365,7 +365,8 @@
/* Matched by pnfs_get_layout_hdr in pnfs_layout_insert_lseg */
atomic_dec(&lo->plh_refcount);
if (list_empty(&lo->plh_segs)) {
- set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
+ if (atomic_read(&lo->plh_outstanding) == 0)
+ set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
}
rpc_wake_up(&NFS_SERVER(inode)->roc_rpcwaitq);
@@ -768,17 +769,32 @@
pnfs_destroy_layouts_byclid(clp, false);
}
+static void
+pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
+{
+ lo->plh_return_iomode = 0;
+ lo->plh_return_seq = 0;
+ clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+}
+
/* update lo->plh_stateid with new if is more recent */
void
pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
bool update_barrier)
{
u32 oldseq, newseq, new_barrier = 0;
- bool invalid = !pnfs_layout_is_valid(lo);
oldseq = be32_to_cpu(lo->plh_stateid.seqid);
newseq = be32_to_cpu(new->seqid);
- if (invalid || pnfs_seqid_is_newer(newseq, oldseq)) {
+
+ if (!pnfs_layout_is_valid(lo)) {
+ nfs4_stateid_copy(&lo->plh_stateid, new);
+ lo->plh_barrier = newseq;
+ pnfs_clear_layoutreturn_info(lo);
+ clear_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
+ return;
+ }
+ if (pnfs_seqid_is_newer(newseq, oldseq)) {
nfs4_stateid_copy(&lo->plh_stateid, new);
/*
* Because of wraparound, we want to keep the barrier
@@ -790,7 +806,7 @@
new_barrier = be32_to_cpu(new->seqid);
else if (new_barrier == 0)
return;
- if (invalid || pnfs_seqid_is_newer(new_barrier, lo->plh_barrier))
+ if (pnfs_seqid_is_newer(new_barrier, lo->plh_barrier))
lo->plh_barrier = new_barrier;
}
@@ -886,19 +902,14 @@
rpc_wake_up(&NFS_SERVER(lo->plh_inode)->roc_rpcwaitq);
}
-static void
-pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
-{
- lo->plh_return_iomode = 0;
- lo->plh_return_seq = 0;
- clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
-}
-
static bool
pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
nfs4_stateid *stateid,
enum pnfs_iomode *iomode)
{
+ /* Serialise LAYOUTGET/LAYOUTRETURN */
+ if (atomic_read(&lo->plh_outstanding) != 0)
+ return false;
if (test_and_set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags))
return false;
pnfs_get_layout_hdr(lo);
@@ -1798,16 +1809,11 @@
*/
pnfs_mark_layout_stateid_invalid(lo, &free_me);
- nfs4_stateid_copy(&lo->plh_stateid, &res->stateid);
- lo->plh_barrier = be32_to_cpu(res->stateid.seqid);
+ pnfs_set_layout_stateid(lo, &res->stateid, true);
}
pnfs_get_lseg(lseg);
pnfs_layout_insert_lseg(lo, lseg, &free_me);
- if (!pnfs_layout_is_valid(lo)) {
- pnfs_clear_layoutreturn_info(lo);
- clear_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
- }
if (res->return_on_close)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index d2f97ec..e0e5f7c 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -67,18 +67,7 @@
pr_debug("%s: group=%p event=%p\n", __func__, group, event);
- wait_event(group->fanotify_data.access_waitq, event->response ||
- atomic_read(&group->fanotify_data.bypass_perm));
-
- if (!event->response) { /* bypass_perm set */
- /*
- * Event was canceled because group is being destroyed. Remove
- * it from group's event list because we are responsible for
- * freeing the permission event.
- */
- fsnotify_remove_event(group, &event->fae.fse);
- return 0;
- }
+ wait_event(group->fanotify_data.access_waitq, event->response);
/* userspace responded, convert to something usable */
switch (event->response) {
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 8e8e6bc..a643138 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -358,16 +358,20 @@
#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
struct fanotify_perm_event_info *event, *next;
+ struct fsnotify_event *fsn_event;
/*
- * There may be still new events arriving in the notification queue
- * but since userspace cannot use fanotify fd anymore, no event can
- * enter or leave access_list by now.
+ * Stop new events from arriving in the notification queue. since
+ * userspace cannot use fanotify fd anymore, no event can enter or
+ * leave access_list by now either.
+ */
+ fsnotify_group_stop_queueing(group);
+
+ /*
+ * Process all permission events on access_list and notification queue
+ * and simulate reply from userspace.
*/
spin_lock(&group->fanotify_data.access_lock);
-
- atomic_inc(&group->fanotify_data.bypass_perm);
-
list_for_each_entry_safe(event, next, &group->fanotify_data.access_list,
fae.fse.list) {
pr_debug("%s: found group=%p event=%p\n", __func__, group,
@@ -379,12 +383,21 @@
spin_unlock(&group->fanotify_data.access_lock);
/*
- * Since bypass_perm is set, newly queued events will not wait for
- * access response. Wake up the already sleeping ones now.
- * synchronize_srcu() in fsnotify_destroy_group() will wait for all
- * processes sleeping in fanotify_handle_event() waiting for access
- * response and thus also for all permission events to be freed.
+ * Destroy all non-permission events. For permission events just
+ * dequeue them and set the response. They will be freed once the
+ * response is consumed and fanotify_get_response() returns.
*/
+ mutex_lock(&group->notification_mutex);
+ while (!fsnotify_notify_queue_is_empty(group)) {
+ fsn_event = fsnotify_remove_first_event(group);
+ if (!(fsn_event->mask & FAN_ALL_PERM_EVENTS))
+ fsnotify_destroy_event(group, fsn_event);
+ else
+ FANOTIFY_PE(fsn_event)->response = FAN_ALLOW;
+ }
+ mutex_unlock(&group->notification_mutex);
+
+ /* Response for all permission events it set, wakeup waiters */
wake_up(&group->fanotify_data.access_waitq);
#endif
@@ -755,7 +768,6 @@
spin_lock_init(&group->fanotify_data.access_lock);
init_waitqueue_head(&group->fanotify_data.access_waitq);
INIT_LIST_HEAD(&group->fanotify_data.access_list);
- atomic_set(&group->fanotify_data.bypass_perm, 0);
#endif
switch (flags & FAN_ALL_CLASS_BITS) {
case FAN_CLASS_NOTIF:
diff --git a/fs/notify/group.c b/fs/notify/group.c
index 3e2dd85..b47f7cf 100644
--- a/fs/notify/group.c
+++ b/fs/notify/group.c
@@ -40,6 +40,17 @@
}
/*
+ * Stop queueing new events for this group. Once this function returns
+ * fsnotify_add_event() will not add any new events to the group's queue.
+ */
+void fsnotify_group_stop_queueing(struct fsnotify_group *group)
+{
+ mutex_lock(&group->notification_mutex);
+ group->shutdown = true;
+ mutex_unlock(&group->notification_mutex);
+}
+
+/*
* Trying to get rid of a group. Remove all marks, flush all events and release
* the group reference.
* Note that another thread calling fsnotify_clear_marks_by_group() may still
@@ -47,6 +58,14 @@
*/
void fsnotify_destroy_group(struct fsnotify_group *group)
{
+ /*
+ * Stop queueing new events. The code below is careful enough to not
+ * require this but fanotify needs to stop queuing events even before
+ * fsnotify_destroy_group() is called and this makes the other callers
+ * of fsnotify_destroy_group() to see the same behavior.
+ */
+ fsnotify_group_stop_queueing(group);
+
/* clear all inode marks for this group, attach them to destroy_list */
fsnotify_detach_group_marks(group);
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index a95d8e0..e455e83 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -82,7 +82,8 @@
* Add an event to the group notification queue. The group can later pull this
* event off the queue to deal with. The function returns 0 if the event was
* added to the queue, 1 if the event was merged with some other queued event,
- * 2 if the queue of events has overflown.
+ * 2 if the event was not queued - either the queue of events has overflown
+ * or the group is shutting down.
*/
int fsnotify_add_event(struct fsnotify_group *group,
struct fsnotify_event *event,
@@ -96,6 +97,11 @@
mutex_lock(&group->notification_mutex);
+ if (group->shutdown) {
+ mutex_unlock(&group->notification_mutex);
+ return 2;
+ }
+
if (group->q_len >= group->max_events) {
ret = 2;
/* Queue overflow event only if it isn't already queued */
@@ -126,21 +132,6 @@
}
/*
- * Remove @event from group's notification queue. It is the responsibility of
- * the caller to destroy the event.
- */
-void fsnotify_remove_event(struct fsnotify_group *group,
- struct fsnotify_event *event)
-{
- mutex_lock(&group->notification_mutex);
- if (!list_empty(&event->list)) {
- list_del_init(&event->list);
- group->q_len--;
- }
- mutex_unlock(&group->notification_mutex);
-}
-
-/*
* Remove and return the first event from the notification list. It is the
* responsibility of the caller to destroy the obtained event
*/
diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 7dabbc3..f165f86 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -5922,7 +5922,6 @@
}
static int ocfs2_replay_truncate_records(struct ocfs2_super *osb,
- handle_t *handle,
struct inode *data_alloc_inode,
struct buffer_head *data_alloc_bh)
{
@@ -5935,11 +5934,19 @@
struct ocfs2_truncate_log *tl;
struct inode *tl_inode = osb->osb_tl_inode;
struct buffer_head *tl_bh = osb->osb_tl_bh;
+ handle_t *handle;
di = (struct ocfs2_dinode *) tl_bh->b_data;
tl = &di->id2.i_dealloc;
i = le16_to_cpu(tl->tl_used) - 1;
while (i >= 0) {
+ handle = ocfs2_start_trans(osb, OCFS2_TRUNCATE_LOG_FLUSH_ONE_REC);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto bail;
+ }
+
/* Caller has given us at least enough credits to
* update the truncate log dinode */
status = ocfs2_journal_access_di(handle, INODE_CACHE(tl_inode), tl_bh,
@@ -5974,12 +5981,7 @@
}
}
- status = ocfs2_extend_trans(handle,
- OCFS2_TRUNCATE_LOG_FLUSH_ONE_REC);
- if (status < 0) {
- mlog_errno(status);
- goto bail;
- }
+ ocfs2_commit_trans(osb, handle);
i--;
}
@@ -5994,7 +5996,6 @@
{
int status;
unsigned int num_to_flush;
- handle_t *handle;
struct inode *tl_inode = osb->osb_tl_inode;
struct inode *data_alloc_inode = NULL;
struct buffer_head *tl_bh = osb->osb_tl_bh;
@@ -6038,21 +6039,11 @@
goto out_mutex;
}
- handle = ocfs2_start_trans(osb, OCFS2_TRUNCATE_LOG_FLUSH_ONE_REC);
- if (IS_ERR(handle)) {
- status = PTR_ERR(handle);
- mlog_errno(status);
- goto out_unlock;
- }
-
- status = ocfs2_replay_truncate_records(osb, handle, data_alloc_inode,
+ status = ocfs2_replay_truncate_records(osb, data_alloc_inode,
data_alloc_bh);
if (status < 0)
mlog_errno(status);
- ocfs2_commit_trans(osb, handle);
-
-out_unlock:
brelse(data_alloc_bh);
ocfs2_inode_unlock(data_alloc_inode, 1);
@@ -6413,43 +6404,34 @@
goto out_mutex;
}
- handle = ocfs2_start_trans(osb, OCFS2_SUBALLOC_FREE);
- if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- mlog_errno(ret);
- goto out_unlock;
- }
-
while (head) {
if (head->free_bg)
bg_blkno = head->free_bg;
else
bg_blkno = ocfs2_which_suballoc_group(head->free_blk,
head->free_bit);
+ handle = ocfs2_start_trans(osb, OCFS2_SUBALLOC_FREE);
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ mlog_errno(ret);
+ goto out_unlock;
+ }
+
trace_ocfs2_free_cached_blocks(
(unsigned long long)head->free_blk, head->free_bit);
ret = ocfs2_free_suballoc_bits(handle, inode, di_bh,
head->free_bit, bg_blkno, 1);
- if (ret) {
+ if (ret)
mlog_errno(ret);
- goto out_journal;
- }
- ret = ocfs2_extend_trans(handle, OCFS2_SUBALLOC_FREE);
- if (ret) {
- mlog_errno(ret);
- goto out_journal;
- }
+ ocfs2_commit_trans(osb, handle);
tmp = head;
head = head->free_next;
kfree(tmp);
}
-out_journal:
- ocfs2_commit_trans(osb, handle);
-
out_unlock:
ocfs2_inode_unlock(inode, 1);
brelse(di_bh);
diff --git a/fs/ocfs2/cluster/tcp_internal.h b/fs/ocfs2/cluster/tcp_internal.h
index 94b1836..b95e7df 100644
--- a/fs/ocfs2/cluster/tcp_internal.h
+++ b/fs/ocfs2/cluster/tcp_internal.h
@@ -44,9 +44,6 @@
* version here in tcp_internal.h should not need to be bumped for
* filesystem locking changes.
*
- * New in version 12
- * - Negotiate hb timeout when storage is down.
- *
* New in version 11
* - Negotiation of filesystem locking in the dlm join.
*
@@ -78,7 +75,7 @@
* - full 64 bit i_size in the metadata lock lvbs
* - introduction of "rw" lock and pushing meta/data locking down
*/
-#define O2NET_PROTOCOL_VERSION 12ULL
+#define O2NET_PROTOCOL_VERSION 11ULL
struct o2net_handshake {
__be64 protocol_version;
__be64 connector_id;
diff --git a/fs/ocfs2/dlm/dlmconvert.c b/fs/ocfs2/dlm/dlmconvert.c
index cdeafb4..0bb1286 100644
--- a/fs/ocfs2/dlm/dlmconvert.c
+++ b/fs/ocfs2/dlm/dlmconvert.c
@@ -268,7 +268,6 @@
struct dlm_lock *lock, int flags, int type)
{
enum dlm_status status;
- u8 old_owner = res->owner;
mlog(0, "type=%d, convert_type=%d, busy=%d\n", lock->ml.type,
lock->ml.convert_type, res->state & DLM_LOCK_RES_IN_PROGRESS);
@@ -335,7 +334,6 @@
spin_lock(&res->spinlock);
res->state &= ~DLM_LOCK_RES_IN_PROGRESS;
- lock->convert_pending = 0;
/* if it failed, move it back to granted queue.
* if master returns DLM_NORMAL and then down before sending ast,
* it may have already been moved to granted queue, reset to
@@ -344,12 +342,14 @@
if (status != DLM_NOTQUEUED)
dlm_error(status);
dlm_revert_pending_convert(res, lock);
- } else if ((res->state & DLM_LOCK_RES_RECOVERING) ||
- (old_owner != res->owner)) {
- mlog(0, "res %.*s is in recovering or has been recovered.\n",
- res->lockname.len, res->lockname.name);
+ } else if (!lock->convert_pending) {
+ mlog(0, "%s: res %.*s, owner died and lock has been moved back "
+ "to granted list, retry convert.\n",
+ dlm->name, res->lockname.len, res->lockname.name);
status = DLM_RECOVERING;
}
+
+ lock->convert_pending = 0;
bail:
spin_unlock(&res->spinlock);
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 4e7b0dc..0b055bf 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1506,7 +1506,8 @@
u64 start, u64 len)
{
int ret = 0;
- u64 tmpend, end = start + len;
+ u64 tmpend = 0;
+ u64 end = start + len;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
unsigned int csize = osb->s_clustersize;
handle_t *handle;
@@ -1538,18 +1539,31 @@
}
/*
- * We want to get the byte offset of the end of the 1st cluster.
+ * If start is on a cluster boundary and end is somewhere in another
+ * cluster, we have not COWed the cluster starting at start, unless
+ * end is also within the same cluster. So, in this case, we skip this
+ * first call to ocfs2_zero_range_for_truncate() truncate and move on
+ * to the next one.
*/
- tmpend = (u64)osb->s_clustersize + (start & ~(osb->s_clustersize - 1));
- if (tmpend > end)
- tmpend = end;
+ if ((start & (csize - 1)) != 0) {
+ /*
+ * We want to get the byte offset of the end of the 1st
+ * cluster.
+ */
+ tmpend = (u64)osb->s_clustersize +
+ (start & ~(osb->s_clustersize - 1));
+ if (tmpend > end)
+ tmpend = end;
- trace_ocfs2_zero_partial_clusters_range1((unsigned long long)start,
- (unsigned long long)tmpend);
+ trace_ocfs2_zero_partial_clusters_range1(
+ (unsigned long long)start,
+ (unsigned long long)tmpend);
- ret = ocfs2_zero_range_for_truncate(inode, handle, start, tmpend);
- if (ret)
- mlog_errno(ret);
+ ret = ocfs2_zero_range_for_truncate(inode, handle, start,
+ tmpend);
+ if (ret)
+ mlog_errno(ret);
+ }
if (tmpend < end) {
/*
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index ea47120..6ad3533 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -1199,14 +1199,24 @@
inode_unlock((*ac)->ac_inode);
ret = ocfs2_try_to_free_truncate_log(osb, bits_wanted);
- if (ret == 1)
+ if (ret == 1) {
+ iput((*ac)->ac_inode);
+ (*ac)->ac_inode = NULL;
goto retry;
+ }
if (ret < 0)
mlog_errno(ret);
inode_lock((*ac)->ac_inode);
- ocfs2_inode_lock((*ac)->ac_inode, NULL, 1);
+ ret = ocfs2_inode_lock((*ac)->ac_inode, NULL, 1);
+ if (ret < 0) {
+ mlog_errno(ret);
+ inode_unlock((*ac)->ac_inode);
+ iput((*ac)->ac_inode);
+ (*ac)->ac_inode = NULL;
+ goto bail;
+ }
}
if (status < 0) {
if (status != -ENOSPC)
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index a939f5e..5c89a07 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -430,6 +430,7 @@
static ssize_t
read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
{
+ char *buf = file->private_data;
ssize_t acc = 0;
size_t size, tsz;
size_t elf_buflen;
@@ -500,23 +501,20 @@
if (clear_user(buffer, tsz))
return -EFAULT;
} else if (is_vmalloc_or_module_addr((void *)start)) {
- char * elf_buf;
-
- elf_buf = kzalloc(tsz, GFP_KERNEL);
- if (!elf_buf)
- return -ENOMEM;
- vread(elf_buf, (char *)start, tsz);
+ vread(buf, (char *)start, tsz);
/* we have to zero-fill user buffer even if no read */
- if (copy_to_user(buffer, elf_buf, tsz)) {
- kfree(elf_buf);
+ if (copy_to_user(buffer, buf, tsz))
return -EFAULT;
- }
- kfree(elf_buf);
} else {
if (kern_addr_valid(start)) {
unsigned long n;
- n = copy_to_user(buffer, (char *)start, tsz);
+ /*
+ * Using bounce buffer to bypass the
+ * hardened user copy kernel text checks.
+ */
+ memcpy(buf, (char *) start, tsz);
+ n = copy_to_user(buffer, buf, tsz);
/*
* We cannot distinguish between fault on source
* and fault on destination. When this happens
@@ -549,6 +547,11 @@
{
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
+
+ filp->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ if (!filp->private_data)
+ return -ENOMEM;
+
if (kcore_need_update)
kcore_update_ram();
if (i_size_read(inode) != proc_root_kcore->size) {
@@ -559,10 +562,16 @@
return 0;
}
+static int release_kcore(struct inode *inode, struct file *file)
+{
+ kfree(file->private_data);
+ return 0;
+}
static const struct file_operations proc_kcore_operations = {
.read = read_kcore,
.open = open_kcore,
+ .release = release_kcore,
.llseek = default_llseek,
};
diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
index 183a212..12af049 100644
--- a/fs/ramfs/file-mmu.c
+++ b/fs/ramfs/file-mmu.c
@@ -27,9 +27,17 @@
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/ramfs.h>
+#include <linux/sched.h>
#include "internal.h"
+static unsigned long ramfs_mmu_get_unmapped_area(struct file *file,
+ unsigned long addr, unsigned long len, unsigned long pgoff,
+ unsigned long flags)
+{
+ return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+}
+
const struct file_operations ramfs_file_operations = {
.read_iter = generic_file_read_iter,
.write_iter = generic_file_write_iter,
@@ -38,6 +46,7 @@
.splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.llseek = generic_file_llseek,
+ .get_unmapped_area = ramfs_mmu_get_unmapped_area,
};
const struct inode_operations ramfs_file_inode_operations = {
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index 5dea1fb..6df9b07 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -231,14 +231,18 @@
might_fault(); \
access_ok(VERIFY_READ, __p, sizeof(*ptr)) ? \
__get_user((x), (__typeof__(*(ptr)) *)__p) : \
- -EFAULT; \
+ ((x) = (__typeof__(*(ptr)))0,-EFAULT); \
})
#ifndef __get_user_fn
static inline int __get_user_fn(size_t size, const void __user *ptr, void *x)
{
- size = __copy_from_user(x, ptr, size);
- return size ? -EFAULT : size;
+ size_t n = __copy_from_user(x, ptr, size);
+ if (unlikely(n)) {
+ memset(x + (size - n), 0, n);
+ return -EFAULT;
+ }
+ return 0;
}
#define __get_user_fn(sz, u, k) __get_user_fn(sz, u, k)
@@ -258,11 +262,13 @@
static inline long copy_from_user(void *to,
const void __user * from, unsigned long n)
{
+ unsigned long res = n;
might_fault();
- if (access_ok(VERIFY_READ, from, n))
- return __copy_from_user(to, from, n);
- else
- return n;
+ if (likely(access_ok(VERIFY_READ, from, n)))
+ res = __copy_from_user(to, from, n);
+ if (unlikely(res))
+ memset(to + (n - res), 0, res);
+ return res;
}
static inline long copy_to_user(void __user *to,
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9a904f6..c201017 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -96,6 +96,7 @@
struct bpf_func_proto {
u64 (*func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
bool gpl_only;
+ bool pkt_access;
enum bpf_return_type ret_type;
enum bpf_arg_type arg1_type;
enum bpf_arg_type arg2_type;
@@ -138,6 +139,13 @@
*/
PTR_TO_PACKET,
PTR_TO_PACKET_END, /* skb->data + headlen */
+
+ /* PTR_TO_MAP_VALUE_ADJ is used for doing pointer math inside of a map
+ * elem value. We only allow this if we can statically verify that
+ * access from this register are going to fall within the size of the
+ * map element.
+ */
+ PTR_TO_MAP_VALUE_ADJ,
};
struct bpf_prog;
@@ -151,7 +159,8 @@
*/
bool (*is_valid_access)(int off, int size, enum bpf_access_type type,
enum bpf_reg_type *reg_type);
-
+ int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
+ const struct bpf_prog *prog);
u32 (*convert_ctx_access)(enum bpf_access_type type, int dst_reg,
int src_reg, int ctx_off,
struct bpf_insn *insn, struct bpf_prog *prog);
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
new file mode 100644
index 0000000..7035b99
--- /dev/null
+++ b/include/linux/bpf_verifier.h
@@ -0,0 +1,102 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _LINUX_BPF_VERIFIER_H
+#define _LINUX_BPF_VERIFIER_H 1
+
+#include <linux/bpf.h> /* for enum bpf_reg_type */
+#include <linux/filter.h> /* for MAX_BPF_STACK */
+
+ /* Just some arbitrary values so we can safely do math without overflowing and
+ * are obviously wrong for any sort of memory access.
+ */
+#define BPF_REGISTER_MAX_RANGE (1024 * 1024 * 1024)
+#define BPF_REGISTER_MIN_RANGE -(1024 * 1024 * 1024)
+
+struct bpf_reg_state {
+ enum bpf_reg_type type;
+ /*
+ * Used to determine if any memory access using this register will
+ * result in a bad access.
+ */
+ u64 min_value, max_value;
+ union {
+ /* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
+ s64 imm;
+
+ /* valid when type == PTR_TO_PACKET* */
+ struct {
+ u32 id;
+ u16 off;
+ u16 range;
+ };
+
+ /* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE |
+ * PTR_TO_MAP_VALUE_OR_NULL
+ */
+ struct bpf_map *map_ptr;
+ };
+};
+
+enum bpf_stack_slot_type {
+ STACK_INVALID, /* nothing was stored in this stack slot */
+ STACK_SPILL, /* register spilled into stack */
+ STACK_MISC /* BPF program wrote some data into this slot */
+};
+
+#define BPF_REG_SIZE 8 /* size of eBPF register in bytes */
+
+/* state of the program:
+ * type of all registers and stack info
+ */
+struct bpf_verifier_state {
+ struct bpf_reg_state regs[MAX_BPF_REG];
+ u8 stack_slot_type[MAX_BPF_STACK];
+ struct bpf_reg_state spilled_regs[MAX_BPF_STACK / BPF_REG_SIZE];
+};
+
+/* linked list of verifier states used to prune search */
+struct bpf_verifier_state_list {
+ struct bpf_verifier_state state;
+ struct bpf_verifier_state_list *next;
+};
+
+struct bpf_insn_aux_data {
+ enum bpf_reg_type ptr_type; /* pointer type for load/store insns */
+};
+
+#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
+
+struct bpf_verifier_env;
+struct bpf_ext_analyzer_ops {
+ int (*insn_hook)(struct bpf_verifier_env *env,
+ int insn_idx, int prev_insn_idx);
+};
+
+/* single container for all structs
+ * one verifier_env per bpf_check() call
+ */
+struct bpf_verifier_env {
+ struct bpf_prog *prog; /* eBPF program being verified */
+ struct bpf_verifier_stack_elem *head; /* stack of verifier states to be processed */
+ int stack_size; /* number of states to be processed */
+ struct bpf_verifier_state cur_state; /* current verifier state */
+ struct bpf_verifier_state_list **explored_states; /* search pruning optimization */
+ const struct bpf_ext_analyzer_ops *analyzer_ops; /* external analyzer ops */
+ void *analyzer_priv; /* pointer to external analyzer's private data */
+ struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
+ u32 used_map_cnt; /* number of used maps */
+ u32 id_gen; /* used to generate unique reg IDs */
+ bool allow_ptr_leaks;
+ bool seen_direct_write;
+ bool varlen_map_value_access;
+ struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
+};
+
+int bpf_analyzer(struct bpf_prog *prog, const struct bpf_ext_analyzer_ops *ops,
+ void *priv);
+
+#endif /* _LINUX_BPF_VERIFIER_H */
diff --git a/include/linux/cec-funcs.h b/include/linux/cec-funcs.h
index 82c3d3b..138bbf7 100644
--- a/include/linux/cec-funcs.h
+++ b/include/linux/cec-funcs.h
@@ -162,10 +162,11 @@
/* One Touch Record Feature */
-static inline void cec_msg_record_off(struct cec_msg *msg)
+static inline void cec_msg_record_off(struct cec_msg *msg, bool reply)
{
msg->len = 2;
msg->msg[1] = CEC_MSG_RECORD_OFF;
+ msg->reply = reply ? CEC_MSG_RECORD_STATUS : 0;
}
struct cec_op_arib_data {
@@ -227,7 +228,7 @@
if (digital->service_id_method == CEC_OP_SERVICE_ID_METHOD_BY_CHANNEL) {
*msg++ = (digital->channel.channel_number_fmt << 2) |
(digital->channel.major >> 8);
- *msg++ = digital->channel.major && 0xff;
+ *msg++ = digital->channel.major & 0xff;
*msg++ = digital->channel.minor >> 8;
*msg++ = digital->channel.minor & 0xff;
*msg++ = 0;
@@ -323,6 +324,7 @@
}
static inline void cec_msg_record_on(struct cec_msg *msg,
+ bool reply,
const struct cec_op_record_src *rec_src)
{
switch (rec_src->type) {
@@ -346,6 +348,7 @@
rec_src->ext_phys_addr.phys_addr);
break;
}
+ msg->reply = reply ? CEC_MSG_RECORD_STATUS : 0;
}
static inline void cec_ops_record_on(const struct cec_msg *msg,
@@ -1141,6 +1144,75 @@
msg->reply = reply ? CEC_MSG_DEVICE_VENDOR_ID : 0;
}
+static inline void cec_msg_vendor_command(struct cec_msg *msg,
+ __u8 size, const __u8 *vendor_cmd)
+{
+ if (size > 14)
+ size = 14;
+ msg->len = 2 + size;
+ msg->msg[1] = CEC_MSG_VENDOR_COMMAND;
+ memcpy(msg->msg + 2, vendor_cmd, size);
+}
+
+static inline void cec_ops_vendor_command(const struct cec_msg *msg,
+ __u8 *size,
+ const __u8 **vendor_cmd)
+{
+ *size = msg->len - 2;
+
+ if (*size > 14)
+ *size = 14;
+ *vendor_cmd = msg->msg + 2;
+}
+
+static inline void cec_msg_vendor_command_with_id(struct cec_msg *msg,
+ __u32 vendor_id, __u8 size,
+ const __u8 *vendor_cmd)
+{
+ if (size > 11)
+ size = 11;
+ msg->len = 5 + size;
+ msg->msg[1] = CEC_MSG_VENDOR_COMMAND_WITH_ID;
+ msg->msg[2] = vendor_id >> 16;
+ msg->msg[3] = (vendor_id >> 8) & 0xff;
+ msg->msg[4] = vendor_id & 0xff;
+ memcpy(msg->msg + 5, vendor_cmd, size);
+}
+
+static inline void cec_ops_vendor_command_with_id(const struct cec_msg *msg,
+ __u32 *vendor_id, __u8 *size,
+ const __u8 **vendor_cmd)
+{
+ *size = msg->len - 5;
+
+ if (*size > 11)
+ *size = 11;
+ *vendor_id = (msg->msg[2] << 16) | (msg->msg[3] << 8) | msg->msg[4];
+ *vendor_cmd = msg->msg + 5;
+}
+
+static inline void cec_msg_vendor_remote_button_down(struct cec_msg *msg,
+ __u8 size,
+ const __u8 *rc_code)
+{
+ if (size > 14)
+ size = 14;
+ msg->len = 2 + size;
+ msg->msg[1] = CEC_MSG_VENDOR_REMOTE_BUTTON_DOWN;
+ memcpy(msg->msg + 2, rc_code, size);
+}
+
+static inline void cec_ops_vendor_remote_button_down(const struct cec_msg *msg,
+ __u8 *size,
+ const __u8 **rc_code)
+{
+ *size = msg->len - 2;
+
+ if (*size > 14)
+ *size = 14;
+ *rc_code = msg->msg + 2;
+}
+
static inline void cec_msg_vendor_remote_button_up(struct cec_msg *msg)
{
msg->len = 2;
@@ -1277,7 +1349,7 @@
msg->len += 4;
msg->msg[3] = (ui_cmd->channel_identifier.channel_number_fmt << 2) |
(ui_cmd->channel_identifier.major >> 8);
- msg->msg[4] = ui_cmd->channel_identifier.major && 0xff;
+ msg->msg[4] = ui_cmd->channel_identifier.major & 0xff;
msg->msg[5] = ui_cmd->channel_identifier.minor >> 8;
msg->msg[6] = ui_cmd->channel_identifier.minor & 0xff;
break;
diff --git a/include/linux/cec.h b/include/linux/cec.h
index b3e2289..851968e 100644
--- a/include/linux/cec.h
+++ b/include/linux/cec.h
@@ -364,7 +364,7 @@
* @num_log_addrs: how many logical addresses should be claimed. Set by the
* caller.
* @vendor_id: the vendor ID of the device. Set by the caller.
- * @flags: set to 0.
+ * @flags: flags.
* @osd_name: the OSD name of the device. Set by the caller.
* @primary_device_type: the primary device type for each logical address.
* Set by the caller.
@@ -389,6 +389,9 @@
__u8 features[CEC_MAX_LOG_ADDRS][12];
};
+/* Allow a fallback to unregistered */
+#define CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK (1 << 0)
+
/* Events */
/* Event that occurs when the adapter state changes */
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 436aa4e..6685698 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -527,13 +527,14 @@
* object's lifetime is managed by something other than RCU. That
* "something other" might be reference counting or simple immortality.
*
- * The seemingly unused size_t variable is to validate @p is indeed a pointer
- * type by making sure it can be dereferenced.
+ * The seemingly unused variable ___typecheck_p validates that @p is
+ * indeed a pointer type by using a pointer to typeof(*p) as the type.
+ * Taking a pointer to typeof(*p) again is needed in case p is void *.
*/
#define lockless_dereference(p) \
({ \
typeof(p) _________p1 = READ_ONCE(p); \
- size_t __maybe_unused __size_of_ptr = sizeof(*(p)); \
+ typeof(*(p)) *___typecheck_p __maybe_unused; \
smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_________p1); \
})
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 242bf53..34bd805 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -1,6 +1,8 @@
#ifndef __CPUHOTPLUG_H
#define __CPUHOTPLUG_H
+#include <linux/types.h>
+
enum cpuhp_state {
CPUHP_OFFLINE,
CPUHP_CREATE_THREADS,
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 7f5a582..0148a30 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -118,6 +118,15 @@
u32 imagesize;
} efi_capsule_header_t;
+struct efi_boot_memmap {
+ efi_memory_desc_t **map;
+ unsigned long *map_size;
+ unsigned long *desc_size;
+ u32 *desc_ver;
+ unsigned long *key_ptr;
+ unsigned long *buff_size;
+};
+
/*
* EFI capsule flags
*/
@@ -946,7 +955,7 @@
/* Iterate through an efi_memory_map */
#define for_each_efi_memory_desc_in_map(m, md) \
for ((md) = (m)->map; \
- ((void *)(md) + (m)->desc_size) <= (m)->map_end; \
+ (md) && ((void *)(md) + (m)->desc_size) <= (m)->map_end; \
(md) = (void *)(md) + (m)->desc_size)
/**
@@ -1371,11 +1380,7 @@
efi_loaded_image_t *image, int *cmd_line_len);
efi_status_t efi_get_memory_map(efi_system_table_t *sys_table_arg,
- efi_memory_desc_t **map,
- unsigned long *map_size,
- unsigned long *desc_size,
- u32 *desc_ver,
- unsigned long *key_ptr);
+ struct efi_boot_memmap *map);
efi_status_t efi_low_alloc(efi_system_table_t *sys_table_arg,
unsigned long size, unsigned long align,
@@ -1457,4 +1462,14 @@
arch_efi_call_virt_teardown(); \
})
+typedef efi_status_t (*efi_exit_boot_map_processing)(
+ efi_system_table_t *sys_table_arg,
+ struct efi_boot_memmap *map,
+ void *priv);
+
+efi_status_t efi_exit_boot_services(efi_system_table_t *sys_table,
+ void *handle,
+ struct efi_boot_memmap *map,
+ void *priv,
+ efi_exit_boot_map_processing priv_func);
#endif /* _LINUX_EFI_H */
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 58205f33..7268ed0 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -148,6 +148,7 @@
#define FS_PRIO_1 1 /* fanotify content based access control */
#define FS_PRIO_2 2 /* fanotify pre-content access */
unsigned int priority;
+ bool shutdown; /* group is being shut down, don't queue more events */
/* stores all fastpath marks assoc with this group so they can be cleaned on unregister */
struct mutex mark_mutex; /* protect marks_list */
@@ -179,7 +180,6 @@
spinlock_t access_lock;
struct list_head access_list;
wait_queue_head_t access_waitq;
- atomic_t bypass_perm;
#endif /* CONFIG_FANOTIFY_ACCESS_PERMISSIONS */
int f_flags;
unsigned int max_marks;
@@ -292,6 +292,8 @@
extern void fsnotify_get_group(struct fsnotify_group *group);
/* drop reference on a group from fsnotify_alloc_group */
extern void fsnotify_put_group(struct fsnotify_group *group);
+/* group destruction begins, stop queuing new events */
+extern void fsnotify_group_stop_queueing(struct fsnotify_group *group);
/* destroy group */
extern void fsnotify_destroy_group(struct fsnotify_group *group);
/* fasync handler function */
@@ -304,8 +306,6 @@
struct fsnotify_event *event,
int (*merge)(struct list_head *,
struct fsnotify_event *));
-/* Remove passed event from groups notification queue */
-extern void fsnotify_remove_event(struct fsnotify_group *group, struct fsnotify_event *event);
/* true if the group notification queue is empty */
extern bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group);
/* return, but do not dequeue the first event on the notification queue */
diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index f923d15..0b17c58 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -25,5 +25,6 @@
__u32 max_tx_rate;
__u32 rss_query_en;
__u32 trusted;
+ __be16 vlan_proto;
};
#endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/irq.h b/include/linux/irq.h
index b52424e..0ac26c8 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -945,6 +945,16 @@
static inline void irq_gc_unlock(struct irq_chip_generic *gc) { }
#endif
+/*
+ * The irqsave variants are for usage in non interrupt code. Do not use
+ * them in irq_chip callbacks. Use irq_gc_lock() instead.
+ */
+#define irq_gc_lock_irqsave(gc, flags) \
+ raw_spin_lock_irqsave(&(gc)->lock, flags)
+
+#define irq_gc_unlock_irqrestore(gc, flags) \
+ raw_spin_unlock_irqrestore(&(gc)->lock, flags)
+
static inline void irq_reg_writel(struct irq_chip_generic *gc,
u32 val, int reg_offset)
{
diff --git a/include/linux/ktime.h b/include/linux/ktime.h
index 2b6a204..aa118ba 100644
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -231,6 +231,11 @@
return ktime_sub_ns(kt, usec * NSEC_PER_USEC);
}
+static inline ktime_t ktime_sub_ms(const ktime_t kt, const u64 msec)
+{
+ return ktime_sub_ns(kt, msec * NSEC_PER_MSEC);
+}
+
extern ktime_t ktime_add_safe(const ktime_t lhs, const ktime_t rhs);
/**
diff --git a/include/linux/mlx4/cmd.h b/include/linux/mlx4/cmd.h
index 116b284..1f35686 100644
--- a/include/linux/mlx4/cmd.h
+++ b/include/linux/mlx4/cmd.h
@@ -309,7 +309,8 @@
struct ifla_vf_stats *vf_stats);
u32 mlx4_comm_get_version(void);
int mlx4_set_vf_mac(struct mlx4_dev *dev, int port, int vf, u64 mac);
-int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan, u8 qos);
+int mlx4_set_vf_vlan(struct mlx4_dev *dev, int port, int vf, u16 vlan,
+ u8 qos, __be16 proto);
int mlx4_set_vf_rate(struct mlx4_dev *dev, int port, int vf, int min_tx_rate,
int max_tx_rate);
int mlx4_set_vf_spoofchk(struct mlx4_dev *dev, int port, int vf, bool setting);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 42da355..59b50d3 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -221,6 +221,7 @@
MLX4_DEV_CAP_FLAG2_ROCE_V1_V2 = 1ULL << 33,
MLX4_DEV_CAP_FLAG2_DMFS_UC_MC_SNIFFER = 1ULL << 34,
MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT = 1ULL << 35,
+ MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP = 1ULL << 36,
};
enum {
@@ -1371,6 +1372,8 @@
int mlx4_SET_PORT_VXLAN(struct mlx4_dev *dev, u8 port, u8 steering, int enable);
int set_phv_bit(struct mlx4_dev *dev, u8 port, int new_val);
int get_phv_bit(struct mlx4_dev *dev, u8 port, int *phv);
+int mlx4_get_is_vlan_offload_disabled(struct mlx4_dev *dev, u8 port,
+ bool *vlan_offload_disabled);
int mlx4_find_cached_mac(struct mlx4_dev *dev, u8 port, u64 mac, int *idx);
int mlx4_find_cached_vlan(struct mlx4_dev *dev, u8 port, u16 vid, int *idx);
int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h
index deaa221..b4ee8f6 100644
--- a/include/linux/mlx4/qp.h
+++ b/include/linux/mlx4/qp.h
@@ -160,6 +160,7 @@
enum { /* fl */
MLX4_FL_CV = 1 << 6,
+ MLX4_FL_SV = 1 << 5,
MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2,
MLX4_FL_ETH_SRC_CHECK_MC_LB = 1 << 1,
MLX4_FL_ETH_SRC_CHECK_UC_LB = 1 << 0,
@@ -267,6 +268,7 @@
MLX4_UPD_QP_PATH_MASK_FVL_RX = 16 + 32,
MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_UC_LB = 18 + 32,
MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB = 19 + 32,
+ MLX4_UPD_QP_PATH_MASK_SV = 22 + 32,
};
enum { /* param3 */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2095b6a..136ae6bb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -789,6 +789,7 @@
TC_SETUP_CLSU32,
TC_SETUP_CLSFLOWER,
TC_SETUP_MATCHALL,
+ TC_SETUP_CLSBPF,
};
struct tc_cls_u32_offload;
@@ -800,6 +801,7 @@
struct tc_cls_u32_offload *cls_u32;
struct tc_cls_flower_offload *cls_flower;
struct tc_cls_matchall_offload *cls_mall;
+ struct tc_cls_bpf_offload *cls_bpf;
};
};
@@ -924,6 +926,14 @@
* 3. Update dev->stats asynchronously and atomically, and define
* neither operation.
*
+ * bool (*ndo_has_offload_stats)(int attr_id)
+ * Return true if this device supports offload stats of this attr_id.
+ *
+ * int (*ndo_get_offload_stats)(int attr_id, const struct net_device *dev,
+ * void *attr_data)
+ * Get statistics for offload operations by attr_id. Write it into the
+ * attr_data pointer.
+ *
* int (*ndo_vlan_rx_add_vid)(struct net_device *dev, __be16 proto, u16 vid);
* If device supports VLAN filtering this function is called when a
* VLAN id is registered.
@@ -936,7 +946,8 @@
*
* SR-IOV management functions.
* int (*ndo_set_vf_mac)(struct net_device *dev, int vf, u8* mac);
- * int (*ndo_set_vf_vlan)(struct net_device *dev, int vf, u16 vlan, u8 qos);
+ * int (*ndo_set_vf_vlan)(struct net_device *dev, int vf, u16 vlan,
+ * u8 qos, __be16 proto);
* int (*ndo_set_vf_rate)(struct net_device *dev, int vf, int min_tx_rate,
* int max_tx_rate);
* int (*ndo_set_vf_spoofchk)(struct net_device *dev, int vf, bool setting);
@@ -1155,6 +1166,10 @@
struct rtnl_link_stats64* (*ndo_get_stats64)(struct net_device *dev,
struct rtnl_link_stats64 *storage);
+ bool (*ndo_has_offload_stats)(int attr_id);
+ int (*ndo_get_offload_stats)(int attr_id,
+ const struct net_device *dev,
+ void *attr_data);
struct net_device_stats* (*ndo_get_stats)(struct net_device *dev);
int (*ndo_vlan_rx_add_vid)(struct net_device *dev,
@@ -1173,7 +1188,8 @@
int (*ndo_set_vf_mac)(struct net_device *dev,
int queue, u8 *mac);
int (*ndo_set_vf_vlan)(struct net_device *dev,
- int queue, u16 vlan, u8 qos);
+ int queue, u16 vlan,
+ u8 qos, __be16 proto);
int (*ndo_set_vf_rate)(struct net_device *dev,
int vf, int min_tx_rate,
int max_tx_rate);
@@ -1783,7 +1799,7 @@
#endif
struct netdev_queue __rcu *ingress_queue;
#ifdef CONFIG_NETFILTER_INGRESS
- struct list_head nf_hooks_ingress;
+ struct nf_hook_entry __rcu *nf_hooks_ingress;
#endif
unsigned char broadcast[MAX_ADDR_LEN];
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 9230f9a..abc7fdc 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -55,12 +55,34 @@
struct net_device *out;
struct sock *sk;
struct net *net;
- struct list_head *hook_list;
+ struct nf_hook_entry __rcu *hook_entries;
int (*okfn)(struct net *, struct sock *, struct sk_buff *);
};
+typedef unsigned int nf_hookfn(void *priv,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state);
+struct nf_hook_ops {
+ struct list_head list;
+
+ /* User fills in from here down. */
+ nf_hookfn *hook;
+ struct net_device *dev;
+ void *priv;
+ u_int8_t pf;
+ unsigned int hooknum;
+ /* Hooks are ordered in ascending priority. */
+ int priority;
+};
+
+struct nf_hook_entry {
+ struct nf_hook_entry __rcu *next;
+ struct nf_hook_ops ops;
+ const struct nf_hook_ops *orig_ops;
+};
+
static inline void nf_hook_state_init(struct nf_hook_state *p,
- struct list_head *hook_list,
+ struct nf_hook_entry *hook_entry,
unsigned int hook,
int thresh, u_int8_t pf,
struct net_device *indev,
@@ -76,26 +98,11 @@
p->out = outdev;
p->sk = sk;
p->net = net;
- p->hook_list = hook_list;
+ RCU_INIT_POINTER(p->hook_entries, hook_entry);
p->okfn = okfn;
}
-typedef unsigned int nf_hookfn(void *priv,
- struct sk_buff *skb,
- const struct nf_hook_state *state);
-struct nf_hook_ops {
- struct list_head list;
-
- /* User fills in from here down. */
- nf_hookfn *hook;
- struct net_device *dev;
- void *priv;
- u_int8_t pf;
- unsigned int hooknum;
- /* Hooks are ordered in ascending priority. */
- int priority;
-};
struct nf_sockopt_ops {
struct list_head list;
@@ -133,6 +140,8 @@
void nf_unregister_hook(struct nf_hook_ops *reg);
int nf_register_hooks(struct nf_hook_ops *reg, unsigned int n);
void nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n);
+int _nf_register_hooks(struct nf_hook_ops *reg, unsigned int n);
+void _nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n);
/* Functions to register get/setsockopt ranges (non-inclusive). You
need to check permissions yourself! */
@@ -161,7 +170,8 @@
int (*okfn)(struct net *, struct sock *, struct sk_buff *),
int thresh)
{
- struct list_head *hook_list;
+ struct nf_hook_entry *hook_head;
+ int ret = 1;
#ifdef HAVE_JUMP_LABEL
if (__builtin_constant_p(pf) &&
@@ -170,16 +180,19 @@
return 1;
#endif
- hook_list = &net->nf.hooks[pf][hook];
-
- if (!list_empty(hook_list)) {
+ rcu_read_lock();
+ hook_head = rcu_dereference(net->nf.hooks[pf][hook]);
+ if (hook_head) {
struct nf_hook_state state;
- nf_hook_state_init(&state, hook_list, hook, thresh,
+ nf_hook_state_init(&state, hook_head, hook, thresh,
pf, indev, outdev, sk, net, okfn);
- return nf_hook_slow(skb, &state);
+
+ ret = nf_hook_slow(skb, &state);
}
- return 1;
+ rcu_read_unlock();
+
+ return ret;
}
static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net,
diff --git a/include/linux/netfilter/nf_conntrack_common.h b/include/linux/netfilter/nf_conntrack_common.h
index 2755057..1d1ef4e 100644
--- a/include/linux/netfilter/nf_conntrack_common.h
+++ b/include/linux/netfilter/nf_conntrack_common.h
@@ -4,13 +4,9 @@
#include <uapi/linux/netfilter/nf_conntrack_common.h>
struct ip_conntrack_stat {
- unsigned int searched;
unsigned int found;
- unsigned int new;
unsigned int invalid;
unsigned int ignore;
- unsigned int delete;
- unsigned int delete_list;
unsigned int insert;
unsigned int insert_failed;
unsigned int drop;
diff --git a/include/linux/netfilter/nf_conntrack_proto_gre.h b/include/linux/netfilter/nf_conntrack_proto_gre.h
index df78dc2..dee0acd 100644
--- a/include/linux/netfilter/nf_conntrack_proto_gre.h
+++ b/include/linux/netfilter/nf_conntrack_proto_gre.h
@@ -1,68 +1,8 @@
#ifndef _CONNTRACK_PROTO_GRE_H
#define _CONNTRACK_PROTO_GRE_H
#include <asm/byteorder.h>
-
-/* GRE PROTOCOL HEADER */
-
-/* GRE Version field */
-#define GRE_VERSION_1701 0x0
-#define GRE_VERSION_PPTP 0x1
-
-/* GRE Protocol field */
-#define GRE_PROTOCOL_PPTP 0x880B
-
-/* GRE Flags */
-#define GRE_FLAG_C 0x80
-#define GRE_FLAG_R 0x40
-#define GRE_FLAG_K 0x20
-#define GRE_FLAG_S 0x10
-#define GRE_FLAG_A 0x80
-
-#define GRE_IS_C(f) ((f)&GRE_FLAG_C)
-#define GRE_IS_R(f) ((f)&GRE_FLAG_R)
-#define GRE_IS_K(f) ((f)&GRE_FLAG_K)
-#define GRE_IS_S(f) ((f)&GRE_FLAG_S)
-#define GRE_IS_A(f) ((f)&GRE_FLAG_A)
-
-/* GRE is a mess: Four different standards */
-struct gre_hdr {
-#if defined(__LITTLE_ENDIAN_BITFIELD)
- __u16 rec:3,
- srr:1,
- seq:1,
- key:1,
- routing:1,
- csum:1,
- version:3,
- reserved:4,
- ack:1;
-#elif defined(__BIG_ENDIAN_BITFIELD)
- __u16 csum:1,
- routing:1,
- key:1,
- seq:1,
- srr:1,
- rec:3,
- ack:1,
- reserved:4,
- version:3;
-#else
-#error "Adjust your <asm/byteorder.h> defines"
-#endif
- __be16 protocol;
-};
-
-/* modified GRE header for PPTP */
-struct gre_hdr_pptp {
- __u8 flags; /* bitfield */
- __u8 version; /* should be GRE_VERSION_PPTP */
- __be16 protocol; /* should be GRE_PROTOCOL_PPTP */
- __be16 payload_len; /* size of ppp payload, not inc. gre header */
- __be16 call_id; /* peer's call_id for this session */
- __be32 seq; /* sequence number. Present if S==1 */
- __be32 ack; /* seq number of highest packet received by */
- /* sender in this session */
-};
+#include <net/gre.h>
+#include <net/pptp.h>
struct nf_ct_gre {
unsigned int stream_timeout;
diff --git a/include/linux/netfilter_ingress.h b/include/linux/netfilter_ingress.h
index 5fcd375..33e37fb 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -11,22 +11,30 @@
if (!static_key_false(&nf_hooks_needed[NFPROTO_NETDEV][NF_NETDEV_INGRESS]))
return false;
#endif
- return !list_empty(&skb->dev->nf_hooks_ingress);
+ return rcu_access_pointer(skb->dev->nf_hooks_ingress);
}
+/* caller must hold rcu_read_lock */
static inline int nf_hook_ingress(struct sk_buff *skb)
{
+ struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress);
struct nf_hook_state state;
- nf_hook_state_init(&state, &skb->dev->nf_hooks_ingress,
- NF_NETDEV_INGRESS, INT_MIN, NFPROTO_NETDEV,
- skb->dev, NULL, NULL, dev_net(skb->dev), NULL);
+ /* Must recheck the ingress hook head, in the event it became NULL
+ * after the check in nf_hook_ingress_active evaluated to true.
+ */
+ if (unlikely(!e))
+ return 0;
+
+ nf_hook_state_init(&state, e, NF_NETDEV_INGRESS, INT_MIN,
+ NFPROTO_NETDEV, skb->dev, NULL, NULL,
+ dev_net(skb->dev), NULL);
return nf_hook_slow(skb, &state);
}
static inline void nf_hook_ingress_init(struct net_device *dev)
{
- INIT_LIST_HEAD(&dev->nf_hooks_ingress);
+ RCU_INIT_POINTER(dev->nf_hooks_ingress, NULL);
}
#else /* CONFIG_NETFILTER_INGRESS */
static inline int nf_hook_ingress_active(struct sk_buff *skb)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 66a1260..7e3d537 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -571,56 +571,56 @@
*/
static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
{
- int ret = 0;
char __user *end = uaddr + size - 1;
if (unlikely(size == 0))
- return ret;
+ return 0;
+ if (unlikely(uaddr > end))
+ return -EFAULT;
/*
* Writing zeroes into userspace here is OK, because we know that if
* the zero gets there, we'll be overwriting it.
*/
- while (uaddr <= end) {
- ret = __put_user(0, uaddr);
- if (ret != 0)
- return ret;
+ do {
+ if (unlikely(__put_user(0, uaddr) != 0))
+ return -EFAULT;
uaddr += PAGE_SIZE;
- }
+ } while (uaddr <= end);
/* Check whether the range spilled into the next page. */
if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK))
- ret = __put_user(0, end);
+ return __put_user(0, end);
- return ret;
+ return 0;
}
static inline int fault_in_multipages_readable(const char __user *uaddr,
int size)
{
volatile char c;
- int ret = 0;
const char __user *end = uaddr + size - 1;
if (unlikely(size == 0))
- return ret;
+ return 0;
- while (uaddr <= end) {
- ret = __get_user(c, uaddr);
- if (ret != 0)
- return ret;
+ if (unlikely(uaddr > end))
+ return -EFAULT;
+
+ do {
+ if (unlikely(__get_user(c, uaddr) != 0))
+ return -EFAULT;
uaddr += PAGE_SIZE;
- }
+ } while (uaddr <= end);
/* Check whether the range spilled into the next page. */
if (((unsigned long)uaddr & PAGE_MASK) ==
((unsigned long)end & PAGE_MASK)) {
- ret = __get_user(c, end);
- (void)c;
+ return __get_user(c, end);
}
- return ret;
+ return 0;
}
int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 2d24b28..e25f183 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -80,6 +80,7 @@
PHY_INTERFACE_MODE_XGMII,
PHY_INTERFACE_MODE_MOCA,
PHY_INTERFACE_MODE_QSGMII,
+ PHY_INTERFACE_MODE_TRGMII,
PHY_INTERFACE_MODE_MAX,
} phy_interface_t;
@@ -123,6 +124,8 @@
return "moca";
case PHY_INTERFACE_MODE_QSGMII:
return "qsgmii";
+ case PHY_INTERFACE_MODE_TRGMII:
+ return "trgmii";
default:
return "unknown";
}
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index 6b15e16..5ad54fc 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -127,6 +127,11 @@
*
* @info: Structure describing the new clock.
* @parent: Pointer to the parent device of the new clock.
+ *
+ * Returns a valid pointer on success or PTR_ERR on failure. If PHC
+ * support is missing at the configuration level, this function
+ * returns NULL, and drivers are expected to gracefully handle that
+ * case separately.
*/
extern struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index fd82584..5c132d3 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -1,7 +1,7 @@
/*
* Resizable, Scalable, Concurrent Hash Table
*
- * Copyright (c) 2015 Herbert Xu <herbert@gondor.apana.org.au>
+ * Copyright (c) 2015-2016 Herbert Xu <herbert@gondor.apana.org.au>
* Copyright (c) 2014-2015 Thomas Graf <tgraf@suug.ch>
* Copyright (c) 2008-2014 Patrick McHardy <kaber@trash.net>
*
@@ -53,6 +53,11 @@
struct rhash_head __rcu *next;
};
+struct rhlist_head {
+ struct rhash_head rhead;
+ struct rhlist_head __rcu *next;
+};
+
/**
* struct bucket_table - Table of hash buckets
* @size: Number of hash buckets
@@ -137,6 +142,7 @@
* @key_len: Key length for hashfn
* @elasticity: Maximum chain length before rehash
* @p: Configuration parameters
+ * @rhlist: True if this is an rhltable
* @run_work: Deferred worker to expand/shrink asynchronously
* @mutex: Mutex to protect current/future table swapping
* @lock: Spin lock to protect walker list
@@ -147,12 +153,21 @@
unsigned int key_len;
unsigned int elasticity;
struct rhashtable_params p;
+ bool rhlist;
struct work_struct run_work;
struct mutex mutex;
spinlock_t lock;
};
/**
+ * struct rhltable - Hash table with duplicate objects in a list
+ * @ht: Underlying rhtable
+ */
+struct rhltable {
+ struct rhashtable ht;
+};
+
+/**
* struct rhashtable_walker - Hash table walker
* @list: List entry on list of walkers
* @tbl: The table that we were walking over
@@ -163,9 +178,10 @@
};
/**
- * struct rhashtable_iter - Hash table iterator, fits into netlink cb
+ * struct rhashtable_iter - Hash table iterator
* @ht: Table to iterate through
* @p: Current pointer
+ * @list: Current hash list pointer
* @walker: Associated rhashtable walker
* @slot: Current slot
* @skip: Number of entries to skip in slot
@@ -173,6 +189,7 @@
struct rhashtable_iter {
struct rhashtable *ht;
struct rhash_head *p;
+ struct rhlist_head *list;
struct rhashtable_walker walker;
unsigned int slot;
unsigned int skip;
@@ -339,13 +356,11 @@
int rhashtable_init(struct rhashtable *ht,
const struct rhashtable_params *params);
+int rhltable_init(struct rhltable *hlt,
+ const struct rhashtable_params *params);
-struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
- const void *key,
- struct rhash_head *obj,
- struct bucket_table *old_tbl,
- void **data);
-int rhashtable_insert_rehash(struct rhashtable *ht, struct bucket_table *tbl);
+void *rhashtable_insert_slow(struct rhashtable *ht, const void *key,
+ struct rhash_head *obj);
void rhashtable_walk_enter(struct rhashtable *ht,
struct rhashtable_iter *iter);
@@ -507,6 +522,31 @@
rht_for_each_entry_rcu_continue(tpos, pos, (tbl)->buckets[hash],\
tbl, hash, member)
+/**
+ * rhl_for_each_rcu - iterate over rcu hash table list
+ * @pos: the &struct rlist_head to use as a loop cursor.
+ * @list: the head of the list
+ *
+ * This hash chain list-traversal primitive should be used on the
+ * list returned by rhltable_lookup.
+ */
+#define rhl_for_each_rcu(pos, list) \
+ for (pos = list; pos; pos = rcu_dereference_raw(pos->next))
+
+/**
+ * rhl_for_each_entry_rcu - iterate over rcu hash table list of given type
+ * @tpos: the type * to use as a loop cursor.
+ * @pos: the &struct rlist_head to use as a loop cursor.
+ * @list: the head of the list
+ * @member: name of the &struct rlist_head within the hashable struct.
+ *
+ * This hash chain list-traversal primitive should be used on the
+ * list returned by rhltable_lookup.
+ */
+#define rhl_for_each_entry_rcu(tpos, pos, list, member) \
+ for (pos = list; pos && rht_entry(tpos, pos, member); \
+ pos = rcu_dereference_raw(pos->next))
+
static inline int rhashtable_compare(struct rhashtable_compare_arg *arg,
const void *obj)
{
@@ -516,18 +556,8 @@
return memcmp(ptr + ht->p.key_offset, arg->key, ht->p.key_len);
}
-/**
- * rhashtable_lookup_fast - search hash table, inlined version
- * @ht: hash table
- * @key: the pointer to the key
- * @params: hash table parameters
- *
- * Computes the hash value for the key and traverses the bucket chain looking
- * for a entry with an identical key. The first matching entry is returned.
- *
- * Returns the first entry on which the compare function returned true.
- */
-static inline void *rhashtable_lookup_fast(
+/* Internal function, do not use. */
+static inline struct rhash_head *__rhashtable_lookup(
struct rhashtable *ht, const void *key,
const struct rhashtable_params params)
{
@@ -539,8 +569,6 @@
struct rhash_head *he;
unsigned int hash;
- rcu_read_lock();
-
tbl = rht_dereference_rcu(ht->tbl, ht);
restart:
hash = rht_key_hashfn(ht, tbl, key, params);
@@ -549,8 +577,7 @@
params.obj_cmpfn(&arg, rht_obj(ht, he)) :
rhashtable_compare(&arg, rht_obj(ht, he)))
continue;
- rcu_read_unlock();
- return rht_obj(ht, he);
+ return he;
}
/* Ensure we see any new tables. */
@@ -559,96 +586,165 @@
tbl = rht_dereference_rcu(tbl->future_tbl, ht);
if (unlikely(tbl))
goto restart;
- rcu_read_unlock();
return NULL;
}
+/**
+ * rhashtable_lookup - search hash table
+ * @ht: hash table
+ * @key: the pointer to the key
+ * @params: hash table parameters
+ *
+ * Computes the hash value for the key and traverses the bucket chain looking
+ * for a entry with an identical key. The first matching entry is returned.
+ *
+ * This must only be called under the RCU read lock.
+ *
+ * Returns the first entry on which the compare function returned true.
+ */
+static inline void *rhashtable_lookup(
+ struct rhashtable *ht, const void *key,
+ const struct rhashtable_params params)
+{
+ struct rhash_head *he = __rhashtable_lookup(ht, key, params);
+
+ return he ? rht_obj(ht, he) : NULL;
+}
+
+/**
+ * rhashtable_lookup_fast - search hash table, without RCU read lock
+ * @ht: hash table
+ * @key: the pointer to the key
+ * @params: hash table parameters
+ *
+ * Computes the hash value for the key and traverses the bucket chain looking
+ * for a entry with an identical key. The first matching entry is returned.
+ *
+ * Only use this function when you have other mechanisms guaranteeing
+ * that the object won't go away after the RCU read lock is released.
+ *
+ * Returns the first entry on which the compare function returned true.
+ */
+static inline void *rhashtable_lookup_fast(
+ struct rhashtable *ht, const void *key,
+ const struct rhashtable_params params)
+{
+ void *obj;
+
+ rcu_read_lock();
+ obj = rhashtable_lookup(ht, key, params);
+ rcu_read_unlock();
+
+ return obj;
+}
+
+/**
+ * rhltable_lookup - search hash list table
+ * @hlt: hash table
+ * @key: the pointer to the key
+ * @params: hash table parameters
+ *
+ * Computes the hash value for the key and traverses the bucket chain looking
+ * for a entry with an identical key. All matching entries are returned
+ * in a list.
+ *
+ * This must only be called under the RCU read lock.
+ *
+ * Returns the list of entries that match the given key.
+ */
+static inline struct rhlist_head *rhltable_lookup(
+ struct rhltable *hlt, const void *key,
+ const struct rhashtable_params params)
+{
+ struct rhash_head *he = __rhashtable_lookup(&hlt->ht, key, params);
+
+ return he ? container_of(he, struct rhlist_head, rhead) : NULL;
+}
+
/* Internal function, please use rhashtable_insert_fast() instead. This
* function returns the existing element already in hashes in there is a clash,
* otherwise it returns an error via ERR_PTR().
*/
static inline void *__rhashtable_insert_fast(
struct rhashtable *ht, const void *key, struct rhash_head *obj,
- const struct rhashtable_params params)
+ const struct rhashtable_params params, bool rhlist)
{
struct rhashtable_compare_arg arg = {
.ht = ht,
.key = key,
};
- struct bucket_table *tbl, *new_tbl;
+ struct rhash_head __rcu **pprev;
+ struct bucket_table *tbl;
struct rhash_head *head;
spinlock_t *lock;
- unsigned int elasticity;
unsigned int hash;
- void *data = NULL;
- int err;
+ int elasticity;
+ void *data;
-restart:
rcu_read_lock();
tbl = rht_dereference_rcu(ht->tbl, ht);
+ hash = rht_head_hashfn(ht, tbl, obj, params);
+ lock = rht_bucket_lock(tbl, hash);
+ spin_lock_bh(lock);
- /* All insertions must grab the oldest table containing
- * the hashed bucket that is yet to be rehashed.
- */
- for (;;) {
- hash = rht_head_hashfn(ht, tbl, obj, params);
- lock = rht_bucket_lock(tbl, hash);
- spin_lock_bh(lock);
-
- if (tbl->rehash <= hash)
- break;
-
+ if (unlikely(rht_dereference_bucket(tbl->future_tbl, tbl, hash))) {
+slow_path:
spin_unlock_bh(lock);
- tbl = rht_dereference_rcu(tbl->future_tbl, ht);
+ rcu_read_unlock();
+ return rhashtable_insert_slow(ht, key, obj);
}
- new_tbl = rht_dereference_rcu(tbl->future_tbl, ht);
- if (unlikely(new_tbl)) {
- tbl = rhashtable_insert_slow(ht, key, obj, new_tbl, &data);
- if (!IS_ERR_OR_NULL(tbl))
- goto slow_path;
+ elasticity = ht->elasticity;
+ pprev = &tbl->buckets[hash];
+ rht_for_each(head, tbl, hash) {
+ struct rhlist_head *plist;
+ struct rhlist_head *list;
- err = PTR_ERR(tbl);
- if (err == -EEXIST)
- err = 0;
+ elasticity--;
+ if (!key ||
+ (params.obj_cmpfn ?
+ params.obj_cmpfn(&arg, rht_obj(ht, head)) :
+ rhashtable_compare(&arg, rht_obj(ht, head))))
+ continue;
- goto out;
+ data = rht_obj(ht, head);
+
+ if (!rhlist)
+ goto out;
+
+
+ list = container_of(obj, struct rhlist_head, rhead);
+ plist = container_of(head, struct rhlist_head, rhead);
+
+ RCU_INIT_POINTER(list->next, plist);
+ head = rht_dereference_bucket(head->next, tbl, hash);
+ RCU_INIT_POINTER(list->rhead.next, head);
+ rcu_assign_pointer(*pprev, obj);
+
+ goto good;
}
- err = -E2BIG;
+ if (elasticity <= 0)
+ goto slow_path;
+
+ data = ERR_PTR(-E2BIG);
if (unlikely(rht_grow_above_max(ht, tbl)))
goto out;
- if (unlikely(rht_grow_above_100(ht, tbl))) {
-slow_path:
- spin_unlock_bh(lock);
- err = rhashtable_insert_rehash(ht, tbl);
- rcu_read_unlock();
- if (err)
- return ERR_PTR(err);
-
- goto restart;
- }
-
- err = 0;
- elasticity = ht->elasticity;
- rht_for_each(head, tbl, hash) {
- if (key &&
- unlikely(!(params.obj_cmpfn ?
- params.obj_cmpfn(&arg, rht_obj(ht, head)) :
- rhashtable_compare(&arg, rht_obj(ht, head))))) {
- data = rht_obj(ht, head);
- goto out;
- }
- if (!--elasticity)
- goto slow_path;
- }
+ if (unlikely(rht_grow_above_100(ht, tbl)))
+ goto slow_path;
head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
RCU_INIT_POINTER(obj->next, head);
+ if (rhlist) {
+ struct rhlist_head *list;
+
+ list = container_of(obj, struct rhlist_head, rhead);
+ RCU_INIT_POINTER(list->next, NULL);
+ }
rcu_assign_pointer(tbl->buckets[hash], obj);
@@ -656,11 +752,14 @@
if (rht_grow_above_75(ht, tbl))
schedule_work(&ht->run_work);
+good:
+ data = NULL;
+
out:
spin_unlock_bh(lock);
rcu_read_unlock();
- return err ? ERR_PTR(err) : data;
+ return data;
}
/**
@@ -685,7 +784,7 @@
{
void *ret;
- ret = __rhashtable_insert_fast(ht, NULL, obj, params);
+ ret = __rhashtable_insert_fast(ht, NULL, obj, params, false);
if (IS_ERR(ret))
return PTR_ERR(ret);
@@ -693,6 +792,58 @@
}
/**
+ * rhltable_insert_key - insert object into hash list table
+ * @hlt: hash list table
+ * @key: the pointer to the key
+ * @list: pointer to hash list head inside object
+ * @params: hash table parameters
+ *
+ * Will take a per bucket spinlock to protect against mutual mutations
+ * on the same bucket. Multiple insertions may occur in parallel unless
+ * they map to the same bucket lock.
+ *
+ * It is safe to call this function from atomic context.
+ *
+ * Will trigger an automatic deferred table resizing if the size grows
+ * beyond the watermark indicated by grow_decision() which can be passed
+ * to rhashtable_init().
+ */
+static inline int rhltable_insert_key(
+ struct rhltable *hlt, const void *key, struct rhlist_head *list,
+ const struct rhashtable_params params)
+{
+ return PTR_ERR(__rhashtable_insert_fast(&hlt->ht, key, &list->rhead,
+ params, true));
+}
+
+/**
+ * rhltable_insert - insert object into hash list table
+ * @hlt: hash list table
+ * @list: pointer to hash list head inside object
+ * @params: hash table parameters
+ *
+ * Will take a per bucket spinlock to protect against mutual mutations
+ * on the same bucket. Multiple insertions may occur in parallel unless
+ * they map to the same bucket lock.
+ *
+ * It is safe to call this function from atomic context.
+ *
+ * Will trigger an automatic deferred table resizing if the size grows
+ * beyond the watermark indicated by grow_decision() which can be passed
+ * to rhashtable_init().
+ */
+static inline int rhltable_insert(
+ struct rhltable *hlt, struct rhlist_head *list,
+ const struct rhashtable_params params)
+{
+ const char *key = rht_obj(&hlt->ht, &list->rhead);
+
+ key += params.key_offset;
+
+ return rhltable_insert_key(hlt, key, list, params);
+}
+
+/**
* rhashtable_lookup_insert_fast - lookup and insert object into hash table
* @ht: hash table
* @obj: pointer to hash head inside object
@@ -722,7 +873,8 @@
BUG_ON(ht->p.obj_hashfn);
- ret = __rhashtable_insert_fast(ht, key + ht->p.key_offset, obj, params);
+ ret = __rhashtable_insert_fast(ht, key + ht->p.key_offset, obj, params,
+ false);
if (IS_ERR(ret))
return PTR_ERR(ret);
@@ -759,7 +911,7 @@
BUG_ON(!ht->p.obj_hashfn || !key);
- ret = __rhashtable_insert_fast(ht, key, obj, params);
+ ret = __rhashtable_insert_fast(ht, key, obj, params, false);
if (IS_ERR(ret))
return PTR_ERR(ret);
@@ -783,13 +935,14 @@
{
BUG_ON(!ht->p.obj_hashfn || !key);
- return __rhashtable_insert_fast(ht, key, obj, params);
+ return __rhashtable_insert_fast(ht, key, obj, params, false);
}
/* Internal function, please use rhashtable_remove_fast() instead */
-static inline int __rhashtable_remove_fast(
+static inline int __rhashtable_remove_fast_one(
struct rhashtable *ht, struct bucket_table *tbl,
- struct rhash_head *obj, const struct rhashtable_params params)
+ struct rhash_head *obj, const struct rhashtable_params params,
+ bool rhlist)
{
struct rhash_head __rcu **pprev;
struct rhash_head *he;
@@ -804,18 +957,86 @@
pprev = &tbl->buckets[hash];
rht_for_each(he, tbl, hash) {
+ struct rhlist_head *list;
+
+ list = container_of(he, struct rhlist_head, rhead);
+
if (he != obj) {
+ struct rhlist_head __rcu **lpprev;
+
pprev = &he->next;
- continue;
+
+ if (!rhlist)
+ continue;
+
+ do {
+ lpprev = &list->next;
+ list = rht_dereference_bucket(list->next,
+ tbl, hash);
+ } while (list && obj != &list->rhead);
+
+ if (!list)
+ continue;
+
+ list = rht_dereference_bucket(list->next, tbl, hash);
+ RCU_INIT_POINTER(*lpprev, list);
+ err = 0;
+ break;
}
- rcu_assign_pointer(*pprev, obj->next);
- err = 0;
+ obj = rht_dereference_bucket(obj->next, tbl, hash);
+ err = 1;
+
+ if (rhlist) {
+ list = rht_dereference_bucket(list->next, tbl, hash);
+ if (list) {
+ RCU_INIT_POINTER(list->rhead.next, obj);
+ obj = &list->rhead;
+ err = 0;
+ }
+ }
+
+ rcu_assign_pointer(*pprev, obj);
break;
}
spin_unlock_bh(lock);
+ if (err > 0) {
+ atomic_dec(&ht->nelems);
+ if (unlikely(ht->p.automatic_shrinking &&
+ rht_shrink_below_30(ht, tbl)))
+ schedule_work(&ht->run_work);
+ err = 0;
+ }
+
+ return err;
+}
+
+/* Internal function, please use rhashtable_remove_fast() instead */
+static inline int __rhashtable_remove_fast(
+ struct rhashtable *ht, struct rhash_head *obj,
+ const struct rhashtable_params params, bool rhlist)
+{
+ struct bucket_table *tbl;
+ int err;
+
+ rcu_read_lock();
+
+ tbl = rht_dereference_rcu(ht->tbl, ht);
+
+ /* Because we have already taken (and released) the bucket
+ * lock in old_tbl, if we find that future_tbl is not yet
+ * visible then that guarantees the entry to still be in
+ * the old tbl if it exists.
+ */
+ while ((err = __rhashtable_remove_fast_one(ht, tbl, obj, params,
+ rhlist)) &&
+ (tbl = rht_dereference_rcu(tbl->future_tbl, ht)))
+ ;
+
+ rcu_read_unlock();
+
return err;
}
@@ -838,34 +1059,29 @@
struct rhashtable *ht, struct rhash_head *obj,
const struct rhashtable_params params)
{
- struct bucket_table *tbl;
- int err;
+ return __rhashtable_remove_fast(ht, obj, params, false);
+}
- rcu_read_lock();
-
- tbl = rht_dereference_rcu(ht->tbl, ht);
-
- /* Because we have already taken (and released) the bucket
- * lock in old_tbl, if we find that future_tbl is not yet
- * visible then that guarantees the entry to still be in
- * the old tbl if it exists.
- */
- while ((err = __rhashtable_remove_fast(ht, tbl, obj, params)) &&
- (tbl = rht_dereference_rcu(tbl->future_tbl, ht)))
- ;
-
- if (err)
- goto out;
-
- atomic_dec(&ht->nelems);
- if (unlikely(ht->p.automatic_shrinking &&
- rht_shrink_below_30(ht, tbl)))
- schedule_work(&ht->run_work);
-
-out:
- rcu_read_unlock();
-
- return err;
+/**
+ * rhltable_remove - remove object from hash list table
+ * @hlt: hash list table
+ * @list: pointer to hash list head inside object
+ * @params: hash table parameters
+ *
+ * Since the hash chain is single linked, the removal operation needs to
+ * walk the bucket chain upon removal. The removal operation is thus
+ * considerable slow if the hash table is not correctly sized.
+ *
+ * Will automatically shrink the table via rhashtable_expand() if the
+ * shrink_decision function specified at rhashtable_init() returns true.
+ *
+ * Returns zero on success, -ENOENT if the entry could not be found.
+ */
+static inline int rhltable_remove(
+ struct rhltable *hlt, struct rhlist_head *list,
+ const struct rhashtable_params params)
+{
+ return __rhashtable_remove_fast(&hlt->ht, &list->rhead, params, true);
}
/* Internal function, please use rhashtable_replace_fast() instead */
@@ -958,4 +1174,51 @@
return 0;
}
+/**
+ * rhltable_walk_enter - Initialise an iterator
+ * @hlt: Table to walk over
+ * @iter: Hash table Iterator
+ *
+ * This function prepares a hash table walk.
+ *
+ * Note that if you restart a walk after rhashtable_walk_stop you
+ * may see the same object twice. Also, you may miss objects if
+ * there are removals in between rhashtable_walk_stop and the next
+ * call to rhashtable_walk_start.
+ *
+ * For a completely stable walk you should construct your own data
+ * structure outside the hash table.
+ *
+ * This function may sleep so you must not call it from interrupt
+ * context or with spin locks held.
+ *
+ * You must call rhashtable_walk_exit after this function returns.
+ */
+static inline void rhltable_walk_enter(struct rhltable *hlt,
+ struct rhashtable_iter *iter)
+{
+ return rhashtable_walk_enter(&hlt->ht, iter);
+}
+
+/**
+ * rhltable_free_and_destroy - free elements and destroy hash list table
+ * @hlt: the hash list table to destroy
+ * @free_fn: callback to release resources of element
+ * @arg: pointer passed to free_fn
+ *
+ * See documentation for rhashtable_free_and_destroy.
+ */
+static inline void rhltable_free_and_destroy(struct rhltable *hlt,
+ void (*free_fn)(void *ptr,
+ void *arg),
+ void *arg)
+{
+ return rhashtable_free_and_destroy(&hlt->ht, free_fn, arg);
+}
+
+static inline void rhltable_destroy(struct rhltable *hlt)
+{
+ return rhltable_free_and_destroy(hlt, NULL, NULL);
+}
+
#endif /* _LINUX_RHASHTABLE_H */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4c5662f..9bf60b5 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -676,13 +676,23 @@
*/
kmemcheck_bitfield_begin(flags1);
__u16 queue_mapping;
+
+/* if you move cloned around you also must adapt those constants */
+#ifdef __BIG_ENDIAN_BITFIELD
+#define CLONED_MASK (1 << 7)
+#else
+#define CLONED_MASK 1
+#endif
+#define CLONED_OFFSET() offsetof(struct sk_buff, __cloned_offset)
+
+ __u8 __cloned_offset[0];
__u8 cloned:1,
nohdr:1,
fclone:2,
peeked:1,
head_frag:1,
- xmit_more:1;
- /* one bit hole */
+ xmit_more:1,
+ __unused:1; /* one bit hole */
kmemcheck_bitfield_end(flags1);
/* fields enclosed in headers_start/headers_end are copied
@@ -3075,6 +3085,7 @@
struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
int skb_ensure_writable(struct sk_buff *skb, int write_len);
+int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci);
int skb_vlan_pop(struct sk_buff *skb);
int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
struct sk_buff *pskb_extract(struct sk_buff *skb, int off, int to_copy,
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index c723a46..a17ae7b 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -19,6 +19,7 @@
#include <linux/skbuff.h>
+#include <linux/win_minmax.h>
#include <net/sock.h>
#include <net/inet_connection_sock.h>
#include <net/inet_timewait_sock.h>
@@ -212,7 +213,8 @@
u8 reord; /* reordering detected */
} rack;
u16 advmss; /* Advertised MSS */
- u8 unused;
+ u8 rate_app_limited:1, /* rate_{delivered,interval_us} limited? */
+ unused:7;
u8 nonagle : 4,/* Disable Nagle algorithm? */
thin_lto : 1,/* Use linear timeouts for thin streams */
thin_dupack : 1,/* Fast retransmit on first dupack */
@@ -234,9 +236,7 @@
u32 mdev_max_us; /* maximal mdev for the last rtt period */
u32 rttvar_us; /* smoothed mdev_max */
u32 rtt_seq; /* sequence number to update rttvar */
- struct rtt_meas {
- u32 rtt, ts; /* RTT in usec and sampling time in jiffies. */
- } rtt_min[3];
+ struct minmax rtt_min;
u32 packets_out; /* Packets which are "in flight" */
u32 retrans_out; /* Retransmitted packets out */
@@ -268,6 +268,12 @@
* receiver in Recovery. */
u32 prr_out; /* Total number of pkts sent during Recovery. */
u32 delivered; /* Total data packets delivered incl. rexmits */
+ u32 lost; /* Total data packets lost incl. rexmits */
+ u32 app_limited; /* limited until "delivered" reaches this val */
+ struct skb_mstamp first_tx_mstamp; /* start of window send phase */
+ struct skb_mstamp delivered_mstamp; /* time we reached "delivered" */
+ u32 rate_delivered; /* saved rate sample: packets delivered */
+ u32 rate_interval_us; /* saved rate sample: time elapsed */
u32 rcv_wnd; /* Current receiver window */
u32 write_seq; /* Tail(+1) of data held in tcp send buffer */
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 1b5d1cd..75b4aaf 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -76,7 +76,7 @@
struct iov_iter *i, unsigned long offset, size_t bytes);
void iov_iter_advance(struct iov_iter *i, size_t bytes);
int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes);
-int iov_iter_fault_in_multipages_readable(struct iov_iter *i, size_t bytes);
+#define iov_iter_fault_in_multipages_readable iov_iter_fault_in_readable
size_t iov_iter_single_seg_count(const struct iov_iter *i);
size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
struct iov_iter *i);
diff --git a/include/linux/win_minmax.h b/include/linux/win_minmax.h
new file mode 100644
index 0000000..5656960
--- /dev/null
+++ b/include/linux/win_minmax.h
@@ -0,0 +1,37 @@
+/**
+ * lib/minmax.c: windowed min/max tracker by Kathleen Nichols.
+ *
+ */
+#ifndef MINMAX_H
+#define MINMAX_H
+
+#include <linux/types.h>
+
+/* A single data point for our parameterized min-max tracker */
+struct minmax_sample {
+ u32 t; /* time measurement was taken */
+ u32 v; /* value measured */
+};
+
+/* State for the parameterized min-max tracker */
+struct minmax {
+ struct minmax_sample s[3];
+};
+
+static inline u32 minmax_get(const struct minmax *m)
+{
+ return m->s[0].v;
+}
+
+static inline u32 minmax_reset(struct minmax *m, u32 t, u32 meas)
+{
+ struct minmax_sample val = { .t = t, .v = meas };
+
+ m->s[2] = m->s[1] = m->s[0] = val;
+ return m->s[0].v;
+}
+
+u32 minmax_running_max(struct minmax *m, u32 win, u32 t, u32 meas);
+u32 minmax_running_min(struct minmax *m, u32 win, u32 t, u32 meas);
+
+#endif
diff --git a/include/media/cec.h b/include/media/cec.h
index dc7854b..fdb5d60 100644
--- a/include/media/cec.h
+++ b/include/media/cec.h
@@ -57,8 +57,8 @@
int minor;
bool registered;
bool unregistered;
- struct mutex fhs_lock;
struct list_head fhs;
+ struct mutex lock;
};
struct cec_adapter;
diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h
index bfd1590..0a1e21d 100644
--- a/include/net/bluetooth/bluetooth.h
+++ b/include/net/bluetooth/bluetooth.h
@@ -29,7 +29,8 @@
#include <net/sock.h>
#include <linux/seq_file.h>
-#define BT_SUBSYS_VERSION "2.21"
+#define BT_SUBSYS_VERSION 2
+#define BT_SUBSYS_REVISION 22
#ifndef AF_BLUETOOTH
#define AF_BLUETOOTH 31
@@ -371,6 +372,7 @@
void hci_sock_clear_flag(struct sock *sk, int nr);
int hci_sock_test_flag(struct sock *sk, int nr);
unsigned short hci_sock_get_channel(struct sock *sk);
+u32 hci_sock_get_cookie(struct sock *sk);
int hci_sock_init(void);
void hci_sock_cleanup(void);
diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 003b252..99aa5e5 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -63,6 +63,7 @@
#define HCI_SDIO 6
#define HCI_SPI 7
#define HCI_I2C 8
+#define HCI_SMD 9
/* HCI controller types */
#define HCI_PRIMARY 0x00
@@ -207,7 +208,11 @@
HCI_MGMT_INDEX_EVENTS,
HCI_MGMT_UNCONF_INDEX_EVENTS,
HCI_MGMT_EXT_INDEX_EVENTS,
- HCI_MGMT_GENERIC_EVENTS,
+ HCI_MGMT_EXT_INFO_EVENTS,
+ HCI_MGMT_OPTION_EVENTS,
+ HCI_MGMT_SETTING_EVENTS,
+ HCI_MGMT_DEV_CLASS_EVENTS,
+ HCI_MGMT_LOCAL_NAME_EVENTS,
HCI_MGMT_OOB_DATA_EVENTS,
};
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index ee7fc47..f00bf66 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -211,6 +211,7 @@
__u8 dev_name[HCI_MAX_NAME_LENGTH];
__u8 short_name[HCI_MAX_SHORT_NAME_LENGTH];
__u8 eir[HCI_MAX_EIR_LENGTH];
+ __u16 appearance;
__u8 dev_class[3];
__u8 major_class;
__u8 minor_class;
@@ -399,7 +400,9 @@
struct delayed_work rpa_expired;
bdaddr_t rpa;
+#if IS_ENABLED(CONFIG_BT_LEDS)
struct led_trigger *power_led;
+#endif
int (*open)(struct hci_dev *hdev);
int (*close)(struct hci_dev *hdev);
@@ -1026,8 +1029,8 @@
int hci_reset_dev(struct hci_dev *hdev);
int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb);
int hci_recv_diag(struct hci_dev *hdev, struct sk_buff *skb);
-void hci_set_hw_info(struct hci_dev *hdev, const char *fmt, ...);
-void hci_set_fw_info(struct hci_dev *hdev, const char *fmt, ...);
+__printf(2, 3) void hci_set_hw_info(struct hci_dev *hdev, const char *fmt, ...);
+__printf(2, 3) void hci_set_fw_info(struct hci_dev *hdev, const char *fmt, ...);
int hci_dev_open(__u16 dev);
int hci_dev_close(__u16 dev);
int hci_dev_do_close(struct hci_dev *hdev);
@@ -1404,6 +1407,9 @@
void hci_send_to_channel(unsigned short channel, struct sk_buff *skb,
int flag, struct sock *skip_sk);
void hci_send_to_monitor(struct hci_dev *hdev, struct sk_buff *skb);
+void hci_send_monitor_ctrl_event(struct hci_dev *hdev, u16 event,
+ void *data, u16 data_len, ktime_t tstamp,
+ int flag, struct sock *skip_sk);
void hci_sock_dev_event(struct hci_dev *hdev, int event);
@@ -1449,6 +1455,7 @@
#define DISCOV_BREDR_INQUIRY_LEN 0x08
#define DISCOV_LE_RESTART_DELAY msecs_to_jiffies(200) /* msec */
+void mgmt_fill_version_info(void *ver);
int mgmt_new_settings(struct hci_dev *hdev);
void mgmt_index_added(struct hci_dev *hdev);
void mgmt_index_removed(struct hci_dev *hdev);
diff --git a/include/net/bluetooth/hci_mon.h b/include/net/bluetooth/hci_mon.h
index 587d013..240786b 100644
--- a/include/net/bluetooth/hci_mon.h
+++ b/include/net/bluetooth/hci_mon.h
@@ -45,6 +45,10 @@
#define HCI_MON_VENDOR_DIAG 11
#define HCI_MON_SYSTEM_NOTE 12
#define HCI_MON_USER_LOGGING 13
+#define HCI_MON_CTRL_OPEN 14
+#define HCI_MON_CTRL_CLOSE 15
+#define HCI_MON_CTRL_COMMAND 16
+#define HCI_MON_CTRL_EVENT 17
struct hci_mon_new_index {
__u8 type;
diff --git a/include/net/bluetooth/mgmt.h b/include/net/bluetooth/mgmt.h
index 7647964..72a456b 100644
--- a/include/net/bluetooth/mgmt.h
+++ b/include/net/bluetooth/mgmt.h
@@ -586,6 +586,24 @@
#define MGMT_OP_START_LIMITED_DISCOVERY 0x0041
+#define MGMT_OP_READ_EXT_INFO 0x0042
+#define MGMT_READ_EXT_INFO_SIZE 0
+struct mgmt_rp_read_ext_info {
+ bdaddr_t bdaddr;
+ __u8 version;
+ __le16 manufacturer;
+ __le32 supported_settings;
+ __le32 current_settings;
+ __le16 eir_len;
+ __u8 eir[0];
+} __packed;
+
+#define MGMT_OP_SET_APPEARANCE 0x0043
+struct mgmt_cp_set_appearance {
+ __u16 appearance;
+} __packed;
+#define MGMT_SET_APPEARANCE_SIZE 2
+
#define MGMT_EV_CMD_COMPLETE 0x0001
struct mgmt_ev_cmd_complete {
__le16 opcode;
@@ -800,3 +818,9 @@
struct mgmt_ev_advertising_removed {
__u8 instance;
} __packed;
+
+#define MGMT_EV_EXT_INFO_CHANGED 0x0025
+struct mgmt_ev_ext_info_changed {
+ __le16 eir_len;
+ __u8 eir[0];
+} __packed;
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index beb7610..bd26cc6 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2432,7 +2432,8 @@
* cases, the result of roaming is indicated with a call to
* cfg80211_roamed() or cfg80211_roamed_bss().
* (invoked with the wireless_dev mutex held)
- * @disconnect: Disconnect from the BSS/ESS.
+ * @disconnect: Disconnect from the BSS/ESS. Once done, call
+ * cfg80211_disconnected().
* (invoked with the wireless_dev mutex held)
*
* @join_ibss: Join the specified IBSS (or create if necessary). Once done, call
@@ -3955,6 +3956,34 @@
struct cfg80211_qos_map *qos_map);
/**
+ * cfg80211_find_ie_match - match information element and byte array in data
+ *
+ * @eid: element ID
+ * @ies: data consisting of IEs
+ * @len: length of data
+ * @match: byte array to match
+ * @match_len: number of bytes in the match array
+ * @match_offset: offset in the IE where the byte array should match.
+ * If match_len is zero, this must also be set to zero.
+ * Otherwise this must be set to 2 or more, because the first
+ * byte is the element id, which is already compared to eid, and
+ * the second byte is the IE length.
+ *
+ * Return: %NULL if the element ID could not be found or if
+ * the element is invalid (claims to be longer than the given
+ * data) or if the byte array doesn't match, or a pointer to the first
+ * byte of the requested element, that is the byte containing the
+ * element ID.
+ *
+ * Note: There are no checks on the element length other than
+ * having to fit into the given data and being large enough for the
+ * byte array to match.
+ */
+const u8 *cfg80211_find_ie_match(u8 eid, const u8 *ies, int len,
+ const u8 *match, int match_len,
+ int match_offset);
+
+/**
* cfg80211_find_ie - find information element in data
*
* @eid: element ID
@@ -3969,7 +3998,10 @@
* Note: There are no checks on the element length other than
* having to fit into the given data.
*/
-const u8 *cfg80211_find_ie(u8 eid, const u8 *ies, int len);
+static inline const u8 *cfg80211_find_ie(u8 eid, const u8 *ies, int len)
+{
+ return cfg80211_find_ie_match(eid, ies, len, NULL, 0, 0);
+}
/**
* cfg80211_find_vendor_ie - find vendor specific information element in data
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 7556646..b122196 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -143,6 +143,7 @@
struct net_device *netdev;
struct device_node *dn;
unsigned int ageing_time;
+ u8 stp_state;
};
struct dsa_switch {
@@ -339,6 +340,7 @@
void (*port_bridge_leave)(struct dsa_switch *ds, int port);
void (*port_stp_state_set)(struct dsa_switch *ds, int port,
u8 state);
+ void (*port_fast_age)(struct dsa_switch *ds, int port);
/*
* VLAN support
diff --git a/include/net/ieee80211_radiotap.h b/include/net/ieee80211_radiotap.h
index b0fd947..ba07b9d 100644
--- a/include/net/ieee80211_radiotap.h
+++ b/include/net/ieee80211_radiotap.h
@@ -190,6 +190,10 @@
* IEEE80211_RADIOTAP_VHT u16, u8, u8, u8[4], u8, u8, u16
*
* Contains VHT information about this frame.
+ *
+ * IEEE80211_RADIOTAP_TIMESTAMP u64, u16, u8, u8 variable
+ *
+ * Contains timestamp information for this frame.
*/
enum ieee80211_radiotap_type {
IEEE80211_RADIOTAP_TSFT = 0,
@@ -214,6 +218,7 @@
IEEE80211_RADIOTAP_MCS = 19,
IEEE80211_RADIOTAP_AMPDU_STATUS = 20,
IEEE80211_RADIOTAP_VHT = 21,
+ IEEE80211_RADIOTAP_TIMESTAMP = 22,
/* valid in every it_present bitmap, even vendor namespaces */
IEEE80211_RADIOTAP_RADIOTAP_NAMESPACE = 29,
@@ -321,6 +326,22 @@
#define IEEE80211_RADIOTAP_CODING_LDPC_USER2 0x04
#define IEEE80211_RADIOTAP_CODING_LDPC_USER3 0x08
+/* For IEEE80211_RADIOTAP_TIMESTAMP */
+#define IEEE80211_RADIOTAP_TIMESTAMP_UNIT_MASK 0x000F
+#define IEEE80211_RADIOTAP_TIMESTAMP_UNIT_MS 0x0000
+#define IEEE80211_RADIOTAP_TIMESTAMP_UNIT_US 0x0001
+#define IEEE80211_RADIOTAP_TIMESTAMP_UNIT_NS 0x0003
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_MASK 0x00F0
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_BEGIN_MDPU 0x0000
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_EO_MPDU 0x0010
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_EO_PPDU 0x0020
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_PLCP_SIG_ACQ 0x0030
+#define IEEE80211_RADIOTAP_TIMESTAMP_SPOS_UNKNOWN 0x00F0
+
+#define IEEE80211_RADIOTAP_TIMESTAMP_FLAG_64BIT 0x00
+#define IEEE80211_RADIOTAP_TIMESTAMP_FLAG_32BIT 0x01
+#define IEEE80211_RADIOTAP_TIMESTAMP_FLAG_ACCURACY 0x02
+
/* helpers */
static inline int ieee80211_get_radiotap_len(unsigned char *data)
{
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 49dcad4..197a30d 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -134,8 +134,8 @@
} icsk_mtup;
u32 icsk_user_timeout;
- u64 icsk_ca_priv[64 / sizeof(u64)];
-#define ICSK_CA_PRIV_SIZE (8 * sizeof(u64))
+ u64 icsk_ca_priv[88 / sizeof(u64)];
+#define ICSK_CA_PRIV_SIZE (11 * sizeof(u64))
};
#define ICSK_TIME_RETRANS 1 /* Retransmit timer */
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index d97305d..e0cd318 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -64,6 +64,9 @@
}
void ip6_route_input(struct sk_buff *skb);
+struct dst_entry *ip6_route_input_lookup(struct net *net,
+ struct net_device *dev,
+ struct flowi6 *fl6, int flags);
struct dst_entry *ip6_route_output_flags(struct net *net, const struct sock *sk,
struct flowi6 *fl6, int flags);
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 7d4a72e..b9314b4 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -22,6 +22,7 @@
#include <net/fib_rules.h>
#include <net/inetpeer.h>
#include <linux/percpu.h>
+#include <linux/notifier.h>
struct fib_config {
u8 fc_dst_len;
@@ -122,6 +123,7 @@
#ifdef CONFIG_IP_ROUTE_MULTIPATH
int fib_weight;
#endif
+ unsigned int fib_offload_cnt;
struct rcu_head rcu;
struct fib_nh fib_nh[0];
#define fib_dev fib_nh[0].nh_dev
@@ -173,6 +175,18 @@
__be32 fib_info_update_nh_saddr(struct net *net, struct fib_nh *nh);
+static inline void fib_info_offload_inc(struct fib_info *fi)
+{
+ fi->fib_offload_cnt++;
+ fi->fib_flags |= RTNH_F_OFFLOAD;
+}
+
+static inline void fib_info_offload_dec(struct fib_info *fi)
+{
+ if (--fi->fib_offload_cnt == 0)
+ fi->fib_flags &= ~RTNH_F_OFFLOAD;
+}
+
#define FIB_RES_SADDR(net, res) \
((FIB_RES_NH(res).nh_saddr_genid == \
atomic_read(&(net)->ipv4.dev_addr_genid)) ? \
@@ -185,6 +199,33 @@
#define FIB_RES_PREFSRC(net, res) ((res).fi->fib_prefsrc ? : \
FIB_RES_SADDR(net, res))
+struct fib_notifier_info {
+ struct net *net;
+};
+
+struct fib_entry_notifier_info {
+ struct fib_notifier_info info; /* must be first */
+ u32 dst;
+ int dst_len;
+ struct fib_info *fi;
+ u8 tos;
+ u8 type;
+ u32 tb_id;
+ u32 nlflags;
+};
+
+enum fib_event_type {
+ FIB_EVENT_ENTRY_ADD,
+ FIB_EVENT_ENTRY_DEL,
+ FIB_EVENT_RULE_ADD,
+ FIB_EVENT_RULE_DEL,
+};
+
+int register_fib_notifier(struct notifier_block *nb);
+int unregister_fib_notifier(struct notifier_block *nb);
+int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
+ struct fib_notifier_info *info);
+
struct fib_table {
struct hlist_node tb_hlist;
u32 tb_id;
@@ -196,13 +237,12 @@
int fib_table_lookup(struct fib_table *tb, const struct flowi4 *flp,
struct fib_result *res, int fib_flags);
-int fib_table_insert(struct fib_table *, struct fib_config *);
-int fib_table_delete(struct fib_table *, struct fib_config *);
+int fib_table_insert(struct net *, struct fib_table *, struct fib_config *);
+int fib_table_delete(struct net *, struct fib_table *, struct fib_config *);
int fib_table_dump(struct fib_table *table, struct sk_buff *skb,
struct netlink_callback *cb);
-int fib_table_flush(struct fib_table *table);
+int fib_table_flush(struct net *net, struct fib_table *table);
struct fib_table *fib_trie_unmerge(struct fib_table *main_tb);
-void fib_table_flush_external(struct fib_table *table);
void fib_free_table(struct fib_table *tb);
#ifndef CONFIG_IP_MULTIPLE_TABLES
@@ -315,7 +355,6 @@
}
#endif
int fib_unmerge(struct net *net);
-void fib_flush_external(struct net *net);
/* Exported by fib_semantics.c */
int ip_fib_check_default(__be32 gw, struct net_device *dev);
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index cca510a..5296100 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1735,6 +1735,9 @@
* @supp_rates: Bitmap of supported rates (per band)
* @ht_cap: HT capabilities of this STA; restricted to our own capabilities
* @vht_cap: VHT capabilities of this STA; restricted to our own capabilities
+ * @max_rx_aggregation_subframes: maximal amount of frames in a single AMPDU
+ * that this station is allowed to transmit to us.
+ * Can be modified by driver.
* @wme: indicates whether the STA supports QoS/WME (if local devices does,
* otherwise always false)
* @drv_priv: data area for driver use, will always be aligned to
@@ -1775,6 +1778,7 @@
u16 aid;
struct ieee80211_sta_ht_cap ht_cap;
struct ieee80211_sta_vht_cap vht_cap;
+ u8 max_rx_aggregation_subframes;
bool wme;
u8 uapsd_queues;
u8 max_sp;
@@ -2014,6 +2018,11 @@
* @IEEE80211_HW_TX_FRAG_LIST: Hardware (or driver) supports sending frag_list
* skbs, needed for zero-copy software A-MSDU.
*
+ * @IEEE80211_HW_REPORTS_LOW_ACK: The driver (or firmware) reports low ack event
+ * by ieee80211_report_low_ack() based on its own algorithm. For such
+ * drivers, mac80211 packet loss mechanism will not be triggered and driver
+ * is completely depending on firmware event for station kickout.
+ *
* @NUM_IEEE80211_HW_FLAGS: number of hardware flags, used for sizing arrays
*/
enum ieee80211_hw_flags {
@@ -2054,6 +2063,7 @@
IEEE80211_HW_USES_RSS,
IEEE80211_HW_TX_AMSDU,
IEEE80211_HW_TX_FRAG_LIST,
+ IEEE80211_HW_REPORTS_LOW_ACK,
/* keep last, obviously */
NUM_IEEE80211_HW_FLAGS
@@ -2141,6 +2151,14 @@
* the default is _GI | _BANDWIDTH.
* Use the %IEEE80211_RADIOTAP_VHT_KNOWN_* values.
*
+ * @radiotap_timestamp: Information for the radiotap timestamp field; if the
+ * 'units_pos' member is set to a non-negative value it must be set to
+ * a combination of a IEEE80211_RADIOTAP_TIMESTAMP_UNIT_* and a
+ * IEEE80211_RADIOTAP_TIMESTAMP_SPOS_* value, and then the timestamp
+ * field will be added and populated from the &struct ieee80211_rx_status
+ * device_timestamp. If the 'accuracy' member is non-negative, it's put
+ * into the accuracy radiotap field and the accuracy known flag is set.
+ *
* @netdev_features: netdev features to be set in each netdev created
* from this HW. Note that not all features are usable with mac80211,
* other features will be rejected during HW registration.
@@ -2184,6 +2202,10 @@
u8 offchannel_tx_hw_queue;
u8 radiotap_mcs_details;
u16 radiotap_vht_details;
+ struct {
+ int units_pos;
+ s16 accuracy;
+ } radiotap_timestamp;
netdev_features_t netdev_features;
u8 uapsd_queues;
u8 uapsd_max_sp_len;
@@ -3085,11 +3107,8 @@
*
* @sta_add_debugfs: Drivers can use this callback to add debugfs files
* when a station is added to mac80211's station list. This callback
- * and @sta_remove_debugfs should be within a CONFIG_MAC80211_DEBUGFS
- * conditional. This callback can sleep.
- *
- * @sta_remove_debugfs: Remove the debugfs files which were added using
- * @sta_add_debugfs. This callback can sleep.
+ * should be within a CONFIG_MAC80211_DEBUGFS conditional. This
+ * callback can sleep.
*
* @sta_notify: Notifies low level driver about power state transition of an
* associated station, AP, IBSS/WDS/mesh peer etc. For a VIF operating
@@ -3485,10 +3504,6 @@
struct ieee80211_vif *vif,
struct ieee80211_sta *sta,
struct dentry *dir);
- void (*sta_remove_debugfs)(struct ieee80211_hw *hw,
- struct ieee80211_vif *vif,
- struct ieee80211_sta *sta,
- struct dentry *dir);
#endif
void (*sta_notify)(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
enum sta_notify_cmd, struct ieee80211_sta *sta);
diff --git a/include/net/netfilter/br_netfilter.h b/include/net/netfilter/br_netfilter.h
index e8d1448..0b0c35c 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -15,6 +15,12 @@
void nf_bridge_update_protocol(struct sk_buff *skb);
+int br_nf_hook_thresh(unsigned int hook, struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *));
+
static inline struct nf_bridge_info *
nf_bridge_info_get(const struct sk_buff *skb)
{
diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h
index cdc920b..8992e42 100644
--- a/include/net/netfilter/nf_conntrack_l3proto.h
+++ b/include/net/netfilter/nf_conntrack_l3proto.h
@@ -63,10 +63,6 @@
size_t nla_size;
-#ifdef CONFIG_SYSCTL
- const char *ctl_table_path;
-#endif /* CONFIG_SYSCTL */
-
/* Init l3proto pernet data */
int (*init_net)(struct net *net);
diff --git a/include/net/netfilter/nf_conntrack_synproxy.h b/include/net/netfilter/nf_conntrack_synproxy.h
index 6793614..e693731 100644
--- a/include/net/netfilter/nf_conntrack_synproxy.h
+++ b/include/net/netfilter/nf_conntrack_synproxy.h
@@ -27,6 +27,20 @@
#endif
}
+static inline bool nf_ct_add_synproxy(struct nf_conn *ct,
+ const struct nf_conn *tmpl)
+{
+ if (tmpl && nfct_synproxy(tmpl)) {
+ if (!nfct_seqadj_ext_add(ct))
+ return false;
+
+ if (!nfct_synproxy_ext_add(ct))
+ return false;
+ }
+
+ return true;
+}
+
struct synproxy_stats {
unsigned int syn_received;
unsigned int cookie_invalid;
diff --git a/include/net/netfilter/nf_log.h b/include/net/netfilter/nf_log.h
index ee07dc8..309cd26 100644
--- a/include/net/netfilter/nf_log.h
+++ b/include/net/netfilter/nf_log.h
@@ -2,15 +2,10 @@
#define _NF_LOG_H
#include <linux/netfilter.h>
+#include <linux/netfilter/nf_log.h>
-/* those NF_LOG_* defines and struct nf_loginfo are legacy definitios that will
- * disappear once iptables is replaced with pkttables. Please DO NOT use them
- * for any new code! */
-#define NF_LOG_TCPSEQ 0x01 /* Log TCP sequence numbers */
-#define NF_LOG_TCPOPT 0x02 /* Log TCP options */
-#define NF_LOG_IPOPT 0x04 /* Log IP options */
-#define NF_LOG_UID 0x08 /* Log UID owning local socket */
-#define NF_LOG_MASK 0x0f
+/* Log tcp sequence, tcp options, ip options and uid owning local socket */
+#define NF_LOG_DEFAULT_MASK 0x0f
/* This flag indicates that copy_len field in nf_loginfo is set */
#define NF_LOG_F_COPY_LEN 0x1
diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h
index 0dbce55..2280cfe 100644
--- a/include/net/netfilter/nf_queue.h
+++ b/include/net/netfilter/nf_queue.h
@@ -11,7 +11,6 @@
struct sk_buff *skb;
unsigned int id;
- struct nf_hook_ops *elem;
struct nf_hook_state state;
u16 size; /* sizeof(entry) + saved route keys */
@@ -22,10 +21,10 @@
/* Packet queuing */
struct nf_queue_handler {
- int (*outfn)(struct nf_queue_entry *entry,
- unsigned int queuenum);
- void (*nf_hook_drop)(struct net *net,
- struct nf_hook_ops *ops);
+ int (*outfn)(struct nf_queue_entry *entry,
+ unsigned int queuenum);
+ void (*nf_hook_drop)(struct net *net,
+ const struct nf_hook_entry *hooks);
};
void nf_register_queue_handler(struct net *net, const struct nf_queue_handler *qh);
@@ -41,23 +40,19 @@
*jhash_initval = prandom_u32();
}
-static inline u32 hash_v4(const struct sk_buff *skb, u32 jhash_initval)
+static inline u32 hash_v4(const struct iphdr *iph, u32 initval)
{
- const struct iphdr *iph = ip_hdr(skb);
-
/* packets in either direction go into same queue */
if ((__force u32)iph->saddr < (__force u32)iph->daddr)
return jhash_3words((__force u32)iph->saddr,
- (__force u32)iph->daddr, iph->protocol, jhash_initval);
+ (__force u32)iph->daddr, iph->protocol, initval);
return jhash_3words((__force u32)iph->daddr,
- (__force u32)iph->saddr, iph->protocol, jhash_initval);
+ (__force u32)iph->saddr, iph->protocol, initval);
}
-#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
-static inline u32 hash_v6(const struct sk_buff *skb, u32 jhash_initval)
+static inline u32 hash_v6(const struct ipv6hdr *ip6h, u32 initval)
{
- const struct ipv6hdr *ip6h = ipv6_hdr(skb);
u32 a, b, c;
if ((__force u32)ip6h->saddr.s6_addr32[3] <
@@ -75,20 +70,50 @@
else
c = (__force u32) ip6h->daddr.s6_addr32[1];
- return jhash_3words(a, b, c, jhash_initval);
+ return jhash_3words(a, b, c, initval);
}
-#endif
+
+static inline u32 hash_bridge(const struct sk_buff *skb, u32 initval)
+{
+ struct ipv6hdr *ip6h, _ip6h;
+ struct iphdr *iph, _iph;
+
+ switch (eth_hdr(skb)->h_proto) {
+ case htons(ETH_P_IP):
+ iph = skb_header_pointer(skb, skb_network_offset(skb),
+ sizeof(*iph), &_iph);
+ if (iph)
+ return hash_v4(iph, initval);
+ break;
+ case htons(ETH_P_IPV6):
+ ip6h = skb_header_pointer(skb, skb_network_offset(skb),
+ sizeof(*ip6h), &_ip6h);
+ if (ip6h)
+ return hash_v6(ip6h, initval);
+ break;
+ }
+
+ return 0;
+}
static inline u32
nfqueue_hash(const struct sk_buff *skb, u16 queue, u16 queues_total, u8 family,
- u32 jhash_initval)
+ u32 initval)
{
- if (family == NFPROTO_IPV4)
- queue += ((u64) hash_v4(skb, jhash_initval) * queues_total) >> 32;
-#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
- else if (family == NFPROTO_IPV6)
- queue += ((u64) hash_v6(skb, jhash_initval) * queues_total) >> 32;
-#endif
+ switch (family) {
+ case NFPROTO_IPV4:
+ queue += reciprocal_scale(hash_v4(ip_hdr(skb), initval),
+ queues_total);
+ break;
+ case NFPROTO_IPV6:
+ queue += reciprocal_scale(hash_v6(ipv6_hdr(skb), initval),
+ queues_total);
+ break;
+ case NFPROTO_BRIDGE:
+ queue += reciprocal_scale(hash_bridge(skb, initval),
+ queues_total);
+ break;
+ }
return queue;
}
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 8972468..5031e07 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -19,6 +19,7 @@
const struct net_device *out;
u8 pf;
u8 hook;
+ bool tprot_set;
u8 tprot;
/* for x_tables compatibility */
struct xt_action_param xt;
@@ -36,6 +37,23 @@
pkt->pf = pkt->xt.family = state->pf;
}
+static inline void nft_set_pktinfo_proto_unspec(struct nft_pktinfo *pkt,
+ struct sk_buff *skb)
+{
+ pkt->tprot_set = false;
+ pkt->tprot = 0;
+ pkt->xt.thoff = 0;
+ pkt->xt.fragoff = 0;
+}
+
+static inline void nft_set_pktinfo_unspec(struct nft_pktinfo *pkt,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ nft_set_pktinfo(pkt, skb, state);
+ nft_set_pktinfo_proto_unspec(pkt, skb);
+}
+
/**
* struct nft_verdict - nf_tables verdict
*
@@ -127,6 +145,7 @@
return type == NFT_DATA_VERDICT ? NFT_REG_VERDICT : NFT_REG_1 * NFT_REG_SIZE / NFT_REG32_SIZE;
}
+unsigned int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest);
unsigned int nft_parse_register(const struct nlattr *attr);
int nft_dump_register(struct sk_buff *skb, unsigned int attr, unsigned int reg);
diff --git a/include/net/netfilter/nf_tables_bridge.h b/include/net/netfilter/nf_tables_bridge.h
deleted file mode 100644
index 511fb79..0000000
--- a/include/net/netfilter/nf_tables_bridge.h
+++ /dev/null
@@ -1,7 +0,0 @@
-#ifndef _NET_NF_TABLES_BRIDGE_H
-#define _NET_NF_TABLES_BRIDGE_H
-
-int nft_bridge_iphdr_validate(struct sk_buff *skb);
-int nft_bridge_ip6hdr_validate(struct sk_buff *skb);
-
-#endif /* _NET_NF_TABLES_BRIDGE_H */
diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index a9060dd..00f4f6b 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -28,6 +28,9 @@
int nft_cmp_module_init(void);
void nft_cmp_module_exit(void);
+int nft_range_module_init(void);
+void nft_range_module_exit(void);
+
int nft_lookup_module_init(void);
void nft_lookup_module_exit(void);
diff --git a/include/net/netfilter/nf_tables_ipv4.h b/include/net/netfilter/nf_tables_ipv4.h
index ca6ef6b..968f00b 100644
--- a/include/net/netfilter/nf_tables_ipv4.h
+++ b/include/net/netfilter/nf_tables_ipv4.h
@@ -14,11 +14,54 @@
nft_set_pktinfo(pkt, skb, state);
ip = ip_hdr(pkt->skb);
+ pkt->tprot_set = true;
pkt->tprot = ip->protocol;
pkt->xt.thoff = ip_hdrlen(pkt->skb);
pkt->xt.fragoff = ntohs(ip->frag_off) & IP_OFFSET;
}
+static inline int
+__nft_set_pktinfo_ipv4_validate(struct nft_pktinfo *pkt,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ struct iphdr *iph, _iph;
+ u32 len, thoff;
+
+ iph = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*iph),
+ &_iph);
+ if (!iph)
+ return -1;
+
+ iph = ip_hdr(skb);
+ if (iph->ihl < 5 || iph->version != 4)
+ return -1;
+
+ len = ntohs(iph->tot_len);
+ thoff = iph->ihl * 4;
+ if (skb->len < len)
+ return -1;
+ else if (len < thoff)
+ return -1;
+
+ pkt->tprot_set = true;
+ pkt->tprot = iph->protocol;
+ pkt->xt.thoff = thoff;
+ pkt->xt.fragoff = ntohs(iph->frag_off) & IP_OFFSET;
+
+ return 0;
+}
+
+static inline void
+nft_set_pktinfo_ipv4_validate(struct nft_pktinfo *pkt,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ nft_set_pktinfo(pkt, skb, state);
+ if (__nft_set_pktinfo_ipv4_validate(pkt, skb, state) < 0)
+ nft_set_pktinfo_proto_unspec(pkt, skb);
+}
+
extern struct nft_af_info nft_af_ipv4;
#endif
diff --git a/include/net/netfilter/nf_tables_ipv6.h b/include/net/netfilter/nf_tables_ipv6.h
index 8ad39a6..d150b50 100644
--- a/include/net/netfilter/nf_tables_ipv6.h
+++ b/include/net/netfilter/nf_tables_ipv6.h
@@ -4,7 +4,7 @@
#include <linux/netfilter_ipv6/ip6_tables.h>
#include <net/ipv6.h>
-static inline int
+static inline void
nft_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
struct sk_buff *skb,
const struct nf_hook_state *state)
@@ -15,15 +15,64 @@
nft_set_pktinfo(pkt, skb, state);
protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, NULL);
- /* If malformed, drop it */
+ if (protohdr < 0) {
+ nft_set_pktinfo_proto_unspec(pkt, skb);
+ return;
+ }
+
+ pkt->tprot_set = true;
+ pkt->tprot = protohdr;
+ pkt->xt.thoff = thoff;
+ pkt->xt.fragoff = frag_off;
+}
+
+static inline int
+__nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+ struct ipv6hdr *ip6h, _ip6h;
+ unsigned int thoff = 0;
+ unsigned short frag_off;
+ int protohdr;
+ u32 pkt_len;
+
+ ip6h = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*ip6h),
+ &_ip6h);
+ if (!ip6h)
+ return -1;
+
+ if (ip6h->version != 6)
+ return -1;
+
+ pkt_len = ntohs(ip6h->payload_len);
+ if (pkt_len + sizeof(*ip6h) > skb->len)
+ return -1;
+
+ protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, NULL);
if (protohdr < 0)
return -1;
+ pkt->tprot_set = true;
pkt->tprot = protohdr;
pkt->xt.thoff = thoff;
pkt->xt.fragoff = frag_off;
return 0;
+#else
+ return -1;
+#endif
+}
+
+static inline void
+nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+ nft_set_pktinfo(pkt, skb, state);
+ if (__nft_set_pktinfo_ipv6_validate(pkt, skb, state) < 0)
+ nft_set_pktinfo_proto_unspec(pkt, skb);
}
extern struct nft_af_info nft_af_ipv6;
diff --git a/include/net/netns/netfilter.h b/include/net/netns/netfilter.h
index 36d7235..58487b1 100644
--- a/include/net/netns/netfilter.h
+++ b/include/net/netns/netfilter.h
@@ -16,6 +16,6 @@
#ifdef CONFIG_SYSCTL
struct ctl_table_header *nf_log_dir_header;
#endif
- struct list_head hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
+ struct nf_hook_entry __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
};
#endif
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a459be5..767b03a 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -123,7 +123,7 @@
for (i = 0; i < exts->nr_actions; i++) {
struct tc_action *a = exts->actions[i];
- list_add(&a->list, actions);
+ list_add_tail(&a->list, actions);
}
#endif
}
@@ -486,4 +486,20 @@
unsigned long cookie;
};
+enum tc_clsbpf_command {
+ TC_CLSBPF_ADD,
+ TC_CLSBPF_REPLACE,
+ TC_CLSBPF_DESTROY,
+ TC_CLSBPF_STATS,
+};
+
+struct tc_cls_bpf_offload {
+ enum tc_clsbpf_command command;
+ struct tcf_exts *exts;
+ struct bpf_prog *prog;
+ const char *name;
+ bool exts_integrated;
+ u32 gen_flags;
+};
+
#endif
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 52a2015..e6aa0a2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -36,6 +36,14 @@
u16 data[];
};
+/* similar to sk_buff_head, but skb->prev pointer is undefined. */
+struct qdisc_skb_head {
+ struct sk_buff *head;
+ struct sk_buff *tail;
+ __u32 qlen;
+ spinlock_t lock;
+};
+
struct Qdisc {
int (*enqueue)(struct sk_buff *skb,
struct Qdisc *sch,
@@ -76,7 +84,7 @@
* For performance sake on SMP, we put highly modified fields at the end
*/
struct sk_buff *gso_skb ____cacheline_aligned_in_smp;
- struct sk_buff_head q;
+ struct qdisc_skb_head q;
struct gnet_stats_basic_packed bstats;
seqcount_t running;
struct gnet_stats_queue qstats;
@@ -600,10 +608,27 @@
sch->qstats.overlimits++;
}
-static inline int __qdisc_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
- struct sk_buff_head *list)
+static inline void qdisc_skb_head_init(struct qdisc_skb_head *qh)
{
- __skb_queue_tail(list, skb);
+ qh->head = NULL;
+ qh->tail = NULL;
+ qh->qlen = 0;
+}
+
+static inline int __qdisc_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
+ struct qdisc_skb_head *qh)
+{
+ struct sk_buff *last = qh->tail;
+
+ if (last) {
+ skb->next = NULL;
+ last->next = skb;
+ qh->tail = skb;
+ } else {
+ qh->tail = skb;
+ qh->head = skb;
+ }
+ qh->qlen++;
qdisc_qstats_backlog_inc(sch, skb);
return NET_XMIT_SUCCESS;
@@ -614,14 +639,16 @@
return __qdisc_enqueue_tail(skb, sch, &sch->q);
}
-static inline struct sk_buff *__qdisc_dequeue_head(struct Qdisc *sch,
- struct sk_buff_head *list)
+static inline struct sk_buff *__qdisc_dequeue_head(struct qdisc_skb_head *qh)
{
- struct sk_buff *skb = __skb_dequeue(list);
+ struct sk_buff *skb = qh->head;
if (likely(skb != NULL)) {
- qdisc_qstats_backlog_dec(sch, skb);
- qdisc_bstats_update(sch, skb);
+ qh->head = skb->next;
+ qh->qlen--;
+ if (qh->head == NULL)
+ qh->tail = NULL;
+ skb->next = NULL;
}
return skb;
@@ -629,7 +656,14 @@
static inline struct sk_buff *qdisc_dequeue_head(struct Qdisc *sch)
{
- return __qdisc_dequeue_head(sch, &sch->q);
+ struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);
+
+ if (likely(skb != NULL)) {
+ qdisc_qstats_backlog_dec(sch, skb);
+ qdisc_bstats_update(sch, skb);
+ }
+
+ return skb;
}
/* Instead of calling kfree_skb() while root qdisc lock is held,
@@ -642,10 +676,10 @@
}
static inline unsigned int __qdisc_queue_drop_head(struct Qdisc *sch,
- struct sk_buff_head *list,
+ struct qdisc_skb_head *qh,
struct sk_buff **to_free)
{
- struct sk_buff *skb = __skb_dequeue(list);
+ struct sk_buff *skb = __qdisc_dequeue_head(qh);
if (likely(skb != NULL)) {
unsigned int len = qdisc_pkt_len(skb);
@@ -666,7 +700,9 @@
static inline struct sk_buff *qdisc_peek_head(struct Qdisc *sch)
{
- return skb_peek(&sch->q);
+ const struct qdisc_skb_head *qh = &sch->q;
+
+ return qh->head;
}
/* generic pseudo peek method for non-work-conserving qdisc */
@@ -701,15 +737,19 @@
return skb;
}
-static inline void __qdisc_reset_queue(struct sk_buff_head *list)
+static inline void __qdisc_reset_queue(struct qdisc_skb_head *qh)
{
/*
* We do not know the backlog in bytes of this list, it
* is up to the caller to correct it
*/
- if (!skb_queue_empty(list)) {
- rtnl_kfree_skbs(list->next, list->prev);
- __skb_queue_head_init(list);
+ ASSERT_RTNL();
+ if (qh->qlen) {
+ rtnl_kfree_skbs(qh->head, qh->tail);
+
+ qh->head = NULL;
+ qh->tail = NULL;
+ qh->qlen = 0;
}
}
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 632e205..87a7f42 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -83,9 +83,9 @@
#endif
/* Round an int up to the next multiple of 4. */
-#define WORD_ROUND(s) (((s)+3)&~3)
+#define SCTP_PAD4(s) (((s)+3)&~3)
/* Truncate to the previous multiple of 4. */
-#define WORD_TRUNC(s) ((s)&~3)
+#define SCTP_TRUNC4(s) ((s)&~3)
/*
* Function declarations.
@@ -433,7 +433,7 @@
if (asoc->user_frag)
frag = min_t(int, frag, asoc->user_frag);
- frag = WORD_TRUNC(min_t(int, frag, SCTP_MAX_CHUNK_LEN));
+ frag = SCTP_TRUNC4(min_t(int, frag, SCTP_MAX_CHUNK_LEN));
return frag;
}
@@ -462,7 +462,7 @@
for (pos.v = chunk->member;\
pos.v <= (void *)chunk + end - ntohs(pos.p->length) &&\
ntohs(pos.p->length) >= sizeof(sctp_paramhdr_t);\
- pos.v += WORD_ROUND(ntohs(pos.p->length)))
+ pos.v += SCTP_PAD4(ntohs(pos.p->length)))
#define sctp_walk_errors(err, chunk_hdr)\
_sctp_walk_errors((err), (chunk_hdr), ntohs((chunk_hdr)->length))
@@ -472,7 +472,7 @@
sizeof(sctp_chunkhdr_t));\
(void *)err <= (void *)chunk_hdr + end - ntohs(err->length) &&\
ntohs(err->length) >= sizeof(sctp_errhdr_t); \
- err = (sctp_errhdr_t *)((void *)err + WORD_ROUND(ntohs(err->length))))
+ err = (sctp_errhdr_t *)((void *)err + SCTP_PAD4(ntohs(err->length))))
#define sctp_walk_fwdtsn(pos, chunk)\
_sctp_walk_fwdtsn((pos), (chunk), ntohs((chunk)->chunk_hdr->length) - sizeof(struct sctp_fwdtsn_chunk))
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index efc0174..ca6c971 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -307,85 +307,27 @@
}
/* Compare two TSNs */
+#define TSN_lt(a,b) \
+ (typecheck(__u32, a) && \
+ typecheck(__u32, b) && \
+ ((__s32)((a) - (b)) < 0))
-/* RFC 1982 - Serial Number Arithmetic
- *
- * 2. Comparison
- * Then, s1 is said to be equal to s2 if and only if i1 is equal to i2,
- * in all other cases, s1 is not equal to s2.
- *
- * s1 is said to be less than s2 if, and only if, s1 is not equal to s2,
- * and
- *
- * (i1 < i2 and i2 - i1 < 2^(SERIAL_BITS - 1)) or
- * (i1 > i2 and i1 - i2 > 2^(SERIAL_BITS - 1))
- *
- * s1 is said to be greater than s2 if, and only if, s1 is not equal to
- * s2, and
- *
- * (i1 < i2 and i2 - i1 > 2^(SERIAL_BITS - 1)) or
- * (i1 > i2 and i1 - i2 < 2^(SERIAL_BITS - 1))
- */
-
-/*
- * RFC 2960
- * 1.6 Serial Number Arithmetic
- *
- * Comparisons and arithmetic on TSNs in this document SHOULD use Serial
- * Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32.
- */
-
-enum {
- TSN_SIGN_BIT = (1<<31)
-};
-
-static inline int TSN_lt(__u32 s, __u32 t)
-{
- return ((s) - (t)) & TSN_SIGN_BIT;
-}
-
-static inline int TSN_lte(__u32 s, __u32 t)
-{
- return ((s) == (t)) || (((s) - (t)) & TSN_SIGN_BIT);
-}
+#define TSN_lte(a,b) \
+ (typecheck(__u32, a) && \
+ typecheck(__u32, b) && \
+ ((__s32)((a) - (b)) <= 0))
/* Compare two SSNs */
+#define SSN_lt(a,b) \
+ (typecheck(__u16, a) && \
+ typecheck(__u16, b) && \
+ ((__s16)((a) - (b)) < 0))
-/*
- * RFC 2960
- * 1.6 Serial Number Arithmetic
- *
- * Comparisons and arithmetic on Stream Sequence Numbers in this document
- * SHOULD use Serial Number Arithmetic as defined in [RFC1982] where
- * SERIAL_BITS = 16.
- */
-enum {
- SSN_SIGN_BIT = (1<<15)
-};
-
-static inline int SSN_lt(__u16 s, __u16 t)
-{
- return ((s) - (t)) & SSN_SIGN_BIT;
-}
-
-static inline int SSN_lte(__u16 s, __u16 t)
-{
- return ((s) == (t)) || (((s) - (t)) & SSN_SIGN_BIT);
-}
-
-/*
- * ADDIP 3.1.1
- * The valid range of Serial Number is from 0 to 4294967295 (2**32 - 1). Serial
- * Numbers wrap back to 0 after reaching 4294967295.
- */
-enum {
- ADDIP_SERIAL_SIGN_BIT = (1<<31)
-};
-
-static inline int ADDIP_SERIAL_gte(__u16 s, __u16 t)
-{
- return ((s) == (t)) || (((t) - (s)) & ADDIP_SERIAL_SIGN_BIT);
-}
+/* ADDIP 3.1.1 */
+#define ADDIP_SERIAL_gte(a,b) \
+ (typecheck(__u32, a) && \
+ typecheck(__u32, b) && \
+ ((__s32)((b) - (a)) <= 0))
/* Check VTAG of the packet matches the sender's own tag. */
static inline int
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index ce93c4b..8693dc4 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -537,6 +537,7 @@
struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *,
struct sctp_sndrcvinfo *,
struct iov_iter *);
+void sctp_datamsg_free(struct sctp_datamsg *);
void sctp_datamsg_put(struct sctp_datamsg *);
void sctp_chunk_fail(struct sctp_chunk *, int error);
int sctp_chunk_abandoned(struct sctp_chunk *);
@@ -1076,7 +1077,7 @@
void sctp_outq_init(struct sctp_association *, struct sctp_outq *);
void sctp_outq_teardown(struct sctp_outq *);
void sctp_outq_free(struct sctp_outq*);
-int sctp_outq_tail(struct sctp_outq *, struct sctp_chunk *chunk, gfp_t);
+void sctp_outq_tail(struct sctp_outq *, struct sctp_chunk *chunk, gfp_t);
int sctp_outq_sack(struct sctp_outq *, struct sctp_chunk *);
int sctp_outq_is_empty(const struct sctp_outq *);
void sctp_outq_restart(struct sctp_outq *);
@@ -1084,7 +1085,7 @@
void sctp_retransmit(struct sctp_outq *, struct sctp_transport *,
sctp_retransmit_reason_t);
void sctp_retransmit_mark(struct sctp_outq *, struct sctp_transport *, __u8);
-int sctp_outq_uncork(struct sctp_outq *, gfp_t gfp);
+void sctp_outq_uncork(struct sctp_outq *, gfp_t gfp);
void sctp_prsctp_prune(struct sctp_association *asoc,
struct sctp_sndrcvinfo *sinfo, int msg_len);
/* Uncork and flush an outqueue. */
diff --git a/include/net/sock.h b/include/net/sock.h
index c797c57..ebf75db 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1339,6 +1339,16 @@
if (!sk_has_account(sk))
return;
sk->sk_forward_alloc += size;
+
+ /* Avoid a possible overflow.
+ * TCP send queues can make this happen, if sk_mem_reclaim()
+ * is not called and more than 2 GBytes are released at once.
+ *
+ * If we reach 2 MBytes, reclaim 1 MBytes right now, there is
+ * no need to hold that much forward allocation anyway.
+ */
+ if (unlikely(sk->sk_forward_alloc >= 1 << 21))
+ __sk_mem_reclaim(sk, 1 << 20);
}
static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 729fe15..eba80c4 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -68,7 +68,6 @@
enum switchdev_obj_id {
SWITCHDEV_OBJ_ID_UNDEFINED,
SWITCHDEV_OBJ_ID_PORT_VLAN,
- SWITCHDEV_OBJ_ID_IPV4_FIB,
SWITCHDEV_OBJ_ID_PORT_FDB,
SWITCHDEV_OBJ_ID_PORT_MDB,
};
@@ -92,21 +91,6 @@
#define SWITCHDEV_OBJ_PORT_VLAN(obj) \
container_of(obj, struct switchdev_obj_port_vlan, obj)
-/* SWITCHDEV_OBJ_ID_IPV4_FIB */
-struct switchdev_obj_ipv4_fib {
- struct switchdev_obj obj;
- u32 dst;
- int dst_len;
- struct fib_info *fi;
- u8 tos;
- u8 type;
- u32 nlflags;
- u32 tb_id;
-};
-
-#define SWITCHDEV_OBJ_IPV4_FIB(obj) \
- container_of(obj, struct switchdev_obj_ipv4_fib, obj)
-
/* SWITCHDEV_OBJ_ID_PORT_FDB */
struct switchdev_obj_port_fdb {
struct switchdev_obj obj;
@@ -209,11 +193,6 @@
struct nlmsghdr *nlh, u16 flags);
int switchdev_port_bridge_dellink(struct net_device *dev,
struct nlmsghdr *nlh, u16 flags);
-int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
- u8 tos, u8 type, u32 nlflags, u32 tb_id);
-int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
- u8 tos, u8 type, u32 tb_id);
-void switchdev_fib_ipv4_abort(struct fib_info *fi);
int switchdev_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev, const unsigned char *addr,
u16 vid, u16 nlm_flags);
@@ -304,25 +283,6 @@
return -EOPNOTSUPP;
}
-static inline int switchdev_fib_ipv4_add(u32 dst, int dst_len,
- struct fib_info *fi,
- u8 tos, u8 type,
- u32 nlflags, u32 tb_id)
-{
- return 0;
-}
-
-static inline int switchdev_fib_ipv4_del(u32 dst, int dst_len,
- struct fib_info *fi,
- u8 tos, u8 type, u32 tb_id)
-{
- return 0;
-}
-
-static inline void switchdev_fib_ipv4_abort(struct fib_info *fi)
-{
-}
-
static inline int switchdev_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
const unsigned char *addr,
diff --git a/include/net/tc_act/tc_ife.h b/include/net/tc_act/tc_ife.h
index 5164bd7..9fd2bea0 100644
--- a/include/net/tc_act/tc_ife.h
+++ b/include/net/tc_act/tc_ife.h
@@ -50,9 +50,11 @@
int ife_alloc_meta_u32(struct tcf_meta_info *mi, void *metaval, gfp_t gfp);
int ife_alloc_meta_u16(struct tcf_meta_info *mi, void *metaval, gfp_t gfp);
int ife_check_meta_u32(u32 metaval, struct tcf_meta_info *mi);
+int ife_check_meta_u16(u16 metaval, struct tcf_meta_info *mi);
int ife_encode_meta_u32(u32 metaval, void *skbdata, struct tcf_meta_info *mi);
int ife_validate_meta_u32(void *val, int len);
int ife_validate_meta_u16(void *val, int len);
+int ife_encode_meta_u16(u16 metaval, void *skbdata, struct tcf_meta_info *mi);
void ife_release_meta_gen(struct tcf_meta_info *mi);
int register_ife_op(struct tcf_meta_ops *mops);
int unregister_ife_op(struct tcf_meta_ops *mops);
diff --git a/include/net/tc_act/tc_vlan.h b/include/net/tc_act/tc_vlan.h
index 6b83588..48cca32 100644
--- a/include/net/tc_act/tc_vlan.h
+++ b/include/net/tc_act/tc_vlan.h
@@ -11,6 +11,7 @@
#define __NET_TC_VLAN_H
#include <net/act_api.h>
+#include <linux/tc_act/tc_vlan.h>
#define VLAN_F_POP 0x1
#define VLAN_F_PUSH 0x2
@@ -24,4 +25,28 @@
};
#define to_vlan(a) ((struct tcf_vlan *)a)
+static inline bool is_tcf_vlan(const struct tc_action *a)
+{
+#ifdef CONFIG_NET_CLS_ACT
+ if (a->ops && a->ops->type == TCA_ACT_VLAN)
+ return true;
+#endif
+ return false;
+}
+
+static inline u32 tcf_vlan_action(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_action;
+}
+
+static inline u16 tcf_vlan_push_vid(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_push_vid;
+}
+
+static inline __be16 tcf_vlan_push_proto(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_push_proto;
+}
+
#endif /* __NET_TC_VLAN_H */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index fdfbedd..f83b7f2 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -533,6 +533,8 @@
#endif
/* tcp_output.c */
+u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now,
+ int min_tso_segs);
void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss,
int nonagle);
bool tcp_may_send_now(struct sock *sk);
@@ -671,7 +673,7 @@
/* Minimum RTT in usec. ~0 means not available. */
static inline u32 tcp_min_rtt(const struct tcp_sock *tp)
{
- return tp->rtt_min[0].rtt;
+ return minmax_get(&tp->rtt_min);
}
/* Compute the actual receive window we are currently advertising.
@@ -763,8 +765,16 @@
__u32 ack_seq; /* Sequence number ACK'd */
union {
struct {
- /* There is space for up to 20 bytes */
- __u32 in_flight;/* Bytes in flight when packet sent */
+ /* There is space for up to 24 bytes */
+ __u32 in_flight:30,/* Bytes in flight at transmit */
+ is_app_limited:1, /* cwnd not fully used? */
+ unused:1;
+ /* pkts S/ACKed so far upon tx of skb, incl retrans: */
+ __u32 delivered;
+ /* start of send pipeline phase */
+ struct skb_mstamp first_tx_mstamp;
+ /* when we reached the "delivered" count */
+ struct skb_mstamp delivered_mstamp;
} tx; /* only used for outgoing skbs */
union {
struct inet_skb_parm h4;
@@ -860,6 +870,27 @@
u32 in_flight;
};
+/* A rate sample measures the number of (original/retransmitted) data
+ * packets delivered "delivered" over an interval of time "interval_us".
+ * The tcp_rate.c code fills in the rate sample, and congestion
+ * control modules that define a cong_control function to run at the end
+ * of ACK processing can optionally chose to consult this sample when
+ * setting cwnd and pacing rate.
+ * A sample is invalid if "delivered" or "interval_us" is negative.
+ */
+struct rate_sample {
+ struct skb_mstamp prior_mstamp; /* starting timestamp for interval */
+ u32 prior_delivered; /* tp->delivered at "prior_mstamp" */
+ s32 delivered; /* number of packets delivered over interval */
+ long interval_us; /* time for tp->delivered to incr "delivered" */
+ long rtt_us; /* RTT of last (S)ACKed packet (or -1) */
+ int losses; /* number of packets marked lost upon ACK */
+ u32 acked_sacked; /* number of packets newly (S)ACKed upon ACK */
+ u32 prior_in_flight; /* in flight before this ACK */
+ bool is_app_limited; /* is sample from packet with bubble in pipe? */
+ bool is_retrans; /* is sample from retransmission? */
+};
+
struct tcp_congestion_ops {
struct list_head list;
u32 key;
@@ -884,6 +915,14 @@
u32 (*undo_cwnd)(struct sock *sk);
/* hook for packet ack accounting (optional) */
void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
+ /* suggest number of segments for each skb to transmit (optional) */
+ u32 (*tso_segs_goal)(struct sock *sk);
+ /* returns the multiplier used in tcp_sndbuf_expand (optional) */
+ u32 (*sndbuf_expand)(struct sock *sk);
+ /* call when packets are delivered to update cwnd and pacing rate,
+ * after all the ca_state processing. (optional)
+ */
+ void (*cong_control)(struct sock *sk, const struct rate_sample *rs);
/* get info for inet_diag (optional) */
size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
union tcp_cc_info *info);
@@ -946,6 +985,14 @@
icsk->icsk_ca_ops->cwnd_event(sk, event);
}
+/* From tcp_rate.c */
+void tcp_rate_skb_sent(struct sock *sk, struct sk_buff *skb);
+void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
+ struct rate_sample *rs);
+void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
+ struct skb_mstamp *now, struct rate_sample *rs);
+void tcp_rate_check_app_limited(struct sock *sk);
+
/* These functions determine how the current flow behaves in respect of SACK
* handling. SACK is negotiated with the peer, and therefore it can vary
* between different flows.
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index d2fdd6d..31947b9 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1540,8 +1540,10 @@
void xfrm4_local_error(struct sk_buff *skb, u32 mtu);
int xfrm6_extract_header(struct sk_buff *skb);
int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb);
-int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi);
+int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi,
+ struct ip6_tnl *t);
int xfrm6_transport_finish(struct sk_buff *skb, int async);
+int xfrm6_rcv_tnl(struct sk_buff *skb, struct ip6_tnl *t);
int xfrm6_rcv(struct sk_buff *skb);
int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
xfrm_address_t *saddr, u8 proto);
diff --git a/include/rxrpc/packet.h b/include/rxrpc/packet.h
index fd6eb3a..703a64b 100644
--- a/include/rxrpc/packet.h
+++ b/include/rxrpc/packet.h
@@ -123,6 +123,7 @@
#define RXRPC_ACK_PING_RESPONSE 7 /* response to RXRPC_ACK_PING */
#define RXRPC_ACK_DELAY 8 /* nothing happened since received packet */
#define RXRPC_ACK_IDLE 9 /* ACK due to fully received ACK window */
+#define RXRPC_ACK__INVALID 10 /* Representation of invalid ACK reason */
uint8_t nAcks; /* number of ACKs */
#define RXRPC_MAXACKS 255
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index ea3b10e..ada12d0 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -16,6 +16,66 @@
#include <linux/tracepoint.h>
+TRACE_EVENT(rxrpc_conn,
+ TP_PROTO(struct rxrpc_connection *conn, enum rxrpc_conn_trace op,
+ int usage, const void *where),
+
+ TP_ARGS(conn, op, usage, where),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_connection *, conn )
+ __field(int, op )
+ __field(int, usage )
+ __field(const void *, where )
+ ),
+
+ TP_fast_assign(
+ __entry->conn = conn;
+ __entry->op = op;
+ __entry->usage = usage;
+ __entry->where = where;
+ ),
+
+ TP_printk("C=%p %s u=%d sp=%pSR",
+ __entry->conn,
+ rxrpc_conn_traces[__entry->op],
+ __entry->usage,
+ __entry->where)
+ );
+
+TRACE_EVENT(rxrpc_client,
+ TP_PROTO(struct rxrpc_connection *conn, int channel,
+ enum rxrpc_client_trace op),
+
+ TP_ARGS(conn, channel, op),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_connection *, conn )
+ __field(u32, cid )
+ __field(int, channel )
+ __field(int, usage )
+ __field(enum rxrpc_client_trace, op )
+ __field(enum rxrpc_conn_cache_state, cs )
+ ),
+
+ TP_fast_assign(
+ __entry->conn = conn;
+ __entry->channel = channel;
+ __entry->usage = atomic_read(&conn->usage);
+ __entry->op = op;
+ __entry->cid = conn->proto.cid;
+ __entry->cs = conn->cache_state;
+ ),
+
+ TP_printk("C=%p h=%2d %s %s i=%08x u=%d",
+ __entry->conn,
+ __entry->channel,
+ rxrpc_client_traces[__entry->op],
+ rxrpc_conn_cache_states[__entry->cs],
+ __entry->cid,
+ __entry->usage)
+ );
+
TRACE_EVENT(rxrpc_call,
TP_PROTO(struct rxrpc_call *call, enum rxrpc_call_trace op,
int usage, const void *where, const void *aux),
@@ -47,14 +107,14 @@
);
TRACE_EVENT(rxrpc_skb,
- TP_PROTO(struct sk_buff *skb, int op, int usage, int mod_count,
- const void *where),
+ TP_PROTO(struct sk_buff *skb, enum rxrpc_skb_trace op,
+ int usage, int mod_count, const void *where),
TP_ARGS(skb, op, usage, mod_count, where),
TP_STRUCT__entry(
__field(struct sk_buff *, skb )
- __field(int, op )
+ __field(enum rxrpc_skb_trace, op )
__field(int, usage )
__field(int, mod_count )
__field(const void *, where )
@@ -70,11 +130,7 @@
TP_printk("s=%p %s u=%d m=%d p=%pSR",
__entry->skb,
- (__entry->op == 0 ? "NEW" :
- __entry->op == 1 ? "SEE" :
- __entry->op == 2 ? "GET" :
- __entry->op == 3 ? "FRE" :
- "PUR"),
+ rxrpc_skb_traces[__entry->op],
__entry->usage,
__entry->mod_count,
__entry->where)
@@ -93,11 +149,12 @@
memcpy(&__entry->hdr, &sp->hdr, sizeof(__entry->hdr));
),
- TP_printk("%08x:%08x:%08x:%04x %08x %08x %02x %02x",
+ TP_printk("%08x:%08x:%08x:%04x %08x %08x %02x %02x %s",
__entry->hdr.epoch, __entry->hdr.cid,
__entry->hdr.callNumber, __entry->hdr.serviceId,
__entry->hdr.serial, __entry->hdr.seq,
- __entry->hdr.type, __entry->hdr.flags)
+ __entry->hdr.type, __entry->hdr.flags,
+ __entry->hdr.type <= 15 ? rxrpc_pkts[__entry->hdr.type] : "?UNK")
);
TRACE_EVENT(rxrpc_rx_done,
@@ -147,6 +204,417 @@
__entry->abort_code, __entry->error, __entry->why)
);
+TRACE_EVENT(rxrpc_transmit,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_transmit_trace why),
+
+ TP_ARGS(call, why),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_transmit_trace, why )
+ __field(rxrpc_seq_t, tx_hard_ack )
+ __field(rxrpc_seq_t, tx_top )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->tx_hard_ack = call->tx_hard_ack;
+ __entry->tx_top = call->tx_top;
+ ),
+
+ TP_printk("c=%p %s f=%08x n=%u",
+ __entry->call,
+ rxrpc_transmit_traces[__entry->why],
+ __entry->tx_hard_ack + 1,
+ __entry->tx_top - __entry->tx_hard_ack)
+ );
+
+TRACE_EVENT(rxrpc_rx_ack,
+ TP_PROTO(struct rxrpc_call *call, rxrpc_seq_t first, u8 reason, u8 n_acks),
+
+ TP_ARGS(call, first, reason, n_acks),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(rxrpc_seq_t, first )
+ __field(u8, reason )
+ __field(u8, n_acks )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->first = first;
+ __entry->reason = reason;
+ __entry->n_acks = n_acks;
+ ),
+
+ TP_printk("c=%p %s f=%08x n=%u",
+ __entry->call,
+ rxrpc_ack_names[__entry->reason],
+ __entry->first,
+ __entry->n_acks)
+ );
+
+TRACE_EVENT(rxrpc_tx_data,
+ TP_PROTO(struct rxrpc_call *call, rxrpc_seq_t seq,
+ rxrpc_serial_t serial, u8 flags, bool lose),
+
+ TP_ARGS(call, seq, serial, flags, lose),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(rxrpc_seq_t, seq )
+ __field(rxrpc_serial_t, serial )
+ __field(u8, flags )
+ __field(bool, lose )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->seq = seq;
+ __entry->serial = serial;
+ __entry->flags = flags;
+ __entry->lose = lose;
+ ),
+
+ TP_printk("c=%p DATA %08x q=%08x fl=%02x%s",
+ __entry->call,
+ __entry->serial,
+ __entry->seq,
+ __entry->flags,
+ __entry->lose ? " *LOSE*" : "")
+ );
+
+TRACE_EVENT(rxrpc_tx_ack,
+ TP_PROTO(struct rxrpc_call *call, rxrpc_serial_t serial,
+ rxrpc_seq_t ack_first, rxrpc_serial_t ack_serial,
+ u8 reason, u8 n_acks),
+
+ TP_ARGS(call, serial, ack_first, ack_serial, reason, n_acks),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(rxrpc_serial_t, serial )
+ __field(rxrpc_seq_t, ack_first )
+ __field(rxrpc_serial_t, ack_serial )
+ __field(u8, reason )
+ __field(u8, n_acks )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->serial = serial;
+ __entry->ack_first = ack_first;
+ __entry->ack_serial = ack_serial;
+ __entry->reason = reason;
+ __entry->n_acks = n_acks;
+ ),
+
+ TP_printk(" c=%p ACK %08x %s f=%08x r=%08x n=%u",
+ __entry->call,
+ __entry->serial,
+ rxrpc_ack_names[__entry->reason],
+ __entry->ack_first,
+ __entry->ack_serial,
+ __entry->n_acks)
+ );
+
+TRACE_EVENT(rxrpc_receive,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_receive_trace why,
+ rxrpc_serial_t serial, rxrpc_seq_t seq),
+
+ TP_ARGS(call, why, serial, seq),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_receive_trace, why )
+ __field(rxrpc_serial_t, serial )
+ __field(rxrpc_seq_t, seq )
+ __field(rxrpc_seq_t, hard_ack )
+ __field(rxrpc_seq_t, top )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->serial = serial;
+ __entry->seq = seq;
+ __entry->hard_ack = call->rx_hard_ack;
+ __entry->top = call->rx_top;
+ ),
+
+ TP_printk("c=%p %s r=%08x q=%08x w=%08x-%08x",
+ __entry->call,
+ rxrpc_receive_traces[__entry->why],
+ __entry->serial,
+ __entry->seq,
+ __entry->hard_ack,
+ __entry->top)
+ );
+
+TRACE_EVENT(rxrpc_recvmsg,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_recvmsg_trace why,
+ rxrpc_seq_t seq, unsigned int offset, unsigned int len,
+ int ret),
+
+ TP_ARGS(call, why, seq, offset, len, ret),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_recvmsg_trace, why )
+ __field(rxrpc_seq_t, seq )
+ __field(unsigned int, offset )
+ __field(unsigned int, len )
+ __field(int, ret )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->seq = seq;
+ __entry->offset = offset;
+ __entry->len = len;
+ __entry->ret = ret;
+ ),
+
+ TP_printk("c=%p %s q=%08x o=%u l=%u ret=%d",
+ __entry->call,
+ rxrpc_recvmsg_traces[__entry->why],
+ __entry->seq,
+ __entry->offset,
+ __entry->len,
+ __entry->ret)
+ );
+
+TRACE_EVENT(rxrpc_rtt_tx,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_rtt_tx_trace why,
+ rxrpc_serial_t send_serial),
+
+ TP_ARGS(call, why, send_serial),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_rtt_tx_trace, why )
+ __field(rxrpc_serial_t, send_serial )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->send_serial = send_serial;
+ ),
+
+ TP_printk("c=%p %s sr=%08x",
+ __entry->call,
+ rxrpc_rtt_tx_traces[__entry->why],
+ __entry->send_serial)
+ );
+
+TRACE_EVENT(rxrpc_rtt_rx,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_rtt_rx_trace why,
+ rxrpc_serial_t send_serial, rxrpc_serial_t resp_serial,
+ s64 rtt, u8 nr, s64 avg),
+
+ TP_ARGS(call, why, send_serial, resp_serial, rtt, nr, avg),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_rtt_rx_trace, why )
+ __field(u8, nr )
+ __field(rxrpc_serial_t, send_serial )
+ __field(rxrpc_serial_t, resp_serial )
+ __field(s64, rtt )
+ __field(u64, avg )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->send_serial = send_serial;
+ __entry->resp_serial = resp_serial;
+ __entry->rtt = rtt;
+ __entry->nr = nr;
+ __entry->avg = avg;
+ ),
+
+ TP_printk("c=%p %s sr=%08x rr=%08x rtt=%lld nr=%u avg=%lld",
+ __entry->call,
+ rxrpc_rtt_rx_traces[__entry->why],
+ __entry->send_serial,
+ __entry->resp_serial,
+ __entry->rtt,
+ __entry->nr,
+ __entry->avg)
+ );
+
+TRACE_EVENT(rxrpc_timer,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_timer_trace why,
+ unsigned long now),
+
+ TP_ARGS(call, why, now),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_timer_trace, why )
+ __field(unsigned long, now )
+ __field(unsigned long, expire_at )
+ __field(unsigned long, ack_at )
+ __field(unsigned long, resend_at )
+ __field(unsigned long, timer )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->now = now;
+ __entry->expire_at = call->expire_at;
+ __entry->ack_at = call->ack_at;
+ __entry->resend_at = call->resend_at;
+ __entry->timer = call->timer.expires;
+ ),
+
+ TP_printk("c=%p %s now=%lx x=%ld a=%ld r=%ld t=%ld",
+ __entry->call,
+ rxrpc_timer_traces[__entry->why],
+ __entry->now,
+ __entry->expire_at - __entry->now,
+ __entry->ack_at - __entry->now,
+ __entry->resend_at - __entry->now,
+ __entry->timer - __entry->now)
+ );
+
+TRACE_EVENT(rxrpc_rx_lose,
+ TP_PROTO(struct rxrpc_skb_priv *sp),
+
+ TP_ARGS(sp),
+
+ TP_STRUCT__entry(
+ __field_struct(struct rxrpc_host_header, hdr )
+ ),
+
+ TP_fast_assign(
+ memcpy(&__entry->hdr, &sp->hdr, sizeof(__entry->hdr));
+ ),
+
+ TP_printk("%08x:%08x:%08x:%04x %08x %08x %02x %02x %s *LOSE*",
+ __entry->hdr.epoch, __entry->hdr.cid,
+ __entry->hdr.callNumber, __entry->hdr.serviceId,
+ __entry->hdr.serial, __entry->hdr.seq,
+ __entry->hdr.type, __entry->hdr.flags,
+ __entry->hdr.type <= 15 ? rxrpc_pkts[__entry->hdr.type] : "?UNK")
+ );
+
+TRACE_EVENT(rxrpc_propose_ack,
+ TP_PROTO(struct rxrpc_call *call, enum rxrpc_propose_ack_trace why,
+ u8 ack_reason, rxrpc_serial_t serial, bool immediate,
+ bool background, enum rxrpc_propose_ack_outcome outcome),
+
+ TP_ARGS(call, why, ack_reason, serial, immediate, background,
+ outcome),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_propose_ack_trace, why )
+ __field(rxrpc_serial_t, serial )
+ __field(u8, ack_reason )
+ __field(bool, immediate )
+ __field(bool, background )
+ __field(enum rxrpc_propose_ack_outcome, outcome )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->why = why;
+ __entry->serial = serial;
+ __entry->ack_reason = ack_reason;
+ __entry->immediate = immediate;
+ __entry->background = background;
+ __entry->outcome = outcome;
+ ),
+
+ TP_printk("c=%p %s %s r=%08x i=%u b=%u%s",
+ __entry->call,
+ rxrpc_propose_ack_traces[__entry->why],
+ rxrpc_ack_names[__entry->ack_reason],
+ __entry->serial,
+ __entry->immediate,
+ __entry->background,
+ rxrpc_propose_ack_outcomes[__entry->outcome])
+ );
+
+TRACE_EVENT(rxrpc_retransmit,
+ TP_PROTO(struct rxrpc_call *call, rxrpc_seq_t seq, u8 annotation,
+ s64 expiry),
+
+ TP_ARGS(call, seq, annotation, expiry),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(rxrpc_seq_t, seq )
+ __field(u8, annotation )
+ __field(s64, expiry )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->seq = seq;
+ __entry->annotation = annotation;
+ __entry->expiry = expiry;
+ ),
+
+ TP_printk("c=%p q=%x a=%02x xp=%lld",
+ __entry->call,
+ __entry->seq,
+ __entry->annotation,
+ __entry->expiry)
+ );
+
+TRACE_EVENT(rxrpc_congest,
+ TP_PROTO(struct rxrpc_call *call, struct rxrpc_ack_summary *summary,
+ rxrpc_serial_t ack_serial, enum rxrpc_congest_change change),
+
+ TP_ARGS(call, summary, ack_serial, change),
+
+ TP_STRUCT__entry(
+ __field(struct rxrpc_call *, call )
+ __field(enum rxrpc_congest_change, change )
+ __field(rxrpc_seq_t, hard_ack )
+ __field(rxrpc_seq_t, top )
+ __field(rxrpc_seq_t, lowest_nak )
+ __field(rxrpc_serial_t, ack_serial )
+ __field_struct(struct rxrpc_ack_summary, sum )
+ ),
+
+ TP_fast_assign(
+ __entry->call = call;
+ __entry->change = change;
+ __entry->hard_ack = call->tx_hard_ack;
+ __entry->top = call->tx_top;
+ __entry->lowest_nak = call->acks_lowest_nak;
+ __entry->ack_serial = ack_serial;
+ memcpy(&__entry->sum, summary, sizeof(__entry->sum));
+ ),
+
+ TP_printk("c=%p %08x %s %08x %s cw=%u ss=%u nr=%u,%u nw=%u,%u r=%u b=%u u=%u d=%u l=%x%s%s%s",
+ __entry->call,
+ __entry->ack_serial,
+ rxrpc_ack_names[__entry->sum.ack_reason],
+ __entry->hard_ack,
+ rxrpc_congest_modes[__entry->sum.mode],
+ __entry->sum.cwnd,
+ __entry->sum.ssthresh,
+ __entry->sum.nr_acks, __entry->sum.nr_nacks,
+ __entry->sum.nr_new_acks, __entry->sum.nr_new_nacks,
+ __entry->sum.nr_rot_new_acks,
+ __entry->top - __entry->hard_ack,
+ __entry->sum.cumulative_acks,
+ __entry->sum.dup_acks,
+ __entry->lowest_nak, __entry->sum.new_low_nack ? "!" : "",
+ rxrpc_congest_changes[__entry->change],
+ __entry->sum.retrans_timeo ? " rTxTo" : "")
+ );
+
#endif /* _TRACE_RXRPC_H */
/* This part must be outside protection */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f896dfa..f09c70b 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -398,6 +398,34 @@
*/
BPF_FUNC_skb_change_tail,
+ /**
+ * bpf_skb_pull_data(skb, len)
+ * The helper will pull in non-linear data in case the
+ * skb is non-linear and not all of len are part of the
+ * linear section. Only needed for read/write with direct
+ * packet access.
+ * @skb: pointer to skb
+ * @len: len to make read/writeable
+ * Return: 0 on success or negative error
+ */
+ BPF_FUNC_skb_pull_data,
+
+ /**
+ * bpf_csum_update(skb, csum)
+ * Adds csum into skb->csum in case of CHECKSUM_COMPLETE.
+ * @skb: pointer to skb
+ * @csum: csum to add
+ * Return: csum on success or negative error
+ */
+ BPF_FUNC_csum_update,
+
+ /**
+ * bpf_set_hash_invalid(skb)
+ * Invalidate current skb>hash.
+ * @skb: pointer to skb
+ */
+ BPF_FUNC_set_hash_invalid,
+
__BPF_FUNC_MAX_ID,
};
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 9bf3aec..b4fba66 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -464,6 +464,7 @@
enum ipvlan_mode {
IPVLAN_MODE_L2 = 0,
IPVLAN_MODE_L3,
+ IPVLAN_MODE_L3S,
IPVLAN_MODE_MAX
};
@@ -618,7 +619,7 @@
enum {
IFLA_VF_UNSPEC,
IFLA_VF_MAC, /* Hardware queue specific attributes */
- IFLA_VF_VLAN,
+ IFLA_VF_VLAN, /* VLAN ID and QoS */
IFLA_VF_TX_RATE, /* Max TX Bandwidth Allocation */
IFLA_VF_SPOOFCHK, /* Spoof Checking on/off switch */
IFLA_VF_LINK_STATE, /* link state enable/disable/auto switch */
@@ -630,6 +631,7 @@
IFLA_VF_TRUST, /* Trust VF */
IFLA_VF_IB_NODE_GUID, /* VF Infiniband node GUID */
IFLA_VF_IB_PORT_GUID, /* VF Infiniband port GUID */
+ IFLA_VF_VLAN_LIST, /* nested list of vlans, option for QinQ */
__IFLA_VF_MAX,
};
@@ -646,6 +648,22 @@
__u32 qos;
};
+enum {
+ IFLA_VF_VLAN_INFO_UNSPEC,
+ IFLA_VF_VLAN_INFO, /* VLAN ID, QoS and VLAN protocol */
+ __IFLA_VF_VLAN_INFO_MAX,
+};
+
+#define IFLA_VF_VLAN_INFO_MAX (__IFLA_VF_VLAN_INFO_MAX - 1)
+#define MAX_VLAN_LIST_LEN 1
+
+struct ifla_vf_vlan_info {
+ __u32 vf;
+ __u32 vlan; /* 0 - 4095, 0 disables VLAN filter */
+ __u32 qos;
+ __be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
+};
+
struct ifla_vf_tx_rate {
__u32 vf;
__u32 rate; /* Max TX bandwidth in Mbps, 0 disables throttling */
@@ -826,6 +844,7 @@
IFLA_STATS_LINK_64,
IFLA_STATS_LINK_XSTATS,
IFLA_STATS_LINK_XSTATS_SLAVE,
+ IFLA_STATS_LINK_OFFLOAD_XSTATS,
__IFLA_STATS_MAX,
};
@@ -845,6 +864,14 @@
};
#define LINK_XSTATS_TYPE_MAX (__LINK_XSTATS_TYPE_MAX - 1)
+/* These are stats embedded into IFLA_STATS_LINK_OFFLOAD_XSTATS */
+enum {
+ IFLA_OFFLOAD_XSTATS_UNSPEC,
+ IFLA_OFFLOAD_XSTATS_CPU_HIT, /* struct rtnl_link_stats64 */
+ __IFLA_OFFLOAD_XSTATS_MAX
+};
+#define IFLA_OFFLOAD_XSTATS_MAX (__IFLA_OFFLOAD_XSTATS_MAX - 1)
+
/* XDP section */
enum {
diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 18d5dc1..92f3c86 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -39,6 +39,7 @@
#define GRE_IS_REC(f) ((f) & GRE_REC)
#define GRE_IS_ACK(f) ((f) & GRE_ACK)
+#define GRE_VERSION_0 __cpu_to_be16(0x0000)
#define GRE_VERSION_1 __cpu_to_be16(0x0001)
#define GRE_PROTO_PPP __cpu_to_be16(0x880b)
#define GRE_PPTP_KEY_MASK __cpu_to_be32(0xffff)
diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h
index b5c366f..509cd96 100644
--- a/include/uapi/linux/inet_diag.h
+++ b/include/uapi/linux/inet_diag.h
@@ -124,6 +124,7 @@
INET_DIAG_PEERS,
INET_DIAG_PAD,
INET_DIAG_MARK,
+ INET_DIAG_BBRINFO,
__INET_DIAG_MAX,
};
@@ -157,8 +158,20 @@
__u32 dctcp_ab_tot;
};
+/* INET_DIAG_BBRINFO */
+
+struct tcp_bbr_info {
+ /* u64 bw: max-filtered BW (app throughput) estimate in Byte per sec: */
+ __u32 bbr_bw_lo; /* lower 32 bits of bw */
+ __u32 bbr_bw_hi; /* upper 32 bits of bw */
+ __u32 bbr_min_rtt; /* min-filtered RTT in uSec */
+ __u32 bbr_pacing_gain; /* pacing gain shifted left 8 bits */
+ __u32 bbr_cwnd_gain; /* cwnd gain shifted left 8 bits */
+};
+
union tcp_cc_info {
struct tcpvegas_info vegas;
struct tcp_dctcp_info dctcp;
+ struct tcp_bbr_info bbr;
};
#endif /* _UAPI_INET_DIAG_H_ */
diff --git a/include/uapi/linux/netfilter/nf_log.h b/include/uapi/linux/netfilter/nf_log.h
new file mode 100644
index 0000000..8be21e0
--- /dev/null
+++ b/include/uapi/linux/netfilter/nf_log.h
@@ -0,0 +1,12 @@
+#ifndef _NETFILTER_NF_LOG_H
+#define _NETFILTER_NF_LOG_H
+
+#define NF_LOG_TCPSEQ 0x01 /* Log TCP sequence numbers */
+#define NF_LOG_TCPOPT 0x02 /* Log TCP options */
+#define NF_LOG_IPOPT 0x04 /* Log IP options */
+#define NF_LOG_UID 0x08 /* Log UID owning local socket */
+#define NF_LOG_NFLOG 0x10 /* Unsupported, don't reuse */
+#define NF_LOG_MACDECODE 0x20 /* Decode MAC header */
+#define NF_LOG_MASK 0x2f
+
+#endif /* _NETFILTER_NF_LOG_H */
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 28ce01d..c6c4477 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -546,6 +546,35 @@
};
#define NFTA_CMP_MAX (__NFTA_CMP_MAX - 1)
+/**
+ * enum nft_range_ops - nf_tables range operator
+ *
+ * @NFT_RANGE_EQ: equal
+ * @NFT_RANGE_NEQ: not equal
+ */
+enum nft_range_ops {
+ NFT_RANGE_EQ,
+ NFT_RANGE_NEQ,
+};
+
+/**
+ * enum nft_range_attributes - nf_tables range expression netlink attributes
+ *
+ * @NFTA_RANGE_SREG: source register of data to compare (NLA_U32: nft_registers)
+ * @NFTA_RANGE_OP: cmp operation (NLA_U32: nft_cmp_ops)
+ * @NFTA_RANGE_FROM_DATA: data range from (NLA_NESTED: nft_data_attributes)
+ * @NFTA_RANGE_TO_DATA: data range to (NLA_NESTED: nft_data_attributes)
+ */
+enum nft_range_attributes {
+ NFTA_RANGE_UNSPEC,
+ NFTA_RANGE_SREG,
+ NFTA_RANGE_OP,
+ NFTA_RANGE_FROM_DATA,
+ NFTA_RANGE_TO_DATA,
+ __NFTA_RANGE_MAX
+};
+#define NFTA_RANGE_MAX (__NFTA_RANGE_MAX - 1)
+
enum nft_lookup_flags {
NFT_LOOKUP_F_INV = (1 << 0),
};
@@ -575,6 +604,10 @@
NFT_DYNSET_OP_UPDATE,
};
+enum nft_dynset_flags {
+ NFT_DYNSET_F_INV = (1 << 0),
+};
+
/**
* enum nft_dynset_attributes - dynset expression attributes
*
@@ -585,6 +618,7 @@
* @NFTA_DYNSET_SREG_DATA: source register of the data (NLA_U32)
* @NFTA_DYNSET_TIMEOUT: timeout value for the new element (NLA_U64)
* @NFTA_DYNSET_EXPR: expression (NLA_NESTED: nft_expr_attributes)
+ * @NFTA_DYNSET_FLAGS: flags (NLA_U32)
*/
enum nft_dynset_attributes {
NFTA_DYNSET_UNSPEC,
@@ -596,6 +630,7 @@
NFTA_DYNSET_TIMEOUT,
NFTA_DYNSET_EXPR,
NFTA_DYNSET_PAD,
+ NFTA_DYNSET_FLAGS,
__NFTA_DYNSET_MAX,
};
#define NFTA_DYNSET_MAX (__NFTA_DYNSET_MAX - 1)
@@ -731,6 +766,7 @@
* @NFTA_HASH_LEN: source data length (NLA_U32)
* @NFTA_HASH_MODULUS: modulus value (NLA_U32)
* @NFTA_HASH_SEED: seed value (NLA_U32)
+ * @NFTA_HASH_OFFSET: add this offset value to hash result (NLA_U32)
*/
enum nft_hash_attributes {
NFTA_HASH_UNSPEC,
@@ -739,6 +775,7 @@
NFTA_HASH_LEN,
NFTA_HASH_MODULUS,
NFTA_HASH_SEED,
+ NFTA_HASH_OFFSET,
__NFTA_HASH_MAX,
};
#define NFTA_HASH_MAX (__NFTA_HASH_MAX - 1)
@@ -886,12 +923,14 @@
* @NFTA_QUEUE_NUM: netlink queue to send messages to (NLA_U16)
* @NFTA_QUEUE_TOTAL: number of queues to load balance packets on (NLA_U16)
* @NFTA_QUEUE_FLAGS: various flags (NLA_U16)
+ * @NFTA_QUEUE_SREG_QNUM: source register of queue number (NLA_U32: nft_registers)
*/
enum nft_queue_attributes {
NFTA_QUEUE_UNSPEC,
NFTA_QUEUE_NUM,
NFTA_QUEUE_TOTAL,
NFTA_QUEUE_FLAGS,
+ NFTA_QUEUE_SREG_QNUM,
__NFTA_QUEUE_MAX
};
#define NFTA_QUEUE_MAX (__NFTA_QUEUE_MAX - 1)
@@ -1126,14 +1165,16 @@
* enum nft_ng_attributes - nf_tables number generator expression netlink attributes
*
* @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: source value to increment the counter until reset (NLA_U32)
+ * @NFTA_NG_MODULUS: maximum counter value (NLA_U32)
* @NFTA_NG_TYPE: operation type (NLA_U32)
+ * @NFTA_NG_OFFSET: offset to be added to the counter (NLA_U32)
*/
enum nft_ng_attributes {
NFTA_NG_UNSPEC,
NFTA_NG_DREG,
- NFTA_NG_UNTIL,
+ NFTA_NG_MODULUS,
NFTA_NG_TYPE,
+ NFTA_NG_OFFSET,
__NFTA_NG_MAX
};
#define NFTA_NG_MAX (__NFTA_NG_MAX - 1)
diff --git a/include/uapi/linux/netfilter/nfnetlink_conntrack.h b/include/uapi/linux/netfilter/nfnetlink_conntrack.h
index 9df78970..6deb886 100644
--- a/include/uapi/linux/netfilter/nfnetlink_conntrack.h
+++ b/include/uapi/linux/netfilter/nfnetlink_conntrack.h
@@ -231,13 +231,13 @@
enum ctattr_stats_cpu {
CTA_STATS_UNSPEC,
- CTA_STATS_SEARCHED,
+ CTA_STATS_SEARCHED, /* no longer used */
CTA_STATS_FOUND,
- CTA_STATS_NEW,
+ CTA_STATS_NEW, /* no longer used */
CTA_STATS_INVALID,
CTA_STATS_IGNORE,
- CTA_STATS_DELETE,
- CTA_STATS_DELETE_LIST,
+ CTA_STATS_DELETE, /* no longer used */
+ CTA_STATS_DELETE_LIST, /* no longer used */
CTA_STATS_INSERT,
CTA_STATS_INSERT_FAILED,
CTA_STATS_DROP,
diff --git a/include/uapi/linux/netfilter/xt_hashlimit.h b/include/uapi/linux/netfilter/xt_hashlimit.h
index 6db9037..3efc0ca 100644
--- a/include/uapi/linux/netfilter/xt_hashlimit.h
+++ b/include/uapi/linux/netfilter/xt_hashlimit.h
@@ -6,6 +6,7 @@
/* timings are in milliseconds. */
#define XT_HASHLIMIT_SCALE 10000
+#define XT_HASHLIMIT_SCALE_v2 1000000llu
/* 1/10,000 sec period => max of 10,000/sec. Min rate is then 429490
* seconds, or one packet every 59 hours.
*/
@@ -63,6 +64,20 @@
__u8 srcmask, dstmask;
};
+struct hashlimit_cfg2 {
+ __u64 avg; /* Average secs between packets * scale */
+ __u64 burst; /* Period multiplier for upper limit. */
+ __u32 mode; /* bitmask of XT_HASHLIMIT_HASH_* */
+
+ /* user specified */
+ __u32 size; /* how many buckets */
+ __u32 max; /* max number of entries */
+ __u32 gc_interval; /* gc interval */
+ __u32 expire; /* when do entries expire? */
+
+ __u8 srcmask, dstmask;
+};
+
struct xt_hashlimit_mtinfo1 {
char name[IFNAMSIZ];
struct hashlimit_cfg1 cfg;
@@ -71,4 +86,12 @@
struct xt_hashlimit_htable *hinfo __attribute__((aligned(8)));
};
+struct xt_hashlimit_mtinfo2 {
+ char name[NAME_MAX];
+ struct hashlimit_cfg2 cfg;
+
+ /* Used internally by the kernel */
+ struct xt_hashlimit_htable *hinfo __attribute__((aligned(8)));
+};
+
#endif /* _UAPI_XT_HASHLIMIT_H */
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 8915b61..8fd715f 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -396,6 +396,7 @@
TCA_BPF_FD,
TCA_BPF_NAME,
TCA_BPF_FLAGS,
+ TCA_BPF_FLAGS_GEN,
__TCA_BPF_MAX,
};
diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 2382eed..df7451d 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -792,6 +792,8 @@
TCA_FQ_ORPHAN_MASK, /* mask applied to orphaned skb hashes */
+ TCA_FQ_LOW_RATE_THRESHOLD, /* per packet delay under this rate */
+
__TCA_FQ_MAX
};
@@ -809,7 +811,7 @@
__u32 flows;
__u32 inactive_flows;
__u32 throttled_flows;
- __u32 pad;
+ __u32 unthrottle_latency_ns;
};
/* Heavy-Hitter Filter */
diff --git a/include/uapi/linux/tc_act/tc_ife.h b/include/uapi/linux/tc_act/tc_ife.h
index 4ece02a..cd18360 100644
--- a/include/uapi/linux/tc_act/tc_ife.h
+++ b/include/uapi/linux/tc_act/tc_ife.h
@@ -32,8 +32,9 @@
#define IFE_META_HASHID 2
#define IFE_META_PRIO 3
#define IFE_META_QMAP 4
+#define IFE_META_TCINDEX 5
/*Can be overridden at runtime by module option*/
-#define __IFE_META_MAX 5
+#define __IFE_META_MAX 6
#define IFE_META_MAX (__IFE_META_MAX - 1)
#endif
diff --git a/include/uapi/linux/tc_act/tc_vlan.h b/include/uapi/linux/tc_act/tc_vlan.h
index be72b6e..bddb272 100644
--- a/include/uapi/linux/tc_act/tc_vlan.h
+++ b/include/uapi/linux/tc_act/tc_vlan.h
@@ -16,6 +16,7 @@
#define TCA_VLAN_ACT_POP 1
#define TCA_VLAN_ACT_PUSH 2
+#define TCA_VLAN_ACT_MODIFY 3
struct tc_vlan {
tc_gen;
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 482898f..73ac0db 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -167,6 +167,7 @@
__u8 tcpi_backoff;
__u8 tcpi_options;
__u8 tcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;
+ __u8 tcpi_delivery_rate_app_limited:1;
__u32 tcpi_rto;
__u32 tcpi_ato;
@@ -211,6 +212,8 @@
__u32 tcpi_min_rtt;
__u32 tcpi_data_segs_in; /* RFC4898 tcpEStatsDataSegsIn */
__u32 tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
+
+ __u64 tcpi_delivery_rate;
};
/* for TCP_MD5SIG socket option */
diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index 1433389..1fc62b2 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -298,7 +298,7 @@
XFRMA_ALG_AUTH_TRUNC, /* struct xfrm_algo_auth */
XFRMA_MARK, /* struct xfrm_mark */
XFRMA_TFCPAD, /* __u32 */
- XFRMA_REPLAY_ESN_VAL, /* struct xfrm_replay_esn */
+ XFRMA_REPLAY_ESN_VAL, /* struct xfrm_replay_state_esn */
XFRMA_SA_EXTRA_FLAGS, /* __u32 */
XFRMA_PROTO, /* __u8 */
XFRMA_ADDRESS_FILTER, /* struct xfrm_address_filter */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 7b7baae..aa6d981 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1031,7 +1031,7 @@
state = &get_cpu_var(bpf_user_rnd_state);
res = prandom_u32_state(state);
- put_cpu_var(state);
+ put_cpu_var(bpf_user_rnd_state);
return res;
}
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a5b8bf8..3991840 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -36,6 +36,7 @@
const struct bpf_func_proto bpf_map_lookup_elem_proto = {
.func = bpf_map_lookup_elem,
.gpl_only = false,
+ .pkt_access = true,
.ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_PTR_TO_MAP_KEY,
@@ -51,6 +52,7 @@
const struct bpf_func_proto bpf_map_update_elem_proto = {
.func = bpf_map_update_elem,
.gpl_only = false,
+ .pkt_access = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_PTR_TO_MAP_KEY,
@@ -67,6 +69,7 @@
const struct bpf_func_proto bpf_map_delete_elem_proto = {
.func = bpf_map_delete_elem,
.gpl_only = false,
+ .pkt_access = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_CONST_MAP_PTR,
.arg2_type = ARG_PTR_TO_MAP_KEY,
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 90493a6..99a7e5b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14,6 +14,7 @@
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
#include <linux/filter.h>
#include <net/netlink.h>
#include <linux/file.h>
@@ -126,76 +127,16 @@
* are set to NOT_INIT to indicate that they are no longer readable.
*/
-struct reg_state {
- enum bpf_reg_type type;
- union {
- /* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
- s64 imm;
-
- /* valid when type == PTR_TO_PACKET* */
- struct {
- u32 id;
- u16 off;
- u16 range;
- };
-
- /* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE |
- * PTR_TO_MAP_VALUE_OR_NULL
- */
- struct bpf_map *map_ptr;
- };
-};
-
-enum bpf_stack_slot_type {
- STACK_INVALID, /* nothing was stored in this stack slot */
- STACK_SPILL, /* register spilled into stack */
- STACK_MISC /* BPF program wrote some data into this slot */
-};
-
-#define BPF_REG_SIZE 8 /* size of eBPF register in bytes */
-
-/* state of the program:
- * type of all registers and stack info
- */
-struct verifier_state {
- struct reg_state regs[MAX_BPF_REG];
- u8 stack_slot_type[MAX_BPF_STACK];
- struct reg_state spilled_regs[MAX_BPF_STACK / BPF_REG_SIZE];
-};
-
-/* linked list of verifier states used to prune search */
-struct verifier_state_list {
- struct verifier_state state;
- struct verifier_state_list *next;
-};
-
/* verifier_state + insn_idx are pushed to stack when branch is encountered */
-struct verifier_stack_elem {
+struct bpf_verifier_stack_elem {
/* verifer state is 'st'
* before processing instruction 'insn_idx'
* and after processing instruction 'prev_insn_idx'
*/
- struct verifier_state st;
+ struct bpf_verifier_state st;
int insn_idx;
int prev_insn_idx;
- struct verifier_stack_elem *next;
-};
-
-#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
-
-/* single container for all structs
- * one verifier_env per bpf_check() call
- */
-struct verifier_env {
- struct bpf_prog *prog; /* eBPF program being verified */
- struct verifier_stack_elem *head; /* stack of verifier states to be processed */
- int stack_size; /* number of states to be processed */
- struct verifier_state cur_state; /* current verifier state */
- struct verifier_state_list **explored_states; /* search pruning optimization */
- struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
- u32 used_map_cnt; /* number of used maps */
- u32 id_gen; /* used to generate unique reg IDs */
- bool allow_ptr_leaks;
+ struct bpf_verifier_stack_elem *next;
};
#define BPF_COMPLEXITY_LIMIT_INSNS 65536
@@ -204,6 +145,7 @@
struct bpf_call_arg_meta {
struct bpf_map *map_ptr;
bool raw_mode;
+ bool pkt_access;
int regno;
int access_size;
};
@@ -240,6 +182,7 @@
[CONST_PTR_TO_MAP] = "map_ptr",
[PTR_TO_MAP_VALUE] = "map_value",
[PTR_TO_MAP_VALUE_OR_NULL] = "map_value_or_null",
+ [PTR_TO_MAP_VALUE_ADJ] = "map_value_adj",
[FRAME_PTR] = "fp",
[PTR_TO_STACK] = "fp",
[CONST_IMM] = "imm",
@@ -247,9 +190,9 @@
[PTR_TO_PACKET_END] = "pkt_end",
};
-static void print_verifier_state(struct verifier_state *state)
+static void print_verifier_state(struct bpf_verifier_state *state)
{
- struct reg_state *reg;
+ struct bpf_reg_state *reg;
enum bpf_reg_type t;
int i;
@@ -267,10 +210,17 @@
else if (t == UNKNOWN_VALUE && reg->imm)
verbose("%lld", reg->imm);
else if (t == CONST_PTR_TO_MAP || t == PTR_TO_MAP_VALUE ||
- t == PTR_TO_MAP_VALUE_OR_NULL)
+ t == PTR_TO_MAP_VALUE_OR_NULL ||
+ t == PTR_TO_MAP_VALUE_ADJ)
verbose("(ks=%d,vs=%d)",
reg->map_ptr->key_size,
reg->map_ptr->value_size);
+ if (reg->min_value != BPF_REGISTER_MIN_RANGE)
+ verbose(",min_value=%llu",
+ (unsigned long long)reg->min_value);
+ if (reg->max_value != BPF_REGISTER_MAX_RANGE)
+ verbose(",max_value=%llu",
+ (unsigned long long)reg->max_value);
}
for (i = 0; i < MAX_BPF_STACK; i += BPF_REG_SIZE) {
if (state->stack_slot_type[i] == STACK_SPILL)
@@ -425,9 +375,9 @@
}
}
-static int pop_stack(struct verifier_env *env, int *prev_insn_idx)
+static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx)
{
- struct verifier_stack_elem *elem;
+ struct bpf_verifier_stack_elem *elem;
int insn_idx;
if (env->head == NULL)
@@ -444,12 +394,12 @@
return insn_idx;
}
-static struct verifier_state *push_stack(struct verifier_env *env, int insn_idx,
- int prev_insn_idx)
+static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
+ int insn_idx, int prev_insn_idx)
{
- struct verifier_stack_elem *elem;
+ struct bpf_verifier_stack_elem *elem;
- elem = kmalloc(sizeof(struct verifier_stack_elem), GFP_KERNEL);
+ elem = kmalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL);
if (!elem)
goto err;
@@ -475,13 +425,15 @@
BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
};
-static void init_reg_state(struct reg_state *regs)
+static void init_reg_state(struct bpf_reg_state *regs)
{
int i;
for (i = 0; i < MAX_BPF_REG; i++) {
regs[i].type = NOT_INIT;
regs[i].imm = 0;
+ regs[i].min_value = BPF_REGISTER_MIN_RANGE;
+ regs[i].max_value = BPF_REGISTER_MAX_RANGE;
}
/* frame pointer */
@@ -491,20 +443,26 @@
regs[BPF_REG_1].type = PTR_TO_CTX;
}
-static void mark_reg_unknown_value(struct reg_state *regs, u32 regno)
+static void mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
{
BUG_ON(regno >= MAX_BPF_REG);
regs[regno].type = UNKNOWN_VALUE;
regs[regno].imm = 0;
}
+static void reset_reg_range_values(struct bpf_reg_state *regs, u32 regno)
+{
+ regs[regno].min_value = BPF_REGISTER_MIN_RANGE;
+ regs[regno].max_value = BPF_REGISTER_MAX_RANGE;
+}
+
enum reg_arg_type {
SRC_OP, /* register is used as source operand */
DST_OP, /* register is used as destination operand */
DST_OP_NO_MARK /* same as above, check only, don't mark */
};
-static int check_reg_arg(struct reg_state *regs, u32 regno,
+static int check_reg_arg(struct bpf_reg_state *regs, u32 regno,
enum reg_arg_type t)
{
if (regno >= MAX_BPF_REG) {
@@ -564,8 +522,8 @@
/* check_stack_read/write functions track spill/fill of registers,
* stack boundary and alignment are checked in check_mem_access()
*/
-static int check_stack_write(struct verifier_state *state, int off, int size,
- int value_regno)
+static int check_stack_write(struct bpf_verifier_state *state, int off,
+ int size, int value_regno)
{
int i;
/* caller checked that off % size == 0 and -MAX_BPF_STACK <= off < 0,
@@ -590,7 +548,7 @@
} else {
/* regular write of data into stack */
state->spilled_regs[(MAX_BPF_STACK + off) / BPF_REG_SIZE] =
- (struct reg_state) {};
+ (struct bpf_reg_state) {};
for (i = 0; i < size; i++)
state->stack_slot_type[MAX_BPF_STACK + off + i] = STACK_MISC;
@@ -598,7 +556,7 @@
return 0;
}
-static int check_stack_read(struct verifier_state *state, int off, int size,
+static int check_stack_read(struct bpf_verifier_state *state, int off, int size,
int value_regno)
{
u8 *slot_type;
@@ -639,7 +597,7 @@
}
/* check read/write into map element returned by bpf_map_lookup_elem() */
-static int check_map_access(struct verifier_env *env, u32 regno, int off,
+static int check_map_access(struct bpf_verifier_env *env, u32 regno, int off,
int size)
{
struct bpf_map *map = env->cur_state.regs[regno].map_ptr;
@@ -654,24 +612,31 @@
#define MAX_PACKET_OFF 0xffff
-static bool may_write_pkt_data(enum bpf_prog_type type)
+static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
+ const struct bpf_call_arg_meta *meta)
{
- switch (type) {
+ switch (env->prog->type) {
+ case BPF_PROG_TYPE_SCHED_CLS:
+ case BPF_PROG_TYPE_SCHED_ACT:
case BPF_PROG_TYPE_XDP:
+ if (meta)
+ return meta->pkt_access;
+
+ env->seen_direct_write = true;
return true;
default:
return false;
}
}
-static int check_packet_access(struct verifier_env *env, u32 regno, int off,
+static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off,
int size)
{
- struct reg_state *regs = env->cur_state.regs;
- struct reg_state *reg = ®s[regno];
+ struct bpf_reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *reg = ®s[regno];
off += reg->off;
- if (off < 0 || off + size > reg->range) {
+ if (off < 0 || size <= 0 || off + size > reg->range) {
verbose("invalid access to packet, off=%d size=%d, R%d(id=%d,off=%d,r=%d)\n",
off, size, regno, reg->id, reg->off, reg->range);
return -EACCES;
@@ -680,9 +645,13 @@
}
/* check access to 'struct bpf_context' fields */
-static int check_ctx_access(struct verifier_env *env, int off, int size,
+static int check_ctx_access(struct bpf_verifier_env *env, int off, int size,
enum bpf_access_type t, enum bpf_reg_type *reg_type)
{
+ /* for analyzer ctx accesses are already validated and converted */
+ if (env->analyzer_ops)
+ return 0;
+
if (env->prog->aux->ops->is_valid_access &&
env->prog->aux->ops->is_valid_access(off, size, t, reg_type)) {
/* remember the offset of last byte accessed in ctx */
@@ -695,7 +664,7 @@
return -EACCES;
}
-static bool is_pointer_value(struct verifier_env *env, int regno)
+static bool is_pointer_value(struct bpf_verifier_env *env, int regno)
{
if (env->allow_ptr_leaks)
return false;
@@ -709,28 +678,19 @@
}
}
-static int check_ptr_alignment(struct verifier_env *env, struct reg_state *reg,
- int off, int size)
+static int check_ptr_alignment(struct bpf_verifier_env *env,
+ struct bpf_reg_state *reg, int off, int size)
{
- if (reg->type != PTR_TO_PACKET) {
+ if (reg->type != PTR_TO_PACKET && reg->type != PTR_TO_MAP_VALUE_ADJ) {
if (off % size != 0) {
- verbose("misaligned access off %d size %d\n", off, size);
+ verbose("misaligned access off %d size %d\n",
+ off, size);
return -EACCES;
} else {
return 0;
}
}
- switch (env->prog->type) {
- case BPF_PROG_TYPE_SCHED_CLS:
- case BPF_PROG_TYPE_SCHED_ACT:
- case BPF_PROG_TYPE_XDP:
- break;
- default:
- verbose("verifier is misconfigured\n");
- return -EACCES;
- }
-
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
/* misaligned access to packet is ok on x86,arm,arm64 */
return 0;
@@ -741,7 +701,8 @@
}
/* skb->data is NET_IP_ALIGN-ed */
- if ((NET_IP_ALIGN + reg->off + off) % size != 0) {
+ if (reg->type == PTR_TO_PACKET &&
+ (NET_IP_ALIGN + reg->off + off) % size != 0) {
verbose("misaligned packet access off %d+%d+%d size %d\n",
NET_IP_ALIGN, reg->off, off, size);
return -EACCES;
@@ -755,12 +716,12 @@
* if t==write && value_regno==-1, some unknown value is stored into memory
* if t==read && value_regno==-1, don't care what we read from memory
*/
-static int check_mem_access(struct verifier_env *env, u32 regno, int off,
+static int check_mem_access(struct bpf_verifier_env *env, u32 regno, int off,
int bpf_size, enum bpf_access_type t,
int value_regno)
{
- struct verifier_state *state = &env->cur_state;
- struct reg_state *reg = &state->regs[regno];
+ struct bpf_verifier_state *state = &env->cur_state;
+ struct bpf_reg_state *reg = &state->regs[regno];
int size, err = 0;
if (reg->type == PTR_TO_STACK)
@@ -774,12 +735,52 @@
if (err)
return err;
- if (reg->type == PTR_TO_MAP_VALUE) {
+ if (reg->type == PTR_TO_MAP_VALUE ||
+ reg->type == PTR_TO_MAP_VALUE_ADJ) {
if (t == BPF_WRITE && value_regno >= 0 &&
is_pointer_value(env, value_regno)) {
verbose("R%d leaks addr into map\n", value_regno);
return -EACCES;
}
+
+ /* If we adjusted the register to this map value at all then we
+ * need to change off and size to min_value and max_value
+ * respectively to make sure our theoretical access will be
+ * safe.
+ */
+ if (reg->type == PTR_TO_MAP_VALUE_ADJ) {
+ if (log_level)
+ print_verifier_state(state);
+ env->varlen_map_value_access = true;
+ /* The minimum value is only important with signed
+ * comparisons where we can't assume the floor of a
+ * value is 0. If we are using signed variables for our
+ * index'es we need to make sure that whatever we use
+ * will have a set floor within our range.
+ */
+ if ((s64)reg->min_value < 0) {
+ verbose("R%d min value is negative, either use unsigned index or do a if (index >=0) check.\n",
+ regno);
+ return -EACCES;
+ }
+ err = check_map_access(env, regno, reg->min_value + off,
+ size);
+ if (err) {
+ verbose("R%d min value is outside of the array range\n",
+ regno);
+ return err;
+ }
+
+ /* If we haven't set a max value then we need to bail
+ * since we can't be sure we won't do bad things.
+ */
+ if (reg->max_value == BPF_REGISTER_MAX_RANGE) {
+ verbose("R%d unbounded memory access, make sure to bounds check any array access into a map\n",
+ regno);
+ return -EACCES;
+ }
+ off += reg->max_value;
+ }
err = check_map_access(env, regno, off, size);
if (!err && t == BPF_READ && value_regno >= 0)
mark_reg_unknown_value(state->regs, value_regno);
@@ -795,9 +796,8 @@
err = check_ctx_access(env, off, size, t, ®_type);
if (!err && t == BPF_READ && value_regno >= 0) {
mark_reg_unknown_value(state->regs, value_regno);
- if (env->allow_ptr_leaks)
- /* note that reg.[id|off|range] == 0 */
- state->regs[value_regno].type = reg_type;
+ /* note that reg.[id|off|range] == 0 */
+ state->regs[value_regno].type = reg_type;
}
} else if (reg->type == FRAME_PTR || reg->type == PTR_TO_STACK) {
@@ -817,7 +817,7 @@
err = check_stack_read(state, off, size, value_regno);
}
} else if (state->regs[regno].type == PTR_TO_PACKET) {
- if (t == BPF_WRITE && !may_write_pkt_data(env->prog->type)) {
+ if (t == BPF_WRITE && !may_access_direct_pkt_data(env, NULL)) {
verbose("cannot write into packet\n");
return -EACCES;
}
@@ -846,9 +846,9 @@
return err;
}
-static int check_xadd(struct verifier_env *env, struct bpf_insn *insn)
+static int check_xadd(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *regs = env->cur_state.regs;
int err;
if ((BPF_SIZE(insn->code) != BPF_W && BPF_SIZE(insn->code) != BPF_DW) ||
@@ -882,12 +882,12 @@
* bytes from that pointer, make sure that it's within stack boundary
* and all elements of stack are initialized
*/
-static int check_stack_boundary(struct verifier_env *env, int regno,
+static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
int access_size, bool zero_size_allowed,
struct bpf_call_arg_meta *meta)
{
- struct verifier_state *state = &env->cur_state;
- struct reg_state *regs = state->regs;
+ struct bpf_verifier_state *state = &env->cur_state;
+ struct bpf_reg_state *regs = state->regs;
int off, i;
if (regs[regno].type != PTR_TO_STACK) {
@@ -926,11 +926,11 @@
return 0;
}
-static int check_func_arg(struct verifier_env *env, u32 regno,
+static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
enum bpf_arg_type arg_type,
struct bpf_call_arg_meta *meta)
{
- struct reg_state *regs = env->cur_state.regs, *reg = ®s[regno];
+ struct bpf_reg_state *regs = env->cur_state.regs, *reg = ®s[regno];
enum bpf_reg_type expected_type, type = reg->type;
int err = 0;
@@ -950,8 +950,8 @@
return 0;
}
- if (type == PTR_TO_PACKET && !may_write_pkt_data(env->prog->type)) {
- verbose("helper access to the packet is not allowed for clsact\n");
+ if (type == PTR_TO_PACKET && !may_access_direct_pkt_data(env, meta)) {
+ verbose("helper access to the packet is not allowed\n");
return -EACCES;
}
@@ -1135,10 +1135,10 @@
return count > 1 ? -EINVAL : 0;
}
-static void clear_all_pkt_pointers(struct verifier_env *env)
+static void clear_all_pkt_pointers(struct bpf_verifier_env *env)
{
- struct verifier_state *state = &env->cur_state;
- struct reg_state *regs = state->regs, *reg;
+ struct bpf_verifier_state *state = &env->cur_state;
+ struct bpf_reg_state *regs = state->regs, *reg;
int i;
for (i = 0; i < MAX_BPF_REG; i++)
@@ -1158,12 +1158,12 @@
}
}
-static int check_call(struct verifier_env *env, int func_id)
+static int check_call(struct bpf_verifier_env *env, int func_id)
{
- struct verifier_state *state = &env->cur_state;
+ struct bpf_verifier_state *state = &env->cur_state;
const struct bpf_func_proto *fn = NULL;
- struct reg_state *regs = state->regs;
- struct reg_state *reg;
+ struct bpf_reg_state *regs = state->regs;
+ struct bpf_reg_state *reg;
struct bpf_call_arg_meta meta;
bool changes_data;
int i, err;
@@ -1191,6 +1191,7 @@
changes_data = bpf_helper_changes_skb_data(fn->func);
memset(&meta, 0, sizeof(meta));
+ meta.pkt_access = fn->pkt_access;
/* We only support one arg being in raw mode at the moment, which
* is sufficient for the helper functions we have right now.
@@ -1241,6 +1242,7 @@
regs[BPF_REG_0].type = NOT_INIT;
} else if (fn->ret_type == RET_PTR_TO_MAP_VALUE_OR_NULL) {
regs[BPF_REG_0].type = PTR_TO_MAP_VALUE_OR_NULL;
+ regs[BPF_REG_0].max_value = regs[BPF_REG_0].min_value = 0;
/* remember map_ptr, so that check_map_access()
* can check 'value_size' boundary of memory access
* to map element returned from bpf_map_lookup_elem()
@@ -1265,12 +1267,13 @@
return 0;
}
-static int check_packet_ptr_add(struct verifier_env *env, struct bpf_insn *insn)
+static int check_packet_ptr_add(struct bpf_verifier_env *env,
+ struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
- struct reg_state *dst_reg = ®s[insn->dst_reg];
- struct reg_state *src_reg = ®s[insn->src_reg];
- struct reg_state tmp_reg;
+ struct bpf_reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *dst_reg = ®s[insn->dst_reg];
+ struct bpf_reg_state *src_reg = ®s[insn->src_reg];
+ struct bpf_reg_state tmp_reg;
s32 imm;
if (BPF_SRC(insn->code) == BPF_K) {
@@ -1338,10 +1341,10 @@
return 0;
}
-static int evaluate_reg_alu(struct verifier_env *env, struct bpf_insn *insn)
+static int evaluate_reg_alu(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
- struct reg_state *dst_reg = ®s[insn->dst_reg];
+ struct bpf_reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *dst_reg = ®s[insn->dst_reg];
u8 opcode = BPF_OP(insn->code);
s64 imm_log2;
@@ -1351,7 +1354,7 @@
*/
if (BPF_SRC(insn->code) == BPF_X) {
- struct reg_state *src_reg = ®s[insn->src_reg];
+ struct bpf_reg_state *src_reg = ®s[insn->src_reg];
if (src_reg->type == UNKNOWN_VALUE && src_reg->imm > 0 &&
dst_reg->imm && opcode == BPF_ADD) {
@@ -1440,11 +1443,12 @@
return 0;
}
-static int evaluate_reg_imm_alu(struct verifier_env *env, struct bpf_insn *insn)
+static int evaluate_reg_imm_alu(struct bpf_verifier_env *env,
+ struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
- struct reg_state *dst_reg = ®s[insn->dst_reg];
- struct reg_state *src_reg = ®s[insn->src_reg];
+ struct bpf_reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *dst_reg = ®s[insn->dst_reg];
+ struct bpf_reg_state *src_reg = ®s[insn->src_reg];
u8 opcode = BPF_OP(insn->code);
/* dst_reg->type == CONST_IMM here, simulate execution of 'add' insn.
@@ -1460,10 +1464,110 @@
return 0;
}
-/* check validity of 32-bit and 64-bit arithmetic operations */
-static int check_alu_op(struct verifier_env *env, struct bpf_insn *insn)
+static void check_reg_overflow(struct bpf_reg_state *reg)
{
- struct reg_state *regs = env->cur_state.regs, *dst_reg;
+ if (reg->max_value > BPF_REGISTER_MAX_RANGE)
+ reg->max_value = BPF_REGISTER_MAX_RANGE;
+ if ((s64)reg->min_value < BPF_REGISTER_MIN_RANGE)
+ reg->min_value = BPF_REGISTER_MIN_RANGE;
+}
+
+static void adjust_reg_min_max_vals(struct bpf_verifier_env *env,
+ struct bpf_insn *insn)
+{
+ struct bpf_reg_state *regs = env->cur_state.regs, *dst_reg;
+ u64 min_val = BPF_REGISTER_MIN_RANGE, max_val = BPF_REGISTER_MAX_RANGE;
+ bool min_set = false, max_set = false;
+ u8 opcode = BPF_OP(insn->code);
+
+ dst_reg = ®s[insn->dst_reg];
+ if (BPF_SRC(insn->code) == BPF_X) {
+ check_reg_overflow(®s[insn->src_reg]);
+ min_val = regs[insn->src_reg].min_value;
+ max_val = regs[insn->src_reg].max_value;
+
+ /* If the source register is a random pointer then the
+ * min_value/max_value values represent the range of the known
+ * accesses into that value, not the actual min/max value of the
+ * register itself. In this case we have to reset the reg range
+ * values so we know it is not safe to look at.
+ */
+ if (regs[insn->src_reg].type != CONST_IMM &&
+ regs[insn->src_reg].type != UNKNOWN_VALUE) {
+ min_val = BPF_REGISTER_MIN_RANGE;
+ max_val = BPF_REGISTER_MAX_RANGE;
+ }
+ } else if (insn->imm < BPF_REGISTER_MAX_RANGE &&
+ (s64)insn->imm > BPF_REGISTER_MIN_RANGE) {
+ min_val = max_val = insn->imm;
+ min_set = max_set = true;
+ }
+
+ /* We don't know anything about what was done to this register, mark it
+ * as unknown.
+ */
+ if (min_val == BPF_REGISTER_MIN_RANGE &&
+ max_val == BPF_REGISTER_MAX_RANGE) {
+ reset_reg_range_values(regs, insn->dst_reg);
+ return;
+ }
+
+ switch (opcode) {
+ case BPF_ADD:
+ dst_reg->min_value += min_val;
+ dst_reg->max_value += max_val;
+ break;
+ case BPF_SUB:
+ dst_reg->min_value -= min_val;
+ dst_reg->max_value -= max_val;
+ break;
+ case BPF_MUL:
+ dst_reg->min_value *= min_val;
+ dst_reg->max_value *= max_val;
+ break;
+ case BPF_AND:
+ /* & is special since it could end up with 0 bits set. */
+ dst_reg->min_value &= min_val;
+ dst_reg->max_value = max_val;
+ break;
+ case BPF_LSH:
+ /* Gotta have special overflow logic here, if we're shifting
+ * more than MAX_RANGE then just assume we have an invalid
+ * range.
+ */
+ if (min_val > ilog2(BPF_REGISTER_MAX_RANGE))
+ dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+ else
+ dst_reg->min_value <<= min_val;
+
+ if (max_val > ilog2(BPF_REGISTER_MAX_RANGE))
+ dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+ else
+ dst_reg->max_value <<= max_val;
+ break;
+ case BPF_RSH:
+ dst_reg->min_value >>= min_val;
+ dst_reg->max_value >>= max_val;
+ break;
+ case BPF_MOD:
+ /* % is special since it is an unsigned modulus, so the floor
+ * will always be 0.
+ */
+ dst_reg->min_value = 0;
+ dst_reg->max_value = max_val - 1;
+ break;
+ default:
+ reset_reg_range_values(regs, insn->dst_reg);
+ break;
+ }
+
+ check_reg_overflow(dst_reg);
+}
+
+/* check validity of 32-bit and 64-bit arithmetic operations */
+static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+ struct bpf_reg_state *regs = env->cur_state.regs, *dst_reg;
u8 opcode = BPF_OP(insn->code);
int err;
@@ -1523,6 +1627,11 @@
if (err)
return err;
+ /* we are setting our register to something new, we need to
+ * reset its range values.
+ */
+ reset_reg_range_values(regs, insn->dst_reg);
+
if (BPF_SRC(insn->code) == BPF_X) {
if (BPF_CLASS(insn->code) == BPF_ALU64) {
/* case: R1 = R2
@@ -1544,6 +1653,8 @@
*/
regs[insn->dst_reg].type = CONST_IMM;
regs[insn->dst_reg].imm = insn->imm;
+ regs[insn->dst_reg].max_value = insn->imm;
+ regs[insn->dst_reg].min_value = insn->imm;
}
} else if (opcode > BPF_END) {
@@ -1596,6 +1707,9 @@
dst_reg = ®s[insn->dst_reg];
+ /* first we want to adjust our ranges. */
+ adjust_reg_min_max_vals(env, insn);
+
/* pattern match 'bpf_add Rx, imm' instruction */
if (opcode == BPF_ADD && BPF_CLASS(insn->code) == BPF_ALU64 &&
dst_reg->type == FRAME_PTR && BPF_SRC(insn->code) == BPF_K) {
@@ -1630,17 +1744,26 @@
return -EACCES;
}
- /* mark dest operand */
- mark_reg_unknown_value(regs, insn->dst_reg);
+ /* If we did pointer math on a map value then just set it to our
+ * PTR_TO_MAP_VALUE_ADJ type so we can deal with any stores or
+ * loads to this register appropriately, otherwise just mark the
+ * register as unknown.
+ */
+ if (env->allow_ptr_leaks &&
+ (dst_reg->type == PTR_TO_MAP_VALUE ||
+ dst_reg->type == PTR_TO_MAP_VALUE_ADJ))
+ dst_reg->type = PTR_TO_MAP_VALUE_ADJ;
+ else
+ mark_reg_unknown_value(regs, insn->dst_reg);
}
return 0;
}
-static void find_good_pkt_pointers(struct verifier_state *state,
- const struct reg_state *dst_reg)
+static void find_good_pkt_pointers(struct bpf_verifier_state *state,
+ struct bpf_reg_state *dst_reg)
{
- struct reg_state *regs = state->regs, *reg;
+ struct bpf_reg_state *regs = state->regs, *reg;
int i;
/* LLVM can generate two kind of checks:
@@ -1686,11 +1809,109 @@
}
}
-static int check_cond_jmp_op(struct verifier_env *env,
+/* Adjusts the register min/max values in the case that the dst_reg is the
+ * variable register that we are working on, and src_reg is a constant or we're
+ * simply doing a BPF_K check.
+ */
+static void reg_set_min_max(struct bpf_reg_state *true_reg,
+ struct bpf_reg_state *false_reg, u64 val,
+ u8 opcode)
+{
+ switch (opcode) {
+ case BPF_JEQ:
+ /* If this is false then we know nothing Jon Snow, but if it is
+ * true then we know for sure.
+ */
+ true_reg->max_value = true_reg->min_value = val;
+ break;
+ case BPF_JNE:
+ /* If this is true we know nothing Jon Snow, but if it is false
+ * we know the value for sure;
+ */
+ false_reg->max_value = false_reg->min_value = val;
+ break;
+ case BPF_JGT:
+ /* Unsigned comparison, the minimum value is 0. */
+ false_reg->min_value = 0;
+ case BPF_JSGT:
+ /* If this is false then we know the maximum val is val,
+ * otherwise we know the min val is val+1.
+ */
+ false_reg->max_value = val;
+ true_reg->min_value = val + 1;
+ break;
+ case BPF_JGE:
+ /* Unsigned comparison, the minimum value is 0. */
+ false_reg->min_value = 0;
+ case BPF_JSGE:
+ /* If this is false then we know the maximum value is val - 1,
+ * otherwise we know the mimimum value is val.
+ */
+ false_reg->max_value = val - 1;
+ true_reg->min_value = val;
+ break;
+ default:
+ break;
+ }
+
+ check_reg_overflow(false_reg);
+ check_reg_overflow(true_reg);
+}
+
+/* Same as above, but for the case that dst_reg is a CONST_IMM reg and src_reg
+ * is the variable reg.
+ */
+static void reg_set_min_max_inv(struct bpf_reg_state *true_reg,
+ struct bpf_reg_state *false_reg, u64 val,
+ u8 opcode)
+{
+ switch (opcode) {
+ case BPF_JEQ:
+ /* If this is false then we know nothing Jon Snow, but if it is
+ * true then we know for sure.
+ */
+ true_reg->max_value = true_reg->min_value = val;
+ break;
+ case BPF_JNE:
+ /* If this is true we know nothing Jon Snow, but if it is false
+ * we know the value for sure;
+ */
+ false_reg->max_value = false_reg->min_value = val;
+ break;
+ case BPF_JGT:
+ /* Unsigned comparison, the minimum value is 0. */
+ true_reg->min_value = 0;
+ case BPF_JSGT:
+ /*
+ * If this is false, then the val is <= the register, if it is
+ * true the register <= to the val.
+ */
+ false_reg->min_value = val;
+ true_reg->max_value = val - 1;
+ break;
+ case BPF_JGE:
+ /* Unsigned comparison, the minimum value is 0. */
+ true_reg->min_value = 0;
+ case BPF_JSGE:
+ /* If this is false then constant < register, if it is true then
+ * the register < constant.
+ */
+ false_reg->min_value = val + 1;
+ true_reg->max_value = val;
+ break;
+ default:
+ break;
+ }
+
+ check_reg_overflow(false_reg);
+ check_reg_overflow(true_reg);
+}
+
+static int check_cond_jmp_op(struct bpf_verifier_env *env,
struct bpf_insn *insn, int *insn_idx)
{
- struct verifier_state *other_branch, *this_branch = &env->cur_state;
- struct reg_state *regs = this_branch->regs, *dst_reg;
+ struct bpf_verifier_state *other_branch, *this_branch = &env->cur_state;
+ struct bpf_reg_state *regs = this_branch->regs, *dst_reg;
u8 opcode = BPF_OP(insn->code);
int err;
@@ -1752,7 +1973,24 @@
if (!other_branch)
return -EFAULT;
- /* detect if R == 0 where R is returned value from bpf_map_lookup_elem() */
+ /* detect if we are comparing against a constant value so we can adjust
+ * our min/max values for our dst register.
+ */
+ if (BPF_SRC(insn->code) == BPF_X) {
+ if (regs[insn->src_reg].type == CONST_IMM)
+ reg_set_min_max(&other_branch->regs[insn->dst_reg],
+ dst_reg, regs[insn->src_reg].imm,
+ opcode);
+ else if (dst_reg->type == CONST_IMM)
+ reg_set_min_max_inv(&other_branch->regs[insn->src_reg],
+ ®s[insn->src_reg], dst_reg->imm,
+ opcode);
+ } else {
+ reg_set_min_max(&other_branch->regs[insn->dst_reg],
+ dst_reg, insn->imm, opcode);
+ }
+
+ /* detect if R == 0 where R is returned from bpf_map_lookup_elem() */
if (BPF_SRC(insn->code) == BPF_K &&
insn->imm == 0 && (opcode == BPF_JEQ || opcode == BPF_JNE) &&
dst_reg->type == PTR_TO_MAP_VALUE_OR_NULL) {
@@ -1794,9 +2032,9 @@
}
/* verify BPF_LD_IMM64 instruction */
-static int check_ld_imm(struct verifier_env *env, struct bpf_insn *insn)
+static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *regs = env->cur_state.regs;
int err;
if (BPF_SIZE(insn->code) != BPF_DW) {
@@ -1812,9 +2050,19 @@
if (err)
return err;
- if (insn->src_reg == 0)
- /* generic move 64-bit immediate into a register */
+ if (insn->src_reg == 0) {
+ /* generic move 64-bit immediate into a register,
+ * only analyzer needs to collect the ld_imm value.
+ */
+ u64 imm = ((u64)(insn + 1)->imm << 32) | (u32)insn->imm;
+
+ if (!env->analyzer_ops)
+ return 0;
+
+ regs[insn->dst_reg].type = CONST_IMM;
+ regs[insn->dst_reg].imm = imm;
return 0;
+ }
/* replace_map_fd_with_map_ptr() should have caught bad ld_imm64 */
BUG_ON(insn->src_reg != BPF_PSEUDO_MAP_FD);
@@ -1851,11 +2099,11 @@
* Output:
* R0 - 8/16/32-bit skb data converted to cpu endianness
*/
-static int check_ld_abs(struct verifier_env *env, struct bpf_insn *insn)
+static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
- struct reg_state *regs = env->cur_state.regs;
+ struct bpf_reg_state *regs = env->cur_state.regs;
u8 mode = BPF_MODE(insn->code);
- struct reg_state *reg;
+ struct bpf_reg_state *reg;
int i, err;
if (!may_access_skb(env->prog->type)) {
@@ -1941,7 +2189,7 @@
BRANCH = 2,
};
-#define STATE_LIST_MARK ((struct verifier_state_list *) -1L)
+#define STATE_LIST_MARK ((struct bpf_verifier_state_list *) -1L)
static int *insn_stack; /* stack of insns to process */
static int cur_stack; /* current stack index */
@@ -1952,7 +2200,7 @@
* w - next instruction
* e - edge
*/
-static int push_insn(int t, int w, int e, struct verifier_env *env)
+static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
{
if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
return 0;
@@ -1993,7 +2241,7 @@
/* non-recursive depth-first-search to detect loops in BPF program
* loop == back-edge in directed graph
*/
-static int check_cfg(struct verifier_env *env)
+static int check_cfg(struct bpf_verifier_env *env)
{
struct bpf_insn *insns = env->prog->insnsi;
int insn_cnt = env->prog->len;
@@ -2102,7 +2350,8 @@
/* the following conditions reduce the number of explored insns
* from ~140k to ~80k for ultra large programs that use a lot of ptr_to_packet
*/
-static bool compare_ptrs_to_packet(struct reg_state *old, struct reg_state *cur)
+static bool compare_ptrs_to_packet(struct bpf_reg_state *old,
+ struct bpf_reg_state *cur)
{
if (old->id != cur->id)
return false;
@@ -2177,9 +2426,11 @@
* whereas register type in current state is meaningful, it means that
* the current state will reach 'bpf_exit' instruction safely
*/
-static bool states_equal(struct verifier_state *old, struct verifier_state *cur)
+static bool states_equal(struct bpf_verifier_env *env,
+ struct bpf_verifier_state *old,
+ struct bpf_verifier_state *cur)
{
- struct reg_state *rold, *rcur;
+ struct bpf_reg_state *rold, *rcur;
int i;
for (i = 0; i < MAX_BPF_REG; i++) {
@@ -2189,6 +2440,13 @@
if (memcmp(rold, rcur, sizeof(*rold)) == 0)
continue;
+ /* If the ranges were not the same, but everything else was and
+ * we didn't do a variable access into a map then we are a-ok.
+ */
+ if (!env->varlen_map_value_access &&
+ rold->type == rcur->type && rold->imm == rcur->imm)
+ continue;
+
if (rold->type == NOT_INIT ||
(rold->type == UNKNOWN_VALUE && rcur->type != NOT_INIT))
continue;
@@ -2219,9 +2477,9 @@
* the same, check that stored pointers types
* are the same as well.
* Ex: explored safe path could have stored
- * (struct reg_state) {.type = PTR_TO_STACK, .imm = -8}
+ * (bpf_reg_state) {.type = PTR_TO_STACK, .imm = -8}
* but current path has stored:
- * (struct reg_state) {.type = PTR_TO_STACK, .imm = -16}
+ * (bpf_reg_state) {.type = PTR_TO_STACK, .imm = -16}
* such verifier states are not equivalent.
* return false to continue verification of this path
*/
@@ -2232,10 +2490,10 @@
return true;
}
-static int is_state_visited(struct verifier_env *env, int insn_idx)
+static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
{
- struct verifier_state_list *new_sl;
- struct verifier_state_list *sl;
+ struct bpf_verifier_state_list *new_sl;
+ struct bpf_verifier_state_list *sl;
sl = env->explored_states[insn_idx];
if (!sl)
@@ -2245,7 +2503,7 @@
return 0;
while (sl != STATE_LIST_MARK) {
- if (states_equal(&sl->state, &env->cur_state))
+ if (states_equal(env, &sl->state, &env->cur_state))
/* reached equivalent register/stack state,
* prune the search
*/
@@ -2259,7 +2517,7 @@
* it will be rejected. Since there are no loops, we won't be
* seeing this 'insn_idx' instruction again on the way to bpf_exit
*/
- new_sl = kmalloc(sizeof(struct verifier_state_list), GFP_USER);
+ new_sl = kmalloc(sizeof(struct bpf_verifier_state_list), GFP_USER);
if (!new_sl)
return -ENOMEM;
@@ -2270,11 +2528,20 @@
return 0;
}
-static int do_check(struct verifier_env *env)
+static int ext_analyzer_insn_hook(struct bpf_verifier_env *env,
+ int insn_idx, int prev_insn_idx)
{
- struct verifier_state *state = &env->cur_state;
+ if (!env->analyzer_ops || !env->analyzer_ops->insn_hook)
+ return 0;
+
+ return env->analyzer_ops->insn_hook(env, insn_idx, prev_insn_idx);
+}
+
+static int do_check(struct bpf_verifier_env *env)
+{
+ struct bpf_verifier_state *state = &env->cur_state;
struct bpf_insn *insns = env->prog->insnsi;
- struct reg_state *regs = state->regs;
+ struct bpf_reg_state *regs = state->regs;
int insn_cnt = env->prog->len;
int insn_idx, prev_insn_idx = 0;
int insn_processed = 0;
@@ -2282,6 +2549,7 @@
init_reg_state(regs);
insn_idx = 0;
+ env->varlen_map_value_access = false;
for (;;) {
struct bpf_insn *insn;
u8 class;
@@ -2328,13 +2596,17 @@
print_bpf_insn(insn);
}
+ err = ext_analyzer_insn_hook(env, insn_idx, prev_insn_idx);
+ if (err)
+ return err;
+
if (class == BPF_ALU || class == BPF_ALU64) {
err = check_alu_op(env, insn);
if (err)
return err;
} else if (class == BPF_LDX) {
- enum bpf_reg_type src_reg_type;
+ enum bpf_reg_type *prev_src_type, src_reg_type;
/* check for reserved fields is already done */
@@ -2358,22 +2630,25 @@
if (err)
return err;
+ reset_reg_range_values(regs, insn->dst_reg);
if (BPF_SIZE(insn->code) != BPF_W &&
BPF_SIZE(insn->code) != BPF_DW) {
insn_idx++;
continue;
}
- if (insn->imm == 0) {
+ prev_src_type = &env->insn_aux_data[insn_idx].ptr_type;
+
+ if (*prev_src_type == NOT_INIT) {
/* saw a valid insn
* dst_reg = *(u32 *)(src_reg + off)
- * use reserved 'imm' field to mark this insn
+ * save type to validate intersecting paths
*/
- insn->imm = src_reg_type;
+ *prev_src_type = src_reg_type;
- } else if (src_reg_type != insn->imm &&
+ } else if (src_reg_type != *prev_src_type &&
(src_reg_type == PTR_TO_CTX ||
- insn->imm == PTR_TO_CTX)) {
+ *prev_src_type == PTR_TO_CTX)) {
/* ABuser program is trying to use the same insn
* dst_reg = *(u32*) (src_reg + off)
* with different pointer types:
@@ -2386,7 +2661,7 @@
}
} else if (class == BPF_STX) {
- enum bpf_reg_type dst_reg_type;
+ enum bpf_reg_type *prev_dst_type, dst_reg_type;
if (BPF_MODE(insn->code) == BPF_XADD) {
err = check_xadd(env, insn);
@@ -2414,11 +2689,13 @@
if (err)
return err;
- if (insn->imm == 0) {
- insn->imm = dst_reg_type;
- } else if (dst_reg_type != insn->imm &&
+ prev_dst_type = &env->insn_aux_data[insn_idx].ptr_type;
+
+ if (*prev_dst_type == NOT_INIT) {
+ *prev_dst_type = dst_reg_type;
+ } else if (dst_reg_type != *prev_dst_type &&
(dst_reg_type == PTR_TO_CTX ||
- insn->imm == PTR_TO_CTX)) {
+ *prev_dst_type == PTR_TO_CTX)) {
verbose("same insn cannot be used with different pointers\n");
return -EINVAL;
}
@@ -2524,6 +2801,7 @@
verbose("invalid BPF_LD mode\n");
return -EINVAL;
}
+ reset_reg_range_values(regs, insn->dst_reg);
} else {
verbose("unknown insn class %d\n", class);
return -EINVAL;
@@ -2553,7 +2831,7 @@
/* look for pseudo eBPF instructions that access map FDs and
* replace them with actual map pointers
*/
-static int replace_map_fd_with_map_ptr(struct verifier_env *env)
+static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
{
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
@@ -2650,7 +2928,7 @@
}
/* drop refcnt of maps used by the rejected program */
-static void release_maps(struct verifier_env *env)
+static void release_maps(struct bpf_verifier_env *env)
{
int i;
@@ -2659,7 +2937,7 @@
}
/* convert pseudo BPF_LD_IMM64 into generic BPF_LD_IMM64 */
-static void convert_pseudo_ld_imm64(struct verifier_env *env)
+static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
{
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
@@ -2673,21 +2951,37 @@
/* convert load instructions that access fields of 'struct __sk_buff'
* into sequence of instructions that access fields of 'struct sk_buff'
*/
-static int convert_ctx_accesses(struct verifier_env *env)
+static int convert_ctx_accesses(struct bpf_verifier_env *env)
{
- struct bpf_insn *insn = env->prog->insnsi;
- int insn_cnt = env->prog->len;
- struct bpf_insn insn_buf[16];
+ const struct bpf_verifier_ops *ops = env->prog->aux->ops;
+ const int insn_cnt = env->prog->len;
+ struct bpf_insn insn_buf[16], *insn;
struct bpf_prog *new_prog;
enum bpf_access_type type;
- int i;
+ int i, cnt, delta = 0;
- if (!env->prog->aux->ops->convert_ctx_access)
+ if (ops->gen_prologue) {
+ cnt = ops->gen_prologue(insn_buf, env->seen_direct_write,
+ env->prog);
+ if (cnt >= ARRAY_SIZE(insn_buf)) {
+ verbose("bpf verifier is misconfigured\n");
+ return -EINVAL;
+ } else if (cnt) {
+ new_prog = bpf_patch_insn_single(env->prog, 0,
+ insn_buf, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+ env->prog = new_prog;
+ delta += cnt - 1;
+ }
+ }
+
+ if (!ops->convert_ctx_access)
return 0;
- for (i = 0; i < insn_cnt; i++, insn++) {
- u32 insn_delta, cnt;
+ insn = env->prog->insnsi + delta;
+ for (i = 0; i < insn_cnt; i++, insn++) {
if (insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
type = BPF_READ;
@@ -2697,40 +2991,34 @@
else
continue;
- if (insn->imm != PTR_TO_CTX) {
- /* clear internal mark */
- insn->imm = 0;
+ if (env->insn_aux_data[i].ptr_type != PTR_TO_CTX)
continue;
- }
- cnt = env->prog->aux->ops->
- convert_ctx_access(type, insn->dst_reg, insn->src_reg,
- insn->off, insn_buf, env->prog);
+ cnt = ops->convert_ctx_access(type, insn->dst_reg, insn->src_reg,
+ insn->off, insn_buf, env->prog);
if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) {
verbose("bpf verifier is misconfigured\n");
return -EINVAL;
}
- new_prog = bpf_patch_insn_single(env->prog, i, insn_buf, cnt);
+ new_prog = bpf_patch_insn_single(env->prog, i + delta, insn_buf,
+ cnt);
if (!new_prog)
return -ENOMEM;
- insn_delta = cnt - 1;
+ delta += cnt - 1;
/* keep walking new program and skip insns we just inserted */
env->prog = new_prog;
- insn = new_prog->insnsi + i + insn_delta;
-
- insn_cnt += insn_delta;
- i += insn_delta;
+ insn = new_prog->insnsi + i + delta;
}
return 0;
}
-static void free_states(struct verifier_env *env)
+static void free_states(struct bpf_verifier_env *env)
{
- struct verifier_state_list *sl, *sln;
+ struct bpf_verifier_state_list *sl, *sln;
int i;
if (!env->explored_states)
@@ -2753,19 +3041,24 @@
int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
{
char __user *log_ubuf = NULL;
- struct verifier_env *env;
+ struct bpf_verifier_env *env;
int ret = -EINVAL;
if ((*prog)->len <= 0 || (*prog)->len > BPF_MAXINSNS)
return -E2BIG;
- /* 'struct verifier_env' can be global, but since it's not small,
+ /* 'struct bpf_verifier_env' can be global, but since it's not small,
* allocate/free it every time bpf_check() is called
*/
- env = kzalloc(sizeof(struct verifier_env), GFP_KERNEL);
+ env = kzalloc(sizeof(struct bpf_verifier_env), GFP_KERNEL);
if (!env)
return -ENOMEM;
+ env->insn_aux_data = vzalloc(sizeof(struct bpf_insn_aux_data) *
+ (*prog)->len);
+ ret = -ENOMEM;
+ if (!env->insn_aux_data)
+ goto err_free_env;
env->prog = *prog;
/* grab the mutex to protect few globals used by verifier */
@@ -2784,12 +3077,12 @@
/* log_* values have to be sane */
if (log_size < 128 || log_size > UINT_MAX >> 8 ||
log_level == 0 || log_ubuf == NULL)
- goto free_env;
+ goto err_unlock;
ret = -ENOMEM;
log_buf = vmalloc(log_size);
if (!log_buf)
- goto free_env;
+ goto err_unlock;
} else {
log_level = 0;
}
@@ -2799,7 +3092,7 @@
goto skip_full_check;
env->explored_states = kcalloc(env->prog->len,
- sizeof(struct verifier_state_list *),
+ sizeof(struct bpf_verifier_state_list *),
GFP_USER);
ret = -ENOMEM;
if (!env->explored_states)
@@ -2858,14 +3151,67 @@
free_log_buf:
if (log_level)
vfree(log_buf);
-free_env:
if (!env->prog->aux->used_maps)
/* if we didn't copy map pointers into bpf_prog_info, release
* them now. Otherwise free_bpf_prog_info() will release them.
*/
release_maps(env);
*prog = env->prog;
- kfree(env);
+err_unlock:
mutex_unlock(&bpf_verifier_lock);
+ vfree(env->insn_aux_data);
+err_free_env:
+ kfree(env);
return ret;
}
+
+int bpf_analyzer(struct bpf_prog *prog, const struct bpf_ext_analyzer_ops *ops,
+ void *priv)
+{
+ struct bpf_verifier_env *env;
+ int ret;
+
+ env = kzalloc(sizeof(struct bpf_verifier_env), GFP_KERNEL);
+ if (!env)
+ return -ENOMEM;
+
+ env->insn_aux_data = vzalloc(sizeof(struct bpf_insn_aux_data) *
+ prog->len);
+ ret = -ENOMEM;
+ if (!env->insn_aux_data)
+ goto err_free_env;
+ env->prog = prog;
+ env->analyzer_ops = ops;
+ env->analyzer_priv = priv;
+
+ /* grab the mutex to protect few globals used by verifier */
+ mutex_lock(&bpf_verifier_lock);
+
+ log_level = 0;
+
+ env->explored_states = kcalloc(env->prog->len,
+ sizeof(struct bpf_verifier_state_list *),
+ GFP_KERNEL);
+ ret = -ENOMEM;
+ if (!env->explored_states)
+ goto skip_full_check;
+
+ ret = check_cfg(env);
+ if (ret < 0)
+ goto skip_full_check;
+
+ env->allow_ptr_leaks = capable(CAP_SYS_ADMIN);
+
+ ret = do_check(env);
+
+skip_full_check:
+ while (pop_stack(env, NULL) >= 0);
+ free_states(env);
+
+ mutex_unlock(&bpf_verifier_lock);
+ vfree(env->insn_aux_data);
+err_free_env:
+ kfree(env);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(bpf_analyzer);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index d1c51b7..5e8dab5 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6270,6 +6270,12 @@
if (cgroup_sk_alloc_disabled)
return;
+ /* Socket clone path */
+ if (skcd->val) {
+ cgroup_get(sock_cgroup_ptr(skcd));
+ return;
+ }
+
rcu_read_lock();
while (true) {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a7b8c1c..9fc3be0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2496,11 +2496,11 @@
return 0;
}
-static int perf_event_restart(struct perf_event *event)
+static int perf_event_stop(struct perf_event *event, int restart)
{
struct stop_event_data sd = {
.event = event,
- .restart = 1,
+ .restart = restart,
};
int ret = 0;
@@ -3549,10 +3549,18 @@
.group = group,
.ret = 0,
};
- ret = smp_call_function_single(event->oncpu, __perf_event_read, &data, 1);
- /* The event must have been read from an online CPU: */
- WARN_ON_ONCE(ret);
- ret = ret ? : data.ret;
+ /*
+ * Purposely ignore the smp_call_function_single() return
+ * value.
+ *
+ * If event->oncpu isn't a valid CPU it means the event got
+ * scheduled out and that will have updated the event count.
+ *
+ * Therefore, either way, we'll have an up-to-date event count
+ * after this.
+ */
+ (void)smp_call_function_single(event->oncpu, __perf_event_read, &data, 1);
+ ret = data.ret;
} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
struct perf_event_context *ctx = event->ctx;
unsigned long flags;
@@ -4837,6 +4845,19 @@
spin_unlock_irqrestore(&rb->event_lock, flags);
}
+ /*
+ * Avoid racing with perf_mmap_close(AUX): stop the event
+ * before swizzling the event::rb pointer; if it's getting
+ * unmapped, its aux_mmap_count will be 0 and it won't
+ * restart. See the comment in __perf_pmu_output_stop().
+ *
+ * Data will inevitably be lost when set_output is done in
+ * mid-air, but then again, whoever does it like this is
+ * not in for the data anyway.
+ */
+ if (has_aux(event))
+ perf_event_stop(event, 0);
+
rcu_assign_pointer(event->rb, rb);
if (old_rb) {
@@ -6112,7 +6133,7 @@
raw_spin_unlock_irqrestore(&ifh->lock, flags);
if (restart)
- perf_event_restart(event);
+ perf_event_stop(event, 1);
}
void perf_event_exec(void)
@@ -6156,7 +6177,13 @@
/*
* In case of inheritance, it will be the parent that links to the
- * ring-buffer, but it will be the child that's actually using it:
+ * ring-buffer, but it will be the child that's actually using it.
+ *
+ * We are using event::rb to determine if the event should be stopped,
+ * however this may race with ring_buffer_attach() (through set_output),
+ * which will make us skip the event that actually needs to be stopped.
+ * So ring_buffer_attach() has to stop an aux event before re-assigning
+ * its rb pointer.
*/
if (rcu_dereference(parent->rb) == rb)
ro->err = __perf_event_stop(&sd);
@@ -6670,7 +6697,7 @@
raw_spin_unlock_irqrestore(&ifh->lock, flags);
if (restart)
- perf_event_restart(event);
+ perf_event_stop(event, 1);
}
/*
@@ -7933,7 +7960,7 @@
mmput(mm);
restart:
- perf_event_restart(event);
+ perf_event_stop(event, 1);
}
/*
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index ae9b90d..257fa46 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -330,15 +330,22 @@
if (!rb)
return NULL;
- if (!rb_has_aux(rb) || !atomic_inc_not_zero(&rb->aux_refcount))
+ if (!rb_has_aux(rb))
goto err;
/*
- * If rb::aux_mmap_count is zero (and rb_has_aux() above went through),
- * the aux buffer is in perf_mmap_close(), about to get freed.
+ * If aux_mmap_count is zero, the aux buffer is in perf_mmap_close(),
+ * about to get freed, so we leave immediately.
+ *
+ * Checking rb::aux_mmap_count and rb::refcount has to be done in
+ * the same order, see perf_mmap_close. Otherwise we end up freeing
+ * aux pages in this path, which is a bug, because in_atomic().
*/
if (!atomic_read(&rb->aux_mmap_count))
- goto err_put;
+ goto err;
+
+ if (!atomic_inc_not_zero(&rb->aux_refcount))
+ goto err;
/*
* Nesting is not supported for AUX area, make sure nested
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2a906f2..44817c6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2016,6 +2016,28 @@
success = 1; /* we're going to change ->state */
cpu = task_cpu(p);
+ /*
+ * Ensure we load p->on_rq _after_ p->state, otherwise it would
+ * be possible to, falsely, observe p->on_rq == 0 and get stuck
+ * in smp_cond_load_acquire() below.
+ *
+ * sched_ttwu_pending() try_to_wake_up()
+ * [S] p->on_rq = 1; [L] P->state
+ * UNLOCK rq->lock -----.
+ * \
+ * +--- RMB
+ * schedule() /
+ * LOCK rq->lock -----'
+ * UNLOCK rq->lock
+ *
+ * [task p]
+ * [S] p->state = UNINTERRUPTIBLE [L] p->on_rq
+ *
+ * Pairs with the UNLOCK+LOCK on rq->lock from the
+ * last wakeup of our task and the schedule that got our task
+ * current.
+ */
+ smp_rmb();
if (p->on_rq && ttwu_remote(p, wake_flags))
goto stat;
diff --git a/lib/Makefile b/lib/Makefile
index 5dc77a8..df747e5 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,7 @@
sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
flex_proportions.o ratelimit.o show_mem.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
- earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o
+ earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
lib-$(CONFIG_MMU) += ioremap.o
lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 9e8c738..7e3138c 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -291,33 +291,13 @@
}
/*
- * Fault in the first iovec of the given iov_iter, to a maximum length
- * of bytes. Returns 0 on success, or non-zero if the memory could not be
- * accessed (ie. because it is an invalid address).
- *
- * writev-intensive code may want this to prefault several iovecs -- that
- * would be possible (callers must not rely on the fact that _only_ the
- * first iovec will be faulted with the current implementation).
- */
-int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
-{
- if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
- char __user *buf = i->iov->iov_base + i->iov_offset;
- bytes = min(bytes, i->iov->iov_len - i->iov_offset);
- return fault_in_pages_readable(buf, bytes);
- }
- return 0;
-}
-EXPORT_SYMBOL(iov_iter_fault_in_readable);
-
-/*
* Fault in one or more iovecs of the given iov_iter, to a maximum length of
* bytes. For each iovec, fault in each page that constitutes the iovec.
*
* Return 0 on success, or non-zero if the memory could not be accessed (i.e.
* because it is an invalid address).
*/
-int iov_iter_fault_in_multipages_readable(struct iov_iter *i, size_t bytes)
+int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
{
size_t skip = i->iov_offset;
const struct iovec *iov;
@@ -334,7 +314,7 @@
}
return 0;
}
-EXPORT_SYMBOL(iov_iter_fault_in_multipages_readable);
+EXPORT_SYMBOL(iov_iter_fault_in_readable);
void iov_iter_init(struct iov_iter *i, int direction,
const struct iovec *iov, unsigned long nr_segs,
diff --git a/lib/random32.c b/lib/random32.c
index 69ed593..915982b 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -81,7 +81,7 @@
u32 res;
res = prandom_u32_state(state);
- put_cpu_var(state);
+ put_cpu_var(net_rand_state);
return res;
}
@@ -128,7 +128,7 @@
struct rnd_state *state = &get_cpu_var(net_rand_state);
prandom_bytes_state(state, buf, bytes);
- put_cpu_var(state);
+ put_cpu_var(net_rand_state);
}
EXPORT_SYMBOL(prandom_bytes);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 06c2872..32d0ad0 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -378,22 +378,8 @@
schedule_work(&ht->run_work);
}
-static bool rhashtable_check_elasticity(struct rhashtable *ht,
- struct bucket_table *tbl,
- unsigned int hash)
-{
- unsigned int elasticity = ht->elasticity;
- struct rhash_head *head;
-
- rht_for_each(head, tbl, hash)
- if (!--elasticity)
- return true;
-
- return false;
-}
-
-int rhashtable_insert_rehash(struct rhashtable *ht,
- struct bucket_table *tbl)
+static int rhashtable_insert_rehash(struct rhashtable *ht,
+ struct bucket_table *tbl)
{
struct bucket_table *old_tbl;
struct bucket_table *new_tbl;
@@ -439,57 +425,165 @@
return err;
}
-EXPORT_SYMBOL_GPL(rhashtable_insert_rehash);
-struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
- const void *key,
- struct rhash_head *obj,
- struct bucket_table *tbl,
- void **data)
+static void *rhashtable_lookup_one(struct rhashtable *ht,
+ struct bucket_table *tbl, unsigned int hash,
+ const void *key, struct rhash_head *obj)
{
+ struct rhashtable_compare_arg arg = {
+ .ht = ht,
+ .key = key,
+ };
+ struct rhash_head __rcu **pprev;
struct rhash_head *head;
- unsigned int hash;
- int err;
+ int elasticity;
- tbl = rhashtable_last_table(ht, tbl);
- hash = head_hashfn(ht, tbl, obj);
- spin_lock_nested(rht_bucket_lock(tbl, hash), SINGLE_DEPTH_NESTING);
+ elasticity = ht->elasticity;
+ pprev = &tbl->buckets[hash];
+ rht_for_each(head, tbl, hash) {
+ struct rhlist_head *list;
+ struct rhlist_head *plist;
- err = -EEXIST;
- if (key) {
- *data = rhashtable_lookup_fast(ht, key, ht->p);
- if (*data)
- goto exit;
+ elasticity--;
+ if (!key ||
+ (ht->p.obj_cmpfn ?
+ ht->p.obj_cmpfn(&arg, rht_obj(ht, head)) :
+ rhashtable_compare(&arg, rht_obj(ht, head))))
+ continue;
+
+ if (!ht->rhlist)
+ return rht_obj(ht, head);
+
+ list = container_of(obj, struct rhlist_head, rhead);
+ plist = container_of(head, struct rhlist_head, rhead);
+
+ RCU_INIT_POINTER(list->next, plist);
+ head = rht_dereference_bucket(head->next, tbl, hash);
+ RCU_INIT_POINTER(list->rhead.next, head);
+ rcu_assign_pointer(*pprev, obj);
+
+ return NULL;
}
- err = -E2BIG;
+ if (elasticity <= 0)
+ return ERR_PTR(-EAGAIN);
+
+ return ERR_PTR(-ENOENT);
+}
+
+static struct bucket_table *rhashtable_insert_one(struct rhashtable *ht,
+ struct bucket_table *tbl,
+ unsigned int hash,
+ struct rhash_head *obj,
+ void *data)
+{
+ struct bucket_table *new_tbl;
+ struct rhash_head *head;
+
+ if (!IS_ERR_OR_NULL(data))
+ return ERR_PTR(-EEXIST);
+
+ if (PTR_ERR(data) != -EAGAIN && PTR_ERR(data) != -ENOENT)
+ return ERR_CAST(data);
+
+ new_tbl = rcu_dereference(tbl->future_tbl);
+ if (new_tbl)
+ return new_tbl;
+
+ if (PTR_ERR(data) != -ENOENT)
+ return ERR_CAST(data);
+
if (unlikely(rht_grow_above_max(ht, tbl)))
- goto exit;
+ return ERR_PTR(-E2BIG);
- err = -EAGAIN;
- if (rhashtable_check_elasticity(ht, tbl, hash) ||
- rht_grow_above_100(ht, tbl))
- goto exit;
-
- err = 0;
+ if (unlikely(rht_grow_above_100(ht, tbl)))
+ return ERR_PTR(-EAGAIN);
head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
RCU_INIT_POINTER(obj->next, head);
+ if (ht->rhlist) {
+ struct rhlist_head *list;
+
+ list = container_of(obj, struct rhlist_head, rhead);
+ RCU_INIT_POINTER(list->next, NULL);
+ }
rcu_assign_pointer(tbl->buckets[hash], obj);
atomic_inc(&ht->nelems);
+ if (rht_grow_above_75(ht, tbl))
+ schedule_work(&ht->run_work);
-exit:
- spin_unlock(rht_bucket_lock(tbl, hash));
+ return NULL;
+}
- if (err == 0)
- return NULL;
- else if (err == -EAGAIN)
- return tbl;
- else
- return ERR_PTR(err);
+static void *rhashtable_try_insert(struct rhashtable *ht, const void *key,
+ struct rhash_head *obj)
+{
+ struct bucket_table *new_tbl;
+ struct bucket_table *tbl;
+ unsigned int hash;
+ spinlock_t *lock;
+ void *data;
+
+ tbl = rcu_dereference(ht->tbl);
+
+ /* All insertions must grab the oldest table containing
+ * the hashed bucket that is yet to be rehashed.
+ */
+ for (;;) {
+ hash = rht_head_hashfn(ht, tbl, obj, ht->p);
+ lock = rht_bucket_lock(tbl, hash);
+ spin_lock_bh(lock);
+
+ if (tbl->rehash <= hash)
+ break;
+
+ spin_unlock_bh(lock);
+ tbl = rcu_dereference(tbl->future_tbl);
+ }
+
+ data = rhashtable_lookup_one(ht, tbl, hash, key, obj);
+ new_tbl = rhashtable_insert_one(ht, tbl, hash, obj, data);
+ if (PTR_ERR(new_tbl) != -EEXIST)
+ data = ERR_CAST(new_tbl);
+
+ while (!IS_ERR_OR_NULL(new_tbl)) {
+ tbl = new_tbl;
+ hash = rht_head_hashfn(ht, tbl, obj, ht->p);
+ spin_lock_nested(rht_bucket_lock(tbl, hash),
+ SINGLE_DEPTH_NESTING);
+
+ data = rhashtable_lookup_one(ht, tbl, hash, key, obj);
+ new_tbl = rhashtable_insert_one(ht, tbl, hash, obj, data);
+ if (PTR_ERR(new_tbl) != -EEXIST)
+ data = ERR_CAST(new_tbl);
+
+ spin_unlock(rht_bucket_lock(tbl, hash));
+ }
+
+ spin_unlock_bh(lock);
+
+ if (PTR_ERR(data) == -EAGAIN)
+ data = ERR_PTR(rhashtable_insert_rehash(ht, tbl) ?:
+ -EAGAIN);
+
+ return data;
+}
+
+void *rhashtable_insert_slow(struct rhashtable *ht, const void *key,
+ struct rhash_head *obj)
+{
+ void *data;
+
+ do {
+ rcu_read_lock();
+ data = rhashtable_try_insert(ht, key, obj);
+ rcu_read_unlock();
+ } while (PTR_ERR(data) == -EAGAIN);
+
+ return data;
}
EXPORT_SYMBOL_GPL(rhashtable_insert_slow);
@@ -593,11 +687,16 @@
void *rhashtable_walk_next(struct rhashtable_iter *iter)
{
struct bucket_table *tbl = iter->walker.tbl;
+ struct rhlist_head *list = iter->list;
struct rhashtable *ht = iter->ht;
struct rhash_head *p = iter->p;
+ bool rhlist = ht->rhlist;
if (p) {
- p = rht_dereference_bucket_rcu(p->next, tbl, iter->slot);
+ if (!rhlist || !(list = rcu_dereference(list->next))) {
+ p = rcu_dereference(p->next);
+ list = container_of(p, struct rhlist_head, rhead);
+ }
goto next;
}
@@ -605,6 +704,18 @@
int skip = iter->skip;
rht_for_each_rcu(p, tbl, iter->slot) {
+ if (rhlist) {
+ list = container_of(p, struct rhlist_head,
+ rhead);
+ do {
+ if (!skip)
+ goto next;
+ skip--;
+ list = rcu_dereference(list->next);
+ } while (list);
+
+ continue;
+ }
if (!skip)
break;
skip--;
@@ -614,7 +725,8 @@
if (!rht_is_a_nulls(p)) {
iter->skip++;
iter->p = p;
- return rht_obj(ht, p);
+ iter->list = list;
+ return rht_obj(ht, rhlist ? &list->rhead : p);
}
iter->skip = 0;
@@ -803,6 +915,48 @@
EXPORT_SYMBOL_GPL(rhashtable_init);
/**
+ * rhltable_init - initialize a new hash list table
+ * @hlt: hash list table to be initialized
+ * @params: configuration parameters
+ *
+ * Initializes a new hash list table.
+ *
+ * See documentation for rhashtable_init.
+ */
+int rhltable_init(struct rhltable *hlt, const struct rhashtable_params *params)
+{
+ int err;
+
+ /* No rhlist NULLs marking for now. */
+ if (params->nulls_base)
+ return -EINVAL;
+
+ err = rhashtable_init(&hlt->ht, params);
+ hlt->ht.rhlist = true;
+ return err;
+}
+EXPORT_SYMBOL_GPL(rhltable_init);
+
+static void rhashtable_free_one(struct rhashtable *ht, struct rhash_head *obj,
+ void (*free_fn)(void *ptr, void *arg),
+ void *arg)
+{
+ struct rhlist_head *list;
+
+ if (!ht->rhlist) {
+ free_fn(rht_obj(ht, obj), arg);
+ return;
+ }
+
+ list = container_of(obj, struct rhlist_head, rhead);
+ do {
+ obj = &list->rhead;
+ list = rht_dereference(list->next, ht);
+ free_fn(rht_obj(ht, obj), arg);
+ } while (list);
+}
+
+/**
* rhashtable_free_and_destroy - free elements and destroy hash table
* @ht: the hash table to destroy
* @free_fn: callback to release resources of element
@@ -839,7 +993,7 @@
pos = next,
next = !rht_is_a_nulls(pos) ?
rht_dereference(pos->next, ht) : NULL)
- free_fn(rht_obj(ht, pos), arg);
+ rhashtable_free_one(ht, pos, free_fn, arg);
}
}
diff --git a/lib/win_minmax.c b/lib/win_minmax.c
new file mode 100644
index 0000000..c8420d4
--- /dev/null
+++ b/lib/win_minmax.c
@@ -0,0 +1,98 @@
+/**
+ * lib/minmax.c: windowed min/max tracker
+ *
+ * Kathleen Nichols' algorithm for tracking the minimum (or maximum)
+ * value of a data stream over some fixed time interval. (E.g.,
+ * the minimum RTT over the past five minutes.) It uses constant
+ * space and constant time per update yet almost always delivers
+ * the same minimum as an implementation that has to keep all the
+ * data in the window.
+ *
+ * The algorithm keeps track of the best, 2nd best & 3rd best min
+ * values, maintaining an invariant that the measurement time of
+ * the n'th best >= n-1'th best. It also makes sure that the three
+ * values are widely separated in the time window since that bounds
+ * the worse case error when that data is monotonically increasing
+ * over the window.
+ *
+ * Upon getting a new min, we can forget everything earlier because
+ * it has no value - the new min is <= everything else in the window
+ * by definition and it's the most recent. So we restart fresh on
+ * every new min and overwrites 2nd & 3rd choices. The same property
+ * holds for 2nd & 3rd best.
+ */
+#include <linux/module.h>
+#include <linux/win_minmax.h>
+
+/* As time advances, update the 1st, 2nd, and 3rd choices. */
+static u32 minmax_subwin_update(struct minmax *m, u32 win,
+ const struct minmax_sample *val)
+{
+ u32 dt = val->t - m->s[0].t;
+
+ if (unlikely(dt > win)) {
+ /*
+ * Passed entire window without a new val so make 2nd
+ * choice the new val & 3rd choice the new 2nd choice.
+ * we may have to iterate this since our 2nd choice
+ * may also be outside the window (we checked on entry
+ * that the third choice was in the window).
+ */
+ m->s[0] = m->s[1];
+ m->s[1] = m->s[2];
+ m->s[2] = *val;
+ if (unlikely(val->t - m->s[0].t > win)) {
+ m->s[0] = m->s[1];
+ m->s[1] = m->s[2];
+ m->s[2] = *val;
+ }
+ } else if (unlikely(m->s[1].t == m->s[0].t) && dt > win/4) {
+ /*
+ * We've passed a quarter of the window without a new val
+ * so take a 2nd choice from the 2nd quarter of the window.
+ */
+ m->s[2] = m->s[1] = *val;
+ } else if (unlikely(m->s[2].t == m->s[1].t) && dt > win/2) {
+ /*
+ * We've passed half the window without finding a new val
+ * so take a 3rd choice from the last half of the window
+ */
+ m->s[2] = *val;
+ }
+ return m->s[0].v;
+}
+
+/* Check if new measurement updates the 1st, 2nd or 3rd choice max. */
+u32 minmax_running_max(struct minmax *m, u32 win, u32 t, u32 meas)
+{
+ struct minmax_sample val = { .t = t, .v = meas };
+
+ if (unlikely(val.v >= m->s[0].v) || /* found new max? */
+ unlikely(val.t - m->s[2].t > win)) /* nothing left in window? */
+ return minmax_reset(m, t, meas); /* forget earlier samples */
+
+ if (unlikely(val.v >= m->s[1].v))
+ m->s[2] = m->s[1] = val;
+ else if (unlikely(val.v >= m->s[2].v))
+ m->s[2] = val;
+
+ return minmax_subwin_update(m, win, &val);
+}
+EXPORT_SYMBOL(minmax_running_max);
+
+/* Check if new measurement updates the 1st, 2nd or 3rd choice min. */
+u32 minmax_running_min(struct minmax *m, u32 win, u32 t, u32 meas)
+{
+ struct minmax_sample val = { .t = t, .v = meas };
+
+ if (unlikely(val.v <= m->s[0].v) || /* found new min? */
+ unlikely(val.t - m->s[2].t > win)) /* nothing left in window? */
+ return minmax_reset(m, t, meas); /* forget earlier samples */
+
+ if (unlikely(val.v <= m->s[1].v))
+ m->s[2] = m->s[1] = val;
+ else if (unlikely(val.v <= m->s[2].v))
+ m->s[2] = val;
+
+ return minmax_subwin_update(m, win, &val);
+}
diff --git a/mm/debug.c b/mm/debug.c
index 8865bfb..74c7cae 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -42,9 +42,11 @@
void __dump_page(struct page *page, const char *reason)
{
+ int mapcount = PageSlab(page) ? 0 : page_mapcount(page);
+
pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx",
- page, page_ref_count(page), page_mapcount(page),
- page->mapping, page->index);
+ page, page_ref_count(page), mapcount,
+ page->mapping, page_to_pgoff(page));
if (PageCompound(page))
pr_cont(" compound_mapcount: %d", compound_mapcount(page));
pr_cont("\n");
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 79c52d0..728d779 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -838,7 +838,8 @@
* value (scan code).
*/
-static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address)
+static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
+ struct vm_area_struct **vmap)
{
struct vm_area_struct *vma;
unsigned long hstart, hend;
@@ -846,7 +847,7 @@
if (unlikely(khugepaged_test_exit(mm)))
return SCAN_ANY_PROCESS;
- vma = find_vma(mm, address);
+ *vmap = vma = find_vma(mm, address);
if (!vma)
return SCAN_VMA_NULL;
@@ -881,6 +882,11 @@
.pmd = pmd,
};
+ /* we only decide to swapin, if there is enough young ptes */
+ if (referenced < HPAGE_PMD_NR/2) {
+ trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
+ return false;
+ }
fe.pte = pte_offset_map(pmd, address);
for (; fe.address < address + HPAGE_PMD_NR*PAGE_SIZE;
fe.pte++, fe.address += PAGE_SIZE) {
@@ -888,17 +894,12 @@
if (!is_swap_pte(pteval))
continue;
swapped_in++;
- /* we only decide to swapin, if there is enough young ptes */
- if (referenced < HPAGE_PMD_NR/2) {
- trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
- return false;
- }
ret = do_swap_page(&fe, pteval);
/* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
if (ret & VM_FAULT_RETRY) {
down_read(&mm->mmap_sem);
- if (hugepage_vma_revalidate(mm, address)) {
+ if (hugepage_vma_revalidate(mm, address, &fe.vma)) {
/* vma is no longer available, don't continue to swapin */
trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
return false;
@@ -923,7 +924,6 @@
static void collapse_huge_page(struct mm_struct *mm,
unsigned long address,
struct page **hpage,
- struct vm_area_struct *vma,
int node, int referenced)
{
pmd_t *pmd, _pmd;
@@ -933,6 +933,7 @@
spinlock_t *pmd_ptl, *pte_ptl;
int isolated = 0, result = 0;
struct mem_cgroup *memcg;
+ struct vm_area_struct *vma;
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
gfp_t gfp;
@@ -961,7 +962,7 @@
}
down_read(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result) {
mem_cgroup_cancel_charge(new_page, memcg, true);
up_read(&mm->mmap_sem);
@@ -994,7 +995,7 @@
* handled by the anon_vma lock + PG_lock.
*/
down_write(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result)
goto out;
/* check if the pmd is still valid */
@@ -1202,7 +1203,7 @@
if (ret) {
node = khugepaged_find_target_node();
/* collapse_huge_page will return with the mmap_sem released */
- collapse_huge_page(mm, address, hpage, vma, node, referenced);
+ collapse_huge_page(mm, address, hpage, node, referenced);
}
out:
trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9a6a51a..4be518d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1740,17 +1740,22 @@
static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
{
struct memcg_stock_pcp *stock;
+ unsigned long flags;
bool ret = false;
if (nr_pages > CHARGE_BATCH)
return ret;
- stock = &get_cpu_var(memcg_stock);
+ local_irq_save(flags);
+
+ stock = this_cpu_ptr(&memcg_stock);
if (memcg == stock->cached && stock->nr_pages >= nr_pages) {
stock->nr_pages -= nr_pages;
ret = true;
}
- put_cpu_var(memcg_stock);
+
+ local_irq_restore(flags);
+
return ret;
}
@@ -1771,15 +1776,18 @@
stock->cached = NULL;
}
-/*
- * This must be called under preempt disabled or must be called by
- * a thread which is pinned to local cpu.
- */
static void drain_local_stock(struct work_struct *dummy)
{
- struct memcg_stock_pcp *stock = this_cpu_ptr(&memcg_stock);
+ struct memcg_stock_pcp *stock;
+ unsigned long flags;
+
+ local_irq_save(flags);
+
+ stock = this_cpu_ptr(&memcg_stock);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
+
+ local_irq_restore(flags);
}
/*
@@ -1788,14 +1796,19 @@
*/
static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
{
- struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
+ struct memcg_stock_pcp *stock;
+ unsigned long flags;
+ local_irq_save(flags);
+
+ stock = this_cpu_ptr(&memcg_stock);
if (stock->cached != memcg) { /* reset if necessary */
drain_stock(stock);
stock->cached = memcg;
}
stock->nr_pages += nr_pages;
- put_cpu_var(memcg_stock);
+
+ local_irq_restore(flags);
}
/*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 41266dc..b58906b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1567,7 +1567,9 @@
return alloc_huge_page_node(page_hstate(compound_head(page)),
next_node_in(nid, nmask));
- node_clear(nid, nmask);
+ if (nid != next_node_in(nid, nmask))
+ node_clear(nid, nmask);
+
if (PageHighMem(page)
|| (zone_idx(page_zone(page)) == ZONE_MOVABLE))
gfp_mask |= __GFP_HIGHMEM;
diff --git a/mm/page_io.c b/mm/page_io.c
index 16bd82fa..eafe5dd 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -264,6 +264,7 @@
int ret;
struct swap_info_struct *sis = page_swap_info(page);
+ BUG_ON(!PageSwapCache(page));
if (sis->flags & SWP_FILE) {
struct kiocb kiocb;
struct file *swap_file = sis->swap_file;
@@ -337,6 +338,7 @@
int ret = 0;
struct swap_info_struct *sis = page_swap_info(page);
+ BUG_ON(!PageSwapCache(page));
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(PageUptodate(page), page);
if (frontswap_load(page) == 0) {
@@ -386,6 +388,7 @@
if (sis->flags & SWP_FILE) {
struct address_space *mapping = sis->swap_file->f_mapping;
+ BUG_ON(!PageSwapCache(page));
return mapping->a_ops->set_page_dirty(page);
} else {
return __set_page_dirty_no_writeback(page);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 78cfa29..2657acc 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2724,7 +2724,6 @@
struct swap_info_struct *page_swap_info(struct page *page)
{
swp_entry_t swap = { .val = page_private(page) };
- BUG_ON(!PageSwapCache(page));
return swap_info[swp_type(swap)];
}
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 089328f..3c8da0a 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -207,8 +207,11 @@
* Some architectures (arm64) return true for virt_addr_valid() on
* vmalloced addresses. Work around this by checking for vmalloc
* first.
+ *
+ * We also need to check for module addresses explicitly since we
+ * may copy static data from modules to userspace
*/
- if (is_vmalloc_addr(ptr))
+ if (is_vmalloc_or_module_addr(ptr))
return NULL;
if (!virt_addr_valid(ptr))
diff --git a/net/6lowpan/ndisc.c b/net/6lowpan/ndisc.c
index 86450b7..941df2f 100644
--- a/net/6lowpan/ndisc.c
+++ b/net/6lowpan/ndisc.c
@@ -101,8 +101,6 @@
ieee802154_be16_to_le16(&neigh->short_addr, lladdr_short);
if (!lowpan_802154_is_valid_src_short_addr(neigh->short_addr))
neigh->short_addr = cpu_to_le16(IEEE802154_ADDR_SHORT_UNSPEC);
- } else {
- neigh->short_addr = cpu_to_le16(IEEE802154_ADDR_SHORT_UNSPEC);
}
write_unlock_bh(&n->lock);
}
diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index 7d17001..ee08540 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -335,7 +335,7 @@
goto out;
skb_reserve(hard_iface->bat_v.elp_skb, ETH_HLEN + NET_IP_ALIGN);
- elp_buff = skb_push(hard_iface->bat_v.elp_skb, BATADV_ELP_HLEN);
+ elp_buff = skb_put(hard_iface->bat_v.elp_skb, BATADV_ELP_HLEN);
elp_packet = (struct batadv_elp_packet *)elp_buff;
memset(elp_packet, 0, BATADV_ELP_HLEN);
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 610f2c4..7e8dc64 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -461,6 +461,29 @@
}
/**
+ * batadv_last_bonding_get - Get last_bonding_candidate of orig_node
+ * @orig_node: originator node whose last bonding candidate should be retrieved
+ *
+ * Return: last bonding candidate of router or NULL if not found
+ *
+ * The object is returned with refcounter increased by 1.
+ */
+static struct batadv_orig_ifinfo *
+batadv_last_bonding_get(struct batadv_orig_node *orig_node)
+{
+ struct batadv_orig_ifinfo *last_bonding_candidate;
+
+ spin_lock_bh(&orig_node->neigh_list_lock);
+ last_bonding_candidate = orig_node->last_bonding_candidate;
+
+ if (last_bonding_candidate)
+ kref_get(&last_bonding_candidate->refcount);
+ spin_unlock_bh(&orig_node->neigh_list_lock);
+
+ return last_bonding_candidate;
+}
+
+/**
* batadv_last_bonding_replace - Replace last_bonding_candidate of orig_node
* @orig_node: originator node whose bonding candidates should be replaced
* @new_candidate: new bonding candidate or NULL
@@ -530,7 +553,7 @@
* router - obviously there are no other candidates.
*/
rcu_read_lock();
- last_candidate = orig_node->last_bonding_candidate;
+ last_candidate = batadv_last_bonding_get(orig_node);
if (last_candidate)
last_cand_router = rcu_dereference(last_candidate->router);
@@ -622,6 +645,9 @@
batadv_orig_ifinfo_put(next_candidate);
}
+ if (last_candidate)
+ batadv_orig_ifinfo_put(last_candidate);
+
return router;
}
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 0b5f729..1aff2da 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -26,11 +26,13 @@
#include <linux/module.h>
#include <linux/debugfs.h>
+#include <linux/stringify.h>
#include <asm/ioctls.h>
#include <net/bluetooth/bluetooth.h>
#include <linux/proc_fs.h>
+#include "leds.h"
#include "selftest.h"
/* Bluetooth sockets */
@@ -712,13 +714,16 @@
struct dentry *bt_debugfs;
EXPORT_SYMBOL_GPL(bt_debugfs);
+#define VERSION __stringify(BT_SUBSYS_VERSION) "." \
+ __stringify(BT_SUBSYS_REVISION)
+
static int __init bt_init(void)
{
int err;
sock_skb_cb_check_size(sizeof(struct bt_skb_cb));
- BT_INFO("Core ver %s", BT_SUBSYS_VERSION);
+ BT_INFO("Core ver %s", VERSION);
err = bt_selftest();
if (err < 0)
@@ -726,6 +731,8 @@
bt_debugfs = debugfs_create_dir("bluetooth", NULL);
+ bt_leds_init();
+
err = bt_sysfs_init();
if (err < 0)
return err;
@@ -785,6 +792,8 @@
bt_sysfs_cleanup();
+ bt_leds_cleanup();
+
debugfs_remove_recursive(bt_debugfs);
}
@@ -792,7 +801,7 @@
module_exit(bt_exit);
MODULE_AUTHOR("Marcel Holtmann <marcel@holtmann.org>");
-MODULE_DESCRIPTION("Bluetooth Core ver " BT_SUBSYS_VERSION);
-MODULE_VERSION(BT_SUBSYS_VERSION);
+MODULE_DESCRIPTION("Bluetooth Core ver " VERSION);
+MODULE_VERSION(VERSION);
MODULE_LICENSE("GPL");
MODULE_ALIAS_NETPROTO(PF_BLUETOOTH);
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index ddf8432..3ac89e9 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -1562,6 +1562,7 @@
auto_off = hci_dev_test_and_clear_flag(hdev, HCI_AUTO_OFF);
if (!auto_off && hdev->dev_type == HCI_PRIMARY &&
+ !hci_dev_test_flag(hdev, HCI_USER_CHANNEL) &&
hci_dev_test_flag(hdev, HCI_MGMT))
__mgmt_power_off(hdev);
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index b0e23df..c813568 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -971,14 +971,14 @@
hci_req_add(req, HCI_OP_LE_SET_ADV_ENABLE, sizeof(enable), &enable);
}
-static u8 create_default_scan_rsp_data(struct hci_dev *hdev, u8 *ptr)
+static u8 append_local_name(struct hci_dev *hdev, u8 *ptr, u8 ad_len)
{
- u8 ad_len = 0;
size_t name_len;
+ int max_len;
+ max_len = HCI_MAX_AD_LENGTH - ad_len - 2;
name_len = strlen(hdev->dev_name);
- if (name_len > 0) {
- size_t max_len = HCI_MAX_AD_LENGTH - ad_len - 2;
+ if (name_len > 0 && max_len > 0) {
if (name_len > max_len) {
name_len = max_len;
@@ -997,22 +997,42 @@
return ad_len;
}
+static u8 create_default_scan_rsp_data(struct hci_dev *hdev, u8 *ptr)
+{
+ return append_local_name(hdev, ptr, 0);
+}
+
static u8 create_instance_scan_rsp_data(struct hci_dev *hdev, u8 instance,
u8 *ptr)
{
struct adv_info *adv_instance;
+ u32 instance_flags;
+ u8 scan_rsp_len = 0;
adv_instance = hci_find_adv_instance(hdev, instance);
if (!adv_instance)
return 0;
- /* TODO: Set the appropriate entries based on advertising instance flags
- * here once flags other than 0 are supported.
- */
+ instance_flags = adv_instance->flags;
+
+ if ((instance_flags & MGMT_ADV_FLAG_APPEARANCE) && hdev->appearance) {
+ ptr[0] = 3;
+ ptr[1] = EIR_APPEARANCE;
+ put_unaligned_le16(hdev->appearance, ptr + 2);
+ scan_rsp_len += 4;
+ ptr += 4;
+ }
+
memcpy(ptr, adv_instance->scan_rsp_data,
adv_instance->scan_rsp_len);
- return adv_instance->scan_rsp_len;
+ scan_rsp_len += adv_instance->scan_rsp_len;
+ ptr += adv_instance->scan_rsp_len;
+
+ if (instance_flags & MGMT_ADV_FLAG_LOCAL_NAME)
+ scan_rsp_len = append_local_name(hdev, ptr, scan_rsp_len);
+
+ return scan_rsp_len;
}
void __hci_req_update_scan_rsp_data(struct hci_request *req, u8 instance)
@@ -1194,7 +1214,7 @@
hci_req_init(&req, hdev);
- hci_req_clear_adv_instance(hdev, &req, instance, false);
+ hci_req_clear_adv_instance(hdev, NULL, &req, instance, false);
if (list_empty(&hdev->adv_instances))
__hci_req_disable_advertising(&req);
@@ -1284,8 +1304,9 @@
* setting.
* - force == false: Only instances that have a timeout will be removed.
*/
-void hci_req_clear_adv_instance(struct hci_dev *hdev, struct hci_request *req,
- u8 instance, bool force)
+void hci_req_clear_adv_instance(struct hci_dev *hdev, struct sock *sk,
+ struct hci_request *req, u8 instance,
+ bool force)
{
struct adv_info *adv_instance, *n, *next_instance = NULL;
int err;
@@ -1311,7 +1332,7 @@
rem_inst = adv_instance->instance;
err = hci_remove_adv_instance(hdev, rem_inst);
if (!err)
- mgmt_advertising_removed(NULL, hdev, rem_inst);
+ mgmt_advertising_removed(sk, hdev, rem_inst);
}
} else {
adv_instance = hci_find_adv_instance(hdev, instance);
@@ -1325,7 +1346,7 @@
err = hci_remove_adv_instance(hdev, instance);
if (!err)
- mgmt_advertising_removed(NULL, hdev, instance);
+ mgmt_advertising_removed(sk, hdev, instance);
}
}
@@ -1716,7 +1737,7 @@
* function. To be safe hard-code one of the
* values that's suitable for SCO.
*/
- rej.reason = HCI_ERROR_REMOTE_LOW_RESOURCES;
+ rej.reason = HCI_ERROR_REJ_LIMITED_RESOURCES;
hci_req_add(req, HCI_OP_REJECT_SYNC_CONN_REQ,
sizeof(rej), &rej);
diff --git a/net/bluetooth/hci_request.h b/net/bluetooth/hci_request.h
index b2d044b..ac1e110 100644
--- a/net/bluetooth/hci_request.h
+++ b/net/bluetooth/hci_request.h
@@ -73,8 +73,9 @@
int __hci_req_schedule_adv_instance(struct hci_request *req, u8 instance,
bool force);
-void hci_req_clear_adv_instance(struct hci_dev *hdev, struct hci_request *req,
- u8 instance, bool force);
+void hci_req_clear_adv_instance(struct hci_dev *hdev, struct sock *sk,
+ struct hci_request *req, u8 instance,
+ bool force);
void __hci_req_update_class(struct hci_request *req);
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 96f04b7..48f9471 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -26,6 +26,7 @@
#include <linux/export.h>
#include <linux/utsname.h>
+#include <linux/sched.h>
#include <asm/unaligned.h>
#include <net/bluetooth/bluetooth.h>
@@ -38,6 +39,8 @@
static LIST_HEAD(mgmt_chan_list);
static DEFINE_MUTEX(mgmt_chan_list_lock);
+static DEFINE_IDA(sock_cookie_ida);
+
static atomic_t monitor_promisc = ATOMIC_INIT(0);
/* ----- HCI socket interface ----- */
@@ -52,6 +55,8 @@
__u32 cmsg_mask;
unsigned short channel;
unsigned long flags;
+ __u32 cookie;
+ char comm[TASK_COMM_LEN];
};
void hci_sock_set_flag(struct sock *sk, int nr)
@@ -74,6 +79,38 @@
return hci_pi(sk)->channel;
}
+u32 hci_sock_get_cookie(struct sock *sk)
+{
+ return hci_pi(sk)->cookie;
+}
+
+static bool hci_sock_gen_cookie(struct sock *sk)
+{
+ int id = hci_pi(sk)->cookie;
+
+ if (!id) {
+ id = ida_simple_get(&sock_cookie_ida, 1, 0, GFP_KERNEL);
+ if (id < 0)
+ id = 0xffffffff;
+
+ hci_pi(sk)->cookie = id;
+ get_task_comm(hci_pi(sk)->comm, current);
+ return true;
+ }
+
+ return false;
+}
+
+static void hci_sock_free_cookie(struct sock *sk)
+{
+ int id = hci_pi(sk)->cookie;
+
+ if (id) {
+ hci_pi(sk)->cookie = 0xffffffff;
+ ida_simple_remove(&sock_cookie_ida, id);
+ }
+}
+
static inline int hci_test_bit(int nr, const void *addr)
{
return *((const __u32 *) addr + (nr >> 5)) & ((__u32) 1 << (nr & 31));
@@ -305,6 +342,60 @@
kfree_skb(skb_copy);
}
+void hci_send_monitor_ctrl_event(struct hci_dev *hdev, u16 event,
+ void *data, u16 data_len, ktime_t tstamp,
+ int flag, struct sock *skip_sk)
+{
+ struct sock *sk;
+ __le16 index;
+
+ if (hdev)
+ index = cpu_to_le16(hdev->id);
+ else
+ index = cpu_to_le16(MGMT_INDEX_NONE);
+
+ read_lock(&hci_sk_list.lock);
+
+ sk_for_each(sk, &hci_sk_list.head) {
+ struct hci_mon_hdr *hdr;
+ struct sk_buff *skb;
+
+ if (hci_pi(sk)->channel != HCI_CHANNEL_CONTROL)
+ continue;
+
+ /* Ignore socket without the flag set */
+ if (!hci_sock_test_flag(sk, flag))
+ continue;
+
+ /* Skip the original socket */
+ if (sk == skip_sk)
+ continue;
+
+ skb = bt_skb_alloc(6 + data_len, GFP_ATOMIC);
+ if (!skb)
+ continue;
+
+ put_unaligned_le32(hci_pi(sk)->cookie, skb_put(skb, 4));
+ put_unaligned_le16(event, skb_put(skb, 2));
+
+ if (data)
+ memcpy(skb_put(skb, data_len), data, data_len);
+
+ skb->tstamp = tstamp;
+
+ hdr = (void *)skb_push(skb, HCI_MON_HDR_SIZE);
+ hdr->opcode = cpu_to_le16(HCI_MON_CTRL_EVENT);
+ hdr->index = index;
+ hdr->len = cpu_to_le16(skb->len - HCI_MON_HDR_SIZE);
+
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+
+ read_unlock(&hci_sk_list.lock);
+}
+
static struct sk_buff *create_monitor_event(struct hci_dev *hdev, int event)
{
struct hci_mon_hdr *hdr;
@@ -384,6 +475,129 @@
return skb;
}
+static struct sk_buff *create_monitor_ctrl_open(struct sock *sk)
+{
+ struct hci_mon_hdr *hdr;
+ struct sk_buff *skb;
+ u16 format;
+ u8 ver[3];
+ u32 flags;
+
+ /* No message needed when cookie is not present */
+ if (!hci_pi(sk)->cookie)
+ return NULL;
+
+ switch (hci_pi(sk)->channel) {
+ case HCI_CHANNEL_RAW:
+ format = 0x0000;
+ ver[0] = BT_SUBSYS_VERSION;
+ put_unaligned_le16(BT_SUBSYS_REVISION, ver + 1);
+ break;
+ case HCI_CHANNEL_USER:
+ format = 0x0001;
+ ver[0] = BT_SUBSYS_VERSION;
+ put_unaligned_le16(BT_SUBSYS_REVISION, ver + 1);
+ break;
+ case HCI_CHANNEL_CONTROL:
+ format = 0x0002;
+ mgmt_fill_version_info(ver);
+ break;
+ default:
+ /* No message for unsupported format */
+ return NULL;
+ }
+
+ skb = bt_skb_alloc(14 + TASK_COMM_LEN , GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ flags = hci_sock_test_flag(sk, HCI_SOCK_TRUSTED) ? 0x1 : 0x0;
+
+ put_unaligned_le32(hci_pi(sk)->cookie, skb_put(skb, 4));
+ put_unaligned_le16(format, skb_put(skb, 2));
+ memcpy(skb_put(skb, sizeof(ver)), ver, sizeof(ver));
+ put_unaligned_le32(flags, skb_put(skb, 4));
+ *skb_put(skb, 1) = TASK_COMM_LEN;
+ memcpy(skb_put(skb, TASK_COMM_LEN), hci_pi(sk)->comm, TASK_COMM_LEN);
+
+ __net_timestamp(skb);
+
+ hdr = (void *)skb_push(skb, HCI_MON_HDR_SIZE);
+ hdr->opcode = cpu_to_le16(HCI_MON_CTRL_OPEN);
+ if (hci_pi(sk)->hdev)
+ hdr->index = cpu_to_le16(hci_pi(sk)->hdev->id);
+ else
+ hdr->index = cpu_to_le16(HCI_DEV_NONE);
+ hdr->len = cpu_to_le16(skb->len - HCI_MON_HDR_SIZE);
+
+ return skb;
+}
+
+static struct sk_buff *create_monitor_ctrl_close(struct sock *sk)
+{
+ struct hci_mon_hdr *hdr;
+ struct sk_buff *skb;
+
+ /* No message needed when cookie is not present */
+ if (!hci_pi(sk)->cookie)
+ return NULL;
+
+ switch (hci_pi(sk)->channel) {
+ case HCI_CHANNEL_RAW:
+ case HCI_CHANNEL_USER:
+ case HCI_CHANNEL_CONTROL:
+ break;
+ default:
+ /* No message for unsupported format */
+ return NULL;
+ }
+
+ skb = bt_skb_alloc(4, GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ put_unaligned_le32(hci_pi(sk)->cookie, skb_put(skb, 4));
+
+ __net_timestamp(skb);
+
+ hdr = (void *)skb_push(skb, HCI_MON_HDR_SIZE);
+ hdr->opcode = cpu_to_le16(HCI_MON_CTRL_CLOSE);
+ if (hci_pi(sk)->hdev)
+ hdr->index = cpu_to_le16(hci_pi(sk)->hdev->id);
+ else
+ hdr->index = cpu_to_le16(HCI_DEV_NONE);
+ hdr->len = cpu_to_le16(skb->len - HCI_MON_HDR_SIZE);
+
+ return skb;
+}
+
+static struct sk_buff *create_monitor_ctrl_command(struct sock *sk, u16 index,
+ u16 opcode, u16 len,
+ const void *buf)
+{
+ struct hci_mon_hdr *hdr;
+ struct sk_buff *skb;
+
+ skb = bt_skb_alloc(6 + len, GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ put_unaligned_le32(hci_pi(sk)->cookie, skb_put(skb, 4));
+ put_unaligned_le16(opcode, skb_put(skb, 2));
+
+ if (buf)
+ memcpy(skb_put(skb, len), buf, len);
+
+ __net_timestamp(skb);
+
+ hdr = (void *)skb_push(skb, HCI_MON_HDR_SIZE);
+ hdr->opcode = cpu_to_le16(HCI_MON_CTRL_COMMAND);
+ hdr->index = cpu_to_le16(index);
+ hdr->len = cpu_to_le16(skb->len - HCI_MON_HDR_SIZE);
+
+ return skb;
+}
+
static void __printf(2, 3)
send_monitor_note(struct sock *sk, const char *fmt, ...)
{
@@ -458,6 +672,26 @@
read_unlock(&hci_dev_list_lock);
}
+static void send_monitor_control_replay(struct sock *mon_sk)
+{
+ struct sock *sk;
+
+ read_lock(&hci_sk_list.lock);
+
+ sk_for_each(sk, &hci_sk_list.head) {
+ struct sk_buff *skb;
+
+ skb = create_monitor_ctrl_open(sk);
+ if (!skb)
+ continue;
+
+ if (sock_queue_rcv_skb(mon_sk, skb))
+ kfree_skb(skb);
+ }
+
+ read_unlock(&hci_sk_list.lock);
+}
+
/* Generate internal stack event */
static void hci_si_event(struct hci_dev *hdev, int type, int dlen, void *data)
{
@@ -585,6 +819,7 @@
{
struct sock *sk = sock->sk;
struct hci_dev *hdev;
+ struct sk_buff *skb;
BT_DBG("sock %p sk %p", sock, sk);
@@ -593,8 +828,24 @@
hdev = hci_pi(sk)->hdev;
- if (hci_pi(sk)->channel == HCI_CHANNEL_MONITOR)
+ switch (hci_pi(sk)->channel) {
+ case HCI_CHANNEL_MONITOR:
atomic_dec(&monitor_promisc);
+ break;
+ case HCI_CHANNEL_RAW:
+ case HCI_CHANNEL_USER:
+ case HCI_CHANNEL_CONTROL:
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_close(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+
+ hci_sock_free_cookie(sk);
+ break;
+ }
bt_sock_unlink(&hci_sk_list, sk);
@@ -721,6 +972,27 @@
goto done;
}
+ /* When calling an ioctl on an unbound raw socket, then ensure
+ * that the monitor gets informed. Ensure that the resulting event
+ * is only send once by checking if the cookie exists or not. The
+ * socket cookie will be only ever generated once for the lifetime
+ * of a given socket.
+ */
+ if (hci_sock_gen_cookie(sk)) {
+ struct sk_buff *skb;
+
+ if (capable(CAP_NET_ADMIN))
+ hci_sock_set_flag(sk, HCI_SOCK_TRUSTED);
+
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_open(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+ }
+
release_sock(sk);
switch (cmd) {
@@ -784,6 +1056,7 @@
struct sockaddr_hci haddr;
struct sock *sk = sock->sk;
struct hci_dev *hdev = NULL;
+ struct sk_buff *skb;
int len, err = 0;
BT_DBG("sock %p sk %p", sock, sk);
@@ -822,7 +1095,35 @@
atomic_inc(&hdev->promisc);
}
+ hci_pi(sk)->channel = haddr.hci_channel;
+
+ if (!hci_sock_gen_cookie(sk)) {
+ /* In the case when a cookie has already been assigned,
+ * then there has been already an ioctl issued against
+ * an unbound socket and with that triggerd an open
+ * notification. Send a close notification first to
+ * allow the state transition to bounded.
+ */
+ skb = create_monitor_ctrl_close(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+ }
+
+ if (capable(CAP_NET_ADMIN))
+ hci_sock_set_flag(sk, HCI_SOCK_TRUSTED);
+
hci_pi(sk)->hdev = hdev;
+
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_open(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
break;
case HCI_CHANNEL_USER:
@@ -884,9 +1185,38 @@
}
}
- atomic_inc(&hdev->promisc);
+ hci_pi(sk)->channel = haddr.hci_channel;
+
+ if (!hci_sock_gen_cookie(sk)) {
+ /* In the case when a cookie has already been assigned,
+ * this socket will transition from a raw socket into
+ * an user channel socket. For a clean transition, send
+ * the close notification first.
+ */
+ skb = create_monitor_ctrl_close(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+ }
+
+ /* The user channel is restricted to CAP_NET_ADMIN
+ * capabilities and with that implicitly trusted.
+ */
+ hci_sock_set_flag(sk, HCI_SOCK_TRUSTED);
hci_pi(sk)->hdev = hdev;
+
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_open(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+
+ atomic_inc(&hdev->promisc);
break;
case HCI_CHANNEL_MONITOR:
@@ -900,6 +1230,8 @@
goto done;
}
+ hci_pi(sk)->channel = haddr.hci_channel;
+
/* The monitor interface is restricted to CAP_NET_RAW
* capabilities and with that implicitly trusted.
*/
@@ -908,9 +1240,10 @@
send_monitor_note(sk, "Linux version %s (%s)",
init_utsname()->release,
init_utsname()->machine);
- send_monitor_note(sk, "Bluetooth subsystem version %s",
- BT_SUBSYS_VERSION);
+ send_monitor_note(sk, "Bluetooth subsystem version %u.%u",
+ BT_SUBSYS_VERSION, BT_SUBSYS_REVISION);
send_monitor_replay(sk);
+ send_monitor_control_replay(sk);
atomic_inc(&monitor_promisc);
break;
@@ -925,6 +1258,8 @@
err = -EPERM;
goto done;
}
+
+ hci_pi(sk)->channel = haddr.hci_channel;
break;
default:
@@ -946,6 +1281,8 @@
if (capable(CAP_NET_ADMIN))
hci_sock_set_flag(sk, HCI_SOCK_TRUSTED);
+ hci_pi(sk)->channel = haddr.hci_channel;
+
/* At the moment the index and unconfigured index events
* are enabled unconditionally. Setting them on each
* socket when binding keeps this functionality. They
@@ -956,16 +1293,40 @@
* received by untrusted users. Example for such events
* are changes to settings, class of device, name etc.
*/
- if (haddr.hci_channel == HCI_CHANNEL_CONTROL) {
+ if (hci_pi(sk)->channel == HCI_CHANNEL_CONTROL) {
+ if (!hci_sock_gen_cookie(sk)) {
+ /* In the case when a cookie has already been
+ * assigned, this socket will transtion from
+ * a raw socket into a control socket. To
+ * allow for a clean transtion, send the
+ * close notification first.
+ */
+ skb = create_monitor_ctrl_close(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+ }
+
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_open(sk);
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+
hci_sock_set_flag(sk, HCI_MGMT_INDEX_EVENTS);
hci_sock_set_flag(sk, HCI_MGMT_UNCONF_INDEX_EVENTS);
- hci_sock_set_flag(sk, HCI_MGMT_GENERIC_EVENTS);
+ hci_sock_set_flag(sk, HCI_MGMT_OPTION_EVENTS);
+ hci_sock_set_flag(sk, HCI_MGMT_SETTING_EVENTS);
+ hci_sock_set_flag(sk, HCI_MGMT_DEV_CLASS_EVENTS);
+ hci_sock_set_flag(sk, HCI_MGMT_LOCAL_NAME_EVENTS);
}
break;
}
-
- hci_pi(sk)->channel = haddr.hci_channel;
sk->sk_state = BT_BOUND;
done:
@@ -1133,6 +1494,19 @@
goto done;
}
+ if (chan->channel == HCI_CHANNEL_CONTROL) {
+ struct sk_buff *skb;
+
+ /* Send event to monitor */
+ skb = create_monitor_ctrl_command(sk, index, opcode, len,
+ buf + sizeof(*hdr));
+ if (skb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, skb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(skb);
+ }
+ }
+
if (opcode >= chan->handler_count ||
chan->handlers[opcode].func == NULL) {
BT_DBG("Unknown op %u", opcode);
@@ -1440,6 +1814,9 @@
BT_DBG("sk %p, opt %d", sk, optname);
+ if (level != SOL_HCI)
+ return -ENOPROTOOPT;
+
lock_sock(sk);
if (hci_pi(sk)->channel != HCI_CHANNEL_RAW) {
@@ -1523,6 +1900,9 @@
BT_DBG("sk %p, opt %d", sk, optname);
+ if (level != SOL_HCI)
+ return -ENOPROTOOPT;
+
if (get_user(len, optlen))
return -EFAULT;
diff --git a/net/bluetooth/leds.c b/net/bluetooth/leds.c
index 8319c84..cb670b5 100644
--- a/net/bluetooth/leds.c
+++ b/net/bluetooth/leds.c
@@ -11,6 +11,8 @@
#include "leds.h"
+DEFINE_LED_TRIGGER(bt_power_led_trigger);
+
struct hci_basic_led_trigger {
struct led_trigger led_trigger;
struct hci_dev *hdev;
@@ -24,6 +26,21 @@
if (hdev->power_led)
led_trigger_event(hdev->power_led,
enabled ? LED_FULL : LED_OFF);
+
+ if (!enabled) {
+ struct hci_dev *d;
+
+ read_lock(&hci_dev_list_lock);
+
+ list_for_each_entry(d, &hci_dev_list, list) {
+ if (test_bit(HCI_UP, &d->flags))
+ enabled = true;
+ }
+
+ read_unlock(&hci_dev_list_lock);
+ }
+
+ led_trigger_event(bt_power_led_trigger, enabled ? LED_FULL : LED_OFF);
}
static void power_activate(struct led_classdev *led_cdev)
@@ -72,3 +89,13 @@
/* initialize power_led */
hdev->power_led = led_allocate_basic(hdev, power_activate, "power");
}
+
+void bt_leds_init(void)
+{
+ led_trigger_register_simple("bluetooth-power", &bt_power_led_trigger);
+}
+
+void bt_leds_cleanup(void)
+{
+ led_trigger_unregister_simple(bt_power_led_trigger);
+}
diff --git a/net/bluetooth/leds.h b/net/bluetooth/leds.h
index a9c4d6e..08725a2 100644
--- a/net/bluetooth/leds.h
+++ b/net/bluetooth/leds.h
@@ -7,10 +7,20 @@
*/
#if IS_ENABLED(CONFIG_BT_LEDS)
+
void hci_leds_update_powered(struct hci_dev *hdev, bool enabled);
void hci_leds_init(struct hci_dev *hdev);
+
+void bt_leds_init(void);
+void bt_leds_cleanup(void);
+
#else
+
static inline void hci_leds_update_powered(struct hci_dev *hdev,
bool enabled) {}
static inline void hci_leds_init(struct hci_dev *hdev) {}
+
+static inline void bt_leds_init(void) {}
+static inline void bt_leds_cleanup(void) {}
+
#endif
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index 7639290..19b8a5e 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -38,7 +38,7 @@
#include "mgmt_util.h"
#define MGMT_VERSION 1
-#define MGMT_REVISION 13
+#define MGMT_REVISION 14
static const u16 mgmt_commands[] = {
MGMT_OP_READ_INDEX_LIST,
@@ -104,6 +104,8 @@
MGMT_OP_REMOVE_ADVERTISING,
MGMT_OP_GET_ADV_SIZE_INFO,
MGMT_OP_START_LIMITED_DISCOVERY,
+ MGMT_OP_READ_EXT_INFO,
+ MGMT_OP_SET_APPEARANCE,
};
static const u16 mgmt_events[] = {
@@ -141,6 +143,7 @@
MGMT_EV_LOCAL_OOB_DATA_UPDATED,
MGMT_EV_ADVERTISING_ADDED,
MGMT_EV_ADVERTISING_REMOVED,
+ MGMT_EV_EXT_INFO_CHANGED,
};
static const u16 mgmt_untrusted_commands[] = {
@@ -149,6 +152,7 @@
MGMT_OP_READ_UNCONF_INDEX_LIST,
MGMT_OP_READ_CONFIG_INFO,
MGMT_OP_READ_EXT_INDEX_LIST,
+ MGMT_OP_READ_EXT_INFO,
};
static const u16 mgmt_untrusted_events[] = {
@@ -162,6 +166,7 @@
MGMT_EV_NEW_CONFIG_OPTIONS,
MGMT_EV_EXT_INDEX_ADDED,
MGMT_EV_EXT_INDEX_REMOVED,
+ MGMT_EV_EXT_INFO_CHANGED,
};
#define CACHE_TIMEOUT msecs_to_jiffies(2 * 1000)
@@ -256,13 +261,6 @@
flag, skip_sk);
}
-static int mgmt_generic_event(u16 event, struct hci_dev *hdev, void *data,
- u16 len, struct sock *skip_sk)
-{
- return mgmt_send_event(event, hdev, HCI_CHANNEL_CONTROL, data, len,
- HCI_MGMT_GENERIC_EVENTS, skip_sk);
-}
-
static int mgmt_event(u16 event, struct hci_dev *hdev, void *data, u16 len,
struct sock *skip_sk)
{
@@ -278,6 +276,14 @@
return ADDR_LE_DEV_RANDOM;
}
+void mgmt_fill_version_info(void *ver)
+{
+ struct mgmt_rp_read_version *rp = ver;
+
+ rp->version = MGMT_VERSION;
+ rp->revision = cpu_to_le16(MGMT_REVISION);
+}
+
static int read_version(struct sock *sk, struct hci_dev *hdev, void *data,
u16 data_len)
{
@@ -285,8 +291,7 @@
BT_DBG("sock %p", sk);
- rp.version = MGMT_VERSION;
- rp.revision = cpu_to_le16(MGMT_REVISION);
+ mgmt_fill_version_info(&rp);
return mgmt_cmd_complete(sk, MGMT_INDEX_NONE, MGMT_OP_READ_VERSION, 0,
&rp, sizeof(rp));
@@ -572,8 +577,8 @@
{
__le32 options = get_missing_options(hdev);
- return mgmt_generic_event(MGMT_EV_NEW_CONFIG_OPTIONS, hdev, &options,
- sizeof(options), skip);
+ return mgmt_limited_event(MGMT_EV_NEW_CONFIG_OPTIONS, hdev, &options,
+ sizeof(options), HCI_MGMT_OPTION_EVENTS, skip);
}
static int send_options_rsp(struct sock *sk, u16 opcode, struct hci_dev *hdev)
@@ -862,6 +867,107 @@
sizeof(rp));
}
+static inline u16 eir_append_data(u8 *eir, u16 eir_len, u8 type, u8 *data,
+ u8 data_len)
+{
+ eir[eir_len++] = sizeof(type) + data_len;
+ eir[eir_len++] = type;
+ memcpy(&eir[eir_len], data, data_len);
+ eir_len += data_len;
+
+ return eir_len;
+}
+
+static inline u16 eir_append_le16(u8 *eir, u16 eir_len, u8 type, u16 data)
+{
+ eir[eir_len++] = sizeof(type) + sizeof(data);
+ eir[eir_len++] = type;
+ put_unaligned_le16(data, &eir[eir_len]);
+ eir_len += sizeof(data);
+
+ return eir_len;
+}
+
+static u16 append_eir_data_to_buf(struct hci_dev *hdev, u8 *eir)
+{
+ u16 eir_len = 0;
+ size_t name_len;
+
+ if (hci_dev_test_flag(hdev, HCI_BREDR_ENABLED))
+ eir_len = eir_append_data(eir, eir_len, EIR_CLASS_OF_DEV,
+ hdev->dev_class, 3);
+
+ if (hci_dev_test_flag(hdev, HCI_LE_ENABLED))
+ eir_len = eir_append_le16(eir, eir_len, EIR_APPEARANCE,
+ hdev->appearance);
+
+ name_len = strlen(hdev->dev_name);
+ eir_len = eir_append_data(eir, eir_len, EIR_NAME_COMPLETE,
+ hdev->dev_name, name_len);
+
+ name_len = strlen(hdev->short_name);
+ eir_len = eir_append_data(eir, eir_len, EIR_NAME_SHORT,
+ hdev->short_name, name_len);
+
+ return eir_len;
+}
+
+static int read_ext_controller_info(struct sock *sk, struct hci_dev *hdev,
+ void *data, u16 data_len)
+{
+ char buf[512];
+ struct mgmt_rp_read_ext_info *rp = (void *)buf;
+ u16 eir_len;
+
+ BT_DBG("sock %p %s", sk, hdev->name);
+
+ memset(&buf, 0, sizeof(buf));
+
+ hci_dev_lock(hdev);
+
+ bacpy(&rp->bdaddr, &hdev->bdaddr);
+
+ rp->version = hdev->hci_ver;
+ rp->manufacturer = cpu_to_le16(hdev->manufacturer);
+
+ rp->supported_settings = cpu_to_le32(get_supported_settings(hdev));
+ rp->current_settings = cpu_to_le32(get_current_settings(hdev));
+
+
+ eir_len = append_eir_data_to_buf(hdev, rp->eir);
+ rp->eir_len = cpu_to_le16(eir_len);
+
+ hci_dev_unlock(hdev);
+
+ /* If this command is called at least once, then the events
+ * for class of device and local name changes are disabled
+ * and only the new extended controller information event
+ * is used.
+ */
+ hci_sock_set_flag(sk, HCI_MGMT_EXT_INFO_EVENTS);
+ hci_sock_clear_flag(sk, HCI_MGMT_DEV_CLASS_EVENTS);
+ hci_sock_clear_flag(sk, HCI_MGMT_LOCAL_NAME_EVENTS);
+
+ return mgmt_cmd_complete(sk, hdev->id, MGMT_OP_READ_EXT_INFO, 0, rp,
+ sizeof(*rp) + eir_len);
+}
+
+static int ext_info_changed(struct hci_dev *hdev, struct sock *skip)
+{
+ char buf[512];
+ struct mgmt_ev_ext_info_changed *ev = (void *)buf;
+ u16 eir_len;
+
+ memset(buf, 0, sizeof(buf));
+
+ eir_len = append_eir_data_to_buf(hdev, ev->eir);
+ ev->eir_len = cpu_to_le16(eir_len);
+
+ return mgmt_limited_event(MGMT_EV_EXT_INFO_CHANGED, hdev, ev,
+ sizeof(*ev) + eir_len,
+ HCI_MGMT_EXT_INFO_EVENTS, skip);
+}
+
static int send_settings_rsp(struct sock *sk, u16 opcode, struct hci_dev *hdev)
{
__le32 settings = cpu_to_le32(get_current_settings(hdev));
@@ -922,7 +1028,7 @@
hci_req_add(&req, HCI_OP_WRITE_SCAN_ENABLE, 1, &scan);
}
- hci_req_clear_adv_instance(hdev, NULL, 0x00, false);
+ hci_req_clear_adv_instance(hdev, NULL, NULL, 0x00, false);
if (hci_dev_test_flag(hdev, HCI_LE_ADV))
__hci_req_disable_advertising(&req);
@@ -1000,8 +1106,8 @@
{
__le32 ev = cpu_to_le32(get_current_settings(hdev));
- return mgmt_generic_event(MGMT_EV_NEW_SETTINGS, hdev, &ev,
- sizeof(ev), skip);
+ return mgmt_limited_event(MGMT_EV_NEW_SETTINGS, hdev, &ev,
+ sizeof(ev), HCI_MGMT_SETTING_EVENTS, skip);
}
int mgmt_new_settings(struct hci_dev *hdev)
@@ -1690,7 +1796,7 @@
enabled = lmp_host_le_capable(hdev);
if (!val)
- hci_req_clear_adv_instance(hdev, NULL, 0x00, true);
+ hci_req_clear_adv_instance(hdev, NULL, NULL, 0x00, true);
if (!hdev_is_powered(hdev) || val == enabled) {
bool changed = false;
@@ -2435,6 +2541,8 @@
if (!cmd)
return -ENOMEM;
+ cmd->cmd_complete = addr_cmd_complete;
+
err = hci_send_cmd(hdev, HCI_OP_PIN_CODE_NEG_REPLY,
sizeof(cp->addr.bdaddr), &cp->addr.bdaddr);
if (err < 0)
@@ -2513,8 +2621,8 @@
BT_DBG("");
if (cp->io_capability > SMP_IO_KEYBOARD_DISPLAY)
- return mgmt_cmd_complete(sk, hdev->id, MGMT_OP_SET_IO_CAPABILITY,
- MGMT_STATUS_INVALID_PARAMS, NULL, 0);
+ return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_IO_CAPABILITY,
+ MGMT_STATUS_INVALID_PARAMS);
hci_dev_lock(hdev);
@@ -2932,6 +3040,35 @@
HCI_OP_USER_PASSKEY_NEG_REPLY, 0);
}
+static void adv_expire(struct hci_dev *hdev, u32 flags)
+{
+ struct adv_info *adv_instance;
+ struct hci_request req;
+ int err;
+
+ adv_instance = hci_find_adv_instance(hdev, hdev->cur_adv_instance);
+ if (!adv_instance)
+ return;
+
+ /* stop if current instance doesn't need to be changed */
+ if (!(adv_instance->flags & flags))
+ return;
+
+ cancel_adv_timeout(hdev);
+
+ adv_instance = hci_get_next_instance(hdev, adv_instance->instance);
+ if (!adv_instance)
+ return;
+
+ hci_req_init(&req, hdev);
+ err = __hci_req_schedule_adv_instance(&req, adv_instance->instance,
+ true);
+ if (err)
+ return;
+
+ hci_req_run(&req, NULL);
+}
+
static void set_name_complete(struct hci_dev *hdev, u8 status, u16 opcode)
{
struct mgmt_cp_set_local_name *cp;
@@ -2947,13 +3084,17 @@
cp = cmd->param;
- if (status)
+ if (status) {
mgmt_cmd_status(cmd->sk, hdev->id, MGMT_OP_SET_LOCAL_NAME,
mgmt_status(status));
- else
+ } else {
mgmt_cmd_complete(cmd->sk, hdev->id, MGMT_OP_SET_LOCAL_NAME, 0,
cp, sizeof(*cp));
+ if (hci_dev_test_flag(hdev, HCI_LE_ADV))
+ adv_expire(hdev, MGMT_ADV_FLAG_LOCAL_NAME);
+ }
+
mgmt_pending_remove(cmd);
unlock:
@@ -2993,8 +3134,9 @@
if (err < 0)
goto failed;
- err = mgmt_generic_event(MGMT_EV_LOCAL_NAME_CHANGED, hdev,
- data, len, sk);
+ err = mgmt_limited_event(MGMT_EV_LOCAL_NAME_CHANGED, hdev, data,
+ len, HCI_MGMT_LOCAL_NAME_EVENTS, sk);
+ ext_info_changed(hdev, sk);
goto failed;
}
@@ -3017,7 +3159,7 @@
/* The name is stored in the scan response data and so
* no need to udpate the advertising data here.
*/
- if (lmp_le_capable(hdev))
+ if (lmp_le_capable(hdev) && hci_dev_test_flag(hdev, HCI_ADVERTISING))
__hci_req_update_scan_rsp_data(&req, hdev->cur_adv_instance);
err = hci_req_run(&req, set_name_complete);
@@ -3029,6 +3171,40 @@
return err;
}
+static int set_appearance(struct sock *sk, struct hci_dev *hdev, void *data,
+ u16 len)
+{
+ struct mgmt_cp_set_appearance *cp = data;
+ u16 apperance;
+ int err;
+
+ BT_DBG("");
+
+ if (!lmp_le_capable(hdev))
+ return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_APPEARANCE,
+ MGMT_STATUS_NOT_SUPPORTED);
+
+ apperance = le16_to_cpu(cp->appearance);
+
+ hci_dev_lock(hdev);
+
+ if (hdev->appearance != apperance) {
+ hdev->appearance = apperance;
+
+ if (hci_dev_test_flag(hdev, HCI_LE_ADV))
+ adv_expire(hdev, MGMT_ADV_FLAG_APPEARANCE);
+
+ ext_info_changed(hdev, sk);
+ }
+
+ err = mgmt_cmd_complete(sk, hdev->id, MGMT_OP_SET_APPEARANCE, 0, NULL,
+ 0);
+
+ hci_dev_unlock(hdev);
+
+ return err;
+}
+
static void read_local_oob_data_complete(struct hci_dev *hdev, u8 status,
u16 opcode, struct sk_buff *skb)
{
@@ -4869,7 +5045,7 @@
int err;
memset(&rp, 0, sizeof(rp));
- memcpy(&rp.addr, &cmd->param, sizeof(rp.addr));
+ memcpy(&rp.addr, cmd->param, sizeof(rp.addr));
if (status)
goto complete;
@@ -5501,17 +5677,6 @@
return err;
}
-static inline u16 eir_append_data(u8 *eir, u16 eir_len, u8 type, u8 *data,
- u8 data_len)
-{
- eir[eir_len++] = sizeof(type) + data_len;
- eir[eir_len++] = type;
- memcpy(&eir[eir_len], data, data_len);
- eir_len += data_len;
-
- return eir_len;
-}
-
static void read_local_oob_ext_data_complete(struct hci_dev *hdev, u8 status,
u16 opcode, struct sk_buff *skb)
{
@@ -5815,6 +5980,8 @@
flags |= MGMT_ADV_FLAG_DISCOV;
flags |= MGMT_ADV_FLAG_LIMITED_DISCOV;
flags |= MGMT_ADV_FLAG_MANAGED_FLAGS;
+ flags |= MGMT_ADV_FLAG_APPEARANCE;
+ flags |= MGMT_ADV_FLAG_LOCAL_NAME;
if (hdev->adv_tx_power != HCI_TX_POWER_INVALID)
flags |= MGMT_ADV_FLAG_TX_POWER;
@@ -5871,28 +6038,59 @@
return err;
}
-static bool tlv_data_is_valid(struct hci_dev *hdev, u32 adv_flags, u8 *data,
- u8 len, bool is_adv_data)
+static u8 tlv_data_max_len(u32 adv_flags, bool is_adv_data)
{
u8 max_len = HCI_MAX_AD_LENGTH;
- int i, cur_len;
- bool flags_managed = false;
- bool tx_power_managed = false;
if (is_adv_data) {
if (adv_flags & (MGMT_ADV_FLAG_DISCOV |
MGMT_ADV_FLAG_LIMITED_DISCOV |
- MGMT_ADV_FLAG_MANAGED_FLAGS)) {
- flags_managed = true;
+ MGMT_ADV_FLAG_MANAGED_FLAGS))
max_len -= 3;
- }
- if (adv_flags & MGMT_ADV_FLAG_TX_POWER) {
- tx_power_managed = true;
+ if (adv_flags & MGMT_ADV_FLAG_TX_POWER)
max_len -= 3;
- }
+ } else {
+ /* at least 1 byte of name should fit in */
+ if (adv_flags & MGMT_ADV_FLAG_LOCAL_NAME)
+ max_len -= 3;
+
+ if (adv_flags & (MGMT_ADV_FLAG_APPEARANCE))
+ max_len -= 4;
}
+ return max_len;
+}
+
+static bool flags_managed(u32 adv_flags)
+{
+ return adv_flags & (MGMT_ADV_FLAG_DISCOV |
+ MGMT_ADV_FLAG_LIMITED_DISCOV |
+ MGMT_ADV_FLAG_MANAGED_FLAGS);
+}
+
+static bool tx_power_managed(u32 adv_flags)
+{
+ return adv_flags & MGMT_ADV_FLAG_TX_POWER;
+}
+
+static bool name_managed(u32 adv_flags)
+{
+ return adv_flags & MGMT_ADV_FLAG_LOCAL_NAME;
+}
+
+static bool appearance_managed(u32 adv_flags)
+{
+ return adv_flags & MGMT_ADV_FLAG_APPEARANCE;
+}
+
+static bool tlv_data_is_valid(u32 adv_flags, u8 *data, u8 len, bool is_adv_data)
+{
+ int i, cur_len;
+ u8 max_len;
+
+ max_len = tlv_data_max_len(adv_flags, is_adv_data);
+
if (len > max_len)
return false;
@@ -5900,10 +6098,21 @@
for (i = 0, cur_len = 0; i < len; i += (cur_len + 1)) {
cur_len = data[i];
- if (flags_managed && data[i + 1] == EIR_FLAGS)
+ if (data[i + 1] == EIR_FLAGS &&
+ (!is_adv_data || flags_managed(adv_flags)))
return false;
- if (tx_power_managed && data[i + 1] == EIR_TX_POWER)
+ if (data[i + 1] == EIR_TX_POWER && tx_power_managed(adv_flags))
+ return false;
+
+ if (data[i + 1] == EIR_NAME_COMPLETE && name_managed(adv_flags))
+ return false;
+
+ if (data[i + 1] == EIR_NAME_SHORT && name_managed(adv_flags))
+ return false;
+
+ if (data[i + 1] == EIR_APPEARANCE &&
+ appearance_managed(adv_flags))
return false;
/* If the current field length would exceed the total data
@@ -6027,8 +6236,8 @@
goto unlock;
}
- if (!tlv_data_is_valid(hdev, flags, cp->data, cp->adv_data_len, true) ||
- !tlv_data_is_valid(hdev, flags, cp->data + cp->adv_data_len,
+ if (!tlv_data_is_valid(flags, cp->data, cp->adv_data_len, true) ||
+ !tlv_data_is_valid(flags, cp->data + cp->adv_data_len,
cp->scan_rsp_len, false)) {
err = mgmt_cmd_status(sk, hdev->id, MGMT_OP_ADD_ADVERTISING,
MGMT_STATUS_INVALID_PARAMS);
@@ -6175,7 +6384,7 @@
hci_req_init(&req, hdev);
- hci_req_clear_adv_instance(hdev, &req, cp->instance, true);
+ hci_req_clear_adv_instance(hdev, sk, &req, cp->instance, true);
if (list_empty(&hdev->adv_instances))
__hci_req_disable_advertising(&req);
@@ -6211,23 +6420,6 @@
return err;
}
-static u8 tlv_data_max_len(u32 adv_flags, bool is_adv_data)
-{
- u8 max_len = HCI_MAX_AD_LENGTH;
-
- if (is_adv_data) {
- if (adv_flags & (MGMT_ADV_FLAG_DISCOV |
- MGMT_ADV_FLAG_LIMITED_DISCOV |
- MGMT_ADV_FLAG_MANAGED_FLAGS))
- max_len -= 3;
-
- if (adv_flags & MGMT_ADV_FLAG_TX_POWER)
- max_len -= 3;
- }
-
- return max_len;
-}
-
static int get_adv_size_info(struct sock *sk, struct hci_dev *hdev,
void *data, u16 data_len)
{
@@ -6356,6 +6548,9 @@
{ remove_advertising, MGMT_REMOVE_ADVERTISING_SIZE },
{ get_adv_size_info, MGMT_GET_ADV_SIZE_INFO_SIZE },
{ start_limited_discovery, MGMT_START_DISCOVERY_SIZE },
+ { read_ext_controller_info,MGMT_READ_EXT_INFO_SIZE,
+ HCI_MGMT_UNTRUSTED },
+ { set_appearance, MGMT_SET_APPEARANCE_SIZE },
};
void mgmt_index_added(struct hci_dev *hdev)
@@ -6494,9 +6689,12 @@
mgmt_pending_foreach(0, hdev, cmd_complete_rsp, &status);
- if (memcmp(hdev->dev_class, zero_cod, sizeof(zero_cod)) != 0)
- mgmt_generic_event(MGMT_EV_CLASS_OF_DEV_CHANGED, hdev,
- zero_cod, sizeof(zero_cod), NULL);
+ if (memcmp(hdev->dev_class, zero_cod, sizeof(zero_cod)) != 0) {
+ mgmt_limited_event(MGMT_EV_CLASS_OF_DEV_CHANGED, hdev,
+ zero_cod, sizeof(zero_cod),
+ HCI_MGMT_DEV_CLASS_EVENTS, NULL);
+ ext_info_changed(hdev, NULL);
+ }
new_settings(hdev, match.sk);
@@ -7092,9 +7290,11 @@
mgmt_pending_foreach(MGMT_OP_ADD_UUID, hdev, sk_lookup, &match);
mgmt_pending_foreach(MGMT_OP_REMOVE_UUID, hdev, sk_lookup, &match);
- if (!status)
- mgmt_generic_event(MGMT_EV_CLASS_OF_DEV_CHANGED, hdev,
- dev_class, 3, NULL);
+ if (!status) {
+ mgmt_limited_event(MGMT_EV_CLASS_OF_DEV_CHANGED, hdev, dev_class,
+ 3, HCI_MGMT_DEV_CLASS_EVENTS, NULL);
+ ext_info_changed(hdev, NULL);
+ }
if (match.sk)
sock_put(match.sk);
@@ -7123,8 +7323,9 @@
return;
}
- mgmt_generic_event(MGMT_EV_LOCAL_NAME_CHANGED, hdev, &ev, sizeof(ev),
- cmd ? cmd->sk : NULL);
+ mgmt_limited_event(MGMT_EV_LOCAL_NAME_CHANGED, hdev, &ev, sizeof(ev),
+ HCI_MGMT_LOCAL_NAME_EVENTS, cmd ? cmd->sk : NULL);
+ ext_info_changed(hdev, cmd ? cmd->sk : NULL);
}
static inline bool has_uuid(u8 *uuid, u16 uuid_count, u8 (*uuids)[16])
diff --git a/net/bluetooth/mgmt_util.c b/net/bluetooth/mgmt_util.c
index 8c30c7e..c933bd0 100644
--- a/net/bluetooth/mgmt_util.c
+++ b/net/bluetooth/mgmt_util.c
@@ -21,12 +21,41 @@
SOFTWARE IS DISCLAIMED.
*/
+#include <asm/unaligned.h>
+
#include <net/bluetooth/bluetooth.h>
#include <net/bluetooth/hci_core.h>
+#include <net/bluetooth/hci_mon.h>
#include <net/bluetooth/mgmt.h>
#include "mgmt_util.h"
+static struct sk_buff *create_monitor_ctrl_event(__le16 index, u32 cookie,
+ u16 opcode, u16 len, void *buf)
+{
+ struct hci_mon_hdr *hdr;
+ struct sk_buff *skb;
+
+ skb = bt_skb_alloc(6 + len, GFP_ATOMIC);
+ if (!skb)
+ return NULL;
+
+ put_unaligned_le32(cookie, skb_put(skb, 4));
+ put_unaligned_le16(opcode, skb_put(skb, 2));
+
+ if (buf)
+ memcpy(skb_put(skb, len), buf, len);
+
+ __net_timestamp(skb);
+
+ hdr = (void *)skb_push(skb, HCI_MON_HDR_SIZE);
+ hdr->opcode = cpu_to_le16(HCI_MON_CTRL_EVENT);
+ hdr->index = index;
+ hdr->len = cpu_to_le16(skb->len - HCI_MON_HDR_SIZE);
+
+ return skb;
+}
+
int mgmt_send_event(u16 event, struct hci_dev *hdev, unsigned short channel,
void *data, u16 data_len, int flag, struct sock *skip_sk)
{
@@ -52,14 +81,18 @@
__net_timestamp(skb);
hci_send_to_channel(channel, skb, flag, skip_sk);
- kfree_skb(skb);
+ if (channel == HCI_CHANNEL_CONTROL)
+ hci_send_monitor_ctrl_event(hdev, event, data, data_len,
+ skb_get_ktime(skb), flag, skip_sk);
+
+ kfree_skb(skb);
return 0;
}
int mgmt_cmd_status(struct sock *sk, u16 index, u16 cmd, u8 status)
{
- struct sk_buff *skb;
+ struct sk_buff *skb, *mskb;
struct mgmt_hdr *hdr;
struct mgmt_ev_cmd_status *ev;
int err;
@@ -80,17 +113,30 @@
ev->status = status;
ev->opcode = cpu_to_le16(cmd);
+ mskb = create_monitor_ctrl_event(hdr->index, hci_sock_get_cookie(sk),
+ MGMT_EV_CMD_STATUS, sizeof(*ev), ev);
+ if (mskb)
+ skb->tstamp = mskb->tstamp;
+ else
+ __net_timestamp(skb);
+
err = sock_queue_rcv_skb(sk, skb);
if (err < 0)
kfree_skb(skb);
+ if (mskb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, mskb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(mskb);
+ }
+
return err;
}
int mgmt_cmd_complete(struct sock *sk, u16 index, u16 cmd, u8 status,
void *rp, size_t rp_len)
{
- struct sk_buff *skb;
+ struct sk_buff *skb, *mskb;
struct mgmt_hdr *hdr;
struct mgmt_ev_cmd_complete *ev;
int err;
@@ -114,10 +160,24 @@
if (rp)
memcpy(ev->data, rp, rp_len);
+ mskb = create_monitor_ctrl_event(hdr->index, hci_sock_get_cookie(sk),
+ MGMT_EV_CMD_COMPLETE,
+ sizeof(*ev) + rp_len, ev);
+ if (mskb)
+ skb->tstamp = mskb->tstamp;
+ else
+ __net_timestamp(skb);
+
err = sock_queue_rcv_skb(sk, skb);
if (err < 0)
kfree_skb(skb);
+ if (mskb) {
+ hci_send_to_channel(HCI_CHANNEL_MONITOR, mskb,
+ HCI_SOCK_TRUSTED, NULL);
+ kfree_skb(mskb);
+ }
+
return err;
}
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 4c1a16a..43faf2a 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -3387,7 +3387,10 @@
if (!lmp_sc_capable(hdev)) {
debugfs_create_file("force_bredr_smp", 0644, hdev->debugfs,
hdev, &force_bredr_smp_fops);
- return 0;
+
+ /* Flag can be already set here (due to power toggle) */
+ if (!hci_dev_test_flag(hdev, HCI_FORCE_BREDR_SMP))
+ return 0;
}
if (WARN_ON(hdev->smp_bredr_data)) {
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 77e7f69..2fe9345 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -30,6 +30,7 @@
#include <linux/netfilter_ipv6.h>
#include <linux/netfilter_arp.h>
#include <linux/in_route.h>
+#include <linux/rculist.h>
#include <linux/inetdevice.h>
#include <net/ip.h>
@@ -395,11 +396,10 @@
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
- NF_HOOK_THRESH(NFPROTO_BRIDGE,
- NF_BR_PRE_ROUTING,
- net, sk, skb, skb->dev, NULL,
- br_nf_pre_routing_finish_bridge,
- 1);
+ br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev,
+ NULL,
+ br_nf_pre_routing_finish);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -417,10 +417,8 @@
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
- NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
- skb->dev, NULL,
- br_handle_frame_finish, 1);
-
+ br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb, skb->dev, NULL,
+ br_handle_frame_finish);
return 0;
}
@@ -992,6 +990,43 @@
.notifier_call = brnf_device_event,
};
+/* recursively invokes nf_hook_slow (again), skipping already-called
+ * hooks (< NF_BR_PRI_BRNF).
+ *
+ * Called with rcu read lock held.
+ */
+int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ struct sock *sk, struct sk_buff *skb,
+ struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *))
+{
+ struct nf_hook_entry *elem;
+ struct nf_hook_state state;
+ int ret;
+
+ elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
+
+ while (elem && (elem->ops.priority <= NF_BR_PRI_BRNF))
+ elem = rcu_dereference(elem->next);
+
+ if (!elem)
+ return okfn(net, sk, skb);
+
+ /* We may already have this, but read-locks nest anyway */
+ rcu_read_lock();
+ nf_hook_state_init(&state, elem, hook, NF_BR_PRI_BRNF + 1,
+ NFPROTO_BRIDGE, indev, outdev, sk, net, okfn);
+
+ ret = nf_hook_slow(skb, &state);
+ rcu_read_unlock();
+ if (ret == 1)
+ ret = okfn(net, sk, skb);
+
+ return ret;
+}
+
#ifdef CONFIG_SYSCTL
static
int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c
index 5e59a84..5989661 100644
--- a/net/bridge/br_netfilter_ipv6.c
+++ b/net/bridge/br_netfilter_ipv6.c
@@ -187,10 +187,9 @@
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
- NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING,
- net, sk, skb, skb->dev, NULL,
- br_nf_pre_routing_finish_bridge,
- 1);
+ br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev, NULL,
+ br_nf_pre_routing_finish_bridge);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -207,9 +206,8 @@
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
- NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
- skb->dev, NULL,
- br_handle_frame_finish, 1);
+ br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb,
+ skb->dev, NULL, br_handle_frame_finish);
return 0;
}
diff --git a/net/bridge/netfilter/ebt_log.c b/net/bridge/netfilter/ebt_log.c
index 152300d..9a11086 100644
--- a/net/bridge/netfilter/ebt_log.c
+++ b/net/bridge/netfilter/ebt_log.c
@@ -91,7 +91,7 @@
if (loginfo->type == NF_LOG_TYPE_LOG)
bitmask = loginfo->u.log.logflags;
else
- bitmask = NF_LOG_MASK;
+ bitmask = NF_LOG_DEFAULT_MASK;
if ((bitmask & EBT_LOG_IP) && eth_hdr(skb)->h_proto ==
htons(ETH_P_IP)) {
diff --git a/net/bridge/netfilter/ebt_redirect.c b/net/bridge/netfilter/ebt_redirect.c
index 20396499..2e7c4f9 100644
--- a/net/bridge/netfilter/ebt_redirect.c
+++ b/net/bridge/netfilter/ebt_redirect.c
@@ -24,7 +24,7 @@
return EBT_DROP;
if (par->hooknum != NF_BR_BROUTING)
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
ether_addr_copy(eth_hdr(skb)->h_dest,
br_port_get_rcu(par->in)->br->dev->dev_addr);
else
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 0833c25..f5c11bb 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -146,7 +146,7 @@
return 1;
if (NF_INVF(e, EBT_IOUT, ebt_dev_check(e->out, out)))
return 1;
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
if (in && (p = br_port_get_rcu(in)) != NULL &&
NF_INVF(e, EBT_ILOGICALIN,
ebt_dev_check(e->logical_in, p->br->dev)))
diff --git a/net/bridge/netfilter/nf_tables_bridge.c b/net/bridge/netfilter/nf_tables_bridge.c
index a78c4e2..97afdc0 100644
--- a/net/bridge/netfilter/nf_tables_bridge.c
+++ b/net/bridge/netfilter/nf_tables_bridge.c
@@ -13,79 +13,11 @@
#include <linux/module.h>
#include <linux/netfilter_bridge.h>
#include <net/netfilter/nf_tables.h>
-#include <net/netfilter/nf_tables_bridge.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <net/netfilter/nf_tables_ipv4.h>
#include <net/netfilter/nf_tables_ipv6.h>
-int nft_bridge_iphdr_validate(struct sk_buff *skb)
-{
- struct iphdr *iph;
- u32 len;
-
- if (!pskb_may_pull(skb, sizeof(struct iphdr)))
- return 0;
-
- iph = ip_hdr(skb);
- if (iph->ihl < 5 || iph->version != 4)
- return 0;
-
- len = ntohs(iph->tot_len);
- if (skb->len < len)
- return 0;
- else if (len < (iph->ihl*4))
- return 0;
-
- if (!pskb_may_pull(skb, iph->ihl*4))
- return 0;
-
- return 1;
-}
-EXPORT_SYMBOL_GPL(nft_bridge_iphdr_validate);
-
-int nft_bridge_ip6hdr_validate(struct sk_buff *skb)
-{
- struct ipv6hdr *hdr;
- u32 pkt_len;
-
- if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
- return 0;
-
- hdr = ipv6_hdr(skb);
- if (hdr->version != 6)
- return 0;
-
- pkt_len = ntohs(hdr->payload_len);
- if (pkt_len + sizeof(struct ipv6hdr) > skb->len)
- return 0;
-
- return 1;
-}
-EXPORT_SYMBOL_GPL(nft_bridge_ip6hdr_validate);
-
-static inline void nft_bridge_set_pktinfo_ipv4(struct nft_pktinfo *pkt,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
- if (nft_bridge_iphdr_validate(skb))
- nft_set_pktinfo_ipv4(pkt, skb, state);
- else
- nft_set_pktinfo(pkt, skb, state);
-}
-
-static inline void nft_bridge_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
-#if IS_ENABLED(CONFIG_IPV6)
- if (nft_bridge_ip6hdr_validate(skb) &&
- nft_set_pktinfo_ipv6(pkt, skb, state) == 0)
- return;
-#endif
- nft_set_pktinfo(pkt, skb, state);
-}
-
static unsigned int
nft_do_chain_bridge(void *priv,
struct sk_buff *skb,
@@ -95,13 +27,13 @@
switch (eth_hdr(skb)->h_proto) {
case htons(ETH_P_IP):
- nft_bridge_set_pktinfo_ipv4(&pkt, skb, state);
+ nft_set_pktinfo_ipv4_validate(&pkt, skb, state);
break;
case htons(ETH_P_IPV6):
- nft_bridge_set_pktinfo_ipv6(&pkt, skb, state);
+ nft_set_pktinfo_ipv6_validate(&pkt, skb, state);
break;
default:
- nft_set_pktinfo(&pkt, skb, state);
+ nft_set_pktinfo_unspec(&pkt, skb, state);
break;
}
@@ -207,12 +139,20 @@
int ret;
nf_register_afinfo(&nf_br_afinfo);
- nft_register_chain_type(&filter_bridge);
+ ret = nft_register_chain_type(&filter_bridge);
+ if (ret < 0)
+ goto err1;
+
ret = register_pernet_subsys(&nf_tables_bridge_net_ops);
- if (ret < 0) {
- nft_unregister_chain_type(&filter_bridge);
- nf_unregister_afinfo(&nf_br_afinfo);
- }
+ if (ret < 0)
+ goto err2;
+
+ return ret;
+
+err2:
+ nft_unregister_chain_type(&filter_bridge);
+err1:
+ nf_unregister_afinfo(&nf_br_afinfo);
return ret;
}
diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 0b77ffb..4b3df6b 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -14,7 +14,6 @@
#include <linux/netfilter/nf_tables.h>
#include <net/netfilter/nf_tables.h>
#include <net/netfilter/nft_reject.h>
-#include <net/netfilter/nf_tables_bridge.h>
#include <net/netfilter/ipv4/nf_reject.h>
#include <net/netfilter/ipv6/nf_reject.h>
#include <linux/ip.h>
@@ -37,6 +36,30 @@
skb_pull(nskb, ETH_HLEN);
}
+static int nft_bridge_iphdr_validate(struct sk_buff *skb)
+{
+ struct iphdr *iph;
+ u32 len;
+
+ if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+ return 0;
+
+ iph = ip_hdr(skb);
+ if (iph->ihl < 5 || iph->version != 4)
+ return 0;
+
+ len = ntohs(iph->tot_len);
+ if (skb->len < len)
+ return 0;
+ else if (len < (iph->ihl*4))
+ return 0;
+
+ if (!pskb_may_pull(skb, iph->ihl*4))
+ return 0;
+
+ return 1;
+}
+
/* We cannot use oldskb->dev, it can be either bridge device (NF_BRIDGE INPUT)
* or the bridge port (NF_BRIDGE PREROUTING).
*/
@@ -143,6 +166,25 @@
br_forward(br_port_get_rcu(dev), nskb, false, true);
}
+static int nft_bridge_ip6hdr_validate(struct sk_buff *skb)
+{
+ struct ipv6hdr *hdr;
+ u32 pkt_len;
+
+ if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
+ return 0;
+
+ hdr = ipv6_hdr(skb);
+ if (hdr->version != 6)
+ return 0;
+
+ pkt_len = ntohs(hdr->payload_len);
+ if (pkt_len + sizeof(struct ipv6hdr) > skb->len)
+ return 0;
+
+ return 1;
+}
+
static void nft_reject_br_send_v6_tcp_reset(struct net *net,
struct sk_buff *oldskb,
const struct net_device *dev,
diff --git a/net/core/dev.c b/net/core/dev.c
index 9dbece2..c0c291f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4055,12 +4055,17 @@
{
#ifdef CONFIG_NETFILTER_INGRESS
if (nf_hook_ingress_active(skb)) {
+ int ingress_retval;
+
if (*pt_prev) {
*ret = deliver_skb(skb, *pt_prev, orig_dev);
*pt_prev = NULL;
}
- return nf_hook_ingress(skb);
+ rcu_read_lock();
+ ingress_retval = nf_hook_ingress(skb);
+ rcu_read_unlock();
+ return ingress_retval;
}
#endif /* CONFIG_NETFILTER_INGRESS */
return 0;
diff --git a/net/core/filter.c b/net/core/filter.c
index 298b146..00351cd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1362,6 +1362,11 @@
return err;
}
+static int bpf_try_make_head_writable(struct sk_buff *skb)
+{
+ return bpf_try_make_writable(skb, skb_headlen(skb));
+}
+
static inline void bpf_push_mac_rcsum(struct sk_buff *skb)
{
if (skb_at_tc_ingress(skb))
@@ -1441,6 +1446,28 @@
.arg4_type = ARG_CONST_STACK_SIZE,
};
+BPF_CALL_2(bpf_skb_pull_data, struct sk_buff *, skb, u32, len)
+{
+ /* Idea is the following: should the needed direct read/write
+ * test fail during runtime, we can pull in more data and redo
+ * again, since implicitly, we invalidate previous checks here.
+ *
+ * Or, since we know how much we need to make read/writeable,
+ * this can be done once at the program beginning for direct
+ * access case. By this we overcome limitations of only current
+ * headroom being accessible.
+ */
+ return bpf_try_make_writable(skb, len ? : skb_headlen(skb));
+}
+
+static const struct bpf_func_proto bpf_skb_pull_data_proto = {
+ .func = bpf_skb_pull_data,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+ .arg2_type = ARG_ANYTHING,
+};
+
BPF_CALL_5(bpf_l3_csum_replace, struct sk_buff *, skb, u32, offset,
u64, from, u64, to, u64, flags)
{
@@ -1567,6 +1594,7 @@
static const struct bpf_func_proto bpf_csum_diff_proto = {
.func = bpf_csum_diff,
.gpl_only = false,
+ .pkt_access = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_STACK,
.arg2_type = ARG_CONST_STACK_SIZE_OR_ZERO,
@@ -1575,6 +1603,26 @@
.arg5_type = ARG_ANYTHING,
};
+BPF_CALL_2(bpf_csum_update, struct sk_buff *, skb, __wsum, csum)
+{
+ /* The interface is to be used in combination with bpf_csum_diff()
+ * for direct packet writes. csum rotation for alignment as well
+ * as emulating csum_sub() can be done from the eBPF program.
+ */
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ return (skb->csum = csum_add(skb->csum, csum));
+
+ return -ENOTSUPP;
+}
+
+static const struct bpf_func_proto bpf_csum_update_proto = {
+ .func = bpf_csum_update,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+ .arg2_type = ARG_ANYTHING,
+};
+
static inline int __bpf_rx_skb(struct net_device *dev, struct sk_buff *skb)
{
return dev_forward_skb(dev, skb);
@@ -1602,6 +1650,8 @@
BPF_CALL_3(bpf_clone_redirect, struct sk_buff *, skb, u32, ifindex, u64, flags)
{
struct net_device *dev;
+ struct sk_buff *clone;
+ int ret;
if (unlikely(flags & ~(BPF_F_INGRESS)))
return -EINVAL;
@@ -1610,14 +1660,25 @@
if (unlikely(!dev))
return -EINVAL;
- skb = skb_clone(skb, GFP_ATOMIC);
- if (unlikely(!skb))
+ clone = skb_clone(skb, GFP_ATOMIC);
+ if (unlikely(!clone))
return -ENOMEM;
- bpf_push_mac_rcsum(skb);
+ /* For direct write, we need to keep the invariant that the skbs
+ * we're dealing with need to be uncloned. Should uncloning fail
+ * here, we need to free the just generated clone to unclone once
+ * again.
+ */
+ ret = bpf_try_make_head_writable(skb);
+ if (unlikely(ret)) {
+ kfree_skb(clone);
+ return -ENOMEM;
+ }
+
+ bpf_push_mac_rcsum(clone);
return flags & BPF_F_INGRESS ?
- __bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb);
+ __bpf_rx_skb(dev, clone) : __bpf_tx_skb(dev, clone);
}
static const struct bpf_func_proto bpf_clone_redirect_proto = {
@@ -1716,6 +1777,22 @@
.arg1_type = ARG_PTR_TO_CTX,
};
+BPF_CALL_1(bpf_set_hash_invalid, struct sk_buff *, skb)
+{
+ /* After all direct packet write, this can be used once for
+ * triggering a lazy recalc on next skb_get_hash() invocation.
+ */
+ skb_clear_hash(skb);
+ return 0;
+}
+
+static const struct bpf_func_proto bpf_set_hash_invalid_proto = {
+ .func = bpf_set_hash_invalid,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_CTX,
+};
+
BPF_CALL_3(bpf_skb_vlan_push, struct sk_buff *, skb, __be16, vlan_proto,
u16, vlan_tci)
{
@@ -2063,19 +2140,14 @@
bool bpf_helper_changes_skb_data(void *func)
{
- if (func == bpf_skb_vlan_push)
- return true;
- if (func == bpf_skb_vlan_pop)
- return true;
- if (func == bpf_skb_store_bytes)
- return true;
- if (func == bpf_skb_change_proto)
- return true;
- if (func == bpf_skb_change_tail)
- return true;
- if (func == bpf_l3_csum_replace)
- return true;
- if (func == bpf_l4_csum_replace)
+ if (func == bpf_skb_vlan_push ||
+ func == bpf_skb_vlan_pop ||
+ func == bpf_skb_store_bytes ||
+ func == bpf_skb_change_proto ||
+ func == bpf_skb_change_tail ||
+ func == bpf_skb_pull_data ||
+ func == bpf_l3_csum_replace ||
+ func == bpf_l4_csum_replace)
return true;
return false;
@@ -2352,7 +2424,7 @@
struct cgroup *cgrp;
struct sock *sk;
- sk = skb->sk;
+ sk = skb_to_full_sk(skb);
if (!sk || !sk_fullsock(sk))
return -ENOENT;
if (unlikely(idx >= array->map.max_entries))
@@ -2440,8 +2512,12 @@
return &bpf_skb_store_bytes_proto;
case BPF_FUNC_skb_load_bytes:
return &bpf_skb_load_bytes_proto;
+ case BPF_FUNC_skb_pull_data:
+ return &bpf_skb_pull_data_proto;
case BPF_FUNC_csum_diff:
return &bpf_csum_diff_proto;
+ case BPF_FUNC_csum_update:
+ return &bpf_csum_update_proto;
case BPF_FUNC_l3_csum_replace:
return &bpf_l3_csum_replace_proto;
case BPF_FUNC_l4_csum_replace:
@@ -2474,6 +2550,8 @@
return &bpf_get_route_realm_proto;
case BPF_FUNC_get_hash_recalc:
return &bpf_get_hash_recalc_proto;
+ case BPF_FUNC_set_hash_invalid:
+ return &bpf_set_hash_invalid_proto;
case BPF_FUNC_perf_event_output:
return &bpf_skb_event_output_proto;
case BPF_FUNC_get_smp_processor_id:
@@ -2491,6 +2569,8 @@
switch (func_id) {
case BPF_FUNC_perf_event_output:
return &bpf_xdp_event_output_proto;
+ case BPF_FUNC_get_smp_processor_id:
+ return &bpf_get_smp_processor_id_proto;
default:
return sk_filter_func_proto(func_id);
}
@@ -2533,6 +2613,45 @@
return __is_valid_access(off, size, type);
}
+static int tc_cls_act_prologue(struct bpf_insn *insn_buf, bool direct_write,
+ const struct bpf_prog *prog)
+{
+ struct bpf_insn *insn = insn_buf;
+
+ if (!direct_write)
+ return 0;
+
+ /* if (!skb->cloned)
+ * goto start;
+ *
+ * (Fast-path, otherwise approximation that we might be
+ * a clone, do the rest in helper.)
+ */
+ *insn++ = BPF_LDX_MEM(BPF_B, BPF_REG_6, BPF_REG_1, CLONED_OFFSET());
+ *insn++ = BPF_ALU32_IMM(BPF_AND, BPF_REG_6, CLONED_MASK);
+ *insn++ = BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 7);
+
+ /* ret = bpf_skb_pull_data(skb, 0); */
+ *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
+ *insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2);
+ *insn++ = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+ BPF_FUNC_skb_pull_data);
+ /* if (!ret)
+ * goto restore;
+ * return TC_ACT_SHOT;
+ */
+ *insn++ = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2);
+ *insn++ = BPF_ALU32_IMM(BPF_MOV, BPF_REG_0, TC_ACT_SHOT);
+ *insn++ = BPF_EXIT_INSN();
+
+ /* restore: */
+ *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
+ /* start: */
+ *insn++ = prog->insnsi[0];
+
+ return insn - insn_buf;
+}
+
static bool tc_cls_act_is_valid_access(int off, int size,
enum bpf_access_type type,
enum bpf_reg_type *reg_type)
@@ -2810,6 +2929,7 @@
.get_func_proto = tc_cls_act_func_proto,
.is_valid_access = tc_cls_act_is_valid_access,
.convert_ctx_access = tc_cls_act_convert_ctx_access,
+ .gen_prologue = tc_cls_act_prologue,
};
static const struct bpf_verifier_ops xdp_ops = {
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 937e459..3ac8946 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -843,7 +843,10 @@
size += nla_total_size(num_vfs * sizeof(struct nlattr));
size += num_vfs *
(nla_total_size(sizeof(struct ifla_vf_mac)) +
- nla_total_size(sizeof(struct ifla_vf_vlan)) +
+ nla_total_size(MAX_VLAN_LIST_LEN *
+ sizeof(struct nlattr)) +
+ nla_total_size(MAX_VLAN_LIST_LEN *
+ sizeof(struct ifla_vf_vlan_info)) +
nla_total_size(sizeof(struct ifla_vf_spoofchk)) +
nla_total_size(sizeof(struct ifla_vf_rate)) +
nla_total_size(sizeof(struct ifla_vf_link_state)) +
@@ -1111,14 +1114,15 @@
struct nlattr *vfinfo)
{
struct ifla_vf_rss_query_en vf_rss_query_en;
+ struct nlattr *vf, *vfstats, *vfvlanlist;
struct ifla_vf_link_state vf_linkstate;
+ struct ifla_vf_vlan_info vf_vlan_info;
struct ifla_vf_spoofchk vf_spoofchk;
struct ifla_vf_tx_rate vf_tx_rate;
struct ifla_vf_stats vf_stats;
struct ifla_vf_trust vf_trust;
struct ifla_vf_vlan vf_vlan;
struct ifla_vf_rate vf_rate;
- struct nlattr *vf, *vfstats;
struct ifla_vf_mac vf_mac;
struct ifla_vf_info ivi;
@@ -1135,11 +1139,14 @@
* IFLA_VF_LINK_STATE_AUTO which equals zero
*/
ivi.linkstate = 0;
+ /* VLAN Protocol by default is 802.1Q */
+ ivi.vlan_proto = htons(ETH_P_8021Q);
if (dev->netdev_ops->ndo_get_vf_config(dev, vfs_num, &ivi))
return 0;
vf_mac.vf =
vf_vlan.vf =
+ vf_vlan_info.vf =
vf_rate.vf =
vf_tx_rate.vf =
vf_spoofchk.vf =
@@ -1150,6 +1157,9 @@
memcpy(vf_mac.mac, ivi.mac, sizeof(ivi.mac));
vf_vlan.vlan = ivi.vlan;
vf_vlan.qos = ivi.qos;
+ vf_vlan_info.vlan = ivi.vlan;
+ vf_vlan_info.qos = ivi.qos;
+ vf_vlan_info.vlan_proto = ivi.vlan_proto;
vf_tx_rate.rate = ivi.max_tx_rate;
vf_rate.min_tx_rate = ivi.min_tx_rate;
vf_rate.max_tx_rate = ivi.max_tx_rate;
@@ -1158,10 +1168,8 @@
vf_rss_query_en.setting = ivi.rss_query_en;
vf_trust.setting = ivi.trusted;
vf = nla_nest_start(skb, IFLA_VF_INFO);
- if (!vf) {
- nla_nest_cancel(skb, vfinfo);
- return -EMSGSIZE;
- }
+ if (!vf)
+ goto nla_put_vfinfo_failure;
if (nla_put(skb, IFLA_VF_MAC, sizeof(vf_mac), &vf_mac) ||
nla_put(skb, IFLA_VF_VLAN, sizeof(vf_vlan), &vf_vlan) ||
nla_put(skb, IFLA_VF_RATE, sizeof(vf_rate),
@@ -1177,17 +1185,23 @@
&vf_rss_query_en) ||
nla_put(skb, IFLA_VF_TRUST,
sizeof(vf_trust), &vf_trust))
- return -EMSGSIZE;
+ goto nla_put_vf_failure;
+ vfvlanlist = nla_nest_start(skb, IFLA_VF_VLAN_LIST);
+ if (!vfvlanlist)
+ goto nla_put_vf_failure;
+ if (nla_put(skb, IFLA_VF_VLAN_INFO, sizeof(vf_vlan_info),
+ &vf_vlan_info)) {
+ nla_nest_cancel(skb, vfvlanlist);
+ goto nla_put_vf_failure;
+ }
+ nla_nest_end(skb, vfvlanlist);
memset(&vf_stats, 0, sizeof(vf_stats));
if (dev->netdev_ops->ndo_get_vf_stats)
dev->netdev_ops->ndo_get_vf_stats(dev, vfs_num,
&vf_stats);
vfstats = nla_nest_start(skb, IFLA_VF_STATS);
- if (!vfstats) {
- nla_nest_cancel(skb, vf);
- nla_nest_cancel(skb, vfinfo);
- return -EMSGSIZE;
- }
+ if (!vfstats)
+ goto nla_put_vf_failure;
if (nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_PACKETS,
vf_stats.rx_packets, IFLA_VF_STATS_PAD) ||
nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_PACKETS,
@@ -1199,11 +1213,19 @@
nla_put_u64_64bit(skb, IFLA_VF_STATS_BROADCAST,
vf_stats.broadcast, IFLA_VF_STATS_PAD) ||
nla_put_u64_64bit(skb, IFLA_VF_STATS_MULTICAST,
- vf_stats.multicast, IFLA_VF_STATS_PAD))
- return -EMSGSIZE;
+ vf_stats.multicast, IFLA_VF_STATS_PAD)) {
+ nla_nest_cancel(skb, vfstats);
+ goto nla_put_vf_failure;
+ }
nla_nest_end(skb, vfstats);
nla_nest_end(skb, vf);
return 0;
+
+nla_put_vf_failure:
+ nla_nest_cancel(skb, vf);
+nla_put_vfinfo_failure:
+ nla_nest_cancel(skb, vfinfo);
+ return -EMSGSIZE;
}
static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev)
@@ -1448,6 +1470,7 @@
static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
[IFLA_VF_MAC] = { .len = sizeof(struct ifla_vf_mac) },
[IFLA_VF_VLAN] = { .len = sizeof(struct ifla_vf_vlan) },
+ [IFLA_VF_VLAN_LIST] = { .type = NLA_NESTED },
[IFLA_VF_TX_RATE] = { .len = sizeof(struct ifla_vf_tx_rate) },
[IFLA_VF_SPOOFCHK] = { .len = sizeof(struct ifla_vf_spoofchk) },
[IFLA_VF_RATE] = { .len = sizeof(struct ifla_vf_rate) },
@@ -1704,7 +1727,34 @@
err = -EOPNOTSUPP;
if (ops->ndo_set_vf_vlan)
err = ops->ndo_set_vf_vlan(dev, ivv->vf, ivv->vlan,
- ivv->qos);
+ ivv->qos,
+ htons(ETH_P_8021Q));
+ if (err < 0)
+ return err;
+ }
+
+ if (tb[IFLA_VF_VLAN_LIST]) {
+ struct ifla_vf_vlan_info *ivvl[MAX_VLAN_LIST_LEN];
+ struct nlattr *attr;
+ int rem, len = 0;
+
+ err = -EOPNOTSUPP;
+ if (!ops->ndo_set_vf_vlan)
+ return err;
+
+ nla_for_each_nested(attr, tb[IFLA_VF_VLAN_LIST], rem) {
+ if (nla_type(attr) != IFLA_VF_VLAN_INFO ||
+ nla_len(attr) < NLA_HDRLEN) {
+ return -EINVAL;
+ }
+ if (len >= MAX_VLAN_LIST_LEN)
+ return -EOPNOTSUPP;
+ ivvl[len] = nla_data(attr);
+
+ len++;
+ }
+ err = ops->ndo_set_vf_vlan(dev, ivvl[0]->vf, ivvl[0]->vlan,
+ ivvl[0]->qos, ivvl[0]->vlan_proto);
if (err < 0)
return err;
}
@@ -3577,6 +3627,91 @@
(!idxattr || idxattr == attrid);
}
+#define IFLA_OFFLOAD_XSTATS_FIRST (IFLA_OFFLOAD_XSTATS_UNSPEC + 1)
+static int rtnl_get_offload_stats_attr_size(int attr_id)
+{
+ switch (attr_id) {
+ case IFLA_OFFLOAD_XSTATS_CPU_HIT:
+ return sizeof(struct rtnl_link_stats64);
+ }
+
+ return 0;
+}
+
+static int rtnl_get_offload_stats(struct sk_buff *skb, struct net_device *dev,
+ int *prividx)
+{
+ struct nlattr *attr = NULL;
+ int attr_id, size;
+ void *attr_data;
+ int err;
+
+ if (!(dev->netdev_ops && dev->netdev_ops->ndo_has_offload_stats &&
+ dev->netdev_ops->ndo_get_offload_stats))
+ return -ENODATA;
+
+ for (attr_id = IFLA_OFFLOAD_XSTATS_FIRST;
+ attr_id <= IFLA_OFFLOAD_XSTATS_MAX; attr_id++) {
+ if (attr_id < *prividx)
+ continue;
+
+ size = rtnl_get_offload_stats_attr_size(attr_id);
+ if (!size)
+ continue;
+
+ if (!dev->netdev_ops->ndo_has_offload_stats(attr_id))
+ continue;
+
+ attr = nla_reserve_64bit(skb, attr_id, size,
+ IFLA_OFFLOAD_XSTATS_UNSPEC);
+ if (!attr)
+ goto nla_put_failure;
+
+ attr_data = nla_data(attr);
+ memset(attr_data, 0, size);
+ err = dev->netdev_ops->ndo_get_offload_stats(attr_id, dev,
+ attr_data);
+ if (err)
+ goto get_offload_stats_failure;
+ }
+
+ if (!attr)
+ return -ENODATA;
+
+ *prividx = 0;
+ return 0;
+
+nla_put_failure:
+ err = -EMSGSIZE;
+get_offload_stats_failure:
+ *prividx = attr_id;
+ return err;
+}
+
+static int rtnl_get_offload_stats_size(const struct net_device *dev)
+{
+ int nla_size = 0;
+ int attr_id;
+ int size;
+
+ if (!(dev->netdev_ops && dev->netdev_ops->ndo_has_offload_stats &&
+ dev->netdev_ops->ndo_get_offload_stats))
+ return 0;
+
+ for (attr_id = IFLA_OFFLOAD_XSTATS_FIRST;
+ attr_id <= IFLA_OFFLOAD_XSTATS_MAX; attr_id++) {
+ if (!dev->netdev_ops->ndo_has_offload_stats(attr_id))
+ continue;
+ size = rtnl_get_offload_stats_attr_size(attr_id);
+ nla_size += nla_total_size_64bit(size);
+ }
+
+ if (nla_size != 0)
+ nla_size += nla_total_size(0);
+
+ return nla_size;
+}
+
static int rtnl_fill_statsinfo(struct sk_buff *skb, struct net_device *dev,
int type, u32 pid, u32 seq, u32 change,
unsigned int flags, unsigned int filter_mask,
@@ -3586,6 +3721,7 @@
struct nlmsghdr *nlh;
struct nlattr *attr;
int s_prividx = *prividx;
+ int err;
ASSERT_RTNL();
@@ -3614,8 +3750,6 @@
const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
if (ops && ops->fill_linkxstats) {
- int err;
-
*idxattr = IFLA_STATS_LINK_XSTATS;
attr = nla_nest_start(skb,
IFLA_STATS_LINK_XSTATS);
@@ -3639,8 +3773,6 @@
if (master)
ops = master->rtnl_link_ops;
if (ops && ops->fill_linkxstats) {
- int err;
-
*idxattr = IFLA_STATS_LINK_XSTATS_SLAVE;
attr = nla_nest_start(skb,
IFLA_STATS_LINK_XSTATS_SLAVE);
@@ -3655,6 +3787,24 @@
}
}
+ if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_OFFLOAD_XSTATS,
+ *idxattr)) {
+ *idxattr = IFLA_STATS_LINK_OFFLOAD_XSTATS;
+ attr = nla_nest_start(skb, IFLA_STATS_LINK_OFFLOAD_XSTATS);
+ if (!attr)
+ goto nla_put_failure;
+
+ err = rtnl_get_offload_stats(skb, dev, prividx);
+ if (err == -ENODATA)
+ nla_nest_cancel(skb, attr);
+ else
+ nla_nest_end(skb, attr);
+
+ if (err && err != -ENODATA)
+ goto nla_put_failure;
+ *idxattr = 0;
+ }
+
nlmsg_end(skb, nlh);
return 0;
@@ -3708,6 +3858,9 @@
}
}
+ if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_OFFLOAD_XSTATS, 0))
+ size += rtnl_get_offload_stats_size(dev);
+
return size;
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 1e329d4..d36c754 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3097,11 +3097,31 @@
sg = !!(features & NETIF_F_SG);
csum = !!can_checksum_protocol(features, proto);
- /* GSO partial only requires that we trim off any excess that
- * doesn't fit into an MSS sized block, so take care of that
- * now.
- */
- if (sg && csum && (features & NETIF_F_GSO_PARTIAL)) {
+ if (sg && csum && (mss != GSO_BY_FRAGS)) {
+ if (!(features & NETIF_F_GSO_PARTIAL)) {
+ struct sk_buff *iter;
+
+ if (!list_skb ||
+ !net_gso_ok(features, skb_shinfo(head_skb)->gso_type))
+ goto normal;
+
+ /* Split the buffer at the frag_list pointer.
+ * This is based on the assumption that all
+ * buffers in the chain excluding the last
+ * containing the same amount of data.
+ */
+ skb_walk_frags(head_skb, iter) {
+ if (skb_headlen(iter))
+ goto normal;
+
+ len -= iter->len;
+ }
+ }
+
+ /* GSO partial only requires that we trim off any excess that
+ * doesn't fit into an MSS sized block, so take care of that
+ * now.
+ */
partial_segs = len / mss;
if (partial_segs > 1)
mss *= partial_segs;
@@ -3109,6 +3129,7 @@
partial_segs = 0;
}
+normal:
headroom = skb_headroom(head_skb);
pos = skb_headlen(head_skb);
@@ -3300,21 +3321,29 @@
*/
segs->prev = tail;
- /* Update GSO info on first skb in partial sequence. */
if (partial_segs) {
+ struct sk_buff *iter;
int type = skb_shinfo(head_skb)->gso_type;
+ unsigned short gso_size = skb_shinfo(head_skb)->gso_size;
/* Update type to add partial and then remove dodgy if set */
- type |= SKB_GSO_PARTIAL;
+ type |= (features & NETIF_F_GSO_PARTIAL) / NETIF_F_GSO_PARTIAL * SKB_GSO_PARTIAL;
type &= ~SKB_GSO_DODGY;
/* Update GSO info and prepare to start updating headers on
* our way back down the stack of protocols.
*/
- skb_shinfo(segs)->gso_size = skb_shinfo(head_skb)->gso_size;
- skb_shinfo(segs)->gso_segs = partial_segs;
- skb_shinfo(segs)->gso_type = type;
- SKB_GSO_CB(segs)->data_offset = skb_headroom(segs) + doffset;
+ for (iter = segs; iter; iter = iter->next) {
+ skb_shinfo(iter)->gso_size = gso_size;
+ skb_shinfo(iter)->gso_segs = partial_segs;
+ skb_shinfo(iter)->gso_type = type;
+ SKB_GSO_CB(iter)->data_offset = skb_headroom(iter) + doffset;
+ }
+
+ if (tail->len - doffset <= gso_size)
+ skb_shinfo(tail)->gso_size = 0;
+ else if (tail != segs)
+ skb_shinfo(tail)->gso_segs = DIV_ROUND_UP(tail->len - doffset, gso_size);
}
/* Following permits correct backpressure, for protocols
@@ -4493,8 +4522,10 @@
}
EXPORT_SYMBOL(skb_ensure_writable);
-/* remove VLAN header from packet and update csum accordingly. */
-static int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci)
+/* remove VLAN header from packet and update csum accordingly.
+ * expects a non skb_vlan_tag_present skb with a vlan tag payload
+ */
+int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci)
{
struct vlan_hdr *vhdr;
unsigned int offset = skb->data - skb_mac_header(skb);
@@ -4525,6 +4556,7 @@
return err;
}
+EXPORT_SYMBOL(__skb_vlan_pop);
int skb_vlan_pop(struct sk_buff *skb)
{
@@ -4535,9 +4567,7 @@
if (likely(skb_vlan_tag_present(skb))) {
skb->vlan_tci = 0;
} else {
- if (unlikely((skb->protocol != htons(ETH_P_8021Q) &&
- skb->protocol != htons(ETH_P_8021AD)) ||
- skb->len < VLAN_ETH_HLEN))
+ if (unlikely(!eth_type_vlan(skb->protocol)))
return 0;
err = __skb_vlan_pop(skb, &vlan_tci);
@@ -4545,9 +4575,7 @@
return err;
}
/* move next vlan tag to hw accel tag */
- if (likely((skb->protocol != htons(ETH_P_8021Q) &&
- skb->protocol != htons(ETH_P_8021AD)) ||
- skb->len < VLAN_ETH_HLEN))
+ if (likely(!eth_type_vlan(skb->protocol)))
return 0;
vlan_proto = skb->protocol;
diff --git a/net/core/sock.c b/net/core/sock.c
index 51a7304..038e660 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1340,7 +1340,6 @@
if (!try_module_get(prot->owner))
goto out_free_sec;
sk_tx_queue_clear(sk);
- cgroup_sk_alloc(&sk->sk_cgrp_data);
}
return sk;
@@ -1400,6 +1399,7 @@
sock_net_set(sk, net);
atomic_set(&sk->sk_wmem_alloc, 1);
+ cgroup_sk_alloc(&sk->sk_cgrp_data);
sock_update_classid(&sk->sk_cgrp_data);
sock_update_netprioidx(&sk->sk_cgrp_data);
}
@@ -1544,6 +1544,9 @@
newsk->sk_priority = 0;
newsk->sk_incoming_cpu = raw_smp_processor_id();
atomic64_set(&newsk->sk_cookie, 0);
+
+ cgroup_sk_alloc(&newsk->sk_cgrp_data);
+
/*
* Before updating sk_refcnt, we must commit prior changes to memory
* (Documentation/RCU/rculist_nulls.txt for details)
diff --git a/net/core/stream.c b/net/core/stream.c
index 159516a..1086c8b 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -43,7 +43,6 @@
rcu_read_unlock();
}
}
-EXPORT_SYMBOL(sk_stream_write_space);
/**
* sk_stream_wait_connect - Wait for a socket to get into the connected state
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 66e31ac..a6902c1 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -378,9 +378,11 @@
if (ret < 0)
goto out;
- ret = ops->set_addr(ds, dst->master_netdev->dev_addr);
- if (ret < 0)
- goto out;
+ if (ops->set_addr) {
+ ret = ops->set_addr(ds, dst->master_netdev->dev_addr);
+ if (ret < 0)
+ goto out;
+ }
if (!ds->slave_mii_bus && ops->phy_read) {
ds->slave_mii_bus = devm_mdiobus_alloc(parent);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 8278385..f8a7d9a 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -304,13 +304,11 @@
if (err < 0)
return err;
- err = ds->ops->set_addr(ds, dst->master_netdev->dev_addr);
- if (err < 0)
- return err;
-
- err = ds->ops->set_addr(ds, dst->master_netdev->dev_addr);
- if (err < 0)
- return err;
+ if (ds->ops->set_addr) {
+ err = ds->ops->set_addr(ds, dst->master_netdev->dev_addr);
+ if (err < 0)
+ return err;
+ }
if (!ds->slave_mii_bus && ds->ops->phy_read) {
ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 9ecbe78..6b1282c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -69,6 +69,30 @@
return !!p->bridge_dev;
}
+static void dsa_port_set_stp_state(struct dsa_switch *ds, int port, u8 state)
+{
+ struct dsa_port *dp = &ds->ports[port];
+
+ if (ds->ops->port_stp_state_set)
+ ds->ops->port_stp_state_set(ds, port, state);
+
+ if (ds->ops->port_fast_age) {
+ /* Fast age FDB entries or flush appropriate forwarding database
+ * for the given port, if we are moving it from Learning or
+ * Forwarding state, to Disabled or Blocking or Listening state.
+ */
+
+ if ((dp->stp_state == BR_STATE_LEARNING ||
+ dp->stp_state == BR_STATE_FORWARDING) &&
+ (state == BR_STATE_DISABLED ||
+ state == BR_STATE_BLOCKING ||
+ state == BR_STATE_LISTENING))
+ ds->ops->port_fast_age(ds, port);
+ }
+
+ dp->stp_state = state;
+}
+
static int dsa_slave_open(struct net_device *dev)
{
struct dsa_slave_priv *p = netdev_priv(dev);
@@ -104,8 +128,7 @@
goto clear_promisc;
}
- if (ds->ops->port_stp_state_set)
- ds->ops->port_stp_state_set(ds, p->port, stp_state);
+ dsa_port_set_stp_state(ds, p->port, stp_state);
if (p->phy)
phy_start(p->phy);
@@ -147,8 +170,7 @@
if (ds->ops->port_disable)
ds->ops->port_disable(ds, p->port, p->phy);
- if (ds->ops->port_stp_state_set)
- ds->ops->port_stp_state_set(ds, p->port, BR_STATE_DISABLED);
+ dsa_port_set_stp_state(ds, p->port, BR_STATE_DISABLED);
return 0;
}
@@ -354,7 +376,7 @@
if (switchdev_trans_ph_prepare(trans))
return ds->ops->port_stp_state_set ? 0 : -EOPNOTSUPP;
- ds->ops->port_stp_state_set(ds, p->port, attr->u.stp_state);
+ dsa_port_set_stp_state(ds, p->port, attr->u.stp_state);
return 0;
}
@@ -556,8 +578,7 @@
/* Port left the bridge, put in BR_STATE_DISABLED by the bridge layer,
* so allow it to be in BR_STATE_FORWARDING to be kept functional
*/
- if (ds->ops->port_stp_state_set)
- ds->ops->port_stp_state_set(ds, p->port, BR_STATE_FORWARDING);
+ dsa_port_set_stp_state(ds, p->port, BR_STATE_FORWARDING);
}
static int dsa_slave_port_attr_get(struct net_device *dev,
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 50d6a9b..300b068 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -640,6 +640,21 @@
D.A. Hayes and G. Armitage. "Revisiting TCP congestion control using
delay gradients." In Networking 2011. Preprint: http://goo.gl/No3vdg
+config TCP_CONG_BBR
+ tristate "BBR TCP"
+ default n
+ ---help---
+
+ BBR (Bottleneck Bandwidth and RTT) TCP congestion control aims to
+ maximize network utilization and minimize queues. It builds an explicit
+ model of the the bottleneck delivery rate and path round-trip
+ propagation delay. It tolerates packet loss and delay unrelated to
+ congestion. It can operate over LAN, WAN, cellular, wifi, or cable
+ modem links. It can coexist with flows that use loss-based congestion
+ control, and can operate with shallow buffers, deep buffers,
+ bufferbloat, policers, or AQM schemes that do not provide a delay
+ signal. It requires the fq ("Fair Queue") pacing packet scheduler.
+
choice
prompt "Default TCP congestion control"
default DEFAULT_CUBIC
@@ -674,6 +689,9 @@
config DEFAULT_CDG
bool "CDG" if TCP_CONG_CDG=y
+ config DEFAULT_BBR
+ bool "BBR" if TCP_CONG_BBR=y
+
config DEFAULT_RENO
bool "Reno"
endchoice
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index 24629b6..bc6a6c8 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -8,7 +8,7 @@
inet_timewait_sock.o inet_connection_sock.o \
tcp.o tcp_input.o tcp_output.o tcp_timer.o tcp_ipv4.o \
tcp_minisocks.o tcp_cong.o tcp_metrics.o tcp_fastopen.o \
- tcp_recovery.o \
+ tcp_rate.o tcp_recovery.o \
tcp_offload.o datagram.o raw.o udp.o udplite.o \
udp_offload.o arp.o icmp.o devinet.o af_inet.o igmp.o \
fib_frontend.o fib_semantics.o fib_trie.o \
@@ -41,6 +41,7 @@
obj-$(CONFIG_INET_TCP_DIAG) += tcp_diag.o
obj-$(CONFIG_INET_UDP_DIAG) += udp_diag.o
obj-$(CONFIG_NET_TCPPROBE) += tcp_probe.o
+obj-$(CONFIG_TCP_CONG_BBR) += tcp_bbr.o
obj-$(CONFIG_TCP_CONG_BIC) += tcp_bic.o
obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o
obj-$(CONFIG_TCP_CONG_CUBIC) += tcp_cubic.o
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index e94b47b..1effc98 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1192,7 +1192,7 @@
struct sk_buff *inet_gso_segment(struct sk_buff *skb,
netdev_features_t features)
{
- bool udpfrag = false, fixedid = false, encap;
+ bool udpfrag = false, fixedid = false, gso_partial, encap;
struct sk_buff *segs = ERR_PTR(-EINVAL);
const struct net_offload *ops;
unsigned int offset = 0;
@@ -1245,6 +1245,8 @@
if (IS_ERR_OR_NULL(segs))
goto out;
+ gso_partial = !!(skb_shinfo(segs)->gso_type & SKB_GSO_PARTIAL);
+
skb = segs;
do {
iph = (struct iphdr *)(skb_mac_header(skb) + nhoff);
@@ -1259,9 +1261,13 @@
iph->id = htons(id);
id += skb_shinfo(skb)->gso_segs;
}
- tot_len = skb_shinfo(skb)->gso_size +
- SKB_GSO_CB(skb)->data_offset +
- skb->head - (unsigned char *)iph;
+
+ if (gso_partial)
+ tot_len = skb_shinfo(skb)->gso_size +
+ SKB_GSO_CB(skb)->data_offset +
+ skb->head - (unsigned char *)iph;
+ else
+ tot_len = skb->len - nhoff;
} else {
if (!fixedid)
iph->id = htons(id++);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 4e56a4c..c3b8047 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -182,26 +182,13 @@
struct fib_table *tb;
hlist_for_each_entry_safe(tb, tmp, head, tb_hlist)
- flushed += fib_table_flush(tb);
+ flushed += fib_table_flush(net, tb);
}
if (flushed)
rt_cache_flush(net);
}
-void fib_flush_external(struct net *net)
-{
- struct fib_table *tb;
- struct hlist_head *head;
- unsigned int h;
-
- for (h = 0; h < FIB_TABLE_HASHSZ; h++) {
- head = &net->ipv4.fib_table_hash[h];
- hlist_for_each_entry(tb, head, tb_hlist)
- fib_table_flush_external(tb);
- }
-}
-
/*
* Find address type as if only "dev" was present in the system. If
* on_dev is NULL then all interfaces are taken into consideration.
@@ -590,13 +577,13 @@
if (cmd == SIOCDELRT) {
tb = fib_get_table(net, cfg.fc_table);
if (tb)
- err = fib_table_delete(tb, &cfg);
+ err = fib_table_delete(net, tb, &cfg);
else
err = -ESRCH;
} else {
tb = fib_new_table(net, cfg.fc_table);
if (tb)
- err = fib_table_insert(tb, &cfg);
+ err = fib_table_insert(net, tb, &cfg);
else
err = -ENOBUFS;
}
@@ -719,7 +706,7 @@
goto errout;
}
- err = fib_table_delete(tb, &cfg);
+ err = fib_table_delete(net, tb, &cfg);
errout:
return err;
}
@@ -741,7 +728,7 @@
goto errout;
}
- err = fib_table_insert(tb, &cfg);
+ err = fib_table_insert(net, tb, &cfg);
errout:
return err;
}
@@ -828,9 +815,9 @@
cfg.fc_scope = RT_SCOPE_HOST;
if (cmd == RTM_NEWROUTE)
- fib_table_insert(tb, &cfg);
+ fib_table_insert(net, tb, &cfg);
else
- fib_table_delete(tb, &cfg);
+ fib_table_delete(net, tb, &cfg);
}
void fib_add_ifaddr(struct in_ifaddr *ifa)
@@ -1254,7 +1241,7 @@
hlist_for_each_entry_safe(tb, tmp, head, tb_hlist) {
hlist_del(&tb->tb_hlist);
- fib_table_flush(tb);
+ fib_table_flush(net, tb);
fib_free_table(tb);
}
}
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 770bebe..2e50062 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -164,6 +164,14 @@
return NULL;
}
+static int call_fib_rule_notifiers(struct net *net,
+ enum fib_event_type event_type)
+{
+ struct fib_notifier_info info;
+
+ return call_fib_notifiers(net, event_type, &info);
+}
+
static const struct nla_policy fib4_rule_policy[FRA_MAX+1] = {
FRA_GENERIC_POLICY,
[FRA_FLOW] = { .type = NLA_U32 },
@@ -220,7 +228,7 @@
rule4->tos = frh->tos;
net->ipv4.fib_has_custom_rules = true;
- fib_flush_external(rule->fr_net);
+ call_fib_rule_notifiers(net, FIB_EVENT_RULE_ADD);
err = 0;
errout:
@@ -242,7 +250,7 @@
net->ipv4.fib_num_tclassid_users--;
#endif
net->ipv4.fib_has_custom_rules = true;
- fib_flush_external(rule->fr_net);
+ call_fib_rule_notifiers(net, FIB_EVENT_RULE_DEL);
errout:
return err;
}
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 241f27b..31cef36 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -73,6 +73,7 @@
#include <linux/slab.h>
#include <linux/export.h>
#include <linux/vmalloc.h>
+#include <linux/notifier.h>
#include <net/net_namespace.h>
#include <net/ip.h>
#include <net/protocol.h>
@@ -80,10 +81,47 @@
#include <net/tcp.h>
#include <net/sock.h>
#include <net/ip_fib.h>
-#include <net/switchdev.h>
#include <trace/events/fib.h>
#include "fib_lookup.h"
+static BLOCKING_NOTIFIER_HEAD(fib_chain);
+
+int register_fib_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_register(&fib_chain, nb);
+}
+EXPORT_SYMBOL(register_fib_notifier);
+
+int unregister_fib_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_unregister(&fib_chain, nb);
+}
+EXPORT_SYMBOL(unregister_fib_notifier);
+
+int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
+ struct fib_notifier_info *info)
+{
+ info->net = net;
+ return blocking_notifier_call_chain(&fib_chain, event_type, info);
+}
+
+static int call_fib_entry_notifiers(struct net *net,
+ enum fib_event_type event_type, u32 dst,
+ int dst_len, struct fib_info *fi,
+ u8 tos, u8 type, u32 tb_id, u32 nlflags)
+{
+ struct fib_entry_notifier_info info = {
+ .dst = dst,
+ .dst_len = dst_len,
+ .fi = fi,
+ .tos = tos,
+ .type = type,
+ .tb_id = tb_id,
+ .nlflags = nlflags,
+ };
+ return call_fib_notifiers(net, event_type, &info.info);
+}
+
#define MAX_STAT_DEPTH 32
#define KEYLENGTH (8*sizeof(t_key))
@@ -1076,7 +1114,8 @@
}
/* Caller must hold RTNL. */
-int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
+int fib_table_insert(struct net *net, struct fib_table *tb,
+ struct fib_config *cfg)
{
struct trie *t = (struct trie *)tb->tb_data;
struct fib_alias *fa, *new_fa;
@@ -1175,17 +1214,6 @@
new_fa->tb_id = tb->tb_id;
new_fa->fa_default = -1;
- err = switchdev_fib_ipv4_add(key, plen, fi,
- new_fa->fa_tos,
- cfg->fc_type,
- cfg->fc_nlflags,
- tb->tb_id);
- if (err) {
- switchdev_fib_ipv4_abort(fi);
- kmem_cache_free(fn_alias_kmem, new_fa);
- goto out;
- }
-
hlist_replace_rcu(&fa->fa_list, &new_fa->fa_list);
alias_free_mem_rcu(fa);
@@ -1193,6 +1221,11 @@
fib_release_info(fi_drop);
if (state & FA_S_ACCESSED)
rt_cache_flush(cfg->fc_nlinfo.nl_net);
+
+ call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_ADD,
+ key, plen, fi,
+ new_fa->fa_tos, cfg->fc_type,
+ tb->tb_id, cfg->fc_nlflags);
rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen,
tb->tb_id, &cfg->fc_nlinfo, nlflags);
@@ -1228,30 +1261,22 @@
new_fa->tb_id = tb->tb_id;
new_fa->fa_default = -1;
- /* (Optionally) offload fib entry to switch hardware. */
- err = switchdev_fib_ipv4_add(key, plen, fi, tos, cfg->fc_type,
- cfg->fc_nlflags, tb->tb_id);
- if (err) {
- switchdev_fib_ipv4_abort(fi);
- goto out_free_new_fa;
- }
-
/* Insert new entry to the list. */
err = fib_insert_alias(t, tp, l, new_fa, fa, key);
if (err)
- goto out_sw_fib_del;
+ goto out_free_new_fa;
if (!plen)
tb->tb_num_default++;
rt_cache_flush(cfg->fc_nlinfo.nl_net);
+ call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_ADD, key, plen, fi, tos,
+ cfg->fc_type, tb->tb_id, cfg->fc_nlflags);
rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, new_fa->tb_id,
&cfg->fc_nlinfo, nlflags);
succeeded:
return 0;
-out_sw_fib_del:
- switchdev_fib_ipv4_del(key, plen, fi, tos, cfg->fc_type, tb->tb_id);
out_free_new_fa:
kmem_cache_free(fn_alias_kmem, new_fa);
out:
@@ -1490,7 +1515,8 @@
}
/* Caller must hold RTNL. */
-int fib_table_delete(struct fib_table *tb, struct fib_config *cfg)
+int fib_table_delete(struct net *net, struct fib_table *tb,
+ struct fib_config *cfg)
{
struct trie *t = (struct trie *) tb->tb_data;
struct fib_alias *fa, *fa_to_delete;
@@ -1543,9 +1569,9 @@
if (!fa_to_delete)
return -ESRCH;
- switchdev_fib_ipv4_del(key, plen, fa_to_delete->fa_info, tos,
- cfg->fc_type, tb->tb_id);
-
+ call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, key, plen,
+ fa_to_delete->fa_info, tos, cfg->fc_type,
+ tb->tb_id, 0);
rtmsg_fib(RTM_DELROUTE, htonl(key), fa_to_delete, plen, tb->tb_id,
&cfg->fc_nlinfo, 0);
@@ -1734,82 +1760,8 @@
return NULL;
}
-/* Caller must hold RTNL */
-void fib_table_flush_external(struct fib_table *tb)
-{
- struct trie *t = (struct trie *)tb->tb_data;
- struct key_vector *pn = t->kv;
- unsigned long cindex = 1;
- struct hlist_node *tmp;
- struct fib_alias *fa;
-
- /* walk trie in reverse order */
- for (;;) {
- unsigned char slen = 0;
- struct key_vector *n;
-
- if (!(cindex--)) {
- t_key pkey = pn->key;
-
- /* cannot resize the trie vector */
- if (IS_TRIE(pn))
- break;
-
- /* resize completed node */
- pn = resize(t, pn);
- cindex = get_index(pkey, pn);
-
- continue;
- }
-
- /* grab the next available node */
- n = get_child(pn, cindex);
- if (!n)
- continue;
-
- if (IS_TNODE(n)) {
- /* record pn and cindex for leaf walking */
- pn = n;
- cindex = 1ul << n->bits;
-
- continue;
- }
-
- hlist_for_each_entry_safe(fa, tmp, &n->leaf, fa_list) {
- struct fib_info *fi = fa->fa_info;
-
- /* if alias was cloned to local then we just
- * need to remove the local copy from main
- */
- if (tb->tb_id != fa->tb_id) {
- hlist_del_rcu(&fa->fa_list);
- alias_free_mem_rcu(fa);
- continue;
- }
-
- /* record local slen */
- slen = fa->fa_slen;
-
- if (!fi || !(fi->fib_flags & RTNH_F_OFFLOAD))
- continue;
-
- switchdev_fib_ipv4_del(n->key, KEYLENGTH - fa->fa_slen,
- fi, fa->fa_tos, fa->fa_type,
- tb->tb_id);
- }
-
- /* update leaf slen */
- n->slen = slen;
-
- if (hlist_empty(&n->leaf)) {
- put_child_root(pn, n->key, NULL);
- node_free(n);
- }
- }
-}
-
/* Caller must hold RTNL. */
-int fib_table_flush(struct fib_table *tb)
+int fib_table_flush(struct net *net, struct fib_table *tb)
{
struct trie *t = (struct trie *)tb->tb_data;
struct key_vector *pn = t->kv;
@@ -1858,9 +1810,11 @@
continue;
}
- switchdev_fib_ipv4_del(n->key, KEYLENGTH - fa->fa_slen,
- fi, fa->fa_tos, fa->fa_type,
- tb->tb_id);
+ call_fib_entry_notifiers(net, FIB_EVENT_ENTRY_DEL,
+ n->key,
+ KEYLENGTH - fa->fa_slen,
+ fi, fa->fa_tos, fa->fa_type,
+ tb->tb_id, 0);
hlist_del_rcu(&fa->fa_list);
fib_release_info(fa->fa_info);
alias_free_mem_rcu(fa);
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index ecd1e09..96e0efe 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -24,7 +24,7 @@
__be16 protocol = skb->protocol;
u16 mac_len = skb->mac_len;
int gre_offset, outer_hlen;
- bool need_csum, ufo;
+ bool need_csum, ufo, gso_partial;
if (!skb->encapsulation)
goto out;
@@ -69,6 +69,8 @@
goto out;
}
+ gso_partial = !!(skb_shinfo(segs)->gso_type & SKB_GSO_PARTIAL);
+
outer_hlen = skb_tnl_header_len(skb);
gre_offset = outer_hlen - tnl_hlen;
skb = segs;
@@ -96,7 +98,7 @@
greh = (struct gre_base_hdr *)skb_transport_header(skb);
pcsum = (__sum16 *)(greh + 1);
- if (skb_is_gso(skb)) {
+ if (gso_partial) {
unsigned int partial_adj;
/* Adjust checksum to account for the fact that
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 4b351af..d6feabb 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -312,6 +312,7 @@
{
const struct iphdr *iph = ip_hdr(skb);
struct rtable *rt;
+ struct net_device *dev = skb->dev;
/* if ingress device is enslaved to an L3 master device pass the
* skb to its handler for processing
@@ -341,7 +342,7 @@
*/
if (!skb_valid_dst(skb)) {
int err = ip_route_input_noref(skb, iph->daddr, iph->saddr,
- iph->tos, skb->dev);
+ iph->tos, dev);
if (unlikely(err)) {
if (err == -EXDEV)
__NET_INC_STATS(net, LINUX_MIB_IPRPFILTER);
@@ -370,7 +371,7 @@
__IP_UPD_PO_STATS(net, IPSTATS_MIB_INBCAST, skb->len);
} else if (skb->pkt_type == PACKET_BROADCAST ||
skb->pkt_type == PACKET_MULTICAST) {
- struct in_device *in_dev = __in_dev_get_rcu(skb->dev);
+ struct in_device *in_dev = __in_dev_get_rcu(dev);
/* RFC 1122 3.3.6:
*
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index cc701fa..5d7944f 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -88,6 +88,7 @@
struct net_device *dev;
struct pcpu_sw_netstats *tstats;
struct xfrm_state *x;
+ struct xfrm_mode *inner_mode;
struct ip_tunnel *tunnel = XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4;
u32 orig_mark = skb->mark;
int ret;
@@ -105,7 +106,19 @@
}
x = xfrm_input_state(skb);
- family = x->inner_mode->afinfo->family;
+
+ inner_mode = x->inner_mode;
+
+ if (x->sel.family == AF_UNSPEC) {
+ inner_mode = xfrm_ip2inner_mode(x, XFRM_MODE_SKB_CB(skb)->protocol);
+ if (inner_mode == NULL) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINSTATEMODEERROR);
+ return -EINVAL;
+ }
+ }
+
+ family = inner_mode->afinfo->family;
skb->mark = be32_to_cpu(tunnel->parms.i_key);
ret = xfrm_policy_check(NULL, XFRM_POLICY_IN, skb, family);
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 2625332..a87bcd2 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -2076,6 +2076,7 @@
struct rta_mfc_stats mfcs;
struct nlattr *mp_attr;
struct rtnexthop *nhp;
+ unsigned long lastuse;
int ct;
/* If cache is unresolved, don't try to parse IIF and OIF */
@@ -2105,12 +2106,14 @@
nla_nest_end(skb, mp_attr);
+ lastuse = READ_ONCE(c->mfc_un.res.lastuse);
+ lastuse = time_after_eq(jiffies, lastuse) ? jiffies - lastuse : 0;
+
mfcs.mfcs_packets = c->mfc_un.res.pkt;
mfcs.mfcs_bytes = c->mfc_un.res.bytes;
mfcs.mfcs_wrong_if = c->mfc_un.res.wrong_if;
if (nla_put_64bit(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs, RTA_PAD) ||
- nla_put_u64_64bit(skb, RTA_EXPIRES,
- jiffies_to_clock_t(c->mfc_un.res.lastuse),
+ nla_put_u64_64bit(skb, RTA_EXPIRES, jiffies_to_clock_t(lastuse),
RTA_PAD))
return -EMSGSIZE;
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index f993545..7c00ce9 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -156,7 +156,7 @@
.u = {
.log = {
.level = 4,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 870aebd..713c09a 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -110,7 +110,7 @@
if (!help)
return NF_ACCEPT;
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index 4b5904b..d075b3c 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -149,7 +149,7 @@
return -NF_ACCEPT;
}
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
innerproto = __nf_ct_l4proto_find(PF_INET, origtuple.dst.protonum);
/* Ordinarily, we'd expect the inverted tupleproto, but it's
diff --git a/net/ipv4/netfilter/nf_log_arp.c b/net/ipv4/netfilter/nf_log_arp.c
index 8945c26..b24795e 100644
--- a/net/ipv4/netfilter/nf_log_arp.c
+++ b/net/ipv4/netfilter/nf_log_arp.c
@@ -30,7 +30,7 @@
.u = {
.log = {
.level = LOGLEVEL_NOTICE,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
diff --git a/net/ipv4/netfilter/nf_log_ipv4.c b/net/ipv4/netfilter/nf_log_ipv4.c
index 20f2255..8566489 100644
--- a/net/ipv4/netfilter/nf_log_ipv4.c
+++ b/net/ipv4/netfilter/nf_log_ipv4.c
@@ -29,7 +29,7 @@
.u = {
.log = {
.level = LOGLEVEL_NOTICE,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
@@ -46,7 +46,7 @@
if (info->type == NF_LOG_TYPE_LOG)
logflags = info->u.log.logflags;
else
- logflags = NF_LOG_MASK;
+ logflags = NF_LOG_DEFAULT_MASK;
ih = skb_header_pointer(skb, iphoff, sizeof(_iph), &_iph);
if (ih == NULL) {
@@ -76,7 +76,7 @@
if (ntohs(ih->frag_off) & IP_OFFSET)
nf_log_buf_add(m, "FRAG:%u ", ntohs(ih->frag_off) & IP_OFFSET);
- if ((logflags & XT_LOG_IPOPT) &&
+ if ((logflags & NF_LOG_IPOPT) &&
ih->ihl * 4 > sizeof(struct iphdr)) {
const unsigned char *op;
unsigned char _opt[4 * 15 - sizeof(struct iphdr)];
@@ -250,7 +250,7 @@
}
/* Max length: 15 "UID=4294967295 " */
- if ((logflags & XT_LOG_UID) && !iphoff)
+ if ((logflags & NF_LOG_UID) && !iphoff)
nf_log_dump_sk_uid_gid(m, skb->sk);
/* Max length: 16 "MARK=0xFFFFFFFF " */
@@ -282,7 +282,7 @@
if (info->type == NF_LOG_TYPE_LOG)
logflags = info->u.log.logflags;
- if (!(logflags & XT_LOG_MACDECODE))
+ if (!(logflags & NF_LOG_MACDECODE))
goto fallback;
switch (dev->type) {
diff --git a/net/ipv4/netfilter/nf_nat_proto_gre.c b/net/ipv4/netfilter/nf_nat_proto_gre.c
index 9414923..edf0500 100644
--- a/net/ipv4/netfilter/nf_nat_proto_gre.c
+++ b/net/ipv4/netfilter/nf_nat_proto_gre.c
@@ -88,8 +88,8 @@
const struct nf_conntrack_tuple *tuple,
enum nf_nat_manip_type maniptype)
{
- const struct gre_hdr *greh;
- struct gre_hdr_pptp *pgreh;
+ const struct gre_base_hdr *greh;
+ struct pptp_gre_header *pgreh;
/* pgreh includes two optional 32bit fields which are not required
* to be there. That's where the magic '8' comes from */
@@ -97,18 +97,19 @@
return false;
greh = (void *)skb->data + hdroff;
- pgreh = (struct gre_hdr_pptp *)greh;
+ pgreh = (struct pptp_gre_header *)greh;
/* we only have destination manip of a packet, since 'source key'
* is not present in the packet itself */
if (maniptype != NF_NAT_MANIP_DST)
return true;
- switch (greh->version) {
- case GRE_VERSION_1701:
+
+ switch (greh->flags & GRE_VERSION) {
+ case GRE_VERSION_0:
/* We do not currently NAT any GREv0 packets.
* Try to behave like "nf_nat_proto_unknown" */
break;
- case GRE_VERSION_PPTP:
+ case GRE_VERSION_1:
pr_debug("call_id -> 0x%04x\n", ntohs(tuple->dst.u.gre.key));
pgreh->call_id = tuple->dst.u.gre.key;
break;
diff --git a/net/ipv4/netfilter/nf_tables_arp.c b/net/ipv4/netfilter/nf_tables_arp.c
index cd84d42..805c8dd 100644
--- a/net/ipv4/netfilter/nf_tables_arp.c
+++ b/net/ipv4/netfilter/nf_tables_arp.c
@@ -21,7 +21,7 @@
{
struct nft_pktinfo pkt;
- nft_set_pktinfo(&pkt, skb, state);
+ nft_set_pktinfo_unspec(&pkt, skb, state);
return nft_do_chain(&pkt, priv);
}
@@ -80,7 +80,10 @@
{
int ret;
- nft_register_chain_type(&filter_arp);
+ ret = nft_register_chain_type(&filter_arp);
+ if (ret < 0)
+ return ret;
+
ret = register_pernet_subsys(&nf_tables_arp_net_ops);
if (ret < 0)
nft_unregister_chain_type(&filter_arp);
diff --git a/net/ipv4/netfilter/nf_tables_ipv4.c b/net/ipv4/netfilter/nf_tables_ipv4.c
index e44ba3b..2840a29 100644
--- a/net/ipv4/netfilter/nf_tables_ipv4.c
+++ b/net/ipv4/netfilter/nf_tables_ipv4.c
@@ -103,7 +103,10 @@
{
int ret;
- nft_register_chain_type(&filter_ipv4);
+ ret = nft_register_chain_type(&filter_ipv4);
+ if (ret < 0)
+ return ret;
+
ret = register_pernet_subsys(&nf_tables_ipv4_net_ops);
if (ret < 0)
nft_unregister_chain_type(&filter_ipv4);
diff --git a/net/ipv4/netfilter/nft_chain_route_ipv4.c b/net/ipv4/netfilter/nft_chain_route_ipv4.c
index 2375b0a..30493be 100644
--- a/net/ipv4/netfilter/nft_chain_route_ipv4.c
+++ b/net/ipv4/netfilter/nft_chain_route_ipv4.c
@@ -31,6 +31,7 @@
__be32 saddr, daddr;
u_int8_t tos;
const struct iphdr *iph;
+ int err;
/* root is playing with raw sockets. */
if (skb->len < sizeof(struct iphdr) ||
@@ -46,15 +47,17 @@
tos = iph->tos;
ret = nft_do_chain(&pkt, priv);
- if (ret != NF_DROP && ret != NF_QUEUE) {
+ if (ret != NF_DROP && ret != NF_STOLEN) {
iph = ip_hdr(skb);
if (iph->saddr != saddr ||
iph->daddr != daddr ||
skb->mark != mark ||
- iph->tos != tos)
- if (ip_route_me_harder(state->net, skb, RTN_UNSPEC))
- ret = NF_DROP;
+ iph->tos != tos) {
+ err = ip_route_me_harder(state->net, skb, RTN_UNSPEC);
+ if (err < 0)
+ ret = NF_DROP_ERR(err);
+ }
}
return ret;
}
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index b52496f..654a9af 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -476,12 +476,18 @@
atomic_t *p_id = ip_idents + hash % IP_IDENTS_SZ;
u32 old = ACCESS_ONCE(*p_tstamp);
u32 now = (u32)jiffies;
- u32 delta = 0;
+ u32 new, delta = 0;
if (old != now && cmpxchg(p_tstamp, old, now) == old)
delta = prandom_u32_max(now - old);
- return atomic_add_return(segs + delta, p_id) - segs;
+ /* Do not use atomic_add_return() as it makes UBSAN unhappy */
+ do {
+ old = (u32)atomic_read(p_id);
+ new = old + delta + segs;
+ } while (atomic_cmpxchg(p_id, old, new) != old);
+
+ return new - segs;
}
EXPORT_SYMBOL(ip_idents_reserve);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7dae800..f253e50 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -387,7 +387,7 @@
icsk->icsk_rto = TCP_TIMEOUT_INIT;
tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
- tp->rtt_min[0].rtt = ~0U;
+ minmax_reset(&tp->rtt_min, tcp_time_stamp, ~0U);
/* So many TCP implementations out there (incorrectly) count the
* initial SYN frame in their delayed-ACK and congestion control
@@ -396,6 +396,9 @@
*/
tp->snd_cwnd = TCP_INIT_CWND;
+ /* There's a bubble in the pipe until at least the first ACK. */
+ tp->app_limited = ~0U;
+
/* See draft-stevens-tcpca-spec-01 for discussion of the
* initialization of these values.
*/
@@ -1014,6 +1017,9 @@
flags);
lock_sock(sk);
+
+ tcp_rate_check_app_limited(sk); /* is sending application-limited? */
+
res = do_tcp_sendpages(sk, page, offset, size, flags);
release_sock(sk);
return res;
@@ -1115,6 +1121,8 @@
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+ tcp_rate_check_app_limited(sk); /* is sending application-limited? */
+
/* Wait for a connection to finish. One exception is TCP Fast Open
* (passive side) where data is allowed to be sent before a connection
* is fully established.
@@ -2704,7 +2712,7 @@
{
const struct tcp_sock *tp = tcp_sk(sk); /* iff sk_type == SOCK_STREAM */
const struct inet_connection_sock *icsk = inet_csk(sk);
- u32 now = tcp_time_stamp;
+ u32 now = tcp_time_stamp, intv;
unsigned int start;
int notsent_bytes;
u64 rate64;
@@ -2794,6 +2802,15 @@
info->tcpi_min_rtt = tcp_min_rtt(tp);
info->tcpi_data_segs_in = tp->data_segs_in;
info->tcpi_data_segs_out = tp->data_segs_out;
+
+ info->tcpi_delivery_rate_app_limited = tp->rate_app_limited ? 1 : 0;
+ rate = READ_ONCE(tp->rate_delivered);
+ intv = READ_ONCE(tp->rate_interval_us);
+ if (rate && intv) {
+ rate64 = (u64)rate * tp->mss_cache * USEC_PER_SEC;
+ do_div(rate64, intv);
+ put_unaligned(rate64, &info->tcpi_delivery_rate);
+ }
}
EXPORT_SYMBOL_GPL(tcp_get_info);
@@ -3261,11 +3278,12 @@
void __init tcp_init(void)
{
- unsigned long limit;
int max_rshare, max_wshare, cnt;
+ unsigned long limit;
unsigned int i;
- sock_skb_cb_check_size(sizeof(struct tcp_skb_cb));
+ BUILD_BUG_ON(sizeof(struct tcp_skb_cb) >
+ FIELD_SIZEOF(struct sk_buff, cb));
percpu_counter_init(&tcp_sockets_allocated, 0, GFP_KERNEL);
percpu_counter_init(&tcp_orphan_count, 0, GFP_KERNEL);
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
new file mode 100644
index 0000000..0ea66c2
--- /dev/null
+++ b/net/ipv4/tcp_bbr.c
@@ -0,0 +1,896 @@
+/* Bottleneck Bandwidth and RTT (BBR) congestion control
+ *
+ * BBR congestion control computes the sending rate based on the delivery
+ * rate (throughput) estimated from ACKs. In a nutshell:
+ *
+ * On each ACK, update our model of the network path:
+ * bottleneck_bandwidth = windowed_max(delivered / elapsed, 10 round trips)
+ * min_rtt = windowed_min(rtt, 10 seconds)
+ * pacing_rate = pacing_gain * bottleneck_bandwidth
+ * cwnd = max(cwnd_gain * bottleneck_bandwidth * min_rtt, 4)
+ *
+ * The core algorithm does not react directly to packet losses or delays,
+ * although BBR may adjust the size of next send per ACK when loss is
+ * observed, or adjust the sending rate if it estimates there is a
+ * traffic policer, in order to keep the drop rate reasonable.
+ *
+ * BBR is described in detail in:
+ * "BBR: Congestion-Based Congestion Control",
+ * Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh,
+ * Van Jacobson. ACM Queue, Vol. 14 No. 5, September-October 2016.
+ *
+ * There is a public e-mail list for discussing BBR development and testing:
+ * https://groups.google.com/forum/#!forum/bbr-dev
+ *
+ * NOTE: BBR *must* be used with the fq qdisc ("man tc-fq") with pacing enabled,
+ * since pacing is integral to the BBR design and implementation.
+ * BBR without pacing would not function properly, and may incur unnecessary
+ * high packet loss rates.
+ */
+#include <linux/module.h>
+#include <net/tcp.h>
+#include <linux/inet_diag.h>
+#include <linux/inet.h>
+#include <linux/random.h>
+#include <linux/win_minmax.h>
+
+/* Scale factor for rate in pkt/uSec unit to avoid truncation in bandwidth
+ * estimation. The rate unit ~= (1500 bytes / 1 usec / 2^24) ~= 715 bps.
+ * This handles bandwidths from 0.06pps (715bps) to 256Mpps (3Tbps) in a u32.
+ * Since the minimum window is >=4 packets, the lower bound isn't
+ * an issue. The upper bound isn't an issue with existing technologies.
+ */
+#define BW_SCALE 24
+#define BW_UNIT (1 << BW_SCALE)
+
+#define BBR_SCALE 8 /* scaling factor for fractions in BBR (e.g. gains) */
+#define BBR_UNIT (1 << BBR_SCALE)
+
+/* BBR has the following modes for deciding how fast to send: */
+enum bbr_mode {
+ BBR_STARTUP, /* ramp up sending rate rapidly to fill pipe */
+ BBR_DRAIN, /* drain any queue created during startup */
+ BBR_PROBE_BW, /* discover, share bw: pace around estimated bw */
+ BBR_PROBE_RTT, /* cut cwnd to min to probe min_rtt */
+};
+
+/* BBR congestion control block */
+struct bbr {
+ u32 min_rtt_us; /* min RTT in min_rtt_win_sec window */
+ u32 min_rtt_stamp; /* timestamp of min_rtt_us */
+ u32 probe_rtt_done_stamp; /* end time for BBR_PROBE_RTT mode */
+ struct minmax bw; /* Max recent delivery rate in pkts/uS << 24 */
+ u32 rtt_cnt; /* count of packet-timed rounds elapsed */
+ u32 next_rtt_delivered; /* scb->tx.delivered at end of round */
+ struct skb_mstamp cycle_mstamp; /* time of this cycle phase start */
+ u32 mode:3, /* current bbr_mode in state machine */
+ prev_ca_state:3, /* CA state on previous ACK */
+ packet_conservation:1, /* use packet conservation? */
+ restore_cwnd:1, /* decided to revert cwnd to old value */
+ round_start:1, /* start of packet-timed tx->ack round? */
+ tso_segs_goal:7, /* segments we want in each skb we send */
+ idle_restart:1, /* restarting after idle? */
+ probe_rtt_round_done:1, /* a BBR_PROBE_RTT round at 4 pkts? */
+ unused:5,
+ lt_is_sampling:1, /* taking long-term ("LT") samples now? */
+ lt_rtt_cnt:7, /* round trips in long-term interval */
+ lt_use_bw:1; /* use lt_bw as our bw estimate? */
+ u32 lt_bw; /* LT est delivery rate in pkts/uS << 24 */
+ u32 lt_last_delivered; /* LT intvl start: tp->delivered */
+ u32 lt_last_stamp; /* LT intvl start: tp->delivered_mstamp */
+ u32 lt_last_lost; /* LT intvl start: tp->lost */
+ u32 pacing_gain:10, /* current gain for setting pacing rate */
+ cwnd_gain:10, /* current gain for setting cwnd */
+ full_bw_cnt:3, /* number of rounds without large bw gains */
+ cycle_idx:3, /* current index in pacing_gain cycle array */
+ unused_b:6;
+ u32 prior_cwnd; /* prior cwnd upon entering loss recovery */
+ u32 full_bw; /* recent bw, to estimate if pipe is full */
+};
+
+#define CYCLE_LEN 8 /* number of phases in a pacing gain cycle */
+
+/* Window length of bw filter (in rounds): */
+static const int bbr_bw_rtts = CYCLE_LEN + 2;
+/* Window length of min_rtt filter (in sec): */
+static const u32 bbr_min_rtt_win_sec = 10;
+/* Minimum time (in ms) spent at bbr_cwnd_min_target in BBR_PROBE_RTT mode: */
+static const u32 bbr_probe_rtt_mode_ms = 200;
+/* Skip TSO below the following bandwidth (bits/sec): */
+static const int bbr_min_tso_rate = 1200000;
+
+/* We use a high_gain value of 2/ln(2) because it's the smallest pacing gain
+ * that will allow a smoothly increasing pacing rate that will double each RTT
+ * and send the same number of packets per RTT that an un-paced, slow-starting
+ * Reno or CUBIC flow would:
+ */
+static const int bbr_high_gain = BBR_UNIT * 2885 / 1000 + 1;
+/* The pacing gain of 1/high_gain in BBR_DRAIN is calculated to typically drain
+ * the queue created in BBR_STARTUP in a single round:
+ */
+static const int bbr_drain_gain = BBR_UNIT * 1000 / 2885;
+/* The gain for deriving steady-state cwnd tolerates delayed/stretched ACKs: */
+static const int bbr_cwnd_gain = BBR_UNIT * 2;
+/* The pacing_gain values for the PROBE_BW gain cycle, to discover/share bw: */
+static const int bbr_pacing_gain[] = {
+ BBR_UNIT * 5 / 4, /* probe for more available bw */
+ BBR_UNIT * 3 / 4, /* drain queue and/or yield bw to other flows */
+ BBR_UNIT, BBR_UNIT, BBR_UNIT, /* cruise at 1.0*bw to utilize pipe, */
+ BBR_UNIT, BBR_UNIT, BBR_UNIT /* without creating excess queue... */
+};
+/* Randomize the starting gain cycling phase over N phases: */
+static const u32 bbr_cycle_rand = 7;
+
+/* Try to keep at least this many packets in flight, if things go smoothly. For
+ * smooth functioning, a sliding window protocol ACKing every other packet
+ * needs at least 4 packets in flight:
+ */
+static const u32 bbr_cwnd_min_target = 4;
+
+/* To estimate if BBR_STARTUP mode (i.e. high_gain) has filled pipe... */
+/* If bw has increased significantly (1.25x), there may be more bw available: */
+static const u32 bbr_full_bw_thresh = BBR_UNIT * 5 / 4;
+/* But after 3 rounds w/o significant bw growth, estimate pipe is full: */
+static const u32 bbr_full_bw_cnt = 3;
+
+/* "long-term" ("LT") bandwidth estimator parameters... */
+/* The minimum number of rounds in an LT bw sampling interval: */
+static const u32 bbr_lt_intvl_min_rtts = 4;
+/* If lost/delivered ratio > 20%, interval is "lossy" and we may be policed: */
+static const u32 bbr_lt_loss_thresh = 50;
+/* If 2 intervals have a bw ratio <= 1/8, their bw is "consistent": */
+static const u32 bbr_lt_bw_ratio = BBR_UNIT / 8;
+/* If 2 intervals have a bw diff <= 4 Kbit/sec their bw is "consistent": */
+static const u32 bbr_lt_bw_diff = 4000 / 8;
+/* If we estimate we're policed, use lt_bw for this many round trips: */
+static const u32 bbr_lt_bw_max_rtts = 48;
+
+/* Do we estimate that STARTUP filled the pipe? */
+static bool bbr_full_bw_reached(const struct sock *sk)
+{
+ const struct bbr *bbr = inet_csk_ca(sk);
+
+ return bbr->full_bw_cnt >= bbr_full_bw_cnt;
+}
+
+/* Return the windowed max recent bandwidth sample, in pkts/uS << BW_SCALE. */
+static u32 bbr_max_bw(const struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ return minmax_get(&bbr->bw);
+}
+
+/* Return the estimated bandwidth of the path, in pkts/uS << BW_SCALE. */
+static u32 bbr_bw(const struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ return bbr->lt_use_bw ? bbr->lt_bw : bbr_max_bw(sk);
+}
+
+/* Return rate in bytes per second, optionally with a gain.
+ * The order here is chosen carefully to avoid overflow of u64. This should
+ * work for input rates of up to 2.9Tbit/sec and gain of 2.89x.
+ */
+static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain)
+{
+ rate *= tcp_mss_to_mtu(sk, tcp_sk(sk)->mss_cache);
+ rate *= gain;
+ rate >>= BBR_SCALE;
+ rate *= USEC_PER_SEC;
+ return rate >> BW_SCALE;
+}
+
+/* Pace using current bw estimate and a gain factor. In order to help drive the
+ * network toward lower queues while maintaining high utilization and low
+ * latency, the average pacing rate aims to be slightly (~1%) lower than the
+ * estimated bandwidth. This is an important aspect of the design. In this
+ * implementation this slightly lower pacing rate is achieved implicitly by not
+ * including link-layer headers in the packet size used for the pacing rate.
+ */
+static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+ u64 rate = bw;
+
+ rate = bbr_rate_bytes_per_sec(sk, rate, gain);
+ rate = min_t(u64, rate, sk->sk_max_pacing_rate);
+ if (bbr->mode != BBR_STARTUP || rate > sk->sk_pacing_rate)
+ sk->sk_pacing_rate = rate;
+}
+
+/* Return count of segments we want in the skbs we send, or 0 for default. */
+static u32 bbr_tso_segs_goal(struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ return bbr->tso_segs_goal;
+}
+
+static void bbr_set_tso_segs_goal(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 min_segs;
+
+ min_segs = sk->sk_pacing_rate < (bbr_min_tso_rate >> 3) ? 1 : 2;
+ bbr->tso_segs_goal = min(tcp_tso_autosize(sk, tp->mss_cache, min_segs),
+ 0x7FU);
+}
+
+/* Save "last known good" cwnd so we can restore it after losses or PROBE_RTT */
+static void bbr_save_cwnd(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if (bbr->prev_ca_state < TCP_CA_Recovery && bbr->mode != BBR_PROBE_RTT)
+ bbr->prior_cwnd = tp->snd_cwnd; /* this cwnd is good enough */
+ else /* loss recovery or BBR_PROBE_RTT have temporarily cut cwnd */
+ bbr->prior_cwnd = max(bbr->prior_cwnd, tp->snd_cwnd);
+}
+
+static void bbr_cwnd_event(struct sock *sk, enum tcp_ca_event event)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if (event == CA_EVENT_TX_START && tp->app_limited) {
+ bbr->idle_restart = 1;
+ /* Avoid pointless buffer overflows: pace at est. bw if we don't
+ * need more speed (we're restarting from idle and app-limited).
+ */
+ if (bbr->mode == BBR_PROBE_BW)
+ bbr_set_pacing_rate(sk, bbr_bw(sk), BBR_UNIT);
+ }
+}
+
+/* Find target cwnd. Right-size the cwnd based on min RTT and the
+ * estimated bottleneck bandwidth:
+ *
+ * cwnd = bw * min_rtt * gain = BDP * gain
+ *
+ * The key factor, gain, controls the amount of queue. While a small gain
+ * builds a smaller queue, it becomes more vulnerable to noise in RTT
+ * measurements (e.g., delayed ACKs or other ACK compression effects). This
+ * noise may cause BBR to under-estimate the rate.
+ *
+ * To achieve full performance in high-speed paths, we budget enough cwnd to
+ * fit full-sized skbs in-flight on both end hosts to fully utilize the path:
+ * - one skb in sending host Qdisc,
+ * - one skb in sending host TSO/GSO engine
+ * - one skb being received by receiver host LRO/GRO/delayed-ACK engine
+ * Don't worry, at low rates (bbr_min_tso_rate) this won't bloat cwnd because
+ * in such cases tso_segs_goal is 1. The minimum cwnd is 4 packets,
+ * which allows 2 outstanding 2-packet sequences, to try to keep pipe
+ * full even with ACK-every-other-packet delayed ACKs.
+ */
+static u32 bbr_target_cwnd(struct sock *sk, u32 bw, int gain)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 cwnd;
+ u64 w;
+
+ /* If we've never had a valid RTT sample, cap cwnd at the initial
+ * default. This should only happen when the connection is not using TCP
+ * timestamps and has retransmitted all of the SYN/SYNACK/data packets
+ * ACKed so far. In this case, an RTO can cut cwnd to 1, in which
+ * case we need to slow-start up toward something safe: TCP_INIT_CWND.
+ */
+ if (unlikely(bbr->min_rtt_us == ~0U)) /* no valid RTT samples yet? */
+ return TCP_INIT_CWND; /* be safe: cap at default initial cwnd*/
+
+ w = (u64)bw * bbr->min_rtt_us;
+
+ /* Apply a gain to the given value, then remove the BW_SCALE shift. */
+ cwnd = (((w * gain) >> BBR_SCALE) + BW_UNIT - 1) / BW_UNIT;
+
+ /* Allow enough full-sized skbs in flight to utilize end systems. */
+ cwnd += 3 * bbr->tso_segs_goal;
+
+ /* Reduce delayed ACKs by rounding up cwnd to the next even number. */
+ cwnd = (cwnd + 1) & ~1U;
+
+ return cwnd;
+}
+
+/* An optimization in BBR to reduce losses: On the first round of recovery, we
+ * follow the packet conservation principle: send P packets per P packets acked.
+ * After that, we slow-start and send at most 2*P packets per P packets acked.
+ * After recovery finishes, or upon undo, we restore the cwnd we had when
+ * recovery started (capped by the target cwnd based on estimated BDP).
+ *
+ * TODO(ycheng/ncardwell): implement a rate-based approach.
+ */
+static bool bbr_set_cwnd_to_recover_or_restore(
+ struct sock *sk, const struct rate_sample *rs, u32 acked, u32 *new_cwnd)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u8 prev_state = bbr->prev_ca_state, state = inet_csk(sk)->icsk_ca_state;
+ u32 cwnd = tp->snd_cwnd;
+
+ /* An ACK for P pkts should release at most 2*P packets. We do this
+ * in two steps. First, here we deduct the number of lost packets.
+ * Then, in bbr_set_cwnd() we slow start up toward the target cwnd.
+ */
+ if (rs->losses > 0)
+ cwnd = max_t(s32, cwnd - rs->losses, 1);
+
+ if (state == TCP_CA_Recovery && prev_state != TCP_CA_Recovery) {
+ /* Starting 1st round of Recovery, so do packet conservation. */
+ bbr->packet_conservation = 1;
+ bbr->next_rtt_delivered = tp->delivered; /* start round now */
+ /* Cut unused cwnd from app behavior, TSQ, or TSO deferral: */
+ cwnd = tcp_packets_in_flight(tp) + acked;
+ } else if (prev_state >= TCP_CA_Recovery && state < TCP_CA_Recovery) {
+ /* Exiting loss recovery; restore cwnd saved before recovery. */
+ bbr->restore_cwnd = 1;
+ bbr->packet_conservation = 0;
+ }
+ bbr->prev_ca_state = state;
+
+ if (bbr->restore_cwnd) {
+ /* Restore cwnd after exiting loss recovery or PROBE_RTT. */
+ cwnd = max(cwnd, bbr->prior_cwnd);
+ bbr->restore_cwnd = 0;
+ }
+
+ if (bbr->packet_conservation) {
+ *new_cwnd = max(cwnd, tcp_packets_in_flight(tp) + acked);
+ return true; /* yes, using packet conservation */
+ }
+ *new_cwnd = cwnd;
+ return false;
+}
+
+/* Slow-start up toward target cwnd (if bw estimate is growing, or packet loss
+ * has drawn us down below target), or snap down to target if we're above it.
+ */
+static void bbr_set_cwnd(struct sock *sk, const struct rate_sample *rs,
+ u32 acked, u32 bw, int gain)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 cwnd = 0, target_cwnd = 0;
+
+ if (!acked)
+ return;
+
+ if (bbr_set_cwnd_to_recover_or_restore(sk, rs, acked, &cwnd))
+ goto done;
+
+ /* If we're below target cwnd, slow start cwnd toward target cwnd. */
+ target_cwnd = bbr_target_cwnd(sk, bw, gain);
+ if (bbr_full_bw_reached(sk)) /* only cut cwnd if we filled the pipe */
+ cwnd = min(cwnd + acked, target_cwnd);
+ else if (cwnd < target_cwnd || tp->delivered < TCP_INIT_CWND)
+ cwnd = cwnd + acked;
+ cwnd = max(cwnd, bbr_cwnd_min_target);
+
+done:
+ tp->snd_cwnd = min(cwnd, tp->snd_cwnd_clamp); /* apply global cap */
+ if (bbr->mode == BBR_PROBE_RTT) /* drain queue, refresh min_rtt */
+ tp->snd_cwnd = min(tp->snd_cwnd, bbr_cwnd_min_target);
+}
+
+/* End cycle phase if it's time and/or we hit the phase's in-flight target. */
+static bool bbr_is_next_cycle_phase(struct sock *sk,
+ const struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ bool is_full_length =
+ skb_mstamp_us_delta(&tp->delivered_mstamp, &bbr->cycle_mstamp) >
+ bbr->min_rtt_us;
+ u32 inflight, bw;
+
+ /* The pacing_gain of 1.0 paces at the estimated bw to try to fully
+ * use the pipe without increasing the queue.
+ */
+ if (bbr->pacing_gain == BBR_UNIT)
+ return is_full_length; /* just use wall clock time */
+
+ inflight = rs->prior_in_flight; /* what was in-flight before ACK? */
+ bw = bbr_max_bw(sk);
+
+ /* A pacing_gain > 1.0 probes for bw by trying to raise inflight to at
+ * least pacing_gain*BDP; this may take more than min_rtt if min_rtt is
+ * small (e.g. on a LAN). We do not persist if packets are lost, since
+ * a path with small buffers may not hold that much.
+ */
+ if (bbr->pacing_gain > BBR_UNIT)
+ return is_full_length &&
+ (rs->losses || /* perhaps pacing_gain*BDP won't fit */
+ inflight >= bbr_target_cwnd(sk, bw, bbr->pacing_gain));
+
+ /* A pacing_gain < 1.0 tries to drain extra queue we added if bw
+ * probing didn't find more bw. If inflight falls to match BDP then we
+ * estimate queue is drained; persisting would underutilize the pipe.
+ */
+ return is_full_length ||
+ inflight <= bbr_target_cwnd(sk, bw, BBR_UNIT);
+}
+
+static void bbr_advance_cycle_phase(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ bbr->cycle_idx = (bbr->cycle_idx + 1) & (CYCLE_LEN - 1);
+ bbr->cycle_mstamp = tp->delivered_mstamp;
+ bbr->pacing_gain = bbr_pacing_gain[bbr->cycle_idx];
+}
+
+/* Gain cycling: cycle pacing gain to converge to fair share of available bw. */
+static void bbr_update_cycle_phase(struct sock *sk,
+ const struct rate_sample *rs)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if ((bbr->mode == BBR_PROBE_BW) && !bbr->lt_use_bw &&
+ bbr_is_next_cycle_phase(sk, rs))
+ bbr_advance_cycle_phase(sk);
+}
+
+static void bbr_reset_startup_mode(struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ bbr->mode = BBR_STARTUP;
+ bbr->pacing_gain = bbr_high_gain;
+ bbr->cwnd_gain = bbr_high_gain;
+}
+
+static void bbr_reset_probe_bw_mode(struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ bbr->mode = BBR_PROBE_BW;
+ bbr->pacing_gain = BBR_UNIT;
+ bbr->cwnd_gain = bbr_cwnd_gain;
+ bbr->cycle_idx = CYCLE_LEN - 1 - prandom_u32_max(bbr_cycle_rand);
+ bbr_advance_cycle_phase(sk); /* flip to next phase of gain cycle */
+}
+
+static void bbr_reset_mode(struct sock *sk)
+{
+ if (!bbr_full_bw_reached(sk))
+ bbr_reset_startup_mode(sk);
+ else
+ bbr_reset_probe_bw_mode(sk);
+}
+
+/* Start a new long-term sampling interval. */
+static void bbr_reset_lt_bw_sampling_interval(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ bbr->lt_last_stamp = tp->delivered_mstamp.stamp_jiffies;
+ bbr->lt_last_delivered = tp->delivered;
+ bbr->lt_last_lost = tp->lost;
+ bbr->lt_rtt_cnt = 0;
+}
+
+/* Completely reset long-term bandwidth sampling. */
+static void bbr_reset_lt_bw_sampling(struct sock *sk)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ bbr->lt_bw = 0;
+ bbr->lt_use_bw = 0;
+ bbr->lt_is_sampling = false;
+ bbr_reset_lt_bw_sampling_interval(sk);
+}
+
+/* Long-term bw sampling interval is done. Estimate whether we're policed. */
+static void bbr_lt_bw_interval_done(struct sock *sk, u32 bw)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 diff;
+
+ if (bbr->lt_bw) { /* do we have bw from a previous interval? */
+ /* Is new bw close to the lt_bw from the previous interval? */
+ diff = abs(bw - bbr->lt_bw);
+ if ((diff * BBR_UNIT <= bbr_lt_bw_ratio * bbr->lt_bw) ||
+ (bbr_rate_bytes_per_sec(sk, diff, BBR_UNIT) <=
+ bbr_lt_bw_diff)) {
+ /* All criteria are met; estimate we're policed. */
+ bbr->lt_bw = (bw + bbr->lt_bw) >> 1; /* avg 2 intvls */
+ bbr->lt_use_bw = 1;
+ bbr->pacing_gain = BBR_UNIT; /* try to avoid drops */
+ bbr->lt_rtt_cnt = 0;
+ return;
+ }
+ }
+ bbr->lt_bw = bw;
+ bbr_reset_lt_bw_sampling_interval(sk);
+}
+
+/* Token-bucket traffic policers are common (see "An Internet-Wide Analysis of
+ * Traffic Policing", SIGCOMM 2016). BBR detects token-bucket policers and
+ * explicitly models their policed rate, to reduce unnecessary losses. We
+ * estimate that we're policed if we see 2 consecutive sampling intervals with
+ * consistent throughput and high packet loss. If we think we're being policed,
+ * set lt_bw to the "long-term" average delivery rate from those 2 intervals.
+ */
+static void bbr_lt_bw_sampling(struct sock *sk, const struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 lost, delivered;
+ u64 bw;
+ s32 t;
+
+ if (bbr->lt_use_bw) { /* already using long-term rate, lt_bw? */
+ if (bbr->mode == BBR_PROBE_BW && bbr->round_start &&
+ ++bbr->lt_rtt_cnt >= bbr_lt_bw_max_rtts) {
+ bbr_reset_lt_bw_sampling(sk); /* stop using lt_bw */
+ bbr_reset_probe_bw_mode(sk); /* restart gain cycling */
+ }
+ return;
+ }
+
+ /* Wait for the first loss before sampling, to let the policer exhaust
+ * its tokens and estimate the steady-state rate allowed by the policer.
+ * Starting samples earlier includes bursts that over-estimate the bw.
+ */
+ if (!bbr->lt_is_sampling) {
+ if (!rs->losses)
+ return;
+ bbr_reset_lt_bw_sampling_interval(sk);
+ bbr->lt_is_sampling = true;
+ }
+
+ /* To avoid underestimates, reset sampling if we run out of data. */
+ if (rs->is_app_limited) {
+ bbr_reset_lt_bw_sampling(sk);
+ return;
+ }
+
+ if (bbr->round_start)
+ bbr->lt_rtt_cnt++; /* count round trips in this interval */
+ if (bbr->lt_rtt_cnt < bbr_lt_intvl_min_rtts)
+ return; /* sampling interval needs to be longer */
+ if (bbr->lt_rtt_cnt > 4 * bbr_lt_intvl_min_rtts) {
+ bbr_reset_lt_bw_sampling(sk); /* interval is too long */
+ return;
+ }
+
+ /* End sampling interval when a packet is lost, so we estimate the
+ * policer tokens were exhausted. Stopping the sampling before the
+ * tokens are exhausted under-estimates the policed rate.
+ */
+ if (!rs->losses)
+ return;
+
+ /* Calculate packets lost and delivered in sampling interval. */
+ lost = tp->lost - bbr->lt_last_lost;
+ delivered = tp->delivered - bbr->lt_last_delivered;
+ /* Is loss rate (lost/delivered) >= lt_loss_thresh? If not, wait. */
+ if (!delivered || (lost << BBR_SCALE) < bbr_lt_loss_thresh * delivered)
+ return;
+
+ /* Find average delivery rate in this sampling interval. */
+ t = (s32)(tp->delivered_mstamp.stamp_jiffies - bbr->lt_last_stamp);
+ if (t < 1)
+ return; /* interval is less than one jiffy, so wait */
+ t = jiffies_to_usecs(t);
+ /* Interval long enough for jiffies_to_usecs() to return a bogus 0? */
+ if (t < 1) {
+ bbr_reset_lt_bw_sampling(sk); /* interval too long; reset */
+ return;
+ }
+ bw = (u64)delivered * BW_UNIT;
+ do_div(bw, t);
+ bbr_lt_bw_interval_done(sk, bw);
+}
+
+/* Estimate the bandwidth based on how fast packets are delivered */
+static void bbr_update_bw(struct sock *sk, const struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u64 bw;
+
+ bbr->round_start = 0;
+ if (rs->delivered < 0 || rs->interval_us <= 0)
+ return; /* Not a valid observation */
+
+ /* See if we've reached the next RTT */
+ if (!before(rs->prior_delivered, bbr->next_rtt_delivered)) {
+ bbr->next_rtt_delivered = tp->delivered;
+ bbr->rtt_cnt++;
+ bbr->round_start = 1;
+ bbr->packet_conservation = 0;
+ }
+
+ bbr_lt_bw_sampling(sk, rs);
+
+ /* Divide delivered by the interval to find a (lower bound) bottleneck
+ * bandwidth sample. Delivered is in packets and interval_us in uS and
+ * ratio will be <<1 for most connections. So delivered is first scaled.
+ */
+ bw = (u64)rs->delivered * BW_UNIT;
+ do_div(bw, rs->interval_us);
+
+ /* If this sample is application-limited, it is likely to have a very
+ * low delivered count that represents application behavior rather than
+ * the available network rate. Such a sample could drag down estimated
+ * bw, causing needless slow-down. Thus, to continue to send at the
+ * last measured network rate, we filter out app-limited samples unless
+ * they describe the path bw at least as well as our bw model.
+ *
+ * So the goal during app-limited phase is to proceed with the best
+ * network rate no matter how long. We automatically leave this
+ * phase when app writes faster than the network can deliver :)
+ */
+ if (!rs->is_app_limited || bw >= bbr_max_bw(sk)) {
+ /* Incorporate new sample into our max bw filter. */
+ minmax_running_max(&bbr->bw, bbr_bw_rtts, bbr->rtt_cnt, bw);
+ }
+}
+
+/* Estimate when the pipe is full, using the change in delivery rate: BBR
+ * estimates that STARTUP filled the pipe if the estimated bw hasn't changed by
+ * at least bbr_full_bw_thresh (25%) after bbr_full_bw_cnt (3) non-app-limited
+ * rounds. Why 3 rounds: 1: rwin autotuning grows the rwin, 2: we fill the
+ * higher rwin, 3: we get higher delivery rate samples. Or transient
+ * cross-traffic or radio noise can go away. CUBIC Hystart shares a similar
+ * design goal, but uses delay and inter-ACK spacing instead of bandwidth.
+ */
+static void bbr_check_full_bw_reached(struct sock *sk,
+ const struct rate_sample *rs)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 bw_thresh;
+
+ if (bbr_full_bw_reached(sk) || !bbr->round_start || rs->is_app_limited)
+ return;
+
+ bw_thresh = (u64)bbr->full_bw * bbr_full_bw_thresh >> BBR_SCALE;
+ if (bbr_max_bw(sk) >= bw_thresh) {
+ bbr->full_bw = bbr_max_bw(sk);
+ bbr->full_bw_cnt = 0;
+ return;
+ }
+ ++bbr->full_bw_cnt;
+}
+
+/* If pipe is probably full, drain the queue and then enter steady-state. */
+static void bbr_check_drain(struct sock *sk, const struct rate_sample *rs)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if (bbr->mode == BBR_STARTUP && bbr_full_bw_reached(sk)) {
+ bbr->mode = BBR_DRAIN; /* drain queue we created */
+ bbr->pacing_gain = bbr_drain_gain; /* pace slow to drain */
+ bbr->cwnd_gain = bbr_high_gain; /* maintain cwnd */
+ } /* fall through to check if in-flight is already small: */
+ if (bbr->mode == BBR_DRAIN &&
+ tcp_packets_in_flight(tcp_sk(sk)) <=
+ bbr_target_cwnd(sk, bbr_max_bw(sk), BBR_UNIT))
+ bbr_reset_probe_bw_mode(sk); /* we estimate queue is drained */
+}
+
+/* The goal of PROBE_RTT mode is to have BBR flows cooperatively and
+ * periodically drain the bottleneck queue, to converge to measure the true
+ * min_rtt (unloaded propagation delay). This allows the flows to keep queues
+ * small (reducing queuing delay and packet loss) and achieve fairness among
+ * BBR flows.
+ *
+ * The min_rtt filter window is 10 seconds. When the min_rtt estimate expires,
+ * we enter PROBE_RTT mode and cap the cwnd at bbr_cwnd_min_target=4 packets.
+ * After at least bbr_probe_rtt_mode_ms=200ms and at least one packet-timed
+ * round trip elapsed with that flight size <= 4, we leave PROBE_RTT mode and
+ * re-enter the previous mode. BBR uses 200ms to approximately bound the
+ * performance penalty of PROBE_RTT's cwnd capping to roughly 2% (200ms/10s).
+ *
+ * Note that flows need only pay 2% if they are busy sending over the last 10
+ * seconds. Interactive applications (e.g., Web, RPCs, video chunks) often have
+ * natural silences or low-rate periods within 10 seconds where the rate is low
+ * enough for long enough to drain its queue in the bottleneck. We pick up
+ * these min RTT measurements opportunistically with our min_rtt filter. :-)
+ */
+static void bbr_update_min_rtt(struct sock *sk, const struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ bool filter_expired;
+
+ /* Track min RTT seen in the min_rtt_win_sec filter window: */
+ filter_expired = after(tcp_time_stamp,
+ bbr->min_rtt_stamp + bbr_min_rtt_win_sec * HZ);
+ if (rs->rtt_us >= 0 &&
+ (rs->rtt_us <= bbr->min_rtt_us || filter_expired)) {
+ bbr->min_rtt_us = rs->rtt_us;
+ bbr->min_rtt_stamp = tcp_time_stamp;
+ }
+
+ if (bbr_probe_rtt_mode_ms > 0 && filter_expired &&
+ !bbr->idle_restart && bbr->mode != BBR_PROBE_RTT) {
+ bbr->mode = BBR_PROBE_RTT; /* dip, drain queue */
+ bbr->pacing_gain = BBR_UNIT;
+ bbr->cwnd_gain = BBR_UNIT;
+ bbr_save_cwnd(sk); /* note cwnd so we can restore it */
+ bbr->probe_rtt_done_stamp = 0;
+ }
+
+ if (bbr->mode == BBR_PROBE_RTT) {
+ /* Ignore low rate samples during this mode. */
+ tp->app_limited =
+ (tp->delivered + tcp_packets_in_flight(tp)) ? : 1;
+ /* Maintain min packets in flight for max(200 ms, 1 round). */
+ if (!bbr->probe_rtt_done_stamp &&
+ tcp_packets_in_flight(tp) <= bbr_cwnd_min_target) {
+ bbr->probe_rtt_done_stamp = tcp_time_stamp +
+ msecs_to_jiffies(bbr_probe_rtt_mode_ms);
+ bbr->probe_rtt_round_done = 0;
+ bbr->next_rtt_delivered = tp->delivered;
+ } else if (bbr->probe_rtt_done_stamp) {
+ if (bbr->round_start)
+ bbr->probe_rtt_round_done = 1;
+ if (bbr->probe_rtt_round_done &&
+ after(tcp_time_stamp, bbr->probe_rtt_done_stamp)) {
+ bbr->min_rtt_stamp = tcp_time_stamp;
+ bbr->restore_cwnd = 1; /* snap to prior_cwnd */
+ bbr_reset_mode(sk);
+ }
+ }
+ }
+ bbr->idle_restart = 0;
+}
+
+static void bbr_update_model(struct sock *sk, const struct rate_sample *rs)
+{
+ bbr_update_bw(sk, rs);
+ bbr_update_cycle_phase(sk, rs);
+ bbr_check_full_bw_reached(sk, rs);
+ bbr_check_drain(sk, rs);
+ bbr_update_min_rtt(sk, rs);
+}
+
+static void bbr_main(struct sock *sk, const struct rate_sample *rs)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+ u32 bw;
+
+ bbr_update_model(sk, rs);
+
+ bw = bbr_bw(sk);
+ bbr_set_pacing_rate(sk, bw, bbr->pacing_gain);
+ bbr_set_tso_segs_goal(sk);
+ bbr_set_cwnd(sk, rs, rs->acked_sacked, bw, bbr->cwnd_gain);
+}
+
+static void bbr_init(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u64 bw;
+
+ bbr->prior_cwnd = 0;
+ bbr->tso_segs_goal = 0; /* default segs per skb until first ACK */
+ bbr->rtt_cnt = 0;
+ bbr->next_rtt_delivered = 0;
+ bbr->prev_ca_state = TCP_CA_Open;
+ bbr->packet_conservation = 0;
+
+ bbr->probe_rtt_done_stamp = 0;
+ bbr->probe_rtt_round_done = 0;
+ bbr->min_rtt_us = tcp_min_rtt(tp);
+ bbr->min_rtt_stamp = tcp_time_stamp;
+
+ minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */
+
+ /* Initialize pacing rate to: high_gain * init_cwnd / RTT. */
+ bw = (u64)tp->snd_cwnd * BW_UNIT;
+ do_div(bw, (tp->srtt_us >> 3) ? : USEC_PER_MSEC);
+ sk->sk_pacing_rate = 0; /* force an update of sk_pacing_rate */
+ bbr_set_pacing_rate(sk, bw, bbr_high_gain);
+
+ bbr->restore_cwnd = 0;
+ bbr->round_start = 0;
+ bbr->idle_restart = 0;
+ bbr->full_bw = 0;
+ bbr->full_bw_cnt = 0;
+ bbr->cycle_mstamp.v64 = 0;
+ bbr->cycle_idx = 0;
+ bbr_reset_lt_bw_sampling(sk);
+ bbr_reset_startup_mode(sk);
+}
+
+static u32 bbr_sndbuf_expand(struct sock *sk)
+{
+ /* Provision 3 * cwnd since BBR may slow-start even during recovery. */
+ return 3;
+}
+
+/* In theory BBR does not need to undo the cwnd since it does not
+ * always reduce cwnd on losses (see bbr_main()). Keep it for now.
+ */
+static u32 bbr_undo_cwnd(struct sock *sk)
+{
+ return tcp_sk(sk)->snd_cwnd;
+}
+
+/* Entering loss recovery, so save cwnd for when we exit or undo recovery. */
+static u32 bbr_ssthresh(struct sock *sk)
+{
+ bbr_save_cwnd(sk);
+ return TCP_INFINITE_SSTHRESH; /* BBR does not use ssthresh */
+}
+
+static size_t bbr_get_info(struct sock *sk, u32 ext, int *attr,
+ union tcp_cc_info *info)
+{
+ if (ext & (1 << (INET_DIAG_BBRINFO - 1)) ||
+ ext & (1 << (INET_DIAG_VEGASINFO - 1))) {
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u64 bw = bbr_bw(sk);
+
+ bw = bw * tp->mss_cache * USEC_PER_SEC >> BW_SCALE;
+ memset(&info->bbr, 0, sizeof(info->bbr));
+ info->bbr.bbr_bw_lo = (u32)bw;
+ info->bbr.bbr_bw_hi = (u32)(bw >> 32);
+ info->bbr.bbr_min_rtt = bbr->min_rtt_us;
+ info->bbr.bbr_pacing_gain = bbr->pacing_gain;
+ info->bbr.bbr_cwnd_gain = bbr->cwnd_gain;
+ *attr = INET_DIAG_BBRINFO;
+ return sizeof(info->bbr);
+ }
+ return 0;
+}
+
+static void bbr_set_state(struct sock *sk, u8 new_state)
+{
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if (new_state == TCP_CA_Loss) {
+ struct rate_sample rs = { .losses = 1 };
+
+ bbr->prev_ca_state = TCP_CA_Loss;
+ bbr->full_bw = 0;
+ bbr->round_start = 1; /* treat RTO like end of a round */
+ bbr_lt_bw_sampling(sk, &rs);
+ }
+}
+
+static struct tcp_congestion_ops tcp_bbr_cong_ops __read_mostly = {
+ .flags = TCP_CONG_NON_RESTRICTED,
+ .name = "bbr",
+ .owner = THIS_MODULE,
+ .init = bbr_init,
+ .cong_control = bbr_main,
+ .sndbuf_expand = bbr_sndbuf_expand,
+ .undo_cwnd = bbr_undo_cwnd,
+ .cwnd_event = bbr_cwnd_event,
+ .ssthresh = bbr_ssthresh,
+ .tso_segs_goal = bbr_tso_segs_goal,
+ .get_info = bbr_get_info,
+ .set_state = bbr_set_state,
+};
+
+static int __init bbr_register(void)
+{
+ BUILD_BUG_ON(sizeof(struct bbr) > ICSK_CA_PRIV_SIZE);
+ return tcp_register_congestion_control(&tcp_bbr_cong_ops);
+}
+
+static void __exit bbr_unregister(void)
+{
+ tcp_unregister_congestion_control(&tcp_bbr_cong_ops);
+}
+
+module_init(bbr_register);
+module_exit(bbr_unregister);
+
+MODULE_AUTHOR("Van Jacobson <vanj@google.com>");
+MODULE_AUTHOR("Neal Cardwell <ncardwell@google.com>");
+MODULE_AUTHOR("Yuchung Cheng <ycheng@google.com>");
+MODULE_AUTHOR("Soheil Hassas Yeganeh <soheil@google.com>");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_DESCRIPTION("TCP BBR (Bottleneck Bandwidth and RTT)");
diff --git a/net/ipv4/tcp_cdg.c b/net/ipv4/tcp_cdg.c
index 03725b2..35b2803 100644
--- a/net/ipv4/tcp_cdg.c
+++ b/net/ipv4/tcp_cdg.c
@@ -56,7 +56,7 @@
module_param(use_tolerance, bool, 0644);
MODULE_PARM_DESC(use_tolerance, "use loss tolerance heuristic");
-struct minmax {
+struct cdg_minmax {
union {
struct {
s32 min;
@@ -74,10 +74,10 @@
};
struct cdg {
- struct minmax rtt;
- struct minmax rtt_prev;
- struct minmax *gradients;
- struct minmax gsum;
+ struct cdg_minmax rtt;
+ struct cdg_minmax rtt_prev;
+ struct cdg_minmax *gradients;
+ struct cdg_minmax gsum;
bool gfilled;
u8 tail;
u8 state;
@@ -353,7 +353,7 @@
{
struct cdg *ca = inet_csk_ca(sk);
struct tcp_sock *tp = tcp_sk(sk);
- struct minmax *gradients;
+ struct cdg_minmax *gradients;
switch (ev) {
case CA_EVENT_CWND_RESTART:
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 882caa4..1294af4 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -69,7 +69,7 @@
int ret = 0;
/* all algorithms must implement ssthresh and cong_avoid ops */
- if (!ca->ssthresh || !ca->cong_avoid) {
+ if (!ca->ssthresh || !(ca->cong_avoid || ca->cong_control)) {
pr_err("%s does not implement required ops\n", ca->name);
return -EINVAL;
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index dad3e7e..8c6ad2d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -289,6 +289,7 @@
static void tcp_sndbuf_expand(struct sock *sk)
{
const struct tcp_sock *tp = tcp_sk(sk);
+ const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
int sndmem, per_mss;
u32 nr_segs;
@@ -309,7 +310,8 @@
* Cubic needs 1.7 factor, rounded to 2 to include
* extra cushion (application might react slowly to POLLOUT)
*/
- sndmem = 2 * nr_segs * per_mss;
+ sndmem = ca_ops->sndbuf_expand ? ca_ops->sndbuf_expand(sk) : 2;
+ sndmem *= nr_segs * per_mss;
if (sk->sk_sndbuf < sndmem)
sk->sk_sndbuf = min(sndmem, sysctl_tcp_wmem[2]);
@@ -899,12 +901,29 @@
tp->retransmit_high = TCP_SKB_CB(skb)->end_seq;
}
+/* Sum the number of packets on the wire we have marked as lost.
+ * There are two cases we care about here:
+ * a) Packet hasn't been marked lost (nor retransmitted),
+ * and this is the first loss.
+ * b) Packet has been marked both lost and retransmitted,
+ * and this means we think it was lost again.
+ */
+static void tcp_sum_lost(struct tcp_sock *tp, struct sk_buff *skb)
+{
+ __u8 sacked = TCP_SKB_CB(skb)->sacked;
+
+ if (!(sacked & TCPCB_LOST) ||
+ ((sacked & TCPCB_LOST) && (sacked & TCPCB_SACKED_RETRANS)))
+ tp->lost += tcp_skb_pcount(skb);
+}
+
static void tcp_skb_mark_lost(struct tcp_sock *tp, struct sk_buff *skb)
{
if (!(TCP_SKB_CB(skb)->sacked & (TCPCB_LOST|TCPCB_SACKED_ACKED))) {
tcp_verify_retransmit_hint(tp, skb);
tp->lost_out += tcp_skb_pcount(skb);
+ tcp_sum_lost(tp, skb);
TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
}
}
@@ -913,6 +932,7 @@
{
tcp_verify_retransmit_hint(tp, skb);
+ tcp_sum_lost(tp, skb);
if (!(TCP_SKB_CB(skb)->sacked & (TCPCB_LOST|TCPCB_SACKED_ACKED))) {
tp->lost_out += tcp_skb_pcount(skb);
TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
@@ -1094,6 +1114,7 @@
*/
struct skb_mstamp first_sackt;
struct skb_mstamp last_sackt;
+ struct rate_sample *rate;
int flag;
};
@@ -1261,6 +1282,7 @@
tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked,
start_seq, end_seq, dup_sack, pcount,
&skb->skb_mstamp);
+ tcp_rate_skb_delivered(sk, skb, state->rate);
if (skb == tp->lost_skb_hint)
tp->lost_cnt_hint += pcount;
@@ -1311,6 +1333,9 @@
tcp_advance_highest_sack(sk, skb);
tcp_skb_collapse_tstamp(prev, skb);
+ if (unlikely(TCP_SKB_CB(prev)->tx.delivered_mstamp.v64))
+ TCP_SKB_CB(prev)->tx.delivered_mstamp.v64 = 0;
+
tcp_unlink_write_queue(skb, sk);
sk_wmem_free_skb(sk, skb);
@@ -1540,6 +1565,7 @@
dup_sack,
tcp_skb_pcount(skb),
&skb->skb_mstamp);
+ tcp_rate_skb_delivered(sk, skb, state->rate);
if (!before(TCP_SKB_CB(skb)->seq,
tcp_highest_sack_seq(tp)))
@@ -1622,8 +1648,10 @@
found_dup_sack = tcp_check_dsack(sk, ack_skb, sp_wire,
num_sacks, prior_snd_una);
- if (found_dup_sack)
+ if (found_dup_sack) {
state->flag |= FLAG_DSACKING_ACK;
+ tp->delivered++; /* A spurious retransmission is delivered */
+ }
/* Eliminate too old ACKs, but take into
* account more or less fresh ones, they can
@@ -1890,6 +1918,7 @@
struct sk_buff *skb;
bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
bool is_reneg; /* is receiver reneging on SACKs? */
+ bool mark_lost;
/* Reduce ssthresh if it has not yet been made inside this window. */
if (icsk->icsk_ca_state <= TCP_CA_Disorder ||
@@ -1923,8 +1952,12 @@
if (skb == tcp_send_head(sk))
break;
+ mark_lost = (!(TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) ||
+ is_reneg);
+ if (mark_lost)
+ tcp_sum_lost(tp, skb);
TCP_SKB_CB(skb)->sacked &= (~TCPCB_TAGBITS)|TCPCB_SACKED_ACKED;
- if (!(TCP_SKB_CB(skb)->sacked&TCPCB_SACKED_ACKED) || is_reneg) {
+ if (mark_lost) {
TCP_SKB_CB(skb)->sacked &= ~TCPCB_SACKED_ACKED;
TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
tp->lost_out += tcp_skb_pcount(skb);
@@ -2503,6 +2536,9 @@
{
struct tcp_sock *tp = tcp_sk(sk);
+ if (inet_csk(sk)->icsk_ca_ops->cong_control)
+ return;
+
/* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
(tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
@@ -2879,67 +2915,13 @@
*rexmit = REXMIT_LOST;
}
-/* Kathleen Nichols' algorithm for tracking the minimum value of
- * a data stream over some fixed time interval. (E.g., the minimum
- * RTT over the past five minutes.) It uses constant space and constant
- * time per update yet almost always delivers the same minimum as an
- * implementation that has to keep all the data in the window.
- *
- * The algorithm keeps track of the best, 2nd best & 3rd best min
- * values, maintaining an invariant that the measurement time of the
- * n'th best >= n-1'th best. It also makes sure that the three values
- * are widely separated in the time window since that bounds the worse
- * case error when that data is monotonically increasing over the window.
- *
- * Upon getting a new min, we can forget everything earlier because it
- * has no value - the new min is <= everything else in the window by
- * definition and it's the most recent. So we restart fresh on every new min
- * and overwrites 2nd & 3rd choices. The same property holds for 2nd & 3rd
- * best.
- */
static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us)
{
- const u32 now = tcp_time_stamp, wlen = sysctl_tcp_min_rtt_wlen * HZ;
- struct rtt_meas *m = tcp_sk(sk)->rtt_min;
- struct rtt_meas rttm = {
- .rtt = likely(rtt_us) ? rtt_us : jiffies_to_usecs(1),
- .ts = now,
- };
- u32 elapsed;
+ struct tcp_sock *tp = tcp_sk(sk);
+ u32 wlen = sysctl_tcp_min_rtt_wlen * HZ;
- /* Check if the new measurement updates the 1st, 2nd, or 3rd choices */
- if (unlikely(rttm.rtt <= m[0].rtt))
- m[0] = m[1] = m[2] = rttm;
- else if (rttm.rtt <= m[1].rtt)
- m[1] = m[2] = rttm;
- else if (rttm.rtt <= m[2].rtt)
- m[2] = rttm;
-
- elapsed = now - m[0].ts;
- if (unlikely(elapsed > wlen)) {
- /* Passed entire window without a new min so make 2nd choice
- * the new min & 3rd choice the new 2nd. So forth and so on.
- */
- m[0] = m[1];
- m[1] = m[2];
- m[2] = rttm;
- if (now - m[0].ts > wlen) {
- m[0] = m[1];
- m[1] = rttm;
- if (now - m[0].ts > wlen)
- m[0] = rttm;
- }
- } else if (m[1].ts == m[0].ts && elapsed > wlen / 4) {
- /* Passed a quarter of the window without a new min so
- * take 2nd choice from the 2nd quarter of the window.
- */
- m[2] = m[1] = rttm;
- } else if (m[2].ts == m[1].ts && elapsed > wlen / 2) {
- /* Passed half the window without a new min so take the 3rd
- * choice from the last half of the window.
- */
- m[2] = rttm;
- }
+ minmax_running_min(&tp->rtt_min, wlen, tcp_time_stamp,
+ rtt_us ? : jiffies_to_usecs(1));
}
static inline bool tcp_ack_update_rtt(struct sock *sk, const int flag,
@@ -3102,10 +3084,11 @@
*/
static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
u32 prior_snd_una, int *acked,
- struct tcp_sacktag_state *sack)
+ struct tcp_sacktag_state *sack,
+ struct skb_mstamp *now)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
- struct skb_mstamp first_ackt, last_ackt, now;
+ struct skb_mstamp first_ackt, last_ackt;
struct tcp_sock *tp = tcp_sk(sk);
u32 prior_sacked = tp->sacked_out;
u32 reord = tp->packets_out;
@@ -3137,7 +3120,6 @@
acked_pcount = tcp_tso_acked(sk, skb);
if (!acked_pcount)
break;
-
fully_acked = false;
} else {
/* Speedup tcp_unlink_write_queue() and next loop */
@@ -3173,6 +3155,7 @@
tp->packets_out -= acked_pcount;
pkts_acked += acked_pcount;
+ tcp_rate_skb_delivered(sk, skb, sack->rate);
/* Initial outgoing SYN's get put onto the write_queue
* just like anything else we transmit. It is not
@@ -3205,16 +3188,15 @@
if (skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
flag |= FLAG_SACK_RENEGING;
- skb_mstamp_get(&now);
if (likely(first_ackt.v64) && !(flag & FLAG_RETRANS_DATA_ACKED)) {
- seq_rtt_us = skb_mstamp_us_delta(&now, &first_ackt);
- ca_rtt_us = skb_mstamp_us_delta(&now, &last_ackt);
+ seq_rtt_us = skb_mstamp_us_delta(now, &first_ackt);
+ ca_rtt_us = skb_mstamp_us_delta(now, &last_ackt);
}
if (sack->first_sackt.v64) {
- sack_rtt_us = skb_mstamp_us_delta(&now, &sack->first_sackt);
- ca_rtt_us = skb_mstamp_us_delta(&now, &sack->last_sackt);
+ sack_rtt_us = skb_mstamp_us_delta(now, &sack->first_sackt);
+ ca_rtt_us = skb_mstamp_us_delta(now, &sack->last_sackt);
}
-
+ sack->rate->rtt_us = ca_rtt_us; /* RTT of last (S)ACKed packet, or -1 */
rtt_update = tcp_ack_update_rtt(sk, flag, seq_rtt_us, sack_rtt_us,
ca_rtt_us);
@@ -3242,7 +3224,7 @@
tp->fackets_out -= min(pkts_acked, tp->fackets_out);
} else if (skb && rtt_update && sack_rtt_us >= 0 &&
- sack_rtt_us > skb_mstamp_us_delta(&now, &skb->skb_mstamp)) {
+ sack_rtt_us > skb_mstamp_us_delta(now, &skb->skb_mstamp)) {
/* Do not re-arm RTO if the sack RTT is measured from data sent
* after when the head was last (re)transmitted. Otherwise the
* timeout may continue to extend in loss recovery.
@@ -3333,8 +3315,15 @@
* information. All transmission or retransmission are delayed afterwards.
*/
static void tcp_cong_control(struct sock *sk, u32 ack, u32 acked_sacked,
- int flag)
+ int flag, const struct rate_sample *rs)
{
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+ if (icsk->icsk_ca_ops->cong_control) {
+ icsk->icsk_ca_ops->cong_control(sk, rs);
+ return;
+ }
+
if (tcp_in_cwnd_reduction(sk)) {
/* Reduce cwnd if state mandates */
tcp_cwnd_reduction(sk, acked_sacked, flag);
@@ -3579,17 +3568,21 @@
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_sacktag_state sack_state;
+ struct rate_sample rs = { .prior_delivered = 0 };
u32 prior_snd_una = tp->snd_una;
u32 ack_seq = TCP_SKB_CB(skb)->seq;
u32 ack = TCP_SKB_CB(skb)->ack_seq;
bool is_dupack = false;
u32 prior_fackets;
int prior_packets = tp->packets_out;
- u32 prior_delivered = tp->delivered;
+ u32 delivered = tp->delivered;
+ u32 lost = tp->lost;
int acked = 0; /* Number of packets newly acked */
int rexmit = REXMIT_NONE; /* Flag to (re)transmit to recover losses */
+ struct skb_mstamp now;
sack_state.first_sackt.v64 = 0;
+ sack_state.rate = &rs;
/* We very likely will need to access write queue head. */
prefetchw(sk->sk_write_queue.next);
@@ -3612,6 +3605,8 @@
if (after(ack, tp->snd_nxt))
goto invalid_ack;
+ skb_mstamp_get(&now);
+
if (icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS ||
icsk->icsk_pending == ICSK_TIME_LOSS_PROBE)
tcp_rearm_rto(sk);
@@ -3622,6 +3617,7 @@
}
prior_fackets = tp->fackets_out;
+ rs.prior_in_flight = tcp_packets_in_flight(tp);
/* ts_recent update must be made after we are sure that the packet
* is in window.
@@ -3677,7 +3673,7 @@
/* See if we can take anything off of the retransmit queue. */
flag |= tcp_clean_rtx_queue(sk, prior_fackets, prior_snd_una, &acked,
- &sack_state);
+ &sack_state, &now);
if (tcp_ack_is_dubious(sk, flag)) {
is_dupack = !(flag & (FLAG_SND_UNA_ADVANCED | FLAG_NOT_DUP));
@@ -3694,7 +3690,10 @@
if (icsk->icsk_pending == ICSK_TIME_RETRANS)
tcp_schedule_loss_probe(sk);
- tcp_cong_control(sk, ack, tp->delivered - prior_delivered, flag);
+ delivered = tp->delivered - delivered; /* freshly ACKed or SACKed */
+ lost = tp->lost - lost; /* freshly marked lost */
+ tcp_rate_gen(sk, delivered, lost, &now, &rs);
+ tcp_cong_control(sk, ack, delivered, flag, &rs);
tcp_xmit_recovery(sk, rexmit);
return 1;
@@ -5951,7 +5950,7 @@
* so release it.
*/
if (req) {
- tp->total_retrans = req->num_retrans;
+ inet_csk(sk)->icsk_retransmits = 0;
reqsk_fastopen_remove(sk, req, false);
} else {
/* Make sure socket is routed, for correct metrics. */
@@ -5993,7 +5992,8 @@
} else
tcp_init_metrics(sk);
- tcp_update_pacing_rate(sk);
+ if (!inet_csk(sk)->icsk_ca_ops->cong_control)
+ tcp_update_pacing_rate(sk);
/* Prevent spurious tcp_cwnd_restart() on first data packet */
tp->lsndtime = tcp_time_stamp;
@@ -6326,6 +6326,7 @@
tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
tcp_openreq_init(req, &tmp_opt, skb, sk);
+ inet_rsk(req)->no_srccheck = inet_sk(sk)->transparent;
/* Note: tcp_v6_init_req() might override ir_iif for link locals */
inet_rsk(req)->ir_iif = inet_request_bound_dev_if(sk, skb);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 04b9893..7ac37c3 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1196,7 +1196,6 @@
sk_rcv_saddr_set(req_to_sk(req), ip_hdr(skb)->daddr);
sk_daddr_set(req_to_sk(req), ip_hdr(skb)->saddr);
- ireq->no_srccheck = inet_sk(sk_listener)->transparent;
ireq->opt = tcp_v4_save_options(skb);
}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index f63c73d..6234eba 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -464,7 +464,7 @@
newtp->srtt_us = 0;
newtp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
- newtp->rtt_min[0].rtt = ~0U;
+ minmax_reset(&newtp->rtt_min, tcp_time_stamp, ~0U);
newicsk->icsk_rto = TCP_TIMEOUT_INIT;
newtp->packets_out = 0;
@@ -487,6 +487,9 @@
newtp->snd_cwnd = TCP_INIT_CWND;
newtp->snd_cwnd_cnt = 0;
+ /* There's a bubble in the pipe until at least the first ACK. */
+ newtp->app_limited = ~0U;
+
tcp_init_xmit_timers(newsk);
newtp->write_seq = newtp->pushed_seq = treq->snt_isn + 1;
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 5c59649..bc68da3 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -90,12 +90,6 @@
goto out;
}
- /* GSO partial only requires splitting the frame into an MSS
- * multiple and possibly a remainder. So update the mss now.
- */
- if (features & NETIF_F_GSO_PARTIAL)
- mss = skb->len - (skb->len % mss);
-
copy_destructor = gso_skb->destructor == tcp_wfree;
ooo_okay = gso_skb->ooo_okay;
/* All segments but the first should have ooo_okay cleared */
@@ -108,6 +102,13 @@
/* Only first segment might have ooo_okay set */
segs->ooo_okay = ooo_okay;
+ /* GSO partial and frag_list segmentation only requires splitting
+ * the frame into an MSS multiple and possibly a remainder, both
+ * cases return a GSO skb. So update the mss now.
+ */
+ if (skb_is_gso(segs))
+ mss *= skb_shinfo(segs)->gso_segs;
+
delta = htonl(oldlen + (thlen + mss));
skb = segs;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 8b45794..7c777089 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -734,9 +734,16 @@
{
if ((1 << sk->sk_state) &
(TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_CLOSING |
- TCPF_CLOSE_WAIT | TCPF_LAST_ACK))
- tcp_write_xmit(sk, tcp_current_mss(sk), tcp_sk(sk)->nonagle,
+ TCPF_CLOSE_WAIT | TCPF_LAST_ACK)) {
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (tp->lost_out > tp->retrans_out &&
+ tp->snd_cwnd > tcp_packets_in_flight(tp))
+ tcp_xmit_retransmit_queue(sk);
+
+ tcp_write_xmit(sk, tcp_current_mss(sk), tp->nonagle,
0, GFP_ATOMIC);
+ }
}
/*
* One tasklet per cpu tries to send more skbs.
@@ -918,6 +925,7 @@
skb_mstamp_get(&skb->skb_mstamp);
TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq
- tp->snd_una;
+ tcp_rate_skb_sent(sk, skb);
if (unlikely(skb_cloned(skb)))
skb = pskb_copy(skb, gfp_mask);
@@ -1213,6 +1221,9 @@
tcp_set_skb_tso_segs(skb, mss_now);
tcp_set_skb_tso_segs(buff, mss_now);
+ /* Update delivered info for the new segment */
+ TCP_SKB_CB(buff)->tx = TCP_SKB_CB(skb)->tx;
+
/* If this packet has been sent out already, we must
* adjust the various packet counters.
*/
@@ -1358,6 +1369,7 @@
}
return mtu;
}
+EXPORT_SYMBOL(tcp_mss_to_mtu);
/* MTU probing init per socket */
void tcp_mtup_init(struct sock *sk)
@@ -1545,7 +1557,8 @@
/* Return how many segs we'd like on a TSO packet,
* to send one TSO packet per ms
*/
-static u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now)
+u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now,
+ int min_tso_segs)
{
u32 bytes, segs;
@@ -1557,10 +1570,23 @@
* This preserves ACK clocking and is consistent
* with tcp_tso_should_defer() heuristic.
*/
- segs = max_t(u32, bytes / mss_now, sysctl_tcp_min_tso_segs);
+ segs = max_t(u32, bytes / mss_now, min_tso_segs);
return min_t(u32, segs, sk->sk_gso_max_segs);
}
+EXPORT_SYMBOL(tcp_tso_autosize);
+
+/* Return the number of segments we want in the skb we are transmitting.
+ * See if congestion control module wants to decide; otherwise, autosize.
+ */
+static u32 tcp_tso_segs(struct sock *sk, unsigned int mss_now)
+{
+ const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
+ u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0;
+
+ return tso_segs ? :
+ tcp_tso_autosize(sk, mss_now, sysctl_tcp_min_tso_segs);
+}
/* Returns the portion of skb which can be sent right away */
static unsigned int tcp_mss_split_point(const struct sock *sk,
@@ -2020,6 +2046,39 @@
return -1;
}
+/* TCP Small Queues :
+ * Control number of packets in qdisc/devices to two packets / or ~1 ms.
+ * (These limits are doubled for retransmits)
+ * This allows for :
+ * - better RTT estimation and ACK scheduling
+ * - faster recovery
+ * - high rates
+ * Alas, some drivers / subsystems require a fair amount
+ * of queued bytes to ensure line rate.
+ * One example is wifi aggregation (802.11 AMPDU)
+ */
+static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
+ unsigned int factor)
+{
+ unsigned int limit;
+
+ limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
+ limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
+ limit <<= factor;
+
+ if (atomic_read(&sk->sk_wmem_alloc) > limit) {
+ set_bit(TSQ_THROTTLED, &tcp_sk(sk)->tsq_flags);
+ /* It is possible TX completion already happened
+ * before we set TSQ_THROTTLED, so we must
+ * test again the condition.
+ */
+ smp_mb__after_atomic();
+ if (atomic_read(&sk->sk_wmem_alloc) > limit)
+ return true;
+ }
+ return false;
+}
+
/* This routine writes packets to the network. It advances the
* send_head. This happens as incoming acks open up the remote
* window for us.
@@ -2057,7 +2116,7 @@
}
}
- max_segs = tcp_tso_autosize(sk, mss_now);
+ max_segs = tcp_tso_segs(sk, mss_now);
while ((skb = tcp_send_head(sk))) {
unsigned int limit;
@@ -2106,29 +2165,8 @@
unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
break;
- /* TCP Small Queues :
- * Control number of packets in qdisc/devices to two packets / or ~1 ms.
- * This allows for :
- * - better RTT estimation and ACK scheduling
- * - faster recovery
- * - high rates
- * Alas, some drivers / subsystems require a fair amount
- * of queued bytes to ensure line rate.
- * One example is wifi aggregation (802.11 AMPDU)
- */
- limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
- limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
-
- if (atomic_read(&sk->sk_wmem_alloc) > limit) {
- set_bit(TSQ_THROTTLED, &tp->tsq_flags);
- /* It is possible TX completion already happened
- * before we set TSQ_THROTTLED, so we must
- * test again the condition.
- */
- smp_mb__after_atomic();
- if (atomic_read(&sk->sk_wmem_alloc) > limit)
- break;
- }
+ if (tcp_small_queue_check(sk, skb, 0))
+ break;
if (unlikely(tcp_transmit_skb(sk, skb, 1, gfp)))
break;
@@ -2605,7 +2643,8 @@
* copying overhead: fragmentation, tunneling, mangling etc.
*/
if (atomic_read(&sk->sk_wmem_alloc) >
- min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf))
+ min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
+ sk->sk_sndbuf))
return -EAGAIN;
if (skb_still_in_host_queue(sk, skb))
@@ -2774,7 +2813,7 @@
last_lost = tp->snd_una;
}
- max_segs = tcp_tso_autosize(sk, tcp_current_mss(sk));
+ max_segs = tcp_tso_segs(sk, tcp_current_mss(sk));
tcp_for_write_queue_from(skb, sk) {
__u8 sacked;
int segs;
@@ -2828,10 +2867,13 @@
if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
continue;
+ if (tcp_small_queue_check(sk, skb, 1))
+ return;
+
if (tcp_retransmit_skb(sk, skb, segs))
return;
- NET_INC_STATS(sock_net(sk), mib_idx);
+ NET_ADD_STATS(sock_net(sk), mib_idx, tcp_skb_pcount(skb));
if (tcp_in_cwnd_reduction(sk))
tp->prr_out += tcp_skb_pcount(skb);
@@ -3568,6 +3610,8 @@
if (!res) {
__TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
+ if (unlikely(tcp_passive_fastopen(sk)))
+ tcp_sk(sk)->total_retrans++;
}
return res;
}
diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
new file mode 100644
index 0000000..9be1581
--- /dev/null
+++ b/net/ipv4/tcp_rate.c
@@ -0,0 +1,186 @@
+#include <net/tcp.h>
+
+/* The bandwidth estimator estimates the rate at which the network
+ * can currently deliver outbound data packets for this flow. At a high
+ * level, it operates by taking a delivery rate sample for each ACK.
+ *
+ * A rate sample records the rate at which the network delivered packets
+ * for this flow, calculated over the time interval between the transmission
+ * of a data packet and the acknowledgment of that packet.
+ *
+ * Specifically, over the interval between each transmit and corresponding ACK,
+ * the estimator generates a delivery rate sample. Typically it uses the rate
+ * at which packets were acknowledged. However, the approach of using only the
+ * acknowledgment rate faces a challenge under the prevalent ACK decimation or
+ * compression: packets can temporarily appear to be delivered much quicker
+ * than the bottleneck rate. Since it is physically impossible to do that in a
+ * sustained fashion, when the estimator notices that the ACK rate is faster
+ * than the transmit rate, it uses the latter:
+ *
+ * send_rate = #pkts_delivered/(last_snd_time - first_snd_time)
+ * ack_rate = #pkts_delivered/(last_ack_time - first_ack_time)
+ * bw = min(send_rate, ack_rate)
+ *
+ * Notice the estimator essentially estimates the goodput, not always the
+ * network bottleneck link rate when the sending or receiving is limited by
+ * other factors like applications or receiver window limits. The estimator
+ * deliberately avoids using the inter-packet spacing approach because that
+ * approach requires a large number of samples and sophisticated filtering.
+ *
+ * TCP flows can often be application-limited in request/response workloads.
+ * The estimator marks a bandwidth sample as application-limited if there
+ * was some moment during the sampled window of packets when there was no data
+ * ready to send in the write queue.
+ */
+
+/* Snapshot the current delivery information in the skb, to generate
+ * a rate sample later when the skb is (s)acked in tcp_rate_skb_delivered().
+ */
+void tcp_rate_skb_sent(struct sock *sk, struct sk_buff *skb)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ /* In general we need to start delivery rate samples from the
+ * time we received the most recent ACK, to ensure we include
+ * the full time the network needs to deliver all in-flight
+ * packets. If there are no packets in flight yet, then we
+ * know that any ACKs after now indicate that the network was
+ * able to deliver those packets completely in the sampling
+ * interval between now and the next ACK.
+ *
+ * Note that we use packets_out instead of tcp_packets_in_flight(tp)
+ * because the latter is a guess based on RTO and loss-marking
+ * heuristics. We don't want spurious RTOs or loss markings to cause
+ * a spuriously small time interval, causing a spuriously high
+ * bandwidth estimate.
+ */
+ if (!tp->packets_out) {
+ tp->first_tx_mstamp = skb->skb_mstamp;
+ tp->delivered_mstamp = skb->skb_mstamp;
+ }
+
+ TCP_SKB_CB(skb)->tx.first_tx_mstamp = tp->first_tx_mstamp;
+ TCP_SKB_CB(skb)->tx.delivered_mstamp = tp->delivered_mstamp;
+ TCP_SKB_CB(skb)->tx.delivered = tp->delivered;
+ TCP_SKB_CB(skb)->tx.is_app_limited = tp->app_limited ? 1 : 0;
+}
+
+/* When an skb is sacked or acked, we fill in the rate sample with the (prior)
+ * delivery information when the skb was last transmitted.
+ *
+ * If an ACK (s)acks multiple skbs (e.g., stretched-acks), this function is
+ * called multiple times. We favor the information from the most recently
+ * sent skb, i.e., the skb with the highest prior_delivered count.
+ */
+void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
+ struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct tcp_skb_cb *scb = TCP_SKB_CB(skb);
+
+ if (!scb->tx.delivered_mstamp.v64)
+ return;
+
+ if (!rs->prior_delivered ||
+ after(scb->tx.delivered, rs->prior_delivered)) {
+ rs->prior_delivered = scb->tx.delivered;
+ rs->prior_mstamp = scb->tx.delivered_mstamp;
+ rs->is_app_limited = scb->tx.is_app_limited;
+ rs->is_retrans = scb->sacked & TCPCB_RETRANS;
+
+ /* Find the duration of the "send phase" of this window: */
+ rs->interval_us = skb_mstamp_us_delta(
+ &skb->skb_mstamp,
+ &scb->tx.first_tx_mstamp);
+
+ /* Record send time of most recently ACKed packet: */
+ tp->first_tx_mstamp = skb->skb_mstamp;
+ }
+ /* Mark off the skb delivered once it's sacked to avoid being
+ * used again when it's cumulatively acked. For acked packets
+ * we don't need to reset since it'll be freed soon.
+ */
+ if (scb->sacked & TCPCB_SACKED_ACKED)
+ scb->tx.delivered_mstamp.v64 = 0;
+}
+
+/* Update the connection delivery information and generate a rate sample. */
+void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
+ struct skb_mstamp *now, struct rate_sample *rs)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ u32 snd_us, ack_us;
+
+ /* Clear app limited if bubble is acked and gone. */
+ if (tp->app_limited && after(tp->delivered, tp->app_limited))
+ tp->app_limited = 0;
+
+ /* TODO: there are multiple places throughout tcp_ack() to get
+ * current time. Refactor the code using a new "tcp_acktag_state"
+ * to carry current time, flags, stats like "tcp_sacktag_state".
+ */
+ if (delivered)
+ tp->delivered_mstamp = *now;
+
+ rs->acked_sacked = delivered; /* freshly ACKed or SACKed */
+ rs->losses = lost; /* freshly marked lost */
+ /* Return an invalid sample if no timing information is available. */
+ if (!rs->prior_mstamp.v64) {
+ rs->delivered = -1;
+ rs->interval_us = -1;
+ return;
+ }
+ rs->delivered = tp->delivered - rs->prior_delivered;
+
+ /* Model sending data and receiving ACKs as separate pipeline phases
+ * for a window. Usually the ACK phase is longer, but with ACK
+ * compression the send phase can be longer. To be safe we use the
+ * longer phase.
+ */
+ snd_us = rs->interval_us; /* send phase */
+ ack_us = skb_mstamp_us_delta(now, &rs->prior_mstamp); /* ack phase */
+ rs->interval_us = max(snd_us, ack_us);
+
+ /* Normally we expect interval_us >= min-rtt.
+ * Note that rate may still be over-estimated when a spuriously
+ * retransmistted skb was first (s)acked because "interval_us"
+ * is under-estimated (up to an RTT). However continuously
+ * measuring the delivery rate during loss recovery is crucial
+ * for connections suffer heavy or prolonged losses.
+ */
+ if (unlikely(rs->interval_us < tcp_min_rtt(tp))) {
+ if (!rs->is_retrans)
+ pr_debug("tcp rate: %ld %d %u %u %u\n",
+ rs->interval_us, rs->delivered,
+ inet_csk(sk)->icsk_ca_state,
+ tp->rx_opt.sack_ok, tcp_min_rtt(tp));
+ rs->interval_us = -1;
+ return;
+ }
+
+ /* Record the last non-app-limited or the highest app-limited bw */
+ if (!rs->is_app_limited ||
+ ((u64)rs->delivered * tp->rate_interval_us >=
+ (u64)tp->rate_delivered * rs->interval_us)) {
+ tp->rate_delivered = rs->delivered;
+ tp->rate_interval_us = rs->interval_us;
+ tp->rate_app_limited = rs->is_app_limited;
+ }
+}
+
+/* If a gap is detected between sends, mark the socket application-limited. */
+void tcp_rate_check_app_limited(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (/* We have less than one packet to send. */
+ tp->write_seq - tp->snd_nxt < tp->mss_cache &&
+ /* Nothing in sending host's qdisc queues or NIC tx queue. */
+ sk_wmem_alloc_get(sk) < SKB_TRUESIZE(1) &&
+ /* We are not limited by CWND. */
+ tcp_packets_in_flight(tp) < tp->snd_cwnd &&
+ /* All lost packets have been retransmitted. */
+ tp->lost_out <= tp->retrans_out)
+ tp->app_limited =
+ (tp->delivered + tcp_packets_in_flight(tp)) ? : 1;
+}
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index d84930b..3ea1cf8 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -192,6 +192,8 @@
if (tp->syn_data && icsk->icsk_retransmits == 1)
NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPFASTOPENACTIVEFAIL);
+ } else if (!tp->syn_data && !tp->syn_fastopen) {
+ sk_rethink_txhash(sk);
}
retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
syn_set = true;
@@ -213,6 +215,8 @@
tcp_mtu_probing(icsk, sk);
dst_negative_advice(sk);
+ } else {
+ sk_rethink_txhash(sk);
}
retry_until = net->ipv4.sysctl_tcp_retries2;
@@ -384,6 +388,7 @@
*/
inet_rtx_syn_ack(sk, req);
req->num_timeout++;
+ icsk->icsk_retransmits++;
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
TCP_TIMEOUT_INIT << req->num_timeout, TCP_RTO_MAX);
}
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 81f253b..f9333c9 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -21,7 +21,7 @@
__be16 new_protocol, bool is_ipv6)
{
int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
- bool remcsum, need_csum, offload_csum, ufo;
+ bool remcsum, need_csum, offload_csum, ufo, gso_partial;
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct udphdr *uh = udp_hdr(skb);
u16 mac_offset = skb->mac_header;
@@ -88,6 +88,8 @@
goto out;
}
+ gso_partial = !!(skb_shinfo(segs)->gso_type & SKB_GSO_PARTIAL);
+
outer_hlen = skb_tnl_header_len(skb);
udp_offset = outer_hlen - tnl_hlen;
skb = segs;
@@ -117,7 +119,7 @@
* will be using a length value equal to only one MSS sized
* segment instead of the entire frame.
*/
- if (skb_is_gso(skb)) {
+ if (gso_partial) {
uh->len = htons(skb_shinfo(skb)->gso_size +
SKB_GSO_CB(skb)->data_offset +
skb->head - (unsigned char *)uh);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 397e1ed..4ce74f8 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1239,7 +1239,7 @@
parms->encap_limit = nla_get_u8(data[IFLA_GRE_ENCAP_LIMIT]);
if (data[IFLA_GRE_FLOWINFO])
- parms->flowinfo = nla_get_u32(data[IFLA_GRE_FLOWINFO]);
+ parms->flowinfo = nla_get_be32(data[IFLA_GRE_FLOWINFO]);
if (data[IFLA_GRE_FLAGS])
parms->flags = nla_get_u32(data[IFLA_GRE_FLAGS]);
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 22e90e5..e7bfd55 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -69,6 +69,7 @@
int offset = 0;
bool encap, udpfrag;
int nhoff;
+ bool gso_partial;
skb_reset_network_header(skb);
nhoff = skb_network_header(skb) - skb_mac_header(skb);
@@ -101,9 +102,11 @@
if (IS_ERR(segs))
goto out;
+ gso_partial = !!(skb_shinfo(segs)->gso_type & SKB_GSO_PARTIAL);
+
for (skb = segs; skb; skb = skb->next) {
ipv6h = (struct ipv6hdr *)(skb_mac_header(skb) + nhoff);
- if (skb_is_gso(skb))
+ if (gso_partial)
payload_len = skb_shinfo(skb)->gso_size +
SKB_GSO_CB(skb)->data_offset +
skb->head - (unsigned char *)(ipv6h + 1);
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index cc7e058..8a02ca8 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -321,11 +321,9 @@
goto discard;
}
- XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = t;
-
rcu_read_unlock();
- return xfrm6_rcv(skb);
+ return xfrm6_rcv_tnl(skb, t);
}
rcu_read_unlock();
return -EINVAL;
@@ -340,6 +338,7 @@
struct net_device *dev;
struct pcpu_sw_netstats *tstats;
struct xfrm_state *x;
+ struct xfrm_mode *inner_mode;
struct ip6_tnl *t = XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6;
u32 orig_mark = skb->mark;
int ret;
@@ -357,7 +356,19 @@
}
x = xfrm_input_state(skb);
- family = x->inner_mode->afinfo->family;
+
+ inner_mode = x->inner_mode;
+
+ if (x->sel.family == AF_UNSPEC) {
+ inner_mode = xfrm_ip2inner_mode(x, XFRM_MODE_SKB_CB(skb)->protocol);
+ if (inner_mode == NULL) {
+ XFRM_INC_STATS(dev_net(skb->dev),
+ LINUX_MIB_XFRMINSTATEMODEERROR);
+ return -EINVAL;
+ }
+ }
+
+ family = inner_mode->afinfo->family;
skb->mark = be32_to_cpu(t->parms.i_key);
ret = xfrm_policy_check(NULL, XFRM_POLICY_IN, skb, family);
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 6122f9c..fccb5dd 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -2239,6 +2239,7 @@
struct rta_mfc_stats mfcs;
struct nlattr *mp_attr;
struct rtnexthop *nhp;
+ unsigned long lastuse;
int ct;
/* If cache is unresolved, don't try to parse IIF and OIF */
@@ -2269,12 +2270,14 @@
nla_nest_end(skb, mp_attr);
+ lastuse = READ_ONCE(c->mfc_un.res.lastuse);
+ lastuse = time_after_eq(jiffies, lastuse) ? jiffies - lastuse : 0;
+
mfcs.mfcs_packets = c->mfc_un.res.pkt;
mfcs.mfcs_bytes = c->mfc_un.res.bytes;
mfcs.mfcs_wrong_if = c->mfc_un.res.wrong_if;
if (nla_put_64bit(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs, RTA_PAD) ||
- nla_put_u64_64bit(skb, RTA_EXPIRES,
- jiffies_to_clock_t(c->mfc_un.res.lastuse),
+ nla_put_u64_64bit(skb, RTA_EXPIRES, jiffies_to_clock_t(lastuse),
RTA_PAD))
return -EMSGSIZE;
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 552fac2..55aacea 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -190,7 +190,7 @@
.u = {
.log = {
.level = LOGLEVEL_WARNING,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index 1aa5848..963ee38 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -115,7 +115,7 @@
help = nfct_help(ct);
if (!help)
return NF_ACCEPT;
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index 660bc10..f5a61bc 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -165,7 +165,7 @@
return -NF_ACCEPT;
}
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
inproto = __nf_ct_l4proto_find(PF_INET6, origtuple.dst.protonum);
/* Ordinarily, we'd expect the inverted tupleproto, but it's
diff --git a/net/ipv6/netfilter/nf_log_ipv6.c b/net/ipv6/netfilter/nf_log_ipv6.c
index c1bcf69..57d8606 100644
--- a/net/ipv6/netfilter/nf_log_ipv6.c
+++ b/net/ipv6/netfilter/nf_log_ipv6.c
@@ -30,7 +30,7 @@
.u = {
.log = {
.level = LOGLEVEL_NOTICE,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
@@ -52,7 +52,7 @@
if (info->type == NF_LOG_TYPE_LOG)
logflags = info->u.log.logflags;
else
- logflags = NF_LOG_MASK;
+ logflags = NF_LOG_DEFAULT_MASK;
ih = skb_header_pointer(skb, ip6hoff, sizeof(_ip6h), &_ip6h);
if (ih == NULL) {
@@ -84,7 +84,7 @@
}
/* Max length: 48 "OPT (...) " */
- if (logflags & XT_LOG_IPOPT)
+ if (logflags & NF_LOG_IPOPT)
nf_log_buf_add(m, "OPT ( ");
switch (currenthdr) {
@@ -121,7 +121,7 @@
case IPPROTO_ROUTING:
case IPPROTO_HOPOPTS:
if (fragment) {
- if (logflags & XT_LOG_IPOPT)
+ if (logflags & NF_LOG_IPOPT)
nf_log_buf_add(m, ")");
return;
}
@@ -129,7 +129,7 @@
break;
/* Max Length */
case IPPROTO_AH:
- if (logflags & XT_LOG_IPOPT) {
+ if (logflags & NF_LOG_IPOPT) {
struct ip_auth_hdr _ahdr;
const struct ip_auth_hdr *ah;
@@ -161,7 +161,7 @@
hdrlen = (hp->hdrlen+2)<<2;
break;
case IPPROTO_ESP:
- if (logflags & XT_LOG_IPOPT) {
+ if (logflags & NF_LOG_IPOPT) {
struct ip_esp_hdr _esph;
const struct ip_esp_hdr *eh;
@@ -194,7 +194,7 @@
nf_log_buf_add(m, "Unknown Ext Hdr %u", currenthdr);
return;
}
- if (logflags & XT_LOG_IPOPT)
+ if (logflags & NF_LOG_IPOPT)
nf_log_buf_add(m, ") ");
currenthdr = hp->nexthdr;
@@ -277,7 +277,7 @@
}
/* Max length: 15 "UID=4294967295 " */
- if ((logflags & XT_LOG_UID) && recurse)
+ if ((logflags & NF_LOG_UID) && recurse)
nf_log_dump_sk_uid_gid(m, skb->sk);
/* Max length: 16 "MARK=0xFFFFFFFF " */
@@ -295,7 +295,7 @@
if (info->type == NF_LOG_TYPE_LOG)
logflags = info->u.log.logflags;
- if (!(logflags & XT_LOG_MACDECODE))
+ if (!(logflags & NF_LOG_MACDECODE))
goto fallback;
switch (dev->type) {
diff --git a/net/ipv6/netfilter/nf_tables_ipv6.c b/net/ipv6/netfilter/nf_tables_ipv6.c
index 30b22f4..d6e4ba5 100644
--- a/net/ipv6/netfilter/nf_tables_ipv6.c
+++ b/net/ipv6/netfilter/nf_tables_ipv6.c
@@ -22,9 +22,7 @@
{
struct nft_pktinfo pkt;
- /* malformed packet, drop it */
- if (nft_set_pktinfo_ipv6(&pkt, skb, state) < 0)
- return NF_DROP;
+ nft_set_pktinfo_ipv6(&pkt, skb, state);
return nft_do_chain(&pkt, priv);
}
@@ -102,7 +100,10 @@
{
int ret;
- nft_register_chain_type(&filter_ipv6);
+ ret = nft_register_chain_type(&filter_ipv6);
+ if (ret < 0)
+ return ret;
+
ret = register_pernet_subsys(&nf_tables_ipv6_net_ops);
if (ret < 0)
nft_unregister_chain_type(&filter_ipv6);
diff --git a/net/ipv6/netfilter/nft_chain_route_ipv6.c b/net/ipv6/netfilter/nft_chain_route_ipv6.c
index 71d995f..f272747 100644
--- a/net/ipv6/netfilter/nft_chain_route_ipv6.c
+++ b/net/ipv6/netfilter/nft_chain_route_ipv6.c
@@ -31,10 +31,9 @@
struct in6_addr saddr, daddr;
u_int8_t hop_limit;
u32 mark, flowlabel;
+ int err;
- /* malformed packet, drop it */
- if (nft_set_pktinfo_ipv6(&pkt, skb, state) < 0)
- return NF_DROP;
+ nft_set_pktinfo_ipv6(&pkt, skb, state);
/* save source/dest address, mark, hoplimit, flowlabel, priority */
memcpy(&saddr, &ipv6_hdr(skb)->saddr, sizeof(saddr));
@@ -46,13 +45,16 @@
flowlabel = *((u32 *)ipv6_hdr(skb));
ret = nft_do_chain(&pkt, priv);
- if (ret != NF_DROP && ret != NF_QUEUE &&
+ if (ret != NF_DROP && ret != NF_STOLEN &&
(memcmp(&ipv6_hdr(skb)->saddr, &saddr, sizeof(saddr)) ||
memcmp(&ipv6_hdr(skb)->daddr, &daddr, sizeof(daddr)) ||
skb->mark != mark ||
ipv6_hdr(skb)->hop_limit != hop_limit ||
- flowlabel != *((u_int32_t *)ipv6_hdr(skb))))
- return ip6_route_me_harder(state->net, skb) == 0 ? ret : NF_DROP;
+ flowlabel != *((u_int32_t *)ipv6_hdr(skb)))) {
+ err = ip6_route_me_harder(state->net, skb);
+ if (err < 0)
+ ret = NF_DROP_ERR(err);
+ }
return ret;
}
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index ad4a7ff..5a5aeb9 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1147,15 +1147,16 @@
return ip6_pol_route(net, table, fl6->flowi6_iif, fl6, flags);
}
-static struct dst_entry *ip6_route_input_lookup(struct net *net,
- struct net_device *dev,
- struct flowi6 *fl6, int flags)
+struct dst_entry *ip6_route_input_lookup(struct net *net,
+ struct net_device *dev,
+ struct flowi6 *fl6, int flags)
{
if (rt6_need_strict(&fl6->daddr) && dev->type != ARPHRD_PIMREG)
flags |= RT6_LOOKUP_F_IFACE;
return fib6_rule_lookup(net, fl6, flags, ip6_pol_route_input);
}
+EXPORT_SYMBOL_GPL(ip6_route_input_lookup);
void ip6_route_input(struct sk_buff *skb)
{
@@ -1991,9 +1992,18 @@
if (!(gwa_type & IPV6_ADDR_UNICAST))
goto out;
- if (cfg->fc_table)
+ if (cfg->fc_table) {
grt = ip6_nh_lookup_table(net, cfg, gw_addr);
+ if (grt) {
+ if (grt->rt6i_flags & RTF_GATEWAY ||
+ (dev && dev != grt->dst.dev)) {
+ ip6_rt_put(grt);
+ grt = NULL;
+ }
+ }
+ }
+
if (!grt)
grt = rt6_lookup(net, gw_addr, NULL,
cfg->fc_ifindex, 1);
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index 00a2d40..b578956 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -21,9 +21,10 @@
return xfrm6_extract_header(skb);
}
-int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi)
+int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi,
+ struct ip6_tnl *t)
{
- XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = NULL;
+ XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 = t;
XFRM_SPI_SKB_CB(skb)->family = AF_INET6;
XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct ipv6hdr, daddr);
return xfrm_input(skb, nexthdr, spi, 0);
@@ -49,13 +50,18 @@
return -1;
}
-int xfrm6_rcv(struct sk_buff *skb)
+int xfrm6_rcv_tnl(struct sk_buff *skb, struct ip6_tnl *t)
{
return xfrm6_rcv_spi(skb, skb_network_header(skb)[IP6CB(skb)->nhoff],
- 0);
+ 0, t);
+}
+EXPORT_SYMBOL(xfrm6_rcv_tnl);
+
+int xfrm6_rcv(struct sk_buff *skb)
+{
+ return xfrm6_rcv_tnl(skb, NULL);
}
EXPORT_SYMBOL(xfrm6_rcv);
-
int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
xfrm_address_t *saddr, u8 proto)
{
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index 5743044..e1c0bbe 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -236,7 +236,7 @@
__be32 spi;
spi = xfrm6_tunnel_spi_lookup(net, (const xfrm_address_t *)&iph->saddr);
- return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi);
+ return xfrm6_rcv_spi(skb, IPPROTO_IPV6, spi, NULL);
}
static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index db63969..391c3cb 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -832,7 +832,7 @@
struct sock *sk = sock->sk;
struct irda_sock *new, *self = irda_sk(sk);
struct sock *newsk;
- struct sk_buff *skb;
+ struct sk_buff *skb = NULL;
int err;
err = irda_create(sock_net(sk), newsock, sk->sk_protocol, 0);
@@ -897,7 +897,6 @@
err = -EPERM; /* value does not seem to make sense. -arnd */
if (!new->tsap) {
pr_debug("%s(), dup failed!\n", __func__);
- kfree_skb(skb);
goto out;
}
@@ -916,7 +915,6 @@
/* Clean up the original one to keep it in listen state */
irttp_listen(self->tsap);
- kfree_skb(skb);
sk->sk_ack_backlog--;
newsock->state = SS_CONNECTED;
@@ -924,6 +922,7 @@
irda_connect_response(new);
err = 0;
out:
+ kfree_skb(skb);
release_sock(sk);
return err;
}
diff --git a/net/mac80211/agg-rx.c b/net/mac80211/agg-rx.c
index a9aff60..f6749dc 100644
--- a/net/mac80211/agg-rx.c
+++ b/net/mac80211/agg-rx.c
@@ -261,10 +261,16 @@
.timeout = timeout,
.ssn = start_seq_num,
};
-
int i, ret = -EOPNOTSUPP;
u16 status = WLAN_STATUS_REQUEST_DECLINED;
+ if (tid >= IEEE80211_FIRST_TSPEC_TSID) {
+ ht_dbg(sta->sdata,
+ "STA %pM requests BA session on unsupported tid %d\n",
+ sta->sta.addr, tid);
+ goto end_no_lock;
+ }
+
if (!sta->sta.ht_cap.ht_supported) {
ht_dbg(sta->sdata,
"STA %pM erroneously requests BA session on tid %d w/o QoS\n",
@@ -298,10 +304,13 @@
buf_size = IEEE80211_MAX_AMPDU_BUF;
/* make sure the size doesn't exceed the maximum supported by the hw */
- if (buf_size > local->hw.max_rx_aggregation_subframes)
- buf_size = local->hw.max_rx_aggregation_subframes;
+ if (buf_size > sta->sta.max_rx_aggregation_subframes)
+ buf_size = sta->sta.max_rx_aggregation_subframes;
params.buf_size = buf_size;
+ ht_dbg(sta->sdata, "AddBA Req buf_size=%d for %pM\n",
+ buf_size, sta->sta.addr);
+
/* examine state machine */
mutex_lock(&sta->ampdu_mlme.mtx);
@@ -406,8 +415,10 @@
}
end:
- if (status == WLAN_STATUS_SUCCESS)
+ if (status == WLAN_STATUS_SUCCESS) {
__set_bit(tid, sta->ampdu_mlme.agg_session_valid);
+ __clear_bit(tid, sta->ampdu_mlme.unexpected_agg);
+ }
mutex_unlock(&sta->ampdu_mlme.mtx);
end_no_lock:
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index 5650c46..45319cc 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -584,6 +584,9 @@
ieee80211_hw_check(&local->hw, TX_AMPDU_SETUP_IN_HW))
return -EINVAL;
+ if (WARN_ON(tid >= IEEE80211_FIRST_TSPEC_TSID))
+ return -EINVAL;
+
ht_dbg(sdata, "Open BA session requested for %pM tid %u\n",
pubsta->addr, tid);
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 543b1d4..e29ff57 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -39,7 +39,7 @@
if (type == NL80211_IFTYPE_MONITOR && flags) {
sdata = IEEE80211_WDEV_TO_SUB_IF(wdev);
- sdata->u.mntr_flags = *flags;
+ sdata->u.mntr.flags = *flags;
}
return wdev;
@@ -73,8 +73,29 @@
sdata->u.mgd.use_4addr = params->use_4addr;
}
- if (sdata->vif.type == NL80211_IFTYPE_MONITOR && flags) {
+ if (sdata->vif.type == NL80211_IFTYPE_MONITOR) {
struct ieee80211_local *local = sdata->local;
+ struct ieee80211_sub_if_data *monitor_sdata;
+ u32 mu_mntr_cap_flag = NL80211_EXT_FEATURE_MU_MIMO_AIR_SNIFFER;
+
+ monitor_sdata = rtnl_dereference(local->monitor_sdata);
+ if (monitor_sdata &&
+ wiphy_ext_feature_isset(wiphy, mu_mntr_cap_flag)) {
+ memcpy(monitor_sdata->vif.bss_conf.mu_group.membership,
+ params->vht_mumimo_groups, WLAN_MEMBERSHIP_LEN);
+ memcpy(monitor_sdata->vif.bss_conf.mu_group.position,
+ params->vht_mumimo_groups + WLAN_MEMBERSHIP_LEN,
+ WLAN_USER_POSITION_LEN);
+ monitor_sdata->vif.mu_mimo_owner = true;
+ ieee80211_bss_info_change_notify(monitor_sdata,
+ BSS_CHANGED_MU_GROUPS);
+
+ ether_addr_copy(monitor_sdata->u.mntr.mu_follow_addr,
+ params->macaddr);
+ }
+
+ if (!flags)
+ return 0;
if (ieee80211_sdata_running(sdata)) {
u32 mask = MONITOR_FLAG_COOK_FRAMES |
@@ -89,11 +110,11 @@
* cooked_mntrs, monitor and all fif_* counters
* reconfigure hardware
*/
- if ((*flags & mask) != (sdata->u.mntr_flags & mask))
+ if ((*flags & mask) != (sdata->u.mntr.flags & mask))
return -EBUSY;
ieee80211_adjust_monitor_flags(sdata, -1);
- sdata->u.mntr_flags = *flags;
+ sdata->u.mntr.flags = *flags;
ieee80211_adjust_monitor_flags(sdata, 1);
ieee80211_configure_filter(local);
@@ -103,7 +124,7 @@
* and ieee80211_do_open take care of "everything"
* mentioned in the comment above.
*/
- sdata->u.mntr_flags = *flags;
+ sdata->u.mntr.flags = *flags;
}
}
@@ -2940,10 +2961,6 @@
}
chanctx = container_of(conf, struct ieee80211_chanctx, conf);
- if (!chanctx) {
- err = -EBUSY;
- goto out;
- }
ch_switch.timestamp = 0;
ch_switch.device_timestamp = 0;
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 2906c10..8ca62b6 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -71,138 +71,39 @@
DEBUGFS_READONLY_FILE(rate_ctrl_alg, "%s",
local->rate_ctrl ? local->rate_ctrl->ops->name : "hw/driver");
-struct aqm_info {
- struct ieee80211_local *local;
- size_t size;
- size_t len;
- unsigned char buf[0];
-};
-
-#define AQM_HDR_LEN 200
-#define AQM_HW_ENTRY_LEN 40
-#define AQM_TXQ_ENTRY_LEN 110
-
-static int aqm_open(struct inode *inode, struct file *file)
-{
- struct ieee80211_local *local = inode->i_private;
- struct ieee80211_sub_if_data *sdata;
- struct sta_info *sta;
- struct txq_info *txqi;
- struct fq *fq = &local->fq;
- struct aqm_info *info = NULL;
- int len = 0;
- int i;
-
- if (!local->ops->wake_tx_queue)
- return -EOPNOTSUPP;
-
- len += AQM_HDR_LEN;
- len += 6 * AQM_HW_ENTRY_LEN;
-
- rcu_read_lock();
- list_for_each_entry_rcu(sdata, &local->interfaces, list)
- len += AQM_TXQ_ENTRY_LEN;
- list_for_each_entry_rcu(sta, &local->sta_list, list)
- len += AQM_TXQ_ENTRY_LEN * ARRAY_SIZE(sta->sta.txq);
- rcu_read_unlock();
-
- info = vmalloc(len);
- if (!info)
- return -ENOMEM;
-
- spin_lock_bh(&local->fq.lock);
- rcu_read_lock();
-
- file->private_data = info;
- info->local = local;
- info->size = len;
- len = 0;
-
- len += scnprintf(info->buf + len, info->size - len,
- "* hw\n"
- "access name value\n"
- "R fq_flows_cnt %u\n"
- "R fq_backlog %u\n"
- "R fq_overlimit %u\n"
- "R fq_collisions %u\n"
- "RW fq_limit %u\n"
- "RW fq_quantum %u\n",
- fq->flows_cnt,
- fq->backlog,
- fq->overlimit,
- fq->collisions,
- fq->limit,
- fq->quantum);
-
- len += scnprintf(info->buf + len,
- info->size - len,
- "* vif\n"
- "ifname addr ac backlog-bytes backlog-packets flows overlimit collisions tx-bytes tx-packets\n");
-
- list_for_each_entry_rcu(sdata, &local->interfaces, list) {
- txqi = to_txq_info(sdata->vif.txq);
- len += scnprintf(info->buf + len, info->size - len,
- "%s %pM %u %u %u %u %u %u %u %u\n",
- sdata->name,
- sdata->vif.addr,
- txqi->txq.ac,
- txqi->tin.backlog_bytes,
- txqi->tin.backlog_packets,
- txqi->tin.flows,
- txqi->tin.overlimit,
- txqi->tin.collisions,
- txqi->tin.tx_bytes,
- txqi->tin.tx_packets);
- }
-
- len += scnprintf(info->buf + len,
- info->size - len,
- "* sta\n"
- "ifname addr tid ac backlog-bytes backlog-packets flows overlimit collisions tx-bytes tx-packets\n");
-
- list_for_each_entry_rcu(sta, &local->sta_list, list) {
- sdata = sta->sdata;
- for (i = 0; i < ARRAY_SIZE(sta->sta.txq); i++) {
- txqi = to_txq_info(sta->sta.txq[i]);
- len += scnprintf(info->buf + len, info->size - len,
- "%s %pM %d %d %u %u %u %u %u %u %u\n",
- sdata->name,
- sta->sta.addr,
- txqi->txq.tid,
- txqi->txq.ac,
- txqi->tin.backlog_bytes,
- txqi->tin.backlog_packets,
- txqi->tin.flows,
- txqi->tin.overlimit,
- txqi->tin.collisions,
- txqi->tin.tx_bytes,
- txqi->tin.tx_packets);
- }
- }
-
- info->len = len;
-
- rcu_read_unlock();
- spin_unlock_bh(&local->fq.lock);
-
- return 0;
-}
-
-static int aqm_release(struct inode *inode, struct file *file)
-{
- vfree(file->private_data);
- return 0;
-}
-
static ssize_t aqm_read(struct file *file,
char __user *user_buf,
size_t count,
loff_t *ppos)
{
- struct aqm_info *info = file->private_data;
+ struct ieee80211_local *local = file->private_data;
+ struct fq *fq = &local->fq;
+ char buf[200];
+ int len = 0;
+
+ spin_lock_bh(&local->fq.lock);
+ rcu_read_lock();
+
+ len = scnprintf(buf, sizeof(buf),
+ "access name value\n"
+ "R fq_flows_cnt %u\n"
+ "R fq_backlog %u\n"
+ "R fq_overlimit %u\n"
+ "R fq_collisions %u\n"
+ "RW fq_limit %u\n"
+ "RW fq_quantum %u\n",
+ fq->flows_cnt,
+ fq->backlog,
+ fq->overlimit,
+ fq->collisions,
+ fq->limit,
+ fq->quantum);
+
+ rcu_read_unlock();
+ spin_unlock_bh(&local->fq.lock);
return simple_read_from_buffer(user_buf, count, ppos,
- info->buf, info->len);
+ buf, len);
}
static ssize_t aqm_write(struct file *file,
@@ -210,8 +111,7 @@
size_t count,
loff_t *ppos)
{
- struct aqm_info *info = file->private_data;
- struct ieee80211_local *local = info->local;
+ struct ieee80211_local *local = file->private_data;
char buf[100];
size_t len;
@@ -237,8 +137,7 @@
static const struct file_operations aqm_ops = {
.write = aqm_write,
.read = aqm_read,
- .open = aqm_open,
- .release = aqm_release,
+ .open = simple_open,
.llseek = default_llseek,
};
@@ -302,6 +201,7 @@
FLAG(USES_RSS),
FLAG(TX_AMSDU),
FLAG(TX_FRAG_LIST),
+ FLAG(REPORTS_LOW_ACK),
#undef FLAG
};
@@ -428,7 +328,9 @@
DEBUGFS_ADD(hwflags);
DEBUGFS_ADD(user_power);
DEBUGFS_ADD(power);
- DEBUGFS_ADD_MODE(aqm, 0600);
+
+ if (local->ops->wake_tx_queue)
+ DEBUGFS_ADD_MODE(aqm, 0600);
statsd = debugfs_create_dir("statistics", phyd);
diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c
index a5ba739..5d35c0f 100644
--- a/net/mac80211/debugfs_netdev.c
+++ b/net/mac80211/debugfs_netdev.c
@@ -30,7 +30,7 @@
size_t count, loff_t *ppos,
ssize_t (*format)(const struct ieee80211_sub_if_data *, char *, int))
{
- char buf[70];
+ char buf[200];
ssize_t ret = -EINVAL;
read_lock(&dev_base_lock);
@@ -486,6 +486,38 @@
}
IEEE80211_IF_FILE_R(num_buffered_multicast);
+static ssize_t ieee80211_if_fmt_aqm(
+ const struct ieee80211_sub_if_data *sdata, char *buf, int buflen)
+{
+ struct ieee80211_local *local = sdata->local;
+ struct txq_info *txqi = to_txq_info(sdata->vif.txq);
+ int len;
+
+ spin_lock_bh(&local->fq.lock);
+ rcu_read_lock();
+
+ len = scnprintf(buf,
+ buflen,
+ "ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets\n"
+ "%u %u %u %u %u %u %u %u %u %u\n",
+ txqi->txq.ac,
+ txqi->tin.backlog_bytes,
+ txqi->tin.backlog_packets,
+ txqi->tin.flows,
+ txqi->cstats.drop_count,
+ txqi->cstats.ecn_mark,
+ txqi->tin.overlimit,
+ txqi->tin.collisions,
+ txqi->tin.tx_bytes,
+ txqi->tin.tx_packets);
+
+ rcu_read_unlock();
+ spin_unlock_bh(&local->fq.lock);
+
+ return len;
+}
+IEEE80211_IF_FILE_R(aqm);
+
/* IBSS attributes */
static ssize_t ieee80211_if_fmt_tsf(
const struct ieee80211_sub_if_data *sdata, char *buf, int buflen)
@@ -618,6 +650,9 @@
DEBUGFS_ADD(rc_rateidx_vht_mcs_mask_2ghz);
DEBUGFS_ADD(rc_rateidx_vht_mcs_mask_5ghz);
DEBUGFS_ADD(hw_queues);
+
+ if (sdata->local->ops->wake_tx_queue)
+ DEBUGFS_ADD(aqm);
}
static void add_sta_files(struct ieee80211_sub_if_data *sdata)
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index fd33413..a2fcdb4 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -133,6 +133,55 @@
}
STA_OPS(last_seq_ctrl);
+#define AQM_TXQ_ENTRY_LEN 130
+
+static ssize_t sta_aqm_read(struct file *file, char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ struct sta_info *sta = file->private_data;
+ struct ieee80211_local *local = sta->local;
+ size_t bufsz = AQM_TXQ_ENTRY_LEN*(IEEE80211_NUM_TIDS+1);
+ char *buf = kzalloc(bufsz, GFP_KERNEL), *p = buf;
+ struct txq_info *txqi;
+ ssize_t rv;
+ int i;
+
+ if (!buf)
+ return -ENOMEM;
+
+ spin_lock_bh(&local->fq.lock);
+ rcu_read_lock();
+
+ p += scnprintf(p,
+ bufsz+buf-p,
+ "tid ac backlog-bytes backlog-packets new-flows drops marks overlimit collisions tx-bytes tx-packets\n");
+
+ for (i = 0; i < IEEE80211_NUM_TIDS; i++) {
+ txqi = to_txq_info(sta->sta.txq[i]);
+ p += scnprintf(p, bufsz+buf-p,
+ "%d %d %u %u %u %u %u %u %u %u %u\n",
+ txqi->txq.tid,
+ txqi->txq.ac,
+ txqi->tin.backlog_bytes,
+ txqi->tin.backlog_packets,
+ txqi->tin.flows,
+ txqi->cstats.drop_count,
+ txqi->cstats.ecn_mark,
+ txqi->tin.overlimit,
+ txqi->tin.collisions,
+ txqi->tin.tx_bytes,
+ txqi->tin.tx_packets);
+ }
+
+ rcu_read_unlock();
+ spin_unlock_bh(&local->fq.lock);
+
+ rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
+ kfree(buf);
+ return rv;
+}
+STA_OPS(aqm);
+
static ssize_t sta_agg_status_read(struct file *file, char __user *userbuf,
size_t count, loff_t *ppos)
{
@@ -478,6 +527,9 @@
DEBUGFS_ADD_COUNTER(rx_fragments, rx_stats.fragments);
DEBUGFS_ADD_COUNTER(tx_filtered, status_stats.filtered);
+ if (local->ops->wake_tx_queue)
+ DEBUGFS_ADD(aqm);
+
if (sizeof(sta->driver_buffered_tids) == sizeof(u32))
debugfs_create_x32("driver_buffered_tids", 0400,
sta->debugfs_dir,
@@ -492,10 +544,6 @@
void ieee80211_sta_debugfs_remove(struct sta_info *sta)
{
- struct ieee80211_local *local = sta->local;
- struct ieee80211_sub_if_data *sdata = sta->sdata;
-
- drv_sta_remove_debugfs(local, sdata, &sta->sta, sta->debugfs_dir);
debugfs_remove_recursive(sta->debugfs_dir);
sta->debugfs_dir = NULL;
}
diff --git a/net/mac80211/driver-ops.c b/net/mac80211/driver-ops.c
index c258f10..c701b64 100644
--- a/net/mac80211/driver-ops.c
+++ b/net/mac80211/driver-ops.c
@@ -62,7 +62,7 @@
if (WARN_ON(sdata->vif.type == NL80211_IFTYPE_AP_VLAN ||
(sdata->vif.type == NL80211_IFTYPE_MONITOR &&
!ieee80211_hw_check(&local->hw, WANT_MONITOR_VIF) &&
- !(sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE))))
+ !(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE))))
return -EINVAL;
trace_drv_add_interface(local, sdata);
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 42a41ae..fe35a1c 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -162,7 +162,8 @@
return;
if (WARN_ON_ONCE(sdata->vif.type == NL80211_IFTYPE_P2P_DEVICE ||
- sdata->vif.type == NL80211_IFTYPE_MONITOR))
+ (sdata->vif.type == NL80211_IFTYPE_MONITOR &&
+ !sdata->vif.mu_mimo_owner)))
return;
if (!check_sdata_in_driver(sdata))
@@ -498,21 +499,6 @@
local->ops->sta_add_debugfs(&local->hw, &sdata->vif,
sta, dir);
}
-
-static inline void drv_sta_remove_debugfs(struct ieee80211_local *local,
- struct ieee80211_sub_if_data *sdata,
- struct ieee80211_sta *sta,
- struct dentry *dir)
-{
- might_sleep();
-
- sdata = get_bss_sdata(sdata);
- check_sdata_in_driver(sdata);
-
- if (local->ops->sta_remove_debugfs)
- local->ops->sta_remove_debugfs(&local->hw, &sdata->vif,
- sta, dir);
-}
#endif
static inline void drv_sta_pre_rcu_remove(struct ieee80211_local *local,
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index f56d342..e496dee 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -3,7 +3,7 @@
* Copyright 2005, Devicescape Software, Inc.
* Copyright 2006-2007 Jiri Benc <jbenc@suse.cz>
* Copyright 2007-2010 Johannes Berg <johannes@sipsolutions.net>
- * Copyright 2013-2014 Intel Mobile Communications GmbH
+ * Copyright 2013-2015 Intel Mobile Communications GmbH
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
@@ -818,12 +818,18 @@
struct fq_tin tin;
struct fq_flow def_flow;
struct codel_vars def_cvars;
+ struct codel_stats cstats;
unsigned long flags;
/* keep last! */
struct ieee80211_txq txq;
};
+struct ieee80211_if_mntr {
+ u32 flags;
+ u8 mu_follow_addr[ETH_ALEN] __aligned(2);
+};
+
struct ieee80211_sub_if_data {
struct list_head list;
@@ -922,7 +928,7 @@
struct ieee80211_if_ibss ibss;
struct ieee80211_if_mesh mesh;
struct ieee80211_if_ocb ocb;
- u32 mntr_flags;
+ struct ieee80211_if_mntr mntr;
} u;
#ifdef CONFIG_MAC80211_DEBUGFS
@@ -1112,7 +1118,6 @@
struct fq fq;
struct codel_vars *cvars;
struct codel_params cparams;
- struct codel_stats cstats;
const struct ieee80211_ops *ops;
@@ -1208,7 +1213,7 @@
spinlock_t tim_lock;
unsigned long num_sta;
struct list_head sta_list;
- struct rhashtable sta_hash;
+ struct rhltable sta_hash;
struct timer_list sta_cleanup;
int sta_generation;
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b123a9e..b0abddc7 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -43,6 +43,8 @@
* by either the RTNL, the iflist_mtx or RCU.
*/
+static void ieee80211_iface_work(struct work_struct *work);
+
bool __ieee80211_recalc_txpower(struct ieee80211_sub_if_data *sdata)
{
struct ieee80211_chanctx_conf *chanctx_conf;
@@ -188,7 +190,7 @@
continue;
if (iter->vif.type == NL80211_IFTYPE_MONITOR &&
- !(iter->u.mntr_flags & MONITOR_FLAG_ACTIVE))
+ !(iter->u.mntr.flags & MONITOR_FLAG_ACTIVE))
continue;
m = iter->vif.addr;
@@ -217,7 +219,7 @@
return -EBUSY;
if (sdata->vif.type == NL80211_IFTYPE_MONITOR &&
- !(sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE))
+ !(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE))
check_dup = false;
ret = ieee80211_verify_mac(sdata, sa->sa_data, check_dup);
@@ -357,7 +359,7 @@
const int offset)
{
struct ieee80211_local *local = sdata->local;
- u32 flags = sdata->u.mntr_flags;
+ u32 flags = sdata->u.mntr.flags;
#define ADJUST(_f, _s) do { \
if (flags & MONITOR_FLAG_##_f) \
@@ -448,6 +450,9 @@
return ret;
}
+ skb_queue_head_init(&sdata->skb_queue);
+ INIT_WORK(&sdata->work, ieee80211_iface_work);
+
return 0;
}
@@ -589,12 +594,12 @@
}
break;
case NL80211_IFTYPE_MONITOR:
- if (sdata->u.mntr_flags & MONITOR_FLAG_COOK_FRAMES) {
+ if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) {
local->cooked_mntrs++;
break;
}
- if (sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE) {
+ if (sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE) {
res = drv_add_interface(local, sdata);
if (res)
goto err_stop;
@@ -926,7 +931,7 @@
/* no need to tell driver */
break;
case NL80211_IFTYPE_MONITOR:
- if (sdata->u.mntr_flags & MONITOR_FLAG_COOK_FRAMES) {
+ if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) {
local->cooked_mntrs--;
break;
}
@@ -1012,7 +1017,7 @@
ieee80211_recalc_idle(local);
mutex_unlock(&local->mtx);
- if (!(sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE))
+ if (!(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE))
break;
/* fall through */
@@ -1444,7 +1449,7 @@
case NL80211_IFTYPE_MONITOR:
sdata->dev->type = ARPHRD_IEEE80211_RADIOTAP;
sdata->dev->netdev_ops = &ieee80211_monitorif_ops;
- sdata->u.mntr_flags = MONITOR_FLAG_CONTROL |
+ sdata->u.mntr.flags = MONITOR_FLAG_CONTROL |
MONITOR_FLAG_OTHER_BSS;
break;
case NL80211_IFTYPE_WDS:
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index d00ea9b..ac053a9 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -660,6 +660,9 @@
ieee80211_roc_setup(local);
+ local->hw.radiotap_timestamp.units_pos = -1;
+ local->hw.radiotap_timestamp.accuracy = -1;
+
return &local->hw;
err_free:
wiphy_free(wiphy);
diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index fa7d37c..b747c96 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -757,6 +757,7 @@
sta = next_hop_deref_protected(mpath);
if (mpath->flags & MESH_PATH_ACTIVE &&
ether_addr_equal(ta, sta->sta.addr) &&
+ !(mpath->flags & MESH_PATH_FIXED) &&
(!(mpath->flags & MESH_PATH_SN_VALID) ||
SN_GT(target_sn, mpath->sn) || target_sn == 0)) {
mpath->flags &= ~MESH_PATH_ACTIVE;
@@ -1023,7 +1024,7 @@
goto enddiscovery;
spin_lock_bh(&mpath->state_lock);
- if (mpath->flags & MESH_PATH_DELETED) {
+ if (mpath->flags & (MESH_PATH_DELETED | MESH_PATH_FIXED)) {
spin_unlock_bh(&mpath->state_lock);
goto enddiscovery;
}
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index 6db2ddf..f0e6175 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -826,7 +826,7 @@
mpath->metric = 0;
mpath->hop_count = 0;
mpath->exp_time = 0;
- mpath->flags |= MESH_PATH_FIXED;
+ mpath->flags = MESH_PATH_FIXED | MESH_PATH_SN_VALID;
mesh_path_activate(mpath);
spin_unlock_bh(&mpath->state_lock);
mesh_path_tx_pending(mpath);
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 8d426f6..7486f2d 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -1672,11 +1672,15 @@
non_acm_ac++)
if (!(sdata->wmm_acm & BIT(7 - 2 * non_acm_ac)))
break;
- /* The loop will result in using BK even if it requires
- * admission control, such configuration makes no sense
- * and we have to transmit somehow - the AC selection
- * does the same thing.
+ /* Usually the loop will result in using BK even if it
+ * requires admission control, but such a configuration
+ * makes no sense and we have to transmit somehow - the
+ * AC selection does the same thing.
+ * If we started out trying to downgrade from BK, then
+ * the extra condition here might be needed.
*/
+ if (non_acm_ac >= IEEE80211_NUM_ACS)
+ non_acm_ac = IEEE80211_AC_BK;
if (drv_conf_tx(local, sdata, ac,
&sdata->tx_conf[non_acm_ac]))
sdata_err(sdata,
diff --git a/net/mac80211/pm.c b/net/mac80211/pm.c
index 00a43a7..28a3a09 100644
--- a/net/mac80211/pm.c
+++ b/net/mac80211/pm.c
@@ -178,8 +178,7 @@
WARN_ON(!list_empty(&local->chanctx_list));
/* stop hardware - this must stop RX */
- if (local->open_count)
- ieee80211_stop_device(local);
+ ieee80211_stop_device(local);
suspend:
local->suspended = true;
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 9dce3b1..f7cf342 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -180,6 +180,11 @@
len += 12;
}
+ if (local->hw.radiotap_timestamp.units_pos >= 0) {
+ len = ALIGN(len, 8);
+ len += 12;
+ }
+
if (status->chains) {
/* antenna and antenna signal fields */
len += 2 * hweight8(status->chains);
@@ -447,6 +452,31 @@
pos += 2;
}
+ if (local->hw.radiotap_timestamp.units_pos >= 0) {
+ u16 accuracy = 0;
+ u8 flags = IEEE80211_RADIOTAP_TIMESTAMP_FLAG_32BIT;
+
+ rthdr->it_present |=
+ cpu_to_le32(1 << IEEE80211_RADIOTAP_TIMESTAMP);
+
+ /* ensure 8 byte alignment */
+ while ((pos - (u8 *)rthdr) & 7)
+ pos++;
+
+ put_unaligned_le64(status->device_timestamp, pos);
+ pos += sizeof(u64);
+
+ if (local->hw.radiotap_timestamp.accuracy >= 0) {
+ accuracy = local->hw.radiotap_timestamp.accuracy;
+ flags |= IEEE80211_RADIOTAP_TIMESTAMP_FLAG_ACCURACY;
+ }
+ put_unaligned_le16(accuracy, pos);
+ pos += sizeof(u16);
+
+ *pos++ = local->hw.radiotap_timestamp.units_pos;
+ *pos++ = flags;
+ }
+
for_each_set_bit(chain, &chains, IEEE80211_MAX_CHAINS) {
*pos++ = status->chain_signal[chain];
*pos++ = chain;
@@ -485,6 +515,9 @@
struct net_device *prev_dev = NULL;
int present_fcs_len = 0;
unsigned int rtap_vendor_space = 0;
+ struct ieee80211_mgmt *mgmt;
+ struct ieee80211_sub_if_data *monitor_sdata =
+ rcu_dereference(local->monitor_sdata);
if (unlikely(status->flag & RX_FLAG_RADIOTAP_VENDOR_DATA)) {
struct ieee80211_vendor_radiotap *rtap = (void *)origskb->data;
@@ -567,7 +600,7 @@
if (sdata->vif.type != NL80211_IFTYPE_MONITOR)
continue;
- if (sdata->u.mntr_flags & MONITOR_FLAG_COOK_FRAMES)
+ if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES)
continue;
if (!ieee80211_sdata_running(sdata))
@@ -585,6 +618,23 @@
ieee80211_rx_stats(sdata->dev, skb->len);
}
+ mgmt = (void *)skb->data;
+ if (monitor_sdata &&
+ skb->len >= IEEE80211_MIN_ACTION_SIZE + 1 + VHT_MUMIMO_GROUPS_DATA_LEN &&
+ ieee80211_is_action(mgmt->frame_control) &&
+ mgmt->u.action.category == WLAN_CATEGORY_VHT &&
+ mgmt->u.action.u.vht_group_notif.action_code == WLAN_VHT_ACTION_GROUPID_MGMT &&
+ is_valid_ether_addr(monitor_sdata->u.mntr.mu_follow_addr) &&
+ ether_addr_equal(mgmt->da, monitor_sdata->u.mntr.mu_follow_addr)) {
+ struct sk_buff *mu_skb = skb_copy(skb, GFP_ATOMIC);
+
+ if (mu_skb) {
+ mu_skb->pkt_type = IEEE80211_SDATA_QUEUE_TYPE_FRAME;
+ skb_queue_tail(&monitor_sdata->skb_queue, mu_skb);
+ ieee80211_queue_work(&local->hw, &monitor_sdata->work);
+ }
+ }
+
if (prev_dev) {
skb->dev = prev_dev;
netif_receive_skb(skb);
@@ -1072,8 +1122,15 @@
tid = *ieee80211_get_qos_ctl(hdr) & IEEE80211_QOS_CTL_TID_MASK;
tid_agg_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
- if (!tid_agg_rx)
+ if (!tid_agg_rx) {
+ if (ack_policy == IEEE80211_QOS_CTL_ACK_POLICY_BLOCKACK &&
+ !test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) &&
+ !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg))
+ ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid,
+ WLAN_BACK_RECIPIENT,
+ WLAN_REASON_QSTA_REQUIRE_SETUP);
goto dont_reorder;
+ }
/* qos null data frames are excluded */
if (unlikely(hdr->frame_control & cpu_to_le16(IEEE80211_STYPE_NULLFUNC)))
@@ -2535,6 +2592,12 @@
tid = le16_to_cpu(bar_data.control) >> 12;
+ if (!test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) &&
+ !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg))
+ ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid,
+ WLAN_BACK_RECIPIENT,
+ WLAN_REASON_QSTA_REQUIRE_SETUP);
+
tid_agg_rx = rcu_dereference(rx->sta->ampdu_mlme.tid_rx[tid]);
if (!tid_agg_rx)
return RX_DROP_MONITOR;
@@ -3147,7 +3210,7 @@
continue;
if (sdata->vif.type != NL80211_IFTYPE_MONITOR ||
- !(sdata->u.mntr_flags & MONITOR_FLAG_COOK_FRAMES))
+ !(sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES))
continue;
if (prev_dev) {
@@ -3940,7 +4003,7 @@
__le16 fc;
struct ieee80211_rx_data rx;
struct ieee80211_sub_if_data *prev;
- struct rhash_head *tmp;
+ struct rhlist_head *tmp;
int err = 0;
fc = ((struct ieee80211_hdr *)skb->data)->frame_control;
@@ -3983,13 +4046,10 @@
goto out;
} else if (ieee80211_is_data(fc)) {
struct sta_info *sta, *prev_sta;
- const struct bucket_table *tbl;
prev_sta = NULL;
- tbl = rht_dereference_rcu(local->sta_hash.tbl, &local->sta_hash);
-
- for_each_sta_info(local, tbl, hdr->addr2, sta, tmp) {
+ for_each_sta_info(local, hdr->addr2, sta, tmp) {
if (!prev_sta) {
prev_sta = sta;
continue;
diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
index 070b40f..23d8ac8 100644
--- a/net/mac80211/scan.c
+++ b/net/mac80211/scan.c
@@ -420,7 +420,7 @@
{
struct ieee80211_local *local = hw_to_local(hw);
- trace_api_scan_completed(local, info);
+ trace_api_scan_completed(local, info->aborted);
set_bit(SCAN_COMPLETED, &local->scanning);
if (info->aborted)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 19f14c9..011880d 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -67,12 +67,10 @@
static const struct rhashtable_params sta_rht_params = {
.nelem_hint = 3, /* start small */
- .insecure_elasticity = true, /* Disable chain-length checks. */
.automatic_shrinking = true,
.head_offset = offsetof(struct sta_info, hash_node),
.key_offset = offsetof(struct sta_info, addr),
.key_len = ETH_ALEN,
- .hashfn = sta_addr_hash,
.max_size = CONFIG_MAC80211_STA_HASH_MAX_SIZE,
};
@@ -80,8 +78,8 @@
static int sta_info_hash_del(struct ieee80211_local *local,
struct sta_info *sta)
{
- return rhashtable_remove_fast(&local->sta_hash, &sta->hash_node,
- sta_rht_params);
+ return rhltable_remove(&local->sta_hash, &sta->hash_node,
+ sta_rht_params);
}
static void __cleanup_single_sta(struct sta_info *sta)
@@ -157,19 +155,22 @@
sta_info_free(local, sta);
}
+struct rhlist_head *sta_info_hash_lookup(struct ieee80211_local *local,
+ const u8 *addr)
+{
+ return rhltable_lookup(&local->sta_hash, addr, sta_rht_params);
+}
+
/* protected by RCU */
struct sta_info *sta_info_get(struct ieee80211_sub_if_data *sdata,
const u8 *addr)
{
struct ieee80211_local *local = sdata->local;
+ struct rhlist_head *tmp;
struct sta_info *sta;
- struct rhash_head *tmp;
- const struct bucket_table *tbl;
rcu_read_lock();
- tbl = rht_dereference_rcu(local->sta_hash.tbl, &local->sta_hash);
-
- for_each_sta_info(local, tbl, addr, sta, tmp) {
+ for_each_sta_info(local, addr, sta, tmp) {
if (sta->sdata == sdata) {
rcu_read_unlock();
/* this is safe as the caller must already hold
@@ -190,14 +191,11 @@
const u8 *addr)
{
struct ieee80211_local *local = sdata->local;
+ struct rhlist_head *tmp;
struct sta_info *sta;
- struct rhash_head *tmp;
- const struct bucket_table *tbl;
rcu_read_lock();
- tbl = rht_dereference_rcu(local->sta_hash.tbl, &local->sta_hash);
-
- for_each_sta_info(local, tbl, addr, sta, tmp) {
+ for_each_sta_info(local, addr, sta, tmp) {
if (sta->sdata == sdata ||
(sta->sdata->bss && sta->sdata->bss == sdata->bss)) {
rcu_read_unlock();
@@ -263,8 +261,8 @@
static int sta_info_hash_add(struct ieee80211_local *local,
struct sta_info *sta)
{
- return rhashtable_insert_fast(&local->sta_hash, &sta->hash_node,
- sta_rht_params);
+ return rhltable_insert(&local->sta_hash, &sta->hash_node,
+ sta_rht_params);
}
static void sta_deliver_ps_frames(struct work_struct *wk)
@@ -340,6 +338,9 @@
memcpy(sta->addr, addr, ETH_ALEN);
memcpy(sta->sta.addr, addr, ETH_ALEN);
+ sta->sta.max_rx_aggregation_subframes =
+ local->hw.max_rx_aggregation_subframes;
+
sta->local = local;
sta->sdata = sdata;
sta->rx_stats.last_rx = jiffies;
@@ -450,9 +451,9 @@
is_multicast_ether_addr(sta->sta.addr)))
return -EINVAL;
- /* Strictly speaking this isn't necessary as we hold the mutex, but
- * the rhashtable code can't really deal with that distinction. We
- * do require the mutex for correctness though.
+ /* The RCU read lock is required by rhashtable due to
+ * asynchronous resize/rehash. We also require the mutex
+ * for correctness.
*/
rcu_read_lock();
lockdep_assert_held(&sdata->local->sta_mtx);
@@ -687,7 +688,7 @@
}
/* No need to do anything if the driver does all */
- if (ieee80211_hw_check(&local->hw, AP_LINK_PS))
+ if (!local->ops->set_tim)
return;
if (sta->dead)
@@ -1040,16 +1041,11 @@
round_jiffies(jiffies + STA_INFO_CLEANUP_INTERVAL));
}
-u32 sta_addr_hash(const void *key, u32 length, u32 seed)
-{
- return jhash(key, ETH_ALEN, seed);
-}
-
int sta_info_init(struct ieee80211_local *local)
{
int err;
- err = rhashtable_init(&local->sta_hash, &sta_rht_params);
+ err = rhltable_init(&local->sta_hash, &sta_rht_params);
if (err)
return err;
@@ -1065,7 +1061,7 @@
void sta_info_stop(struct ieee80211_local *local)
{
del_timer_sync(&local->sta_cleanup);
- rhashtable_destroy(&local->sta_hash);
+ rhltable_destroy(&local->sta_hash);
}
@@ -1135,17 +1131,14 @@
const u8 *localaddr)
{
struct ieee80211_local *local = hw_to_local(hw);
+ struct rhlist_head *tmp;
struct sta_info *sta;
- struct rhash_head *tmp;
- const struct bucket_table *tbl;
-
- tbl = rht_dereference_rcu(local->sta_hash.tbl, &local->sta_hash);
/*
* Just return a random station if localaddr is NULL
* ... first in list.
*/
- for_each_sta_info(local, tbl, addr, sta, tmp) {
+ for_each_sta_info(local, addr, sta, tmp) {
if (localaddr &&
!ether_addr_equal(sta->sdata->vif.addr, localaddr))
continue;
@@ -1616,7 +1609,6 @@
sta_info_recalc_tim(sta);
} else {
- unsigned long tids = sta->txq_buffered_tids & driver_release_tids;
int tid;
/*
@@ -1648,7 +1640,8 @@
for (tid = 0; tid < ARRAY_SIZE(sta->sta.txq); tid++) {
struct txq_info *txqi = to_txq_info(sta->sta.txq[tid]);
- if (!(tids & BIT(tid)) || txqi->tin.backlog_packets)
+ if (!(driver_release_tids & BIT(tid)) ||
+ txqi->tin.backlog_packets)
continue;
sta_info_recalc_tim(sta);
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 0556be3..ed5fcb9 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -230,6 +230,8 @@
* @tid_rx_stop_requested: bitmap indicating which BA sessions per TID the
* driver requested to close until the work for it runs
* @agg_session_valid: bitmap indicating which TID has a rx BA session open on
+ * @unexpected_agg: bitmap indicating which TID already sent a delBA due to
+ * unexpected aggregation related frames outside a session
* @work: work struct for starting/stopping aggregation
* @tid_tx: aggregation info for Tx per TID
* @tid_start_tx: sessions where start was requested
@@ -244,6 +246,7 @@
unsigned long tid_rx_timer_expired[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
unsigned long tid_rx_stop_requested[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
unsigned long agg_session_valid[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
+ unsigned long unexpected_agg[BITS_TO_LONGS(IEEE80211_NUM_TIDS)];
/* tx */
struct work_struct work;
struct tid_ampdu_tx __rcu *tid_tx[IEEE80211_NUM_TIDS];
@@ -452,7 +455,7 @@
/* General information, mostly static */
struct list_head list, free_list;
struct rcu_head rcu_head;
- struct rhash_head hash_node;
+ struct rhlist_head hash_node;
u8 addr[ETH_ALEN];
struct ieee80211_local *local;
struct ieee80211_sub_if_data *sdata;
@@ -635,6 +638,9 @@
*/
#define STA_INFO_CLEANUP_INTERVAL (10 * HZ)
+struct rhlist_head *sta_info_hash_lookup(struct ieee80211_local *local,
+ const u8 *addr);
+
/*
* Get a STA info, must be under RCU read lock.
*/
@@ -644,17 +650,9 @@
struct sta_info *sta_info_get_bss(struct ieee80211_sub_if_data *sdata,
const u8 *addr);
-u32 sta_addr_hash(const void *key, u32 length, u32 seed);
-
-#define _sta_bucket_idx(_tbl, _a) \
- rht_bucket_index(_tbl, sta_addr_hash(_a, ETH_ALEN, (_tbl)->hash_rnd))
-
-#define for_each_sta_info(local, tbl, _addr, _sta, _tmp) \
- rht_for_each_entry_rcu(_sta, _tmp, tbl, \
- _sta_bucket_idx(tbl, _addr), \
- hash_node) \
- /* compare address and run code only if it matches */ \
- if (ether_addr_equal(_sta->addr, (_addr)))
+#define for_each_sta_info(local, _addr, _sta, _tmp) \
+ rhl_for_each_entry_rcu(_sta, _tmp, \
+ sta_info_hash_lookup(local, _addr), hash_node)
/*
* Get STA info by index, BROKEN!
diff --git a/net/mac80211/status.c b/net/mac80211/status.c
index a2a6826..ddf71c6 100644
--- a/net/mac80211/status.c
+++ b/net/mac80211/status.c
@@ -557,6 +557,12 @@
static void ieee80211_lost_packet(struct sta_info *sta,
struct ieee80211_tx_info *info)
{
+ /* If driver relies on its own algorithm for station kickout, skip
+ * mac80211 packet loss mechanism.
+ */
+ if (ieee80211_hw_check(&sta->local->hw, REPORTS_LOW_ACK))
+ return;
+
/* This packet was aggregated but doesn't carry status info */
if ((info->flags & IEEE80211_TX_CTL_AMPDU) &&
!(info->flags & IEEE80211_TX_STAT_AMPDU))
@@ -709,7 +715,7 @@
if (!ieee80211_sdata_running(sdata))
continue;
- if ((sdata->u.mntr_flags & MONITOR_FLAG_COOK_FRAMES) &&
+ if ((sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) &&
!send_to_cooked)
continue;
@@ -740,8 +746,8 @@
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
__le16 fc;
struct ieee80211_supported_band *sband;
+ struct rhlist_head *tmp;
struct sta_info *sta;
- struct rhash_head *tmp;
int retry_count;
int rates_idx;
bool send_to_cooked;
@@ -749,7 +755,6 @@
struct ieee80211_bar *bar;
int shift = 0;
int tid = IEEE80211_NUM_TIDS;
- const struct bucket_table *tbl;
rates_idx = ieee80211_tx_get_rates(hw, info, &retry_count);
@@ -758,9 +763,7 @@
sband = local->hw.wiphy->bands[info->band];
fc = hdr->frame_control;
- tbl = rht_dereference_rcu(local->sta_hash.tbl, &local->sta_hash);
-
- for_each_sta_info(local, tbl, hdr->addr1, sta, tmp) {
+ for_each_sta_info(local, hdr->addr1, sta, tmp) {
/* skip wrong virtual interface */
if (!ether_addr_equal(hdr->addr2, sta->sdata->vif.addr))
continue;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 1d0746d..1ff08be 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -796,6 +796,36 @@
return ret;
}
+static struct txq_info *ieee80211_get_txq(struct ieee80211_local *local,
+ struct ieee80211_vif *vif,
+ struct ieee80211_sta *pubsta,
+ struct sk_buff *skb)
+{
+ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
+ struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+ struct ieee80211_txq *txq = NULL;
+
+ if ((info->flags & IEEE80211_TX_CTL_SEND_AFTER_DTIM) ||
+ (info->control.flags & IEEE80211_TX_CTRL_PS_RESPONSE))
+ return NULL;
+
+ if (!ieee80211_is_data(hdr->frame_control))
+ return NULL;
+
+ if (pubsta) {
+ u8 tid = skb->priority & IEEE80211_QOS_CTL_TID_MASK;
+
+ txq = pubsta->txq[tid];
+ } else if (vif) {
+ txq = vif->txq;
+ }
+
+ if (!txq)
+ return NULL;
+
+ return to_txq_info(txq);
+}
+
static ieee80211_tx_result debug_noinline
ieee80211_tx_h_sequence(struct ieee80211_tx_data *tx)
{
@@ -853,7 +883,8 @@
tid = *qc & IEEE80211_QOS_CTL_TID_MASK;
tx->sta->tx_stats.msdu[tid]++;
- if (!tx->sta->sta.txq[0])
+ if (!ieee80211_get_txq(tx->local, info->control.vif, &tx->sta->sta,
+ tx->skb))
hdr->seq_ctrl = ieee80211_tx_next_seq(tx->sta, tid);
return TX_CONTINUE;
@@ -1243,36 +1274,6 @@
return TX_CONTINUE;
}
-static struct txq_info *ieee80211_get_txq(struct ieee80211_local *local,
- struct ieee80211_vif *vif,
- struct ieee80211_sta *pubsta,
- struct sk_buff *skb)
-{
- struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
- struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
- struct ieee80211_txq *txq = NULL;
-
- if ((info->flags & IEEE80211_TX_CTL_SEND_AFTER_DTIM) ||
- (info->control.flags & IEEE80211_TX_CTRL_PS_RESPONSE))
- return NULL;
-
- if (!ieee80211_is_data(hdr->frame_control))
- return NULL;
-
- if (pubsta) {
- u8 tid = skb->priority & IEEE80211_QOS_CTL_TID_MASK;
-
- txq = pubsta->txq[tid];
- } else if (vif) {
- txq = vif->txq;
- }
-
- if (!txq)
- return NULL;
-
- return to_txq_info(txq);
-}
-
static void ieee80211_set_skb_enqueue_time(struct sk_buff *skb)
{
IEEE80211_SKB_CB(skb)->control.enqueue_time = codel_get_time();
@@ -1343,7 +1344,7 @@
local = container_of(fq, struct ieee80211_local, fq);
txqi = container_of(tin, struct txq_info, tin);
cparams = &local->cparams;
- cstats = &local->cstats;
+ cstats = &txqi->cstats;
if (flow == &txqi->def_flow)
cvars = &txqi->def_cvars;
@@ -1403,6 +1404,7 @@
fq_tin_init(&txqi->tin);
fq_flow_init(&txqi->def_flow);
codel_vars_init(&txqi->def_cvars);
+ codel_stats_init(&txqi->cstats);
txqi->txq.vif = &sdata->vif;
@@ -1441,7 +1443,6 @@
return ret;
codel_params_init(&local->cparams);
- codel_stats_init(&local->cstats);
local->cparams.interval = MS2TIME(100);
local->cparams.target = MS2TIME(20);
local->cparams.ecn = true;
@@ -1514,8 +1515,12 @@
spin_unlock_bh(&fq->lock);
if (skb && skb_has_frag_list(skb) &&
- !ieee80211_hw_check(&local->hw, TX_FRAG_LIST))
- skb_linearize(skb);
+ !ieee80211_hw_check(&local->hw, TX_FRAG_LIST)) {
+ if (skb_linearize(skb)) {
+ ieee80211_free_txskb(&local->hw, skb);
+ return NULL;
+ }
+ }
return skb;
}
@@ -1643,7 +1648,7 @@
switch (sdata->vif.type) {
case NL80211_IFTYPE_MONITOR:
- if (sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE) {
+ if (sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE) {
vif = &sdata->vif;
break;
}
@@ -2263,15 +2268,9 @@
case NL80211_IFTYPE_STATION:
if (sdata->wdev.wiphy->flags & WIPHY_FLAG_SUPPORTS_TDLS) {
sta = sta_info_get(sdata, skb->data);
- if (sta) {
- bool tdls_peer, tdls_auth;
-
- tdls_peer = test_sta_flag(sta,
- WLAN_STA_TDLS_PEER);
- tdls_auth = test_sta_flag(sta,
- WLAN_STA_TDLS_PEER_AUTH);
-
- if (tdls_peer && tdls_auth) {
+ if (sta && test_sta_flag(sta, WLAN_STA_TDLS_PEER)) {
+ if (test_sta_flag(sta,
+ WLAN_STA_TDLS_PEER_AUTH)) {
*sta_out = sta;
return 0;
}
@@ -2283,8 +2282,7 @@
* after a TDLS sta is removed due to being
* unreachable.
*/
- if (tdls_peer && !tdls_auth &&
- !ieee80211_is_tdls_setup(skb))
+ if (!ieee80211_is_tdls_setup(skb))
return -EINVAL;
}
@@ -3243,7 +3241,7 @@
if (hdr->frame_control & cpu_to_le16(IEEE80211_STYPE_QOS_DATA)) {
*ieee80211_get_qos_ctl(hdr) = tid;
- if (!sta->sta.txq[0])
+ if (!ieee80211_get_txq(local, &sdata->vif, &sta->sta, skb))
hdr->seq_ctrl = ieee80211_tx_next_seq(sta, tid);
} else {
info->flags |= IEEE80211_TX_CTL_ASSIGN_SEQ;
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 42bf0b6..b6865d8 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -598,7 +598,7 @@
list_for_each_entry_rcu(sdata, &local->interfaces, list) {
switch (sdata->vif.type) {
case NL80211_IFTYPE_MONITOR:
- if (!(sdata->u.mntr_flags & MONITOR_FLAG_ACTIVE))
+ if (!(sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE))
continue;
break;
case NL80211_IFTYPE_AP_VLAN:
@@ -2555,7 +2555,6 @@
if (need_basic && basic_rates & BIT(i))
basic = 0x80;
- rate = sband->bitrates[i].bitrate;
rate = DIV_ROUND_UP(sband->bitrates[i].bitrate,
5 * (1 << shift));
*pos++ = basic | (u8) rate;
diff --git a/net/mac802154/iface.c b/net/mac802154/iface.c
index 7079cd3..06019db 100644
--- a/net/mac802154/iface.c
+++ b/net/mac802154/iface.c
@@ -663,6 +663,7 @@
/* TODO check this */
SET_NETDEV_DEV(ndev, &local->phy->dev);
+ dev_net_set(ndev, wpan_phy_net(local->hw.phy));
sdata = netdev_priv(ndev);
ndev->ieee802154_ptr = &sdata->wpan_dev;
memcpy(sdata->name, ndev->name, IFNAMSIZ);
diff --git a/net/mac802154/rx.c b/net/mac802154/rx.c
index 446e130..4dcf6e1 100644
--- a/net/mac802154/rx.c
+++ b/net/mac802154/rx.c
@@ -101,11 +101,16 @@
sdata->dev->stats.rx_bytes += skb->len;
switch (mac_cb(skb)->type) {
+ case IEEE802154_FC_TYPE_BEACON:
+ case IEEE802154_FC_TYPE_ACK:
+ case IEEE802154_FC_TYPE_MAC_CMD:
+ goto fail;
+
case IEEE802154_FC_TYPE_DATA:
return ieee802154_deliver_skb(skb);
default:
- pr_warn("ieee802154: bad frame received (type = %d)\n",
- mac_cb(skb)->type);
+ pr_warn_ratelimited("ieee802154: bad frame received "
+ "(type = %d)\n", mac_cb(skb)->type);
goto fail;
}
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 0c858110..c23c3c8 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -71,8 +71,9 @@
# nf_tables
nf_tables-objs += nf_tables_core.o nf_tables_api.o nf_tables_trace.o
-nf_tables-objs += nft_immediate.o nft_cmp.o nft_lookup.o nft_dynset.o
+nf_tables-objs += nft_immediate.o nft_cmp.o nft_range.o
nf_tables-objs += nft_bitwise.o nft_byteorder.o nft_payload.o
+nf_tables-objs += nft_lookup.o nft_dynset.o
obj-$(CONFIG_NF_TABLES) += nf_tables.o
obj-$(CONFIG_NF_TABLES_INET) += nf_tables_inet.o
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index f39276d..fa6715d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -22,6 +22,7 @@
#include <linux/proc_fs.h>
#include <linux/mutex.h>
#include <linux/slab.h>
+#include <linux/rcupdate.h>
#include <net/net_namespace.h>
#include <net/sock.h>
@@ -61,33 +62,55 @@
#endif
static DEFINE_MUTEX(nf_hook_mutex);
+#define nf_entry_dereference(e) \
+ rcu_dereference_protected(e, lockdep_is_held(&nf_hook_mutex))
-static struct list_head *nf_find_hook_list(struct net *net,
- const struct nf_hook_ops *reg)
+static struct nf_hook_entry *nf_hook_entry_head(struct net *net,
+ const struct nf_hook_ops *reg)
{
- struct list_head *hook_list = NULL;
+ struct nf_hook_entry *hook_head = NULL;
if (reg->pf != NFPROTO_NETDEV)
- hook_list = &net->nf.hooks[reg->pf][reg->hooknum];
+ hook_head = nf_entry_dereference(net->nf.hooks[reg->pf]
+ [reg->hooknum]);
else if (reg->hooknum == NF_NETDEV_INGRESS) {
#ifdef CONFIG_NETFILTER_INGRESS
if (reg->dev && dev_net(reg->dev) == net)
- hook_list = ®->dev->nf_hooks_ingress;
+ hook_head =
+ nf_entry_dereference(
+ reg->dev->nf_hooks_ingress);
#endif
}
- return hook_list;
+ return hook_head;
}
-struct nf_hook_entry {
- const struct nf_hook_ops *orig_ops;
- struct nf_hook_ops ops;
-};
+/* must hold nf_hook_mutex */
+static void nf_set_hooks_head(struct net *net, const struct nf_hook_ops *reg,
+ struct nf_hook_entry *entry)
+{
+ switch (reg->pf) {
+ case NFPROTO_NETDEV:
+ /* We already checked in nf_register_net_hook() that this is
+ * used from ingress.
+ */
+ rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
+ break;
+ default:
+ rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum],
+ entry);
+ break;
+ }
+}
int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg)
{
- struct list_head *hook_list;
+ struct nf_hook_entry *hooks_entry;
struct nf_hook_entry *entry;
- struct nf_hook_ops *elem;
+
+ if (reg->pf == NFPROTO_NETDEV &&
+ (reg->hooknum != NF_NETDEV_INGRESS ||
+ !reg->dev || dev_net(reg->dev) != net))
+ return -EINVAL;
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
@@ -95,19 +118,30 @@
entry->orig_ops = reg;
entry->ops = *reg;
-
- hook_list = nf_find_hook_list(net, reg);
- if (!hook_list) {
- kfree(entry);
- return -ENOENT;
- }
+ entry->next = NULL;
mutex_lock(&nf_hook_mutex);
- list_for_each_entry(elem, hook_list, list) {
- if (reg->priority < elem->priority)
- break;
+ hooks_entry = nf_hook_entry_head(net, reg);
+
+ if (hooks_entry && hooks_entry->orig_ops->priority > reg->priority) {
+ /* This is the case where we need to insert at the head */
+ entry->next = hooks_entry;
+ hooks_entry = NULL;
}
- list_add_rcu(&entry->ops.list, elem->list.prev);
+
+ while (hooks_entry &&
+ reg->priority >= hooks_entry->orig_ops->priority &&
+ nf_entry_dereference(hooks_entry->next)) {
+ hooks_entry = nf_entry_dereference(hooks_entry->next);
+ }
+
+ if (hooks_entry) {
+ entry->next = nf_entry_dereference(hooks_entry->next);
+ rcu_assign_pointer(hooks_entry->next, entry);
+ } else {
+ nf_set_hooks_head(net, reg, entry);
+ }
+
mutex_unlock(&nf_hook_mutex);
#ifdef CONFIG_NETFILTER_INGRESS
if (reg->pf == NFPROTO_NETDEV && reg->hooknum == NF_NETDEV_INGRESS)
@@ -122,24 +156,33 @@
void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg)
{
- struct list_head *hook_list;
- struct nf_hook_entry *entry;
- struct nf_hook_ops *elem;
-
- hook_list = nf_find_hook_list(net, reg);
- if (!hook_list)
- return;
+ struct nf_hook_entry *hooks_entry;
mutex_lock(&nf_hook_mutex);
- list_for_each_entry(elem, hook_list, list) {
- entry = container_of(elem, struct nf_hook_entry, ops);
- if (entry->orig_ops == reg) {
- list_del_rcu(&entry->ops.list);
- break;
- }
+ hooks_entry = nf_hook_entry_head(net, reg);
+ if (hooks_entry->orig_ops == reg) {
+ nf_set_hooks_head(net, reg,
+ nf_entry_dereference(hooks_entry->next));
+ goto unlock;
}
+ while (hooks_entry && nf_entry_dereference(hooks_entry->next)) {
+ struct nf_hook_entry *next =
+ nf_entry_dereference(hooks_entry->next);
+ struct nf_hook_entry *nnext;
+
+ if (next->orig_ops != reg) {
+ hooks_entry = next;
+ continue;
+ }
+ nnext = nf_entry_dereference(next->next);
+ rcu_assign_pointer(hooks_entry->next, nnext);
+ hooks_entry = next;
+ break;
+ }
+
+unlock:
mutex_unlock(&nf_hook_mutex);
- if (&elem->list == hook_list) {
+ if (!hooks_entry) {
WARN(1, "nf_unregister_net_hook: hook not found!\n");
return;
}
@@ -151,10 +194,10 @@
static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]);
#endif
synchronize_net();
- nf_queue_nf_hook_drop(net, &entry->ops);
+ nf_queue_nf_hook_drop(net, hooks_entry);
/* other cpu might still process nfqueue verdict that used reg */
synchronize_net();
- kfree(entry);
+ kfree(hooks_entry);
}
EXPORT_SYMBOL(nf_unregister_net_hook);
@@ -188,19 +231,17 @@
static LIST_HEAD(nf_hook_list);
-int nf_register_hook(struct nf_hook_ops *reg)
+static int _nf_register_hook(struct nf_hook_ops *reg)
{
struct net *net, *last;
int ret;
- rtnl_lock();
for_each_net(net) {
ret = nf_register_net_hook(net, reg);
if (ret && ret != -ENOENT)
goto rollback;
}
list_add_tail(®->list, &nf_hook_list);
- rtnl_unlock();
return 0;
rollback:
@@ -210,19 +251,34 @@
break;
nf_unregister_net_hook(net, reg);
}
+ return ret;
+}
+
+int nf_register_hook(struct nf_hook_ops *reg)
+{
+ int ret;
+
+ rtnl_lock();
+ ret = _nf_register_hook(reg);
rtnl_unlock();
+
return ret;
}
EXPORT_SYMBOL(nf_register_hook);
-void nf_unregister_hook(struct nf_hook_ops *reg)
+static void _nf_unregister_hook(struct nf_hook_ops *reg)
{
struct net *net;
- rtnl_lock();
list_del(®->list);
for_each_net(net)
nf_unregister_net_hook(net, reg);
+}
+
+void nf_unregister_hook(struct nf_hook_ops *reg)
+{
+ rtnl_lock();
+ _nf_unregister_hook(reg);
rtnl_unlock();
}
EXPORT_SYMBOL(nf_unregister_hook);
@@ -246,6 +302,26 @@
}
EXPORT_SYMBOL(nf_register_hooks);
+/* Caller MUST take rtnl_lock() */
+int _nf_register_hooks(struct nf_hook_ops *reg, unsigned int n)
+{
+ unsigned int i;
+ int err = 0;
+
+ for (i = 0; i < n; i++) {
+ err = _nf_register_hook(®[i]);
+ if (err)
+ goto err;
+ }
+ return err;
+
+err:
+ if (i > 0)
+ _nf_unregister_hooks(reg, i);
+ return err;
+}
+EXPORT_SYMBOL(_nf_register_hooks);
+
void nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n)
{
while (n-- > 0)
@@ -253,10 +329,17 @@
}
EXPORT_SYMBOL(nf_unregister_hooks);
-unsigned int nf_iterate(struct list_head *head,
- struct sk_buff *skb,
+/* Caller MUST take rtnl_lock */
+void _nf_unregister_hooks(struct nf_hook_ops *reg, unsigned int n)
+{
+ while (n-- > 0)
+ _nf_unregister_hook(®[n]);
+}
+EXPORT_SYMBOL(_nf_unregister_hooks);
+
+unsigned int nf_iterate(struct sk_buff *skb,
struct nf_hook_state *state,
- struct nf_hook_ops **elemp)
+ struct nf_hook_entry **entryp)
{
unsigned int verdict;
@@ -264,20 +347,23 @@
* The caller must not block between calls to this
* function because of risk of continuing from deleted element.
*/
- list_for_each_entry_continue_rcu((*elemp), head, list) {
- if (state->thresh > (*elemp)->priority)
+ while (*entryp) {
+ if (state->thresh > (*entryp)->ops.priority) {
+ *entryp = rcu_dereference((*entryp)->next);
continue;
+ }
/* Optimization: we don't need to hold module
reference here, since function can't sleep. --RR */
repeat:
- verdict = (*elemp)->hook((*elemp)->priv, skb, state);
+ verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state);
if (verdict != NF_ACCEPT) {
#ifdef CONFIG_NETFILTER_DEBUG
if (unlikely((verdict & NF_VERDICT_MASK)
> NF_MAX_VERDICT)) {
NFDEBUG("Evil return from %p(%u).\n",
- (*elemp)->hook, state->hook);
+ (*entryp)->ops.hook, state->hook);
+ *entryp = rcu_dereference((*entryp)->next);
continue;
}
#endif
@@ -285,25 +371,23 @@
return verdict;
goto repeat;
}
+ *entryp = rcu_dereference((*entryp)->next);
}
return NF_ACCEPT;
}
/* Returns 1 if okfn() needs to be executed by the caller,
- * -EPERM for NF_DROP, 0 otherwise. */
+ * -EPERM for NF_DROP, 0 otherwise. Caller must hold rcu_read_lock. */
int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state)
{
- struct nf_hook_ops *elem;
+ struct nf_hook_entry *entry;
unsigned int verdict;
int ret = 0;
- /* We may already have this, but read-locks nest anyway */
- rcu_read_lock();
-
- elem = list_entry_rcu(state->hook_list, struct nf_hook_ops, list);
+ entry = rcu_dereference(state->hook_entries);
next_hook:
- verdict = nf_iterate(state->hook_list, skb, state, &elem);
+ verdict = nf_iterate(skb, state, &entry);
if (verdict == NF_ACCEPT || verdict == NF_STOP) {
ret = 1;
} else if ((verdict & NF_VERDICT_MASK) == NF_DROP) {
@@ -312,8 +396,10 @@
if (ret == 0)
ret = -EPERM;
} else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
- int err = nf_queue(skb, elem, state,
- verdict >> NF_VERDICT_QBITS);
+ int err;
+
+ RCU_INIT_POINTER(state->hook_entries, entry);
+ err = nf_queue(skb, state, verdict >> NF_VERDICT_QBITS);
if (err < 0) {
if (err == -ESRCH &&
(verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
@@ -321,7 +407,6 @@
kfree_skb(skb);
}
}
- rcu_read_unlock();
return ret;
}
EXPORT_SYMBOL(nf_hook_slow);
@@ -441,7 +526,7 @@
for (i = 0; i < ARRAY_SIZE(net->nf.hooks); i++) {
for (h = 0; h < NF_MAX_HOOKS; h++)
- INIT_LIST_HEAD(&net->nf.hooks[i][h]);
+ RCU_INIT_POINTER(net->nf.hooks[i][h], NULL);
}
#ifdef CONFIG_PROC_FS
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index ac1db40..ba6a1d4 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -379,7 +379,6 @@
destroy_conntrack(struct nf_conntrack *nfct)
{
struct nf_conn *ct = (struct nf_conn *)nfct;
- struct net *net = nf_ct_net(ct);
struct nf_conntrack_l4proto *l4proto;
pr_debug("destroy_conntrack(%p)\n", ct);
@@ -406,7 +405,6 @@
nf_ct_del_from_dying_or_unconfirmed_list(ct);
- NF_CT_STAT_INC(net, delete);
local_bh_enable();
if (ct->master)
@@ -438,7 +436,6 @@
nf_ct_add_to_dying_list(ct);
- NF_CT_STAT_INC(net, delete_list);
local_bh_enable();
}
@@ -529,11 +526,8 @@
if (nf_ct_is_dying(ct))
continue;
- if (nf_ct_key_equal(h, tuple, zone, net)) {
- NF_CT_STAT_INC_ATOMIC(net, found);
+ if (nf_ct_key_equal(h, tuple, zone, net))
return h;
- }
- NF_CT_STAT_INC_ATOMIC(net, searched);
}
/*
* if the nulls value we got at the end of this lookup is
@@ -798,7 +792,6 @@
*/
__nf_conntrack_hash_insert(ct, hash, reply_hash);
nf_conntrack_double_unlock(hash, reply_hash);
- NF_CT_STAT_INC(net, insert);
local_bh_enable();
help = nfct_help(ct);
@@ -857,7 +850,6 @@
rcu_read_unlock();
return 1;
}
- NF_CT_STAT_INC_ATOMIC(net, searched);
}
if (get_nulls_value(n) != hash) {
@@ -1116,9 +1108,9 @@
if (IS_ERR(ct))
return (struct nf_conntrack_tuple_hash *)ct;
- if (tmpl && nfct_synproxy(tmpl)) {
- nfct_seqadj_ext_add(ct);
- nfct_synproxy_ext_add(ct);
+ if (!nf_ct_add_synproxy(ct, tmpl)) {
+ nf_conntrack_free(ct);
+ return ERR_PTR(-ENOMEM);
}
timeout_ext = tmpl ? nf_ct_timeout_find(tmpl) : NULL;
@@ -1177,10 +1169,8 @@
}
spin_unlock(&nf_conntrack_expect_lock);
}
- if (!exp) {
+ if (!exp)
__nf_ct_try_assign_helper(ct, tmpl, GFP_ATOMIC);
- NF_CT_STAT_INC(net, new);
- }
/* Now it is inserted into the unconfirmed list, bump refcount */
nf_conntrack_get(&ct->ct_general);
@@ -1285,7 +1275,7 @@
skb->nfct = NULL;
}
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
l3proto = __nf_ct_l3proto_find(pf);
ret = l3proto->get_l4proto(skb, skb_network_offset(skb),
&dataoff, &protonum);
diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c
index b6934b5..e3ed200 100644
--- a/net/netfilter/nf_conntrack_ftp.c
+++ b/net/netfilter/nf_conntrack_ftp.c
@@ -301,8 +301,6 @@
size_t i = plen;
pr_debug("find_pattern `%s': dlen = %Zu\n", pattern, dlen);
- if (dlen == 0)
- return 0;
if (dlen <= plen) {
/* Short packet: try for partial? */
@@ -311,19 +309,8 @@
else return 0;
}
- if (strncasecmp(data, pattern, plen) != 0) {
-#if 0
- size_t i;
-
- pr_debug("ftp: string mismatch\n");
- for (i = 0; i < plen; i++) {
- pr_debug("ftp:char %u `%c'(%u) vs `%c'(%u)\n",
- i, data[i], data[i],
- pattern[i], pattern[i]);
- }
-#endif
+ if (strncasecmp(data, pattern, plen) != 0)
return 0;
- }
pr_debug("Pattern matches!\n");
/* Now we've found the constant string, try to skip
diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c
index 5c0db5c..f65d9363 100644
--- a/net/netfilter/nf_conntrack_h323_main.c
+++ b/net/netfilter/nf_conntrack_h323_main.c
@@ -736,7 +736,7 @@
const struct nf_afinfo *afinfo;
int ret = 0;
- /* rcu_read_lock()ed by nf_hook_slow() */
+ /* rcu_read_lock()ed by nf_hook_thresh */
afinfo = nf_get_afinfo(family);
if (!afinfo)
return 0;
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index b989b81..336e215 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -189,7 +189,6 @@
struct nf_conntrack_helper *helper = NULL;
struct nf_conn_help *help;
struct net *net = nf_ct_net(ct);
- int ret = 0;
/* We already got a helper explicitly attached. The function
* nf_conntrack_alter_reply - in case NAT is in use - asks for looking
@@ -223,15 +222,13 @@
if (helper == NULL) {
if (help)
RCU_INIT_POINTER(help->helper, NULL);
- goto out;
+ return 0;
}
if (help == NULL) {
help = nf_ct_helper_ext_add(ct, helper, flags);
- if (help == NULL) {
- ret = -ENOMEM;
- goto out;
- }
+ if (help == NULL)
+ return -ENOMEM;
} else {
/* We only allow helper re-assignment of the same sort since
* we cannot reallocate the helper extension area.
@@ -240,13 +237,13 @@
if (tmp && tmp->help != helper->help) {
RCU_INIT_POINTER(help->helper, NULL);
- goto out;
+ return 0;
}
}
rcu_assign_pointer(help->helper, helper);
-out:
- return ret;
+
+ return 0;
}
EXPORT_SYMBOL_GPL(__nf_ct_try_assign_helper);
@@ -349,7 +346,7 @@
/* Called from the helper function, this call never fails */
help = nfct_help(ct);
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
nf_log_packet(nf_ct_net(ct), nf_ct_l3num(ct), 0, skb, NULL, NULL, NULL,
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index c052b71..2754045 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1984,13 +1984,9 @@
nfmsg->version = NFNETLINK_V0;
nfmsg->res_id = htons(cpu);
- if (nla_put_be32(skb, CTA_STATS_SEARCHED, htonl(st->searched)) ||
- nla_put_be32(skb, CTA_STATS_FOUND, htonl(st->found)) ||
- nla_put_be32(skb, CTA_STATS_NEW, htonl(st->new)) ||
+ if (nla_put_be32(skb, CTA_STATS_FOUND, htonl(st->found)) ||
nla_put_be32(skb, CTA_STATS_INVALID, htonl(st->invalid)) ||
nla_put_be32(skb, CTA_STATS_IGNORE, htonl(st->ignore)) ||
- nla_put_be32(skb, CTA_STATS_DELETE, htonl(st->delete)) ||
- nla_put_be32(skb, CTA_STATS_DELETE_LIST, htonl(st->delete_list)) ||
nla_put_be32(skb, CTA_STATS_INSERT, htonl(st->insert)) ||
nla_put_be32(skb, CTA_STATS_INSERT_FAILED,
htonl(st->insert_failed)) ||
diff --git a/net/netfilter/nf_conntrack_proto_gre.c b/net/netfilter/nf_conntrack_proto_gre.c
index a96451a..9a715f8 100644
--- a/net/netfilter/nf_conntrack_proto_gre.c
+++ b/net/netfilter/nf_conntrack_proto_gre.c
@@ -192,15 +192,15 @@
static bool gre_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff,
struct net *net, struct nf_conntrack_tuple *tuple)
{
- const struct gre_hdr_pptp *pgrehdr;
- struct gre_hdr_pptp _pgrehdr;
+ const struct pptp_gre_header *pgrehdr;
+ struct pptp_gre_header _pgrehdr;
__be16 srckey;
- const struct gre_hdr *grehdr;
- struct gre_hdr _grehdr;
+ const struct gre_base_hdr *grehdr;
+ struct gre_base_hdr _grehdr;
/* first only delinearize old RFC1701 GRE header */
grehdr = skb_header_pointer(skb, dataoff, sizeof(_grehdr), &_grehdr);
- if (!grehdr || grehdr->version != GRE_VERSION_PPTP) {
+ if (!grehdr || (grehdr->flags & GRE_VERSION) != GRE_VERSION_1) {
/* try to behave like "nf_conntrack_proto_generic" */
tuple->src.u.all = 0;
tuple->dst.u.all = 0;
@@ -212,8 +212,8 @@
if (!pgrehdr)
return true;
- if (ntohs(grehdr->protocol) != GRE_PROTOCOL_PPTP) {
- pr_debug("GRE_VERSION_PPTP but unknown proto\n");
+ if (grehdr->protocol != GRE_PROTO_PPP) {
+ pr_debug("Unsupported GRE proto(0x%x)\n", ntohs(grehdr->protocol));
return false;
}
diff --git a/net/netfilter/nf_conntrack_seqadj.c b/net/netfilter/nf_conntrack_seqadj.c
index dff0f0c..ef7063e 100644
--- a/net/netfilter/nf_conntrack_seqadj.c
+++ b/net/netfilter/nf_conntrack_seqadj.c
@@ -169,7 +169,7 @@
s32 seqoff, ackoff;
struct nf_conn_seqadj *seqadj = nfct_seqadj(ct);
struct nf_ct_seqadj *this_way, *other_way;
- int res;
+ int res = 1;
this_way = &seqadj->seq[dir];
other_way = &seqadj->seq[!dir];
@@ -184,27 +184,31 @@
else
seqoff = this_way->offset_before;
+ newseq = htonl(ntohl(tcph->seq) + seqoff);
+ inet_proto_csum_replace4(&tcph->check, skb, tcph->seq, newseq, false);
+ pr_debug("Adjusting sequence number from %u->%u\n",
+ ntohl(tcph->seq), ntohl(newseq));
+ tcph->seq = newseq;
+
+ if (!tcph->ack)
+ goto out;
+
if (after(ntohl(tcph->ack_seq) - other_way->offset_before,
other_way->correction_pos))
ackoff = other_way->offset_after;
else
ackoff = other_way->offset_before;
- newseq = htonl(ntohl(tcph->seq) + seqoff);
newack = htonl(ntohl(tcph->ack_seq) - ackoff);
-
- inet_proto_csum_replace4(&tcph->check, skb, tcph->seq, newseq, false);
inet_proto_csum_replace4(&tcph->check, skb, tcph->ack_seq, newack,
false);
-
- pr_debug("Adjusting sequence number from %u->%u, ack from %u->%u\n",
+ pr_debug("Adjusting ack number from %u->%u, ack from %u->%u\n",
ntohl(tcph->seq), ntohl(newseq), ntohl(tcph->ack_seq),
ntohl(newack));
-
- tcph->seq = newseq;
tcph->ack_seq = newack;
res = nf_ct_sack_adjust(skb, protoff, tcph, ct, ctinfo);
+out:
spin_unlock_bh(&ct->lock);
return res;
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index 7d77217..621b81c 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -83,9 +83,10 @@
static int iswordc(const char c)
{
if (isalnum(c) || c == '!' || c == '"' || c == '%' ||
- (c >= '(' && c <= '/') || c == ':' || c == '<' || c == '>' ||
+ (c >= '(' && c <= '+') || c == ':' || c == '<' || c == '>' ||
c == '?' || (c >= '[' && c <= ']') || c == '_' || c == '`' ||
- c == '{' || c == '}' || c == '~')
+ c == '{' || c == '}' || c == '~' || (c >= '-' && c <= '/') ||
+ c == '\'')
return 1;
return 0;
}
@@ -329,13 +330,12 @@
static const char *sip_skip_whitespace(const char *dptr, const char *limit)
{
for (; dptr < limit; dptr++) {
- if (*dptr == ' ')
+ if (*dptr == ' ' || *dptr == '\t')
continue;
if (*dptr != '\r' && *dptr != '\n')
break;
dptr = sip_follow_continuation(dptr, limit);
- if (dptr == NULL)
- return NULL;
+ break;
}
return dptr;
}
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 3d9a316..5f446cd 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -212,6 +212,11 @@
if (unlikely(!atomic_inc_not_zero(&ct->ct_general.use)))
return 0;
+ if (nf_ct_should_gc(ct)) {
+ nf_ct_kill(ct);
+ goto release;
+ }
+
/* we only want to print DIR_ORIGINAL */
if (NF_CT_DIRECTION(hash))
goto release;
@@ -352,13 +357,13 @@
seq_printf(seq, "%08x %08x %08x %08x %08x %08x %08x %08x "
"%08x %08x %08x %08x %08x %08x %08x %08x %08x\n",
nr_conntracks,
- st->searched,
+ 0,
st->found,
- st->new,
+ 0,
st->invalid,
st->ignore,
- st->delete,
- st->delete_list,
+ 0,
+ 0,
st->insert,
st->insert_failed,
st->drop,
diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h
index 0655225..e0adb59 100644
--- a/net/netfilter/nf_internals.h
+++ b/net/netfilter/nf_internals.h
@@ -13,13 +13,13 @@
/* core.c */
-unsigned int nf_iterate(struct list_head *head, struct sk_buff *skb,
- struct nf_hook_state *state, struct nf_hook_ops **elemp);
+unsigned int nf_iterate(struct sk_buff *skb, struct nf_hook_state *state,
+ struct nf_hook_entry **entryp);
/* nf_queue.c */
-int nf_queue(struct sk_buff *skb, struct nf_hook_ops *elem,
- struct nf_hook_state *state, unsigned int queuenum);
-void nf_queue_nf_hook_drop(struct net *net, struct nf_hook_ops *ops);
+int nf_queue(struct sk_buff *skb, struct nf_hook_state *state,
+ unsigned int queuenum);
+void nf_queue_nf_hook_drop(struct net *net, const struct nf_hook_entry *entry);
int __init netfilter_queue_init(void);
/* nf_log.c */
diff --git a/net/netfilter/nf_log_common.c b/net/netfilter/nf_log_common.c
index a5aa596..119fe1c 100644
--- a/net/netfilter/nf_log_common.c
+++ b/net/netfilter/nf_log_common.c
@@ -77,7 +77,7 @@
nf_log_buf_add(m, "SPT=%u DPT=%u ",
ntohs(th->source), ntohs(th->dest));
/* Max length: 30 "SEQ=4294967295 ACK=4294967295 " */
- if (logflags & XT_LOG_TCPSEQ) {
+ if (logflags & NF_LOG_TCPSEQ) {
nf_log_buf_add(m, "SEQ=%u ACK=%u ",
ntohl(th->seq), ntohl(th->ack_seq));
}
@@ -107,7 +107,7 @@
/* Max length: 11 "URGP=65535 " */
nf_log_buf_add(m, "URGP=%u ", ntohs(th->urg_ptr));
- if ((logflags & XT_LOG_TCPOPT) && th->doff*4 > sizeof(struct tcphdr)) {
+ if ((logflags & NF_LOG_TCPOPT) && th->doff*4 > sizeof(struct tcphdr)) {
u_int8_t _opt[60 - sizeof(struct tcphdr)];
const u_int8_t *op;
unsigned int i;
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 81ae41f..bbb8f3d 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -441,7 +441,8 @@
ct->status |= IPS_DST_NAT;
if (nfct_help(ct))
- nfct_seqadj_ext_add(ct);
+ if (!nfct_seqadj_ext_add(ct))
+ return NF_DROP;
}
if (maniptype == NF_NAT_MANIP_SRC) {
@@ -801,7 +802,7 @@
if (err < 0)
return err;
- return nf_nat_setup_info(ct, &range, manip);
+ return nf_nat_setup_info(ct, &range, manip) == NF_DROP ? -ENOMEM : 0;
}
#else
static int
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index b19ad20..96964a0 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -96,14 +96,14 @@
}
EXPORT_SYMBOL_GPL(nf_queue_entry_get_refs);
-void nf_queue_nf_hook_drop(struct net *net, struct nf_hook_ops *ops)
+void nf_queue_nf_hook_drop(struct net *net, const struct nf_hook_entry *entry)
{
const struct nf_queue_handler *qh;
rcu_read_lock();
qh = rcu_dereference(net->nf.queue_handler);
if (qh)
- qh->nf_hook_drop(net, ops);
+ qh->nf_hook_drop(net, entry);
rcu_read_unlock();
}
@@ -112,7 +112,6 @@
* through nf_reinject().
*/
int nf_queue(struct sk_buff *skb,
- struct nf_hook_ops *elem,
struct nf_hook_state *state,
unsigned int queuenum)
{
@@ -141,7 +140,6 @@
*entry = (struct nf_queue_entry) {
.skb = skb,
- .elem = elem,
.state = *state,
.size = sizeof(*entry) + afinfo->route_key_size,
};
@@ -165,11 +163,15 @@
void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
{
+ struct nf_hook_entry *hook_entry;
struct sk_buff *skb = entry->skb;
- struct nf_hook_ops *elem = entry->elem;
const struct nf_afinfo *afinfo;
+ struct nf_hook_ops *elem;
int err;
+ hook_entry = rcu_dereference(entry->state.hook_entries);
+ elem = &hook_entry->ops;
+
nf_queue_entry_release_refs(entry);
/* Continue traversal iff userspace said ok... */
@@ -186,8 +188,7 @@
if (verdict == NF_ACCEPT) {
next_hook:
- verdict = nf_iterate(entry->state.hook_list,
- skb, &entry->state, &elem);
+ verdict = nf_iterate(skb, &entry->state, &hook_entry);
}
switch (verdict & NF_VERDICT_MASK) {
@@ -198,7 +199,8 @@
local_bh_enable();
break;
case NF_QUEUE:
- err = nf_queue(skb, elem, &entry->state,
+ RCU_INIT_POINTER(entry->state.hook_entries, hook_entry);
+ err = nf_queue(skb, &entry->state,
verdict >> NF_VERDICT_QBITS);
if (err < 0) {
if (err == -ESRCH &&
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index bd9715e..b70d3ea 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4410,6 +4410,31 @@
}
/**
+ * nft_parse_u32_check - fetch u32 attribute and check for maximum value
+ *
+ * @attr: netlink attribute to fetch value from
+ * @max: maximum value to be stored in dest
+ * @dest: pointer to the variable
+ *
+ * Parse, check and store a given u32 netlink attribute into variable.
+ * This function returns -ERANGE if the value goes over maximum value.
+ * Otherwise a 0 is returned and the attribute value is stored in the
+ * destination variable.
+ */
+unsigned int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest)
+{
+ int val;
+
+ val = ntohl(nla_get_be32(attr));
+ if (val > max)
+ return -ERANGE;
+
+ *dest = val;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(nft_parse_u32_check);
+
+/**
* nft_parse_register - parse a register value from a netlink attribute
*
* @attr: netlink attribute
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index fb8b589..0dd5c69 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -34,7 +34,7 @@
.u = {
.log = {
.level = LOGLEVEL_WARNING,
- .logflags = NF_LOG_MASK,
+ .logflags = NF_LOG_DEFAULT_MASK,
},
},
};
@@ -93,12 +93,15 @@
if (priv->base == NFT_PAYLOAD_NETWORK_HEADER)
ptr = skb_network_header(skb);
- else
+ else {
+ if (!pkt->tprot_set)
+ return false;
ptr = skb_network_header(skb) + pkt->xt.thoff;
+ }
ptr += priv->offset;
- if (unlikely(ptr + priv->len >= skb_tail_pointer(skb)))
+ if (unlikely(ptr + priv->len > skb_tail_pointer(skb)))
return false;
*dest = 0;
@@ -260,8 +263,13 @@
if (err < 0)
goto err7;
- return 0;
+ err = nft_range_module_init();
+ if (err < 0)
+ goto err8;
+ return 0;
+err8:
+ nft_dynset_module_exit();
err7:
nft_payload_module_exit();
err6:
diff --git a/net/netfilter/nf_tables_inet.c b/net/netfilter/nf_tables_inet.c
index 6b5f762..f713cc2 100644
--- a/net/netfilter/nf_tables_inet.c
+++ b/net/netfilter/nf_tables_inet.c
@@ -82,7 +82,10 @@
{
int ret;
- nft_register_chain_type(&filter_inet);
+ ret = nft_register_chain_type(&filter_inet);
+ if (ret < 0)
+ return ret;
+
ret = register_pernet_subsys(&nf_tables_inet_net_ops);
if (ret < 0)
nft_unregister_chain_type(&filter_inet);
diff --git a/net/netfilter/nf_tables_netdev.c b/net/netfilter/nf_tables_netdev.c
index 75d696f..9e2ae42 100644
--- a/net/netfilter/nf_tables_netdev.c
+++ b/net/netfilter/nf_tables_netdev.c
@@ -15,78 +15,6 @@
#include <net/netfilter/nf_tables_ipv4.h>
#include <net/netfilter/nf_tables_ipv6.h>
-static inline void
-nft_netdev_set_pktinfo_ipv4(struct nft_pktinfo *pkt,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
- struct iphdr *iph, _iph;
- u32 len, thoff;
-
- nft_set_pktinfo(pkt, skb, state);
-
- iph = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*iph),
- &_iph);
- if (!iph)
- return;
-
- if (iph->ihl < 5 || iph->version != 4)
- return;
-
- len = ntohs(iph->tot_len);
- thoff = iph->ihl * 4;
- if (skb->len < len)
- return;
- else if (len < thoff)
- return;
-
- pkt->tprot = iph->protocol;
- pkt->xt.thoff = thoff;
- pkt->xt.fragoff = ntohs(iph->frag_off) & IP_OFFSET;
-}
-
-static inline void
-__nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
-#if IS_ENABLED(CONFIG_IPV6)
- struct ipv6hdr *ip6h, _ip6h;
- unsigned int thoff = 0;
- unsigned short frag_off;
- int protohdr;
- u32 pkt_len;
-
- ip6h = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*ip6h),
- &_ip6h);
- if (!ip6h)
- return;
-
- if (ip6h->version != 6)
- return;
-
- pkt_len = ntohs(ip6h->payload_len);
- if (pkt_len + sizeof(*ip6h) > skb->len)
- return;
-
- protohdr = ipv6_find_hdr(pkt->skb, &thoff, -1, &frag_off, NULL);
- if (protohdr < 0)
- return;
-
- pkt->tprot = protohdr;
- pkt->xt.thoff = thoff;
- pkt->xt.fragoff = frag_off;
-#endif
-}
-
-static inline void nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
- nft_set_pktinfo(pkt, skb, state);
- __nft_netdev_set_pktinfo_ipv6(pkt, skb, state);
-}
-
static unsigned int
nft_do_chain_netdev(void *priv, struct sk_buff *skb,
const struct nf_hook_state *state)
@@ -95,13 +23,13 @@
switch (skb->protocol) {
case htons(ETH_P_IP):
- nft_netdev_set_pktinfo_ipv4(&pkt, skb, state);
+ nft_set_pktinfo_ipv4_validate(&pkt, skb, state);
break;
case htons(ETH_P_IPV6):
- nft_netdev_set_pktinfo_ipv6(&pkt, skb, state);
+ nft_set_pktinfo_ipv6_validate(&pkt, skb, state);
break;
default:
- nft_set_pktinfo(&pkt, skb, state);
+ nft_set_pktinfo_unspec(&pkt, skb, state);
break;
}
@@ -221,14 +149,25 @@
{
int ret;
- nft_register_chain_type(&nft_filter_chain_netdev);
- ret = register_pernet_subsys(&nf_tables_netdev_net_ops);
- if (ret < 0) {
- nft_unregister_chain_type(&nft_filter_chain_netdev);
+ ret = nft_register_chain_type(&nft_filter_chain_netdev);
+ if (ret)
return ret;
- }
- register_netdevice_notifier(&nf_tables_netdev_notifier);
+
+ ret = register_pernet_subsys(&nf_tables_netdev_net_ops);
+ if (ret)
+ goto err1;
+
+ ret = register_netdevice_notifier(&nf_tables_netdev_notifier);
+ if (ret)
+ goto err2;
+
return 0;
+
+err2:
+ unregister_pernet_subsys(&nf_tables_netdev_net_ops);
+err1:
+ nft_unregister_chain_type(&nft_filter_chain_netdev);
+ return ret;
}
static void __exit nf_tables_netdev_exit(void)
diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
index 39eb1cc..ab695f8 100644
--- a/net/netfilter/nf_tables_trace.c
+++ b/net/netfilter/nf_tables_trace.c
@@ -113,20 +113,22 @@
const struct nft_pktinfo *pkt)
{
const struct sk_buff *skb = pkt->skb;
- unsigned int len = min_t(unsigned int,
- pkt->xt.thoff - skb_network_offset(skb),
- NFT_TRACETYPE_NETWORK_HSIZE);
int off = skb_network_offset(skb);
+ unsigned int len, nh_end;
+ nh_end = pkt->tprot_set ? pkt->xt.thoff : skb->len;
+ len = min_t(unsigned int, nh_end - skb_network_offset(skb),
+ NFT_TRACETYPE_NETWORK_HSIZE);
if (trace_fill_header(nlskb, NFTA_TRACE_NETWORK_HEADER, skb, off, len))
return -1;
- len = min_t(unsigned int, skb->len - pkt->xt.thoff,
- NFT_TRACETYPE_TRANSPORT_HSIZE);
-
- if (trace_fill_header(nlskb, NFTA_TRACE_TRANSPORT_HEADER, skb,
- pkt->xt.thoff, len))
- return -1;
+ if (pkt->tprot_set) {
+ len = min_t(unsigned int, skb->len - pkt->xt.thoff,
+ NFT_TRACETYPE_TRANSPORT_HSIZE);
+ if (trace_fill_header(nlskb, NFTA_TRACE_TRANSPORT_HEADER, skb,
+ pkt->xt.thoff, len))
+ return -1;
+ }
if (!skb_mac_header_was_set(skb))
return 0;
@@ -237,7 +239,7 @@
break;
case NFT_TRACETYPE_POLICY:
if (nla_put_be32(skb, NFTA_TRACE_POLICY,
- info->basechain->policy))
+ htonl(info->basechain->policy)))
goto nla_put_failure;
break;
}
diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c
index e924e95..3b79f34 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -43,7 +43,7 @@
if (help == NULL)
return NF_DROP;
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (helper == NULL)
return NF_DROP;
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 6577db5..eb086a1 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -442,7 +442,9 @@
if (nla_put_be32(inst->skb, NFULA_IFINDEX_PHYSINDEV,
htonl(indev->ifindex)) ||
/* this is the bridge group "brX" */
- /* rcu_read_lock()ed by nf_hook_slow or nf_log_packet */
+ /* rcu_read_lock()ed by nf_hook_thresh or
+ * nf_log_packet.
+ */
nla_put_be32(inst->skb, NFULA_IFINDEX_INDEV,
htonl(br_port_get_rcu(indev)->br->dev->ifindex)))
goto nla_put_failure;
@@ -477,7 +479,9 @@
if (nla_put_be32(inst->skb, NFULA_IFINDEX_PHYSOUTDEV,
htonl(outdev->ifindex)) ||
/* this is the bridge group "brX" */
- /* rcu_read_lock()ed by nf_hook_slow or nf_log_packet */
+ /* rcu_read_lock()ed by nf_hook_thresh or
+ * nf_log_packet.
+ */
nla_put_be32(inst->skb, NFULA_IFINDEX_OUTDEV,
htonl(br_port_get_rcu(outdev)->br->dev->ifindex)))
goto nla_put_failure;
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index f49f450..af832c5 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -740,7 +740,7 @@
struct net *net = entry->state.net;
struct nfnl_queue_net *q = nfnl_queue_pernet(net);
- /* rcu_read_lock()ed by nf_hook_slow() */
+ /* rcu_read_lock()ed by nf_hook_thresh */
queue = instance_lookup(q, queuenum);
if (!queue)
return -ESRCH;
@@ -917,12 +917,14 @@
.notifier_call = nfqnl_rcv_dev_event,
};
-static int nf_hook_cmp(struct nf_queue_entry *entry, unsigned long ops_ptr)
+static int nf_hook_cmp(struct nf_queue_entry *entry, unsigned long entry_ptr)
{
- return entry->elem == (struct nf_hook_ops *)ops_ptr;
+ return rcu_access_pointer(entry->state.hook_entries) ==
+ (struct nf_hook_entry *)entry_ptr;
}
-static void nfqnl_nf_hook_drop(struct net *net, struct nf_hook_ops *hook)
+static void nfqnl_nf_hook_drop(struct net *net,
+ const struct nf_hook_entry *hook)
{
struct nfnl_queue_net *q = nfnl_queue_pernet(net);
int i;
@@ -1522,9 +1524,16 @@
goto cleanup_netlink_notifier;
}
- register_netdevice_notifier(&nfqnl_dev_notifier);
+ status = register_netdevice_notifier(&nfqnl_dev_notifier);
+ if (status < 0) {
+ pr_err("nf_queue: failed to register netdevice notifier\n");
+ goto cleanup_netlink_subsys;
+ }
+
return status;
+cleanup_netlink_subsys:
+ nfnetlink_subsys_unregister(&nfqnl_subsys);
cleanup_netlink_notifier:
netlink_unregister_notifier(&nfqnl_rtnl_notifier);
unregister_pernet_subsys(&nfnl_queue_net_ops);
diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c
index d71cc18..31c15ed 100644
--- a/net/netfilter/nft_bitwise.c
+++ b/net/netfilter/nft_bitwise.c
@@ -52,6 +52,7 @@
{
struct nft_bitwise *priv = nft_expr_priv(expr);
struct nft_data_desc d1, d2;
+ u32 len;
int err;
if (tb[NFTA_BITWISE_SREG] == NULL ||
@@ -61,7 +62,12 @@
tb[NFTA_BITWISE_XOR] == NULL)
return -EINVAL;
- priv->len = ntohl(nla_get_be32(tb[NFTA_BITWISE_LEN]));
+ err = nft_parse_u32_check(tb[NFTA_BITWISE_LEN], U8_MAX, &len);
+ if (err < 0)
+ return err;
+
+ priv->len = len;
+
priv->sreg = nft_parse_register(tb[NFTA_BITWISE_SREG]);
err = nft_validate_register_load(priv->sreg, priv->len);
if (err < 0)
diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c
index b78c28b..ee63d98 100644
--- a/net/netfilter/nft_byteorder.c
+++ b/net/netfilter/nft_byteorder.c
@@ -99,6 +99,7 @@
const struct nlattr * const tb[])
{
struct nft_byteorder *priv = nft_expr_priv(expr);
+ u32 size, len;
int err;
if (tb[NFTA_BYTEORDER_SREG] == NULL ||
@@ -117,7 +118,12 @@
return -EINVAL;
}
- priv->size = ntohl(nla_get_be32(tb[NFTA_BYTEORDER_SIZE]));
+ err = nft_parse_u32_check(tb[NFTA_BYTEORDER_SIZE], U8_MAX, &size);
+ if (err < 0)
+ return err;
+
+ priv->size = size;
+
switch (priv->size) {
case 2:
case 4:
@@ -128,7 +134,12 @@
}
priv->sreg = nft_parse_register(tb[NFTA_BYTEORDER_SREG]);
- priv->len = ntohl(nla_get_be32(tb[NFTA_BYTEORDER_LEN]));
+ err = nft_parse_u32_check(tb[NFTA_BYTEORDER_LEN], U8_MAX, &len);
+ if (err < 0)
+ return err;
+
+ priv->len = len;
+
err = nft_validate_register_load(priv->sreg, priv->len);
if (err < 0)
return err;
diff --git a/net/netfilter/nft_cmp.c b/net/netfilter/nft_cmp.c
index e25b35d..2e53739 100644
--- a/net/netfilter/nft_cmp.c
+++ b/net/netfilter/nft_cmp.c
@@ -84,6 +84,9 @@
if (err < 0)
return err;
+ if (desc.len > U8_MAX)
+ return -ERANGE;
+
priv->op = ntohl(nla_get_be32(tb[NFTA_CMP_OP]));
priv->len = desc.len;
return 0;
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 51e180f..d7b0d171 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -128,15 +128,18 @@
memcpy(dest, &count, sizeof(count));
return;
}
+ case NFT_CT_L3PROTOCOL:
+ *dest = nf_ct_l3num(ct);
+ return;
+ case NFT_CT_PROTOCOL:
+ *dest = nf_ct_protonum(ct);
+ return;
default:
break;
}
tuple = &ct->tuplehash[priv->dir].tuple;
switch (priv->key) {
- case NFT_CT_L3PROTOCOL:
- *dest = nf_ct_l3num(ct);
- return;
case NFT_CT_SRC:
memcpy(dest, tuple->src.u3.all,
nf_ct_l3num(ct) == NFPROTO_IPV4 ? 4 : 16);
@@ -145,9 +148,6 @@
memcpy(dest, tuple->dst.u3.all,
nf_ct_l3num(ct) == NFPROTO_IPV4 ? 4 : 16);
return;
- case NFT_CT_PROTOCOL:
- *dest = nf_ct_protonum(ct);
- return;
case NFT_CT_PROTO_SRC:
*dest = (__force __u16)tuple->src.u.all;
return;
@@ -283,8 +283,9 @@
case NFT_CT_L3PROTOCOL:
case NFT_CT_PROTOCOL:
- if (tb[NFTA_CT_DIRECTION] == NULL)
- return -EINVAL;
+ /* For compatibility, do not report error if NFTA_CT_DIRECTION
+ * attribute is specified.
+ */
len = sizeof(u8);
break;
case NFT_CT_SRC:
@@ -363,6 +364,8 @@
switch (priv->key) {
#ifdef CONFIG_NF_CONNTRACK_MARK
case NFT_CT_MARK:
+ if (tb[NFTA_CT_DIRECTION])
+ return -EINVAL;
len = FIELD_SIZEOF(struct nf_conn, mark);
break;
#endif
@@ -432,8 +435,6 @@
goto nla_put_failure;
switch (priv->key) {
- case NFT_CT_L3PROTOCOL:
- case NFT_CT_PROTOCOL:
case NFT_CT_SRC:
case NFT_CT_DST:
case NFT_CT_PROTO_SRC:
diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index 0af2669..e3b83c3 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -22,6 +22,7 @@
enum nft_dynset_ops op:8;
enum nft_registers sreg_key:8;
enum nft_registers sreg_data:8;
+ bool invert;
u64 timeout;
struct nft_expr *expr;
struct nft_set_binding binding;
@@ -82,10 +83,14 @@
if (sexpr != NULL)
sexpr->ops->eval(sexpr, regs, pkt);
+
+ if (priv->invert)
+ regs->verdict.code = NFT_BREAK;
return;
}
out:
- regs->verdict.code = NFT_BREAK;
+ if (!priv->invert)
+ regs->verdict.code = NFT_BREAK;
}
static const struct nla_policy nft_dynset_policy[NFTA_DYNSET_MAX + 1] = {
@@ -96,6 +101,7 @@
[NFTA_DYNSET_SREG_DATA] = { .type = NLA_U32 },
[NFTA_DYNSET_TIMEOUT] = { .type = NLA_U64 },
[NFTA_DYNSET_EXPR] = { .type = NLA_NESTED },
+ [NFTA_DYNSET_FLAGS] = { .type = NLA_U32 },
};
static int nft_dynset_init(const struct nft_ctx *ctx,
@@ -113,6 +119,15 @@
tb[NFTA_DYNSET_SREG_KEY] == NULL)
return -EINVAL;
+ if (tb[NFTA_DYNSET_FLAGS]) {
+ u32 flags = ntohl(nla_get_be32(tb[NFTA_DYNSET_FLAGS]));
+
+ if (flags & ~NFT_DYNSET_F_INV)
+ return -EINVAL;
+ if (flags & NFT_DYNSET_F_INV)
+ priv->invert = true;
+ }
+
set = nf_tables_set_lookup(ctx->table, tb[NFTA_DYNSET_SET_NAME],
genmask);
if (IS_ERR(set)) {
@@ -220,6 +235,7 @@
static int nft_dynset_dump(struct sk_buff *skb, const struct nft_expr *expr)
{
const struct nft_dynset *priv = nft_expr_priv(expr);
+ u32 flags = priv->invert ? NFT_DYNSET_F_INV : 0;
if (nft_dump_register(skb, NFTA_DYNSET_SREG_KEY, priv->sreg_key))
goto nla_put_failure;
@@ -235,6 +251,8 @@
goto nla_put_failure;
if (priv->expr && nft_expr_dump(skb, NFTA_DYNSET_EXPR, priv->expr))
goto nla_put_failure;
+ if (nla_put_be32(skb, NFTA_DYNSET_FLAGS, htonl(flags)))
+ goto nla_put_failure;
return 0;
nla_put_failure:
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 82c264e..a84cf3d 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -59,7 +59,7 @@
const struct nlattr * const tb[])
{
struct nft_exthdr *priv = nft_expr_priv(expr);
- u32 offset, len;
+ u32 offset, len, err;
if (tb[NFTA_EXTHDR_DREG] == NULL ||
tb[NFTA_EXTHDR_TYPE] == NULL ||
@@ -67,11 +67,13 @@
tb[NFTA_EXTHDR_LEN] == NULL)
return -EINVAL;
- offset = ntohl(nla_get_be32(tb[NFTA_EXTHDR_OFFSET]));
- len = ntohl(nla_get_be32(tb[NFTA_EXTHDR_LEN]));
+ err = nft_parse_u32_check(tb[NFTA_EXTHDR_OFFSET], U8_MAX, &offset);
+ if (err < 0)
+ return err;
- if (offset > U8_MAX || len > U8_MAX)
- return -ERANGE;
+ err = nft_parse_u32_check(tb[NFTA_EXTHDR_LEN], U8_MAX, &len);
+ if (err < 0)
+ return err;
priv->type = nla_get_u8(tb[NFTA_EXTHDR_TYPE]);
priv->offset = offset;
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 764251d..09473b4 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -23,6 +23,7 @@
u8 len;
u32 modulus;
u32 seed;
+ u32 offset;
};
static void nft_hash_eval(const struct nft_expr *expr,
@@ -31,10 +32,10 @@
{
struct nft_hash *priv = nft_expr_priv(expr);
const void *data = ®s->data[priv->sreg];
+ u32 h;
- regs->data[priv->dreg] =
- reciprocal_scale(jhash(data, priv->len, priv->seed),
- priv->modulus);
+ h = reciprocal_scale(jhash(data, priv->len, priv->seed), priv->modulus);
+ regs->data[priv->dreg] = h + priv->offset;
}
static const struct nla_policy nft_hash_policy[NFTA_HASH_MAX + 1] = {
@@ -59,6 +60,9 @@
!tb[NFTA_HASH_MODULUS])
return -EINVAL;
+ if (tb[NFTA_HASH_OFFSET])
+ priv->offset = ntohl(nla_get_be32(tb[NFTA_HASH_OFFSET]));
+
priv->sreg = nft_parse_register(tb[NFTA_HASH_SREG]);
priv->dreg = nft_parse_register(tb[NFTA_HASH_DREG]);
@@ -72,6 +76,9 @@
if (priv->modulus <= 1)
return -ERANGE;
+ if (priv->offset + priv->modulus - 1 < priv->offset)
+ return -EOVERFLOW;
+
priv->seed = ntohl(nla_get_be32(tb[NFTA_HASH_SEED]));
return nft_validate_register_load(priv->sreg, len) &&
@@ -94,7 +101,9 @@
goto nla_put_failure;
if (nla_put_be32(skb, NFTA_HASH_SEED, htonl(priv->seed)))
goto nla_put_failure;
-
+ if (priv->offset != 0)
+ if (nla_put_be32(skb, NFTA_HASH_OFFSET, htonl(priv->offset)))
+ goto nla_put_failure;
return 0;
nla_put_failure:
diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c
index db3b746..d17018f 100644
--- a/net/netfilter/nft_immediate.c
+++ b/net/netfilter/nft_immediate.c
@@ -53,6 +53,10 @@
tb[NFTA_IMMEDIATE_DATA]);
if (err < 0)
return err;
+
+ if (desc.len > U8_MAX)
+ return -ERANGE;
+
priv->dlen = desc.len;
priv->dreg = nft_parse_register(tb[NFTA_IMMEDIATE_DREG]);
diff --git a/net/netfilter/nft_log.c b/net/netfilter/nft_log.c
index 24a73bb..1b01404 100644
--- a/net/netfilter/nft_log.c
+++ b/net/netfilter/nft_log.c
@@ -58,8 +58,11 @@
if (tb[NFTA_LOG_LEVEL] != NULL &&
tb[NFTA_LOG_GROUP] != NULL)
return -EINVAL;
- if (tb[NFTA_LOG_GROUP] != NULL)
+ if (tb[NFTA_LOG_GROUP] != NULL) {
li->type = NF_LOG_TYPE_ULOG;
+ if (tb[NFTA_LOG_FLAGS] != NULL)
+ return -EINVAL;
+ }
nla = tb[NFTA_LOG_PREFIX];
if (nla != NULL) {
@@ -87,6 +90,10 @@
if (tb[NFTA_LOG_FLAGS] != NULL) {
li->u.log.logflags =
ntohl(nla_get_be32(tb[NFTA_LOG_FLAGS]));
+ if (li->u.log.logflags & ~NF_LOG_MASK) {
+ err = -EINVAL;
+ goto err1;
+ }
}
break;
case NF_LOG_TYPE_ULOG:
diff --git a/net/netfilter/nft_lookup.c b/net/netfilter/nft_lookup.c
index e164325..8166b69 100644
--- a/net/netfilter/nft_lookup.c
+++ b/net/netfilter/nft_lookup.c
@@ -43,7 +43,7 @@
return;
}
- if (found && set->flags & NFT_SET_MAP)
+ if (set->flags & NFT_SET_MAP)
nft_data_copy(®s->data[priv->dreg],
nft_set_ext_data(ext), set->dlen);
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 8a6bc76..6c1e024 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -52,6 +52,8 @@
*dest = pkt->pf;
break;
case NFT_META_L4PROTO:
+ if (!pkt->tprot_set)
+ goto err;
*dest = pkt->tprot;
break;
case NFT_META_PRIORITY:
diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
index 294745e..55bc5ab 100644
--- a/net/netfilter/nft_numgen.c
+++ b/net/netfilter/nft_numgen.c
@@ -21,8 +21,9 @@
struct nft_ng_inc {
enum nft_registers dreg:8;
- u32 until;
+ u32 modulus;
atomic_t counter;
+ u32 offset;
};
static void nft_ng_inc_eval(const struct nft_expr *expr,
@@ -34,16 +35,17 @@
do {
oval = atomic_read(&priv->counter);
- nval = (oval + 1 < priv->until) ? oval + 1 : 0;
+ nval = (oval + 1 < priv->modulus) ? oval + 1 : 0;
} while (atomic_cmpxchg(&priv->counter, oval, nval) != oval);
- memcpy(®s->data[priv->dreg], &priv->counter, sizeof(u32));
+ regs->data[priv->dreg] = nval + priv->offset;
}
static const struct nla_policy nft_ng_policy[NFTA_NG_MAX + 1] = {
[NFTA_NG_DREG] = { .type = NLA_U32 },
- [NFTA_NG_UNTIL] = { .type = NLA_U32 },
+ [NFTA_NG_MODULUS] = { .type = NLA_U32 },
[NFTA_NG_TYPE] = { .type = NLA_U32 },
+ [NFTA_NG_OFFSET] = { .type = NLA_U32 },
};
static int nft_ng_inc_init(const struct nft_ctx *ctx,
@@ -52,10 +54,16 @@
{
struct nft_ng_inc *priv = nft_expr_priv(expr);
- priv->until = ntohl(nla_get_be32(tb[NFTA_NG_UNTIL]));
- if (priv->until == 0)
+ if (tb[NFTA_NG_OFFSET])
+ priv->offset = ntohl(nla_get_be32(tb[NFTA_NG_OFFSET]));
+
+ priv->modulus = ntohl(nla_get_be32(tb[NFTA_NG_MODULUS]));
+ if (priv->modulus == 0)
return -ERANGE;
+ if (priv->offset + priv->modulus - 1 < priv->offset)
+ return -EOVERFLOW;
+
priv->dreg = nft_parse_register(tb[NFTA_NG_DREG]);
atomic_set(&priv->counter, 0);
@@ -64,14 +72,16 @@
}
static int nft_ng_dump(struct sk_buff *skb, enum nft_registers dreg,
- u32 until, enum nft_ng_types type)
+ u32 modulus, enum nft_ng_types type, u32 offset)
{
if (nft_dump_register(skb, NFTA_NG_DREG, dreg))
goto nla_put_failure;
- if (nla_put_be32(skb, NFTA_NG_UNTIL, htonl(until)))
+ if (nla_put_be32(skb, NFTA_NG_MODULUS, htonl(modulus)))
goto nla_put_failure;
if (nla_put_be32(skb, NFTA_NG_TYPE, htonl(type)))
goto nla_put_failure;
+ if (nla_put_be32(skb, NFTA_NG_OFFSET, htonl(offset)))
+ goto nla_put_failure;
return 0;
@@ -83,12 +93,14 @@
{
const struct nft_ng_inc *priv = nft_expr_priv(expr);
- return nft_ng_dump(skb, priv->dreg, priv->until, NFT_NG_INCREMENTAL);
+ return nft_ng_dump(skb, priv->dreg, priv->modulus, NFT_NG_INCREMENTAL,
+ priv->offset);
}
struct nft_ng_random {
enum nft_registers dreg:8;
- u32 until;
+ u32 modulus;
+ u32 offset;
};
static void nft_ng_random_eval(const struct nft_expr *expr,
@@ -97,9 +109,10 @@
{
struct nft_ng_random *priv = nft_expr_priv(expr);
struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
+ u32 val;
- regs->data[priv->dreg] = reciprocal_scale(prandom_u32_state(state),
- priv->until);
+ val = reciprocal_scale(prandom_u32_state(state), priv->modulus);
+ regs->data[priv->dreg] = val + priv->offset;
}
static int nft_ng_random_init(const struct nft_ctx *ctx,
@@ -108,10 +121,16 @@
{
struct nft_ng_random *priv = nft_expr_priv(expr);
- priv->until = ntohl(nla_get_be32(tb[NFTA_NG_UNTIL]));
- if (priv->until == 0)
+ if (tb[NFTA_NG_OFFSET])
+ priv->offset = ntohl(nla_get_be32(tb[NFTA_NG_OFFSET]));
+
+ priv->modulus = ntohl(nla_get_be32(tb[NFTA_NG_MODULUS]));
+ if (priv->modulus == 0)
return -ERANGE;
+ if (priv->offset + priv->modulus - 1 < priv->offset)
+ return -EOVERFLOW;
+
prandom_init_once(&nft_numgen_prandom_state);
priv->dreg = nft_parse_register(tb[NFTA_NG_DREG]);
@@ -124,7 +143,8 @@
{
const struct nft_ng_random *priv = nft_expr_priv(expr);
- return nft_ng_dump(skb, priv->dreg, priv->until, NFT_NG_RANDOM);
+ return nft_ng_dump(skb, priv->dreg, priv->modulus, NFT_NG_RANDOM,
+ priv->offset);
}
static struct nft_expr_type nft_ng_type;
@@ -149,8 +169,8 @@
{
u32 type;
- if (!tb[NFTA_NG_DREG] ||
- !tb[NFTA_NG_UNTIL] ||
+ if (!tb[NFTA_NG_DREG] ||
+ !tb[NFTA_NG_MODULUS] ||
!tb[NFTA_NG_TYPE])
return ERR_PTR(-EINVAL);
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 12cd4bf..b2f8861 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -92,6 +92,8 @@
offset = skb_network_offset(skb);
break;
case NFT_PAYLOAD_TRANSPORT_HEADER:
+ if (!pkt->tprot_set)
+ goto err;
offset = pkt->xt.thoff;
break;
default:
@@ -184,6 +186,8 @@
offset = skb_network_offset(skb);
break;
case NFT_PAYLOAD_TRANSPORT_HEADER:
+ if (!pkt->tprot_set)
+ goto err;
offset = pkt->xt.thoff;
break;
default:
diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
index 61d216e..393d359 100644
--- a/net/netfilter/nft_queue.c
+++ b/net/netfilter/nft_queue.c
@@ -22,9 +22,10 @@
static u32 jhash_initval __read_mostly;
struct nft_queue {
- u16 queuenum;
- u16 queues_total;
- u16 flags;
+ enum nft_registers sreg_qnum:8;
+ u16 queuenum;
+ u16 queues_total;
+ u16 flags;
};
static void nft_queue_eval(const struct nft_expr *expr,
@@ -54,27 +55,51 @@
regs->verdict.code = ret;
}
+static void nft_queue_sreg_eval(const struct nft_expr *expr,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ struct nft_queue *priv = nft_expr_priv(expr);
+ u32 queue, ret;
+
+ queue = regs->data[priv->sreg_qnum];
+
+ ret = NF_QUEUE_NR(queue);
+ if (priv->flags & NFT_QUEUE_FLAG_BYPASS)
+ ret |= NF_VERDICT_FLAG_QUEUE_BYPASS;
+
+ regs->verdict.code = ret;
+}
+
static const struct nla_policy nft_queue_policy[NFTA_QUEUE_MAX + 1] = {
[NFTA_QUEUE_NUM] = { .type = NLA_U16 },
[NFTA_QUEUE_TOTAL] = { .type = NLA_U16 },
[NFTA_QUEUE_FLAGS] = { .type = NLA_U16 },
+ [NFTA_QUEUE_SREG_QNUM] = { .type = NLA_U32 },
};
static int nft_queue_init(const struct nft_ctx *ctx,
- const struct nft_expr *expr,
- const struct nlattr * const tb[])
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
{
struct nft_queue *priv = nft_expr_priv(expr);
+ u32 maxid;
- if (tb[NFTA_QUEUE_NUM] == NULL)
- return -EINVAL;
-
- init_hashrandom(&jhash_initval);
priv->queuenum = ntohs(nla_get_be16(tb[NFTA_QUEUE_NUM]));
- if (tb[NFTA_QUEUE_TOTAL] != NULL)
+ if (tb[NFTA_QUEUE_TOTAL])
priv->queues_total = ntohs(nla_get_be16(tb[NFTA_QUEUE_TOTAL]));
- if (tb[NFTA_QUEUE_FLAGS] != NULL) {
+ else
+ priv->queues_total = 1;
+
+ if (priv->queues_total == 0)
+ return -EINVAL;
+
+ maxid = priv->queues_total - 1 + priv->queuenum;
+ if (maxid > U16_MAX)
+ return -ERANGE;
+
+ if (tb[NFTA_QUEUE_FLAGS]) {
priv->flags = ntohs(nla_get_be16(tb[NFTA_QUEUE_FLAGS]));
if (priv->flags & ~NFT_QUEUE_FLAG_MASK)
return -EINVAL;
@@ -82,6 +107,29 @@
return 0;
}
+static int nft_queue_sreg_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_queue *priv = nft_expr_priv(expr);
+ int err;
+
+ priv->sreg_qnum = nft_parse_register(tb[NFTA_QUEUE_SREG_QNUM]);
+ err = nft_validate_register_load(priv->sreg_qnum, sizeof(u32));
+ if (err < 0)
+ return err;
+
+ if (tb[NFTA_QUEUE_FLAGS]) {
+ priv->flags = ntohs(nla_get_be16(tb[NFTA_QUEUE_FLAGS]));
+ if (priv->flags & ~NFT_QUEUE_FLAG_MASK)
+ return -EINVAL;
+ if (priv->flags & NFT_QUEUE_FLAG_CPU_FANOUT)
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
static int nft_queue_dump(struct sk_buff *skb, const struct nft_expr *expr)
{
const struct nft_queue *priv = nft_expr_priv(expr);
@@ -97,6 +145,21 @@
return -1;
}
+static int
+nft_queue_sreg_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ const struct nft_queue *priv = nft_expr_priv(expr);
+
+ if (nft_dump_register(skb, NFTA_QUEUE_SREG_QNUM, priv->sreg_qnum) ||
+ nla_put_be16(skb, NFTA_QUEUE_FLAGS, htons(priv->flags)))
+ goto nla_put_failure;
+
+ return 0;
+
+nla_put_failure:
+ return -1;
+}
+
static struct nft_expr_type nft_queue_type;
static const struct nft_expr_ops nft_queue_ops = {
.type = &nft_queue_type,
@@ -106,9 +169,35 @@
.dump = nft_queue_dump,
};
+static const struct nft_expr_ops nft_queue_sreg_ops = {
+ .type = &nft_queue_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_queue)),
+ .eval = nft_queue_sreg_eval,
+ .init = nft_queue_sreg_init,
+ .dump = nft_queue_sreg_dump,
+};
+
+static const struct nft_expr_ops *
+nft_queue_select_ops(const struct nft_ctx *ctx,
+ const struct nlattr * const tb[])
+{
+ if (tb[NFTA_QUEUE_NUM] && tb[NFTA_QUEUE_SREG_QNUM])
+ return ERR_PTR(-EINVAL);
+
+ init_hashrandom(&jhash_initval);
+
+ if (tb[NFTA_QUEUE_NUM])
+ return &nft_queue_ops;
+
+ if (tb[NFTA_QUEUE_SREG_QNUM])
+ return &nft_queue_sreg_ops;
+
+ return ERR_PTR(-EINVAL);
+}
+
static struct nft_expr_type nft_queue_type __read_mostly = {
.name = "queue",
- .ops = &nft_queue_ops,
+ .select_ops = &nft_queue_select_ops,
.policy = nft_queue_policy,
.maxattr = NFTA_QUEUE_MAX,
.owner = THIS_MODULE,
diff --git a/net/netfilter/nft_quota.c b/net/netfilter/nft_quota.c
index 6eafbf9..c00104c 100644
--- a/net/netfilter/nft_quota.c
+++ b/net/netfilter/nft_quota.c
@@ -21,10 +21,10 @@
atomic64_t remain;
};
-static inline long nft_quota(struct nft_quota *priv,
- const struct nft_pktinfo *pkt)
+static inline bool nft_overquota(struct nft_quota *priv,
+ const struct nft_pktinfo *pkt)
{
- return atomic64_sub_return(pkt->skb->len, &priv->remain);
+ return atomic64_sub_return(pkt->skb->len, &priv->remain) < 0;
}
static void nft_quota_eval(const struct nft_expr *expr,
@@ -33,7 +33,7 @@
{
struct nft_quota *priv = nft_expr_priv(expr);
- if (nft_quota(priv, pkt) < 0 && !priv->invert)
+ if (nft_overquota(priv, pkt) ^ priv->invert)
regs->verdict.code = NFT_BREAK;
}
diff --git a/net/netfilter/nft_range.c b/net/netfilter/nft_range.c
new file mode 100644
index 0000000..c6d5358
--- /dev/null
+++ b/net/netfilter/nft_range.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) 2016 Pablo Neira Ayuso <pablo@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/netlink.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_core.h>
+#include <net/netfilter/nf_tables.h>
+
+struct nft_range_expr {
+ struct nft_data data_from;
+ struct nft_data data_to;
+ enum nft_registers sreg:8;
+ u8 len;
+ enum nft_range_ops op:8;
+};
+
+static void nft_range_eval(const struct nft_expr *expr,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ const struct nft_range_expr *priv = nft_expr_priv(expr);
+ bool mismatch;
+ int d1, d2;
+
+ d1 = memcmp(®s->data[priv->sreg], &priv->data_from, priv->len);
+ d2 = memcmp(®s->data[priv->sreg], &priv->data_to, priv->len);
+ switch (priv->op) {
+ case NFT_RANGE_EQ:
+ mismatch = (d1 < 0 || d2 > 0);
+ break;
+ case NFT_RANGE_NEQ:
+ mismatch = (d1 >= 0 && d2 <= 0);
+ break;
+ }
+
+ if (mismatch)
+ regs->verdict.code = NFT_BREAK;
+}
+
+static const struct nla_policy nft_range_policy[NFTA_RANGE_MAX + 1] = {
+ [NFTA_RANGE_SREG] = { .type = NLA_U32 },
+ [NFTA_RANGE_OP] = { .type = NLA_U32 },
+ [NFTA_RANGE_FROM_DATA] = { .type = NLA_NESTED },
+ [NFTA_RANGE_TO_DATA] = { .type = NLA_NESTED },
+};
+
+static int nft_range_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_range_expr *priv = nft_expr_priv(expr);
+ struct nft_data_desc desc_from, desc_to;
+ int err;
+
+ err = nft_data_init(NULL, &priv->data_from, sizeof(priv->data_from),
+ &desc_from, tb[NFTA_RANGE_FROM_DATA]);
+ if (err < 0)
+ return err;
+
+ err = nft_data_init(NULL, &priv->data_to, sizeof(priv->data_to),
+ &desc_to, tb[NFTA_RANGE_TO_DATA]);
+ if (err < 0)
+ goto err1;
+
+ if (desc_from.len != desc_to.len) {
+ err = -EINVAL;
+ goto err2;
+ }
+
+ priv->sreg = nft_parse_register(tb[NFTA_RANGE_SREG]);
+ err = nft_validate_register_load(priv->sreg, desc_from.len);
+ if (err < 0)
+ goto err2;
+
+ priv->op = ntohl(nla_get_be32(tb[NFTA_RANGE_OP]));
+ priv->len = desc_from.len;
+ return 0;
+err2:
+ nft_data_uninit(&priv->data_to, desc_to.type);
+err1:
+ nft_data_uninit(&priv->data_from, desc_from.type);
+ return err;
+}
+
+static int nft_range_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ const struct nft_range_expr *priv = nft_expr_priv(expr);
+
+ if (nft_dump_register(skb, NFTA_RANGE_SREG, priv->sreg))
+ goto nla_put_failure;
+ if (nla_put_be32(skb, NFTA_RANGE_OP, htonl(priv->op)))
+ goto nla_put_failure;
+
+ if (nft_data_dump(skb, NFTA_RANGE_FROM_DATA, &priv->data_from,
+ NFT_DATA_VALUE, priv->len) < 0 ||
+ nft_data_dump(skb, NFTA_RANGE_TO_DATA, &priv->data_to,
+ NFT_DATA_VALUE, priv->len) < 0)
+ goto nla_put_failure;
+ return 0;
+
+nla_put_failure:
+ return -1;
+}
+
+static struct nft_expr_type nft_range_type;
+static const struct nft_expr_ops nft_range_ops = {
+ .type = &nft_range_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_range_expr)),
+ .eval = nft_range_eval,
+ .init = nft_range_init,
+ .dump = nft_range_dump,
+};
+
+static struct nft_expr_type nft_range_type __read_mostly = {
+ .name = "range",
+ .ops = &nft_range_ops,
+ .policy = nft_range_policy,
+ .maxattr = NFTA_RANGE_MAX,
+ .owner = THIS_MODULE,
+};
+
+int __init nft_range_module_init(void)
+{
+ return nft_register_expr(&nft_range_type);
+}
+
+void nft_range_module_exit(void)
+{
+ nft_unregister_expr(&nft_range_type);
+}
diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
index 515131f..dbd6c4a 100644
--- a/net/netfilter/xt_RATEEST.c
+++ b/net/netfilter/xt_RATEEST.c
@@ -24,7 +24,6 @@
#define RATEEST_HSIZE 16
static struct hlist_head rateest_hash[RATEEST_HSIZE] __read_mostly;
static unsigned int jhash_rnd __read_mostly;
-static bool rnd_inited __read_mostly;
static unsigned int xt_rateest_hash(const char *name)
{
@@ -99,10 +98,7 @@
} cfg;
int ret;
- if (unlikely(!rnd_inited)) {
- get_random_bytes(&jhash_rnd, sizeof(jhash_rnd));
- rnd_inited = true;
- }
+ net_get_random_once(&jhash_rnd, sizeof(jhash_rnd));
est = xt_rateest_lookup(info->name);
if (est) {
diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c
index e118397..872db2d 100644
--- a/net/netfilter/xt_TCPMSS.c
+++ b/net/netfilter/xt_TCPMSS.c
@@ -110,18 +110,14 @@
if (info->mss == XT_TCPMSS_CLAMP_PMTU) {
struct net *net = par->net;
unsigned int in_mtu = tcpmss_reverse_mtu(net, skb, family);
+ unsigned int min_mtu = min(dst_mtu(skb_dst(skb)), in_mtu);
- if (dst_mtu(skb_dst(skb)) <= minlen) {
+ if (min_mtu <= minlen) {
net_err_ratelimited("unknown or invalid path-MTU (%u)\n",
- dst_mtu(skb_dst(skb)));
+ min_mtu);
return -1;
}
- if (in_mtu <= minlen) {
- net_err_ratelimited("unknown or invalid path-MTU (%u)\n",
- in_mtu);
- return -1;
- }
- newmss = min(dst_mtu(skb_dst(skb)), in_mtu) - minlen;
+ newmss = min_mtu - minlen;
} else
newmss = info->mss;
diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 6e57a39..0471db4 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -89,6 +89,8 @@
return -EINVAL;
if (info->oif[0]) {
+ int ret;
+
if (info->oif[sizeof(info->oif)-1] != '\0')
return -EINVAL;
@@ -101,7 +103,11 @@
priv->notifier.notifier_call = tee_netdev_event;
info->priv = priv;
- register_netdevice_notifier(&priv->notifier);
+ ret = register_netdevice_notifier(&priv->notifier);
+ if (ret) {
+ kfree(priv);
+ return ret;
+ }
} else
info->priv = NULL;
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index 99bbc82..b6dc322 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -366,14 +366,8 @@
unsigned int i;
int ret;
- if (unlikely(!connlimit_rnd)) {
- u_int32_t rand;
+ net_get_random_once(&connlimit_rnd, sizeof(connlimit_rnd));
- do {
- get_random_bytes(&rand, sizeof(rand));
- } while (!rand);
- cmpxchg(&connlimit_rnd, 0, rand);
- }
ret = nf_ct_l3proto_try_module_get(par->family);
if (ret < 0) {
pr_info("cannot load conntrack support for "
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 1786968..44a095e 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -56,6 +56,7 @@
}
/* need to declare this at the top */
+static const struct file_operations dl_file_ops_v1;
static const struct file_operations dl_file_ops;
/* hash table crap */
@@ -86,8 +87,8 @@
unsigned long expires; /* precalculated expiry time */
struct {
unsigned long prev; /* last modification */
- u_int32_t credit;
- u_int32_t credit_cap, cost;
+ u_int64_t credit;
+ u_int64_t credit_cap, cost;
} rateinfo;
struct rcu_head rcu;
};
@@ -98,7 +99,7 @@
u_int8_t family;
bool rnd_initialized;
- struct hashlimit_cfg1 cfg; /* config */
+ struct hashlimit_cfg2 cfg; /* config */
/* used internally */
spinlock_t lock; /* lock for list_head */
@@ -114,6 +115,30 @@
struct hlist_head hash[0]; /* hashtable itself */
};
+static int
+cfg_copy(struct hashlimit_cfg2 *to, void *from, int revision)
+{
+ if (revision == 1) {
+ struct hashlimit_cfg1 *cfg = (struct hashlimit_cfg1 *)from;
+
+ to->mode = cfg->mode;
+ to->avg = cfg->avg;
+ to->burst = cfg->burst;
+ to->size = cfg->size;
+ to->max = cfg->max;
+ to->gc_interval = cfg->gc_interval;
+ to->expire = cfg->expire;
+ to->srcmask = cfg->srcmask;
+ to->dstmask = cfg->dstmask;
+ } else if (revision == 2) {
+ memcpy(to, from, sizeof(struct hashlimit_cfg2));
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static DEFINE_MUTEX(hashlimit_mutex); /* protects htables list */
static struct kmem_cache *hashlimit_cachep __read_mostly;
@@ -215,16 +240,18 @@
}
static void htable_gc(struct work_struct *work);
-static int htable_create(struct net *net, struct xt_hashlimit_mtinfo1 *minfo,
- u_int8_t family)
+static int htable_create(struct net *net, struct hashlimit_cfg2 *cfg,
+ const char *name, u_int8_t family,
+ struct xt_hashlimit_htable **out_hinfo,
+ int revision)
{
struct hashlimit_net *hashlimit_net = hashlimit_pernet(net);
struct xt_hashlimit_htable *hinfo;
- unsigned int size;
- unsigned int i;
+ unsigned int size, i;
+ int ret;
- if (minfo->cfg.size) {
- size = minfo->cfg.size;
+ if (cfg->size) {
+ size = cfg->size;
} else {
size = (totalram_pages << PAGE_SHIFT) / 16384 /
sizeof(struct list_head);
@@ -238,10 +265,14 @@
sizeof(struct list_head) * size);
if (hinfo == NULL)
return -ENOMEM;
- minfo->hinfo = hinfo;
+ *out_hinfo = hinfo;
/* copy match config into hashtable config */
- memcpy(&hinfo->cfg, &minfo->cfg, sizeof(hinfo->cfg));
+ ret = cfg_copy(&hinfo->cfg, (void *)cfg, 2);
+
+ if (ret)
+ return ret;
+
hinfo->cfg.size = size;
if (hinfo->cfg.max == 0)
hinfo->cfg.max = 8 * hinfo->cfg.size;
@@ -255,17 +286,18 @@
hinfo->count = 0;
hinfo->family = family;
hinfo->rnd_initialized = false;
- hinfo->name = kstrdup(minfo->name, GFP_KERNEL);
+ hinfo->name = kstrdup(name, GFP_KERNEL);
if (!hinfo->name) {
vfree(hinfo);
return -ENOMEM;
}
spin_lock_init(&hinfo->lock);
- hinfo->pde = proc_create_data(minfo->name, 0,
+ hinfo->pde = proc_create_data(name, 0,
(family == NFPROTO_IPV4) ?
hashlimit_net->ipt_hashlimit : hashlimit_net->ip6t_hashlimit,
- &dl_file_ops, hinfo);
+ (revision == 1) ? &dl_file_ops_v1 : &dl_file_ops,
+ hinfo);
if (hinfo->pde == NULL) {
kfree(hinfo->name);
vfree(hinfo);
@@ -398,7 +430,8 @@
(slowest userspace tool allows), which means
CREDITS_PER_JIFFY*HZ*60*60*24 < 2^32 ie.
*/
-#define MAX_CPJ (0xFFFFFFFF / (HZ*60*60*24))
+#define MAX_CPJ_v1 (0xFFFFFFFF / (HZ*60*60*24))
+#define MAX_CPJ (0xFFFFFFFFFFFFFFFF / (HZ*60*60*24))
/* Repeated shift and or gives us all 1s, final shift and add 1 gives
* us the power of 2 below the theoretical max, so GCC simply does a
@@ -408,9 +441,12 @@
#define _POW2_BELOW8(x) (_POW2_BELOW4(x)|_POW2_BELOW4((x)>>4))
#define _POW2_BELOW16(x) (_POW2_BELOW8(x)|_POW2_BELOW8((x)>>8))
#define _POW2_BELOW32(x) (_POW2_BELOW16(x)|_POW2_BELOW16((x)>>16))
+#define _POW2_BELOW64(x) (_POW2_BELOW32(x)|_POW2_BELOW32((x)>>32))
#define POW2_BELOW32(x) ((_POW2_BELOW32(x)>>1) + 1)
+#define POW2_BELOW64(x) ((_POW2_BELOW64(x)>>1) + 1)
-#define CREDITS_PER_JIFFY POW2_BELOW32(MAX_CPJ)
+#define CREDITS_PER_JIFFY POW2_BELOW64(MAX_CPJ)
+#define CREDITS_PER_JIFFY_v1 POW2_BELOW32(MAX_CPJ_v1)
/* in byte mode, the lowest possible rate is one packet/second.
* credit_cap is used as a counter that tells us how many times we can
@@ -425,14 +461,24 @@
}
/* Precision saver. */
-static u32 user2credits(u32 user)
+static u64 user2credits(u64 user, int revision)
{
- /* If multiplying would overflow... */
- if (user > 0xFFFFFFFF / (HZ*CREDITS_PER_JIFFY))
- /* Divide first. */
- return (user / XT_HASHLIMIT_SCALE) * HZ * CREDITS_PER_JIFFY;
+ if (revision == 1) {
+ /* If multiplying would overflow... */
+ if (user > 0xFFFFFFFF / (HZ*CREDITS_PER_JIFFY_v1))
+ /* Divide first. */
+ return (user / XT_HASHLIMIT_SCALE) *\
+ HZ * CREDITS_PER_JIFFY_v1;
- return (user * HZ * CREDITS_PER_JIFFY) / XT_HASHLIMIT_SCALE;
+ return (user * HZ * CREDITS_PER_JIFFY_v1) \
+ / XT_HASHLIMIT_SCALE;
+ } else {
+ if (user > 0xFFFFFFFFFFFFFFFF / (HZ*CREDITS_PER_JIFFY))
+ return (user / XT_HASHLIMIT_SCALE_v2) *\
+ HZ * CREDITS_PER_JIFFY;
+
+ return (user * HZ * CREDITS_PER_JIFFY) / XT_HASHLIMIT_SCALE_v2;
+ }
}
static u32 user2credits_byte(u32 user)
@@ -442,10 +488,11 @@
return (u32) (us >> 32);
}
-static void rateinfo_recalc(struct dsthash_ent *dh, unsigned long now, u32 mode)
+static void rateinfo_recalc(struct dsthash_ent *dh, unsigned long now,
+ u32 mode, int revision)
{
unsigned long delta = now - dh->rateinfo.prev;
- u32 cap;
+ u64 cap, cpj;
if (delta == 0)
return;
@@ -453,7 +500,7 @@
dh->rateinfo.prev = now;
if (mode & XT_HASHLIMIT_BYTES) {
- u32 tmp = dh->rateinfo.credit;
+ u64 tmp = dh->rateinfo.credit;
dh->rateinfo.credit += CREDITS_PER_JIFFY_BYTES * delta;
cap = CREDITS_PER_JIFFY_BYTES * HZ;
if (tmp >= dh->rateinfo.credit) {/* overflow */
@@ -461,7 +508,9 @@
return;
}
} else {
- dh->rateinfo.credit += delta * CREDITS_PER_JIFFY;
+ cpj = (revision == 1) ?
+ CREDITS_PER_JIFFY_v1 : CREDITS_PER_JIFFY;
+ dh->rateinfo.credit += delta * cpj;
cap = dh->rateinfo.credit_cap;
}
if (dh->rateinfo.credit > cap)
@@ -469,7 +518,7 @@
}
static void rateinfo_init(struct dsthash_ent *dh,
- struct xt_hashlimit_htable *hinfo)
+ struct xt_hashlimit_htable *hinfo, int revision)
{
dh->rateinfo.prev = jiffies;
if (hinfo->cfg.mode & XT_HASHLIMIT_BYTES) {
@@ -478,8 +527,8 @@
dh->rateinfo.credit_cap = hinfo->cfg.burst;
} else {
dh->rateinfo.credit = user2credits(hinfo->cfg.avg *
- hinfo->cfg.burst);
- dh->rateinfo.cost = user2credits(hinfo->cfg.avg);
+ hinfo->cfg.burst, revision);
+ dh->rateinfo.cost = user2credits(hinfo->cfg.avg, revision);
dh->rateinfo.credit_cap = dh->rateinfo.credit;
}
}
@@ -603,15 +652,15 @@
}
static bool
-hashlimit_mt(const struct sk_buff *skb, struct xt_action_param *par)
+hashlimit_mt_common(const struct sk_buff *skb, struct xt_action_param *par,
+ struct xt_hashlimit_htable *hinfo,
+ const struct hashlimit_cfg2 *cfg, int revision)
{
- const struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
- struct xt_hashlimit_htable *hinfo = info->hinfo;
unsigned long now = jiffies;
struct dsthash_ent *dh;
struct dsthash_dst dst;
bool race = false;
- u32 cost;
+ u64 cost;
if (hashlimit_init_dst(hinfo, &dst, skb, par->thoff) < 0)
goto hotdrop;
@@ -626,18 +675,18 @@
} else if (race) {
/* Already got an entry, update expiration timeout */
dh->expires = now + msecs_to_jiffies(hinfo->cfg.expire);
- rateinfo_recalc(dh, now, hinfo->cfg.mode);
+ rateinfo_recalc(dh, now, hinfo->cfg.mode, revision);
} else {
dh->expires = jiffies + msecs_to_jiffies(hinfo->cfg.expire);
- rateinfo_init(dh, hinfo);
+ rateinfo_init(dh, hinfo, revision);
}
} else {
/* update expiration timeout */
dh->expires = now + msecs_to_jiffies(hinfo->cfg.expire);
- rateinfo_recalc(dh, now, hinfo->cfg.mode);
+ rateinfo_recalc(dh, now, hinfo->cfg.mode, revision);
}
- if (info->cfg.mode & XT_HASHLIMIT_BYTES)
+ if (cfg->mode & XT_HASHLIMIT_BYTES)
cost = hashlimit_byte_cost(skb->len, dh);
else
cost = dh->rateinfo.cost;
@@ -647,73 +696,136 @@
dh->rateinfo.credit -= cost;
spin_unlock(&dh->lock);
rcu_read_unlock_bh();
- return !(info->cfg.mode & XT_HASHLIMIT_INVERT);
+ return !(cfg->mode & XT_HASHLIMIT_INVERT);
}
spin_unlock(&dh->lock);
rcu_read_unlock_bh();
/* default match is underlimit - so over the limit, we need to invert */
- return info->cfg.mode & XT_HASHLIMIT_INVERT;
+ return cfg->mode & XT_HASHLIMIT_INVERT;
hotdrop:
par->hotdrop = true;
return false;
}
-static int hashlimit_mt_check(const struct xt_mtchk_param *par)
+static bool
+hashlimit_mt_v1(const struct sk_buff *skb, struct xt_action_param *par)
{
- struct net *net = par->net;
- struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
+ const struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
+ struct xt_hashlimit_htable *hinfo = info->hinfo;
+ struct hashlimit_cfg2 cfg = {};
int ret;
- if (info->cfg.gc_interval == 0 || info->cfg.expire == 0)
- return -EINVAL;
- if (info->name[sizeof(info->name)-1] != '\0')
+ ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
+
+ if (ret)
+ return ret;
+
+ return hashlimit_mt_common(skb, par, hinfo, &cfg, 1);
+}
+
+static bool
+hashlimit_mt(const struct sk_buff *skb, struct xt_action_param *par)
+{
+ const struct xt_hashlimit_mtinfo2 *info = par->matchinfo;
+ struct xt_hashlimit_htable *hinfo = info->hinfo;
+
+ return hashlimit_mt_common(skb, par, hinfo, &info->cfg, 2);
+}
+
+static int hashlimit_mt_check_common(const struct xt_mtchk_param *par,
+ struct xt_hashlimit_htable **hinfo,
+ struct hashlimit_cfg2 *cfg,
+ const char *name, int revision)
+{
+ struct net *net = par->net;
+ int ret;
+
+ if (cfg->gc_interval == 0 || cfg->expire == 0)
return -EINVAL;
if (par->family == NFPROTO_IPV4) {
- if (info->cfg.srcmask > 32 || info->cfg.dstmask > 32)
+ if (cfg->srcmask > 32 || cfg->dstmask > 32)
return -EINVAL;
} else {
- if (info->cfg.srcmask > 128 || info->cfg.dstmask > 128)
+ if (cfg->srcmask > 128 || cfg->dstmask > 128)
return -EINVAL;
}
- if (info->cfg.mode & ~XT_HASHLIMIT_ALL) {
+ if (cfg->mode & ~XT_HASHLIMIT_ALL) {
pr_info("Unknown mode mask %X, kernel too old?\n",
- info->cfg.mode);
+ cfg->mode);
return -EINVAL;
}
/* Check for overflow. */
- if (info->cfg.mode & XT_HASHLIMIT_BYTES) {
- if (user2credits_byte(info->cfg.avg) == 0) {
- pr_info("overflow, rate too high: %u\n", info->cfg.avg);
+ if (cfg->mode & XT_HASHLIMIT_BYTES) {
+ if (user2credits_byte(cfg->avg) == 0) {
+ pr_info("overflow, rate too high: %llu\n", cfg->avg);
return -EINVAL;
}
- } else if (info->cfg.burst == 0 ||
- user2credits(info->cfg.avg * info->cfg.burst) <
- user2credits(info->cfg.avg)) {
- pr_info("overflow, try lower: %u/%u\n",
- info->cfg.avg, info->cfg.burst);
+ } else if (cfg->burst == 0 ||
+ user2credits(cfg->avg * cfg->burst, revision) <
+ user2credits(cfg->avg, revision)) {
+ pr_info("overflow, try lower: %llu/%llu\n",
+ cfg->avg, cfg->burst);
return -ERANGE;
}
mutex_lock(&hashlimit_mutex);
- info->hinfo = htable_find_get(net, info->name, par->family);
- if (info->hinfo == NULL) {
- ret = htable_create(net, info, par->family);
+ *hinfo = htable_find_get(net, name, par->family);
+ if (*hinfo == NULL) {
+ ret = htable_create(net, cfg, name, par->family,
+ hinfo, revision);
if (ret < 0) {
mutex_unlock(&hashlimit_mutex);
return ret;
}
}
mutex_unlock(&hashlimit_mutex);
+
return 0;
}
+static int hashlimit_mt_check_v1(const struct xt_mtchk_param *par)
+{
+ struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
+ struct hashlimit_cfg2 cfg = {};
+ int ret;
+
+ if (info->name[sizeof(info->name) - 1] != '\0')
+ return -EINVAL;
+
+ ret = cfg_copy(&cfg, (void *)&info->cfg, 1);
+
+ if (ret)
+ return ret;
+
+ return hashlimit_mt_check_common(par, &info->hinfo,
+ &cfg, info->name, 1);
+}
+
+static int hashlimit_mt_check(const struct xt_mtchk_param *par)
+{
+ struct xt_hashlimit_mtinfo2 *info = par->matchinfo;
+
+ if (info->name[sizeof(info->name) - 1] != '\0')
+ return -EINVAL;
+
+ return hashlimit_mt_check_common(par, &info->hinfo, &info->cfg,
+ info->name, 2);
+}
+
+static void hashlimit_mt_destroy_v1(const struct xt_mtdtor_param *par)
+{
+ const struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
+
+ htable_put(info->hinfo);
+}
+
static void hashlimit_mt_destroy(const struct xt_mtdtor_param *par)
{
- const struct xt_hashlimit_mtinfo1 *info = par->matchinfo;
+ const struct xt_hashlimit_mtinfo2 *info = par->matchinfo;
htable_put(info->hinfo);
}
@@ -723,8 +835,18 @@
.name = "hashlimit",
.revision = 1,
.family = NFPROTO_IPV4,
- .match = hashlimit_mt,
+ .match = hashlimit_mt_v1,
.matchsize = sizeof(struct xt_hashlimit_mtinfo1),
+ .checkentry = hashlimit_mt_check_v1,
+ .destroy = hashlimit_mt_destroy_v1,
+ .me = THIS_MODULE,
+ },
+ {
+ .name = "hashlimit",
+ .revision = 2,
+ .family = NFPROTO_IPV4,
+ .match = hashlimit_mt,
+ .matchsize = sizeof(struct xt_hashlimit_mtinfo2),
.checkentry = hashlimit_mt_check,
.destroy = hashlimit_mt_destroy,
.me = THIS_MODULE,
@@ -734,8 +856,18 @@
.name = "hashlimit",
.revision = 1,
.family = NFPROTO_IPV6,
- .match = hashlimit_mt,
+ .match = hashlimit_mt_v1,
.matchsize = sizeof(struct xt_hashlimit_mtinfo1),
+ .checkentry = hashlimit_mt_check_v1,
+ .destroy = hashlimit_mt_destroy_v1,
+ .me = THIS_MODULE,
+ },
+ {
+ .name = "hashlimit",
+ .revision = 2,
+ .family = NFPROTO_IPV6,
+ .match = hashlimit_mt,
+ .matchsize = sizeof(struct xt_hashlimit_mtinfo2),
.checkentry = hashlimit_mt_check,
.destroy = hashlimit_mt_destroy,
.me = THIS_MODULE,
@@ -786,18 +918,12 @@
spin_unlock_bh(&htable->lock);
}
-static int dl_seq_real_show(struct dsthash_ent *ent, u_int8_t family,
- struct seq_file *s)
+static void dl_seq_print(struct dsthash_ent *ent, u_int8_t family,
+ struct seq_file *s)
{
- const struct xt_hashlimit_htable *ht = s->private;
-
- spin_lock(&ent->lock);
- /* recalculate to show accurate numbers */
- rateinfo_recalc(ent, jiffies, ht->cfg.mode);
-
switch (family) {
case NFPROTO_IPV4:
- seq_printf(s, "%ld %pI4:%u->%pI4:%u %u %u %u\n",
+ seq_printf(s, "%ld %pI4:%u->%pI4:%u %llu %llu %llu\n",
(long)(ent->expires - jiffies)/HZ,
&ent->dst.ip.src,
ntohs(ent->dst.src_port),
@@ -808,7 +934,7 @@
break;
#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
case NFPROTO_IPV6:
- seq_printf(s, "%ld %pI6:%u->%pI6:%u %u %u %u\n",
+ seq_printf(s, "%ld %pI6:%u->%pI6:%u %llu %llu %llu\n",
(long)(ent->expires - jiffies)/HZ,
&ent->dst.ip6.src,
ntohs(ent->dst.src_port),
@@ -821,10 +947,52 @@
default:
BUG();
}
+}
+
+static int dl_seq_real_show_v1(struct dsthash_ent *ent, u_int8_t family,
+ struct seq_file *s)
+{
+ const struct xt_hashlimit_htable *ht = s->private;
+
+ spin_lock(&ent->lock);
+ /* recalculate to show accurate numbers */
+ rateinfo_recalc(ent, jiffies, ht->cfg.mode, 1);
+
+ dl_seq_print(ent, family, s);
+
spin_unlock(&ent->lock);
return seq_has_overflowed(s);
}
+static int dl_seq_real_show(struct dsthash_ent *ent, u_int8_t family,
+ struct seq_file *s)
+{
+ const struct xt_hashlimit_htable *ht = s->private;
+
+ spin_lock(&ent->lock);
+ /* recalculate to show accurate numbers */
+ rateinfo_recalc(ent, jiffies, ht->cfg.mode, 2);
+
+ dl_seq_print(ent, family, s);
+
+ spin_unlock(&ent->lock);
+ return seq_has_overflowed(s);
+}
+
+static int dl_seq_show_v1(struct seq_file *s, void *v)
+{
+ struct xt_hashlimit_htable *htable = s->private;
+ unsigned int *bucket = (unsigned int *)v;
+ struct dsthash_ent *ent;
+
+ if (!hlist_empty(&htable->hash[*bucket])) {
+ hlist_for_each_entry(ent, &htable->hash[*bucket], node)
+ if (dl_seq_real_show_v1(ent, htable->family, s))
+ return -1;
+ }
+ return 0;
+}
+
static int dl_seq_show(struct seq_file *s, void *v)
{
struct xt_hashlimit_htable *htable = s->private;
@@ -839,6 +1007,13 @@
return 0;
}
+static const struct seq_operations dl_seq_ops_v1 = {
+ .start = dl_seq_start,
+ .next = dl_seq_next,
+ .stop = dl_seq_stop,
+ .show = dl_seq_show_v1
+};
+
static const struct seq_operations dl_seq_ops = {
.start = dl_seq_start,
.next = dl_seq_next,
@@ -846,9 +1021,9 @@
.show = dl_seq_show
};
-static int dl_proc_open(struct inode *inode, struct file *file)
+static int dl_proc_open_v1(struct inode *inode, struct file *file)
{
- int ret = seq_open(file, &dl_seq_ops);
+ int ret = seq_open(file, &dl_seq_ops_v1);
if (!ret) {
struct seq_file *sf = file->private_data;
@@ -857,6 +1032,26 @@
return ret;
}
+static int dl_proc_open(struct inode *inode, struct file *file)
+{
+ int ret = seq_open(file, &dl_seq_ops);
+
+ if (!ret) {
+ struct seq_file *sf = file->private_data;
+
+ sf->private = PDE_DATA(inode);
+ }
+ return ret;
+}
+
+static const struct file_operations dl_file_ops_v1 = {
+ .owner = THIS_MODULE,
+ .open = dl_proc_open_v1,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release
+};
+
static const struct file_operations dl_file_ops = {
.owner = THIS_MODULE,
.open = dl_proc_open,
diff --git a/net/netfilter/xt_helper.c b/net/netfilter/xt_helper.c
index 9f4ab00..f679dd4 100644
--- a/net/netfilter/xt_helper.c
+++ b/net/netfilter/xt_helper.c
@@ -41,7 +41,7 @@
if (!master_help)
return ret;
- /* rcu_read_lock()ed by nf_hook_slow */
+ /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(master_help->helper);
if (!helper)
return ret;
@@ -65,7 +65,7 @@
par->family);
return ret;
}
- info->name[29] = '\0';
+ info->name[sizeof(info->name) - 1] = '\0';
return 0;
}
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index d725a27..e3b7a09 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -110,7 +110,6 @@
#endif
static u_int32_t hash_rnd __read_mostly;
-static bool hash_rnd_inited __read_mostly;
static inline unsigned int recent_entry_hash4(const union nf_inet_addr *addr)
{
@@ -340,10 +339,8 @@
int ret = -EINVAL;
size_t sz;
- if (unlikely(!hash_rnd_inited)) {
- get_random_bytes(&hash_rnd, sizeof(hash_rnd));
- hash_rnd_inited = true;
- }
+ net_get_random_once(&hash_rnd, sizeof(hash_rnd));
+
if (info->check_set & ~XT_RECENT_VALID_FLAGS) {
pr_info("Unsupported user space flags (%08x)\n",
info->check_set);
diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c
index ef36a56..4dedb96 100644
--- a/net/netfilter/xt_sctp.c
+++ b/net/netfilter/xt_sctp.c
@@ -68,7 +68,7 @@
++i, offset, sch->type, htons(sch->length),
sch->flags);
#endif
- offset += WORD_ROUND(ntohs(sch->length));
+ offset += SCTP_PAD4(ntohs(sch->length));
pr_debug("skb->len: %d\toffset: %d\n", skb->len, offset);
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 0536ab3..4d67ea8 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -928,7 +928,6 @@
struct sw_flow_mask mask;
struct sk_buff *reply;
struct datapath *dp;
- struct sw_flow_key key;
struct sw_flow_actions *acts;
struct sw_flow_match match;
u32 ufid_flags = ovs_nla_get_ufid_flags(a[OVS_FLOW_ATTR_UFID_FLAGS]);
@@ -956,20 +955,24 @@
}
/* Extract key. */
- ovs_match_init(&match, &key, &mask);
+ ovs_match_init(&match, &new_flow->key, false, &mask);
error = ovs_nla_get_match(net, &match, a[OVS_FLOW_ATTR_KEY],
a[OVS_FLOW_ATTR_MASK], log);
if (error)
goto err_kfree_flow;
- ovs_flow_mask_key(&new_flow->key, &key, true, &mask);
-
/* Extract flow identifier. */
error = ovs_nla_get_identifier(&new_flow->id, a[OVS_FLOW_ATTR_UFID],
- &key, log);
+ &new_flow->key, log);
if (error)
goto err_kfree_flow;
+ /* unmasked key is needed to match when ufid is not used. */
+ if (ovs_identifier_is_key(&new_flow->id))
+ match.key = new_flow->id.unmasked_key;
+
+ ovs_flow_mask_key(&new_flow->key, &new_flow->key, true, &mask);
+
/* Validate actions. */
error = ovs_nla_copy_actions(net, a[OVS_FLOW_ATTR_ACTIONS],
&new_flow->key, &acts, log);
@@ -996,7 +999,7 @@
if (ovs_identifier_is_ufid(&new_flow->id))
flow = ovs_flow_tbl_lookup_ufid(&dp->table, &new_flow->id);
if (!flow)
- flow = ovs_flow_tbl_lookup(&dp->table, &key);
+ flow = ovs_flow_tbl_lookup(&dp->table, &new_flow->key);
if (likely(!flow)) {
rcu_assign_pointer(new_flow->sf_acts, acts);
@@ -1121,7 +1124,7 @@
ufid_present = ovs_nla_get_ufid(&sfid, a[OVS_FLOW_ATTR_UFID], log);
if (a[OVS_FLOW_ATTR_KEY]) {
- ovs_match_init(&match, &key, &mask);
+ ovs_match_init(&match, &key, true, &mask);
error = ovs_nla_get_match(net, &match, a[OVS_FLOW_ATTR_KEY],
a[OVS_FLOW_ATTR_MASK], log);
} else if (!ufid_present) {
@@ -1238,7 +1241,7 @@
ufid_present = ovs_nla_get_ufid(&ufid, a[OVS_FLOW_ATTR_UFID], log);
if (a[OVS_FLOW_ATTR_KEY]) {
- ovs_match_init(&match, &key, NULL);
+ ovs_match_init(&match, &key, true, NULL);
err = ovs_nla_get_match(net, &match, a[OVS_FLOW_ATTR_KEY], NULL,
log);
} else if (!ufid_present) {
@@ -1297,7 +1300,7 @@
ufid_present = ovs_nla_get_ufid(&ufid, a[OVS_FLOW_ATTR_UFID], log);
if (a[OVS_FLOW_ATTR_KEY]) {
- ovs_match_init(&match, &key, NULL);
+ ovs_match_init(&match, &key, true, NULL);
err = ovs_nla_get_match(net, &match, a[OVS_FLOW_ATTR_KEY],
NULL, log);
if (unlikely(err))
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 1240ae3..634cc10 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/in.h>
#include <linux/rcupdate.h>
+#include <linux/cpumask.h>
#include <linux/if_arp.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
@@ -72,32 +73,33 @@
{
struct flow_stats *stats;
int node = numa_node_id();
+ int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
- stats = rcu_dereference(flow->stats[node]);
+ stats = rcu_dereference(flow->stats[cpu]);
- /* Check if already have node-specific stats. */
+ /* Check if already have CPU-specific stats. */
if (likely(stats)) {
spin_lock(&stats->lock);
/* Mark if we write on the pre-allocated stats. */
- if (node == 0 && unlikely(flow->stats_last_writer != node))
- flow->stats_last_writer = node;
+ if (cpu == 0 && unlikely(flow->stats_last_writer != cpu))
+ flow->stats_last_writer = cpu;
} else {
stats = rcu_dereference(flow->stats[0]); /* Pre-allocated. */
spin_lock(&stats->lock);
- /* If the current NUMA-node is the only writer on the
+ /* If the current CPU is the only writer on the
* pre-allocated stats keep using them.
*/
- if (unlikely(flow->stats_last_writer != node)) {
+ if (unlikely(flow->stats_last_writer != cpu)) {
/* A previous locker may have already allocated the
- * stats, so we need to check again. If node-specific
+ * stats, so we need to check again. If CPU-specific
* stats were already allocated, we update the pre-
* allocated stats as we have already locked them.
*/
- if (likely(flow->stats_last_writer != NUMA_NO_NODE)
- && likely(!rcu_access_pointer(flow->stats[node]))) {
- /* Try to allocate node-specific stats. */
+ if (likely(flow->stats_last_writer != -1) &&
+ likely(!rcu_access_pointer(flow->stats[cpu]))) {
+ /* Try to allocate CPU-specific stats. */
struct flow_stats *new_stats;
new_stats =
@@ -114,12 +116,12 @@
new_stats->tcp_flags = tcp_flags;
spin_lock_init(&new_stats->lock);
- rcu_assign_pointer(flow->stats[node],
+ rcu_assign_pointer(flow->stats[cpu],
new_stats);
goto unlock;
}
}
- flow->stats_last_writer = node;
+ flow->stats_last_writer = cpu;
}
}
@@ -136,14 +138,15 @@
struct ovs_flow_stats *ovs_stats,
unsigned long *used, __be16 *tcp_flags)
{
- int node;
+ int cpu;
*used = 0;
*tcp_flags = 0;
memset(ovs_stats, 0, sizeof(*ovs_stats));
- for_each_node(node) {
- struct flow_stats *stats = rcu_dereference_ovsl(flow->stats[node]);
+ /* We open code this to make sure cpu 0 is always considered */
+ for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) {
+ struct flow_stats *stats = rcu_dereference_ovsl(flow->stats[cpu]);
if (stats) {
/* Local CPU may write on non-local stats, so we must
@@ -163,10 +166,11 @@
/* Called with ovs_mutex. */
void ovs_flow_stats_clear(struct sw_flow *flow)
{
- int node;
+ int cpu;
- for_each_node(node) {
- struct flow_stats *stats = ovsl_dereference(flow->stats[node]);
+ /* We open code this to make sure cpu 0 is always considered */
+ for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask)) {
+ struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]);
if (stats) {
spin_lock_bh(&stats->lock);
@@ -763,8 +767,6 @@
{
int err;
- memset(key, 0, OVS_SW_FLOW_KEY_METADATA_SIZE);
-
/* Extract metadata from netlink attributes. */
err = ovs_nla_get_flow_metadata(net, attr, key, log);
if (err)
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index 156a302..ae783f5 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -178,14 +178,14 @@
struct hlist_node node[2];
u32 hash;
} flow_table, ufid_table;
- int stats_last_writer; /* NUMA-node id of the last writer on
+ int stats_last_writer; /* CPU id of the last writer on
* 'stats[0]'.
*/
struct sw_flow_key key;
struct sw_flow_id id;
struct sw_flow_mask *mask;
struct sw_flow_actions __rcu *sf_acts;
- struct flow_stats __rcu *stats[]; /* One for each NUMA node. First one
+ struct flow_stats __rcu *stats[]; /* One for each CPU. First one
* is allocated at flow creation time,
* the rest are allocated on demand
* while holding the 'stats[0].lock'.
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 8efa718..ae25ded 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -1996,13 +1996,15 @@
void ovs_match_init(struct sw_flow_match *match,
struct sw_flow_key *key,
+ bool reset_key,
struct sw_flow_mask *mask)
{
memset(match, 0, sizeof(*match));
match->key = key;
match->mask = mask;
- memset(key, 0, sizeof(*key));
+ if (reset_key)
+ memset(key, 0, sizeof(*key));
if (mask) {
memset(&mask->key, 0, sizeof(mask->key));
@@ -2049,7 +2051,7 @@
struct nlattr *a;
int err = 0, start, opts_type;
- ovs_match_init(&match, &key, NULL);
+ ovs_match_init(&match, &key, true, NULL);
opts_type = ip_tun_from_nlattr(nla_data(attr), &match, false, log);
if (opts_type < 0)
return opts_type;
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 47dd142..45f9769 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -41,7 +41,8 @@
size_t ovs_key_attr_size(void);
void ovs_match_init(struct sw_flow_match *match,
- struct sw_flow_key *key, struct sw_flow_mask *mask);
+ struct sw_flow_key *key, bool reset_key,
+ struct sw_flow_mask *mask);
int ovs_nla_put_key(const struct sw_flow_key *, const struct sw_flow_key *,
int attr, bool is_mask, struct sk_buff *);
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index d073fff..ea7a807 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -32,6 +32,7 @@
#include <linux/module.h>
#include <linux/in.h>
#include <linux/rcupdate.h>
+#include <linux/cpumask.h>
#include <linux/if_arp.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
@@ -79,17 +80,12 @@
{
struct sw_flow *flow;
struct flow_stats *stats;
- int node;
- flow = kmem_cache_alloc(flow_cache, GFP_KERNEL);
+ flow = kmem_cache_zalloc(flow_cache, GFP_KERNEL);
if (!flow)
return ERR_PTR(-ENOMEM);
- flow->sf_acts = NULL;
- flow->mask = NULL;
- flow->id.unmasked_key = NULL;
- flow->id.ufid_len = 0;
- flow->stats_last_writer = NUMA_NO_NODE;
+ flow->stats_last_writer = -1;
/* Initialize the default stat node. */
stats = kmem_cache_alloc_node(flow_stats_cache,
@@ -102,10 +98,6 @@
RCU_INIT_POINTER(flow->stats[0], stats);
- for_each_node(node)
- if (node != 0)
- RCU_INIT_POINTER(flow->stats[node], NULL);
-
return flow;
err:
kmem_cache_free(flow_cache, flow);
@@ -142,16 +134,17 @@
static void flow_free(struct sw_flow *flow)
{
- int node;
+ int cpu;
if (ovs_identifier_is_key(&flow->id))
kfree(flow->id.unmasked_key);
if (flow->sf_acts)
ovs_nla_free_flow_actions((struct sw_flow_actions __force *)flow->sf_acts);
- for_each_node(node)
- if (flow->stats[node])
+ /* We open code this to make sure cpu 0 is always considered */
+ for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, cpu_possible_mask))
+ if (flow->stats[cpu])
kmem_cache_free(flow_stats_cache,
- (struct flow_stats __force *)flow->stats[node]);
+ (struct flow_stats __force *)flow->stats[cpu]);
kmem_cache_free(flow_cache, flow);
}
@@ -756,7 +749,7 @@
BUILD_BUG_ON(sizeof(struct sw_flow_key) % sizeof(long));
flow_cache = kmem_cache_create("sw_flow", sizeof(struct sw_flow)
- + (nr_node_ids
+ + (nr_cpu_ids
* sizeof(struct flow_stats *)),
0, 0, NULL);
if (flow_cache == NULL)
diff --git a/net/rxrpc/Kconfig b/net/rxrpc/Kconfig
index 13396c7..86f8853 100644
--- a/net/rxrpc/Kconfig
+++ b/net/rxrpc/Kconfig
@@ -26,6 +26,13 @@
Say Y here to allow AF_RXRPC to use IPV6 UDP as well as IPV4 UDP as
its network transport.
+config AF_RXRPC_INJECT_LOSS
+ bool "Inject packet loss into RxRPC packet stream"
+ depends on AF_RXRPC
+ help
+ Say Y here to inject packet loss by discarding some received and some
+ transmitted packets.
+
config AF_RXRPC_DEBUG
bool "RxRPC dynamic debugging"
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 09f81bef..8dbf7be 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -45,7 +45,7 @@
atomic_t rxrpc_debug_id;
/* count of skbs currently in use */
-atomic_t rxrpc_n_skbs;
+atomic_t rxrpc_n_tx_skbs, rxrpc_n_rx_skbs;
struct workqueue_struct *rxrpc_workqueue;
@@ -867,7 +867,8 @@
proto_unregister(&rxrpc_proto);
rxrpc_destroy_all_calls();
rxrpc_destroy_all_connections();
- ASSERTCMP(atomic_read(&rxrpc_n_skbs), ==, 0);
+ ASSERTCMP(atomic_read(&rxrpc_n_tx_skbs), ==, 0);
+ ASSERTCMP(atomic_read(&rxrpc_n_rx_skbs), ==, 0);
rxrpc_destroy_all_locals();
remove_proc_entry("rxrpc_conns", init_net.proc_net);
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index e78c40b..ca96e54 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -142,10 +142,7 @@
*/
struct rxrpc_skb_priv {
union {
- unsigned long resend_at; /* time in jiffies at which to resend */
- struct {
- u8 nr_jumbo; /* Number of jumbo subpackets */
- };
+ u8 nr_jumbo; /* Number of jumbo subpackets */
};
union {
unsigned int offset; /* offset into buffer of next read */
@@ -258,10 +255,12 @@
/* calculated RTT cache */
#define RXRPC_RTT_CACHE_SIZE 32
- suseconds_t rtt; /* current RTT estimate (in uS) */
- unsigned int rtt_point; /* next entry at which to insert */
- unsigned int rtt_usage; /* amount of cache actually used */
- suseconds_t rtt_cache[RXRPC_RTT_CACHE_SIZE]; /* calculated RTT cache */
+ ktime_t rtt_last_req; /* Time of last RTT request */
+ u64 rtt; /* Current RTT estimate (in nS) */
+ u64 rtt_sum; /* Sum of cache contents */
+ u64 rtt_cache[RXRPC_RTT_CACHE_SIZE]; /* Determined RTT cache */
+ u8 rtt_cursor; /* next entry at which to insert */
+ u8 rtt_usage; /* amount of cache actually used */
};
/*
@@ -314,6 +313,7 @@
RXRPC_CONN_CLIENT_ACTIVE, /* Conn is on active list, doing calls */
RXRPC_CONN_CLIENT_CULLED, /* Conn is culled and delisted, doing calls */
RXRPC_CONN_CLIENT_IDLE, /* Conn is on idle list, doing mostly nothing */
+ RXRPC_CONN__NR_CACHE_STATES
};
/*
@@ -384,10 +384,9 @@
int debug_id; /* debug ID for printks */
atomic_t serial; /* packet serial number counter */
unsigned int hi_serial; /* highest serial number received */
- u8 size_align; /* data size alignment (for security) */
- u8 header_size; /* rxrpc + security header size */
- u8 security_size; /* security header size */
u32 security_nonce; /* response re-use preventer */
+ u8 size_align; /* data size alignment (for security) */
+ u8 security_size; /* security header size */
u8 security_ix; /* security type */
u8 out_clientflag; /* RXRPC_CLIENT_INITIATED if we are client */
};
@@ -402,6 +401,8 @@
RXRPC_CALL_EXPOSED, /* The call was exposed to the world */
RXRPC_CALL_RX_LAST, /* Received the last packet (at rxtx_top) */
RXRPC_CALL_TX_LAST, /* Last packet in Tx buffer (at rxtx_top) */
+ RXRPC_CALL_PINGING, /* Ping in process */
+ RXRPC_CALL_RETRANS_TIMEOUT, /* Retransmission due to timeout occurred */
};
/*
@@ -447,6 +448,17 @@
};
/*
+ * Call Tx congestion management modes.
+ */
+enum rxrpc_congest_mode {
+ RXRPC_CALL_SLOW_START,
+ RXRPC_CALL_CONGEST_AVOIDANCE,
+ RXRPC_CALL_PACKET_LOSS,
+ RXRPC_CALL_FAST_RETRANSMIT,
+ NR__RXRPC_CONGEST_MODES
+};
+
+/*
* RxRPC call definition
* - matched by { connection, call_id }
*/
@@ -486,6 +498,8 @@
u32 call_id; /* call ID on connection */
u32 cid; /* connection ID plus channel index */
int debug_id; /* debug ID for printks */
+ unsigned short rx_pkt_offset; /* Current recvmsg packet offset */
+ unsigned short rx_pkt_len; /* Current recvmsg packet len */
/* Rx/Tx circular buffer, depending on phase.
*
@@ -505,6 +519,10 @@
#define RXRPC_TX_ANNO_UNACK 1
#define RXRPC_TX_ANNO_NAK 2
#define RXRPC_TX_ANNO_RETRANS 3
+#define RXRPC_TX_ANNO_MASK 0x03
+#define RXRPC_TX_ANNO_LAST 0x04
+#define RXRPC_TX_ANNO_RESENT 0x08
+
#define RXRPC_RX_ANNO_JUMBO 0x3f /* Jumbo subpacket number + 1 if not zero */
#define RXRPC_RX_ANNO_JLAST 0x40 /* Set if last element of a jumbo packet */
#define RXRPC_RX_ANNO_VERIFIED 0x80 /* Set if verified and decrypted */
@@ -512,6 +530,20 @@
* not hard-ACK'd packet follows this.
*/
rxrpc_seq_t tx_top; /* Highest Tx slot allocated. */
+
+ /* TCP-style slow-start congestion control [RFC5681]. Since the SMSS
+ * is fixed, we keep these numbers in terms of segments (ie. DATA
+ * packets) rather than bytes.
+ */
+#define RXRPC_TX_SMSS RXRPC_JUMBO_DATALEN
+ u8 cong_cwnd; /* Congestion window size */
+ u8 cong_extra; /* Extra to send for congestion management */
+ u8 cong_ssthresh; /* Slow-start threshold */
+ enum rxrpc_congest_mode cong_mode:8; /* Congestion management mode */
+ u8 cong_dup_acks; /* Count of ACKs showing missing packets */
+ u8 cong_cumul_acks; /* Cumulative ACK count */
+ ktime_t cong_tstamp; /* Last time cwnd was changed */
+
rxrpc_seq_t rx_hard_ack; /* Dead slot in buffer; the first received but not
* consumed packet follows this.
*/
@@ -519,6 +551,7 @@
rxrpc_seq_t rx_expect_next; /* Expected next packet sequence number */
u8 rx_winsize; /* Size of Rx window */
u8 tx_winsize; /* Maximum size of Tx window */
+ bool tx_phase; /* T if transmission phase, F if receive phase */
u8 nr_jumbo_bad; /* Number of jumbo dups/exceeds-windows */
/* receive-phase ACK management */
@@ -526,19 +559,105 @@
u16 ackr_skew; /* skew on packet being ACK'd */
rxrpc_serial_t ackr_serial; /* serial of packet being ACK'd */
rxrpc_seq_t ackr_prev_seq; /* previous sequence number received */
- unsigned short rx_pkt_offset; /* Current recvmsg packet offset */
- unsigned short rx_pkt_len; /* Current recvmsg packet len */
+ rxrpc_seq_t ackr_consumed; /* Highest packet shown consumed */
+ rxrpc_seq_t ackr_seen; /* Highest packet shown seen */
+ rxrpc_serial_t ackr_ping; /* Last ping sent */
+ ktime_t ackr_ping_time; /* Time last ping sent */
/* transmission-phase ACK management */
+ ktime_t acks_latest_ts; /* Timestamp of latest ACK received */
rxrpc_serial_t acks_latest; /* serial number of latest ACK received */
+ rxrpc_seq_t acks_lowest_nak; /* Lowest NACK in the buffer (or ==tx_hard_ack) */
};
+/*
+ * Summary of a new ACK and the changes it made to the Tx buffer packet states.
+ */
+struct rxrpc_ack_summary {
+ u8 ack_reason;
+ u8 nr_acks; /* Number of ACKs in packet */
+ u8 nr_nacks; /* Number of NACKs in packet */
+ u8 nr_new_acks; /* Number of new ACKs in packet */
+ u8 nr_new_nacks; /* Number of new NACKs in packet */
+ u8 nr_rot_new_acks; /* Number of rotated new ACKs */
+ bool new_low_nack; /* T if new low NACK found */
+ bool retrans_timeo; /* T if reTx due to timeout happened */
+ u8 flight_size; /* Number of unreceived transmissions */
+ /* Place to stash values for tracing */
+ enum rxrpc_congest_mode mode:8;
+ u8 cwnd;
+ u8 ssthresh;
+ u8 dup_acks;
+ u8 cumulative_acks;
+};
+
+enum rxrpc_skb_trace {
+ rxrpc_skb_rx_cleaned,
+ rxrpc_skb_rx_freed,
+ rxrpc_skb_rx_got,
+ rxrpc_skb_rx_lost,
+ rxrpc_skb_rx_received,
+ rxrpc_skb_rx_rotated,
+ rxrpc_skb_rx_purged,
+ rxrpc_skb_rx_seen,
+ rxrpc_skb_tx_cleaned,
+ rxrpc_skb_tx_freed,
+ rxrpc_skb_tx_got,
+ rxrpc_skb_tx_lost,
+ rxrpc_skb_tx_new,
+ rxrpc_skb_tx_rotated,
+ rxrpc_skb_tx_seen,
+ rxrpc_skb__nr_trace
+};
+
+extern const char rxrpc_skb_traces[rxrpc_skb__nr_trace][7];
+
+enum rxrpc_conn_trace {
+ rxrpc_conn_new_client,
+ rxrpc_conn_new_service,
+ rxrpc_conn_queued,
+ rxrpc_conn_seen,
+ rxrpc_conn_got,
+ rxrpc_conn_put_client,
+ rxrpc_conn_put_service,
+ rxrpc_conn__nr_trace
+};
+
+extern const char rxrpc_conn_traces[rxrpc_conn__nr_trace][4];
+
+enum rxrpc_client_trace {
+ rxrpc_client_activate_chans,
+ rxrpc_client_alloc,
+ rxrpc_client_chan_activate,
+ rxrpc_client_chan_disconnect,
+ rxrpc_client_chan_pass,
+ rxrpc_client_chan_unstarted,
+ rxrpc_client_cleanup,
+ rxrpc_client_count,
+ rxrpc_client_discard,
+ rxrpc_client_duplicate,
+ rxrpc_client_exposed,
+ rxrpc_client_replace,
+ rxrpc_client_to_active,
+ rxrpc_client_to_culled,
+ rxrpc_client_to_idle,
+ rxrpc_client_to_inactive,
+ rxrpc_client_to_waiting,
+ rxrpc_client_uncount,
+ rxrpc_client__nr_trace
+};
+
+extern const char rxrpc_client_traces[rxrpc_client__nr_trace][7];
+extern const char rxrpc_conn_cache_states[RXRPC_CONN__NR_CACHE_STATES][5];
+
enum rxrpc_call_trace {
rxrpc_call_new_client,
rxrpc_call_new_service,
rxrpc_call_queued,
rxrpc_call_queued_ref,
rxrpc_call_seen,
+ rxrpc_call_connected,
+ rxrpc_call_release,
rxrpc_call_got,
rxrpc_call_got_userid,
rxrpc_call_got_kernel,
@@ -546,17 +665,130 @@
rxrpc_call_put_userid,
rxrpc_call_put_kernel,
rxrpc_call_put_noqueue,
+ rxrpc_call_error,
rxrpc_call__nr_trace
};
extern const char rxrpc_call_traces[rxrpc_call__nr_trace][4];
+enum rxrpc_transmit_trace {
+ rxrpc_transmit_wait,
+ rxrpc_transmit_queue,
+ rxrpc_transmit_queue_last,
+ rxrpc_transmit_rotate,
+ rxrpc_transmit_rotate_last,
+ rxrpc_transmit_await_reply,
+ rxrpc_transmit_end,
+ rxrpc_transmit__nr_trace
+};
+
+extern const char rxrpc_transmit_traces[rxrpc_transmit__nr_trace][4];
+
+enum rxrpc_receive_trace {
+ rxrpc_receive_incoming,
+ rxrpc_receive_queue,
+ rxrpc_receive_queue_last,
+ rxrpc_receive_front,
+ rxrpc_receive_rotate,
+ rxrpc_receive_end,
+ rxrpc_receive__nr_trace
+};
+
+extern const char rxrpc_receive_traces[rxrpc_receive__nr_trace][4];
+
+enum rxrpc_recvmsg_trace {
+ rxrpc_recvmsg_enter,
+ rxrpc_recvmsg_wait,
+ rxrpc_recvmsg_dequeue,
+ rxrpc_recvmsg_hole,
+ rxrpc_recvmsg_next,
+ rxrpc_recvmsg_cont,
+ rxrpc_recvmsg_full,
+ rxrpc_recvmsg_data_return,
+ rxrpc_recvmsg_terminal,
+ rxrpc_recvmsg_to_be_accepted,
+ rxrpc_recvmsg_return,
+ rxrpc_recvmsg__nr_trace
+};
+
+extern const char rxrpc_recvmsg_traces[rxrpc_recvmsg__nr_trace][5];
+
+enum rxrpc_rtt_tx_trace {
+ rxrpc_rtt_tx_ping,
+ rxrpc_rtt_tx_data,
+ rxrpc_rtt_tx__nr_trace
+};
+
+extern const char rxrpc_rtt_tx_traces[rxrpc_rtt_tx__nr_trace][5];
+
+enum rxrpc_rtt_rx_trace {
+ rxrpc_rtt_rx_ping_response,
+ rxrpc_rtt_rx_requested_ack,
+ rxrpc_rtt_rx__nr_trace
+};
+
+extern const char rxrpc_rtt_rx_traces[rxrpc_rtt_rx__nr_trace][5];
+
+enum rxrpc_timer_trace {
+ rxrpc_timer_begin,
+ rxrpc_timer_init_for_reply,
+ rxrpc_timer_expired,
+ rxrpc_timer_set_for_ack,
+ rxrpc_timer_set_for_resend,
+ rxrpc_timer_set_for_send,
+ rxrpc_timer__nr_trace
+};
+
+extern const char rxrpc_timer_traces[rxrpc_timer__nr_trace][8];
+
+enum rxrpc_propose_ack_trace {
+ rxrpc_propose_ack_client_tx_end,
+ rxrpc_propose_ack_input_data,
+ rxrpc_propose_ack_ping_for_lost_ack,
+ rxrpc_propose_ack_ping_for_lost_reply,
+ rxrpc_propose_ack_ping_for_params,
+ rxrpc_propose_ack_respond_to_ack,
+ rxrpc_propose_ack_respond_to_ping,
+ rxrpc_propose_ack_retry_tx,
+ rxrpc_propose_ack_rotate_rx,
+ rxrpc_propose_ack_terminal_ack,
+ rxrpc_propose_ack__nr_trace
+};
+
+enum rxrpc_propose_ack_outcome {
+ rxrpc_propose_ack_use,
+ rxrpc_propose_ack_update,
+ rxrpc_propose_ack_subsume,
+ rxrpc_propose_ack__nr_outcomes
+};
+
+extern const char rxrpc_propose_ack_traces[rxrpc_propose_ack__nr_trace][8];
+extern const char *const rxrpc_propose_ack_outcomes[rxrpc_propose_ack__nr_outcomes];
+
+enum rxrpc_congest_change {
+ rxrpc_cong_begin_retransmission,
+ rxrpc_cong_cleared_nacks,
+ rxrpc_cong_new_low_nack,
+ rxrpc_cong_no_change,
+ rxrpc_cong_progress,
+ rxrpc_cong_retransmit_again,
+ rxrpc_cong_rtt_window_end,
+ rxrpc_cong_saw_nack,
+ rxrpc_congest__nr_change
+};
+
+extern const char rxrpc_congest_modes[NR__RXRPC_CONGEST_MODES][10];
+extern const char rxrpc_congest_changes[rxrpc_congest__nr_change][9];
+
+extern const char *const rxrpc_pkts[];
+extern const char const rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4];
+
#include <trace/events/rxrpc.h>
/*
* af_rxrpc.c
*/
-extern atomic_t rxrpc_n_skbs;
+extern atomic_t rxrpc_n_tx_skbs, rxrpc_n_rx_skbs;
extern u32 rxrpc_epoch;
extern atomic_t rxrpc_debug_id;
extern struct workqueue_struct *rxrpc_workqueue;
@@ -577,7 +809,9 @@
/*
* call_event.c
*/
-void rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool, bool);
+void rxrpc_set_timer(struct rxrpc_call *, enum rxrpc_timer_trace);
+void rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool, bool,
+ enum rxrpc_propose_ack_trace);
void rxrpc_process_call(struct work_struct *);
/*
@@ -631,6 +865,7 @@
call->error = error;
call->completion = compl,
call->state = RXRPC_CALL_COMPLETE;
+ wake_up(&call->waitq);
return true;
}
return false;
@@ -728,7 +963,11 @@
void __rxrpc_disconnect_call(struct rxrpc_connection *, struct rxrpc_call *);
void rxrpc_disconnect_call(struct rxrpc_call *);
void rxrpc_kill_connection(struct rxrpc_connection *);
-void __rxrpc_put_connection(struct rxrpc_connection *);
+bool rxrpc_queue_conn(struct rxrpc_connection *);
+void rxrpc_see_connection(struct rxrpc_connection *);
+void rxrpc_get_connection(struct rxrpc_connection *);
+struct rxrpc_connection *rxrpc_get_connection_maybe(struct rxrpc_connection *);
+void rxrpc_put_service_conn(struct rxrpc_connection *);
void __exit rxrpc_destroy_all_connections(void);
static inline bool rxrpc_conn_is_client(const struct rxrpc_connection *conn)
@@ -741,38 +980,15 @@
return !rxrpc_conn_is_client(conn);
}
-static inline void rxrpc_get_connection(struct rxrpc_connection *conn)
-{
- atomic_inc(&conn->usage);
-}
-
-static inline
-struct rxrpc_connection *rxrpc_get_connection_maybe(struct rxrpc_connection *conn)
-{
- return atomic_inc_not_zero(&conn->usage) ? conn : NULL;
-}
-
static inline void rxrpc_put_connection(struct rxrpc_connection *conn)
{
if (!conn)
return;
- if (rxrpc_conn_is_client(conn)) {
- if (atomic_dec_and_test(&conn->usage))
- rxrpc_put_client_conn(conn);
- } else {
- if (atomic_dec_return(&conn->usage) == 1)
- __rxrpc_put_connection(conn);
- }
-}
-
-static inline bool rxrpc_queue_conn(struct rxrpc_connection *conn)
-{
- if (!rxrpc_get_connection_maybe(conn))
- return false;
- if (!rxrpc_queue_work(&conn->processor))
- rxrpc_put_connection(conn);
- return true;
+ if (rxrpc_conn_is_client(conn))
+ rxrpc_put_client_conn(conn);
+ else
+ rxrpc_put_service_conn(conn);
}
/*
@@ -851,16 +1067,13 @@
extern unsigned int rxrpc_rx_jumbo_max;
extern unsigned int rxrpc_resend_timeout;
-extern const char *const rxrpc_pkts[];
extern const s8 rxrpc_ack_priority[];
-extern const char *rxrpc_acks(u8 reason);
-
/*
* output.c
*/
int rxrpc_send_call_packet(struct rxrpc_call *, u8);
-int rxrpc_send_data_packet(struct rxrpc_connection *, struct sk_buff *);
+int rxrpc_send_data_packet(struct rxrpc_call *, struct sk_buff *);
void rxrpc_reject_packets(struct rxrpc_local *);
/*
@@ -868,6 +1081,8 @@
*/
void rxrpc_error_report(struct sock *);
void rxrpc_peer_error_distributor(struct work_struct *);
+void rxrpc_peer_add_rtt(struct rxrpc_call *, enum rxrpc_rtt_rx_trace,
+ rxrpc_serial_t, rxrpc_serial_t, ktime_t, ktime_t);
/*
* peer_object.c
@@ -936,10 +1151,11 @@
*/
void rxrpc_kernel_data_consumed(struct rxrpc_call *, struct sk_buff *);
void rxrpc_packet_destructor(struct sk_buff *);
-void rxrpc_new_skb(struct sk_buff *);
-void rxrpc_see_skb(struct sk_buff *);
-void rxrpc_get_skb(struct sk_buff *);
-void rxrpc_free_skb(struct sk_buff *);
+void rxrpc_new_skb(struct sk_buff *, enum rxrpc_skb_trace);
+void rxrpc_see_skb(struct sk_buff *, enum rxrpc_skb_trace);
+void rxrpc_get_skb(struct sk_buff *, enum rxrpc_skb_trace);
+void rxrpc_free_skb(struct sk_buff *, enum rxrpc_skb_trace);
+void rxrpc_lose_skb(struct sk_buff *, enum rxrpc_skb_trace);
void rxrpc_purge_queue(struct sk_buff_head *);
/*
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 26c293e..a8d39d7 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -85,6 +85,9 @@
b->conn_backlog[head] = conn;
smp_store_release(&b->conn_backlog_head,
(head + 1) & (size - 1));
+
+ trace_rxrpc_conn(conn, rxrpc_conn_new_service,
+ atomic_read(&conn->usage), here);
}
/* Now it gets complicated, because calls get registered with the
@@ -290,6 +293,7 @@
rxrpc_get_local(local);
conn->params.local = local;
conn->params.peer = peer;
+ rxrpc_see_connection(conn);
rxrpc_new_incoming_connection(conn, skb);
} else {
rxrpc_get_connection(conn);
@@ -363,12 +367,17 @@
goto out;
}
+ trace_rxrpc_receive(call, rxrpc_receive_incoming,
+ sp->hdr.serial, sp->hdr.seq);
+
/* Make the call live. */
rxrpc_incoming_call(rx, call, skb);
conn = call->conn;
if (rx->notify_new_call)
rx->notify_new_call(&rx->sk, call, call->user_call_ID);
+ else
+ sk_acceptq_added(&rx->sk);
spin_lock(&conn->state_lock);
switch (conn->state) {
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index 6143204..0e84780 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -24,28 +24,32 @@
/*
* Set the timer
*/
-static void rxrpc_set_timer(struct rxrpc_call *call)
+void rxrpc_set_timer(struct rxrpc_call *call, enum rxrpc_timer_trace why)
{
unsigned long t, now = jiffies;
- _enter("{%ld,%ld,%ld:%ld}",
- call->ack_at - now, call->resend_at - now, call->expire_at - now,
- call->timer.expires - now);
-
read_lock_bh(&call->state_lock);
if (call->state < RXRPC_CALL_COMPLETE) {
- t = call->ack_at;
- if (time_before(call->resend_at, t))
+ t = call->expire_at;
+ if (time_before_eq(t, now))
+ goto out;
+
+ if (time_after(call->resend_at, now) &&
+ time_before(call->resend_at, t))
t = call->resend_at;
- if (time_before(call->expire_at, t))
- t = call->expire_at;
- if (!timer_pending(&call->timer) ||
- time_before(t, call->timer.expires)) {
- _debug("set timer %ld", t - now);
+
+ if (time_after(call->ack_at, now) &&
+ time_before(call->ack_at, t))
+ t = call->ack_at;
+
+ if (call->timer.expires != t || !timer_pending(&call->timer)) {
mod_timer(&call->timer, t);
+ trace_rxrpc_timer(call, why, now);
}
}
+
+out:
read_unlock_bh(&call->state_lock);
}
@@ -54,14 +58,13 @@
*/
static void __rxrpc_propose_ACK(struct rxrpc_call *call, u8 ack_reason,
u16 skew, u32 serial, bool immediate,
- bool background)
+ bool background,
+ enum rxrpc_propose_ack_trace why)
{
+ enum rxrpc_propose_ack_outcome outcome = rxrpc_propose_ack_use;
unsigned long now, ack_at, expiry = rxrpc_soft_ack_delay;
s8 prior = rxrpc_ack_priority[ack_reason];
- _enter("{%d},%s,%%%x,%u",
- call->debug_id, rxrpc_acks(ack_reason), serial, immediate);
-
/* Update DELAY, IDLE, REQUESTED and PING_RESPONSE ACK serial
* numbers, but we don't alter the timeout.
*/
@@ -70,15 +73,18 @@
call->ackr_reason, rxrpc_ack_priority[call->ackr_reason]);
if (ack_reason == call->ackr_reason) {
if (RXRPC_ACK_UPDATEABLE & (1 << ack_reason)) {
+ outcome = rxrpc_propose_ack_update;
call->ackr_serial = serial;
call->ackr_skew = skew;
}
if (!immediate)
- return;
+ goto trace;
} else if (prior > rxrpc_ack_priority[call->ackr_reason]) {
call->ackr_reason = ack_reason;
call->ackr_serial = serial;
call->ackr_skew = skew;
+ } else {
+ outcome = rxrpc_propose_ack_subsume;
}
switch (ack_reason) {
@@ -94,6 +100,7 @@
expiry = rxrpc_soft_ack_delay;
break;
+ case RXRPC_ACK_PING:
case RXRPC_ACK_IDLE:
if (rxrpc_idle_ack_delay < expiry)
expiry = rxrpc_idle_ack_delay;
@@ -117,38 +124,52 @@
_debug("deferred ACK %ld < %ld", expiry, call->ack_at - now);
if (time_before(ack_at, call->ack_at)) {
call->ack_at = ack_at;
- rxrpc_set_timer(call);
+ rxrpc_set_timer(call, rxrpc_timer_set_for_ack);
}
}
+
+trace:
+ trace_rxrpc_propose_ack(call, why, ack_reason, serial, immediate,
+ background, outcome);
}
/*
* propose an ACK be sent, locking the call structure
*/
void rxrpc_propose_ACK(struct rxrpc_call *call, u8 ack_reason,
- u16 skew, u32 serial, bool immediate, bool background)
+ u16 skew, u32 serial, bool immediate, bool background,
+ enum rxrpc_propose_ack_trace why)
{
spin_lock_bh(&call->lock);
__rxrpc_propose_ACK(call, ack_reason, skew, serial,
- immediate, background);
+ immediate, background, why);
spin_unlock_bh(&call->lock);
}
/*
+ * Handle congestion being detected by the retransmit timeout.
+ */
+static void rxrpc_congestion_timeout(struct rxrpc_call *call)
+{
+ set_bit(RXRPC_CALL_RETRANS_TIMEOUT, &call->flags);
+}
+
+/*
* Perform retransmission of NAK'd and unack'd packets.
*/
static void rxrpc_resend(struct rxrpc_call *call)
{
- struct rxrpc_wire_header *whdr;
struct rxrpc_skb_priv *sp;
struct sk_buff *skb;
rxrpc_seq_t cursor, seq, top;
- unsigned long resend_at, now;
+ ktime_t now = ktime_get_real(), max_age, oldest, resend_at, ack_ts;
int ix;
- u8 annotation;
+ u8 annotation, anno_type, retrans = 0, unacked = 0;
_enter("{%d,%d}", call->tx_hard_ack, call->tx_top);
+ max_age = ktime_sub_ms(now, rxrpc_resend_timeout);
+
spin_lock_bh(&call->lock);
cursor = call->tx_hard_ack;
@@ -161,68 +182,89 @@
* the packets in the Tx buffer we're going to resend and what the new
* resend timeout will be.
*/
- now = jiffies;
- resend_at = now + rxrpc_resend_timeout;
- seq = cursor + 1;
- do {
+ oldest = now;
+ for (seq = cursor + 1; before_eq(seq, top); seq++) {
ix = seq & RXRPC_RXTX_BUFF_MASK;
annotation = call->rxtx_annotations[ix];
- if (annotation == RXRPC_TX_ANNO_ACK)
+ anno_type = annotation & RXRPC_TX_ANNO_MASK;
+ annotation &= ~RXRPC_TX_ANNO_MASK;
+ if (anno_type == RXRPC_TX_ANNO_ACK)
continue;
skb = call->rxtx_buffer[ix];
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_tx_seen);
sp = rxrpc_skb(skb);
- if (annotation == RXRPC_TX_ANNO_UNACK) {
- if (time_after(sp->resend_at, now)) {
- if (time_before(sp->resend_at, resend_at))
- resend_at = sp->resend_at;
+ if (anno_type == RXRPC_TX_ANNO_UNACK) {
+ if (ktime_after(skb->tstamp, max_age)) {
+ if (ktime_before(skb->tstamp, oldest))
+ oldest = skb->tstamp;
continue;
}
+ if (!(annotation & RXRPC_TX_ANNO_RESENT))
+ unacked++;
}
/* Okay, we need to retransmit a packet. */
- call->rxtx_annotations[ix] = RXRPC_TX_ANNO_RETRANS;
- seq++;
- } while (before_eq(seq, top));
+ call->rxtx_annotations[ix] = RXRPC_TX_ANNO_RETRANS | annotation;
+ retrans++;
+ trace_rxrpc_retransmit(call, seq, annotation | anno_type,
+ ktime_to_ns(ktime_sub(skb->tstamp, max_age)));
+ }
- call->resend_at = resend_at;
+ resend_at = ktime_add_ms(oldest, rxrpc_resend_timeout);
+ call->resend_at = jiffies +
+ nsecs_to_jiffies(ktime_to_ns(ktime_sub(resend_at, now))) +
+ 1; /* We have to make sure that the calculated jiffies value
+ * falls at or after the nsec value, or we shall loop
+ * ceaselessly because the timer times out, but we haven't
+ * reached the nsec timeout yet.
+ */
+
+ if (unacked)
+ rxrpc_congestion_timeout(call);
+
+ /* If there was nothing that needed retransmission then it's likely
+ * that an ACK got lost somewhere. Send a ping to find out instead of
+ * retransmitting data.
+ */
+ if (!retrans) {
+ rxrpc_set_timer(call, rxrpc_timer_set_for_resend);
+ spin_unlock_bh(&call->lock);
+ ack_ts = ktime_sub(now, call->acks_latest_ts);
+ if (ktime_to_ns(ack_ts) < call->peer->rtt)
+ goto out;
+ rxrpc_propose_ACK(call, RXRPC_ACK_PING, 0, 0, true, false,
+ rxrpc_propose_ack_ping_for_lost_ack);
+ rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
+ goto out;
+ }
/* Now go through the Tx window and perform the retransmissions. We
* have to drop the lock for each send. If an ACK comes in whilst the
* lock is dropped, it may clear some of the retransmission markers for
* packets that it soft-ACKs.
*/
- seq = cursor + 1;
- do {
+ for (seq = cursor + 1; before_eq(seq, top); seq++) {
ix = seq & RXRPC_RXTX_BUFF_MASK;
annotation = call->rxtx_annotations[ix];
- if (annotation != RXRPC_TX_ANNO_RETRANS)
+ anno_type = annotation & RXRPC_TX_ANNO_MASK;
+ if (anno_type != RXRPC_TX_ANNO_RETRANS)
continue;
skb = call->rxtx_buffer[ix];
- rxrpc_get_skb(skb);
+ rxrpc_get_skb(skb, rxrpc_skb_tx_got);
spin_unlock_bh(&call->lock);
- sp = rxrpc_skb(skb);
- /* Each Tx packet needs a new serial number */
- sp->hdr.serial = atomic_inc_return(&call->conn->serial);
-
- whdr = (struct rxrpc_wire_header *)skb->head;
- whdr->serial = htonl(sp->hdr.serial);
-
- if (rxrpc_send_data_packet(call->conn, skb) < 0) {
- call->resend_at = now + 2;
- rxrpc_free_skb(skb);
+ if (rxrpc_send_data_packet(call, skb) < 0) {
+ rxrpc_free_skb(skb, rxrpc_skb_tx_freed);
return;
}
if (rxrpc_is_client_call(call))
rxrpc_expose_client_call(call);
- sp->resend_at = now + rxrpc_resend_timeout;
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_tx_freed);
spin_lock_bh(&call->lock);
/* We need to clear the retransmit state, but there are two
@@ -230,18 +272,25 @@
* received and the packet might have been hard-ACK'd (in which
* case it will no longer be in the buffer).
*/
- if (after(seq, call->tx_hard_ack) &&
- (call->rxtx_annotations[ix] == RXRPC_TX_ANNO_RETRANS ||
- call->rxtx_annotations[ix] == RXRPC_TX_ANNO_NAK))
- call->rxtx_annotations[ix] = RXRPC_TX_ANNO_UNACK;
+ if (after(seq, call->tx_hard_ack)) {
+ annotation = call->rxtx_annotations[ix];
+ anno_type = annotation & RXRPC_TX_ANNO_MASK;
+ if (anno_type == RXRPC_TX_ANNO_RETRANS ||
+ anno_type == RXRPC_TX_ANNO_NAK) {
+ annotation &= ~RXRPC_TX_ANNO_MASK;
+ annotation |= RXRPC_TX_ANNO_UNACK;
+ }
+ annotation |= RXRPC_TX_ANNO_RESENT;
+ call->rxtx_annotations[ix] = annotation;
+ }
if (after(call->tx_hard_ack, seq))
seq = call->tx_hard_ack;
- seq++;
- } while (before_eq(seq, top));
+ }
out_unlock:
spin_unlock_bh(&call->lock);
+out:
_leave("");
}
@@ -275,6 +324,7 @@
if (time_after_eq(now, call->expire_at)) {
rxrpc_abort_call("EXP", call, 0, RX_CALL_TIMEOUT, ETIME);
set_bit(RXRPC_CALL_EV_ABORT, &call->events);
+ goto recheck_state;
}
if (test_and_clear_bit(RXRPC_CALL_EV_ACK, &call->events) ||
@@ -292,7 +342,7 @@
goto recheck_state;
}
- rxrpc_set_timer(call);
+ rxrpc_set_timer(call, rxrpc_timer_set_for_resend);
/* other events may have been raised since we started checking */
if (call->events && call->state < RXRPC_CALL_COMPLETE) {
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index 22f9b0d..d4b3293 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -53,6 +53,8 @@
[rxrpc_call_new_service] = "NWs",
[rxrpc_call_queued] = "QUE",
[rxrpc_call_queued_ref] = "QUR",
+ [rxrpc_call_connected] = "CON",
+ [rxrpc_call_release] = "RLS",
[rxrpc_call_seen] = "SEE",
[rxrpc_call_got] = "GOT",
[rxrpc_call_got_userid] = "Gus",
@@ -61,6 +63,7 @@
[rxrpc_call_put_userid] = "Pus",
[rxrpc_call_put_kernel] = "Pke",
[rxrpc_call_put_noqueue] = "PNQ",
+ [rxrpc_call_error] = "*E*",
};
struct kmem_cache *rxrpc_call_jar;
@@ -73,8 +76,10 @@
_enter("%d", call->debug_id);
- if (call->state < RXRPC_CALL_COMPLETE)
+ if (call->state < RXRPC_CALL_COMPLETE) {
+ trace_rxrpc_timer(call, rxrpc_timer_expired, jiffies);
rxrpc_queue_call(call);
+ }
}
/*
@@ -155,6 +160,14 @@
call->rx_winsize = rxrpc_rx_window_size;
call->tx_winsize = 16;
call->rx_expect_next = 1;
+
+ if (RXRPC_TX_SMSS > 2190)
+ call->cong_cwnd = 2;
+ else if (RXRPC_TX_SMSS > 1095)
+ call->cong_cwnd = 3;
+ else
+ call->cong_cwnd = 4;
+ call->cong_ssthresh = RXRPC_RXTX_BUFF_SIZE - 1;
return call;
nomem_2:
@@ -171,6 +184,7 @@
gfp_t gfp)
{
struct rxrpc_call *call;
+ ktime_t now;
_enter("");
@@ -179,6 +193,10 @@
return ERR_PTR(-ENOMEM);
call->state = RXRPC_CALL_CLIENT_AWAIT_CONN;
call->service_id = srx->srx_service;
+ call->tx_phase = true;
+ now = ktime_get_real();
+ call->acks_latest_ts = now;
+ call->cong_tstamp = now;
_leave(" = %p", call);
return call;
@@ -195,8 +213,8 @@
call->expire_at = expire_at;
call->ack_at = expire_at;
call->resend_at = expire_at;
- call->timer.expires = expire_at;
- add_timer(&call->timer);
+ call->timer.expires = expire_at + 1;
+ rxrpc_set_timer(call, rxrpc_timer_begin);
}
/*
@@ -222,13 +240,10 @@
return call;
}
- trace_rxrpc_call(call, 0, atomic_read(&call->usage), here,
- (const void *)user_call_ID);
+ trace_rxrpc_call(call, rxrpc_call_new_client, atomic_read(&call->usage),
+ here, (const void *)user_call_ID);
/* Publish the call, even though it is incompletely set up as yet */
- call->user_call_ID = user_call_ID;
- __set_bit(RXRPC_CALL_HAS_USERID, &call->flags);
-
write_lock(&rx->call_lock);
pp = &rx->calls.rb_node;
@@ -242,10 +257,12 @@
else if (user_call_ID > xcall->user_call_ID)
pp = &(*pp)->rb_right;
else
- goto found_user_ID_now_present;
+ goto error_dup_user_ID;
}
rcu_assign_pointer(call->socket, rx);
+ call->user_call_ID = user_call_ID;
+ __set_bit(RXRPC_CALL_HAS_USERID, &call->flags);
rxrpc_get_call(call, rxrpc_call_got_userid);
rb_link_node(&call->sock_node, parent, pp);
rb_insert_color(&call->sock_node, &rx->calls);
@@ -264,6 +281,9 @@
if (ret < 0)
goto error;
+ trace_rxrpc_call(call, rxrpc_call_connected, atomic_read(&call->usage),
+ here, ERR_PTR(ret));
+
spin_lock_bh(&call->conn->params.peer->lock);
hlist_add_head(&call->error_link,
&call->conn->params.peer->error_targets);
@@ -276,33 +296,24 @@
_leave(" = %p [new]", call);
return call;
-error:
- write_lock(&rx->call_lock);
- rb_erase(&call->sock_node, &rx->calls);
- write_unlock(&rx->call_lock);
- rxrpc_put_call(call, rxrpc_call_put_userid);
-
- write_lock(&rxrpc_call_lock);
- list_del_init(&call->link);
- write_unlock(&rxrpc_call_lock);
-
-error_out:
- __rxrpc_set_call_completion(call, RXRPC_CALL_LOCAL_ERROR,
- RX_CALL_DEAD, ret);
- set_bit(RXRPC_CALL_RELEASED, &call->flags);
- rxrpc_put_call(call, rxrpc_call_put);
- _leave(" = %d", ret);
- return ERR_PTR(ret);
-
/* We unexpectedly found the user ID in the list after taking
* the call_lock. This shouldn't happen unless the user races
* with itself and tries to add the same user ID twice at the
* same time in different threads.
*/
-found_user_ID_now_present:
+error_dup_user_ID:
write_unlock(&rx->call_lock);
ret = -EEXIST;
- goto error_out;
+
+error:
+ __rxrpc_set_call_completion(call, RXRPC_CALL_LOCAL_ERROR,
+ RX_CALL_DEAD, ret);
+ trace_rxrpc_call(call, rxrpc_call_error, atomic_read(&call->usage),
+ here, ERR_PTR(ret));
+ rxrpc_release_call(rx, call);
+ rxrpc_put_call(call, rxrpc_call_put);
+ _leave(" = %d", ret);
+ return ERR_PTR(ret);
}
/*
@@ -326,6 +337,7 @@
call->state = RXRPC_CALL_SERVER_ACCEPTING;
if (sp->hdr.securityIndex > 0)
call->state = RXRPC_CALL_SERVER_SECURING;
+ call->cong_tstamp = skb->tstamp;
/* Set the channel for this call. We don't get channel_lock as we're
* only defending against the data_ready handler (which we're called
@@ -408,15 +420,17 @@
*/
void rxrpc_release_call(struct rxrpc_sock *rx, struct rxrpc_call *call)
{
+ const void *here = __builtin_return_address(0);
struct rxrpc_connection *conn = call->conn;
bool put = false;
int i;
_enter("{%d,%d}", call->debug_id, atomic_read(&call->usage));
- ASSERTCMP(call->state, ==, RXRPC_CALL_COMPLETE);
+ trace_rxrpc_call(call, rxrpc_call_release, atomic_read(&call->usage),
+ here, (const void *)call->flags);
- rxrpc_see_call(call);
+ ASSERTCMP(call->state, ==, RXRPC_CALL_COMPLETE);
spin_lock_bh(&call->lock);
if (test_and_set_bit(RXRPC_CALL_RELEASED, &call->flags))
@@ -460,7 +474,9 @@
rxrpc_disconnect_call(call);
for (i = 0; i < RXRPC_RXTX_BUFF_SIZE; i++) {
- rxrpc_free_skb(call->rxtx_buffer[i]);
+ rxrpc_free_skb(call->rxtx_buffer[i],
+ (call->tx_phase ? rxrpc_skb_tx_cleaned :
+ rxrpc_skb_rx_cleaned));
call->rxtx_buffer[i] = NULL;
}
@@ -476,6 +492,14 @@
_enter("%p", rx);
+ while (!list_empty(&rx->to_be_accepted)) {
+ call = list_entry(rx->to_be_accepted.next,
+ struct rxrpc_call, accept_link);
+ list_del(&call->accept_link);
+ rxrpc_abort_call("SKR", call, 0, RX_CALL_DEAD, ECONNRESET);
+ rxrpc_put_call(call, rxrpc_call_put);
+ }
+
while (!list_empty(&rx->sock_calls)) {
call = list_entry(rx->sock_calls.next,
struct rxrpc_call, sock_link);
@@ -546,9 +570,11 @@
/* Clean up the Rx/Tx buffer */
for (i = 0; i < RXRPC_RXTX_BUFF_SIZE; i++)
- rxrpc_free_skb(call->rxtx_buffer[i]);
+ rxrpc_free_skb(call->rxtx_buffer[i],
+ (call->tx_phase ? rxrpc_skb_tx_cleaned :
+ rxrpc_skb_rx_cleaned));
- rxrpc_free_skb(call->tx_pending);
+ rxrpc_free_skb(call->tx_pending, rxrpc_skb_tx_cleaned);
call_rcu(&call->rcu, rxrpc_rcu_destroy_call);
}
diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c
index 9344a84..c76a125 100644
--- a/net/rxrpc/conn_client.c
+++ b/net/rxrpc/conn_client.c
@@ -105,6 +105,14 @@
static DECLARE_DELAYED_WORK(rxrpc_client_conn_reap,
rxrpc_discard_expired_client_conns);
+const char rxrpc_conn_cache_states[RXRPC_CONN__NR_CACHE_STATES][5] = {
+ [RXRPC_CONN_CLIENT_INACTIVE] = "Inac",
+ [RXRPC_CONN_CLIENT_WAITING] = "Wait",
+ [RXRPC_CONN_CLIENT_ACTIVE] = "Actv",
+ [RXRPC_CONN_CLIENT_CULLED] = "Cull",
+ [RXRPC_CONN_CLIENT_IDLE] = "Idle",
+};
+
/*
* Get a connection ID and epoch for a client connection from the global pool.
* The connection struct pointer is then recorded in the idr radix tree. The
@@ -220,6 +228,9 @@
rxrpc_get_local(conn->params.local);
key_get(conn->params.key);
+ trace_rxrpc_conn(conn, rxrpc_conn_new_client, atomic_read(&conn->usage),
+ __builtin_return_address(0));
+ trace_rxrpc_client(conn, -1, rxrpc_client_alloc);
_leave(" = %p", conn);
return conn;
@@ -385,6 +396,7 @@
rb_replace_node(&conn->client_node,
&candidate->client_node,
&local->client_conns);
+ trace_rxrpc_client(conn, -1, rxrpc_client_replace);
goto candidate_published;
}
}
@@ -409,8 +421,11 @@
_debug("found conn");
spin_unlock(&local->client_conns_lock);
- rxrpc_put_connection(candidate);
- candidate = NULL;
+ if (candidate) {
+ trace_rxrpc_client(candidate, -1, rxrpc_client_duplicate);
+ rxrpc_put_connection(candidate);
+ candidate = NULL;
+ }
spin_lock(&conn->channel_lock);
call->conn = conn;
@@ -433,6 +448,7 @@
*/
static void rxrpc_activate_conn(struct rxrpc_connection *conn)
{
+ trace_rxrpc_client(conn, -1, rxrpc_client_to_active);
conn->cache_state = RXRPC_CONN_CLIENT_ACTIVE;
rxrpc_nr_active_client_conns++;
list_move_tail(&conn->cache_link, &rxrpc_active_client_conns);
@@ -462,8 +478,10 @@
spin_lock(&rxrpc_client_conn_cache_lock);
nr_conns = rxrpc_nr_client_conns;
- if (!test_and_set_bit(RXRPC_CONN_COUNTED, &conn->flags))
+ if (!test_and_set_bit(RXRPC_CONN_COUNTED, &conn->flags)) {
+ trace_rxrpc_client(conn, -1, rxrpc_client_count);
rxrpc_nr_client_conns = nr_conns + 1;
+ }
switch (conn->cache_state) {
case RXRPC_CONN_CLIENT_ACTIVE:
@@ -494,6 +512,7 @@
wait_for_capacity:
_debug("wait");
+ trace_rxrpc_client(conn, -1, rxrpc_client_to_waiting);
conn->cache_state = RXRPC_CONN_CLIENT_WAITING;
list_move_tail(&conn->cache_link, &rxrpc_waiting_client_conns);
goto out_unlock;
@@ -524,6 +543,8 @@
struct rxrpc_call, chan_wait_link);
u32 call_id = chan->call_counter + 1;
+ trace_rxrpc_client(conn, channel, rxrpc_client_chan_activate);
+
write_lock_bh(&call->state_lock);
call->state = RXRPC_CALL_CLIENT_SEND_REQUEST;
write_unlock_bh(&call->state_lock);
@@ -563,6 +584,8 @@
_enter("%d", conn->debug_id);
+ trace_rxrpc_client(conn, -1, rxrpc_client_activate_chans);
+
if (conn->cache_state != RXRPC_CONN_CLIENT_ACTIVE ||
conn->active_chans == RXRPC_ACTIVE_CHANS_MASK)
return;
@@ -657,10 +680,13 @@
* had a chance at re-use (the per-connection security negotiation is
* expensive).
*/
-static void rxrpc_expose_client_conn(struct rxrpc_connection *conn)
+static void rxrpc_expose_client_conn(struct rxrpc_connection *conn,
+ unsigned int channel)
{
- if (!test_and_set_bit(RXRPC_CONN_EXPOSED, &conn->flags))
+ if (!test_and_set_bit(RXRPC_CONN_EXPOSED, &conn->flags)) {
+ trace_rxrpc_client(conn, channel, rxrpc_client_exposed);
rxrpc_get_connection(conn);
+ }
}
/*
@@ -669,9 +695,9 @@
*/
void rxrpc_expose_client_call(struct rxrpc_call *call)
{
+ unsigned int channel = call->cid & RXRPC_CHANNELMASK;
struct rxrpc_connection *conn = call->conn;
- struct rxrpc_channel *chan =
- &conn->channels[call->cid & RXRPC_CHANNELMASK];
+ struct rxrpc_channel *chan = &conn->channels[channel];
if (!test_and_set_bit(RXRPC_CALL_EXPOSED, &call->flags)) {
/* Mark the call ID as being used. If the callNumber counter
@@ -682,7 +708,7 @@
chan->call_counter++;
if (chan->call_counter >= INT_MAX)
set_bit(RXRPC_CONN_DONT_REUSE, &conn->flags);
- rxrpc_expose_client_conn(conn);
+ rxrpc_expose_client_conn(conn, channel);
}
}
@@ -695,6 +721,7 @@
struct rxrpc_connection *conn = call->conn;
struct rxrpc_channel *chan = &conn->channels[channel];
+ trace_rxrpc_client(conn, channel, rxrpc_client_chan_disconnect);
call->conn = NULL;
spin_lock(&conn->channel_lock);
@@ -709,6 +736,8 @@
ASSERT(!test_bit(RXRPC_CALL_EXPOSED, &call->flags));
list_del_init(&call->chan_wait_link);
+ trace_rxrpc_client(conn, channel, rxrpc_client_chan_unstarted);
+
/* We must deactivate or idle the connection if it's now
* waiting for nothing.
*/
@@ -721,7 +750,6 @@
}
ASSERTCMP(rcu_access_pointer(chan->call), ==, call);
- ASSERTCMP(atomic_read(&conn->usage), >=, 2);
/* If a client call was exposed to the world, we save the result for
* retransmission.
@@ -740,7 +768,7 @@
/* See if we can pass the channel directly to another call. */
if (conn->cache_state == RXRPC_CONN_CLIENT_ACTIVE &&
!list_empty(&conn->waiting_calls)) {
- _debug("pass chan");
+ trace_rxrpc_client(conn, channel, rxrpc_client_chan_pass);
rxrpc_activate_one_channel(conn, channel);
goto out_2;
}
@@ -763,7 +791,7 @@
goto out;
}
- _debug("pass chan 2");
+ trace_rxrpc_client(conn, channel, rxrpc_client_chan_pass);
rxrpc_activate_one_channel(conn, channel);
goto out;
@@ -795,7 +823,7 @@
* immediately or moved to the idle list for a short while.
*/
if (test_bit(RXRPC_CONN_EXPOSED, &conn->flags)) {
- _debug("make idle");
+ trace_rxrpc_client(conn, channel, rxrpc_client_to_idle);
conn->idle_timestamp = jiffies;
conn->cache_state = RXRPC_CONN_CLIENT_IDLE;
list_move_tail(&conn->cache_link, &rxrpc_idle_client_conns);
@@ -805,7 +833,7 @@
&rxrpc_client_conn_reap,
rxrpc_conn_idle_client_expiry);
} else {
- _debug("make inactive");
+ trace_rxrpc_client(conn, channel, rxrpc_client_to_inactive);
conn->cache_state = RXRPC_CONN_CLIENT_INACTIVE;
list_del_init(&conn->cache_link);
}
@@ -818,10 +846,12 @@
static struct rxrpc_connection *
rxrpc_put_one_client_conn(struct rxrpc_connection *conn)
{
- struct rxrpc_connection *next;
+ struct rxrpc_connection *next = NULL;
struct rxrpc_local *local = conn->params.local;
unsigned int nr_conns;
+ trace_rxrpc_client(conn, -1, rxrpc_client_cleanup);
+
if (test_bit(RXRPC_CONN_IN_CLIENT_CONNS, &conn->flags)) {
spin_lock(&local->client_conns_lock);
if (test_and_clear_bit(RXRPC_CONN_IN_CLIENT_CONNS,
@@ -834,24 +864,23 @@
ASSERTCMP(conn->cache_state, ==, RXRPC_CONN_CLIENT_INACTIVE);
- if (!test_bit(RXRPC_CONN_COUNTED, &conn->flags))
- return NULL;
+ if (test_bit(RXRPC_CONN_COUNTED, &conn->flags)) {
+ trace_rxrpc_client(conn, -1, rxrpc_client_uncount);
+ spin_lock(&rxrpc_client_conn_cache_lock);
+ nr_conns = --rxrpc_nr_client_conns;
- spin_lock(&rxrpc_client_conn_cache_lock);
- nr_conns = --rxrpc_nr_client_conns;
+ if (nr_conns < rxrpc_max_client_connections &&
+ !list_empty(&rxrpc_waiting_client_conns)) {
+ next = list_entry(rxrpc_waiting_client_conns.next,
+ struct rxrpc_connection, cache_link);
+ rxrpc_get_connection(next);
+ rxrpc_activate_conn(next);
+ }
- next = NULL;
- if (nr_conns < rxrpc_max_client_connections &&
- !list_empty(&rxrpc_waiting_client_conns)) {
- next = list_entry(rxrpc_waiting_client_conns.next,
- struct rxrpc_connection, cache_link);
- rxrpc_get_connection(next);
- rxrpc_activate_conn(next);
+ spin_unlock(&rxrpc_client_conn_cache_lock);
}
- spin_unlock(&rxrpc_client_conn_cache_lock);
rxrpc_kill_connection(conn);
-
if (next)
rxrpc_activate_channels(next);
@@ -866,20 +895,18 @@
*/
void rxrpc_put_client_conn(struct rxrpc_connection *conn)
{
- struct rxrpc_connection *next;
+ const void *here = __builtin_return_address(0);
+ int n;
do {
- _enter("%p{u=%d,d=%d}",
- conn, atomic_read(&conn->usage), conn->debug_id);
+ n = atomic_dec_return(&conn->usage);
+ trace_rxrpc_conn(conn, rxrpc_conn_put_client, n, here);
+ if (n > 0)
+ return;
+ ASSERTCMP(n, >=, 0);
- next = rxrpc_put_one_client_conn(conn);
-
- if (!next)
- break;
- conn = next;
- } while (atomic_dec_and_test(&conn->usage));
-
- _leave("");
+ conn = rxrpc_put_one_client_conn(conn);
+ } while (conn);
}
/*
@@ -910,9 +937,11 @@
ASSERTCMP(conn->cache_state, ==, RXRPC_CONN_CLIENT_ACTIVE);
if (list_empty(&conn->waiting_calls)) {
+ trace_rxrpc_client(conn, -1, rxrpc_client_to_culled);
conn->cache_state = RXRPC_CONN_CLIENT_CULLED;
list_del_init(&conn->cache_link);
} else {
+ trace_rxrpc_client(conn, -1, rxrpc_client_to_waiting);
conn->cache_state = RXRPC_CONN_CLIENT_WAITING;
list_move_tail(&conn->cache_link,
&rxrpc_waiting_client_conns);
@@ -986,7 +1015,7 @@
goto not_yet_expired;
}
- _debug("discard conn %d", conn->debug_id);
+ trace_rxrpc_client(conn, -1, rxrpc_client_discard);
if (!test_and_clear_bit(RXRPC_CONN_EXPOSED, &conn->flags))
BUG();
conn->cache_state = RXRPC_CONN_CLIENT_INACTIVE;
diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c
index 0691007..37609ce 100644
--- a/net/rxrpc/conn_event.c
+++ b/net/rxrpc/conn_event.c
@@ -97,6 +97,7 @@
pkt.info.maxMTU = htonl(mtu);
pkt.info.rwind = htonl(rxrpc_rx_window_size);
pkt.info.jumbo_max = htonl(rxrpc_rx_jumbo_max);
+ pkt.whdr.flags |= RXRPC_SLOW_START_OK;
len += sizeof(pkt.ack) + sizeof(pkt.info);
break;
}
@@ -119,6 +120,8 @@
_proto("Tx ABORT %%%u { %d } [re]", serial, conn->local_abort);
break;
case RXRPC_PACKET_TYPE_ACK:
+ trace_rxrpc_tx_ack(NULL, serial, chan->last_seq, 0,
+ RXRPC_ACK_DUPLICATE, 0);
_proto("Tx ACK %%%u [re]", serial);
break;
}
@@ -377,7 +380,7 @@
u32 abort_code = RX_PROTOCOL_ERROR;
int ret;
- _enter("{%d}", conn->debug_id);
+ rxrpc_see_connection(conn);
if (test_and_clear_bit(RXRPC_CONN_EV_CHALLENGE, &conn->events))
rxrpc_secure_connection(conn);
@@ -385,7 +388,7 @@
/* go through the conn-level event packets, releasing the ref on this
* connection that each one has when we've finished with it */
while ((skb = skb_dequeue(&conn->rx_queue))) {
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_rx_seen);
ret = rxrpc_process_event(conn, skb, &abort_code);
switch (ret) {
case -EPROTO:
@@ -396,7 +399,7 @@
goto requeue_and_leave;
case -ECONNABORTED:
default:
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
break;
}
}
@@ -413,7 +416,7 @@
protocol_error:
if (rxrpc_abort_connection(conn, -ret, abort_code) < 0)
goto requeue_and_leave;
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
_leave(" [EPROTO]");
goto out;
}
diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c
index bb1f292..e1e83af 100644
--- a/net/rxrpc/conn_object.c
+++ b/net/rxrpc/conn_object.c
@@ -53,7 +53,6 @@
spin_lock_init(&conn->state_lock);
conn->debug_id = atomic_inc_return(&rxrpc_debug_id);
conn->size_align = 4;
- conn->header_size = sizeof(struct rxrpc_wire_header);
conn->idle_timestamp = jiffies;
}
@@ -246,11 +245,77 @@
}
/*
- * release a virtual connection
+ * Queue a connection's work processor, getting a ref to pass to the work
+ * queue.
*/
-void __rxrpc_put_connection(struct rxrpc_connection *conn)
+bool rxrpc_queue_conn(struct rxrpc_connection *conn)
{
- rxrpc_queue_delayed_work(&rxrpc_connection_reap, 0);
+ const void *here = __builtin_return_address(0);
+ int n = __atomic_add_unless(&conn->usage, 1, 0);
+ if (n == 0)
+ return false;
+ if (rxrpc_queue_work(&conn->processor))
+ trace_rxrpc_conn(conn, rxrpc_conn_queued, n + 1, here);
+ else
+ rxrpc_put_connection(conn);
+ return true;
+}
+
+/*
+ * Note the re-emergence of a connection.
+ */
+void rxrpc_see_connection(struct rxrpc_connection *conn)
+{
+ const void *here = __builtin_return_address(0);
+ if (conn) {
+ int n = atomic_read(&conn->usage);
+
+ trace_rxrpc_conn(conn, rxrpc_conn_seen, n, here);
+ }
+}
+
+/*
+ * Get a ref on a connection.
+ */
+void rxrpc_get_connection(struct rxrpc_connection *conn)
+{
+ const void *here = __builtin_return_address(0);
+ int n = atomic_inc_return(&conn->usage);
+
+ trace_rxrpc_conn(conn, rxrpc_conn_got, n, here);
+}
+
+/*
+ * Try to get a ref on a connection.
+ */
+struct rxrpc_connection *
+rxrpc_get_connection_maybe(struct rxrpc_connection *conn)
+{
+ const void *here = __builtin_return_address(0);
+
+ if (conn) {
+ int n = __atomic_add_unless(&conn->usage, 1, 0);
+ if (n > 0)
+ trace_rxrpc_conn(conn, rxrpc_conn_got, n + 1, here);
+ else
+ conn = NULL;
+ }
+ return conn;
+}
+
+/*
+ * Release a service connection
+ */
+void rxrpc_put_service_conn(struct rxrpc_connection *conn)
+{
+ const void *here = __builtin_return_address(0);
+ int n;
+
+ n = atomic_dec_return(&conn->usage);
+ trace_rxrpc_conn(conn, rxrpc_conn_put_service, n, here);
+ ASSERTCMP(n, >=, 0);
+ if (n == 0)
+ rxrpc_queue_delayed_work(&rxrpc_connection_reap, 0);
}
/*
diff --git a/net/rxrpc/conn_service.c b/net/rxrpc/conn_service.c
index 83d54da..eef551f 100644
--- a/net/rxrpc/conn_service.c
+++ b/net/rxrpc/conn_service.c
@@ -136,6 +136,10 @@
list_add_tail(&conn->link, &rxrpc_connections);
list_add_tail(&conn->proc_link, &rxrpc_connection_proc_list);
write_unlock(&rxrpc_connection_lock);
+
+ trace_rxrpc_conn(conn, rxrpc_conn_new_service,
+ atomic_read(&conn->usage),
+ __builtin_return_address(0));
}
return conn;
diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 75af0bd..094720d 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -37,12 +37,198 @@
}
/*
+ * Do TCP-style congestion management [RFC 5681].
+ */
+static void rxrpc_congestion_management(struct rxrpc_call *call,
+ struct sk_buff *skb,
+ struct rxrpc_ack_summary *summary)
+{
+ enum rxrpc_congest_change change = rxrpc_cong_no_change;
+ struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+ unsigned int cumulative_acks = call->cong_cumul_acks;
+ unsigned int cwnd = call->cong_cwnd;
+ bool resend = false;
+
+ summary->flight_size =
+ (call->tx_top - call->tx_hard_ack) - summary->nr_acks;
+
+ if (test_and_clear_bit(RXRPC_CALL_RETRANS_TIMEOUT, &call->flags)) {
+ summary->retrans_timeo = true;
+ call->cong_ssthresh = max_t(unsigned int,
+ summary->flight_size / 2, 2);
+ cwnd = 1;
+ if (cwnd > call->cong_ssthresh &&
+ call->cong_mode == RXRPC_CALL_SLOW_START) {
+ call->cong_mode = RXRPC_CALL_CONGEST_AVOIDANCE;
+ call->cong_tstamp = skb->tstamp;
+ cumulative_acks = 0;
+ }
+ }
+
+ cumulative_acks += summary->nr_new_acks;
+ cumulative_acks += summary->nr_rot_new_acks;
+ if (cumulative_acks > 255)
+ cumulative_acks = 255;
+
+ summary->mode = call->cong_mode;
+ summary->cwnd = call->cong_cwnd;
+ summary->ssthresh = call->cong_ssthresh;
+ summary->cumulative_acks = cumulative_acks;
+ summary->dup_acks = call->cong_dup_acks;
+
+ switch (call->cong_mode) {
+ case RXRPC_CALL_SLOW_START:
+ if (summary->nr_nacks > 0)
+ goto packet_loss_detected;
+ if (summary->cumulative_acks > 0)
+ cwnd += 1;
+ if (cwnd > call->cong_ssthresh) {
+ call->cong_mode = RXRPC_CALL_CONGEST_AVOIDANCE;
+ call->cong_tstamp = skb->tstamp;
+ }
+ goto out;
+
+ case RXRPC_CALL_CONGEST_AVOIDANCE:
+ if (summary->nr_nacks > 0)
+ goto packet_loss_detected;
+
+ /* We analyse the number of packets that get ACK'd per RTT
+ * period and increase the window if we managed to fill it.
+ */
+ if (call->peer->rtt_usage == 0)
+ goto out;
+ if (ktime_before(skb->tstamp,
+ ktime_add_ns(call->cong_tstamp,
+ call->peer->rtt)))
+ goto out_no_clear_ca;
+ change = rxrpc_cong_rtt_window_end;
+ call->cong_tstamp = skb->tstamp;
+ if (cumulative_acks >= cwnd)
+ cwnd++;
+ goto out;
+
+ case RXRPC_CALL_PACKET_LOSS:
+ if (summary->nr_nacks == 0)
+ goto resume_normality;
+
+ if (summary->new_low_nack) {
+ change = rxrpc_cong_new_low_nack;
+ call->cong_dup_acks = 1;
+ if (call->cong_extra > 1)
+ call->cong_extra = 1;
+ goto send_extra_data;
+ }
+
+ call->cong_dup_acks++;
+ if (call->cong_dup_acks < 3)
+ goto send_extra_data;
+
+ change = rxrpc_cong_begin_retransmission;
+ call->cong_mode = RXRPC_CALL_FAST_RETRANSMIT;
+ call->cong_ssthresh = max_t(unsigned int,
+ summary->flight_size / 2, 2);
+ cwnd = call->cong_ssthresh + 3;
+ call->cong_extra = 0;
+ call->cong_dup_acks = 0;
+ resend = true;
+ goto out;
+
+ case RXRPC_CALL_FAST_RETRANSMIT:
+ if (!summary->new_low_nack) {
+ if (summary->nr_new_acks == 0)
+ cwnd += 1;
+ call->cong_dup_acks++;
+ if (call->cong_dup_acks == 2) {
+ change = rxrpc_cong_retransmit_again;
+ call->cong_dup_acks = 0;
+ resend = true;
+ }
+ } else {
+ change = rxrpc_cong_progress;
+ cwnd = call->cong_ssthresh;
+ if (summary->nr_nacks == 0)
+ goto resume_normality;
+ }
+ goto out;
+
+ default:
+ BUG();
+ goto out;
+ }
+
+resume_normality:
+ change = rxrpc_cong_cleared_nacks;
+ call->cong_dup_acks = 0;
+ call->cong_extra = 0;
+ call->cong_tstamp = skb->tstamp;
+ if (cwnd <= call->cong_ssthresh)
+ call->cong_mode = RXRPC_CALL_SLOW_START;
+ else
+ call->cong_mode = RXRPC_CALL_CONGEST_AVOIDANCE;
+out:
+ cumulative_acks = 0;
+out_no_clear_ca:
+ if (cwnd >= RXRPC_RXTX_BUFF_SIZE - 1)
+ cwnd = RXRPC_RXTX_BUFF_SIZE - 1;
+ call->cong_cwnd = cwnd;
+ call->cong_cumul_acks = cumulative_acks;
+ trace_rxrpc_congest(call, summary, sp->hdr.serial, change);
+ if (resend && !test_and_set_bit(RXRPC_CALL_EV_RESEND, &call->events))
+ rxrpc_queue_call(call);
+ return;
+
+packet_loss_detected:
+ change = rxrpc_cong_saw_nack;
+ call->cong_mode = RXRPC_CALL_PACKET_LOSS;
+ call->cong_dup_acks = 0;
+ goto send_extra_data;
+
+send_extra_data:
+ /* Send some previously unsent DATA if we have some to advance the ACK
+ * state.
+ */
+ if (call->rxtx_annotations[call->tx_top & RXRPC_RXTX_BUFF_MASK] &
+ RXRPC_TX_ANNO_LAST ||
+ summary->nr_acks != call->tx_top - call->tx_hard_ack) {
+ call->cong_extra++;
+ wake_up(&call->waitq);
+ }
+ goto out_no_clear_ca;
+}
+
+/*
+ * Ping the other end to fill our RTT cache and to retrieve the rwind
+ * and MTU parameters.
+ */
+static void rxrpc_send_ping(struct rxrpc_call *call, struct sk_buff *skb,
+ int skew)
+{
+ struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+ ktime_t now = skb->tstamp;
+
+ if (call->peer->rtt_usage < 3 ||
+ ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000), now))
+ rxrpc_propose_ACK(call, RXRPC_ACK_PING, skew, sp->hdr.serial,
+ true, true,
+ rxrpc_propose_ack_ping_for_params);
+}
+
+/*
* Apply a hard ACK by advancing the Tx window.
*/
-static void rxrpc_rotate_tx_window(struct rxrpc_call *call, rxrpc_seq_t to)
+static void rxrpc_rotate_tx_window(struct rxrpc_call *call, rxrpc_seq_t to,
+ struct rxrpc_ack_summary *summary)
{
struct sk_buff *skb, *list = NULL;
int ix;
+ u8 annotation;
+
+ if (call->acks_lowest_nak == call->tx_hard_ack) {
+ call->acks_lowest_nak = to;
+ } else if (before_eq(call->acks_lowest_nak, to)) {
+ summary->new_low_nack = true;
+ call->acks_lowest_nak = to;
+ }
spin_lock(&call->lock);
@@ -50,22 +236,31 @@
call->tx_hard_ack++;
ix = call->tx_hard_ack & RXRPC_RXTX_BUFF_MASK;
skb = call->rxtx_buffer[ix];
- rxrpc_see_skb(skb);
+ annotation = call->rxtx_annotations[ix];
+ rxrpc_see_skb(skb, rxrpc_skb_tx_rotated);
call->rxtx_buffer[ix] = NULL;
call->rxtx_annotations[ix] = 0;
skb->next = list;
list = skb;
+
+ if (annotation & RXRPC_TX_ANNO_LAST)
+ set_bit(RXRPC_CALL_TX_LAST, &call->flags);
+ if ((annotation & RXRPC_TX_ANNO_MASK) != RXRPC_TX_ANNO_ACK)
+ summary->nr_rot_new_acks++;
}
spin_unlock(&call->lock);
+ trace_rxrpc_transmit(call, (test_bit(RXRPC_CALL_TX_LAST, &call->flags) ?
+ rxrpc_transmit_rotate_last :
+ rxrpc_transmit_rotate));
wake_up(&call->waitq);
while (list) {
skb = list;
list = skb->next;
skb->next = NULL;
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_tx_freed);
}
}
@@ -75,40 +270,77 @@
* This occurs when we get an ACKALL packet, the first DATA packet of a reply,
* or a final ACK packet.
*/
-static bool rxrpc_end_tx_phase(struct rxrpc_call *call, const char *abort_why)
+static bool rxrpc_end_tx_phase(struct rxrpc_call *call, bool reply_begun,
+ const char *abort_why)
{
- _enter("");
- switch (call->state) {
- case RXRPC_CALL_CLIENT_RECV_REPLY:
- return true;
- case RXRPC_CALL_CLIENT_AWAIT_REPLY:
- case RXRPC_CALL_SERVER_AWAIT_ACK:
- break;
- default:
- rxrpc_proto_abort(abort_why, call, call->tx_top);
- return false;
- }
-
- rxrpc_rotate_tx_window(call, call->tx_top);
+ ASSERT(test_bit(RXRPC_CALL_TX_LAST, &call->flags));
write_lock(&call->state_lock);
switch (call->state) {
- default:
- break;
+ case RXRPC_CALL_CLIENT_SEND_REQUEST:
case RXRPC_CALL_CLIENT_AWAIT_REPLY:
- call->state = RXRPC_CALL_CLIENT_RECV_REPLY;
+ if (reply_begun)
+ call->state = RXRPC_CALL_CLIENT_RECV_REPLY;
+ else
+ call->state = RXRPC_CALL_CLIENT_AWAIT_REPLY;
break;
+
case RXRPC_CALL_SERVER_AWAIT_ACK:
__rxrpc_call_completed(call);
rxrpc_notify_socket(call);
break;
+
+ default:
+ goto bad_state;
}
write_unlock(&call->state_lock);
+ if (call->state == RXRPC_CALL_CLIENT_AWAIT_REPLY) {
+ rxrpc_propose_ACK(call, RXRPC_ACK_IDLE, 0, 0, false, true,
+ rxrpc_propose_ack_client_tx_end);
+ trace_rxrpc_transmit(call, rxrpc_transmit_await_reply);
+ } else {
+ trace_rxrpc_transmit(call, rxrpc_transmit_end);
+ }
_leave(" = ok");
return true;
+
+bad_state:
+ write_unlock(&call->state_lock);
+ kdebug("end_tx %s", rxrpc_call_states[call->state]);
+ rxrpc_proto_abort(abort_why, call, call->tx_top);
+ return false;
+}
+
+/*
+ * Begin the reply reception phase of a call.
+ */
+static bool rxrpc_receiving_reply(struct rxrpc_call *call)
+{
+ struct rxrpc_ack_summary summary = { 0 };
+ rxrpc_seq_t top = READ_ONCE(call->tx_top);
+
+ if (call->ackr_reason) {
+ spin_lock_bh(&call->lock);
+ call->ackr_reason = 0;
+ call->resend_at = call->expire_at;
+ call->ack_at = call->expire_at;
+ spin_unlock_bh(&call->lock);
+ rxrpc_set_timer(call, rxrpc_timer_init_for_reply);
+ }
+
+ if (!test_bit(RXRPC_CALL_TX_LAST, &call->flags))
+ rxrpc_rotate_tx_window(call, top, &summary);
+ if (!test_bit(RXRPC_CALL_TX_LAST, &call->flags)) {
+ rxrpc_proto_abort("TXL", call, top);
+ return false;
+ }
+ if (!rxrpc_end_tx_phase(call, true, "ETD"))
+ return false;
+ call->tx_phase = false;
+ return true;
}
/*
@@ -207,8 +439,9 @@
/* Received data implicitly ACKs all of the request packets we sent
* when we're acting as a client.
*/
- if (call->state == RXRPC_CALL_CLIENT_AWAIT_REPLY &&
- !rxrpc_end_tx_phase(call, "ETD"))
+ if ((call->state == RXRPC_CALL_CLIENT_SEND_REQUEST ||
+ call->state == RXRPC_CALL_CLIENT_AWAIT_REPLY) &&
+ !rxrpc_receiving_reply(call))
return;
call->ackr_prev_seq = seq;
@@ -238,7 +471,7 @@
len = RXRPC_JUMBO_DATALEN;
if (flags & RXRPC_LAST_PACKET) {
- if (test_and_set_bit(RXRPC_CALL_RX_LAST, &call->flags) &&
+ if (test_bit(RXRPC_CALL_RX_LAST, &call->flags) &&
seq != call->rx_top)
return rxrpc_proto_abort("LSN", call, seq);
} else {
@@ -276,12 +509,26 @@
* Barriers against rxrpc_recvmsg_data() and rxrpc_rotate_rx_window()
* and also rxrpc_fill_out_ack().
*/
- rxrpc_get_skb(skb);
+ rxrpc_get_skb(skb, rxrpc_skb_rx_got);
call->rxtx_annotations[ix] = annotation;
smp_wmb();
call->rxtx_buffer[ix] = skb;
- if (after(seq, call->rx_top))
+ if (after(seq, call->rx_top)) {
smp_store_release(&call->rx_top, seq);
+ } else if (before(seq, call->rx_top)) {
+ /* Send an immediate ACK if we fill in a hole */
+ if (!ack) {
+ ack = RXRPC_ACK_DELAY;
+ ack_serial = serial;
+ }
+ immediate_ack = true;
+ }
+ if (flags & RXRPC_LAST_PACKET) {
+ set_bit(RXRPC_CALL_RX_LAST, &call->flags);
+ trace_rxrpc_receive(call, rxrpc_receive_queue_last, serial, seq);
+ } else {
+ trace_rxrpc_receive(call, rxrpc_receive_queue, serial, seq);
+ }
queued = true;
if (after_eq(seq, call->rx_expect_next)) {
@@ -326,7 +573,8 @@
ack:
if (ack)
rxrpc_propose_ACK(call, ack, skew, ack_serial,
- immediate_ack, true);
+ immediate_ack, true,
+ rxrpc_propose_ack_input_data);
if (sp->hdr.seq == READ_ONCE(call->rx_hard_ack) + 1)
rxrpc_notify_socket(call);
@@ -334,6 +582,64 @@
}
/*
+ * Process a requested ACK.
+ */
+static void rxrpc_input_requested_ack(struct rxrpc_call *call,
+ ktime_t resp_time,
+ rxrpc_serial_t orig_serial,
+ rxrpc_serial_t ack_serial)
+{
+ struct rxrpc_skb_priv *sp;
+ struct sk_buff *skb;
+ ktime_t sent_at;
+ int ix;
+
+ for (ix = 0; ix < RXRPC_RXTX_BUFF_SIZE; ix++) {
+ skb = call->rxtx_buffer[ix];
+ if (!skb)
+ continue;
+
+ sp = rxrpc_skb(skb);
+ if (sp->hdr.serial != orig_serial)
+ continue;
+ smp_rmb();
+ sent_at = skb->tstamp;
+ goto found;
+ }
+ return;
+
+found:
+ rxrpc_peer_add_rtt(call, rxrpc_rtt_rx_requested_ack,
+ orig_serial, ack_serial, sent_at, resp_time);
+}
+
+/*
+ * Process a ping response.
+ */
+static void rxrpc_input_ping_response(struct rxrpc_call *call,
+ ktime_t resp_time,
+ rxrpc_serial_t orig_serial,
+ rxrpc_serial_t ack_serial)
+{
+ rxrpc_serial_t ping_serial;
+ ktime_t ping_time;
+
+ ping_time = call->ackr_ping_time;
+ smp_rmb();
+ ping_serial = call->ackr_ping;
+
+ if (!test_bit(RXRPC_CALL_PINGING, &call->flags) ||
+ before(orig_serial, ping_serial))
+ return;
+ clear_bit(RXRPC_CALL_PINGING, &call->flags);
+ if (after(orig_serial, ping_serial))
+ return;
+
+ rxrpc_peer_add_rtt(call, rxrpc_rtt_rx_ping_response,
+ orig_serial, ack_serial, ping_time, resp_time);
+}
+
+/*
* Process the extra information that may be appended to an ACK packet
*/
static void rxrpc_input_ackinfo(struct rxrpc_call *call, struct sk_buff *skb,
@@ -375,31 +681,45 @@
* the time the ACK was sent.
*/
static void rxrpc_input_soft_acks(struct rxrpc_call *call, u8 *acks,
- rxrpc_seq_t seq, int nr_acks)
+ rxrpc_seq_t seq, int nr_acks,
+ struct rxrpc_ack_summary *summary)
{
- bool resend = false;
int ix;
+ u8 annotation, anno_type;
for (; nr_acks > 0; nr_acks--, seq++) {
ix = seq & RXRPC_RXTX_BUFF_MASK;
- switch (*acks) {
+ annotation = call->rxtx_annotations[ix];
+ anno_type = annotation & RXRPC_TX_ANNO_MASK;
+ annotation &= ~RXRPC_TX_ANNO_MASK;
+ switch (*acks++) {
case RXRPC_ACK_TYPE_ACK:
- call->rxtx_annotations[ix] = RXRPC_TX_ANNO_ACK;
+ summary->nr_acks++;
+ if (anno_type == RXRPC_TX_ANNO_ACK)
+ continue;
+ summary->nr_new_acks++;
+ call->rxtx_annotations[ix] =
+ RXRPC_TX_ANNO_ACK | annotation;
break;
case RXRPC_ACK_TYPE_NACK:
- if (call->rxtx_annotations[ix] == RXRPC_TX_ANNO_NAK)
+ if (!summary->nr_nacks &&
+ call->acks_lowest_nak != seq) {
+ call->acks_lowest_nak = seq;
+ summary->new_low_nack = true;
+ }
+ summary->nr_nacks++;
+ if (anno_type == RXRPC_TX_ANNO_NAK)
continue;
- call->rxtx_annotations[ix] = RXRPC_TX_ANNO_NAK;
- resend = true;
+ summary->nr_new_nacks++;
+ if (anno_type == RXRPC_TX_ANNO_RETRANS)
+ continue;
+ call->rxtx_annotations[ix] =
+ RXRPC_TX_ANNO_NAK | annotation;
break;
default:
return rxrpc_proto_abort("SFT", call, 0);
}
}
-
- if (resend &&
- !test_and_set_bit(RXRPC_CALL_EV_RESEND, &call->events))
- rxrpc_queue_call(call);
}
/*
@@ -415,12 +735,14 @@
static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb,
u16 skew)
{
+ struct rxrpc_ack_summary summary = { 0 };
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
union {
struct rxrpc_ackpacket ack;
struct rxrpc_ackinfo info;
u8 acks[RXRPC_MAXACKS];
} buf;
+ rxrpc_serial_t acked_serial;
rxrpc_seq_t first_soft_ack, hard_ack;
int nr_acks, offset;
@@ -432,26 +754,40 @@
}
sp->offset += sizeof(buf.ack);
+ acked_serial = ntohl(buf.ack.serial);
first_soft_ack = ntohl(buf.ack.firstPacket);
hard_ack = first_soft_ack - 1;
nr_acks = buf.ack.nAcks;
+ summary.ack_reason = (buf.ack.reason < RXRPC_ACK__INVALID ?
+ buf.ack.reason : RXRPC_ACK__INVALID);
+
+ trace_rxrpc_rx_ack(call, first_soft_ack, summary.ack_reason, nr_acks);
_proto("Rx ACK %%%u { m=%hu f=#%u p=#%u s=%%%u r=%s n=%u }",
sp->hdr.serial,
ntohs(buf.ack.maxSkew),
first_soft_ack,
ntohl(buf.ack.previousPacket),
- ntohl(buf.ack.serial),
- rxrpc_acks(buf.ack.reason),
+ acked_serial,
+ rxrpc_ack_names[summary.ack_reason],
buf.ack.nAcks);
+ if (buf.ack.reason == RXRPC_ACK_PING_RESPONSE)
+ rxrpc_input_ping_response(call, skb->tstamp, acked_serial,
+ sp->hdr.serial);
+ if (buf.ack.reason == RXRPC_ACK_REQUESTED)
+ rxrpc_input_requested_ack(call, skb->tstamp, acked_serial,
+ sp->hdr.serial);
+
if (buf.ack.reason == RXRPC_ACK_PING) {
_proto("Rx ACK %%%u PING Request", sp->hdr.serial);
rxrpc_propose_ACK(call, RXRPC_ACK_PING_RESPONSE,
- skew, sp->hdr.serial, true, true);
+ skew, sp->hdr.serial, true, true,
+ rxrpc_propose_ack_respond_to_ping);
} else if (sp->hdr.flags & RXRPC_REQUEST_ACK) {
rxrpc_propose_ACK(call, RXRPC_ACK_REQUESTED,
- skew, sp->hdr.serial, true, true);
+ skew, sp->hdr.serial, true, true,
+ rxrpc_propose_ack_respond_to_ack);
}
offset = sp->offset + nr_acks + 3;
@@ -476,34 +812,43 @@
}
/* Discard any out-of-order or duplicate ACKs. */
- if ((int)sp->hdr.serial - (int)call->acks_latest <= 0) {
+ if (before_eq(sp->hdr.serial, call->acks_latest)) {
_debug("discard ACK %d <= %d",
sp->hdr.serial, call->acks_latest);
return;
}
+ call->acks_latest_ts = skb->tstamp;
call->acks_latest = sp->hdr.serial;
- if (test_bit(RXRPC_CALL_TX_LAST, &call->flags) &&
- hard_ack == call->tx_top) {
- rxrpc_end_tx_phase(call, "ETA");
- return;
- }
-
if (before(hard_ack, call->tx_hard_ack) ||
after(hard_ack, call->tx_top))
return rxrpc_proto_abort("AKW", call, 0);
+ if (nr_acks > call->tx_top - hard_ack)
+ return rxrpc_proto_abort("AKN", call, 0);
if (after(hard_ack, call->tx_hard_ack))
- rxrpc_rotate_tx_window(call, hard_ack);
+ rxrpc_rotate_tx_window(call, hard_ack, &summary);
- if (after(first_soft_ack, call->tx_top))
+ if (nr_acks > 0) {
+ if (skb_copy_bits(skb, sp->offset, buf.acks, nr_acks) < 0)
+ return rxrpc_proto_abort("XSA", call, 0);
+ rxrpc_input_soft_acks(call, buf.acks, first_soft_ack, nr_acks,
+ &summary);
+ }
+
+ if (test_bit(RXRPC_CALL_TX_LAST, &call->flags)) {
+ rxrpc_end_tx_phase(call, false, "ETA");
return;
+ }
- if (nr_acks > call->tx_top - first_soft_ack + 1)
- nr_acks = first_soft_ack - call->tx_top + 1;
- if (skb_copy_bits(skb, sp->offset, buf.acks, nr_acks) < 0)
- return rxrpc_proto_abort("XSA", call, 0);
- rxrpc_input_soft_acks(call, buf.acks, first_soft_ack, nr_acks);
+ if (call->rxtx_annotations[call->tx_top & RXRPC_RXTX_BUFF_MASK] &
+ RXRPC_TX_ANNO_LAST &&
+ summary.nr_acks == call->tx_top - hard_ack)
+ rxrpc_propose_ACK(call, RXRPC_ACK_PING, skew, sp->hdr.serial,
+ false, true,
+ rxrpc_propose_ack_ping_for_lost_reply);
+
+ return rxrpc_congestion_management(call, skb, &summary);
}
/*
@@ -511,11 +856,14 @@
*/
static void rxrpc_input_ackall(struct rxrpc_call *call, struct sk_buff *skb)
{
+ struct rxrpc_ack_summary summary = { 0 };
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
_proto("Rx ACKALL %%%u", sp->hdr.serial);
- rxrpc_end_tx_phase(call, "ETL");
+ rxrpc_rotate_tx_window(call, call->tx_top, &summary);
+ if (test_bit(RXRPC_CALL_TX_LAST, &call->flags))
+ rxrpc_end_tx_phase(call, false, "ETL");
}
/*
@@ -681,13 +1029,13 @@
return;
}
- rxrpc_new_skb(skb);
+ rxrpc_new_skb(skb, rxrpc_skb_rx_received);
_net("recv skb %p", skb);
/* we'll probably need to checksum it (didn't call sock_recvmsg) */
if (skb_checksum_complete(skb)) {
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
__UDP_INC_STATS(&init_net, UDP_MIB_INERRORS, 0);
_leave(" [CSUM failed]");
return;
@@ -701,12 +1049,19 @@
skb_orphan(skb);
sp = rxrpc_skb(skb);
- _net("Rx UDP packet from %08x:%04hu",
- ntohl(ip_hdr(skb)->saddr), ntohs(udp_hdr(skb)->source));
-
/* dig out the RxRPC connection details */
if (rxrpc_extract_header(sp, skb) < 0)
goto bad_message;
+
+ if (IS_ENABLED(CONFIG_AF_RXRPC_INJECT_LOSS)) {
+ static int lose;
+ if ((lose++ & 7) == 7) {
+ trace_rxrpc_rx_lose(sp);
+ rxrpc_lose_skb(skb, rxrpc_skb_rx_lost);
+ return;
+ }
+ }
+
trace_rxrpc_rx_packet(sp);
_net("Rx RxRPC %s ep=%x call=%x:%x",
@@ -803,6 +1158,7 @@
rcu_read_unlock();
goto reject_packet;
}
+ rxrpc_send_ping(call, skb, skew);
}
rxrpc_input_call_packet(call, skb, skew);
@@ -811,7 +1167,7 @@
discard_unlock:
rcu_read_unlock();
discard:
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
out:
trace_rxrpc_rx_done(0, 0);
return;
diff --git a/net/rxrpc/local_event.c b/net/rxrpc/local_event.c
index f073e93..190f68b 100644
--- a/net/rxrpc/local_event.c
+++ b/net/rxrpc/local_event.c
@@ -90,7 +90,7 @@
if (skb) {
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_rx_seen);
_debug("{%d},{%u}", local->debug_id, sp->hdr.type);
switch (sp->hdr.type) {
@@ -107,7 +107,7 @@
break;
}
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
}
_leave("");
diff --git a/net/rxrpc/misc.c b/net/rxrpc/misc.c
index 8b91078..aedb897 100644
--- a/net/rxrpc/misc.c
+++ b/net/rxrpc/misc.c
@@ -68,9 +68,9 @@
unsigned int rxrpc_rx_jumbo_max = 4;
/*
- * Time till packet resend (in jiffies).
+ * Time till packet resend (in milliseconds).
*/
-unsigned int rxrpc_resend_timeout = 4 * HZ;
+unsigned int rxrpc_resend_timeout = 4 * 1000;
const char *const rxrpc_pkts[] = {
"?00",
@@ -83,21 +83,153 @@
[RXRPC_ACK_DELAY] = 1,
[RXRPC_ACK_REQUESTED] = 2,
[RXRPC_ACK_IDLE] = 3,
- [RXRPC_ACK_PING_RESPONSE] = 4,
- [RXRPC_ACK_DUPLICATE] = 5,
- [RXRPC_ACK_OUT_OF_SEQUENCE] = 6,
- [RXRPC_ACK_EXCEEDS_WINDOW] = 7,
- [RXRPC_ACK_NOSPACE] = 8,
+ [RXRPC_ACK_DUPLICATE] = 4,
+ [RXRPC_ACK_OUT_OF_SEQUENCE] = 5,
+ [RXRPC_ACK_EXCEEDS_WINDOW] = 6,
+ [RXRPC_ACK_NOSPACE] = 7,
+ [RXRPC_ACK_PING_RESPONSE] = 8,
+ [RXRPC_ACK_PING] = 9,
};
-const char *rxrpc_acks(u8 reason)
-{
- static const char *const str[] = {
- "---", "REQ", "DUP", "OOS", "WIN", "MEM", "PNG", "PNR", "DLY",
- "IDL", "-?-"
- };
+const char const rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4] = {
+ "---", "REQ", "DUP", "OOS", "WIN", "MEM", "PNG", "PNR", "DLY",
+ "IDL", "-?-"
+};
- if (reason >= ARRAY_SIZE(str))
- reason = ARRAY_SIZE(str) - 1;
- return str[reason];
-}
+const char rxrpc_skb_traces[rxrpc_skb__nr_trace][7] = {
+ [rxrpc_skb_rx_cleaned] = "Rx CLN",
+ [rxrpc_skb_rx_freed] = "Rx FRE",
+ [rxrpc_skb_rx_got] = "Rx GOT",
+ [rxrpc_skb_rx_lost] = "Rx *L*",
+ [rxrpc_skb_rx_received] = "Rx RCV",
+ [rxrpc_skb_rx_purged] = "Rx PUR",
+ [rxrpc_skb_rx_rotated] = "Rx ROT",
+ [rxrpc_skb_rx_seen] = "Rx SEE",
+ [rxrpc_skb_tx_cleaned] = "Tx CLN",
+ [rxrpc_skb_tx_freed] = "Tx FRE",
+ [rxrpc_skb_tx_got] = "Tx GOT",
+ [rxrpc_skb_tx_lost] = "Tx *L*",
+ [rxrpc_skb_tx_new] = "Tx NEW",
+ [rxrpc_skb_tx_rotated] = "Tx ROT",
+ [rxrpc_skb_tx_seen] = "Tx SEE",
+};
+
+const char rxrpc_conn_traces[rxrpc_conn__nr_trace][4] = {
+ [rxrpc_conn_new_client] = "NWc",
+ [rxrpc_conn_new_service] = "NWs",
+ [rxrpc_conn_queued] = "QUE",
+ [rxrpc_conn_seen] = "SEE",
+ [rxrpc_conn_got] = "GOT",
+ [rxrpc_conn_put_client] = "PTc",
+ [rxrpc_conn_put_service] = "PTs",
+};
+
+const char rxrpc_client_traces[rxrpc_client__nr_trace][7] = {
+ [rxrpc_client_activate_chans] = "Activa",
+ [rxrpc_client_alloc] = "Alloc ",
+ [rxrpc_client_chan_activate] = "ChActv",
+ [rxrpc_client_chan_disconnect] = "ChDisc",
+ [rxrpc_client_chan_pass] = "ChPass",
+ [rxrpc_client_chan_unstarted] = "ChUnst",
+ [rxrpc_client_cleanup] = "Clean ",
+ [rxrpc_client_count] = "Count ",
+ [rxrpc_client_discard] = "Discar",
+ [rxrpc_client_duplicate] = "Duplic",
+ [rxrpc_client_exposed] = "Expose",
+ [rxrpc_client_replace] = "Replac",
+ [rxrpc_client_to_active] = "->Actv",
+ [rxrpc_client_to_culled] = "->Cull",
+ [rxrpc_client_to_idle] = "->Idle",
+ [rxrpc_client_to_inactive] = "->Inac",
+ [rxrpc_client_to_waiting] = "->Wait",
+ [rxrpc_client_uncount] = "Uncoun",
+};
+
+const char rxrpc_transmit_traces[rxrpc_transmit__nr_trace][4] = {
+ [rxrpc_transmit_wait] = "WAI",
+ [rxrpc_transmit_queue] = "QUE",
+ [rxrpc_transmit_queue_last] = "QLS",
+ [rxrpc_transmit_rotate] = "ROT",
+ [rxrpc_transmit_rotate_last] = "RLS",
+ [rxrpc_transmit_await_reply] = "AWR",
+ [rxrpc_transmit_end] = "END",
+};
+
+const char rxrpc_receive_traces[rxrpc_receive__nr_trace][4] = {
+ [rxrpc_receive_incoming] = "INC",
+ [rxrpc_receive_queue] = "QUE",
+ [rxrpc_receive_queue_last] = "QLS",
+ [rxrpc_receive_front] = "FRN",
+ [rxrpc_receive_rotate] = "ROT",
+ [rxrpc_receive_end] = "END",
+};
+
+const char rxrpc_recvmsg_traces[rxrpc_recvmsg__nr_trace][5] = {
+ [rxrpc_recvmsg_enter] = "ENTR",
+ [rxrpc_recvmsg_wait] = "WAIT",
+ [rxrpc_recvmsg_dequeue] = "DEQU",
+ [rxrpc_recvmsg_hole] = "HOLE",
+ [rxrpc_recvmsg_next] = "NEXT",
+ [rxrpc_recvmsg_cont] = "CONT",
+ [rxrpc_recvmsg_full] = "FULL",
+ [rxrpc_recvmsg_data_return] = "DATA",
+ [rxrpc_recvmsg_terminal] = "TERM",
+ [rxrpc_recvmsg_to_be_accepted] = "TBAC",
+ [rxrpc_recvmsg_return] = "RETN",
+};
+
+const char rxrpc_rtt_tx_traces[rxrpc_rtt_tx__nr_trace][5] = {
+ [rxrpc_rtt_tx_ping] = "PING",
+ [rxrpc_rtt_tx_data] = "DATA",
+};
+
+const char rxrpc_rtt_rx_traces[rxrpc_rtt_rx__nr_trace][5] = {
+ [rxrpc_rtt_rx_ping_response] = "PONG",
+ [rxrpc_rtt_rx_requested_ack] = "RACK",
+};
+
+const char rxrpc_timer_traces[rxrpc_timer__nr_trace][8] = {
+ [rxrpc_timer_begin] = "Begin ",
+ [rxrpc_timer_expired] = "*EXPR*",
+ [rxrpc_timer_init_for_reply] = "IniRpl",
+ [rxrpc_timer_set_for_ack] = "SetAck",
+ [rxrpc_timer_set_for_send] = "SetTx ",
+ [rxrpc_timer_set_for_resend] = "SetRTx",
+};
+
+const char rxrpc_propose_ack_traces[rxrpc_propose_ack__nr_trace][8] = {
+ [rxrpc_propose_ack_client_tx_end] = "ClTxEnd",
+ [rxrpc_propose_ack_input_data] = "DataIn ",
+ [rxrpc_propose_ack_ping_for_lost_ack] = "LostAck",
+ [rxrpc_propose_ack_ping_for_lost_reply] = "LostRpl",
+ [rxrpc_propose_ack_ping_for_params] = "Params ",
+ [rxrpc_propose_ack_respond_to_ack] = "Rsp2Ack",
+ [rxrpc_propose_ack_respond_to_ping] = "Rsp2Png",
+ [rxrpc_propose_ack_retry_tx] = "RetryTx",
+ [rxrpc_propose_ack_rotate_rx] = "RxAck ",
+ [rxrpc_propose_ack_terminal_ack] = "ClTerm ",
+};
+
+const char *const rxrpc_propose_ack_outcomes[rxrpc_propose_ack__nr_outcomes] = {
+ [rxrpc_propose_ack_use] = "",
+ [rxrpc_propose_ack_update] = " Update",
+ [rxrpc_propose_ack_subsume] = " Subsume",
+};
+
+const char rxrpc_congest_modes[NR__RXRPC_CONGEST_MODES][10] = {
+ [RXRPC_CALL_SLOW_START] = "SlowStart",
+ [RXRPC_CALL_CONGEST_AVOIDANCE] = "CongAvoid",
+ [RXRPC_CALL_PACKET_LOSS] = "PktLoss ",
+ [RXRPC_CALL_FAST_RETRANSMIT] = "FastReTx ",
+};
+
+const char rxrpc_congest_changes[rxrpc_congest__nr_change][9] = {
+ [rxrpc_cong_begin_retransmission] = " Retrans",
+ [rxrpc_cong_cleared_nacks] = " Cleared",
+ [rxrpc_cong_new_low_nack] = " NewLowN",
+ [rxrpc_cong_no_change] = "",
+ [rxrpc_cong_progress] = " Progres",
+ [rxrpc_cong_retransmit_again] = " ReTxAgn",
+ [rxrpc_cong_rtt_window_end] = " RttWinE",
+ [rxrpc_cong_saw_nack] = " SawNack",
+};
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index 06a9aca..cf43a71 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -36,25 +36,34 @@
* Fill out an ACK packet.
*/
static size_t rxrpc_fill_out_ack(struct rxrpc_call *call,
- struct rxrpc_pkt_buffer *pkt)
+ struct rxrpc_pkt_buffer *pkt,
+ rxrpc_seq_t *_hard_ack,
+ rxrpc_seq_t *_top)
{
+ rxrpc_serial_t serial;
rxrpc_seq_t hard_ack, top, seq;
int ix;
u32 mtu, jmax;
u8 *ackp = pkt->acks;
/* Barrier against rxrpc_input_data(). */
+ serial = call->ackr_serial;
hard_ack = READ_ONCE(call->rx_hard_ack);
top = smp_load_acquire(&call->rx_top);
+ *_hard_ack = hard_ack;
+ *_top = top;
pkt->ack.bufferSpace = htons(8);
pkt->ack.maxSkew = htons(call->ackr_skew);
pkt->ack.firstPacket = htonl(hard_ack + 1);
pkt->ack.previousPacket = htonl(call->ackr_prev_seq);
- pkt->ack.serial = htonl(call->ackr_serial);
+ pkt->ack.serial = htonl(serial);
pkt->ack.reason = call->ackr_reason;
pkt->ack.nAcks = top - hard_ack;
+ if (pkt->ack.reason == RXRPC_ACK_PING)
+ pkt->whdr.flags |= RXRPC_REQUEST_ACK;
+
if (after(top, hard_ack)) {
seq = hard_ack + 1;
do {
@@ -91,7 +100,9 @@
struct msghdr msg;
struct kvec iov[2];
rxrpc_serial_t serial;
+ rxrpc_seq_t hard_ack, top;
size_t len, n;
+ bool ping = false;
int ioc, ret;
u32 abort_code;
@@ -110,8 +121,6 @@
return -ENOMEM;
}
- serial = atomic_inc_return(&conn->serial);
-
msg.msg_name = &call->peer->srx.transport;
msg.msg_namelen = call->peer->srx.transport_len;
msg.msg_control = NULL;
@@ -122,7 +131,6 @@
pkt->whdr.cid = htonl(call->cid);
pkt->whdr.callNumber = htonl(call->call_id);
pkt->whdr.seq = 0;
- pkt->whdr.serial = htonl(serial);
pkt->whdr.type = type;
pkt->whdr.flags = conn->out_clientflag;
pkt->whdr.userStatus = 0;
@@ -137,19 +145,19 @@
switch (type) {
case RXRPC_PACKET_TYPE_ACK:
spin_lock_bh(&call->lock);
- n = rxrpc_fill_out_ack(call, pkt);
+ if (!call->ackr_reason) {
+ spin_unlock_bh(&call->lock);
+ ret = 0;
+ goto out;
+ }
+ ping = (call->ackr_reason == RXRPC_ACK_PING);
+ n = rxrpc_fill_out_ack(call, pkt, &hard_ack, &top);
call->ackr_reason = 0;
spin_unlock_bh(&call->lock);
- _proto("Tx ACK %%%u { m=%hu f=#%u p=#%u s=%%%u r=%s n=%u }",
- serial,
- ntohs(pkt->ack.maxSkew),
- ntohl(pkt->ack.firstPacket),
- ntohl(pkt->ack.previousPacket),
- ntohl(pkt->ack.serial),
- rxrpc_acks(pkt->ack.reason),
- pkt->ack.nAcks);
+
+ pkt->whdr.flags |= RXRPC_SLOW_START_OK;
iov[0].iov_len += sizeof(pkt->ack) + n;
iov[1].iov_base = &pkt->ackinfo;
@@ -161,7 +169,6 @@
case RXRPC_PACKET_TYPE_ABORT:
abort_code = call->abort_code;
pkt->abort_code = htonl(abort_code);
- _proto("Tx ABORT %%%u { %d }", serial, abort_code);
iov[0].iov_len += sizeof(pkt->abort_code);
len += sizeof(pkt->abort_code);
ioc = 1;
@@ -173,19 +180,52 @@
goto out;
}
+ serial = atomic_inc_return(&conn->serial);
+ pkt->whdr.serial = htonl(serial);
+ switch (type) {
+ case RXRPC_PACKET_TYPE_ACK:
+ trace_rxrpc_tx_ack(call, serial,
+ ntohl(pkt->ack.firstPacket),
+ ntohl(pkt->ack.serial),
+ pkt->ack.reason, pkt->ack.nAcks);
+ break;
+ }
+
+ if (ping) {
+ call->ackr_ping = serial;
+ smp_wmb();
+ /* We need to stick a time in before we send the packet in case
+ * the reply gets back before kernel_sendmsg() completes - but
+ * asking UDP to send the packet can take a relatively long
+ * time, so we update the time after, on the assumption that
+ * the packet transmission is more likely to happen towards the
+ * end of the kernel_sendmsg() call.
+ */
+ call->ackr_ping_time = ktime_get_real();
+ set_bit(RXRPC_CALL_PINGING, &call->flags);
+ trace_rxrpc_rtt_tx(call, rxrpc_rtt_tx_ping, serial);
+ }
ret = kernel_sendmsg(conn->params.local->socket,
&msg, iov, ioc, len);
+ if (ping)
+ call->ackr_ping_time = ktime_get_real();
- if (ret < 0 && call->state < RXRPC_CALL_COMPLETE) {
- switch (pkt->whdr.type) {
- case RXRPC_PACKET_TYPE_ACK:
+ if (type == RXRPC_PACKET_TYPE_ACK &&
+ call->state < RXRPC_CALL_COMPLETE) {
+ if (ret < 0) {
+ clear_bit(RXRPC_CALL_PINGING, &call->flags);
rxrpc_propose_ACK(call, pkt->ack.reason,
ntohs(pkt->ack.maxSkew),
ntohl(pkt->ack.serial),
- true, true);
- break;
- case RXRPC_PACKET_TYPE_ABORT:
- break;
+ true, true,
+ rxrpc_propose_ack_retry_tx);
+ } else {
+ spin_lock_bh(&call->lock);
+ if (after(hard_ack, call->ackr_consumed))
+ call->ackr_consumed = hard_ack;
+ if (after(top, call->ackr_seen))
+ call->ackr_seen = top;
+ spin_unlock_bh(&call->lock);
}
}
@@ -198,43 +238,100 @@
/*
* send a packet through the transport endpoint
*/
-int rxrpc_send_data_packet(struct rxrpc_connection *conn, struct sk_buff *skb)
+int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb)
{
- struct kvec iov[1];
+ struct rxrpc_connection *conn = call->conn;
+ struct rxrpc_wire_header whdr;
+ struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct msghdr msg;
+ struct kvec iov[2];
+ rxrpc_serial_t serial;
+ size_t len;
int ret, opt;
_enter(",{%d}", skb->len);
- iov[0].iov_base = skb->head;
- iov[0].iov_len = skb->len;
+ /* Each transmission of a Tx packet needs a new serial number */
+ serial = atomic_inc_return(&conn->serial);
- msg.msg_name = &conn->params.peer->srx.transport;
- msg.msg_namelen = conn->params.peer->srx.transport_len;
+ whdr.epoch = htonl(conn->proto.epoch);
+ whdr.cid = htonl(call->cid);
+ whdr.callNumber = htonl(call->call_id);
+ whdr.seq = htonl(sp->hdr.seq);
+ whdr.serial = htonl(serial);
+ whdr.type = RXRPC_PACKET_TYPE_DATA;
+ whdr.flags = sp->hdr.flags;
+ whdr.userStatus = 0;
+ whdr.securityIndex = call->security_ix;
+ whdr._rsvd = htons(sp->hdr._rsvd);
+ whdr.serviceId = htons(call->service_id);
+
+ iov[0].iov_base = &whdr;
+ iov[0].iov_len = sizeof(whdr);
+ iov[1].iov_base = skb->head;
+ iov[1].iov_len = skb->len;
+ len = iov[0].iov_len + iov[1].iov_len;
+
+ msg.msg_name = &call->peer->srx.transport;
+ msg.msg_namelen = call->peer->srx.transport_len;
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = 0;
+ /* If our RTT cache needs working on, request an ACK. Also request
+ * ACKs if a DATA packet appears to have been lost.
+ */
+ if (call->cong_mode == RXRPC_CALL_FAST_RETRANSMIT ||
+ (call->peer->rtt_usage < 3 && sp->hdr.seq & 1) ||
+ ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000),
+ ktime_get_real()))
+ whdr.flags |= RXRPC_REQUEST_ACK;
+
+ if (IS_ENABLED(CONFIG_AF_RXRPC_INJECT_LOSS)) {
+ static int lose;
+ if ((lose++ & 7) == 7) {
+ trace_rxrpc_tx_data(call, sp->hdr.seq, serial,
+ whdr.flags, true);
+ rxrpc_lose_skb(skb, rxrpc_skb_tx_lost);
+ _leave(" = 0 [lose]");
+ return 0;
+ }
+ }
+
+ _proto("Tx DATA %%%u { #%u }", serial, sp->hdr.seq);
+
/* send the packet with the don't fragment bit set if we currently
* think it's small enough */
- if (skb->len - sizeof(struct rxrpc_wire_header) < conn->params.peer->maxdata) {
- down_read(&conn->params.local->defrag_sem);
- /* send the packet by UDP
- * - returns -EMSGSIZE if UDP would have to fragment the packet
- * to go out of the interface
- * - in which case, we'll have processed the ICMP error
- * message and update the peer record
- */
- ret = kernel_sendmsg(conn->params.local->socket, &msg, iov, 1,
- iov[0].iov_len);
+ if (iov[1].iov_len >= call->peer->maxdata)
+ goto send_fragmentable;
- up_read(&conn->params.local->defrag_sem);
- if (ret == -EMSGSIZE)
- goto send_fragmentable;
+ down_read(&conn->params.local->defrag_sem);
+ /* send the packet by UDP
+ * - returns -EMSGSIZE if UDP would have to fragment the packet
+ * to go out of the interface
+ * - in which case, we'll have processed the ICMP error
+ * message and update the peer record
+ */
+ ret = kernel_sendmsg(conn->params.local->socket, &msg, iov, 2, len);
- _leave(" = %d [%u]", ret, conn->params.peer->maxdata);
- return ret;
+ up_read(&conn->params.local->defrag_sem);
+ if (ret == -EMSGSIZE)
+ goto send_fragmentable;
+
+done:
+ trace_rxrpc_tx_data(call, sp->hdr.seq, serial, whdr.flags, false);
+ if (ret >= 0) {
+ ktime_t now = ktime_get_real();
+ skb->tstamp = now;
+ smp_wmb();
+ sp->hdr.serial = serial;
+ if (whdr.flags & RXRPC_REQUEST_ACK) {
+ call->peer->rtt_last_req = now;
+ trace_rxrpc_rtt_tx(call, rxrpc_rtt_tx_data, serial);
+ }
}
+ _leave(" = %d [%u]", ret, call->peer->maxdata);
+ return ret;
send_fragmentable:
/* attempt to send this message with fragmentation enabled */
@@ -249,8 +346,8 @@
SOL_IP, IP_MTU_DISCOVER,
(char *)&opt, sizeof(opt));
if (ret == 0) {
- ret = kernel_sendmsg(conn->params.local->socket, &msg, iov, 1,
- iov[0].iov_len);
+ ret = kernel_sendmsg(conn->params.local->socket, &msg,
+ iov, 2, len);
opt = IP_PMTUDISC_DO;
kernel_setsockopt(conn->params.local->socket, SOL_IP,
@@ -279,8 +376,7 @@
}
up_write(&conn->params.local->defrag_sem);
- _leave(" = %d [frag %u]", ret, conn->params.peer->maxdata);
- return ret;
+ goto done;
}
/*
@@ -314,7 +410,7 @@
whdr.type = RXRPC_PACKET_TYPE_ABORT;
while ((skb = skb_dequeue(&local->reject_queue))) {
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_rx_seen);
sp = rxrpc_skb(skb);
if (rxrpc_extract_addr_from_skb(&srx, skb) == 0) {
@@ -333,7 +429,7 @@
kernel_sendmsg(local->socket, &msg, iov, 2, size);
}
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
}
_leave("");
diff --git a/net/rxrpc/peer_event.c b/net/rxrpc/peer_event.c
index 9e0725f..bf13b84 100644
--- a/net/rxrpc/peer_event.c
+++ b/net/rxrpc/peer_event.c
@@ -155,11 +155,11 @@
_leave("UDP socket errqueue empty");
return;
}
- rxrpc_new_skb(skb);
+ rxrpc_new_skb(skb, rxrpc_skb_rx_received);
serr = SKB_EXT_ERR(skb);
if (!skb->len && serr->ee.ee_origin == SO_EE_ORIGIN_TIMESTAMPING) {
_leave("UDP empty message");
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
return;
}
@@ -169,7 +169,7 @@
peer = NULL;
if (!peer) {
rcu_read_unlock();
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
_leave(" [no peer]");
return;
}
@@ -179,7 +179,7 @@
serr->ee.ee_code == ICMP_FRAG_NEEDED)) {
rxrpc_adjust_mtu(peer, serr);
rcu_read_unlock();
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
rxrpc_put_peer(peer);
_leave(" [MTU update]");
return;
@@ -187,7 +187,7 @@
rxrpc_store_error(peer, serr);
rcu_read_unlock();
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
/* The ref we obtained is passed off to the work item */
rxrpc_queue_work(&peer->error_distributor);
@@ -305,3 +305,44 @@
rxrpc_put_peer(peer);
_leave("");
}
+
+/*
+ * Add RTT information to cache. This is called in softirq mode and has
+ * exclusive access to the peer RTT data.
+ */
+void rxrpc_peer_add_rtt(struct rxrpc_call *call, enum rxrpc_rtt_rx_trace why,
+ rxrpc_serial_t send_serial, rxrpc_serial_t resp_serial,
+ ktime_t send_time, ktime_t resp_time)
+{
+ struct rxrpc_peer *peer = call->peer;
+ s64 rtt;
+ u64 sum = peer->rtt_sum, avg;
+ u8 cursor = peer->rtt_cursor, usage = peer->rtt_usage;
+
+ rtt = ktime_to_ns(ktime_sub(resp_time, send_time));
+ if (rtt < 0)
+ return;
+
+ /* Replace the oldest datum in the RTT buffer */
+ sum -= peer->rtt_cache[cursor];
+ sum += rtt;
+ peer->rtt_cache[cursor] = rtt;
+ peer->rtt_cursor = (cursor + 1) & (RXRPC_RTT_CACHE_SIZE - 1);
+ peer->rtt_sum = sum;
+ if (usage < RXRPC_RTT_CACHE_SIZE) {
+ usage++;
+ peer->rtt_usage = usage;
+ }
+
+ /* Now recalculate the average */
+ if (usage == RXRPC_RTT_CACHE_SIZE) {
+ avg = sum / RXRPC_RTT_CACHE_SIZE;
+ } else {
+ avg = sum;
+ do_div(avg, usage);
+ }
+
+ peer->rtt = avg;
+ trace_rxrpc_rtt_rx(call, why, send_serial, resp_serial, rtt,
+ usage, avg);
+}
diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c
index f3e5766..941b724 100644
--- a/net/rxrpc/peer_object.c
+++ b/net/rxrpc/peer_object.c
@@ -244,6 +244,7 @@
peer->hash_key = hash_key;
rxrpc_assess_MTU_size(peer);
peer->mtu = peer->if_mtu;
+ peer->rtt_last_req = ktime_get_real();
switch (peer->srx.transport.family) {
case AF_INET:
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index a284205..038ae62 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -94,6 +94,8 @@
break;
}
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_terminal, call->rx_hard_ack,
+ call->rx_pkt_offset, call->rx_pkt_len, ret);
return ret;
}
@@ -124,21 +126,24 @@
write_unlock(&rx->call_lock);
}
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_to_be_accepted, 1, 0, 0, ret);
return ret;
}
/*
* End the packet reception phase.
*/
-static void rxrpc_end_rx_phase(struct rxrpc_call *call)
+static void rxrpc_end_rx_phase(struct rxrpc_call *call, rxrpc_serial_t serial)
{
_enter("%d,%s", call->debug_id, rxrpc_call_states[call->state]);
+ trace_rxrpc_receive(call, rxrpc_receive_end, 0, call->rx_top);
+ ASSERTCMP(call->rx_hard_ack, ==, call->rx_top);
+
if (call->state == RXRPC_CALL_CLIENT_RECV_REPLY) {
- rxrpc_propose_ACK(call, RXRPC_ACK_IDLE, 0, 0, true, false);
+ rxrpc_propose_ACK(call, RXRPC_ACK_IDLE, 0, serial, true, false,
+ rxrpc_propose_ack_terminal_ack);
rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
- } else {
- rxrpc_propose_ACK(call, RXRPC_ACK_IDLE, 0, 0, false, false);
}
write_lock_bh(&call->state_lock);
@@ -149,6 +154,7 @@
break;
case RXRPC_CALL_SERVER_RECV_REQUEST:
+ call->tx_phase = true;
call->state = RXRPC_CALL_SERVER_ACK_REQUEST;
break;
default:
@@ -163,8 +169,11 @@
*/
static void rxrpc_rotate_rx_window(struct rxrpc_call *call)
{
+ struct rxrpc_skb_priv *sp;
struct sk_buff *skb;
+ rxrpc_serial_t serial;
rxrpc_seq_t hard_ack, top;
+ u8 flags;
int ix;
_enter("%d", call->debug_id);
@@ -176,17 +185,35 @@
hard_ack++;
ix = hard_ack & RXRPC_RXTX_BUFF_MASK;
skb = call->rxtx_buffer[ix];
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_rx_rotated);
+ sp = rxrpc_skb(skb);
+ flags = sp->hdr.flags;
+ serial = sp->hdr.serial;
+ if (call->rxtx_annotations[ix] & RXRPC_RX_ANNO_JUMBO)
+ serial += (call->rxtx_annotations[ix] & RXRPC_RX_ANNO_JUMBO) - 1;
+
call->rxtx_buffer[ix] = NULL;
call->rxtx_annotations[ix] = 0;
/* Barrier against rxrpc_input_data(). */
smp_store_release(&call->rx_hard_ack, hard_ack);
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_rx_freed);
- _debug("%u,%u,%lx", hard_ack, top, call->flags);
- if (hard_ack == top && test_bit(RXRPC_CALL_RX_LAST, &call->flags))
- rxrpc_end_rx_phase(call);
+ _debug("%u,%u,%02x", hard_ack, top, flags);
+ trace_rxrpc_receive(call, rxrpc_receive_rotate, serial, hard_ack);
+ if (flags & RXRPC_LAST_PACKET) {
+ rxrpc_end_rx_phase(call, serial);
+ } else {
+ /* Check to see if there's an ACK that needs sending. */
+ if (after_eq(hard_ack, call->ackr_consumed + 2) ||
+ after_eq(top, call->ackr_seen + 2) ||
+ (hard_ack == top && after(hard_ack, call->ackr_consumed)))
+ rxrpc_propose_ACK(call, RXRPC_ACK_DELAY, 0, serial,
+ true, false,
+ rxrpc_propose_ack_rotate_rx);
+ if (call->ackr_reason)
+ rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
+ }
}
/*
@@ -240,9 +267,6 @@
int ret;
u8 annotation = *_annotation;
- if (offset > 0)
- return 0;
-
/* Locate the subpacket */
offset = sp->offset;
len = skb->len - sp->offset;
@@ -281,32 +305,53 @@
size_t remain;
bool last;
unsigned int rx_pkt_offset, rx_pkt_len;
- int ix, copy, ret = 0;
-
- _enter("");
+ int ix, copy, ret = -EAGAIN, ret2;
rx_pkt_offset = call->rx_pkt_offset;
rx_pkt_len = call->rx_pkt_len;
+ if (call->state >= RXRPC_CALL_SERVER_ACK_REQUEST) {
+ seq = call->rx_hard_ack;
+ ret = 1;
+ goto done;
+ }
+
/* Barriers against rxrpc_input_data(). */
hard_ack = call->rx_hard_ack;
top = smp_load_acquire(&call->rx_top);
for (seq = hard_ack + 1; before_eq(seq, top); seq++) {
ix = seq & RXRPC_RXTX_BUFF_MASK;
skb = call->rxtx_buffer[ix];
- if (!skb)
+ if (!skb) {
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_hole, seq,
+ rx_pkt_offset, rx_pkt_len, 0);
break;
+ }
smp_rmb();
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_rx_seen);
sp = rxrpc_skb(skb);
+ if (!(flags & MSG_PEEK))
+ trace_rxrpc_receive(call, rxrpc_receive_front,
+ sp->hdr.serial, seq);
+
if (msg)
sock_recv_timestamp(msg, sock->sk, skb);
- ret = rxrpc_locate_data(call, skb, &call->rxtx_annotations[ix],
- &rx_pkt_offset, &rx_pkt_len);
- _debug("recvmsg %x DATA #%u { %d, %d }",
- sp->hdr.callNumber, seq, rx_pkt_offset, rx_pkt_len);
+ if (rx_pkt_offset == 0) {
+ ret2 = rxrpc_locate_data(call, skb,
+ &call->rxtx_annotations[ix],
+ &rx_pkt_offset, &rx_pkt_len);
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_next, seq,
+ rx_pkt_offset, rx_pkt_len, ret2);
+ if (ret2 < 0) {
+ ret = ret2;
+ goto out;
+ }
+ } else {
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_cont, seq,
+ rx_pkt_offset, rx_pkt_len, 0);
+ }
/* We have to handle short, empty and used-up DATA packets. */
remain = len - *_offset;
@@ -314,22 +359,24 @@
if (copy > remain)
copy = remain;
if (copy > 0) {
- ret = skb_copy_datagram_iter(skb, rx_pkt_offset, iter,
- copy);
- if (ret < 0)
+ ret2 = skb_copy_datagram_iter(skb, rx_pkt_offset, iter,
+ copy);
+ if (ret2 < 0) {
+ ret = ret2;
goto out;
+ }
/* handle piecemeal consumption of data packets */
- _debug("copied %d @%zu", copy, *_offset);
-
rx_pkt_offset += copy;
rx_pkt_len -= copy;
*_offset += copy;
}
if (rx_pkt_len > 0) {
- _debug("buffer full");
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_full, seq,
+ rx_pkt_offset, rx_pkt_len, 0);
ASSERTCMP(*_offset, ==, len);
+ ret = 0;
break;
}
@@ -340,20 +387,21 @@
rx_pkt_offset = 0;
rx_pkt_len = 0;
- ASSERTIFCMP(last, seq, ==, top);
+ if (last) {
+ ASSERTCMP(seq, ==, READ_ONCE(call->rx_top));
+ ret = 1;
+ goto out;
+ }
}
- if (after(seq, top)) {
- ret = -EAGAIN;
- if (test_bit(RXRPC_CALL_RX_LAST, &call->flags))
- ret = 1;
- }
out:
if (!(flags & MSG_PEEK)) {
call->rx_pkt_offset = rx_pkt_offset;
call->rx_pkt_len = rx_pkt_len;
}
- _leave(" = %d [%u/%u]", ret, seq, top);
+done:
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_data_return, seq,
+ rx_pkt_offset, rx_pkt_len, ret);
return ret;
}
@@ -374,7 +422,7 @@
DEFINE_WAIT(wait);
- _enter(",,,%zu,%d", len, flags);
+ trace_rxrpc_recvmsg(NULL, rxrpc_recvmsg_enter, 0, 0, 0, 0);
if (flags & (MSG_OOB | MSG_TRUNC))
return -EOPNOTSUPP;
@@ -394,8 +442,10 @@
if (list_empty(&rx->recvmsg_q)) {
ret = -EWOULDBLOCK;
- if (timeo == 0)
+ if (timeo == 0) {
+ call = NULL;
goto error_no_call;
+ }
release_sock(&rx->sk);
@@ -409,6 +459,8 @@
if (list_empty(&rx->recvmsg_q)) {
if (signal_pending(current))
goto wait_interrupted;
+ trace_rxrpc_recvmsg(NULL, rxrpc_recvmsg_wait,
+ 0, 0, 0, 0);
timeo = schedule_timeout(timeo);
}
finish_wait(sk_sleep(&rx->sk), &wait);
@@ -427,7 +479,7 @@
rxrpc_get_call(call, rxrpc_call_got);
write_unlock_bh(&rx->recvmsg_lock);
- _debug("recvmsg call %p", call);
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_dequeue, 0, 0, 0, 0);
if (test_bit(RXRPC_CALL_RELEASED, &call->flags))
BUG();
@@ -497,16 +549,15 @@
rxrpc_put_call(call, rxrpc_call_put);
error_no_call:
release_sock(&rx->sk);
- _leave(" = %d", ret);
+ trace_rxrpc_recvmsg(call, rxrpc_recvmsg_return, 0, 0, 0, ret);
return ret;
wait_interrupted:
ret = sock_intr_errno(timeo);
wait_error:
finish_wait(sk_sleep(&rx->sk), &wait);
- release_sock(&rx->sk);
- _leave(" = %d [wait]", ret);
- return ret;
+ call = NULL;
+ goto error_no_call;
}
/**
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index ae39255..88d080a 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -80,12 +80,10 @@
case RXRPC_SECURITY_AUTH:
conn->size_align = 8;
conn->security_size = sizeof(struct rxkad_level1_hdr);
- conn->header_size += sizeof(struct rxkad_level1_hdr);
break;
case RXRPC_SECURITY_ENCRYPT:
conn->size_align = 8;
conn->security_size = sizeof(struct rxkad_level2_hdr);
- conn->header_size += sizeof(struct rxkad_level2_hdr);
break;
default:
ret = -EKEYREJECTED;
@@ -161,7 +159,7 @@
_enter("");
- check = sp->hdr.seq ^ sp->hdr.callNumber;
+ check = sp->hdr.seq ^ call->call_id;
data_size |= (u32)check << 16;
hdr.data_size = htonl(data_size);
@@ -205,7 +203,7 @@
_enter("");
- check = sp->hdr.seq ^ sp->hdr.callNumber;
+ check = sp->hdr.seq ^ call->call_id;
rxkhdr.data_size = htonl(data_size | (u32)check << 16);
rxkhdr.checksum = 0;
@@ -277,7 +275,7 @@
/* calculate the security checksum */
x = (call->cid & RXRPC_CHANNELMASK) << (32 - RXRPC_CIDSHIFT);
x |= sp->hdr.seq & 0x3fffffff;
- call->crypto_buf[0] = htonl(sp->hdr.callNumber);
+ call->crypto_buf[0] = htonl(call->call_id);
call->crypto_buf[1] = htonl(x);
sg_init_one(&sg, call->crypto_buf, 8);
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index cba2365..1f8040d 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -45,7 +45,9 @@
for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
ret = 0;
- if (call->tx_top - call->tx_hard_ack < call->tx_winsize)
+ if (call->tx_top - call->tx_hard_ack <
+ min_t(unsigned int, call->tx_winsize,
+ call->cong_cwnd + call->cong_extra))
break;
if (call->state >= RXRPC_CALL_COMPLETE) {
ret = -call->error;
@@ -56,6 +58,7 @@
break;
}
+ trace_rxrpc_transmit(call, rxrpc_transmit_wait);
release_sock(&rx->sk);
*timeo = schedule_timeout(*timeo);
lock_sock(&rx->sk);
@@ -93,19 +96,30 @@
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
rxrpc_seq_t seq = sp->hdr.seq;
int ret, ix;
+ u8 annotation = RXRPC_TX_ANNO_UNACK;
_net("queue skb %p [%d]", skb, seq);
ASSERTCMP(seq, ==, call->tx_top + 1);
+ if (last)
+ annotation |= RXRPC_TX_ANNO_LAST;
+
+ /* We have to set the timestamp before queueing as the retransmit
+ * algorithm can see the packet as soon as we queue it.
+ */
+ skb->tstamp = ktime_get_real();
+
ix = seq & RXRPC_RXTX_BUFF_MASK;
- rxrpc_get_skb(skb);
- call->rxtx_annotations[ix] = RXRPC_TX_ANNO_UNACK;
+ rxrpc_get_skb(skb, rxrpc_skb_tx_got);
+ call->rxtx_annotations[ix] = annotation;
smp_wmb();
call->rxtx_buffer[ix] = skb;
call->tx_top = seq;
if (last)
- set_bit(RXRPC_CALL_TX_LAST, &call->flags);
+ trace_rxrpc_transmit(call, rxrpc_transmit_queue_last);
+ else
+ trace_rxrpc_transmit(call, rxrpc_transmit_queue);
if (last || call->state == RXRPC_CALL_SERVER_ACK_REQUEST) {
_debug("________awaiting reply/ACK__________");
@@ -127,46 +141,29 @@
write_unlock_bh(&call->state_lock);
}
- _proto("Tx DATA %%%u { #%u }", sp->hdr.serial, sp->hdr.seq);
-
if (seq == 1 && rxrpc_is_client_call(call))
rxrpc_expose_client_call(call);
- sp->resend_at = jiffies + rxrpc_resend_timeout;
- ret = rxrpc_send_data_packet(call->conn, skb);
+ ret = rxrpc_send_data_packet(call, skb);
if (ret < 0) {
_debug("need instant resend %d", ret);
rxrpc_instant_resend(call, ix);
+ } else {
+ unsigned long resend_at;
+
+ resend_at = jiffies + msecs_to_jiffies(rxrpc_resend_timeout);
+
+ if (time_before(resend_at, call->resend_at)) {
+ call->resend_at = resend_at;
+ rxrpc_set_timer(call, rxrpc_timer_set_for_send);
+ }
}
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_tx_freed);
_leave("");
}
/*
- * Convert a host-endian header into a network-endian header.
- */
-static void rxrpc_insert_header(struct sk_buff *skb)
-{
- struct rxrpc_wire_header whdr;
- struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
-
- whdr.epoch = htonl(sp->hdr.epoch);
- whdr.cid = htonl(sp->hdr.cid);
- whdr.callNumber = htonl(sp->hdr.callNumber);
- whdr.seq = htonl(sp->hdr.seq);
- whdr.serial = htonl(sp->hdr.serial);
- whdr.type = sp->hdr.type;
- whdr.flags = sp->hdr.flags;
- whdr.userStatus = sp->hdr.userStatus;
- whdr.securityIndex = sp->hdr.securityIndex;
- whdr._rsvd = htons(sp->hdr._rsvd);
- whdr.serviceId = htons(sp->hdr.serviceId);
-
- memcpy(skb->head, &whdr, sizeof(whdr));
-}
-
-/*
* send data through a socket
* - must be called in process context
* - caller holds the socket locked
@@ -194,17 +191,22 @@
skb = call->tx_pending;
call->tx_pending = NULL;
- rxrpc_see_skb(skb);
+ rxrpc_see_skb(skb, rxrpc_skb_tx_seen);
copied = 0;
do {
+ /* Check to see if there's a ping ACK to reply to. */
+ if (call->ackr_reason == RXRPC_ACK_PING_RESPONSE)
+ rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
+
if (!skb) {
size_t size, chunk, max, space;
_debug("alloc");
if (call->tx_top - call->tx_hard_ack >=
- call->tx_winsize) {
+ min_t(unsigned int, call->tx_winsize,
+ call->cong_cwnd + call->cong_extra)) {
ret = -EAGAIN;
if (msg->msg_flags & MSG_DONTWAIT)
goto maybe_error;
@@ -214,7 +216,7 @@
goto maybe_error;
}
- max = call->conn->params.peer->maxdata;
+ max = RXRPC_JUMBO_DATALEN;
max -= call->conn->security_size;
max &= ~(call->conn->size_align - 1UL);
@@ -225,7 +227,7 @@
space = chunk + call->conn->size_align;
space &= ~(call->conn->size_align - 1UL);
- size = space + call->conn->header_size;
+ size = space + call->conn->security_size;
_debug("SIZE: %zu/%zu/%zu", chunk, space, size);
@@ -235,15 +237,15 @@
if (!skb)
goto maybe_error;
- rxrpc_new_skb(skb);
+ rxrpc_new_skb(skb, rxrpc_skb_tx_new);
_debug("ALLOC SEND %p", skb);
ASSERTCMP(skb->mark, ==, 0);
- _debug("HS: %u", call->conn->header_size);
- skb_reserve(skb, call->conn->header_size);
- skb->len += call->conn->header_size;
+ _debug("HS: %u", call->conn->security_size);
+ skb_reserve(skb, call->conn->security_size);
+ skb->len += call->conn->security_size;
sp = rxrpc_skb(skb);
sp->remain = chunk;
@@ -305,33 +307,21 @@
seq = call->tx_top + 1;
- sp->hdr.epoch = conn->proto.epoch;
- sp->hdr.cid = call->cid;
- sp->hdr.callNumber = call->call_id;
sp->hdr.seq = seq;
- sp->hdr.serial = atomic_inc_return(&conn->serial);
- sp->hdr.type = RXRPC_PACKET_TYPE_DATA;
- sp->hdr.userStatus = 0;
- sp->hdr.securityIndex = call->security_ix;
sp->hdr._rsvd = 0;
- sp->hdr.serviceId = call->service_id;
+ sp->hdr.flags = conn->out_clientflag;
- sp->hdr.flags = conn->out_clientflag;
if (msg_data_left(msg) == 0 && !more)
sp->hdr.flags |= RXRPC_LAST_PACKET;
else if (call->tx_top - call->tx_hard_ack <
call->tx_winsize)
sp->hdr.flags |= RXRPC_MORE_PACKETS;
- if (more && seq & 1)
- sp->hdr.flags |= RXRPC_REQUEST_ACK;
ret = conn->security->secure_packet(
- call, skb, skb->mark,
- skb->head + sizeof(struct rxrpc_wire_header));
+ call, skb, skb->mark, skb->head);
if (ret < 0)
goto out;
- rxrpc_insert_header(skb);
rxrpc_queue_packet(call, skb, !msg_data_left(msg) && !more);
skb = NULL;
}
@@ -345,7 +335,7 @@
return ret;
call_terminated:
- rxrpc_free_skb(skb);
+ rxrpc_free_skb(skb, rxrpc_skb_tx_freed);
_leave(" = %d", -call->error);
return -call->error;
diff --git a/net/rxrpc/skbuff.c b/net/rxrpc/skbuff.c
index 620d9cc..5154cbf 100644
--- a/net/rxrpc/skbuff.c
+++ b/net/rxrpc/skbuff.c
@@ -18,55 +18,77 @@
#include <net/af_rxrpc.h>
#include "ar-internal.h"
+#define select_skb_count(op) (op >= rxrpc_skb_tx_cleaned ? &rxrpc_n_tx_skbs : &rxrpc_n_rx_skbs)
+
/*
- * Note the existence of a new-to-us socket buffer (allocated or dequeued).
+ * Note the allocation or reception of a socket buffer.
*/
-void rxrpc_new_skb(struct sk_buff *skb)
+void rxrpc_new_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
- int n = atomic_inc_return(&rxrpc_n_skbs);
- trace_rxrpc_skb(skb, 0, atomic_read(&skb->users), n, here);
+ int n = atomic_inc_return(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
}
/*
* Note the re-emergence of a socket buffer from a queue or buffer.
*/
-void rxrpc_see_skb(struct sk_buff *skb)
+void rxrpc_see_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
if (skb) {
- int n = atomic_read(&rxrpc_n_skbs);
- trace_rxrpc_skb(skb, 1, atomic_read(&skb->users), n, here);
+ int n = atomic_read(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
}
}
/*
* Note the addition of a ref on a socket buffer.
*/
-void rxrpc_get_skb(struct sk_buff *skb)
+void rxrpc_get_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
- int n = atomic_inc_return(&rxrpc_n_skbs);
- trace_rxrpc_skb(skb, 2, atomic_read(&skb->users), n, here);
+ int n = atomic_inc_return(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
skb_get(skb);
}
/*
* Note the destruction of a socket buffer.
*/
-void rxrpc_free_skb(struct sk_buff *skb)
+void rxrpc_free_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
if (skb) {
int n;
CHECK_SLAB_OKAY(&skb->users);
- n = atomic_dec_return(&rxrpc_n_skbs);
- trace_rxrpc_skb(skb, 3, atomic_read(&skb->users), n, here);
+ n = atomic_dec_return(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
kfree_skb(skb);
}
}
/*
+ * Note the injected loss of a socket buffer.
+ */
+void rxrpc_lose_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
+{
+ const void *here = __builtin_return_address(0);
+ if (skb) {
+ int n;
+ CHECK_SLAB_OKAY(&skb->users);
+ if (op == rxrpc_skb_tx_lost) {
+ n = atomic_read(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ } else {
+ n = atomic_dec_return(select_skb_count(op));
+ trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ kfree_skb(skb);
+ }
+ }
+}
+
+/*
* Clear a queue of socket buffers.
*/
void rxrpc_purge_queue(struct sk_buff_head *list)
@@ -74,8 +96,9 @@
const void *here = __builtin_return_address(0);
struct sk_buff *skb;
while ((skb = skb_dequeue((list))) != NULL) {
- int n = atomic_dec_return(&rxrpc_n_skbs);
- trace_rxrpc_skb(skb, 4, atomic_read(&skb->users), n, here);
+ int n = atomic_dec_return(select_skb_count(rxrpc_skb_rx_purged));
+ trace_rxrpc_skb(skb, rxrpc_skb_rx_purged,
+ atomic_read(&skb->users), n, here);
kfree_skb(skb);
}
}
diff --git a/net/rxrpc/sysctl.c b/net/rxrpc/sysctl.c
index a03c61c..13d1df03 100644
--- a/net/rxrpc/sysctl.c
+++ b/net/rxrpc/sysctl.c
@@ -59,7 +59,7 @@
.data = &rxrpc_resend_timeout,
.maxlen = sizeof(unsigned int),
.mode = 0644,
- .proc_handler = proc_dointvec_ms_jiffies,
+ .proc_handler = proc_dointvec,
.extra1 = (void *)&one,
},
{
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 7795d5a..87956a7 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -793,6 +793,11 @@
depends on NET_ACT_IFE
---help---
+config NET_IFE_SKBTCINDEX
+ tristate "Support to encoding decoding skb tcindex on IFE action"
+ depends on NET_ACT_IFE
+ ---help---
+
config NET_CLS_IND
bool "Incoming device classification"
depends on NET_CLS_U32 || NET_CLS_FW
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 148ae0d..4bdda36 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -23,6 +23,7 @@
obj-$(CONFIG_NET_ACT_IFE) += act_ife.o
obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o
obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o
+obj-$(CONFIG_NET_IFE_SKBTCINDEX) += act_meta_skbtcindex.o
obj-$(CONFIG_NET_ACT_TUNNEL_KEY)+= act_tunnel_key.o
obj-$(CONFIG_NET_SCH_FIFO) += sch_fifo.o
obj-$(CONFIG_NET_SCH_CBQ) += sch_cbq.o
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index d09d068..c910217 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -592,9 +592,19 @@
return ERR_PTR(err);
}
-int tcf_action_init(struct net *net, struct nlattr *nla,
- struct nlattr *est, char *name, int ovr,
- int bind, struct list_head *actions)
+static void cleanup_a(struct list_head *actions, int ovr)
+{
+ struct tc_action *a;
+
+ if (!ovr)
+ return;
+
+ list_for_each_entry(a, actions, list)
+ a->tcfa_refcnt--;
+}
+
+int tcf_action_init(struct net *net, struct nlattr *nla, struct nlattr *est,
+ char *name, int ovr, int bind, struct list_head *actions)
{
struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
struct tc_action *act;
@@ -612,8 +622,15 @@
goto err;
}
act->order = i;
+ if (ovr)
+ act->tcfa_refcnt++;
list_add_tail(&act->list, actions);
}
+
+ /* Remove the temp refcnt which was necessary to protect against
+ * destroying an existing action which was being replaced
+ */
+ cleanup_a(actions, ovr);
return 0;
err:
@@ -883,6 +900,8 @@
goto err;
}
act->order = i;
+ if (event == RTM_GETACTION)
+ act->tcfa_refcnt++;
list_add_tail(&act->list, &actions);
}
@@ -923,9 +942,8 @@
return err;
}
-static int
-tcf_action_add(struct net *net, struct nlattr *nla, struct nlmsghdr *n,
- u32 portid, int ovr)
+static int tcf_action_add(struct net *net, struct nlattr *nla,
+ struct nlmsghdr *n, u32 portid, int ovr)
{
int ret = 0;
LIST_HEAD(actions);
@@ -988,8 +1006,7 @@
return ret;
}
-static struct nlattr *
-find_dump_kind(const struct nlmsghdr *n)
+static struct nlattr *find_dump_kind(const struct nlmsghdr *n)
{
struct nlattr *tb1, *tb2[TCA_ACT_MAX + 1];
struct nlattr *tb[TCA_ACT_MAX_PRIO + 1];
@@ -1016,8 +1033,7 @@
return kind;
}
-static int
-tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
+static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
{
struct net *net = sock_net(skb->sk);
struct nlmsghdr *nlh;
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index b5dbf63..e0defce 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -116,8 +116,8 @@
return (void *)(skb_network_header(skb) + ihl);
}
-static int tcf_csum_ipv4_icmp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl)
+static int tcf_csum_ipv4_icmp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl)
{
struct icmphdr *icmph;
@@ -152,8 +152,8 @@
return 1;
}
-static int tcf_csum_ipv6_icmp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl)
+static int tcf_csum_ipv6_icmp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl)
{
struct icmp6hdr *icmp6h;
const struct ipv6hdr *ip6h;
@@ -174,8 +174,8 @@
return 1;
}
-static int tcf_csum_ipv4_tcp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl)
+static int tcf_csum_ipv4_tcp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl)
{
struct tcphdr *tcph;
const struct iphdr *iph;
@@ -195,8 +195,8 @@
return 1;
}
-static int tcf_csum_ipv6_tcp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl)
+static int tcf_csum_ipv6_tcp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl)
{
struct tcphdr *tcph;
const struct ipv6hdr *ip6h;
@@ -217,8 +217,8 @@
return 1;
}
-static int tcf_csum_ipv4_udp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl, int udplite)
+static int tcf_csum_ipv4_udp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl, int udplite)
{
struct udphdr *udph;
const struct iphdr *iph;
@@ -270,8 +270,8 @@
return 1;
}
-static int tcf_csum_ipv6_udp(struct sk_buff *skb,
- unsigned int ihl, unsigned int ipl, int udplite)
+static int tcf_csum_ipv6_udp(struct sk_buff *skb, unsigned int ihl,
+ unsigned int ipl, int udplite)
{
struct udphdr *udph;
const struct ipv6hdr *ip6h;
@@ -380,8 +380,8 @@
return 0;
}
-static int tcf_csum_ipv6_hopopts(struct ipv6_opt_hdr *ip6xh,
- unsigned int ixhl, unsigned int *pl)
+static int tcf_csum_ipv6_hopopts(struct ipv6_opt_hdr *ip6xh, unsigned int ixhl,
+ unsigned int *pl)
{
int off, len, optlen;
unsigned char *xh = (void *)ip6xh;
@@ -494,8 +494,8 @@
return 0;
}
-static int tcf_csum(struct sk_buff *skb,
- const struct tc_action *a, struct tcf_result *res)
+static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
+ struct tcf_result *res)
{
struct tcf_csum *p = to_tcf_csum(a);
int action;
@@ -531,8 +531,8 @@
return TC_ACT_SHOT;
}
-static int tcf_csum_dump(struct sk_buff *skb,
- struct tc_action *a, int bind, int ref)
+static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
+ int ref)
{
unsigned char *b = skb_tail_pointer(skb);
struct tcf_csum *p = to_tcf_csum(a);
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index e24a409..e0aa30f 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -156,7 +156,8 @@
int action = READ_ONCE(gact->tcf_action);
struct tcf_t *tm = &gact->tcf_tm;
- _bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), bytes, packets);
+ _bstats_cpu_update(this_cpu_ptr(gact->common.cpu_bstats), bytes,
+ packets);
if (action == TC_ACT_SHOT)
this_cpu_ptr(gact->common.cpu_qstats)->drops += packets;
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index e87cd81..ccf7b4b 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -63,6 +63,23 @@
}
EXPORT_SYMBOL_GPL(ife_tlv_meta_encode);
+int ife_encode_meta_u16(u16 metaval, void *skbdata, struct tcf_meta_info *mi)
+{
+ u16 edata = 0;
+
+ if (mi->metaval)
+ edata = *(u16 *)mi->metaval;
+ else if (metaval)
+ edata = metaval;
+
+ if (!edata) /* will not encode */
+ return 0;
+
+ edata = htons(edata);
+ return ife_tlv_meta_encode(skbdata, mi->metaid, 2, &edata);
+}
+EXPORT_SYMBOL_GPL(ife_encode_meta_u16);
+
int ife_get_meta_u32(struct sk_buff *skb, struct tcf_meta_info *mi)
{
if (mi->metaval)
@@ -81,6 +98,15 @@
}
EXPORT_SYMBOL_GPL(ife_check_meta_u32);
+int ife_check_meta_u16(u16 metaval, struct tcf_meta_info *mi)
+{
+ if (metaval || mi->metaval)
+ return 8; /* T+L+(V) == 2+2+(2+2bytepad) */
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(ife_check_meta_u16);
+
int ife_encode_meta_u32(u32 metaval, void *skbdata, struct tcf_meta_info *mi)
{
u32 edata = metaval;
diff --git a/net/sched/act_meta_skbtcindex.c b/net/sched/act_meta_skbtcindex.c
new file mode 100644
index 0000000..3b35774
--- /dev/null
+++ b/net/sched/act_meta_skbtcindex.c
@@ -0,0 +1,79 @@
+/*
+ * net/sched/act_meta_tc_index.c IFE skb->tc_index metadata module
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * copyright Jamal Hadi Salim (2016)
+ *
+*/
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <linux/rtnetlink.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <uapi/linux/tc_act/tc_ife.h>
+#include <net/tc_act/tc_ife.h>
+#include <linux/rtnetlink.h>
+
+static int skbtcindex_encode(struct sk_buff *skb, void *skbdata,
+ struct tcf_meta_info *e)
+{
+ u32 ifetc_index = skb->tc_index;
+
+ return ife_encode_meta_u16(ifetc_index, skbdata, e);
+}
+
+static int skbtcindex_decode(struct sk_buff *skb, void *data, u16 len)
+{
+ u16 ifetc_index = *(u16 *)data;
+
+ skb->tc_index = ntohs(ifetc_index);
+ return 0;
+}
+
+static int skbtcindex_check(struct sk_buff *skb, struct tcf_meta_info *e)
+{
+ return ife_check_meta_u16(skb->tc_index, e);
+}
+
+static struct tcf_meta_ops ife_skbtcindex_ops = {
+ .metaid = IFE_META_TCINDEX,
+ .metatype = NLA_U16,
+ .name = "tc_index",
+ .synopsis = "skb tc_index 16 bit metadata",
+ .check_presence = skbtcindex_check,
+ .encode = skbtcindex_encode,
+ .decode = skbtcindex_decode,
+ .get = ife_get_meta_u16,
+ .alloc = ife_alloc_meta_u16,
+ .release = ife_release_meta_gen,
+ .validate = ife_validate_meta_u16,
+ .owner = THIS_MODULE,
+};
+
+static int __init ifetc_index_init_module(void)
+{
+ return register_ife_op(&ife_skbtcindex_ops);
+}
+
+static void __exit ifetc_index_cleanup_module(void)
+{
+ unregister_ife_op(&ife_skbtcindex_ops);
+}
+
+module_init(ifetc_index_init_module);
+module_exit(ifetc_index_cleanup_module);
+
+MODULE_AUTHOR("Jamal Hadi Salim(2016)");
+MODULE_DESCRIPTION("Inter-FE skb tc_index metadata module");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_IFE_META(IFE_META_SKBTCINDEX);
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6038c85..667dc38 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -204,7 +204,15 @@
return retval;
}
-static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
+static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets,
+ u64 lastuse)
+{
+ tcf_lastuse_update(&a->tcfa_tm);
+ _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
+}
+
+static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
+ int ref)
{
unsigned char *b = skb_tail_pointer(skb);
struct tcf_mirred *m = to_mirred(a);
@@ -280,6 +288,7 @@
.type = TCA_ACT_MIRRED,
.owner = THIS_MODULE,
.act = tcf_mirred,
+ .stats_update = tcf_stats_update,
.dump = tcf_mirred_dump,
.cleanup = tcf_mirred_release,
.init = tcf_mirred_init,
diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 8a3be1d..d1bd248 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -249,6 +249,8 @@
police->tcfp_t_c = now;
police->tcfp_toks = toks;
police->tcfp_ptoks = ptoks;
+ if (police->tcfp_result == TC_ACT_SHOT)
+ police->tcf_qstats.drops++;
spin_unlock(&police->tcf_lock);
return police->tcfp_result;
}
@@ -261,8 +263,8 @@
return police->tcf_action;
}
-static int
-tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
+static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a,
+ int bind, int ref)
{
unsigned char *b = skb_tail_pointer(skb);
struct tcf_police *police = to_police(a);
@@ -347,14 +349,12 @@
.size = sizeof(struct tc_action_net),
};
-static int __init
-police_init_module(void)
+static int __init police_init_module(void)
{
return tcf_register_action(&act_police_ops, &police_net_ops);
}
-static void __exit
-police_cleanup_module(void)
+static void __exit police_cleanup_module(void)
{
tcf_unregister_action(&act_police_ops, &police_net_ops);
}
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 59a8d31..a95c00b 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -30,6 +30,7 @@
struct tcf_vlan *v = to_vlan(a);
int action;
int err;
+ u16 tci;
spin_lock(&v->tcf_lock);
tcf_lastuse_update(&v->tcf_tm);
@@ -48,6 +49,30 @@
if (err)
goto drop;
break;
+ case TCA_VLAN_ACT_MODIFY:
+ /* No-op if no vlan tag (either hw-accel or in-payload) */
+ if (!skb_vlan_tagged(skb))
+ goto unlock;
+ /* extract existing tag (and guarantee no hw-accel tag) */
+ if (skb_vlan_tag_present(skb)) {
+ tci = skb_vlan_tag_get(skb);
+ skb->vlan_tci = 0;
+ } else {
+ /* in-payload vlan tag, pop it */
+ err = __skb_vlan_pop(skb, &tci);
+ if (err)
+ goto drop;
+ }
+ /* replace the vid */
+ tci = (tci & ~VLAN_VID_MASK) | v->tcfv_push_vid;
+ /* replace prio bits, if tcfv_push_prio specified */
+ if (v->tcfv_push_prio) {
+ tci &= ~VLAN_PRIO_MASK;
+ tci |= v->tcfv_push_prio << VLAN_PRIO_SHIFT;
+ }
+ /* put updated tci as hwaccel tag */
+ __vlan_hwaccel_put_tag(skb, v->tcfv_push_proto, tci);
+ break;
default:
BUG();
}
@@ -102,6 +127,7 @@
case TCA_VLAN_ACT_POP:
break;
case TCA_VLAN_ACT_PUSH:
+ case TCA_VLAN_ACT_MODIFY:
if (!tb[TCA_VLAN_PUSH_VLAN_ID]) {
if (exists)
tcf_hash_release(*a, bind);
@@ -185,7 +211,8 @@
if (nla_put(skb, TCA_VLAN_PARMS, sizeof(opt), &opt))
goto nla_put_failure;
- if (v->tcfv_action == TCA_VLAN_ACT_PUSH &&
+ if ((v->tcfv_action == TCA_VLAN_ACT_PUSH ||
+ v->tcfv_action == TCA_VLAN_ACT_MODIFY) &&
(nla_put_u16(skb, TCA_VLAN_PUSH_VLAN_ID, v->tcfv_push_vid) ||
nla_put_be16(skb, TCA_VLAN_PUSH_VLAN_PROTOCOL,
v->tcfv_push_proto) ||
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a7c5645..11da7da 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -344,13 +344,15 @@
if (err == 0) {
struct tcf_proto *next = rtnl_dereference(tp->next);
- tfilter_notify(net, skb, n, tp, fh, RTM_DELTFILTER);
+ tfilter_notify(net, skb, n, tp, fh,
+ RTM_DELTFILTER);
if (tcf_destroy(tp, false))
RCU_INIT_POINTER(*back, next);
}
goto errout;
case RTM_GETTFILTER:
- err = tfilter_notify(net, skb, n, tp, fh, RTM_NEWTFILTER);
+ err = tfilter_notify(net, skb, n, tp, fh,
+ RTM_NEWTFILTER);
goto errout;
default:
err = -EINVAL;
@@ -448,7 +450,8 @@
struct net *net = sock_net(a->skb->sk);
return tcf_fill_node(net, a->skb, tp, n, NETLINK_CB(a->cb->skb).portid,
- a->cb->nlh->nlmsg_seq, NLM_F_MULTI, RTM_NEWTFILTER);
+ a->cb->nlh->nlmsg_seq, NLM_F_MULTI,
+ RTM_NEWTFILTER);
}
/* called with RTNL */
@@ -552,7 +555,7 @@
EXPORT_SYMBOL(tcf_exts_destroy);
int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb,
- struct nlattr *rate_tlv, struct tcf_exts *exts, bool ovr)
+ struct nlattr *rate_tlv, struct tcf_exts *exts, bool ovr)
{
#ifdef CONFIG_NET_CLS_ACT
{
@@ -560,8 +563,7 @@
if (exts->police && tb[exts->police]) {
act = tcf_action_init_1(net, tb[exts->police], rate_tlv,
- "police", ovr,
- TCA_ACT_BIND);
+ "police", ovr, TCA_ACT_BIND);
if (IS_ERR(act))
return PTR_ERR(act);
@@ -573,8 +575,8 @@
int err, i = 0;
err = tcf_action_init(net, tb[exts->action], rate_tlv,
- NULL, ovr,
- TCA_ACT_BIND, &actions);
+ NULL, ovr, TCA_ACT_BIND,
+ &actions);
if (err)
return err;
list_for_each_entry(act, &actions, list)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 1d92d4d..bb1d5a4 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -27,6 +27,8 @@
MODULE_DESCRIPTION("TC BPF based classifier");
#define CLS_BPF_NAME_LEN 256
+#define CLS_BPF_SUPPORTED_GEN_FLAGS \
+ (TCA_CLS_FLAGS_SKIP_HW | TCA_CLS_FLAGS_SKIP_SW)
struct cls_bpf_head {
struct list_head plist;
@@ -39,6 +41,8 @@
struct list_head link;
struct tcf_result res;
bool exts_integrated;
+ bool offloaded;
+ u32 gen_flags;
struct tcf_exts exts;
u32 handle;
union {
@@ -54,8 +58,10 @@
static const struct nla_policy bpf_policy[TCA_BPF_MAX + 1] = {
[TCA_BPF_CLASSID] = { .type = NLA_U32 },
[TCA_BPF_FLAGS] = { .type = NLA_U32 },
+ [TCA_BPF_FLAGS_GEN] = { .type = NLA_U32 },
[TCA_BPF_FD] = { .type = NLA_U32 },
- [TCA_BPF_NAME] = { .type = NLA_NUL_STRING, .len = CLS_BPF_NAME_LEN },
+ [TCA_BPF_NAME] = { .type = NLA_NUL_STRING,
+ .len = CLS_BPF_NAME_LEN },
[TCA_BPF_OPS_LEN] = { .type = NLA_U16 },
[TCA_BPF_OPS] = { .type = NLA_BINARY,
.len = sizeof(struct sock_filter) * BPF_MAXINSNS },
@@ -90,7 +96,9 @@
qdisc_skb_cb(skb)->tc_classid = prog->res.classid;
- if (at_ingress) {
+ if (tc_skip_sw(prog->gen_flags)) {
+ filter_res = prog->exts_integrated ? TC_ACT_UNSPEC : 0;
+ } else if (at_ingress) {
/* It is safe to push/pull even if skb_shared() */
__skb_push(skb, skb->mac_len);
bpf_compute_data_end(skb);
@@ -137,6 +145,91 @@
return !prog->bpf_ops;
}
+static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
+ enum tc_clsbpf_command cmd)
+{
+ struct net_device *dev = tp->q->dev_queue->dev;
+ struct tc_cls_bpf_offload bpf_offload = {};
+ struct tc_to_netdev offload;
+
+ offload.type = TC_SETUP_CLSBPF;
+ offload.cls_bpf = &bpf_offload;
+
+ bpf_offload.command = cmd;
+ bpf_offload.exts = &prog->exts;
+ bpf_offload.prog = prog->filter;
+ bpf_offload.name = prog->bpf_name;
+ bpf_offload.exts_integrated = prog->exts_integrated;
+ bpf_offload.gen_flags = prog->gen_flags;
+
+ return dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
+ tp->protocol, &offload);
+}
+
+static int cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
+ struct cls_bpf_prog *oldprog)
+{
+ struct net_device *dev = tp->q->dev_queue->dev;
+ struct cls_bpf_prog *obj = prog;
+ enum tc_clsbpf_command cmd;
+ bool skip_sw;
+ int ret;
+
+ skip_sw = tc_skip_sw(prog->gen_flags) ||
+ (oldprog && tc_skip_sw(oldprog->gen_flags));
+
+ if (oldprog && oldprog->offloaded) {
+ if (tc_should_offload(dev, tp, prog->gen_flags)) {
+ cmd = TC_CLSBPF_REPLACE;
+ } else if (!tc_skip_sw(prog->gen_flags)) {
+ obj = oldprog;
+ cmd = TC_CLSBPF_DESTROY;
+ } else {
+ return -EINVAL;
+ }
+ } else {
+ if (!tc_should_offload(dev, tp, prog->gen_flags))
+ return skip_sw ? -EINVAL : 0;
+ cmd = TC_CLSBPF_ADD;
+ }
+
+ ret = cls_bpf_offload_cmd(tp, obj, cmd);
+ if (ret)
+ return skip_sw ? ret : 0;
+
+ obj->offloaded = true;
+ if (oldprog)
+ oldprog->offloaded = false;
+
+ return 0;
+}
+
+static void cls_bpf_stop_offload(struct tcf_proto *tp,
+ struct cls_bpf_prog *prog)
+{
+ int err;
+
+ if (!prog->offloaded)
+ return;
+
+ err = cls_bpf_offload_cmd(tp, prog, TC_CLSBPF_DESTROY);
+ if (err) {
+ pr_err("Stopping hardware offload failed: %d\n", err);
+ return;
+ }
+
+ prog->offloaded = false;
+}
+
+static void cls_bpf_offload_update_stats(struct tcf_proto *tp,
+ struct cls_bpf_prog *prog)
+{
+ if (!prog->offloaded)
+ return;
+
+ cls_bpf_offload_cmd(tp, prog, TC_CLSBPF_STATS);
+}
+
static int cls_bpf_init(struct tcf_proto *tp)
{
struct cls_bpf_head *head;
@@ -176,6 +269,7 @@
{
struct cls_bpf_prog *prog = (struct cls_bpf_prog *) arg;
+ cls_bpf_stop_offload(tp, prog);
list_del_rcu(&prog->link);
tcf_unbind_filter(tp, &prog->res);
call_rcu(&prog->rcu, __cls_bpf_delete_prog);
@@ -192,6 +286,7 @@
return false;
list_for_each_entry_safe(prog, tmp, &head->plist, link) {
+ cls_bpf_stop_offload(tp, prog);
list_del_rcu(&prog->link);
tcf_unbind_filter(tp, &prog->res);
call_rcu(&prog->rcu, __cls_bpf_delete_prog);
@@ -301,6 +396,7 @@
{
bool is_bpf, is_ebpf, have_exts = false;
struct tcf_exts exts;
+ u32 gen_flags = 0;
int ret;
is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
@@ -325,8 +421,17 @@
have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
}
+ if (tb[TCA_BPF_FLAGS_GEN]) {
+ gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
+ if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS ||
+ !tc_flags_valid(gen_flags)) {
+ ret = -EINVAL;
+ goto errout;
+ }
+ }
prog->exts_integrated = have_exts;
+ prog->gen_flags = gen_flags;
ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) :
cls_bpf_prog_from_efd(tb, prog, tp);
@@ -409,10 +514,17 @@
goto errout;
}
- ret = cls_bpf_modify_existing(net, tp, prog, base, tb, tca[TCA_RATE], ovr);
+ ret = cls_bpf_modify_existing(net, tp, prog, base, tb, tca[TCA_RATE],
+ ovr);
if (ret < 0)
goto errout;
+ ret = cls_bpf_offload(tp, prog, oldprog);
+ if (ret) {
+ cls_bpf_delete_prog(tp, prog);
+ return ret;
+ }
+
if (oldprog) {
list_replace_rcu(&oldprog->link, &prog->link);
tcf_unbind_filter(tp, &oldprog->res);
@@ -474,6 +586,8 @@
tm->tcm_handle = prog->handle;
+ cls_bpf_offload_update_stats(tp, prog);
+
nest = nla_nest_start(skb, TCA_OPTIONS);
if (nest == NULL)
goto nla_put_failure;
@@ -496,6 +610,9 @@
bpf_flags |= TCA_BPF_FLAG_ACT_DIRECT;
if (bpf_flags && nla_put_u32(skb, TCA_BPF_FLAGS, bpf_flags))
goto nla_put_failure;
+ if (prog->gen_flags &&
+ nla_put_u32(skb, TCA_BPF_FLAGS_GEN, prog->gen_flags))
+ goto nla_put_failure;
nla_nest_end(skb, nest);
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index a379bae..e396723 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -87,12 +87,14 @@
return addr_fold(skb_dst(skb)) ^ (__force u16) tc_skb_protocol(skb);
}
-static u32 flow_get_proto(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_proto(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
return flow->basic.ip_proto;
}
-static u32 flow_get_proto_src(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_proto_src(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
if (flow->ports.ports)
return ntohs(flow->ports.src);
@@ -100,7 +102,8 @@
return addr_fold(skb->sk);
}
-static u32 flow_get_proto_dst(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_proto_dst(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
if (flow->ports.ports)
return ntohs(flow->ports.dst);
@@ -149,7 +152,8 @@
})
#endif
-static u32 flow_get_nfct_src(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_nfct_src(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
switch (tc_skb_protocol(skb)) {
case htons(ETH_P_IP):
@@ -161,7 +165,8 @@
return flow_get_src(skb, flow);
}
-static u32 flow_get_nfct_dst(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_nfct_dst(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
switch (tc_skb_protocol(skb)) {
case htons(ETH_P_IP):
@@ -173,14 +178,16 @@
return flow_get_dst(skb, flow);
}
-static u32 flow_get_nfct_proto_src(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_nfct_proto_src(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
return ntohs(CTTUPLE(skb, src.u.all));
fallback:
return flow_get_proto_src(skb, flow);
}
-static u32 flow_get_nfct_proto_dst(const struct sk_buff *skb, const struct flow_keys *flow)
+static u32 flow_get_nfct_proto_dst(const struct sk_buff *skb,
+ const struct flow_keys *flow)
{
return ntohs(CTTUPLE(skb, dst.u.all));
fallback:
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index a3f4c70..f6f40fb 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -241,7 +241,8 @@
tc.type = TC_SETUP_CLSFLOWER;
tc.cls_flower = &offload;
- err = dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol, &tc);
+ err = dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, tp->protocol,
+ &tc);
if (tc_skip_sw(flags))
return err;
@@ -480,7 +481,7 @@
}
fl_set_key_val(tb, &key->enc_key_id.keyid, TCA_FLOWER_KEY_ENC_KEY_ID,
- &mask->enc_key_id.keyid, TCA_FLOWER_KEY_ENC_KEY_ID,
+ &mask->enc_key_id.keyid, TCA_FLOWER_UNSPEC,
sizeof(key->enc_key_id.keyid));
return 0;
@@ -918,7 +919,7 @@
goto nla_put_failure;
if (fl_dump_key_val(skb, &key->enc_key_id, TCA_FLOWER_KEY_ENC_KEY_ID,
- &mask->enc_key_id, TCA_FLOWER_KEY_ENC_KEY_ID,
+ &mask->enc_key_id, TCA_FLOWER_UNSPEC,
sizeof(key->enc_key_id)))
goto nla_put_failure;
diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
index cc0bda9..9dc63d5 100644
--- a/net/sched/cls_fw.c
+++ b/net/sched/cls_fw.c
@@ -57,7 +57,7 @@
}
static int fw_classify(struct sk_buff *skb, const struct tcf_proto *tp,
- struct tcf_result *res)
+ struct tcf_result *res)
{
struct fw_head *head = rcu_dereference_bh(tp->root);
struct fw_filter *f;
@@ -188,7 +188,8 @@
static int
fw_change_attrs(struct net *net, struct tcf_proto *tp, struct fw_filter *f,
- struct nlattr **tb, struct nlattr **tca, unsigned long base, bool ovr)
+ struct nlattr **tb, struct nlattr **tca, unsigned long base,
+ bool ovr)
{
struct fw_head *head = rtnl_dereference(tp->root);
struct tcf_exts e;
@@ -237,9 +238,8 @@
static int fw_change(struct net *net, struct sk_buff *in_skb,
struct tcf_proto *tp, unsigned long base,
- u32 handle,
- struct nlattr **tca,
- unsigned long *arg, bool ovr)
+ u32 handle, struct nlattr **tca, unsigned long *arg,
+ bool ovr)
{
struct fw_head *head = rtnl_dereference(tp->root);
struct fw_filter *f = (struct fw_filter *) *arg;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index c91e65d..455fc8f 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -268,8 +268,7 @@
return 0;
}
-static void
-route4_delete_filter(struct rcu_head *head)
+static void route4_delete_filter(struct rcu_head *head)
{
struct route4_filter *f = container_of(head, struct route4_filter, rcu);
@@ -474,10 +473,8 @@
}
static int route4_change(struct net *net, struct sk_buff *in_skb,
- struct tcf_proto *tp, unsigned long base,
- u32 handle,
- struct nlattr **tca,
- unsigned long *arg, bool ovr)
+ struct tcf_proto *tp, unsigned long base, u32 handle,
+ struct nlattr **tca, unsigned long *arg, bool ovr)
{
struct route4_head *head = rtnl_dereference(tp->root);
struct route4_filter __rcu **fp;
@@ -562,7 +559,8 @@
return 0;
errout:
- tcf_exts_destroy(&f->exts);
+ if (f)
+ tcf_exts_destroy(&f->exts);
kfree(f);
return err;
}
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c
index d950070..96144bd 100644
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
@@ -50,14 +50,13 @@
struct rcu_head rcu;
};
-static inline int
-tcindex_filter_is_set(struct tcindex_filter_result *r)
+static inline int tcindex_filter_is_set(struct tcindex_filter_result *r)
{
return tcf_exts_is_predicative(&r->exts) || r->res.classid;
}
-static struct tcindex_filter_result *
-tcindex_lookup(struct tcindex_data *p, u16 key)
+static struct tcindex_filter_result *tcindex_lookup(struct tcindex_data *p,
+ u16 key)
{
if (p->perfect) {
struct tcindex_filter_result *f = p->perfect + key;
@@ -144,7 +143,8 @@
static void tcindex_destroy_fexts(struct rcu_head *head)
{
- struct tcindex_filter *f = container_of(head, struct tcindex_filter, rcu);
+ struct tcindex_filter *f = container_of(head, struct tcindex_filter,
+ rcu);
tcf_exts_destroy(&f->result.exts);
kfree(f);
@@ -550,7 +550,7 @@
static int tcindex_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
- struct sk_buff *skb, struct tcmsg *t)
+ struct sk_buff *skb, struct tcmsg *t)
{
struct tcindex_data *p = rtnl_dereference(tp->root);
struct tcindex_filter_result *r = (struct tcindex_filter_result *) fh;
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index a29263a..ae83c3ae 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -104,7 +104,8 @@
return h;
}
-static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct tcf_result *res)
+static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp,
+ struct tcf_result *res)
{
struct {
struct tc_u_knode *knode;
@@ -256,8 +257,7 @@
return -1;
}
-static struct tc_u_hnode *
-u32_lookup_ht(struct tc_u_common *tp_c, u32 handle)
+static struct tc_u_hnode *u32_lookup_ht(struct tc_u_common *tp_c, u32 handle)
{
struct tc_u_hnode *ht;
@@ -270,8 +270,7 @@
return ht;
}
-static struct tc_u_knode *
-u32_lookup_key(struct tc_u_hnode *ht, u32 handle)
+static struct tc_u_knode *u32_lookup_key(struct tc_u_hnode *ht, u32 handle)
{
unsigned int sel;
struct tc_u_knode *n = NULL;
@@ -360,8 +359,7 @@
return 0;
}
-static int u32_destroy_key(struct tcf_proto *tp,
- struct tc_u_knode *n,
+static int u32_destroy_key(struct tcf_proto *tp, struct tc_u_knode *n,
bool free_pf)
{
tcf_exts_destroy(&n->exts);
@@ -448,9 +446,8 @@
}
}
-static int u32_replace_hw_hnode(struct tcf_proto *tp,
- struct tc_u_hnode *h,
- u32 flags)
+static int u32_replace_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
+ u32 flags)
{
struct net_device *dev = tp->q->dev_queue->dev;
struct tc_cls_u32_offload u32_offload = {0};
@@ -496,9 +493,8 @@
}
}
-static int u32_replace_hw_knode(struct tcf_proto *tp,
- struct tc_u_knode *n,
- u32 flags)
+static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
+ u32 flags)
{
struct net_device *dev = tp->q->dev_queue->dev;
struct tc_cls_u32_offload u32_offload = {0};
@@ -763,8 +759,7 @@
return err;
}
-static void u32_replace_knode(struct tcf_proto *tp,
- struct tc_u_common *tp_c,
+static void u32_replace_knode(struct tcf_proto *tp, struct tc_u_common *tp_c,
struct tc_u_knode *n)
{
struct tc_u_knode __rcu **ins;
@@ -845,8 +840,7 @@
static int u32_change(struct net *net, struct sk_buff *in_skb,
struct tcf_proto *tp, unsigned long base, u32 handle,
- struct nlattr **tca,
- unsigned long *arg, bool ovr)
+ struct nlattr **tca, unsigned long *arg, bool ovr)
{
struct tc_u_common *tp_c = tp->data;
struct tc_u_hnode *ht;
@@ -1088,7 +1082,7 @@
}
static int u32_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
- struct sk_buff *skb, struct tcmsg *t)
+ struct sk_buff *skb, struct tcmsg *t)
{
struct tc_u_knode *n = (struct tc_u_knode *)fh;
struct tc_u_hnode *ht_up, *ht_down;
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index d677b34..206dc24 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -389,7 +389,8 @@
static struct qdisc_rate_table *qdisc_rtab_list;
-struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r, struct nlattr *tab)
+struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
+ struct nlattr *tab)
{
struct qdisc_rate_table *rtab;
@@ -541,7 +542,8 @@
return -1;
}
-void __qdisc_calculate_pkt_len(struct sk_buff *skb, const struct qdisc_size_table *stab)
+void __qdisc_calculate_pkt_len(struct sk_buff *skb,
+ const struct qdisc_size_table *stab)
{
int pkt_len, slot;
@@ -888,10 +890,10 @@
Parameters are passed via opt.
*/
-static struct Qdisc *
-qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
- struct Qdisc *p, u32 parent, u32 handle,
- struct nlattr **tca, int *errp)
+static struct Qdisc *qdisc_create(struct net_device *dev,
+ struct netdev_queue *dev_queue,
+ struct Qdisc *p, u32 parent, u32 handle,
+ struct nlattr **tca, int *errp)
{
int err;
struct nlattr *kind = tca[TCA_KIND];
@@ -1073,7 +1075,8 @@
int depth;
};
-static int check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w);
+static int check_loop_fn(struct Qdisc *q, unsigned long cl,
+ struct qdisc_walker *w);
static int check_loop(struct Qdisc *q, struct Qdisc *p, int depth)
{
@@ -1450,7 +1453,8 @@
} else {
if (!tc_qdisc_dump_ignore(q) &&
tc_fill_qdisc(skb, q, q->parent, NETLINK_CB(cb->skb).portid,
- cb->nlh->nlmsg_seq, NLM_F_MULTI, RTM_NEWQDISC) <= 0)
+ cb->nlh->nlmsg_seq, NLM_F_MULTI,
+ RTM_NEWQDISC) <= 0)
goto done;
q_idx++;
}
@@ -1471,7 +1475,8 @@
}
if (!tc_qdisc_dump_ignore(q) &&
tc_fill_qdisc(skb, q, q->parent, NETLINK_CB(cb->skb).portid,
- cb->nlh->nlmsg_seq, NLM_F_MULTI, RTM_NEWQDISC) <= 0)
+ cb->nlh->nlmsg_seq, NLM_F_MULTI,
+ RTM_NEWQDISC) <= 0)
goto done;
q_idx++;
}
@@ -1505,7 +1510,8 @@
s_q_idx = 0;
q_idx = 0;
- if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx, true) < 0)
+ if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx,
+ true) < 0)
goto done;
dev_queue = dev_ingress_queue(dev);
@@ -1640,7 +1646,8 @@
if (cops->delete)
err = cops->delete(q, cl);
if (err == 0)
- tclass_notify(net, skb, n, q, cl, RTM_DELTCLASS);
+ tclass_notify(net, skb, n, q, cl,
+ RTM_DELTCLASS);
goto out;
case RTM_GETTCLASS:
err = tclass_notify(net, skb, n, q, cl, RTM_NEWTCLASS);
@@ -1738,12 +1745,14 @@
struct netlink_callback *cb;
};
-static int qdisc_class_dump(struct Qdisc *q, unsigned long cl, struct qdisc_walker *arg)
+static int qdisc_class_dump(struct Qdisc *q, unsigned long cl,
+ struct qdisc_walker *arg)
{
struct qdisc_dump_args *a = (struct qdisc_dump_args *)arg;
return tc_fill_tclass(a->skb, q, cl, NETLINK_CB(a->cb->skb).portid,
- a->cb->nlh->nlmsg_seq, NLM_F_MULTI, RTM_NEWTCLASS);
+ a->cb->nlh->nlmsg_seq, NLM_F_MULTI,
+ RTM_NEWTCLASS);
}
static int tc_dump_tclass_qdisc(struct Qdisc *q, struct sk_buff *skb,
@@ -1976,10 +1985,12 @@
rtnl_register(PF_UNSPEC, RTM_NEWQDISC, tc_modify_qdisc, NULL, NULL);
rtnl_register(PF_UNSPEC, RTM_DELQDISC, tc_get_qdisc, NULL, NULL);
- rtnl_register(PF_UNSPEC, RTM_GETQDISC, tc_get_qdisc, tc_dump_qdisc, NULL);
+ rtnl_register(PF_UNSPEC, RTM_GETQDISC, tc_get_qdisc, tc_dump_qdisc,
+ NULL);
rtnl_register(PF_UNSPEC, RTM_NEWTCLASS, tc_ctl_tclass, NULL, NULL);
rtnl_register(PF_UNSPEC, RTM_DELTCLASS, tc_ctl_tclass, NULL, NULL);
- rtnl_register(PF_UNSPEC, RTM_GETTCLASS, tc_ctl_tclass, tc_dump_tclass, NULL);
+ rtnl_register(PF_UNSPEC, RTM_GETTCLASS, tc_ctl_tclass, tc_dump_tclass,
+ NULL);
return 0;
}
diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c
index 4002df3..5bfa79e 100644
--- a/net/sched/sch_codel.c
+++ b/net/sched/sch_codel.c
@@ -69,7 +69,7 @@
static struct sk_buff *dequeue_func(struct codel_vars *vars, void *ctx)
{
struct Qdisc *sch = ctx;
- struct sk_buff *skb = __skb_dequeue(&sch->q);
+ struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);
if (skb)
sch->qstats.backlog -= qdisc_pkt_len(skb);
@@ -172,7 +172,7 @@
qlen = sch->q.qlen;
while (sch->q.qlen > sch->limit) {
- struct sk_buff *skb = __skb_dequeue(&sch->q);
+ struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);
dropped += qdisc_pkt_len(skb);
qdisc_qstats_backlog_dec(sch, skb);
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index baeed6a..1e37247 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -31,7 +31,7 @@
static int pfifo_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
- if (likely(skb_queue_len(&sch->q) < sch->limit))
+ if (likely(sch->q.qlen < sch->limit))
return qdisc_enqueue_tail(skb, sch);
return qdisc_drop(skb, sch, to_free);
@@ -42,7 +42,7 @@
{
unsigned int prev_backlog;
- if (likely(skb_queue_len(&sch->q) < sch->limit))
+ if (likely(sch->q.qlen < sch->limit))
return qdisc_enqueue_tail(skb, sch);
prev_backlog = sch->qstats.backlog;
diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c
index e5458b9..18e7524 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -86,6 +86,7 @@
struct rb_root delayed; /* for rate limited flows */
u64 time_next_delayed_flow;
+ unsigned long unthrottle_latency_ns;
struct fq_flow internal; /* for non classified or high prio packets */
u32 quantum;
@@ -94,6 +95,7 @@
u32 flow_max_rate; /* optional max rate per flow */
u32 flow_plimit; /* max packets per flow */
u32 orphan_mask; /* mask for orphaned skb */
+ u32 low_rate_threshold;
struct rb_root *fq_root;
u8 rate_enable;
u8 fq_trees_log;
@@ -407,11 +409,19 @@
static void fq_check_throttled(struct fq_sched_data *q, u64 now)
{
+ unsigned long sample;
struct rb_node *p;
if (q->time_next_delayed_flow > now)
return;
+ /* Update unthrottle latency EWMA.
+ * This is cheap and can help diagnosing timer/latency problems.
+ */
+ sample = (unsigned long)(now - q->time_next_delayed_flow);
+ q->unthrottle_latency_ns -= q->unthrottle_latency_ns >> 3;
+ q->unthrottle_latency_ns += sample >> 3;
+
q->time_next_delayed_flow = ~0ULL;
while ((p = rb_first(&q->delayed)) != NULL) {
struct fq_flow *f = container_of(p, struct fq_flow, rate_node);
@@ -433,7 +443,7 @@
struct fq_flow_head *head;
struct sk_buff *skb;
struct fq_flow *f;
- u32 rate;
+ u32 rate, plen;
skb = fq_dequeue_head(sch, &q->internal);
if (skb)
@@ -482,7 +492,7 @@
prefetch(&skb->end);
f->credit -= qdisc_pkt_len(skb);
- if (f->credit > 0 || !q->rate_enable)
+ if (!q->rate_enable)
goto out;
/* Do not pace locally generated ack packets */
@@ -493,8 +503,15 @@
if (skb->sk)
rate = min(skb->sk->sk_pacing_rate, rate);
+ if (rate <= q->low_rate_threshold) {
+ f->credit = 0;
+ plen = qdisc_pkt_len(skb);
+ } else {
+ plen = max(qdisc_pkt_len(skb), q->quantum);
+ if (f->credit > 0)
+ goto out;
+ }
if (rate != ~0U) {
- u32 plen = max(qdisc_pkt_len(skb), q->quantum);
u64 len = (u64)plen * NSEC_PER_SEC;
if (likely(rate))
@@ -507,7 +524,12 @@
len = NSEC_PER_SEC;
q->stat_pkts_too_long++;
}
-
+ /* Account for schedule/timers drifts.
+ * f->time_next_packet was set when prior packet was sent,
+ * and current time (@now) can be too late by tens of us.
+ */
+ if (f->time_next_packet)
+ len -= min(len/2, now - f->time_next_packet);
f->time_next_packet = now + len;
}
out:
@@ -662,6 +684,7 @@
[TCA_FQ_FLOW_MAX_RATE] = { .type = NLA_U32 },
[TCA_FQ_BUCKETS_LOG] = { .type = NLA_U32 },
[TCA_FQ_FLOW_REFILL_DELAY] = { .type = NLA_U32 },
+ [TCA_FQ_LOW_RATE_THRESHOLD] = { .type = NLA_U32 },
};
static int fq_change(struct Qdisc *sch, struct nlattr *opt)
@@ -716,6 +739,10 @@
if (tb[TCA_FQ_FLOW_MAX_RATE])
q->flow_max_rate = nla_get_u32(tb[TCA_FQ_FLOW_MAX_RATE]);
+ if (tb[TCA_FQ_LOW_RATE_THRESHOLD])
+ q->low_rate_threshold =
+ nla_get_u32(tb[TCA_FQ_LOW_RATE_THRESHOLD]);
+
if (tb[TCA_FQ_RATE_ENABLE]) {
u32 enable = nla_get_u32(tb[TCA_FQ_RATE_ENABLE]);
@@ -774,6 +801,7 @@
q->initial_quantum = 10 * psched_mtu(qdisc_dev(sch));
q->flow_refill_delay = msecs_to_jiffies(40);
q->flow_max_rate = ~0U;
+ q->time_next_delayed_flow = ~0ULL;
q->rate_enable = 1;
q->new_flows.first = NULL;
q->old_flows.first = NULL;
@@ -781,6 +809,7 @@
q->fq_root = NULL;
q->fq_trees_log = ilog2(1024);
q->orphan_mask = 1024 - 1;
+ q->low_rate_threshold = 550000 / 8;
qdisc_watchdog_init(&q->watchdog, sch);
if (opt)
@@ -811,6 +840,8 @@
nla_put_u32(skb, TCA_FQ_FLOW_REFILL_DELAY,
jiffies_to_usecs(q->flow_refill_delay)) ||
nla_put_u32(skb, TCA_FQ_ORPHAN_MASK, q->orphan_mask) ||
+ nla_put_u32(skb, TCA_FQ_LOW_RATE_THRESHOLD,
+ q->low_rate_threshold) ||
nla_put_u32(skb, TCA_FQ_BUCKETS_LOG, q->fq_trees_log))
goto nla_put_failure;
@@ -823,20 +854,24 @@
static int fq_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
{
struct fq_sched_data *q = qdisc_priv(sch);
- u64 now = ktime_get_ns();
- struct tc_fq_qd_stats st = {
- .gc_flows = q->stat_gc_flows,
- .highprio_packets = q->stat_internal_packets,
- .tcp_retrans = q->stat_tcp_retrans,
- .throttled = q->stat_throttled,
- .flows_plimit = q->stat_flows_plimit,
- .pkts_too_long = q->stat_pkts_too_long,
- .allocation_errors = q->stat_allocation_errors,
- .flows = q->flows,
- .inactive_flows = q->inactive_flows,
- .throttled_flows = q->throttled_flows,
- .time_next_delayed_flow = q->time_next_delayed_flow - now,
- };
+ struct tc_fq_qd_stats st;
+
+ sch_tree_lock(sch);
+
+ st.gc_flows = q->stat_gc_flows;
+ st.highprio_packets = q->stat_internal_packets;
+ st.tcp_retrans = q->stat_tcp_retrans;
+ st.throttled = q->stat_throttled;
+ st.flows_plimit = q->stat_flows_plimit;
+ st.pkts_too_long = q->stat_pkts_too_long;
+ st.allocation_errors = q->stat_allocation_errors;
+ st.time_next_delayed_flow = q->time_next_delayed_flow - ktime_get_ns();
+ st.flows = q->flows;
+ st.inactive_flows = q->inactive_flows;
+ st.throttled_flows = q->throttled_flows;
+ st.unthrottle_latency_ns = min_t(unsigned long,
+ q->unthrottle_latency_ns, ~0U);
+ sch_tree_unlock(sch);
return gnet_stats_copy_app(d, &st, sizeof(st));
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 0d21b56..6cfb6e9 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -466,7 +466,7 @@
*/
struct pfifo_fast_priv {
u32 bitmap;
- struct sk_buff_head q[PFIFO_FAST_BANDS];
+ struct qdisc_skb_head q[PFIFO_FAST_BANDS];
};
/*
@@ -477,7 +477,7 @@
*/
static const int bitmap2band[] = {-1, 0, 1, 0, 2, 0, 1, 0};
-static inline struct sk_buff_head *band2list(struct pfifo_fast_priv *priv,
+static inline struct qdisc_skb_head *band2list(struct pfifo_fast_priv *priv,
int band)
{
return priv->q + band;
@@ -486,10 +486,10 @@
static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
struct sk_buff **to_free)
{
- if (skb_queue_len(&qdisc->q) < qdisc_dev(qdisc)->tx_queue_len) {
+ if (qdisc->q.qlen < qdisc_dev(qdisc)->tx_queue_len) {
int band = prio2band[skb->priority & TC_PRIO_MAX];
struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
- struct sk_buff_head *list = band2list(priv, band);
+ struct qdisc_skb_head *list = band2list(priv, band);
priv->bitmap |= (1 << band);
qdisc->q.qlen++;
@@ -505,11 +505,16 @@
int band = bitmap2band[priv->bitmap];
if (likely(band >= 0)) {
- struct sk_buff_head *list = band2list(priv, band);
- struct sk_buff *skb = __qdisc_dequeue_head(qdisc, list);
+ struct qdisc_skb_head *qh = band2list(priv, band);
+ struct sk_buff *skb = __qdisc_dequeue_head(qh);
+
+ if (likely(skb != NULL)) {
+ qdisc_qstats_backlog_dec(qdisc, skb);
+ qdisc_bstats_update(qdisc, skb);
+ }
qdisc->q.qlen--;
- if (skb_queue_empty(list))
+ if (qh->qlen == 0)
priv->bitmap &= ~(1 << band);
return skb;
@@ -524,9 +529,9 @@
int band = bitmap2band[priv->bitmap];
if (band >= 0) {
- struct sk_buff_head *list = band2list(priv, band);
+ struct qdisc_skb_head *qh = band2list(priv, band);
- return skb_peek(list);
+ return qh->head;
}
return NULL;
@@ -564,7 +569,7 @@
struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
for (prio = 0; prio < PFIFO_FAST_BANDS; prio++)
- __skb_queue_head_init(band2list(priv, prio));
+ qdisc_skb_head_init(band2list(priv, prio));
/* Can by-pass the queue discipline */
qdisc->flags |= TCQ_F_CAN_BYPASS;
@@ -612,7 +617,8 @@
sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
sch->padded = (char *) sch - (char *) p;
}
- skb_queue_head_init(&sch->q);
+ qdisc_skb_head_init(&sch->q);
+ spin_lock_init(&sch->q.lock);
spin_lock_init(&sch->busylock);
lockdep_set_class(&sch->busylock,
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 53dbfa1..c798d0d 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -162,7 +162,7 @@
struct work_struct work;
/* non shaped skbs; let them go directly thru */
- struct sk_buff_head direct_queue;
+ struct qdisc_skb_head direct_queue;
long direct_pkts;
struct qdisc_watchdog watchdog;
@@ -570,6 +570,22 @@
list_del_init(&cl->un.leaf.drop_list);
}
+static void htb_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
+ struct qdisc_skb_head *qh)
+{
+ struct sk_buff *last = qh->tail;
+
+ if (last) {
+ skb->next = NULL;
+ last->next = skb;
+ qh->tail = skb;
+ } else {
+ qh->tail = skb;
+ qh->head = skb;
+ }
+ qh->qlen++;
+}
+
static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
@@ -580,7 +596,7 @@
if (cl == HTB_DIRECT) {
/* enqueue to helper queue */
if (q->direct_queue.qlen < q->direct_qlen) {
- __skb_queue_tail(&q->direct_queue, skb);
+ htb_enqueue_tail(skb, sch, &q->direct_queue);
q->direct_pkts++;
} else {
return qdisc_drop(skb, sch, to_free);
@@ -888,7 +904,7 @@
unsigned long start_at;
/* try to dequeue direct packets as high prio (!) to minimize cpu work */
- skb = __skb_dequeue(&q->direct_queue);
+ skb = __qdisc_dequeue_head(&q->direct_queue);
if (skb != NULL) {
ok:
qdisc_bstats_update(sch, skb);
@@ -1019,7 +1035,7 @@
qdisc_watchdog_init(&q->watchdog, sch);
INIT_WORK(&q->work, htb_work_func);
- __skb_queue_head_init(&q->direct_queue);
+ qdisc_skb_head_init(&q->direct_queue);
if (tb[TCA_HTB_DIRECT_QLEN])
q->direct_qlen = nla_get_u32(tb[TCA_HTB_DIRECT_QLEN]);
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index aaaf021..9f7b380 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -413,6 +413,16 @@
return segs;
}
+static void netem_enqueue_skb_head(struct qdisc_skb_head *qh, struct sk_buff *skb)
+{
+ skb->next = qh->head;
+
+ if (!qh->head)
+ qh->tail = skb;
+ qh->head = skb;
+ qh->qlen++;
+}
+
/*
* Insert one skb into qdisc.
* Note: parent depends on return value to account for queue length.
@@ -502,7 +512,7 @@
1<<(prandom_u32() % 8);
}
- if (unlikely(skb_queue_len(&sch->q) >= sch->limit))
+ if (unlikely(sch->q.qlen >= sch->limit))
return qdisc_drop(skb, sch, to_free);
qdisc_qstats_backlog_inc(sch, skb);
@@ -522,8 +532,8 @@
if (q->rate) {
struct sk_buff *last;
- if (!skb_queue_empty(&sch->q))
- last = skb_peek_tail(&sch->q);
+ if (sch->q.qlen)
+ last = sch->q.tail;
else
last = netem_rb_to_skb(rb_last(&q->t_root));
if (last) {
@@ -552,7 +562,7 @@
cb->time_to_send = psched_get_time();
q->counter = 0;
- __skb_queue_head(&sch->q, skb);
+ netem_enqueue_skb_head(&sch->q, skb);
sch->qstats.requeues++;
}
@@ -587,7 +597,7 @@
struct rb_node *p;
tfifo_dequeue:
- skb = __skb_dequeue(&sch->q);
+ skb = __qdisc_dequeue_head(&sch->q);
if (skb) {
qdisc_qstats_backlog_dec(sch, skb);
deliver:
diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c
index a570b0b..5c3a99d 100644
--- a/net/sched/sch_pie.c
+++ b/net/sched/sch_pie.c
@@ -231,7 +231,7 @@
/* Drop excess packets if new limit is lower */
qlen = sch->q.qlen;
while (sch->q.qlen > sch->limit) {
- struct sk_buff *skb = __skb_dequeue(&sch->q);
+ struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);
dropped += qdisc_pkt_len(skb);
qdisc_qstats_backlog_dec(sch, skb);
@@ -511,7 +511,7 @@
static struct sk_buff *pie_qdisc_dequeue(struct Qdisc *sch)
{
struct sk_buff *skb;
- skb = __qdisc_dequeue_head(sch, &sch->q);
+ skb = qdisc_dequeue_head(sch);
if (!skb)
return NULL;
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 1c23060..f10d339 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -1408,7 +1408,7 @@
transports) {
if (t->pmtu_pending && t->dst) {
sctp_transport_update_pmtu(sk, t,
- WORD_TRUNC(dst_mtu(t->dst)));
+ SCTP_TRUNC4(dst_mtu(t->dst)));
t->pmtu_pending = 0;
}
if (!pmtu || (t->pathmtu < pmtu))
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index a55e547..8afe2e9 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -70,6 +70,19 @@
return msg;
}
+void sctp_datamsg_free(struct sctp_datamsg *msg)
+{
+ struct sctp_chunk *chunk;
+
+ /* This doesn't have to be a _safe vairant because
+ * sctp_chunk_free() only drops the refs.
+ */
+ list_for_each_entry(chunk, &msg->chunks, frag_list)
+ sctp_chunk_free(chunk);
+
+ sctp_datamsg_put(msg);
+}
+
/* Final destructruction of datamsg memory. */
static void sctp_datamsg_destroy(struct sctp_datamsg *msg)
{
@@ -182,9 +195,10 @@
/* This is the biggest possible DATA chunk that can fit into
* the packet
*/
- max_data = (asoc->pathmtu -
- sctp_sk(asoc->base.sk)->pf->af->net_header_len -
- sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk)) & ~3;
+ max_data = asoc->pathmtu -
+ sctp_sk(asoc->base.sk)->pf->af->net_header_len -
+ sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+ max_data = SCTP_TRUNC4(max_data);
max = asoc->frag_point;
/* If the the peer requested that we authenticate DATA chunks
@@ -195,8 +209,8 @@
struct sctp_hmac *hmac_desc = sctp_auth_asoc_get_hmac(asoc);
if (hmac_desc)
- max_data -= WORD_ROUND(sizeof(sctp_auth_chunk_t) +
- hmac_desc->hmac_len);
+ max_data -= SCTP_PAD4(sizeof(sctp_auth_chunk_t) +
+ hmac_desc->hmac_len);
}
/* Now, check if we need to reduce our max */
@@ -216,7 +230,7 @@
asoc->outqueue.out_qlen == 0 &&
list_empty(&asoc->outqueue.retransmit) &&
msg_len > max)
- max_data -= WORD_ROUND(sizeof(sctp_sack_chunk_t));
+ max_data -= SCTP_PAD4(sizeof(sctp_sack_chunk_t));
/* Encourage Cookie-ECHO bundling. */
if (asoc->state < SCTP_STATE_COOKIE_ECHOED)
diff --git a/net/sctp/input.c b/net/sctp/input.c
index 69444d3..a2ea1d1 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -605,7 +605,7 @@
/* PMTU discovery (RFC1191) */
if (ICMP_FRAG_NEEDED == code) {
sctp_icmp_frag_needed(sk, asoc, transport,
- WORD_TRUNC(info));
+ SCTP_TRUNC4(info));
goto out_unlock;
} else {
if (ICMP_PROT_UNREACH == code) {
@@ -673,7 +673,7 @@
if (ntohs(ch->length) < sizeof(sctp_chunkhdr_t))
break;
- ch_end = offset + WORD_ROUND(ntohs(ch->length));
+ ch_end = offset + SCTP_PAD4(ntohs(ch->length));
if (ch_end > skb->len)
break;
@@ -796,27 +796,34 @@
static inline int sctp_hash_cmp(struct rhashtable_compare_arg *arg,
const void *ptr)
{
+ struct sctp_transport *t = (struct sctp_transport *)ptr;
const struct sctp_hash_cmp_arg *x = arg->key;
- const struct sctp_transport *t = ptr;
- struct sctp_association *asoc = t->asoc;
- const struct net *net = x->net;
+ struct sctp_association *asoc;
+ int err = 1;
if (!sctp_cmp_addr_exact(&t->ipaddr, x->paddr))
- return 1;
- if (!net_eq(sock_net(asoc->base.sk), net))
- return 1;
+ return err;
+ if (!sctp_transport_hold(t))
+ return err;
+
+ asoc = t->asoc;
+ if (!net_eq(sock_net(asoc->base.sk), x->net))
+ goto out;
if (x->ep) {
if (x->ep != asoc->ep)
- return 1;
+ goto out;
} else {
if (x->laddr->v4.sin_port != htons(asoc->base.bind_addr.port))
- return 1;
+ goto out;
if (!sctp_bind_addr_match(&asoc->base.bind_addr,
x->laddr, sctp_sk(asoc->base.sk)))
- return 1;
+ goto out;
}
- return 0;
+ err = 0;
+out:
+ sctp_transport_put(t);
+ return err;
}
static inline u32 sctp_hash_obj(const void *data, u32 len, u32 seed)
@@ -1121,7 +1128,7 @@
if (ntohs(ch->length) < sizeof(sctp_chunkhdr_t))
break;
- ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+ ch_end = ((__u8 *)ch) + SCTP_PAD4(ntohs(ch->length));
if (ch_end > skb_tail_pointer(skb))
break;
@@ -1190,7 +1197,7 @@
* that the chunk length doesn't cause overflow. Otherwise, we'll
* walk off the end.
*/
- if (WORD_ROUND(ntohs(ch->length)) > skb->len)
+ if (SCTP_PAD4(ntohs(ch->length)) > skb->len)
return NULL;
/* If this is INIT/INIT-ACK look inside the chunk too. */
diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c
index 6437aa9..f731de3 100644
--- a/net/sctp/inqueue.c
+++ b/net/sctp/inqueue.c
@@ -213,7 +213,7 @@
}
chunk->chunk_hdr = ch;
- chunk->chunk_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+ chunk->chunk_end = ((__u8 *)ch) + SCTP_PAD4(ntohs(ch->length));
skb_pull(chunk->skb, sizeof(sctp_chunkhdr_t));
chunk->subh.v = NULL; /* Subheader is no longer valid. */
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 31b7bc3..2a5c189 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -180,7 +180,6 @@
int one_packet, gfp_t gfp)
{
sctp_xmit_t retval;
- int error = 0;
pr_debug("%s: packet:%p size:%Zu chunk:%p size:%d\n", __func__,
packet, packet->size, chunk, chunk->skb ? chunk->skb->len : -1);
@@ -188,6 +187,8 @@
switch ((retval = (sctp_packet_append_chunk(packet, chunk)))) {
case SCTP_XMIT_PMTU_FULL:
if (!packet->has_cookie_echo) {
+ int error = 0;
+
error = sctp_packet_transmit(packet, gfp);
if (error < 0)
chunk->skb->sk->sk_err = -error;
@@ -296,7 +297,7 @@
struct sctp_chunk *chunk)
{
sctp_xmit_t retval = SCTP_XMIT_OK;
- __u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
+ __u16 chunk_len = SCTP_PAD4(ntohs(chunk->chunk_hdr->length));
/* Check to see if this chunk will fit into the packet */
retval = sctp_packet_will_fit(packet, chunk, chunk_len);
@@ -441,14 +442,14 @@
* time. Application may notice this error.
*/
pr_err_once("Trying to GSO but underlying device doesn't support it.");
- goto nomem;
+ goto err;
}
} else {
pkt_size = packet->size;
}
head = alloc_skb(pkt_size + MAX_HEADER, gfp);
if (!head)
- goto nomem;
+ goto err;
if (gso) {
NAPI_GRO_CB(head)->last = head;
skb_shinfo(head)->gso_type = sk->sk_gso_type;
@@ -469,8 +470,12 @@
}
}
dst = dst_clone(tp->dst);
- if (!dst)
- goto no_route;
+ if (!dst) {
+ if (asoc)
+ IP_INC_STATS(sock_net(asoc->base.sk),
+ IPSTATS_MIB_OUTNOROUTES);
+ goto nodst;
+ }
skb_dst_set(head, dst);
/* Build the SCTP header. */
@@ -503,7 +508,7 @@
if (gso) {
pkt_size = packet->overhead;
list_for_each_entry(chunk, &packet->chunk_list, list) {
- int padded = WORD_ROUND(chunk->skb->len);
+ int padded = SCTP_PAD4(chunk->skb->len);
if (pkt_size + padded > tp->pathmtu)
break;
@@ -533,7 +538,7 @@
* included in the chunk length field. The sender should
* never pad with more than 3 bytes.
*
- * [This whole comment explains WORD_ROUND() below.]
+ * [This whole comment explains SCTP_PAD4() below.]
*/
pkt_size -= packet->overhead;
@@ -555,7 +560,7 @@
has_data = 1;
}
- padding = WORD_ROUND(chunk->skb->len) - chunk->skb->len;
+ padding = SCTP_PAD4(chunk->skb->len) - chunk->skb->len;
if (padding)
memset(skb_put(chunk->skb, padding), 0, padding);
@@ -582,7 +587,7 @@
* acknowledged or have failed.
* Re-queue auth chunks if needed.
*/
- pkt_size -= WORD_ROUND(chunk->skb->len);
+ pkt_size -= SCTP_PAD4(chunk->skb->len);
if (!sctp_chunk_is_data(chunk) && chunk != packet->auth)
sctp_chunk_free(chunk);
@@ -621,8 +626,10 @@
if (!gso)
break;
- if (skb_gro_receive(&head, nskb))
+ if (skb_gro_receive(&head, nskb)) {
+ kfree_skb(nskb);
goto nomem;
+ }
nskb = NULL;
if (WARN_ON_ONCE(skb_shinfo(head)->gso_segs >=
sk->sk_gso_max_segs))
@@ -716,18 +723,13 @@
}
head->ignore_df = packet->ipfragok;
tp->af_specific->sctp_xmit(head, tp);
+ goto out;
-out:
- sctp_packet_reset(packet);
- return err;
-no_route:
- kfree_skb(head);
- if (nskb != head)
- kfree_skb(nskb);
+nomem:
+ if (packet->auth && list_empty(&packet->auth->list))
+ sctp_chunk_free(packet->auth);
- if (asoc)
- IP_INC_STATS(sock_net(asoc->base.sk), IPSTATS_MIB_OUTNOROUTES);
-
+nodst:
/* FIXME: Returning the 'err' will effect all the associations
* associated with a socket, although only one of the paths of the
* association is unreachable.
@@ -736,22 +738,18 @@
* required.
*/
/* err = -EHOSTUNREACH; */
-err:
- /* Control chunks are unreliable so just drop them. DATA chunks
- * will get resent or dropped later.
- */
+ kfree_skb(head);
+err:
list_for_each_entry_safe(chunk, tmp, &packet->chunk_list, list) {
list_del_init(&chunk->list);
if (!sctp_chunk_is_data(chunk))
sctp_chunk_free(chunk);
}
- goto out;
-nomem:
- if (packet->auth && list_empty(&packet->auth->list))
- sctp_chunk_free(packet->auth);
- err = -ENOMEM;
- goto err;
+
+out:
+ sctp_packet_reset(packet);
+ return err;
}
/********************************************************************
@@ -913,7 +911,7 @@
*/
maxsize = pmtu - packet->overhead;
if (packet->auth)
- maxsize -= WORD_ROUND(packet->auth->skb->len);
+ maxsize -= SCTP_PAD4(packet->auth->skb->len);
if (chunk_len > maxsize)
retval = SCTP_XMIT_PMTU_FULL;
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 72e54a4..3ec6da8b 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -68,7 +68,7 @@
static void sctp_generate_fwdtsn(struct sctp_outq *q, __u32 sack_ctsn);
-static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp);
+static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp);
/* Add data to the front of the queue. */
static inline void sctp_outq_head_data(struct sctp_outq *q,
@@ -285,10 +285,9 @@
}
/* Put a new chunk in an sctp_outq. */
-int sctp_outq_tail(struct sctp_outq *q, struct sctp_chunk *chunk, gfp_t gfp)
+void sctp_outq_tail(struct sctp_outq *q, struct sctp_chunk *chunk, gfp_t gfp)
{
struct net *net = sock_net(q->asoc->base.sk);
- int error = 0;
pr_debug("%s: outq:%p, chunk:%p[%s]\n", __func__, q, chunk,
chunk && chunk->chunk_hdr ?
@@ -299,54 +298,26 @@
* immediately.
*/
if (sctp_chunk_is_data(chunk)) {
- /* Is it OK to queue data chunks? */
- /* From 9. Termination of Association
- *
- * When either endpoint performs a shutdown, the
- * association on each peer will stop accepting new
- * data from its user and only deliver data in queue
- * at the time of sending or receiving the SHUTDOWN
- * chunk.
- */
- switch (q->asoc->state) {
- case SCTP_STATE_CLOSED:
- case SCTP_STATE_SHUTDOWN_PENDING:
- case SCTP_STATE_SHUTDOWN_SENT:
- case SCTP_STATE_SHUTDOWN_RECEIVED:
- case SCTP_STATE_SHUTDOWN_ACK_SENT:
- /* Cannot send after transport endpoint shutdown */
- error = -ESHUTDOWN;
- break;
+ pr_debug("%s: outqueueing: outq:%p, chunk:%p[%s])\n",
+ __func__, q, chunk, chunk && chunk->chunk_hdr ?
+ sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)) :
+ "illegal chunk");
- default:
- pr_debug("%s: outqueueing: outq:%p, chunk:%p[%s])\n",
- __func__, q, chunk, chunk && chunk->chunk_hdr ?
- sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)) :
- "illegal chunk");
-
- sctp_chunk_hold(chunk);
- sctp_outq_tail_data(q, chunk);
- if (chunk->asoc->prsctp_enable &&
- SCTP_PR_PRIO_ENABLED(chunk->sinfo.sinfo_flags))
- chunk->asoc->sent_cnt_removable++;
- if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
- SCTP_INC_STATS(net, SCTP_MIB_OUTUNORDERCHUNKS);
- else
- SCTP_INC_STATS(net, SCTP_MIB_OUTORDERCHUNKS);
- break;
- }
+ sctp_outq_tail_data(q, chunk);
+ if (chunk->asoc->prsctp_enable &&
+ SCTP_PR_PRIO_ENABLED(chunk->sinfo.sinfo_flags))
+ chunk->asoc->sent_cnt_removable++;
+ if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
+ SCTP_INC_STATS(net, SCTP_MIB_OUTUNORDERCHUNKS);
+ else
+ SCTP_INC_STATS(net, SCTP_MIB_OUTORDERCHUNKS);
} else {
list_add_tail(&chunk->list, &q->control_chunk_list);
SCTP_INC_STATS(net, SCTP_MIB_OUTCTRLCHUNKS);
}
- if (error < 0)
- return error;
-
if (!q->cork)
- error = sctp_outq_flush(q, 0, gfp);
-
- return error;
+ sctp_outq_flush(q, 0, gfp);
}
/* Insert a chunk into the sorted list based on the TSNs. The retransmit list
@@ -559,7 +530,6 @@
sctp_retransmit_reason_t reason)
{
struct net *net = sock_net(q->asoc->base.sk);
- int error = 0;
switch (reason) {
case SCTP_RTXR_T3_RTX:
@@ -603,10 +573,7 @@
* will be flushed at the end.
*/
if (reason != SCTP_RTXR_FAST_RTX)
- error = sctp_outq_flush(q, /* rtx_timeout */ 1, GFP_ATOMIC);
-
- if (error)
- q->asoc->base.sk->sk_err = -error;
+ sctp_outq_flush(q, /* rtx_timeout */ 1, GFP_ATOMIC);
}
/*
@@ -778,12 +745,12 @@
}
/* Cork the outqueue so queued chunks are really queued. */
-int sctp_outq_uncork(struct sctp_outq *q, gfp_t gfp)
+void sctp_outq_uncork(struct sctp_outq *q, gfp_t gfp)
{
if (q->cork)
q->cork = 0;
- return sctp_outq_flush(q, 0, gfp);
+ sctp_outq_flush(q, 0, gfp);
}
@@ -796,7 +763,7 @@
* locking concerns must be made. Today we use the sock lock to protect
* this function.
*/
-static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
+static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
{
struct sctp_packet *packet;
struct sctp_packet singleton;
@@ -919,8 +886,10 @@
sctp_packet_config(&singleton, vtag, 0);
sctp_packet_append_chunk(&singleton, chunk);
error = sctp_packet_transmit(&singleton, gfp);
- if (error < 0)
- return error;
+ if (error < 0) {
+ asoc->base.sk->sk_err = -error;
+ return;
+ }
break;
case SCTP_CID_ABORT:
@@ -1018,6 +987,8 @@
retran:
error = sctp_outq_flush_rtx(q, packet,
rtx_timeout, &start_timer);
+ if (error < 0)
+ asoc->base.sk->sk_err = -error;
if (start_timer) {
sctp_transport_reset_t3_rtx(transport);
@@ -1192,14 +1163,15 @@
struct sctp_transport,
send_ready);
packet = &t->packet;
- if (!sctp_packet_empty(packet))
+ if (!sctp_packet_empty(packet)) {
error = sctp_packet_transmit(packet, gfp);
+ if (error < 0)
+ asoc->base.sk->sk_err = -error;
+ }
/* Clear the burst limited state, if any */
sctp_transport_burst_reset(t);
}
-
- return error;
}
/* Update unack_data based on the incoming SACK chunk */
@@ -1747,7 +1719,7 @@
{
int i;
sctp_sack_variable_t *frags;
- __u16 gap;
+ __u16 tsn_offset, blocks;
__u32 ctsn = ntohl(sack->cum_tsn_ack);
if (TSN_lte(tsn, ctsn))
@@ -1766,10 +1738,11 @@
*/
frags = sack->variable;
- gap = tsn - ctsn;
- for (i = 0; i < ntohs(sack->num_gap_ack_blocks); ++i) {
- if (TSN_lte(ntohs(frags[i].gab.start), gap) &&
- TSN_lte(gap, ntohs(frags[i].gab.end)))
+ blocks = ntohs(sack->num_gap_ack_blocks);
+ tsn_offset = tsn - ctsn;
+ for (i = 0; i < blocks; ++i) {
+ if (tsn_offset >= ntohs(frags[i].gab.start) &&
+ tsn_offset <= ntohs(frags[i].gab.end))
goto pass;
}
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 8c77b87..79dd660 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -253,7 +253,7 @@
num_types = sp->pf->supported_addrs(sp, types);
chunksize = sizeof(init) + addrs_len;
- chunksize += WORD_ROUND(SCTP_SAT_LEN(num_types));
+ chunksize += SCTP_PAD4(SCTP_SAT_LEN(num_types));
chunksize += sizeof(ecap_param);
if (asoc->prsctp_enable)
@@ -283,14 +283,14 @@
/* Add HMACS parameter length if any were defined */
auth_hmacs = (sctp_paramhdr_t *)asoc->c.auth_hmacs;
if (auth_hmacs->length)
- chunksize += WORD_ROUND(ntohs(auth_hmacs->length));
+ chunksize += SCTP_PAD4(ntohs(auth_hmacs->length));
else
auth_hmacs = NULL;
/* Add CHUNKS parameter length */
auth_chunks = (sctp_paramhdr_t *)asoc->c.auth_chunks;
if (auth_chunks->length)
- chunksize += WORD_ROUND(ntohs(auth_chunks->length));
+ chunksize += SCTP_PAD4(ntohs(auth_chunks->length));
else
auth_chunks = NULL;
@@ -300,8 +300,8 @@
/* If we have any extensions to report, account for that */
if (num_ext)
- chunksize += WORD_ROUND(sizeof(sctp_supported_ext_param_t) +
- num_ext);
+ chunksize += SCTP_PAD4(sizeof(sctp_supported_ext_param_t) +
+ num_ext);
/* RFC 2960 3.3.2 Initiation (INIT) (1)
*
@@ -443,13 +443,13 @@
auth_hmacs = (sctp_paramhdr_t *)asoc->c.auth_hmacs;
if (auth_hmacs->length)
- chunksize += WORD_ROUND(ntohs(auth_hmacs->length));
+ chunksize += SCTP_PAD4(ntohs(auth_hmacs->length));
else
auth_hmacs = NULL;
auth_chunks = (sctp_paramhdr_t *)asoc->c.auth_chunks;
if (auth_chunks->length)
- chunksize += WORD_ROUND(ntohs(auth_chunks->length));
+ chunksize += SCTP_PAD4(ntohs(auth_chunks->length));
else
auth_chunks = NULL;
@@ -458,8 +458,8 @@
}
if (num_ext)
- chunksize += WORD_ROUND(sizeof(sctp_supported_ext_param_t) +
- num_ext);
+ chunksize += SCTP_PAD4(sizeof(sctp_supported_ext_param_t) +
+ num_ext);
/* Now allocate and fill out the chunk. */
retval = sctp_make_control(asoc, SCTP_CID_INIT_ACK, 0, chunksize, gfp);
@@ -1390,7 +1390,7 @@
struct sock *sk;
/* No need to allocate LL here, as this is only a chunk. */
- skb = alloc_skb(WORD_ROUND(sizeof(sctp_chunkhdr_t) + paylen), gfp);
+ skb = alloc_skb(SCTP_PAD4(sizeof(sctp_chunkhdr_t) + paylen), gfp);
if (!skb)
goto nodata;
@@ -1482,7 +1482,7 @@
void *target;
void *padding;
int chunklen = ntohs(chunk->chunk_hdr->length);
- int padlen = WORD_ROUND(chunklen) - chunklen;
+ int padlen = SCTP_PAD4(chunklen) - chunklen;
padding = skb_put(chunk->skb, padlen);
target = skb_put(chunk->skb, len);
@@ -1900,7 +1900,7 @@
struct __sctp_missing report;
__u16 len;
- len = WORD_ROUND(sizeof(report));
+ len = SCTP_PAD4(sizeof(report));
/* Make an ERROR chunk, preparing enough room for
* returning multiple unknown parameters.
@@ -2098,9 +2098,9 @@
if (*errp) {
if (!sctp_init_cause_fixed(*errp, SCTP_ERROR_UNKNOWN_PARAM,
- WORD_ROUND(ntohs(param.p->length))))
+ SCTP_PAD4(ntohs(param.p->length))))
sctp_addto_chunk_fixed(*errp,
- WORD_ROUND(ntohs(param.p->length)),
+ SCTP_PAD4(ntohs(param.p->length)),
param.v);
} else {
/* If there is no memory for generating the ERROR
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 12d4519..c345bf1 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1020,19 +1020,13 @@
* This way the whole message is queued up and bundling if
* encouraged for small fragments.
*/
-static int sctp_cmd_send_msg(struct sctp_association *asoc,
- struct sctp_datamsg *msg, gfp_t gfp)
+static void sctp_cmd_send_msg(struct sctp_association *asoc,
+ struct sctp_datamsg *msg, gfp_t gfp)
{
struct sctp_chunk *chunk;
- int error = 0;
- list_for_each_entry(chunk, &msg->chunks, frag_list) {
- error = sctp_outq_tail(&asoc->outqueue, chunk, gfp);
- if (error)
- break;
- }
-
- return error;
+ list_for_each_entry(chunk, &msg->chunks, frag_list)
+ sctp_outq_tail(&asoc->outqueue, chunk, gfp);
}
@@ -1427,8 +1421,7 @@
local_cork = 1;
}
/* Send a chunk to our peer. */
- error = sctp_outq_tail(&asoc->outqueue, cmd->obj.chunk,
- gfp);
+ sctp_outq_tail(&asoc->outqueue, cmd->obj.chunk, gfp);
break;
case SCTP_CMD_SEND_PKT:
@@ -1682,7 +1675,7 @@
case SCTP_CMD_FORCE_PRIM_RETRAN:
t = asoc->peer.retran_path;
asoc->peer.retran_path = asoc->peer.primary_path;
- error = sctp_outq_uncork(&asoc->outqueue, gfp);
+ sctp_outq_uncork(&asoc->outqueue, gfp);
local_cork = 0;
asoc->peer.retran_path = t;
break;
@@ -1709,7 +1702,7 @@
sctp_outq_cork(&asoc->outqueue);
local_cork = 1;
}
- error = sctp_cmd_send_msg(asoc, cmd->obj.msg, gfp);
+ sctp_cmd_send_msg(asoc, cmd->obj.msg, gfp);
break;
case SCTP_CMD_SEND_NEXT_ASCONF:
sctp_cmd_send_asconf(asoc);
@@ -1739,9 +1732,9 @@
*/
if (asoc && SCTP_EVENT_T_CHUNK == event_type && chunk) {
if (chunk->end_of_packet || chunk->singleton)
- error = sctp_outq_uncork(&asoc->outqueue, gfp);
+ sctp_outq_uncork(&asoc->outqueue, gfp);
} else if (local_cork)
- error = sctp_outq_uncork(&asoc->outqueue, gfp);
+ sctp_outq_uncork(&asoc->outqueue, gfp);
if (sp->data_ready_signalled)
sp->data_ready_signalled = 0;
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index d88bb2b..026e3bc 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3454,7 +3454,7 @@
}
/* Report violation if chunk len overflows */
- ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+ ch_end = ((__u8 *)ch) + SCTP_PAD4(ntohs(ch->length));
if (ch_end > skb_tail_pointer(skb))
return sctp_sf_violation_chunklen(net, ep, asoc, type, arg,
commands);
@@ -4185,7 +4185,7 @@
hdr = unk_chunk->chunk_hdr;
err_chunk = sctp_make_op_error(asoc, unk_chunk,
SCTP_ERROR_UNKNOWN_CHUNK, hdr,
- WORD_ROUND(ntohs(hdr->length)),
+ SCTP_PAD4(ntohs(hdr->length)),
0);
if (err_chunk) {
sctp_add_cmd_sf(commands, SCTP_CMD_REPLY,
@@ -4203,7 +4203,7 @@
hdr = unk_chunk->chunk_hdr;
err_chunk = sctp_make_op_error(asoc, unk_chunk,
SCTP_ERROR_UNKNOWN_CHUNK, hdr,
- WORD_ROUND(ntohs(hdr->length)),
+ SCTP_PAD4(ntohs(hdr->length)),
0);
if (err_chunk) {
sctp_add_cmd_sf(commands, SCTP_CMD_REPLY,
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 9fc417a..6cdc61c 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1958,6 +1958,8 @@
/* Now send the (possibly) fragmented message. */
list_for_each_entry(chunk, &datamsg->chunks, frag_list) {
+ sctp_chunk_hold(chunk);
+
/* Do accounting for the write space. */
sctp_set_owner_w(chunk);
@@ -1970,13 +1972,15 @@
* breaks.
*/
err = sctp_primitive_SEND(net, asoc, datamsg);
- sctp_datamsg_put(datamsg);
/* Did the lower layer accept the chunk? */
- if (err)
+ if (err) {
+ sctp_datamsg_free(datamsg);
goto out_free;
+ }
pr_debug("%s: we sent primitively\n", __func__);
+ sctp_datamsg_put(datamsg);
err = msg_len;
if (unlikely(wait_connect)) {
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index 81b8667..ce54dce 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -233,7 +233,7 @@
}
if (transport->dst) {
- transport->pathmtu = WORD_TRUNC(dst_mtu(transport->dst));
+ transport->pathmtu = SCTP_TRUNC4(dst_mtu(transport->dst));
} else
transport->pathmtu = SCTP_DEFAULT_MAXSEGMENT;
}
@@ -287,7 +287,7 @@
return;
}
if (transport->dst) {
- transport->pathmtu = WORD_TRUNC(dst_mtu(transport->dst));
+ transport->pathmtu = SCTP_TRUNC4(dst_mtu(transport->dst));
/* Initialize sk->sk_rcv_saddr, if the transport is the
* association's active path for getsockname().
diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index d85b803..bea0005 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -383,7 +383,7 @@
ch = (sctp_errhdr_t *)(chunk->skb->data);
cause = ch->cause;
- elen = WORD_ROUND(ntohs(ch->length)) - sizeof(sctp_errhdr_t);
+ elen = SCTP_PAD4(ntohs(ch->length)) - sizeof(sctp_errhdr_t);
/* Pull off the ERROR header. */
skb_pull(chunk->skb, sizeof(sctp_errhdr_t));
@@ -688,7 +688,7 @@
* MUST ignore the padding bytes.
*/
len = ntohs(chunk->chunk_hdr->length);
- padding = WORD_ROUND(len) - len;
+ padding = SCTP_PAD4(len) - len;
/* Fixup cloned skb with just this chunks data. */
skb_trim(skb, chunk->chunk_end - padding - skb->data);
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index 877e550..84d0fda 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -140,11 +140,8 @@
* we can go ahead and clear out the lobby in one shot
*/
if (!skb_queue_empty(&sp->pd_lobby)) {
- struct list_head *list;
skb_queue_splice_tail_init(&sp->pd_lobby,
&sk->sk_receive_queue);
- list = (struct list_head *)&sctp_sk(sk)->pd_lobby;
- INIT_LIST_HEAD(list);
return 1;
}
} else {
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 1d28181..d858202 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -569,9 +569,10 @@
struct rsc *found;
memset(&rsci, 0, sizeof(rsci));
- rsci.handle.data = handle->data;
- rsci.handle.len = handle->len;
+ if (dup_to_netobj(&rsci.handle, handle->data, handle->len))
+ return NULL;
found = rsc_lookup(cd, &rsci);
+ rsc_free(&rsci);
if (!found)
return NULL;
if (cache_check(cd, &found->h, NULL))
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 536d0be..799cce6 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -51,6 +51,7 @@
#include <linux/slab.h>
#include <linux/prefetch.h>
#include <linux/sunrpc/addr.h>
+#include <linux/sunrpc/svc_rdma.h>
#include <asm/bitops.h>
#include <linux/module.h> /* try_module_get()/module_put() */
@@ -923,7 +924,7 @@
}
INIT_LIST_HEAD(&buf->rb_recv_bufs);
- for (i = 0; i < buf->rb_max_requests; i++) {
+ for (i = 0; i < buf->rb_max_requests + RPCRDMA_MAX_BC_REQUESTS; i++) {
struct rpcrdma_rep *rep;
rep = rpcrdma_create_rep(r_xprt);
@@ -1018,6 +1019,7 @@
rep = rpcrdma_buffer_get_rep_locked(buf);
rpcrdma_destroy_rep(ia, rep);
}
+ buf->rb_send_count = 0;
spin_lock(&buf->rb_reqslock);
while (!list_empty(&buf->rb_allreqs)) {
@@ -1032,6 +1034,7 @@
spin_lock(&buf->rb_reqslock);
}
spin_unlock(&buf->rb_reqslock);
+ buf->rb_recv_count = 0;
rpcrdma_destroy_mrs(buf);
}
@@ -1074,8 +1077,27 @@
spin_unlock(&buf->rb_mwlock);
}
+static struct rpcrdma_rep *
+rpcrdma_buffer_get_rep(struct rpcrdma_buffer *buffers)
+{
+ /* If an RPC previously completed without a reply (say, a
+ * credential problem or a soft timeout occurs) then hold off
+ * on supplying more Receive buffers until the number of new
+ * pending RPCs catches up to the number of posted Receives.
+ */
+ if (unlikely(buffers->rb_send_count < buffers->rb_recv_count))
+ return NULL;
+
+ if (unlikely(list_empty(&buffers->rb_recv_bufs)))
+ return NULL;
+ buffers->rb_recv_count++;
+ return rpcrdma_buffer_get_rep_locked(buffers);
+}
+
/*
* Get a set of request/reply buffers.
+ *
+ * Reply buffer (if available) is attached to send buffer upon return.
*/
struct rpcrdma_req *
rpcrdma_buffer_get(struct rpcrdma_buffer *buffers)
@@ -1085,21 +1107,15 @@
spin_lock(&buffers->rb_lock);
if (list_empty(&buffers->rb_send_bufs))
goto out_reqbuf;
+ buffers->rb_send_count++;
req = rpcrdma_buffer_get_req_locked(buffers);
- if (list_empty(&buffers->rb_recv_bufs))
- goto out_repbuf;
- req->rl_reply = rpcrdma_buffer_get_rep_locked(buffers);
+ req->rl_reply = rpcrdma_buffer_get_rep(buffers);
spin_unlock(&buffers->rb_lock);
return req;
out_reqbuf:
spin_unlock(&buffers->rb_lock);
- pr_warn("rpcrdma: out of request buffers (%p)\n", buffers);
- return NULL;
-out_repbuf:
- list_add(&req->rl_free, &buffers->rb_send_bufs);
- spin_unlock(&buffers->rb_lock);
- pr_warn("rpcrdma: out of reply buffers (%p)\n", buffers);
+ pr_warn("RPC: %s: out of request buffers\n", __func__);
return NULL;
}
@@ -1117,9 +1133,12 @@
req->rl_reply = NULL;
spin_lock(&buffers->rb_lock);
+ buffers->rb_send_count--;
list_add_tail(&req->rl_free, &buffers->rb_send_bufs);
- if (rep)
+ if (rep) {
+ buffers->rb_recv_count--;
list_add_tail(&rep->rr_list, &buffers->rb_recv_bufs);
+ }
spin_unlock(&buffers->rb_lock);
}
@@ -1133,8 +1152,7 @@
struct rpcrdma_buffer *buffers = req->rl_buffer;
spin_lock(&buffers->rb_lock);
- if (!list_empty(&buffers->rb_recv_bufs))
- req->rl_reply = rpcrdma_buffer_get_rep_locked(buffers);
+ req->rl_reply = rpcrdma_buffer_get_rep(buffers);
spin_unlock(&buffers->rb_lock);
}
@@ -1148,6 +1166,7 @@
struct rpcrdma_buffer *buffers = &rep->rr_rxprt->rx_buf;
spin_lock(&buffers->rb_lock);
+ buffers->rb_recv_count--;
list_add_tail(&rep->rr_list, &buffers->rb_recv_bufs);
spin_unlock(&buffers->rb_lock);
}
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 670fad5..a71b0f5 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -321,6 +321,7 @@
char *rb_pool;
spinlock_t rb_lock; /* protect buf lists */
+ int rb_send_count, rb_recv_count;
struct list_head rb_send_bufs;
struct list_head rb_recv_bufs;
u32 rb_max_requests;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 8ede3bc..bf16883 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1074,7 +1074,7 @@
skb = skb_recv_datagram(sk, 0, 1, &err);
if (skb != NULL) {
xs_udp_data_read_skb(&transport->xprt, sk, skb);
- skb_free_datagram(sk, skb);
+ skb_free_datagram_locked(sk, skb);
continue;
}
if (!test_and_clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state))
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 10b8193..02beb35 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -21,7 +21,6 @@
#include <linux/workqueue.h>
#include <linux/if_vlan.h>
#include <linux/rtnetlink.h>
-#include <net/ip_fib.h>
#include <net/switchdev.h>
/**
@@ -344,8 +343,6 @@
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_VLAN:
return sizeof(struct switchdev_obj_port_vlan);
- case SWITCHDEV_OBJ_ID_IPV4_FIB:
- return sizeof(struct switchdev_obj_ipv4_fib);
case SWITCHDEV_OBJ_ID_PORT_FDB:
return sizeof(struct switchdev_obj_port_fdb);
case SWITCHDEV_OBJ_ID_PORT_MDB:
@@ -1108,184 +1105,6 @@
}
EXPORT_SYMBOL_GPL(switchdev_port_fdb_dump);
-static struct net_device *switchdev_get_lowest_dev(struct net_device *dev)
-{
- const struct switchdev_ops *ops = dev->switchdev_ops;
- struct net_device *lower_dev;
- struct net_device *port_dev;
- struct list_head *iter;
-
- /* Recusively search down until we find a sw port dev.
- * (A sw port dev supports switchdev_port_attr_get).
- */
-
- if (ops && ops->switchdev_port_attr_get)
- return dev;
-
- netdev_for_each_lower_dev(dev, lower_dev, iter) {
- port_dev = switchdev_get_lowest_dev(lower_dev);
- if (port_dev)
- return port_dev;
- }
-
- return NULL;
-}
-
-static struct net_device *switchdev_get_dev_by_nhs(struct fib_info *fi)
-{
- struct switchdev_attr attr = {
- .id = SWITCHDEV_ATTR_ID_PORT_PARENT_ID,
- };
- struct switchdev_attr prev_attr;
- struct net_device *dev = NULL;
- int nhsel;
-
- ASSERT_RTNL();
-
- /* For this route, all nexthop devs must be on the same switch. */
-
- for (nhsel = 0; nhsel < fi->fib_nhs; nhsel++) {
- const struct fib_nh *nh = &fi->fib_nh[nhsel];
-
- if (!nh->nh_dev)
- return NULL;
-
- dev = switchdev_get_lowest_dev(nh->nh_dev);
- if (!dev)
- return NULL;
-
- attr.orig_dev = dev;
- if (switchdev_port_attr_get(dev, &attr))
- return NULL;
-
- if (nhsel > 0 &&
- !netdev_phys_item_id_same(&prev_attr.u.ppid, &attr.u.ppid))
- return NULL;
-
- prev_attr = attr;
- }
-
- return dev;
-}
-
-/**
- * switchdev_fib_ipv4_add - Add/modify switch IPv4 route entry
- *
- * @dst: route's IPv4 destination address
- * @dst_len: destination address length (prefix length)
- * @fi: route FIB info structure
- * @tos: route TOS
- * @type: route type
- * @nlflags: netlink flags passed in (NLM_F_*)
- * @tb_id: route table ID
- *
- * Add/modify switch IPv4 route entry.
- */
-int switchdev_fib_ipv4_add(u32 dst, int dst_len, struct fib_info *fi,
- u8 tos, u8 type, u32 nlflags, u32 tb_id)
-{
- struct switchdev_obj_ipv4_fib ipv4_fib = {
- .obj.id = SWITCHDEV_OBJ_ID_IPV4_FIB,
- .dst = dst,
- .dst_len = dst_len,
- .fi = fi,
- .tos = tos,
- .type = type,
- .nlflags = nlflags,
- .tb_id = tb_id,
- };
- struct net_device *dev;
- int err = 0;
-
- /* Don't offload route if using custom ip rules or if
- * IPv4 FIB offloading has been disabled completely.
- */
-
-#ifdef CONFIG_IP_MULTIPLE_TABLES
- if (fi->fib_net->ipv4.fib_has_custom_rules)
- return 0;
-#endif
-
- if (fi->fib_net->ipv4.fib_offload_disabled)
- return 0;
-
- dev = switchdev_get_dev_by_nhs(fi);
- if (!dev)
- return 0;
-
- ipv4_fib.obj.orig_dev = dev;
- err = switchdev_port_obj_add(dev, &ipv4_fib.obj);
- if (!err)
- fi->fib_flags |= RTNH_F_OFFLOAD;
-
- return err == -EOPNOTSUPP ? 0 : err;
-}
-EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_add);
-
-/**
- * switchdev_fib_ipv4_del - Delete IPv4 route entry from switch
- *
- * @dst: route's IPv4 destination address
- * @dst_len: destination address length (prefix length)
- * @fi: route FIB info structure
- * @tos: route TOS
- * @type: route type
- * @tb_id: route table ID
- *
- * Delete IPv4 route entry from switch device.
- */
-int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
- u8 tos, u8 type, u32 tb_id)
-{
- struct switchdev_obj_ipv4_fib ipv4_fib = {
- .obj.id = SWITCHDEV_OBJ_ID_IPV4_FIB,
- .dst = dst,
- .dst_len = dst_len,
- .fi = fi,
- .tos = tos,
- .type = type,
- .nlflags = 0,
- .tb_id = tb_id,
- };
- struct net_device *dev;
- int err = 0;
-
- if (!(fi->fib_flags & RTNH_F_OFFLOAD))
- return 0;
-
- dev = switchdev_get_dev_by_nhs(fi);
- if (!dev)
- return 0;
-
- ipv4_fib.obj.orig_dev = dev;
- err = switchdev_port_obj_del(dev, &ipv4_fib.obj);
- if (!err)
- fi->fib_flags &= ~RTNH_F_OFFLOAD;
-
- return err == -EOPNOTSUPP ? 0 : err;
-}
-EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_del);
-
-/**
- * switchdev_fib_ipv4_abort - Abort an IPv4 FIB operation
- *
- * @fi: route FIB info structure
- */
-void switchdev_fib_ipv4_abort(struct fib_info *fi)
-{
- /* There was a problem installing this route to the offload
- * device. For now, until we come up with more refined
- * policy handling, abruptly end IPv4 fib offloading for
- * for entire net by flushing offload device(s) of all
- * IPv4 routes, and mark IPv4 fib offloading broken from
- * this point forward.
- */
-
- fib_flush_external(fi->fib_net);
- fi->fib_net->ipv4.fib_offload_disabled = true;
-}
-EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_abort);
-
bool switchdev_port_same_parent_id(struct net_device *a,
struct net_device *b)
{
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 2029b49..4911cd9 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1252,7 +1252,7 @@
if (err)
goto out_fail_reg;
- cfg80211_wq = create_singlethread_workqueue("cfg80211");
+ cfg80211_wq = alloc_ordered_workqueue("cfg80211", WQ_MEM_RECLAIM);
if (!cfg80211_wq) {
err = -ENOMEM;
goto out_fail_wq;
diff --git a/net/wireless/core.h b/net/wireless/core.h
index eee9144..5555e3c 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -249,9 +249,9 @@
};
struct cfg80211_cached_keys {
- struct key_params params[6];
- u8 data[6][WLAN_MAX_KEY_LEN];
- int def, defmgmt;
+ struct key_params params[4];
+ u8 data[4][WLAN_KEY_LEN_WEP104];
+ int def;
};
enum cfg80211_chan_mode {
diff --git a/net/wireless/ibss.c b/net/wireless/ibss.c
index 4a4dda5..eafdfa5 100644
--- a/net/wireless/ibss.c
+++ b/net/wireless/ibss.c
@@ -114,6 +114,9 @@
}
}
+ if (WARN_ON(connkeys && connkeys->def < 0))
+ return -EINVAL;
+
if (WARN_ON(wdev->connect_keys))
kzfree(wdev->connect_keys);
wdev->connect_keys = connkeys;
@@ -284,18 +287,16 @@
if (!netif_running(wdev->netdev))
return 0;
- if (wdev->wext.keys) {
+ if (wdev->wext.keys)
wdev->wext.keys->def = wdev->wext.default_key;
- wdev->wext.keys->defmgmt = wdev->wext.default_mgmt_key;
- }
wdev->wext.ibss.privacy = wdev->wext.default_key != -1;
- if (wdev->wext.keys) {
+ if (wdev->wext.keys && wdev->wext.keys->def != -1) {
ck = kmemdup(wdev->wext.keys, sizeof(*ck), GFP_KERNEL);
if (!ck)
return -ENOMEM;
- for (i = 0; i < 6; i++)
+ for (i = 0; i < 4; i++)
ck->params[i].key = ck->data[i];
}
err = __cfg80211_join_ibss(rdev, wdev->netdev,
diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c
index c284d88..d6abb07 100644
--- a/net/wireless/mlme.c
+++ b/net/wireless/mlme.c
@@ -222,7 +222,7 @@
ASSERT_WDEV_LOCK(wdev);
if (auth_type == NL80211_AUTHTYPE_SHARED_KEY)
- if (!key || !key_len || key_idx < 0 || key_idx > 4)
+ if (!key || !key_len || key_idx < 0 || key_idx > 3)
return -EINVAL;
if (wdev->current_bss &&
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 4997857..fd111e2 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -848,13 +848,21 @@
struct nlattr *key;
struct cfg80211_cached_keys *result;
int rem, err, def = 0;
+ bool have_key = false;
+
+ nla_for_each_nested(key, keys, rem) {
+ have_key = true;
+ break;
+ }
+
+ if (!have_key)
+ return NULL;
result = kzalloc(sizeof(*result), GFP_KERNEL);
if (!result)
return ERR_PTR(-ENOMEM);
result->def = -1;
- result->defmgmt = -1;
nla_for_each_nested(key, keys, rem) {
memset(&parse, 0, sizeof(parse));
@@ -866,7 +874,7 @@
err = -EINVAL;
if (!parse.p.key)
goto error;
- if (parse.idx < 0 || parse.idx > 4)
+ if (parse.idx < 0 || parse.idx > 3)
goto error;
if (parse.def) {
if (def)
@@ -881,16 +889,24 @@
parse.idx, false, NULL);
if (err)
goto error;
+ if (parse.p.cipher != WLAN_CIPHER_SUITE_WEP40 &&
+ parse.p.cipher != WLAN_CIPHER_SUITE_WEP104) {
+ err = -EINVAL;
+ goto error;
+ }
result->params[parse.idx].cipher = parse.p.cipher;
result->params[parse.idx].key_len = parse.p.key_len;
result->params[parse.idx].key = result->data[parse.idx];
memcpy(result->data[parse.idx], parse.p.key, parse.p.key_len);
- if (parse.p.cipher == WLAN_CIPHER_SUITE_WEP40 ||
- parse.p.cipher == WLAN_CIPHER_SUITE_WEP104) {
- if (no_ht)
- *no_ht = true;
- }
+ /* must be WEP key if we got here */
+ if (no_ht)
+ *no_ht = true;
+ }
+
+ if (result->def < 0) {
+ err = -EINVAL;
+ goto error;
}
return result;
@@ -2525,10 +2541,35 @@
int if_idx = 0;
int wp_start = cb->args[0];
int if_start = cb->args[1];
+ int filter_wiphy = -1;
struct cfg80211_registered_device *rdev;
struct wireless_dev *wdev;
rtnl_lock();
+ if (!cb->args[2]) {
+ struct nl80211_dump_wiphy_state state = {
+ .filter_wiphy = -1,
+ };
+ int ret;
+
+ ret = nl80211_dump_wiphy_parse(skb, cb, &state);
+ if (ret)
+ return ret;
+
+ filter_wiphy = state.filter_wiphy;
+
+ /*
+ * if filtering, set cb->args[2] to +1 since 0 is the default
+ * value needed to determine that parsing is necessary.
+ */
+ if (filter_wiphy >= 0)
+ cb->args[2] = filter_wiphy + 1;
+ else
+ cb->args[2] = -1;
+ } else if (cb->args[2] > 0) {
+ filter_wiphy = cb->args[2] - 1;
+ }
+
list_for_each_entry(rdev, &cfg80211_rdev_list, list) {
if (!net_eq(wiphy_net(&rdev->wiphy), sock_net(skb->sk)))
continue;
@@ -2536,6 +2577,10 @@
wp_idx++;
continue;
}
+
+ if (filter_wiphy >= 0 && filter_wiphy != rdev->wiphy_idx)
+ continue;
+
if_idx = 0;
list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
@@ -6969,7 +7014,7 @@
params.n_counter_offsets_presp = len / sizeof(u16);
if (rdev->wiphy.max_num_csa_counters &&
- (params.n_counter_offsets_beacon >
+ (params.n_counter_offsets_presp >
rdev->wiphy.max_num_csa_counters))
return -EINVAL;
@@ -7359,7 +7404,7 @@
(key.p.cipher != WLAN_CIPHER_SUITE_WEP104 ||
key.p.key_len != WLAN_KEY_LEN_WEP104))
return -EINVAL;
- if (key.idx > 4)
+ if (key.idx > 3)
return -EINVAL;
} else {
key.p.key_len = 0;
@@ -7977,6 +8022,8 @@
}
data = nla_nest_start(skb, attr);
+ if (!data)
+ goto nla_put_failure;
((void **)skb->cb)[0] = rdev;
((void **)skb->cb)[1] = hdr;
@@ -9406,18 +9453,27 @@
if (!freqs)
return -ENOBUFS;
- for (i = 0; i < req->n_channels; i++)
- nla_put_u32(msg, i, req->channels[i]->center_freq);
+ for (i = 0; i < req->n_channels; i++) {
+ if (nla_put_u32(msg, i, req->channels[i]->center_freq))
+ return -ENOBUFS;
+ }
nla_nest_end(msg, freqs);
if (req->n_match_sets) {
matches = nla_nest_start(msg, NL80211_ATTR_SCHED_SCAN_MATCH);
+ if (!matches)
+ return -ENOBUFS;
+
for (i = 0; i < req->n_match_sets; i++) {
match = nla_nest_start(msg, i);
- nla_put(msg, NL80211_SCHED_SCAN_MATCH_ATTR_SSID,
- req->match_sets[i].ssid.ssid_len,
- req->match_sets[i].ssid.ssid);
+ if (!match)
+ return -ENOBUFS;
+
+ if (nla_put(msg, NL80211_SCHED_SCAN_MATCH_ATTR_SSID,
+ req->match_sets[i].ssid.ssid_len,
+ req->match_sets[i].ssid.ssid))
+ return -ENOBUFS;
nla_nest_end(msg, match);
}
nla_nest_end(msg, matches);
@@ -9429,6 +9485,9 @@
for (i = 0; i < req->n_scan_plans; i++) {
scan_plan = nla_nest_start(msg, i + 1);
+ if (!scan_plan)
+ return -ENOBUFS;
+
if (!scan_plan ||
nla_put_u32(msg, NL80211_SCHED_SCAN_PLAN_INTERVAL,
req->scan_plans[i].interval) ||
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index 0358e12..b5bd58d 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -352,52 +352,48 @@
__cfg80211_bss_expire(rdev, jiffies - IEEE80211_SCAN_RESULT_EXPIRE);
}
-const u8 *cfg80211_find_ie(u8 eid, const u8 *ies, int len)
+const u8 *cfg80211_find_ie_match(u8 eid, const u8 *ies, int len,
+ const u8 *match, int match_len,
+ int match_offset)
{
- while (len > 2 && ies[0] != eid) {
+ /* match_offset can't be smaller than 2, unless match_len is
+ * zero, in which case match_offset must be zero as well.
+ */
+ if (WARN_ON((match_len && match_offset < 2) ||
+ (!match_len && match_offset)))
+ return NULL;
+
+ while (len >= 2 && len >= ies[1] + 2) {
+ if ((ies[0] == eid) &&
+ (ies[1] + 2 >= match_offset + match_len) &&
+ !memcmp(ies + match_offset, match, match_len))
+ return ies;
+
len -= ies[1] + 2;
ies += ies[1] + 2;
}
- if (len < 2)
- return NULL;
- if (len < 2 + ies[1])
- return NULL;
- return ies;
+
+ return NULL;
}
-EXPORT_SYMBOL(cfg80211_find_ie);
+EXPORT_SYMBOL(cfg80211_find_ie_match);
const u8 *cfg80211_find_vendor_ie(unsigned int oui, int oui_type,
const u8 *ies, int len)
{
- struct ieee80211_vendor_ie *ie;
- const u8 *pos = ies, *end = ies + len;
- int ie_oui;
+ const u8 *ie;
+ u8 match[] = { oui >> 16, oui >> 8, oui, oui_type };
+ int match_len = (oui_type < 0) ? 3 : sizeof(match);
if (WARN_ON(oui_type > 0xff))
return NULL;
- while (pos < end) {
- pos = cfg80211_find_ie(WLAN_EID_VENDOR_SPECIFIC, pos,
- end - pos);
- if (!pos)
- return NULL;
+ ie = cfg80211_find_ie_match(WLAN_EID_VENDOR_SPECIFIC, ies, len,
+ match, match_len, 2);
- ie = (struct ieee80211_vendor_ie *)pos;
+ if (ie && (ie[1] < 4))
+ return NULL;
- /* make sure we can access ie->len */
- BUILD_BUG_ON(offsetof(struct ieee80211_vendor_ie, len) != 1);
-
- if (ie->len < sizeof(*ie))
- goto cont;
-
- ie_oui = ie->oui[0] << 16 | ie->oui[1] << 8 | ie->oui[2];
- if (ie_oui == oui &&
- (oui_type < 0 || ie->oui_type == oui_type))
- return pos;
-cont:
- pos += 2 + ie->len;
- }
- return NULL;
+ return ie;
}
EXPORT_SYMBOL(cfg80211_find_vendor_ie);
diff --git a/net/wireless/sme.c b/net/wireless/sme.c
index add6824..c08a3b5 100644
--- a/net/wireless/sme.c
+++ b/net/wireless/sme.c
@@ -1043,6 +1043,9 @@
connect->crypto.ciphers_pairwise[0] = cipher;
}
}
+ } else {
+ if (WARN_ON(connkeys))
+ return -EINVAL;
}
wdev->connect_keys = connkeys;
diff --git a/net/wireless/sysfs.c b/net/wireless/sysfs.c
index e46469b..0082f4b 100644
--- a/net/wireless/sysfs.c
+++ b/net/wireless/sysfs.c
@@ -57,7 +57,7 @@
return sprintf(buf, "%pM\n", wiphy->perm_addr);
for (i = 0; i < wiphy->n_addresses; i++)
- buf += sprintf(buf, "%pM\n", &wiphy->addresses[i].addr);
+ buf += sprintf(buf, "%pM\n", wiphy->addresses[i].addr);
return buf - start;
}
diff --git a/net/wireless/util.c b/net/wireless/util.c
index 0675f51..9e6e2aa 100644
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -218,7 +218,7 @@
struct key_params *params, int key_idx,
bool pairwise, const u8 *mac_addr)
{
- if (key_idx > 5)
+ if (key_idx < 0 || key_idx > 5)
return -EINVAL;
if (!pairwise && mac_addr && !(rdev->wiphy.flags & WIPHY_FLAG_IBSS_RSN))
@@ -249,7 +249,13 @@
/* Disallow BIP (group-only) cipher as pairwise cipher */
if (pairwise)
return -EINVAL;
+ if (key_idx < 4)
+ return -EINVAL;
break;
+ case WLAN_CIPHER_SUITE_WEP40:
+ case WLAN_CIPHER_SUITE_WEP104:
+ if (key_idx > 3)
+ return -EINVAL;
default:
break;
}
@@ -906,7 +912,7 @@
if (!wdev->connect_keys)
return;
- for (i = 0; i < 6; i++) {
+ for (i = 0; i < 4; i++) {
if (!wdev->connect_keys->params[i].cipher)
continue;
if (rdev_add_key(rdev, dev, i, false, NULL,
@@ -919,9 +925,6 @@
netdev_err(dev, "failed to set defkey %d\n", i);
continue;
}
- if (wdev->connect_keys->defmgmt == i)
- if (rdev_set_default_mgmt_key(rdev, dev, i))
- netdev_err(dev, "failed to set mgtdef %d\n", i);
}
kzfree(wdev->connect_keys);
diff --git a/net/wireless/wext-compat.c b/net/wireless/wext-compat.c
index 9f27221..7b97d43 100644
--- a/net/wireless/wext-compat.c
+++ b/net/wireless/wext-compat.c
@@ -408,10 +408,10 @@
if (!wdev->wext.keys) {
wdev->wext.keys = kzalloc(sizeof(*wdev->wext.keys),
- GFP_KERNEL);
+ GFP_KERNEL);
if (!wdev->wext.keys)
return -ENOMEM;
- for (i = 0; i < 6; i++)
+ for (i = 0; i < 4; i++)
wdev->wext.keys->params[i].key =
wdev->wext.keys->data[i];
}
@@ -460,7 +460,7 @@
if (err == -ENOENT)
err = 0;
if (!err) {
- if (!addr) {
+ if (!addr && idx < 4) {
memset(wdev->wext.keys->data[idx], 0,
sizeof(wdev->wext.keys->data[idx]));
wdev->wext.keys->params[idx].key_len = 0;
@@ -487,6 +487,9 @@
err = 0;
if (wdev->current_bss)
err = rdev_add_key(rdev, dev, idx, pairwise, addr, params);
+ else if (params->cipher != WLAN_CIPHER_SUITE_WEP40 &&
+ params->cipher != WLAN_CIPHER_SUITE_WEP104)
+ return -EINVAL;
if (err)
return err;
diff --git a/net/wireless/wext-sme.c b/net/wireless/wext-sme.c
index a4e8af3..88f1f69 100644
--- a/net/wireless/wext-sme.c
+++ b/net/wireless/wext-sme.c
@@ -35,7 +35,6 @@
if (wdev->wext.keys) {
wdev->wext.keys->def = wdev->wext.default_key;
- wdev->wext.keys->defmgmt = wdev->wext.default_mgmt_key;
if (wdev->wext.default_key != -1)
wdev->wext.connect.privacy = true;
}
@@ -43,11 +42,11 @@
if (!wdev->wext.connect.ssid_len)
return 0;
- if (wdev->wext.keys) {
+ if (wdev->wext.keys && wdev->wext.keys->def != -1) {
ck = kmemdup(wdev->wext.keys, sizeof(*ck), GFP_KERNEL);
if (!ck)
return -ENOMEM;
- for (i = 0; i < 6; i++)
+ for (i = 0; i < 4; i++)
ck->params[i].key = ck->data[i];
}
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ba8bf51..419bf5d 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -350,6 +350,7 @@
{
tasklet_hrtimer_cancel(&x->mtimer);
del_timer_sync(&x->rtimer);
+ kfree(x->aead);
kfree(x->aalg);
kfree(x->ealg);
kfree(x->calg);
@@ -1431,9 +1432,9 @@
{
struct xfrm_state *x;
- spin_lock_bh(&net->xfrm.xfrm_state_lock);
+ rcu_read_lock();
x = __xfrm_state_lookup(net, mark, daddr, spi, proto, family);
- spin_unlock_bh(&net->xfrm.xfrm_state_lock);
+ rcu_read_unlock();
return x;
}
EXPORT_SYMBOL(xfrm_state_lookup);
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index cb65d91..0889209 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -581,9 +581,12 @@
if (err)
goto error;
- if (attrs[XFRMA_SEC_CTX] &&
- security_xfrm_state_alloc(x, nla_data(attrs[XFRMA_SEC_CTX])))
- goto error;
+ if (attrs[XFRMA_SEC_CTX]) {
+ err = security_xfrm_state_alloc(x,
+ nla_data(attrs[XFRMA_SEC_CTX]));
+ if (err)
+ goto error;
+ }
if ((err = xfrm_alloc_replay_state_esn(&x->replay_esn, &x->preplay_esn,
attrs[XFRMA_REPLAY_ESN_VAL])))
diff --git a/samples/bpf/libbpf.h b/samples/bpf/libbpf.h
index 364582b..ac6edb6 100644
--- a/samples/bpf/libbpf.h
+++ b/samples/bpf/libbpf.h
@@ -85,6 +85,14 @@
.off = 0, \
.imm = IMM })
+#define BPF_MOV32_IMM(DST, IMM) \
+ ((struct bpf_insn) { \
+ .code = BPF_ALU | BPF_MOV | BPF_K, \
+ .dst_reg = DST, \
+ .src_reg = 0, \
+ .off = 0, \
+ .imm = IMM })
+
/* BPF_LD_IMM64 macro encodes single 'load 64-bit immediate' insn */
#define BPF_LD_IMM64(DST, IMM) \
BPF_LD_IMM64_RAW(DST, 0, IMM)
diff --git a/samples/bpf/sockex2_kern.c b/samples/bpf/sockex2_kern.c
index ba0e177..44e5846 100644
--- a/samples/bpf/sockex2_kern.c
+++ b/samples/bpf/sockex2_kern.c
@@ -14,7 +14,7 @@
__be16 h_vlan_encapsulated_proto;
};
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -59,7 +59,7 @@
}
static inline __u64 parse_ip(struct __sk_buff *skb, __u64 nhoff, __u64 *ip_proto,
- struct flow_keys *flow)
+ struct bpf_flow_keys *flow)
{
__u64 verlen;
@@ -83,7 +83,7 @@
}
static inline __u64 parse_ipv6(struct __sk_buff *skb, __u64 nhoff, __u64 *ip_proto,
- struct flow_keys *flow)
+ struct bpf_flow_keys *flow)
{
*ip_proto = load_byte(skb,
nhoff + offsetof(struct ipv6hdr, nexthdr));
@@ -96,7 +96,7 @@
return nhoff;
}
-static inline bool flow_dissector(struct __sk_buff *skb, struct flow_keys *flow)
+static inline bool flow_dissector(struct __sk_buff *skb, struct bpf_flow_keys *flow)
{
__u64 nhoff = ETH_HLEN;
__u64 ip_proto;
@@ -198,7 +198,7 @@
SEC("socket2")
int bpf_prog2(struct __sk_buff *skb)
{
- struct flow_keys flow;
+ struct bpf_flow_keys flow;
struct pair *value;
u32 key;
diff --git a/samples/bpf/sockex3_kern.c b/samples/bpf/sockex3_kern.c
index 41ae2fd..95907f8 100644
--- a/samples/bpf/sockex3_kern.c
+++ b/samples/bpf/sockex3_kern.c
@@ -61,7 +61,7 @@
__be16 h_vlan_encapsulated_proto;
};
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -88,7 +88,7 @@
}
struct globals {
- struct flow_keys flow;
+ struct bpf_flow_keys flow;
};
struct bpf_map_def SEC("maps") percpu_map = {
@@ -114,14 +114,14 @@
struct bpf_map_def SEC("maps") hash_map = {
.type = BPF_MAP_TYPE_HASH,
- .key_size = sizeof(struct flow_keys),
+ .key_size = sizeof(struct bpf_flow_keys),
.value_size = sizeof(struct pair),
.max_entries = 1024,
};
static void update_stats(struct __sk_buff *skb, struct globals *g)
{
- struct flow_keys key = g->flow;
+ struct bpf_flow_keys key = g->flow;
struct pair *value;
value = bpf_map_lookup_elem(&hash_map, &key);
diff --git a/samples/bpf/sockex3_user.c b/samples/bpf/sockex3_user.c
index d4184ab..3fcfd8c 100644
--- a/samples/bpf/sockex3_user.c
+++ b/samples/bpf/sockex3_user.c
@@ -7,7 +7,7 @@
#include <arpa/inet.h>
#include <sys/resource.h>
-struct flow_keys {
+struct bpf_flow_keys {
__be32 src;
__be32 dst;
union {
@@ -49,7 +49,7 @@
(void) f;
for (i = 0; i < 5; i++) {
- struct flow_keys key = {}, next_key;
+ struct bpf_flow_keys key = {}, next_key;
struct pair value;
sleep(1);
diff --git a/samples/bpf/test_verifier.c b/samples/bpf/test_verifier.c
index 1f6cc9b..369ffaa 100644
--- a/samples/bpf/test_verifier.c
+++ b/samples/bpf/test_verifier.c
@@ -29,6 +29,7 @@
struct bpf_insn insns[MAX_INSNS];
int fixup[MAX_FIXUPS];
int prog_array_fixup[MAX_FIXUPS];
+ int test_val_map_fixup[MAX_FIXUPS];
const char *errstr;
const char *errstr_unpriv;
enum {
@@ -39,6 +40,19 @@
enum bpf_prog_type prog_type;
};
+/* Note we want this to be 64 bit aligned so that the end of our array is
+ * actually the end of the structure.
+ */
+#define MAX_ENTRIES 11
+struct test_val {
+ unsigned index;
+ int foo[MAX_ENTRIES];
+};
+
+struct other_val {
+ unsigned int action[32];
+};
+
static struct bpf_test tests[] = {
{
"add+sub+mul",
@@ -291,6 +305,29 @@
.result = REJECT,
},
{
+ "invalid argument register",
+ .insns = {
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_cgroup_classid),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_cgroup_classid),
+ BPF_EXIT_INSN(),
+ },
+ .errstr = "R1 !read_ok",
+ .result = REJECT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "non-invalid argument register",
+ .insns = {
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_1),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_cgroup_classid),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_1, BPF_REG_6),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_cgroup_classid),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
"check valid spill/fill",
.insns = {
/* spill R1(ctx) into stack */
@@ -1210,6 +1247,54 @@
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
{
+ "raw_stack: skb_load_bytes, negative len",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, -8),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_4, -8),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_load_bytes),
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_6, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid stack type R3",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "raw_stack: skb_load_bytes, negative len 2",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, -8),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_4, ~0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_load_bytes),
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_6, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid stack type R3",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "raw_stack: skb_load_bytes, zero len",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, -8),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_load_bytes),
+ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_6, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid stack type R3",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
"raw_stack: skb_load_bytes, no init",
.insns = {
BPF_MOV64_IMM(BPF_REG_2, 4),
@@ -1511,7 +1596,7 @@
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
},
{
- "direct packet access: test4",
+ "direct packet access: test4 (write)",
.insns = {
BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
offsetof(struct __sk_buff, data)),
@@ -1524,8 +1609,7 @@
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
},
- .errstr = "cannot write",
- .result = REJECT,
+ .result = ACCEPT,
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
{
@@ -1631,6 +1715,26 @@
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
},
{
+ "direct packet access: test10 (write invalid)",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_STX_MEM(BPF_B, BPF_REG_2, BPF_REG_2, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .errstr = "invalid access to packet",
+ .result = REJECT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
"helper access to packet: test1, valid packet_ptr range",
.insns = {
BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
@@ -1736,6 +1840,549 @@
.errstr = "invalid access to packet",
.prog_type = BPF_PROG_TYPE_XDP,
},
+ {
+ "helper access to packet: test6, cls valid packet_ptr range",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 5),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_2),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_update_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup = {5},
+ .result = ACCEPT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test7, cls unchecked packet_ptr",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup = {1},
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test8, cls variable add",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_MOV64_REG(BPF_REG_4, BPF_REG_2),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_3, 10),
+ BPF_LDX_MEM(BPF_B, BPF_REG_5, BPF_REG_2, 0),
+ BPF_MOV64_REG(BPF_REG_4, BPF_REG_2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_5),
+ BPF_MOV64_REG(BPF_REG_5, BPF_REG_4),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_5, BPF_REG_3, 4),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_4),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup = {11},
+ .result = ACCEPT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test9, cls packet_ptr with bad range",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_MOV64_REG(BPF_REG_4, BPF_REG_2),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 4),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_3, 2),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup = {7},
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test10, cls packet_ptr with too short range",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 1),
+ BPF_MOV64_REG(BPF_REG_4, BPF_REG_2),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_3, 3),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup = {6},
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test11, cls unsuitable helper 1",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_7, 4),
+ BPF_MOV64_IMM(BPF_REG_2, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 42),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_store_bytes),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "helper access to the packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test12, cls unsuitable helper 2",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_6, BPF_REG_7, 3),
+ BPF_MOV64_IMM(BPF_REG_2, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 4),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_skb_load_bytes),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "helper access to the packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test13, cls helper ok",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = ACCEPT,
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test14, cls helper fail sub",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 4),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "type=inv expected=fp",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test15, cls helper fail range 1",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_2, 8),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test16, cls helper fail range 2",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_2, -9),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test17, cls helper fail range 3",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_2, ~0),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test18, cls helper fail range zero",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_MOV64_IMM(BPF_REG_2, 0),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test19, pkt end as input",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_7),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "R1 type=pkt_end expected=fp",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "helper access to packet: test20, wrong reg",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 1),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 7),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_7, 6),
+ BPF_MOV64_IMM(BPF_REG_2, 4),
+ BPF_MOV64_IMM(BPF_REG_3, 0),
+ BPF_MOV64_IMM(BPF_REG_4, 0),
+ BPF_MOV64_IMM(BPF_REG_5, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_csum_diff),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .result = REJECT,
+ .errstr = "invalid access to packet",
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ },
+ {
+ "valid map access into an array with a constant",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr_unpriv = "R0 leaks addr",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+ },
+ {
+ "valid map access into an array with a register",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_MOV64_IMM(BPF_REG_1, 4),
+ BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr_unpriv = "R0 leaks addr",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+ },
+ {
+ "valid map access into an array with a variable",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 5),
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
+ BPF_JMP_IMM(BPF_JGE, BPF_REG_1, MAX_ENTRIES, 3),
+ BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr_unpriv = "R0 leaks addr",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+ },
+ {
+ "valid map access into an array with a signed variable",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
+ BPF_JMP_IMM(BPF_JSGT, BPF_REG_1, 0xffffffff, 1),
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ BPF_MOV32_IMM(BPF_REG_2, MAX_ENTRIES),
+ BPF_JMP_REG(BPF_JSGT, BPF_REG_2, BPF_REG_1, 1),
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ BPF_ALU32_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr_unpriv = "R0 leaks addr",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+ },
+ {
+ "invalid map access into an array with a constant",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, (MAX_ENTRIES + 1) << 2,
+ offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr = "invalid access to map value, value_size=48 off=48 size=8",
+ .result = REJECT,
+ },
+ {
+ "invalid map access into an array with a register",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_MOV64_IMM(BPF_REG_1, MAX_ENTRIES + 1),
+ BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr = "R0 min value is outside of the array range",
+ .result = REJECT,
+ },
+ {
+ "invalid map access into an array with a variable",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 4),
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
+ BPF_ALU64_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr = "R0 min value is negative, either use unsigned index or do a if (index >=0) check.",
+ .result = REJECT,
+ },
+ {
+ "invalid map access into an array with no floor check",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7),
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV32_IMM(BPF_REG_2, MAX_ENTRIES),
+ BPF_JMP_REG(BPF_JSGT, BPF_REG_2, BPF_REG_1, 1),
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ BPF_ALU32_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr = "R0 min value is negative, either use unsigned index or do a if (index >=0) check.",
+ .result = REJECT,
+ },
+ {
+ "invalid map access into an array with a invalid max check",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 7),
+ BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0),
+ BPF_MOV32_IMM(BPF_REG_2, MAX_ENTRIES + 1),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_2, BPF_REG_1, 1),
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ BPF_ALU32_IMM(BPF_LSH, BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
+ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3},
+ .errstr = "invalid access to map value, value_size=48 off=44 size=8",
+ .result = REJECT,
+ },
+ {
+ "invalid map access into an array with a invalid max check",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 10),
+ BPF_MOV64_REG(BPF_REG_8, BPF_REG_0),
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_8),
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, offsetof(struct test_val, foo)),
+ BPF_EXIT_INSN(),
+ },
+ .test_val_map_fixup = {3, 11},
+ .errstr = "R0 min value is negative, either use unsigned index or do a if (index >=0) check.",
+ .result = REJECT,
+ },
};
static int probe_filter_length(struct bpf_insn *fp)
@@ -1749,12 +2396,12 @@
return len + 1;
}
-static int create_map(void)
+static int create_map(size_t val_size, int num)
{
int map_fd;
map_fd = bpf_create_map(BPF_MAP_TYPE_HASH,
- sizeof(long long), sizeof(long long), 1024, 0);
+ sizeof(long long), val_size, num, 0);
if (map_fd < 0)
printf("failed to create map '%s'\n", strerror(errno));
@@ -1784,12 +2431,13 @@
int prog_len = probe_filter_length(prog);
int *fixup = tests[i].fixup;
int *prog_array_fixup = tests[i].prog_array_fixup;
+ int *test_val_map_fixup = tests[i].test_val_map_fixup;
int expected_result;
const char *expected_errstr;
- int map_fd = -1, prog_array_fd = -1;
+ int map_fd = -1, prog_array_fd = -1, test_val_map_fd = -1;
if (*fixup) {
- map_fd = create_map();
+ map_fd = create_map(sizeof(long long), 1024);
do {
prog[*fixup].imm = map_fd;
@@ -1804,6 +2452,18 @@
prog_array_fixup++;
} while (*prog_array_fixup);
}
+ if (*test_val_map_fixup) {
+ /* Unprivileged can't create a hash map.*/
+ if (unpriv)
+ continue;
+ test_val_map_fd = create_map(sizeof(struct test_val),
+ 256);
+ do {
+ prog[*test_val_map_fixup].imm = test_val_map_fd;
+ test_val_map_fixup++;
+ } while (*test_val_map_fixup);
+ }
+
printf("#%d %s ", i, tests[i].descr);
prog_fd = bpf_prog_load(prog_type ?: BPF_PROG_TYPE_SOCKET_FILTER,
@@ -1850,6 +2510,8 @@
close(map_fd);
if (prog_array_fd >= 0)
close(prog_array_fd);
+ if (test_val_map_fd >= 0)
+ close(test_val_map_fd);
close(prog_fd);
}
diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c
index f95f232..fd12d71 100644
--- a/samples/bpf/tracex5_kern.c
+++ b/samples/bpf/tracex5_kern.c
@@ -19,20 +19,18 @@
.max_entries = 1024,
};
-SEC("kprobe/seccomp_phase1")
+SEC("kprobe/__seccomp_filter")
int bpf_prog1(struct pt_regs *ctx)
{
- struct seccomp_data sd;
-
- bpf_probe_read(&sd, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+ int sc_nr = (int)PT_REGS_PARM1(ctx);
/* dispatch into next BPF program depending on syscall number */
- bpf_tail_call(ctx, &progs, sd.nr);
+ bpf_tail_call(ctx, &progs, sc_nr);
/* fall through -> unknown syscall */
- if (sd.nr >= __NR_getuid && sd.nr <= __NR_getsid) {
+ if (sc_nr >= __NR_getuid && sc_nr <= __NR_getsid) {
char fmt[] = "syscall=%d (one of get/set uid/pid/gid)\n";
- bpf_trace_printk(fmt, sizeof(fmt), sd.nr);
+ bpf_trace_printk(fmt, sizeof(fmt), sc_nr);
}
return 0;
}
@@ -42,7 +40,7 @@
{
struct seccomp_data sd;
- bpf_probe_read(&sd, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+ bpf_probe_read(&sd, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] == 512) {
char fmt[] = "write(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
@@ -55,7 +53,7 @@
{
struct seccomp_data sd;
- bpf_probe_read(&sd, sizeof(sd), (void *)PT_REGS_PARM1(ctx));
+ bpf_probe_read(&sd, sizeof(sd), (void *)PT_REGS_PARM2(ctx));
if (sd.args[2] > 128 && sd.args[2] <= 1024) {
char fmt[] = "read(fd=%d, buf=%p, size=%d)\n";
bpf_trace_printk(fmt, sizeof(fmt),
diff --git a/samples/bpf/tracex5_user.c b/samples/bpf/tracex5_user.c
index a04dd3c..36b5925 100644
--- a/samples/bpf/tracex5_user.c
+++ b/samples/bpf/tracex5_user.c
@@ -6,6 +6,7 @@
#include <sys/prctl.h>
#include "libbpf.h"
#include "bpf_load.h"
+#include <sys/resource.h>
/* install fake seccomp program to enable seccomp code path inside the kernel,
* so that our kprobe attached to seccomp_phase1() can be triggered
@@ -27,8 +28,10 @@
{
FILE *f;
char filename[256];
+ struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+ setrlimit(RLIMIT_MEMLOCK, &r);
if (load_bpf_file(filename)) {
printf("%s", bpf_log_buf);
diff --git a/scripts/faddr2line b/scripts/faddr2line
new file mode 100755
index 0000000..450b332
--- /dev/null
+++ b/scripts/faddr2line
@@ -0,0 +1,177 @@
+#!/bin/bash
+#
+# Translate stack dump function offsets.
+#
+# addr2line doesn't work with KASLR addresses. This works similarly to
+# addr2line, but instead takes the 'func+0x123' format as input:
+#
+# $ ./scripts/faddr2line ~/k/vmlinux meminfo_proc_show+0x5/0x568
+# meminfo_proc_show+0x5/0x568:
+# meminfo_proc_show at fs/proc/meminfo.c:27
+#
+# If the address is part of an inlined function, the full inline call chain is
+# printed:
+#
+# $ ./scripts/faddr2line ~/k/vmlinux native_write_msr+0x6/0x27
+# native_write_msr+0x6/0x27:
+# arch_static_branch at arch/x86/include/asm/msr.h:121
+# (inlined by) static_key_false at include/linux/jump_label.h:125
+# (inlined by) native_write_msr at arch/x86/include/asm/msr.h:125
+#
+# The function size after the '/' in the input is optional, but recommended.
+# It's used to help disambiguate any duplicate symbol names, which can occur
+# rarely. If the size is omitted for a duplicate symbol then it's possible for
+# multiple code sites to be printed:
+#
+# $ ./scripts/faddr2line ~/k/vmlinux raw_ioctl+0x5
+# raw_ioctl+0x5/0x20:
+# raw_ioctl at drivers/char/raw.c:122
+#
+# raw_ioctl+0x5/0xb1:
+# raw_ioctl at net/ipv4/raw.c:876
+#
+# Multiple addresses can be specified on a single command line:
+#
+# $ ./scripts/faddr2line ~/k/vmlinux type_show+0x10/45 free_reserved_area+0x90
+# type_show+0x10/0x2d:
+# type_show at drivers/video/backlight/backlight.c:213
+#
+# free_reserved_area+0x90/0x123:
+# free_reserved_area at mm/page_alloc.c:6429 (discriminator 2)
+
+
+set -o errexit
+set -o nounset
+
+command -v awk >/dev/null 2>&1 || die "awk isn't installed"
+command -v readelf >/dev/null 2>&1 || die "readelf isn't installed"
+command -v addr2line >/dev/null 2>&1 || die "addr2line isn't installed"
+
+usage() {
+ echo "usage: faddr2line <object file> <func+offset> <func+offset>..." >&2
+ exit 1
+}
+
+warn() {
+ echo "$1" >&2
+}
+
+die() {
+ echo "ERROR: $1" >&2
+ exit 1
+}
+
+# Try to figure out the source directory prefix so we can remove it from the
+# addr2line output. HACK ALERT: This assumes that start_kernel() is in
+# kernel/init.c! This only works for vmlinux. Otherwise it falls back to
+# printing the absolute path.
+find_dir_prefix() {
+ local objfile=$1
+
+ local start_kernel_addr=$(readelf -sW $objfile | awk '$8 == "start_kernel" {printf "0x%s", $2}')
+ [[ -z $start_kernel_addr ]] && return
+
+ local file_line=$(addr2line -e $objfile $start_kernel_addr)
+ [[ -z $file_line ]] && return
+
+ local prefix=${file_line%init/main.c:*}
+ if [[ -z $prefix ]] || [[ $prefix = $file_line ]]; then
+ return
+ fi
+
+ DIR_PREFIX=$prefix
+ return 0
+}
+
+__faddr2line() {
+ local objfile=$1
+ local func_addr=$2
+ local dir_prefix=$3
+ local print_warnings=$4
+
+ local func=${func_addr%+*}
+ local offset=${func_addr#*+}
+ offset=${offset%/*}
+ local size=
+ [[ $func_addr =~ "/" ]] && size=${func_addr#*/}
+
+ if [[ -z $func ]] || [[ -z $offset ]] || [[ $func = $func_addr ]]; then
+ warn "bad func+offset $func_addr"
+ DONE=1
+ return
+ fi
+
+ # Go through each of the object's symbols which match the func name.
+ # In rare cases there might be duplicates.
+ while read symbol; do
+ local fields=($symbol)
+ local sym_base=0x${fields[1]}
+ local sym_size=${fields[2]}
+ local sym_type=${fields[3]}
+
+ # calculate the address
+ local addr=$(($sym_base + $offset))
+ if [[ -z $addr ]] || [[ $addr = 0 ]]; then
+ warn "bad address: $sym_base + $offset"
+ DONE=1
+ return
+ fi
+ local hexaddr=0x$(printf %x $addr)
+
+ # weed out non-function symbols
+ if [[ $sym_type != "FUNC" ]]; then
+ [[ $print_warnings = 1 ]] &&
+ echo "skipping $func address at $hexaddr due to non-function symbol"
+ continue
+ fi
+
+ # if the user provided a size, make sure it matches the symbol's size
+ if [[ -n $size ]] && [[ $size -ne $sym_size ]]; then
+ [[ $print_warnings = 1 ]] &&
+ echo "skipping $func address at $hexaddr due to size mismatch ($size != $sym_size)"
+ continue;
+ fi
+
+ # make sure the provided offset is within the symbol's range
+ if [[ $offset -gt $sym_size ]]; then
+ [[ $print_warnings = 1 ]] &&
+ echo "skipping $func address at $hexaddr due to size mismatch ($offset > $sym_size)"
+ continue
+ fi
+
+ # separate multiple entries with a blank line
+ [[ $FIRST = 0 ]] && echo
+ FIRST=0
+
+ local hexsize=0x$(printf %x $sym_size)
+ echo "$func+$offset/$hexsize:"
+ addr2line -fpie $objfile $hexaddr | sed "s; $dir_prefix\(\./\)*; ;"
+ DONE=1
+
+ done < <(readelf -sW $objfile | awk -v f=$func '$8 == f {print}')
+}
+
+[[ $# -lt 2 ]] && usage
+
+objfile=$1
+[[ ! -f $objfile ]] && die "can't find objfile $objfile"
+shift
+
+DIR_PREFIX=supercalifragilisticexpialidocious
+find_dir_prefix $objfile
+
+FIRST=1
+while [[ $# -gt 0 ]]; do
+ func_addr=$1
+ shift
+
+ # print any matches found
+ DONE=0
+ __faddr2line $objfile $func_addr $DIR_PREFIX 0
+
+ # if no match was found, print warnings
+ if [[ $DONE = 0 ]]; then
+ __faddr2line $objfile $func_addr $DIR_PREFIX 1
+ warn "no match for $func_addr"
+ fi
+done
diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index d9836c5..11c8d9b 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -3266,6 +3266,9 @@
}
}
+ /* If we exit via err(), this kills all the threads, restores tty. */
+ atexit(cleanup_devices);
+
/* We always have a console device, and it's always device 1. */
setup_console();
@@ -3369,9 +3372,6 @@
/* Ensure that we terminate if a device-servicing child dies. */
signal(SIGCHLD, kill_launcher);
- /* If we exit via err(), this kills all the threads, restores tty. */
- atexit(cleanup_devices);
-
/* If requested, chroot to a directory */
if (chroot_path) {
if (chroot(chroot_path) != 0)