linux/drivers/net/ethernet/mscc/ocelot_net.c

1333 lines
35 KiB
C
Raw Normal View History

// SPDX-License-Identifier: (GPL-2.0 OR MIT)
/* Microsemi Ocelot Switch driver
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 04:11:20 +02:00
*
* This contains glue logic between the switchdev driver operations and the
* mscc_ocelot_switch_lib.
*
* Copyright (c) 2017, 2019 Microsemi Corporation
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 04:11:20 +02:00
* Copyright 2020-2021 NXP Semiconductors
*/
#include <linux/if_bridge.h>
#include <net/pkt_cls.h>
#include "ocelot.h"
#include "ocelot_vcap.h"
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 04:11:20 +02:00
static struct ocelot *devlink_port_to_ocelot(struct devlink_port *dlp)
{
return devlink_priv(dlp->devlink);
}
static int devlink_port_to_port(struct devlink_port *dlp)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
return dlp - ocelot->devlink_ports;
}
static int ocelot_devlink_sb_pool_get(struct devlink *dl,
unsigned int sb_index, u16 pool_index,
struct devlink_sb_pool_info *pool_info)
{
struct ocelot *ocelot = devlink_priv(dl);
return ocelot_sb_pool_get(ocelot, sb_index, pool_index, pool_info);
}
static int ocelot_devlink_sb_pool_set(struct devlink *dl, unsigned int sb_index,
u16 pool_index, u32 size,
enum devlink_sb_threshold_type threshold_type,
struct netlink_ext_ack *extack)
{
struct ocelot *ocelot = devlink_priv(dl);
return ocelot_sb_pool_set(ocelot, sb_index, pool_index, size,
threshold_type, extack);
}
static int ocelot_devlink_sb_port_pool_get(struct devlink_port *dlp,
unsigned int sb_index, u16 pool_index,
u32 *p_threshold)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_port_pool_get(ocelot, port, sb_index, pool_index,
p_threshold);
}
static int ocelot_devlink_sb_port_pool_set(struct devlink_port *dlp,
unsigned int sb_index, u16 pool_index,
u32 threshold,
struct netlink_ext_ack *extack)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_port_pool_set(ocelot, port, sb_index, pool_index,
threshold, extack);
}
static int
ocelot_devlink_sb_tc_pool_bind_get(struct devlink_port *dlp,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u16 *p_pool_index, u32 *p_threshold)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_tc_pool_bind_get(ocelot, port, sb_index, tc_index,
pool_type, p_pool_index,
p_threshold);
}
static int
ocelot_devlink_sb_tc_pool_bind_set(struct devlink_port *dlp,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u16 pool_index, u32 threshold,
struct netlink_ext_ack *extack)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_tc_pool_bind_set(ocelot, port, sb_index, tc_index,
pool_type, pool_index, threshold,
extack);
}
static int ocelot_devlink_sb_occ_snapshot(struct devlink *dl,
unsigned int sb_index)
{
struct ocelot *ocelot = devlink_priv(dl);
return ocelot_sb_occ_snapshot(ocelot, sb_index);
}
static int ocelot_devlink_sb_occ_max_clear(struct devlink *dl,
unsigned int sb_index)
{
struct ocelot *ocelot = devlink_priv(dl);
return ocelot_sb_occ_max_clear(ocelot, sb_index);
}
static int ocelot_devlink_sb_occ_port_pool_get(struct devlink_port *dlp,
unsigned int sb_index,
u16 pool_index, u32 *p_cur,
u32 *p_max)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_occ_port_pool_get(ocelot, port, sb_index, pool_index,
p_cur, p_max);
}
static int
ocelot_devlink_sb_occ_tc_port_bind_get(struct devlink_port *dlp,
unsigned int sb_index, u16 tc_index,
enum devlink_sb_pool_type pool_type,
u32 *p_cur, u32 *p_max)
{
struct ocelot *ocelot = devlink_port_to_ocelot(dlp);
int port = devlink_port_to_port(dlp);
return ocelot_sb_occ_tc_port_bind_get(ocelot, port, sb_index,
tc_index, pool_type,
p_cur, p_max);
}
const struct devlink_ops ocelot_devlink_ops = {
net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 04:11:20 +02:00
.sb_pool_get = ocelot_devlink_sb_pool_get,
.sb_pool_set = ocelot_devlink_sb_pool_set,
.sb_port_pool_get = ocelot_devlink_sb_port_pool_get,
.sb_port_pool_set = ocelot_devlink_sb_port_pool_set,
.sb_tc_pool_bind_get = ocelot_devlink_sb_tc_pool_bind_get,
.sb_tc_pool_bind_set = ocelot_devlink_sb_tc_pool_bind_set,
.sb_occ_snapshot = ocelot_devlink_sb_occ_snapshot,
.sb_occ_max_clear = ocelot_devlink_sb_occ_max_clear,
.sb_occ_port_pool_get = ocelot_devlink_sb_occ_port_pool_get,
.sb_occ_tc_port_bind_get = ocelot_devlink_sb_occ_tc_port_bind_get,
};
int ocelot_port_devlink_init(struct ocelot *ocelot, int port,
enum devlink_port_flavour flavour)
{
struct devlink_port *dlp = &ocelot->devlink_ports[port];
int id_len = sizeof(ocelot->base_mac);
struct devlink *dl = ocelot->devlink;
struct devlink_port_attrs attrs = {};
memcpy(attrs.switch_id.id, &ocelot->base_mac, id_len);
attrs.switch_id.id_len = id_len;
attrs.phys.port_number = port;
attrs.flavour = flavour;
devlink_port_attrs_set(dlp, &attrs);
return devlink_port_register(dl, dlp, port);
}
void ocelot_port_devlink_teardown(struct ocelot *ocelot, int port)
{
struct devlink_port *dlp = &ocelot->devlink_ports[port];
devlink_port_unregister(dlp);
}
static struct devlink_port *ocelot_get_devlink_port(struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
return &ocelot->devlink_ports[port];
}
int ocelot_setup_tc_cls_flower(struct ocelot_port_private *priv,
struct flow_cls_offload *f,
bool ingress)
{
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
if (!ingress)
return -EOPNOTSUPP;
switch (f->command) {
case FLOW_CLS_REPLACE:
return ocelot_cls_flower_replace(ocelot, port, f, ingress);
case FLOW_CLS_DESTROY:
return ocelot_cls_flower_destroy(ocelot, port, f, ingress);
case FLOW_CLS_STATS:
return ocelot_cls_flower_stats(ocelot, port, f, ingress);
default:
return -EOPNOTSUPP;
}
}
static int ocelot_setup_tc_cls_matchall(struct ocelot_port_private *priv,
struct tc_cls_matchall_offload *f,
bool ingress)
{
struct netlink_ext_ack *extack = f->common.extack;
struct ocelot *ocelot = priv->port.ocelot;
struct ocelot_policer pol = { 0 };
struct flow_action_entry *action;
int port = priv->chip_port;
int err;
if (!ingress) {
NL_SET_ERR_MSG_MOD(extack, "Only ingress is supported");
return -EOPNOTSUPP;
}
switch (f->command) {
case TC_CLSMATCHALL_REPLACE:
if (!flow_offload_has_one_action(&f->rule->action)) {
NL_SET_ERR_MSG_MOD(extack,
"Only one action is supported");
return -EOPNOTSUPP;
}
if (priv->tc.block_shared) {
NL_SET_ERR_MSG_MOD(extack,
"Rate limit is not supported on shared blocks");
return -EOPNOTSUPP;
}
action = &f->rule->action.entries[0];
if (action->id != FLOW_ACTION_POLICE) {
NL_SET_ERR_MSG_MOD(extack, "Unsupported action");
return -EOPNOTSUPP;
}
if (priv->tc.police_id && priv->tc.police_id != f->cookie) {
NL_SET_ERR_MSG_MOD(extack,
"Only one policer per port is supported");
return -EEXIST;
}
pol.rate = (u32)div_u64(action->police.rate_bytes_ps, 1000) * 8;
pol.burst = action->police.burst;
err = ocelot_port_policer_add(ocelot, port, &pol);
if (err) {
NL_SET_ERR_MSG_MOD(extack, "Could not add policer");
return err;
}
priv->tc.police_id = f->cookie;
priv->tc.offload_cnt++;
return 0;
case TC_CLSMATCHALL_DESTROY:
if (priv->tc.police_id != f->cookie)
return -ENOENT;
err = ocelot_port_policer_del(ocelot, port);
if (err) {
NL_SET_ERR_MSG_MOD(extack,
"Could not delete policer");
return err;
}
priv->tc.police_id = 0;
priv->tc.offload_cnt--;
return 0;
case TC_CLSMATCHALL_STATS:
default:
return -EOPNOTSUPP;
}
}
static int ocelot_setup_tc_block_cb(enum tc_setup_type type,
void *type_data,
void *cb_priv, bool ingress)
{
struct ocelot_port_private *priv = cb_priv;
if (!tc_cls_can_offload_and_chain0(priv->dev, type_data))
return -EOPNOTSUPP;
switch (type) {
case TC_SETUP_CLSMATCHALL:
return ocelot_setup_tc_cls_matchall(priv, type_data, ingress);
case TC_SETUP_CLSFLOWER:
return ocelot_setup_tc_cls_flower(priv, type_data, ingress);
default:
return -EOPNOTSUPP;
}
}
static int ocelot_setup_tc_block_cb_ig(enum tc_setup_type type,
void *type_data,
void *cb_priv)
{
return ocelot_setup_tc_block_cb(type, type_data,
cb_priv, true);
}
static int ocelot_setup_tc_block_cb_eg(enum tc_setup_type type,
void *type_data,
void *cb_priv)
{
return ocelot_setup_tc_block_cb(type, type_data,
cb_priv, false);
}
static LIST_HEAD(ocelot_block_cb_list);
static int ocelot_setup_tc_block(struct ocelot_port_private *priv,
struct flow_block_offload *f)
{
struct flow_block_cb *block_cb;
flow_setup_cb_t *cb;
if (f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS) {
cb = ocelot_setup_tc_block_cb_ig;
priv->tc.block_shared = f->block_shared;
} else if (f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS) {
cb = ocelot_setup_tc_block_cb_eg;
} else {
return -EOPNOTSUPP;
}
f->driver_block_list = &ocelot_block_cb_list;
switch (f->command) {
case FLOW_BLOCK_BIND:
if (flow_block_cb_is_busy(cb, priv, &ocelot_block_cb_list))
return -EBUSY;
block_cb = flow_block_cb_alloc(cb, priv, priv, NULL);
if (IS_ERR(block_cb))
return PTR_ERR(block_cb);
flow_block_cb_add(block_cb, f);
list_add_tail(&block_cb->driver_list, f->driver_block_list);
return 0;
case FLOW_BLOCK_UNBIND:
block_cb = flow_block_cb_lookup(f->block, cb, priv);
if (!block_cb)
return -ENOENT;
flow_block_cb_remove(block_cb, f);
list_del(&block_cb->driver_list);
return 0;
default:
return -EOPNOTSUPP;
}
}
static int ocelot_setup_tc(struct net_device *dev, enum tc_setup_type type,
void *type_data)
{
struct ocelot_port_private *priv = netdev_priv(dev);
switch (type) {
case TC_SETUP_BLOCK:
return ocelot_setup_tc_block(priv, type_data);
default:
return -EOPNOTSUPP;
}
return 0;
}
static void ocelot_port_adjust_link(struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
ocelot_adjust_link(ocelot, port, dev->phydev);
}
static int ocelot_vlan_vid_prepare(struct net_device *dev, u16 vid, bool pvid,
bool untagged)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
return ocelot_vlan_prepare(ocelot, port, vid, pvid, untagged);
}
static int ocelot_vlan_vid_add(struct net_device *dev, u16 vid, bool pvid,
bool untagged)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
int ret;
ret = ocelot_vlan_add(ocelot, port, vid, pvid, untagged);
if (ret)
return ret;
/* Add the port MAC address to with the right VLAN information */
ocelot_mact_learn(ocelot, PGID_CPU, dev->dev_addr, vid,
ENTRYTYPE_LOCKED);
return 0;
}
static int ocelot_vlan_vid_del(struct net_device *dev, u16 vid)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
int ret;
/* 8021q removes VID 0 on module unload for all interfaces
* with VLAN filtering feature. We need to keep it to receive
* untagged traffic.
*/
if (vid == 0)
return 0;
ret = ocelot_vlan_del(ocelot, port, vid);
if (ret)
return ret;
/* Del the port MAC address to with the right VLAN information */
ocelot_mact_forget(ocelot, dev->dev_addr, vid);
return 0;
}
static int ocelot_port_open(struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
int err;
if (priv->serdes) {
err = phy_set_mode_ext(priv->serdes, PHY_MODE_ETHERNET,
ocelot_port->phy_mode);
if (err) {
netdev_err(dev, "Could not set mode of SerDes\n");
return err;
}
}
err = phy_connect_direct(dev, priv->phy, &ocelot_port_adjust_link,
ocelot_port->phy_mode);
if (err) {
netdev_err(dev, "Could not attach to PHY\n");
return err;
}
dev->phydev = priv->phy;
phy_attached_info(priv->phy);
phy_start(priv->phy);
ocelot_port_enable(ocelot, port, priv->phy);
return 0;
}
static int ocelot_port_stop(struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
phy_disconnect(priv->phy);
dev->phydev = NULL;
ocelot_port_disable(ocelot, port);
return 0;
}
/* Generate the IFH for frame injection
*
* The IFH is a 128bit-value
* bit 127: bypass the analyzer processing
* bit 56-67: destination mask
* bit 28-29: pop_cnt: 3 disables all rewriting of the frame
* bit 20-27: cpu extraction queue mask
* bit 16: tag type 0: C-tag, 1: S-tag
* bit 0-11: VID
*/
static int ocelot_gen_ifh(u32 *ifh, struct frame_info *info)
{
ifh[0] = IFH_INJ_BYPASS | ((0x1ff & info->rew_op) << 21);
ifh[1] = (0xf00 & info->port) >> 8;
ifh[2] = (0xff & info->port) << 24;
ifh[3] = (info->tag_type << 16) | info->vid;
return 0;
}
static int ocelot_port_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct skb_shared_info *shinfo = skb_shinfo(skb);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
u32 val, ifh[OCELOT_TAG_LEN / 4];
struct frame_info info = {};
u8 grp = 0; /* Send everything on CPU group 0 */
unsigned int i, count, last;
int port = priv->chip_port;
val = ocelot_read(ocelot, QS_INJ_STATUS);
if (!(val & QS_INJ_STATUS_FIFO_RDY(BIT(grp))) ||
(val & QS_INJ_STATUS_WMARK_REACHED(BIT(grp))))
return NETDEV_TX_BUSY;
ocelot_write_rix(ocelot, QS_INJ_CTRL_GAP_SIZE(1) |
QS_INJ_CTRL_SOF, QS_INJ_CTRL, grp);
info.port = BIT(port);
info.tag_type = IFH_TAG_TYPE_C;
info.vid = skb_vlan_tag_get(skb);
/* Check if timestamping is needed */
if (ocelot->ptp && (shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
info.rew_op = ocelot_port->ptp_cmd;
if (ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
struct sk_buff *clone;
clone = skb_clone_sk(skb);
if (!clone) {
kfree_skb(skb);
return NETDEV_TX_OK;
}
ocelot_port_add_txtstamp_skb(ocelot, port, clone);
info.rew_op |= clone->cb[0] << 3;
}
}
if (ocelot->ptp && shinfo->tx_flags & SKBTX_HW_TSTAMP) {
info.rew_op = ocelot_port->ptp_cmd;
2020-09-18 04:07:24 +03:00
if (ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP)
info.rew_op |= skb->cb[0] << 3;
}
ocelot_gen_ifh(ifh, &info);
for (i = 0; i < OCELOT_TAG_LEN / 4; i++)
ocelot_write_rix(ocelot, (__force u32)cpu_to_be32(ifh[i]),
QS_INJ_WR, grp);
count = (skb->len + 3) / 4;
last = skb->len % 4;
for (i = 0; i < count; i++)
ocelot_write_rix(ocelot, ((u32 *)skb->data)[i], QS_INJ_WR, grp);
/* Add padding */
while (i < (OCELOT_BUFFER_CELL_SZ / 4)) {
ocelot_write_rix(ocelot, 0, QS_INJ_WR, grp);
i++;
}
/* Indicate EOF and valid bytes in last word */
ocelot_write_rix(ocelot, QS_INJ_CTRL_GAP_SIZE(1) |
QS_INJ_CTRL_VLD_BYTES(skb->len < OCELOT_BUFFER_CELL_SZ ? 0 : last) |
QS_INJ_CTRL_EOF,
QS_INJ_CTRL, grp);
/* Add dummy CRC */
ocelot_write_rix(ocelot, 0, QS_INJ_WR, grp);
skb_tx_timestamp(skb);
dev->stats.tx_packets++;
dev->stats.tx_bytes += skb->len;
kfree_skb(skb);
return NETDEV_TX_OK;
}
enum ocelot_action_type {
OCELOT_MACT_LEARN,
OCELOT_MACT_FORGET,
};
struct ocelot_mact_work_ctx {
struct work_struct work;
struct ocelot *ocelot;
enum ocelot_action_type type;
union {
/* OCELOT_MACT_LEARN */
struct {
unsigned char addr[ETH_ALEN];
u16 vid;
enum macaccess_entry_type entry_type;
int pgid;
} learn;
/* OCELOT_MACT_FORGET */
struct {
unsigned char addr[ETH_ALEN];
u16 vid;
} forget;
};
};
#define ocelot_work_to_ctx(x) \
container_of((x), struct ocelot_mact_work_ctx, work)
static void ocelot_mact_work(struct work_struct *work)
{
struct ocelot_mact_work_ctx *w = ocelot_work_to_ctx(work);
struct ocelot *ocelot = w->ocelot;
switch (w->type) {
case OCELOT_MACT_LEARN:
ocelot_mact_learn(ocelot, w->learn.pgid, w->learn.addr,
w->learn.vid, w->learn.entry_type);
break;
case OCELOT_MACT_FORGET:
ocelot_mact_forget(ocelot, w->forget.addr, w->forget.vid);
break;
default:
break;
}
kfree(w);
}
static int ocelot_enqueue_mact_action(struct ocelot *ocelot,
const struct ocelot_mact_work_ctx *ctx)
{
struct ocelot_mact_work_ctx *w = kmemdup(ctx, sizeof(*w), GFP_ATOMIC);
if (!w)
return -ENOMEM;
w->ocelot = ocelot;
INIT_WORK(&w->work, ocelot_mact_work);
queue_work(ocelot->owq, &w->work);
return 0;
}
static int ocelot_mc_unsync(struct net_device *dev, const unsigned char *addr)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
struct ocelot_mact_work_ctx w;
ether_addr_copy(w.forget.addr, addr);
w.forget.vid = ocelot_port->pvid_vlan.vid;
w.type = OCELOT_MACT_FORGET;
return ocelot_enqueue_mact_action(ocelot, &w);
}
static int ocelot_mc_sync(struct net_device *dev, const unsigned char *addr)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
struct ocelot_mact_work_ctx w;
ether_addr_copy(w.learn.addr, addr);
w.learn.vid = ocelot_port->pvid_vlan.vid;
w.learn.pgid = PGID_CPU;
w.learn.entry_type = ENTRYTYPE_LOCKED;
w.type = OCELOT_MACT_LEARN;
return ocelot_enqueue_mact_action(ocelot, &w);
}
static void ocelot_set_rx_mode(struct net_device *dev)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
u32 val;
int i;
/* This doesn't handle promiscuous mode because the bridge core is
* setting IFF_PROMISC on all slave interfaces and all frames would be
* forwarded to the CPU port.
*/
val = GENMASK(ocelot->num_phys_ports - 1, 0);
for_each_nonreserved_multicast_dest_pgid(ocelot, i)
ocelot_write_rix(ocelot, val, ANA_PGID_PGID, i);
__dev_mc_sync(dev, ocelot_mc_sync, ocelot_mc_unsync);
}
static int ocelot_port_set_mac_address(struct net_device *dev, void *p)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
const struct sockaddr *addr = p;
/* Learn the new net device MAC address in the mac table. */
ocelot_mact_learn(ocelot, PGID_CPU, addr->sa_data,
ocelot_port->pvid_vlan.vid, ENTRYTYPE_LOCKED);
/* Then forget the previous one. */
ocelot_mact_forget(ocelot, dev->dev_addr, ocelot_port->pvid_vlan.vid);
ether_addr_copy(dev->dev_addr, addr->sa_data);
return 0;
}
static void ocelot_get_stats64(struct net_device *dev,
struct rtnl_link_stats64 *stats)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
/* Configure the port to read the stats from */
ocelot_write(ocelot, SYS_STAT_CFG_STAT_VIEW(port),
SYS_STAT_CFG);
/* Get Rx stats */
stats->rx_bytes = ocelot_read(ocelot, SYS_COUNT_RX_OCTETS);
stats->rx_packets = ocelot_read(ocelot, SYS_COUNT_RX_SHORTS) +
ocelot_read(ocelot, SYS_COUNT_RX_FRAGMENTS) +
ocelot_read(ocelot, SYS_COUNT_RX_JABBERS) +
ocelot_read(ocelot, SYS_COUNT_RX_LONGS) +
ocelot_read(ocelot, SYS_COUNT_RX_64) +
ocelot_read(ocelot, SYS_COUNT_RX_65_127) +
ocelot_read(ocelot, SYS_COUNT_RX_128_255) +
ocelot_read(ocelot, SYS_COUNT_RX_256_1023) +
ocelot_read(ocelot, SYS_COUNT_RX_1024_1526) +
ocelot_read(ocelot, SYS_COUNT_RX_1527_MAX);
stats->multicast = ocelot_read(ocelot, SYS_COUNT_RX_MULTICAST);
stats->rx_dropped = dev->stats.rx_dropped;
/* Get Tx stats */
stats->tx_bytes = ocelot_read(ocelot, SYS_COUNT_TX_OCTETS);
stats->tx_packets = ocelot_read(ocelot, SYS_COUNT_TX_64) +
ocelot_read(ocelot, SYS_COUNT_TX_65_127) +
ocelot_read(ocelot, SYS_COUNT_TX_128_511) +
ocelot_read(ocelot, SYS_COUNT_TX_512_1023) +
ocelot_read(ocelot, SYS_COUNT_TX_1024_1526) +
ocelot_read(ocelot, SYS_COUNT_TX_1527_MAX);
stats->tx_dropped = ocelot_read(ocelot, SYS_COUNT_TX_DROPS) +
ocelot_read(ocelot, SYS_COUNT_TX_AGING);
stats->collisions = ocelot_read(ocelot, SYS_COUNT_TX_COLLISION);
}
static int ocelot_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
const unsigned char *addr,
u16 vid, u16 flags,
struct netlink_ext_ack *extack)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
return ocelot_fdb_add(ocelot, port, addr, vid);
}
static int ocelot_port_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
struct net_device *dev,
const unsigned char *addr, u16 vid)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
return ocelot_fdb_del(ocelot, port, addr, vid);
}
static int ocelot_port_fdb_dump(struct sk_buff *skb,
struct netlink_callback *cb,
struct net_device *dev,
struct net_device *filter_dev, int *idx)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
struct ocelot_dump_ctx dump = {
.dev = dev,
.skb = skb,
.cb = cb,
.idx = *idx,
};
int port = priv->chip_port;
int ret;
ret = ocelot_fdb_dump(ocelot, port, ocelot_port_fdb_do_dump, &dump);
*idx = dump.idx;
return ret;
}
static int ocelot_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
u16 vid)
{
return ocelot_vlan_vid_add(dev, vid, false, false);
}
static int ocelot_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
u16 vid)
{
return ocelot_vlan_vid_del(dev, vid);
}
static void ocelot_vlan_mode(struct ocelot *ocelot, int port,
netdev_features_t features)
{
u32 val;
/* Filtering */
val = ocelot_read(ocelot, ANA_VLANMASK);
if (features & NETIF_F_HW_VLAN_CTAG_FILTER)
val |= BIT(port);
else
val &= ~BIT(port);
ocelot_write(ocelot, val, ANA_VLANMASK);
}
static int ocelot_set_features(struct net_device *dev,
netdev_features_t features)
{
netdev_features_t changed = dev->features ^ features;
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
if ((dev->features & NETIF_F_HW_TC) > (features & NETIF_F_HW_TC) &&
priv->tc.offload_cnt) {
netdev_err(dev,
"Cannot disable HW TC offload while offloads active\n");
return -EBUSY;
}
if (changed & NETIF_F_HW_VLAN_CTAG_FILTER)
ocelot_vlan_mode(ocelot, port, features);
return 0;
}
static int ocelot_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
/* If the attached PHY device isn't capable of timestamping operations,
* use our own (when possible).
*/
if (!phy_has_hwtstamp(dev->phydev) && ocelot->ptp) {
switch (cmd) {
case SIOCSHWTSTAMP:
return ocelot_hwstamp_set(ocelot, port, ifr);
case SIOCGHWTSTAMP:
return ocelot_hwstamp_get(ocelot, port, ifr);
}
}
return phy_mii_ioctl(dev->phydev, ifr, cmd);
}
static const struct net_device_ops ocelot_port_netdev_ops = {
.ndo_open = ocelot_port_open,
.ndo_stop = ocelot_port_stop,
.ndo_start_xmit = ocelot_port_xmit,
.ndo_set_rx_mode = ocelot_set_rx_mode,
.ndo_set_mac_address = ocelot_port_set_mac_address,
.ndo_get_stats64 = ocelot_get_stats64,
.ndo_fdb_add = ocelot_port_fdb_add,
.ndo_fdb_del = ocelot_port_fdb_del,
.ndo_fdb_dump = ocelot_port_fdb_dump,
.ndo_vlan_rx_add_vid = ocelot_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = ocelot_vlan_rx_kill_vid,
.ndo_set_features = ocelot_set_features,
.ndo_setup_tc = ocelot_setup_tc,
.ndo_do_ioctl = ocelot_ioctl,
.ndo_get_devlink_port = ocelot_get_devlink_port,
};
struct net_device *ocelot_port_to_netdev(struct ocelot *ocelot, int port)
{
struct ocelot_port *ocelot_port = ocelot->ports[port];
struct ocelot_port_private *priv;
if (!ocelot_port)
return NULL;
priv = container_of(ocelot_port, struct ocelot_port_private, port);
return priv->dev;
}
/* Checks if the net_device instance given to us originates from our driver */
static bool ocelot_netdevice_dev_check(const struct net_device *dev)
{
return dev->netdev_ops == &ocelot_port_netdev_ops;
}
int ocelot_netdev_to_port(struct net_device *dev)
{
struct ocelot_port_private *priv;
if (!dev || !ocelot_netdevice_dev_check(dev))
return -EINVAL;
priv = netdev_priv(dev);
return priv->chip_port;
}
static void ocelot_port_get_strings(struct net_device *netdev, u32 sset,
u8 *data)
{
struct ocelot_port_private *priv = netdev_priv(netdev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
ocelot_get_strings(ocelot, port, sset, data);
}
static void ocelot_port_get_ethtool_stats(struct net_device *dev,
struct ethtool_stats *stats,
u64 *data)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
ocelot_get_ethtool_stats(ocelot, port, data);
}
static int ocelot_port_get_sset_count(struct net_device *dev, int sset)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
return ocelot_get_sset_count(ocelot, port, sset);
}
static int ocelot_port_get_ts_info(struct net_device *dev,
struct ethtool_ts_info *info)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
if (!ocelot->ptp)
return ethtool_op_get_ts_info(dev, info);
return ocelot_get_ts_info(ocelot, port, info);
}
static const struct ethtool_ops ocelot_ethtool_ops = {
.get_strings = ocelot_port_get_strings,
.get_ethtool_stats = ocelot_port_get_ethtool_stats,
.get_sset_count = ocelot_port_get_sset_count,
.get_link_ksettings = phy_ethtool_get_link_ksettings,
.set_link_ksettings = phy_ethtool_set_link_ksettings,
.get_ts_info = ocelot_port_get_ts_info,
};
static void ocelot_port_attr_stp_state_set(struct ocelot *ocelot, int port,
u8 state)
{
ocelot_bridge_stp_state_set(ocelot, port, state);
}
static void ocelot_port_attr_ageing_set(struct ocelot *ocelot, int port,
unsigned long ageing_clock_t)
{
unsigned long ageing_jiffies = clock_t_to_jiffies(ageing_clock_t);
u32 ageing_time = jiffies_to_msecs(ageing_jiffies);
ocelot_set_ageing_time(ocelot, ageing_time);
}
static void ocelot_port_attr_mc_set(struct ocelot *ocelot, int port, bool mc)
{
u32 cpu_fwd_mcast = ANA_PORT_CPU_FWD_CFG_CPU_IGMP_REDIR_ENA |
ANA_PORT_CPU_FWD_CFG_CPU_MLD_REDIR_ENA |
ANA_PORT_CPU_FWD_CFG_CPU_IPMC_CTRL_COPY_ENA;
u32 val = 0;
if (mc)
val = cpu_fwd_mcast;
ocelot_rmw_gix(ocelot, val, cpu_fwd_mcast,
ANA_PORT_CPU_FWD_CFG, port);
}
static int ocelot_port_attr_set(struct net_device *dev,
net: switchdev: remove the transaction structure from port attributes Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit 2e554a7a5d8a ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:50 +02:00
const struct switchdev_attr *attr)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot *ocelot = priv->port.ocelot;
int port = priv->chip_port;
int err = 0;
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
net: switchdev: remove the transaction structure from port attributes Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit 2e554a7a5d8a ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:50 +02:00
ocelot_port_attr_stp_state_set(ocelot, port, attr->u.stp_state);
break;
case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
ocelot_port_attr_ageing_set(ocelot, port, attr->u.ageing_time);
break;
case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING:
net: switchdev: remove the transaction structure from port attributes Since the introduction of the switchdev API, port attributes were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port attribute notifier structures, and converts drivers to not look at this member. In part, this patch contains a revert of my previous commit 2e554a7a5d8a ("net: dsa: propagate switchdev vlan_filtering prepare phase to drivers"). For the most part, the conversion was trivial except for: - Rocker's world implementation based on Broadcom OF-DPA had an odd implementation of ofdpa_port_attr_bridge_flags_set. The conversion was done mechanically, by pasting the implementation twice, then only keeping the code that would get executed during prepare phase on top, then only keeping the code that gets executed during the commit phase on bottom, then simplifying the resulting code until this was obtained. - DSA's offloading of STP state, bridge flags, VLAN filtering and multicast router could be converted right away. But the ageing time could not, so a shim was introduced and this was left for a further commit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:50 +02:00
ocelot_port_vlan_filtering(ocelot, port, attr->u.vlan_filtering);
break;
case SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED:
ocelot_port_attr_mc_set(ocelot, port, !attr->u.mc_disabled);
break;
default:
err = -EOPNOTSUPP;
break;
}
return err;
}
static int ocelot_port_obj_add_vlan(struct net_device *dev,
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
const struct switchdev_obj_port_vlan *vlan)
{
net: switchdev: remove vid_begin -> vid_end range from VLAN objects The call path of a switchdev VLAN addition to the bridge looks something like this today: nbp_vlan_init | __br_vlan_set_default_pvid | | | | | br_afspec | | | | | | | v | | | br_process_vlan_info | | | | | | | v | | | br_vlan_info | | | / \ / | | / \ / | | / \ / | | / \ / v v v v v nbp_vlan_add br_vlan_add ------+ | ^ ^ | | | / | | | | / / / | \ br_vlan_get_master/ / v \ ^ / / br_vlan_add_existing \ | / / | \ | / / / \ | / / / \ | / / / \ | / / / v | | v / __vlan_add / / | / / | / v | / __vlan_vid_add | / \ | / v v v br_switchdev_port_vlan_add The ranges UAPI was introduced to the bridge in commit bdced7ef7838 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec) have always been passed one by one, through struct bridge_vlan_info tmp_vinfo, to br_vlan_info. So the range never went too far in depth. Then Scott Feldman introduced the switchdev_port_bridge_setlink function in commit 47f8328bb1a4 ("switchdev: add new switchdev bridge setlink"). That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made full use of the range. But switchdev_port_bridge_setlink was called like this: br_setlink -> br_afspec -> switchdev_port_bridge_setlink Basically, the switchdev and the bridge code were not tightly integrated. Then commit 41c498b9359e ("bridge: restore br_setlink back to original") came, and switchdev drivers were required to implement .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while. In the meantime, commits such as 0944d6b5a2fa ("bridge: try switchdev op first in __vlan_vid_add/del") finally made switchdev penetrate the br_vlan_info() barrier and start to develop the call path we have today. But remember, br_vlan_info() still receives VLANs one by one. Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit 29ab586c3d83 ("net: switchdev: Remove bridge bypass support from switchdev") so that drivers would not implement .ndo_bridge_setlink any longer. The switchdev_port_bridge_setlink also got deleted. This refactoring removed the parallel bridge_setlink implementation from switchdev, and left the only switchdev VLAN objects to be the ones offloaded from __vlan_vid_add (basically RX filtering) and __vlan_add (the latter coming from commit 9c86ce2c1ae3 ("net: bridge: Notify about bridge VLANs")). That is to say, today the switchdev VLAN object ranges are not used in the kernel. Refactoring the above call path is a bit complicated, when the bridge VLAN call path is already a bit complicated. Let's go off and finish the job of commit 29ab586c3d83 by deleting the bogus iteration through the VLAN ranges from the drivers. Some aspects of this feature never made too much sense in the first place. For example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID flag supposed to mean, when a port can obviously have a single pvid? This particular configuration _is_ denied as of commit 6623c60dc28e ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API perspective, the driver still has to play pretend, and only offload the vlan->vid_end as pvid. And the addition of a switchdev VLAN object can modify the flags of another, completely unrelated, switchdev VLAN object! (a VLAN that is PVID will invalidate the PVID flag from whatever other VLAN had previously been offloaded with switchdev and had that flag. Yet switchdev never notifies about that change, drivers are supposed to guess). Nonetheless, having a VLAN range in the API makes error handling look scarier than it really is - unwinding on errors and all of that. When in reality, no one really calls this API with more than one VLAN. It is all unnecessary complexity. And despite appearing pretentious (two-phase transactional model and all), the switchdev API is really sloppy because the VLAN addition and removal operations are not paired with one another (you can add a VLAN 100 times and delete it just once). The bridge notifies through switchdev of a VLAN addition not only when the flags of an existing VLAN change, but also when nothing changes. There are switchdev drivers out there who don't like adding a VLAN that has already been added, and those checks don't really belong at driver level. But the fact that the API contains ranges is yet another factor that prevents this from being addressed in the future. Of the existing switchdev pieces of hardware, it appears that only Mellanox Spectrum supports offloading more than one VLAN at a time, through mlxsw_sp_port_vlan_set. I have kept that code internal to the driver, because there is some more bookkeeping that makes use of it, but I deleted it from the switchdev API. But since the switchdev support for ranges has already been de facto deleted by a Mellanox employee and nobody noticed for 4 years, I'm going to assume it's not a biggie. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:46 +02:00
bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
int ret;
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
ret = ocelot_vlan_vid_prepare(dev, vlan->vid, pvid, untagged);
if (ret)
return ret;
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
return ocelot_vlan_vid_add(dev, vlan->vid, pvid, untagged);
}
static int ocelot_port_obj_add_mdb(struct net_device *dev,
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
const struct switchdev_obj_port_mdb *mdb)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
return ocelot_port_mdb_add(ocelot, port, mdb);
}
static int ocelot_port_obj_del_mdb(struct net_device *dev,
const struct switchdev_obj_port_mdb *mdb)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
return ocelot_port_mdb_del(ocelot, port, mdb);
}
static int ocelot_port_obj_add(struct net_device *dev,
const struct switchdev_obj *obj,
struct netlink_ext_ack *extack)
{
int ret = 0;
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_VLAN:
ret = ocelot_port_obj_add_vlan(dev,
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
SWITCHDEV_OBJ_PORT_VLAN(obj));
break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
net: switchdev: remove the transaction structure from port object notifiers Since the introduction of the switchdev API, port objects were transmitted to drivers for offloading using a two-step transactional model, with a prepare phase that was supposed to catch all errors, and a commit phase that was supposed to never fail. Some classes of failures can never be avoided, like hardware access, or memory allocation. In the latter case, merely attempting to move the memory allocation to the preparation phase makes it impossible to avoid memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused transaction item queue") which has removed the unused mechanism of passing on the allocated memory between one phase and another. It is time we admit that separating the preparation from the commit phase is something that is best left for the driver to decide, and not something that should be baked into the API, especially since there are no switchdev callers that depend on this. This patch removes the struct switchdev_trans member from switchdev port object notifier structures, and converts drivers to not look at this member. Where driver conversion is trivial (like in the case of the Marvell Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is done in this patch. Where driver conversion needs more attention (DSA, Mellanox Spectrum), the conversion is left for subsequent patches and here we only fake the prepare/commit phases at a lower level, just not in the switchdev notifier itself. Where the code has a natural structure that is best left alone as a preparation and a commit phase (as in the case of the Ocelot switch), that structure is left in place, just made to not depend upon the switchdev transactional model. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:48 +02:00
ret = ocelot_port_obj_add_mdb(dev, SWITCHDEV_OBJ_PORT_MDB(obj));
break;
default:
return -EOPNOTSUPP;
}
return ret;
}
static int ocelot_port_obj_del(struct net_device *dev,
const struct switchdev_obj *obj)
{
int ret = 0;
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_VLAN:
net: switchdev: remove vid_begin -> vid_end range from VLAN objects The call path of a switchdev VLAN addition to the bridge looks something like this today: nbp_vlan_init | __br_vlan_set_default_pvid | | | | | br_afspec | | | | | | | v | | | br_process_vlan_info | | | | | | | v | | | br_vlan_info | | | / \ / | | / \ / | | / \ / | | / \ / v v v v v nbp_vlan_add br_vlan_add ------+ | ^ ^ | | | / | | | | / / / | \ br_vlan_get_master/ / v \ ^ / / br_vlan_add_existing \ | / / | \ | / / / \ | / / / \ | / / / \ | / / / v | | v / __vlan_add / / | / / | / v | / __vlan_vid_add | / \ | / v v v br_switchdev_port_vlan_add The ranges UAPI was introduced to the bridge in commit bdced7ef7838 ("bridge: support for multiple vlans and vlan ranges in setlink and dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec) have always been passed one by one, through struct bridge_vlan_info tmp_vinfo, to br_vlan_info. So the range never went too far in depth. Then Scott Feldman introduced the switchdev_port_bridge_setlink function in commit 47f8328bb1a4 ("switchdev: add new switchdev bridge setlink"). That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made full use of the range. But switchdev_port_bridge_setlink was called like this: br_setlink -> br_afspec -> switchdev_port_bridge_setlink Basically, the switchdev and the bridge code were not tightly integrated. Then commit 41c498b9359e ("bridge: restore br_setlink back to original") came, and switchdev drivers were required to implement .ndo_bridge_setlink = switchdev_port_bridge_setlink for a while. In the meantime, commits such as 0944d6b5a2fa ("bridge: try switchdev op first in __vlan_vid_add/del") finally made switchdev penetrate the br_vlan_info() barrier and start to develop the call path we have today. But remember, br_vlan_info() still receives VLANs one by one. Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit 29ab586c3d83 ("net: switchdev: Remove bridge bypass support from switchdev") so that drivers would not implement .ndo_bridge_setlink any longer. The switchdev_port_bridge_setlink also got deleted. This refactoring removed the parallel bridge_setlink implementation from switchdev, and left the only switchdev VLAN objects to be the ones offloaded from __vlan_vid_add (basically RX filtering) and __vlan_add (the latter coming from commit 9c86ce2c1ae3 ("net: bridge: Notify about bridge VLANs")). That is to say, today the switchdev VLAN object ranges are not used in the kernel. Refactoring the above call path is a bit complicated, when the bridge VLAN call path is already a bit complicated. Let's go off and finish the job of commit 29ab586c3d83 by deleting the bogus iteration through the VLAN ranges from the drivers. Some aspects of this feature never made too much sense in the first place. For example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID flag supposed to mean, when a port can obviously have a single pvid? This particular configuration _is_ denied as of commit 6623c60dc28e ("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API perspective, the driver still has to play pretend, and only offload the vlan->vid_end as pvid. And the addition of a switchdev VLAN object can modify the flags of another, completely unrelated, switchdev VLAN object! (a VLAN that is PVID will invalidate the PVID flag from whatever other VLAN had previously been offloaded with switchdev and had that flag. Yet switchdev never notifies about that change, drivers are supposed to guess). Nonetheless, having a VLAN range in the API makes error handling look scarier than it really is - unwinding on errors and all of that. When in reality, no one really calls this API with more than one VLAN. It is all unnecessary complexity. And despite appearing pretentious (two-phase transactional model and all), the switchdev API is really sloppy because the VLAN addition and removal operations are not paired with one another (you can add a VLAN 100 times and delete it just once). The bridge notifies through switchdev of a VLAN addition not only when the flags of an existing VLAN change, but also when nothing changes. There are switchdev drivers out there who don't like adding a VLAN that has already been added, and those checks don't really belong at driver level. But the fact that the API contains ranges is yet another factor that prevents this from being addressed in the future. Of the existing switchdev pieces of hardware, it appears that only Mellanox Spectrum supports offloading more than one VLAN at a time, through mlxsw_sp_port_vlan_set. I have kept that code internal to the driver, because there is some more bookkeeping that makes use of it, but I deleted it from the switchdev API. But since the switchdev support for ranges has already been de facto deleted by a Mellanox employee and nobody noticed for 4 years, I'm going to assume it's not a biggie. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-09 02:01:46 +02:00
ret = ocelot_vlan_vid_del(dev,
SWITCHDEV_OBJ_PORT_VLAN(obj)->vid);
break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
ret = ocelot_port_obj_del_mdb(dev, SWITCHDEV_OBJ_PORT_MDB(obj));
break;
default:
return -EOPNOTSUPP;
}
return ret;
}
static int ocelot_netdevice_changeupper(struct net_device *dev,
struct netdev_notifier_changeupper_info *info)
{
struct ocelot_port_private *priv = netdev_priv(dev);
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
int err = 0;
if (netif_is_bridge_master(info->upper_dev)) {
if (info->linking) {
err = ocelot_port_bridge_join(ocelot, port,
info->upper_dev);
} else {
err = ocelot_port_bridge_leave(ocelot, port,
info->upper_dev);
}
}
if (netif_is_lag_master(info->upper_dev)) {
if (info->linking) {
err = ocelot_port_lag_join(ocelot, port,
info->upper_dev,
info->upper_info);
if (err == -EOPNOTSUPP) {
NL_SET_ERR_MSG_MOD(info->info.extack,
"Offloading not supported");
err = 0;
}
} else {
ocelot_port_lag_leave(ocelot, port,
info->upper_dev);
}
}
return notifier_from_errno(err);
}
static int
ocelot_netdevice_lag_changeupper(struct net_device *dev,
struct netdev_notifier_changeupper_info *info)
{
struct net_device *lower;
struct list_head *iter;
int err = NOTIFY_DONE;
netdev_for_each_lower_dev(dev, lower, iter) {
err = ocelot_netdevice_changeupper(lower, info);
if (err)
return notifier_from_errno(err);
}
return NOTIFY_DONE;
}
static int
ocelot_netdevice_changelowerstate(struct net_device *dev,
struct netdev_lag_lower_state_info *info)
{
struct ocelot_port_private *priv = netdev_priv(dev);
bool is_active = info->link_up && info->tx_enabled;
struct ocelot_port *ocelot_port = &priv->port;
struct ocelot *ocelot = ocelot_port->ocelot;
int port = priv->chip_port;
if (!ocelot_port->bond)
return NOTIFY_DONE;
if (ocelot_port->lag_tx_active == is_active)
return NOTIFY_DONE;
ocelot_port_lag_change(ocelot, port, is_active);
return NOTIFY_OK;
}
static int ocelot_netdevice_event(struct notifier_block *unused,
unsigned long event, void *ptr)
{
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
switch (event) {
case NETDEV_CHANGEUPPER: {
struct netdev_notifier_changeupper_info *info = ptr;
if (ocelot_netdevice_dev_check(dev))
return ocelot_netdevice_changeupper(dev, info);
if (netif_is_lag_master(dev))
return ocelot_netdevice_lag_changeupper(dev, info);
break;
}
case NETDEV_CHANGELOWERSTATE: {
struct netdev_notifier_changelowerstate_info *info = ptr;
if (!ocelot_netdevice_dev_check(dev))
break;
return ocelot_netdevice_changelowerstate(dev,
info->lower_state_info);
}
default:
break;
}
return NOTIFY_DONE;
}
struct notifier_block ocelot_netdevice_nb __read_mostly = {
.notifier_call = ocelot_netdevice_event,
};
static int ocelot_switchdev_event(struct notifier_block *unused,
unsigned long event, void *ptr)
{
struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
int err;
switch (event) {
case SWITCHDEV_PORT_ATTR_SET:
err = switchdev_handle_port_attr_set(dev, ptr,
ocelot_netdevice_dev_check,
ocelot_port_attr_set);
return notifier_from_errno(err);
}
return NOTIFY_DONE;
}
struct notifier_block ocelot_switchdev_nb __read_mostly = {
.notifier_call = ocelot_switchdev_event,
};
static int ocelot_switchdev_blocking_event(struct notifier_block *unused,
unsigned long event, void *ptr)
{
struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
int err;
switch (event) {
/* Blocking events. */
case SWITCHDEV_PORT_OBJ_ADD:
err = switchdev_handle_port_obj_add(dev, ptr,
ocelot_netdevice_dev_check,
ocelot_port_obj_add);
return notifier_from_errno(err);
case SWITCHDEV_PORT_OBJ_DEL:
err = switchdev_handle_port_obj_del(dev, ptr,
ocelot_netdevice_dev_check,
ocelot_port_obj_del);
return notifier_from_errno(err);
case SWITCHDEV_PORT_ATTR_SET:
err = switchdev_handle_port_attr_set(dev, ptr,
ocelot_netdevice_dev_check,
ocelot_port_attr_set);
return notifier_from_errno(err);
}
return NOTIFY_DONE;
}
struct notifier_block ocelot_switchdev_blocking_nb __read_mostly = {
.notifier_call = ocelot_switchdev_blocking_event,
};
int ocelot_probe_port(struct ocelot *ocelot, int port, struct regmap *target,
struct phy_device *phy)
{
struct ocelot_port_private *priv;
struct ocelot_port *ocelot_port;
struct net_device *dev;
int err;
dev = alloc_etherdev(sizeof(struct ocelot_port_private));
if (!dev)
return -ENOMEM;
SET_NETDEV_DEV(dev, ocelot->dev);
priv = netdev_priv(dev);
priv->dev = dev;
priv->phy = phy;
priv->chip_port = port;
ocelot_port = &priv->port;
ocelot_port->ocelot = ocelot;
ocelot_port->target = target;
ocelot->ports[port] = ocelot_port;
dev->netdev_ops = &ocelot_port_netdev_ops;
dev->ethtool_ops = &ocelot_ethtool_ops;
dev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXFCS |
NETIF_F_HW_TC;
dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_TC;
memcpy(dev->dev_addr, ocelot->base_mac, ETH_ALEN);
dev->dev_addr[ETH_ALEN - 1] += port;
ocelot_mact_learn(ocelot, PGID_CPU, dev->dev_addr,
ocelot_port->pvid_vlan.vid, ENTRYTYPE_LOCKED);
ocelot_init_port(ocelot, port);
err = register_netdev(dev);
if (err) {
dev_err(ocelot->dev, "register_netdev failed\n");
free_netdev(dev);
net: mscc: ocelot: fix error handling bugs in mscc_ocelot_init_ports() There are several error handling bugs in mscc_ocelot_init_ports(). I went through the code, and carefully audited it and made fixes and cleanups. 1) The ocelot_probe_port() function didn't have a mirror release function so it was hard to follow. I created the ocelot_release_port() function. 2) In the ocelot_probe_port() function, if the register_netdev() call failed, then it lead to a double free_netdev(dev) bug. Fix this by setting "ocelot->ports[port] = NULL" on the error path. 3) I was concerned that the "port" which comes from of_property_read_u32() might be out of bounds so I added a check for that. 4) In the original code if ocelot_regmap_init() failed then the driver tried to continue but I think that should be a fatal error. 5) If ocelot_probe_port() failed then the most recent devlink was leaked. The fix for mostly came Vladimir Oltean. Get rid of "registered_ports" and just set a bit in "devlink_ports_registered" to say when the devlink port has been registered (and needs to be unregistered on error). There are fewer than 32 ports so a u32 is large enough for this purpose. 6) The error handling if the final ocelot_port_devlink_init() failed had two problems. The "while (port-- >= 0)" loop should have been "--port" pre-op instead of a post-op to avoid a buffer underflow. The "if (!registered_ports[port])" condition was reversed leading to resource leaks and double frees. Fixes: 6c30384eb1de ("net: mscc: ocelot: register devlink ports") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/YBkXhqRxHtRGzSnJ@mwanda Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 12:12:38 +03:00
ocelot->ports[port] = NULL;
return err;
}
net: mscc: ocelot: fix error handling bugs in mscc_ocelot_init_ports() There are several error handling bugs in mscc_ocelot_init_ports(). I went through the code, and carefully audited it and made fixes and cleanups. 1) The ocelot_probe_port() function didn't have a mirror release function so it was hard to follow. I created the ocelot_release_port() function. 2) In the ocelot_probe_port() function, if the register_netdev() call failed, then it lead to a double free_netdev(dev) bug. Fix this by setting "ocelot->ports[port] = NULL" on the error path. 3) I was concerned that the "port" which comes from of_property_read_u32() might be out of bounds so I added a check for that. 4) In the original code if ocelot_regmap_init() failed then the driver tried to continue but I think that should be a fatal error. 5) If ocelot_probe_port() failed then the most recent devlink was leaked. The fix for mostly came Vladimir Oltean. Get rid of "registered_ports" and just set a bit in "devlink_ports_registered" to say when the devlink port has been registered (and needs to be unregistered on error). There are fewer than 32 ports so a u32 is large enough for this purpose. 6) The error handling if the final ocelot_port_devlink_init() failed had two problems. The "while (port-- >= 0)" loop should have been "--port" pre-op instead of a post-op to avoid a buffer underflow. The "if (!registered_ports[port])" condition was reversed leading to resource leaks and double frees. Fixes: 6c30384eb1de ("net: mscc: ocelot: register devlink ports") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/YBkXhqRxHtRGzSnJ@mwanda Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 12:12:38 +03:00
return 0;
}
void ocelot_release_port(struct ocelot_port *ocelot_port)
{
struct ocelot_port_private *priv = container_of(ocelot_port,
struct ocelot_port_private,
port);
unregister_netdev(priv->dev);
free_netdev(priv->dev);
}