linux/drivers/gpu/drm/i915/display/intel_bw.c

1787 lines
48 KiB
C
Raw Permalink Normal View History

drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
// SPDX-License-Identifier: MIT
/*
* Copyright © 2019 Intel Corporation
*/
#include <drm/drm_atomic_state_helper.h>
#include "soc/intel_dram.h"
#include "i915_drv.h"
#include "i915_reg.h"
#include "i915_utils.h"
#include "intel_atomic.h"
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
#include "intel_bw.h"
#include "intel_cdclk.h"
#include "intel_display_core.h"
#include "intel_display_regs.h"
#include "intel_display_types.h"
#include "intel_mchbar_regs.h"
#include "intel_pcode.h"
#include "intel_uncore.h"
#include "skl_watermark.h"
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
struct intel_dbuf_bw {
unsigned int max_bw[I915_MAX_DBUF_SLICES];
u8 active_planes[I915_MAX_DBUF_SLICES];
};
struct intel_bw_state {
struct intel_global_state base;
struct intel_dbuf_bw dbuf_bw[I915_MAX_PIPES];
/*
* Contains a bit mask, used to determine, whether correspondent
* pipe allows SAGV or not.
*/
u8 pipe_sagv_reject;
/* bitmask of active pipes */
u8 active_pipes;
/*
* From MTL onwards, to lock a QGV point, punit expects the peak BW of
* the selected QGV point as the parameter in multiples of 100MB/s
*/
u16 qgv_point_peakbw;
/*
* Current QGV points mask, which restricts
* some particular SAGV states, not to confuse
* with pipe_sagv_mask.
*/
u16 qgv_points_mask;
unsigned int data_rate[I915_MAX_PIPES];
u8 num_active_planes[I915_MAX_PIPES];
};
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
/* Parameters for Qclk Geyserville (QGV) */
struct intel_qgv_point {
u16 dclk, t_rp, t_rdpre, t_rc, t_ras, t_rcd;
};
#define DEPROGBWPCLIMIT 60
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
struct intel_psf_gv_point {
u8 clk; /* clock in multiples of 16.6666 MHz */
};
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
struct intel_qgv_info {
struct intel_qgv_point points[I915_NUM_QGV_POINTS];
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
struct intel_psf_gv_point psf_points[I915_NUM_PSF_GV_POINTS];
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
u8 num_points;
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
u8 num_psf_points;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
u8 t_bl;
2021-10-15 14:00:41 -07:00
u8 max_numchannels;
u8 channel_width;
u8 deinterleave;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
};
static int dg1_mchbar_read_qgv_point_info(struct intel_display *display,
struct intel_qgv_point *sp,
int point)
{
struct drm_i915_private *i915 = to_i915(display->drm);
u32 dclk_ratio, dclk_reference;
u32 val;
val = intel_uncore_read(&i915->uncore, SA_PERF_STATUS_0_0_0_MCHBAR_PC);
dclk_ratio = REG_FIELD_GET(DG1_QCLK_RATIO_MASK, val);
if (val & DG1_QCLK_REFERENCE)
dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
else
dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
2021-10-15 14:00:41 -07:00
sp->dclk = DIV_ROUND_UP((16667 * dclk_ratio * dclk_reference) + 500, 1000);
val = intel_uncore_read(&i915->uncore, SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
if (val & DG1_GEAR_TYPE)
sp->dclk *= 2;
if (sp->dclk == 0)
return -EINVAL;
val = intel_uncore_read(&i915->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
sp->t_rp = REG_FIELD_GET(DG1_DRAM_T_RP_MASK, val);
sp->t_rdpre = REG_FIELD_GET(DG1_DRAM_T_RDPRE_MASK, val);
val = intel_uncore_read(&i915->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
sp->t_rcd = REG_FIELD_GET(DG1_DRAM_T_RCD_MASK, val);
sp->t_ras = REG_FIELD_GET(DG1_DRAM_T_RAS_MASK, val);
sp->t_rc = sp->t_rp + sp->t_ras;
return 0;
}
static int icl_pcode_read_qgv_point_info(struct intel_display *display,
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
struct intel_qgv_point *sp,
int point)
{
u32 val = 0, val2 = 0;
2021-10-15 14:00:41 -07:00
u16 dclk;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
int ret;
ret = intel_pcode_read(display->drm, ICL_PCODE_MEM_SUBSYSYSTEM_INFO |
ICL_PCODE_MEM_SS_READ_QGV_POINT_INFO(point),
&val, &val2);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
if (ret)
return ret;
2021-10-15 14:00:41 -07:00
dclk = val & 0xffff;
sp->dclk = DIV_ROUND_UP((16667 * dclk) + (DISPLAY_VER(display) >= 12 ? 500 : 0),
1000);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
sp->t_rp = (val & 0xff0000) >> 16;
sp->t_rcd = (val & 0xff000000) >> 24;
sp->t_rdpre = val2 & 0xff;
sp->t_ras = (val2 & 0xff00) >> 8;
sp->t_rc = sp->t_rp + sp->t_ras;
return 0;
}
static int adls_pcode_read_psf_gv_point_info(struct intel_display *display,
struct intel_psf_gv_point *points)
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
{
u32 val = 0;
int ret;
int i;
ret = intel_pcode_read(display->drm, ICL_PCODE_MEM_SUBSYSYSTEM_INFO |
ADL_PCODE_MEM_SS_READ_PSF_GV_INFO, &val, NULL);
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
if (ret)
return ret;
for (i = 0; i < I915_NUM_PSF_GV_POINTS; i++) {
points[i].clk = val & 0xff;
val >>= 8;
}
return 0;
}
static u16 icl_qgv_points_mask(struct intel_display *display)
{
unsigned int num_psf_gv_points = display->bw.max[0].num_psf_gv_points;
unsigned int num_qgv_points = display->bw.max[0].num_qgv_points;
u16 qgv_points = 0, psf_points = 0;
/*
* We can _not_ use the whole ADLS_QGV_PT_MASK here, as PCode rejects
* it with failure if we try masking any unadvertised points.
* So need to operate only with those returned from PCode.
*/
if (num_qgv_points > 0)
qgv_points = GENMASK(num_qgv_points - 1, 0);
if (num_psf_gv_points > 0)
psf_points = GENMASK(num_psf_gv_points - 1, 0);
return ICL_PCODE_REQ_QGV_PT(qgv_points) | ADLS_PCODE_REQ_PSF_PT(psf_points);
}
static bool is_sagv_enabled(struct intel_display *display, u16 points_mask)
{
return !is_power_of_2(~points_mask & icl_qgv_points_mask(display) &
ICL_PCODE_REQ_QGV_PT_MASK);
}
static int icl_pcode_restrict_qgv_points(struct intel_display *display,
u32 points_mask)
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
{
int ret;
if (DISPLAY_VER(display) >= 14)
return 0;
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
/* bspec says to keep retrying for at least 1 ms */
ret = intel_pcode_request(display->drm, ICL_PCODE_SAGV_DE_MEM_SS_CONFIG,
points_mask,
ICL_PCODE_REP_QGV_MASK | ADLS_PCODE_REP_PSF_MASK,
ICL_PCODE_REP_QGV_SAFE | ADLS_PCODE_REP_PSF_SAFE,
1);
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
if (ret < 0) {
drm_err(display->drm,
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
"Failed to disable qgv points (0x%x) points: 0x%x\n",
ret, points_mask);
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
return ret;
}
display->sagv.status = is_sagv_enabled(display, points_mask) ?
I915_SAGV_ENABLED : I915_SAGV_DISABLED;
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
return 0;
}
static int mtl_read_qgv_point_info(struct intel_display *display,
struct intel_qgv_point *sp, int point)
{
struct drm_i915_private *i915 = to_i915(display->drm);
u32 val, val2;
u16 dclk;
val = intel_uncore_read(&i915->uncore,
MTL_MEM_SS_INFO_QGV_POINT_LOW(point));
val2 = intel_uncore_read(&i915->uncore,
MTL_MEM_SS_INFO_QGV_POINT_HIGH(point));
dclk = REG_FIELD_GET(MTL_DCLK_MASK, val);
sp->dclk = DIV_ROUND_CLOSEST(16667 * dclk, 1000);
sp->t_rp = REG_FIELD_GET(MTL_TRP_MASK, val);
sp->t_rcd = REG_FIELD_GET(MTL_TRCD_MASK, val);
sp->t_rdpre = REG_FIELD_GET(MTL_TRDPRE_MASK, val2);
sp->t_ras = REG_FIELD_GET(MTL_TRAS_MASK, val2);
sp->t_rc = sp->t_rp + sp->t_ras;
return 0;
}
static int
intel_read_qgv_point_info(struct intel_display *display,
struct intel_qgv_point *sp,
int point)
{
if (DISPLAY_VER(display) >= 14)
return mtl_read_qgv_point_info(display, sp, point);
else if (display->platform.dg1)
return dg1_mchbar_read_qgv_point_info(display, sp, point);
else
return icl_pcode_read_qgv_point_info(display, sp, point);
}
static int icl_get_qgv_points(struct intel_display *display,
const struct dram_info *dram_info,
2021-10-15 14:00:41 -07:00
struct intel_qgv_info *qi,
bool is_y_tile)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
int i, ret;
qi->num_points = dram_info->num_qgv_points;
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
qi->num_psf_points = dram_info->num_psf_gv_points;
if (DISPLAY_VER(display) >= 14) {
switch (dram_info->type) {
case INTEL_DRAM_DDR4:
qi->t_bl = 4;
qi->max_numchannels = 2;
qi->channel_width = 64;
qi->deinterleave = 2;
break;
case INTEL_DRAM_DDR5:
qi->t_bl = 8;
qi->max_numchannels = 4;
qi->channel_width = 32;
qi->deinterleave = 2;
break;
case INTEL_DRAM_LPDDR4:
case INTEL_DRAM_LPDDR5:
qi->t_bl = 16;
qi->max_numchannels = 8;
qi->channel_width = 16;
qi->deinterleave = 4;
break;
case INTEL_DRAM_GDDR:
case INTEL_DRAM_GDDR_ECC:
qi->channel_width = 32;
break;
default:
MISSING_CASE(dram_info->type);
return -EINVAL;
}
} else if (DISPLAY_VER(display) >= 12) {
switch (dram_info->type) {
case INTEL_DRAM_DDR4:
2021-10-15 14:00:41 -07:00
qi->t_bl = is_y_tile ? 8 : 4;
qi->max_numchannels = 2;
qi->channel_width = 64;
qi->deinterleave = is_y_tile ? 1 : 2;
break;
case INTEL_DRAM_DDR5:
2021-10-15 14:00:41 -07:00
qi->t_bl = is_y_tile ? 16 : 8;
qi->max_numchannels = 4;
qi->channel_width = 32;
qi->deinterleave = is_y_tile ? 1 : 2;
break;
case INTEL_DRAM_LPDDR4:
if (display->platform.rocketlake) {
2021-10-15 14:00:41 -07:00
qi->t_bl = 8;
qi->max_numchannels = 4;
qi->channel_width = 32;
qi->deinterleave = 2;
break;
}
fallthrough;
case INTEL_DRAM_LPDDR5:
qi->t_bl = 16;
qi->max_numchannels = 8;
qi->channel_width = 16;
qi->deinterleave = is_y_tile ? 2 : 4;
break;
default:
qi->t_bl = 16;
2021-10-15 14:00:41 -07:00
qi->max_numchannels = 1;
break;
}
} else if (DISPLAY_VER(display) == 11) {
qi->t_bl = dram_info->type == INTEL_DRAM_DDR4 ? 4 : 8;
2021-10-15 14:00:41 -07:00
qi->max_numchannels = 1;
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
if (drm_WARN_ON(display->drm,
drm/i915/display: Make WARN* drm specific where drm_device ptr is available drm specific WARN* calls include device information in the backtrace, so we know what device the warnings originate from. Covert all the calls of WARN* with device specific drm_WARN* variants in functions where drm_device or drm_i915_private struct pointer is readily available. The conversion was done automatically with below coccinelle semantic patch. checkpatch errors/warnings are fixed manually. @rule1@ identifier func, T; @@ func(...) { ... struct drm_device *T = ...; <... ( -WARN( +drm_WARN(T, ...) | -WARN_ON( +drm_WARN_ON(T, ...) | -WARN_ONCE( +drm_WARN_ONCE(T, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(T, ...) ) ...> } @rule2@ identifier func, T; @@ func(struct drm_device *T,...) { <... ( -WARN( +drm_WARN(T, ...) | -WARN_ON( +drm_WARN_ON(T, ...) | -WARN_ONCE( +drm_WARN_ONCE(T, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(T, ...) ) ...> } @rule3@ identifier func, T; @@ func(...) { ... struct drm_i915_private *T = ...; <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } @rule4@ identifier func, T; @@ func(struct drm_i915_private *T,...) { <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200128181603.27767-20-pankaj.laxminarayan.bharadiya@intel.com
2020-01-28 23:46:01 +05:30
qi->num_points > ARRAY_SIZE(qi->points)))
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
qi->num_points = ARRAY_SIZE(qi->points);
for (i = 0; i < qi->num_points; i++) {
struct intel_qgv_point *sp = &qi->points[i];
ret = intel_read_qgv_point_info(display, sp, i);
if (ret) {
drm_dbg_kms(display->drm, "Could not read QGV %d info\n", i);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return ret;
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
drm_dbg_kms(display->drm,
"QGV %d: DCLK=%d tRP=%d tRDPRE=%d tRAS=%d tRCD=%d tRC=%d\n",
i, sp->dclk, sp->t_rp, sp->t_rdpre, sp->t_ras,
sp->t_rcd, sp->t_rc);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
if (qi->num_psf_points > 0) {
ret = adls_pcode_read_psf_gv_point_info(display, qi->psf_points);
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
if (ret) {
drm_err(display->drm, "Failed to read PSF point data; PSF points will not be considered in bandwidth calculations.\n");
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
qi->num_psf_points = 0;
}
for (i = 0; i < qi->num_psf_points; i++)
drm_dbg_kms(display->drm,
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
"PSF GV %d: CLK=%d \n",
i, qi->psf_points[i].clk);
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return 0;
}
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
static int adl_calc_psf_bw(int clk)
{
/*
* clk is multiples of 16.666MHz (100/6)
* According to BSpec PSF GV bandwidth is
* calculated as BW = 64 * clk * 16.666Mhz
*/
return DIV_ROUND_CLOSEST(64 * clk * 100, 6);
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
static int icl_sagv_max_dclk(const struct intel_qgv_info *qi)
{
u16 dclk = 0;
int i;
for (i = 0; i < qi->num_points; i++)
dclk = max(dclk, qi->points[i].dclk);
return dclk;
}
struct intel_sa_info {
u16 displayrtids;
u8 deburst, deprogbwlimit, derating;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
};
static const struct intel_sa_info icl_sa_info = {
.deburst = 8,
.deprogbwlimit = 25, /* GB/s */
.displayrtids = 128,
.derating = 10,
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
};
static const struct intel_sa_info tgl_sa_info = {
.deburst = 16,
.deprogbwlimit = 34, /* GB/s */
.displayrtids = 256,
.derating = 10,
};
static const struct intel_sa_info rkl_sa_info = {
2021-10-15 14:00:41 -07:00
.deburst = 8,
.deprogbwlimit = 20, /* GB/s */
.displayrtids = 128,
.derating = 10,
};
static const struct intel_sa_info adls_sa_info = {
.deburst = 16,
.deprogbwlimit = 38, /* GB/s */
.displayrtids = 256,
.derating = 10,
};
static const struct intel_sa_info adlp_sa_info = {
.deburst = 16,
.deprogbwlimit = 38, /* GB/s */
.displayrtids = 256,
.derating = 20,
};
static const struct intel_sa_info mtl_sa_info = {
.deburst = 32,
.deprogbwlimit = 38, /* GB/s */
.displayrtids = 256,
.derating = 10,
};
static const struct intel_sa_info xe2_hpd_sa_info = {
.derating = 30,
.deprogbwlimit = 53,
/* Other values not used by simplified algorithm */
};
static const struct intel_sa_info xe2_hpd_ecc_sa_info = {
.derating = 45,
.deprogbwlimit = 53,
/* Other values not used by simplified algorithm */
};
static const struct intel_sa_info xe3lpd_sa_info = {
.deburst = 32,
.deprogbwlimit = 65, /* GB/s */
.displayrtids = 256,
.derating = 10,
};
static const struct intel_sa_info xe3lpd_3002_sa_info = {
.deburst = 32,
.deprogbwlimit = 22, /* GB/s */
.displayrtids = 256,
.derating = 10,
};
static int icl_get_bw_info(struct intel_display *display,
const struct dram_info *dram_info,
const struct intel_sa_info *sa)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
struct intel_qgv_info qi = {};
bool is_y_tile = true; /* assume y tile may be used */
int num_channels = max_t(u8, 1, dram_info->num_channels);
2021-10-15 14:00:41 -07:00
int ipqdepth, ipqdepthpch = 16;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
int dclk_max;
int maxdebw;
int num_groups = ARRAY_SIZE(display->bw.max);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
int i, ret;
ret = icl_get_qgv_points(display, dram_info, &qi, is_y_tile);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
if (ret) {
drm_dbg_kms(display->drm,
"Failed to get memory subsystem information, ignoring bandwidth limits");
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return ret;
}
dclk_max = icl_sagv_max_dclk(&qi);
2021-10-15 14:00:41 -07:00
maxdebw = min(sa->deprogbwlimit * 1000, dclk_max * 16 * 6 / 10);
ipqdepth = min(ipqdepthpch, sa->displayrtids / num_channels);
qi.deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
for (i = 0; i < num_groups; i++) {
struct intel_bw_info *bi = &display->bw.max[i];
2021-10-15 14:00:41 -07:00
int clpchgroup;
int j;
clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
bi->num_planes = (ipqdepth - clpchgroup) / clpchgroup + 1;
bi->num_qgv_points = qi.num_points;
bi->num_psf_gv_points = qi.num_psf_points;
for (j = 0; j < qi.num_points; j++) {
const struct intel_qgv_point *sp = &qi.points[j];
int ct, bw;
/*
* Max row cycle time
*
* FIXME what is the logic behind the
* assumed burst length?
*/
ct = max_t(int, sp->t_rc, sp->t_rp + sp->t_rcd +
(clpchgroup - 1) * qi.t_bl + sp->t_rdpre);
bw = DIV_ROUND_UP(sp->dclk * clpchgroup * 32 * num_channels, ct);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
2021-10-15 14:00:41 -07:00
bi->deratedbw[j] = min(maxdebw,
bw * (100 - sa->derating) / 100);
drm_dbg_kms(display->drm,
2021-10-15 14:00:41 -07:00
"BW%d / QGV %d: num_planes=%d deratedbw=%u\n",
i, j, bi->num_planes, bi->deratedbw[j]);
}
}
/*
* In case if SAGV is disabled in BIOS, we always get 1
* SAGV point, but we can't send PCode commands to restrict it
* as it will fail and pointless anyway.
*/
if (qi.num_points == 1)
display->sagv.status = I915_SAGV_NOT_CONTROLLED;
2021-10-15 14:00:41 -07:00
else
display->sagv.status = I915_SAGV_ENABLED;
2021-10-15 14:00:41 -07:00
return 0;
}
static int tgl_get_bw_info(struct intel_display *display,
const struct dram_info *dram_info,
const struct intel_sa_info *sa)
2021-10-15 14:00:41 -07:00
{
struct intel_qgv_info qi = {};
bool is_y_tile = true; /* assume y tile may be used */
int num_channels = max_t(u8, 1, dram_info->num_channels);
2021-10-15 14:00:41 -07:00
int ipqdepth, ipqdepthpch = 16;
int dclk_max;
int maxdebw, peakbw;
int clperchgroup;
int num_groups = ARRAY_SIZE(display->bw.max);
2021-10-15 14:00:41 -07:00
int i, ret;
ret = icl_get_qgv_points(display, dram_info, &qi, is_y_tile);
2021-10-15 14:00:41 -07:00
if (ret) {
drm_dbg_kms(display->drm,
2021-10-15 14:00:41 -07:00
"Failed to get memory subsystem information, ignoring bandwidth limits");
return ret;
}
if (DISPLAY_VER(display) < 14 &&
(dram_info->type == INTEL_DRAM_LPDDR4 || dram_info->type == INTEL_DRAM_LPDDR5))
2021-10-15 14:00:41 -07:00
num_channels *= 2;
qi.deinterleave = qi.deinterleave ? : DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
if (num_channels < qi.max_numchannels && DISPLAY_VER(display) >= 12)
2021-10-15 14:00:41 -07:00
qi.deinterleave = max(DIV_ROUND_UP(qi.deinterleave, 2), 1);
if (DISPLAY_VER(display) >= 12 && num_channels > qi.max_numchannels)
drm_warn(display->drm, "Number of channels exceeds max number of channels.");
2021-10-15 14:00:41 -07:00
if (qi.max_numchannels != 0)
num_channels = min_t(u8, num_channels, qi.max_numchannels);
dclk_max = icl_sagv_max_dclk(&qi);
peakbw = num_channels * DIV_ROUND_UP(qi.channel_width, 8) * dclk_max;
maxdebw = min(sa->deprogbwlimit * 1000, peakbw * DEPROGBWPCLIMIT / 100);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
ipqdepth = min(ipqdepthpch, sa->displayrtids / num_channels);
2021-10-15 14:00:41 -07:00
/*
* clperchgroup = 4kpagespermempage * clperchperblock,
* clperchperblock = 8 / num_channels * interleave
*/
clperchgroup = 4 * DIV_ROUND_UP(8, num_channels) * qi.deinterleave;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
2021-10-15 14:00:41 -07:00
for (i = 0; i < num_groups; i++) {
struct intel_bw_info *bi = &display->bw.max[i];
2021-10-15 14:00:41 -07:00
struct intel_bw_info *bi_next;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
int clpchgroup;
int j;
2021-10-15 14:00:41 -07:00
clpchgroup = (sa->deburst * qi.deinterleave / num_channels) << i;
drm/i915: fix null pointer dereference Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel. The root cause is null pointer defeference of bi_next in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c. BUG: kernel NULL pointer dereference, address: 000000000000002e PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 1 Comm: swapper/0 Tainted: G U 5.17.0-rc1 Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021 RIP: 0010:tgl_get_bw_info+0x2de/0x510 ... [ 2.554467] Call Trace: [ 2.554467] <TASK> [ 2.554467] intel_bw_init_hw+0x14a/0x434 [ 2.554467] ? _printk+0x59/0x73 [ 2.554467] ? _dev_err+0x77/0x91 [ 2.554467] i915_driver_hw_probe+0x329/0x33e [ 2.554467] i915_driver_probe+0x4c8/0x638 [ 2.554467] i915_pci_probe+0xf8/0x14e [ 2.554467] ? _raw_spin_unlock_irqrestore+0x12/0x2c [ 2.554467] pci_device_probe+0xaa/0x142 [ 2.554467] really_probe+0x13f/0x2f4 [ 2.554467] __driver_probe_device+0x9e/0xd3 [ 2.554467] driver_probe_device+0x24/0x7c [ 2.554467] __driver_attach+0xba/0xcf [ 2.554467] ? driver_attach+0x1f/0x1f [ 2.554467] bus_for_each_dev+0x8c/0xc0 [ 2.554467] bus_add_driver+0x11b/0x1f7 [ 2.554467] driver_register+0x60/0xea [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16 [ 2.554467] i915_init+0x2c/0xb9 [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16 [ 2.554467] do_one_initcall+0x12e/0x2b3 [ 2.554467] do_initcall_level+0xd6/0xf3 [ 2.554467] do_initcalls+0x4e/0x79 [ 2.554467] kernel_init_freeable+0xed/0x14d [ 2.554467] ? rest_init+0xc1/0xc1 [ 2.554467] kernel_init+0x1a/0x120 [ 2.554467] ret_from_fork+0x1f/0x30 [ 2.554467] </TASK> ... Kernel panic - not syncing: Fatal exception Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae") Signed-off-by: Łukasz Bartosik <lb@semihalf.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220201153354.11971-1-lukasz.bartosik@semihalf.com
2022-02-01 16:33:54 +01:00
if (i < num_groups - 1) {
bi_next = &display->bw.max[i + 1];
drm/i915: fix null pointer dereference Asus chromebook CX550 crashes during boot on v5.17-rc1 kernel. The root cause is null pointer defeference of bi_next in tgl_get_bw_info() in drivers/gpu/drm/i915/display/intel_bw.c. BUG: kernel NULL pointer dereference, address: 000000000000002e PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 1 Comm: swapper/0 Tainted: G U 5.17.0-rc1 Hardware name: Google Delbin/Delbin, BIOS Google_Delbin.13672.156.3 05/14/2021 RIP: 0010:tgl_get_bw_info+0x2de/0x510 ... [ 2.554467] Call Trace: [ 2.554467] <TASK> [ 2.554467] intel_bw_init_hw+0x14a/0x434 [ 2.554467] ? _printk+0x59/0x73 [ 2.554467] ? _dev_err+0x77/0x91 [ 2.554467] i915_driver_hw_probe+0x329/0x33e [ 2.554467] i915_driver_probe+0x4c8/0x638 [ 2.554467] i915_pci_probe+0xf8/0x14e [ 2.554467] ? _raw_spin_unlock_irqrestore+0x12/0x2c [ 2.554467] pci_device_probe+0xaa/0x142 [ 2.554467] really_probe+0x13f/0x2f4 [ 2.554467] __driver_probe_device+0x9e/0xd3 [ 2.554467] driver_probe_device+0x24/0x7c [ 2.554467] __driver_attach+0xba/0xcf [ 2.554467] ? driver_attach+0x1f/0x1f [ 2.554467] bus_for_each_dev+0x8c/0xc0 [ 2.554467] bus_add_driver+0x11b/0x1f7 [ 2.554467] driver_register+0x60/0xea [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16 [ 2.554467] i915_init+0x2c/0xb9 [ 2.554467] ? mipi_dsi_bus_init+0x16/0x16 [ 2.554467] do_one_initcall+0x12e/0x2b3 [ 2.554467] do_initcall_level+0xd6/0xf3 [ 2.554467] do_initcalls+0x4e/0x79 [ 2.554467] kernel_init_freeable+0xed/0x14d [ 2.554467] ? rest_init+0xc1/0xc1 [ 2.554467] kernel_init+0x1a/0x120 [ 2.554467] ret_from_fork+0x1f/0x30 [ 2.554467] </TASK> ... Kernel panic - not syncing: Fatal exception Fixes: c64a9a7c05be ("drm/i915: Update memory bandwidth formulae") Signed-off-by: Łukasz Bartosik <lb@semihalf.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220201153354.11971-1-lukasz.bartosik@semihalf.com
2022-02-01 16:33:54 +01:00
if (clpchgroup < clperchgroup)
bi_next->num_planes = (ipqdepth - clpchgroup) /
clpchgroup + 1;
else
bi_next->num_planes = 0;
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
bi->num_qgv_points = qi.num_points;
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
bi->num_psf_gv_points = qi.num_psf_points;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
for (j = 0; j < qi.num_points; j++) {
const struct intel_qgv_point *sp = &qi.points[j];
int ct, bw;
/*
* Max row cycle time
*
* FIXME what is the logic behind the
* assumed burst length?
*/
ct = max_t(int, sp->t_rc, sp->t_rp + sp->t_rcd +
(clpchgroup - 1) * qi.t_bl + sp->t_rdpre);
2021-10-15 14:00:41 -07:00
bw = DIV_ROUND_UP(sp->dclk * clpchgroup * 32 * num_channels, ct);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
bi->deratedbw[j] = min(maxdebw,
bw * (100 - sa->derating) / 100);
bi->peakbw[j] = DIV_ROUND_CLOSEST(sp->dclk *
num_channels *
qi.channel_width, 8);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
drm_dbg_kms(display->drm,
"BW%d / QGV %d: num_planes=%d deratedbw=%u peakbw: %u\n",
i, j, bi->num_planes, bi->deratedbw[j],
bi->peakbw[j]);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
for (j = 0; j < qi.num_psf_points; j++) {
const struct intel_psf_gv_point *sp = &qi.psf_points[j];
bi->psf_bw[j] = adl_calc_psf_bw(sp->clk);
drm_dbg_kms(display->drm,
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
"BW%d / PSF GV %d: num_planes=%d bw=%u\n",
i, j, bi->num_planes, bi->psf_bw[j]);
}
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
/*
* In case if SAGV is disabled in BIOS, we always get 1
* SAGV point, but we can't send PCode commands to restrict it
* as it will fail and pointless anyway.
*/
if (qi.num_points == 1)
display->sagv.status = I915_SAGV_NOT_CONTROLLED;
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
else
display->sagv.status = I915_SAGV_ENABLED;
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return 0;
}
static void dg2_get_bw_info(struct intel_display *display)
{
unsigned int deratedbw = display->platform.dg2_g11 ? 38000 : 50000;
int num_groups = ARRAY_SIZE(display->bw.max);
int i;
/*
* DG2 doesn't have SAGV or QGV points, just a constant max bandwidth
* that doesn't depend on the number of planes enabled. So fill all the
* plane group with constant bw information for uniformity with other
* platforms. DG2-G10 platforms have a constant 50 GB/s bandwidth,
* whereas DG2-G11 platforms have 38 GB/s.
*/
for (i = 0; i < num_groups; i++) {
struct intel_bw_info *bi = &display->bw.max[i];
bi->num_planes = 1;
/* Need only one dummy QGV point per group */
bi->num_qgv_points = 1;
bi->deratedbw[0] = deratedbw;
}
display->sagv.status = I915_SAGV_NOT_CONTROLLED;
}
static int xe2_hpd_get_bw_info(struct intel_display *display,
const struct dram_info *dram_info,
const struct intel_sa_info *sa)
{
struct intel_qgv_info qi = {};
int num_channels = dram_info->num_channels;
int peakbw, maxdebw;
int ret, i;
ret = icl_get_qgv_points(display, dram_info, &qi, true);
if (ret) {
drm_dbg_kms(display->drm,
"Failed to get memory subsystem information, ignoring bandwidth limits");
return ret;
}
peakbw = num_channels * qi.channel_width / 8 * icl_sagv_max_dclk(&qi);
maxdebw = min(sa->deprogbwlimit * 1000, peakbw * DEPROGBWPCLIMIT / 10);
for (i = 0; i < qi.num_points; i++) {
const struct intel_qgv_point *point = &qi.points[i];
int bw = num_channels * (qi.channel_width / 8) * point->dclk;
display->bw.max[0].deratedbw[i] =
min(maxdebw, (100 - sa->derating) * bw / 100);
display->bw.max[0].peakbw[i] = bw;
drm_dbg_kms(display->drm, "QGV %d: deratedbw=%u peakbw: %u\n",
i, display->bw.max[0].deratedbw[i],
display->bw.max[0].peakbw[i]);
}
/* Bandwidth does not depend on # of planes; set all groups the same */
display->bw.max[0].num_planes = 1;
display->bw.max[0].num_qgv_points = qi.num_points;
for (i = 1; i < ARRAY_SIZE(display->bw.max); i++)
memcpy(&display->bw.max[i], &display->bw.max[0],
sizeof(display->bw.max[0]));
/*
* Xe2_HPD should always have exactly two QGV points representing
* battery and plugged-in operation.
*/
drm_WARN_ON(display->drm, qi.num_points != 2);
display->sagv.status = I915_SAGV_ENABLED;
return 0;
}
static unsigned int icl_max_bw_index(struct intel_display *display,
int num_planes, int qgv_point)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
int i;
drm/i915: Restrict qgv points which don't have enough bandwidth. According to BSpec 53998, we should try to restrict qgv points, which can't provide enough bandwidth for desired display configuration. Currently we are just comparing against all of those and take minimum(worst case). v2: Fixed wrong PCode reply mask, removed hardcoded values. v3: Forbid simultaneous legacy SAGV PCode requests and restricting qgv points. Put the actual restriction to commit function, added serialization(thanks to Ville) to prevent commit being applied out of order in case of nonblocking and/or nomodeset commits. v4: - Minor code refactoring, fixed few typos(thanks to James Ausmus) - Change the naming of qgv point masking/unmasking functions(James Ausmus). - Simplify the masking/unmasking operation itself, as we don't need to mask only single point per request(James Ausmus) - Reject and stick to highest bandwidth point if SAGV can't be enabled(BSpec) v5: - Add new mailbox reply codes, which seems to happen during boot time for TGL and indicate that QGV setting is not yet available. v6: - Increase number of supported QGV points to be in sync with BSpec. v7: - Rebased and resolved conflict to fix build failure. - Fix NUM_QGV_POINTS to 8 and moved that to header file(James Ausmus) v8: - Don't report an error if we can't restrict qgv points, as SAGV can be disabled by BIOS, which is completely legal. So don't make CI panic. Instead if we detect that there is only 1 QGV point accessible just analyze if we can fit the required bandwidth requirements, but no need in restricting. v9: - Fix wrong QGV transition if we have 0 planes and no SAGV simultaneously. v10: - Fix CDCLK corruption, because of global state getting serialized without modeset, which caused copying of non-calculated cdclk to be copied to dev_priv(thanks to Ville for the hint). v11: - Remove unneeded headers and spaces(Matthew Roper) - Remove unneeded intel_qgv_info qi struct from bw check and zero out the needed one(Matthew Roper) - Changed QGV error message to have more clear meaning(Matthew Roper) - Use state->modeset_set instead of any_ms(Matthew Roper) - Moved NUM_SAGV_POINTS from i915_reg.h to i915_drv.h where it's used - Keep using crtc_state->hw.active instead of .enable(Matthew Roper) - Moved unrelated changes to other patch(using latency as parameter for plane wm calculation, moved to SAGV refactoring patch) v12: - Fix rebase conflict with own temporary SAGV/QGV fix. - Remove unnecessary mask being zero check when unmasking qgv points as this is completely legal(Matt Roper) - Check if we are setting the same mask as already being set in hardware to prevent error from PCode. - Fix error message when restricting/unrestricting qgv points to "mask/unmask" which sounds more accurate(Matt Roper) - Move sagv status setting to icl_get_bw_info from atomic check as this should be calculated only once.(Matt Roper) - Edited comments for the case when we can't enable SAGV and use only 1 QGV point with highest bandwidth to be more understandable.(Matt Roper) v13: - Moved max_data_rate in bw check to closer scope(Ville Syrjälä) - Changed comment for zero new_mask in qgv points masking function to better reflect reality(Ville Syrjälä) - Simplified bit mask operation in qgv points masking function (Ville Syrjälä) - Moved intel_qgv_points_mask closer to gen11 SAGV disabling, however this still can't be under modeset condition(Ville Syrjälä) - Packed qgv_points_mask as u8 and moved closer to pipe_sagv_mask (Ville Syrjälä) - Extracted PCode changes to separate patch.(Ville Syrjälä) - Now treat num_planes 0 same as 1 to avoid confusion and returning max_bw as 0, which would prevent choosing QGV point having max bandwidth in case if SAGV is not allowed, as per BSpec(Ville Syrjälä) - Do the actual qgv_points_mask swap in the same place as all other global state parts like cdclk are swapped. In the next patch, this all will be moved to bw state as global state, once new global state patch series from Ville lands v14: - Now using global state to serialize access to qgv points - Added global state locking back, otherwise we seem to read bw state in a wrong way. v15: - Added TODO comment for near atomic global state locking in bw code. v16: - Fixed intel_atomic_bw_* functions to be intel_bw_* as discussed with Jani Nikula. - Take bw_state_changed flag into use. v17: - Moved qgv point related manipulations next to SAGV code, as those are semantically related(Ville Syrjälä) - Renamed those into intel_sagv_(pre)|(post)_plane_update (Ville Syrjälä) v18: - Move sagv related calls from commit tail into intel_sagv_(pre)|(post)_plane_update(Ville Syrjälä) v19: - Use intel_atomic_get_bw_(old)|(new)_state which is intended for commit tail stage. v20: - Return max bandwidth for 0 planes(Ville) - Constify old_bw_state in bw_atomic_check(Ville) - Removed some debugs(Ville) - Added data rate to debug print when no QGV points(Ville) - Removed some comments(Ville) v21, v22, v23: - Fixed rebase conflict v24: - Changed PCode mask to use ICL_ prefix v25: - Resolved rebase conflict v26: - Removed redundant NULL checks(Ville) - Removed redundant error prints(Ville) v27: - Use device specific drm_err(Ville) - Fixed parenthesis ident reported by checkpatch Line over 100 warns to be fixed together with existing code style. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Ville Syrjälä <ville.syrjala@intel.com> Cc: James Ausmus <james.ausmus@intel.com> [vsyrjala: Drop duplicate intel_sagv_{pre,post}_plane_update() prototypes and drop unused NUM_SAGV_POINTS define] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514074853.9508-3-stanislav.lisovskiy@intel.com
2020-05-14 10:48:52 +03:00
/*
* Let's return max bw for 0 planes
*/
num_planes = max(1, num_planes);
for (i = 0; i < ARRAY_SIZE(display->bw.max); i++) {
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
const struct intel_bw_info *bi =
&display->bw.max[i];
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
/*
* Pcode will not expose all QGV points when
* SAGV is forced to off/min/med/max.
*/
if (qgv_point >= bi->num_qgv_points)
return UINT_MAX;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
if (num_planes >= bi->num_planes)
return i;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
return UINT_MAX;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
static unsigned int tgl_max_bw_index(struct intel_display *display,
int num_planes, int qgv_point)
2021-10-15 14:00:41 -07:00
{
int i;
/*
* Let's return max bw for 0 planes
*/
num_planes = max(1, num_planes);
for (i = ARRAY_SIZE(display->bw.max) - 1; i >= 0; i--) {
2021-10-15 14:00:41 -07:00
const struct intel_bw_info *bi =
&display->bw.max[i];
2021-10-15 14:00:41 -07:00
/*
* Pcode will not expose all QGV points when
* SAGV is forced to off/min/med/max.
*/
if (qgv_point >= bi->num_qgv_points)
return UINT_MAX;
if (num_planes <= bi->num_planes)
return i;
2021-10-15 14:00:41 -07:00
}
return 0;
2021-10-15 14:00:41 -07:00
}
static unsigned int adl_psf_bw(struct intel_display *display,
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
int psf_gv_point)
{
const struct intel_bw_info *bi =
&display->bw.max[0];
drm/i915: Implement PSF GV point support PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-05-31 09:48:45 +03:00
return bi->psf_bw[psf_gv_point];
}
static unsigned int icl_qgv_bw(struct intel_display *display,
int num_active_planes, int qgv_point)
{
unsigned int idx;
if (DISPLAY_VER(display) >= 12)
idx = tgl_max_bw_index(display, num_active_planes, qgv_point);
else
idx = icl_max_bw_index(display, num_active_planes, qgv_point);
if (idx >= ARRAY_SIZE(display->bw.max))
return 0;
return display->bw.max[idx].deratedbw[qgv_point];
}
void intel_bw_init_hw(struct intel_display *display)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
const struct dram_info *dram_info = intel_dram_info(display->drm);
if (!HAS_DISPLAY(display))
return;
if (DISPLAY_VERx100(display) >= 3002)
tgl_get_bw_info(display, dram_info, &xe3lpd_3002_sa_info);
else if (DISPLAY_VER(display) >= 30)
tgl_get_bw_info(display, dram_info, &xe3lpd_sa_info);
else if (DISPLAY_VERx100(display) >= 1401 && display->platform.dgfx &&
dram_info->type == INTEL_DRAM_GDDR_ECC)
xe2_hpd_get_bw_info(display, dram_info, &xe2_hpd_ecc_sa_info);
else if (DISPLAY_VERx100(display) >= 1401 && display->platform.dgfx)
xe2_hpd_get_bw_info(display, dram_info, &xe2_hpd_sa_info);
else if (DISPLAY_VER(display) >= 14)
tgl_get_bw_info(display, dram_info, &mtl_sa_info);
else if (display->platform.dg2)
dg2_get_bw_info(display);
else if (display->platform.alderlake_p)
tgl_get_bw_info(display, dram_info, &adlp_sa_info);
else if (display->platform.alderlake_s)
tgl_get_bw_info(display, dram_info, &adls_sa_info);
else if (display->platform.rocketlake)
tgl_get_bw_info(display, dram_info, &rkl_sa_info);
else if (DISPLAY_VER(display) == 12)
tgl_get_bw_info(display, dram_info, &tgl_sa_info);
else if (DISPLAY_VER(display) == 11)
icl_get_bw_info(display, dram_info, &icl_sa_info);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
static unsigned int intel_bw_crtc_num_active_planes(const struct intel_crtc_state *crtc_state)
{
/*
* We assume cursors are small enough
* to not not cause bandwidth problems.
*/
return hweight8(crtc_state->active_planes & ~BIT(PLANE_CURSOR));
}
static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_state)
{
struct intel_display *display = to_intel_display(crtc_state);
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
unsigned int data_rate = 0;
enum plane_id plane_id;
for_each_plane_id_on_crtc(crtc, plane_id) {
/*
* We assume cursors are small enough
* to not not cause bandwidth problems.
*/
if (plane_id == PLANE_CURSOR)
continue;
data_rate += crtc_state->data_rate[plane_id];
if (DISPLAY_VER(display) < 11)
data_rate += crtc_state->data_rate_y[plane_id];
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
return data_rate;
}
/* "Maximum Pipe Read Bandwidth" */
static int intel_bw_crtc_min_cdclk(struct intel_display *display,
unsigned int data_rate)
{
if (DISPLAY_VER(display) < 12)
return 0;
return DIV_ROUND_UP_ULL(mul_u32_u32(data_rate, 10), 512);
}
static unsigned int intel_bw_num_active_planes(struct intel_display *display,
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
const struct intel_bw_state *bw_state)
{
unsigned int num_active_planes = 0;
enum pipe pipe;
for_each_pipe(display, pipe)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
num_active_planes += bw_state->num_active_planes[pipe];
return num_active_planes;
}
static unsigned int intel_bw_data_rate(struct intel_display *display,
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
const struct intel_bw_state *bw_state)
{
struct drm_i915_private *i915 = to_i915(display->drm);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
unsigned int data_rate = 0;
enum pipe pipe;
for_each_pipe(display, pipe)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
data_rate += bw_state->data_rate[pipe];
if (DISPLAY_VER(display) >= 13 && i915_vtd_active(i915))
data_rate = DIV_ROUND_UP(data_rate * 105, 100);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return data_rate;
}
struct intel_bw_state *to_intel_bw_state(struct intel_global_state *obj_state)
{
return container_of(obj_state, struct intel_bw_state, base);
}
struct intel_bw_state *
intel_atomic_get_old_bw_state(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
struct intel_global_state *bw_state;
bw_state = intel_atomic_get_old_global_obj_state(state, &display->bw.obj);
return to_intel_bw_state(bw_state);
}
struct intel_bw_state *
intel_atomic_get_new_bw_state(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
struct intel_global_state *bw_state;
bw_state = intel_atomic_get_new_global_obj_state(state, &display->bw.obj);
return to_intel_bw_state(bw_state);
}
struct intel_bw_state *
intel_atomic_get_bw_state(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
struct intel_global_state *bw_state;
bw_state = intel_atomic_get_global_obj_state(state, &display->bw.obj);
if (IS_ERR(bw_state))
return ERR_CAST(bw_state);
return to_intel_bw_state(bw_state);
}
static unsigned int icl_max_bw_qgv_point_mask(struct intel_display *display,
int num_active_planes)
{
unsigned int num_qgv_points = display->bw.max[0].num_qgv_points;
unsigned int max_bw_point = 0;
unsigned int max_bw = 0;
int i;
for (i = 0; i < num_qgv_points; i++) {
unsigned int max_data_rate =
icl_qgv_bw(display, num_active_planes, i);
/*
* We need to know which qgv point gives us
* maximum bandwidth in order to disable SAGV
* if we find that we exceed SAGV block time
* with watermarks. By that moment we already
* have those, as it is calculated earlier in
* intel_atomic_check,
*/
if (max_data_rate > max_bw) {
max_bw_point = BIT(i);
max_bw = max_data_rate;
}
}
return max_bw_point;
}
static u16 icl_prepare_qgv_points_mask(struct intel_display *display,
unsigned int qgv_points,
unsigned int psf_points)
{
return ~(ICL_PCODE_REQ_QGV_PT(qgv_points) |
ADLS_PCODE_REQ_PSF_PT(psf_points)) & icl_qgv_points_mask(display);
}
static unsigned int icl_max_bw_psf_gv_point_mask(struct intel_display *display)
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
{
unsigned int num_psf_gv_points = display->bw.max[0].num_psf_gv_points;
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
unsigned int max_bw_point_mask = 0;
unsigned int max_bw = 0;
int i;
for (i = 0; i < num_psf_gv_points; i++) {
unsigned int max_data_rate = adl_psf_bw(display, i);
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
if (max_data_rate > max_bw) {
max_bw_point_mask = BIT(i);
max_bw = max_data_rate;
} else if (max_data_rate == max_bw) {
max_bw_point_mask |= BIT(i);
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
}
}
return max_bw_point_mask;
}
static void icl_force_disable_sagv(struct intel_display *display,
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
struct intel_bw_state *bw_state)
{
unsigned int qgv_points = icl_max_bw_qgv_point_mask(display, 0);
unsigned int psf_points = icl_max_bw_psf_gv_point_mask(display);
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
bw_state->qgv_points_mask = icl_prepare_qgv_points_mask(display,
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
qgv_points,
psf_points);
drm_dbg_kms(display->drm, "Forcing SAGV disable: mask 0x%x\n",
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
bw_state->qgv_points_mask);
icl_pcode_restrict_qgv_points(display, bw_state->qgv_points_mask);
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
}
void icl_sagv_pre_plane_update(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
const struct intel_bw_state *old_bw_state =
intel_atomic_get_old_bw_state(state);
const struct intel_bw_state *new_bw_state =
intel_atomic_get_new_bw_state(state);
u16 old_mask, new_mask;
if (!new_bw_state)
return;
old_mask = old_bw_state->qgv_points_mask;
new_mask = old_bw_state->qgv_points_mask | new_bw_state->qgv_points_mask;
if (old_mask == new_mask)
return;
WARN_ON(!new_bw_state->base.changed);
drm_dbg_kms(display->drm, "Restricting QGV points: 0x%x -> 0x%x\n",
old_mask, new_mask);
/*
* Restrict required qgv points before updating the configuration.
* According to BSpec we can't mask and unmask qgv points at the same
* time. Also masking should be done before updating the configuration
* and unmasking afterwards.
*/
icl_pcode_restrict_qgv_points(display, new_mask);
}
void icl_sagv_post_plane_update(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
const struct intel_bw_state *old_bw_state =
intel_atomic_get_old_bw_state(state);
const struct intel_bw_state *new_bw_state =
intel_atomic_get_new_bw_state(state);
u16 old_mask, new_mask;
if (!new_bw_state)
return;
old_mask = old_bw_state->qgv_points_mask | new_bw_state->qgv_points_mask;
new_mask = new_bw_state->qgv_points_mask;
if (old_mask == new_mask)
return;
WARN_ON(!new_bw_state->base.changed);
drm_dbg_kms(display->drm, "Relaxing QGV points: 0x%x -> 0x%x\n",
old_mask, new_mask);
/*
* Allow required qgv points after updating the configuration.
* According to BSpec we can't mask and unmask qgv points at the same
* time. Also masking should be done before updating the configuration
* and unmasking afterwards.
*/
icl_pcode_restrict_qgv_points(display, new_mask);
}
static int mtl_find_qgv_points(struct intel_display *display,
unsigned int data_rate,
unsigned int num_active_planes,
struct intel_bw_state *new_bw_state)
{
unsigned int best_rate = UINT_MAX;
unsigned int num_qgv_points = display->bw.max[0].num_qgv_points;
unsigned int qgv_peak_bw = 0;
int i;
int ret;
ret = intel_atomic_lock_global_state(&new_bw_state->base);
if (ret)
return ret;
/*
* If SAGV cannot be enabled, disable the pcode SAGV by passing all 1's
* for qgv peak bw in PM Demand request. So assign UINT_MAX if SAGV is
* not enabled. PM Demand code will clamp the value for the register
*/
if (!intel_bw_can_enable_sagv(display, new_bw_state)) {
new_bw_state->qgv_point_peakbw = U16_MAX;
drm_dbg_kms(display->drm, "No SAGV, use UINT_MAX as peak bw.");
return 0;
}
/*
* Find the best QGV point by comparing the data_rate with max data rate
* offered per plane group
*/
for (i = 0; i < num_qgv_points; i++) {
unsigned int bw_index =
tgl_max_bw_index(display, num_active_planes, i);
unsigned int max_data_rate;
if (bw_index >= ARRAY_SIZE(display->bw.max))
continue;
max_data_rate = display->bw.max[bw_index].deratedbw[i];
if (max_data_rate < data_rate)
continue;
if (max_data_rate - data_rate < best_rate) {
best_rate = max_data_rate - data_rate;
qgv_peak_bw = display->bw.max[bw_index].peakbw[i];
}
drm_dbg_kms(display->drm, "QGV point %d: max bw %d required %d qgv_peak_bw: %d\n",
i, max_data_rate, data_rate, qgv_peak_bw);
}
drm_dbg_kms(display->drm, "Matching peaks QGV bw: %d for required data rate: %d\n",
qgv_peak_bw, data_rate);
/*
* The display configuration cannot be supported if no QGV point
* satisfying the required data rate is found
*/
if (qgv_peak_bw == 0) {
drm_dbg_kms(display->drm, "No QGV points for bw %d for display configuration(%d active planes).\n",
data_rate, num_active_planes);
return -EINVAL;
}
/* MTL PM DEMAND expects QGV BW parameter in multiples of 100 mbps */
new_bw_state->qgv_point_peakbw = DIV_ROUND_CLOSEST(qgv_peak_bw, 100);
return 0;
}
static int icl_find_qgv_points(struct intel_display *display,
unsigned int data_rate,
unsigned int num_active_planes,
const struct intel_bw_state *old_bw_state,
struct intel_bw_state *new_bw_state)
{
unsigned int num_psf_gv_points = display->bw.max[0].num_psf_gv_points;
unsigned int num_qgv_points = display->bw.max[0].num_qgv_points;
u16 psf_points = 0;
u16 qgv_points = 0;
int i;
int ret;
ret = intel_atomic_lock_global_state(&new_bw_state->base);
if (ret)
return ret;
for (i = 0; i < num_qgv_points; i++) {
unsigned int max_data_rate = icl_qgv_bw(display,
num_active_planes, i);
if (max_data_rate >= data_rate)
qgv_points |= BIT(i);
drm_dbg_kms(display->drm, "QGV point %d: max bw %d required %d\n",
i, max_data_rate, data_rate);
}
for (i = 0; i < num_psf_gv_points; i++) {
unsigned int max_data_rate = adl_psf_bw(display, i);
if (max_data_rate >= data_rate)
psf_points |= BIT(i);
drm_dbg_kms(display->drm, "PSF GV point %d: max bw %d"
" required %d\n",
i, max_data_rate, data_rate);
}
/*
* BSpec states that we always should have at least one allowed point
* left, so if we couldn't - simply reject the configuration for obvious
* reasons.
*/
if (qgv_points == 0) {
drm_dbg_kms(display->drm, "No QGV points provide sufficient memory"
" bandwidth %d for display configuration(%d active planes).\n",
data_rate, num_active_planes);
return -EINVAL;
}
if (num_psf_gv_points > 0 && psf_points == 0) {
drm_dbg_kms(display->drm, "No PSF GV points provide sufficient memory"
" bandwidth %d for display configuration(%d active planes).\n",
data_rate, num_active_planes);
return -EINVAL;
}
/*
* Leave only single point with highest bandwidth, if
* we can't enable SAGV due to the increased memory latency it may
* cause.
*/
if (!intel_bw_can_enable_sagv(display, new_bw_state)) {
qgv_points = icl_max_bw_qgv_point_mask(display, num_active_planes);
drm_dbg_kms(display->drm, "No SAGV, using single QGV point mask 0x%x\n",
qgv_points);
}
/*
* We store the ones which need to be masked as that is what PCode
* actually accepts as a parameter.
*/
new_bw_state->qgv_points_mask = icl_prepare_qgv_points_mask(display,
qgv_points,
psf_points);
/*
* If the actual mask had changed we need to make sure that
* the commits are serialized(in case this is a nomodeset, nonblocking)
*/
if (new_bw_state->qgv_points_mask != old_bw_state->qgv_points_mask) {
ret = intel_atomic_serialize_global_state(&new_bw_state->base);
if (ret)
return ret;
}
return 0;
}
static int intel_bw_check_qgv_points(struct intel_display *display,
const struct intel_bw_state *old_bw_state,
struct intel_bw_state *new_bw_state)
{
unsigned int data_rate = intel_bw_data_rate(display, new_bw_state);
unsigned int num_active_planes =
intel_bw_num_active_planes(display, new_bw_state);
data_rate = DIV_ROUND_UP(data_rate, 1000);
if (DISPLAY_VER(display) >= 14)
return mtl_find_qgv_points(display, data_rate, num_active_planes,
new_bw_state);
else
return icl_find_qgv_points(display, data_rate, num_active_planes,
old_bw_state, new_bw_state);
}
static bool intel_dbuf_bw_changed(struct intel_display *display,
const struct intel_dbuf_bw *old_dbuf_bw,
const struct intel_dbuf_bw *new_dbuf_bw)
{
enum dbuf_slice slice;
for_each_dbuf_slice(display, slice) {
if (old_dbuf_bw->max_bw[slice] != new_dbuf_bw->max_bw[slice] ||
old_dbuf_bw->active_planes[slice] != new_dbuf_bw->active_planes[slice])
return true;
}
return false;
}
static bool intel_bw_state_changed(struct intel_display *display,
const struct intel_bw_state *old_bw_state,
const struct intel_bw_state *new_bw_state)
{
enum pipe pipe;
for_each_pipe(display, pipe) {
const struct intel_dbuf_bw *old_dbuf_bw =
&old_bw_state->dbuf_bw[pipe];
const struct intel_dbuf_bw *new_dbuf_bw =
&new_bw_state->dbuf_bw[pipe];
if (intel_dbuf_bw_changed(display, old_dbuf_bw, new_dbuf_bw))
return true;
if (intel_bw_crtc_min_cdclk(display, old_bw_state->data_rate[pipe]) !=
intel_bw_crtc_min_cdclk(display, new_bw_state->data_rate[pipe]))
return true;
}
return false;
}
static void skl_plane_calc_dbuf_bw(struct intel_dbuf_bw *dbuf_bw,
struct intel_crtc *crtc,
enum plane_id plane_id,
const struct skl_ddb_entry *ddb,
unsigned int data_rate)
{
struct intel_display *display = to_intel_display(crtc);
unsigned int dbuf_mask = skl_ddb_dbuf_slice_mask(display, ddb);
enum dbuf_slice slice;
/*
* The arbiter can only really guarantee an
* equal share of the total bw to each plane.
*/
for_each_dbuf_slice_in_mask(display, slice, dbuf_mask) {
dbuf_bw->max_bw[slice] = max(dbuf_bw->max_bw[slice], data_rate);
dbuf_bw->active_planes[slice] |= BIT(plane_id);
}
}
static void skl_crtc_calc_dbuf_bw(struct intel_dbuf_bw *dbuf_bw,
const struct intel_crtc_state *crtc_state)
{
struct intel_display *display = to_intel_display(crtc_state);
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
enum plane_id plane_id;
memset(dbuf_bw, 0, sizeof(*dbuf_bw));
if (!crtc_state->hw.active)
return;
for_each_plane_id_on_crtc(crtc, plane_id) {
/*
* We assume cursors are small enough
* to not cause bandwidth problems.
*/
if (plane_id == PLANE_CURSOR)
continue;
skl_plane_calc_dbuf_bw(dbuf_bw, crtc, plane_id,
&crtc_state->wm.skl.plane_ddb[plane_id],
crtc_state->data_rate[plane_id]);
if (DISPLAY_VER(display) < 11)
skl_plane_calc_dbuf_bw(dbuf_bw, crtc, plane_id,
&crtc_state->wm.skl.plane_ddb_y[plane_id],
crtc_state->data_rate[plane_id]);
}
}
/* "Maximum Data Buffer Bandwidth" */
static int
intel_bw_dbuf_min_cdclk(struct intel_display *display,
const struct intel_bw_state *bw_state)
{
unsigned int total_max_bw = 0;
enum dbuf_slice slice;
for_each_dbuf_slice(display, slice) {
int num_active_planes = 0;
unsigned int max_bw = 0;
enum pipe pipe;
/*
* The arbiter can only really guarantee an
* equal share of the total bw to each plane.
*/
for_each_pipe(display, pipe) {
const struct intel_dbuf_bw *dbuf_bw = &bw_state->dbuf_bw[pipe];
max_bw = max(dbuf_bw->max_bw[slice], max_bw);
num_active_planes += hweight8(dbuf_bw->active_planes[slice]);
}
max_bw *= num_active_planes;
total_max_bw = max(total_max_bw, max_bw);
}
return DIV_ROUND_UP(total_max_bw, 64);
}
int intel_bw_min_cdclk(struct intel_display *display,
const struct intel_bw_state *bw_state)
{
enum pipe pipe;
int min_cdclk;
min_cdclk = intel_bw_dbuf_min_cdclk(display, bw_state);
for_each_pipe(display, pipe)
min_cdclk = max(min_cdclk,
intel_bw_crtc_min_cdclk(display,
bw_state->data_rate[pipe]));
return min_cdclk;
}
int intel_bw_calc_min_cdclk(struct intel_atomic_state *state,
bool *need_cdclk_calc)
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
{
struct intel_display *display = to_intel_display(state);
struct intel_bw_state *new_bw_state = NULL;
const struct intel_bw_state *old_bw_state = NULL;
const struct intel_cdclk_state *cdclk_state;
const struct intel_crtc_state *old_crtc_state;
const struct intel_crtc_state *new_crtc_state;
int old_min_cdclk, new_min_cdclk;
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
struct intel_crtc *crtc;
int i;
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
if (DISPLAY_VER(display) < 9)
return 0;
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
struct intel_dbuf_bw old_dbuf_bw, new_dbuf_bw;
skl_crtc_calc_dbuf_bw(&old_dbuf_bw, old_crtc_state);
skl_crtc_calc_dbuf_bw(&new_dbuf_bw, new_crtc_state);
if (!intel_dbuf_bw_changed(display, &old_dbuf_bw, &new_dbuf_bw))
continue;
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
new_bw_state = intel_atomic_get_bw_state(state);
if (IS_ERR(new_bw_state))
return PTR_ERR(new_bw_state);
old_bw_state = intel_atomic_get_old_bw_state(state);
new_bw_state->dbuf_bw[crtc->pipe] = new_dbuf_bw;
}
if (!old_bw_state)
return 0;
if (intel_bw_state_changed(display, old_bw_state, new_bw_state)) {
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
int ret = intel_atomic_lock_global_state(&new_bw_state->base);
if (ret)
return ret;
}
old_min_cdclk = intel_bw_min_cdclk(display, old_bw_state);
new_min_cdclk = intel_bw_min_cdclk(display, new_bw_state);
/*
* No need to check against the cdclk state if
* the min cdclk doesn't increase.
*
* Ie. we only ever increase the cdclk due to bandwidth
* requirements. This can reduce back and forth
* display blinking due to constant cdclk changes.
*/
if (new_min_cdclk <= old_min_cdclk)
return 0;
cdclk_state = intel_atomic_get_cdclk_state(state);
if (IS_ERR(cdclk_state))
return PTR_ERR(cdclk_state);
/*
* No need to recalculate the cdclk state if
* the min cdclk doesn't increase.
*
* Ie. we only ever increase the cdclk due to bandwidth
* requirements. This can reduce back and forth
* display blinking due to constant cdclk changes.
*/
if (new_min_cdclk <= intel_cdclk_bw_min_cdclk(cdclk_state))
return 0;
drm_dbg_kms(display->drm,
"new bandwidth min cdclk (%d kHz) > old min cdclk (%d kHz)\n",
new_min_cdclk, intel_cdclk_bw_min_cdclk(cdclk_state));
*need_cdclk_calc = true;
drm/i915: Adjust CDCLK accordingly to our DBuf bw needs According to BSpec max BW per slice is calculated using formula Max BW = CDCLK * 64. Currently when calculating min CDCLK we account only per plane requirements, however in order to avoid FIFO underruns we need to estimate accumulated BW consumed by all planes(ddb entries basically) residing on that particular DBuf slice. This will allow us to put CDCLK lower and save power when we don't need that much bandwidth or gain additional performance once plane consumption grows. v2: - Fix long line warning - Limited new DBuf bw checks to only gens >= 11 v3: - Lets track used Dbuf bw per slice and per crtc in bw state (or may be in DBuf state in future), that way we don't need to have all crtcs in state and those only if we detect if are actually going to change cdclk, just same way as we do with other stuff, i.e intel_atomic_serialize_global_state and co. Just as per Ville's paradigm. - Made dbuf bw calculation procedure look nicer by introducing for_each_dbuf_slice_in_mask - we often will now need to iterate slices using mask. - According to experimental results CDCLK * 64 accounts for overall bandwidth across all dbufs, not per dbuf. v4: - Fixed missing const(Ville) - Removed spurious whitespaces(Ville) - Fixed local variable init(reduced scope where not needed) - Added some comments about data rate for planar formats - Changed struct intel_crtc_bw to intel_dbuf_bw - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville) v5: - Removed unneeded macro v6: - Prevent too frequent CDCLK switching back and forth: Always switch to higher CDCLK when needed to prevent bandwidth issues, however don't switch to lower CDCLK earlier than once in 30 minutes in order to prevent constant modeset blinking. We could of course not switch back at all, however this is bad from power consumption point of view. v7: - Fixed to track cdclk using bw_state, modeset will be now triggered only when CDCLK change is really needed. v8: - Lock global state if bw_state->min_cdclk is changed. - Try getting bw_state only if there are crtcs in the commit (need to have read-locked global state) v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN as ddb_size is 0. v10: - Lock global state for older gens as well. v11: - Define new bw_calc_min_cdclk hook, instead of using a condition(Manasi Navare) v12: - Fixed rebase conflict v13: - Added spaces after declarations to make checkpatch happy. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
2020-05-20 18:00:58 +03:00
return 0;
}
static int intel_bw_check_data_rate(struct intel_atomic_state *state, bool *changed)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
struct intel_display *display = to_intel_display(state);
const struct intel_crtc_state *new_crtc_state, *old_crtc_state;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
struct intel_crtc *crtc;
int i;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
unsigned int old_data_rate =
intel_bw_crtc_data_rate(old_crtc_state);
unsigned int new_data_rate =
intel_bw_crtc_data_rate(new_crtc_state);
unsigned int old_active_planes =
intel_bw_crtc_num_active_planes(old_crtc_state);
unsigned int new_active_planes =
intel_bw_crtc_num_active_planes(new_crtc_state);
struct intel_bw_state *new_bw_state;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
/*
* Avoid locking the bw state when
* nothing significant has changed.
*/
if (old_data_rate == new_data_rate &&
old_active_planes == new_active_planes)
continue;
new_bw_state = intel_atomic_get_bw_state(state);
if (IS_ERR(new_bw_state))
return PTR_ERR(new_bw_state);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
new_bw_state->data_rate[crtc->pipe] = new_data_rate;
new_bw_state->num_active_planes[crtc->pipe] = new_active_planes;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
*changed = true;
drm_dbg_kms(display->drm,
"[CRTC:%d:%s] data rate %u num active planes %u\n",
crtc->base.base.id, crtc->base.name,
new_bw_state->data_rate[crtc->pipe],
new_bw_state->num_active_planes[crtc->pipe]);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
}
return 0;
}
static int intel_bw_modeset_checks(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
const struct intel_bw_state *old_bw_state;
struct intel_bw_state *new_bw_state;
if (DISPLAY_VER(display) < 9)
return 0;
new_bw_state = intel_atomic_get_bw_state(state);
if (IS_ERR(new_bw_state))
return PTR_ERR(new_bw_state);
old_bw_state = intel_atomic_get_old_bw_state(state);
new_bw_state->active_pipes =
intel_calc_active_pipes(state, old_bw_state->active_pipes);
if (new_bw_state->active_pipes != old_bw_state->active_pipes) {
int ret;
ret = intel_atomic_lock_global_state(&new_bw_state->base);
if (ret)
return ret;
}
return 0;
}
static int intel_bw_check_sagv_mask(struct intel_atomic_state *state)
{
struct intel_display *display = to_intel_display(state);
const struct intel_crtc_state *old_crtc_state;
const struct intel_crtc_state *new_crtc_state;
const struct intel_bw_state *old_bw_state = NULL;
struct intel_bw_state *new_bw_state = NULL;
struct intel_crtc *crtc;
int ret, i;
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
if (intel_crtc_can_enable_sagv(old_crtc_state) ==
intel_crtc_can_enable_sagv(new_crtc_state))
continue;
new_bw_state = intel_atomic_get_bw_state(state);
if (IS_ERR(new_bw_state))
return PTR_ERR(new_bw_state);
old_bw_state = intel_atomic_get_old_bw_state(state);
if (intel_crtc_can_enable_sagv(new_crtc_state))
new_bw_state->pipe_sagv_reject &= ~BIT(crtc->pipe);
else
new_bw_state->pipe_sagv_reject |= BIT(crtc->pipe);
}
if (!new_bw_state)
return 0;
if (intel_bw_can_enable_sagv(display, new_bw_state) !=
intel_bw_can_enable_sagv(display, old_bw_state)) {
ret = intel_atomic_serialize_global_state(&new_bw_state->base);
if (ret)
return ret;
} else if (new_bw_state->pipe_sagv_reject != old_bw_state->pipe_sagv_reject) {
ret = intel_atomic_lock_global_state(&new_bw_state->base);
if (ret)
return ret;
}
return 0;
}
int intel_bw_atomic_check(struct intel_atomic_state *state, bool any_ms)
{
struct intel_display *display = to_intel_display(state);
bool changed = false;
struct intel_bw_state *new_bw_state;
const struct intel_bw_state *old_bw_state;
int ret;
if (DISPLAY_VER(display) < 9)
return 0;
if (any_ms) {
ret = intel_bw_modeset_checks(state);
if (ret)
return ret;
}
ret = intel_bw_check_sagv_mask(state);
if (ret)
return ret;
/* FIXME earlier gens need some checks too */
if (DISPLAY_VER(display) < 11)
return 0;
ret = intel_bw_check_data_rate(state, &changed);
if (ret)
return ret;
old_bw_state = intel_atomic_get_old_bw_state(state);
new_bw_state = intel_atomic_get_new_bw_state(state);
if (new_bw_state &&
intel_bw_can_enable_sagv(display, old_bw_state) !=
intel_bw_can_enable_sagv(display, new_bw_state))
changed = true;
/*
* If none of our inputs (data rates, number of active
* planes, SAGV yes/no) changed then nothing to do here.
*/
if (!changed)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return 0;
ret = intel_bw_check_qgv_points(display, old_bw_state, new_bw_state);
if (ret)
return ret;
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return 0;
}
static void intel_bw_crtc_update(struct intel_bw_state *bw_state,
const struct intel_crtc_state *crtc_state)
{
struct intel_display *display = to_intel_display(crtc_state);
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
bw_state->data_rate[crtc->pipe] =
intel_bw_crtc_data_rate(crtc_state);
bw_state->num_active_planes[crtc->pipe] =
intel_bw_crtc_num_active_planes(crtc_state);
drm_dbg_kms(display->drm, "pipe %c data rate %u num active planes %u\n",
pipe_name(crtc->pipe),
bw_state->data_rate[crtc->pipe],
bw_state->num_active_planes[crtc->pipe]);
}
void intel_bw_update_hw_state(struct intel_display *display)
{
struct intel_bw_state *bw_state =
to_intel_bw_state(display->bw.obj.state);
struct intel_crtc *crtc;
if (DISPLAY_VER(display) < 9)
return;
bw_state->active_pipes = 0;
bw_state->pipe_sagv_reject = 0;
for_each_intel_crtc(display->drm, crtc) {
const struct intel_crtc_state *crtc_state =
to_intel_crtc_state(crtc->base.state);
enum pipe pipe = crtc->pipe;
if (crtc_state->hw.active)
bw_state->active_pipes |= BIT(pipe);
if (DISPLAY_VER(display) >= 11)
intel_bw_crtc_update(bw_state, crtc_state);
skl_crtc_calc_dbuf_bw(&bw_state->dbuf_bw[pipe], crtc_state);
/* initially SAGV has been forced off */
bw_state->pipe_sagv_reject |= BIT(pipe);
}
}
void intel_bw_crtc_disable_noatomic(struct intel_crtc *crtc)
{
struct intel_display *display = to_intel_display(crtc);
struct intel_bw_state *bw_state =
to_intel_bw_state(display->bw.obj.state);
enum pipe pipe = crtc->pipe;
if (DISPLAY_VER(display) < 9)
return;
bw_state->data_rate[pipe] = 0;
bw_state->num_active_planes[pipe] = 0;
memset(&bw_state->dbuf_bw[pipe], 0, sizeof(bw_state->dbuf_bw[pipe]));
}
static struct intel_global_state *
intel_bw_duplicate_state(struct intel_global_obj *obj)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
struct intel_bw_state *state;
state = kmemdup(obj->state, sizeof(*state), GFP_KERNEL);
if (!state)
return NULL;
return &state->base;
}
static void intel_bw_destroy_state(struct intel_global_obj *obj,
struct intel_global_state *state)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
kfree(state);
}
static const struct intel_global_state_funcs intel_bw_funcs = {
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
.atomic_duplicate_state = intel_bw_duplicate_state,
.atomic_destroy_state = intel_bw_destroy_state,
};
int intel_bw_init(struct intel_display *display)
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
{
struct intel_bw_state *state;
state = kzalloc(sizeof(*state), GFP_KERNEL);
if (!state)
return -ENOMEM;
intel_atomic_global_obj_init(display, &display->bw.obj,
&state->base, &intel_bw_funcs);
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
/*
* Limit this only if we have SAGV. And for Display version 14 onwards
* sagv is handled though pmdemand requests
*/
if (intel_has_sagv(display) && IS_DISPLAY_VER(display, 11, 13))
icl_force_disable_sagv(display, state);
drm/i915/display: Disable SAGV on bw init, to force QGV point recalculation Problem is that on some platforms, we do get QGV point mask in wrong state on boot. However driver assumes it is set to 0 (i.e all points allowed), however in reality we might get them all restricted, causing issues. Lets disable SAGV initially to force proper QGV point state. If more QGV points are available, driver will recalculate and update those then after next commit. v2: - Added trace to see which QGV/PSF GV point is used when SAGV is disabled. v3: - Move force disable function to intel_bw_init in order to initialize bw state as well, so that hw/sw are immediately in sync after init. v4: - Don't try sending PCode request, seems like it is not possible at intel_bw_init, however assigning bw->state to be restricted as if SAGV is off, still forces driveer to send PCode request anyway on next modeset, so the solution still works. However we still need to address the case, when no display is connected, which anyway requires much more changes. v5: - Put PCode request back and apply temporary hack to make the request succeed(in case if there 2 PSF GV points with same BW, PCode accepts only if both points are restricted/unrestricted same time) - Fix argument sequence for adl_qgv_bw(Ville Syrjälä) v6: - Fix wrong platform checks, not to break everything else. v7: - Split the handling of quplicate QGV/PSF GV points (Vinod) Restrict force disable to display version below 14 (Vinod) v8: - Simplify icl_force_disable_sagv (Vinod) Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240405113533.338553-5-vinod.govindapillai@intel.com
2024-04-05 14:35:31 +03:00
drm/i915: Make sure we have enough memory bandwidth on ICL ICL has so many planes that it can easily exceed the maximum effective memory bandwidth of the system. We must therefore check that we don't exceed that limit. The algorithm is very magic number heavy and lacks sufficient explanation for now. We also have no sane way to query the memory clock and timings, so we must rely on a combination of raw readout from the memory controller and hardcoded assumptions. The memory controller values obviously change as the system jumps between the different SAGV points, so we try to stabilize it first by disabling SAGV for the duration of the readout. The utilized bandwidth is tracked via a device wide atomic private object. That is actually not robust because we can't afford to enforce strict global ordering between the pipes. Thus I think I'll need to change this to simply chop up the available bandwidth between all the active pipes. Each pipe can then do whatever it wants as long as it doesn't exceed its budget. That scheme will also require that we assume that any number of planes could be active at any time. TODO: make it robust and deal with all the open questions v2: Sleep longer after disabling SAGV v3: Poll for the dclk to get raised (seen it take 250ms!) If the system has 2133MT/s memory then we pointlessly wait one full second :( v4: Use the new pcode interface to get the qgv points rather that using hardcoded numbers v5: Move the pcode stuff into intel_bw.c (Matt) s/intel_sagv_info/intel_qgv_info/ Do the NV12/P010 as per spec for now (Matt) s/IS_ICELAKE/IS_GEN11/ v6: Ignore bandwidth limits if the pcode query fails Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
2019-05-24 18:36:14 +03:00
return 0;
}
bool intel_bw_pmdemand_needs_update(struct intel_atomic_state *state)
{
const struct intel_bw_state *new_bw_state, *old_bw_state;
new_bw_state = intel_atomic_get_new_bw_state(state);
old_bw_state = intel_atomic_get_old_bw_state(state);
if (new_bw_state &&
new_bw_state->qgv_point_peakbw != old_bw_state->qgv_point_peakbw)
return true;
return false;
}
bool intel_bw_can_enable_sagv(struct intel_display *display,
const struct intel_bw_state *bw_state)
{
if (DISPLAY_VER(display) < 11 &&
bw_state->active_pipes && !is_power_of_2(bw_state->active_pipes))
return false;
return bw_state->pipe_sagv_reject == 0;
}
int intel_bw_qgv_point_peakbw(const struct intel_bw_state *bw_state)
{
return bw_state->qgv_point_peakbw;
}