Commit graph

2587 commits

Author SHA1 Message Date
Yang Wei
a947a45f05 iommu/mediatek: Fix semicolon code style issue
Delete a superfluous semicolon in mtk_iommu_add_device().

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-03-01 10:17:56 +01:00
Lan Tianyu
29217a4746 iommu/hyper-v: Add Hyper-V stub IOMMU driver
On the bare metal, enabling X2APIC mode requires interrupt remapping
function which helps to deliver irq to cpu with 32-bit APIC ID.
Hyper-V doesn't provide interrupt remapping function so far and Hyper-V
MSI protocol already supports to deliver interrupt to the CPU whose
virtual processor index is more than 255. IO-APIC interrupt still has
8-bit APIC ID limitation.

This patch is to add Hyper-V stub IOMMU driver in order to enable
X2APIC mode successfully in Hyper-V Linux guest. The driver returns X2APIC
interrupt remapping capability when X2APIC mode is available. Otherwise,
it creates a Hyper-V irq domain to limit IO-APIC interrupts' affinity
and make sure cpus assigned with IO-APIC interrupt have 8-bit APIC ID.

Define 24 IO-APIC remapping entries because Hyper-V only expose one
single IO-APIC and one IO-APIC has 24 pins according IO-APIC spec(
https://pdos.csail.mit.edu/6.828/2016/readings/ia32/ioapic.pdf).

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-28 11:12:16 +01:00
Lu Baolu
117266fd59 iommu/vt-d: Check identity map for hot-added devices
The Intel IOMMU driver will put devices into a static identity
mapped domain during boot if the kernel parameter "iommu=pt" is
used. That means the IOMMU hardware will translate a DMA address
into the same memory address.

Unfortunately, hot-added devices are not subject to this. That
results in some devices not working properly after hot added. A
quick way to reproduce this issue is to boot a system with

    iommu=pt

and, remove then readd the pci device with

    echo 1 > /sys/bus/pci/devices/[pci_source_id]/remove
    echo 1 > /sys/bus/pci/rescan

You will find the identity mapped domain was replaced with a
normal domain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Reported-by: Jis Ben <jisben@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: James Dong <xmdong@google.com>
Fixes: 99dcadede4 ('intel-iommu: Support PCIe hot-plug')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:55:14 +01:00
Julia Cartwright
cffaaf0c81 iommu/dmar: Fix buffer overflow during PCI bus notification
Commit 57384592c4 ("iommu/vt-d: Store bus information in RMRR PCI
device path") changed the type of the path data, however, the change in
path type was not reflected in size calculations.  Update to use the
correct type and prevent a buffer overflow.

This bug manifests in systems with deep PCI hierarchies, and can lead to
an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
or can lead to overflow of slab-allocated data.

   BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
   Write of size 1 at addr ffffffff90445d80 by task swapper/0/1
   CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.87-rt49-02406-gd0a0e96 #1
   Call Trace:
    ? dump_stack+0x46/0x59
    ? print_address_description+0x1df/0x290
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? kasan_report+0x256/0x340
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? e820__memblock_setup+0xb0/0xb0
    ? dmar_dev_scope_init+0x424/0x48f
    ? __down_write_common+0x1ec/0x230
    ? dmar_dev_scope_init+0x48f/0x48f
    ? dmar_free_unused_resources+0x109/0x109
    ? cpumask_next+0x16/0x20
    ? __kmem_cache_create+0x392/0x430
    ? kmem_cache_create+0x135/0x2f0
    ? e820__memblock_setup+0xb0/0xb0
    ? intel_iommu_init+0x170/0x1848
    ? _raw_spin_unlock_irqrestore+0x32/0x60
    ? migrate_enable+0x27a/0x5b0
    ? sched_setattr+0x20/0x20
    ? migrate_disable+0x1fc/0x380
    ? task_rq_lock+0x170/0x170
    ? try_to_run_init_process+0x40/0x40
    ? locks_remove_file+0x85/0x2f0
    ? dev_prepare_static_identity_mapping+0x78/0x78
    ? rt_spin_unlock+0x39/0x50
    ? lockref_put_or_lock+0x2a/0x40
    ? dput+0x128/0x2f0
    ? __rcu_read_unlock+0x66/0x80
    ? __fput+0x250/0x300
    ? __rcu_read_lock+0x1b/0x30
    ? mntput_no_expire+0x38/0x290
    ? e820__memblock_setup+0xb0/0xb0
    ? pci_iommu_init+0x25/0x63
    ? pci_iommu_init+0x25/0x63
    ? do_one_initcall+0x7e/0x1c0
    ? initcall_blacklisted+0x120/0x120
    ? kernel_init_freeable+0x27b/0x307
    ? rest_init+0xd0/0xd0
    ? kernel_init+0xf/0x120
    ? rest_init+0xd0/0xd0
    ? ret_from_fork+0x1f/0x40
   The buggy address belongs to the variable:
    dmar_pci_notify_info_buf+0x40/0x60

Fixes: 57384592c4 ("iommu/vt-d: Store bus information in RMRR PCI device path")
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:24:37 +01:00
Geert Uytterhoeven
18b3af4492 iommu: Fix IOMMU debugfs fallout
A change made in the final version of IOMMU debugfs support replaced the
public function iommu_debugfs_new_driver_dir() by the public dentry
iommu_debugfs_dir in <linux/iommu.h>, but forgot to update both the
implementation in iommu-debugfs.c, and the patch description.

Fix this by exporting iommu_debugfs_dir, and removing the reference to
and implementation of iommu_debugfs_new_driver_dir().

Fixes: bad614b242 ("iommu: Enable debugfs exposure of IOMMU driver internals")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:15:58 +01:00
Kuppuswamy Sathyanarayanan
61363c1474 iommu/vt-d: Enable ATS only if the device uses page aligned address.
As per Intel vt-d specification, Rev 3.0 (section 7.5.1.1, title "Page
Request Descriptor"), Intel IOMMU page request descriptor only uses
bits[63:12] of the page address. Hence Intel IOMMU driver would only
permit devices that advertise they would only send Page Aligned Requests
to participate in ATS service.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:08:07 +01:00
Kuppuswamy Sathyanarayanan
1b84778a62 iommu/vt-d: Fix PRI/PASID dependency issue.
In Intel IOMMU, if the Page Request Queue (PRQ) is full, it will
automatically respond to the device with a success message as a keep
alive. And when sending the success message, IOMMU will include PASID in
the Response Message when the Page Request has a PASID in Request
Message and it does not check against the PRG Response PASID requirement
of the device before sending the response. Also, if the device receives
the PRG response with PASID when its not expecting it the device behavior
is undefined. So if PASID is enabled in the device, enable PRI only if
device expects PASID in PRG Response Message.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 11:07:05 +01:00
Logan Gunthorpe
3f0c625c6a iommu/vt-d: Allow interrupts from the entire bus for aliased devices
When a device has multiple aliases that all are from the same bus,
we program the IRTE to accept requests from any matching device on the
bus.

This is so NTB devices which can have requests from multiple bus-devfns
can pass MSI interrupts through across the bridge.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 10:34:03 +01:00
Logan Gunthorpe
9ca8261173 iommu/vt-d: Add helper to set an IRTE to verify only the bus number
The current code uses set_irte_sid() with SVT_VERIFY_BUS and PCI_DEVID
to set the SID value. However, this is very confusing because, with
SVT_VERIFY_BUS, the SID value is not a PCI devfn address, but the start
and end bus numbers to match against.

According to the Intel Virtualization Technology for Directed I/O
Architecture Specification, Rev. 3.0, page 9-36:

  The most significant 8-bits of the SID field contains the Startbus#,
  and the least significant 8-bits of the SID field contains the Endbus#.
  Interrupt requests that reference this IRTE must have a requester-id
  whose bus# (most significant 8-bits of requester-id) has a value equal
  to or within the Startbus# to Endbus# range.

So to make this more clear, introduce a new set_irte_verify_bus() that
explicitly takes a start bus and end bus so that we can stop abusing
the PCI_DEVID macro.

This helper function will be called a second time in an subsequent patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 10:34:03 +01:00
Nicolas Boichat
032ebd8548 iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables
L1 tables are allocated with __get_dma_pages, and therefore already
ignored by kmemleak.

Without this, the kernel would print this error message on boot,
when the first L1 table is allocated:

[    2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black
[    2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S                4.19.16 #8
[    2.831227] Workqueue: events deferred_probe_work_func
[    2.836353] Call trace:
...
[    2.852532]  paint_ptr+0xa0/0xa8
[    2.855750]  kmemleak_ignore+0x38/0x6c
[    2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
[    2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
[    2.868354]  alloc_io_pgtable_ops+0x3c/0x78
...

Fixes: e5fc9753b1 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-26 10:16:06 +01:00
Bjorn Helgaas
f096d6657a iommu/vt-d: Remove misleading "domain 0" test from domain_exit()
The "Domain 0 is reserved, so dont process it" comment suggests that a NULL
pointer corresponds to domain 0.  I don't think that's true, and in any
case, every caller supplies a non-NULL domain pointer that has already been
dereferenced, so the test is unnecessary.

Remove the test for a null "domain" pointer.  No functional change
intended.

This null pointer check was added by 5e98c4b1d6 ("Allocation and free
functions of virtual machine domain").

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:07:06 +01:00
Bjorn Helgaas
717532394c iommu/vt-d: Remove unused dmar_remove_one_dev_info() argument
domain_remove_dev_info() takes a struct dmar_domain * argument, but doesn't
use it.  Remove it.  No functional change intended.

The last use of this argument was removed by 127c761598 ("iommu/vt-d:
Pass device_domain_info to __dmar_remove_one_dev_info").

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:07:06 +01:00
Bjorn Helgaas
e083ea5b02 iommu/vt-d: Remove unnecessary local variable initializations
A local variable initialization is a hint that the variable will be used in
an unusual way.  If the initialization is unnecessary, that hint becomes a
distraction.

Remove unnecessary initializations.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:07:05 +01:00
Bjorn Helgaas
932a6523ce iommu/vt-d: Use dev_printk() when possible
Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:07:05 +01:00
Bjorn Helgaas
5f226da1b1 iommu/amd: Use dev_printk() when possible
Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:06:39 +01:00
Bjorn Helgaas
780da9e4f5 iommu: Use dev_printk() when possible
Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.

E.g., I think these messages related to surprise hotplug:

  pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
  iommu: Removing device 0000:87:00.0 from group 12
  pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec

would be easier to read as these (also requires some PCI changes not
included here):

  pci 0000:80:10.0: Slot(36): Link Down
  pci 0000:87:00.0: Removing from iommu group 12
  pci 0000:80:10.0: Slot(36): Card present
  pci 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 12:02:14 +01:00
Rob Herring
b77cf11f09 iommu: Allow io-pgtable to be used outside of drivers/iommu/
Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
use the page table library. Specifically, some ARM Mali GPUs use the
ARM page table formats.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 11:26:48 +01:00
Greg Kroah-Hartman
9481caf39b Merge 5.0-rc6 into driver-core-next
We need the debugfs fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:09:02 +01:00
Linus Torvalds
2e277fa089 IOMMU Fix for Linux v5.0-rc5:
* Intel decided to leave the newly added Scalable Mode Feature
 	  default-disabled for now. The patch here accomplishes that.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcXbITAAoJECvwRC2XARrjuyEQALi6rRLaXXTqGdSDNAC5no7U
 EMHg3u/Ezm6ynLfZVbbCVih4rDy2gXh9dwacFjoWAgodU58wGUVJuGgMok+anax1
 VJPxcHfn0yltb767p4hF65kqcWEKKRssnZLw43E5Tkw/JjyrBwDQ95StdOJIlmd8
 qDiEPIKfTyiZn9QUErCHp+JbrCnkVvxFcFazHh2f8bzv3IjLuN1dA3Uof4mkyqgn
 tt7mmwAi1MdQAaSRUDlnKB9Qqi+TvR3JGII37v54gkmz4BC7xCnAxBZPZHzCJcp8
 JMODezy/ZX2HiaWqWYMf+Idrkd0fxpnfChtkl8SX33UZQcwNS57kBr/L0AshqAjS
 hWUoyOHTF9VWYtA6KgbULpUtlCpWImCwu2tkdYk1fl/OzAWqhoZ11YbSarsbQPC9
 RM8Ear5EhoWIyjqUDm0iKTYifbga2oJ9wBglRC8/QielZdbva3Yh2OKHXNWwhlml
 hUdIUkXHoTLaI0nyhUVTxOs3ixVzv/gZM7HxCXmnNVoPa+jvgFuMXTYPfuFDdX5/
 BHke5FZHGKCTTqRxfcipzdiBcKD71+iGIHMHPw2hBgn026dCCHVW8baFHmXDs5Sv
 nFxzRru0XtxiN9cCXv1IXasxWizrSnVkdy+jIAqwY3r0iO0j9CtKdh3SYd9vpQ/q
 3YbTsBTdgVrJ4GdFoFD+
 =4Nrv
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fix from Joerg Roedel:
 "Intel decided to leave the newly added Scalable Mode Feature
  default-disabled for now. The patch here accomplishes that"

* tag 'iommu-fixes-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Leave scalable mode default off
2019-02-08 15:34:10 -08:00
Rafael J. Wysocki
ea4f640025 IOMMU: Make dwo drivers use stateless device links
The device links used by rockchip-iommu and exynos-iommu are
completely managed by these drivers within the IOMMU framework,
so there is no reason to involve the driver core in the management
of these links.

For this reason, make rockchip-iommu and exynos-iommu pass
DL_FLAG_STATELESS in flags to device_link_add(), so that the device
links used by them are stateless.

[Note that this change is requisite for a subsequent one that will
 rework the management of stateful device links in the driver core
 and it will not be compatible with the two drivers in question any
 more.]

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-01 10:04:08 +01:00
Linus Torvalds
1c0490ce90 IOMMU Fixes for Linux v5.0-rc4
A few more fixes this time:
 
 	- Two patches to fix the error path of the map_sg implementation
 	  of the AMD IOMMU driver.
 
 	- Also a missing IOTLB flush is fixed in the AMD IOMMU driver.
 
 	- Memory leak fix for the Intel IOMMU driver.
 
 	- Fix a regression in the Mediatek IOMMU driver which caused
 	  device initialization to fail (seen as broken HDMI output).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcUbw6AAoJECvwRC2XARrjlhQP/1tvg9nam673Otx45FnmvKUk
 7Bu5oLRXo67zBA9NqYZKaENFLTzb9TneyalSoiMwWfZTSaLTFgleieeT6iij1uU+
 D4TEpXF7Jc87Zm7pPASuWHGEu3XR0dKja4pukVHnH0vRXlOhKsP6MrmEUj2+5ZrJ
 RBXSX4a9Q6Ros2OxjnxJNxo8oekJQV0TiKtafzSUqPHnF4QLHLisuCe3z2DLwtsg
 NHwis0Fgrb9ljM+pxEBYmeG9UXxfdvG2wlmYwrJvhoK+lmsjq1HjG5afxyMYvHSU
 daK+mBvZ4HHLCe5oVY+BaMo8De1g1spqT2klWZecgr0FDXQdovdkYipSun6TZO/i
 2dv8QvMkCwFwLfReJj1AV6qf83zR3Sn/rb4MKqo0/K9xlHc3WxVoN20Tcikwg6wN
 5bPucgNkpavJxiODjfd6iiBC0K7SAOnvkiACySSXe5daL/Oi9c9q6izy7Z1z1D7q
 UomvUCGyIj01drG+YC9m1eH4dqILTiDJGA5mrdtoAEDFYwYtp+354fF3u0x2sCsb
 g87KV4RdAMuXRKWdxdsfw1BFNliHo4QcGDQk54bwN2t4X6hkOiq9jLMVcm4R+Fwy
 IcCoS0BXVdbD0PZXeb2M4CHkxsV7AIU7Drj2/fb4pmjuMb22Z7228yRCsIIYzGcM
 qq2AnNS1J0Z9BsxIItWO
 =kSY5
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
 "A few more fixes this time:

   - Two patches to fix the error path of the map_sg implementation of
     the AMD IOMMU driver.

   - Also a missing IOTLB flush is fixed in the AMD IOMMU driver.

   - Memory leak fix for the Intel IOMMU driver.

   - Fix a regression in the Mediatek IOMMU driver which caused device
     initialization to fail (seen as broken HDMI output)"

* tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix IOMMU page flush when detach device from a domain
  iommu/mediatek: Use correct fwspec in mtk_iommu_add_device()
  iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions()
  iommu/amd: Unmap all mapped pages in error path of map_sg
  iommu/amd: Call free_iova_fast with pfn in map_sg
2019-01-30 09:30:03 -08:00
Peter Xu
1a9eb9b98f iommu/vt-d: Remove change_pte notifier
The change_pte() interface is tailored for PFN updates, while the
other notifier invalidate_range() should be enough for Intel IOMMU
cache flushing.  Actually we've done similar thing for AMD IOMMU
already in 8301da53fb ("iommu/amd: Remove change_pte mmu_notifier
call-back", 2014-07-30) but the Intel IOMMU driver still have it.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-30 17:31:36 +01:00
Peter Xu
5a63f0adeb iommu/amd: Remove clear_flush_young notifier
AMD IOMMU driver is using the clear_flush_young() to do cache flushing
but that's actually already covered by invalidate_range().  Remove the
extra notifier and the chunks.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-30 17:30:46 +01:00
Jerry Snitselaar
2e6c6a8657 iommu/amd: Print reason for iommu_map_page failure in map_sg
Since there are multiple possible failures in iommu_map_page
it would be useful to know which case is being hit when the
error message is printed in map_sg. While here, fix up checkpatch
complaint about using function name in a string instead of
__func__.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-30 17:26:24 +01:00
Lu Baolu
8950dcd83a iommu/vt-d: Leave scalable mode default off
Commit 765b6a98c1 ("iommu/vt-d: Enumerate the scalable
mode capability") enables VT-d scalable mode if hardware
advertises the capability. As we will bring up different
features and use cases to upstream in different patch
series, it will leave some intermediate kernel versions
which support partial features. Hence, end user might run
into problems when they use such kernels on bare metals
or virtualization environments.

This leaves scalable mode default off and end users could
turn it on with "intel-iommu=sm_on" only when they have
clear ideas about which scalable features are supported
in the kernel.

Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-30 17:23:58 +01:00
Logan Gunthorpe
21d5d27c04 iommu/vt-d: Implement dma_[un]map_resource()
Currently the Intel IOMMU uses the default dma_[un]map_resource()
implementations does nothing and simply returns the physical address
unmodified.

However, this doesn't create the IOVA entries necessary for addresses
mapped this way to work when the IOMMU is enabled. Thus, when the
IOMMU is enabled, drivers relying on dma_map_resource() will trigger
DMAR errors. We see this when running ntb_transport with the IOMMU
enabled, DMA, and switchtec hardware.

The implementation for intel_map_resource() is nearly identical to
intel_map_page(), we just have to re-create __intel_map_single().
dma_unmap_resource() uses intel_unmap_page() directly as the
functions are identical.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-30 17:23:04 +01:00
Suravee Suthikulpanit
9825bd94e3 iommu/amd: Fix IOMMU page flush when detach device from a domain
When a VM is terminated, the VFIO driver detaches all pass-through
devices from VFIO domain by clearing domain id and page table root
pointer from each device table entry (DTE), and then invalidates
the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages.

Currently, the IOMMU driver keeps track of which IOMMU and how many
devices are attached to the domain. When invalidate IOMMU pages,
the driver checks if the IOMMU is still attached to the domain before
issuing the invalidate page command.

However, since VFIO has already detached all devices from the domain,
the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as
there is no IOMMU attached to the domain. This results in data
corruption and could cause the PCI device to end up in indeterministic
state.

Fix this by invalidate IOMMU pages when detach a device, and
before decrementing the per-domain device reference counts.

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Joerg Roedel <joro@8bytes.org>
Co-developed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: 6de8ad9b9e ('x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-24 15:24:49 +01:00
Shaokun Zhang
c61a4633a5 iommu/dma: Remove unused variable
end_pfn is never used after commit <aa3ac9469c18> ('iommu/iova: Make dma
32bit pfn implicit'), cleanup it.

Cc: Joerg Roedel <jroedel@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-24 15:14:35 +01:00
Joerg Roedel
da5d2748e4 iommu/mediatek: Use correct fwspec in mtk_iommu_add_device()
The mtk_iommu_add_device() function keeps the fwspec in an
on-stack pointer and calls mtk_iommu_create_mapping(), which
might change its source, dev->iommu_fwspec. This causes the
on-stack pointer to be obsoleted and the device
initialization to fail. Update the on-stack fwspec pointer
after mtk_iommu_create_mapping() has been called.

Reported-by: Frank Wunderlich <frank-w@public-files.de>
Fixes: a9bf2eec5a ('iommu/mediatek: Use helper functions to access dev->iommu_fwspec')
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-23 13:51:05 +01:00
Gerald Schaefer
198bc3252e iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions()
Commit 9d3a4de4cb ("iommu: Disambiguate MSI region types") changed
the reserved region type in intel_iommu_get_resv_regions() from
IOMMU_RESV_RESERVED to IOMMU_RESV_MSI, but it forgot to also change
the type in intel_iommu_put_resv_regions().

This leads to a memory leak, because now the check in
intel_iommu_put_resv_regions() for IOMMU_RESV_RESERVED will never
be true, and no allocated regions will be freed.

Fix this by changing the region type in intel_iommu_put_resv_regions()
to IOMMU_RESV_MSI, matching the type of the allocated regions.

Fixes: 9d3a4de4cb ("iommu: Disambiguate MSI region types")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-22 14:01:38 +01:00
Jerry Snitselaar
f1724c0883 iommu/amd: Unmap all mapped pages in error path of map_sg
In the error path of map_sg there is an incorrect if condition
for breaking out of the loop that searches the scatterlist
for mapped pages to unmap. Instead of breaking out of the
loop once all the pages that were mapped have been unmapped,
it will break out of the loop after it has unmapped 1 page.
Fix the condition, so it breaks out of the loop only after
all the mapped pages have been unmapped.

Fixes: 80187fd39d ("iommu/amd: Optimize map_sg and unmap_sg")
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-22 12:03:26 +01:00
Jerry Snitselaar
51d8838d66 iommu/amd: Call free_iova_fast with pfn in map_sg
In the error path of map_sg, free_iova_fast is being called with
address instead of the pfn. This results in a bad value getting into
the rcache, and can result in hitting a BUG_ON when
iova_magazine_free_pfns is called.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Fixes: 80187fd39d ("iommu/amd: Optimize map_sg and unmap_sg")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-22 11:55:31 +01:00
Linus Torvalds
52e60b7544 IOMMU Fix for Linux v5.0-rc3
One fix only for now:
 
 	- Fix probe deferral in iommu/of code (broke with recent changes
 	  to iommu_ops->add_device invocation)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcRZMlAAoJECvwRC2XARrjk2MQAIvf8Ze/T9xlCG+ouzdrnkfs
 oUeta4cdrfsHc5kk+RPOHSQzPJxYeAGYFQHi2W0Vkh0f7vOj+Cwr/vaMbRQS5I/+
 dfQSj7nwSmLEaGlC2OIzWmJ50Mj4L4M2rV5q8IXQqP5plPM22s7u5p9pJOiNIqhX
 K/EBlA+movhnh3RqrzjJ1Vl5aKl0RQhrl1vJfxgTeoem42f9nWMEflifbddrQsbm
 G12RQdjTVI7w/IVc+xwPPAimhZX2we9MowXmVB7zdVA1MUa3i1rwLYNIkdJ56lRN
 F/OC/WyMIkln8SLP1A2VOb1M7Vkkao2gaSwLw3XP92oxzimQMHHTMalI1C2Pv4/e
 yrrSPzg4A0QadeS68loEOIT/zneGfd4EX3P7J2DysSLrwqkiiE/Ivgn737TcsleW
 xLnknIgkaTx3pxW7p2rZDWPQcQVAk60IOU4o6tSyCwfFVAk4teJxKZF98fyMhmf6
 LR2GPYmXTAdc+s1TVTotYX9/g1InRcPbwOCbnlsPQ/iykYLlF3kn+fAJiliv6bNj
 PZ39B0gLmKPF5jqvnxNeXy94nF5IgGek+tGdmR0j1Y0i3MuvJnqvq09NCnwxXY6I
 SReu1R66C24m4uLkHATTovWBvGJ4sFBPzDl4OpbB248I4cboYcAfaxwbitc7JMf/
 79KXXR3tiNUY1cDmt7u2
 =+93K
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fix from Joerg Roedel:
 "One fix only for now: Fix probe deferral in iommu/of code (broke with
  recent changes to iommu_ops->add_device invocation)"

* tag 'iommu-fixes-v5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/of: Fix probe-deferral
2019-01-22 07:27:17 +13:00
Dmitry Osipenko
707223095c iommu/tegra: gart: Perform code refactoring
Removed redundant safety-checks in the code and some debug code that
isn't actually very useful for debugging, like enormous pagetable dump
on each fault. The majority of the changes are code reshuffling,
variables/whitespaces clean up and removal of debug messages that
duplicate messages of the IOMMU-core. Now the GART translation is kept
disabled while GART is suspended.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:16 +01:00
Dmitry Osipenko
e7e2367041 iommu/tegra: gart: Simplify clients-tracking code
GART is a simple IOMMU provider that has single address space. There is
no need to setup global clients list and manage it for tracking of the
active domain, hence lot's of code could be safely removed and replaced
with a simpler alternative.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
cc0e120576 iommu/tegra: gart: Don't detach devices from inactive domains
There could be unlimited number of allocated domains, but only one domain
can be active at a time. Hence devices must be detached only from the
active domain.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
5dd82cdb36 iommu/tegra: gart: Prepend error/debug messages with "gart:"
GART became a part of Memory Controller, hence now the drivers device
is Memory Controller and not GART. As a result all printed messages are
prepended with the "tegra-mc 7000f000.memory-controller:", so let's
prepend GART's messages with "gart:" in order to differentiate them
from the MC.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
167d67d550 iommu/tegra: gart: Don't use managed resources
GART is a part of the Memory Controller driver that is always built-in,
hence there is no benefit from the use of managed resources.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
7d849b7b40 iommu/tegra: gart: Allow only one active domain at a time
GART has a single address space that is shared by all devices, hence only
one domain could be active at a time.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
8e924910dd iommu/tegra: gart: Fix NULL pointer dereference
Fix NULL pointer dereference on IOMMU domain destruction that happens
because clients list is being iterated unsafely and its elements are
getting deleted during the iteration.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
c3086fad27 iommu/tegra: gart: Fix spinlock recursion
Fix spinlock recursion bug that happens on IOMMU domain destruction if
any of the allocated domains have devices attached to them.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:14 +01:00
Dmitry Osipenko
568ece5bab memory: tegra: Do not try to probe SMMU on Tegra20
Tegra20 doesn't have SMMU. Move out checking of the SMMU presence from
the SMMU driver into the Memory Controller driver. This change makes code
consistent in regards to how GART/SMMU presence checking is performed.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:13 +01:00
Dmitry Osipenko
ce2785a75d iommu/tegra: gart: Integrate with Memory Controller driver
The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU on Tegra30+.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:12 +01:00
Dmitry Osipenko
2fc0ac180d iommu/tegra: gart: Optimize mapping / unmapping performance
Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% performance boost (depending on size of mapping) in comparison to
flushing after each page entry update.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:09 +01:00
Dmitry Osipenko
1d7ae53b15 iommu: Introduce iotlb_sync_map callback
Introduce iotlb_sync_map() callback that is invoked in the end of
iommu_map(). This new callback allows IOMMU drivers to avoid syncing
after mapping of each contiguous chunk and sync only when the whole
mapping is completed, optimizing performance of the mapping operation.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:09 +01:00
Dmitry Osipenko
4b6f0ea384 iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
GART can't handle all devices, hence ignore devices that aren't related
to GART. IOMMU phandle must be explicitly assign to devices in the device
tree.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:09 +01:00
Dmitry Osipenko
ae95c46dbe iommu/tegra: gart: Clean up driver probe errors handling
Properly clean up allocated resources on the drivers probe failure and
remove unneeded checks.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:09 +01:00
Dmitry Osipenko
4f821c1002 iommu/tegra: gart: Remove pr_fmt and clean up includes
Remove unneeded headers inclusion and sort the headers in alphabet order.
Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
affect dev_*() functions.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-16 13:54:09 +01:00
Jacob Pan
5b438f4ba3 iommu/vt-d: Support page request in scalable mode
VT-d Rev3.0 has made a few changes to the page request interface,

1. widened PRQ descriptor from 128 bits to 256 bits;
2. removed streaming response type;
3. introduced private data that requires page response even the
   request is not last request in group (LPIG).

This is a supplement to commit 1c4f88b7f1 ("iommu/vt-d: Shared
virtual address in scalable mode") and makes the svm code compliant
with VT-d Rev3.0.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Fixes: 1c4f88b7f1 ("iommu/vt-d: Shared virtual address in scalable mode")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-11 13:10:03 +01:00
Robin Murphy
e8e683ae9a iommu/of: Fix probe-deferral
Whilst iommu_probe_device() does check for non-NULL ops as the previous
code did, it does not do so in the same order relative to the other
checks, and as a result means that -EPROBE_DEFER returned by of_xlate()
(plus any real error condition too) gets overwritten with -EINVAL and
leads to various misbehaviour.

Reinstate the original logic, but without implicitly relying on ops
being set to infer !err as the initial condition (now that the validity
of ops for its own sake is checked elsewhere).

Fixes: 641fb0efbf ("iommu/of: Don't call iommu_ops->add_device directly")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-01-11 12:28:24 +01:00