linux-dmdedup.git
11 years agodrm/radeon: Make classic pageflip completion path less racy.
Mario Kleiner [Thu, 17 Jul 2014 00:24:45 +0000 (02:24 +0200)]
drm/radeon: Make classic pageflip completion path less racy.

Need to protect mmio flip programming by event lock as well.

Need to also first enable pflip irq, then mmio program,
otherwise a flip completion may get unnoticed in the vblank
of actual completion if the flip is programmed, but
radeon_flip_work_func gets preempted immediately after
mmio programming and before vblank. In that case the
vblank irq handler wouldn't run radeon_crtc_handle_vblank()
with the completion check routine, miss the completed flip,
and only notice one vblank after actual completion, causing
a false/delayed report of flip completion.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: Add missing vblank_put in pageflip ioctl error path.
Mario Kleiner [Wed, 16 Jul 2014 23:37:53 +0000 (01:37 +0200)]
drm/radeon: Add missing vblank_put in pageflip ioctl error path.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: Remove redundant fence unref in pageflip path.
Mario Kleiner [Wed, 16 Jul 2014 23:27:25 +0000 (01:27 +0200)]
drm/radeon: Remove redundant fence unref in pageflip path.

Not needed anymore, as it is already unreffed within
radeon_flip_work_func() after its only use.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: Complete page flip even if waiting on the BO fence fails
Michel Dänzer [Mon, 14 Jul 2014 06:58:03 +0000 (15:58 +0900)]
drm/radeon: Complete page flip even if waiting on the BO fence fails

Otherwise the DRM core and userspace will be confused about which BO the
CRTC is scanning out.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
Michel Dänzer [Mon, 14 Jul 2014 06:48:42 +0000 (15:48 +0900)]
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()

As well as enabling the vblank interrupt. These shouldn't take any
significant amount of time, but at least pinning the BO has actually been
seen to fail in practice before, in which case we need to let userspace
know about it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: Prevent too early kms-pageflips triggered by vblank.
Mario Kleiner [Thu, 3 Jul 2014 01:45:02 +0000 (03:45 +0200)]
drm/radeon: Prevent too early kms-pageflips triggered by vblank.

Since 3.16-rc1 we have this new failure:

When the userspace XOrg ddx schedules vblank events to
trigger deferred kms-pageflips, e.g., via the OML_sync_control
extension call glXSwapBuffersMscOML(), or if a glXSwapBuffers()
is called immediately after completion of a previous swapbuffers
call, e.g., in a tight rendering loop with minimal rendering,
it happens frequently that the pageflip ioctl() is executed
within the same vblank in which a previous kms-pageflip completed,
or - for deferred swaps - always one vblank earlier than requested
by the client app.

This causes premature pageflips and detection of failure by
the ddx, e.g., XOrg log warnings like...

"(WW) RADEON(1): radeon_dri2_flip_event_handler: Pageflip
completion event has impossible msc 201025 < target_msc 201026"

... and error/invalid return values of glXWaitForSbcOML() and
Intel_swap_events extension.

Reason is the new way in which kms-pageflips are programmed
since 3.16.

This commit changes the time window in which the hw can
execute pending programmed pageflips. Before, a pending flip
would get executed anywhere within the vblank interval. Now
a pending flip only gets executed at the leading edge of
vblank (start of front porch), making sure that a invocation
of the pageflip ioctl() within a given vblank interval will
only lead to pageflip completion in the following vblank.

Tested to death on a DCE-4 card.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agodrm/radeon: set default bl level to something reasonable
Alex Deucher [Tue, 15 Jul 2014 13:48:53 +0000 (09:48 -0400)]
drm/radeon: set default bl level to something reasonable

If the value in the scratch register is 0, set it to the
max level.  This fixes an issue where the console fb blanking
code calls back into the backlight driver on unblank and then
sets the backlight level to 0 after the driver has already
set the mode and enabled the backlight.

bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=81382
https://bugs.freedesktop.org/show_bug.cgi?id=70207

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: David Heidelberger <david.heidelberger@ixit.cz>
Cc: stable@vger.kernel.org
11 years agodrm/radeon: avoid leaking edid data
Alex Deucher [Mon, 14 Jul 2014 21:57:19 +0000 (17:57 -0400)]
drm/radeon: avoid leaking edid data

In some cases we fetch the edid in the detect() callback
in order to determine what sort of monitor is connected.
If that happens, don't fetch the edid again in the get_modes()
callback or we will leak the edid.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
11 years agoirqchip: gic: Add binding probe for ARM GIC400
Suravee Suthikulpanit [Mon, 14 Jul 2014 22:03:03 +0000 (00:03 +0200)]
irqchip: gic: Add binding probe for ARM GIC400

Commit 3ab72f9156bb "dt-bindings: add GIC-400 binding" added the
"arm,gic-400" compatible string, but the corresponding IRQCHIP_DECLARE
was never added to the gic driver.

Therefore add the missing irqchip declaration for it.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Removed additional empty line and adapted commit message to mark it
as fixing an issue.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Fixes: 3ab72f9156bb ("dt-bindings: add GIC-400 binding")
Cc: <stable@vger.kernel.org> # v3.14+
Link: https://lkml.kernel.org/r/2621565.f5eISveXXJ@diego
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
11 years agoKVM: nVMX: Fix virtual interrupt delivery injection
Wanpeng Li [Thu, 17 Jul 2014 11:03:00 +0000 (19:03 +0800)]
KVM: nVMX: Fix virtual interrupt delivery injection

This patch fix bug reported in https://bugzilla.kernel.org/show_bug.cgi?id=73331,
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there is
some progress and the L2 can boot up, however, slowly. The original idea of this
fix vid injection patch is from "Zhang, Yang Z" <yang.z.zhang@intel.com>.

Interrupt which delivered by vid should be injected to L1 by L0 if current is in
L1, or should be injected to L2 by L0 through the old injection way if L1 doesn't
have set External-interrupt exiting bit. The current logic doen't consider these
cases. This patch fix it by vid intr to L1 if current is L1 or L2 through old
injection way if L1 doen't have External-interrupt exiting bit set.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agodrivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
Andrey Utkin [Thu, 17 Jul 2014 12:13:23 +0000 (15:13 +0300)]
drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()

[linux-3.16-rc5/drivers/ata/pata_ep93xx.c:929]: (style) Checking if unsigned
variable 'irq' is less than zero.

Source code is

    irq = platform_get_irq(pdev, 0);
    if (irq < 0) {

but

    unsigned int irq;

$ fgrep platform_get_irq `find . -name \*.h -print`
./include/linux/platform_device.h:extern int platform_get_irq(struct
platform_device *, unsigned int);

Now using "int" type instead of "unsigned int" for "irq" variable.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=80401
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
11 years agocpufreq: move policy kobj to policy->cpu at resume
Viresh Kumar [Thu, 17 Jul 2014 05:18:25 +0000 (10:48 +0530)]
cpufreq: move policy kobj to policy->cpu at resume

This is only relevant to implementations with multiple clusters, where clusters
have separate clock lines but all CPUs within a cluster share it.

Consider a dual cluster platform with 2 cores per cluster. During suspend we
start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj
would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its
kobj as we want to retain permissions/values/etc.

Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
We will recover the old policy and update policy->cpu from 3 to 2 from
update_policy_cpu().

But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a
link for CPU2, but would try that for CPU3 while bringing it online. Which will
report errors as CPU3 already has kobj assigned to it.

This bug got introduced with commit 42f921a, which overlooked this scenario.

To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here
only for the first CPU of a non-boot cluster. And we can't recover from this
situation, if kobject_move() fails.

Fixes: 42f921a6f10c (cpufreq: remove sysfs files for CPUs which failed to come back after resume)
Cc: 3.13+ <stable@vger.kernel.org> # 3.13+
Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com>
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agonet: ppp: fix creating PPP pass and active filters
Christoph Schulz [Wed, 16 Jul 2014 20:10:29 +0000 (22:10 +0200)]
net: ppp: fix creating PPP pass and active filters

Commit 568f194e8bd16c353ad50f9ab95d98b20578a39d ("net: ppp: use
sk_unattached_filter api") inadvertently changed the logic when setting
PPP pass and active filters. This applies to both the generic PPP subsystem
implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem
implemented by drivers/isdn/i4l/isdn_ppp.c. The original code in ppp_ioctl()
(or isdn_ppp_ioctl(), resp.) handling PPPIOCSPASS and PPPIOCSACTIVE allowed to
remove a pass/active filter previously set by using a filter of length zero.
However, with the new code this is not possible anymore as this case is not
explicitly checked for, which leads to passing NULL as a filter to
sk_unattached_filter_create(). This results in returning EINVAL to the caller.

Additionally, the variables ppp->pass_filter and ppp->active_filter (or
is->pass_filter and is->active_filter, resp.) are not reset to NULL, although
the filters they point to may have been destroyed by
sk_unattached_filter_destroy(), so in this EINVAL case dangling pointers are
left behind (provided the pointers were previously non-NULL).

This patch corrects both problems by checking whether the filter passed is
empty or non-empty, and prevents sk_unattached_filter_create() from being
called in the first case. Moreover, the pointers are always reset to NULL
as soon as sk_unattached_filter_destroy() returns.

Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
Amir Vadai [Wed, 16 Jul 2014 10:33:50 +0000 (13:33 +0300)]
net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's

Fix a regression introduced by commit 35f6f45 ("net/mlx4_en: Don't use
irq_affinity_notifier to track changes in IRQ affinity map").
When core is started in legacy EQ's (number of IRQ's < rx rings), cq->irq_desc
was NULL.  This caused a kernel crash under heavy traffic - when having more
than rx NAPI budget completions.
Fixed to have it set for both EQ modes.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branches 'cxgb4' and 'mlx5' into for-next
Roland Dreier [Thu, 17 Jul 2014 06:14:42 +0000 (23:14 -0700)]
Merge branches 'cxgb4' and 'mlx5' into for-next

11 years agoIB/mlx5: Enable "block multicast loopback" for kernel consumers
Or Gerlitz [Wed, 25 Jun 2014 13:44:14 +0000 (16:44 +0300)]
IB/mlx5: Enable "block multicast loopback" for kernel consumers

In commit f360d88a2efd, we advertise blocking multicast loopback to both
kernel and userspace consumers, but don't allow kernel consumers (e.g IPoIB)
to use it with their UD QPs.  Fix that.

Fixes: f360d88a2efd ("IB/mlx5: Add block multicast loopback support")
Reported-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
11 years agosunvnet: clean up objects created in vnet_new() on vnet_exit()
Sowmini Varadhan [Wed, 16 Jul 2014 14:02:26 +0000 (10:02 -0400)]
sunvnet: clean up objects created in vnet_new() on vnet_exit()

Nothing cleans up the objects created by
vnet_new(), they are completely leaked.

vnet_exit(), after doing the vio_unregister_driver() to clean
up ports, should call a helper function that iterates over vnet_list
and cleans up those objects. This includes unregister_netdevice()
as well as free_netdev().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agor8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
Michel Dänzer [Thu, 17 Jul 2014 03:55:40 +0000 (12:55 +0900)]
r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40

The ethernet port on my ASUS A88X Pro mainboard stopped working
several times a day, with messages like these in dmesg:

AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x001e address=0x0000000000003000 flags=0x0050]

Searching the web for these messages led me to similar reports about
different hardware supported by r8169, and eventually to commits
3ced8c955e74d319f3e3997f7169c79d524dfd06 ('r8169: enforce RX_MULTI_EN
for the 8168f.') and eb2dc35d99028b698cdedba4f5522bc43e576bd2 ('r8169:
RxConfig hack for the 8168evl'). So I tried this change, and it fixes
the problem for me.

Signed-off-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agohwmon: (adt7470) Fix writes to temperature limit registers
Guenter Roeck [Thu, 17 Jul 2014 00:40:31 +0000 (17:40 -0700)]
hwmon: (adt7470) Fix writes to temperature limit registers

Temperature limit registers are signed. Limits therefore need
to be clamped to (-128, 127) degrees C and not to (0, 255)
degrees C.

Without this fix, writing a limit of 128 degrees C sets the
actual limit to -128 degrees C.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Reviewed-by: Axel Lin <axel.lin@ingics.com>
11 years agoMerge branches 'pci/host-generic', 'pci/host-mvebu', 'pci/host-rcar', 'pci/host-tegra...
Bjorn Helgaas [Wed, 16 Jul 2014 23:09:47 +0000 (17:09 -0600)]
Merge branches 'pci/host-generic', 'pci/host-mvebu', 'pci/host-rcar', 'pci/host-tegra', 'pci/msi', 'pci/misc', 'pci/resource' and 'pci/virtualization' into next

* pci/host-generic:
  PCI: generic: Fix GPL v2 license string typo

* pci/host-mvebu:
  PCI: mvebu: Fix GPL v2 license string typo

* pci/host-rcar:
  PCI: rcar: Fix GPL v2 license string typo

* pci/host-tegra:
  PCI: tegra: Fix GPL v2 license string typo

* pci/msi:
  PCI/MSI: Use irq_get_msi_desc() to simplify code
  PCI/MSI: Remove unused list access in __pci_restore_msix_state()
  PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev
  PCI/MSI: Remove unused function msi_remove_pci_irq_vectors()
  PCI/MSI: Add msi_setup_entry() to clean up MSI initialization

* pci/misc:
  PCI: Configure ASPM when enabling device
  x86: don't exclude low BIOS area when allocating address space for non-PCI cards
  PCI: Add include guard to include/linux/pci_ids.h
  x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()

* pci/resource:
  PCI: Tidy resource assignment messages
  PCI: Return conventional error values from pci_revert_fw_address()
  PCI: Cleanup control flow
  PCI: Support BAR sizes up to 128GB
  PCI: Keep original resource if we fail to expand it

* pci/virtualization:
  powerpc/pci: Remove duplicate logic
  PCI: Make resetting secondary bus logic common

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
David S. Miller [Wed, 16 Jul 2014 22:27:16 +0000 (15:27 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter/nf_tables fixes

The following patchset contains nf_tables fixes, they are:

1) Fix wrong transaction handling when the table flags are not
   modified.

2) Fix missing rcu read_lock section in the netlink dump path, which
   is not protected by the nfnl_lock.

3) Set NLM_F_DUMP_INTR in the netlink dump path to indicate
   interferences with updates.

4) Fix 64 bits chain counters when they are retrieved from a 32 bits
   arch, from Eric Dumazet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agodrm/qxl: return IRQ_NONE if it was not our irq
Jason Wang [Mon, 12 May 2014 08:35:39 +0000 (16:35 +0800)]
drm/qxl: return IRQ_NONE if it was not our irq

Return IRQ_NONE if it was not our irq. This is necessary for the case
when qxl is sharing irq line with a device A in a crash kernel. If qxl
is initialized before A and A's irq was raised during this gap,
returning IRQ_HANDLED in this case will cause this irq to be raised
again after EOI since kernel think it was handled but in fact it was
not.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agonet-gre-gro: Fix a bug that breaks the forwarding path
Jerry Chu [Mon, 14 Jul 2014 22:54:46 +0000 (15:54 -0700)]
net-gre-gro: Fix a bug that breaks the forwarding path

Fixed a bug that was introduced by my GRE-GRO patch
(bf5a755f5e9186406bbf50f4087100af5bd68e40 net-gre-gro: Add GRE
support to the GRO stack) that breaks the forwarding path
because various GSO related fields were not set. The bug will
cause on the egress path either the GSO code to fail, or a
GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
fix has been tested for both cases.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoPCI/MSI: Use irq_get_msi_desc() to simplify code
Yijing Wang [Tue, 8 Jul 2014 02:09:19 +0000 (10:09 +0800)]
PCI/MSI: Use irq_get_msi_desc() to simplify code

Use irq_get_msi_desc() to get MSI IRQ related msi_desc directly instead of
searching the dev->msi_list.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI/MSI: Remove unused list access in __pci_restore_msix_state()
Yijing Wang [Tue, 8 Jul 2014 02:09:08 +0000 (10:09 +0800)]
PCI/MSI: Remove unused list access in __pci_restore_msix_state()

In __pci_restore_msix_state(), we get the first element from msi_list, but
we never use it.  Remove this useless code.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev
Yijing Wang [Tue, 8 Jul 2014 02:08:55 +0000 (10:08 +0800)]
PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev

Retrieve the first MSI IRQ to compute the MSI index from struct msi_desc
rather than the struct pci_dev to avoid an additional memory access.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI/MSI: Remove unused function msi_remove_pci_irq_vectors()
Yijing Wang [Tue, 8 Jul 2014 02:08:36 +0000 (10:08 +0800)]
PCI/MSI: Remove unused function msi_remove_pci_irq_vectors()

msi_remove_pci_irq_vectors() is unused, so remove it.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI/MSI: Add msi_setup_entry() to clean up MSI initialization
Yijing Wang [Tue, 8 Jul 2014 02:07:23 +0000 (10:07 +0800)]
PCI/MSI: Add msi_setup_entry() to clean up MSI initialization

Move MSI entry stuff to a new function, msi_setup_entry(), to simplify
msi_capability_init() as MSI-X does.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI: Configure ASPM when enabling device
Vidya Sagar [Wed, 16 Jul 2014 10:03:42 +0000 (15:33 +0530)]
PCI: Configure ASPM when enabling device

We can't do ASPM configuration at enumeration-time because enabling it
makes some defective hardware unresponsive, even if ASPM is disabled later
(see 41cd766b0659 ("PCI: Don't enable aspm before drivers have had a chance
to veto it").  Therefore, we have to do it after a driver claims the
device.

We previously configured ASPM in pci_set_power_state(), but that's not a
very good place because it's not really related to setting the PCI device
power state, and doing it there means:

  - We incorrectly skipped ASPM config when setting a device that's
    already in D0 to D0.

  - We unnecessarily configured ASPM when setting a device to a low-power
    state (the ASPM feature only applies when the device is in D0).

  - We unnecessarily configured ASPM when called from a .resume() method
    (ASPM configuration needs to be restored during resume, but
    pci_restore_pcie_state() should already do this).

Move ASPM configuration from pci_set_power_state() to
do_pci_enable_device() so we do it when a driver enables a device.

[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79621
Fixes: db288c9c5f9d ("PCI / PM: restore the original behavior of pci_set_power_state()")
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Vidya Sagar <sagar.tv@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.6+
11 years agoMerge tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd
Linus Torvalds [Wed, 16 Jul 2014 20:11:42 +0000 (10:11 -1000)]
Merge tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd

Pull MTD fixes from Brian Norris:

 - Fix ELM suspend/resume

 - Reduce warnings if NAND ECC is too weak

 - Add CFI support for Sharp LH28F640BF NOR

   The last fix is coming in because other commits in the 3.16 cycle
   depended on this support.

* tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd:
  mtd: cfi_cmdset_0001.c: add support for Sharp LH28F640BF NOR
  mtd: nand: reduce the warning noise when the ECC is too weak
  mtd: devices: elm: fix elm_context_save() and elm_context_restore() functions

11 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 16 Jul 2014 20:11:02 +0000 (10:11 -1000)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "A cpufreq lockup fix and a compiler warning fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix compiler warnings
  x86, tsc: Fix cpufreq lockup

11 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 16 Jul 2014 20:10:27 +0000 (10:10 -1000)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Tooling fixes and an Intel PMU driver fixlet"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Do not allow optimized switch for non-cloned events
  perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
  perf symbols: Get kernel start address by symbol name
  perf tools: Fix segfault in cumulative.callchain report

11 years agotracing: Kill "filter_string" arg of replace_preds()
Oleg Nesterov [Tue, 15 Jul 2014 18:48:32 +0000 (20:48 +0200)]
tracing: Kill "filter_string" arg of replace_preds()

Cosmetic, but replace_preds() doesn't need/use "char *filter_string".
Remove it to microsimplify the code.

Link: http://lkml.kernel.org/p/20140715184832.GA20519@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agotracing: Change apply_subsystem_event_filter() paths to check file->system == dir
Oleg Nesterov [Tue, 15 Jul 2014 18:48:29 +0000 (20:48 +0200)]
tracing: Change apply_subsystem_event_filter() paths to check file->system == dir

filter_free_subsystem_preds(), filter_free_subsystem_filters() and
replace_system_preds() can simply check file->system->subsystem and
avoid strcmp(call->class->system).

Better yet, we can pass "struct ftrace_subsystem_dir *dir" instead of
event_subsystem and just check file->system == dir.

Thanks to Namhyung Kim who pointed out that replace_system_preds() can
be changed too.

Link: http://lkml.kernel.org/p/20140715184829.GA20516@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agox86: don't exclude low BIOS area when allocating address space for non-PCI cards
Christoph Schulz [Wed, 16 Jul 2014 08:00:57 +0000 (10:00 +0200)]
x86: don't exclude low BIOS area when allocating address space for non-PCI cards

Commit 30919b0bf356 ("x86: avoid low BIOS area when allocating address
space") moved the test for resource allocations that fall within the first
1MB of address space from the PCI-specific path to a generic path, such
that all resource allocations will avoid this area.  However, this breaks
ISA cards which need to allocate a memory region within the first 1MB.  An
example is the i82365 PCMCIA controller and derivatives like the Ricoh
RF5C296/396 which map part of the PCMCIA socket memory address space into
the first 1MB of system memory address space.  They do not work anymore as
no usable memory region exists due to this change:

  Intel ISA PCIC probe: Ricoh RF5C296/396 ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
  host opts [0]: none
  host opts [1]: none
  ISA irqs (scanned) = 3,4,5,9,10 status change on irq 10
  pcmcia_socket pcmcia_socket1: pccard: PCMCIA card inserted into slot 1
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0cc000-0x0effff: excluding 0xe0000-0xeffff
  pcmcia_socket pcmcia_socket1: cs: unable to map card memory!

If filtering out the first 1MB is reverted, everything works as expected.

Tested-by: Robert Resch <fli4l@robert.reschpara.de>
Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v2.6.37+
11 years agotracing: Kill ftrace_event_call->files
Oleg Nesterov [Tue, 15 Jul 2014 18:48:27 +0000 (20:48 +0200)]
tracing: Kill ftrace_event_call->files

Remove ftrace_event_call->files. It has no users, and in fact even
the commit ae63b31e4d0e "tracing: Separate out trace events from
global variables" which added this member did not use it.

Link: http://lkml.kernel.org/p/20140715184827.GA20508@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agotracing/uprobes: Kill the dead TRACE_EVENT_FL_USE_CALL_FILTER logic
Oleg Nesterov [Tue, 15 Jul 2014 18:48:24 +0000 (20:48 +0200)]
tracing/uprobes: Kill the dead TRACE_EVENT_FL_USE_CALL_FILTER logic

alloc_trace_uprobe() sets TRACE_EVENT_FL_USE_CALL_FILTER for unknown
reason and this is simply wrong. Fortunately this has no effect because
register_uprobe_event() clears call->flags after that.

Kill both. This trace_uprobe was kzalloc'ed and we rely on this fact
anyway.

Link: http://lkml.kernel.org/p/20140715184824.GA20505@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agotracing: Kill call_filter_disable()
Oleg Nesterov [Tue, 15 Jul 2014 18:48:21 +0000 (20:48 +0200)]
tracing: Kill call_filter_disable()

It seems that the only purpose of call_filter_disable() is to
make filter_disable() less clear and symmetrical, remove it.

Link: http://lkml.kernel.org/p/20140715184821.GA20498@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agotracing: Kill destroy_call_preds()
Oleg Nesterov [Tue, 15 Jul 2014 18:48:16 +0000 (20:48 +0200)]
tracing: Kill destroy_call_preds()

Remove destroy_call_preds(). Its only caller, __trace_remove_event_call(),
can use free_event_filter() and nullify ->filter by hand.

Perhaps we could keep this trivial helper although imo it is pointless, but
then it should be static in trace_events.c.

Link: http://lkml.kernel.org/p/20140715184816.GA20495@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agotracing: Kill destroy_preds() and destroy_file_preds()
Oleg Nesterov [Tue, 15 Jul 2014 18:48:13 +0000 (20:48 +0200)]
tracing: Kill destroy_preds() and destroy_file_preds()

destroy_preds() makes no sense.

The only caller, event_remove(), actually wants destroy_file_preds().
__trace_remove_event_call() does destroy_call_preds() which takes care
of call->filter.

And after the previous change we can simply remove destroy_preds() from
event_remove(), we are going to call remove_event_from_tracers() which
in turn calls remove_event_file_dir()->free_event_filter().

Link: http://lkml.kernel.org/p/20140715184813.GA20488@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agoMerge tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Wed, 16 Jul 2014 16:48:08 +0000 (06:48 -1000)]
Merge tag 'sound-3.16-rc6' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Things seem to calm down so far, just a small few HD-audio fixes
  (regression fixes and a new codec ID addition) popping up"

* tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix broken PM due to incomplete i915 initialization
  ALSA: hda - Revert stream assignment order for Intel controllers
  ALSA: hda - Add new GPU codec ID 0x10de0070 to snd-hda
  ALSA: hda: Fix build warning

11 years agoftrace: Allow archs to specify if they need a separate function graph trampoline
Steven Rostedt (Red Hat) [Fri, 11 Jul 2014 18:39:10 +0000 (14:39 -0400)]
ftrace: Allow archs to specify if they need a separate function graph trampoline

Currently if an arch supports function graph tracing, the core code will
just assign the function graph trampoline to the function graph addr that
gets called.

But as the old method for function graph tracing always calls the function
trampoline first and that calls the function graph trampoline, some
archs may have the function graph trampoline dependent on operations that
were done in the function trampoline. This causes function graph tracer
to break on those archs.

Instead of having the default be to set the function graph ftrace_ops
to the function graph trampoline, have it instead just set it to zero
which will keep it from jumping to a trampoline that is not set up
to be jumped directly too.

Link: http://lkml.kernel.org/r/53BED155.9040607@nvidia.com
Reported-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com>
Tested-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agocpufreq: cpu0: OPPs can be populated at runtime
Viresh Kumar [Fri, 11 Jul 2014 14:54:19 +0000 (20:24 +0530)]
cpufreq: cpu0: OPPs can be populated at runtime

OPPs can be populated statically, via DT, or added at run time with
dev_pm_opp_add().

While this driver handles the first case correctly, it would fail to populate
OPPs added at runtime. Because call to of_init_opp_table() would fail as there
are no OPPs in DT and probe will return early.

To fix this, remove error checking and call dev_pm_opp_init_cpufreq_table()
unconditionally.

Update bindings as well.

Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agocpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD
Quentin Armitage [Thu, 10 Jul 2014 10:15:55 +0000 (11:15 +0100)]
cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD

Commit ff1f0018cf66080d8e6f59791e552615648a033a ("drivers: Enable
building of Kirkwood drivers for mach-mvebu") added Kirkwood into
mach-mvebu, adding MACH_KIRKWOOD to ARCH_KIRKWOOD in the KConfig files.

The change for ARM_KIRKWOOD_CPUFREQ replaced ARCH_KIRKWOOD with
MACH_KIRKWOOD, whereas all the other changes were ARCH_KIRKWOOD ||
MACH_KIRKWOOD.

As a consequence of this change, the cpufreq driver is no longer enabled
for ARCH_KIRKWOOD. This patch reinstates ARM_KIRKWOOD_CPUFREQ for
ARCH_KIRKWOOD.

Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agoMerge branch 'liblockdep-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 16 Jul 2014 12:57:27 +0000 (14:57 +0200)]
Merge branch 'liblockdep-fixes' of git://git./linux/kernel/git/sashal/linux into locking/urgent

Pull liblockdep fixes from Sasha Levin.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER
Davidlohr Bueso [Fri, 11 Jul 2014 21:00:06 +0000 (14:00 -0700)]
locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER

Just like with mutexes (CONFIG_MUTEX_SPIN_ON_OWNER),
encapsulate the dependencies for rwsem optimistic spinning.
No logical changes here as it continues to depend on both
SMP and the XADD algorithm variant.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Jason Low <jason.low2@hp.com>
[ Also make it depend on ARCH_SUPPORTS_ATOMIC_RMW. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1405112406-13052-2-git-send-email-davidlohr@hp.com
Cc: aswin@hp.com
Cc: Chris Mason <clm@fb.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/mutex: Disable optimistic spinning on some architectures
Peter Zijlstra [Fri, 6 Jun 2014 17:53:16 +0000 (19:53 +0200)]
locking/mutex: Disable optimistic spinning on some architectures

The optimistic spin code assumes regular stores and cmpxchg() play nice;
this is found to not be true for at least: parisc, sparc32, tile32,
metag-lock1, arc-!llsc and hexagon.

There is further wreckage, but this in particular seemed easy to
trigger, so blacklist this.

Opt in for known good archs.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: John David Anglin <dave.anglin@bell.net>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: stable@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/20140606175316.GV13930@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/rwsem: Reduce the size of struct rw_semaphore
Jason Low [Mon, 14 Jul 2014 17:27:52 +0000 (10:27 -0700)]
locking/rwsem: Reduce the size of struct rw_semaphore

Recent optimistic spinning additions to rwsem provide significant performance
benefits on many workloads on large machines. The cost of it was increasing
the size of the rwsem structure by up to 128 bits.

However, now that the previous patches in this series bring the overhead of
struct optimistic_spin_queue to 32 bits, this patch reorders some fields in
struct rw_semaphore such that we can reduce the overhead of the rwsem structure
by 64 bits (on 64 bit systems).

The extra overhead required for rwsem optimistic spinning would now be up
to 8 additional bytes instead of up to 16 bytes. Additionally, the size of
rwsem would now be more in line with mutexes.

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Link: http://lkml.kernel.org/r/1405358872-3732-6-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/rwsem: Rename 'activity' to 'count'
Peter Zijlstra [Wed, 16 Jul 2014 12:54:55 +0000 (14:54 +0200)]
locking/rwsem: Rename 'activity' to 'count'

There are two definitions of struct rw_semaphore, one in linux/rwsem.h
and one in linux/rwsem-spinlock.h.

For some reason they have different names for the initial field. This
makes it impossible to use C99 named initialization for
__RWSEM_INITIALIZER() -- or we have to duplicate that entire thing
along with the structure definitions.

The simpler patch is renaming the rwsem-spinlock variant to match the
regular rwsem.

This allows us to switch to C99 named initialization.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-bmrZolsbGmautmzrerog27io@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agocpufreq: imx6q: Select PM_OPP
Nicolas Del Piano [Sun, 13 Jul 2014 21:59:00 +0000 (18:59 -0300)]
cpufreq: imx6q: Select PM_OPP

PM_OPP is a library used by several of the existing cpufreq drivers.
ARM IMX6Q cpufreq driver uses this library for its functionality.
Thus, it should be selected in Kconfig.

Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Nicolas Del Piano <ndel314@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agocpufreq: sa1110: set memory type for h3600
Linus Walleij [Sat, 12 Jul 2014 09:43:33 +0000 (11:43 +0200)]
cpufreq: sa1110: set memory type for h3600

The Compaq iPAQ h3600 also has the K4S281632b-1H memory type.
Verified by prying apart a broken board.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agokprobes/x86: Don't try to resolve kprobe faults from userspace
Andy Lutomirski [Fri, 11 Jul 2014 17:27:01 +0000 (10:27 -0700)]
kprobes/x86: Don't try to resolve kprobe faults from userspace

This commit:

    commit 6f6343f53d133bae516caf3d254bce37d8774625
    Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
    Date:   Thu Apr 17 17:17:33 2014 +0900

        kprobes/x86: Call exception handlers directly from do_int3/do_debug

appears to have inadvertently dropped a check that the int3 came
from kernel mode.  Trying to dereference addr when addr is
user-controlled is completely bogus.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/c4e339882c121aa76254f2adde3fcbdf502faec2.1405099506.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoACPI / video: Add use_native_backlight quirk for HP ProBook 4540s
Hans de Goede [Wed, 16 Jul 2014 11:28:34 +0000 (13:28 +0200)]
ACPI / video: Add use_native_backlight quirk for HP ProBook 4540s

As reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=1025690
This is yet another model which needs this quirk.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1025690
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
11 years agosched: Fix possible divide by zero in avg_atom() calculation
Mateusz Guzik [Sat, 14 Jun 2014 13:00:09 +0000 (15:00 +0200)]
sched: Fix possible divide by zero in avg_atom() calculation

proc_sched_show_task() does:

  if (nr_switches)
do_div(avg_atom, nr_switches);

nr_switches is unsigned long and do_div truncates it to 32 bits, which
means it can test non-zero on e.g. x86-64 and be truncated to zero for
division.

Fix the problem by using div64_ul() instead.

As a side effect calculations of avg_atom for big nr_switches are now correct.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1402750809-31991-1-git-send-email-mguzik@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf/x86/intel: Avoid spamming kernel log for BTS buffer failure
David Rientjes [Mon, 30 Jun 2014 23:04:08 +0000 (16:04 -0700)]
perf/x86/intel: Avoid spamming kernel log for BTS buffer failure

It's unnecessary to excessively spam the kernel log anytime the BTS buffer
cannot be allocated, so make this allocation __GFP_NOWARN.

The user probably will want to at least find some artifact that the
allocation has failed in the past, probably due to fragmentation because
of its large size, when it's not allocated at bootstrap.  Thus, add a
WARN_ONCE() so something is left behind for them to understand why perf
commnads that require PEBS is not working properly.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/spinlocks/mcs: Micro-optimize osq_unlock()
Jason Low [Mon, 14 Jul 2014 17:27:51 +0000 (10:27 -0700)]
locking/spinlocks/mcs: Micro-optimize osq_unlock()

In the unlock function of the cancellable MCS spinlock, the first
thing we do is to retrive the current CPU's osq node. However, due to
the changes made in the previous patch, in the common case where the
lock is not contended, we wouldn't need to access the current CPU's
osq node anymore.

This patch optimizes this by only retriving this CPU's osq node
after we attempt the initial cmpxchg to unlock the osq and found
that its contended.

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1405358872-3732-5-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/spinlocks/mcs: Introduce and use init macro and function for osq locks
Jason Low [Mon, 14 Jul 2014 17:27:50 +0000 (10:27 -0700)]
locking/spinlocks/mcs: Introduce and use init macro and function for osq locks

Currently, we initialize the osq lock by directly setting the lock's values. It
would be preferable if we use an init macro to do the initialization like we do
with other locks.

This patch introduces and uses a macro and function for initializing the osq lock.

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Link: http://lkml.kernel.org/r/1405358872-3732-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead
Jason Low [Mon, 14 Jul 2014 17:27:49 +0000 (10:27 -0700)]
locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead

The cancellable MCS spinlock is currently used to queue threads that are
doing optimistic spinning. It uses per-cpu nodes, where a thread obtaining
the lock would access and queue the local node corresponding to the CPU that
it's running on. Currently, the cancellable MCS lock is implemented by using
pointers to these nodes.

In this patch, instead of operating on pointers to the per-cpu nodes, we
store the CPU numbers in which the per-cpu nodes correspond to in atomic_t.
A similar concept is used with the qspinlock.

By operating on the CPU # of the nodes using atomic_t instead of pointers
to those nodes, this can reduce the overhead of the cancellable MCS spinlock
by 32 bits (on 64 bit systems).

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Link: http://lkml.kernel.org/r/1405358872-3732-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()
Jason Low [Mon, 14 Jul 2014 17:27:48 +0000 (10:27 -0700)]
locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()

Currently, the per-cpu nodes structure for the cancellable MCS spinlock is
named "optimistic_spin_queue". However, in a follow up patch in the series
we will be introducing a new structure that serves as the new "handle" for
the lock. It would make more sense if that structure is named
"optimistic_spin_queue". Additionally, since the current use of the
"optimistic_spin_queue" structure are  "nodes", it might be better if we
rename them to "node" anyway.

This preparatory patch renames all current "optimistic_spin_queue"
to "optimistic_spin_node".

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Scott Norton <scott.norton@hp.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Mason <clm@fb.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Link: http://lkml.kernel.org/r/1405358872-3732-2-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolocking/rwsem: Allow conservative optimistic spinning when readers have lock
Jason Low [Sat, 5 Jul 2014 03:49:32 +0000 (20:49 -0700)]
locking/rwsem: Allow conservative optimistic spinning when readers have lock

Commit 4fc828e24cd9 ("locking/rwsem: Support optimistic spinning")
introduced a major performance regression for workloads such as
xfs_repair which mix read and write locking of the mmap_sem across
many threads. The result was xfs_repair ran 5x slower on 3.16-rc2
than on 3.15 and using 20x more system CPU time.

Perf profiles indicate in some workloads that significant time can
be spent spinning on !owner. This is because we don't set the lock
owner when readers(s) obtain the rwsem.

In this patch, we'll modify rwsem_can_spin_on_owner() such that we'll
return false if there is no lock owner. The rationale is that if we
just entered the slowpath, yet there is no lock owner, then there is
a possibility that a reader has the lock. To be conservative, we'll
avoid spinning in these situations.

This patch reduced the total run time of the xfs_repair workload from
about 4 minutes 24 seconds down to approximately 1 minute 26 seconds,
back to close to the same performance as on 3.15.

Retesting of AIM7, which were some of the workloads used to test the
original optimistic spinning code, confirmed that we still get big
performance gains with optimistic spinning, even with this additional
regression fix. Davidlohr found that while the 'custom' workload took
a performance hit of ~-14% to throughput for >300 users with this
additional patch, the overall gain with optimistic spinning is
still ~+45%. The 'disk' workload even improved by ~+15% at >1000 users.

Tested-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1404532172.2572.30.camel@j-VirtualBox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf/x86/intel: Protect LBR and extra_regs against KVM lying
Kan Liang [Mon, 14 Jul 2014 19:25:56 +0000 (12:25 -0700)]
perf/x86/intel: Protect LBR and extra_regs against KVM lying

With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.

When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.

For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf: Fix lockdep warning on process exit
Peter Zijlstra [Mon, 23 Jun 2014 14:12:42 +0000 (16:12 +0200)]
perf: Fix lockdep warning on process exit

Sasha Levin reported:

> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted
> -------------------------------------------------------
> trinity-c578/9725 is trying to acquire lock:
> (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346)
>
> but task is already holding lock:
> (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
>
> which lock already depends on the new lock.

> 1 lock held by trinity-c578/9725:
> #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533)
>
>  Call Trace:
>  dump_stack (lib/dump_stack.c:52)
>  print_circular_bug (kernel/locking/lockdep.c:1216)
>  __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182)
>  lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602)
>  _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
>  __queue_work (kernel/workqueue.c:1346)
>  queue_work_on (kernel/workqueue.c:1424)
>  free_object (lib/debugobjects.c:209)
>  __debug_check_no_obj_freed (lib/debugobjects.c:715)
>  debug_check_no_obj_freed (lib/debugobjects.c:727)
>  kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711)
>  free_task (kernel/fork.c:221)
>  __put_task_struct (kernel/fork.c:250)
>  put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898)
>  perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533)
>  do_exit (kernel/exit.c:766)
>  do_group_exit (kernel/exit.c:884)
>  get_signal_to_deliver (kernel/signal.c:2347)
>  do_signal (arch/x86/kernel/signal.c:698)
>  do_notify_resume (arch/x86/kernel/signal.c:751)
>  int_signal (arch/x86/kernel/entry_64.S:600)

Urgh.. so the only way I can make that happen is through:

  perf_event_exit_task_context()
    raw_spin_lock(&child_ctx->lock);
    unclone_ctx(child_ctx)
      put_ctx(ctx->parent_ctx);
    raw_spin_unlock_irqrestore(&child_ctx->lock);

And we can avoid this by doing the change below.

I can't immediately see how this changed recently, but given that you
say it's easy to reproduce, lets fix this.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140623141242.GB19860@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings
Stephane Eranian [Mon, 30 Jun 2014 14:46:24 +0000 (16:46 +0200)]
perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings

This patch fixes the SNB-EP and IVT Cbox filter mapping
table. The table controls which filters are supported by
which events. There were several mistakes in those tables
causing some filters to be ignored, such as NID on
TOR_INSERTS.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: zheng.z.yan@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140630144624.GA2604@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf/x86/intel: Use proper dTLB-load-misses event on IvyBridge
Vince Weaver [Mon, 14 Jul 2014 19:33:25 +0000 (15:33 -0400)]
perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge

This was discussed back in February:

https://lkml.org/lkml/2014/2/18/956

But I never saw a patch come out of it.

On IvyBridge we share the SandyBridge cache event tables, but the
dTLB-load-miss event is not compatible.  Patch it up after
the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agoperf: Revert ("perf: Always destroy groups on exit")
Peter Zijlstra [Tue, 15 Jul 2014 15:27:27 +0000 (17:27 +0200)]
perf: Revert ("perf: Always destroy groups on exit")

Vince reported that commit 15a2d4de0eab5 ("perf: Always destroy groups
on exit") causes a regression with grouped events. In particular his
read_group_attached.c test fails.

  https://github.com/deater/perf_event_tests/blob/master/tests/bugs/read_group_attached.c

Because of the context switch optimization in
perf_event_context_sched_out() the 'original' event may end up in the
child process and when that exits the change in the patch in question
destroys the actual grouping.

Therefore revert that change and only destroy inherited groups.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-zedy3uktcp753q8fw8dagx7a@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agox86: Remove unused variable "polling"
Paul Bolle [Mon, 30 Jun 2014 14:32:29 +0000 (16:32 +0200)]
x86: Remove unused variable "polling"

Compile tested. "polling" is unused since commit f80c5b39b80a
("sched/idle, x86: Switch from TS_POLLING to TIF_POLLING_NRFLAG").

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Link: http://lkml.kernel.org/r/1404138749.2978.6.camel@x41
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agos390: fix restore of invalid floating-point-control
Martin Schwidefsky [Tue, 15 Jul 2014 08:41:37 +0000 (10:41 +0200)]
s390: fix restore of invalid floating-point-control

The fixup of the inline assembly to restore the floating-point-control
register needs to check for instruction address *after* the lfcp
instruction as the specification and data exceptions are suppresssing.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agos390/zcrypt: improve device probing for zcrypt adapter cards
Ingo Tuchscherer [Mon, 14 Jul 2014 17:11:48 +0000 (19:11 +0200)]
s390/zcrypt: improve device probing for zcrypt adapter cards

Improve device probing process for zcrypt adapters to
transmit service request during registration process.

Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agos390/ptrace: fix PSW mask check
Martin Schwidefsky [Mon, 23 Jun 2014 13:29:40 +0000 (15:29 +0200)]
s390/ptrace: fix PSW mask check

The PSW mask check of the PTRACE_POKEUSR_AREA command is incorrect.
The PSW_MASK_USER define contains the PSW_MASK_ASC bits, the ptrace
interface accepts all combinations for the address-space-control
bits. To protect the kernel space the PSW mask check in ptrace needs
to reject the address-space-control bit combination for home space.

Fixes CVE-2014-3534

Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agos390/MSI: Use standard mask and unmask funtions
Yijing Wang [Tue, 8 Jul 2014 02:08:05 +0000 (10:08 +0800)]
s390/MSI: Use standard mask and unmask funtions

MSI irqchip in s390 has its own mask and unmask MSI irq
functions, zpci_enable_irq() and zpci_disable_irq().
They mask and unmask MSI irq in standard ways, no arch
special. MSI driver provides two global standard functions
mask_msi_irq() and unmask_msi_irq(). Local zpci_enable_irq()
and zpci_disable_irq() are almost the same as the standard
two. the difference is local mask/unmask functions
read the mask status before mask and unmask everytime.
Then change the value and rewrite to hardware. In standard
functions, save the mask status after mask and unmask msi
irq, and use the cached status to change the mask status.
When we mask or unmask a MSI irq, we always cache its
mask status except we know need not to cache it, like in
pci_msi_shutdown. So use the standard functions to replace
the local is safe.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
[sebott: fixed inverted function pointers]
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agos390/3270: correct size detection with the read-partition command
Martin Schwidefsky [Thu, 26 Jun 2014 08:48:56 +0000 (10:48 +0200)]
s390/3270: correct size detection with the read-partition command

The size detection for 3270 terminals with the read-partition command is
broken. The raw3270_reset_device_cb function clears the init_data array,
but if raw3270_writesf_readpart has been called the read-partition command
is queued which needs the init_data array. In this case the size detection
will fail and the invalid command does funny things to the terminal.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agos390: require mvcos facility, not tod clock steering facility
David Hildenbrand [Wed, 18 Jun 2014 10:32:19 +0000 (12:32 +0200)]
s390: require mvcos facility, not tod clock steering facility

Inlined uaccess functions require the mvcos facility (bit 27), not the tod
clock steering facility (bit 28) for z10 and newer machines.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
11 years agoUBI: fastmap: do not miss bit-flips
Brian Norris [Wed, 21 May 2014 05:35:38 +0000 (22:35 -0700)]
UBI: fastmap: do not miss bit-flips

The return value from 'ubi_io_read_ec_hdr()' was stored in 'err', not in 'ret'.
This fix makes sure Fastmap-enabled UBI does not miss bit-flip while reading EC
headers, events and scrubs the affected PEBs.

This issue was reported by Coverity Scan.

Artem: improved the commit message.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
11 years agoMerge tag 'xtensa-for-next-20140715' of git://github.com/jcmvbkbc/linux-xtensa into...
Chris Zankel [Wed, 16 Jul 2014 05:18:30 +0000 (22:18 -0700)]
Merge tag 'xtensa-for-next-20140715' of git://github.com/jcmvbkbc/linux-xtensa into for_next

Xtensa fixes for 3.16:
- resolve FIXMEs in double exception handler for window overflow. This
  fix makes native building of linux on xtensa host possible;
- fix sysmem region removal issue introduced in 3.15.

11 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Wed, 16 Jul 2014 03:47:42 +0000 (17:47 -1000)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull quota fix from Jan Kara:
 "Fix locking of dquot shrinker"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: missing lock in dqcache_shrink_scan()

11 years agoMerge tag 'gpio-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Wed, 16 Jul 2014 03:46:51 +0000 (17:46 -1000)]
Merge tag 'gpio-v3.16-3' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fix from Linus Walleij:
 "Fix up some merge confusion from the merge window"

* tag 'gpio-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: mcp23s08: Eliminates redundant checking.

11 years agoipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack
Julian Anastasov [Thu, 10 Jul 2014 06:24:01 +0000 (09:24 +0300)]
ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack

commit 8f4e0a18682d91 ("IPVS netns exit causes crash in conntrack")
added second ip_vs_conn_drop_conntrack call instead of just adding
the needed check. As result, the first call still can cause
crash on netns exit. Remove it.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoclk: qcom: Add support for APQ8064 multimedia clocks
Stephen Boyd [Tue, 15 Jul 2014 21:52:22 +0000 (14:52 -0700)]
clk: qcom: Add support for APQ8064 multimedia clocks

The APQ8064 multimedia clock controller is fairly similar to the
8960 multimedia clock controller, except that gfx2d0/1 has been
removed and the gfx3d frequency is slightly faster when using the
newly introduced PLL15. We also add vcap clocks and a couple new
TV clocks.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: pll: Add support for configuring SR PLLs
Stephen Boyd [Tue, 15 Jul 2014 21:48:41 +0000 (14:48 -0700)]
clk: qcom: pll: Add support for configuring SR PLLs

Some SR type PLLs need to be configured for a certain rate when
linux boots. Add support for these types of PLLs so that we can
program PLL15's rate on apq8064.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: mdp_lut_clk is a child of mdp_src
Stephen Boyd [Wed, 9 Jul 2014 01:36:06 +0000 (18:36 -0700)]
clk: qcom: mdp_lut_clk is a child of mdp_src

The mdp_lut_clk isn't a child of the mdp_clk. Instead it's the
child of the mdp_src clock. Fix it.

Fixes: 6d00b56fe "clk: qcom: Add support for MSM8960's multimedia clock controller (MMCC)"
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: Fix PLL rate configurations
Stephen Boyd [Tue, 15 Jul 2014 21:59:21 +0000 (14:59 -0700)]
clk: qcom: Fix PLL rate configurations

Sometimes we need to program PLLs with a fixed rate
configuration during driver probe. Doing this after we register
the PLLs with the clock framework causes the common clock
framework to assume the rate of the PLLs are 0. This causes all
sorts of problems for rate recalculations because the common
clock framework caches the rate once at registration time unless
a flag is set to always recalculate the rates.

Split the qcom_cc_probe() function into two pieces, map and
everything else, so that drivers which need to configure some
PLL rates or otherwise twiddle bits in the clock controller can
do so before registering clocks. This allows us to properly
detect the rates of PLLs that are programmed at boot.

Fixes: 49fc825f0cc2 "clk: qcom: Consolidate common probe code"
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: Fix MN frequency tables, parent map, and jpegd
Stephen Boyd [Wed, 9 Jul 2014 01:36:06 +0000 (18:36 -0700)]
clk: qcom: Fix MN frequency tables, parent map, and jpegd

Clocks that don't have a pre-divider don't list any pre-divider
in their frequency tables, but their tables are initialized using
aggregate initializers. Use tagged initializers so we properly
assign the m and n values for each frequency. Furthermore, the
mmcc_pxo_pll8_pll2_pll3 array improperly mapped the second
element to pll2 instead of pll8, causing the clock driver to
recalculate the wrong rate for any clocks using this array along
with a rate that uses pll2. Plus the .num_parents field is 3
instead of 4 so you can't even switch the parent to pll3. Finally
I noticed that the jpegd clock improperly indicates that the
pre-divider width is only 2, when it's actually 4 bits wide.

Fixes: 6d00b56fe "clk: qcom: Add support for MSM8960's multimedia clock controller (MMCC)"
Tested-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: Support bypass RCG configuration
Stephen Boyd [Fri, 11 Jul 2014 19:55:27 +0000 (12:55 -0700)]
clk: qcom: Support bypass RCG configuration

In the case of HDMI clocks, we want to bypass the RCG's ability
to divide the output clock and pass through the parent HDMI PLL
rate. Add a simple set of clk_ops to configure the RCG to do
this. This removes the need to keep adding more frequency entries
to the tv_src clock whenever we want to support a new rate.

Tested-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: Add support for IPQ8064's global clock controller (GCC)
Kumar Gala [Tue, 17 Jun 2014 19:46:51 +0000 (14:46 -0500)]
clk: qcom: Add support for IPQ8064's global clock controller (GCC)

Add a driver for the global clock controller found on IPQ8064 based
platforms. This should allow most non-multimedia device drivers to probe
and control their clocks.

This is currently missing clocks for USB HSIC and networking devices.

Signed-off-by: Kumar Gala <galak@codeaurora.org>
Signed-off-by: Andy Gross <agross@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoclk: qcom: Add APQ8084 Multimedia Clock Controller (MMCC) support
Georgi Djakov [Thu, 12 Jun 2014 16:41:42 +0000 (19:41 +0300)]
clk: qcom: Add APQ8084 Multimedia Clock Controller (MMCC) support

Add support for the multimedia clock controller found on the APQ8084
based platforms. This will allow the multimedia device drivers to
control their clocks.

Signed-off-by: Georgi Djakov <gdjakov@mm-sol.com>
[sboyd: Rework parent mapping to avoid conflicts]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
11 years agoring-buffer: Fix polling on trace_pipe
Martin Lau [Tue, 10 Jun 2014 06:06:42 +0000 (23:06 -0700)]
ring-buffer: Fix polling on trace_pipe

ring_buffer_poll_wait() should always put the poll_table to its wait_queue
even there is immediate data available.  Otherwise, the following epoll and
read sequence will eventually hang forever:

1. Put some data to make the trace_pipe ring_buffer read ready first
2. epoll_ctl(efd, EPOLL_CTL_ADD, trace_pipe_fd, ee)
3. epoll_wait()
4. read(trace_pipe_fd) till EAGAIN
5. Add some more data to the trace_pipe ring_buffer
6. epoll_wait() -> this epoll_wait() will block forever

~ During the epoll_ctl(efd, EPOLL_CTL_ADD,...) call in step 2,
  ring_buffer_poll_wait() returns immediately without adding poll_table,
  which has poll_table->_qproc pointing to ep_poll_callback(), to its
  wait_queue.
~ During the epoll_wait() call in step 3 and step 6,
  ring_buffer_poll_wait() cannot add ep_poll_callback() to its wait_queue
  because the poll_table->_qproc is NULL and it is how epoll works.
~ When there is new data available in step 6, ring_buffer does not know
  it has to call ep_poll_callback() because it is not in its wait queue.
  Hence, block forever.

Other poll implementation seems to call poll_wait() unconditionally as the very
first thing to do.  For example, tcp_poll() in tcp.c.

Link: http://lkml.kernel.org/p/20140610060637.GA14045@devbig242.prn2.facebook.com
Cc: stable@vger.kernel.org # 2.6.27
Fixes: 2a2cc8f7c4d0 "ftrace: allow the event pipe to be polled"
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Martin Lau <kafai@fb.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agoPCI: generic: Fix GPL v2 license string typo
Bjorn Helgaas [Tue, 15 Jul 2014 21:07:46 +0000 (15:07 -0600)]
PCI: generic: Fix GPL v2 license string typo

Per license_is_gpl_compatible(), the MODULE_LICENSE() string for GPL v2 is
"GPL v2", not "GPLv2".  Use "GPL v2" so this module doesn't taint the
kernel.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
11 years agoPCI: rcar: Fix GPL v2 license string typo
Bjorn Helgaas [Tue, 15 Jul 2014 21:06:12 +0000 (15:06 -0600)]
PCI: rcar: Fix GPL v2 license string typo

Per license_is_gpl_compatible(), the MODULE_LICENSE() string for GPL v2 is
"GPL v2", not "GPLv2".  Use "GPL v2" so this module doesn't taint the
kernel.

Based-on-work-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
11 years agoPCI: tegra: Fix GPL v2 license string typo
Thierry Reding [Fri, 11 Jul 2014 06:58:58 +0000 (08:58 +0200)]
PCI: tegra: Fix GPL v2 license string typo

Per license_is_gpl_compatible(), the MODULE_LICENSE() string for GPL v2 is
"GPL v2", not "GPLv2".  Use "GPL v2" so this module doesn't taint the
kernel.

[bhelgaas: changelog]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
11 years agoPCI: mvebu: Fix GPL v2 license string typo
Thierry Reding [Fri, 11 Jul 2014 06:58:57 +0000 (08:58 +0200)]
PCI: mvebu: Fix GPL v2 license string typo

Per license_is_gpl_compatible(), the MODULE_LICENSE() string for GPL v2 is
"GPL v2", not "GPLv2".  Use "GPL v2" so this module doesn't taint the
kernel.

[bhelgaas: changelog]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
11 years agoquota: missing lock in dqcache_shrink_scan()
Niu Yawei [Wed, 4 Jun 2014 04:22:13 +0000 (12:22 +0800)]
quota: missing lock in dqcache_shrink_scan()

Commit 1ab6c4997e04 (fs: convert fs shrinkers to new scan/count API)
accidentally removed locking from quota shrinker. Fix it -
dqcache_shrink_scan() should use dq_list_lock to protect the
scan on free_dquots list.

CC: stable@vger.kernel.org
Fixes: 1ab6c4997e04a00c50c6d786c2f046adc0d1f5de
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
11 years agogpio: rcar: Add support for DT IRQ flags
Laurent Pinchart [Tue, 8 Jul 2014 10:46:46 +0000 (12:46 +0200)]
gpio: rcar: Add support for DT IRQ flags

The gpio-rcar driver has no IRQ domain OF xlate function and thus
ignores IRQ flags specified in DT. Fix this by using the two-cell xlate
function.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
11 years agoMAINTAINERS: Add entry for the Renesas pin controller driver
Laurent Pinchart [Wed, 2 Jul 2014 14:33:31 +0000 (16:33 +0200)]
MAINTAINERS: Add entry for the Renesas pin controller driver

I'm actively maintaining the driver, let's document that.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
11 years agopinctrl: st: Fix irqmux handler
Maxime COQUELIN [Fri, 20 Jun 2014 11:34:54 +0000 (13:34 +0200)]
pinctrl: st: Fix irqmux handler

st_gpio_irqmux_handler() reads the status register to find out
which banks inside the controller have pending IRQs.
For each banks having pending IRQs, it calls the corresponding handler.

Problem is that current code restricts the number of possible banks inside the
controller to ST_GPIO_PINS_PER_BANK. This define represents the number of pins
inside a bank, so it shouldn't be used here.

On STiH407, PIO_FRONT0 controller has 10 banks, so IRQs pending in the two
last banks (PIO18 & PIO19) aren't handled.

This patch replace ST_GPIO_PINS_PER_BANK by the number of banks inside the
controller.

Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@vger.kernel.org> #v3.15+
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
11 years agodm cache metadata: do not allow the data block size to change for-linus
Mike Snitzer [Mon, 14 Jul 2014 20:59:39 +0000 (16:59 -0400)]
dm cache metadata: do not allow the data block size to change

The block size for the dm-cache's data device must remained fixed for
the life of the cache.  Disallow any attempt to change the cache's data
block size.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org
11 years agodm thin metadata: do not allow the data block size to change
Mike Snitzer [Mon, 14 Jul 2014 20:35:54 +0000 (16:35 -0400)]
dm thin metadata: do not allow the data block size to change

The block size for the thin-pool's data device must remained fixed for
the life of the thin-pool.  Disallow any attempt to change the
thin-pool's data block size.

It should be noted that attempting to change the data block size via
thin-pool table reload will be ignored as a side-effect of the thin-pool
handover that the thin-pool target does during thin-pool table reload.

Here is an example outcome of attempting to load a thin-pool table that
reduced the thin-pool's data block size from 1024K to 512K.

Before:
kernel: device-mapper: thin: 253:4: growing the data device from 204800 to 409600 blocks

After:
kernel: device-mapper: thin metadata: changing the data block size (from 2048 to 1024) is not supported
kernel: device-mapper: table: 253:4: thin-pool: Error creating metadata object
kernel: device-mapper: ioctl: error adding target to table

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Cc: stable@vger.kernel.org
11 years agotracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs
zhangwei(Jovi) [Thu, 18 Jul 2013 08:31:18 +0000 (16:31 +0800)]
tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs

The TRACE_ITER_PRINTK check in __trace_puts/__trace_bputs is missing,
so add it, to be consistent with __trace_printk/__trace_bprintk.
Those functions are all called by the same function: trace_printk().

Link: http://lkml.kernel.org/p/51E7A7D6.8090900@huawei.com
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Tue, 15 Jul 2014 15:57:17 +0000 (08:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:
 "This contains miscellaneous fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: replace count*size kzalloc by kcalloc
  fuse: release temporary page if fuse_writepage_locked() failed
  fuse: restructure ->rename2()
  fuse: avoid scheduling while atomic
  fuse: handle large user and group ID
  fuse: inode: drop cast
  fuse: ignore entry-timeout on LOOKUP_REVAL
  fuse: timeout comparison fix

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 15 Jul 2014 15:42:52 +0000 (08:42 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Bluetooth pairing fixes from Johan Hedberg.

 2) ieee80211_send_auth() doesn't allocate enough tail room for the SKB,
    from Max Stepanov.

 3) New iwlwifi chip IDs, from Oren Givon.

 4) bnx2x driver reads wrong PCI config space MSI register, from Yijing
    Wang.

 5) IPV6 MLD Query validation isn't strong enough, from Hangbin Liu.

 6) Fix double SKB free in openvswitch, from Andy Zhou.

 7) Fix sk_dst_set() being racey with UDP sockets, leading to strange
    crashes, from Eric Dumazet.

 8) Interpret the NAPI budget correctly in the new systemport driver,
    from Florian Fainelli.

 9) VLAN code frees percpu stats in the wrong place, leading to crashes
    in the get stats handler.  From Eric Dumazet.

10) TCP sockets doing a repair can crash with a divide by zero, because
    we invoke tcp_push() with an MSS value of zero.  Just skip that part
    of the sendmsg paths in repair mode.  From Christoph Paasch.

11) IRQ affinity bug fixes in mlx4 driver from Amir Vadai.

12) Don't ignore path MTU icmp messages with a zero mtu, machines out
    there still spit them out, and all of our per-protocol handlers for
    PMTU can cope with it just fine.  From Edward Allcutt.

13) Some NETDEV_CHANGE notifier invocations were not passing in the
    correct kind of cookie as the argument, from Loic Prylli.

14) Fix crashes in long multicast/broadcast reassembly, from Jon Paul
    Maloy.

15) ip_tunnel_lookup() doesn't interpret wildcard keys correctly, fix
    from Dmitry Popov.

16) Fix skb->sk assigned without taking a reference to 'sk' in
    appletalk, from Andrey Utkin.

17) Fix some info leaks in ULP event signalling to userspace in SCTP,
    from Daniel Borkmann.

18) Fix deadlocks in HSO driver, from Olivier Sobrie.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits)
  hso: fix deadlock when receiving bursts of data
  hso: remove unused workqueue
  net: ppp: don't call sk_chk_filter twice
  mlx4: mark napi id for gro_skb
  bonding: fix ad_select module param check
  net: pppoe: use correct channel MTU when using Multilink PPP
  neigh: sysctl - simplify address calculation of gc_* variables
  net: sctp: fix information leaks in ulpevent layer
  MAINTAINERS: update r8169 maintainer
  net: bcmgenet: fix RGMII_MODE_EN bit
  tipc: clear 'next'-pointer of message fragments before reassembly
  r8152: fix r8152_csum_workaround function
  be2net: set EQ DB clear-intr bit in be_open()
  GRE: enable offloads for GRE
  farsync: fix invalid memory accesses in fst_add_one() and fst_init_card()
  igb: do a reset on SR-IOV re-init if device is down
  igb: Workaround for i210 Errata 25: Slow System Clock
  usbnet: smsc95xx: add reset_resume function with reset operation
  dp83640: Always decode received status frames
  r8169: disable L23
  ...

11 years agolibata: EH should handle AMNF error condition as a media error
Alexey Asemov [Tue, 15 Jul 2014 06:28:42 +0000 (10:28 +0400)]
libata: EH should handle AMNF error condition as a media error

libata-eh.c should handle AMNF error condition (error byte bit 0,
usually code 0x01) in libata-eh.c along with UNC as a media error so
SCSI stack can handle it properly (translation code 0x01 is already
present in libata-scsi.c) but was never passed down due to lack of
handling in EH.

While using linux-based machine (AMD 6550M-based notebook, PCI IDs for the
controller are 1022:7801 subsys 1025:059d) and ddrescue to salvage data
from failing hard drive (WD7500BPVT 2.5" 750G SATA2), I've found that pure
AMNF 0x01 error code generates generic "device error" that is retried
several times by SCSI stack instead of "media error" that is passed up to
software.

So we may assume deprecated AMNF error code is surely not dead yet, and
it's better for it to be handled properly. As we may see it is used by
modern enough devices, and used properly: drive returned AMNF only when IDs
for track cannot be read completely due to dying head or positioning,
otherwise it returned UNC(orrectables).

Not handling it causes wrong generic error code ("device error") reporting
down the stack, can damage failing drives further because of excessive
retries, and slows salvaging down a lot. Also, there is handling code in
libata-scsi.c for 0x01 AMNF error already.

https://bugzilla.kernel.org/show_bug.cgi?id=80031

tj: Shortened $SUBJ and moved its content to the first paragraph.

Signed-off-by: Alexey Asemov <alex@alex-at.ru>
Signed-off-by: Tejun Heo <tj@kernel.org>