- 13 Jul, 2021 1 commit
-
-
Eric Huang authored
This reverts commit 31f33243 . Reason for revert: it causes regressions on several Asics. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 30 Jun, 2021 3 commits
-
-
Philip Yang authored
This is part of SVM profiling API, export sysfs counters for per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram. counters will not be updated in parallel in GPU retry fault handler and migration to vram/ram path, use READ_ONCE to avoid compiler optimization. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
3 cases of kobj leak, which causes memory leak: kobj_type must have release() method to free memory from release callback. Don't need NULL default_attrs to init kobj. sysfs files created under kobj_status should be removed with kobj_status as parent kobject. Remove queue sysfs files when releasing queue from process MMU notifier release callback. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
No functionality change. Modify kfd_sysfs_create_file to use kobject as parameter, so it becomes common helper function to remove duplicate code and will simplify new kfd sysfs file create in future. Move pr_warn to helper function if sysfs file create failed. Set helper function as void return because caller doesn't use the helper function return value. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 15 Jun, 2021 1 commit
-
-
Felix Kuehling authored
When some GPUs don't support SVM, don't disabe it for the entire process. That would be inconsistent with the information the process got from the topology, which indicates SVM support per GPU. Instead disable SVM support only for the unsupported GPUs. This is done by checking any per-device attributes against the bitmap of supported GPUs. Also use the supported GPU bitmap to initialize access bitmaps for new SVM address ranges. Don't handle recoverable page faults from unsupported GPUs. (I don't think there will be unsupported GPUs that can generate recoverable page faults. But better safe than sorry.) Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 07 Jun, 2021 1 commit
-
-
Wan Jiabing authored
kfd_svm.h is included duplicately in commit 42de677f ("drm/amdkfd: register svm range"). After checking possible related header files, remove the former one to make the code format more reasonable. Signed-off-by:
Wan Jiabing <wanjiabing@vivo.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 04 Jun, 2021 2 commits
-
-
Eric Huang authored
It is to optimize memory mapping latency, and also aviod a page fault in a corner case of changing valid PDE into PTE. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
It is to provide more tlb flush types option for different case scenario. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 21 May, 2021 1 commit
-
-
Guenter Roeck authored
The first parameter passed to container_of() is the pointer to the work structure passed to the worker and never NULL. The NULL check on the result of container_of() is therefore unnecessary and misleading. Remove it. This change was made automatically with the following Coccinelle script. @@ type t; identifier v; statement s; @@ <+... ( t v = container_of(...); | v = container_of(...); ) ... when != v - if (\( !v \| v == NULL \) ) s ...+> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 20 May, 2021 1 commit
-
-
Philip Yang authored
Need do a heavy-weight TLB flush to make sure we have no more dirty data in the cache for the unmapped pages. Define enum TLB_FLUSH_TYPE, add flush_type parameter to amdgpu_amdkfd_flush_gpu_tlb_pasid. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 29 Apr, 2021 1 commit
-
-
Fabio M. De Francesco authored
Fixed a kernel-doc error in the documentation of a function. Signed-off-by:
Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 21 Apr, 2021 8 commits
-
-
Felix Kuehling authored
With xnack on, GPU vm fault handler decide the best restore location, then migrate range to the best restore location and update GPU mapping to recover the GPU vm fault. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Sierra authored
XNACK mode controls the SQ RETRY_DISABLE setting that determines, whether recoverable page faults can be supported on GFXv9 hardware. Only on Aldebaran we can support different processes running with different XNACK modes. On older chips all processes must use the same RETRY_DISABLE setting. However, processes not relying on recoverable page faults can work with RETRY enabled. This means XNACK off is always available as a fallback so we can use the same mode on all GPUs in a process. Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
HMM interval notifier callback notify CPU page table will be updated, stop process queues if the updated address belongs to svm range registered in process svms objects tree. Scheduled restore work to update GPU page table using new pages address in the updated svm range. The restore worker flushes any deferred work to make sure it restores an up-to-date svm_range_list. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
svm range structure stores the range start address, size, attributes, flags, prefetch location and gpu bitmap which indicates which GPU this range maps to. Same virtual address is shared by CPU and GPUs. Process has svm range list which uses both interval tree and list to store all svm ranges registered by the process. Interval tree is used by GPU vm fault handler and CPU page fault handler to get svm range structure from the specific address. List is used to scan all ranges in eviction restore work. No overlap range interval [start, last] exist in svms object interval tree. If process registers new range which has overlap with old range, the old range split into 2 ranges depending on the overlap happens at head or tail part of old range. Apply attributes preferred location, prefetch location, mapping flags, migration granularity to svm range, store mapping gpu index into bitmap. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
Add svm (shared virtual memory) ioctl data structure and API definition. The svm ioctl API is designed to be extensible in the future. All operations are provided by a single IOCTL to preserve ioctl number space. The arguments structure ends with a variable size array of attributes that can be used to set or get one or multiple attributes. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Sierra authored
svm range uses gpu bitmap to store which GPU svm range maps to. Application pass driver gpu id to specify GPU, the helper is needed to convert gpu id to gpu bitmap idx. Access through kfd_process_device pointers array from kfd_process. Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
DRM render node file handles are used for CPU mapping of BOs using mmap by the Thunk. It uses the DRM render node of the GPU where the BO was allocated. DRM allows mmap access automatically when it creates a GEM handle for a BO. KFD BOs don't have GEM handles, so KFD needs to manage access manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was used in the kfd_ioctl_acquire_vm call for the same GPU. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 15 Apr, 2021 1 commit
-
-
Felix Kuehling authored
ROCm user mode has acquired VMs from DRM file descriptors for as long as it supported the upstream KFD. Legacy code to support older versions of ROCm is not needed any more. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 09 Apr, 2021 1 commit
-
-
Alex Sierra authored
Remove per_device_list from kfd_process and replace it with a kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps to manage the kfd_process_devices binded to a specific kfd_process. Also, functions used by kfd_chardev to iterate over the list were removed, since they are not valid anymore. Instead, it was replaced by a local loop iterating the array. Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Jonathan Kim <jonathan.kim@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 05 Mar, 2021 1 commit
-
-
Jay Cornwall authored
Trap handler is set per-process per-device and is unrelated to queue management. Move implementation closer to TMA setup code. Signed-off-by:
Jay Cornwall <jay.cornwall@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 22 Feb, 2021 2 commits
-
-
Felix Kuehling authored
If init_cwsr_apu fails, we currently leave the kfd_process structure in place anyway. The next kfd_open will then succeed, using the existing kfd_process structure. Fix that by cleaning up the kfd_process after a failure in init_cwsr_apu. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
We use mmu_notifier_put to free the MMU notifier. That needs to be paired with mmu_notifier_get to work correctly. Othewrise the next patch would cause a kernel oops. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 30 Sep, 2020 1 commit
-
-
Ramesh Errabolu authored
compute units that are in use. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Surface files in Sysfs that allow user to determine the number of compute units that are in use for a given process. One Sysfs file is used per device. Signed-off-by:
Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By:
Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 22 Sep, 2020 2 commits
-
-
Philip Cox authored
amdkfd is dumping a stack during initialization. kfd_procfs_add_sysfs_stats is being called twice. This removes one of them. Fixes: 4327bed2 ("drm/amdkfd: Add process eviction counters to sysfs") Reviewed-by:
Kent Russell <kent.russell@amd.com> Signed-off-by:
Philip Cox <Philip.Cox@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mukul Joshi authored
Move doorbell allocation for a process into kfd device and allocate doorbell space in each PDD during process creation. Currently, KFD manages its own doorbell space but for some devices, amdgpu would allocate the complete doorbell space instead of leaving a chunk of doorbell space for KFD to manage. In a system with mix of such devices, KFD would need to request process doorbell space based on the type of device, either from amdgpu or from its own doorbell space. Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 18 Sep, 2020 2 commits
-
-
Philip Cox authored
Add per-process eviction counters to sysfs to keep track of how many eviction events have happened for each process. v2: rename the stats dir, and track all evictions per process, per device. v3: Simplify the stats kobject handling and cleanup. v4: more code cleanup Signed-off-by:
Philip Cox <Philip.Cox@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Cox authored
Extending the module parameter debug_evictions to also print a stack trace when the eviction code path is called. Signed-off-by:
Philip Cox <Philip.Cox@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 17 Sep, 2020 1 commit
-
-
Fenghua Yu authored
PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Fenghua Yu <fenghua.yu@intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Tony Luck <tony.luck@intel.com> Reviewed-by:
Lu Baolu <baolu.lu@linux.intel.com> Acked-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
-
- 24 Aug, 2020 1 commit
-
-
Mukul Joshi authored
Add __user annotation to fix related sparse warning while reading SDMA counters from userland. Also, rework the read SDMA counters function by removing redundant checks. Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 18 Aug, 2020 1 commit
-
-
Mukul Joshi authored
To prevent reporting erroneous SDMA usage, initialize SDMA activity counter to 0 before using. Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 14 Aug, 2020 1 commit
-
-
Christian König authored
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1a . Signed-off-by:
Christian König <christian.koenig@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 27 Jul, 2020 1 commit
-
-
Dennis Li authored
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by:
Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by:
Christian König <christian.koenig@amd.com> Suggested-by:
Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by:
Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by:
Luben Tukov <luben.tuikov@amd.com> Signed-off-by:
Dennis Li <Dennis.Li@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 01 Jul, 2020 4 commits
-
-
Mukul Joshi authored
[ 150.887733] ====================================================== [ 150.893903] WARNING: possible circular locking dependency detected [ 150.905917] ------------------------------------------------------ [ 150.912129] kfdtest/4081 is trying to acquire lock: [ 150.917002] ffff8f7f3762e118 (&mm->mmap_sem#2){++++}, at: __might_fault+0x3e/0x90 [ 150.924490] but task is already holding lock: [ 150.930320] ffff8f7f49d229e8 (&dqm->lock_hidden){+.+.}, at: destroy_queue_cpsch+0x29/0x210 [amdgpu] [ 150.939432] which lock already depends on the new lock. [ 150.947603] the existing dependency chain (in reverse order) is: [ 150.955074] -> #3 (&dqm->lock_hidden){+.+.}: [ 150.960822] __mutex_lock+0xa1/0x9f0 [ 150.964996] evict_process_queues_cpsch+0x22/0x120 [amdgpu] [ 150.971155] kfd_process_evict_queues+0x3b/0xc0 [amdgpu] [ 150.977054] kgd2kfd_quiesce_mm+0x25/0x60 [amdgpu] [ 150.982442] amdgpu_amdkfd_evict_userptr+0x35/0x70 [amdgpu] [ 150.988615] amdgpu_mn_invalidate_hsa+0x41/0x60 [amdgpu] [ 150.994448] __mmu_notifier_invalidate_range_start+0xa4/0x240 [ 151.000714] copy_page_range+0xd70/0xd80 [ 151.005159] dup_mm+0x3ca/0x550 [ 151.008816] copy_process+0x1bdc/0x1c70 [ 151.013183] _do_fork+0x76/0x6c0 [ 151.016929] __x64_sys_clone+0x8c/0xb0 [ 151.021201] do_syscall_64+0x4a/0x1d0 [ 151.025404] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 151.030977] -> #2 (&adev->notifier_lock){+.+.}: [ 151.036993] __mutex_lock+0xa1/0x9f0 [ 151.041168] amdgpu_mn_invalidate_hsa+0x30/0x60 [amdgpu] [ 151.047019] __mmu_notifier_invalidate_range_start+0xa4/0x240 [ 151.053277] copy_page_range+0xd70/0xd80 [ 151.057722] dup_mm+0x3ca/0x550 [ 151.061388] copy_process+0x1bdc/0x1c70 [ 151.065748] _do_fork+0x76/0x6c0 [ 151.069499] __x64_sys_clone+0x8c/0xb0 [ 151.073765] do_syscall_64+0x4a/0x1d0 [ 151.077952] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 151.083523] -> #1 (mmu_notifier_invalidate_range_start){+.+.}: [ 151.090833] change_protection+0x802/0xab0 [ 151.095448] mprotect_fixup+0x187/0x2d0 [ 151.099801] setup_arg_pages+0x124/0x250 [ 151.104251] load_elf_binary+0x3a4/0x1464 [ 151.108781] search_binary_handler+0x6c/0x210 [ 151.113656] __do_execve_file.isra.40+0x7f7/0xa50 [ 151.118875] do_execve+0x21/0x30 [ 151.122632] call_usermodehelper_exec_async+0x17e/0x190 [ 151.128393] ret_from_fork+0x24/0x30 [ 151.132489] -> #0 (&mm->mmap_sem#2){++++}: [ 151.138064] __lock_acquire+0x11a1/0x1490 [ 151.142597] lock_acquire+0x90/0x180 [ 151.146694] __might_fault+0x68/0x90 [ 151.150879] read_sdma_queue_counter+0x5f/0xb0 [amdgpu] [ 151.156693] update_sdma_queue_past_activity_stats+0x3b/0x90 [amdgpu] [ 151.163725] destroy_queue_cpsch+0x1ae/0x210 [amdgpu] [ 151.169373] pqm_destroy_queue+0xf0/0x250 [amdgpu] [ 151.174762] kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu] [ 151.180577] kfd_ioctl+0x223/0x400 [amdgpu] [ 151.185284] ksys_ioctl+0x8f/0xb0 [ 151.189118] __x64_sys_ioctl+0x16/0x20 [ 151.193389] do_syscall_64+0x4a/0x1d0 [ 151.197569] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 151.203141] other info that might help us debug this: [ 151.211140] Chain exists of: &mm->mmap_sem#2 --> &adev->notifier_lock --> &dqm->lock_hidden [ 151.222535] Possible unsafe locking scenario: [ 151.228447] CPU0 CPU1 [ 151.232971] ---- ---- [ 151.237502] lock(&dqm->lock_hidden); [ 151.241254] lock(&adev->notifier_lock); [ 151.247774] lock(&dqm->lock_hidden); [ 151.254038] lock(&mm->mmap_sem#2); This commit fixes the warning by ensuring get_user() is not called while reading SDMA stats with dqm_lock held as get_user() could cause a page fault which leads to the circular locking scenario. Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Bernard Zhao authored
The function kobject_init_and_add alloc memory like: kobject_init_and_add->kobject_add_varg->kobject_set_name_vargs ->kvasprintf_const->kstrdup_const->kstrdup->kmalloc_track_caller ->kmalloc_slab, in err branch this memory not free. If use kmemleak, this path maybe catched. These changes are to add kobject_put in kobject_init_and_add failed branch, fix potential memleak. Signed-off-by:
Bernard Zhao <bernard@vivo.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Nirmoy Das authored
Used sparse(make C=1) to find these loose ends. v2: removed unwanted extra line Signed-off-by:
Nirmoy Das <nirmoy.das@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
The call to pm_runtime_get_sync increments the counter even in case of failure, leading to incorrect ref count. In case of failure, decrement the ref count before returning. Reviewed-by:
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 24 Jun, 2020 1 commit
-
-
Bernard Zhao authored
The function kobject_init_and_add alloc memory like: kobject_init_and_add->kobject_add_varg->kobject_set_name_vargs ->kvasprintf_const->kstrdup_const->kstrdup->kmalloc_track_caller ->kmalloc_slab, in err branch this memory not free. If use kmemleak, this path maybe catched. These changes are to add kobject_put in kobject_init_and_add failed branch, fix potential memleak. Signed-off-by:
Bernard Zhao <bernard@vivo.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 29 May, 2020 1 commit
-
-
Colin Ian King authored
Currently pointer pdd is being dereferenced when assigning pointer dpm and then pdd is being null checked. Fix this by checking if pdd is null before the dereference of pdd occurs. Addresses-Coverity: ("Dereference before null check") Fixes: 32cb59f3 ("drm/amdkfd: Track SDMA utilization per process") Signed-off-by:
Colin Ian King <colin.king@canonical.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-