Find the kth smallest number in an array
This is an classical question, the general solution to this question is first using sorting algorithm to sort the array, then get the kth number in the array, the complexity is O(nlogn), we can also use selective sort or head sort to, the complexity of selective sort is O(kn) and heap sort is O(nlogk). The better solution is to use quick sort to find the kth smallest number, the complexity is O(n), the worst case cost is O(n^2). But today we introduce one more solution which has the worst case cost O(n).
First let's check the quick sort solution, quick sort is to find one number, then put all numbers in the array which is less than the number chosen on one side and then put the other numbers on the other side. Then recursively sort them again. Then to find the kth smallest number in the array, we only need to find it in one side of the array.
-
import random
-
-
def partition(arr, left, right, pivot):
-
v = arr[pivot]
-
arr[pivot], arr[right-1] = arr[right-1], arr[pivot]
-
index = left
-
for i in xrange(left, right):
-
if arr[i] <= v:
-
arr[i], arr[index] = arr[index], arr[i]
-
index += 1
-
return index-1
-
-
def select(arr, left, right, k):
-
while right - left > 1:
-
index = partition(arr, left, right, random.randint(left, right-1))
-
dist = index - left + 1
-
if dist == k:
-
return arr[index]
-
if dist < k:
-
k -= dist
-
left = index + 1
-
else:
-
right = index
-
return arr[left]
Here arr is the array to search, we can call select to find the kth smallest number, if we cannot find the pivot element correctly, the worst case cost will be O(n^2).
Now we discuss the solution which has worst case cost O(n), we can divide all these numbers into some subarrays which have 5 elements in each subarray, then there will be n/5 subarrays. For each subarray, we can find the middle number very fast, then we can find the middle number of these n/5 middle numbers. This algorithm is called Median of Medians algorithm.
If we use the median of the medians as the pivot, then there will be 3/5*1/2 which is 3/10 numbers which are less or equal to pivot, also there will be 3/10 numbers which are greater than pivot, so the worst case cost is the array will be divided into 30%,70% or 70%,30% two parts.
The cost is :
T(n)<=T(n/5)+T(7/10*n)+O(n)<=c*n*(1+9/10+(9/10)^2....)
So T(n)=O(n)
The worst case cost is O(n).
-
import heapq
-
-
def partition(arr, left, right, pivot):
-
v = arr[pivot]
-
arr[pivot], arr[right-1] = arr[right-1], arr[pivot]
-
index = left
-
for i in xrange(left, right):
-
if arr[i] <= v:
-
arr[i], arr[index] = arr[index], arr[i]
-
index += 1
-
return index-1
-
-
def select_heap(arr, left, right, k):
-
tmp = arr[left:right]
-
heapq.heapify(tmp)
-
[heapq.heappop(tmp) for i in xrange(k-1)]
-
return heapq.heappop(tmp)
-
-
def median(arr, left, right):
-
num = (right - left - 1) / 5
-
for i in xrange(num+1):
-
sub_left = left + i*5
-
sub_right = sub_left + 5
-
if sub_right > right:
-
sub_right = right
-
m_index = select_heap(arr, sub_left, sub_right, (sub_right-sub_left)/2)
-
arr[left+i], arr[m_index] = arr[m_index], arr[left+i]
-
return select(arr, left, left+num+1, (num+1)/2)
-
-
def select(arr, left, right, k):
-
while right - left > 1:
-
pivot = median(arr, left, right)
-
index = partition(arr, left, right, pivot)
-
dist = index - left + 1
-
if dist == k:
-
return arr[index]
-
if dist < k:
-
k -= dist
-
left = index + 1
-
else:
-
right = index
-
return arr[left]
Source : http://www.isnowfy.com/top-k-number/
By clicking the "Mark as important" button, this article will be put to your important article list which you can find in "Amin->Article important list". Later when you want reread this article, it's easier for you to find it by checking the "Article important list".
Tags:Sort, Quick sort, Search,Smallest Read(1507) Comment(0)
Previous : Output a file with HTTP range header in PHP Next : Big file transfer in Linux
::Related Articles
::Comment Zone
No comment for this article.
::Comment
:: Other versions
No other versions available yet.
:: Recent articles
- Huawei is open to acquire Nokia
- HTML Email Guide
- Use of Erlang in WhatsApp
- Write high quality JavaScript and CSS with SublimeLinter
- Is Facebook developing RSS reader?
- About tmpfs
- Differences among Enter,F5 and Ctrl+F5 in webpage refresh
- China to expand 4G network across the country
- Yahoo is going to recycle inactive user IDs
- The big four in Taiwan can compete with Samsung
- more>>
:: Most read
- TIOBE : C overtakes Java as the No.1 programming language
- Sony is to release PlayStation4 in 2015
- Which programming language should I learn first?
- Disposable Email address
- 5 Free Open Source Chat Applications For Developers
- Never ever touch a programmer
- Hacking Vs. Programming
- TIOBE : Where is that next big programming language?
- Multitasking vs multiprogramming
- Google is developing advanced programming technology to simplify Web application development
- more>>
:: Most commented
- Hacking Vs. Programming
- Which programming language should I learn first?
- Error handling style in C
- 10 controversial programming opinions?
- C vs Java Complete Comparison
- Sony is to release PlayStation4 in 2015
- Google+ is sick
- Unix Philosophy
- Disposable Email address
- Will Google+ be a "Ghost City"
- more>>
:: Contribute
Write article:: Find us
