https://leetcode.com/explore/learn/card/heap/
In many CS applications, we only need to
access the largest or smallest element
in the dataset. We
DO NOT care about the order of other data in the data set
.
How do we efficiently access the largest or smallest element in the
current dataset? -> The answer would be Heap
.
Heap
is a “complete binary tree”
Priority Queue (PQ)
abstract data type
similar to a
regular queue or stack data structure in which each element additionally
has a "priority"
associated with it. In a priority queue,
an element with high priority is served before an element with low
priority.Heap != Priority Queue
completed binary tree
(heap is binary
tree)O( log N)
O( log N)
O(1)
time complexity.
<=
or >=
C.min
heap
max
heap
heap sort
- `priority queue`
- LC 295, 787, 1492
- K th biggest integer
- LC 215
heap queue
AKA priority queue
) (Py api)MIN heap
-1 * val
index start from 0
pop()
will return min
element (instead of
max element)#------------------------
# PY API examples
#------------------------
#----------------------
# 1) build heapq
#----------------------
43]: import heapq
In [
...:
...:= [10, 17, 50, 7, 30, 24, 27, 45, 15, 5, 36, 21]
...: array = []
...: heap for num in array:
...:
...: heapq.heappush(heap, num)print("array:", array)
...: print("heap: ", heap)
...:
...:
...: heapq.heapify(array)print("array:", array)
...: 10, 17, 50, 7, 30, 24, 27, 45, 15, 5, 36, 21]
array: [5, 7, 21, 15, 10, 24, 27, 45, 17, 30, 36, 50]
heap: [5, 7, 21, 10, 17, 24, 27, 45, 15, 30, 36, 50]
array: [
# NOTE : there are 2 ways create heap (in py)
# 1) heappush(heap, num)
# 2) heapify(array)
#
# -> we can see above results are a bit different. However this not affect the "min heap" property in py. We can still get min element, and heap will get updated accordingly.
#----------------------
# 1') build heapq V2
#----------------------
# https://leetcode.com/explore/learn/card/heap/644/common-applications-of-heap/4022/
import heapq
# Construct an empty Min Heap
= []
minHeap
heapq.heapify(minHeap)
# Construct an empty Max Heap
# As there are no internal functions to construct a Max Heap in Python,
# So, we will not construct a Max Heap.
# Construct a Heap with Initial values
# this process is called "Heapify"
# The Heap is a Min Heap
= [3,1,2]
heapWithValues
heapq.heapify(heapWithValues)
# Trick in constructing a Max Heap
# As there are no internal functions to construct a Max Heap
# We can multiply each element by -1, then heapify with these modified elements.
# The top element will be the smallest element in the modified set,
# It can also be converted to the maximum value in the original dataset.
# Example
= [1,2,3]
maxHeap = [-x for x in maxHeap]
maxHeap
heapq.heapify(maxHeap)# The top element of maxHeap is -3
# Convert -3 to 3, which is the maximum value in the original maxHeap
#----------------------
# 2) insert into element
#----------------------
# https://leetcode.com/explore/learn/card/heap/644/common-applications-of-heap/4023/
# Insert an element to the Min Heap
5)
heapq.heappush(minHeap,
# Insert an element to the Max Heap
# Multiply the element by -1
# As we are converting the Min Heap to a Max Heap
-1 * 5)
heapq.heappush(maxHeap,
#----------------------
# 3) delete the top element
#----------------------
# https://leetcode.com/explore/learn/card/heap/644/common-applications-of-heap/4025/
# Delete top element from the Min Heap
heapq.heappop(minHeap)
# Delete top element from the Max Heap
heapq.heappop(maxHeap)
#----------------------
# 3) get top element
#----------------------
# https://leetcode.com/explore/learn/card/heap/644/common-applications-of-heap/4024/
# Get top element from the Min Heap
# i.e. the smallest element
0]
minHeap[# Get top element from the Max Heap
# i.e. the largest element
# When inserting an element, we multiplied it by -1
# Therefore, we need to multiply the element by -1 to revert it back
-1 * maxHeap[0]
#----------------------
# 2) sorting via heapq
#----------------------
44]: array = [10, 17, 50, 7, 30, 24, 27, 45, 15, 5, 36, 21]
In [= []
...: heap for num in array:
...:
...: heapq.heappush(heap, num)print(heap[0])
...: 5
45]: heap_sort = [heapq.heappop(heap) for _ in range(len(heap))]
In [print("heap sort result: ", heap_sort)
...: 5, 7, 10, 15, 17, 21, 24, 27, 30, 36, 45, 50]
heap sort result: [
#----------------------
# 3) get Min or Max from heap
#----------------------
48]: array = [10, 17, 50, 7, 30, 24, 27, 45, 15, 5, 36, 21]
In [
...: heapq.heapify(array)print(heapq.nlargest(2, array))
...: print(heapq.nsmallest(3, array))
...: 50, 45]
[5, 7, 10]
[
#----------------------
# 4) merge 2 sorted list via heap
#----------------------
49]: array_a = [10, 7, 15, 8]
In [= [17, 3, 8, 20, 13]
...: array_b = heapq.merge(sorted(array_a), sorted(array_b))
...: array_merge print("merge result:", list(array_merge))
...: 3, 7, 8, 8, 10, 13, 15, 17, 20]
merge result: [
#----------------------
# 5) heap replace element
#----------------------
50]: array_c = [10, 7, 15, 8]
In [
...: heapq.heapify(array_c)print("before:", array_c)
...: # heappushpop : push first, then pop
...: = heapq.heappushpop(array_c, 5)
...: item print("after: ", array_c)
...: print(item)
...:
...:7, 8, 15, 10]
before: [7, 8, 15, 10]
after: [5
51]: array_d = [10, 7, 15, 8]
In [
...: heapq.heapify(array_d)print("before:", array_d)
...: # pop first, then push
...: = heapq.heapreplace(array_d, 5)
...: item print("after: ", array_d)
...: print(item)
...: 7, 8, 15, 10]
before: [5, 8, 15, 10]
after: [7
#----------------------
# 5) make a MAX heapq
#----------------------
54]: numbers = [4,1,24,2,1]
In [
...:# invert numbers so that the largest values are now the smalles
...:
...:= [-1 * n for n in numbers]
...: numbers
...:# turn numbers into min heap
...:
...: heapq.heapify(numbers)
...:# pop out 5 times
...: = []
...: klargest for i in range(len(numbers)):
...: # multiply by -1 to get our inital number back
...: -1 * heapq.heappop(numbers))
...: klargest.append(
...:
55]: klargest
In [55]: [24, 4, 2, 1, 1] Out[
# https://docs.python.org/zh-tw/3/library/heapq.html
def heapsort(iterable):
= []
h for value in iterable:
heappush(h, value)return [heappop(h) for i in range(len(h))]
# heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
# >>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
// java
PriorityQueue pq
// random insert
for i in {2,4,1,9,6}:
.add(i)
pq
while pq not empty:
// every time get the one minimum element
print(pq.pop())
// the output should be in order (small -> big)
// 1,2,4,6,9
# 355 Design Twitter
# V0
# https://github.com/labuladong/fucking-algorithm/blob/master/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84%E7%B3%BB%E5%88%97/%E8%AE%BE%E8%AE%A1Twitter.md
from collections import defaultdict
from heapq import merge
class Twitter(object):
def __init__(self):
self.follower_followees_map = defaultdict(set)
self.user_tweets_map = defaultdict(list)
self.time_stamp = 0
def postTweet(self, userId, tweetId):
self.user_tweets_map[userId].append((self.time_stamp, tweetId))
self.time_stamp -= 1
def getNewsFeed(self, userId):
# get the followees list
= self.follower_followees_map[userId]
followees # add userId as well, since he/she can also see his/her post in the timeline
followees.add(userId)
# reversed(.) returns a listreverseiterator, so the complexity is O(1) not O(n)
= [reversed(self.user_tweets_map[u]) for u in followees]
candidate_tweets
= []
tweets """
python starred expression :
-> will extend Iterable Unpacking
example 1 : *candidate_tweets
exmaple 2 : a, *b, c = range(5)
ref :
https://www.python.org/dev/peps/pep-3132/
https://blog.csdn.net/weixin_41521681/article/details/103528136
http://swaywang.blogspot.com/2012/01/pythonstarred-expression.html
https://github.com/yennanliu/CS_basics/blob/master/doc/cheatsheet/python_trick.md
"""
# complexity is 10*log(n), n is twitter's user number in worst case
for t in merge(*candidate_tweets):
1])
tweets.append(t[if len(tweets) == 10:
break
return tweets
def follow(self, followerId, followeeId):
self.follower_followees_map[followerId].add(followeeId)
def unfollow(self, followerId, followeeId):
self.follower_followees_map[followerId].discard(followeeId)
// java
// LC 347
// V1
// IDEA : HEAP
// https://leetcode.com/problems/top-k-frequent-elements/editorial/
public int[] topKFrequent_2(int[] nums, int k) {
// O(1) time
if (k == nums.length) {
return nums;
}
// 1. build hash map : character and how often it appears
// O(N) time
Map<Integer, Integer> count = new HashMap();
for (int n: nums) {
.put(n, count.getOrDefault(n, 0) + 1);
count}
// init heap 'the less frequent element first'
Queue<Integer> heap = new PriorityQueue<>(
(n1, n2) -> count.get(n1) - count.get(n2));
// 2. keep k top frequent elements in the heap
// O(N log k) < O(N log N) time
for (int n: count.keySet()) {
.add(n);
heapif (heap.size() > k) heap.poll();
}
// 3. build an output array
// O(k log k) time
int[] top = new int[k];
for(int i = k - 1; i >= 0; --i) {
[i] = heap.poll();
top}
return top;
}
# 703 Kth Largest Element in a Stream
# V0
# IDEA : HEAP
# NOTE !!! : we ONLY need to return k biggest element
# -> we ONLY need to keep at most k element
# -> if element more than k, then pop element out
# -> then return 0 element directly
import heapq
class KthLargest:
def __init__(self, k, nums):
self.k = k
heapq.heapify(nums)self.heap = nums
while len(self.heap) > k:
self.heap)
heapq.heappop(
def add(self, val):
if len(self.heap) < self.k:
self.heap, val)
heapq.heappush(else:
self.heap, val)
heapq.heappushpop(
return self.heap[0]
# V0'
# IDEA : heap op
class KthLargest(object):
def __init__(self, k, nums):
self.pool = nums
self.size = len(self.pool)
self.k = k
self.pool)
heapq.heapify(while self.size > k:
# heappop : get minimum element out from heap and return it
self.pool)
heapq.heappop(self.size -= 1
def add(self, val):
if self.size < self.k:
# heappush : put item input heap, and make heap unchanged
self.pool, val)
heapq.heappush(self.size += 1
elif val > self.pool[0]:
# get minimum value from heap and return it, and put new item into heap
# heapreplace : pop first, then push
self.pool, val)
heapq.heapreplace(return self.pool[0]
# 264 Ugly Number II
# V0
# IDEA : HEAP
# using brute force is too slow -> time out error
# -> so here we generate "ugly number" by ourself, and order them via heap (heappush)
# -> and return the i-th element as request
class Solution(object):
def nthUglyNumber(self, n):
= [1]
heap = set([1])
visited for i in range(n):
= heapq.heappop(heap)
val for factor in [2,3,5]:
if val*factor not in visited:
*factor)
heapq.heappush(heap, val*factor)
visited.add(valreturn val
# V1
import heapq
class Solution(object):
def nthUglyNumber(self, n):
= 0
ugly_number
= []
heap 1)
heapq.heappush(heap, for _ in range(n):
= heapq.heappop(heap)
ugly_number if ugly_number % 2 == 0:
* 2)
heapq.heappush(heap, ugly_number elif ugly_number % 3 == 0:
* 2)
heapq.heappush(heap, ugly_number * 3)
heapq.heappush(heap, ugly_number else:
* 2)
heapq.heappush(heap, ugly_number * 3)
heapq.heappush(heap, ugly_number * 5)
heapq.heappush(heap, ugly_number
return ugly_number
# 295 Find Median from Data Stream
# V0
# https://docs.python.org/zh-tw/3/library/heapq.html
# https://github.com/python/cpython/blob/3.10/Lib/heapq.py
# Note !!!
# -> that the heapq in python is a min heap, thus we need to invert the values in the smaller half to mimic a "max heap".
# IDEA : python heapq (heap queue AKA priority queue)
# -> Step 1) init 2 heap : small, large
# -> small : stack storage "smaller than half value" elements
# -> large : stack storage "bigger than half value" elements
# -> Step 2) check if len(self.small) == len(self.large)
# -> Step 3-1) add num: (if len(self.small) == len(self.large))
# -> since heapq in python is "min" heap, so we need to add minus to smaller stack for "max" heap simulation
# -> e.g. :
# "-num" in -heappushpop(self.small, -num)
# "-heappushpop" is for balacing the "-" back (e.g. -(-value) == value)
# and pop "biggest" elment in small stack to big stack
# -> Step 3-2) add num: (if len(self.small) != len(self.large))
# -> pop smallest element from large heap to small heap
# -> e.g. heappush(self.small, -heappushpop(self.large, num))
# -> Step 4) return median
# -> if even length (len(self.small) == len(self.large))
# -> return float(self.large[0] - self.small[0]) / 2.0
# -> if odd length ((len(self.small) != len(self.large)))
# -> return float(self.large[0])
from heapq import *
class MedianFinder:
def __init__(self):
self.small = [] # the smaller half of the list, max heap (invert min-heap)
self.large = [] # the larger half of the list, min heap
def addNum(self, num):
"""
doc : https://docs.python.org/3/library/heapq.html
src code : https://github.com/python/cpython/blob/3.10/Lib/heapq.py
* heappush(heap, item)
-> Push the value item onto the heap, maintaining the heap invariant.
* heappop(heap)
-> Pop and return the smallest item from the heap, maintaining the heap invariant. If the heap is empty, IndexError is raised. To access the smallest item without popping it, use heap[0].
* heappushpop(heap, item)
-> Push item on the heap, then pop and return the smallest item from the heap. The combined action runs more efficiently than heappush() followed by a separate call to heappop().
"""
if len(self.small) == len(self.large):
self.large, -heappushpop(self.small, -num))
heappush(else:
self.small, -heappushpop(self.large, num))
heappush(
def findMedian(self):
# even length
if len(self.small) == len(self.large):
return float(self.large[0] - self.small[0]) / 2.0
# odd length
else:
return float(self.large[0])
# LC 1167 Minimum Cost to Connect Sticks
# V0
# IDEA : heapq
class Solution(object):
def connectSticks(self, sticks):
from heapq import *
heapify(sticks)= 0
res while len(sticks) > 1:
= heappop(sticks)
s1 = heappop(sticks)
s2 += s1 + s2 # merge 2 shortest sticks
res + s2)
heappush(sticks, s1 return res
# LC 1492 The kth Factor of n
# note : there is also brute force, math approaches
# V1
# IDEA : HEAP
# Initialize max heap. Use PriorityQueue in Java and heap in Python. heap is a min-heap. Hence, to implement max heap, change the sign of divisor before pushing it into the heap.
# https://leetcode.com/problems/the-kth-factor-of-n/solution/
class Solution:
def kthFactor(self, n, k):
# push into heap
# by limiting size of heap to k
def heappush_k(num):
- num)
heappush(heap, if len(heap) > k:
heappop(heap)
# Python heap is min heap
# -> to keep max element always on top,
# one has to push negative values
= []
heap for x in range(1, int(n**0.5) + 1):
if n % x == 0:
heappush_k(x)if x != n // x:
// x)
heappush_k(n
return -heappop(heap) if k == len(heap) else -1
# LC 1481. Least Number of Unique Integers after K Removals
# NOTE : there's also Counter approaches
# V0
# IDEA : Counter
from collections import Counter
class Solution:
def findLeastNumOfUniqueInts(self, arr, k):
# edge case
if not arr:
return 0
= dict(Counter(arr))
cnt = sorted(cnt.items(), key = lambda x : x[1])
cnt_sorted #print ("cnt_sorted = " + str(cnt_sorted))
= 0
removed for key, freq in cnt_sorted:
"""
NOTE !!!
-> we need to remove exactly k elements and make remain unique integers as less as possible
-> since we ALREADY sort num_counter,
-> so the elements NOW are ordering with their count
-> so we need to remove ALL element while k still > 0
-> so k -= freq, since for element key, there are freq count for it in arr
"""
if freq <= k:
-= freq
k += 1
removed
return len(cnt.keys()) - removed
# V1'''''
# IDEA : Counter + heapq
# https://leetcode.com/problems/least-number-of-unique-integers-after-k-removals/discuss/704179/python-solution%3A-Counter-and-Priority-Queue
# IDEA
# -> Count the occurence of each number.
# -> We want to delete the number with lowest occurence thus we can use minimum steps to reduce the total unique numbers in the list. For example,[4,3,1,1,3,3,2]. The Counter of this array will be: {3:3, 1:2, 4:1, 2:1}. Given k = 3, the greedy approach is to delete 2 and 4 first because both of them are appearing once. We need an ordering data structure to give us the lowest occurence of number each time. As you may know, Priority Queue comes to play
# -> Use heap to build PQ for the counter. We store each member as a tuple: (count, number) Python heap module will sort it based on the first member of the tuple.
# -> loop through k times to pop member out of heap and check if we need to push it back
class Solution(object):
def findLeastNumOfUniqueInts(self, arr, k):
# use counter, and heap (priority queue)
from collections import Counter
import heapq
= []
h for key, val in Counter(arr).items():
heapq.heappush(h,(val,key))
while k > 0:
= heapq.heappop(h)
item if item[0] != 1:
0]-1, item[1]))
heapq.heappush(h, (item[-=1
k
return len(h)
# V1'''''''
# IDEA : Counter + heapq
# https://leetcode.com/problems/least-number-of-unique-integers-after-k-removals/discuss/1542356/Python-MinHeap-Solution
class Solution:
def findLeastNumOfUniqueInts(self, arr, k):
= collections.Counter(arr)
counter = []
minHeap for key, val in counter.items():
heapq.heappush(minHeap, val)
while k:
0] -= 1
minHeap[if minHeap[0] == 0:
heapq.heappop(minHeap)-= 1
k
return len(minHeap)
# LC 1353. Maximum Number of Events That Can Be Attended
# V0
# IDEA : PRIORITY QUEUE
# NOTE !!!
# We just need to attend d where startTimei <= d <= endTimei, then we CAN attend the meeting
# startTimei <= d <= endTimei. You can only attend one event at any time d.
class Solution:
def maxEvents(self, events: List[List[int]]) -> int:
# algorithm: greedy+heap
# step1: loop from min to max day
# step2: each iteration put the candidates in the heap
# step3: each iteration eliminate the ineligibility ones from the heap
# step4: each iteration choose one event attend if it is possible
# time complexity: O(max(n1logn1, n2))
# space complexity: O(n1)
= lambda x: -x[0])
events.sort(key = []
h = 0
ans = 1 #events[-1][0]
minDay = 100001 #max(x[1] for x in events) + 1
maxDay for day in range(minDay, maxDay):
# add all days that can start today
while events and events[-1][0] <= day:
1])
heapq.heappush(h, events.pop()[
# remove all days that cannot start
while h and h[0]<day:
heapq.heappop(h)
# if can attend meeting
if h:
heapq.heappop(h)+= 1
ans return ans
# V0'
# IDEA : PRIORITY QUEUE
# NOTE !!!
# We just need to attend d where startTimei <= d <= endTimei, then we CAN attend the meeting
# startTimei <= d <= endTimei. You can only attend one event at any time d.
class Solution:
def maxEvents(self, events):
= lambda x: (-x[0], -x[1]))
events.sort(key = []
endday = 0
ans for day in range(1, 100001, 1):
# check if events is not null and events start day = day (events[-1][0] == day)
# if above conditions are True, we insert "events.pop()[1]" to endday
while events and events[-1][0] == day:
1])
heapq.heappush(endday, events.pop()[# check if endday is not null, if first day in endday < day, then we pop its element
while endday and endday[0] < day:
heapq.heappop(endday)# if there is still remaining elements in endday -> means we CAN atten the meeting, so ans += 1
if endday:
+= 1
ans
heapq.heappop(endday)return ans
# LC 895. Maximum Frequency Stack
# V1'
# IDEA : STACK
# https://leetcode.com/problems/maximum-frequency-stack/solution/
class FreqStack(object):
def __init__(self):
self.freq = collections.Counter()
self.group = collections.defaultdict(list)
self.maxfreq = 0
def push(self, x):
= self.freq[x] + 1
f self.freq[x] = f
if f > self.maxfreq:
self.maxfreq = f
self.group[f].append(x)
def pop(self):
= self.group[self.maxfreq].pop()
x self.freq[x] -= 1
if not self.group[self.maxfreq]:
self.maxfreq -= 1
return x
# LC 264 Ugly Number II
# V0
# IDEA : HEAP
# using brute force is too slow -> time out error
# -> so here we generate "ugly number" by ourself, and order them via heap (heappush)
# -> and return the i-th element as request
import heapq
class Solution(object):
def nthUglyNumber(self, n):
# NOTE : we init heap as [1], visited = set([1])
= [1]
heap = set([1])
visited for i in range(n):
# NOTE !!! trick here, we use last element via heappop
= heapq.heappop(heap)
val # and we genrate ugly by ourself
for factor in [2,3,5]:
if val*factor not in visited:
*factor)
heapq.heappush(heap, val*factor)
visited.add(valreturn val
// java
// LC 373
// V0-1
// IDEA: PQ (fixed by gpt)
/**
* IDEA:
*
* ✅ Use a min-heap (priority queue) to:
*
* - Always retrieve the next smallest sum pair
*
* - Efficiently keep track of candidates
*
*/
public List<List<Integer>> kSmallestPairs_0_1(int[] nums1, int[] nums2, int k) {
List<List<Integer>> res = new ArrayList<>();
if (nums1 == null || nums2 == null || nums1.length == 0 || nums2.length == 0 || k <= 0) {
return res;
}
// Min-heap to store [sum, index in nums1, index in nums2]
/**
* NOTE !!!
*
* min PQ structure:
*
* [ sum, nums_1_idx, nums_2_idx ]
*
*
* - Heap stores: int[] {sum, index in nums1, index in nums2}
*
* - It's sorted by sum = nums1[i] + nums2[j]
*
*/
PriorityQueue<int[]> minHeap = new PriorityQueue<>((a, b) -> Integer.compare(a[0], b[0]));
// Add the first k pairs (nums1[0] + nums2[0...k])
/** NOTE !!!
*
* we init PQ as below:
*
* - We insert first k pairs: (nums1[i], nums2[0])
*
* - Why nums2[0]?
* -> Because nums2 is sorted,
* so (nums1[i], nums2[0]) is the smallest possible for that row.
*
*
* -> so, we insert `nums_1[i] + nums_2[0]` to PQ for now
*
*
*/
for (int i = 0; i < nums1.length && i < k; i++) {
.offer(new int[] { nums1[i] + nums2[0], i, 0 });
minHeap}
/** NOTE !!! Pop from Heap and Expand
*
* - Poll the `smallest` sum pair (i, j) and add it to result.
*
* - You now consider the next element in that row, which is (i, j + 1).
*
*/
while (k > 0 && !minHeap.isEmpty()) {
// current smallest val from PQ
int[] current = minHeap.poll();
int i = current[1]; // index in nums1
int j = current[2]; // index in nums2
.add(Arrays.asList(nums1[i], nums2[j]));
res
/**
* NOTE !!! Push the Next Pair in the Same Row
*
* - This ensures you're exploring pairs in increasing sum order:
*
* - From (i, 0) → (i, 1) → (i, 2) ...
*
* - Since the arrays are sorted, this gives increasing sums
*
*
*/
if (j + 1 < nums2.length) {
.offer(new int[] { nums1[i] + nums2[j + 1], i, j + 1 });
minHeap}
--;
k}
return res;
}