Post

Backtracking - Subsets II

All diagrams presented herein are original creations, meticulously designed to enhance comprehension and recall. Crafting these aids required considerable effort, and I kindly request attribution if this content is reused elsewhere.

Difficulty : Medium

DFS, Pruning

Problem

Given an integer array nums that may contain duplicates, return all possible subsets (the power set).

The solution set must not contain duplicate subsets. Return the solution in any order.

Example 1:

1
2
Input: nums = [1,2,2]
Output: [[],[1],[1,2],[1,2,2],[2],[2,2]]

Example 2:

1
2
Input: nums = [0]
Output: [[],[0]]

Solution

If we follow the exact same solution as Subsets, you will find two duplicates for the input array [1,2,2]. I have highlighted all the 2’s which were 3 in the Subsets problem as the input array changed from [1,2,3] to [1,2,2]

subsets2

The solution we want is the following. The issue is after popping the value 2 from the subset we are incrementing the index again however since there is another 2 in the new index our results now has duplicates. We still want the previous results if the input is [1,2,3] instead of [1,2,2].

image-20240513103516342

The way to resolve this is to increment index such a way it does not land in another 2 after popping 2 from the subset. The first step is to make sure the input array is sorted. This way all duplicates will be continuously repeated.

1
nums.sort()

Now, we can keep incrementing index until it finds a new value other than current nums[index] or reaches len(nums)-1. So in case of [1,2,2], after we pop 2 from subset and instead of calling dfs(index+1)= dfs(1+1) we keep incrementing index until it reaches len(nums)-1 and then we call dfs(index+1)= dfs(2+1). This way the terminating condition is triggered and [1] will be added to the output.

This way the highlighted traversal does not execute and the duplicate entry does not get added to the subset array.

image-20240513105943011

Here is another example of input array [1,2,2,3], when we come back to index=0, instead of incrementing it by +1 we shift it to 2+1.

image-20240513110644501

Here is just 2 additional lines of code to add to the existing Subsets solution. The reason we are using ` < len(nums)-1 is, we have to compare with the next value so setting < len(nums) will cause out of bound exception and we are going to increment index when calling dfs()` anyway.

1
2
while index < len(nums)-1 and nums[index]==nums[index+1]:
  index+=1

And thats all the changes needed for this to work. If it’s not yet clear, my suggestion is to work out an example using on paper.

Final Code

Here is the full code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def subsets(nums):
    output = []
    subset = []
    
    # Sort the values
    nums.sort()

    def dfs(index):
        if index >= len(nums):
            output.append(subset.copy())
            return

        subset.append(nums[index])
        dfs(index + 1)

        subset.pop()
        
        # Skip all the duplicates
        while index < len(nums)-1 and nums[index] == nums[index+1]:
            index+=1
            
        dfs(index + 1)

    dfs(0)
    return output
This post is licensed under CC BY 4.0 by the author.