The union of two sorted arrays is a common problem in programming that tests a developer’s ability to efficiently merge and handle sorted data. This problem is frequently encountered in coding interviews and competitive programming. In this topic, we will discuss the concept, approach, and optimized solutions for finding the union of two sorted arrays, as well as its implementation in various programming languages.
What Is the Union of Two Sorted Arrays?
The union of two arrays refers to a set that contains all unique elements from both arrays. Since the given arrays are already sorted, the task is to merge them while ensuring that duplicates are removed.
Example
Input:
arr1 = [1, 2, 3, 4, 5] arr2 = [2, 3, 5, 6]
Output:
[1, 2, 3, 4, 5, 6]
Here, elements 2, 3, and 5
are common in both arrays, but they appear only once in the final output.
Approaches to Solve the Problem
There are multiple ways to solve this problem efficiently. Let’s explore some of the best approaches.
1. Using Set Data Structure (Brute Force Approach)
One of the simplest ways to find the union is by inserting all elements into a set. Since sets do not allow duplicates, this method automatically removes redundant values.
Algorithm:
-
Insert all elements from the first array into a set.
-
Insert all elements from the second array into the same set.
-
Convert the set back to a sorted list and return it.
Time Complexity:
- O(N log N) (because inserting into a set and then sorting takes extra time).
Implementation in Python:
def union_of_sorted_arrays(arr1, arr2):result_set = set(arr1 + arr2) # Using set to store unique elementsreturn sorted(result_set) # Sorting to maintain order# Example usagearr1 = [1, 2, 3, 4, 5]arr2 = [2, 3, 5, 6]print(union_of_sorted_arrays(arr1, arr2))
2. Two-Pointer Approach (Optimal Solution)
Since the arrays are already sorted, we can use the two-pointer technique to efficiently merge them while avoiding duplicates.
Algorithm:
-
Use two pointers, one for each array.
-
Compare the elements at both pointers and insert the smaller one into the result.
-
If elements are the same, add one and move both pointers.
-
Continue until both arrays are fully traversed.
Time Complexity:
- O(N + M) (since each element is processed once).
Implementation in Python:
def union_sorted_arrays(arr1, arr2):i, j = 0, 0result = []while i < len(arr1) and j < len(arr2):if arr1[i] < arr2[j]:result.append(arr1[i])i += 1elif arr1[i] > arr2[j]:result.append(arr2[j])j += 1else:result.append(arr1[i])i += 1j += 1 # Move both pointers to skip duplicateswhile i < len(arr1): # If elements are left in arr1result.append(arr1[i])i += 1while j < len(arr2): # If elements are left in arr2result.append(arr2[j])j += 1return result# Example usagearr1 = [1, 2, 3, 4, 5]arr2 = [2, 3, 5, 6]print(union_sorted_arrays(arr1, arr2))
3. Merging Without Extra Space (In-Place Approach)
If memory optimization is a concern, we can modify one of the existing arrays in-place and avoid using extra space. However, this approach is complex and can affect performance.
Comparison of Different Approaches
Approach | Time Complexity | Space Complexity | Efficiency |
---|---|---|---|
Using Set | O(N log N) | O(N) | Easy to implement but not optimal |
Two-Pointer | O(N + M) | O(N + M) | Optimal for sorted arrays |
In-Place Merge | O(N + M) | O(1) | Space-efficient but complex |
Use Cases of This Problem
The problem of finding the union of two sorted arrays has many real-world applications, including:
-
Database Merging: When combining sorted datasets from two different sources.
-
Search Engine Indexing: Merging search results from multiple sources without duplicates.
-
Big Data Processing: Handling large-scale data in sorted format efficiently.
Common Mistakes and How to Avoid Them
-
Not Handling Duplicates Properly:
- Ensure that duplicates are removed while merging. The two-pointer approach does this effectively.
-
Using Sorting After Merging:
- Sorting after merging is unnecessary and increases time complexity. Instead, use the two-pointer method.
-
Not Considering Edge Cases:
- Handle cases where one array is empty or both arrays contain the same elements.
Edge Cases to Consider
Case | Example Input | Expected Output |
---|---|---|
One array is empty | [1, 2, 3], [] |
[1, 2, 3] |
Both arrays have no common elements | [1, 3, 5], [2, 4, 6] |
[1, 2, 3, 4, 5, 6] |
All elements are the same | [1, 1, 1], [1, 1] |
[1] |
Large input size | [1, 2, ..., 100000] , [50000, ..., 150000] |
Efficient merging needed |
The problem of finding the union of two sorted arrays is a fundamental yet important concept in programming. Using the two-pointer approach is the most efficient solution, as it processes each element in O(N + M) time. Other methods, like using sets, are easier to implement but are less optimal in terms of performance.
This problem is frequently asked in coding interviews and competitive programming. Understanding its optimal solution can help in solving more complex data merging and processing tasks efficiently.