721. Accounts Merge

Problem Statement

Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest of the elements are emails representing emails of the account.

Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.

After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.

Examples

Example 1:

Input: accounts = [["John","johnsmith@mail.com","john_newyork@mail.com"],["John","johnsmith@mail.com","john00@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Output: [["John","john00@mail.com","john_newyork@mail.com","johnsmith@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Explanation:
The first and second John's accounts are merged because they share the email "johnsmith@mail.com".
The third John's account is not merged because it doesn't share any email with the other accounts.

Example 2:

Input: accounts = [["Gabe","Gabe0@m.co","Gabe3@m.co","Gabe1@m.co"],["Kevin","Kevin3@m.co","Kevin5@m.co","Kevin0@m.co"],["Ethan","Ethan5@m.co","Ethan4@m.co","Ethan0@m.co"],["Hanzo","Hanzo3@m.co","Hanzo1@m.co","Hanzo0@m.co"],["Fern","Fern5@m.co","Fern1@m.co","Fern0@m.co"]]
Output: [["Ethan","Ethan0@m.co","Ethan4@m.co","Ethan5@m.co"],["Gabe","Gabe0@m.co","Gabe1@m.co","Gabe3@m.co"],["Hanzo","Hanzo0@m.co","Hanzo1@m.co","Hanzo3@m.co"],["Kevin","Kevin0@m.co","Kevin3@m.co","Kevin5@m.co"],["Fern","Fern0@m.co","Fern1@m.co","Fern5@m.co"]]

Constraints

1 <= accounts.length <= 1000
2 <= accounts[i].length <= 10
1 <= accounts[i][0].length <= 10
accounts[i][j] for j > 0 is a valid email.

Clarification Questions

Before diving into the solution, here are 5 important clarifications and assumptions to discuss during an interview:

Merging rule: When should accounts be merged? (Assumption: Two accounts belong to the same person if they share at least one common email)
Name handling: What if accounts have different names but share emails? (Assumption: If emails match, they’re the same person - use the name from first account or clarify)
Email uniqueness: Can the same email appear in multiple accounts? (Assumption: Yes - that’s how we identify accounts to merge)
Output format: How should merged accounts be formatted? (Assumption: List with name first, then sorted unique emails)
Transitive merging: If A shares email with B, and B shares email with C, should A, B, C all be merged? (Assumption: Yes - transitive closure - use Union-Find to handle this)

Interview Deduction Process (20 minutes)

Step 1: Brute-Force Approach (5 minutes)

For each account, check against all other accounts to see if they share any email. If they do, merge them by combining their email lists. Repeat until no more merges are possible. This approach has O(n² × m) complexity where n is the number of accounts and m is the average number of emails per account, which is too slow. The transitive closure requirement makes this even more complex, as we need multiple passes.

Step 2: Semi-Optimized Approach (7 minutes)

Use a hash map from email to account index. For each account, check if any of its emails already exist in the map. If yes, merge the current account into the existing one. This handles direct matches but doesn’t handle transitive relationships efficiently. For example, if account A shares email with B, and B shares email with C, we might not merge A and C correctly without additional passes or a more sophisticated data structure.

Step 3: Optimized Solution (8 minutes)

Use Union-Find (Disjoint Set Union) to group emails that belong to the same person. Create a mapping from email to a unique index, and use Union-Find to union all emails within the same account, and also union emails that appear in multiple accounts. After all union operations, group emails by their root parent. For each group, retrieve the account name (from the first account containing any email in that group) and sort the emails. This achieves O(n × m × α(n)) complexity where α is the inverse Ackermann function (effectively constant). The key insight is using Union-Find to handle transitive relationships automatically - if email1 and email2 are in account A, and email2 and email3 are in account B, Union-Find will correctly group all three emails together.

Solution Approach

This is a Union-Find (Disjoint Set Union) problem. The key insight is that emails belonging to the same account should be grouped together, and if two accounts share any email, they should be merged.

Key Insights:

Email as Identifier: Use emails as unique identifiers (not names, since names can be duplicated)
Union-Find: Group emails that belong to the same account using Union-Find
Email Mapping: Map each email to a unique index and to its owner’s name
Grouping: After union operations, group emails by their root parent
Sorting: Sort emails within each merged account

Algorithm:

Assign IDs: Map each unique email to a unique index
Union Emails: For each account, union all emails in that account
Group by Root: Group emails by their root parent (connected component)
Build Result: For each group, create an account with name and sorted emails

Solution

Solution: Union-Find (Disjoint Set Union)

class UnionFind {
public:
    UnionFind(int n) {
        parent.resize(n);
        for(int i = 0; i < n; i++) {
            parent[i] = i;
        }
    }

    void unionSet(int idx1, int idx2) {
        parent[find(idx2)] = find(idx1);
    }

    int find(int idx) {
        if(parent[idx] != idx) {
            parent[idx] = find(parent[idx]);
        }
        return parent[idx];
    }

private:
    vector<int> parent;
};

class Solution {
public:
    vector<vector<string>> accountsMerge(vector<vector<string>>& accounts) {
        unordered_map<string, int> emailToIdx;
        unordered_map<string, string> emailToName;

        // 1: Assign an ID to each unique email
        int id = 0;
        for(auto& account: accounts) {
            string& name = account[0];
            int size = account.size();
            for(int i = 1; i < size; i++) {
                string& email = account[i];
                if(!emailToIdx.contains(email)) {
                    emailToIdx[email] = id++;
                    emailToName[email] = name;
                }
            }
        }

        // 2: Union emails belonging to the same account
        UnionFind uf(id);
        for(auto& account: accounts) {
            string& firstEmail = account[1];
            int firstIdx = emailToIdx[firstEmail];
            int size = account.size();
            for(int i = 2; i < size; i++) {
                string& nextEmail = account[i];
                int nextIndex = emailToIdx[nextEmail];
                uf.unionSet(firstIdx, nextIndex);
            }
        }

        // 3: Group emails by their root parent        
        unordered_map<int, vector<string>> idxToEmails;
        for(auto& [email, _]: emailToIdx) {
            int idx = uf.find(emailToIdx[email]);
            vector<string>& account = idxToEmails[idx];;
            account.emplace_back(email);
            idxToEmails[idx] = account;
        }

        // 4: Build the merged account list
        vector<vector<string>> merged;
        for(auto& [_, emails]: idxToEmails) {
            sort(emails.begin(), emails.end());
            string& name = emailToName[emails[0]];
            vector<string> account;
            account.emplace_back(name);
            for(auto& email:emails) {
                account.emplace_back(email);
            }
            merged.emplace_back(account);
        }
        return merged;
    }
};

Algorithm Explanation:

Union-Find Class (Lines 1-23):
- Constructor: Initialize parent array where each element is its own parent
- unionSet: Union two sets by making idx2’s root point to idx1’s root
- find: Find root with path compression (flattens the tree for efficiency)
Step 1: Assign IDs (Lines 30-41):
- Iterate through all accounts
- For each email in an account:
  - If email not seen before, assign a unique id and map to name
  - Store mappings: emailToIdx[email] = id and emailToName[email] = name
Step 2: Union Emails (Lines 43-53):
- For each account, union all emails in that account
- Strategy: Union all emails to the first email in the account
- This groups all emails from the same account together
Step 3: Group by Root (Lines 55-61):
- For each email, find its root parent using uf.find()
- Group emails by their root parent in idxToEmails
- Emails with the same root belong to the same merged account
Step 4: Build Result (Lines 63-73):
- For each group of emails:
  - Sort emails alphabetically
  - Get the name from the first email (all emails in group have same name)
  - Create account: [name, email1, email2, ...]
  - Add to result

Why This Works:

Union-Find Groups: Emails in the same account are unioned, so they share the same root
Transitive Merging: If account A shares email with B, and B shares with C, then A, B, C are all merged (transitive property)
Path Compression: Makes find operations efficient (nearly O(1) amortized)
Email as Key: Using emails (not names) correctly handles duplicate names

Example Walkthrough:

For accounts = [["John","a@mail.com","b@mail.com"],["John","a@mail.com","c@mail.com"],["Mary","d@mail.com"]]:

Step 1: Assign IDs
emailToIdx: {
    "a@mail.com": 0,
    "b@mail.com": 1,
    "c@mail.com": 2,
    "d@mail.com": 3
}
emailToName: {
    "a@mail.com": "John",
    "b@mail.com": "John",
    "c@mail.com": "John",
    "d@mail.com": "Mary"
}

Step 2: Union emails
Account 1: ["John","a@mail.com","b@mail.com"]
  - Union(0, 1): parent[1] = 0
  - parent = [0, 0, 2, 3]

Account 2: ["John","a@mail.com","c@mail.com"]
  - Union(0, 2): parent[2] = 0
  - parent = [0, 0, 0, 3]

Account 3: ["Mary","d@mail.com"]
  - No union needed (only one email)
  - parent = [0, 0, 0, 3]

Step 3: Group by root
idxToEmails: {
    0: ["a@mail.com", "b@mail.com", "c@mail.com"],
    3: ["d@mail.com"]
}

Step 4: Build result
Group 0: 
  - Sort: ["a@mail.com", "b@mail.com", "c@mail.com"]
  - Name: "John"
  - Account: ["John", "a@mail.com", "b@mail.com", "c@mail.com"]

Group 3:
  - Sort: ["d@mail.com"]
  - Name: "Mary"
  - Account: ["Mary", "d@mail.com"]

Result: [
    ["John", "a@mail.com", "b@mail.com", "c@mail.com"],
    ["Mary", "d@mail.com"]
]

Complexity Analysis:

Time Complexity: O(n × k × α(n × k) + n × k × log(n × k))
- n = number of accounts, k = average emails per account
- O(n × k) for assigning IDs
- O(n × k × α(n × k)) for union operations (α is inverse Ackermann, nearly constant)
- O(n × k × log(n × k)) for sorting emails in each group
- Total: O(n × k × log(n × k)) dominated by sorting
Space Complexity: O(n × k)
- emailToIdx: O(n × k) for all emails
- emailToName: O(n × k) for all emails
- idxToEmails: O(n × k) for grouped emails
- parent array: O(n × k)

Key Insights

Email as Unique Identifier: Names can be duplicated, but emails are unique
Union-Find for Grouping: Efficiently groups emails that belong together
Transitive Merging: If A shares email with B, and B shares with C, then A, B, C are merged
Path Compression: Critical for efficiency in Union-Find operations

Edge Cases

Single email per account: [["John","a@mail.com"]] → return [["John","a@mail.com"]]
No shared emails: All accounts remain separate
All emails shared: All accounts merge into one
Duplicate emails in same account: Handled correctly (only assigned one ID)

Common Mistakes

Using names as keys: Names can be duplicated, causing incorrect merging
Not sorting emails: Result must have sorted emails
Wrong union logic: Must union all emails in an account, not just pairs
Missing path compression: Without it, find can be O(n) instead of O(α(n))
Name mismatch: Assuming all emails in merged account have same name (they do, but need to verify)

Alternative Approaches

Approach 2: DFS (Depth-First Search)

Build a graph where emails are nodes and edges connect emails in the same account. Use DFS to find connected components.

class Solution {
public:
    vector<vector<string>> accountsMerge(vector<vector<string>>& accounts) {
        unordered_map<string, vector<string>> graph;
        unordered_map<string, string> emailToName;
        
        // Build graph
        for(auto& account: accounts) {
            string name = account[0];
            for(int i = 1; i < account.size(); i++) {
                graph[account[1]].push_back(account[i]);
                graph[account[i]].push_back(account[1]);
                emailToName[account[i]] = name;
            }
        }
        
        // DFS to find connected components
        vector<vector<string>> result;
        unordered_set<string> visited;
        
        for(auto& [email, _]: graph) {
            if(!visited.count(email)) {
                vector<string> component;
                dfs(graph, email, visited, component);
                sort(component.begin(), component.end());
                component.insert(component.begin(), emailToName[email]);
                result.push_back(component);
            }
        }
        
        return result;
    }
    
private:
    void dfs(unordered_map<string, vector<string>>& graph, 
             string email, unordered_set<string>& visited, 
             vector<string>& component) {
        visited.insert(email);
        component.push_back(email);
        for(string& neighbor: graph[email]) {
            if(!visited.count(neighbor)) {
                dfs(graph, neighbor, visited, component);
            }
        }
    }
};

Time Complexity: O(n × k × log(n × k))
Space Complexity: O(n × k)

LC 323: Number of Connected Components - Count connected components
LC 547: Number of Provinces - Similar connected components problem
LC 684: Redundant Connection - Union-Find for cycle detection
LC 990: Satisfiability of Equality Equations - Union-Find for equality constraints

This problem demonstrates Union-Find for grouping related entities. The key insight is using emails (not names) as unique identifiers and leveraging Union-Find’s transitive property to merge accounts that share any email.

Robina Li

721. Accounts Merge

721. Accounts Merge

Problem Statement

Examples

Constraints

Clarification Questions

Interview Deduction Process (20 minutes)

Solution Approach

Key Insights:

Algorithm:

Solution

Solution: Union-Find (Disjoint Set Union)

Algorithm Explanation:

Why This Works:

Example Walkthrough:

Complexity Analysis:

Key Insights

Edge Cases

Common Mistakes

Alternative Approaches

Approach 2: DFS (Depth-First Search)

Related Posts

Recent Posts

721. Accounts Merge

Problem Statement

Examples

Constraints

Clarification Questions

Interview Deduction Process (20 minutes)

Solution Approach

Key Insights:

Algorithm:

Solution

Solution: Union-Find (Disjoint Set Union)

Algorithm Explanation:

Why This Works:

Example Walkthrough:

Complexity Analysis:

Key Insights

Edge Cases

Common Mistakes

Alternative Approaches

Approach 2: DFS (Depth-First Search)

Related Problems

Related Posts

Recent Posts