Java HashSet

Master the art of storing unique elements with HashSet - the collection that automatically eliminates duplicates in Java.

What is HashSet?

HashSet is a collection that stores only unique elements, meaning no duplicates are allowed. It's part of the java.util package and implements the Set interface. Think of it like a bag of colored balls where you can't have two balls of the exact same color!

HashSet is perfect when you need to ensure all items in your collection are different from each other, like storing unique email addresses, removing duplicate entries, or maintaining a list of unique tags.

  • No Duplicates: Automatically rejects duplicate elements
  • Fast Operations: Add, remove, and search operations are very fast (O(1) time complexity)
  • Unordered: Does not maintain insertion order or any specific order
  • Null Allowed: Can store one null element
  • Not Thread-Safe: Not synchronized for multi-threaded environments

๐Ÿ’ก Key Point: HashSet uses a HashMap internally to store elements. Each element you add becomes a key in the internal HashMap with a dummy value.

Creating and Using HashSet

Here's how to create a HashSet and perform basic operations:

Example: HashSet Basic Operations

import java.util.HashSet;

public class HashSetExample {
    public static void main(String[] args) {
        // Creating a HashSet of Strings
        HashSet colors = new HashSet<>();
        
        // Adding elements
        colors.add("Red");
        colors.add("Blue");
        colors.add("Green");
        colors.add("Yellow");
        
        // Trying to add a duplicate
        boolean added = colors.add("Red");
        System.out.println("Added duplicate Red? " + added);
        
        // Display HashSet
        System.out.println("Colors: " + colors);
        
        // Checking if element exists
        System.out.println("Contains Blue? " + colors.contains("Blue"));
        System.out.println("Contains Purple? " + colors.contains("Purple"));
        
        // Removing an element
        colors.remove("Green");
        System.out.println("After removing Green: " + colors);
        
        // Size of HashSet
        System.out.println("Total colors: " + colors.size());
        
        // Check if empty
        System.out.println("Is empty? " + colors.isEmpty());
    }
}
Output:
Added duplicate Red? false
Colors: [Red, Yellow, Blue, Green]
Contains Blue? true
Contains Purple? false
After removing Green: [Red, Yellow, Blue]
Total colors: 3
Is empty? false

Explanation:

  • HashSet<String> - Creates a HashSet that stores String objects
  • add(element) - Adds element if not already present, returns false if duplicate
  • contains(element) - Checks if the element exists in the set
  • remove(element) - Removes the specified element
  • size() - Returns the number of elements
  • isEmpty() - Checks if the set is empty

Removing Duplicates from ArrayList

One of the most common uses of HashSet is to remove duplicates from a list:

Example: Remove Duplicates

import java.util.ArrayList;
import java.util.HashSet;

public class RemoveDuplicates {
    public static void main(String[] args) {
        // ArrayList with duplicates
        ArrayList numbers = new ArrayList<>();
        numbers.add(10);
        numbers.add(20);
        numbers.add(10);
        numbers.add(30);
        numbers.add(20);
        numbers.add(40);
        numbers.add(10);
        
        System.out.println("Original List: " + numbers);
        System.out.println("Size: " + numbers.size());
        
        // Remove duplicates using HashSet
        HashSet uniqueNumbers = new HashSet<>(numbers);
        
        System.out.println("\nUnique Numbers: " + uniqueNumbers);
        System.out.println("Size: " + uniqueNumbers.size());
        
        // Convert back to ArrayList if needed
        ArrayList cleanedList = new ArrayList<>(uniqueNumbers);
        System.out.println("\nCleaned List: " + cleanedList);
    }
}
Output:
Original List: [10, 20, 10, 30, 20, 40, 10]
Size: 7

Unique Numbers: [40, 20, 10, 30]
Size: 4

Cleaned List: [40, 20, 10, 30]

Explanation:

  • Original ArrayList has 7 elements including duplicates
  • Creating HashSet from ArrayList automatically removes duplicates
  • Result has only 4 unique elements
  • Can convert HashSet back to ArrayList if ordered collection is needed

Looping Through HashSet

There are several ways to iterate through a HashSet:

Example: Different Ways to Loop

import java.util.HashSet;
import java.util.Iterator;

public class HashSetLoopExample {
    public static void main(String[] args) {
        HashSet fruits = new HashSet<>();
        fruits.add("Apple");
        fruits.add("Banana");
        fruits.add("Orange");
        fruits.add("Mango");
        
        // Method 1: For-each loop (Most common)
        System.out.println("Method 1: For-each loop");
        for (String fruit : fruits) {
            System.out.println("- " + fruit);
        }
        
        // Method 2: Iterator
        System.out.println("\nMethod 2: Using Iterator");
        Iterator iterator = fruits.iterator();
        while (iterator.hasNext()) {
            System.out.println("- " + iterator.next());
        }
        
        // Method 3: forEach with Lambda (Java 8+)
        System.out.println("\nMethod 3: forEach method");
        fruits.forEach(fruit -> System.out.println("- " + fruit));
    }
}
Output:
Method 1: For-each loop
- Apple
- Mango
- Orange
- Banana

Method 2: Using Iterator
- Apple
- Mango
- Orange
- Banana

Method 3: forEach method
- Apple
- Mango
- Orange
- Banana

Explanation:

  • For-each loop: Simplest and most readable way to iterate
  • Iterator: Useful when you need to remove elements while iterating
  • forEach: Modern Java 8+ approach with clean syntax
  • Note: Order of elements may differ from insertion order

HashSet Operations

1. Basic Operations

Add, remove, and check elements in the HashSet.

Methods: add(element), remove(element), contains(element), clear()

2. Set Operations

Perform mathematical set operations like union, intersection, and difference.

Methods: addAll(collection) (union), retainAll(collection) (intersection), removeAll(collection) (difference)

3. Bulk Operations

Work with multiple elements at once.

Methods: addAll(), removeAll(), containsAll()

4. Utility Methods

Get information about the HashSet.

Methods: size(), isEmpty(), toArray()

Set Operations Example

HashSet supports mathematical set operations. Let's see union, intersection, and difference:

Example: Mathematical Set Operations

import java.util.HashSet;

public class SetOperations {
    public static void main(String[] args) {
        HashSet setA = new HashSet<>();
        setA.add(1);
        setA.add(2);
        setA.add(3);
        setA.add(4);
        
        HashSet setB = new HashSet<>();
        setB.add(3);
        setB.add(4);
        setB.add(5);
        setB.add(6);
        
        System.out.println("Set A: " + setA);
        System.out.println("Set B: " + setB);
        
        // Union: All elements from both sets
        HashSet union = new HashSet<>(setA);
        union.addAll(setB);
        System.out.println("\nUnion (A โˆช B): " + union);
        
        // Intersection: Common elements
        HashSet intersection = new HashSet<>(setA);
        intersection.retainAll(setB);
        System.out.println("Intersection (A โˆฉ B): " + intersection);
        
        // Difference: Elements in A but not in B
        HashSet difference = new HashSet<>(setA);
        difference.removeAll(setB);
        System.out.println("Difference (A - B): " + difference);
        
        // Check subset
        HashSet subset = new HashSet<>();
        subset.add(1);
        subset.add(2);
        System.out.println("\nIs {1, 2} subset of A? " + 
                           setA.containsAll(subset));
    }
}
Output:
Set A: [1, 2, 3, 4]
Set B: [3, 4, 5, 6]

Union (A โˆช B): [1, 2, 3, 4, 5, 6]
Intersection (A โˆฉ B): [3, 4]
Difference (A - B): [1, 2]

Is {1, 2} subset of A? true

Explanation:

  • Union: Combines all unique elements from both sets
  • Intersection: Keeps only elements that exist in both sets
  • Difference: Removes elements of setB from setA
  • Subset: Checks if all elements of one set exist in another

Practical Example: Unique Visitor Tracker

Let's create a real-world application that tracks unique website visitors:

Example: Visitor Tracking System

import java.util.HashSet;

public class VisitorTracker {
    public static void main(String[] args) {
        HashSet uniqueVisitors = new HashSet<>();
        
        // Simulating visitor IPs (some visit multiple times)
        String[] visits = {
            "192.168.1.1",
            "192.168.1.2",
            "192.168.1.1",  // Duplicate
            "192.168.1.3",
            "192.168.1.2",  // Duplicate
            "192.168.1.4",
            "192.168.1.1",  // Duplicate
            "192.168.1.5"
        };
        
        System.out.println("Processing website visits...\n");
        
        int totalVisits = 0;
        for (String ip : visits) {
            totalVisits++;
            boolean isNewVisitor = uniqueVisitors.add(ip);
            
            if (isNewVisitor) {
                System.out.println("โœ“ New visitor: " + ip);
            } else {
                System.out.println("โ†ป Returning visitor: " + ip);
            }
        }
        
        System.out.println("\n=== Visit Statistics ===");
        System.out.println("Total visits: " + totalVisits);
        System.out.println("Unique visitors: " + uniqueVisitors.size());
        System.out.println("Repeat visits: " + 
                           (totalVisits - uniqueVisitors.size()));
        
        System.out.println("\nAll unique visitors:");
        for (String visitor : uniqueVisitors) {
            System.out.println("- " + visitor);
        }
    }
}
Output:
Processing website visits...

โœ“ New visitor: 192.168.1.1
โœ“ New visitor: 192.168.1.2
โ†ป Returning visitor: 192.168.1.1
โœ“ New visitor: 192.168.1.3
โ†ป Returning visitor: 192.168.1.2
โœ“ New visitor: 192.168.1.4
โ†ป Returning visitor: 192.168.1.1
โœ“ New visitor: 192.168.1.5

=== Visit Statistics ===
Total visits: 8
Unique visitors: 5
Repeat visits: 3

All unique visitors:
- 192.168.1.1
- 192.168.1.2
- 192.168.1.3
- 192.168.1.4
- 192.168.1.5

Explanation:

  • HashSet automatically tracks unique visitors without manual duplicate checking
  • add() returns true for new visitors and false for returning ones
  • Can easily calculate statistics like total vs unique visits
  • Perfect for tracking unique items like emails, usernames, or session IDs

HashSet vs Other Collections

Understanding when to use HashSet compared to other collections:

Quick Comparison

Feature              HashSet             ArrayList           HashMap
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Duplicates           Not allowed          Allowed               Keys: No, Values: Yes
Ordering             No order             Insertion order       No order
Null values          One null             Multiple nulls        One null key
Access method        Contains check       By index              By key
Performance          O(1) add/remove      O(1) access           O(1) get/put
Use case             Unique items         Ordered list          Key-value pairs

๐ŸŽฏ When to Use HashSet: Choose HashSet when you need to ensure uniqueness of elements and don't care about order. Perfect for: unique tags, removing duplicates, membership testing, mathematical set operations.