Alternative approaches for Maps with primitive values

While profiling the current state of OpenPatrician I noted that there are maps that are quite memory intensive and they have primitive values. As the map cannot hold primitive values they need to be boxed into their corresponding object, which adds overhead. Therefore I asked about alternative approaches. To figure out what meets my requirement best I did a small benchmark of my own comparing some of the available collection libraries.

The contestants

The contestants were the HashMap and ConcurrentHashMap from the JDK mainly for the baseline to compare to. Then there is the MutableObjectDoubleMap from the Eclipse Collection framework (formerly Goldman Sachs (GS)), then the TObjectDoubleHashMap from the Trove library. Further the ObjectDoubleScatterMap of the High Performance Primitive Collections by Carrotsearch and finally from FastUtil the Object2DoubleLinkedOpenHashMap.

The setup and meassure requirements

Into each map like structure 300’000 key-value pairs are inserted. The keys are the index of the insertion into the map to ensure uniqueness of the keys. Values are random numbers in the range [0,1).

Measured is the total time it takes to insert all the 300’000 entries into the map, which includes the generation of the random numbers. Insertion of new value does not have a high importance as the bulk is inserted during initialisation and very few new entries are added later on. More important is the lookup which happens quite often and should therefore be performant. The timing here happend by doing 1’000 lookups of keys that are present in the map and then averaging the time it took.

Finally the memory consumption is looked at using VisualVM by doing a heap dump and taking a look at the retained memory for the maps.

The Code

public class CollectionComparison {

    private static final int MAX_ENTRIES = 300000;
    private Random rnd = new Random();

    private Map<String, Double> baseJDKHashMap = new HashMap<>();
    private Map<String, Double> concurrentJDKHashMap = new ConcurrentHashMap<>();
    private MutableObjectDoubleMap<String> mutableEclipseMap = ObjectDoubleMaps.mutable.empty();
    private TObjectDoubleMap<String> troveMap = new TObjectDoubleHashMap<>();
    private ObjectDoubleMap<String> hppcMap = new ObjectDoubleScatterMap<>();
    private Object2DoubleMap<String> fastUtilMap = new Object2DoubleLinkedOpenHashMap();

    public static void main(String[] args) {
        CollectionComparison comparison = new CollectionComparison();

        comparison.fillJDKHashMap();
        comparison.fillJDKConcurrentHashMap();
        comparison.fillEclipseMap();
        comparison.fillTroveMap();
        comparison.fillHPPCMap();
        comparison.fillFastUtilMap();

        comparison.lookupBenchmark();

        System.out.println("Please take heap dump");

        try {
            Thread.sleep(5 * 60 * 1000l);   // Wait so a heap dump can be taken
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }



    private void fillJDKHashMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            baseJDKHashMap.put(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the JDK HashMap took "+took+"ms");
    }

    private void fillJDKConcurrentHashMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            concurrentJDKHashMap.put(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the JDK ConcurrentHashMap took "+took+"ms");
    }
    private void fillEclipseMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            mutableEclipseMap.addToValue(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the Eclipse map took "+took+"ms");
    }
    private void fillTroveMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            troveMap.put(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the Trove map took "+took+"ms");
    }

    private void fillHPPCMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            hppcMap.addTo(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the HPPC map took "+took+"ms");
    }

    private void fillFastUtilMap() {
        long start = System.currentTimeMillis();
        for (int i = 0; i < MAX_ENTRIES; i++) {
            double next = rnd.nextDouble();
            String key = String.valueOf(i); // ensure unique keys
            fastUtilMap.put(key, next);
        }
        long took = System.currentTimeMillis() - start;
        System.out.println("Filling in "+MAX_ENTRIES+" into the FastUtil map took "+took+"ms");
    }

    private void lookupBenchmark() {
        long jdkHashMapTime = 0;
        long jdkConcurrentMapTime = 0;
        long eclipseMapTime = 0;
        long troveMapTime = 0;
        long hppcMapTime = 0;
        long fastUtilMapTime = 0;
        final int nbLookups = 1000;
        for (int i = 0; i < nbLookups; i++) {
            int nextIndex = rnd.nextInt(MAX_ENTRIES);
            String key = String.valueOf(nextIndex);
            long start = System.nanoTime();
            boolean contained = baseJDKHashMap.containsKey(key);
            long took = System.nanoTime() - start;
            jdkHashMapTime += took;
            assertTrue(contained);

            start = System.nanoTime();
            contained = concurrentJDKHashMap.containsKey(key);
            took = System.nanoTime() - start;
            jdkConcurrentMapTime += took;
            assertTrue(contained);

            start = System.nanoTime();
            contained = mutableEclipseMap.containsKey(key);
            took = System.nanoTime() - start;
            eclipseMapTime += took;
            assertTrue(contained);

            start = System.nanoTime();
            contained = troveMap.containsKey(key);
            took = System.nanoTime() - start;
            troveMapTime += took;
            assertTrue(contained);

            start = System.nanoTime();
            contained = hppcMap.containsKey(key);
            took = System.nanoTime() - start;
            hppcMapTime += took;
            assertTrue(contained);

            start = System.nanoTime();
            contained = fastUtilMap.containsKey(key);
            took = System.nanoTime() - start;
            fastUtilMapTime += took;
            assertTrue(contained);
        }
        System.out.println(nbLookups+" lookups average in JDK HashMap took: "+ (jdkHashMapTime/nbLookups)+"ns");
        System.out.println(nbLookups+" lookups average in JDK Concurrent HashMap took: "+ (jdkConcurrentMapTime/nbLookups)+"ns");
        System.out.println(nbLookups+" lookups average in Eclipse Map took: "+ (eclipseMapTime/nbLookups)+"ns");
        System.out.println(nbLookups+" lookups average in Trove Map took: "+ (troveMapTime/nbLookups)+"ns");
        System.out.println(nbLookups+" lookups average in HPPC Map took: "+ (hppcMapTime/nbLookups)+"ns");
        System.out.println(nbLookups+" lookups average in FastUtil Map took: "+ (fastUtilMapTime/nbLookups)+"ns");
    }
}

The Results

Comparison of the various maps delivered the following results:

Filling in 300000 into the JDK HashMap took 107ms
Filling in 300000 into the JDK ConcurrentHashMap took 152ms
Filling in 300000 into the Eclipse map took 107ms
Filling in 300000 into the Trove map took 855ms
Filling in 300000 into the HPPC map took 93ms
Filling in 300000 into the FastUtil map took 163ms
1000 lookups average in JDK HashMap took: 550ns
1000 lookups average in JDK Concurrent HashMap took: 748ns
1000 lookups average in Eclipse Map took: 894ns
1000 lookups average in Trove Map took: 1033ns
1000 lookups average in HPPC Map took: 523ns
1000 lookups average in FastUtil Map took: 680ns
JDK HashMap:            43'809'895B
JDK Concurrent HashMap: 43'653'740B => save  0.36%
Eclipse Map:            35'755'084B => save 18.39%
Trove Map:              32'147'798B => save 26.62%
HPPC Map:               27'366'533B => save 37.53%
FastUtil Map:           31'560'889B => save 27.96%

Conclusion

Surprisingly the ConcurrentHashMap requires slightly less memory than the HashMap. While the Trove library is memory wise in the middle it’s performance is abominable: On insertions it’s 8x slower than the HashMap and on the lookup it still requires the double time. Here the tradeof for memory versus performance just does not hold up especially when comparing with other libraries that range similarly.

The one Map implementation that clearly swings out above the others is the ObjectDoubleScatterMap of HPPC: Insertion seems to be at least as good as with a HashMap. The performance on insertion is very dependent on the inserted keys, because when there are a lot of collisions during insertion, this has a negative effect. The lookup time is also comparable to the one of the HashMap, which no other framework can claim. And finally the saved memory for a map with primitive double values for types is about 1/3.

If you are interested in a closer look at the performance, I can suggest this comparison.

Ein Gedanke zu „Alternative approaches for Maps with primitive values“

  1. Pingback: Y-O-Y? (Part II) |

Schreibe einen Kommentar