Hash buckets and % vs. & ops
FROM SEAJUG MAILING LIST
-------------------------------> Re: hashmap in java7? Stuart Maclean (<<redacted&rt;&rt;) Schedule cleanup 11:39 AM [Keep this message at the top of your inbox] Groups To: seajug@yahoogroups.com Picture of Stuart Maclean In the old days, the bucket count for a map was advised to be prime, e.g. C = 101, so that after the hash() method was called on the object to be either inserted or lookup, you would get a hopefully 'uniform' distribution of bucketIDs, since the hash value mod C and bucket count C would be co-prime. 101 was chosen i think because if you had to increase capacity C, then C = 2C + 1 would still be prime for C = 203, 407. Then, in Java 7 I think, it was realised that the expensive operation is the mod operation needed in identifying the bucket. It would be MUCH cheaper if C was a power of 2, since it reduces this int bucket = hash % C to this int bucket = hash & (C-1) which is is much more 'machine-oriented'. I can vouch for this since one search problem I had was reduced from 2.5 hours to 1.5 hours solely by replacing a 'mod' with an 'and', where the operation was called millions of times in an inner loop. Of course this may not be relevant to the current discussion but I thought it worth sharing... Stuart