How to migrate Oracle Coherence applications to Hazelcast

If you think about migrating your Oracle Coherence to Hazelcast for whatever reason we may have something to make this process a little less painful.

First of all, go ahead and read the excellent introduction to the migration process prepared by Hazelcast: Oracle Coherence to Hazelcast Migration Guide

Then, if your Coherence application is based on XML configuration, the migration can be partially automated using our small Oracle Coherence to Hazelcast configuration migration tool which is now available open source on github.

The cohe2hazel project tries to transform the Coherence operational and the cache configuration files to the hazelcast.xml (hazelcast-config-3.6) configuration file using XSLT processor. This may be useful when you need to generate the initial version of configuration files for the new cache system which you can then further amend and customize. You can also use this tool in quick PoC projects to demonstrate ability to run your existing Coherence project on Hazelcast.

During the transformation the Coherence operational config file is transformed to the Hazelcast network section and the Coherence cache config file to the hazelcast map section. This transformation is especially helpful when your cache configuration contains a lot of cache definitions (also parameterized ones) .

Quick start

java -jar cohe2hazel-1.0-SNAPSHOT.jar ${cache}.xml ${operational}.xml


java -jar cohe2hazel-1.0-SNAPSHOT.jar coherence-cache.xml tangosol-coherence.xml

Output: Generated output.xml file is a base for main Hazelcast configuration.

The tool performs three operations:

  1. Transforms Coherence operational xml and creates network.xml as an output.
  2. Transforms Coherence cache xml and creates cache.xml as an output
  3. Merges previously created xml files into a single output.xml.

Generated output.xml file is a base for main Hazelcast configuration. For more information please refer to Hazelcast documentation – configuration section.

Obviously, some limitations apply. Currently, as the API is different in both products it is not possible to automatically migrate the whole configuration, so please be aware that the configuration will be migrated in limited scope only.

You can find more on requirements, limitations and practical use in our github repo.

Migrating your Coherence project now?

If you are interested in any changes to this tool or our experience in migrating Oracle Coherence projects to Hazelcast you may contact us at

How to find missing foreign key indexes in PostgreSQL

If you ever wondered how to almost immediately speed up your PostgreSQL database when it is created without proper foreign key indexes, e.g. with schema generated by JPA, you may want to use the following query to collect candidate indexes:

select 'CREATE INDEX idx_'||conname||' ON '||conrelid|| ' ('||attname||');' from (
 SELECT conrelid::regclass
 ,reltuples::bigint,unnest(conkey) coloid
 FROM pg_constraint
 JOIN pg_class ON (conrelid = pg_class.oid)
 WHERE contype = 'f'
 SELECT 1 FROM pg_index
 WHERE indrelid = conrelid
 AND conkey @> indkey AND conkey <@ indkey
ORDER BY reltuples DESC
) as cols
join pg_attribute pga on pga.attrelid = cols.conrelid and pga.attnum=cols.coloid

It will return a list of DDL creating missing foreign key indexes ready for execution after your review:

"CREATE INDEX idx_fk_4ywoftpugx4dycijy8i9tyhwb ON profile (project_id);"
"CREATE INDEX idx_fk_o4al1lv1rgjw8m8xvrwnv797f ON profile_user_data (profile_id);"
"CREATE INDEX idx_fk_euwbenmq4r06p1k5cjlypng8s ON profile_user_data (user_id);"
"CREATE INDEX idx_fk_b7i81l1tk1ph95xnhtoftyv53 ON task (project_id);"
"CREATE INDEX idx_fk_1ytmg44m35ff160cofosaum8h ON user_projects (project_id);"
"CREATE INDEX idx_fk_8xi9c7nbxov8bdubwdcgahnjp ON user_projects (user_id);"

Why latency matters

We care a lot about latency at Codedose, and this is not only when building high performance computing processes but for any type of applications really.

Why is that?

We could observe a major shift in hard availability and its cost/performance ratio in the last 10 years, and while this also relates to processors the real impact is made by RAM memory and SSD storage devices.

Memory units became cheap, large and fast, and you can order real memory monsters with terabytes of RAM with a few clicks right now. Well, your credit card limit applies but joke asides – just see Dell PowerEdge R920 or HP ProLiant DL580. Yes, 6 TB of DDR4 RAM in one physical box at your service.

SSD storage devices are amazing at their speed, typically at least 20x faster than good old spinning disks. You can easily extend your desktop or laptop with Samsung 850 PRO disks or go with specialised SSD cards for your servers.

What does it mean to us, Java devs?

Just take a look at this brilliant summary of latency times required to transfer data between different physical units of your computer system: Latency Numbers Every Programmer Should Know:

Latency Comparison Numbers
 L1 cache reference 0.5 ns
 Branch mispredict 5 ns
 L2 cache reference 7 ns 14x L1 cache
 Mutex lock/unlock 25 ns
 Main memory reference 100 ns 20x L2 cache, 200x L1 cache
 Compress 1K bytes with Zippy 3,000 ns 3 us
 Send 1K bytes over 1 Gbps network 10,000 ns 10 us
 Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
 Read 1 MB sequentially from memory 250,000 ns 250 us
 Round trip within same datacenter 500,000 ns 500 us
 Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
 Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
 Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD
 Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms

 1 ns = 10^-9 seconds
 1 us = 10^-6 seconds = 1,000 ns
 1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns

 By Jeff Dean:
 Originally by Peter Norvig:

Yes, main memory is 4x faster than SSD, and SSD is 20x faster than a spinning hard drive.

So what are the lessons for us?

  1. “Your data fits in RAM. Yes, it does.” – see RAM is the New SSD.
  2. Design your system to store all data in memory or cache aggressively otherwise. Observe and eliminate I/O hits.
  3. Process data in memory and leave the results there.
  4. Worried about persistency and transactional control? Try to restructure your processing to write data asynchronously into a persistent storage device and measure your throughput/latency twice – it may be that you can just reprocess whole batch once again in memory in case of any error.
  5. Really not enough memory to store your data? Or more CPU power needed to crunch your data? Look at distributed cache and grid-computing platforms like Oracle Coherence, Hazelcast or Spark.