Unlike the buffer cache which holds entire rows, the In Memory option allows the DBA to hold specific columns in memory making more efficient use of this expensive resource. This can be extended to specific partitions, which mean for example, a system could cache the most frequently accessed columns from the most recent (and actively queried) data in memory.
OIM also implements Storage Index Technology used in the Exadata hardware appliance. Unlike traditional indexes which are deployed by the DBA, these are transparently built and maintained by Oracle itself, and speed in-memory data scans by skipping over data based upon the query Where clause.
Like many Oracle features, the In Memory option is transparent to the developer, and data can be marked for addition or removal from memory with a simple Alter statement with no application changes required.
Queries can however also combine data from both the In Memory and Buffer Cache. For example, the most recent results can be held in memory, while less frequently accessed (older data), or dimensions can be read from disk as needed.
What's the Benefit?
Normal Insert and Update operations receive no benefit at all as these are performed against the Buffer Cache. However, selected columns or entire tables can be pinned in memory, and automatically compressed using columnar compression. I found Dimension tables (with frequently repeated text values), compression rates were around 2-7 times, and Fact tables typically around 50%. It's also possible to tune the level of compression and to prioritise the speed at which data is refreshed as it's changed.
What are the Challenges?
1. What data to hold in Memory
While many DBAs and application developers are aware of frequently accessed tables, choosing which partitions or even columns to pin in memory is a challenge. As memory is a finite and valuable resource, it makes sense to spend time tuning usage, and it's sensible to use the Oracle Diagnostics and Tuning Pack to help. Be aware, if a query includes a column which is no pinned in memory, the default Buffer Cache method will be used.
2. Data Sorts can also produce Disk Reads
Although data fetches are often the primary driver of disk traffic, performing large sort operations with an insufficient Sort Area Size will also lead to disk accesses which may eliminate the performance gains. As most analytic queries include a GROUP BY or ORDER BY clause, It's therefore worth checking all sort operations are entirely in memory, and tuning the PGA Sort Area Size accordingly.
3. Oracle Parallel Query Side Effects
During a series of test queries, I carefully checked for physical disk operations, and was suprised to find Oracle Parallel Query had a significant negative impact upon query performance. In fact, query performance went from three times faster to over 12 times faster when I switched to serial processing.
It seems, using Oracle Parallel Query on very large data sets can produce unexpected physical IOs as each parallel slave process needs it's own memory area to collate and sort the data, and this in turn magnified the problem of large sort operations.
Switching off Oracle Parallel Query on In Memory operations has a huge scalability benefit, as each user executes on a single core intead of flooding the machine with parallel processes. It does however indicate a potential architectural hardware decision - buy machines with lots of very fast processors and lots of memory to maximise the benefits.
4. It's not a panacea for performance
Keep in mind a poorly designed application may not work any faster than before. In one case, another team where surprised to find zero benefit from using OIM to join two massive (multi-gigabyte) fact tables in memory. In reality they probably hit the limits of in memory sorting, and would have benefited from a table redesign producing a single fact table with simple dimensional joins.
Likewise, OIM works best for analytic type queries which summarise or calculate averages over large data sets. It's not a solution to improve OLTP appilcations, although it could be deployed on the same platform as an OLTP system to deliver fast operational reports.
Oracle now faces some serious competition in the Data Warehouse and Business Intelligence space. Dedicated colum based data stores like Sybase IQ and HP Vertica are now joined by the Hadoop based products including Impala from Cloudera and Apache Spark - both which aim to provide sub-second query performance against massive data volumes.
Although architecturally, Oracle 12c is rather a bolt on solution to the challenge from faster column based data stores, it does have the advantage of being a simple upgrade to existing systems. With performance gains of between 4 and 27 times faster compared to the traditional solution on a relatively inexpensive Oracle Data Appliance, they certainly provide a compelling solution. The fact this is entirely transparent to the application is a massive advantage, and it certainly gets my vote.