Oracle Vs Snowflake

Image by Keith Johnston from Pixabay 

Image by Keith Johnston from Pixabay 

 

10 minute read

I first started working with Oracle in 1987 on Oracle Version 5. That was not only before PL/SQL and row-level locking, but The Cloud and even The Internet itself had yet to be invented.

Back then, Oracle was an amazing product. The competition was hierarchical and network databases or Index Sequential (ISAM) files on big iron IBM mainframes. Using Oracle on a DEC VAX was like going from a model T-Ford to a Porsche 911 at half the cost.

Fast forward 30 years, and Oracle is barely clinging to the top spot on the database ranking table, at a mere 5% ahead of the MySQL database, while MongoDB and PostgreSQL both grew by over 60% in the past year.

Oracle clinging to the top position in the DBMS Ranking

Oracle clinging to the top position in the DBMS Ranking

Maybe it was because of the Oracle aggressive sales tactics or because of a selfish and arrogant attitude where customers are always put last.

The massive growth in popularity of Hadoop, and open source alternatives including MySQL, PostgreSQL have been eating into the Oracle market, and niche database alternatives from NoSQL vendors including MongoDB, Elasticsearch and Redis have become increasingly popular as business deal with web-scale data volumes and millisecond performance.

Finally, as I have indicated before Oracle is no longer a really sought after skill. It doesn’t even make it in the top 10 most sought-after database skills in 2019, as illustrated below.

Most in-demand database skills.  2020

Most in-demand database skills. 2020

But Enterprise grade products like Oracle don’t just fall out of fashion. They either fail to innovate or like the dinosaurs, they die because they refuse to accept, let alone adapt to a rapidly changing world. One in which The Cloud is already dominating the IT landscape.

Where did it all go wrong?

As early as 2005, the Turing award winner, professor Michael Stonebraker of MIT was predicting the demise of the dominance of Oracle, Microsoft and IBM in his seminal paper “One Size Fits All” – An idea who’s time has come and Gone. In this paper he accurately predicted the database market would fragment, and that the data warehouse market would be dominated by new players with the emergence of column-oriented database solutions.

Three years later he published OLTP through the looking glass which demonstrated the H-Store database, a stripped back open source database with a pure OLTP focus. This was an astonishing achievement, as the TCP-C world record was around 1,000 transactions per second per CPU core, and yet he managed over 35,000 on an Intel 2.8GHz desktop.

In total the prototype delivered an incredible 70,000 transactions per second, and he went on to build a highly successful commercial venture VoltDB, which recent benchmarks demonstrate single digit millisecond latency while maintaining a serializable isolation level and complete transactional consistency that is simply not possible on Oracle.

The architecture of most DBMSs is essentially identical to that of System R [designed in the 1970s]
— Dr Michael Stonebraker. MIT.

The underlying reason the team at MIT were able to achieve such an astonishing result, is illustrated in the chart below, and highlights the fact that the Oracle architecture is based upon a design conceived in the 1970s for the “System R” database.

What does Oracle spend most elapsed time working on?  93% is pure overhead!

What does Oracle spend most elapsed time working on? 93% is pure overhead!

In summary, they found that every commercial database product (ie. Oracle, SQL Server and DB2), spent around 93% of the time on overhead tasks, and just 7% doing any useful work. They discovered the database spent the majority of the time coordinating the in-memory buffer cache, writing to log files, and in-memory latching and locking.

Essentially the architecture was formulated at a time when memory was about 13 million times more expensive than today, and the entire architecture was based around the careful management of a historically expensive resource.

Even the release of Oracle in Memory which I personally demonstrated a 27 times performance improvement is still a sticking plaster on the problem, and I was unable to deploy this for a customer because of the massive licence cost per CPU core.

Data Warehouse Alternative?

Firstly, any modern data warehouse needs to be cloud based to gain the huge benefits of agility, scalability, elasticity and end-to-end security. Snowflake stands out as the only data warehouse designed specifically for the cloud. So how does it compare to Oracle on premise?

Main Differences

Oracle Vs Snowflake - Main Differences

Technical Differences

Oracle Vs. Snowflake - Technical Differences

What don’t you need with Snowflake?

In discussing the differences between Snowflake and Oracle, rather than criticising Oracle, let’s consider what you don’t need with Snowflake:

  • Installation:  There is no hardware to install, no operating system or database software to deploy. There are no patches to apply or database upgrades to schedule.  In fact, switching to Snowflake may well be the last upgrade you ever need to do, and you can run with zero down-time.

  • Be Locked In:  As Snowflake currently runs on Amazon AWS, Microsoft Azure platform and Google Cloud Platform.  You can even seamlessly share data with customers, partners and third parties across all three cloud platforms using the Snowflake Data Exchange.

  • Database Management:  There are no indexes to manage, no physical data partitioning needed, no statistics to capture or fear of a query performance cliff-edge if they are not correctly captured.  In fact, Snowflake demonstrates near zero database administration effort.

  • Tug of War:  Snowflake have abolished the tug of war for machine resources using workload isolation.  Instead of a single, massive multi-core database server you can deploy an unlimited number of completely independent virtual warehouses.  This means you can be loading terabytes of data on one warehouse, while transforming the data on another, and analysing the results on yet another.

  • Disk Upgrades:  Never run out of disk space ever again.  Snowflake supports literally unlimited data volumes on both Amazon AWS, Microsoft Azure and soon Google Cloud Platform.  On Snowflake, some customers hold as much as a petabyte in a single table.

  • Data Compression:  There is no need to pay the licence cost of the OLTP option or carefully load data to maximise data compression using insert append on Oracle. With Snowflake, all your data is automatically compressed using columnar compression, often to a factor of between 3 and 6 times.

  • Migrate to a Bigger Server:  As the chart below illustrates, Snowflake is incrementally scalable, with a simple set of T-Shirt sizes, and can be increased from an extra-small to a 4X-Large server within milliseconds.  To put this in context, that is increasing the size of the machine from one to 128 database servers. The chart demonstrates the reduction in elapsed time of a query to copy 1.3TB of data. This simple CTAS (Create table as select) is the mainstay of most ELT procesesses.

Snowflake benchmark.  Effect on query elapsed time and throughput

Snowflake benchmark. Effect on query elapsed time and throughput

  • Deploy for High Availability:  There is no requirement to deploy an expensive hot-standby data centre, with data replication and fail-over for high availability.  Snowflake transparently writes data to three Availability Zones within a region and can automatically survive the loss of any two.  Indeed, they recently announced the launch of cross-region and cross-cloud replication and automatic fail-over.  This means a customer in the US West Coast will be able provide an automatic fail-over to the East Coast, and even fail-over from Amazon AWS to Microsoft Azure or Google Cloud.

  • Spend hours producing backups:  As Snowflake provides up to 90 days of time travel including the ability to undrop a table, schema or even an entire database within seconds.  This means you can query the data as it was up to 90 days ago, and with zero copy clones, take a transactionally consistent backup of a terabyte of data within five seconds, with recovery equally fast.

  • Worry about Security:  As Snowflake includes end-to-end encryption with automatic key rotation, Multi-Factor Authentication, and even the option of dedicated cloud hardware with the Virtual Private Snowflake option.

Impact of Cloud Computing

What the Hell is Cloud Computing? …. I’ve no idea what everyone’s talking about
— Larry Ellison

Of course, despite the protests of Larry Ellison back in 2008 that cloud computing was “complete gibberish” it now seems Oracle has completely embraced a future in the cloud which is just as well since according to a RightScale survey Cloud services are currently witnessing growth rates of up to 50% per year.

oracle-vs-snowflake-cloud-computing-survey.png

Conclusion

In conclusion, while the rest of the world was looking towards Hadoop to resolve the challenges of “Big Data”, Snowflake quietly redefined what’s possible. The solution can be deployed within minutes, with nothing more complex than a credit card, and an extra-small server running for an hour costs less than a cup of coffee. As a platform it’s remarkably flexible and will scale up performance to execute a massive terabyte sized queries, reducing elapsed time from 5.5 hours down to just two minutes. Equally, as the UK based food delivery service Deliveroo found, it can automatically scale out, to cope with a huge number of concurrent users, then silently scale back when not needed.

Unlike Oracle, it is incredibly simple to manage with no statistics to capture, no indexes to manage and data partitioning, compression and encryption which are automatic and completely transparent.

If you’d like to try Snowflake for yourself, you can start a free trial with $400 of credit. There’s no hardware to configure or software to install, and the online documentation is equally easy to follow with lots of getting started videos.

Notice Anything Missing?

No annoying pop-ups or adverts. No bull, just facts, insights and opinions. Sign up below and I will ping you a mail when new content is available. I will never spam you or abuse your trust. Alternatively, you can leave a comment below.

Disclaimer: The opinions expressed on this site are entirely my own, and will not necessarily reflect those of my employer.

John A. Ryan

John A. Ryan

Previous
Previous

When should I use Data Vault?

Next
Next

What is Snow flake ?