July 15, 2013 § 3 Comments
What are the Differences Between Relational and Graph Databases?
Say “database” to most people, and the concept of an organized collection of data, neatly stored in rows and columns of tables comes to mind. This concept of a fixed schema, where each row is a collection of attributes, is the basis for relational databases and the querying languages, such as SQL (Structured Query Language) used to interact with the stored data since the earlier 1970’s.
In the last 15 years however, there are several industries where the amount of data being generated greatly exceeds the ability for relational databases to handle it. Companies like Google and Amazon have long been generating massive amounts of data using countless numbers of servers. With the resulting data spread across multiple machines, traditional relational SQL JOIN operations are just not possible.
Enter graph databases, which are defined as any storage system that provides index-free adjacency. What this means is that every element in the database contains a direct link to its adjacent element. No index lookups are required; every element (or node) knows what node or nodes it is connected with, this connection is called an edge. This allows graph database systems to utilize graph theory to very rapidly examine the connections and interconnectedness of nodes – and how Netflix can recommend videos for you.
The power of the edge allows a graph database to find results in associative data sets – data where information is the aggregation of links between nodes – faster than relational databases. Graph databases can scale more naturally to large data sets and to datasets with changing or on-the-fly schemas. On the other hand, relational databases are still better at performing the same operation on large numbers of identical data. When you want your bank balance, you don’t want a rapid list of all your transactions – just your bottom line.
The use of graph databases is rapidly spreading to many applications through the use of mixed-database approaches, where a graph search is used to identify the extent of the data, and a subsequent relational search is used to provide the detailed analytics. While this approach presently involves developing (and supporting) two database structures, it yields rapid response and targeted data analysis. Some solutions present the graph results to users while the analytics are being pulled and crunched; other systems serve up old results while the new results are being calculated. How analytics and associations coexist is one of the considerations that must be made when architecting your solution.
So is a graph database in your future? If a third or more of your relational tables describe links between data elements, your database is heavily associative, and can be a graph database candidate. The final decision requires a complete analysis of how the data is being used, volume and growth patterns, and not just a review of table structures. If your data is used for statistical analysis, data mining and exploration, or operational research, the relational database approach is still at least part of the architectural solution.
July 4, 2013 § 1 Comment
Ian Plosker shares a number of techniques for establishing the data query patterns from the outset of application development, designing a data model to fit those patterns.
Interesting discussion on choosing data model. Its really important to understand what our application want. Very good presenation