The unprecedented explosion in the amount of information we are generating and collecting, thanks to the arrival of the internet and the always-online society, powers all the incredible advances we see today in the field of artificial intelligence (AI) and Big Data.
With this in mind, a great deal of thought and research has gone into working out the best way to store and organise information during the digital age. The relational database model was developed in the 1970s and organises data into tables consisting of rows and columns – meaning the relationship between different data points can be determined at a glance.
This worked very well in the early days of business computing, where information volumes grew slowly. For more complicated operations, however – such as establishing a relationship between data points stored in many different tables – the necessary operations quickly become complex, slow and cumbersome.
Machine learning – the self-teaching algorithms designed to become more accurate at generating predictions from data as they are fed increasingly large volumes of information – often need to draw data from vast and disparate datasets. It quickly became apparent that a new approach was necessary.
The Knowledge Graph
There have been many attempts to improve on the functionality of the relational database since the model was first developed. One which is quickly growing in popularity due to its flexibility and potential for dealing with complex, interrelated data, is the knowledge graph (sometimes known as a graph database.)
The meaning of the term is not precisely set-in-stone – for example, Google has a specific feature which it calls the Knowledge Graph, which powers the section of its search results page that displays factual information, drawn from recognised sources of authority.
While this is built with several of the same ideas that feed into the broader concept of the knowledge graph, it’s not the be-all and end-all of the technology.
In basic terms, a knowledge graph is a database which stores information in a graphical format – and, importantly, can be used to generate a graphical representation of the relationships between any of its data points.
This means that the apparent advantage over the older, relational style database is that the relationships between any data points can be calculated far more quickly and with less compute power overheads, regardless of whether the data points fit neatly together into a table.
Oracle – which actually released the first commercially available relational database management system in 1978 – is now leading the field in making knowledge graph systems available to the wider business community. Currently, you’re more likely to find them being used by tech giants and research institutions, but this is set to change in the very near future.
Hassan Chafi, senior director of research and advanced development at Oracle Labs, describes the difference between relational and graph databases to me in this way: “With a relational database … it just deals with tables … it allows you to find a row in a table, or take two tables and combine them … and with every join, you’re traversing one hop in these graphs, but you’re reasoning about it in these tabular ways.
“So now what we’re saying is, what if you were to rearrange that same information as a graph? Now it’s visual, and instead of having these tables representing connexions, you have vertices which represent people, or accounts, and you have ‘edges’ which represent relationships. Now I can more quickly say, ‘ok, are Bob and Charlie related? And I can see easily that they are.”
Who uses knowledge graphs?
At the moment, knowledge graphs are widely used by the tech giants that have made gathering and analysing huge volumes of messy, complex data their core business. They power Google’s search engine, as the original page rank algorithm is based on a form of knowledge graph, as well as later additions to its search technology such as the Knowledge Graph.
Facebook also relies on this form of information organisation, to keep track of networks of people and the connexions between them, as well as every other data point they use to build a picture of their users, such as favourite artists and movies, events attended and geographical locations. One of its most significant breakthroughs is considered to be the realisation that the relationships between data points are as valuable as the data points themselves when it comes to building social networks.
Netflix uses knowledge graph technology to organise information on its vast catalogue of content, drawing connexions between movies and TV shows and the actors, directors or producers who put them together. This helps them to predict what customers might like to watch next, and foster the “binge-watching” model of consumption it has built its business around.
Electronics and manufacturing giant Siemens uses knowledge graphs to build accessible models of all of the data it generates and stores, and use it for risk management, process monitoring and building “digital twins” – simulated versions of real-world systems which can be used for design, prototyping and training.
In supply chain logistics, knowledge graphs can be used to keep track of inventories of different components and parts, allowing manufacturers to understand the crossover between materials that are used in different products.
They are also being quickly adopted by the financial services industry, where they are useful for assessing whether or not transactions are fraudulent, as well as many other functions such as marketing and investment analytics.
Chafi tells me “In particular for crime and compliance … for money laundering, one can think of money moving around as a graph, and you need to think about whether those movements are risky or not … If you were to follow the trail that starts from one place, does it come back to the same place after an indefinite number of hops?”
With industries increasingly adopting machine learning, it seems likely that knowledge graph technology will also evolve hand-in-hand. As well as being a useful format for feeding training data to algorithms, machine learning can quickly build and structure graph databases, drawing connexions between data points that would otherwise go unnoticed.
Machine learning is great for answering questions, and knowledge graphs are a step towards enabling machines to more deeply understand data such as video, audio and text that don’t fit neatly into the rows and columns of a relational database.
This could potentially revolutionise fields where the technology undoubtedly has applications that have not yet been fully explored, including healthcare and law.
As with machine learning itself, what started as an academic exercise before being adopted by the most cutting-edge tech companies will no doubt “trickle down”, as tools and frameworks designed to make it accessible become more widely available.