Overview
Graph database is introduced in SQL Server 2017 version and it is a collection of Nodes and Edges. Both the relational and graph databases are data management system software. In Graph database building blocks are Nodes and Edges whereas in relational database building blocks are tables.
Graph database is introduced in SQL Server 2017 version and it is a collection of Nodes and Edges. Both the relational and graph databases are data management system software. In Graph database building blocks are Nodes and Edges whereas in relational database building blocks are tables.
The term “graph” comes from mathematics. There it’s used to describe a collection of nodes (or vertices), each containing information (properties), and with labeled relationships (or edges) between the nodes.
A
graph is composed of two elements: a NODE (vertices) and an EDGE
(relationship). Each node represents entities such as a person or location, and the nodes are connected to
one another with edges; these provide details on the relationship
between two nodes with their own set of attributes and properties. For example, a relationship might exist between a location such as Hyderabad and a person named Mahesh, who lives in Hyderabad. Mahesh and Hyderabad are the entities, and ‘lives in’ is the relationship between the two.
Graph database has nodes and edges, So in SQL server two new table types NODE and EDGE. And a new function called MATCH(). Node or edge tables can be created under any schema in a database, but they all belong to one logical graph.
Here are some features that make a graph database unique:
Graph database has nodes and edges, So in SQL server two new table types NODE and EDGE. And a new function called MATCH(). Node or edge tables can be created under any schema in a database, but they all belong to one logical graph.
Here are some features that make a graph database unique:
- Edges or relationships are first class entities in a Graph Database and can have attributes or properties associated with them.
- A single edge can flexibly connect multiple nodes in a Graph Database.
- You can express pattern matching and multi-hop navigation queries easily.
- You can express transitive closure and polymorphic queries easily.
Why graph database?
Some
data models do not fit the traditional relationships. These scenarios include
hierarchical data, complex many-to-many relationships, or complex
interconnected data and relationships. Even it is always possible to store and
query data for these types of scenarios in SQL Server, but the number of tables
and relationships among these objects will be more and querying this tables
will be more complex. So, in SQL Server 2017 introduces a new feature that
makes it far easier to navigate these types of data relationships, called graph
database.
Relational databases do well when there are only one or two levels of relationships; graph databases handle far greater depths.
When to consider graph database?
Edge Table
Relational databases do well when there are only one or two levels of relationships; graph databases handle far greater depths.
When to consider graph database?
- When you have lots of relationships in your data
- When you want to do real-time analytical queries
- When your current system is not working out because of slow joins in RDBMS systems.
Node Table
A node table represents an entity in a graph schema. Whenever we create a node table, along with the user defined columns, an implicit $node_id column is created, which uniquely identifies a given node in the database. The values in the $node_id column are automatically generated and are a combination of object_id of that node table and an internally generated bigint value. However, when the $node_id column is selected, a computed value in the form of a JSON string is displayed. Also, $node_id is a pseudo column, that maps to an internal name with hex string in it. When we select $node_id from the table, the column name will appear as $node_id_\hex_string.
It is recommended that users create a unique constraint or index on the $node_id column at the time of creation of node table, but if one is not created, a default unique, non-clustered index is automatically created.
Syntax for creating the Node table:
CREATE TABLE
{ database_name.schema_name.table_name. | schema_name.table_name | table_name }
( { } [ ,...n ] ) AS [NODE]
[ ; ]
Edge Table
An edge table represents a relationship in a graph. Edges are always directed and connect two nodes. An edge table enables users to model many-to-many relationships in the graph. An edge table may or may not have any user-defined attributes in it.
Syntax for creating the Edge table:
CREATE TABLE
{ database_name.schema_name.table_name. | schema_name.table_name | table_name }
( { } [ ,...n ] ) AS [EDGE]
[ ; ]
Every time an edge table is created, along with the user-defined attributes, three implicit columns are created in the edge table:
Column Name | Description |
---|---|
$edge_id | Uniquely identifies a given edge in the database. It is a generated column and the value is a combination of object_id of the edge table and a internally generated bigint value. However, when the $edge_id column is selected, a computed value in the form of a JSON string is displayed. $edge_id is a pseudo-column, that maps to an internal name with hex string in it. When you select $edge_id from the table, the column name will appear as $edge_id_\ |
$from_id | Stores the $node_id of the node, from where the edge originates. |
$to_id | Stores the $node_id of the node, at which the edge terminates. |
Similar to the $node_id column, it is recommended that users create a unique index or constraint on the $edge_id column at the time of creation of the edge table, but if one is not created, a default unique, nonclustered index is automatically created on this column.It is also recommended, for OLTP scenarios, that users create an index on ($from_id, $to_id) columns, for faster lookups in the direction of the edge.
Metadata for Graph Databases
Metadata for Graph Databases
- sys.tables : If is_node is set to 1, that indicates that the table is a node table and if is_edge is set to 1, that indicates that the table is an edge table.
- sys.columns : The sys.columns view contains additional columns graph_type and graph_type_desc, that indicate the type of the column in node and edge tables.
Value | Description |
---|---|
1 | GRAPH_ID |
2 | GRAPH_ID_COMPUTED |
3 | GRAPH_FROM_ID |
4 | GRAPH_FROM_OBJ_ID |
5 | GRAPH_FROM_ID_COMPUTED |
6 | GRAPH_TO_ID |
7 | GRAPH_TO_OBJ_ID |
8 | GRAPH_TO_ID_COMPUTED |
System Functions
Built-in | Description |
---|---|
OBJECT_ID_FROM_NODE_ID | Extract the object_id from a node_id.We need to pass node_id to this function and it will return the object_id. Eg: select OBJECT_ID_FROM_NODE_ID('{"type":"node","schema":"dbo","table":"Person","id":0}') |
GRAPH_ID_FROM_NODE_ID | Extract the graph_id from a node_id. We need to pass node_id to this function and it will return the object_id. Eg: select GRAPH_ID_FROM_NODE_ID('{"type":"node","schema":"dbo","table":"Person","id":1}') |
NODE_ID_FROM_PARTS | Construct a node_id from an object_id and a graph_id. Eg: select NODE_ID_FROM_PARTS(1221579390,2) |
OBJECT_ID_FROM_EDGE_ID | Extract object_id from edge_id. Eg: select OBJECT_ID_FROM_EDGE_ID('{"type":"edge","schema":"dbo","table":"friendOf","id":1}') |
GRAPH_ID_FROM_EDGE_ID | Extract identity from edge_id. Eg: select GRAPH_ID_FROM_EDGE_ID('{"type":"edge","schema":"dbo","table":"friendOf","id":1}') |
EDGE_ID_FROM_PARTS | Construct edge_id from object_id and identity. Eg: select EDGE_ID_FROM_PARTS(1525580473,1) |
MATCH clause
MATCH clause is introduced to support pattern matching and multi-hop navigation through the graph. The MATCH function uses ASCII-art style syntax for pattern matching.For example to find friends of mahesh
Sample Data Model
In our data model, the rectangle boxes are referred to as nodes. The lines connecting them are relationships or Edges. So basically Nodes are the entity which are Employee, Company, Project and Location. These entities are connected to each other with a relationship that is called Edges which are LivesIn, Project_Works, Company_Works, LocatedIn, Knows, Owns. So the basic flows goes something like Node > Relationship > Node. Here Employee has 4 relations and Company has 2 relations.
Sample Queries
MATCH clause is introduced to support pattern matching and multi-hop navigation through the graph. The MATCH function uses ASCII-art style syntax for pattern matching.For example to find friends of mahesh
SELECT E2.ENAME FROM employee E1,know,employee E2 WHERE MATCH(E1-(know)->E2) AND E1.ENAME='MAHESH'
DemonstrationSample Data Model
In our data model, the rectangle boxes are referred to as nodes. The lines connecting them are relationships or Edges. So basically Nodes are the entity which are Employee, Company, Project and Location. These entities are connected to each other with a relationship that is called Edges which are LivesIn, Project_Works, Company_Works, LocatedIn, Knows, Owns. So the basic flows goes something like Node > Relationship > Node. Here Employee has 4 relations and Company has 2 relations.
- Know: Employee know employees
- Company_Works: Employee works in the company
- Project_Works : Employee works in the project
- LivesIn: Employee lives in location
- Owns: Company owns the project
- LocatedIn: Company located in the location
Schema
CREATE TABLE company(ID INTEGER PRIMARY KEY,
CNAME VARCHAR(45)) AS NODE
CREATE TABLE [location](ID INTEGER PRIMARY KEY,
LNAME VARCHAR(45),
STATENAME VARCHAR(45)) AS NODE
CREATE TABLE employee(ID INTEGER PRIMARY KEY,
ENAME VARCHAR(45)) AS NODE
CREATE TABLE project(ID INT PRIMARY KEY,
PNAME VARCHAR(45)) AS NODE
CREATE TABLE know AS EDGE
CREATE TABLE livesin AS EDGE
CREATE TABLE project_works AS EDGE
CREATE TABLE locatedin AS EDGE
CREATE TABLE company_works AS EDGE
CREATE TABLE owns AS EDGE
INSERT INTO company(ID,CNAME) VALUES (1,'MICROSOFT'),(2,'IBM'),(3,'GOOGLE')
INSERT INTO project(ID,PNAME) VALUES (1,'CMS'),(2,'TMS'),(3,'PMS'),(4,'LMS'),(5,'CRMS')
INSERT INTO [location](ID,LNAME,STATENAME) VALUES (1,'HYDERABAD','TG'),(2,'CHENNAI','TN'),(3,'BANGALORE','KA')
INSERT INTO employee(ID,ENAME) VALUES (1,'MAHESH'),(2,'SURESH'),(3,'RAKESH'),(4,'PRAGESH'),(5,'RAMESH'),(6,'VENKATESH'),(7,'SATHEESH'),(8,'SAMPORNESH')
-- know/friendof
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=2))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=3))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=3),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=2))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=4),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=5))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=6),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=7))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=7),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=6))
INSERT INTO know VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8))
-- project_works
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM PROJECT WHERE ID=1))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=2),(SELECT $NODE_ID FROM PROJECT WHERE ID=2))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=3),(SELECT $NODE_ID FROM PROJECT WHERE ID=1))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=4),(SELECT $NODE_ID FROM PROJECT WHERE ID=5))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=5),(SELECT $NODE_ID FROM PROJECT WHERE ID=5))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=6),(SELECT $NODE_ID FROM PROJECT WHERE ID=3))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=7),(SELECT $NODE_ID FROM PROJECT WHERE ID=4))
INSERT INTO project_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8),(SELECT $NODE_ID FROM PROJECT WHERE ID=4))
-- company_works
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM COMPANY WHERE ID=1))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=2),(SELECT $NODE_ID FROM COMPANY WHERE ID=1))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=3),(SELECT $NODE_ID FROM COMPANY WHERE ID=1))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=4),(SELECT $NODE_ID FROM COMPANY WHERE ID=2))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=5),(SELECT $NODE_ID FROM COMPANY WHERE ID=2))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=6),(SELECT $NODE_ID FROM COMPANY WHERE ID=3))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=7),(SELECT $NODE_ID FROM COMPANY WHERE ID=3))
INSERT INTO company_works VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8),(SELECT $NODE_ID FROM COMPANY WHERE ID=3))
-- locatedin
INSERT INTO locatedin VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=1),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO locatedin VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=1),(SELECT $NODE_ID FROM LOCATION WHERE ID=2))
INSERT INTO locatedin VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=2),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO locatedin VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=3),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
-- livesin
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=1),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=2),(SELECT $NODE_ID FROM LOCATION WHERE ID=2))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=3),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=4),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=5),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=6),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=7),(SELECT $NODE_ID FROM LOCATION WHERE ID=1))
INSERT INTO livesin VALUES ((SELECT $NODE_ID FROM EMPLOYEE WHERE ID=8),(SELECT $NODE_ID FROM LOCATION WHERE ID=3))
-- owns
INSERT INTO owns VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=1),(SELECT $NODE_ID FROM PROJECT WHERE ID=1))
INSERT INTO owns VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=1),(SELECT $NODE_ID FROM PROJECT WHERE ID=2))
INSERT INTO owns VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=2),(SELECT $NODE_ID FROM PROJECT WHERE ID=5))
INSERT INTO owns VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=3),(SELECT $NODE_ID FROM PROJECT WHERE ID=3))
INSERT INTO owns VALUES ((SELECT $NODE_ID FROM COMPANY WHERE ID=3),(SELECT $NODE_ID FROM PROJECT WHERE ID=4))
- Query to list employee names who work on project 'CMS'
SELECT ENAME FROM employee,project_works,project WHERE MATCH(employee-(project_works)->project) AND PNAME='CMS'
- Query to list company name,project name, employee name who work on project 'CMS' in company 'MICROSOFT'
SELECT CNAME,PNAME,ENAME FROM company,owns,project,project_works,employee WHERE MATCH(company-(owns)->project AND employee-(project_works)->project) AND PNAME='CMS' AND CNAME='MICROSOFT'
- Query to list companies that MAHESH friends works
SELECT CNAME,E2.ENAME FROM company,company_works,employee E1,know,employee E2 WHERE MATCH(E1-(know)->E2-(company_works)->company) AND E1.ENAME='MAHESH'
- List employees who work in the same city they live in
SELECT ENAME,LNAME FROM employee,livesin,locatedin,company_works,company,[location] WHERE MATCH(employee-(company_works)->company-(locatedin)->[location] AND employee-(livesin)->[location])
Let me know if you have any questions, suggestions or comments.
References:
https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-sample
https://medium.com/@dmccreary/a-taxonomy-of-graph-use-cases-2ba34618cf78
https://www.youtube.com/watch?v=_KEmK_4ZBYQ
0 comments:
Post a Comment