Talk

How Graph Data Science can turbocharge your Knowledge Graph
Conference (BEGINNER level)
Room 3
Score 0.18
Score 0.20
Score 0.20
Score 0.21
The match becomes increasingly accurate as the similarity score approaches zero.

Knowledge Graphs are becoming mission-critical across many industries. More recently, we are witnessing the application of Graph Data Science to Knowledge Graphs, offering powerful outcomes. But what does Graph Data Science stand for, and how does it turbocharge Knowledge Graphs?

In this talk, we will illustrate the various methods and models of Graph Data Science being applied to Knowledge Graphs and how the Neo4j platform allows you to find implicit relationships in your graph, which are impossible to detect in any other way. You will learn how centrality algorithms as well as node embeddings uniquely capture the topology of your graph and how they are being used in drug discovery as well as various other industries.

We will end the talk by showcasing how you can get started with Neo4j.

Kristof Neys
Neo4j

Kristof is Director Graph Data Science Technology in the Field Engineering team at Neo4j, the leading graph technology platform, where he advises on and implements graph data science solutions for Neo4j’s clients. He is also currently pursuing a PhD in Graph Machine Learning at the University of London, Birkbeck. Kristof holds a MSc in Mathematics and a MSc in Data Science from the University of London. Prior to joining Neo4j Kristof had a 20 year career in Fixed Income Sales & Trading at some of the major investment banks in London. 

Generated Summary
WARNING: This summary was generated using GPT based on the transcript, as a result spelling mistakes and more importantly hallucinations can be present.

Knowledge Graphs, Graph Data Science, Graph Embeddings, and Graph Machine Learning
Introduction
Knowledge graphs are becoming increasingly ubiquitous in industry, but the concept is not fully understood. Neo4j provides an open source platform for users to explore and understand these concepts. At the end of the talk, attendees should have an understanding of knowledge graphs, graph data science, graph embeddings and machine learning. Graph databases are a natural way of representing reality and are best used for exploring relationships between data points. A knowledge graph is a graph with additional semantics, which add a layer of meaning to the data. This allows for more complex queries and analysis. A basic example of a knowledge graph is three nodes (two people and one car) that have relationships between them and unique properties that define them.
What is a Knowledge Graph?
This article explains the concept of a knowledge graph and how it can be used in data science. A knowledge graph is a collection of nodes and relationships between them that can be used to detect implicit relationships in large datasets. By loading the data into a graph database, one can query the data quickly and create explicit relationships between nodes to form a subgraph. Advanced methods can then be used to detect less obvious implicit relationships and leverage the query language of the graph database to further explore the data. Knowledge Graphs are becoming increasingly popular in industry and are being used for a variety of purposes, such as fraud detection and production process improvement. Neo4j has been working with companies such as JPMorgan, Siemens, Caterpillar, and NASA to establish Knowledge Graphs. Gartner predicts that graph technology will become a standard in the next few years. Neo4j's database is designed for scalability and its Cipher query language is likely to become the standard for graph databases.
Querying a Knowledge Graph
This discussion focuses on the use of a query language to run algorithms and models. There is an investment in learning the language, but it will be worth it since it can be used to run these algorithms and models. A visualization tool is also mentioned, as well as analytics and how to structure data in a graph data structure. It also discusses the use of various graph algorithms such as centrality metrics and community detection. These algorithms can be used to detect implicit relationships and create semantics.
Graph Machine Learning
Finally, this talk covers the use of graph machine learning which involves embedding techniques. This article discusses how to use graph algorithms to analyze networks and learn from the structure of the network. It explains how to compute scores and write them as properties to each node, and how to use embeddings and similarity algorithms for vector representation and comparison. It also explains how scalability of the database is extended to scalability of the algorithms. Finally, it suggests a quiz to make readers think in graph terms by asking them to choose the most important node out of three. Routers have the highest betweenness centrality score, which is computed based on the number of shortest paths that run through a particular node. This makes them important to preserve in order to keep the network running. Graph machine learning is used to convert a node, its neighborhood and features into a numerical vector representation.
Neo4j Graph Search, FastRP, and Node2Vec
Neo4j offers three models for this: Neo4j Graph Search, FastRP, and Node2Vec, which is an extension of Word2Vec. Graph Sage and Fast RP are powerful algorithms that take into account feature information. Fast RP was developed to improve upon NoteVac and is much faster. Graph Sage is a more full-blown graph neural network that can predict vector representations for new nodes added to the graph. With ground truth labels, traditional machine learning methods can be used to learn the association between vectors and labels. Neo4j supports three workflows: node classification, link prediction, and node regression. A pipeline architecture is used for fine tuning parameters, computing features, and training models.
Graph Neural Networks
Graph Neural Networks (GNNs) have become one of the top three topics in Machine Learning, as seen by submissions to major conferences such as ICLR. GNNs have been used in a variety of applications such as protein folding, fighting Ebola, and digital assistants like KBC's Kate. GNNs can detect implicit relationships, create semantics, and perform Knowledge Graphs. They differ from Neural Networks in that GNNs use embeddings and weights for both the embeddings and the neural networks. Research into GNNs began in 2003-2009 but it wasn't until 2016-2018 with the development of Graph Convolution Neural Networks that it gained explosive growth.
Chatbot with Knowledge Graph
This article describes how a chatbot with a Knowledge Graph underneath it has a 90% retention rate. It provides a toy example of a graph with nodes about Elon Musk related to solar power, cars, machines, and Mars planets. It also explains how link prediction and Knowledge Graph completion can be used to answer questions from the graph. Lastly, it introduces Transformers, a language model used to create Knowledge Graphs for biomedical applications, and provides guidance on getting started with Neo4j.
Conclusion
This presentation discussed how to use graph technology with Python to run algorithms and get results. Graph algorithms such as closeness centrality and centrality have been improved for scalability and are open source. Additionally, users can modify the code and use the Prego API to run their own version of the algorithm in Neo4j. Finally, resources are available to help users on their graph journey. This talk provided a comprehensive overview of knowledge graphs, graph data science, graph embeddings, and graph machine learning. Attendees now have a better understanding of how to use graph technology to analyze relationships and solve problems.
You can also ask questions on the complete talk using Devoxx Insights