Martin Junghanns - Databricks

Martin Junghanns

Software Engineer, Neo4j

Martin Junghanns is part of the Cypher for Apache Spark Engineering team at Neo4j. He is also the main developer of Gradoop, a system for graph analytics on distributed data flow systems. Martin holds a MSc Computer Science degree from the University of Leipzig.

UPCOMING SESSIONS

PAST SESSIONS

Extending Spark Graph for the Enterprise with Morpheus and Neo4jSummit Europe 2019

Spark 3.0 introduces a new module: Spark Graph. Spark Graph adds the popular query language Cypher, its accompanying Property Graph Model and graph algorithms to the data science toolbox. Graphs have a plethora of useful applications in recommendation, fraud detection and research.

Morpheus is an open-source library that is API compatible with Spark Graph and extends its functionality by:

  • A Property Graph catalog to manage multiple Property Graphs and Views 
  • Property Graph Data Sources that connect Spark Graph to Neo4j and SQL databases
  • Extended Cypher capabilities including multiple graph support and graph construction
  • Built-in support for the Neo4j Graph Algorithms library In this talk, we will walk you through the new Spark Graph module and demonstrate how we extend it with Morpheus to support enterprise users to integrate Spark Graph in their existing Spark and Neo4j installations.

We will demonstrate how to explore data in Spark, use Morpheus to transform data into a Property Graph, and then build a Graph Solution in Neo4j.

Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apache SparkSummit 2019

Graph data and graph analytics are increasingly important in data science and engineering. Cypher is an open language used for querying and updating graph databases and analytics platforms, which is now available in the Apache Spark environment. Neo4j Morpheus leverages the open source graph language project to integrate data from Neo4j operational graph databases with Hive and JDBC SQL data sources, using new Cypher features like the Property Graph Catalog, named graphs, graph projection, parameterized graph view functions, and graph/table views. Input and output graphs can be loaded and stored as structured collections of DataFrames with strong graph schemas to ensure data consistency and graph query optimization. Property graphs can also be analyzed and transformed using graph algorithms such as those in the GraphFrames project. Besides describing and demonstrating these capabilities, this talk also discusses the Spark Project Improvement Proposal to bring Cypher into Spark 3.0, and outlines current work to unify Cypher with other graph query languages to form a new ISO standard Graph Query Language.

Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apache Spark (continues)Summit 2019

Graph data and graph analytics are increasingly important in data science and engineering. Cypher is an open language used for querying and updating graph databases and analytics platforms, which is now available in the Apache Spark environment. Neo4j Morpheus leverages the open source graph language project to integrate data from Neo4j operational graph databases with Hive and JDBC SQL data sources, using new Cypher features like the Property Graph Catalog, named graphs, graph projection, parameterized graph view functions, and graph/table views. Input and output graphs can be loaded and stored as structured collections of DataFrames with strong graph schemas to ensure data consistency and graph query optimization. Property graphs can also be analyzed and transformed using graph algorithms such as those in the GraphFrames project. Besides describing and demonstrating these capabilities, this talk also discusses the Spark Project Improvement Proposal to bring Cypher into Spark 3.0, and outlines current work to unify Cypher with other graph query languages to form a new ISO standard Graph Query Language.

Matching Patterns and Constructing Graphs with Cypher for Apache SparkSummit Europe 2018

Graph pattern matching is one of the most interesting and challenging operations in analytics. Uncovering patterns of relationships in real-work networks actually helps us reveal their inner structures and infer/predict their dynamic behavior. The Cypher graph query language was originally designed for transactional graph databases like Neo4j. Spark developers and analysts can benefit from having this straightforward language available for analytic and data wrangling workloads. Cypher targets the property graph data model, making it easy to analyze highly connected datasets in a natural, uncomplicated way. Using a composable, declarative language for graphs reduces program complexity and allows complex data transformations. Under the umbrella of the openCypher project, Cypher is the first industrial language to provide composable property graph querying with multiple named graphs. Graph construction is new in Cypher and a critical feature for the Spark world of immutable datasets and function chains. Neo4j initiated the Apache-licensed OSS project, Cypher for Apache Spark (CAPS), joining other Cypher language implementations like Neo4j, SAP HANA Graph, RedisGraph, Agens Graph and the OSS Cypher for Gremlin project. The language allows the intuitive definition of graph patterns including structural and semantic predicates. Cypher for Apache Spark is a graph mirror of SparkSQL, with a graph catalog, graph data sources, graph schemas, graph operations functions, and textual Cypher queries. Graph querying and SQL querying can be interwoven at will, as Cypher can project graphs and tables, and process driving table inputs. We'll explain the importance of graphs and Cypher within Big Data applications and the main challenges of implementing a schema-flexible data model and graph specific operators, e.g. for path computation, using DataFrames. Takeaways: Intro to the Cypher graph query language Understand the benefits of graph-based data integration and analytics Insights into Cypher for Apache Spark and how it parallels SparkSQL Session hashtag: #SAISDev8