Neo4j Morpheus: Interweaving Documents, Tables and and Graph Data in Spark

Download Slides

Fuse graph, document and relational data from transactional and analytic data sources, into a property graph “bird’s eye view”. The property graph data model is Chen’s “entity relationship” model, without clutter. Use “ASCII Art” visual property graph schemas to define “graph data lifts”, mapping from data lake, RDBMS, RDF or graph data cloud services into Spark. Graphs in Spark draw on multiple data sources. Leverage the Cypher query language to combine, split, and project graphs in Spark memory. Graph data is “woven” in Spark without altering or copying the original source. The results of graph workloads can be written back into HDFS or other file systems. Graphs can be read from, stored and merged into a Neo4j transactional database. And tabular datasets can be extracted from graphs. Data scientists and engineers load, wrangle and analyze mixed model data through Morpheus transformations. Enterprises use graphs to catalogue their disparate data assets and processes. They store graph datasets in the data lake. In a world of concern about data protection, see how graph data lifts allow tailored, canonical data views to be realized, in Spark, without remodeling and moving data. Morpheus combines SparkSQL and Cypher queries, and table/graph functions.Choose the right language for the job: eliminate cumbersome multi-joins for connected-data traversals by using super-concise Cypher patterns for sub-graph detection and graph projection; use the power of table projection, grouping, aggregation in SparkSQL, all in one application. Feel free to “dismantle your graph”: expose your graph nodes or relationships as dataframes, or as Hive tables. Key Takeaways Graph technology meets Big Data and Spark Analytics Property graphs: the superset data model Graph, relational and document data, interwoven Lift, split, combine, and create new graphs, from any data source Get your data fit to exploit graph compute, without losing any of your existing tools undefined undefined undefined undefined undefined

Session hashtag: #SAISDD9



« back
About Alastair Green

openCypher and SQL Property Graphs standards contributor Lead, Query Languages Standards and Research, Neo4j Inc. Product Manager, Neo4j Morpheus/Cypher for Apache Spark Head of Enterprise Data Distribution Infrastructure, Barclays Investment Bank, 2011-2015 Co-author OASIS Business Transaction Protocol 1.1

About Mats Rydberg

Mats has worked with Neo4j for more than four years with a focus on graph query language design and implementation. Mats is leading the development of the Cypher for Apache Spark (CAPS) project, now called Morpheus, which has been accepted as a Spark 3.0 major feature under the name of Spark Graph and will bring the leading graph query language Cypher to Apache Spark. Mats holds a Master's degree in Computer Science specialized on graph algorithms.