Amy manages the Neo4j graph analytics programs and marketing. She loves seeing how our ecosystem uses graph analytics to reveal structures within real-world networks and infer dynamic behavior. Amy has consistently helped teams break into new markets at startups and large companies including EDS, Microsoft and Hewlett-Packard (HP). She most recently comes from Cray Inc., where she was the analytics and artificial intelligence market manager. Amy has a love for science and art with an extreme fascination for complexity science. When the weather is good, you’re likely to find her cycling the passes in beautiful Eastern Washington.
Relationships are one of the most predictive indicators of behavior and preferences. Communities detection based on relationships is a powerful tool for inferring similar preferences in peer groups, anticipating future behavior, estimating group resiliency, finding hierarchies, and preparing data for other analysis. Centrality measures based on relationships identify the most important items in a network and help us understand group dynamics such as influence, accessibility, the speed at which things spread, and bridges between groups. Data scientists use graph algorithms to identify groups and estimate important entities based on their interactions. In this session, we'll cover the common uses of community detection and centrality measures and how some of the iconic graph algorithms compute values. We'll show examples of how to run community detection and centrality algorithms in Apache Spark including using the AggregateMessages function to add your own algorithms. You'll learn best practices and tips for tricky situations. For those that want to run graph algorithms in a graph platform, we'll also illustrate a few examples in Neo4j. Some of the Community Detection Algorithms included: * Triangle Count and Clustering Coefficient to estimate network cohesiveness * Strongly Connected Components and Connected Components to find clusters * Label Propagation to quickly infer groups and data cleans with semi-supervised learning * Louvain Modularity to uncover at group hierarchies Balanced Triad to identify unstable groups * PageRank to reveal influencers * Betweenness Centrality to predict bottlenecks and bridges
The most practical way to improve our machine learning predictions right away is using graph algorithms for connected feature extraction. We’ll quickly dive into creating a machine learning pipeline and tips on training and evaluating a model for link prediction – integrating Neo4j and Spark in our workflow. We’ll look at an example using several models to predict future collaborations and show measurable improvements using graph based features.
Graphs are a powerful and flexible way of doing data analysis and integration. The Cypher query language (akin to SQL for graphs) which is already implemented in systems like Neo4j, SAP HANA Graph and Redis Graph, allows for intuitive definition of graph patterns including structural and semantic predicates. Cypher can also be used to express complex data integration patterns within many Spark programs. To bring the benefits of Cypher from the graph database realm into the world of Big Data, Neo4j has developed an exciting new product for Apache Spark. This leading-edge product is a hybrid workbench that fuses graph and relational data from different sources. It uses a single abstraction model and streaming integrations to weave together data in Spark without altering or copying the original source. Join us for an overview of the new graph-based data integration and graph analytical query workloads in Spark across many different sources, such as Neo4j, SQL databases or HDFS. You’ll see how analysts can lift graph structures out of many disparate formats and systems, unify it all into a single graph layer, and query an organization's hypergraph with Cypher run on top of Spark. We’ll show you how recent changes to support multiple graphs in the Cypher language unlock new data integration potential for all Spark developers. Attend this session to learn about the impact of graphs in Big Data applications and use cases for graph data integration. We’ll also provide a short demo of how this new product works and how you can get started!