Skip to main content
<
Page 67
>

Large Scale Topic Modeling: Improvements to LDA on Apache Spark

This blog was written by Feynman Liang and Joseph Bradley from Databricks, and Yuhao Yang from Intel. To get started using LDA, download...

Apache Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time Intervals, and UDAFs

To try new features highlighted in this blog post, download Spark 1.5 or sign up Databricks for a 14-day free trial today...

Announcing Apache Spark 1.5

September 8, 2015 by Reynold Xin and Patrick Wendell in
The inaugural Spark Summit Europe will be held in Amsterdam this October. Check out the full agenda and get your ticket before it...

From Pandas to Apache Spark's DataFrame

August 12, 2015 by Olivier Girardot in
This is a cross-post from the blog of Olivier Girardot. Olivier is a software engineer and the co-founder of Lateral Thoughts, where he...

Diving into Apache Spark Streaming's Execution Model

With so many distributed stream processing engines available, people often ask us about the unique benefits of Apache Spark Streaming . From early...

New Features in Machine Learning Pipelines in Apache Spark 1.4

Apache Spark 1.2 introduced Machine Learning (ML) Pipelines to facilitate the creation, tuning, and inspection of practical ML workflows. Spark’s latest release, Spark...

Joint Blog Post: Bringing ORC Support into Apache Spark

This is a joint blog post with our partner Hortonworks. Zhan Zhang is a member of technical staff at Hortonworks, where he collaborated...

Introducing Window Functions in Spark SQL

Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. In this blog post...

New Visualizations for Understanding Apache Spark Streaming Applications

Earlier, we presented new visualizations introduced in Apache Spark 1.4.0 to understand the behavior of Spark applications. Continuing the theme, this blog highlights...

Guest blog: PMML Support in Apache Spark's MLlib

This is a guest blog from our friend Vincenzo Selvaggio who contributed this feature. He is a Senior Java Technical Architect and Project...