Russell received his Ph.D from UCSF after performing a lot of comparisons of protein binding sites. Following that he joined DataStax to get more involved in the Big Data / Distributed Computing Scene. Currently he spends the majority of his time developing the integration of Cassandra with Hadoop, Spark, Solr and other open source technologies.
Learn from someone who has made just about every basic Apache Spark mistake possible so you don't have to! We'll go over some of the most common things that users do that end up doing that cause unnecessary pain and actually explain how to avoid them. Confused about serialization? Not sure what is meant by use a singleton to share connections? Together we will walk through concrete examples of how to handle these situation. Learn how to: do all your work remotely, not break your catalyst optimizations, use all your resources, and much more! Together lets learn how to make our Spark Applications better!