Kontextfrei: A new approach to testable Spark applications

Apache Spark has become the de-facto standard for writing big data processing pipelines. While the business logic of Spark applications is often at least as complex as what we have been dealing with in a pre-big data world, enabling developers to write comprehensive, fast unit test suites has not been a priority in the design of Spark. The main problem is that you cannot test your code without at least running a local SparkContext. These tests are not really unit tests, and they are too slow for pursuing a test-driven development approach. In this talk, I will introduce thekontextfrei library, which aims to liberate you from the chains of the SparkContext. I will show how it helps restoring the fast feedback loop we are taking for granted. In addition, I will explain how kontextfrei is implemented and discuss some of the design decisions made and look at alternative approaches and current limitations.

Date
2017-04-08
Time
15:00 - 15:30
Conference / Event
Scalar 2017
Venue
Museum of the History of Polish Jews, Warsaw

Slides

TAGS

Comments

Please accept our cookie agreement to see full comments functionality. Read more