Awesome Spark Awesome

A curated list of awesome Apache Spark packages and resources.

Apache Spark is an open-source cluster-computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance (Wikipedia 2017).

Users of Apache Spark may choose between different the Python, R, Scala and Java programming languages to interface with the Apache Spark APIs.

Contents

Packages

Language Bindings

Notebooks and IDEs

General Purpose Libraries

SQL Data Sources

Bioinformatics

GIS

Time Series Analytics

Graph Processing

Machine Learning Extension

Middleware

Utilities

Natural Language Processing

Streaming

Interfaces

Testing

Workflow Management

Resources

Books

Papers

MOOCS

Workshops

Projects Using Spark

Blogs

Docker Images

Miscellaneous

References

License

Apache Spark, Spark, Apache, and the Spark logo are trademarks of The Apache Software Foundation. This compilation is not endorsed by The Apache Software Foundation.

Inspired by sindresorhus/awesome.