Apache Spark
Unified engine for large-scale data processing.
Contents
Packages
Language Bindings
Notebooks and IDEs
Apache Zeppelin 6.6k
updated 1mo ago
Web-based notebook that enables interactive data analytics with plugable backends, integrated plotting, and extensive Spark support out-of-the-box.
General Purpose Libraries
SQL Data Sources
Storage
Bioinformatics
GIS
Graph Processing
Machine Learning Extension
ModelDB 1.7k
updated 1y ago
A system to manage machine learning models for spark.ml and scikit-learn <img src="https://img.shields.io/github/last-commit/scikit-learn/scikit-learn.svg">.
MLeap 1.5k
updated 1mo ago
Execution engine and serialization format which supports deployment of o.a.s.ml models without dependency on SparkSession.