Project Awesome project awesome

Python > Data Science

Data analysis and machine learning.

Collection 3.4k stars GitHub

Machine Learning

General Purpose Machine Learning

PyCaret

An open-source, low-code machine learning library in Python.

Shogun 3.1k updated 2y ago

Machine learning toolbox.

xLearn 3.1k updated 2y ago

High Performance, Easy-to-use, and Scalable Machine Learning Package.

cuML 5.2k updated 2d ago

RAPIDS Machine Learning Library.

modAL 2.3k updated 2y ago

Modular active learning framework for Python3.

Sparkit-learn 1.1k updated 5y ago

PySpark + scikit-learn = Sparkit-learn.

mlpack 5.6k updated yesterday

A scalable C++ machine learning library (Python bindings).

dlib

Toolkit for making real-world machine learning and data analysis applications in C++ (Python bindings).

MLxtend 5.1k updated 2mo ago

Extension and helper modules for Python's data analysis and machine learning libraries.

hyperlearn

50%+ Faster, 50%+ less RAM usage, GPU support re-written Sklearn, Statsmodels.

Reproducible Experiment Platform (REP) 700 updated 1y ago

Machine Learning toolbox for Humans.

scikit-multilearn 955 updated 2y ago

Multi-label classification for python.

seqlearn 704 updated 3y ago

Sequence classification toolkit for Python.

pystruct 670 updated 4y ago

Simple structured learning framework for Python.

sklearn-expertsys 489 updated 8y ago

Highly interpretable classifiers for scikit learn.

RuleFit 442 updated 2y ago

Implementation of the rulefit.

metric-learn 1.4k updated 5d ago

Metric learning algorithms in Python.

pyGAM 989 updated 2mo ago

Generalized Additive Models in Python.

causalml 5.8k updated 4d ago

Uplift modeling and causal inference with machine learning algorithms.

Deep Learning

Time Series

sktime 9.7k updated 4d ago

A unified framework for machine learning with time series. <img height="20" src="img/sklearn_big.png" alt="sklearn">

skforecast

Time series forecasting with machine learning models

darts 9.3k updated 2d ago

A python library for easy manipulation and forecasting of time series.

statsforecast 4.7k updated today

Lightning fast forecasting with statistical and econometric models.

mlforecast 1.2k updated 2d ago

Scalable machine learning-based time series forecasting.

neuralforecast 4.0k updated 2d ago

Scalable machine learning-based time series forecasting.

tslearn 3.1k updated 2d ago

Machine learning toolkit dedicated to time-series data. <img height="20" src="img/sklearn_big.png" alt="sklearn">

tick 540 updated 1y ago

Module for statistical learning, with a particular emphasis on time-dependent modeling. <img height="20" src="img/sklearn_big.png" alt="sklearn">

greykite 1.9k updated 1y ago

A flexible, intuitive, and fast forecasting library next.

Prophet 20.1k updated 23d ago

Automatic Forecasting Procedure.

PyFlux 2.1k updated 2y ago

Open source time series library for Python.

bayesloop 169 updated 1mo ago

Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

luminol 1.2k updated 7mo ago

Anomaly Detection and Correlation library.

maya 3.4k updated 1y ago

makes it very easy to parse a string and for changing timezones

Chaos Genius 775 (archived)

ML powered analytics engine for outlier/anomaly detection and root cause analysis

Reinforcement Learning

Gymnasium 11.6k updated 2d ago

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym).

PettingZoo 3.4k updated 1mo ago

An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities.

MAgent2 325 updated 4mo ago

An engine for high performance multi-agent environments with very large numbers of agents, along with a set of reference environments.

Stable Baselines3 13.0k updated 7d ago

A set of improved implementations of reinforcement learning algorithms based on OpenAI Baselines.

Shimmy 205 updated 3mo ago

An API conversion tool for popular external reinforcement learning environments.

EnvPool 1.3k updated 2d ago

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

Tianshou 10.4k updated 3mo ago

An elegant PyTorch deep reinforcement learning library. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

Acme 3.9k updated 13d ago

A library of reinforcement learning components and agents.

Catalyst-RL 48 updated 4y ago

PyTorch framework for RL research. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

d3rlpy 1.6k updated 6mo ago

An offline deep reinforcement learning library.

DI-engine 3.6k updated 3mo ago

OpenDILab Decision AI Engine. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

TF-Agents 3.0k updated 2mo ago

A library for Reinforcement Learning in TensorFlow. <img height="20" src="img/tf_big2.png" alt="TensorFlow">

TensorForce 3.3k updated 1y ago

A TensorFlow library for applied reinforcement learning. <img height="20" src="img/tf_big2.png" alt="TensorFlow">

TRFL 3.1k updated 3y ago

TensorFlow Reinforcement Learning. <img height="20" src="img/tf_big2.png" alt="sklearn">

Dopamine 10.9k updated 1y ago

A research framework for fast prototyping of reinforcement learning algorithms.

keras-rl 5.6k updated 2y ago

Deep Reinforcement Learning for Keras. <img height="20" src="img/keras_big.png" alt="Keras compatible">

garage 2.1k updated 2y ago

A toolkit for reproducible reinforcement learning research.

Horizon 3.7k updated 5d ago

A platform for Applied Reinforcement Learning.

rlpyt 2.3k updated 5y ago

Reinforcement Learning in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

cleanrl 9.4k updated 8mo ago

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG).

Machin 418 updated 4y ago

A reinforcement library designed for pytorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

SKRL 1.0k updated 8d ago

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Isaac Orbit and Omniverse Isaac Gym. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

Imitation 1.7k updated 1y ago

Clean PyTorch implementations of imitation and reward learning algorithms. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

Graph Machine Learning

pytorch_geometric 23.6k updated 3d ago

Geometric Deep Learning Extension Library for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

pytorch_geometric_temporal 3.0k updated 6mo ago

Temporal Extension Library for PyTorch Geometric. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

PyTorch Geometric Signed Directed 146 updated 1y ago

A signed/directed graph neural network extension library for PyTorch Geometric. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

dgl 14.3k updated 7mo ago

Python package built to ease deep learning on graph, on top of existing DL frameworks. <img height="20" src="img/pytorchbig2.png" alt="PyTorch based/compatible"> <img height="20" src="img/tfbig2.png" alt="TensorFlow"> <img height="20" src="img/mxnet_big.png" alt="MXNet based">

GRAPE 623 updated 2y ago

GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Spektral 2.4k updated 2y ago

Deep learning on graphs. <img height="20" src="img/keras_big.png" alt="Keras compatible">

StellarGraph 3.0k updated 1y ago

Machine Learning on Graphs. <img height="20" src="img/tfbig2.png" alt="TensorFlow"> <img height="20" src="img/kerasbig.png" alt="Keras compatible">

Graph Nets 5.4k updated 3y ago

Build Graph Nets in Tensorflow. <img height="20" src="img/tf_big2.png" alt="TensorFlow">

TensorFlow GNN 1.5k updated 5d ago

A library to build Graph Neural Networks on the TensorFlow platform. <img height="20" src="img/tf_big2.png" alt="TensorFlow">

Auto Graph Learning 1.1k updated 4mo ago

An autoML framework & toolkit for machine learning on graphs.

PyTorch-BigGraph 3.5k (archived)

Generate embeddings from large-scale graph-structured data. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

Karate Club 2.3k updated 1y ago

An unsupervised machine learning library for graph-structured data.

Little Ball of Fur 713 updated 3mo ago

A library for sampling graph structured data.

GreatX 89 updated 1y ago

A graph reliability toolbox based on PyTorch and PyTorch Geometric (PyG). <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

Jraph 1.5k (archived)

A Graph Neural Network Library in Jax.

TRL 17.8k updated 2d ago

Train transformer language models with reinforcement learning.

Probabilistic Methods

pyro 9.0k updated 8mo ago

A flexible, scalable deep probabilistic programming library built on PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

PyMC 9.5k updated 2d ago

Bayesian Stochastic Modelling in Python.

InferPy 148 updated 1y ago

Deep Probabilistic Modelling Made Easy. <img height="20" src="img/tf_big2.png" alt="sklearn">

PyStan 362 updated 13d ago

Bayesian inference using the No-U-Turn sampler (Python interface).

sklearn-bayes 523 updated 4y ago

Python package for Bayesian Machine Learning with scikit-learn API. <img height="20" src="img/sklearn_big.png" alt="sklearn">

skpro 317 updated 3d ago

Supervised domain-agnostic prediction framework for probabilistic modelling by The Alan Turing Institute. <img height="20" src="img/sklearn_big.png" alt="sklearn">

PyVarInf 362 updated 6y ago

Bayesian Deep Learning methods with Variational Inference for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

emcee 1.6k updated 8d ago

The Python ensemble sampling toolkit for affine-invariant MCMC.

hsmmlearn 86 (archived)

A library for hidden semi-Markov models with explicit durations.

pyhsmm 575 updated 1y ago

Bayesian inference in HSMMs and HMMs.

GPyTorch 3.9k updated 13d ago

A highly efficient and modular implementation of Gaussian Processes in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">

sklearn-crfsuite 434 updated 1mo ago

A scikit-learn-inspired API for CRFsuite. <img height="20" src="img/sklearn_big.png" alt="sklearn">

Model Explanation

dalex

moDel Agnostic Language for Exploration and explanation.

Shapley 224 updated 2mo ago

A data-driven framework to quantify the value of classifiers in a machine learning ensemble.

Alibi 2.6k updated 5mo ago

Algorithms for monitoring and explaining machine learning models.

anchor 812 updated 3y ago

Code for "High-Precision Model-Agnostic Explanations" paper.

aequitas 755 updated 1mo ago

Bias and Fairness Audit Toolkit.

Contrastive Explanation 45 updated 3y ago

Contrastive Explanation (Foil Trees).

yellowbrick 4.4k updated 1y ago

Visual analysis and diagnostic tools to facilitate machine learning model selection.

scikit-plot 2.4k updated 1y ago

An intuitive library to add plotting functionality to scikit-learn objects.

shap 25.2k updated 13d ago

A unified approach to explain the output of any machine learning model.

InterpretML 6.8k updated 2d ago

InterpretML implements the Explainable Boosting Machine (EBM), a modern, fully interpretable machine learning model based on Generalized Additive Models (GAMs). This open-source package also provides visualization tools for EBMs, other glass-box models, and black-box explanations.

ELI5 2.8k updated 1mo ago

A library for debugging/inspecting machine learning classifiers and explaining their predictions.

Lime 12.1k updated 1y ago

Explaining the predictions of any machine learning classifier.

FairML 366 updated 4y ago

FairML is a python toolbox auditing the machine learning models for bias.

L2X 124 updated 4y ago

Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation.

PDPbox 861 updated 1y ago

Partial dependence plot toolbox.

PyCEbox 163 updated 5y ago

Python Individual Conditional Expectation Plot Toolbox.

Skater

Python Library for Model Interpretation.

model-analysis 1.3k updated 7mo ago

Model analysis tools for TensorFlow.

themis-ml 126 (archived)

A library that implements fairness-aware machine learning algorithms.

treeinterpreter 761 updated 2y ago

Interpreting scikit-learn's decision tree and random forest predictions.

AI Explainability 360 1.8k updated 6d ago

Interpretability and explainability of data and machine learning models.

Auralisation 42 updated 9y ago

Auralisation of learned features in CNN (for audio).

CapsNet-Visualization 395 updated 4y ago

A visualization of the CapsNet layers to better understand how it works.

lucid 4.7k (archived)

A collection of infrastructure and tools for research in neural network interpretability.

Netron 32.6k updated 2d ago

Visualizer for deep learning and machine learning models (no Python code, but visualizes models from most Python Deep Learning frameworks).

FlashLight

Visualization Tool for your NeuralNetwork.

tensorboard-pytorch 8.0k updated 1mo ago

Tensorboard for PyTorch (and chainer, mxnet, numpy, ...).

Optimization

Optuna 13.8k updated today

A hyperparameter optimization framework.

pymoo 2.8k updated 1mo ago

Multi-objective Optimization in Python.

pycma 1.3k updated 27d ago

Python implementation of CMA-ES.

Spearmint 1.6k updated 6y ago

Bayesian optimization.

BoTorch 3.5k updated 5d ago

Bayesian optimization in PyTorch.

scikit-opt 6.4k updated 6mo ago

Heuristic Algorithms for optimization.

sklearn-genetic-opt 357 updated 6mo ago

Hyperparameters tuning and feature selection using evolutionary algorithms.

SMAC3 1.2k updated 4d ago

Sequential Model-based Algorithm Configuration.

Optunity 425 updated 2y ago

Is a library containing various optimizers for hyperparameter tuning.

hyperopt 7.6k updated 8d ago

Distributed Asynchronous Hyperparameter Optimization in Python.

hyperopt-sklearn 1.6k updated 11mo ago

Hyper-parameter optimization for sklearn.

sklearn-deap 773 updated 2y ago

Use evolutionary algorithms instead of gridsearch in scikit-learn.

sigopt_sklearn 75 (archived)

SigOpt wrappers for scikit-learn methods.

Bayesian Optimization 8.6k updated 9d ago

A Python implementation of global optimization with gaussian processes.

SafeOpt 150 updated 3y ago

Safe Bayesian Optimization.

scikit-optimize 2.8k (archived)

Sequential model-based optimization with a scipy.optimize interface.

Solid 584 updated 6y ago

A comprehensive gradient-free optimization framework written in Python.

PySwarms 1.4k updated 1y ago

A research toolkit for particle swarm optimization in Python.

Platypus 647 updated 2mo ago

A Free and Open Source Python Library for Multiobjective Optimization.

GPflowOpt 274 updated 5y ago

Bayesian Optimization using GPflow.

POT 2.8k updated 14d ago

Python Optimal Transport library.

Talos 1.6k updated 1y ago

Hyperparameter Optimization for Keras Models.

nlopt 2.2k updated 11d ago

Library for nonlinear optimization (global and local, constrained or unconstrained).

Feature Engineering

Data Manipulation

Data Frames

pandas_profiling 13.4k updated 22d ago

Create HTML profiling reports from pandas DataFrame objects

polars 37.8k updated 2d ago

A fast multi-threaded, hybrid-out-of-core DataFrame library.

Arctic 3.1k updated 1y ago

High-performance datastore for time series and tick data.

datatable 1.9k updated 1y ago

Data.table for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib">

cuDF 9.6k updated today

GPU DataFrame Library. <img height="20" src="img/pandasbig.png" alt="pandas compatible"> <img height="20" src="img/gpubig.png" alt="GPU accelerated">

blaze 3.2k updated 2y ago

NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible">

pandasql 1.3k (archived)

Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible">

pandas-gbq

pandas Google Big Query. <img height="20" src="img/pandas_big.png" alt="pandas compatible">

xpandas 26 updated 3y ago

Universal 1d/2d data containers with Transformers .functionality for data analysis by The Alan Turing Institute.

pysparkling 271 updated 1y ago

A pure Python implementation of Apache Spark's RDD and DStream interfaces. <img height="20" src="img/spark_big.png" alt="Apache Spark based">

modin 10.4k updated 1mo ago

Speed up your pandas workflows by changing a single line of code. <img height="20" src="img/pandas_big.png" alt="pandas compatible">

swifter 2.6k updated 2y ago

A package that efficiently applies any function to a pandas dataframe or series in the fastest available manner.

pandas-log 218 updated 4y ago

A package that allows providing feedback about basic pandas operations and finds both business logic and performance issues.

vaex 8.5k updated 24d ago

Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second.

xarray 4.1k updated 2d ago

Xarray combines the best features of NumPy and pandas for multidimensional data selection by supplementing numerical axis labels with named dimensions for more intuitive, concise, and less error-prone indexing routines.