Python > Data Science
Data analysis and machine learning.
Contents
Machine Learning
General Purpose Machine Learning
An open-source, low-code machine learning library in Python.
Machine learning toolbox.
High Performance, Easy-to-use, and Scalable Machine Learning Package.
RAPIDS Machine Learning Library.
Modular active learning framework for Python3.
PySpark + scikit-learn = Sparkit-learn.
A scalable C++ machine learning library (Python bindings).
Toolkit for making real-world machine learning and data analysis applications in C++ (Python bindings).
Extension and helper modules for Python's data analysis and machine learning libraries.
50%+ Faster, 50%+ less RAM usage, GPU support re-written Sklearn, Statsmodels.
Machine Learning toolbox for Humans.
Multi-label classification for python.
Sequence classification toolkit for Python.
Simple structured learning framework for Python.
Highly interpretable classifiers for scikit learn.
Implementation of the rulefit.
Metric learning algorithms in Python.
Generalized Additive Models in Python.
Uplift modeling and causal inference with machine learning algorithms.
Gradient Boosting
Scalable, Portable, and Distributed Gradient Boosting.
A fast, distributed, high-performance gradient boosting.
An open-source gradient boosting on decision trees library.
Fast GBDTs and Random Forests on GPUs.
Natural Gradient Boosting for Probabilistic Prediction.
A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.
Ensemble Methods
Simple and useful stacking library written in Python. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Library for machine learning stacking generalization. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Python package for stacking (machine learning technique). <img height="20" src="img/sklearn_big.png" alt="sklearn">
Imbalanced Datasets
Module to perform under-sampling and over-sampling with various techniques. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Python-based implementations of algorithms for learning on imbalanced data. <img height="20" src="img/sklearnbig.png" alt="sklearn"> <img height="20" src="img/tfbig2.png" alt="sklearn">
Random Forests
A forest of random projection trees. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Wrapper of the Random Bits Forest program written by (Wang et al., 2016).<img height="20" src="img/sklearn_big.png" alt="sklearn">
Python Wrapper of Regularized Greedy Forest. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Kernel Methods
Factorization machines in python. <img height="20" src="img/sklearn_big.png" alt="sklearn">
A library for Factorization Machines. <img height="20" src="img/sklearn_big.png" alt="sklearn">
TensorFlow implementation of an arbitrary order Factorization Machine. <img height="20" src="img/sklearnbig.png" alt="sklearn"> <img height="20" src="img/tfbig2.png" alt="sklearn">
Relevance Vector Machine implementation using the scikit-learn API. <img height="20" src="img/sklearn_big.png" alt="sklearn">
A fast SVM Library on GPUs and CPUs. <img height="20" src="img/sklearnbig.png" alt="sklearn"> <img height="20" src="img/gpubig.png" alt="GPU accelerated">
Deep Learning
PyTorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
PyTorch Lightning is just organized PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
High-level library to help with training neural networks in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A scikit-learn compatible neural network library that wraps PyTorch. <img height="20" src="img/sklearnbig.png" alt="sklearn"> <img height="20" src="img/pytorchbig2.png" alt="PyTorch based/compatible">
High-level utils for PyTorch DL & RL research. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A PyTorch-based deep learning library for drug pair scoring. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
TensorFlow
Computation using data flow graphs for scalable machine learning by Google.
Deep Learning and Reinforcement Learning Library for Researcher and Engineer.
Deep learning library featuring a higher-level API for TensorFlow.
TensorFlow-based neural network library.
A Neural Net Training Interface on TensorFlow.
A platform that helps you build, manage and monitor deep learning models.
Deploy TensorFlow graphs for fast evaluation and export to TensorFlow-less environments running numpy.
TensorFlow ROCm port.
Deep learning with dynamic computation graphs in TensorFlow.
A high-level framework for TensorFlow.
Model Parallelism Made Easier.
A toolbox that allows one to train and test deep learning models without the need to write code.
Keras community contributions.
Keras + Hyperopt: A straightforward wrapper for a convenient hyperparameter.
Distributed Deep learning with Keras & Spark.
A quantization deep learning library.
JAX
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more.
A neural network library for JAX that is designed for flexibility.
A gradient processing and optimization library for JAX.
Others
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Source-to-Source Debuggable Derivatives in Pure Python.
Efficiently computes derivatives of numpy code.
A fast open framework for deep learning.
Neural Network Libraries by Sony.
Automated Machine Learning
An AutoML toolkit and a drop-in replacement for a scikit-learn estimator.
Automatic architecture search and hyperparameter optimization for PyTorch.
AutoML library for deep learning.
AutoML for Image, Text, Tabular, Time-Series, and MultiModal Data.
AutoML tool that optimizes machine learning pipelines using genetic programming.
A powerful Automated Machine Learning python library.
Natural Language Processing
Data loaders and abstractions for text and NLP.
Modular Natural Language Processing workflows with Keras.
Modules, data sets, and tutorials supporting research and development in Natural Language Processing.
The Classical Language Toolkik.
Python binding for Morfologik.
Scikit-learn wrappers for Python fastText.
Simple text-to-phonemes converter for multiple languages.
Very simple framework for state-of-the-art NLP.
Computer Audition
An audio library for PyTorch.
Python library for audio and music analysis.
Audio features extraction.
A library for audio and music analysis.
Library for audio and music analysis, description, and synthesis.
A simple, portable, lightweight library of audio feature extraction functions.
Music Analysis, Retrieval, and Synthesis for Audio Signals.
A library for augmenting annotated audio data.
Python audio and music signal processing library.
Computer Vision
Datasets, Transforms, and Models specific to Computer Vision.
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data.
Industry-strength Computer Vision workflows with Keras.
Open Source Computer Vision Library.
An efficient video loader for deep learning with smart shuffling that's super easy to digest.
OpenMMLab Foundational Library for Training Deep Learning Models.
Image Processing SciKit (Toolbox for SciPy).
Image augmentation for machine learning experiments.
Additional augmentations for imgaug.
Image augmentation library in Python for machine learning.
Fast image augmentation library and easy-to-use wrapper around other libraries.
A One-stop Library for Language-Vision Intelligence.
Time Series
A unified framework for machine learning with time series. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Time series forecasting with machine learning models
A python library for easy manipulation and forecasting of time series.
Lightning fast forecasting with statistical and econometric models.
Scalable machine learning-based time series forecasting.
Scalable machine learning-based time series forecasting.
Machine learning toolkit dedicated to time-series data. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Module for statistical learning, with a particular emphasis on time-dependent modeling. <img height="20" src="img/sklearn_big.png" alt="sklearn">
A flexible, intuitive, and fast forecasting library next.
Automatic Forecasting Procedure.
Open source time series library for Python.
Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.
Anomaly Detection and Correlation library.
makes it very easy to parse a string and for changing timezones
ML powered analytics engine for outlier/anomaly detection and root cause analysis
Reinforcement Learning
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym).
An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities.
An engine for high performance multi-agent environments with very large numbers of agents, along with a set of reference environments.
A set of improved implementations of reinforcement learning algorithms based on OpenAI Baselines.
An API conversion tool for popular external reinforcement learning environments.
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
An elegant PyTorch deep reinforcement learning library. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A library of reinforcement learning components and agents.
PyTorch framework for RL research. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
An offline deep reinforcement learning library.
OpenDILab Decision AI Engine. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A library for Reinforcement Learning in TensorFlow. <img height="20" src="img/tf_big2.png" alt="TensorFlow">
A TensorFlow library for applied reinforcement learning. <img height="20" src="img/tf_big2.png" alt="TensorFlow">
TensorFlow Reinforcement Learning. <img height="20" src="img/tf_big2.png" alt="sklearn">
A research framework for fast prototyping of reinforcement learning algorithms.
Deep Reinforcement Learning for Keras. <img height="20" src="img/keras_big.png" alt="Keras compatible">
A toolkit for reproducible reinforcement learning research.
A platform for Applied Reinforcement Learning.
Reinforcement Learning in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG).
A reinforcement library designed for pytorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Isaac Orbit and Omniverse Isaac Gym. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Clean PyTorch implementations of imitation and reward learning algorithms. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Graph Machine Learning
Geometric Deep Learning Extension Library for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Temporal Extension Library for PyTorch Geometric. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A signed/directed graph neural network extension library for PyTorch Geometric. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Python package built to ease deep learning on graph, on top of existing DL frameworks. <img height="20" src="img/pytorchbig2.png" alt="PyTorch based/compatible"> <img height="20" src="img/tfbig2.png" alt="TensorFlow"> <img height="20" src="img/mxnet_big.png" alt="MXNet based">
GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations
Deep learning on graphs. <img height="20" src="img/keras_big.png" alt="Keras compatible">
Machine Learning on Graphs. <img height="20" src="img/tfbig2.png" alt="TensorFlow"> <img height="20" src="img/kerasbig.png" alt="Keras compatible">
Build Graph Nets in Tensorflow. <img height="20" src="img/tf_big2.png" alt="TensorFlow">
A library to build Graph Neural Networks on the TensorFlow platform. <img height="20" src="img/tf_big2.png" alt="TensorFlow">
An autoML framework & toolkit for machine learning on graphs.
Generate embeddings from large-scale graph-structured data. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
An unsupervised machine learning library for graph-structured data.
A library for sampling graph structured data.
A graph reliability toolbox based on PyTorch and PyTorch Geometric (PyG). <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A Graph Neural Network Library in Jax.
Train transformer language models with reinforcement learning.
Graph Manipulation
Learning-to-Rank & Recommender Systems
A Python implementation of LightFM, a hybrid recommendation algorithm.
Deep recommender models using PyTorch.
A Python scikit for building and analyzing recommender systems.
A unified, comprehensive and efficient recommendation library. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
allRank is a framework for training learning-to-rank neural models based on PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A library for building recommender system models using TensorFlow. <img height="20" src="img/tfbig2.png" alt="TensorFlow"> <img height="20" src="img/kerasbig.png" alt="Keras compatible">
Learning to Rank in TensorFlow. <img height="20" src="img/tf_big2.png" alt="TensorFlow">
Probabilistic Graphical Models
Probabilistic Methods
A flexible, scalable deep probabilistic programming library built on PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
Bayesian Stochastic Modelling in Python.
Deep Probabilistic Modelling Made Easy. <img height="20" src="img/tf_big2.png" alt="sklearn">
Bayesian inference using the No-U-Turn sampler (Python interface).
Python package for Bayesian Machine Learning with scikit-learn API. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Supervised domain-agnostic prediction framework for probabilistic modelling by The Alan Turing Institute. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Bayesian Deep Learning methods with Variational Inference for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
The Python ensemble sampling toolkit for affine-invariant MCMC.
A library for hidden semi-Markov models with explicit durations.
Bayesian inference in HSMMs and HMMs.
A highly efficient and modular implementation of Gaussian Processes in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
A scikit-learn-inspired API for CRFsuite. <img height="20" src="img/sklearn_big.png" alt="sklearn">
Model Explanation
moDel Agnostic Language for Exploration and explanation.
A data-driven framework to quantify the value of classifiers in a machine learning ensemble.
Algorithms for monitoring and explaining machine learning models.
Code for "High-Precision Model-Agnostic Explanations" paper.
Bias and Fairness Audit Toolkit.
Contrastive Explanation (Foil Trees).
Visual analysis and diagnostic tools to facilitate machine learning model selection.
An intuitive library to add plotting functionality to scikit-learn objects.
A unified approach to explain the output of any machine learning model.
InterpretML implements the Explainable Boosting Machine (EBM), a modern, fully interpretable machine learning model based on Generalized Additive Models (GAMs). This open-source package also provides visualization tools for EBMs, other glass-box models, and black-box explanations.
A library for debugging/inspecting machine learning classifiers and explaining their predictions.
Explaining the predictions of any machine learning classifier.
FairML is a python toolbox auditing the machine learning models for bias.
Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation.
Partial dependence plot toolbox.
Python Individual Conditional Expectation Plot Toolbox.
Python Library for Model Interpretation.
Model analysis tools for TensorFlow.
A library that implements fairness-aware machine learning algorithms.
Interpreting scikit-learn's decision tree and random forest predictions.
Interpretability and explainability of data and machine learning models.
Auralisation of learned features in CNN (for audio).
A visualization of the CapsNet layers to better understand how it works.
A collection of infrastructure and tools for research in neural network interpretability.
Visualizer for deep learning and machine learning models (no Python code, but visualizes models from most Python Deep Learning frameworks).
Visualization Tool for your NeuralNetwork.
Tensorboard for PyTorch (and chainer, mxnet, numpy, ...).
Genetic Programming
Genetic Programming in Python.
Genetic Algorithm in Python.
Distributed Evolutionary Algorithms in Python.
A Genetic Programming platform for Python with GPU support.
A strongly-typed genetic programming framework for Python.
Genetic feature selection module for scikit-learn.
Optimization
A hyperparameter optimization framework.
Multi-objective Optimization in Python.
Python implementation of CMA-ES.
Bayesian optimization.
Bayesian optimization in PyTorch.
Heuristic Algorithms for optimization.
Hyperparameters tuning and feature selection using evolutionary algorithms.
Sequential Model-based Algorithm Configuration.
Is a library containing various optimizers for hyperparameter tuning.
Distributed Asynchronous Hyperparameter Optimization in Python.
Hyper-parameter optimization for sklearn.
Use evolutionary algorithms instead of gridsearch in scikit-learn.
SigOpt wrappers for scikit-learn methods.
A Python implementation of global optimization with gaussian processes.
Safe Bayesian Optimization.
Sequential model-based optimization with a scipy.optimize interface.
A comprehensive gradient-free optimization framework written in Python.
A research toolkit for particle swarm optimization in Python.
A Free and Open Source Python Library for Multiobjective Optimization.
Bayesian Optimization using GPflow.
Python Optimal Transport library.
Hyperparameter Optimization for Keras Models.
Library for nonlinear optimization (global and local, constrained or unconstrained).
Feature Engineering
General
Automated feature engineering.
Feature engineering package with sklearn-like functionality.
Automated feature generation with expert-level performance.
A scikit-learn addon to operate on set/"group"-based features.
A set of tools for creating and testing machine learning features.
A feature engineering wrapper for sklearn.
A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.
Automatic extraction of relevant features from time series.
Machine learning on dirty tabular data (especially: string-based variables for classifcation and regression).
Moving window features.
A collection of various pandas & scikit-learn compatible transformers for all kinds of preprocessing and feature engineering steps
Feature Selection
Feature selection repository in Python.
Implementations of the Boruta all-relevant feature selection method.
A fast xgboost feature selection algorithm.
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
A feature selection library based on evolutionary algorithms.
Visualization
General Purposes
Plotting with Python.
Statistical data visualization using matplotlib.
Painlessly create beautiful matplotlib plots.
Ternary plotting library for Python with matplotlib.
Missing data visualization module for Python.
Python library that makes it easy for data scientists to create charts.
Improved histograms.
Interactive plots
A python package for animating plots built on matplotlib.
Interactive Web Plotting for Python.
Plotting library for IPython/Jupyter notebooks
Migrated from Echarts, a charting and visualization library, to Python's interactive visual drawing library.
Map
Automatic Plotting
Stop plotting your data - annotate your data and let it visualize itself.
Visualize data automatically with 1 line of code (ideal for machine learning)
Visualize and compare datasets, target values and associations, with one line of code.
Deployment
No-code in the front, Python in the back. An open-source framework for creating data apps.
Create UIs for your machine learning model in Python in 3 minutes.
A toolkit for creating modular data visualization applications.
Deepnote is a drop-in replacement for Jupyter with an AI-first design, sleek UI, new blocks, and native data integrations. Use Python, R, and SQL locally in your favorite IDE, then scale to Deepnote cloud for real-time collaboration, Deepnote agent, and deployable data apps.
Statistics
Extension to pandas dataframes describe function.
Statistical modeling and econometrics in Python.
Supply a wrapper `StockDataFrame based on the pandas.DataFrame` with inline stock statistics/indicators support.
A pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
Pairwise Multiple Comparisons Post-hoc Tests.
Performance analysis of predictive (alpha) stock factors.
Data Manipulation
Data Frames
Create HTML profiling reports from pandas DataFrame objects
A fast multi-threaded, hybrid-out-of-core DataFrame library.
High-performance datastore for time series and tick data.
Data.table for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib">
GPU DataFrame Library. <img height="20" src="img/pandasbig.png" alt="pandas compatible"> <img height="20" src="img/gpubig.png" alt="GPU accelerated">
NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
pandas Google Big Query. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
Universal 1d/2d data containers with Transformers .functionality for data analysis by The Alan Turing Institute.
A pure Python implementation of Apache Spark's RDD and DStream interfaces. <img height="20" src="img/spark_big.png" alt="Apache Spark based">
Speed up your pandas workflows by changing a single line of code. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
A package that efficiently applies any function to a pandas dataframe or series in the fastest available manner.
A package that allows providing feedback about basic pandas operations and finds both business logic and performance issues.
Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second.
Xarray combines the best features of NumPy and pandas for multidimensional data selection by supplementing numerical axis labels with named dimensions for more intuitive, concise, and less error-prone indexing routines.
Pipelines
Sasy pipelines for pandas DataFrames.
Functional data manipulation for pandas. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
Dplyr for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib">
pandas integration with sklearn. <img height="20" src="img/sklearnbig.png" alt="sklearn"> <img height="20" src="img/pandasbig.png" alt="pandas compatible">
Helps you conveniently work with random or sequential batches of your data and define data processing.
Clean APIs for data cleaning. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
A Python toolkit for processing tabular data.
Build system for data science pipelines.
Hints and tips for using pandas in an analysis environment. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
A microframework for dataframe generation that applies Directed Acyclic Graphs specified by a flow of lazily evaluated Python functions.
Data-centric AI
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
A system for quickly generating training data with weak supervision.
Collect, clean, and visualize your data in Python with a few lines of code.
A system for quickly generating training data with weak supervision.
Distributed Computing
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Distributed machine learning platform.
Framework and Library for Distributed Online Machine Learning.
Microsoft Distributed Machine Learning Toolkit.
PArallel Distributed Deep LEarning.
Distributed and parallel machine learning.
Distributed computation in Python.
Experimentation
Open source platform for the machine learning lifecycle.
Data Version Control | Git for Data & Models | ML Experiments Management.
️ machine learning development environment for data science and AI/ML engineering teams.
A tool to help you configure, organize, log, and reproduce experiments.
Adaptive Experimentation Platform.
Data Validation
Always know what to expect from your data.
A lightweight, flexible, and expressive statistical data testing library.
Validation & testing of ML models and data during model development, deployment, and production.
Evaluate and monitor ML models from validation to production.
Library for exploring and validating machine learning data.
A library to compare Pandas, Polars, and Spark data frames. It provides stats and lets users adjust for match accuracy.
Evaluation
Library of useful metrics and plots for evaluating recommender systems.
Machine learning evaluation metric.
Model evaluation made easy: plots, tables, and markdown reports.
Fairness metrics for datasets and ML models, explanations, and algorithms to mitigate bias in datasets and models.
Algorithms for outlier, adversarial and drift detection.
Computations
Parallel computing with task scheduling.
Fast NumPy array functions written in C.
NumPy-like API accelerated with CUDA.
Python library for multilinear algebra and tensor factorizations.
Solve automatic numerical differentiation problems in one or more variables.
Add built-in support for quaternions to numpy.
Tools for adaptive and parallel samping of mathematical functions.
A fast numerical expression evaluator for NumPy that comes with an integrated computing virtual machine to speed calculations up by avoiding memory allocation for intermediate results.
Web Scraping
Spatial Analysis
Quantum Computing
Qiskit is an open-source SDK for working with quantum computers at the level of circuits, algorithms, and application modules.
A python framework for creating, editing, and invoking Noisy Intermediate Scale Quantum (NISQ) circuits.
Quantum machine learning, automatic differentiation, and optimization of hybrid quantum-classical computations.
A Python Toolkit for Quantum Machine Learning.
Conversion
Transpile trained scikit-learn estimators to C, Java, JavaScript, and others.
Open Neural Network Exchange.
A set of tools to help users inter-operate among different deep learning frameworks.
Universal model exchange and serialization format for decision tree forests.