Speech and Natural Language Processing > NLP with Ruby
Contents
NLP Pipeline Subtasks
Pipeline Generation
Multipurpose Engines
Language Identification
Segmentation
Stemming
Lexical Statistics: Counting Types and Tokens
Filtering Stop Words
Phrasal Level Processing
Constituency Parsing
Semantic Analysis
Set of five distance types between strings (including Levenshtein, Sellers, Jaro-Winkler, 'pair distance').
Calculates edit distance using the Damerau-Levenshtein algorithm.
Pragmatical Analysis
Projects and Code Examples
High Level Tasks
Full Text Search, Information Retrieval, Indexing
elasticsearch-ruby/tree/master/elasticsearch
Dialog Agents, Assistants, and Chatbots
Linguistic Resources
Machine Learning Libraries
General classifier module to allow Bayesian and other types of classifications.
Ruby implementation of the LDA (Latent Dirichlet Allocation) for automatic Topic Modelling and Document Clustering.
Ruby interface to LIBLINEAR (much more efficient than LIBSVM for text classification).
JRuby maximum entropy classifier for string data, based on the OpenNLP Maxent framework.
Optical Character Recognition
Text Extraction
Language Aware String Manipulation
Fuzzy string comparison with Distance measures and Regular Expression.
RoR ActiveSupport gem has various string extensions that can handle case.
Articles, Posts, Talks, and Presentations
by Todd Schneider video via github.com/toddwschneider by Todd Schneider code