AGI & CoCoSci

The reciprocation of Artificial General Intelligence (AGI) and Computational Cognitive Sciences (CoCoSci).

Collection 375 stars GitHub

Generative Model

AI Concept Representation

Human Concept Representation

Human Concept Representation

DSL Program Synthesis

Explainable Deep Learning

AI Assisted Research

AI Assisted Research

Imperative DSL Applications

Declarative DSL Applications

Design Automation

Generative Model

Generative Modeling Explained 64 updated 3y ago

This tutorial on generative modeling is in part of Statistical Machine Learning Tutorial by Ying Nian Wu at UCLA Statistics. The tutorial goes over the key equations and algorithms for learning recent generative models, including energy-based models, diffusion/score-based models, autoregressive/flow-based models, VAEs, and GANs, and explains the connections between these models.

Learning Latent Space Energy-Based Prior Model 37 updated 5y ago

A milestone paper on Latent Energy-Based Model.

Learning Energy-Based Models by Diffusion Recovery Likelihood 54 updated 5y ago

Code

AI Concept Representation

ImageBind: One Embedding Space To Bind Them All 9.0k updated 6mo ago

This work presents ImageBind, an approach to learn a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. The authors show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together. ImageBind can leverage recent large scale vision-language models, and extends their zero-shot capabilities to new modalities just by using their natural pairing with images. It enables novel emergent applications 'out-of-the-box' including cross-modal retrieval, composing modalities with arithmetic, cross-modal detection and generation. The emergent capabilities improve with the strength of the image encoder and this work sets a new state-of-the-art on emergent zero-shot recognition tasks across modalities, outperforming specialist supervised models. Finally, the authors show strong few-shot recognition results outperforming prior work, and that ImageBind serves as a new way to evaluate vision models for visual and non-visual tasks.

Metabolic activity organizes olfactory representations 30 updated 2y ago

Odorous compounds with similar POM representations are more likely to co-occur within a substance and be metabolically closely related; metabolic reaction sequences also follow smooth paths in POM despite large jumps in molecular structure.

Connecting Touch and Vision via Cross-Modal Prediction 77 updated 7y ago

Humans perceive the world using multi-modal sensory inputs such as vision, audition, and touch. This work investigates the cross-modal connection between vision and touch. The main challenge in this cross-domain modeling task lies in the significant scale discrepancy between the two: while our eyes perceive an entire visual scene at once, humans can only feel a small region of an object at any given moment. To connect vision and touch, this work introduces new tasks of synthesizing plausible tactile signals from visual inputs as well as imagining how we interact with objects given tactile data as input. To accomplish the goals, the authors first equip robots with both visual and tactile sensors and collect a large-scale dataset of corresponding vision and tactile image sequences. To close the scale gap, the authors present a new conditional adversarial model that incorporates the scale and location information of the touch. Human perceptual studies demonstrate that the model can produce realistic visual images from tactile data and vice versa.

Human Concept Representation

Natural speech reveals the semantic maps that tile human cerebral cortex 40 updated 1y ago

The meaning of language is represented in regions of the cerebral cortex collectively known as the ‘semantic system’. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. This work systematically maps semantic selectivity across the cortex using voxel-wise modelling of functional MRI (fMRI) data collected while subjects listened to hours of narrative stories. This work shows that the semantic system is organized into intricate patterns that seem to be consistent across individuals. The authors then use a novel generative model to create a detailed semantic atlas. The results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and the atlas shows which domains are represented in each area. This study demonstrates that data-driven methods---commonplace in studies of human neuroanatomy and functional connectivity---provide a powerful and efficient means for mapping functional representations in the brain.

Papers

Theory

On the Complexity of Bayesian Generalization 8 updated 2y ago

This work examines concept generalization at a large scale in the natural visual spectrum. Established computational modes (i.e., rule-based or similarity-based) are primarily studied isolated, focusing on confined and abstract problem spaces. This work studies these two modes when the problem space scales up and when the complexity of concepts becomes diverse. At the representational level, the authors investigate how the complexity varies when a visual concept is mapped to the representation space. Prior literature has shown that two types of complexities build an inverted-U relation. Leveraging Representativeness of Attribute (RoA), the authors computationally confirm: Models use attributes with high RoA to describe visual concepts, and the description length falls in an inverted-U relation with the increment in visual complexity. At the computational level, the authors examine how the complexity of representation affects the shift between the rule- and similarity-based generalization. The authors hypothesize that category-conditioned visual modeling estimates the co-occurrence frequency between visual and categorical attributes, thus potentially serving as the prior for the natural visual world. Experimental results show that representations with relatively high subjective complexity outperform those with relatively low subjective complexity in rule-based generalization, while the trend is the opposite in similarity-based generalization.

Non-Verbal Communication

Pixelor: A Competitive Sketching AI Agent. So you think you can beat me?

ACM SIGGRAPH'20, 2020. [All Versions]. [Project]. Rationality in feature sketching.

Pixelor: A Competitive Sketching AI Agent. So you think you can beat me? updated 5y ago

ACM SIGGRAPH'20, 2020. [All Versions]. [Project]. Rationality in feature sketching.

Pragmatics

OSMnx Tool 5.6k updated 2mo ago

[OpenStreetMap Website]. [OSMnx Tool]. [All Versions]. [Exploring Urban Form Through Openstreetmap Data: A Visual Introduction].

Dialogue Experimental Toolkit(DiET) 18 updated 8mo ago

The present study sets out to experimentally investigate how environmental factors come to shape the emergence of linguistic conventions. To this end, the authors adapt the classical Maze Game task to test the hypothesis that participants routinise different linguistic strategies to communicate positions in the maze contingent on particular environmental affordances (i.e. structure of the mazes). The results confirm that subtle environmental motivations drive the emergence of different communicative conventions in an otherwise identical task, suggesting that linguistic adaptations are highly sensitive to factors of the shared task environment.

The SocialAI School: Insights from Developmental Psychology Towards Artificial Socio-Cultural Agents

ICML'23 Workshop on Theory-of-Mind, 2023. [All Versions]. [Project].

Coordination

HLSMAC: A New StarCraft Multi-Agent Challenge for High-Level Strategic Decision-Making

AAMAS'26, 2026. [All Versions]. Benchmarks are crucial for assessing multi-agent reinforcement learning (MARL) algorithms. While StarCraft II-related environments have driven significant advances in MARL, existing benchmarks like SMAC focus primarily on micromanagement, limiting comprehensive evaluation of high-level strategic intelligence. To address this, this work introduces HLSMAC, a new cooperative MARL benchmark with 12 carefully designed StarCraft II scenarios based on classical stratagems from the Thirty-Six Stratagems. Each scenario corresponds to a specific stratagem and is designed to challenge agents with diverse strategic elements, including tactical maneuvering, timing coordination, and deception, thereby opening up avenues for evaluating high-level strategic decision-making capabilities. The authors also propose novel metrics across multiple dimensions beyond conventional win rate, such as ability utilization and advancement efficiency, to assess agents' overall performance within the HLSMAC environment. The authors conduct a large-scale evaluation of 21 state-of-the-art MARL algorithms and LLM-based agents, with additional multi-seed analysis for relatively better-performing methods. The results demonstrate that HLSMAC serves as a robust testbed for advancing multi-agent strategic decision-making.

HLSMAC: A New StarCraft Multi-Agent Challenge for High-Level Strategic Decision-Making

The original paper on Abductive Learning, a derivative-free approach for neuro-symbolic learning.

Grounded Language Learning Fast and Slow 54 updated 4y ago

[Project].

Learn to explain efﬁciently via neural logic inductive learning 44 updated 3y ago

[Project].

Explainable Deep Learning

pytorch-grad-cam 12.8k updated 1mo ago

2021. Class Activation Map methods implemented in Pytorch, with many elegant features.

Compositional Explanations of Neurons 25 updated 5y ago

NeurIPS'20, 2020. [All Versions]. [Project]. A concept-composition version of network dissection.

Noise or Signal: The Role of Backgrounds in Image Classification 143 updated 5y ago

ICLR'21, 2021. [All Versions]. [Code & Data]. [Project]. A perspective on image background provides strong clue for foreground classification.

Noise or Signal: The Role of Backgrounds in Image Classification 143 updated 5y ago

ICLR'21, 2021. [All Versions]. [Code & Data]. [Project]. A perspective on image background provides strong clue for foreground classification.

Methodologies for Experiments

Identification of Causal Effects Using Instrumental Variables

Journal of the American Statistical Association, 1996. [All Versions]. The original paper on Instrumental Variables for natural sociology studies.

Meta Learning

Meta-Learning

A survey of meta-learning in the context of reinforcement learning.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Chelsea Finn's original paper on Model-Agnostic Meta-Learning (MAML).

Bayesian Model-Agnostic Meta-Learning

A Bayesian account on MAML.

Meta-Q-Learning

The milestone paper on context Meta-RL.

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

Balancing Constraints and Rewards with Meta-Gradient D4PG

Metacontrol for Adaptive Imagination-Based Optimization

On Effective Scheduling of Model-based Reinforcement Learning

Marr's Levels of Analysis

Vision: A Computational Investigation into the Human Representation and Processing of Visual Information

Levels of Analysis in Computational Social Science

A Marr's paradigm account on computational social science.

Levels of Analysis for Machine Learning

A Marr's paradigm account on machine learning.

Levels of Analysis in Computational Social Science

CogSci'18, 2018. [All Versions]. A Marr's paradigm account on computational social science.

Levels of Analysis for Machine Learning

ICLR'20 Bridging AI and Cognitive Science Workshop, 2020. [All Versions]. A Marr's paradigm account on machine learning.

Gestalt

Gestalt theory

The original book on Gestalt psychology.

The Aha! Moment

A computational model of scientific insight

A computational account on insights for scientific discovery.

What Makes an Insight Problem? The Roles of Heuristics, Goal Conception, and Solution Recoding in Knowledge-Lean Problems

Constraint relaxation and chunk decomposition in insight problem solving

Dynamics and constraints in insight problem solving

Insight solutions are correct more often than analytic solutions

Human Performance on Insight Problem Solving: A Review

Insight Is Not in the Problem: Investigating Insight in Problem Solving across Task Types

Multiple Causes of Difficulty in Insight: The Case of the Nine-Dot Problem

Investigating the effect of Mental Set on Insight Problem Solving

Rationality

The Adaptive Nature of Human Categorization Behavior

The original paper that relates cognitive resource limitation with Bayesian rational analysis, in the case of categorization behavior.

Task switching

The original paper on ``switch cost'', where subjects' responses are substantially slower and, usually, more error-prone immediately after a task switch.

Computational Rationality: Linking Mechanism and Behavior Through Bounded Utility Maximization

Introducing the computational rationality framework for including information-processing bounds in rational analyses, which emphasizes the incorporation of computational mechanism into the definition of rational action.

Computational rationality: A converging paradigm for intelligence in brains, minds, and machines

A comprehensive review on the rationality of Bayesian computational models.

Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources

A resource-rational account on interpreting human intelligence.

Rational Use of Cognitive Resources: Levels of Analysis Between the Computational and the Algorithmic

An earlier version of the paper above.

Understanding Human Intelligence through Human Limitations

Recent progress in artificial intelligence provides the opportunity to ask the question of what is unique about human intelligence, but with a new comparison class. The author argues that we can understand human intelligence, and the ways in which it may differ from artificial intelligence, by considering the characteristics of the kind of computational problems that human minds have to solve. The author claims that these problems acquire their structure from three fundamental limitations that apply to human beings: limited time, limited computation, and limited communication. From these limitations we can derive many of the properties we associate with human intelligence, such as rapid learning, the ability to break down problems into parts, and the capacity for cumulative cultural evolution.

Foundations of intuitive power analyses in children and adults

Evidences support that people have some of the foundations for 'intuitive power analyses', which help people use intuitive statistical reasoning and metacognitive strategies to estimate how much information they might need to solve different discrimination problems.

Cognitive Science as a Source of Forward and Inverse Models of Human Decisions for Robotics and Control

The review focuses on how cognitive science can provide forward models of human decision-making and inverse models of how humans think about others’ decision-making. The authors highlight relevant recent developments, including approaches that synthesize black box and theory-driven modeling, accounts that recast heuristics and biases as forms of bounded optimality, and models that characterize human theory of mind and communication in decision-theoretic terms.

Cognitive Architecture

The secret life of predictive brains: what's spontaneous activity for?

A neuroscience account on brain as a generative model.

SOAR: An architecture for general intelligence

Is human cognition adaptive?

The original paper introducing the adaptation perspective of human intelligence, the theoretical basis of the ACT cognitive architecture.

Metacognition in computation: A selected research review

Basic functional trade-offs in cognition: An integrative framework

A Theoretical Computer Science Perspective on Consciousness

SOAR: An architecture for general intelligence 3.0k updated 29d ago

Artificial Intelligence, 1987. [All Versions].

Is human cognition adaptive?

Behavioral and Brain Sciences, 1991. [All Versions]. The original paper introducing the adaptation perspective of human intelligence, the theoretical basis of the ACT cognitive architecture.

A Theoretical Computer Science Perspective on Consciousness

Journal of Artificial Intelligence and Consciousness, 2020. [All Versions].

Theory of Mind

The Naïve Utility Calculus as a unified, quantitative framework for action understanding 15 updated 4y ago

This paper presents a formal theory of the Naïve Utility Calculus as a probabilistic generative model, which highlights the role of cost and reward tradeoffs in a Bayesian framework for action-understanding. The model predicts with quantitative accuracy how people infer agents’ subjective costs and rewards based on their observable actions. By distinguishing between desires, goals, and intentions, the model extends to complex action scenarios unfolding over space and time in scenes with multiple objects and multiple action episodes.

PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception

This project implements a system for perceiving social events that are grounded in physical dynamics. It leverages a combination of computer vision and deep learning techniques to extract meaningful representations of social interactions from video data. The system aims to provide a more comprehensive understanding of social perception by considering the underlying physical principles that govern these interactions.

PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception

AAAI'21, 2021. [Project].

Intuitive Physics

Intuitive Physics Reading List 19 updated 1y ago

GitHub. A reading list on intuitive physics, maintained actively by Shiqian Li.

Knowledge Representation

Connecting perceptual and procedural abstractions in physical construction 2 updated 2mo ago

CogSci'21, 2021. [All Versions].

Language Compositionality

War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars

2023. [All Versions].

Scaling Up Behavioral Studies

Scaling up experimental social, behavioral, and economic science

A white paper on scaling up social, behavioral, and econimic experiments.

Causality

Theory-Based Causal Transfer: Integrating Instance-Level Induction and Abstract-Level Structure Learning

A computatinoal account on causal transfer.

Do New Caledonian crows solve physical problems through causal reasoning?

A piece of evidence for the capability of causal reasoning in intelligent animals.

Do six-month-old infants perceive causality?

Cognition, 1987. [All Versions].

AI Commonsense Reasoning

PIQA: Reasoning about Physical Commonsense in Natural Language

AAAI'20, 2020. [All Versions].

VisualCOMET: Reasoning About the Dynamic Context of a Still Image

ECCV'20, 2020. [All Versions].

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning

ECCV'22, 2022. [All Versions]. [Preprint]. This paper presents Sherlock, an annotated corpus of 103K images for testing machine capacity for abductive reasoning beyond literal image contents. The corpus construction process adopts a free-viewing paradigm: participants first observe and identify salient clues within images (e.g., objects, actions) and then provide a plausible inference about the scene, given the clue.

From Recognition to Cognition: Visual Commonsense Reasoning

CVPR'19, 2019. [All Versions]. [Project].

PIQA: Reasoning about Physical Commonsense in Natural Language

AAAI'20, 2020. [All Versions].

Visual Commonsense R-CNN

CVPR'20, 2020. [All Versions].

Abductive Commonsense Reasoning

ICLR'20, 2020. [All Versions]. Abductive commonsense reasoning on large language models.

VisualCOMET: Reasoning About the Dynamic Context of a Still Image

ECCV'20, 2020. [All Versions].

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

NAACL'24, 2024. [All Versions]. This paper explores the task of uncommonsense abductive reasoning. Given a piece of context with an unexpected outcome, this task requires reasoning abductively to generate an explanation that makes the unexpected outcome more likely in the context.

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

NeurIPS'23, 2023. [All Versions]. [Project].

Program Synthesis

Foundations and Trends in Programming Languages, 2017. [All Versions]. Sumit Gulwani's comprehensive review on program synthesis.

The Discovery of the Equator or Concept Driven Learning

The original paper on second-order metarules.

Towards combining inductive logic programming with Bayesian networks

Meta-interpretive learning: application to grammatical inference

Stephen Muggleton's original paper on Meta-Interpretive Learning (MIL).

Learning Efficient Logical Robot Strategies Involving Composable Objects

Learning Higher-Order Logic Programs through Abstraction and Invention

How Much Can Experimental Cost Be Reduced in Active Learning of Agent Strategies?

Meta-Interpretive Learning from noisy images

Learning efficient logic programs

Learning higher-order logic programs

Logical reduction of metarules

Playgol: Learning Programs Through Play

Machine Discovery of Comprehensible Strategies for Simple Games Using Meta-interpretive Learning

Forgetting to Learn Logic Programs

Turning 30: New Ideas in Inductive Logic Programming

Inductive logic programming at 30: a new introduction

A 30-year comprehensive review on Inductive Logic Programming.

Learning programs by learning from failures

Complete Bottom-Up Predicate Invention in Meta-Interpretive Learning

Meta-Interpretive Learning as Metarule Specialisation

Qualitative choice logic

Derivative-free optimization of high-dimensional non-convex functions by sequential random embeddings

Finitely Generated Groups and First-Order Logic

Program Synthesis Guided Reinforcement Learning

Program Synthesis with Large Language Models

This paper explores the limits of the current generation of large language models for program synthesis in general purpose programming languages.

From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought

Rational meaning construction, a computational framework for language-informed thinking that combines neural language models with probabilistic models for rational inference. Linguistic meaning is framed as a context-sensitive mapping from natural language into a probabilistic language of thought (PLoT)--a general-purpose symbolic substrate for generative world modeling.

Large Language Models Meet NL2Code: A Survey

A paper presenting a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics, suggesting that the key factors contributing to the success of large language models for NL2Code are “Large Size, Premium Data, Expert Tuning”.

Large Language Models for Software Engineering: A Systematic Literature Review

A systematic literature review on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes.

Large Language Models for Software Engineering: A Systematic Literature Review

2023. [All Versions]. A systematic literature review on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes.

Commonsense reasoning about causality: Deriving behavior from structure

Logics for Epistemic Programs

A Translation Approach to Portable Ontology Specifications

The Symbolic Grounding Problem

Learning overhypotheses with hierarchical Bayesian models

Learning Causal Schemata

The Perception of Relations

A computational philosophy account on scientific representation, focusing on how scientific models represent their target systems.

Learning overhypotheses with hierarchical Bayesian models

A computational philosophy account on scientific representation, focusing on how scientific models represent their target systems.

Meta-interpretive learning: application to grammatical inference

Machine Learning, 2014. [All Versions]. Stephen Muggleton's original paper on Meta-Interpretive Learning (MIL).

Synthesizing theories of human language with Bayesian program induction

Nature Communications, 2022. [All Versions].

Latent Programmer: Discrete Latent Codes for Program Synthesis

ICML'21, 2021. [All Versions]. Paper introducing the Latent Programmer, a two-level program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language.

Large Language Models Meet NL2Code: A Survey

ACL'23, 2023. [All Versions]. [NL2Code Website]. A paper presenting a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics, suggesting that the key factors contributing to the success of large language models for NL2Code are “Large Size, Premium Data, Expert Tuning”.

The Perception of Relations

Chaz Firestone's review on the perception of relation, in constrast to the conventional reasoning view.

Learning overhypotheses with hierarchical Bayesian models

Learning Causal Schemata

Design Automation

Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs

Self-driving laboratories have begun to replace human experimenters in performing single experimental skills or predetermined experimental protocols. However, as the pace of idea iteration in scientific research has been intensified by Artificial Intelligence, the demand for rapid design of new protocols for new discoveries become evident. Efforts to automate protocol design have been initiated, but the capabilities of knowledge-based machine designers, such as Large Language Models, have not been fully elicited, probably for the absence of a systematic representation of experimental knowledge, as opposed to isolated, flatten pieces of information. To tackle this issue, this work proposes a multi-faceted, multi-scale representation, where instance actions, generalized operations, and product flow models are hierarchically encapsulated using Domain-Specific Languages. The authors further develop a data-driven algorithm based on non-parametric modeling that autonomously customizes these representations for specific domains. The proposed representation is equipped with various machine designers to manage protocol design tasks, including planning, modification, and adjustment. The results demonstrate that the proposed method could effectively complement Large Language Models in the protocol design process, serving as an auxiliary module in the realm of machine-assisted scientific exploration.

Evolutionary Intelligence

From language development to language evolution: A unified view of human lexical creativity

This work supports a unified foundation for human lexical creativity underlying both the fleeting products of individual ontogeny and the evolutionary products of phylogeny across languages.

Meta-Level Considerations

On Effective Scheduling of Model-based Reinforcement Learning

NeurIPS'21, 2021. [All Versions].

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

ICML'17, 2017. [All Versions]. [Post]. Chelsea Finn's original paper on Model-Agnostic Meta-Learning (MAML).

Bayesian Model-Agnostic Meta-Learning 1.6k (archived)

NeurIPS'18, 2018. [All Versions]. A Bayesian account on MAML.

Meta-Q-Learning

ICLR'20, 2020. [All Versions]. The milestone paper on context Meta-RL.

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

ICML'19, 2019. [All Versions].

Balancing Constraints and Rewards with Meta-Gradient D4PG

ICLR'21, 2021. [All Versions].

Metacontrol for Adaptive Imagination-Based Optimization

ICLR'17, 2017. [All Versions].

Commonsense Knowledgebase

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge 2.9k updated 3y ago

AAAI'17, 2017. [All Versions]. Latest version of ConceptNet.

Strong Machine Learning

Deep Forest: Towards An Alternative to Deep Neural Networks 964 updated 8mo ago

IJCAI'17, 2017. [All Versions]. [Project]. This paper proposes gcForest, a decision tree ensemble approach with performance highly competitive to deep neural networks in a broad range of tasks. In contrast to deep neural networks which require great effort in hyper-parameter tuning, gcForest is much easier to train; even when it is applied to different data across different domains in the experiments, excellent performance can be achieved by almost same settings of hyper-parameters. The training process of gcForest is efficient, and users can control training cost according to computational resource available. The efficiency may be further enhanced because gcForest is naturally apt to parallel implementation. Furthermore, in contrast to deep neural networks which require large-scale training data, gcForest can work well even when there are only small-scale training data.

NBDT: Neural-Backed Decision Trees 624 updated 3y ago

ICLR'21, 2021. [All Versions]. [Code]. Machine learning applications such as finance and medicine demand accurate and justifiable predictions, barring most deep learning methods from use. In response, previous work combines decision trees with deep learning, yielding models that (1) sacrifice interpretability for accuracy or (2) sacrifice accuracy for interpretability. This work forgoes this dilemma by jointly improving accuracy and interpretability using Neural-Backed Decision Trees (NBDTs). NBDTs replace a neural network's final linear layer with a differentiable sequence of decisions and a surrogate loss. This forces the model to learn high-level concepts and lessens reliance on highly-uncertain decisions, yielding (1) accuracy: NBDTs match or outperform modern neural networks on CIFAR, ImageNet and better generalize to unseen classes by up to 16%. Furthermore, the surrogate loss improves the original model's accuracy by up to 2%. NBDTs also afford (2) interpretability: improving human trustby clearly identifying model mistakes and assisting in dataset debugging.

DSL Program Synthesis

pix2code: Generating Code from a Graphical User Interface Screenshot 12.0k updated 2y ago

This paper shows that deep learning methods can be leveraged to train a model end-to-end to automatically reverse engineer user interfaces and generate code from a single input image with over 77% of accuracy for three different platforms (i.e. iOS, Android and web-based technologies).

Expert-level protocol translation for self-driving labs

Recent development in Artificial Intelligence (AI) models has propelled their application in scientific discovery, but the validation and exploration of these discoveries require subsequent empirical experimentation. The concept of self-driving laboratories promises to automate and thus boost the experimental process following AI-driven discoveries. However, the transition of experimental protocols, originally crafted for human comprehension, into formats interpretable by machines presents significant challenges, which, within the context of specific expert domain, encompass the necessity for structured as opposed to natural language, the imperative for explicit rather than tacit knowledge, and the preservation of causality and consistency throughout protocol steps. Presently, the task of protocol translation predominantly requires the manual and labor-intensive involvement of domain experts and information technology specialists, rendering the process time-intensive. To address these issues, this work proposes a framework that automates the protocol translation process through a three-stage workflow, which incrementally constructs Protocol Dependence Graphs (PDGs) that approach structured on the syntax level, completed on the semantics level, and linked on the execution level. Quantitative and qualitative evaluations have demonstrated its performance at par with that of human experts, underscoring its potential to significantly expedite and democratize the process of scientific discovery by elevating the automation capabilities within self-driving laboratories.

Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting 47 updated 1y ago

This paper proposes CLAIRIFY, an approach that combines automatic iterative prompting with program verification to ensure programs written in data-scarce domain-specific language are syntactically valid and incorporate environment constraints.

Synthesizing theories of human language with Bayesian program induction

Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence. This work presents a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sounds. The authors integrate Bayesian inference with program synthesis and representations inspired by linguistic theory and cognitive models of learning and discovery. Across 70 datasets from 58 diverse languages, the system synthesizes human-interpretable models for core aspects of each language’s morpho-phonology, sometimes approaching models posited by human linguists. Joint inference across all 70 data sets automatically synthesizes a meta-model encoding interpretable cross-language typological tendencies. Finally, the same algorithm captures few-shot learning dynamics, acquiring new morphophonological rules from just one or a few examples. These results suggest routes to more powerful machine-enabled discovery of interpretable models in linguistics and other scientific domains.

Explainable Deep Learning

Network dissection: Quantifying interpretability of deep visual representations

The original paper on visualizing the class activation maps to explain convolutional neural networks.

AI Assisted Research

The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4 77 updated 2y ago

A survey on the performance of LLMs within the context of scientific discovery, focusing on GPT-4.

BioPlanner: Automatic Evaluation of LLMs on Protocol Planning in Biology 28 updated 1y ago

This paper presents an automatic evaluation framework for the task of planning experimental protocols, and introduces BioProt: a dataset of biology protocols with corresponding pseudocode representations.

DrBioRight 2.0: an LLM-powered bioinformatics chatbot for large-scale cancer functional proteomics analysis

Functional proteomics provides critical insights into cancer mechanisms, facilitating the discovery of novel biomarkers and therapeutic targets. The authors have developed a comprehensive cancer functional proteomics resource using reverse phase protein arrays, incorporating data from nearly 8000 patient samples from The Cancer Genome Atlas and approximately 900 samples from the Cancer Cell Line Encyclopedia. The dataset includes a curated panel of nearly 500 high-quality antibodies, covering all major cancer hallmark pathways. To enhance the accessibility and analytic power of this resource, this work introduces DrBioRight 2.0, an intuitive bioinformatic platform powered by state-of-the-art large language models. DrBioRight enables researchers to explore protein-centric cancer omics data, perform advanced analyses, visualize results, and engage in interactive discussions using natural language. By streamlining complex proteogenomic analyses, this tool accelerates the translation of large-scale functional proteomics data into meaningful biomedical insights.

Imperative DSL Applications

Biocoder: A programming language for standardizing and automating biology protocols 1 updated 7y ago

This paper introduces BioCoder, a C++ library that enables biologists to express the exact steps needed to execute a protocol. In addition to being suitable for automation, BioCoder converts the code into a readable, English-language description for use by biologists.

Corel: A DSL for Cooking Recipes

The Corel DSL for cooking recipes enables understanding of and computation with ingredients, and can construct a nutrition label for the recipe.

Infinite Photorealistic Worlds Using Procedural Generation

This paper introduces Infinigen, a procedural generator of photorealistic 3D scenes of the natural world. Infinigen is entirely procedural: every asset, from shape to texture, is generated from scratch via randomized mathematical rules, using no external source and allowing infinite variation and composition.

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

This work introduces Infinigen Indoors, a Blender-based procedural generator of photorealistic indoor scenes. It builds upon the existing Infinigen system, which focuses on natural scenes, but expands its coverage to indoor scenes by introducing a diverse library of procedural indoor assets, including furniture, architecture elements, appliances, and other day-to-day objects. It also introduces a constraint-based arrangement system, which consists of a domain-specific language for expressing diverse constraints on scene composition, and a solver that generates scene compositions that maximally satisfy the constraints. The authors provide an export tool that allows the generated 3D objects and scenes to be directly used for training embodied agents in real-time simulators such as Omniverse and Unreal. Infinigen Indoors is open-sourced under the BSD license.

"We Need Structured Output": Towards User-centered Constraints on Large Language Model Output

Large language models can produce creative and diverse responses. However, to integrate them into current developer workflows, it is essential to constrain their outputs to follow specific formats or standards. This work surveyed 51 experienced industry professionals to understand the range of scenarios and motivations driving the need for output constraints from a user-centered perspective. The authors identified 134 concrete use cases for constraints at two levels: low-level, which ensures the output adhere to a structured format and an appropriate length, and high-level, which requires the output to follow semantic and stylistic guidelines without hallucination. Critically, applying output constraints could not only streamline the currently repetitive process of developing, testing, and integrating LLM prompts for developers, but also enhance the user experience of LLM-powered features and applications. The authors conclude with a discussion on user preferences and needs towards articulating intended constraints for LLMs, alongside an initial design for a constraint prototyping tool.

OCTOPUS: operation control system for task optimization and job parallelization via a user-optimal scheduler 1.7k updated 2mo ago

The material acceleration platform, empowered by robotics and artificial intelligence, is a transformative approach for expediting material discovery processes across diverse domains. However, the development of an operating system for material acceleration platform faces challenges in simultaneously managing diverse experiments from multiple users. Specifically, when it is utilized by multiple users, the overlapping challenges of experimental modules or devices can lead to inefficiencies in both resource utilization and safety hazards. To overcome these challenges, this work presents an operation control system for material acceleration platform, namely, OCTOPUS, which is an acronym for operation control system for task optimization and job parallelization via a user-optimal scheduler. OCTOPUS streamlines experiment scheduling and optimizes resource utilization through integrating its interface node, master node and module nodes. Leveraging process modularization and a network protocol, OCTOPUS ensures the homogeneity, scalability, safety and versatility of the platform. In addition, OCTOPUS embodies a user-optimal scheduler. Job parallelization and task optimization techniques mitigate delays and safety hazards within realistic operational environments, while the closed-packing schedule algorithm efficiently executes multiple jobs with minimal resource waste. Copilot of OCTOPUS is developed to promote the reusability of OCTOPUS for potential users with their own sets of lab resources, which substantially simplifies the process of code generation and customization through GPT recommendations and client feedback. This work offers a solution to the challenges encountered within the platform accessed by multiple users, and thereby will facilitate its widespread adoption in material development processes.

Declarative DSL Applications

Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language 16 updated 8mo ago

Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. This work shows that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. The authors demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization---although the authors provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, this work shows how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output.

OpenLaw

It is now possible to model all or parts of legal agreements using code (smart contracts), decreasing the cost and friction of creating, securing, and generating binding legal agreements. Lawyers lack basic tools to build these dynamic, “smart” contracts in a way that is enforceable and understandable to a legal professional. OpenLaw is a technology stack to help power next generation "smart" legal agreements, with a domain-specific markup language, a integration framework, and a series of general applications.

A Language for Counterfactual Generative Models 178 updated 7mo ago

This paper presents Omega, a probabilistic programming language with support for counterfactual inference. This feature is accomplished by introducing a new operator to probabilistic programming akin to Pearl’s do.

Goals as reward-producing programs 13 updated 1y ago

People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. This work bridges this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, the authors learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. The authors also discovered that the model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like.

The Scene Language: Representing Scenes with Programs, Words, and Embeddings

This paper introduces the Scene Language, a visual scene representation that concisely and precisely describes the structure, semantics, and identity of visual scenes. It represents a scene with three key components: a program that specifies the hierarchical and relational structure of entities in the scene, words in natural language that summarize the semantic class of each entity, and embeddings that capture the visual identity of each entity. This representation can be inferred from pre-trained language models via a training-free inference technique, given text or image inputs.

Configurable 3D Scene Synthesis and 2D Image Rendering with Per-pixel Ground Truth Using Stochastic Grammars

This work proposes a systematic learning-based approach to the generation of massive quantities of synthetic 3D scenes and arbitrary numbers of photorealistic 2D images thereof, for the purposes of training, benchmarking, and diagnosing learning-based computer vision and robotics algorithms. In particular, the authors devise a learning-based pipeline of algorithms capable of automatically generating and rendering a potentially infinite variety of indoor scenes by using a stochastic grammar, represented as an attributed Spatial And-Or Graph, in conjunction with state-of-the-art physics-based rendering. The pipeline is capable of synthesizing scene layouts with high diversity, and it is configurable inasmuch as it enables the precise customization and control of important attributes of the generated scenes. It renders photorealistic RGB images of the generated scenes while automatically synthesizing detailed, per-pixel ground truth data, including visible surface depth and normal, object identity, and material information (detailed to object parts), as well as environments (e.g., illuminations and camera viewpoints). The authors demonstrate the value of the synthesized dataset, by improving performance in certain machine-learning-based scene understanding tasks—depth and surface normal prediction, semantic segmentation, reconstruction, etc.---and by providing benchmarks for and diagnostics of trained models by modifying object attributes and scene properties in a controllable manner.

Domain Specific Language for Smart Contract Development

This research addresses the understanding hardness raised from the conceptual discrepancy between contractual clauses and corresponding code of the Solidity programming language, by the design and study of a domain-specific smart contract language based on higher level of abstraction that can be automatically transformed to an implementation.

Product Line Engineering Using Domain-Specific Languages

This paper investigates the application of domain-specific languages in product line engineering (PLE). It starts by analyzing the limits of expressivity of feature models. Feature models correspond to context-free grammars without recursion, which prevents the expression of multiple instances and references. The authors then show how domain-specific languages (DSLs) can serve as a middle ground between feature modeling and programming. They can be used in cases where feature models are too limited, while keeping the separation between problem space and solution space provided by feature models. This work then categorizes useful combinations between configuration with feature model and construction with DSLs and provide an integration of DSLs into the conceptual framework of PLE. Finally the authors show how use of a consistent, unified formalism for models, code, and configuration can yield important benefits for managing variability and trace ability.

Design Automation

Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs

AGI & CoCoSci

Contents

Generative Model

AI Concept Representation

Human Concept Representation

Human Concept Representation

Papers

Theory

Non-Verbal Communication

Pragmatics

Coordination

Problem Solving

Reinforcement Learning

Neural-Symbolic AI

Explainable Deep Learning

Methodologies for Experiments

Meta Learning

Marr's Levels of Analysis

Gestalt

The Aha! Moment

Rationality

Cognitive Architecture

Theory of Mind

Intuitive Physics

Knowledge Representation

Language Compositionality

Scaling Up Behavioral Studies

Causality

AI Commonsense Reasoning

Design Automation

Evolutionary Intelligence

Meta-Level Considerations

Commonsense Knowledgebase

Strong Machine Learning

DSL Program Synthesis

Explainable Deep Learning

AI Assisted Research

AI Assisted Research

Imperative DSL Applications

Declarative DSL Applications

Design Automation