Skip to the content.
Machine Learning
- Data Packaged Core Datasets - Important, commonly-used datasets in high quality, easy-to-use & open form as data packages
- awesome-public-datasets - A topic-centric list of high-quality open datasets in public domains
- public data sets - A collection of public data sets for testing out visualization methods. @ various stages of preparation
Datasets
Reinforcement Learning
- Gym by OpenAI
- Gym Retro by OpenAI - 1000+ games + their tools for analysis
- VGL - high-level video game description language (let’s you treat game objects as primitives in python)
- Habitat - high-performance 3D simulator with configurable agents, multiple sensors, and generic 3D dataset handling
- Visdoom - tool for visualizing reinforcement learning experimental data
Cognitive Science/Neuroscience
- Dallinger (for Mechanical Turk) - Laboratory automation for the behavioral and social sciences
- BrainFacts - Initiative to disseminate information on Brain run by Gatsby Institute, Kavli Foundation, & Society for Neuroscience
Cognitive Science Datasets
- Data on the Mind - Datasets on human activity led by Professor Tom Griffiths @ UC Berkeley
Social Science Datasets
- IRIS Dataset - Large data collected by UMich on research investments, scientific production, career outcomes, etc. across the United States from past 10+ years.
Neuroscience Datasets
- Brain Map - Data collected by the Allen Brain Atlas
- NeuroQuery - given a query, it visually shows what parts of the brain have been related along with corresponding papers