Sources of Big Open Data
This page links to various places where large (hundreds of megabytes to gigabytes) open data sets can be obtained for analysis using big data techniques.
Swiss
- mingle.io (Swiss open data startup)
Worldwide
- DBpedia Downloads - semantically interlinked Wikipedia content
- Planet.osm - bulk downloads of Open Street Map
- GeoFabrik.de - extracts of OSM per-content
- Arxiv bulk access - science articles preprints
- Open Library - 20 million editions, 6 million authors
- BioTorrents - (legal!) open data torrents
- Visible Earth - NASA's image archives
- Public Library of Science - can be scraped
USA specific
- Open States - bills & legislatures from the US
- Geospatial Data Gateway - USDA environmental and natural resources data
- Enron Email Dataset - 500K messages