====== Sources of Big Open Data ====== This page links to various places where large (hundreds of megabytes to gigabytes) open data sets can be obtained for analysis using [[http://en.wikipedia.org/wiki/Big_data|big data]] techniques. ==== Swiss ==== * [[data:270a|Statistical Linked Datasets]] * [[http://www.mingle.io|mingle.io]] (Swiss open data startup) ==== Worldwide ==== * [[http://wiki.dbpedia.org/Downloads|DBpedia Downloads]] - semantically interlinked Wikipedia content * [[http://wiki.openstreetmap.org/wiki/Planet.osm|Planet.osm]] - bulk downloads of Open Street Map * [[http://download.geofabrik.de/|GeoFabrik.de]] - extracts of OSM per-content * [[http://arxiv.org/help/bulk_data|Arxiv bulk access]] - science articles preprints * [[http://openlibrary.org/data|Open Library]] - 20 million editions, 6 million authors * [[http://www.biotorrents.net/browse.php|BioTorrents]] - (legal!) open data torrents * [[http://visibleearth.nasa.gov/|Visible Earth]] - NASA's image archives * [[https://earth.esa.int/web/guest/pi-community/apply-for-data|ESA Mission data]] - some data in the [[https://earth.esa.int/web/guest/data-access/browse-data-products|products]] and [[https://earth.esa.int/web/guest/campaigns|campaigns]] require registration/application * [[http://www.ploscollections.org/static/pcbiCollections|Public Library of Science]] - can be [[http://www.neurogems.org/fetchplos/|scraped]] ==== USA specific ==== * [[http://openstates.org/downloads/|Open States]] - bills & legislatures from the US * [[http://datagateway.nrcs.usda.gov/|Geospatial Data Gateway]] - USDA environmental and natural resources data * [[http://www.cs.cmu.edu/~enron/|Enron Email Dataset]] - 500K messages