project:chparlscraping

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
project:chparlscraping [2015/09/05 17:20] shalfproject:chparlscraping [2015/09/07 16:23] (current) yrochat
Line 1: Line 1:
 ==== Swiss parliament minutes scraping ==== ==== Swiss parliament minutes scraping ====
-Is the Swiss parliament really useful ? Once elected, what are our councilors talking about ? Who is answering to who ?+ 
 +[[http://parlement.letemps.ch]] 
 + 
 +Is the Swiss parliament really useful ? Once elected, what are our councilors talking about ? Who is answering to whom ?
  
 Goal of this project is to answer some of these questions and many more. To do this, we are planning to: Goal of this project is to answer some of these questions and many more. To do this, we are planning to:
Line 16: Line 19:
  
 === Data === === Data ===
-Raw data are available as one single JSON file, and its .cvs counterpart. We had size problems, thus exploring ways to produce several .csv +Raw data are available as one single JSON file, and its .csv counterpart. We had size problems, thus exploring ways to produce several .csv. 
-**The [[https://github.com/douglas-watson/parl-scraping/tree/master/data|final folder for our data is on github]], the [[https://github.com/douglas-watson/parl-scraping/blob/master/data/merged-csv.zip|.csv files are here]].**+**The [[https://github.com/douglas-watson/parl-scraping/tree/master/data|final folder for our data is on github]], the [[https://github.com/douglas-watson/parl-scraping/blob/master/data/with-bio-split-csv.zip|.csv files split by session are here]].**
  
-== Structure of the main JSON (on Giovanni's side, complement with bio data from Jeremie) ==+== Structure of the main JSON (on Giovanni's side, complement with the bio data from the Parliament API) ==
  
 list of interventions, with the following fields: list of interventions, with the following fields:
Line 34: Line 37:
   * Data: transcript of the intervention   * Data: transcript of the intervention
   * Name: of the person speaking   * Name: of the person speaking
- 
-== Graph data for Yannick == 
- 
-graph.csv: edgeless with Source (bio url as id of person replying to) - Destination (bio url as id of person talking before) - Subject (id of subject under discussion) - Date (of intervention, YY.MM.DD) 
-nodes.csv: nodelist with bio id - name - surname - canton - political group 
  
 == Structure of the Parliament API data via Yannick == == Structure of the Parliament API data via Yannick ==
Line 52: Line 50:
 == Structure of the final files from Jérémie == == Structure of the final files from Jérémie ==
  
-1 JSON file + its .csv counterpart for each Parliament session from 1995.+1 JSON file + its .csv counterpart for each Parliament session of the National Council from 1995. 
 +The same datasets are also available split as one JSON/CSV file per legislative session. 
 + 
 +== Graph data for Yannick == 
 + 
 +graph.csv: edgeless with Source (bio url as id of person replying to) - Destination (bio url as id of person talking before) - Subject (id of subject under discussion) - Date (of intervention, YY.MM.DD) 
 +nodes.csv: nodelist with bio id - name - surname - canton - political group
  
 === Results visualization === === Results visualization ===
  
-  Kibana Dashboard iframe:+  Kibana Dashboard iframe:
 <code><iframe src="http://178.62.236.56:3335/#/dashboard/Swiss-parliament-minutes-scraping?embed&_a=(filters:!(),panels:!((col:1,id:Total-count,row:1,size_x:3,size_y:2,type:visualization),(col:9,id:Who,row:3,size_x:4,size_y:4,type:visualization),(col:1,id:Parties,row:3,size_x:4,size_y:4,type:visualization),(col:4,id:Members,row:1,size_x:9,size_y:2,type:visualization),(col:5,id:County,row:3,size_x:4,size_y:4,type:visualization)),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Swiss%20parliament%20minutes%20scraping')&_g=()" height="600" width="800"></iframe></code> <code><iframe src="http://178.62.236.56:3335/#/dashboard/Swiss-parliament-minutes-scraping?embed&_a=(filters:!(),panels:!((col:1,id:Total-count,row:1,size_x:3,size_y:2,type:visualization),(col:9,id:Who,row:3,size_x:4,size_y:4,type:visualization),(col:1,id:Parties,row:3,size_x:4,size_y:4,type:visualization),(col:4,id:Members,row:1,size_x:9,size_y:2,type:visualization),(col:5,id:County,row:3,size_x:4,size_y:4,type:visualization)),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Swiss%20parliament%20minutes%20scraping')&_g=()" height="600" width="800"></iframe></code>
  
-  [[http://178.62.236.56:3335/#/dashboard/Swiss-parliament-minutes-scraping?_a=(filters:!(),panels:!((col:1,id:Total-count,row:1,size_x:3,size_y:2,type:visualization),(col:9,id:Who,row:3,size_x:4,size_y:4,type:visualization),(col:1,id:Parties,row:3,size_x:4,size_y:4,type:visualization),(col:4,id:Members,row:1,size_x:9,size_y:2,type:visualization),(col:5,id:County,row:3,size_x:4,size_y:4,type:visualization)),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Swiss%20parliament%20minutes%20scraping')&_g=()|Example of a Kibana Dashboard]]:+  [[http://178.62.236.56:3335/#/dashboard/Swiss-parliament-minutes-scraping?_a=(filters:!(),panels:!((col:1,id:Total-count,row:1,size_x:3,size_y:2,type:visualization),(col:9,id:Who,row:3,size_x:4,size_y:4,type:visualization),(col:1,id:Parties,row:3,size_x:4,size_y:4,type:visualization),(col:4,id:Members,row:1,size_x:9,size_y:2,type:visualization),(col:5,id:County,row:3,size_x:4,size_y:4,type:visualization)),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Swiss%20parliament%20minutes%20scraping')&_g=()|Example of a Kibana Dashboard]]:
 <pic sylvain> <pic sylvain>
  
-  Example viz graph "who talks to who":+  Example viz graph "who talks to who":
 <pic yannick> <pic yannick>
  
-  Semantic distance between members of parliament: <viz pa+  Semantic distance between members of parliament: <viz pa>
- +
-  - A simple gender gap visualization for the current Parliament: <[[https://docs.google.com/spreadsheets/d/1MiO6w331UMGX4vYTyhsMs5uUAgAUCJfyzPkRqAUSjww/edit?usp=sharing|gsheet shalf]]>+
  
 +  * A simple gender gap visualization for the current Parliament that kind of summarizes it all: <[[https://docs.google.com/spreadsheets/d/1MiO6w331UMGX4vYTyhsMs5uUAgAUCJfyzPkRqAUSjww/edit?usp=sharing|gsheet shalf]]>
 ===== Team ===== ===== Team =====
   * Giovanni Colavizza, [[https://github.com/Giovanni1085|github: Giovanni1085]]   * Giovanni Colavizza, [[https://github.com/Giovanni1085|github: Giovanni1085]]
Line 74: Line 77:
   * [[http://shalf.me|Yann Heurtaux]] [[https://twitter.com/shalf|@shalf]], [[https://github.com/shalf|github: shalf]]   * [[http://shalf.me|Yann Heurtaux]] [[https://twitter.com/shalf|@shalf]], [[https://github.com/shalf|github: shalf]]
   * Fabrice Hong, [[https://github.com/fabricehong|github: fabricehong]]   * Fabrice Hong, [[https://github.com/fabricehong|github: fabricehong]]
-  * Jérémie Knüsel [[https://twitter.com/ambystome|@ambystome]], [[https://github.com/knuessel|github: knuessel]]+  * Jan Iwaszkiewicz, [[https://github.com/jan44|github: jan44]] 
 +  * Jérémie Knüsel [[https://twitter.com/ambystome|@ambystome]], [[https://github.com/knuesel|github: knuesel]]
   * Sylvain Moesching   * Sylvain Moesching
   * [[user:nray|Nicolas Ray]]   * [[user:nray|Nicolas Ray]]
   * [[http://yro.ch|Yannick Rochat]] [[https://twitter.com/yrochat|@yrochat]], [[https://github.com/yrochat|github: yrochat]]   * [[http://yro.ch|Yannick Rochat]] [[https://twitter.com/yrochat|@yrochat]], [[https://github.com/yrochat|github: yrochat]]
   * Douglas Watson, [[https://github.com/douglas-watson|github: douglas-watson]]   * Douglas Watson, [[https://github.com/douglas-watson|github: douglas-watson]]
-  * Jan Iwaszkiewicz, [[https://github.com/jan44|github: jan44]]+
  
 ===== Links ===== ===== Links =====
  • project/chparlscraping.1441466457.txt.gz
  • Last modified: 2015/09/05 17:20
  • by shalf