Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
project:jung_rilke_correspondance_network [2017/09/15 12:30] – [Data] basimar | project:jung_rilke_correspondance_network [2017/10/26 16:36] (current) – [Team] mgasser | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Jung - Rilke Correspondance Network | + | ===== Jung - Rilke Correspondence Networks |
- | (screenshots or sketches up here) | + | Joint project bringing together three separate projects: Rilke correspondance, |
- | Joint project bringing together three separate | + | Objectives: |
+ | * agree on a common metadata structure for correspondence datasets | ||
+ | * clean and enrich the existing datasets | ||
+ | * build a database that can can be used not just by these two projects | ||
+ | * experiment with existing visualization | ||
===== Data ===== | ===== Data ===== | ||
- | * List and link your actual and ideal data sources. | + | **ACTUAL INPUT DATA** |
- | ACTUAL | ||
* For Jung correspondance: | * For Jung correspondance: | ||
* For Rilke correspondance: | * For Rilke correspondance: | ||
* | * | ||
- | Comment: The Rilke data is cleaner than the Jung data. Some cleaning needed to make them match. | + | Comment: The Rilke data is cleaner than the Jung data. Some cleaning needed to make them match: |
- | IDEAL | + | 1) separate sender and receiver; clean up and cluster (OpenRefine) |
- | Jung: comments. Needs to be cleaned up. Names of sender | + | 2) clean up dates and put in a format that IT developpers need (Perl) |
- | To do: extract place. | + | 3) clean up placenames and match to geolocators (Dariah-DE) |
+ | 4) match senders and receivers to Wikidata where possible (Openrefine, | ||
+ | |||
+ | **METADATA STRUCTURE** | ||
+ | |||
+ | The follwing fields were included in the common basic data structure: | ||
+ | |||
+ | sysID; callNo; titel; sender; senderID; recipient; recipientID; | ||
+ | |||
+ | **DATA CLEANSING AND ENRICHMENT** | ||
+ | |||
+ | |||
+ | * Description of steps, and issues, in Process (please correct and refine). | ||
+ | |||
+ | |||
+ | Issues with the Jung correspondence is data structure. Sender and recipient in one column. | ||
+ | Also dates need both cleaning for consistency (e.g. removal of " | ||
+ | |||
+ | For geocoding the placenames: OpenRefine was used for the normalization | ||
+ | |||
+ | The C.G. Jung dataset contains sending locations information for 16,619 out of 32,127 letters; 10,271 places were georeferenced. In the Rilke dataset all the sending location were georeferenced. | ||
+ | |||
+ | For matching senders and recipients | ||
+ | |||
+ | Doing this all at once poses some project management challenges, since several people may be working on same files to clean different data. Need to integrate all files. | ||
DATA after cleaning: | DATA after cleaning: | ||
Line 23: | Line 50: | ||
https:// | https:// | ||
+ | **DATABASE** | ||
+ | Issues with the target database: | ||
+ | Fields defined, SQL databases and visuablisation program being evaluated. | ||
+ | How - and whether - to integrate with WIkidata still not clear. | ||
- | ===== Team ===== | + | Issues: letters are too detailed to be imported as Wikidata items, although it looks like the senders and recipients have the notability and networks to make it worthwhile. Trying to keep options open. |
- | Please add yourself | + | As IT guys are building the database |
+ | They took the cleaned CVS files, converted to SQL, then JSON. | ||
- | Flor Méchain (Wikimedia CH): working on cleaning and matching with Wikidata Q codes using OpenRefine. | ||
- | Lena Heizman (Doda): Mentoring with OpenRefine. | ||
- | Hugo Martin | + | Additional issues encountered: |
- | Samantha Weiss | + | - Visualization: |
- | Michael Gasser | + | - Ensuring that the files from different projects respect same structure in final, cleaned-up versions. |
- | Irina Schubert | ||
- | Sylvie Béguelin | ||
- | Basie Manti | ||
- | Jérome Zbinden | ||
- | Deborah Kyburz | ||
- | Paul Varé | ||
- | Laurel Zuckerman | ||
- | Christian Sisi?? | ||
- | Adrien Zemma | + | ===== Visualization (examples) ===== |
- | Dominik Sievi | + | {{: |
- | * [[user:wdparis2017]] | + | Heatmap of Rainer Maria Rilke’s correspondence (visualized with Google Fusion Tables) |
+ | |||
+ | |||
+ | {{:project: | ||
+ | |||
+ | Correspondence from and to C. G. Jung visualized as a network. The two large nodes are Carl Gustav Jung (below) and his secretary’s office (above). Visualized with the tool Gephi | ||
+ | ===== Video of the presentation ===== | ||
+ | {{vimeo> | ||
| | ||
- | {{tag> | + | {{tag> |
+ | |||
+ | |||
+ | ===== Team ===== | ||
+ | |||
+ | |||
+ | |||
+ | * Flor Méchain (Wikimedia CH): working on cleaning and matching with Wikidata Q codes using OpenRefine. | ||
+ | * Lena Heizman (Dodis / histHub): Mentoring with OpenRefine. | ||
+ | * Hugo Martin | ||
+ | * Samantha Weiss | ||
+ | * Michael Gasser (Archives, ETH Library): provider of the dataset [[https:// | ||
+ | * Irina Schubert | ||
+ | * Sylvie Béguelin | ||
+ | * Basil Marti | ||
+ | * Jérome Zbinden | ||
+ | * Deborah Kyburz | ||
+ | * Paul Varé | ||
+ | * Laurel Zuckerman | ||
+ | * Christiane Sibille (Dodis / histHub) | ||
+ | * Adrien Zemma | ||
+ | * Dominik Sievi [[user: | ||