This is an old revision of the document!
Jung - Rilke Correspondance Network
(screenshots or sketches up here)
Joint project bringing together three separate projects: Rilke correspondance, Jung correspondance and ETH Library. Include links to your demo and/or source code, relevant documentation, tools, etc.
Data
- List and link your actual and ideal data sources.
ACTUAL
- For Jung correspondance: https://opendata.swiss/dataset/c-g-jung-correspondence (three files)
- For Rilke correspondance: https://opendata.swiss/en/dataset/handschriften-rainer-maria-rilke (two files, images and meta data)
Comment: The Rilke data is cleaner than the Jung data. Some cleaning needed to make them match: 1) separate sender and receiver; clean up and cluster (OpenRefine) 2) clean up dates and put in a format that IT developpers need (Perl) 3) clean up placenames and match to geolocators (Dariah-DE) 4) match senders and receivers to Wikidata where possible (Openrefine, problem with volume)
IDEAL
DATA after cleaning:
https://github.com/basimar/hackathon17_jungrilke
* Description of steps, and issues, in Process (please correct and refine).
Objective: provide a framework for correspondance, defining a database that can can be used not just by these two projects but others as well, and that works well with visualisation software in order to see correspondance networks.
Issues with the Jung correspondence is data quality. Sender and recipient in one column. Data cleaning still needed. Also dates need both cleaning for consistency and transformation to meet developper specs. (Basil using Perl)
Will look for Q with Open Refine, note them, and list names that need to be created in wikidata for future use.
Issues with the target database: Fields defined, SQL databases and visuablisation program being evaluated. How - and whether - to integrate with WIkidata still not clear.
Issues: letters are too detailed to be Wikidata items, although it looks like the senders and recipients have the notability and networks to make it worthwhile. Trying to keep options open.
As IT guys are building the database to be used with the visualization tool, data is being cleaned and Q codes are being extracted.
Doing this all at once poses some project management challenges, since several people may be working on same files to clean different data. Need to integrate all files.
Additional issues encountered: Wikidata Q codes that Openrefine linked to seem to have disappeared?
Team
Please add yourself to the list
Flor Méchain (Wikimedia CH): working on cleaning and matching with Wikidata Q codes using OpenRefine.
Lena Heizman (Doda): Mentoring with OpenRefine.
Hugo Martin
Samantha Weiss
Michael Gasser
Irina Schubert
Sylvie Béguelin
Basie Manti
Jérome Zbinden
Deborah Kyburz
Paul Varé
Laurel Zuckerman
Christian Sisi??
Adrien Zemma
Dominik Sievi