project:hds_out_of_the_box

This is an old revision of the document!


The Historical Dictionary of Switzerland (HDS) is an academic reference work which documents ``the most important topics and objects of Swiss history from prehistory up to the present''.

The HDS digital edition comprises about XXXX articles organized in 4 main headword groups:
- Biographies,
- Families,
- Geographical Entities and
- Thematical contributions.

Beyond the encyclopaedic description of entities/concepts, each articles contains references to primary and secondary sources which supported authors when writing articles.

We have the following data:

- metadata information about HDS articles Historical Dictionary of Switzerland
- Le Temps digital archive for the year 1914
- bibliographic references of HDS articles

Our projects revolve around Linking the HDS to external data and aim at:

1. Entity Linking towards HDS

The objective is to link named entity mentions discovered in historical Swiss newspapers to their correspondant HDS articles.

2. Exploring reference citation of HDS articles

The objective is to reconcile HDS bibliographic data with SwissBib.

We work on the list of references in all articles of the HDS, with three goals:

  1. Finding all the sources which are cited in the HDS (several sources are cited multiple times).
  2. Link all the sources with the SwissBib catalog, if possible.
  3. Interactively explore the citation networks of the HDS.

The dataset: lists of references in every HDS article:

Result of source disambiguation and look-up into SwissBib:

Bibliographic coupling network of the HDS articles (giant component). In Bibliographic coupling two articles are connected if they cite the same source at least once. Biographies (white), Places (green), Families (blue) and Topics (red):

Ci-citation network of the HDS sources (giant component of degree > 15). In co-citation networks, two sources are connected if they are cited by one or more articles together. Publications (white), Works of the subject of an article (green), Archival sources (cyan) and Critical editions (grey):

Bibliographic data in the HDS citations is unfortunately not structured. There is no logical separation between work title, publication year, page numbers, etc. other than typographical convention. Furthermore, many citations contain abbreviations. Using OpenRefine to explore the dataset, multiple approaches were attempted to query the swissbib SRU API using unstructured citation data.

Examples of unstructured data issues

  • L'oro bruno - Cioccolato e cioccolatieri delle terre ticinesi, Ausstellungskat. Lottigna, 2007 - Elements of the citation are sometimes divided by commas (,) but there is no fixed rule. In this case, the first comma separates the title from an object type. Another comma separates the place of publication with the publication year.
  • A. Niederer, «Vergleichende Bemerkungen zur ethnolog. und zur volkskundl. Arbeitsweise», in Beitr. zur E. der Schweiz 4, 1980, 1-25 - This citation mentions an article within a collection, the commas separate publication year and page numbers.
  • La visite des églises du diocèse de L. en 1453, hg. von A. Wildermann et al., 1993 - The subject of the dictionary entry is often abbreviated in the related citations. In this example, “L.” stands for Lausanne, because the citation comes from the dictionary entry for Lausanne.
  • Stat. Jb. des Kt. L., 2002- - In this example, “L.” stands for Luzern. The other abbreviations are standard and can be resolved using the dictionary's list of abbreviations.
  • project/hds_out_of_the_box.1467465690.txt.gz
  • Last modified: 2016/07/02 15:21
  • by timtom