Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revisionBoth sides next revision | ||
project:hds_out_of_the_box [2016/07/02 16:42] – [Named Entity Recognition] maudehrmann | project:hds_out_of_the_box [2016/07/04 14:53] – [Data] pmau | ||
---|---|---|---|
Line 1: | Line 1: | ||
===== Historical Dictionary of Switzerland Out of the Box ===== | ===== Historical Dictionary of Switzerland Out of the Box ===== | ||
- | The [[http:// | + | The [[http:// |
- | The HDS digital edition comprises about XXXX articles organized in 4 main headword groups: \\ | + | The HDS digital edition comprises about 36.000 |
- Biographies, | - Biographies, | ||
- Families, \\ | - Families, \\ | ||
- | - Geographical | + | - Geographical |
- Thematical contributions. | - Thematical contributions. | ||
- | Beyond the encyclopaedic description of entities/ | + | Beyond the encyclopaedic description of entities/ |
Line 17: | Line 17: | ||
We have the following data:\\ | We have the following data:\\ | ||
- | - [[http:// | + | |
- | - [[http:// | + | * bibliographic references of HDS articles\\ |
- | - bibliographic references of HDS articles\\ | + | * article titles\\ |
+ | * [[http:// | ||
| | ||
===== Goals ===== | ===== Goals ===== | ||
Line 25: | Line 27: | ||
Our projects revolve around **Linking the HDS to external data** and aim at:\\ | Our projects revolve around **Linking the HDS to external data** and aim at:\\ | ||
- | ** 1. Entity | + | ** 1. Entity |
The objective is to link named entity mentions discovered in historical Swiss newspapers to their correspondant HDS articles. | The objective is to link named entity mentions discovered in historical Swiss newspapers to their correspondant HDS articles. | ||
Line 31: | Line 33: | ||
** 2. Exploring reference citation of HDS articles** | ** 2. Exploring reference citation of HDS articles** | ||
- | The objective is to reconcile HDS bibliographic data with SwissBib. | + | The objective is to reconcile HDS bibliographic data contained in articles |
Line 37: | Line 39: | ||
===== Named Entity Recognition ===== | ===== Named Entity Recognition ===== | ||
- | We used web-services to annotate text with Named Entities:\\ | + | We used web-services to annotate text with named entities:\\ |
- Dandelion\\ | - Dandelion\\ | ||
- Alchemy\\ | - Alchemy\\ | ||
- OpenCalais \\ | - OpenCalais \\ | ||
+ | |||
+ | |||
+ | {{: | ||
Named entity mentions (persons and places) are matched against entity labels of HDS entries and directly linked when only one HDS entry exists. | Named entity mentions (persons and places) are matched against entity labels of HDS entries and directly linked when only one HDS entry exists. | ||
Further developments would includes:\\ | Further developments would includes:\\ | ||
- | - handling name variants, e.g. 'W.A. Mozart' | + | - handling name variants, e.g. 'W.A. Mozart' |
- | - real disambiguation by comparing the newspaper article context with the HDS article context (a first simple similarity could be tf-idf based) | + | - real disambiguation by comparing the newspaper article context with the HDS article context (a first simple similarity could be tf-idf based)\\ |
- | - working with a more refined NER output which comprises information about name components (first, middle,last names) | + | - working with a more refined NER output which comprises information about name components (first, middle,last names)\\ |
+ | |||
+ | === Some statistics === | ||
+ | In the 23.622 articles of the year 1914 in «Le Temps digital archive» we linked 90.603 entities pointing to 1.417 articles of the «Historical Dictionary of Switzerland». | ||
+ | |||
+ | {{: | ||
+ | |||
+ | |||
+ | === Web Interface === | ||
+ | |||
+ | We developed a simple web interface for searching in the corpus and displaying the texts with the links.\\ | ||
+ | It consists of 3 views: | ||
+ | |||
+ | |||
+ | 1. Home\\ | ||
+ | {{: | ||
+ | \\ | ||
+ | 2. Search\\ | ||
+ | {{: | ||
+ | \\ | ||
+ | 3. Article with links to HDS, Wikipedia and dbpedia\\ | ||
+ | {{: | ||
+ | \\ | ||
+ | |||
+ | === Further works === | ||
+ | Further works would include: | ||
+ | - evaluate and improve method.\\ | ||
+ | - apply the method to the Historical Dictionary of Switzerland itself for internal linking.\\ | ||
Line 178: | Line 211: | ||
===== Team ===== | ===== Team ===== | ||
+ | * Pierre-Marie Aubertel | ||
+ | * Francesco Beretta | ||
* Giovanni Colavizza | * Giovanni Colavizza | ||
- | * Jonas Schneider | ||
* Maud Ehrmann | * Maud Ehrmann | ||
* [[https:// | * [[https:// | ||
- | * Pierre-Marie Aubertel | + | * Jonas Schneider |
- | * Francesco Beretta | + | |
+ | |||
+ | |||
| | ||
- | {{tag> | + | {{tag> |