User Tools

Site Tools


project:dncsonyc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
project:dncsonyc [2018/10/27 13:41] – created beat_estermannproject:dncsonyc [2018/10/28 15:42] (current) – [Dog Name Creativity Survey of New York City] birk
Line 1: Line 1:
 ===== Dog Name Creativity Survey of New York City ===== ===== Dog Name Creativity Survey of New York City =====
  
-(screenshots or sketches up here)+{{:project:dncrsesult1.png?600|How does the creativity of given dog names related to the amount of culture found in the different boroughs of New York City?}} 
 + 
 +We started this project to see if art and cultural institutions in the environment have an impact on the creativity of dognames. This was not possible with the date from Zurich because the name-dataset does not contain information about the location and the dataset about the owners does not include the dognames. We choose to stick with our idea but used a different dataset: NYC Dog Licensing Dataset. 
 + 
 +The creativity of a name is measured by the frequency of each letter in the English language and gets +/- points according to the amount of dogs with the same name. The data for the cultural environment comes from Wikidata. 
 + 
 +After some data-cleaning with OpenRefine and failed attempts with OpenCalc we got the following code: 
 +<code> 
 +import string 
 +import pandas as pd 
 + 
 +numbers_ = {"e":1,"t":2,"a":3,"o":4,"n":5,"i":6,"s":7,"h":8,"r":9,"l":10,"d":11,"u":12,"c":13,"m":14,"w":15,"y":16,"f":17,"g":18,"p":19,"b":20,"v":21,"k":22,"j":23,"x":24,"q":25,"z":26} 
 +name_list = [] 
 + 
 +def KreaWert(name_)
 +    name_ = str(name_) 
 +    wert_ = 0 
 +    for letter in str.lower(name_): 
 +        temp_ = 0 
 +        if letter in string.ascii_lowercase : 
 +            temp_ += numbers_[letter] 
 +            wert_ += temp_ 
 +    if name_ in H_:    
 +        wert_ = wert_* ((Hmax-H_[name_])/(Hmax-1)*5 + 0.2) 
 +    return round(wert_,1) 
 + 
 +df = pd.read_csv("Vds3.csv", sep = ";"
 +df["AnimalName"] = df["AnimalName"].str.strip() 
 +H_ = df["AnimalName"].value_counts() 
 +Hmax = max(H_) 
 +Hmin = min(H_) 
 + 
 +df["KreaWert"] = df["AnimalName"].map(KreaWert) 
 +df.to_csv("namen2.csv"
 + 
 +dftemp = df[["AnimalName", "KreaWert"]].drop_duplicates().set_index("AnimalName"
 +dftemp.to_csv("dftemp.csv"
 + 
 +df3 = pd.DataFrame() 
 +df3["amount"] = H_ 
 +df3 = df3.join(dftemp, how="outer"
 +df3.to_csv("data3.csv"
 + 
 +df1 = round(df.groupby("Borough").mean(),2) 
 +df1.to_csv("data1.csv"
 + 
 +df2 = round(df.groupby(["Borough","AnimalGender"]).mean(),2) 
 +df2.to_csv("data2.csv"
 +</code> 
 + 
 +Visualisations were made with D3: https://d3js.org/
  
-Brief description goes here. Add sections below if you need more room. Include links to your demo and/or source code, relevant documentation, tools, etc. 
  
 ===== Data ===== ===== Data =====
 +Hundedaten der Stadt Zürich:
 +  * https://opendata.swiss/de/dataset/hundenamen-aus-dem-hundebestand-der-stadt-zurich
 +  * https://opendata.swiss/de/dataset/hundebestand-der-stadt-zurich
  
-  List and link your actual and ideal data sources.+NYC Dog Licensing Dataset: 
 +  https://data.cityofnewyork.us/Health/NYC-Dog-Licensing-Dataset/nu7n-tubp 
 + 
  
 ===== Team ===== ===== Team =====
project/dncsonyc.1540640471.txt.gz · Last modified: 2018/10/27 13:41 by beat_estermann