This is an old revision of the document!
Cumulizer
This is a Make OpenData.ch 2013 finance hackdays project with a focus on analyzing personal consumer data. We have started by aggregating shopping receipts available from the Cumulus incentive points program run by Migros, one of the largest supermarkets chains in the country and would be interested in expanding the concept to others.
We believe that the data we help to collect while making purchases can be relevant and useful to us individual shoppers, and we want to try starting popular action to aggregate this (anonymously) with others in our community. Although it is unlikely that we will get broad access to this highly valuable data, we want to at least learn some of the same things that shops are already learning about us: what, when and where we make our purchases, what are the patterns, how do we compare. To get around commercial closed data policy, we can liberate, i.e. open our personal data by uploading it to a shared repository.
The project is live at cumulizer.eu
Data
Right now any customer who takes part in the Cumulus program can access detailed data about their purchases in CSV format. This is an excellent initiative from Migros which is somewhat hampered by a not very user friendly web application - we want to automate this process to make it easier for users.
Right now here are the steps to collect your personal data:
- Log into the Cumulus program using your customer number and a password on a paper mail-out to create an M-Connect account if you have not already
- Under Mein Konto - Kassenbons you can browse and view details of your shopping receipts
- Browsing month-by-month, and for every page, you need to click Alle, and then Ausgewählte Kassenbons als Excel-Liste (csv) (not “Übersicht”) to download a file with the details for those shopping trips
You are now ready to share your personal data with us. At the moment we do not have a live application, but if you are really eager to help, please send us your CSV files directly.
Legal status
Here is what Migros has to say about data usage in their data protection policy:
Das Datenschutzgesetz bestimmt in Artikel 8, dass jede Person von Inhabern einer Datensammlung Auskunft über die zu ihrer Person gespeicherten und bearbeiteten Daten verlangen kann. Diese Auskunftsbegehren müssen schriftlich eingereicht werden. Auf der Homepage des eidgenössischen Datenschutz- und Öffentlichkeitsbeauftragten finden Sie entsprechende Musterschreiben.
(rough translation: they link to the government standard form which you can fill out to request an export of all your personal data)
In the Terms and Conditions and Impressum there is no mention of restrictions on the use of the personal Cumulus data. At the moment we assume, and will contact their management to check, that they will not restrict us from using the receipt data from our personal accounts any way we can, including sharing it publicly.
We will obviously not try to collect any other person's data without their full cooperation, attempt to circumvent any of their security, or put any automated scraping/spidering in place. This requires clear statements and terms of use on the site. We are hoping to find a way to cooperate with the Cumulus program, and ensure that we do not abuse the allowances they have made that make this project possible.
If you are able to provide additional legal input, please do.
Project status
We have created an initial working prototype which gives a view into the data of five Cumulus users who have volunteered to submit one year of data anonymously to seed the project.
- To complete Phase I we need to finish developing the dashboard tools that should conveniently and accurately visualize their purchases over time, sorted by category, and so on.
- In Phase II we are aiming to let users upload their own data, so that we get aggregate data from the community, allowing comparison between users who may wish to add a simple profile, and show common trends.
- In Phase III we will be able to link our data with other sources, showing additional information about user purchases, and how buying trends align with, for example semantically interlinked economic data (world bank via 270a) and open consumer surveys (data.gov). We will also create APIs and, having a sufficient user base, the project itself will be interesting to social scientists who are reportedly starved for accurate personal economic data.
If you'd like to take part in the planning, or otherwise contribute your expertise, please contact us.
Technical information
Here is a basic guide to our JSON API:
Link | Info |
---|---|
/dashboard/simpleupload | Simple upload form |
/dashboard/heatmap | Heatmap of the stores |
/maintenance/geocodestores | Geocode store adresses |
/maintenance/autocategorize | Start autocategorization |
/api?action=summary | General statistics |
/api?action=stores | Stores and sales |
/api?action=monthlypurchases | List of purchases by month |
/api?action=spendings | Monthly spending sums by category |
/api?action=categories | List of all categories |
For more details, please see the source.
<GITHUB cstuder/cumulizer>
Installation
- Download Cumulizer sources from GitHub
- Install the project on your Apache/MySQL/PHP server
- Create a database, modify the top of the import script if you don't use 'cumulizer' as the name
- Import the initial schema (_docs/cumulizer.sql)
- Modify username and password (application/config/database.php)
- Upload your CSV receipts (dashboard/simpleupload)
- Run geotagging generator (admin/geocodestores)
- Your dashboard will be ready at /
Team
Links
- Migros is one of Switzerland's top three supermarkets and has been immensely helpful in creating a legal interface to download our personal data
- Solikarte is a related initiative for anonymously collecting points for charitable causes, and would be a possible data source and partner for this project
- One Receipt lets customers in the US to see all their purchases in one place
- Opening Product Data for a more Responsible World (OKFN Blog) talks about the opportunities for open product data, and the Product Open Data project
- A Deep Dive into Facebook and Datalogix (Electronic Frontier Foundation) explores the marketing bonanza that is consumer data collection, and warns of the privacy risks
- What does the consumer data industry know about you (The Atlantic) explores a few more perspectives on the subject