Cumulizer

This is a Make OpenData.ch 2013 finance hackdays project with a focus on analyzing personal consumer data. We have started by aggregating shopping receipts available from the Cumulus incentive points program run by Migros, one of the largest supermarkets chains in the country and would be interested in expanding the concept to others.

This project is currently no longer live. We are looking for support to bring this to the general public.

We believe that the data we help to collect while making purchases can be relevant and useful to us individual shoppers, and we want to try starting popular action to aggregate this (anonymously) with others in our community. Although it is unlikely that we will get broad access to this highly valuable data, we want to at least learn some of the same things that shops are already learning about us: what, when and where we make our purchases, what are the patterns, how do we compare. To get around commercial closed data policy, we can liberate, i.e. open our personal data by uploading it to a shared repository.

Cumulizer Heatmap

Data

Right now any customer who takes part in the Cumulus program can access detailed data about their purchases in CSV format. This is an excellent initiative from Migros which is somewhat hampered by a not very user friendly web application - we want to automate this process to make it easier for users.

Right now here are the steps to collect your personal data:

  1. Log into the Cumulus program using your customer number and a password on a paper mail-out to create an M-Connect account if you have not already
  2. Under Mein Konto - Kassenbons you can browse and view details of your shopping receipts
  3. Browsing month-by-month, and for every page, you need to click Alle, and then Ausgewählte Kassenbons als Excel-Liste (csv) (not “Übersicht”) to download a file with the details for those shopping trips

You are now ready to share your personal data with us. At the moment we do not have a live application, but if you are really eager to help, please send us your CSV files directly.

Here is what Migros has to say about data usage in their data protection policy:

Das Datenschutzgesetz bestimmt in Artikel 8, dass jede Person von Inhabern einer Datensammlung Auskunft über die zu ihrer Person gespeicherten und bearbeiteten Daten verlangen kann. Diese Auskunftsbegehren müssen schriftlich eingereicht werden. Auf der Homepage des eidgenössischen Datenschutz- und Öffentlichkeitsbeauftragten finden Sie entsprechende Musterschreiben.

(rough translation: they link to the government standard form which you can fill out to request an export of all your personal data)

In the Terms and Conditions and Impressum there is no mention of restrictions on the use of the personal Cumulus data. At the moment we assume, and will contact their management to check, that they will not restrict us from using the receipt data from our personal accounts any way we can, including sharing it publicly.

We will obviously not try to collect any other person's data without their full cooperation, attempt to circumvent any of their security, or put any automated scraping/spidering in place. This requires clear statements and terms of use on the site. We are hoping to find a way to cooperate with the Cumulus program, and ensure that we do not abuse the allowances they have made that make this project possible.

If you are able to provide additional legal input, please do.

Project status

We have created an initial working prototype which gives a view into the data of five Cumulus users who have volunteered to submit one year of data anonymously to seed the project.

  • To complete Phase I we need to finish developing the dashboard tools that should conveniently and accurately visualize their purchases over time, sorted by category, and so on.
  • In Phase II we are aiming to let users upload their own data, so that we get aggregate data from the community, allowing comparison between users who may wish to add a simple profile, and show common trends.
  • In Phase III we will be able to link our data with other sources, showing additional information about user purchases, and how buying trends align with, for example semantically interlinked economic data (world bank via 270a) and open consumer surveys (data.gov). We will also create APIs and, having a sufficient user base, the project itself will be interesting to social scientists who are reportedly starved for accurate personal economic data.

If you'd like to take part in the planning, or otherwise contribute your expertise, please contact us.

Technical information

Here is a basic guide to our JSON API:

LinkInfo
/dashboard/simpleuploadSimple upload form
/dashboard/heatmapHeatmap of the stores
/maintenance/geocodestoresGeocode store adresses
/maintenance/autocategorizeStart autocategorization
/api?action=summaryGeneral statistics
/api?action=storesStores and sales
/api?action=monthlypurchasesList of purchases by month
/api?action=spendingsMonthly spending sums by category
/api?action=categoriesList of all categories

For more details, please see the source.

Installation

  1. Download Cumulizer sources from GitHub
  2. Install composer and run it at the root of the project
  3. Install the project on your Apache/MySQL/PHP server
  4. Create a database, modify the top of the import script if you don't use 'cumulizer' as the name
  5. Import the initial schema (_docs/cumulizer.sql)
  6. Modify username and password (application/config/database.php)
  7. Upload your CSV receipts (dashboard/simpleupload)
  8. Run geotagging generator (admin/geocodestores)
  9. Your dashboard will be ready at /

Team