Javier D. Fernández – Green Big Data

April 18, 2018 in Opinion

ol{margin:0;padding:0}table td,table th{padding:0}.c1{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#000000;border-top-width:1pt;border-right-width:1pt;border-left-color:#000000;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;background-color:#fff2cc;border-left-style:solid;border-bottom-width:1pt;width:134.2pt;border-top-color:#000000;border-bottom-style:solid}.c20{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#000000;border-top-width:1pt;border-right-width:1pt;border-left-color:#000000;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;background-color:#f1c232;border-left-style:solid;border-bottom-width:1pt;width:134.2pt;border-top-color:#000000;border-bottom-style:solid}.c6{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#000000;border-top-width:1pt;border-right-width:1pt;border-left-color:#000000;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;border-left-style:solid;border-bottom-width:1pt;width:117pt;border-top-color:#000000;border-bottom-style:solid}.c12{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#000000;border-top-width:1pt;border-right-width:1pt;border-left-color:#000000;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;border-left-style:solid;border-bottom-width:1pt;width:147.8pt;border-top-color:#000000;border-bottom-style:solid}.c9{border-right-style:solid;padding:5pt 5pt 5pt 5pt;border-bottom-color:#ffffff;border-top-width:1pt;border-right-width:1pt;border-left-color:#ffffff;vertical-align:top;border-right-color:#000000;border-left-width:1pt;border-top-style:solid;border-left-style:solid;border-bottom-width:1pt;width:69pt;border-top-color:#ffffff;border-bottom-style:solid}.c2{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:”Arial”;font-style:normal}.c18{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:center}.c0{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:justify}.c5{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c10{color:#000000;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:”Arial”;font-style:normal}.c4{padding-top:0pt;padding-bottom:0pt;line-height:1.0;text-align:left}.c15{padding-top:0pt;padding-bottom:0pt;line-height:1.0;text-align:right}.c17{border-spacing:0;border-collapse:collapse;margin-right:auto}.c8{text-decoration-skip-ink:none;-webkit-text-decoration-skip:none;color:#1155cc;text-decoration:underline}.c19{background-color:#ffffff;max-width:468pt;padding:72pt 72pt 72pt 72pt}.c11{color:inherit;text-decoration:inherit}.c13{height:0pt}.c16{font-style:italic}.c14{background-color:#cccccc}.c3{height:11pt}.c7{font-weight:700}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:”Arial”}p{margin:0;color:#000000;font-size:11pt;font-family:”Arial”}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:”Arial”;line-height:1.15;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}

I have a MSc and a PhD degree in Computer Science, and it’s sad (but honest) to say that in all my academic and professional career the word “privacy” was hardly mentioned. We do learn about “security” but as a mere non-functional requirement, as it is called. Don’t get me wrong, I do care about privacy and I envision a future where “ethical systems” are the rule and no longer the exception, but when people suggest, promote or ask for privacy-by-design systems, one should also understand that we engineers (at least my generation) are mostly not yet privacy-by-design educated.

That’s why, caring about privacy, I like it so much to read diverse theories and manifestos providing general principles to come up with ethical, responsible and sustainable designs for our systems, in particular where personal Big Data (and all the variants, i.e. Data Science) is involved. The Copenhague letter (promoting open humanity-centered designs to serve society), the Responsible Data Science principles (fairness, accuracy, confidentiality, and transparency) and the Ethical Design Manifesto (focused on maximizing human rights and human experience and respect human effort) are good examples, to name but a few.

Acknowledging that these are inspiring works, an engineer might find the aforementioned principles a bit too general to serve as an everyday reference guide for practitioners. In fact, one could argue that they are deliberately open for interpretation, in order to adapt them to each particular use case: they point to the goal(s) and some intermediate stones (i.e. openess or decentralization), while the work of filling up all the gaps is by no means trivial.

Digging a bit to find more fine-grained principles, I thought of the concept of Green Big Data, to refer to Big Data made and use in a “green”, healthy fashion, i.e, being human-centered, ethical, sustainable and valuable for the society. Interestingly, the closest reference for such term was a highly cited article from 2003 regarding “green engineering” [1]. In this article, Anastas and Zimmerman inspected 12 principles to serve as a “framework for scientists and engineers to engage in when designing new materials, products, processes, and systems that are benign to human health and the environment”.

Inspired by the 12 principles of green engineering, I started an exercise to map such principles to my idea of Green Big Data. This map is by no means complete, and still subject to interpretation and discussion. Ben Wagner and my colleagues at the Privacy & Sustainable Computing Lab provided valuable feedback and encouraged me to share these principles with the community in order to start a discussion openly and widely. As an example, Axel Polleres already pointed out that “green” is interpreted here as mostly covering the privacy-aware aspect of sustainable computing, but other concepts such as “transparency-aware” (make data easy to consume) or “environmentally-aware” (avoid wasting energy by letting people run the same stuff over and over again) could be further developed.

You can find the Green Big Data principles below, looking forward for your thoughts!

12 Principles of Green Engineering

12 Principles of Green Big Data

Related topics

Principle 1

Designers need to strive to ensure that all material and energy inputs and outputs are as inherently non-hazardous as possible.

Big Data inputs, outputs and algorithms should be designed to minimize exposing persons to risk.

Security, privacy, data leaks, fairness, confidentiality, human-centric

Principle 2

It is better to prevent waste than to treat or clean up waste after it is formed.

Design proactive strategies to minimize, prevent, detect and contain personal data leaks and misuse.

Security, privacy, accountability, transparency

Principle 3

Separation and purification operations should be designed to minimize energy consumption and materials use.

Design distributed and energy-efficient systems and algorithms that require as little personal data as possible, favoring anonymous and personal-independent processing.

Distribution, anonymity, sustainability

Principle 4

Products, processes, and systems should be designed to maximize mass, energy, space, and time efficiency.

Use the full capabilities of existing resources and monitor that it serves the needs of individuals and the society in general.

Sustainability, human-centric, societal challenges, accuracy

Principle 5

Products, processes, and systems should be “output pulled” rather than “input pushed” through the use of energy and materials.

Design systems and algorithms to be versatile, flexible and extensible, independently of the scale of the personal data input.

Sustainability,

scalability

Principle 6

Embedded entropy and complexity must be viewed as an investment when making design choices on recycle, reuse, or beneficial disposition.

Treat personal data as a first-class but hazardous citizen, with extreme precautions in third-party personal data reuse, sharing and disposal.

Privacy, confidentiality, human-centric

Principle 7

Targeted durability, not immortality, should be a design goal.

Define the “intended lifespan” of the system, algorithms and involved data, and design them to be transparent by subjects, who control their data.

Transparency, openness, right to amend and to be forgotten,

human-centric

Principle 8

Design for unnecessary capacity or capability (e.g., “one size fits all”) solutions should be considered a design flaw.

Analyze the expected system/algorithm load and design it to meet the needs and minimize the excess.

Sustainability, scalability, data leaks

Principle 9

Material diversity in multicomponent products should be minimized to promote disassembly and value retention.

Data and system integration must be carefully designed to avoid further personal data risks.

Integration, confidentiality, cross-correlation of personal data

Principle 10

Design of products, processes, and systems must include integration and interconnectivity with available energy and materials flows.

Design open and interoperable systems to leverage the full potential of existing systems and data, while maximizing transparency for data subjects.

Integration, openness

Interoperability, transparency

Principle 11

Products, processes, and systems should be designed for performance in a commercial “afterlife”.

Design modularly for the potential system and data obsolescence, maximizing reuse.

Sustainability, Obsolescence

Principle 12

Material and energy inputs should be renewable rather than depleting.

Prefer data, systems and algorithms that are

open, well-maintained and sustainable in the long term.

Integration, openness

Interoperability, sustainability

 

[1] Anastas, P. & Zimmerman, J. 2003. Design through the 12 principles of green engineering. Environmental Science and Technology 37(5):94A–101A

Axel Polleres: What is “Sustainable Computing”?

March 20, 2018 in Opinion

Blog post written by Axel Polleres and originally posted on http://doingthingswithdata.wordpress.com/

A while ago, together with colleagues Sarah Spiekermann-Hoff, Sabrina Kirrane, and Ben Wagner (who joined in a bit later) we founded a joint research lab, to foster interdisciplinary discussions on how information systems can be build in a private, secure, ethical, value-driven, and eventually more human-centric manner.

We called this lab the Privacy & Sustainable Computing Lab to provide a platform to jointly promote and discuss our research and views and provide a think-tank on how these goals can be achieved, also open to others. Since then, we had many partially heated but first and foremost always very rewarding discussions, to create mutual understanding between researchers coming from an engineering, AI, social sciences, or legal background, on how to address challenges around digitization.

Not surprisingly, the first (and maybe still unresolved) discussion was about how to name the lab. Back then, our research was very much focused on privacy, but we all felt that the topic of societal challenges in the context of the digital age need to be viewed broader. Consequently, one of the first suggestions floating around was “Privacy-aware and Sustainable Computing Lab“, emphasizing on privacy-awareness as one of the main pillars, but with the aim for a broader definition of sustainable computing, which we later shortened to just “Privacy & Sustainable Computing Lab” (for merely length reasons, if I remember correctly, my co-founders to correct me if I am wrong 😉 ).

Towards defining Sustainable Computing

On coming up with a joint definition of the term “Sustainable Computing” back then, I answered in an internal e-mail thread that

Sustainable Computing for me encompasses obviously: 

  1. human-friendly 
  2. ecologically-friendly
  3. societally friendly 

aspects of [the design and usage of] Computing and Information Systems. In fact, in my personal understanding these three aspects are – in some contexts – potentially conflicting, but resolving and discussing these conflicts is  one points why we have founded this lab in first place.

Conflicts add Value(s)

Conflicts can arise for instance from individual well-being being weighed higher than ecologic impacts (or vice versa), or likewise in how much a society as a whole needs to respect and protect the individual’s rights and needs, and in which cases (if at all ever) the common well-being should be put above those individual rights.

These are fundamental questions in neither of which I would by any means consider myself an expert, but where obviously, if you think them into design of systems or into a technology research agenda (which would be more my home-turf), then it both adds value and makes us discuss values as such. Conflicts, that is, making value conflicts explicit and resolving conflicts about the understanding and importance of these values is a necessary  part of Sustainable Computing. This is why Sarah suggested the addition of

4. value-based

computing, as part of the definition.

Sabrina added, that although sustainable computing is not mentioned the ideas herein, the notion of Sustainable Computing resonates well with what was postulated in the Copenhagen Letter.

Overall, we haven’t finished the discussion about a crisp definition about what Sustainable Computing is (which is maybe why you don’t find it yet on our Website), but for me this is actually ok: to keep this definition evolving and agile, to keep ready for discussions about it, to keep learning from each other. We’ve also discussed sustainable computing quite extensively in a mission workshop in December 2017, to try to better define what sustainable computing is and how it influences our research.

What I learned mainly is that we as technology experts play a crucial role and carry responsibility in defining Sustainable Computing: by being able to explain limitations of technology but also as advocates of the benefits of technologies, in spite of risks and justified skepticism, and by helping developing technologies to minimize these risks.

Some Examples

Some examples of what falls for me under Sustainable computing:

  • Government Transparency through Open Data, and making such Open Data easily accessible to citizens – we try to get closer to this vision in our national research project CommuniData
  • Building technical infrastructures to support transparency in personal data processing for data subjects, but also to help companies to fulfill the respective requirements in terms of legal regulations such as the GDPR – we are working on such an infrastructure in our EU H2020 project SPECIAL
  • Building standard model processes for value-based, ethical system design, as the IEEE P7000 group does it (with involvement of my colleague Sarah Spiekermann).
  • Thinking about how AI can support ethics (instead of fearmongering the risks of AI) – we will shortly publish a special issue on some examples in a forthcoming volume of ACM Transactions on Internet Technologies (TOIT)
  • Studying phenomena and social behaviours online with the purpose of detecting and pinpointing biases as for example our colleagues at the Complexity Science Hub Vienna do in their work on Computational Social Sciences, understanding Systemic Risks and Socio-Economic Phenomena

Many more such examples are hopefully coming out of our lab through cross-fertilizing, interdisciplinary research and discussions in the years to come…

 

Let’s Switch! Some Simple Steps for Privacy-Activism on the Ground

March 13, 2018 in Opinion

by Sarah Spiekermann, Professor of Business Informatics & Author,

Vienna University of Economics and Business, Austria

Being an “activist” sounds like the next big hack in order to change society for the better; important work done by really smart and courageous people. But I wonder whether these high standards for activism suffice to really change things on the ground. I think we need more: We need activism on the ground.

What is activism on the ground?

By activism on the ground I mean all of us need to be involved: anyone who consumes products and services. Anyone who currently does not engage in any of those “rational choices” that economists ascribe to us. Lets become rational! Me, you, we all can become activists on the ground and make markets move OUR way. How? By switching! Switching from the products and services that we currently buy and use, where we feel that the companies who provide us with these services don’t deserve our money or attention or – most importantly – any information about your private life.

For the digital service world I have started to think about how to switch for quite some time. And in November last year I started a project with my Master Class in Privacy & Security at Vienna University of Business and Economics: We went out and tested the market leading Internet Services that most of us use. We looked into their privacy policies and checked to what extent they give us fair control over our data or – in contrast – hide important information from us. We benchmarked the market leaders with their privacy-friendly competitors. We looked at their privacy defaults and the information and decision control they give us over our data. To check whether switching to a privacy-friendly alternative is a realistic option. We also compared all services’ user experience (nothing is worse than functional but unusable security…). And guess what? Ethical machines are indeed out there.

So why not switch?

Here is the free benchmark study for download that gives you the overview.

Switching your messenger services

For the messenger world, I can personally recommend Signal, which works just as well as WhatsApp does; only that it is blue instead of green. I actually think that WhatsApp does not deserve to be green, because the company shares our contact network information with anyone interested in buying it. My students found that Signal’s privacy design is not quite as good as Wickr Me. I must admit that I had some trouble using Signal on my new GSMK Cryptophone where I obviously reject the idea of installing GooglePlay; but for normal phones Signal works just fine.

Switching your social network

When it comes to social networks, I quit Facebook long ago. I thought the content got a bit boring in these past 4-5 years as people have started to become more cautious in posting their really interesting stuff. I am on Twitter and find it really cool, but the company’s privacy settings and controls are not good. We did not test for Twitter addictiveness …

I signed up with diaspora* which I have known for a long time, because its architecture and early set-up was done by colleagues in the academic community. It is building on a peer-to-peer infrastructure and hence possesses the architecture of choice for a privacy-friendly social network. Not surprisingly, my students found it really good in terms of privacy.  I am not fully done with testing it myself. I certainly hate the name “diaspora”, which is associated with displacement from your homeland. The name signals too much negativity for a service that is actually meant to be a save haven. But other than that I think we should support it more. Interesting enough my students also benchmarked Ello, that is really a social network for artists by now. But as Joseph Beuys already famously proclaimed “Everyone is an artists”, right? I really support this idea! And since their privacy settings are ok (just minor default issues…), this is also an alternative for creative social nomads to start afresh.

Switching your maps service

HERE WeGo is my absolute favorite when it comes to a location service. And this bias has a LONG history, because I already knew the guys who build the service in its earliest versions back then in Berlin (at the time the company was called Gate5). Many of this service’s founding fathers were also members of the Chaos Computer Club. And guess what: when hackers build for themselves, they build really well.

For good reasons my students argue that OSMAND is a great company as well. Especially their decisional data control seems awesome. No matter what you do: Don’t waste your time throwing your location data into the capitalist hands of Google and Apple. Get rid of them! And Maps.me and Waze are not any better according to our benchmark. Location services that don’t get privacy right are the worst we can carry around with us, because letting anyone know where we are at any point in time is really stupid. If you don’t switch for the sake of privacy, switch for the sake of activism.

Switching E-Mail services

I remember when a few of my friends started to be beta-users of gmail. Everyone wanted to have an account. But ever since Google decided to not only scan all our e-mails for advertising purposes but also combine all this knowledge with everything else we do with them (including search, YouTube, etc.) As a result I turned away from the company. I do not even search with Google anymore, but use Startpage as a very good alternative.

That said, gmail is really not the only online mail provider that scans all you write and exchange with others. As soon as you handle your e-mail in the cloud with free providers you must kind of expect that this is the case. My students therefore recommend to switch to Runbox. It is a pay-for e-mail service, but the price is really affordable starting with € 1,35 per month with the smallest package and below € 5 for a really comfortable one. Also: Runbox is a hydropowered e-mail service. So you also do something good for the environment supporting them. An alternative to Runbox is Tutanota. Its usability was rated a bit weaker in comparison to Runbox, but it is available for free.

Switching Calender Systems

Calendars are next to our physical locations and contact data an important service to care about when it comes to privacy. After all, the calendar tells whether you are at home or not at a certain time. Just imagine an online calendar was hacked and your home broken into while you are not there. These fears were pretty evident in class discussions I had with my students who created the benchmark study and we therefore compared calendar apps as well. All the big service providers are really not what you want to use. Simple came up as the service of choice you can use on your phone; at least if you have an Android operating system. If you do not have the calendar on you phone or no Android, Fruux is the alternative of choice for you.

In conclusion, there are alternatives available and you can make meaningful choices about your privacy. The question is now, will you be willing to do so?