Fellow Footprint 2013 – The “Big Data, Small Data” YouTube Cinema
2In mid-August we met with Elena Pfautsch, the project manager for the Summer Fellow Programme and discussed the possibility of the four of us working together on what we called the Fellow Footprint Project. In other words, something to leave behind as digital memento of our staying at Humboldt Institute for Internet and Society. We had several ideas, some more feasible than others. Eventually we decided to walk down the road less travelled. Taking a cue from Julian’s experience in Wien, we decided to organise a YouTube-cinema night loosely revolving around the crucial relationship between Big Data and Small Data and what it means to us.
The following four very different sets of YouTube videos are what the four of us came up with. They encapsulate our different background and approaches to the topic in focus and the chosen medium.
Giovanni Navarria: The Journey and the Moment
Few weeks ago, after the other fellows and I discussed this event, I started thinking on what kind of short-film I wanted to show. There are millions of self-made films on YouTube, it is a cornucopia of good and bad taste, one needs just to choose. And for the best part of them they could all be interesting to discuss here tonight.
But the more I thought about the themes we had discussed the more I felt like I was looking at a whiteboard all covered with a series words, and the more I read those words the more they re-arranged each other into the title of the first movie, The Address is Approximate, even before it was entirely clear to me why I had chosen it. The choice of the second film, I Forgot My Phone, is instead a consequence of the first.
The Address is Approximate has had about 4 million views. It is a stop-motion film and it was made by Tom Jenkins from the British based company The Theory. It was a personal project and once uploaded on Vimeo, Youtube and other video-sharing platforms, the video became a success and gained Jenkins and Simon Sharp, his partner at The Theory a contract with a large talent agency in Los Angeles.
The first thing that struck me about this video is the concept of the journey that is at its foundation: in all its romantic and poetic meaning, Jenkins short film captures the importance of the idea of travelling, of reaching new places to answer an almost vital yearning of discovering or reaching beyond the limits of our daily life. But the toys that are at the centre of the story are supposed to stay inside, they are trapped within. Yet, thanks to a bit of imagination and a the myriad of stored data on Google servers they manage to travel, without moving all the way to the west coast. It is not the same thing as the real thing perhaps, but nonetheless the power of discovery is not to be underestimated.
We live in a world in which – notwithstanding the fact that more people than ever before in our history can travel for pleasure – yet the gap between the rich and the poor has widened quite considerably.
There are billions who don’t earn enough money every month for a couple of tickets to go to the cinema, let alone take a flight and travel across the world. The Internet in this case has helped to shrink the world to fit in a 15 inches monitor. It is not just about reading or watching material online, but it is also about experiencing, almost peeping into another country’s life. Television has done much about this, so have movies, but the journey of the little robot in the Jenkins’s short-film is a new type of journeying without moving, one that is done often in solitude, and gives the illusion to be in the driving seat.
The emancipatory potential of that personal journey is not easy to calculate. Especially when we put it together with the rest of the experiences one can live through the web (and all its applications). This movie is certainly an interesting and I would dare to say positive example of the importance of making sense of the wealth of data that are out there in the web.
The title of the short-film is also very important: The address is approximate is not just about Google Street View, it is not simply about the lack of a precise address in the robot’s virtual journey to the west coast, but those words encapsulate exactly my feelings about the field of studies we are all in (and naturally on a much larger scale about life): in our specific field of research, we have more or less a sense of direction but the final destination is quite unknown. This is a very complex field that evolves quickly and continuously (a bit like the loading images of Google street view), and within this context it is far too easy to be stuck in the past, to be lost or to follow the wrong path.
The movie however brings to us, through the eyes of the protagonist a certain sense of wonder and magic, and perhaps with it a certain idea of humility and at the same time boldness, two virtues that should always be attached to any journey towards the unknown.
The last point regarding Address is Approximate is about its production merit. This is quite an artistic feat. With a very little budget, and basic equipment, a rented i-Mac, shot using a DSLR Canon 5d MkII, a normal stop Motion software, data available from the we and 1 week work during after-work hours Jenkins managed to produce an award winning short and reach out to a wide audience, score a contract with an important talent agency in LA and create great publicity for his firm. The perfect example of how the web can serve creativity and business needs.
From an artistic point of view, I Forgot My Phone, is – over all – a much more conventional and simple short-film. It was written and starred by a young tv/film American star (Charlene de Guzman), and her presence was certainly instrumental in its success with the audience. So far it has had about 22 million views on YouTube. Philosophically it is perhaps a bit more monothematic than Jenkins’s movie, but it certainly makes an important point.
I see I Forgot My Phone as the opposite to Address is Approximate. In that first video, data and machines, linked together by a network enrich us, bring us beyond our limits, let us travel beyond the cages of our lives and experience what we cannot always experience.
Charlene de Guzman’s short film instead depicts a world we are very familiar with, a world in which we no longer feel the need to live the moment and experience it and store it within our consciousness and with it become different, travel through life as a series of interconnected experiences whose memories and feel makes us who we are.
Instead, Miles Crawford (the director) shows us a world in which we live through the other and others, we outsource, so to speak, our experiences and the journey itself to a cool gadget, an hard disk, or a cloud system and a 4G network – the machine, the gadget acts almost as a proxy or a buffer between us and the experience we are having, somehow disconnecting us from that experience.
The lost gaze of De Guzman shows us a world in which we reduce emotions and feelings in bits of 1s and 0s forgetting that the emotion one feels in the moment is entirely different than the one we re-live through the electronic spectacle that has become our digital life.
Memory stored in our brains is selective and fallible, rather imperfect, and in fact it is because of that, far more important than crystal clear digital images. We are not meant to overload ourselves with the everyday clatter of the life we live; we are not meant to record all our life, most of it is in fact boring and pointless. We will never have the time to watch it, and, if we did, we would never want. But most importantly we seem to forfeit living that life for the sake of recording it, sharing it, storing and so on and so forth.
Seen from the perspective of this little movie, the cornucopia of digital machineries that shape our lives make us poorer, in fact it robs us of that very life and what it means to us. We are and we always will be the life we live not the life we watch.
Julian Ausserhofer: Metaphors of data
When scientists and practitioners try to explain matters about data, they very often refer to metaphors from the physical world. Most of the terms have been established long before the digital era, they come from commerce (“data storage”, “data retrieval”, “data mining” or “data harvesting”) and nature (“data explosion”, “data is the new oil“, “Datenberg” (in German)). Han-Teng likes to speak of “data massage”. He uses the term to describe the manual effort of getting raw data (!) into the right shape before it can be processed further.
The terminology of data is full of metaphors. And – as it lies in the nature of metaphors – they are never never precise, because the words are taken out of context, they stem from another sphere of meaning and should explain entities that are difficult to understand otherwise. For instance, the “new oil” comparison is inadequate because data is (usually) not a finite resource.
But explaining data without metaphors is also very hard. I usually speak about structured information or values in (MS Excel) tables. Video makers and news reporters also face that problem when they produce films on data issues. Usually they show the physical repositories of data: screens with visualizations, server farms and crowds on the streets. This rather new BBC documentary on big data is a perfect example for that. At the same time I find the animated data flows in the intro an appealing idea:
Very often, data metaphors are so common, we do not even realize they are metaphors. The trekkies here certainly know an officer called Data. Until recently I did not associate Data with structured information. To me, Data was just a normal name. But Data is called Data because he is an android. As one of the leading officers on Picard’s Enterprise, he has impressive computational capabilities. He is also a know-it-all, more like an encyclopedia. But I guess that name would not have been so sexy. Take this example, where Data explains complex systems:
Data saves the enterprise on numerous occasions. His biggest problem is that he does not understand human behavior. He is unable to feel emotions. One thread that continues through the whole TV show is that Data tries to find a chip that would bring him human feelings. Until that is the case, Data is very slow on the uptake of getting jokes and habits:
Although Star Trek plays in the 24th century, Data is not able to differentiate between an idiom and a factual expression – something computers are already very close to understanding today.
In the following clip I found another note-worthy anthropomorphism of data. This video is an episode of a mini web show by Oracle to advertise their data processing solutions.
The video reminded me of The Internet Party, another short film, where services become humans to better understand their nature. Although very old, I still find that representation of persons can explain abstract concepts very well, especially when it comes to data.
Ulrike Klinger: Surveillance and Power: The rise of panopticism
“If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place. (…) It is possible that that information could be made available to the authorities.” Eric Schmidt, Google CEO
The hype about „Big Data“ should not let us forget that the interest of elites and states in knowledge about their subjects is actually nothing new. The story of Christmas begins with pregnant Mary and Joseph having to leave their hometown for a census. A few centuries later, Thomas Cromwell demanded lists of weddings, baptisms and funerals all over England. Since the beginning of statistics, elites collected all sorts of information about their population. While we are still in shock and awe of today´s technological possibilities and the unscrupulousness of state institutions in using them to spy on friends and citizens, historical accounts show that even the technologies are not so new, they just look better today. And, of course, many people are careless enough to voluntarily share their private and even intimate “data” freely.
“What are people worried about? What is the problem? Are you doing something you’re not supposed to?” Trent Lott, former US Senator
The problem lies not so much within the data itself, but with the interpretation of this data. Today, the Stasi files, whether shredded or intact, are of interest again – but from a different perspective. While thousands of Stasi officials tried to distinguish harmless talk about soccer games from demagogue metaphors by using qualitative document and text analysis, the same documents now serve to identify inofficial Stasi informants. The power of data lies in its interpretation. And much of this interpretation is now being delegated to algorithms – which decide about relevance and consequences of information pieces. Are you eligible to receive a credit card? Are you a potential criminal? Algorithms start to make decisions that humans should make – this is why we should find them scary.
“If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged.” Cardinal Richelieu (1585 – 1642)
Relevance of data depends on interpretation frames. That is why we might be misled by the argument: „I did not do anything wrong“. Who gets to define what is wrong and what is right? Is it „wrong“ to attend a punk concert, as seen in video 1? What if things that are right today will be wrong tomorrow? Total surveillance, Foucault`s panopticum, is already there, we learn form a former high-ranked Stasi official who talks about Facebook in video 3. And the best thing was that people willingly shared personal information.
Of course, there are worlds between Stasi and NSA. But the question that they pose before society remains the same: How do we want to live?
Han-Teng Liao: The human condition facing big data
On 11 September 2013, I gave this following talk in a “Big Data, Small Data” YouTube Cinema that I co-organized with other three summer fellows at the Humboldt Institute for Internet and Society. By showing videos that we pick individually (e.g. Julian Ausserhofer’s pick documented in his blog post here “Metaphors of Data”), we started a conversations on how data, big and small, has interfered with our lives and our research. My talk is the last of the four summer fellows, and I reconstruct my talk in a conversational tone as below.
Yes, as Julian Ausserhofer rightly mentioned, I prefer the terms data massage/masseuse over data science/scientists because I want to put human labour into the process of big data in general. I want to contribute to this conversations by asking this question: Why *my* small data matters to the *big* world of humanity? I will use some videos on sex, porn and genes to illustrate both my questions and arguments. Please believe me that the video selection shows my seriousness about the question and arguments, not some superficial attempts to sell my talk on sex and porn. I deliberately chose topics such as sex, porn and genes because these are intimate subjects that are critical cases for the human condition and, if I can call it, the data condition. I chose these topics also because the data about sex, porn and genes are not necessarily detached or masculine. I want to use these videos to have us reflect on the possibilities and directions of the human condition and human values, when all human data may be collected and processed.
Just came back from a Berlin protest against NSA surveillance, I appreciate the efforts to defend privacy. Very few things could be more private and intimate than sex. What if sex became a big data business?
The movie I.K.U. tells such a story when corporate control over sex data may enhance, and thus take over, our most intimate and private experiences. The main storyline is indeed about orgasm-data collection and reuse. A corporation sends shapeshifting cyborgs into New Tokyo to gather “orgasm data” by having all kinds of sex with different kinds of people.
As the first pornographic film screened in Sundance Film Festival in 2000, the independent film I.K.U. is a rare porn in the genre of cyberpunk. It is worth mentioning that the geographic context of the film production, i.e. a Taiwanese-American director used Japanese-language title “iku” (Japanese slang for orgasm). The choice of Tokyo and Japanese references are not arbitrary. “Modern Japan simply was cyberpunk,” said William Gibson, arguably the most well-known cyberpunk author of Blade Runner (1982) which also inspired this movie.
This description provided by sci-fi cyberpunk author Lawrence Person is of interest here for our conversations on big data. The idea of “ubiquitous datasphere of computerized information” is not new, and even then cyberpunk writers have taken a step further to imagine a world where invasive excitation of the human body using cyborgs can collect “orgasm data”. So I suggest that the cultural critique on corporate control made available by cyberpunk is worth reading about and reflecting upon. I also want to highlight that, the “big data” of diverse orgasm experiences, also enhance (or promise to enhance) the human sexual experiences beyond the limits of one’s sexuality and sexual orientation. The business model allows a hetrosexual male to experience sex as a lesbian, and the other way around. So the video provides an answer to the question posed here: Why *my* small data matters to the *big* world of humanity? “Your small orgasm data, collected by our company, can enrich the big exotic world of humanity,” might be a very likely answer given by the “big-orgasm-data” company.
- Jacobs, Katrien (2003). ‘Queer voyeurism and the pussy-matrix in Shu Lea Cheang’s Japanese pornography‘ in Mobile Cultures: New Media in Queer Asia.
- Notes Toward a Postcyberpunk Manifesto – Person, Lawrence first published in Nova Express issue 16, 1998, later posted to Slashdot
Cyberpunk predates the rise of user-generated content, and thus it is worth mentioning a user-generated porn website that actually collects a particular kind of “orgasm-data”. The “Beautiful Agony” website collects user-uploaded videos where only head shots of users having orgasms (mostly through masturbation). By concealing anything below the neck and upper chest, the format of such user-generated content structures the “orgasm-data” in a particular way that is somehow different than conventional mainstream porn. Let me show you a US music video featuring a series of such “orgasm-data” from the website. Note that it is not the original porn video but a series of “Beautiful Agony” video being edited together for a music video. I believe that the beginning of the 20 seconds of the music video is enough for today’s discussion.
Here we find an interesting case where certain limited format of personal *small* orgasm data are voluntarily given by individual users to a for-profit website. Is this kind of data sharing liberating? Or is this still a form of corporate control? Now it seems to me that both the empowering and enslaving possibilities depend much more on “how” we share our own data, not just on “whether” we share it. Here my normative answer to the question of big versus small data derives from Hannah Arendt’s concepts on the public, the private, and the social. When fighting or resisting governmental or corporate data control, we must avoid the “loss of the world”. We still need the human world of public sphere and collective action. Sharing and contributing one individual’s small piece of data, as components of action and speech in the public sphere is important and political for the bigger world. Thus, while we need to be cautious about keeping things private and non-sharing, avoiding more government and corporate control, which that big data may help facilitate, we also need to proactively link our data for public and political action.
- Paasonen, S. (2010). Labors of Love: Netporn, Web 2.0, and the Meanings of Amateurism. New Media & Society 12(8). doi:10.1177/1461444810362853
- Ward, Anna E. (2010). Pantomimes of Ecstasy: BeautifulAgony.com and the Representation of Pleasure.53 Camera Obscura 25(1). doi:10.1215/02705346-2009-018
Let me show you two videoes about a civic science project that collect human gene data to map out the tree and paths of human evolution and migration. Here we find yet another project sharing individual small but quiet personal data (one’s DNA) in a way to contribute to our understanding of the bigger world and longer history of humanity. It is the Genographic Project launched in 2005 by the National Geographic Society and IBM with the aim to map human migration by collecting and analyzing DNA samples from people around the world. (First:1:15–>3:55; second: 11:14 –>14:54)
It is an inherently political project as it constructs a compelling narrative that we, despite our racial and ethnic differences, all come from the same ancestor in Africa. It is relatively more difficult to construct a racism ideology if we can see how our individual gene actually belongs to the grand tree of human beings throughout history and around the world. In particular, pay special attention to the second video regarding not only how the personal DNA is collected voluntarily, but also how the non-profit financial support is established. The funds that are raised from selling the DNA-collecting kits to the general public (most likely in the first world) become the Legacy Fund to fund projects that directly preserve or revitalize indigenous culture. Here we see the necessity and alternative thinking of data sharing that is beyond dystopian corporate and/or governmental control.
- TallBear Kimberly (2007). Narratives of Race and Indigeneity in the Genographic Project. The Journal of Law, Medicine & Ethics, 35(3), 412–424. doi:10.1111/j.1748-720X.2007.00164.x
- Spencer Wells and Theodore Schurr (2009). Response to Decoding Implications of the Genographic Project. International Journal of Cultural Property, 16, pp 182-187. doi:10.1017/S0940739109090109
- Bandelt, H.-J., Yao, Y.-G., Richards, M. B. and Salas, A. (2008), The brave new era of human genetic testing. Bioessays, 30: 1246–1251. doi: 10.1002/bies.20837
Getting Political: Personal story/data become big story/data
Hannah Arendt’s concepts really provide me a tentative answer to the big-versus-small data question. We have to be cautious about the rise of the social, be active about expanding and enriching the public and the political.
Ultimately defending privacy is not about keeping something to oneself as isolated individuals (think closet gay), but rather about taking control over individual, social and public lives. To conclude my talk, I want to share with you some videos of the “It Gets Better” project. These videos, contributed by the celebrities and employees from major US companies and government agencies, aim to help gay teenagers who may lose hope for life because of bullying or social alienation.
Thus, we need Hannah Arendt’s vita activa via proactive definition of freedom for data politics: freedom is about participation in public action, which is the opposite of the undisturbed private life (Canovan, 1974). As a Chinese proverb goes, “when hands tightly closed into fists, you have nothing inside; when hands wide-open, you have the world.” We need better knowledge on data politics to reclaim the wider public world that is beyond corporate and government control.
This blog entry is part of the Summer Fellow Programm and was created by the Summer Fellows 2013 of the Humboldt Institute for Internet and Society. This entry does not necessarily represent the opinion of the Institute.
This post represents the view of the author and does not necessarily represent the view of the institute itself. For more information about the topics of these articles and associated research projects, please contact email@example.com.
Sign up for HIIG's Monthly Digest
and receive our latest blog articles.
AI has the potential to make decisions and optimise processes – for example in medical treatments. But the new kind of AI-infused decision making works in obscure ways and we...
“We are gambling with our future in Germany. We definitely need more AI firms that develop solutions for our existing industry structure. Currently, we are missing that trend.” says Fabian...