Comments
Transcript
Mudassar A. Mughal DSV Report Series No. 16-001
LIVE MOBILE VIDEO INTERACTION Mudassar A. Mughal DSV Report Series No. 16-001 ii Live Mobile Video Interaction Inventing and investigating technology, formats and applications Mudassar A. Mughal iii ©Mudassar A. Mughal, Stockholm University 2015 ISSN 1101-8526 ISBN 978-91-7649-324-3 Printed by Printers Holmbergs, Malmö 2015 Distributor: Dept. of Computer and System Sciences, Stockholm University iv To the memory of my beloved father v Abstract The convergence of inexpensive video-enabled mobile phones, high-speed mobile data networks and ubiquitous sensing devices opens up a new design space called “live mobile video interaction”. It gives rise to a new genre of applications concerning live mobile video production, which can be seen as an instance of the said space. In this work we are particularly interested to explore potential technical challenges and opportunities presented by “live mobile video interaction”. We started our investigation by studying two existing prototypes from the said genre i.e. the Instant Broadcasting System (IBS) and the Mobile Vision Mixer (MVM). We studied their applicability for amateur users of collaborative mobile video production tools and the problems caused by inherent communication delays in the Internet. We acquired initial user feedback and conducted technical tests on Instant Broadcasting System (IBS) and Mobile Vision Mixer (MVM). Our results indicate that lack of synchronisation among video streams causes problems for directors in such systems that were not present in professional systems. We also identified two distinct video production modes depending on visual access of the director to the event that is being filmed. Based on our study we proposed technical design suggestions and indications on how to solve the synchronisation problems in respective mixing modes. We also proposed an algorithm for frame-rate exclusive synchronisation management of live streams in a collaborative mobile production environment. We further probed the design space using the research through design method, which resulted in a fully functional prototype system called “Livenature” that would incite an emotional connection that exists between people and the places they cherish. Further investigation of Livenature allowed us to produce detailed studies about experiential and technical aspects of the system, thus revealing phenomenological and technical dimensions of the design space. vi vii Acknowledgements When I joined MobileLife Center I had recently graduated from The Royal Institute of Technology (KTH). I fundamentally was trained as an engineer who would immediately jump to functional details of a problem with little consideration to understand ideas and concepts on an abstract level. I could never imagine that one day I would be able to accomplish some thing like this work. Here I would like to take an opportunity to express my deepest gratitude to all those who, in a way or another, were part of this journey. I feel immensely fortunate to have worked at MobileLife Center, a place full of fascinating people and their ideas. There are so many names who have influenced me and inspired me over the years at MobileLife. If I miss anyone, I want you to know that I am filled with respect and gratitude for you. I would like to take the opportunity to acknowledge that this work would not have been possible if it was not for the intellectual grooming I received from my supervisor and mentor Professor Oskar Juhlin. I am thankful for his continuous support, patience and immense knowledge. It is not possible for me to thank him enough for being such an incredible supervisor, colleague, and above all, a friend. I owe special thanks to Arvid Engsrtöm, Ramin Tousssi, You Le Chong, Elin Önnevall, Jinyi Wang, Yanqing Zhang and Fredrik Aspling, Alexandra Weilenmann, and Thamwika Bergström for their invaluable contributions to this work in many ways. I also want to extend sincere thanks to Gina Venolia for accepting to act as my opponent. I am grateful to committee members Konrad Tollmar and Professor Theo Kanter for their highly valuable feedback during my “final seminar”. I am also thankful to Professor Mikael Wiberg and Professor Gudrun Dahl as committee members. I must not forget to mention Lars Erik Holmquist and Goranka Zoric, who acted as my co-supervisors at different times, for their always encouraging and inspiring role. I cannot be grateful enough to Kristina Höök, Anika Waern, Barry Brown, and Maria Holm and Oskar Juhlin for creating a work place where inspiring ideas are always floating in the air. This intellectually nurturing atmosphere enabled me slowly open up to the strange and amazing world of design oriented research. I want to express my gratitude to all those to whom I have been working close and who have helped making this journey easier and pleasant. These include Airi Lampinen, Anna Ståhl, Donny viii McMillan, Elena Marquez, Elsa Kosmack Vaara, Eva-Carin, Ilias Bergström, Jakob Tholander, Jarmo Laaksolahti, Johanna Mercurio, Jon Back, Jordi Solsona Belenguer, Lucian Leahu, Kim Nevelsteen, Moira McGregor, Mattias Jacobsson, Pedro Ferreira, Sebastian Büttner, Sophie Landwehr Sydow, Stina Nylander, Syed Naseh, Vygandas Simbelis, Vasiliki Tsaknaki, Vincent Lewandowski, Ylva Fernaeus. This was a long journey and it would have been incredibly difficult if I was not lucky enough to have amazing friends and family. The incredible comfort and support I received from my family pushed me through tough times. For that I am thankful to my mother, my brothers, sister and my wife. Lastly, my sincere thanks goes to Higher Education Commission Pakistan and Stockholm University who funded this endeavor. Stockholm, December 2015 Mudassar A. Mughal ix Contents Abstract .......................................................................................................... vi Acknowledgements ...................................................................................... viii List of figures ................................................................................................. xii 1 Introduction ............................................................................................ 13 2 Background ........................................................................................... 17 2.1 Development of live video streaming .................................................................. 17 2.3 Mobile video production in HCI ........................................................................... 20 2.2 2.4 Video streaming in technical research ................................................................ 19 Liveness in HCI and Media studies .................................................................... 22 3 Methodology .......................................................................................... 26 3.1 3.2 General approach ............................................................................................... 26 Points of investigation ......................................................................................... 27 3.2.1 3.2.2 3.3 Professional oriented multi-camera video production ................................ 28 Live mobile ambient video .......................................................................... 29 Specific methods ................................................................................................ 30 3.3.1 Designing and building prototypes ............................................................. 31 3.3.3 Field trials ................................................................................................... 33 3.3.2 3.4 Performance testing ................................................................................... 33 Ethical considerations ......................................................................................... 34 4 Live mobile video interaction ................................................................. 36 4.1 Professional oriented multi-camera video production......................................... 36 4.1.1 4.1.2 4.2 Synchronisation and delays in multi-camera production ............................ 36 Frame-rate exclusive synchronisation........................................................ 40 Live mobile ambient video .................................................................................. 41 4.2.1 Ambient video format and mobile webcasting............................................ 41 4.2.3 Resource efficiency in live ambient video .................................................. 50 4.2.2 Liveness experienced through ambient video ............................................ 48 5 Findings and results .............................................................................. 53 5.1 Varying delays give rise to asynchrony .............................................................. 53 5.3 Specific mixing scenarios have distinct demands .............................................. 55 5.2 Mixing decisions are affected by lack of synchronisations and delays ............... 55 x 5.4 Frame rate-based synchronisation ..................................................................... 56 5.6 Live ambient video is a new a form of mobile webcasting .................................. 58 5.5 5.7 5.8 5.9 Mobile webcasting extends ambient video format .............................................. 57 There is a “magic” to experience of liveness ...................................................... 59 Liveness theory revisited .................................................................................... 59 Resource efficiency and live ambient video formats .......................................... 60 6 Conclusion ............................................................................................. 63 7 Dissemination of results ........................................................................ 65 7.1 Paper I: Collaborative live video production ....................................................... 65 7.3 Paper III: Livenature system ............................................................................... 66 7.2 7.4 7.5 7.6 Paper II: FESM ................................................................................................... 65 Paper IV: Liveness and ambient live videos ....................................................... 66 Paper V: Resource efficient ambient live video .................................................. 66 Related Publications ........................................................................................... 67 8 Bibliography ........................................................................................... 69 The Papers .................................................................................................. 76 xi List of figures Figure 3-1: Points of investigation in live mobile video interaction design space ................................................................................................... 28 Figure 4-1: User roles in a collaborative video production setting ............... 37 Figure 4-2: Mixing interfaces of IBS (left) and MVM (right), Graphics by Engström, A. ................................................................... 38 Figure 4-3: a) camera phones on mobile stand, b) interaction manager ..... 44 Figure 4-4: Livenature architecture .............................................................. 45 Figure 4-5: a) Large display, (b) Small screen, (c) windmill ......................... 46 Figure 5-1: Delays in professional oriented mobile video mixing ................. 54 xii 1 Introduction In recent years, the availability of high-speed mobile networks, together with camera-enabled mobile phones, has given rise to a new generation of mobile live video streaming services. These developments have opened a new avenue for live video production. Most such services and applications today are limited to a single mobile camera. Recent works indicate that there is a demand for more extended resources for amateur storytelling that resemble professional TV production technology (Engström and Juhlin et al, 2010; Engström, 2012). There is an emerging class of applications to fill this gap. Such applications focus on enabling the use of collaborative resources in live video production. These applications allow users to produce videos collaboratively using multiple mobile cameras, in a manner similar to how professional live TV production teams work. Mediating live events as they happen, video and audio have been the dominant media. However, the proliferation of environment sensors and networking within the area of Internet of Things provides yet another interesting source of real-time data. The Internet of Things will provide access to a rich amount of real-time sensor data. Such data will be able to combine with existing forms of multimedia, generating new hybrid media that could support more diverse ways of experiencing remote contexts. This convergence of live video streaming, high-speed mobile networks, and ubiquitous environment sensor data acts as a catalyst for the development of yet another new genre of video interaction applications. This thesis explores the potential technical challenges within the applications generated as a result of the juxtaposition of the aforementioned technologies. Furthermore, this work provides design indications, algorithmic solutions and insight into the experiential dimension through investigation of existing applications and by designing novel systems. We investigate a design space called live mobile video interaction that is mainly situated upon the intersection of the domain of mobile video production and Internetworking. The mobile video production domain concerns emerging mobile live video production tools/technologies and social practices emerging around them. The Internetworking domain pertains to the supporting technology that enables those video recording devices to talk to each other and the remaining Internet. A design space is a hypothetical concept that has vague boundaries and multiple dimensions. While constituting research domains are known and pre-defined dimensions, others are investi13 gated by acquiring a concrete design example that represents the space. We imagine a design space as an alien territory and an example system that represents it as a probe, which we use to obtain as much data as possible. With the backdrop of this notion of a design space, we aim to investigate the following relatively open research question: What are the challenges and the opportunities presented by live mobile video interaction in terms of technology, user practices and design? We started this investigation by studying amateur users of collaborative mobile live video production, and the problems they face caused by inherent communication delays in the Internet. We acquired initial user feedback and conducted technical tests on two examples of live mobile collaborative video production systems, i.e. the Instant Broadcasting System (IBS) and the Mobile Vision Mixer (MVM). We identified two distinct video production modes depending on visual access of the director to the event being filmed. Based on our study we proposed technical design suggestions and indications to solve the synchronisation problems in respective mixing modes. We also proposed an algorithm for frame-rate exclusive synchronisation management of live streams in a collaborative mobile production environment. The real-time nature of live mobile webcasting makes it a challenge to find relevant interesting content for broadcast. Also, the amount of camera-work required obscures the experience of the moment itself. An attempt to invent alternative video formats that may address these mentioned challenges led us to further exploration of the space. The term “video format” should not be confused with “media encoding format”. A video format here is more in line with the concept of a TV format, which is an overall arrangement from production to presentation of video content that defines the structure of invilved storytelling. Few examples of a TV format are; game shows, reality shows and current affair shows. This endeavour resulted in a fully functional prototype called Livenature. Taking inspirations from emerging user practices around the newly arrived mobile streaming technology coupled with mobile internet streaming services and leisure, we conceived a system that would incite an emotional connection that exists between people and the places they cherish. We take the research through design approach to investigate the experiential qualities of live ambient videos used as objects of decoration within the context of home. We studied people who have occasional access to highly appreciated geographical locations; the results reinforced our conviction that there exists an important emotional connection between people and their cherished places, and that people like to dream about such places when they are away. The final implementation of the system consists of three subsystems: a media capture system, communication infrastructure and decora14 tive media. Livenature captures live video feeds and weather data from the cherished place and presents the live streams and weather data in an ambient aesthetical manner in the living room of a household. During the design process, we take inspiration from ambient video, which is an artistic form of video with slow and aesthetically appealing content recorded from natural scenery. The prototype gave rise to a new kind of ambient video by incorporating liveness, mobility and sensor data. As an exemplar of the design space in question, Livenature helped us ask questions about the experience of the new kind of hybrid media, and revealed new technical challenges that are likely to emerge. So, as a result, we produced detailed studies about the experiential and technical aspects of the system, thus revealing phenomenological and technical dimensions of the design space. 15 16 2 Background This chapter sets a background against which this investigation into live mobile video interaction design space should be understood. First, we present an account of the historical development of live video streaming, which led to presently available mobile broadcast services. Then we present a brief overview of related work on video streaming in technical research. Finally, we present state-of-the-art on the topic of liveness in HCI and media studies. 2.1 Development of live video streaming Transmission of live media has been around over the last 70 years in one form or another. Invention of the television brought moving images into the home in the early 1940s. The original television service was analogue in nature and delivered live images from the camera via a distribution network. Streaming media is defined as transmission of audio-visual digital information from source to destination over the Internet (Austerberry & Starks, 2004). By the time that Internet reached home users, the digital video was stored on compact discs (CD) using MPEG-1 compression. Unfortunately, MPEG-1 video files were too large to be streamed in real-time. Therefore, true streaming had to wait until more sophisticated compression techniques emerged. Applications used the Internet only for file transfer in the early days of multimedia streaming. In such applications, the content could be played back only after a complete multimedia file had been downloaded to the local computer. This is referred to as “download and play”. On the contrary, in “true streaming” media content is transferred to the viewer’s player at the same time as it is played, and the media content is not stored anywhere between its origin and the receiving application. As indicated earlier, this kind of streaming became possible because of three major developments: progress in content delivery mechanisms, advances in data compression techniques, and overall improvement in the Internet throughput available for home users. Development in techniques for content delivery and media serving, together with an ongoing progress in audio compression, resulted in codecs that made audio streaming at a bit rate of 28 kbps, which was the maximum available bitrate for home users early 90s, possible. In later years, telecommunication companies and service providers were offering broad17 band Internet to the home users. At that time, cable modems and ADSL supported up to 1 Mbps downlink speed. This particular convergence of technological trends made video streaming over the Internet a reality (Austerberry & Starks, 2004). On 5 September 1995, a Seattle based company named Progressive networks enabled ESPN SportsZone to stream a live radio broadcast of a baseball game. It was the world’s first live streaming event over the Internet. Although the idea of streaming video over IP was received with excitement by the tech-industry, in its early days streaming media had practical challenges like how to deliver watchable video over 56 kbps modem lines. In the early 2000s, Microsoft and RealNetworks were big players in the arena of media streaming. However by the mid-2000s the FlashPlayer, developed by Macromedia and later acquired by Adobe systems, began dominating the media-streaming scene. FlashPlayer revolutionised the streaming industry by bringing interactivity, media streaming and web2.0 together (Zambelli, 2013). Such a success of streaming media comes as a result of a number of previous technical achievements. Thanks to digital video applications such as storage and playback from CD-ROM, developments in digital signal processing and data compression techniques were already underway before the arrival of streaming media. Such developments led to efficient video codecs. The raw data rate required for a digital video, with resolution of 720 x 576 and frame rate of 25 fps, easily exceeds 25 Mbps. On the other hand, data rate supported for desktop video playback using CD-ROM was not more than 1.3 Mbps. This clearly makes evident that video compression is a key prerequisite for enabling transmission of video over the Internet. If not compressed, only 30 seconds of such video requires a whole 5 Gbit compact disc (Paul, 2010). A compression ratio of 240:1 is required for saving an entire such movie on the CD of the same capacity. Moore’s Law states that number of transistors that can be incorporated on a given size of a semiconductor chip can be expected to double roughly every two years (Paul, 2010). Growth of capabilities of digital hardware e.g. processing power, storage space, pixel density and data throughput are linked to Moore’s Law in one way or another. In previous years, the world has witnessed storage capacity per dollar increasing following the same trend (Paul, 2010). This rapid growth in technology also has caused improvement in the bandwidth of broadband networks at home, wireless local area networks and mobile bandwidth. This continuing trend led to the widespread availability of highly powerful mobile handheld devices fitted with cameras. This growth in computing power of devices, along with the ever-improving image quality of integrated cameras, capacitated services that allow mobile phone users to broadcast live video streams at any time and from anywhere within the reach of a cellular network. Where the use of live video on the internet once was marginal and was primarily used for person to person video chat and proprietary video conferencing systems, now it 18 is a common feature of web TV services, blogs and news services (Engström, 2012). With the emergence of powerful streaming-capable smart phones, new services supporting live video broadcasting started emerging. Today such services have a vast numbers of users. The first such service, called ComVu pocket caster, which allowed mobile broadcasting on a public webpage was launched in 2005 (Reponen, 2008; Juhlin et al, 2010). ComVu pocket caster was later renamed Livecast. In the following years, similar services emerged including Qik, Bambuser, Flixwagon, Floobs, Next2Friends, and Ustream. More recent examples include Meerkat and Periscope. 2.2 Video streaming in technical research Video streaming is a vast field and there has been continuous research and development. Today, video streaming has a plethora of application areas ranging from recreational use of video streaming technology in e.g. entertainment and sports to more practical areas such as video surveillance, traffic monitoring, industrial inspection etc. The variety of applications and sheer amount of interest from public and private sector alike help generate a huge amount of research contributions. In this study we are concerned with research work related to collaborative mobile video streaming and issues like end to end delays, audio/video synchronisation, and inter-stream synchronisation. Kaheel et al. present a system called Mobicast (Kaheel, et al., 2009) that enables production of a better viewing experience of an event. It selects a suitable angle of view from among multiple live mobile streams. It is also capable of stitching more than one video stream together to provide a wider field of view. Mobicast uses a network time protocol (NTP) based time-code to achieve synchronisation among video streams from multiple mobile devices. Juhlin and Engström presented a collaborative mobile video mixing application called SwarmCam, which allows multiple mobile cameras to stream live video data over 3G to a mixing station where a video mixer enables the director to select one of the streams and mix it with pre-recorded material and broadcast the final produced content in real-time (Engtröm, 2008). Wang, L. et al. (2008) report another example of a mobile collaborative system called Mobile Audio Wiki that enables audio-mediated collaboration on the move. Delays and synchronisation are not of any concern since this application offers asynchronous audio-based collaborations. Transmission delay in multimedia applications generates different kinds of effects. Ito et al. (2004) studied average end-to-end delays and jitter, i.e. variation in end-to-end delay, in live audio-video transmissions to investigate their effect on user-perceived quality of service. They found that the standard deviation of delay affected the user experience more than a constant 19 delay. Baldi et al. (2000) studied the question of how end-to-end delay in video conferencing in packet switched networks can be minimised. They analysed end-to-end delay with six different settings, combining three generic network architectures: circuit switching, synchronous packet switching, and asynchronous packet switching. They performed their study with raw video and a variable bit rate codec. They showed that variable bit rate video encoding is a better choice for delay sensitive systems. Endoh et al. (2008) propose a new live video streaming system featuring low end-to-end delay and jitter. The system does not incorporate audio data, however. In all, many other researchers have focused on the problem of delay in end-to-end video transmission (Sareenan & Narendaran, 1996; Gualdi et al., 2006; Weber, 2006). However, no one has performed delay analysis in mobile collaborative settings. Synchronisation in a set of video streams is defined as maintaining the same temporal relationship between frames of the video streams at the time and place of reception as they had at the time of acquisition (Boronat et al., 2009). Synchronisation in the context of networked video streaming services is a critical topic that has been extensively studied (Sareenan & Narendaran, 1996; Blum; Rautiainen et al., 2009). Researchers have put forth several solutions for media synchronisation in a variety of contexts. Most have proposed new systems that do not fit the requirement of our heterogeneous collaborative mobile setting. Others (Haitsma & Kalker, 2002;Cremer & Cook, 2009; Kennedy & Naaman, 2009; Shrestha, 2009) explored the possibility of using common features, such as audio signatures and sequences of camera flashes in video streams, as reference points for calculating the synchronisation offset. However, with every new application and service, the problem re-emerges due to new intricacies and constraints involved. Summing up, there is a large body of research in the area of video streaming. We have only presented a selected work from technical research here that is closely related to this thesis. A number of works address the topic of synchronisation and delays. However, it does not focus on delays in the mixing of live video streams, which is an essential feature of both professional live TV systems and the upcoming new types of mobile collaborative systems. 2.3 Mobile video production in HCI With the proliferation of powerful mobile devices with digital cameras, mobile video has gained significant attention within the research community. Kirk et al. (2007) frame user practices around video as a “videowork” life cycle. They show how users produce, edit and consume videos with existing devices. They argue that mobile video production is typically spontaneous, 20 unlike the heavyweight production domain, and videos produced by lightweight devices are shared in the moment mostly without editing and are primarily meaningful in the context of shared experience. Editing before sharing is seen as an inconvenience. This difference in a spontaneous video recording by a camera phone and video production practices around more heavy equipment echoes in another study conducted by Lehmuskallio et al. (2008). They compare videography on mobile phones to camera-phone photography and suggest that mobile video practices are more closely related to snapshot photography rather than to traditional video production and filmmaking. The following quotation from David’s work seems to be in harmony with the characteristics of mobile video practices mentioned above. David’s observations seem to agree to the aforementioned characteristics of video production by mobile devices: “Repositories of digital pocket videos often tell stories that feel like old spaghetti western films. Most of the home camera phone videos lack dramatic action. While the spectator just wants to see what will happen next, the takes are long and the desert dry. Camera phones are an appurtenance of everyday life, which we rarely storyboard. The images so produced, therefore, tend to be spontaneous, at least in their content” (David, 2010). O’Hara et al (2007) study the social practices around video consumption on mobile devices and unveil a variety of motivations and values in different contexts. Puikkonen et al. (2008; 2009) focus on mobile videowork in daily life and identify usage patterns involved in the practice, highlighting challenges that mobile videographers face. Previously researchers have explored affordances and properties of live streaming video in a variety of contexts. Some examples are: a group of friends (Reponen, 2008), visual performance in nightclubs (Sá, M et al., 2014) and emergency response work (Bergstrand & Landgren, 2011). After the emergence of early streaming services that allowed live broadcasting from mobile devices, people have performed content analysis of such services (Juhlin et al., 2010; Dougherty, 2011). Jacucci et al. (2005) investigated how camera phones can enhance the spectator’s shared experience. They suggest that mobile imaging is not merely a means for documenting the spectator’s in situ experience. It can also be a participative practice that enhances the experience. Juhlin et al. (2010) argue that there is a demand for more sophisticated mobile video tools. Such tools would support real-time storytelling by allowing collaboration as well as editing before broadcast of the final production, just like it is done in the professional TV production realm. Early works on such applications include Jokela’s (2008) prototype and design for editing on mobile devices. Bentley et al. (2009) discussed technical challenges in live streaming with the help of their prototypes TuVisa I and TuVisa II that explore collaboration and mix21 ing of mobile feeds. Juhlin and Engström et al. (Engström, 2012; 2008) also developed two working prototypes Mobile Vision Mixer (MVM) and Instant Broadcasting System (IBS) that were modelled on professional video production and supported multi-camera live collaborative video production and editing in real time. Since the dawn of mobile live streaming services, we have witnessed a number of services in the class. Some notable examples are qik, livestream, justinTv, Bambooser, Flixwagon, Floobs, Next2Friends, and Ustream. Most of the services did not translate into an explosive success as in the beginning they were expected to. Many services disappeared from the scene altogether and those left have never really taken off. One reason could be related to the challenges that are associated with producing meaningful and interesting live content. Even if one manages to capture an interesting story, it is hard to find an audience ready to consume the content in the moment. Recently, two increasingly popular live streaming applications, periscope and meerkat, cleverly seem to have solved at least the problem of reaching the audience, by connecting with Twitter, which matches such services in character for being “live”. 2.4 Liveness in HCI and Media studies All these developments around real-time streaming also increased interest in HCI community to understand liveness (Hook et al., 2012). Previously HCI research has shown little interest in the production aspect of live videos; accounts of liveness have appeared in only a scattered manner. Media theorist Scannell argues that there is a “magical quality” in the temporal “now” of live broadcasts (Shirky, 2008). Technical research (Maloney et al., 1995; Joshi et al., 2012; Maleki et al, 2014) is usually occupied with achieving media presentation occurring as close in time to the captured content as possible. In addition to the studies of live webcasting, works about art, music and performance have focused on various experiences associated with such temporal adjacency (Joshi et al., 2012; Hook et al., 2013). Design-oriented research projects touch upon the experiences of liveness without explicitly articulating it (Gaver, 2008; Gaver, et al., 2013). Given this interest, we are motivated to align and articulate liveness experiences existing in CHI research with the theoretical concepts that are well established in the area of media studies. Here we provide a media study-based account of the experiential qualities in liveness, such as immediacy, unpredictability, engagement and authenticity, and how they relate to liveness. Liveness is seen as an immediate experience, as it occurs in the present moment of “now” (Auslander, 2008). It provides instant sensory access to re22 mote events as they occur, thus acting as an extension of human sensory perception. Video chat, for example, is considered to possess the quality of liveness, facilitating an immediate connection between remotely located friends and family (Massimi & Neustaedter, 2014). In a study of a photo/video-sharing application, Weilenmann et al. (2013) touch upon the importance of immediacy and discuss how an ability to share content in realtime makes the application different to other similar services. Gaver et al. (2008) use real-time data for creating an immediate experience of liveness. The so-called “threshold devices” gather real-time data, such as wind speed, passing aeroplanes etc, from the house’s surroundings and present then in an aesthetically pleasing manner in the homes. So immediate access to information surrounding the home enables the inhabitants to experience liveness from the outside. In all, the collecting indication from these works is that immediacy, or the sense of now, is essential for various liveness experiences. Live content potentially brings the experience of the unexpected conflated by unpredictability and spontaneity (Auslander, 2008). There is an associated anticipation that something unplanned might occur in the viewing experience, since events are unfolding as they happen in the present moment. Hook et al. (2013) explore non-mediated live events where a performer is colocated with the audience and investigate how the experience of liveness degrades if part of the creative work is produced in advance of a performance. For example, co-present experiences of “laptop”-generated VJ performances feel less “live” than traditional live concerts. Liveness pertains to some sort of improvisation, responsiveness and uniqueness (Leong & Wright, 2013; Bowers et al, 2014). Gaver also discusses the character of “unexpectedness” in liveness experience in his design work of Video Window in which he attached a camera to the outside wall of his home, which continuously streamed the video to a monitor on his bedroom wall (Gaver, 2006). He gives an extended description of how unanticipated changes in weather, e.g. raindrops or snowflakes landing on the camera lens, brought about aesthetically pleasing experience. All this tells that unpredictability is connected to the liveness experience. In live events such as concerts or sports events it is a commonplace observation that there is an elevated sense of engagement among viewers. The spectators feel an emotional connection, i.e. a “despatialised simultanerity” with the event and its participants (Thompson, 1995). The presence of an audience also creates a sense of liveness, as a performer’s awareness of the audience is often fundamental to the flow of the performance (Reeves et al, 2005). The understanding of liveness in relation to engagement appears to emphasise its “event character” where co-presence and co-location is key to the experiences. It implies that to engage fully in liveness experiences is something special that stands out from “everyday life”. 23 Liveness is considered to have a connection with the experience of authenticity and trust. The most immediate transmission makes it less plausible that someone has manipulated the content through post-processing or by censoring the content prior to its presentation. However, in a live coverage situation, camera work and the production setup involved facilitates ways in which media content can be pre-computed into the system to give a viewpoint that is not quite as neutral as it may appear (MacNeill, 1996). The relation between liveness and authenticity has been a topic of concern in HCI. It is argued that the co-presence of performers and audiences at a live event inherently provides authenticity, which might be lost if the event is mediated. Co-present mediatisation could be interesting if it increases engagement by parallel activities. Jacobs et al. (2013) discuss the balance of data authenticity and audience engagement in liveness. Research in HCI recognises authenticity as an important aspect of liveness experiences mediated through technology. However, the detailed ways in which such experiences occur lack in articulation. The abundance of mobile cameras, sensors and networks makes real time media more ubiquitous. Thus, there is a need for an increased understanding of how this maps to authenticity in liveness experiences. In sum, the “magic” of liveness has already been identified within the studies of traditional broadcast media. With the emergence of technology for personalised media production and consumption, so has the interest in liveness experiences increased within HCI. The latter interest is both recent and fragmentary, which motivates further investigation. 24 25 3 Methodology This chapter describes the general approach for inquiry into the design space of live mobile video interaction as well as specific methods we adopted for conducting the practical activities involved. 3.1 General approach We present new technology, and novel contexts of use of that technology. We engage with designing and developing functional prototypes of such technology in order to generate new knowledge and to demonstrate a research contribution rather than to develop a market-ready product. The result is the generated knowledge embodied by the prototype. On a general level this approach of generating knowledge through design matches well with the notion of Zimmerman et al. (2007) of research through design (RtD). RtD emphasises the designed artifact as an end product in which the generated knowledge is embedded. RtD has become a widely adopted approach because it embraces the practice-based nature of design investigations (Gaver & Bowers, 2012; Ylirisku et al, 2013). The knowledge intends to be “generative and suggestive” rather than providing generalisation through falsification (Gaver, 2012). A design space is a hypothetical space having multiple dimensions. The research areas conflated by the space, e.g. in this case mobile streaming and high speed mobile networks, are among its pre-defined dimensions. When a design space is uncharted, and its boundaries and dimensions are not defined, there is no way in which we can ask precise questions that we know would scale the entire space or develop general theoretical models, conduct experiments and draw conclusions about the question in a traditional sense of scientific research. The critical problem is that we know little about the area to suggest the parameters to control and parameters to vary in a laboratory. Therefore, it is better to conduct initial observations before any stable questions can emerge. In such a situation, one way of investigating is to acquire an example system that is designed with the initial vague understanding of the design space. Once such an example is acquired, one can use it to ask questions, form specific hypotheses and conduct investigations such as user studies and technical tests, to obtain concrete knowledge about tech26 nical characteristics, user practices and design. The design example in this case is strictly a discrete case that reveals some of the aspects of the design space and serves as a probe to an unknown territory. We are inspired by the approach of associative design (Juhlin, 2011) throughout this work in general and in this stage of the process in particular. Various research, design and development activities are tightly associated and participants engage in all the parts in such an approach (Engström, 2012). Juhlin suggests associating ethnography, design and evaluation by copresence for situations where the goal is exploratory rather than solving a predefined problem. Brown & Juhlin (2015) refer to an example given by Latour of associative work in which a senior scientist is sitting in her office. The papers and articles that she has read over the past years are arranged in boxes discipline-wise in a bookshelf. One day her bookshelf falls over, turning all the articles into one big chaotic pile of papers. The scientist starts to clean up. While she is putting the articles back one by one in right places, she holds a paper on recent advances in Internet protocols in one hand and a paper on endocrinology in the other, she makes an association where recent development in the digital communication can be used to design a new device for endocrinology. In other words, combination and recombination of various design materials gives rise to new associations, which in turn generate innovation (Brown & Juhlin 2015). This approach goes hand in hand with a multi-disciplinary research group environment. Diverse competences and knowledge backgrounds of group members are combined and recombined in various ways to generate novel concepts. Thus collaboration among group members plays a pivotal role in setting the direction of an emerging concept and its refinement. Multiple workshops and brainstorming sessions with group members belonging to disciplines like sociology, interaction design, and media studies helped with the conceptualisation of the design space and directed us towards more specific points of investigation within it. The process of conceptualisation in our case is not a stage that has a definitive end before the other activities in the work could start. Rather, it is an overarching blanket activity that goes on in an iterative manner. Each point of investigation is driven by a knowledge interest concerning design, technology and user practices. 3.2 Points of investigation We imagine investigating a design space to be analogous to mapping out an uncharted area. Probing a point in such an area roughly involves following four steps: prototyping, technical trials or performance testing, field trials 27 and proposing solutions. It is important to note however that there is no logical way that leads from one point of investigation to the next. Video interaction design space Professional oriented Multicamera live collaborative video production Continuous ambient broadcasts for home decor Figure 3-1: Points of investigation in live mobile video interaction design space This work attempts to probe the space or tries to map the area out on two separate points of investigation, i.e. professional oriented multi-camera live video production and continuous ambient video broadcasts for home décor, where we are interested in technical characteristics, user practices and design (see Figure 3-1). 3.2.1 Professional oriented multi-camera video production We selected professional oriented multi-camera video production as our first point of investigation. We selected an area where already initial steps, i.e. design and prototyping, had been executed. We conducted performance tests and field trials to identify new interesting challenges and propose relevant solutions as we are also interested in technological aspects of live mobile video interaction. In terms of investigation of a point, as described earlier, we conducted performance testing, field trials, and proposed solutions using existing prototypes. Live video broadcasting has largely been used as a way of experiencing remote events as they unfold in real time. Before the arrival of recording technologies like videotapes, live broadcast dominated the canvas of TV production. Since the invasion of recordable media in TV production, live video broadcast has been seen only in conjunction with production formats such as coverage of large events, sports television and breaking news (Engtröm, 2012). In recent times, with the emergence of affordable video equipment and its integration with the Internet, video broadcasting is not confined to the TV production industry any more, and content distribution cost dramatically dropped. The arrival of mobile broadband network 28 services, e.g. 3G/4G, changed the game further by adding the possibility of accessing the distribution network in mobile and/or remote situations. In the meantime, video production tools have become relatively inexpensive, abundant and more available to general public. We also saw the development of technology that allows live video streaming over the Internet using web cameras. These developments with arrival of mobile broadband, i.e. 3G and 4G, gave rise to live broadcasting services like Bambuser, Qik and Ustream, which allow a mobile user to broadcast live videos from virtually anywhere within the reach of mobile networks. Such services have grown rapidly in terms of numbers of active users. Bambuser1 reported over 100,000 broadcasts covering the political unrest in Egypt in 2010 alone (Engtröm, 2012). The emergence of these services and the growing numbers of users tell us that live video sharing is increasingly a popular activity. This trend is inevitably leading towards, with growing technical development, more advanced video production tools. Although it is argued that widespread user practices around online video broadcasting stem from practices related to snapshot photography rather than traditional TV production (Lehmuskallio, 2008; Engtröm, 2012), continuation of professional TV like trends of collaborative production with multiple cameras can be observed in emerging applications. The emergence of the aforementioned real-time mobile streaming services created a completely new platform for non-professionals to generate live content. Reasonably, most of these services are limited to a single camera production mode. This production model has the advantage of reduced complexity compared to a multi-camera setting. However, it offers only one viewing perspective to the consumer. It is conceivable that situations may arise for amateur live videographers where more than one perspective is important. Demand of such storytelling tools that follow production model of professional-TV has been discussed in previous research (Juhlin, 2010). In response, there is an emerging class of tools to enable amateur users of collaborative live video production using mobile devices over 3G/4G networks modelled on traditional TV production practices. Such applications allow amateur users to produce live videos collaboratively using multiple mobile cameras, in a manner similar to how professional live TV production teams work. Where this new class of applications offers new production opportunities for users, they also present a new set of challenges. 3.2.2 Live mobile ambient video Our second point if investigation is live mobile ambient video. As discussed earlier, multi-camera visual storytelling tools with orientation towards tradi1 www.bambuser.com 29 tional TV production are emerging. However, it is hard to expect that wide groups of amateurs will adopt professional-style mobile webcasting as a form of leisure, which attain some form of “production-oriented” standard. Previous research has acknowledged a number of general characteristics that make video production challenging, which also hamper its use. Finding an interesting story to tell through live video, and capturing it in the moment is difficult (Lehmuskallio et al, 2008; Juhlin et al, 2010). The demand for editing material before presenting it also makes the activity cumbersome (Kirk et al 2007; Vihavainen et al, 2011). The ability of video capture “anytime and anywhere” also has privacy-related implications (Lehmuskallio et al, 2008). Visual storytelling in video is a complicated task that involves skills that until recently were exclusive to the professional domain. The inherent complexity in multi camera live production further exacerbates the problem, as organisation of teamwork makes the task cumbersome and production decisions become increasingly time-critical (Zimmerman et al., 2007; Juhlin Juhlin et al, 2010;, et al, 2014). In recent times a more simple video capture format has also surfaced in parallel, which follows the capture model of snapshot photography. Such a format addresses the challenges mentioned above, but only at the cost of the affordances of the new media. We argue for a third format inspired by ambient video that may tackle the problems associated with multi-camera live mobile webcasting without compromising on the leverage it offers. This gives us our second point of enquiry called live mobile ambient video. 3.3 Specific methods We model our method specifically on the approach adopted by Östergren (2006) in his dissertation Traffic encounters: drivers meeting face-to-face and peer-to-peer. Östergren argues for using design programs for investigating an unfamiliar design space. Once a design space has been outlined, one needs to conceive a design program. A design program here is equivalent to a strategic plan about how we probe the said space. Such programs not only act as a starting point but also guide the whole process of investigation (Redström, 2001). A design program requires practical design work to initiate and advance the investigation into the space. So the next step is to formulate more specific ideas and working hypotheses that can be addressed in practical work i.e., by working with design, implementation, testing and evaluating prototypes (Redström, 2001). The practical work that went into this investigation can be seen as a collection of activities consisting of designing and building the prototype, performance testing and field trials. 30 3.3.1 Designing and building prototypes In this activity, the aim is to acquire a concrete design example that represents a point of investigation. It involves conceiving the design of a prototype system and its implementation, which further includes building hardware, and developing the required software. At the beginning of developing the design concept we argue that gaining insight into the social domain relevant for the system is highly important, since such an exercise gives us access to empirical data from the field. Such data helps to better understand needs, expectations, desires and emotions in the existing user practice. These insights provide important initial indications for the design. The process of acquiring a design example involves an ideation phase, leading up to a number of design experiments. This is an iterative process where we design lowfidelity prototypes that guide us to final acceptable design instantiation. At this stage, a prototype is deemed stable for proceeding to performance testing. The ambition here is to obtain a fully functional system rather than compromising with mock-ups. Dummy prototyping may be desirable and helpful in investigations about interaction design and user experience to some extent. However, when the aim is to gain insights about technical challenges and opportunities involved in the process of building, we need to implement as much functionality as possible (Östergren, 2006). Prototyping, after the ideation phase, proceeds to the stage where the practical work of building hardware and software begins. The boundary of design conceptualisation and building/programming a prototype is not always defined. It is not always a one-time sequential process. There may be several iterations back and forth between design ideation and building process. The exploratory nature of the investigation also allows us to bring in stimuli from other areas thought to be relevant. We were influenced by concepts such as ambiance, mobility, hybridity, individuality and liveness, as guiding keywords during the design exercises. These keywords guided us to focus on the relevant technological materials. While selecting the hardware and software platforms to be used while building the system, it is important to keep availability, flexibility and ease of use in mind so that the resultant system may be built with less time and effort, and the learning curve for building tools is less steep. The activity of building a working prototype that is fit for testing in a natural setting is a complex task and involves selection of ready-made hardware as well as software components, building novel hardware and programming software modules. It is likely that, while building such a system, one may spend time and effort on one aspect disproportionately relative to the others. To avoid such a state, one must have a clear prioritisation of design goals. An understanding of the match and/or mismatch of design requirements and 31 technical constraints and opportunities also plays an important role in assigning such a priority. This is not a strictly linear progression. It emerges from the process of building itself. As the building process progresses, one learns more and more about the limitations and opportunities offered by the design material. This new knowledge consequently encourages one to revisit the prioritisation of design goals. So the whole routine from design to building and programming is a process of an iterative nature, where the result of each step may trigger a revision of another. We argue for developing systems and applications on Android-based smart phones as the mobile device of choice for capturing and streaming live video data over 4G mobile networks on the media capture side of the system and for receiving and presentation of videos on the decorative media side in the Livenature system. As of January 2015, there are more than 1 billion active Android users around the world (Trout, 2014). The abundance of such devices and their availability for the masses gives credibility to our assumption that streaming-capable devices are becoming more and more affordable. The open-source nature of the Android platform has encouraged a large community of developers and enthusiasts to use the system as a platform for a growing number of community-driven projects. This nurturing eco-system for software development ensures active support on the latest developments in relevant technologies. These are the factors that encouraged us to use Android devices as a platform of choice. It is important while building a prototype to keep a balance in work that goes into building the system and to avoid re-inventing the wheel. In order to keep the development scope achievable within our limited resources we argue for, first, striving to use as much existing technology as possible. For example, we selected an off-the-shelf weather-sensing device to be able to sense weather parameters and to communicate the data to the receiving end. Secondly, we rely on industrial-standard streaming protocols for streaming live video data (RTMP, H264, FLV, RTSP). When programming the software parts of the system, we prefer to rely on available knowledge of preexisting open-source relevant solutions. Working with the standard protocols and existing open source solutions reduces complexity significantly and ensures the resulting solution would work and could be tested against extant systems that are known to work according to the same specifications (Östergren, 2006). Thirdly, we argue for using the programming languages and tools that are known to have been used in relevant development. In this way, it becomes certain that there exists relevant support. For example, we selected Max/MSP/Jitter as a programming language for developing the video and sensor data mixing application to be run as a mixer server. Max/MSP/Jitter is a fifth-generation visual language well suited for real-time video and audio editing. 32 3.3.2 Performance testing When basic functionality of a prototype is achieved, before putting it in the field for user trials, it is important to test its performance. The performance test generally aims to confirm the functionality of the system under field-like circumstances. As an objective of this study is the investigation of a design space from a technical perspective, the performance test becomes important here. Thus, the performance test is preferred here over simulations and laboratory studies. The results of such tests are particularly interesting when the aim is a technical investigation. Such experiments provide access to important technical data that may be interesting in its own right, and/or be used in comparisons with data collected during user tests to draw important conclusions. In some cases, performance tests are conducted for the sake of collecting technical data to uncover underlying newly raised challenges. 3.3.3 Field trials Once the prototype has passed the performance tests, it is safe to install it in situ and evaluate it with users. This stage is analogous to deploying of a probe in previously uncharted territory. The actual setting of a field test varies depending on the context of use for the system. For example, for a prototype, like Livenature, where the aim is to let the user have an ambient and continuous experience of elevated connectedness to their cherished place, the field trial must involve a setting where system is installed in the user’s living places, and their specific cherished place, for an extended period of time. For the MVM and IBS, the field trials are required to be conducted in a context of some event, e.g. a skateboard contest or a music concert. It is challenging to find willing participants for the field trials because they mostly tend to obstruct the experience of the situation. For example, trials of MVM in the field with skateboarding teenagers would get in the way of their experience of the activity itself. The same can be said about Livenature, since the kind of trials that are needed for such a system requires a longerterm commitment from the user’s side. It demands their deeper and extended involvement in the test. It provides guides on the limit of the scope and length of the tests. Different means of data acquisition are applicable in various situations. For example, video is best suited for cases where details of the activity and the situation under study are important, and there is not enough time to capture everything by relying on the participant’s memory, such as in the case of our initial user feedback study of MVM with skateboarders. In such situations, video recording allows one to revisit the situation and observe in detail what is going on. On the other hand, audio recorded interviews taken after an experience itself are more suitable when overall reflection of the experience is important, as in the case of the Livenature study. 33 3.4 Ethical considerations Ethical considerations in research pertains to protecting the interests of the participants, ensuring voluntary participation after informed consent and integrity in conduct (Denscombe, 2011). While conducting user studies, we were cautious that the participants are aware of the nature of respective studies so that they are in a position to make informed decisions. The participants were given description of research and its aims before each study. In instances where audio or video recordings were involved participants were explicitly informed about it. In order to protect their privacy, there identities were not revealed in any of the studies and we referred to them with fictitious names. 34 35 4 Live mobile video interaction The method described in previous chapter, when applied to the design space of live mobile video interaction, resulted in several detailed studies that belong to two separate points of investigation. This chapter describes these studies belonging to each point. 4.1 Professional production oriented multi-camera video As described in the previous chapter, we started investigating this point in the design space by selecting pre-existing prototypes called IBS and MVM. We performed the following studies using these prototypes. 4.1.1 Synchronisation and delays in multi-camera production This section describes our attempt to explore underlying technical challenges posed by applications modelled on professional collaborative video production. As described in section 3.2, there is an emerging class of applications that enables amateur users of collaborative video production using mobile devices over cellular data networks. As with any emerging technology and user practice, such systems come with their own new challenges. Considering that video streaming in such applications relies solely on the Internet via mobile communication networks, we discovered problems related to inherent communication delays involved in such networks. In this production format, the applications are modelled on professional TV production environment. The team structure and nature of collaborative work involved is essentially also similar to professional systems. These applications involve three user roles: camerapersons, a director, or producer, and viewers (see Figure 4-1). The terms director and producer are used interchangeably in this text. Camerapersons carry mobile phones and film the object of interest. Currently, such systems support up to four different live feeds. The director sits at the control location viewing these live feeds on a mixer console. This typically shows all the live feeds at the same time in separate windows, allowing the director to ‘‘multi-view’’ all available live 36 content. The task is then to decide, on a moment-by-moment basis, which camera to select for the live broadcast. The viewer consumes the final video output in real time, based on the director’s selection. There is also a feedback channel for real-time one-way communication of instructions, alerts and cues that the director may need to send to camerapersons. Figure 4-1: User roles in a collaborative video production setting In a professional TV production environment, there are always a few seconds of delay between the event and its presentation on the viewer’s receiving device. This is caused by the communication delay experienced by data signals in TV transmission network; its details vary depending on the kind of transmission network being used in the content distribution. In live TV transmission, such a communication delay is almost never experienced as a problem by viewers due to separation between the event that is being covered and the site of its consumption. In most cases there is no way in which the viewer can compare the temporal difference between the occurrence of the event and its presentation on the receiving device, e.g. a TV set. However, in an actual production situation where the team is at work to produce live content collaboratively, demand for minimum delay and tight synchronisation between different camera feeds is very high. Professional production teams use specialised hardware to keep the camera feeds synchronised; high-speed dedicated media helps to minimise delays. In the domain of mobile live collaborative applications, maintaining a low communication delay and high level of synchronisation among multiple camera feeds becomes even more of a serious challenge, since such applications rely completely on mobile broadband networks and the Internet for video transmission from production and broadcasting to consumption. These issues can be attributed to two reasons. First, data transmission experiences larger delays since we are relying on mobile data networks as opposed to dedicated media used in professional systems. 37 Figure 4-2: Mixing interfaces Graphics by Engström, A. of IBS (left) and MVM (right), Secondly, delays in each camera feed may be different to the others due to the architecture of the Internet. This inequality in delays causes a lack of synchronisation among camera feeds. The problem of large delays and a lack of synchronisation may affect the task of video production negatively. We provide an investigation to unpack the details surrounding delays and the synchronisation problem in such applications as a contribution to the overall understanding of live mobile video interaction design space, where these innovations take place. In the spirit of the research through design method, where knowledge is embedded in the design exemplar, we avail prototype systems to conduct studies in order to obtain indications about how these problems affect the production process. We selected two existing prototypes, i.e. the Instant Broadcasting System (IBS) and the Mobile Vision Mixer (MVM), for this purpose (see Figure 4-2). IBS (Engström, 2012) is a mobile video production system that can be seen as a miniaturised version of professional live TV production systems. It allows amateurs to produce live video broadcasts in real time using live streams from multiple mobile cameras over a mobile data network. It supports up to four camera feeds. Live broadcasts are produced in teams, as is done in the professional domain, with camerapersons using mobile phones as cameras. Each camera-phone sends a live feed of the event that is being filmed to a mixing application, called IBSnode, on a laptop computer. IBS node presents all the four live feeds that it receives from camera-phones together. The user who acts as a director uses the IBS-node. He/she can select any of the four feeds to broadcast in real time. The IBS-node also provides other live mixing functionalities like instant replay, visual overlays, graphic effects etc. MVM offers a more minimalistic version of mobile live video production with a similar design philosophy. It allows all the members in a production team to be mobile. In this application, the director receives four live camera feeds from mobile camerapersons, on his mobile phone. The mixing application on his phone pre38 sents four live feeds together and allows the director to select any of these feeds in real-time to be broadcasted. Thus the director can cut from one camera-perspective to another in live broadcast. Due to limitations of the mobile platform, the mixing application in MVM does not provide other extended options of replay and video effects. We conducted two kinds of studies to investigate delays and synchronisation issues in such applications. First, we performed an initial user feedback test by employing an ethnographic method of enquiry in which we filmed and observed the users while they were using the system. Secondly, we conducted technical tests to measure the delays in the two systems, i.e. IBS and MVM. We identify two problems specific to such applications. First, end-toend delays, which in professional systems are of no consequence because of the separation between the event and the production environment, turn out to be a source of confusion for mobile systems. Such delays are problematic here since the director may choose between looking at the event itself and at the video feeds of it when making broadcast selections. The time for the actual selection of a cut, as decided by looking at the event itself, is not aligned with the video feeds presented in the system. Secondly, if all the cameras are filming the same event from different angles, which is likely in a collaborative production, inter-camera asynchrony also becomes a serious issue. The initial user feedback study identified a new problem, which was not present in professional TV production. The mobile nature of technology allows the director to be at the site of event and be co-present with the camerapersons. In professional TV production, the director is always off-site sitting in a mixing facility. We term such kind of mixing as “out-of-view mixing”. In this scenario, he/she only has visual access to the event that is being filmed mediated through screens in his mixing console. The mixing console is a panel of multiple screens that shows live camera feeds received from the cameras at work. The mobile character of the systems in question provides the director an opportunity to be at the site of event. The director is able to observe the event directly unmediated as it unfolds, in addition to accessing its mediated version through his mixing console. We term this way of conducting multi-camera production ‘‘in-view mixing.’’ The delay, in this case, between what is happening and what is presented on mixer console becomes visible. We argue that these two mixing modes i.e. “in-view mixing” and “out-ofview mixing” are distinct and have different demands concerning delays and synchronisation. In the former mode, the director is producing/mixing the live streams sitting away from the actual event that is being filmed and he/she can only see the event through the camera feeds that are presented to him/her in the mixer console, as is done in professional live TV. In this case, the director cannot notice delays between camera feeds showing the event 39 and the event per se, as he/she does not know when the actual event is taking place in time. Synchronisation among streams and smoothness of video presentation is of high importance here, because it affects multi-viewing, thus affecting the director’s mixing decisions. In the in-view mode, the director is present on the site of event and he/she can observe the event directly as well as through live camera feeds in the mixer console. In this case, the director can notice a delay between camera feeds and the event; thus high delays cause problems with production of a consistent live broadcast and cannot be tolerated. Compared to out-of-view mixing, synchronisation between streams still has great importance. However, smoothness may be compromised since the director can directly see the event. In such applications, synchronisation can be achieved following two steps. First, we need to ensure that there is a way to compare the feeds temporally. This could be done by marking them with a common point of reference, such as by identifying common audio features etcetera or by means of synchronised time stamps. The next step is to introduce techniques to align the feeds, either by buffering at the receiving mixer side or by dropping early frames at the receiver. The first approach provides high synchronisation and a smooth video, however, with a larger delay because of the extra buffering. The second approach ensures less delay in achieving synchronisation, although at the cost of smoothness. As has been discussed, in an ‘‘in-view mixing” setting, delays and asynchrony are quite intolerable as they confuse the director and affect his/her production decisions. The two temporal alignment techniques above represent two different priorities in the trade-off between delay and smoothness. Smoothness may be compromised in case of in-view mixing, as the director can also see and observe the event itself. As the frame dropping technique ensures a shorter delay in the streams on the mixer console, this is quite suitable for scenarios where the director is mixing and producing live videos while looking directly at the event. In the “out-of-view mixing” case as discussed above, delays are tolerable. However, as the director solely relies on the mixer-console to make production decisions, video quality and smoothness are much more important. In such a situation, synchronisation techniques with buffering are more suitable. 4.1.2 Frame-rate exclusive synchronisation Buffering-based techniques for synchronisation (Shepherd et al, 1990; Escobar et al, 1994) are a well-researched area. There is a number of choices of techniques that are potentially suitable for “out-of-view mixing”. Still, there are only a few synchronisation techniques that are suitable for the in-view mixing case. The existing approaches employ a combination of transmission control techniques, frame skipping and duplication along with other buffer 40 control mechanisms, which introduce additional delay. Variable bitrate encoding (VBR) is also not good for our delay-sensitive application as it takes more time to encode due to the increased complexity of the process. We propose an algorithm for synchronisation called “Frame rate Exclusive Sync Manager” (FESM) that relies solely on frame rate adaptation to handle synchronisation among multiple live streams with bandwidth fluctuations. This method completely avoids buffering, and thus provides synchronisation with minimal delay. The downside is that the video playback loses smoothness in the mixer console when the frame rate is dropped to handle synchronisation. As we focus on a specific scenario in collaborative live video mixing systems where the director is present at the filming location, this drawback is not believed to affect the director’s work. We evaluated the proposed algorithm by performing simulation tests. The results showed that the algorithm handles synchronisation with an average recovery time of 3.5 seconds (Mughal et al, 2014). This simulation study indicates the potential in the concept and unpacks the influence of different parameters involved in recovering synchronisation in this manner. However, implementation is needed to demonstrate the performance of the proposed solution in the real mobile networks, as well as to understand how long the synchronisation recovery time could be tolerated in order not to influence the director’s decisions in the in-view mixing scenario. 4.2 Live mobile ambient video The second point of investigation is called live ambient video, as described in the methodology chapter. This section is dedicated to covering the studies that we performed to explore challenges and opportunities, in terms of design and technology, offered by a combination of multi-camera mobile webcasting, sensor networks, and home décor. 4.2.1 Ambient video format and mobile webcasting As we have discussed earlier, When mobile-based multi-camera live webcasting applications are employed to produce content in a professional fashion, the demand on finding relevant content, and performing coordinated and complex teamwork, makes such a kind of video production too demanding for non-professional use. Development of quicker and simpler capture models based on snapshot photography on the one hand may address challenges related to multi-camera live webcasting by limiting to single camera use and restricting the content’s length. On the other hand, such applications achieve this at the cost of affordances offered by multi-camera live webcasting. We propose a complementary approach for mobile-webcasting that em41 ploys multi-camera webcasting to generate a novel variation of ambient video for home décor. Ambient video is defined in previous literature as a form of art presented on high-resolution displays (Bizzocchi, 2003; 2006; 2008) showing slow, continuous and aesthetically pleasing content with captivating transition effects. Its prime characteristics are to be “pleasant”, visually interesting and capable of supporting occasional close viewing. The content of an ambient video changes slowly during a given time interval. Preferred content includes views of natural scenery e.g. clouds in the sky since such elements usually involve slow and gradual changes that allow longer and closer examination. We argue that the ubiquitous presence of mobile devices capable of highquality live video recording and streaming makes mobile webcasting more and more available. Thus the combination of live webcasting and ambient video format becomes viable. This combination may address some of the challenges associated with multi-camera video production for the following two reasons. First, video production is a time-consuming task that sometimes interferes with other practices at hand. Live ambient video requires only initial attention when placing the cameras, and will then provide continued meaningful broadcast over a long time. Secondly, the ambient format provides guidelines for what to record, and makes the selection easier, since the prolonged broadcast makes it possible for a producer to be the same as the viewer. This addresses the problem of finding interesting content for live broadcast. The suggested mix of technologies brings new opportunities to the area of ambient video. It introduces mobility and affordability in data acquisition that make it possible for many to produce content. Live broadcast is also considered an integral part of television, and the immediacy of this media has a “magical” component, which explains the value it is assigned to among general viewers (Scannell, 1996; Juhlin et al, 2010. The fact that cameras are wireless enables content ot be recorded from anywhere within the reach of a mobile network and allows an extended freedom for camerawork. This makes mobile broadcasting different to earlier webcam-based technologies. Such freedom of movement allows users to select sceneries of their personal interest, e.g. from a cherished place, for broadcast. Accessibility of other real-time sensor data sources supports production of hybrid media i.e. “the simultaneous appearance of multiple media within the same frame” (Manovich, 2007). Hybrid media is already becoming the norm in the moving-image culture since the 1990s, with cinematography, graphics, photography, animation and typography being combined in numerous ways. With the advances of sensor technology, such opportunities emerge also in mobile webcasting. Moreover, we argue that the abundance of mobile technology also can be employed for developing ambient interfaces not only for art galleries, but also as home decorations (Anderson, 2011; Meese et al., 2013; Miller, 2013). 42 Over the course of this study we designed a fully functional prototype called Livenature following a research through design (RtD) approach, where the process of building prototypes is considered a form of enquiry and generated knowledge is embedded in the acquired design artifact (Zimmerman, 2007). The design instance acquired through such a process indicates the potential of combining technologies and users’ interest, to realise a novel concept. It includes making explicit theoretical influences, articulating design considerations, and in this case providing early technical feedback and lessons learned. We started this process with three field studies aiming at people’s relationships for a particular place they cherish. Insights gained through the fieldwork coupled with the study of the technological context of live webcasting and theoretical understanding of aesthetic interaction in homes instigated system design explorations. The design process involved several workshops and brainstorming sessions with group members belonging to fields of interaction design and sociology. These activities generated several design ideas. We tested these ideas through iterations of experimentation with lo-fi prototypes and their evaluation through pilot studies. This led us to a fully functional prototype system called Livenature. The final prototype was deployed in a context of a “test-apartment” in collaboration with an international furniture manufacturing company. Livenature’s design is influenced by the idea of people’s emotional connection with a geographical place, to which they have occasional access. They “dream” about such a place when they are away from it. We term such a place as a “cherished place”. It is separate from where they live, although they revisit it on a regular basis. We started off with a set of study visits in Sweden to interview people who might be representative of a section of population having such a connection with a place. We visited a small island named Bastuholmen in Stockholm archipelago, an island called Småskär in Luleå archipelago, and a ski resort in Jämtland County. We conducted short interviews with 20 people in the location of their respective cherished places. The interviews contained open-ended questions regarding their relation to the place and how they imagined it when they are not at the location itself. We presented the interview results to the rest of the team as transcripts and photographs. If we describe the results in general terms, the participants supported our initial hypothesis of them having a cherished place. On a more detailed level, several of the participants envisaged the cherished place in their mind’s eye when away. The respondents reflected what they saw as well as when they saw it. It appeared in their mind’s eye when they were longing for it, when they needed to relax as well as for practical reasons. Their visualisations are diverse, some being related to scenery of landscapes, experiencing sunny days, and looking at open seas while others involve people engaged in different activities. The description of the respondents’ imaginations presented here is in no sense an overview of such contemplations. 43 However, we argue that they are sufficient to inspire the design of a system that attempts to encapsulate aspects of such fancies and support an enriched remote experience of a cherished place. We used ambience, liveness, individuality, hybridity and aesthetics as guiding keywords during the design process of Livenature. A number of ideas were generated during the ideation phase, three of which we developed into lo-fi prototypes. We installed these prototypes in our workspace to run small pilot studies with visitors and research colleagues. Thus we narrowed down to one concept, i.e. Livenature, which then took its final shape over a number of iterations. It consists of the following integrated components: a media capture device that is responsible for recording live video streams and weather data at the cherished place, a communication infrastructure that transports the captured content from cherished place to the user’s home, and a decorative media in the home that displays the captured video and weather data combined in an aesthetically pleasing and ambient manner. A component called interaction manager is also part of the system (see Figure 4-3, Figure 4-4). (a) (b) Figure 4-3: a) camera phones on mobile stand, b) interaction manager It enables, as apparent from its name, the user to interact with the system. While designing the system, we imagined the media capture part to consist of a set of mobile phones streaming live views continuously. We used a Samsung Galaxy S4 Active device that is IP67 rated for water and dust resistance. We mounted four of these phones onto a custom-built “stand” attached to a pole (See Figure 4-3a). The phones have to be plugged in to the mains, as they are required to stream video data persistently. The mobile phones stream live video to a streaming server using an FFmpeg-based Android application that we developed. When it comes to what kind of visual material to be covered, we argue for it being abstract and poetic in nature, which would act as a trigger that encourages the user to dream about a cherished place in their mind’s eye, rather than provide a visual “replica” of the place. Therefore we envisioned views of sky and clouds in particular to be captured in live streaming from the cherished place. 44 This approach would also solve privacy-related issues involved with unsupervised continuous live streaming. We included other real-time weather related sensor data and presented it in combination with live video streams in the home environment through decorative media. We attached a weather station to the mobile stand for this purpose. It measures temperature, humidity, atmospheric pressure, wind-speed and wind direction, and transmits this data over the Internet to decorative media via a server. The data captured at the cherished place is transported and processed using a communication infrastructure that includes mobile data networks, a mixing server and a streaming server. The system also allows user interaction via an interaction manager that enables users to choose between camera feeds and weather parameters to be presented in the home environment. Figure 4-4: Livenature architecture 45 (a) (b) (c) Figure 4-5: a) Large display, (b) Small screen, (c) windmill Through a series of design experimentations, the final design of decorative media consists of a set of four small digital displays and a large display (See Figure 4-5). For aesthetical and decorative considerations, it is important that the screens do not look like computational components. Therefore, all the screens were given a “picture frame” appearance and were spread in different places in the home and blended in with other home decoration items. Each of the small screens is capable of receiving a live stream via the Internet from a corresponding camera phone installed at the cherished place look46 ing into the sky above the cherished place. The large screen is connected to the mixer server and displays a chosen camera view with visual effects influenced by the real-time weather sensor data from that place. The mixer server receives video data from the streaming server via the Internet and allows users, with the help of the interaction manager, to select any of the four camera feeds to be displayed on the large screen connected to the mixing server. It also fetches live weather data via the Internet from the weather station. When a user selects to see weather information in the selected video feed on the large display, the mixing server applies appropriate visual effects corresponding to the selected weather parameter(s). For example, if users selected temperature, the mixing server will automatically apply a predefined visual effect on the selected camera feed in real time. The intensity of the visual effect corresponds to the value of temperature at the media capture location. The interaction manager is developed on Android, and it acts as a remote control that allows the user to select from among four camerafeeds as well as to enable/disable the hybrid visual effects. It communicates with the mixing server over a Wi-Fi network. To introduce hybridity and emphasise the decorative value and aesthetic experiences, the sensor data collected from the weather station is mapped to different visual effects that can be applied to the large screen. The system maps, for example, the humidity value at the cherished place to the saturation level in the selected video stream. Similarly, temperature and atmospheric pressure and wind speed values are mapped to different visual effects that can be applied on the selected stream on the fly. In order to expand the aesthetic interactions and hybridity beyond mere visualisations, we associated the “sense” of a remote place with a decorative item at home. We fashioned a small decorative object called interactive windmill that consists of an Arduino board that receives wind-speed data from the weather station at the cherished place, and controls a fan that blows at a paper-made windmill according to the received data. Thus the speed with which this windmill spins represents the actual wind speed at the cherished place. The spinning motion of the windmill indirectly illustrates a sense of the cherished place and is intended to trigger an imagination of that place without presenting formal data. The implemented system has to meet certain requirements in order to support the suggested blend of ambient video with mobile webcasting. Generation of ambient video requires continuous streaming of both weather data and real-time video for an extended period of time with minimal supervision and maintenance. Home decoration and ambient video require high visual quality, which must at least be equivalent to standard spatial definition (640 x 480) at frame rate of 24 fps. The system must use an advanced and flexible data compression mechanism for streaming, to minimise the data transmis47 sion costs. The system should be mobile and utilise mobile Internet connections for data streaming, since a cherished place can be located outdoors or away from the fixed Internet. The decorative media part of the system must be able to receive data from weather sensors and video streams from mobile cameras and mix them together in an aesthetical and meaningful way. Ambient video requires glance-based interaction that does not interfere or disturb other activities at hand. We preferred to deploy the system in a natural setting for testing the system against these requirements. We installed the decorative media part of the prototype in a test apartment and installed the media capture system on a balcony of an office building. We conducted a test of two weeks’ continuous operation. The system was running and streaming for 14 days to test the requirement of long-term continuous streaming. The media capture system sustained 13 mm rain in the two-week long period. The lessons learned from this test show that current mobile webcasting technology is an interesting and plausible candidate for live ambient video. Livenature generated two weeks of continuous multiple broadcasts of compressed video, with a spatial resolution of 640 x 480 and an average frame rate of 14.6 fps. However, the design and implementation of this media as home decoration were more challenging. The requirement of the furniture manufacturers’ stakeholders led to unforeseen technical problems, such as unsuccessful charging given the need to conceal adapters, as well as demands on lowered noise levels. 4.2.2 Liveness experienced through ambient video This performance test was followed by an initial user experience study of Livenature, which gives us deep insights into the experience of Liveness. Conventionally, live webcasting almost always serves the purpose of covering some sort of event. The live ambient video format, which arguably is enabled by Livenature, allows a shift of focus from “event” to the experience of “liveness” itself. In media studies it has been noted that the experience of liveness has a magical “now” quality (Reeves et al., 2005). It has for long been a topic of interest in media studies to understand this elusive quality (Friederichs-Büttner et al., 2012). The advent of sophisticated mobile cameras and ubiquitous wireless sensors make live content more diverse and accessible for production as well as consumption. This turns personalisation of liveness experiences into a possibility (MacNeill, 1996). The HCI research community also has shown a growing interest in the experience of liveness, especially in the kind of experiences that provide a “real sense of access to an event in its moment by moment unfolding” (Reeves et al., 2005; Hook et al., 2012). As it has been pointed out, the liveness experience has been articulated in contemporary work mostly in the context of an “event” that is to be covered. If we decouple liveness from the “event” character, there are poten48 tials for the design of liveness experiences that are beyond the concept of “real-time”. When we say that there is a magical quality in “liveness”, it is not strictly about “real-time”. The former emphasises experiential qualities; the latter highlights time constraints, i.e. between an action and system response. The heterogeneity of the experience of “now” as opposed to being strictly a measure of time makes liveness an exciting concept. In traditional broadcasts, live is an important viewing format since it retains qualities like immediacy and authenticity. We conducted an interview study for gaining an insight into user experience of liveness through this system. As described previously, the system was installed in a test apartment, which is an ordinary apartment in a residential building furnished and maintained by the research department at a European furniture manufacturer. For this investigation into the experiential dimension of the design space we interviewed four adults, after each of whom had lived in the test apartment with their families and experienced the Livenature system for two weeks. Although the living experience in an apartment on a temporary basis is very different from the home-experience, considering our limited resources, the context of a test apartment is still sufficient for an initial user feedback. The families were interviewed and asked about the location of their cherished place and their feeling thereof prior to their arrival in the test apartment. We then installed the media capture part of the system, i.e. camera phones and weather station, as close as possible to the identified place for each family, which then constantly provided live video and weather data to be presented on decorative media part in the apartment. On the last day of the test period, we conducted a semi-structured interview with the main respondent from each family for the duration of an hour. Semistructured interviews as a data collection technique are practical for making both experience and interactions accessible at an early stage while preserving the user’s privacy. Such has been employed in existing literature related to technology evaluation in the home (Harper, 2012). The interview questions concerned: 1) the use of the system, e.g. when and for how long did they look at the screens, and if they discussed it with other family members; and 2) the experiences, e.g. feelings and thoughts while seeing the visualisations from their cherished places through the system. A set of categories that characterised prominent features in the material was formed. This was achieved following a qualitative approach where these were developed by attending to individual answers and comments as well as to the theoretical understanding of liveness reflected in the existing works. Detailed analysis of the collected data with a focus on the user’s experience in the light of previous liveness theory indicates that liveness, understood as experiences of immediacy and unpredictability, provides captivating experiences. The study extends the understanding of liveness experiences by showing that continuous live content presented in an ambient and aesthetical manner may encour49 age a new type of engagement given the context of use; and that authenticity is not an inherent quality in live media, but occurs through the actions of acquiring authenticity. We also discovered transcendence as a quality of liveness, unarticulated in previous research, which seems to bring an important experience to everyday life. 4.2.3 Resource efficiency in live ambient video The deployment and performance testing of Livenature, in the first study for live ambient videos, revealed that the media capture part of the system consumed approximately 28 W of power. The system generated 40 GB of data per day in the form of live video streams. This resource-intensive nature of the system is linked to the requirements of continuous uninterrupted connectivity for live ambient videos. Such a high resource consumption presents an interesting technical challenge. The data capture part of Livenature is supposed to be installed in a user’s cherished place. Such a place is unlikely to support a broadband Internet connection and to be connected to a power grid. Therefore, such a video format must be designed in such a way that it consumes power and network resources efficiently without compromising the live ambient nature of the media. We investigated how we can adapt live ambient video formats for better efficiency in terms of energy and network resources employing nonconventional video form factor and image analyses coupled with “duty cycling”-like techniques. We attend to the live streaming part of the system in order to achieve efficient live ambient videos, as it is the most significant part of resource consumption in the system. In an attempt to achieve a resource-efficient live ambient video format, we propose the incorporation of a number of modes of operation for the Livenature system. We build a revised and optimised version of the Livenature system. We will be referring this revised system as eLivenature for the sake of clarity. We enabled the system to respond to the sunrise and the sunset times at the location and stream live videos only in the daylight hours. It turns on one hour before the sunrise and turns itself off after an hour the sun sets. The modes of operations that the system affords are: live mode, smart picture mode, picture mode and masked mode. The system reacts to the user’s presence in the living room where the decorative media part is deployed. The system operates in live mode or in smart-picture-mode, depending on the user’s preferred configuration, as long as the living room is occupied. The system switches to one of the low-power modes, i.e. picture mode or the masked mode, as soon as the living room becomes unoccupied. The specific low-power mode at a given time is determined by the user’s configurations. When the system is in live mode it is operating in with no regard for resource efficiency. Smart-picture mode is when the system takes a photograph from the media capture side every five 50 minutes. Then an image analysis algorithm decides whether the captured image is interesting or not interesting based on its chromatic properties. If the image is not interesting, it is displayed on the screen as it is. In the case where the algorithm indicates that the captured image is visually interesting, the system switches to live mode. When in the picture mode, the system stops video streams from all four cameras and captures a picture once every five minutes with a selected camera and displays it on the screen in the living room. In the masked mode, the system conserves resources by limiting the spatial area of a live video stream. The content in this case is still live, except that the visible area of the video stream is reduced. The ambient nature of the system requires a persistent ambient connection with the cherished place. The aforementioned low-power modes maintain a connection in the background, albeit a semi-live one. We argue that the picture mode is an intermediate state between still photographs and live video streaming, as at any given time, the presented photo in the living room is not older than five minutes. We explored the possibility of using a non-conventional video format in masked mode to reduce resource consumption. The main contributions of this study are as follows. We extended the Livenature system to make it resource-efficient by employing presence-aware adaptive modes of operation. In doing so, we explored the use of image analysis algorithm for aesthetic assessment of an image. We also explored the nonconventional video format with an arbitrary spatial boundary for such systems to reduce resource consumption. We performed an evaluation by conducting experiments with the system in three modes. The results of our evaluation show that when compared to live mode, the masked mode showed a 65% reduction in network bandwidth usage. However, when it comes to energy efficiency, it consumes 9% less energy compared to the live mode. The picture mode saves 99.6% bandwidth and 96.9% energy compared to the live mode. 51 52 5 Findings and results Here we discuss general findings and results from those investigations and see how they contributed in illuminating the space in question.. The first part of the investigation revealed the underlying challenges by showing how varying delays give rise to asynchrony among multiple camera feeds and how mixing decisions are affected by lack of synchronisations and delays. It unveiled emerging production scenarios within live video iteration and showed how these distinct mixing scenarios have distinct demands. Finally these studies extended design suggestions for arising new demands of such mixing modes and propose frame rate-based synchronisation. The second point of investigation in our research intends finding alternative content formats for live mobile video interaction; in doing so we also developed a nuanced understanding of the liveness experience. Related studies inform how mobile webcasting extends ambient video format and also how live ambient video is a new a form of mobile webcasting. While conforming with existing media theory on several points we argue that there is a “magic” to experience of liveness; however it was also uncovered that there is a need for re-visiting the liveness as defined in media theory. 5.1 Varying delays give rise to asynchrony The initial user feedback study and technical test that we conducted helped us gain a decent comprehension of synchronisation problems and their role in systems like the MVM and the IBS. The systems that are designed with orientation towards professional standards of live TV production are fairly complex both in terms of technological infrastructure as well as in the required collaborative teamwork. The technical tests revealed that video transmission delays occur from one point to another in such systems, which may prove to be problematic for the production team at work. However, this is already known from professional grade systems. It is also known how these delays are dealt with in that environment using dedicated hardware. Such hardware is not available in mobile systems. This makes the problem a demanding one to be addressed in this new technical setting. 53 In live video streaming applications, end-to-end delay is made up of the following components: 1) sender delay, which is the time it takes to capture and process video data; 2) transmission delay, which is the time taken by a video data unit to travel from source to destination over the network; and 3) receiver delay, which constitutes the time it takes to process video data and display it on the final output screen. The sender and receiver delays are going to be reduced more and more with the increasing availability of more powerful devices. The transmission can also be optimised in terms of delays by using better-suited transport protocols. Today, applications like IBS and MVM, in most situations, are bound to use TCP based transport for video streaming. This is because most mobile carriers do not allow UDP-based video streaming traffic in their network. TCP-based data transport is not suitable for live video streaming, as it can introduce undesirable delays because of its retransmission mechanisms (Wang et al., 2008). Live streaming protocols that are based on UDP, such as Real-Time Transport Protocol (RTP), are better suited for this kind of data traffic. We measured two types of delays in our study. First, the delay that occurs between a mobile-camera and the mixer console. Secondly, the overall delay between an occurrence of an event and its presentation to the viewer. For the sake of clarity, let us call camera-to-mixer delays DC-M and the overall delay as DO (see Figure 5-1). In this kind of system, DC-M may become a source of confusion for the director and it may lead to bad mixing decisions. On the other hand DO is not as significant since the viewer cannot perceive it due to the unavailability of any alternative access to the event relative to what is received on his/her viewing device. Nevertheless, if a real-time feedback channel is introduced between the viewer and the director, such a delay can also generate similar problems for the task of production, as described above. Figure 5-1: Delays in professional oriented mobile video mixing Furthermore, the DC-M for a camera feed may change over time and may also be different to the other camera feeds. Since these applications involve multiple camera feeds streaming over a mobile data network, it is highly likely 54 that each stream experiences different amount of DC-M delay. This generates the problem of asynchrony when camera feeds are presented on the mixer console for “multi-viewing”. In the case that four camerapersons are filming the same event from different angles and streaming the live feeds to a mixer’s console, the mixer console ends up presenting the event as it occurs from different points in time through each camera feed. This is because each camera feed may experience a different amount of delay. For future work it will be interesting to investigate an acceptable level of delay that does not interfere with the production work, and how it relates to new mixing contexts allowed by the mobility of such systems. 5.2 Mixing decisions are affected synchronisations and delays by lack of We found that delay and asynchrony, in the video presentation at the director’s end, cause inaccurate and bad mixing decisions in video production. This severely impairs the final broadcast quality. For example, during the course of video production, the director needs to communicate via feedback channel to a cameraperson, say, for requesting to follow a passing bird. The director does this based on the visual image that is available on the console at that moment. As the available instance of a video stream on the console is a representation of the event that has occurred some moments before, by the time a cameraperson receives and understands the director’s instruction, the bird already would have flown out of the visual range. Thus the director’s feedback is rendered meaningless in this particular situation due to the delays involved. We learned from our initial user feedback study that delays also confused the director and the task of cutting from one view to another at a right moment. Lack of synchronisation among different live feeds also leads to similar mixing problems. In a case where two streams that are out of sync in such a way that one stream lags behind the other one by a couple of seconds, the director may find cutting from one feed to another meaningless since the instance of time represented by each stream does not match. 5.3 Specific mixing scenarios have distinct demands Mobility is one of the main features of the video production systems in question. This feature, in combination with the real-time nature of such systems, brings about even more interesting problems. In professional TV production, the director is almost always working while sitting in a production room that is away from the site of event itself. Even in relatively ad-hoc situations, e.g. live coverage of an ice hockey game where the whole production team 55 would have to be co-situated, the director sits in a closed OB-BUS based production room. This maintains a separation between the event per se and the director. So the delays between the event and its presentation on the mixer console are rendered benign as long as all the camera feeds are presented synchronously at the mixer console. In this case synchronization is ensured using dedicated synchronisation hardware. As a director is working away from the site of the event, we term this mixing scenario as out-of-view mixing. On the other hand there exists no such separation any more when extended mobility is introduced into the collaborative production environment. The visual access of the event for the director as it unfolds both via mixer console as well as by looking directly makes camera to mixer (DC-M) delays observable for the director. We call this scenario as in-view mixing. We argue that the mobile live production systems afford both in-view mixing and out of view mixing. In in-view mixing a director can compare what is happening and what is displayed on the console; the delays are visible independent of the level of synchronisation between the feeds. Thus, this kind of mixing demands minimal delays. Shorter delays at the expense of smoothness of video presentation can also be accommodated as the director has a choice of glancing at the event directly too. The “out-of-view” mode is not sensitive to camera-to-mixer delays and the task of video mixing heavily relies on synchronous presentation of camera feeds. So, camera-to-mixer delays can be allowed with high synchronisation. 5.4 Frame rate-based synchronisation As described earlier, in-view mixing and out-of-view mixing have different demands when it comes to handling synchronisation and delay problems. Here we focus on the in-view mixing context that requires high synchronisation with minimal delay. In such a case, the smoothness of video presentation can be sacrificed, as discussed in the previous section. We proposed a synchronisation algorithm specifically for in-view mixing scenario called Frame rate Exclusive Sync Management (FESM) that completely avoids buffering while achieving synchronisation though dynamically adjusting the frame generation rate at the live video source. The frame rate is adapted based on the receiving side’s cues. The evaluation of FESM through simulation unpacked several parameters involved. It provided initial indications for implementation of such solutions. The evaluation indicates that FESM is capable of handling the arising asynchrony between two streams with an average synch recovery time of 3.5 seconds. Although the simulation study gives us insights into different parameters involved, the real prototype would be required to demonstrate the performance of the algorithm and to gain further understanding of the details regarding recovery time and its impact on the mixing decisions. 56 5.5 Mobile webcasting extends ambient video format As we have described in section 4.2.1, there are multiple problems associated with professional oriented multi-camera mobile video production. These include: high demand on skills, difficulty in finding relevant stories in the moment, requirement of complex teamwork, and concerns about privacy. We explored alternative video formats that would avoid these problems without compromising the affordances that multi-camera webcasting offers. We performed a study to obtain an insight into the concept of a cherished place and how such a place is represented in a domestic environment via imagination, pictures and other artifacts. The process of attaining the final design and then its implementation equally informs us about various affordances and challenges associated with such a concept. The interesting mix of mobile webcasting technology, ambient video and home decoration has sparked a generation of a novel video format. The results from this design research endeavour illuminate the design space by providing a better understanding of the potential of extending ambient video with mobile webcasting technology. The salient characteristics of this newly invented format and supporting system are: being ever available and ready to present content, mobility in the media capture system’s design, and ambient design of the presentation that does not demand attention and supports the notion of glance-based model of consumption. The two-week long performance test that we conducted gives us a good idea of what kind of problems may arise while designing for such an experience. The four media capture system was able to operate unsupervised under weather conditions such as continuous rain and strong wind. It provided continuous live data streams. It is a promising indication of the plausibility of such a design choice in this context. However, the amount of data generated to be transported over the network, being up to 40 GB per day, with spatial resolution of 640 x 480 at average frame rate of 14.6 fps, show that considerations about the efficient use of network bandwidth are important areas to investigate. Ambient video format is characterised by high resolution, which obviously depends on high-quality video streams. Mobile webcasting is still far from providing the video quality that can be achieved in wired networks. However, the recent shift from 3G to 4G networks has made a difference. Since Livenature is designed with the focus on slowly moving natural sceneries as part of investigating the ambient video format, we are able to compromise on the amount of frames per second on behalf of screen resolution. The result shows a good quality on the output screens, except when birds rapidly pass over the cameras. In traditional ambient video production, the videos are preproduced by selected artists for other people to view. Livenature brings 57 liveness and personalisation to the ambient video format. Users can deploy the media capture system to their personal cherished place as the system’s design allows for mobility. We placed components of decorative media, i.e. screens and an interactive windmill, with other objects of home decoration so that presentation of captured data streams may support a glance-based interaction that demands only occasional focus. This encouraged interaction with the system displays in the same way as for other decorative objects like paintings and figurines. In conclusion, Livenature’s development and performance tests conducted in the context of a living laboratory contributes to understanding the potential challenges. It also demonstrates the viability of extended ambient video format with liveness, personalisation and hybridity, which points to a broader use context for such a video format beyond that of the art scene. 5.6 Live ambient video is a new a form of mobile webcasting As argued before, a critical problem is that users of “TV-production” oriented webcasting struggle with the amount of production work and with finding interesting stories to tell. The live ambient format, which we propose, requires low production efforts, since one sets up the system only once and leaves it to broadcast for a very long time. The concept of production with Livenature is that the broadcast is automatic and will continue over an extended period i.e. for months or years. This involves the initial effort of installing the capture system at the cherished place. The custom-built mobile stand ensures it can be done with ease. With the cameras and the weather station in place, the user is also required to administer the broadcast through the interaction manager. Moreover, the design of the system draws on our study of people who have a strong relation to a cherished place. Based on that, we suggest that having live access to such a place would provide a meaningful source of content and a story to tell e.g.. the weather at the location, which will in turn entice imaginations of days passed and days to come in relation to the beloved spot. The process of designing, building, deploying and testing the system in situ extended our understanding of the design space in many ways. Every step of the process has embedded knowledge about the underlying terrain of the space. In general, this study informs the potential of sustaining and developing mobile webcasting with the ambient video format. It extends the arena of mobile video production where traditionally TV production-based webcasting and more recently “snapshot”-based webcasting models are dominant. 58 5.7 There is a “magic” to experience of liveness In addition to the knowledge contributions mentioned above, there are findings and implications to be discussed in the experiential dimension of the system. Following qualitative research principles, we presented a descriptive study of a small number of families’ experiences over the period of their stay at the test apartment, installed with Livenature. By and large, the results from the field trial support the basic design idea of combining ambient video with mobile webcasting to trigger users’ emotional connection with a cherished place. Participants appreciated the connection with their cherished place and valued its immediacy and unpredictability. The sense of “now” in the mediated sceneries of nature and unexpected and unanticipated experiences like looking at a bird passing by the camera were regarded positively. This indicates that existing approaches to experience liveness through usergenerated webcasting could be extended beyond mediatising events, to account for the content of ambient nature. 5.8 Liveness theory revisited In HCI research, when studying liveness, the major focus is often dedicated to the merits of immediacy (Vihavainen et al, 2011), which is tantamount to equating the whole concept of liveness to the notion of “real-time” alone. The concept of real-time pinpoints measurable temporal differences between an action and the system’s responses, whereas experiences related to liveness have much more to them and are elusive to articulate. The experience of liveness seems to have a set of characteristics that is heterogeneous in nature. For example, the experience of “now” is plastic and its prime stimulus may have occurred a second, a day or years ago, or it may refer to mobility of objects that are part of the presentation. So, appreciating “now” is a heterogeneous experience rather than distinct measure of time. The users seem to experience immediacy when they identified movements of a boat or a bird although with a very different understandings of what “now” means. In media theory, liveness is especially associated with authenticity; in HCI it has been noted that the liveness experience decreases, such as in VJperformances, if media is perceived as pre-produced (Hook et al, 2013) regardless how “real-time” the digital animation might be. When applied to new domains, we argue that there is a need to revisit the concept of aggregation of liveness experiences. In particular, our study reveals a need to revisit existing associations such as those of the experiences of engagement and authenticity with the liveness conglomeration. 59 Traditionally, live content is said to be particularly engaging and there exists some sort of focused interaction between the viewer and the mediatised content. In our case, the content is presented in a domestic setting and its presentation and form factor need to be such that they do not demand attention and blend into the home environment aesthetically. The engagement, in this case, is blended into everyday life taking place in homes in a way that allows varied levels of attention. The viewers show a tendency of having sporadic and prolonged engagement over days and weeks with the system. We refer to this as “sporadic engagement”. This invites a broader conceptualisation of the engagement experience in relation to liveness. Furthermore, the notion of authenticity is thought to be an inherent feature in a live “real-time” content in existing media studies literature. However, this investigation indicates that it might not always be the case and design of liveness experiences also needs to account for the authentication work. Our study identifies a form of sporadic engagement and authenticity work by focusing both on the content and the context of use. Both these extensions are of relevance to providing a nuanced understanding of liveness experience. In addition to the incremental contributions mentioned above, we also identify an experience associated with live ambient videos that has not been discussed in theoretical articulations of liveness before. We call it “transcendence”. The participants not only seem to enjoy live views from a cherished place, but also surpass these views in their mind’s eye. When looking at the clouds and the sky above their requested cherished place, they started thinking about pleasant memories, other places far away, or plans for the future. In other words, they transcended the mediated connection to a remote place. Liveness “transported” the participants’ minds away from the mediated cherished place to elsewhere in time and space. We argue that recognising transcendence emerged from liveness could inspire new design ideas. 5.9 Resource efficiency and live ambient video formats Where implementation and the performance test of Livenature system threw light on the new opportunities offered by the design space, it also uncovered underlying challenges related to resource efficiency. The fact that the Livenature system consumed 28 W of power and 40 GB per day of network data demanded investigation into resource efficiency related aspects of live ambient videos. This led us to conduct a study with the focus on issues of resource efficiency in the live ambient video format. We have described this study in a previous chapter. We refer to Paper V for more details. Results from this endeavour include an extended version of the Livenature system with multiple modes of operation characterised by different demands on resources. These modes include: live mode, smart-picture mode, picture 60 mode, and masked mode. The live mode corresponds to a fully-functional mode where no conservation on resource is taken into account. Other modes conserve resources in one way or another. This brings us to the discussion about our evaluation of the efficient modes, namely, masked mode and picture mode. When compared to live mode, the masked mode showed a 65% reduction in network bandwidth usage. However, when it comes to energy efficiency, it consumes 9% less energy compared to the live mode. According to our experimental results, the most efficient mode, in terms of both energy and network bandwidth, is picture mode, which saves 99.6% bandwidth and 96.9% energy compared to the live mode. While exploring various ways to conserve energy while attending to the specific requirements posed by the ambient nature of the media, we employed non-conventional spatial form of video stream. Conventionally, video content is always presented in a rectangular form. We argue that the use of video with arbitrary shapes can be used in an ambient context. Such video formats would be useful in conserving network resources while keeping the sense of connection intact. We exploit such video formats in the aforementioned masked mode to make it efficient with respect to network resources. 61 62 6 Conclusion The ambition of this work is to examine the design space of live mobile video interaction with the perspective of design, technology and user practices in order to reveal underlying challenges and opportunities afforded by it. In order to achieved this goal we approached the design space on two separate points of investigation namely, professional oriented mobile video production and live ambient video formats. We selected pre-existing prototypes, IBS and MVM, for conducting studies on the first point, i.e. professional oriented mobile video production. We conducted an initial user feed back study coupled with measurement study to investigate role of communication delays in such application. These studies indicate how lack of synchronisation among multiple video streams becomes source of problems of the director in such applications. These studies throw light on how mobile characteristic puts such system in a unique position and presents new challenges related to delays and synchronization compared to the professional systems. We identified two modes of video production afforded by these systems based on the director’s visual access to the event that is being filmed. We also proposed an algorithm to handle synchronization relying solely on frame-rate adaptation for one of the mixing scenarios specific to such systems. In the second point of investigation, we focused on inventing alternative mobile live video production formats using research through design approach. It resulted in a functional prototype system called Livenature that exploits a combination of mobile webcasting with ambient video to induce an emotional connection between people and places they cherish. The insights gained from the process of prototyping and studies around it allowed us to uncover experiential dimensions of live ambient video as well as revealed underlying technical challenges associated with the media at hand. In conclusion, this work has contributed to the understanding of opportunities and challenges offered by the design space of live mobile video interaction with new insights. These include, first, the articulation of problems associated with professional oriented mobile video production and description of newly available production scenarios associated to them, proposition of relevant frame-rate exclusive synchronisation algorithm. Secondly, suggestion of novel video production format i.e. live ambient video. Thirdly, extended understanding of experiential qualities and technical challenges associated with such production contexts. 63 64 7 Dissemination of results In this chapter we list the publications that are part of this thesis. In summary, we have published four papers. The first two papers are our attempt to chart the landscape of the design space of this thesis with the orientation towards multi-camera professional oriented video production with mobile webcasting in terms of design, formats and user practices. The third and the fourth and fifth paper concern taking this investigation to the point where ambient video formats, and mobile webcasting in combination with home décor are explored. 7.1 Paper I: Collaborative live video production Mughal, M. A., Juhlin, O (2013). Context-dependent software solutions to handle video synchronization and delay in collaborative live mobile video production. Personal Ubiquitous Computing, Springer-Verlag, London, Volume 18, Issue 3 , pp 709-721 This paper presents our initial user feedback study, delay measurement tests and a detailed analysis of synchronisation and delay issues in emerging class of applications called mobile live video production systems. The author of this thesis was solely responsible for delay measurement tests and collaborated with others on writing. 7.2 Paper II: FESM Mughal, M.A. Zoric, G. Juhlin, O (2014). Frame Rate Exclusive Sync Management of Live Video Streams in Collaborative Mobile Production Environment. Proceedings of the 6th ACM MoVid 14′ Singapore. This paper is a follow up to the first one, which proposes a synchronisation algorithm called Frame Rate Exclusive Sync Management (FESM) and presents its evaluation by simulation. The author here is responsible for development of algorithm and its simulation and the paper writing was accomplished in collaboration with others. 65 7.3 Paper III: Livenature system Mughal, M.A. Wang, J. Juhlin, O (2014). Juxtaposing Mobile Webcasting and Ambient Video for Home Decore. Proceedings of 13th International Conference on Mobile Ubiquitous Multimeda (MUM2014). This paper presents our associative design work that resulted in the Livenature prototype, attempting to entice an emotional connection that exists between people and their cherished place. The paper concerns the design, prototype and its performance testing in situ. This author was solely responsible for particularly technical design, prototyping and performance testing. The author was not involved in writing the user study part, and collaborated in other parts of the paper. 7.4 Paper IV: Liveness and ambient live videos Wang, J. Mughal, M.A. Juhlin, O (2015). Experiencing Liveness of a Cherished Place in the Home. Proceedings of ACM TVX, 2015. This paper presents our field trial of the system where it was installed in a test apartment in collaboration with a furniture company in Malmö. This author’s contributions are the deploying and maintenance of the system over the period of eight weeks. The author collaborated with co-writers in taking interviews, interpreting the results and paper writing. 7.5 Paper V: Resource efficient ambient live video Mughal, M.A, Resource efficient ambient live video, Submitted. This paper proposes efficient design for ambient live video by enhancing the Livenature prototype. The proposed solution is evaluated through measurement tests. I also explore the use of non-conventional spatial video formats for conserving network resources. The author of this thesis is solely responsible for this work. 66 7.6 Related Publications Mughal, M. A., Tousi, R. Delay and collaboration in live mobile video production, in CHI workshop Video interaction – Making broadcasting a successful social media at CHI2011 – A position paper Wang, J. Mughal, M.A. Juhlin, O. Experiencing Liveness of a Cherished Place in the Home In proceedings of ACM TVX 2015. Mughal, M.A., Juhlin, O., Engström, A. Dynamic delay handling in mobile live video production systems Patent: EP20120186600 67 68 8 Bibliography Anderson, A. (2011). The ‘New Old School’: Furnishing with Antiques in the Modern Interior—Frederic, Lord Leighton's Studio-House and Its Collections. Journal of Design History , 24 (4), 315–338. Auslander, P. (2008). Liveness: Performance in a mediatized culture., 2nd ed. New York: Routledge. Austerberry, D., & Starks, G. (2004). The Technology of Video and Audio Streaming. New York: Elsevier Science Inc. Baldi, M., & Ofek, Y. (2000). End-to-end delay analysis of videoconferencing over packet-switched networks. IEEE/ACM Transactions on Networking. 8 (4) 479–492 Bentley, F., & Groble, M. (2009). TuVista: Meeting the multimedia needs of mobile sports fans. ACM MM’09, 471– 480 Bergstrand, F., & Landgren, J. (2011). Visual reporting in time-critical work: Exploring video use in emergency response. MobileHCI 2011, 415–424 Bizzocchi, J. (2008). Winterscape and ambient video: an intermedia border zone. Proceedings of the 16th ACM international conference on Multimedia (MM '08) (pp. 949-952). New York: ACM. Bizzocchi, J. (2006). Ambient Video. Proceedings of the 2006 ACM SIGCHI international conference on Advances in computer entertainment technology (ACE '06). New York: ACM. Bizzocchi, J. (2003). The magic window: the emergent aesthetics of highresolution large-scale video display. the second international conference on Entertainment computing (pp. 1-4). Pittsburgh: Carnegie Mellon University. Blum, C. A. Practical Method for the Synchronization of Live Continuous Media Streams. Institut Eurecom. 69 Boronat, F., Lloret, J., & García, M. (2009). Multimedia group and interstream synchronization techniques: A comparative study. Information Systems, 34 (1), 108–131. Bowers, J., Taylor, R., Hook, J., Freeman, D., Bramley, C., & Newell, C. (2014). Human-computer improvisation. The 2014 companion publication on Designing interactive systems (DIS Companion '14). 203– 206 New York: ACM. Brown, B., & Juhlin, O. (2015). Enjoying Machines. MIT press. Cremer, M., & Cook, R. (2009). Machine-assisted editing of user generated content. SPIE 7254, Media Forensics and Security, 725404, doi:10.1117/12.807515. David, G. (2010). Camera phone images, videos and live streaming: a contemporary visual trend. Visual Studies. 25 (1), 89–98. Denscombe, M. (2011). The Good Research Guide-For small-scale social research projects (4th ed.). Berkshire, England: Open University Press. Dougherty, A. (2011). Live-streaming mobile video: Production as civic engagement. MobileHCI 2011, 425–434 Endoh, K., Yoshida, K., & Yakoh, T. (2008). Low delay live video streaming system for interactive use. The IEEE international conference on industrial informatics (INDIN2008), 1481–1486 Engström, A. E. (2008). Mobile collaborative live video mixing. Mobile HCI 157–166. New York: ACM. Engström, A. (2012). Going Live: Collaborative Video Production After Television. Stockholm: Stockholm University.- PhD dissertation Friederichs-Büttner, G., Walther-Franks, B., & Malaka, R. (2012). An Unfinished Drama: Designing Participation for the Theatrical Dance Performance Parcival XX-XI. DIS’12 770–778. New York: ACM. Gaver, W. (2006). The video window: my life with a ludic system. Personal Ubiquitous Computing, 10 (2–3), 60–65. Gaver, W. (2012). What should we expect from research through design? CHI'12 937. New York: ACM. 70 Gaver, W., & Bowers, J. (2012). Annotated portfolios. Interactions , 19 (4), 40. Gaver, W., Boucher, A., Law, A., & al., e. (2008). Threshold Devices: Looking out from the Home. CHI’08 1429–1438. New York: ACM. Gaver, W., Bowers, J., Boehner, K., & al., e. (2013). Investigating a Ludic Approach to Environmental HCI Through Batch Prototyping. CHI ’13 3451–3460. New York: ACM. Gualdi, G., Cucchiara, R., & Prati, A (2006). Low-Latency Live Video Streaming over Low-Capacity Networks. Eighth IEEE International Symposium on Multimedia, 449–456. Haitsma, J., & Kalker, T. (2002). A highly robust audio fingerprinting system. International symposium on music information retrieval (ISMIR). 2002, 107-115 Harper, R. (2012). The Connected Home: The Future of Domestic Life: The Future of Domestic Life. Springer Science & Business Media. Hook, J., McCarthy, J., Wright, P., & Olivier, P. (2013). Waves: Exploring Idiographic Design for Live Performance. CHI’13 2969–2978. New York: ACM. Hook, J., Schofield, G., Taylor, R., Bartindale, T., McCarthy, J., & Wright, P. (2012). Exploring HCI’s Relationship with Liveness. CHI '12 Extended Abstracts on Human Factors in Computing Systems (CHI EA '12) 2771–2774. New York: ACM. Ito, Y., Tasaka, S., & Fukuta, Y. (2004). Psychometric analysis of the effect of end-to-end delay on user-level QoS in live audio-video transmission. Communications, 2004 IEEE International Conference on, 4. Jacobs, R., Benford, S., Selby, M., Golembewski, M., Price, D., & Giannachi, G. (2013). A Conversation Between Trees: What Data Feels Like in the Forest. CHI’13 129–138. New York: ACM. Jacucci, G., Oulasvirta, A., Salovaara, A., & Sarvas, R. (2005). Supporting the Shared Experience of Spectators through Mobile Group Media. Proceedings of Group 2005 207–216. New York: ACM. 71 Jokela, T., J., L., & Korhonen, H. (2008). Mobile multimedia presentation editor: enabling creation of audio-visual stories on mobile devices. the twenty-sixth annual SIGCHI conference on Human factors in computing systems (CHI '08) 63–72. New York: ACM. Joshi, N., Kar, A., & Cohen, M. (2012). Looking at You: Fused Gyro and Face Tracking for Viewing Large Imagery on Mobile Devices. CHI’12 2211–2220. New York: ACM. Juhlin, O. (2011). Social media on the road: the future of car based computing. London: Springer. Juhlin, O., Engström, A., & Reponen, E. (2010). Mobile broadcasting: the whats and hows of live video as a social medium. 12th international conference on Human computer interaction with mobile devices and services (MobileHCI '10) 35–44. New York: ACM. Juhlin, O., Zoric, G., Engström, A., & Reponen, E. (2014). Video interaction: a research agenda. Personal Ubiquitous Computing , 18 (3), 685–692. Kaheel, A., El-Saban, M., Refaat, M., & Ezz, M. (2009). Mobicast: a system for collaborative event casting using mobile phones. 8th International Conference on Mobile and Ubiquitous Multimedia (MUM '09) 7–8. New York: ACM. Kennedy, L., & Naaman, M. (2009). Less talk, more rock: automated organization of community-contributed collections of concert videos. the 18th international conference on World Wide Web 311–320. New York: ACM. Kirk, D., Sellen, A., Harper, R., & Wood, K. (2007). Understanding videowork. the SIGCHI Conference on Human Factors in Computing Systems (CHI) 61–70. New York: ACM. Lehmuskallio, A., & Sarvas, R. (2008). Snapshot video: everyday photographers taking short video-clips. 5th Nordic conference on Humancomputer interaction: building bridges (NordiCHI '08) 257–265). New York: ACM. Leong, T., & Wright, P. (2013). Revisiting Social Practices Surrounding Music. CHI’13 951–960. New York: ACM. 72 MacNeill, M. (1996). Networks: producing Olympic ice hockey for a national television audience. Sociology of Sport Journal , 13 (2), 103– 124. Maleki, M., Woodbury, R., & Neustaedter, C. (2014). Liveness, Localization and Lookahead: Interaction Elements for Parametric Design. DIS’14 805–814. New York: ACM. Maloney, J., & Smith, R. (1995). Directness and Liveness in the Morphic User Interface Construction Environment. UIST’95 21–28. New York: ACM. Manovich, L. (2007). Understanding Hybrid Media. http://manovich.net/index.php/projects/understanding-hybrid-media (Last accessed on 20 Dec 2015) Massimi, M., & Neustaedter, C. (2014). Moving from Talking Heads to Newlyweds: Exploring Video Chat Use During Major Life Events. DIS’14 43–52. New York: ACM. Meese, R., Shakir Ali, S., Thorne, E., Benford, S. D., Quinn, A., Mortier, R., et al. (2013). From codes to patterns: designing interactive decoration for tableware. CHI'13 931–940. New York: ACM. Miller, D. (2013). Stuff. Wiley. Mughal, M. A., Wang, J., & Juhlin, O. (2014). Juxtaposing mobile webcasting and ambient video for home décor. the 13th International Conference on Mobile and Ubiquitous Multimedia (MUM '14) 151–159. New York: ACM. O'Hara, K., Slayden, A., Mitchell, & Vorbau. (2007). Consuming video on mobile devices. the SIGCHI Conference on Human Factors in Computing Systems (CHI '07) 857–866. New York: ACM. Paul, S. (2010). Digital Video Distribution in Broadband, Television, Mobile and Converged Networks: Trends, Challenges and Solutions. New Delhi, India: Wiley Publishing. Puikkonen, A., Häkkilä, J., Ballagas, R., & Mäntyjärvi, J. (2009). Practices in creating videos with mobile phones. the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI '09). New York: ACM. 73 Puikkonen, A., Ventä, L., Häkkilä, J., & Beekhuyzen, J. (2008). Playing, performing, reporting: a case study of mobile minimovies composed by teenage girls. the 20th Australasian Conference on Computer-Human Interaction: Designing for Habitus and Habitat (OZCHI '08) (pp. 140–147). New York: ACM. Rautiainen, M., Aska, H., Ojala, T., Hosio, M., Makivirta, A., & Haatainen, N. (2009). Swarm synchronization for multi-recipient multimedia streaming,. International conference on Multimedia and Expo, 2009. ICME 2009 (pp. 786–-789). IEEE. Redström, J. (2001). Designing everyday computational things. Göteborg University.- A Ph.D. dissertation Reeves, S., Benford, S., O’Malley, C., & Fraser, M. (2005). Designing the Spectator Experience. CHI’05 741–750. New York: ACM. Reponen, E. (2008). Live @ Dublin --- Mobile Phone Live Video Group Communication Experiment. EUROITV '08 Proceedings of the 6th European conference on Changing Television Environments 133–142. Berlin, Heidelberg: Springer-Verlag. Sá, M., Shamma, D., & Churchill, E. (2014). Live mobile collaboration for video production: design, guidelines, and requirements. Personal Ubiquitous Computing. 18,( 3) 693-707 Sareenan, C., & Narendaran, B. A. (1996). Internet stream synchronization using concord. Proceedings of IS&T/SPIE International Conference on Multimedia Computing and Networking (MMCN). Scannell, P. (1996). Radio, television, and modern life: A phenomenological approach. Oxford: Blackwell publishers Shirky, C. (2008). Here Comes Everybody: The Power of Organizing Without Organizations. Penguin Press, New York,. Shrestha, P. (2009). Automatic mashup generation of multiplecamera videos. Technische Universiteit Eindhoven. – A Ph.D. dissertaion Thompson, J. B. (1995). The media and modernity: A social theory of the media. Cambridge, Polity. 74 Trout, C. (2014). Android still the dominant mobile OS with 1 billion active users. Retrieved Sept 20, 2015, from Engadget: http://www.engadget.com/2014/06/25/google-io-2014-by-the-numbers/ Vihavainen, S., Mate, S., Seppälä, L., Cricri, F., & Curcio, I. D. (2011). We want more: human-computer collaboration in mobile social video remixing of music concerts. CHI '11 287–296. New York: ACM. Wang, B., Kurose, J., Shenoy, P., & Towsley, D. (2008). Multimedia streaming via TCP: An analytic performance study. 12th annual ACM international conference on Multimedia. ACM, New York , 908-915. Wang, L., Roe, P., Pham, B., & Tjondronegoro, D. (2008). An audio wiki supporting mobile collaboration. the 2008 ACM symposium on Applied computing (SAC '08) 1889–1896. New York: ACM. Weber, M. G. (2006). Measurement and analysis of video streaming performance in live UMTS networks. Int’l Symp. on Wireless Personal Multimedia Communications (WPMC’06) 1–5. Weilenmann, A., Hillman, T., & and Jungselius, B. (2013). Instagram at the Museum: Communicating the Museum Experience Through Social Photo Sharing. CHI’13 1843–1852. New York: ACM. Ylirisku, S., Lindley, S., Jacucci, G., Banks, R., Stewart, C., Sellen, A., et al. (2013). Designing web-connected physical artefacts for the 'aesthetic' of the home. CHI'13 909–918. New York: ACM. Zambelli, A. (2013). A history of media streaming and the future of connected TV. Retrieved Sept 20, 2015, from The Guardian: http://www.theguardian.com/media-network/media-networkblog/2013/mar/01/history-streaming-future-connected-tv Zimmerman, J., Forlizzi, J., & Evenson, S. (2007). Research through design as a method for interaction design research in HCI. CHI'07 493–502. New York: ACM. Östergren, M. (2006). Traffic encounters: drivers meeting face to face and peer to peer. IT University of Göteborg. – A Ph.D. dissertation 75 ThePapers 76