Visual Cloud

From HandWiki

Visual Cloud is the implementation of visual computing applications that rely on cloud computing architectures, cloud scale processing and storage, and ubiquitous broadband connectivity between connected devices, network edge devices and cloud data centers. It is a model for providing visual computing services to consumers and business users, while allowing service providers to realize the general benefits of cloud computing, such as low cost, elastic scalability, and high availability while providing optimized infrastructure for visual computing application requirements.

History and overview

The rise of cloud computing was enabled by a convergence of powerful, low-cost computer hardware, high-capacity networks, and advances in hardware virtualization. To satisfy high consumer demand for visually-based entertainment such as video and gaming, as well as online social interaction, service providers began to deploy visually oriented applications in centralized data centers and use distributed content delivery networks to make that content accessible to their users.

Mobile consumption of video content in particular makes cloud delivery of video attractive, because remote processing and storage can compensate for the limitations of mobile devices. As much as 75% of the world's mobile data traffic is expected to be video by 2020.[1]

The first generation of visual cloud technologies were mostly oriented around streaming media applications. The mid-2000s saw the introduction of professional and user generated video-on-demand services like Netflix and YouTube, multiplayer online games (MOGs) like Call of Duty and massively multiplayer online games (MMOGs) like World of Warcraft. Another common usage of visual cloud that emerged during this timeframe is desktop virtualization based on remote desktop instances that are hosted using cloud infrastructure.

As visual cloud technology has become more capable, more demanding usages have begun to emerge, such as the use of visual cloud for virtual reality, augmented reality, 3D scene understanding and interactivity, and immersive live experiences.[2] Visual cloud applications can be roughly divided into four primary domains:

  • Media content creation and delivery
  • Cloud graphics
  • Media analytics
  • Immersive media

Media content creation and delivery

The overall amount of video being delivered throughout the world is increasing significantly, as new sources develop. Processing and distribution of that content may increasingly be addressed by means of the visual cloud. Sources of that content include applications in cloud, communications, media/entertainment, and enterprise environments. Global mobile data traffic is forecast to increase nearly 7x between 2016 and 2021.[3] There are three primary models of content distribution.

  • Broadcasting of linear, live, and on-demand content by traditional communications service providers such as Comcast and DirecTV. This content has typically been consumed on televisions. The visual cloud trends in this model include cloud-based DVRs and virtual set-top boxes that enable broadcast content to be watched on other devices.
  • Over the Top: Video on Demand (VOD) of professional content from cloud media companies such as Amazon Video and Netflix and user-generated content hosted on platforms like Netflix. Over the top content refers to audio/visual content that is transmitted directly to end users via the Internet, without relying on a communications service provider for control or distribution.
  • Over the Top: Live streaming of content on video platforms such as Twitch, Facebook live streaming, WatchESPN and SlingTV that is distributed by private cloud or public cloud.

Compute-intensive visual workloads in the media content creation and delivery segment include media processing (e.g., compression and transcoding), enhancement, restoration, and compositing. As this content is stored in data centers and ultimately transmitted to end-users, workloads such as these (and many others) may be applied to the content, with factors such as bit rate and resolution tailored to match the transmission medium and capabilities of the end-consumer device

Cloud Graphics

Interactive 3D (e.g., virtual desktop infrastructure) and batch rendering (e.g., Renderman) operations may be performed at scale in visual cloud usages, where the user is remote from the site of the rendering operations. Example usages in this domain include the following:

  • Remote desktops allow end-user virtualized computing environments to be centrally hosted, stored and managed in the cloud, for content access with consistent user experiences on multiple types of devices, including tablets and phablets with limited footprints. Examples of remote desktop applications include Citrix, VMware, and Xen.
  • Remote batch rendering enables resource-intensive graphics processing to be done using either public or private cloud resources, or a hybrid combination of the two. This approach is particularly valuable for on-demand usages that may have large peak workloads, such as in the final stages of production for animated films. Render farms operated by Pixar and LucasFilm are established examples of this usage model.[4]
  • Cloud game streaming stores, executes, and renders the game itself in the cloud, transmitting an encoded video stream to user PCs, consoles, or other devices, where the game video is displayed. Controller and keystroke signals are transmitted back to the cloud. Early pioneers in this space included OnLive and Gaikai (both acquired by Sony). Hardware providers such as Sony and NVIDIA and smaller companies like GameFly, GameStream, and PlayGiga are developing cloud gaming products and services today.
  • Compute-intensive visual workloads in the cloud graphics segment include computer graphics technologies such as ray and raster rendering, 3D design, 3D modeling, and visual simulation. The graphics workload is manipulated in the cloud, with final rendering to the client device. Examples of this kind of workload include Petrel (a software platform used in petroleum exploration and production) and Autodesk 3ds Max (a computer graphics application for making 3D animations).

The varying requirements for the scale of performance and density among these workloads have implications for the cloud resources that optimally support them. For example, a visual cloud infrastructure to support remote desktops would likely be configured with the goal of supporting the greatest practical number of desktop instances per server. Cloud game streaming, on the other hand, requires far greater attention to meeting peak graphics performance, likely requiring lower density per server. While both those interactive usages are also highly latency sensitive, remote batch rendering values time to completion, with latency playing a far less important role.

Media analytics

Computations based on media in the visual cloud can be used to manipulate or provide deeper understanding of the media content itself, as well as to provide insights based on how users interact with it. Media analytics treats visual information as unstructured data to be processed and fed into analytics engines for interpretation of images, audio, or video to implement usages such as web visual search, autonomous transportation, surveillance, smart cities, and robotics. Visual computing technologies in the media analytics segment include three subdomains:

  • Media processing technologies required to prepare visual content for analysis include transcoding, decoding, enhancement, restoration, edge detection, and segmentation.
  • Content analysis consists of capabilities such as object detection and recognition, event detection and recognition, and scene understanding.
  • Media analytics produces metrics based on performance and usage factors related to video or other media; measurements of video quality and audience behavior are common examples.[5]

Media analytics often makes use of “deep learning” frameworks, which involve training an algorithm using large amounts of source data. The training portion of this approach typically takes place over an extended period of time and involves teaching the algorithm by mapping large amounts of input data to specific output classifications. The resulting trained algorithm can then make rapid or instantaneous interpretations of new input data based on rules developed during the training stage.

Immersive media

Making use of the capabilities in the three usage areas described above (i.e., media content creation and delivery, cloud graphics, and media analytics), visual data can be manipulated based on its contents to support emerging usages such as live panoramic video and augmented or virtual reality (VR). Immersive reality gaming, for example, builds a game experience on top of the physical surroundings of the player, in real time. These experiences can be consumed on purpose-built displays such as VR head-mounted displays or on conventional devices such as PCs, tablets, or phones.

Some of the more visible usages of these technologies include Google Street View and Pokémon Go. Other commercially available examples of immersive media include Voke VR live streaming, FreeD virtual replay technology, Facebook 360° photos, and Oculus Rift and Microsoft Hololens VR goggles.

Most usages in the immersive media segment require compute-intensive scene analysis, which must often be performed in real time or near-real time. As with all visual cloud applications, workloads will be distributed between end devices and the cloud. For example, head-mounted display rendering might be done locally to the user to minimize latency, but live VR content distribution could be done predominantly from the cloud.

References