TM-3D-SM Group of Digital Video Broadcast (DVB)

The DVB Project is an industry-led consortium of over 250 broadcasters, manufacturers, network operators, software developers, regulatory bodies, and others in over 35 countries committed to designing open technical standards for the
global delivery of DTV and data services. The DVB project is responsible for the definition of today’s 2D DTV broadcast infrastructure in Europe, requires the use of the MPEG-2 Systems Layer specification for the distribution of audiovisual
data via cable (DVB-C i.e., Digital Video Broadcast-Cable), satellite (DVB-S i.e., Digital Video Broadcast-Satellite), or terrestrial (DVB-T i.e., Digital Video Broadcast-Terrestrial) transmitters. Owing to its almost universal acceptance and
worldwide use, it is of major importance for any future 3DTV system, and to build its distribution services on this transport technology [16] (services using DVB standards are available on every continent with more than 500 million DVB receivers deployed).

During5 2009, DVB closely studied the various aspects of (potential) 3DTV solutions. A Technical Module Study Mission report was finalized, leading to the formal creation of the TM-3DTV group. A 3DTV Commercial Module has also now been created to go back to the first step of the DVB process: what kind of 3DTV solution does the market want and need, and how can DVB play an active part in the creation of that solution? To start answering some of these questions, the CM-3DTV group was planning to host a DVB 3D TV Kick-Off Workshop in early 2010.

There have already been broadcasts of a conventional display-compatible system, and the first HDTV channel compatible broadcasts are scheduled to start in Europe in spring 2010. In 2009, DVB had been closely studying the various aspects of (potential) 3DTV solutions. A Technical Module Study Mission report was finalized, leading to the formal creation of the TM-3DTV group. As the DVB process is business- and market-driven, a 3DTV Commercial Module has now also been created to go back to the first step: what kind of 3DTV solution does the market want and need, and how can DVB play an active part in the creation of that solution? To start answering some of these questions, the CM-
3DTV group hosted a DVB 3DTV Kick-off Workshop in Geneva in early 2010, followed immediately by the first CM-3DTV.

3DTV Standardization and Related Activities

Standardization efforts have to be understood in the context of where stakeholders and proponents see the technology going. We already defined what we believe to be five generations of 3DTV commercialization in Chapter 1, which the reader will certainly recall. These generations fit in well with the following menu of research activity being sponsored by various European and global research initiatives, as described in Ref. [1]:

Short-term 3DV R&D (immediate commercialization, 2010–2013)

  • Digital stereoscopic projection
    • better/perfect alignment to minimize “eye-fatigue.”
  • End-to-end digital production-line for stereoscopic 3D cinema
    • digital stereo cameras;
    • digital baseline correction for realistic perspective;
    • digital postprocessing.

Medium-term 3DV R&D (commercialization during the next few years, 2013–2016)

  • End-to-end multi-view 3DV with autostereoscopic displays
    • cameras and automated camera calibration;
    • compression/coding for efficient delivery;
    • standardization;
    • view interpolation for free-view video;
    • better autostereoscopic displays, based on current and near future technology (lenticular, barrier-based);
    • natural immersive environments.

Long-term 3DV R&D (10+ years, 2016–2020+)

  • realistic/ultrarealistic displays;
  • “natural” interaction with 3D displays;
  • holographic 3D displays, including “integral imaging” variants;
  • natural immersive environments;
  • total decoupling of “capture” and “display”;
  • novel capture, representation, and display techniques.

One of the goals of the current standardization effort is to decouple the capture function from the display function. This is a very typical requirement for service providers, going back to voice and Internet services: there will be a large pool of end users each opting to choose a distinct Customer Premises Equipment (CPE) device (e.g., phone, PC, fax machine, cell phone, router, 3DTV display); therefore, the service provider needs to utilize an network-intrinsic protocol (encoding, framing, addressing, etc.) that can then be utilized by the end device to create its own internal representation, as needed. The same applies to 3DTV.

As noted in Chapter 1, there is a lot of interest shown in this topic by the industry and standards body. The MPEG of ISO/IEC is working on a coding format for 3DV. Standards are the key to cost-effective deployment of a technology. Examples of video-related standards include the Beta-VHS (Video Home System) and the HD DVD–Blu-ray controversies.  SMPTE is working on some of the key standards needed to deliver 3D to the home. As far back as 2003, a 3D Consortium with 70 partner organizations had been founded in Japan and, more recently, four new activities have been started: the [email protected] Consortium, the SMPTE 3D Home Entertainment Task Force, the Rapporteur Group on 3DTV of ITU-R Study Group 6, and the TM-3D-SM group of DVB. It will probably be somewhere around 2012 by the time there
will be an interoperable standard available in consumer systems to handle all the delivery mechanisms for 3DTV.

At a broad level and in the context of 3DTV, the following major initiatives had been undertaken at press time:

  • MPEG: standardizing multi-view and 3DV coding;
  • DVB: standardizing of digital video transmission to TVs and mobile devices;
  • SMPTE: standardizing 3D delivery to the home;
  • ITU-T: standardizing user experience of multimedia content;
  • VQEG (Video Quality Experts Group): standardizing of objective video quality assessment.

There is a pragmatic possibility that in the short term, equipment providers may have to support a number of formats for stereo 3D content. The ideal approach for stereoscopic 3DTV is to provide sequential left and right frames at twice the chosen viewing rate. However, because broadcasters and some devices may lack transport/interface bandwidth for that approach, a number of alternatives may also be used (at least in the short term). Broadcasters appear to be focusing on top/bottom interleaving; however, trials are still ongoing to examine other approaches that involve some form of compression including checkerboard, sideby-side, or interleaved rows or columns.

 

 

MPEG-4 and/or Other Data Support

For satellite transmission and to remain consistent with already existing MPEG-2 technology6, MPEG-4 TSs (or other data) are further  encapsulated in Multiprotocol Encapsulation (MPE – RFC 3016) and then segmented again and placed into TS streams via a device called IP Encapsulator (IPE; Fig. A4.5). MPE is used to transmit datagrams that exceed the length of the DVB “cell,” just as Asynchronous Transfer Mode Adaptation Layer 5 (AAL5) is used for a

Pictorial view of encapsulation.

similar function in an ATM context. MPE allows one to encapsulate IP packets into MPEG-2 TSs (“packets,” or “cells”; Fig. A4.6). IPEs handle statistical multiplexing and facilitate coexistence. IPE receives IP packets from an Ethernet connection and encapsulates packets using MPE, and then maps these streams into an MPEG-2 TS. Once the device has encapsulated the data, the IPE forwards the data packets to a satellite link. Generic data (IP) for transmission over the MPEG-2 transport multiplex (or IP packets containing MPEG-4 video) is passed to an encapsulator that typically receives PDUs (Ethernet frames, IP datagrams, or other network layer packets); the encapsulator formats each PDU into a series of TS packets (usually after adding an encapsulation header) that are sent over a TS logical channel. The MPE packet has the format shown in Fig. A4.7. Figure A4.8 shows the encapsulation process.

Note: IPEs are usually not employed if the output of the layer 2 switch is connected to a router for transmission over a terrestrial network; in this case, the headend is responsible for proper downstream enveloping and distribution of the traffic to the ultimate consumer. In other pure, IP-based video environments, where DVB-S or DVB-S2 are not used (e.g., a greenfield IP network that is designed to just handle video), the TSs are included in IP packets that are then transmitted as needed (Fig. A4.9). Specifically, with the current generation of equipment, the encoder will typically generate IP packets; these have a source IP address and a single or multicast IP address. The advantage of having video in IP format is that it can be carried over a regular (pure) Local Area Network (LAN) or carrier Wide Area Network (WAN).

IPE protocol stack.

MPE packet.

Encapsulation process.

Consider Fig. A4.9 again. It clearly depicts video (and audio) information being organized into PESs that are then segmented into TSs. Examining the protocol stack of Fig. A4.9 one should note that in a traditional MPEG-2 environment of DTV, either over-the-air transmission or cable TV transmission, the TSs are handled directly by an MPEG-2-ready infrastructure formally known as an MPEG-2 Transport Multiplex (see left-hand side stack). As explained, MPEG-2 Transport Multiplex offers a number of parallel channels that are known as TS logical channels. Each TS logical channel is uniquely identified by the PID value that is carried in the header of each MPEG-2 TS packet. TS logical channels are independently numbered on each MPEG-2 TS multiplex (MUX). As just noted, the service provided by an MPEG-2 transport multiplex offers a number of parallel channels that correspond to logical links (forming the MPEG TS) [24, 28]. The MPEG-2 TS has been widely accepted not only for providing digital TV services but also as a subnetwork technology for building IP networks, say in cable TV–based Internet access.

Simplified protocol hierarchy.

There may be an interest in also carrying actual IP datagrams over this MPEG-2 transport multiplex infrastructure (this may be generic IP data or IP packets emerging from MPEG-4 encoders that contain 7 MPEG-4 frames.) To handle this requirement, packet data for transmission over an MPEG-2 transport multiplex is passed to an IPE. This receives PDUs, such as Ethernet frames or IP packets, and formats each into a Subnetwork Data Unit (SNDU), by adding an encapsulation header and trailer. The SNDUs are subsequently fragmented into a series of TS packets. To receive IP packets over an MPEG-2 TS Multiplex, a receiver needs to identify the specific TS multiplex (physical link) and also the TS logical channel (the PID value of a logical link). It is common for a number of MPEG-2 TS logical channels to carry SNDUs; therefore, a receiver must filter (accept) IP packets sent with a number of PID values, and must independently reassemble each SNDU [29]. Some applications require transmission of MPEG-4 streams over a preexisting MPEG-2 infrastructure, for example, in a cable TV application. This is also done via the IPE; here the IP packets generated by the MPEG-4 encoder are considered (treated) as if they were data, as just described above in this paragraph (Fig. A4.10).

The encapsulator receives PDUs (e.g., IP packets or Ethernet frames) and formats these into SNDUs. An encapsulation (or convergence) protocol transports each SNDU over the MPEG-2 TS service and provides the appropriate mechanisms to deliver the encapsulated PDU to the receiver IP interface. In forming an SNDU, the encapsulation protocol typically adds header fields that carry protocol control information, such as the length of SNDU, receiver address, multiplexing information, payload type, and sequence numbers. The SNDU payload is typically followed by a trailer that carries an Integrity Check (e.g., Cyclic Redundancy Check, CRC). When required, an SNDU may be fragmented across a number of TS packets (Figs A4.11 and A4.12) [29].

Encapsulator function.

Encapsulation of a subnetwork IPv4 or IPv6 PDU to form an MPEG-2 payload unit.

Encapsulation of a PDU (e.g., IP packet) into a series of MPEG-2 TS packets. Each TS packet carries a header with a common packet ID (PID) value denoting the MPEG-2 TS logical channel.

In summary, the standard DVB way of carrying IP datagrams in an MPEG-2 TS is to use MPE; with MPE each IP datagram is encapsulated into one MPE section. A stream of MPE sections are then put into an ES, that is, a stream of MPEG-2 TS packets with a particular PID. Each MPE section has a 12-B header, a 4-B CRC (CRC-32) tail, and a payload length, which is identical to the length of the IP datagram that is carried by the MPE section [30].

 

DVB (Digital Video Broadcasting)-Based Transport in Packet Networks

As we discussed in the body of this chapter, DVB-S is set up to carry MPEG-2 TS streams encapsulated with 16 bytes of Reed–Solomon FEC to create a packet

DVB packet.

that is 204 bytes long (Fig. A4.4). DVB-S embodies the concept of “virtual channels” in a manner analogous to ATM; virtual channels are identified by PIDs (one can think of the DVB packets as being similar to an ATM cell, but with different length and format). DVB packets are transmitted over an appropriate network. The receiver looks for specific PIDs that it has been configured to acquire (directly in the headend receiver for terrestrial redistribution purposes or in the viewer’s set-top box for a DTH application or in the set-top box via an IGMP join in an IPTV environment).

Specifically, to display a channel of IPTV digital television, the DVB-based application configures the driver in the receiver to pass up to it the packets with a set of specific PIDs, for example, PID 121 containing video and PID 131 containing audio (these packets are then sent to the MPEG decoder which is either hardware- or software-based). So, in conclusion, a receiver or demultiplexer extracts ESs from the TS in part by looking for packets identified by the same PID.

 

DVB-H

There is interest in the industry in delivering 3DTV services to mobile phones. It is perceived that simple lenticular screens can work well in this context and that the bandwidth (even though always at a premium in mobile applications) would not be too onerous overall; even assuming a model with two independent streams being delivered, it would double the bandwidth to 2 × 384 kbps or 2 × 512 kbps and the use of spatial compression (which should not be such a “big” compromise here) would be handled at the traditional data rate, 384 kbps or 512 kbps.

DVB-H, as noted in Table 4.2, is a DVB specification that deals with approaches and technologies to deliver commercial-grade medium-quality real-time linear and on-demand video content to handheld, battery-powered devices such as mobile telephones and PDAs (Personal Digital Assistants). IP Multicast is typically employed to support DVB-H.

DVB-H2 addresses the requirements for reliable, high-speed, high–data rate reception for a number of mobile applications including real-time video to handheld devices. DVB-H systems typically make use of IP Multicast. DVB-H is generating significant interest in the broadcast and telecommunications worlds, and DVB-H services are expected to start at this time. The DVB-H standards have been standardized through ETSI.

ETSI EN 302 304 “Digital Video Broadcasting (DVB); Transmission System for Handheld Terminals (DVB-H)” is an extension of the DVB-T standard. Additional features have been added to support handheld and mobile reception. Lower
power consumption for mobile terminals and secured reception in the mobility environments are key features of the standard. It is meant for IP-based wireless services. DVB-H can share the DVB-T MUX with MPEG-2/MPEG-4 services,
so it can be part of the IPTV infrastructure described in the previous chapter, except that lower bitrates are used for transmission (typically in the 384-kbps range). DVB-H was published as ETSI Standard in 2004 as an umbrella standard
defining how to combine the existing (now updated) ETSI standards to form the DVB-H system (Fig. 4.11).

DVB-H is based on DVB-T, a standard for digital transmission of terrestrial over-the-air TV signals. When DVB-T was first published in 1997, it was not designed to target mobile receivers. However, DVB-T mobile services have been launched in a number of countries. Indeed, with the advent of diversity antenna receivers, services that target fixed reception can now largely be received on the move as well. DVB-T is deployed in more than 50 countries. Yet, a new standard was sought, namely, DVB-H.

Despite the success of mobile DVB-T reception, the major concern with any handheld device is that of battery life. The current and projected power consumption of DVB-T front-ends is too high to support handheld receivers that are
expected to last from one to several days on a single charge. The other major requirements for DVB-H were an ability to receive 15 Mbps in an 8-MHz channel and in a wide area Single Frequency Network (SFN) at high speed. These requirements were drawn up after much debate and with an eye on emerging convergence devices providing video services and other broadcast data services to 2.5G and 3G handheld devices. Furthermore, all this should be possible while

DVB-H Framework.

Block-level view of a DVB-H network.

maintaining maximum compatibility with existing DVB-T networks and systems. Figure 4.12 depicts a block-level view of a DVB-H network.

In order to meet these requirements, the newly developed DVB-H specification includes the capabilities discussed next.

  • Time-Slicing: Rather than continuous data transmission as in DVB-T, DVBH employs a mechanism where bursts of data are received at a time—a so-called IP datacast carousel. This means that the receiver is inactive for much of the time, and can thus, by means of clever control signaling, be “switched off.” The result is a power saving of about 90% and more in some cases.
  • “4-K Mode”: With the addition of a 4-K mode with 3409 active carriers, DVB-H benefits from the compromise between the high-speed small-area SFN capability of 2-K DVB-T and the lower speed but larger area SFN of 8-K DVB-T. In addition, with the aid of enhanced in-depth interleavers in the 2-K and 4-K modes, DVB-H has even better immunity to ignition interference.
  • Multiprotocol Encapsulation–Forward Error Correction (MPE-FEC): The addition of an optional, multiplexer level, FEC scheme means that DVB-H transmissions can be even more robust. This is advantageous when considering
    the hostile environments and poor (but fashionable) antenna designs typical of handheld receivers.

Like DVB-T, DVB-H can be used in 6-, 7-, and 8-MHz channel environments. However, a 5-MHz option is also specified for use in non-broadcast environments. A key initial requirement, and a significant feature of DVB-H, is that it can coexist
with DVB-T in the same multiplex. Thus, an operator can choose to have two DVB-T services and one DVB-H service in the same overall DVB-T multiplex.

Broadcasting is an efficient way of reaching many users with a single (configurable) service. DVB-H combines broadcasting with a set of measures to ensure that the target receivers can operate from a battery and on the move, and is thus an ideal companion to 3G telecommunications, offering symmetrical and asymmetrical bidirectional multimedia services.

DVB-H trials have been conducted in recent years in Germany, Finland, and the United States. Such trials help frequency planning and improve understanding of the complex issue of interoperability with telecommunications networks and
services. However, to date at least in the United States, there has been limited interest (and success) in the use of DVB-H to deliver video to hand-held devices. Providers have tended to use proprietary protocols.

Proponents have suggested the use of DVB-H for delivery of 3DTV to mobile devices. Some make the claim that wireless 3DTV may be introduced at an early point because of the tendency of wireless operators to feature new applications
earlier than traditional carriers. While this may be true in some parts of the world—perhaps mostly driven by the regulatory environment favoring wireless in some countries, by the inertia of the wireline operators, and by the relative
ease with which “towers are put up”—we remain of the opinion that the spectrum limitations and the limited QoE of a cellular 3D interaction do not make cellular 3D such a financially compelling business case for the wireless operators to
induce them to introduce the service “over night.”

 

DVB

DVB is a consortium of over 300 companies in the fields of broadcasting and manufacturing that work cooperatively to establish common international standards for digital broadcasting. DVB-generated standards have become the leading
international standards, commonly referred to as “DVB,” and the accepted choice for technologies that enable an efficient, cost-effective, high-quality, and interoperable digital broadcasting. The DVB standards for digital television have been adopted in the United Kingdom, across mainland Europe, in the Middle East, South America, and in Australasia. DBV standards are used for DTH satellite transmission 22 (and also for terrestrial and cable transmission).

The DVB standards are published by a Joint Technical Committee (JTC) of European Telecommunications Standards Institute (ETSI), European Committee for Electrotechnical Standardization (Comit´e Europ´een de Normalisation Electrotechnique—CENELEC), and European Broadcasting Union (EBU). DVB produces specifications that are subsequently standardized in one of the European statutory standardization bodies. They cover the following DTV-related areas:

  • conditional access,
  • content protection copy management,
  • interactivity,
  • interfacing,
  • IP,
  • measurement,
  • middleware,
  • multiplexing,
  • source coding,
  • subtitling,
  • transmission.

Standards have emerged in the past 10 years for defining the physical layer and data link layer of a distribution system, as follows:

  • satellite video distribution (DVB-S and DVB-S2),
  • cable video distribution (DVB-C),
  • terrestrial television video distribution (DVB-T),
  • terrestrial television for handheld mobile devices (DVB-H).

Distribution systems differ mainly in the modulation schemes used (because of specific technical constraints):

  • DVB-S (SHF) employs QPSK (Quadrature Phase-Shift Keying).
  • DVB-S2 employs QPSK, 8PSK (Phase-Shift Keying), 16APSK (Asymmetric Phase-Shift Keying) or 32APSK; 8PSK is the most common at this time (it supports a 30-megasymbols pre-satellite transponder and provides a usable rate in the 75 Mbps range, or about 25 SD-equivalent MPEG-4 video channels).
  • DVB-C (VHF/UHF) employs QAM (Quadrature Amplitude Moderation): 64-QAM or 256-QAM.
  • DVB-T (VHF/UHF) employs 16-QAM or 64-QAM (or QPSK) along with COFDM (Coded Orthogonal Frequency Division Multiplexing).
  • DVB-H: refer to the next section.

Because these systems have been widely deployed, especially in Europe, they may well play a role in the near-term 3DTV services. IPTV also makes use of a number of these standards, particularly when making use of satellite links (an architecture that has emerged is to use satellite links to provide signals to various geographically distributed headends, which then distribute these signals terrestrially to a small region using the telco IP network—these headends act as rendezvous point in the IP Multicast infrastructure). Hence, in the reasonable assumption that IPTV will play a role in 3DTV, these specifications will also be considered for 3DTV in that context.

As implied above, transmission is a key area of activity for DVB. See Table 4.2 for some of the key transmission specifications.

In particular, EN 300 421 V1.1.2 (1997–2008) describes the modulation and channel coding system for satellite digital multiprogram television (TV)/HDTV services to be used for primary and secondary distribution in Fixed Satellite Service (FSS) and Broadcast Satellite Service (BSS) bands. This specification is also known as DVB-S. The system is intended to provide DTH services for consumer IRD, as well as cable television headend stations with a likelihood
of remodulation. The system is defined as the functional block of equipment performing the adaptation of the baseband TV signals, from the output of the MPEG-2 transport multiplexer (ISO/IEC DIS 13818-1) to the satellite channel characteristics. The following processes are applied to the data stream:

  • transport multiplex adaptation and randomization for energy dispersal;
  • outer coding (i.e., Reed–Solomon);
  • convolutional interleaving;
  • inner coding (i.e., punctured convolutional code);
  • baseband shaping for modulation;
  • modulation.

DVB-S/DVB-S2 as well as the other transmission systems could be used to deliver 3DTV. As seen in Fig. 4.9, MPEG information is packed into PESs (Packetized Elementary Streams), which are then mapped to TSs that are then handled by the DVB adaptation. The system is directly compatible with MPEG- 2 coded TV signals. The modem transmission frame is synchronous with the MPEG-2 multiplex transport packets. Appropriate adaptation to the signal formats (e.g., MVC ISO/IEC 14496-10:2008 Amendment 1 and ITU-T Recommendation H.264, the extension of AVC) will have to be made, but this kind of adaptation has recently been defined in the context of IPTV to carry MPEG-4 streams over an MPEG-2 infrastructure (Fig. 4.10).

Key DVB Transmission Specifications

Key DVB Transmission Specifications

For Digital Rights Management (DRM), the DVB Project–developed Digital Video Broadcast Conditional Access (DVB-CA) defines a Digital Video Broadcast Common Scrambling Algorithm (DVB-CSA) and a Digital Video Broadcast Common Interface (DVB-CI) for accessing scrambled content:

  • DVB system providers develop their proprietary conditional access systems within these specifications;
  • DVB transports include metadata called service information (DVB-SI i.e., Digital Video Broadcast Service Information) that links the various Elementary Streams (ESs) into coherent programs and provides human-readable descriptions for electronic program guides.

Functional block diagram of DVB-S.

Mapping of MPEG-2/MPEG-4 to DVB/DVB-S2 systems.

3DTV/3DV Transmission Approaches and Satellite Delivery

We start with a generic discussion about transmission approaches and then look at DVB-based satellite approaches.

Overview of Basic Transport Approaches

It is to be expected that 3DTV for home use will likely first see penetration via stored media delivery (e.g., Blu-ray Disc). The broadcast commercial delivery of 3DTV (whether over satellite/DTH, over the air, over cable, or via IPTV), may take a few years because of the relatively large-scale infrastructure that has to be put in place by the service providers and the limited availability of 3D-ready TV sets in the home (implying a small subscriber, and so small revenue, base). Delivery of downloadable 3DTV files over the Internet may occur at any point in the immediate future, but the provision of a broadcast-quality service over the Internet is not likely in the foreseeable future.

There are a number of alternative transport architectures for 3DTV signals, also depending on the underlying media. The service can be supported by traditional broadcast structures including the DVB architecture, wireless 3G/4G transmission such as DVB-H approaches, Internet Protocol (IP) in support of an IPTV-based service (in which case it also makes sense to consider IPv6), and the IP architecture for Internet-based delivery (both non–real time and streaming).

The specific approach used by each of these transport methods will also depend on the video-capture approach, as depicted in Table 4.1. Initially conventional stereo video (with temporal multiplexing or spatial compression) will be used by all commercial 3DTV service providers; later in the decade other methods may be used. Also, make note in this context that in the United States one has a well-developed cable infrastructure in all Tier 1 and Tier 2 metropolitan and
suburban areas; in Europe/Asia, this is less so, with more DTH delivery (in the United States DTH tends to serve more exurban and rural areas). A 3DTV rollout must take these differences into account and/or accommodate both.

Note that the V + D data representation can be utilized to build 3DTV transport evolutionarily on the existing DVB infrastructure. The in-home 3D images are reconstructed at the receiver side by using DIBR. MPEG has established a standardization activity that focuses on 3DTV using V + D representation.

There are generally two potential approaches for transport of 3DTV signals: (i) connection-oriented (time/frequency division multiplexing) over existing DVB infrastructure over traditional channels (e.g., satellite, cable, over-the-air broadcast, DVB-H/cellular), and (ii) connectionless/packet using the IP (e.g., “private/dedicated” IPTV network, Internet streaming, Internet on-demand servers/P2P i.e., peer-to-peer). These references, for example, among others, describe various methods for traditional video over packet/ATM (Asynchronous Transfer Mode)/IPTV/satellite/Internet [1–5]; many of these approaches and techniques can be extended/adapted for use in 3DTV.

Figures 4.1–4.7 depict graphically system-level views of the possible delivery mechanisms. We use the term “complexity” in these figures to remind the reader that it will not be trivial to deploy these networks on a broad national basis.

A challenge in the deployment of multi-view video services, including 3D and free-viewpoint TV, is the relatively large bandwidth requirement associated with transport of multiple video streams. Two-streams signals CSV, V + D, and LDV are doable: the delivery of a single stream of 3D video in the range of 20 Mbps is not outside the technical realm of most providers these days, but to deliver a large number of channels in an unswitched mode (requiring say 2 Gbps access to a domicile) will require FTTH capabilities. It is not possible to deliver that content over an existing copper plant of the xDSL (Digital Subscriber Line) nature unless a provider could deploy ADSL2+ (Asymmetric Digital Subscriber Line; but why bother upgrading a plant to a new copper technology such as this one when the provider could actually deploy fiber? However, ADSL2+ may be used in Multiple Dwelling Units as a riser for a FTTH plant). A way to deal with this is to provide user-selected multicast capabilities where a user can select an appropriate content channel using IGMP (Internet Group Management Protocol). Even then, a household may have multiple TVs (say three or four) switched on
simultaneously (and maybe even an active Digital Video Recorder or DVR), thus requiring bandwidth in the 15–60 Mbps. MV + D, where one wants to carry three or even more intrinsic (raw) views, becomes much more challenging and problematic for practical commercial applications.

Video Capture and Transmission Possibilities

 

(Simplicity of) initial enjoyment of 3DTV in the home.

Complexity of a commercial-grade 3DTV delivery environment using IPTV.

Off-the-air broadcast could be accomplished with some compromise by using the entire HDTV bandwidth for a single 3DTV channel—here, multiple TVs in a household could be tuned to different programs.

However, a traditional cable TV plant would find it a challenge to deliver a (large) pack of 3DTV channels, but it could deliver a subset of their total selection in 3DTV (say 10 or 20 channels) by scarifying bandwidth on the cable that could
otherwise carry distinct channels. The same is true for DTH applications.

For IP, a service provider–engineered network could be used. Here, the provider can control the latency, jitter, effective source–sink bandwidth, packet

Complexity of a commercial-grade 3DTV delivery environment using the cable TV infrastructure.Complexity of a commercial-grade 3DTV delivery environment using a satellite DTH infrastructure.

Complexity of a commercial-grade 3DTV delivery environment using overthe- air infrastructure.

Complexity of a commercial-grade 3DTV delivery environment using the Internet.

loss, and other service parameters. However, if the approach is to use the Internet, performance issues will be a major consideration, at least for real-time services. A number of multi-view encoding and streaming strategies using RTP (Real-Time Transport Protocol)/UDP (User Datagram Protocol)/IP or RTP/DCCP (Datagram Congestion Control Protocol)/IP exist for this approach. Video streaming architectures can be classified as (i) server to single client unicast, (ii) server multicasting

Complexity of a commercial-grade 3DTV delivery environment using a DVB-H (or proprietary infrastructure).

Block diagram of the framework and system for 3DTV streaming transport over IP.

to several clients, (iii) P2P unicast distribution, where each peer forwards packets to another peer, and (iv) P2P multicasting, where each peer forwards packets to several other peers. Multicasting protocols can be supported at the network-layer or application layer [6]. Figure 4.8 provides a view of the framework and system for 3DTV streaming transport over IP.

Yet, there is a lot of current academic research and interest in connectionless delivery of 3DTV content over shared packet networks. 3D video content needs to be protected when transmitted over unreliable communication channels. The effects of transmission errors on the perceived quality of 3D video could not be less than those for the equivalent 2D video applications, because the errors will influence several perceptual attributes (e.g., naturalness, presence, depth perception, eye-strain, and viewing experience), associated with 3D viewing [7].

It has long been known that IP-based transport can accommodate a wide range of applications. Transport and delivery of video in various forms goes back to the early days of the Internet. However, (i) the delivery of quality (jitter-, loss-free) content, particularly HD or even 3D; (ii) the delivery of content in a secure, money-making subscription-based manner; and (iii) the delivery of streaming real-time services for thousands of channels (worldwide) and millions of simultaneous customers remain a long shot at this juncture. Obviously, at the academic level, transmission of video over the Internet (whether 2D or 3D) is currently an active research and development area where significant results have already been achieved. Some video-on-demand services that make use of the Internet, both for news and entertainment applications, have emerged but desiderata (i), (ii), and (iii) have not been met. Naturally, it is critical to distinguish between the use of the IP (IPv4 or IPv6) protocol and the use of the Internet (that is based on IP), as a delivery mechanism (a delivery channel). IPTV services delivered over a restricted IP infrastructure appear to be more tenable in the short term, both in terms of Quality of Service (QoS) and Quality of Experience (QoE). Advocates now advance the concept of 3D IPTV. The transport of 3DTV signals over IP packet networks appears to be a natural extension of video over IP applications; but the IPTV model (rather than the Internet) model seems more appropriate at this time. The consuming public will not be willing, we argue based on experience, to purchase new (fairly expensive) TV displays for 3D, if the quality of the service is not there. The technology has to “disappear into the background” and not be smack in the foreground, if the QoE has to be reasonable.

To make a comparison with Voice over Internet Protocol (VoIP), it should be noted that while voice over the Internet is certainly doable end-to-end, specialized commercial VoIP providers tend to use the Internet mostly for access (except
for international calling to secondary geographic locations). Most top-line (traditional) carriers use the IPover their own internally designed, internally engineered, and internally provisioned network, and also for core transport [8–21].

Some of the research issues associated with IP delivery in general, and IP/Internet streaming in particular, include but are not limited to, the following [6]:

  1. Determination of the best video encoding configuration for each streaming strategy: multi-view video encoding methods provide some compression efficiency gain at the expense of creating dependencies between views that
    hinder random access to views.
  2. Determination of the best rate adaptation method: adaptation refers to adaptation of the rate of each view as well as inter-view rate allocation depending on available network rate and video content, and adaptation of the number and quality of views transmitted depending on available network rate and user display technology and desired viewpoint.
  3. Packet-loss resilient video encoding and streaming strategies as well as better error concealment methods at the receiver: some ongoing industry research includes the following [7].
    1. Some research related to Robust Source Coding is of interest. In a connectionless network, packets can get lost; the network is lossy. A number of standard source coding approaches are available to provide robust source coding for 2D video to deal with this issue, and many of these can be used for 3D V + D applications (features such as slice coding, redundant pictures, Flexible Macroblock Ordering or FMO, Intrarefresh, and
      Multiple Description Coding or MDC are useful in this context). Lossaware rate-distortion optimization is often used for 2D video to optimize the application of robust source coding techniques. However, the models used have not been validated for use with 3D video in general and FVV in particular.
    2. Some research related to Cross-Layer Error Robustness is also of interest for transport of V + D signals over a connectionless network. In recent years, attention has focused on cross-layer optimization of 2D video quality. This has resulted in algorithms that have optimized channel coding and prioritized the video data. Similar work is needed to uncover/assess appropriate methods to transport 3D video across networks.
    3. Other research work pertains to Error Concealment that might be needed when transporting V + D signals over a connectionless network. Most 2D error concealment algorithms can be used for 3D video. However,
      there is additional information that can be used in 3D video to enhance the concealed quality: for example, information such as motion vectors can be shared between the color and depth video; if color information
      is lost, then depth motion vectors can be used to carry out concealment. There are other opportunities with MVC, where adjacent views can be used to conceal a view that is lost.
  4. Best peer-to-peer multicasting design methods are required, including topology discovery, topology maintenance, forwarding techniques, exploitation of path diversity, methods for enticing peers to send data and to stay connected, and use of dedicated nodes as relays.

Some have argued that stereo streaming allows for flexibility in congestion control methods, such as video rate adaptation to the available network rate, methods for packet loss handling, and postprocessing for error concealment but it is unclear how a commercial service with paying customers (possibly paying a premium for the 3DTV service) would indeed be able to accept any degradation in quality.

Developing laboratory test bed servers that unicast content to multiple clients with stereoscopic displays should be easily achievable, as should be the case for other comparable arrangements. Translating those test beds to scalable, reliable (99.999% availability), cost-effective commercial-service-supporting infrastructures is altogether another matter.

In summary, real-time delivery of 3DTV content can make use of satellite, cable, broadcast, IP, IPTV, Internet, and wireless technologies. Any unique requirements of 3DTV need to be taken into account. The requirements are very
similar to those needed for delivery of entertainment-quality video (e.g., with reference to latency, jitter, and packet loss), but with the observation that a number (in not most) of the encoding techniques require more bandwidth. The incremental bandwidth is as follows: (i) from 20% to 100% more for stereoscopic viewing compared with 2D viewing;1 (ii) from 50% to 200% for multi-view systems compared with 2D viewing; and (iii) a lot more bandwidth for holoscopic/holographic designs (presently not even being considered for near-term commercial 3DTV service). We mentioned explicit coding earlier: that would indeed provide more efficiency, but as noted, most video systems in use today (or anticipated to be available in the near future) use explicit coding. Synthetic video generation based  on CGI techniques needs less bandwidth than actual video. There can also be content with Mixed Reality (MR)/Augmented Reality (AR) that mix graphics with real images, such as those that use depth information together with image data for
3D scene generation. These systems may also require less bandwidth than actual full video. Stereoscopic video (CSV) may be used as a reference point. Holoscopic/ holographic systems require the most. It should also be noted that while
graphic techniques and/or implicit coding may require a very large transmission bandwidth, the tolerance to  nformation (packet loss) is typically very low.

We conclude this section by noting, again, that while connectionless packet networks offer many research opportunities as related to supporting 3DTV, we believe that a commercial 3DTV service will more likely occur in a connectionoriented
(e.g., DTH, cable TV) environment and/or a controlled-environment IPTV setting.

 

Adoption of 3DTV in the Marketplace

It should be noted that 3D film and 3DTV trials have a long history, as shown in Fig. 1.7 (based partially on Ref. 2). However, the technology has finally

History of 3D in film and television.

progressed enough at this juncture, for example with the deployment of digital television (DTV) and High Definition Television (HDTV), that regular commercial services will finally be introduced at this juncture.

We start by noting that there are two general commercial-grade display approaches for 3DTV: (i) stereoscopic TV, which requires special glasses to watch 3D movies, and (ii) autostereoscopic TV, which displays 3D images in such a manner that the user can enjoy the viewing experience without special accessories.

Short-term commercial 3DTV deployment, and the focus of this book, is on stereoscopic 3D imaging and movie technology. The stereoscopic approach follows the cinematic model, is simpler to implement, can be deployed more
quickly (including the use of relatively simpler displays), can produce the best results in the short term, and may be cheaper in the immediate future. However, the limitations are the requisite use of accessories (glasses), somewhat limited positions of view, and physiological and/or optical limitations including possible eye strain. In summary, (i) glasses may be cumbersome and expensive (especially for a large family) and (ii) without the glasses, the 3D content is unusable.

Autostereoscopic 3DTV eliminates the use of any special accessories: it implies that the perception of 3D is in some manner automatic, and does not require devices—either filter-based glasses or shutter-based glasses. Autostereoscopic displays use additional optical elements aligned on the surface of the screen, to ensure that the observer sees different images with each eye. From a home screen hardware perspective the autostereoscopic approach is more challenging, including the need to develop relatively more complex displays; also, more complex acquisition/coding algorithms may be needed to make optimal use of the technology. It follows that this approach is more complex to implement, will require longer to be deployed, and may be more expensive in the immediate future. However, this approach can produce the best results in the long term, including accessories-free viewing, multi-view operation allowing both movement and different perspective at different viewing positions, and better physiological and/or optical response to 3D.

Table 1.1 depicts a larger set of possible 3DTV (display) systems than what we identified above. The expectation is that 3DTV based on stereoscopy will experience earlier deployment compared with other technological alternatives.
Hence, this text focuses principally on stereoscopy. Holography and integral imaging are relatively newer technologies in the 3DTV context compared to stereoscopy; holographic and/or integral imaging 3DTV may be feasible late in
the decade. There are a number of techniques to allow each eye to view the separate pictures, as summarized in Table 1.2 (based partially on Ref. 3.) All of these techniques work in some manner, but all have some shortcomings.

To highlight the commercial interest in 3DTV at press time, note that ESPN announced in January 2010 that it planned to launch what would be the world’s

Various 3D Display Approaches and Technologies

first 3D sports network with the 2010 World Cup soccer tournament in June 2010, followed by an estimated 85 live sports events during its first year of operation. DIRECTV announced that they will start 3D programming in 2010. DIRECTV’s new HD 3D channels will deliver movies, sports, and entertainment content from some of the world’s most renowned 3D producers. DIRECTV is currently working with AEG/AEG Digital Media, CBS, Fox Sports/FSN, Golden Boy
Promotions, HDNet, MTV, NBC Universal, and Turner Broadcasting System, Inc., to develop additional 3D programming that will debut in 2010–2011. At launch, the new DIRECTV HD 3D programming platform will offer a 24/7 3D pay per view channel focused on movies, documentaries, and other programming;

Current Techniques to Allow Each Eye to View Distinct Pictures Streams

a 24/7 3D DIRECTV on Demand channel; and a free 3D sampler demo channel featuring event programming such as sports, music, and other content. Comcast has announced that its VOD (Video-On-Demand) service is offering a number
of movies in anaglyph 3D (as well as HD) form. Comcast customers can pick up 3D anaglyph glasses at Comcast payment centers and malls “while supplies last” (anaglyph is a basic and inexpensive method of 3D transmission that relies on inexpensive colored glasses, but its drawback is the relatively low quality.) Verizon’s FiOS was expected to support 3DTV programming by Late 2010. Sky TV in the United Kingdom was planning to start broadcasting programs in 3D in the fall of 2010 on a dedicated channel that will be available to anyone who has the Sky HD package; there are currently 1.6 million customers who have a Sky HD set-top box. Sky TV has not announced what programs will be broadcast in 3D, but it is expected to broadcast live the main Sunday afternoon soccer game from the Premiership in 3D from the 2011 season, along with some arts documentaries and performances of ballet [4]. Sky TV has already invested in installing special twin-lens 3D cameras at stadiums.

3DTV television displays could be purchased in the United States and United Kingdom as of the spring of 2010 for $1000–5000 initially, depending on technology and approach. Liquid Crystal Display (LCD) systems with active
glasses tend to generally cost less. LG released its 3D model, a 47-in. LCD screen, expected to cost about $3000; with this system, viewers will need to wear polarized dark glasses to experience broadcasts in 3D. Samsung and Sony also announced they were bringing their own versions to market by the summer of 2010, along with 3D Blu-ray players, allowing consumers to enjoy 3D movies such as Avatar and Up, in their own homes [4]. Samsung and Sony’s models
use LED (Light-Emitting Diode) screens which are considered to give a crisper picture and are, therefore, expected to retail for about $5000 or possibly more. While LG is adopting the use of inexpensive polarizing dark glasses, Sony and Samsung are using active shutter technology. This requires users to buy expensive dark glasses, which usually cost more than $50 and are heavier than the $2–3 plastic polarizing ones. Active shutter glasses alternately darken over one eye, and then the other, in synchronization with the refresh rate of the screen using shutters built into the glasses (using infrared or Bluetooth connections). Panasonic Corporation has developed a full HD 3D home theater system consisting of a plasma full HD 3D TVs, 3D Blu-ray player, and active shutter 3D glasses. The 3D display was originally available in 50-in., 54-in., 58-in. and 65-in. class sizes. High-end systems are also being introduced; for example Panasonic announced a 152-in. 4K × 2K (4096 × 2160 pixels)-definition full HD 3D plasma display. The display features a new Plasma Display Panel (PDP) that uses self-illuminating technology. Self-illuminating plasma panels offer excellent response to moving images with full motion picture resolution, making them suitable for rapid 3D image display (its illuminating speed is about one-fourth the speed of conventional full HD panels). Each display approach
has advantages and disadvantages as shown in Table 1.3.

Summary of Possible, Commercially Available TV Screen/System Choices for 3D

Summary of Possible, Commercially Available TV Screen/System Choices for 3D

3D Blu-ray disc logo.

It is to be expected that 3DTV for home use is likely to first see penetration via stored media delivery. For content source, proponents make the case that BD “is the ideal platform” for the initial penetration of 3D technology in the mainstream market because of the high quality of pictures and sound it offers film producers. Many products are being introduced by manufacturers: for example at the 2010 Consumer Electronics Show (CES) International Trade Show, vendors introduced eight home theater product bundles (one with 3D capability), 14 new players (four with 3D capability), three portable players, and a number of software titles. In 2010 the Blu-ray Disc Association (BDA) launched a new 3D Blu-ray logo to help consumers quickly discern 3D-capable Blu-ray players from 2D-only versions (Fig. 1.8) [5].

The BDA makes note of the strong adoption rate of the Blu-ray format. In 2009, the number of Blu-ray households increased by more than 75% over 2008 totals. After four years in the market, total Blu-ray playback devices (including
both set-top players and PlayStation3 consoles) numbered 17.6 million units, and 16.2 million US homes had one or more Blu-ray playback devices. By comparison, DVD playback devices (set-tops and PlayStation2 consoles) reached
14.1 million units after four years, with 13.7 million US households having one or more playback devices. The strong performance of the BD format is due to a number of factors, including the rapid rate at which prices declined due to
competitive pressures and the economy; the rapid adoption pace of HDTV sets, which has generated a US DTV household penetration rate exceeding 50%; and, a superior picture and sound experience compared to standard definition and even
other HD sources. Another factor in the successful adoption pace has been the willingness of movie studios to discount popular BD titles [5]. Blu-ray software unit sales in 2009 reached 48 million, compared with 22.5 million in 2008, up
by 113.4%. A number of movie classics were available at press time through leading retailers at sale prices as low as $10.

The BDA also announced (at the end of 2009) the finalization and release of the Blu-ray 3D specification. These BD specifications for 3D allow for full HD 1080p resolution to each eye. The specifications are display agnostic, meaning
they apply equally to plasma, LCD, projector, and other display formats regardless of the 3D systems those devices use to present 3D to viewers. The specifications also allow the PlayStation3 gaming console to play back 3D content.
The specifications that represent the work of the leading Hollywood studios and consumer electronic and computer manufacturers, will enable the home entertainment industry to bring stereoscopic 3D experience into consumers’ living
rooms on BD, but will require consumers to acquire new players, HDTVs, and shutter glasses. The specifications allow studios (but do not require them) to package 3D Blu-ray titles with 2D versions of the same content on the same disc. The specifications also support playback of 2D discs in forthcoming 3D players and can enable 2D playback of Blu-ray 3D discs on an installed base of BD. The Blu-ray 3D specification encodes 3D video using the Multi-View Video Coding (MVC) codec, an extension to the ITU-T H.264 Advanced Video Coding (AVC) codec currently supported by all BD players. MPEG-4 (Moving Picture Experts Group 4)-MVC compresses both left and right eye views with a typical 50% overhead compared to equivalent 2D content, according to BDA and can provide full 1080p resolution backward compatibility with current 2D BD players [6].

The broadcast commercial delivery of 3DTV on a large scale—whether over satellite/Direct-To-Home (DTH), over the air, over cable systems, or via IPTV—may take some number of years because of the relatively large-scale infrastructure that has to be put in place by the service providers and the limited availability of 3D-ready TV sets in the home (implying a small subscriber, and so small revenue base). A handful of providers were active at press time, as described earlier, but general deployment by multiple providers serving a geographic market will come at a future time. Delivery of  downloadable 3DTV files over the Internet may occur at any point in the immediate future, but the provision of a broadcast-quality service over the Internet is not likely for the foreseeable future.

At the transport level, 3DTV will require more bandwidth of regular programming, perhaps even twice the bandwidth in some implementations (e.g., simulcasting—the transmission of two fully independent channels4); some newer schemes such as “video + depth” may require only 25% more bandwidth compared to 2D, but these schemes are not the leading candidate technologies for actual deployment in the next 2–3 years. Other interleaving approaches use the same bandwidth of a channel now in use, but at a compromise in resolution. Therefore, in principle, if HDTV programming is broadcast at high quality, say, 12–15Mbps using MPEG-4 encoding, 3DTV using the simplest methods of two independent streams will require 24–30Mbps.5 This data rate does not fit a standard over-the-air digital TV (DTV) channel of 19.2 Mbps, and will also be a challenge for non-Fiber-To-The-Home (non-FTTH) broadband Internet connections. However, one expects to see the emergence of bandwidth reduction techniques, as alluded to above. On the other hand, DTH satellite providers, terrestrial fiberoptic providers, and some cable TV firms should have adequate bandwidth to support the service. For example, the use of the Digital Video Broadcast Satellite Second Generation (DVB-S2) allows a transponder to carry 75 Mbps of content with modulation using an 8-point constellation and twice that much with a 16-point constellation. The trade-off would be, however (if we use the raw HD bandwidth just described as a point of reference), that a DVB-S2 transponder that would otherwise carry 25 channels of standard definition video or 6–8 channels of HD video would now only carry 2–3 3DTV channels. To be pragmatic about this issue, most 3DTV providers are not contemplating delivering full resolution as just described and/or the transmission of two fully independent channels (simulcasting), but some compromise; for example, lowering the per eye data rate such that a 3DTV program fits into a commercial-grade HDTV channel (say 8–10 Mbps), using time interleaving or spatial compression—again, this is doable but comes with the degradation of ultimate resolution quality.

There are a number of alternative transport architectures for 3DTV signals, also depending on the underlying media. As noted, the service can be supported by traditional broadcast structures including the DVB architecture, wireless 3G/4G transmission such as DVB-H approaches, Internet Protocol (IP) in support of an IPTV-based service (in which case it also makes sense to consider IPv6) and the IP architecture for internet-based delivery (both non–real time and streaming). The specific approach used by each of these transport methods will also depend on the video-capture approach. One should note that in the United States, one has a well-developed cable infrastructure in all Tier 1 and Tier 2 metropolitan and suburban areas; in Europe/Asia, this is less so, with more DTH delivery (in the United States DTH tends to serve more exurban and rural areas). A 3DTV rollout must take these differences into account and/or accommodate
both. In reference to possible cable TV delivery, CableLabs announced at press time that it started to provide testing capabilities for 3D TV implementation scenarios over cable; these testing capabilities cover a full range of technologies
including various frame-compatible, spatial multiplexing solutions for transmission [7].

Standards are critical to achieving interworking and are of great value to both consumers and service providers. The MPEG of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) has been working on coding formats for 3D video (and has already completed some of them.) The Society of Motion Picture and Television Engineers (SMPTE) 3D Home Entertainment Task Force has been working on mastering standards. The
Rapporteur Group on 3DTV of the International Telecommunications Union- Radiocommunications Sector (ITU-R) Study Group 6, and the TM-3D-SM group of DVB were working on transport standards.