Upcoming WebRTC World Conference

Last month, WebRTC was included in Firefox for Android. It has been available for a while in Chrome, Mozilla, and Opera browsers. Justin Uberti of Google claims that this adds up to a billion devices running WebRTC.

To get up to speed on the challenges and opportunities of WebRTC, and the future of real-time voice/video communications, Wirevolution’s Charlie Gold will be attending the WebRTC World Conference next month and writing about it here. The conference runs Nov 19-21 at the Convention Center in Santa Clara, CA.

At the conference we hope to make sense of the wide range of applications integrating WebRTC, and to relate them to integration opportunities for service providers. The applications range from standard video enabled apps such as webcasting and security, to enterprise applications such as CRM and contact centers, to emerging opportunities such as virtual reality and gaming. Service providers can combine webRTC with IMS and RCS, and can use it to manage network capabilities and end users’ quality of experience.

The conference is organized around four tracks: developer workshops, B2C, enterprise, and service providers.

  • The developer workshop topics include concepts and structures of WebRTC, implementation and build options, standardization efforts, signaling options, applications to the internet of things (IoT), codec evolution, monitoring and alarms, firewall traversal with STUN/TURN/ICE, and large scale simultaneous delivery in applications such as webcasting, gaming and virtual reality (VR) and security.
  • Business and consumer applications sessions cover successful deployment strategies of use cases like collaboration and conferencing, call centers and the Internet of Things (IoT). Other sessions on this track cover security, device requirements and regulatory issues.
  • Service provider workshops include IMS value in a world of WebRTC, how to use WebRTC, deployment strategies, how to extend existing services and offer new services using WebRTC, using WebRTC to acquire new users, and understanding the network impact of WebRTC.
  • The enterprise track has additional sessions on integrating WebRTC into your contact center and websites (public, supplier, internal). These sessions cover details like mapping out your integration strategy between WebRTC and SIP, using a media server vs. direct media interoperation; and how to deploy a WebRTC portal.

Keynotes will be from Ericsson, Alcatel-Lucent, Mozilla, Genband, Mavenir, Radisys, CafeX and presumably others.

To round it out, there will be a plethora of special workshops, realtime demos, panels and round tables.

With the momentum of WebRTC growing in leaps and bounds, we are looking forward to attending and sharing more on webRTC next month.

ITExpo: The Realities of Mobile Videoconferencing

I will be moderating a panel on this topic at ITExpo East 2012 in Miami at 1:00pm on Thursday, February 2nd.

The panelists will be Girish Khavasi of Dialogic, Trent Johnsen of Hookflash, Anatoli Levine of RADVISION and Al Balasco RadiSys. This is a heavy hitting collection of panelists. Come with your toughest questions – you will get useful, authoritative answers.

The pitch for the panel is:

As 4G mobile networks continue to be rolled out and new devices are adopted by end users, mobile video conferencing is becoming an increasingly important component in today’s Unified Communications ecosystem. The ability to deliver enterprise-grade video conferencing including high definition voice, video and data-sharing will be critical for those playing in this space. Mobile video solutions require vendors to consider a number of issues including interoperability with new and traditional communications platforms as well as mobile operating systems, user interfaces that maximize the experience, and the ability to interoperate with carrier networks. This session will explore the business-class mobile video platforms available in the market today as well as highlight some end-user experiences with these technologies.

ITExpo: The Future is Now: Mobile Callers Want Visuals with Voice over the existing network

I will be moderating a panel on this topic at ITExpo East 2012 in Miami at 2:30 pm on Wednesday, February 1st.

The panelists will be Theresa Szczurek of Radish Systems, LLC, Jim Machi of Dialogic, Niv Kagan of Surf Communications Solutions and Bogdan-George Pintea of Damaka.

The concept of visuals with voice is a compelling one, and there are numerous kinds of visual content that you may want to convey. For example, when you do a video call with FaceTime or Skype, you can switch the camera to show what you are looking at if you wish, but you can’t share your screen or photos during a call.

FaceTime, Skype and Google Talk all use the data connection for both the voice and video streams, and the streams travel over the Internet.

A different, non-IP technology for videophone service called 3G-324M, is widely used by carriers in Europe and Asia. It carries the video over the circuit-switched channel, which enables better quality (lower latency) than the data channel. An interesting application of this lets companies put their IVR menus into a visual format, so instead of having to listen through a tedious listing of options that you don’t want, you can instantly select your choice from an on-screen menu. Dialogic makes back-end equipment that makes applications like on-screen IVR possible on 3G-324M networks.

Radish Systems uses a different method to provide a similar visual IVR capability for when your carrier doesn’t support 3G-324M (none of the US carriers do). The Radish application is called Choiceview. When you make a call from your iPhone to a Choiceview-enabled IVR, you dial the call the regular way, then start the Choiceview app on your iPhone. The Choiceview IVR matches the Caller ID on the call with your phone number that you typed into the app setup, and pushes a menu to the appropriate client. So the call goes over the old circuit-switched network, while Choiceview communicates over the data network. Choiceview is strictly a client-server application. A Choiceview server can push any data to a phone, but the phone can’t send data the other way, neither can two phones exchange data via Choiceview.

So this ITExpo session will try to make sense of this mix: multiple technologies, multiple geographies and multiple use cases for visual data exchange during phone calls.

ITExpo East 2011: C-01 “Connecting the Distributed Enterprise via Video”

I will be moderating this panel at IT Expo in Miami on February 3rd at 9:00 am:

Mobility is taking the enterprise space by storm – everyone is toting a smartphone, tablet, laptop, or one of each. It’s all about what device happens to be tIn today’s distributed workforce environment, it’s essential to be able to communicate to employees and customers across the globe both efficiently and effectively. Prior to today, doing so was far more easily said than done because, not only was the technology not in place, but video wasn’t accepted as a form of business communication. Now that video has burst onto the scene by way of Apple’s Facetime, Skype and Gmail video chat, consumers are far more likely to pick video over voice – both in their home and at their workplaces. But, though demand has never been higher, enterprise networks still experience a slow-down when employees attempt to access video streams from the public Internet because the implementation of IP video is not provisioned properly. This session will provide an overview of the main deployment considerations so that IP video can be successfully deployed inside or outside the corporate firewall, without impacting the performance of the network, as well as how networks need to adapt to accommodate widespread desktop video deployments. It will also expose the latest in video compression technology in order to elucidate the relationship between video quality, bandwidth, and storage. With the technology in place, an enterprise can efficiently leverage video communication to lower costs and increase collaboration.

The panelists are:

  • Mike Benson, Regional Vice President, VBrick Systems
  • Anatoli Levine, Sr. Director, Product Management, RADVISION Inc.
  • Matt Collier, Senior Vice President of Corporate Development, LifeSize

VBrick claims to be the leader in video streaming for enterprises. Radvision and LifeSize (a subsidiary of Logitech) are oriented towards video conferencing rather than streaming. It will be interesting to get their respective takes on bandwidth constraints on the WLAN and the access link, and what other impairments are important.

Video calling from your cell phone

Although phone numbers are an antiquated kind of thing, we are sufficiently beaten down by the machines that we think of it as natural to identify a person by a 10 digit number. Maybe the demise of the numeric phone keypad as big touch-screens take over will change matters on this front. But meanwhile, phone numbers are holding us back in important ways. Because phone numbers are bound to the PSTN, which doesn’t carry video calls, it is harder to make video calls than voice, because we don’t have people’s video addresses so handy.

This year, three new products attempted to address this issue in remarkably similar ways – clearly an idea whose time has come. The products are Apple’s FaceTime, Cisco’s IME and a startup product called Tango.

In all three of these products, you make a call to a regular phone number, which triggers a video session over the Internet. You only need the phone number – the Internet addressing is handled automatically. The two problems the automatic addressing has to handle are finding a candidate address, then verifying that it is the right one. Here’s how each of those three new products does the job:

1. FaceTime. When you first start FaceTime, it sends an SMS (text message) to an Apple server. The SMS contains sufficient information for the Apple server to reliably associate your phone number with the XMPP (push services) client running on your iPhone. With this authentication performed, anybody else who has your phone number in their address book on their iPhone or Mac can place a videophone call to you via FaceTime.

2. Cisco IME (Inter-Company Media Engine). The protocol used by IME to securely associate your phone number with your IP address is ViPR (Verification Involving PSTN Reachability), an open protocol specified in several IETF drafts co-authored by Jonathan Rosenberg who is now at Skype. ViPR can be embodied in a network box like IME, or in an endpoint like a phone of PC.
Here’s how it works: you make a phone call in the usual way. After you hang up, ViPR looks up the phone number you called to see if it is also ViPR-enabled. If it is, ViPR performs a secure mutual verification, by using proof-of-knowledge of the previous PSTN call as a shared secret. The next time you dial that phone number, ViPR makes the call through the Internet rather than through the phone network, so you can do wideband audio and video with no per-minute charge. A major difference between ViPR and FaceTime or Tango is that ViPR does not have a central registration server. The directory that ViPR looks up phone numbers in is stored in a distributed hash table (DHT). This is basically a distributed database with the contents stored across the network. Each ViPR participant contributes a little bit of storage to the network. The DHT itself defines an algorithm – called Chord – which describes how each node connects to other nodes, and how to look up information.

3. Tango, like FaceTime, has its own registration servers. The authentication on these works slightly differently. When you register with Tango, it looks in the address book on your iPhone for other registered Tango users, and displays them in your Tango address book. So if you already know somebody’s phone number, and that person is a registered Tango user, Tango lets you call them in video over the Internet.

ITExpo West — Building Better HD Video Conferencing & Collaboration Systems

I will be moderating a session at ITExpo West on Tuesday 5th October at 9:30 am: “Building Better HD Video Conferencing & Collaboration Systems,” will be held in room 306A.

Here’s the session description:

Visual communications are becoming more and more commonplace. As networks improve to support video more effectively, the moment is right for broad market adoption of video conferencing and collaboration systems.

Delivering high quality video streams requires expertise in both networks and audio/video codec technology. Often, however, audio quality gets ignored, despite it being more important to efficient communication than the video component. Intelligibility is the key metric here, where wideband audio and voice quality enhancement algorithms can greatly improve the quality of experience.

This session will cover both audio and video aspects of today’s conferencing systems, and the various criteria that are used to evaluate them, including round-trip delay, lip-sync, smooth motion, bit-rate required, visual artifacts and network traversal – and of course pure audio quality. The emphasis will be on sharing best practices for building and deploying high-definition conferencing systems.

The panelists are:

  • James Awad, Marketing Product Manager, Octasic
  • Amir Zmora, VP Products and Marketing, RADVISION
  • Andy Singleton, Product Manager, MASERGY

These panelists cover the complete technology stack from chips (Octasic), to equipment (Radvison) to network services (Masergy), so please bring your questions about any technical aspect of video conferencing systems.

Top ten uses for an Internet Tablet/Web Slate

The tablet wars are imminent, with Microsoft, Google and Apple breaking out their big guns. Here’s what you will be doing with yours later this year:

  1. Internet browser of course: think iPhone experience with a bigger screen. It will be super-fast with 802.11n in your home, and somewhat slower when you are out and about, tethering to your cell phone for wide-area connectivity. You don’t need a cellular connection in the Internet Tablet itself, though the cellcos wish you would.
  2. TV accessory: treat it as a personal picture-in-picture display. View the program guide without disturbing the other people watching the main screen. Use it for voting on shows like American Idol. Use it as a remote to change channels and set up recordings.
  3. TV replacement: a 10 inch screen at two feet is the same relative size as a 50 inch screen at ten feet. Use it with Hulu and the other streaming video services.
  4. Video iPod, but with a much nicer screen. Say goodbye to portable DVD players.
  5. VideoPhone: some Internet Tablets will have hi-res user-facing cameras and high definition microphones and speakers: the perfect Skype phone to keep on your coffee table. How about on your fridge door for an always-on video connection to the grandparents? Or in a suitable charging base, a replacement office desk phone.
  6. Electronic picture frame: sure it’s overkill for such a trivial application, but when it’s not doing anything else, why not?
  7. eBook reader: maybe not in 2010, but as screen and power technology evolve the notion of a special-function eBook reader will become as quaint as a Wang word processor. (Never heard of a Wang word processor? I rest my case.)
  8. Home remote: take a look at AMX. This kind of top-of-the-line home control will be available to the masses. Set the thermostat, set the burglar alarm, look at the front door webcam when the doorbell rings…
  9. Game console: look at all the games for the iPhone. Many of them will work on Apple’s iSlate from day one. And you can bet there will be plenty of cool games for Android, and even Windows-based Internet Tablets.
  10. PND display: Google Maps on the iPhone is miraculously good, but it’s not perfect. The display is way too small for effective in-car navigation. It’s possible that some Internet Tablets will have GPS chips in them (GPS only adds a few dollars to the bill of materials), but for this application there’s no need. Tether it to your cell phone for the Internet connectivity and the GPS, and use the tablet for display and input only.

2010 will be the year of the Internet Tablet. The industry has pretty much converged on the form factor: ten-inch-plus screen, touch interface, Wi-Fi connectivity. What’s a little more up in the air are minor details that will provide differentiation, like cellular connectivity, cameras, speakers and microphones. Apple will jump-start the category, but there will quickly be a slew of contenders at sub-$200 price points.

Several technology advances have converged to make now the right time. Low-cost, low energy ARM processors like the Qualcomm Snapdragon have enough processing muscle to drive PC-scale applications, and their pricing piggy-backs on the manufacturing scale of phones. 802.11n is fast enough for responsive web-based applications and HD video streaming. LCD screens continue to get cheaper. Personal Wi-Fi networks enable tethering and wireless keyboards for when you need them.

This also the perfect form factor for grade school kids. Once the screen resolutions get high enough books will disappear almost overnight. No more backs bent under packs laden with schoolbooks. Just this.

All you can eat?

The always good Rethink Wireless has an article AT&T sounds deathknell for unlimited mobile data.

It points out that with “3% of smartphone users now consuming 40% of network capacity,” the carrier has to draw a line. Presumably because if 30% of AT&T’s subscribers were to buy iPhones, they would consume 400% of the network’s capacity.

Wireless networks are badly bandwidth constrained. AT&T’s woes with the iPhone launch were caused by lack of backhaul (wired capacity to the cell towers), but the real problem is on the wireless link from the cell tower to the phone.

The problem here is one of setting expectations. Here’s an excerpt from AT&T’s promotional materials: “Customers with capable LaptopConnect products or phones, like the iPhone 3G S, can experience the 7.2 [megabit per second] speeds in coverage areas.” A reasonable person reading this might think that it is an invitation to do something like video streaming. Actually, a single user of this bandwidth would consume the entire capacity of a cell-tower sector:
HSPA ell capacity per sector per 5 MHz
Source: High Speed Radio Access for Mobile Communications, edited by Harri Holma and Antti Toskala.

This provokes a dilemma – not just for AT&T but for all wireless service providers. Ideally you want the network to be super responsive, for example when you are loading a web page. This requires a lot of bandwidth for short bursts. So imposing a bandwidth cap, throttling download speeds to some arbitrary maximum, would give users a worse experience. But users who use a lot of bandwidth continuously – streaming live TV for example – make things bad for everybody.

The cellular companies think of users like this as bad guys, taking more than their share. But actually they are innocently taking the carriers up on the promises in their ads. This is why the Rethink piece says “many observers think AT&T – and its rivals – will have to return to usage-based pricing, or a tiered tariff plan.”

Actually, AT&T already appears to have such a policy – reserving the right to charge more if you use more than 5GB per month. This is a lot, unless you are using your phone to stream video. For example, it’s over 10,000 average web pages or 10,000 minutes of VoIP. You can avoid running over this cap by limiting your streaming videos and your videophone calls to when you are in Wi-Fi coverage. You can still watch videos when you are out and about by downloading them in advance, iPod style.

This doesn’t seem particularly burdensome to me.

Open up Skype?

Skype is the gorilla of HD Voice. Looking at my Skype client I see that there are at this moment about 16 million people enjoying the wideband audio experience on Skype. The other main type of Voice over IP, SIP, is rarely used for HD Voice conversations, though I wrote an HD Voice Cookbook to help to popularize wideband codecs on SIP. Since Skype has the largest base of wideband codec users, those who are enthusiasts of both HD Voice and SIP are eager for SIP networks to interoperate with Skype, allowing all HD-capable endpoints to talk HD to each other. Skype does already kind of interoperate with SIP, but only through the PSTN, which reduces the wideband media stream to narrowband. Opening up Skype would solve this problem, so it’s obviously a good idea. What is not so clear, however, is what it means to “open up Skype.”

Skype reinvented Voice over IP, and did it better than SIP. SIP was originally intended to be a lightweight way to set up real-time communications session. It was the Internet Engineering Task Force’s response to the complexities of the ITU VoIP standard, H.323. But SIP got hijacked by the telephone industry, and recast into the familiar mold of proliferating standards and proprietary implementations. SIP is no longer lightweight, implementation is a challenge and only the basic features are easily interoperable.

Take a look at my HD Voice Cookbook to see what it takes to set up a typical SIP phone, then compare this to installing Skype on your PC. Or compare it to the simplicity of plugging in a POTS phone to your wall socket. So we have:

  • Skype, free video calls with HD voice from your PC to anywhere in the world;
  • POTS, narrowband voice-only calls that cost about $30 per month plus per-minute charges for international calls; or
  • SIP, that falls somewhere in between the two but which is way too complex for consumers to set up, and which people only really use for narrowband because everybody else only uses it for narrowband, so there’s no network effect.

Open VoIP standards got a several-year start on Skype, starting with H.323 and going on to SIP; but from its inception Skype blew them out of the water. To be sure it had a strong hype amplifier since P2P file sharing was controversial at that time, and Skype came from the same people as Kazaa, but at that time NetMeeting (an H.323 VoIP program) had an enormous installed base, since it came as part of Windows. The problem Skype solved was ease of use.

Skype doesn’t just give you video and wideband voice. It’s all encrypted and you get all sorts of bonus features like conferencing, presence, chat, desktop sharing, NAT traversal and dial-by-name. And did I mention it’s free?

The open standards VoIP community was beaten fair and square by Skype, blowing a several year start in the process.

Let me clarify that. In terms of minutes of voice traffic on network backbones, SIP traffic outweighs Skype, so from that point of view, SIP is not so beaten by Skype. The sense in which Skype has trounced the open standards VoIP community is in providing users with something better and cheaper than the decades-old PSTN experience, which carrier VoIP merely strives to emulate at a marginally lower price.

So it seems to me like sour grapes to clamor for Skype to make technical changes to conform to open standards, especially if those changes would impair some of the benefits that Skype offers users. How would users benefit from opening up Skype? Would the competition lower the cost of a Skype call? It’s hard to see how, when Skype calls are free. Would the service be more accessible, or accessible to more customers? No, because anybody with a browser can download Skype free by typing “Skype” or even “Skipe” into their browser’s search field. Would the open standards community innovate faster than Skype, and provide more and better features? Not based on the their respective track records. The open standards community has had plenty of time to out-innovate Skype and manifestly failed.

Anyway, what are the senses in which Skype is not open? It is certainly interoperable with the PSTN; SkypeIn and SkypeOut are among the cheapest ways to make calls on the PSTN. Actually, this may be the greatest threat to Skype’s innovation. SkypeIn and SkypeOut are the only way that Skype makes money; this is a powerful motivation for Skype to not incent users to abandon them. If this remains the only economic force acting on the company Skype is likely to decay into an old-style regular phone service provider.

After a lot of debate with people who know about these things, there seem to be two main ways in which Skype could be said to be not open:

  1. The protocol is proprietary and not published, so third parties can’t implement endpoints that interoperate with Skype endpoints.
  2. Only Skype can issue Skype addresses, and Skype controls the directories rather than using DNS like SIP.

Let’s look at the issue of the proprietary protocol first. Let’s break it into two parts, first who defines the protocols and second, their secrecy. In the debate between the cathedral and the bazaar, the cathedral has recently been losing out to the bazaar amongst the theorizers. We see the success of Apache, MySQL, Linux and Firefox and it looks as though the cathedral is being routed in the marketplace, too. But on the other hand we have successful companies like Apple, Google, Intel and Skype, whose success demonstrates that a design monopoly can often deliver a more elegant and tight user experience. There is no Linus Torvalds of SIP. Having taken the decision to implement a protocol other than SIP, it seems fine to me that whoever invented the Skype protocol should continue to design it, especially since they have manifestly done a much better job than the designers of SIP – ‘better’ in the sense of being more appealing to users.

What about the secrecy? A while back one of the original designers of SIP, Henning Schulzrinne, with his colleague Salman Baset, reverse engineered the Skype network and published his findings here. There is more technical background on Skype here. According to Baset and Schulzrinne:

Login is perhaps the most critical function to the Skype operation. It is during this process a Skype client authenticates its user name and password with the login server, advertises its presence to other peers and its buddies, determines the type of NAT and firewall it is behind, discovers online Skype nodes with public IP addresses, and checks the availability of latest Skype version.

Opening up the protocol to let other people use it would enable them to implement their own Skype login servers. This would enable a parallel network, but in the absence of a new protocol that enabled the login servers to exchange information, it would not lead to interoperability, in the sense of users on Skype being able to view the presence information of users on the parallel network, or even retrieve their IP address to make a call. So it would have the effect of fragmenting the Skype network, rather than opening it. Alternatively the Skype login servers could implement the SIP protocol to exchange presence information. But then it would start to be a SIP network, not a Skype network. And the market numbers say that users find SIP inferior to Skype. So why do it?

Opening up the protocol to let other people write Skype clients that logged into the Skype login servers would open up the network, but at the risk of introducing interoperability issues due to faulty interpretations of the specification. Network protocols are notoriously prone to this kind of problem. But guaranteed interoperability of the clients is one of the primary benefits of Skype over SIP from the point of view of the user, who would therefore not benefit from this step.

So why not have Skype distribute binaries that expose to third party applications the functionality of the protocols and the ability to log into the Skype login server through a published API? Wait a sec – they already do that.

Another objection to Skype publishing the protocols for third parties to implement is that there would be a danger of the third parties implementing some parts of the protocol but not others. For example not the encryption part, or not the parts that enable clients to be super-nodes or relays. A proliferation of this kind of free-rider would stress the network, making it more prone to failure.

Related to the issue of who implements the login servers is who issues Skype addresses. There is a central authority for issuing phone numbers (the ITU), and a central authority for issuing IP addresses (the IANA). But in both cases, the address space is hierarchical, allowing the central authority to delegate blocks of addresses to third party issuers. The Skype address space is not hierarchical, so it would require some kind of reworking to enable delegation. Alternatively the Skype login servers could accept logins from anybody with a SIP address. But there would be no guarantee that the client logging in was interoperable.

Scanning back through this posting, I see that my arguments could be parodied as “you can’t argue with success,” and “if it ain’t broke don’t fix it.” Arguments of this type are normally weak, so in this case I think my points are actually “there are reasons for Skype’s success,” “fixes could break it,” and “users would be better served if Skype competitors concentrated on seducing them with a superior offering,” the last of which, after all, is how Skype has won its users away from the traditional telecom industry. Some people are trying this approach, notably Gizmo5, which I plan to write about later.

Skype on Nokia phones. Video telephony for the masses?

At the end of 2008 there were 415 million broadband subscribers world-wide, and Skype claimed 405 million subscribers after a 47% year-on-year growth. So Skype must be topping out, right?

Perhaps not. At the end of 2008 there were 4 billion mobile phone users. Ten times as many as fixed broadband, and four times as many as PCs. Skype just announced that Nokia will be putting Skype on some of its high end phones. If the idea spreads Skype will still have plenty of room to grow.

But there is bigger news hidden here. Video telephony has been just around the corner for about 50 years. This announcement may soon make it commonplace.

I have written before about Skype sound quality, but Skype’s video capabilities also kick the competition. My children make regular intercontinental Skype video calls to their grandmother, and both the sound and video quality are generally excellent now that I have discarded my Linksys router and got an Apple Airport Extreme. If the numbers don’t convince you that Skype video calling is perfectly mainstream, perhaps Oprah will.

The phone mentioned by Nokia as the first to have Skype built in is the N97. Almost all of Nokia’s high end smart phones (the Eseries and Nseries) have Wi-Fi, and many (including the N97) have a “secondary camera” on the same side as the screen for use in video calling. Video calling is supported by the SIP soft-phone software that Nokia puts in almost all these phones, but SIP VoIP is nowhere compared to Skype. So the news that Nokia will be loading Skype onto some of these phones is tantalizing. The existing base of Skype users on PCs will bestow a massive network effect on Skype video calls from Nokia handsets.

The Wi-Fi aspect will help users to get around the carriers’ resistance, which in any case may be waning if the Skype interview linked above is correct.