Real Time Communication with WebRTC

Technology advanced at a rapid pace. We are now in the border of true unified communications, thank to WebRTC. What is WebRTC and why does it matters ?

A VERY SHORT HISTORY OF WEBRTC

One of the last major challenges for the web is to enable human communication via voice and video: Real Time Communication, RTC for short. RTC should be as natural in a web application as entering text in a text input. Without it, we’re limited in our ability to innovate and develop new ways for people to interact.

Historically, RTC has been corporate and complex, requiring expensive audio and video technologies to be licensed or developed in house. Integrating RTC technology with existing content, data and services has been difficult and time consuming, particularly on the web.

Gmail video chat became popular in 2008, and in 2011 Google introduced Hangouts, which use the Google Talk service (as does Gmail). Google bought GIPS, a company which had developed many components required for RTC, such as codecs and echo cancellation techniques. Google open sourced the technologies developed by GIPS and engaged with relevant standards bodies at the IETF and W3C to ensure industry consensus. In May 2011, Ericsson built the first implementation of WebRTC

Source: HTML5 Rocks

As explained by HTML5Rocks, RTC is technology that enables us to communicate in real time using the internet. For years, we have seen it advance from audio using VoIP to video using Skype. However the barrier lies in the proprietary nature of the platform used to deliver RTC. Thus, they are expensive, due to licensing costs.

Today, W3C is developing an open standard that will enables free and common platform that will deliver audio and video through web browsers. It is still in Editors Draft right now. So, perhaps a bit too early. With the rapid pace of technology advancement, this will soon becomes common place. Thus, we should be prepared for it.

The common usage for RTC right now is Video Chat, like Skype and Google Hangout. So, very soon, video will no longer be a one way communication, where we are only watching, but interactive. Where both sides of the video conference can communicate directly.

With the rapid deployment of internet worldwide, the barrier between countries are broken. Global companies with offices in multiple countries can communicate better. There will be less delays due to communication issues.

How does WebRTC works ?

The overall architecture of WebRTC includes two layers: the top layer relates to the web, individual web applications, and the Web API for web developers, which is being edited by the W3C W; the second layer is the WebRTC portion that concerns the C++ API and peer connection for browser developers. The top sub-layer of the WebRTC portion is further divided with session management and abstract signaling, which is split into three areas, the Voice Engine, the Video Engine, and the Transport session components.

The Voice Engine sets the framework for the audio media connection from sound card to network and back between clients, and includes the iSAC / iLBC / Opus codec, NetEQ for Voice, Acoustic Echo Canceler (AEC), and Noise Reduction (NR) technologies. The iSAC / iLBC / Opus audio codec for VoIP and audio streaming over wideband and narrowband, and supports constant and variable bitrate encoding from 6 kbits/s to 510 kbits/s as well as frame sizes from 2.5 ms to 60 ms, and sampling rates from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth. The NetEQ for Voice is a dynamic jitter buffer and error concealment algorithm used for concealing the negative effects of network jitter and packet loss which keeps latency as low as possible while maintaining the highest voice quality. The Acoustic Echo Canceler is a software-based signal processing component that removes the acoustic echo resulting from the voice being played out coming into the active microphone. The Noise Reduction component is a software-based signal processing component that removes certain types of background noise usually associated with VoIP, such as hiss, fan noise, and background noise.

The Video Engine is the framework for the video media connection for video, from camera to the network, and from network to the screen and back and includes the VP8 codec, the Dynamic Jitter Buffer, and Image Enhancements. The VP8 codec by the WebM Project is well suited for RTC as it is designed for low latency. The Dynamic Jitter Buffer for video helps conceal the effects of jitter and packet loss on overall video quality, and Image Enhancemens removes video noise from the image capture by the webcam.

The Transport Session components are built by re-using components from libjingle, without using or requiring the xmpp/jingle protocol. The Real Time Protocols (RTP) network stack and the STUN and ICE component allows calls to establish connections across various types of networks.

Source: Tech Republic

Basically, the platform must be implemented by web browsers, currently it is supported by Chrome and Firefox. Written as reported in Tech Republic in C++ which will form the core API, that can be accessed through the Web API. It is these Web API that we will be interested the most. After all, we are web developers.

While it is still early technology, there is nothing wrong in taking a look in the WebRTC direction to see what the future holds. For more information, do check the offical WebRTC website at www.webrtc.org