From Snom User Wiki

Jump to: navigation, search

Author: Asad Khan

This document explains the bandwidth requirement for the proper operation of the different codecs at different packet durations. They have been presented both as a table and in graph form. However, the bandwidth requirement given here is for uncompressed IP headers and can be considerably less due to several factors, as explained below.


Headers overhead

The standard method of transporting voice packets through IP-based networks requires the addition of three headers. These are IP, UDP and RTP. An IPv4 header is 20 octets, a UDP header is 8 octets and an RTP header is 12 octets. A total of (20+8+12) 40 octets (bytes) (or 320 bits) is therefore sent each time a packet containing voice samples is transmitted. The additional bandwidth occupied by this header information is determined by the number of packets that are sent each second. For example, if the voice samples in one packet represent a duration of 20 milliseconds, then 50 (1000ms/20ms) of these samples would be required each second (the selection of this payload duration is a compromise between bandwidth requirements and quality). Each sample carries an IP/UDP/RTP header overhead of 320 bits, meaning that 16,000 (50 X 320) header bits are sent each second. It can therefore generally be assumed that header information will add 16kbps to the bandwidth requirement for voiceover IP. For example, if an 8kbps algorithm such as G.729 is used, the total bandwidth required to transmit each voice channel will be 24kbps (8+16). The bandwidth requirements for the different codecs given in this document are for uncompressed IP headers, which, as explained above, take significant bandwidth. Mechanisms that reduce the overhead of IP / UDP and RTP headers, such as RTP header compression and RTP multiplexing, have not been taken into account for the bandwidth requirement in this document. The bandwidth requirement can thus be considerably less if these mechanisms to reduce the overhead are being used.

Silence suppression

Silence suppression is another way in which to reduce bandwidth requirement. It has not been considered in the required bandwidths given in this document. Silence suppression considerably reduces the data transferred by not sending silence packets. The other side generates the comfort noise locally (CNG), thereby reducing the bandwidth required.

Packets sent in between

As we know, during a conversation (voice packets being transferred) we may also send or receive other packets of 1, 2 or 4 Kbytes in between (SIP INVITE or other messages). This results in a delay of the following packets and thus of the resulting jitter. The jitter buffer that receives packets schedules them for playing out (according to the jitter calculated). However, if packets are delayed for longer than this, voice chopping can result. These effects have also not been taken into consideration for the bandwidth requirement.


QoS not only involves a robustly working VoIP system, it also includes the network in which that system is running. In addition, there are many other factors and situations affecting the quality of service. If the bandwidth is not sufficient, the network is not properly set up or other such factors outside the VoIP system do not function properly, it is very difficult to achieve good quality of voice. However, provided all requirements are met and some of the details explained in this document are kept in mind, there is no reason why a VoIP system should not provide high-quality voice.


Although codecs vary in their quality and delay characteristics and there is not yet an agreed standard, G.723.1 and G729A are the most common codecs used for Internet voice transmission. Similarly, there is no recommendation on the packet duration to use in the different environments, but 20ms is a good choice for normal Internet conversation with acceptable bandwidth. For office environments where there is almost no bandwidth restriction, G.711 at 20 ms packet duration is recommended.

Codec Packet duration Bandwidth (kbps)
G.711 (PCM) 64kbps uncompressed 10 milliseconds (80 samples) 96
G.711 (PCM) 64kbps uncompressed 20 milliseconds (160 samples) 80
G.711 (PCM) 64kbps uncompressed 30 milliseconds (240 samples) 75
G.711 (PCM) 64kbps uncompressed 40 milliseconds (320 samples) 72
G.711 (PCM) 64kbps uncompressed 80 milliseconds (640 samples) 68
G.723.1 (ACELP) 5.3kbps compression 30 milliseconds (1 sample) 16
G.723.1 (ACELP) 5.3kbps compression 60 milliseconds (2 sample) 11
G.723.1 (ACELP) 5.3kbps compression 90 milliseconds (3 sample) 9
G.723.1 (MP-MLQ) 6.4kbps compression 30 milliseconds (1 sample) 18
G.723.1 (MP-MLQ) 6.4kbps compression 60 milliseconds (2 sample) 12
G.723.1 (MP-MLQ) 6.4kbps compression 90 milliseconds (3 sample) 10
G.726 (ADPCM) 32kbps compression 10 milliseconds (80 samples) 64
G.726 (ADPCM) 32kbps compression 20 milliseconds (160 samples) 48
G.726 (ADPCM) 32kbps compression 30 milliseconds (240 samples) 43
G.726 (ADPCM) 32kbps compression 40 milliseconds (320 samples) 40
G.726 (ADPCM) 32kbps compression 80 milliseconds (640 samples) 36
G.728 (LD-CELP) 16kbps compression 10 milliseconds (16 samples) 48
G.728 (LD-CELP) 16kbps compression 20 milliseconds (32 samples) 32
G.728 (LD-CELP) 16kbps compression 30 milliseconds (48 samples) 27
G.728 (LD-CELP) 16kbps compression 40 milliseconds (64 samples) 24
G.728 (LD-CELP) 16kbps compression 80 milliseconds (128 samples) 20
G.729A (CS-CELP) 8kbps compression 10 milliseconds (1 samples) 40
G.729A (CS-CELP) 8kbps compression 20 milliseconds (2 samples) 24
G.729A (CS-CELP) 8kbps compression 30 milliseconds (3 samples) 19
G.729A (CS-CELP) 8kbps compression 40 milliseconds (4 samples) 16
G.729A (CS-CELP) 8kbps compression 80 milliseconds (8 samples) 12


This category currently contains no pages or media.

Personal tools