Segment

Protocol

A small chunk of encoded video/audio, typically 2-10 seconds duration. Segments are the actual media files that players download and stitch together for playback. Common formats: .ts (MPEG-TS), .m4s (fragmented MP4), .mp4.

Updated Apr 3, 2026

Full Explanation

Segments are the actual audio and video data that makes up a stream. The player reads the manifest to discover segment URLs, downloads them in sequence, decodes them, and stitches them together into continuous playback. Each segment is a self-contained media file covering a few seconds of content.

Typical segment durations range from 2 to 10 seconds. This is a tradeoff. Shorter segments (2s) mean lower latency because the player can start playing sooner and switch quality faster, but they generate more HTTP requests and reduce compression efficiency since each segment has encoding overhead. Longer segments (10s) compress better and reduce request count but increase startup time and quality-switching delay. Most services land on 4-6 seconds as a good balance.

Common container formats include .ts (MPEG Transport Stream, the original HLS format), .m4s (fragmented MP4, used by DASH and modern HLS via CMAF), and plain .mp4 for progressive download. The industry is moving toward fMP4/.m4s everywhere because it works with both HLS and DASH and supports features like common encryption.

Segments are ideal CDN objects. For VOD, a segment is immutable once created. It will never change. Set a long cache TTL and forget about it.

# CDN caching strategy for video segments
# VOD segments: immutable, cache forever
Cache-Control: public, max-age=86400, immutable

# Live segments: cache for DVR window duration
Cache-Control: public, max-age=7200

# Segment file extensions to cache aggressively:
# .m4s (fragmented MP4), .ts (MPEG-TS), .mp4

For live streams, segments are created in real time and have a natural lifetime tied to the DVR window. Once a segment ages out of the window, it can be evicted from cache.

Examples

A VOD movie encoded at 5 quality levels with 6-second segments produces about 900 segments per rendition (for a 90-minute film), or 4,500 segments total. Each one is cached with max-age=86400 on the CDN. A live channel with 6-second segments produces 10 new segments per minute per rendition, and old segments drop out of cache as they leave the DVR window.

Frequently Asked Questions

A small chunk of encoded video/audio, typically 2-10 seconds duration. Segments are the actual media files that players download and stitch together for playback. Common formats: .ts (MPEG-TS), .m4s (fragmented MP4), .mp4.

A VOD movie encoded at 5 quality levels with 6-second segments produces about 900 segments per rendition (for a 90-minute film), or 4,500 segments total. Each one is cached with max-age=86400 on the CDN. A live channel with 6-second segments produces 10 new segments per minute per rendition, and old segments drop out of cache as they leave the DVR window.