Content Conditioning

Considerations for frame-accurate ad insertion

Content conditioning refers to the process of preparing ABR streaming content to ensure that meaningful parts of it (defined by specific time points) can be retrieved accurately and manipulated. Frame-accurate insertion is crucial for a good viewing experience, both with Live and VOD, in a variety of scenarios:

  • In AVOD scenarios, you want to ensure that ads are inserted accurately between content scenes, for a smooth transition without glitches or awkward jumps.
  • With Live Ad Replacement, you want to ensure that the original ads are replaced perfectly with new ads, without cutting off any of the actual content.
  • It's a similar story (though usually less stringent) with other manipulations such as in Content Replacement and Virtual Channel scenarios, for example if you want to extract and schedule a specific portion of a source asset.

It is the responsibility of your encoder and packager to splice the content to allow frame-accurate insertion:

  • The encoder needs to insert an I-frame (actually an IDR frame) as first frame after the splice point.
  • The packager needs to ensure that there are segment boundaries at the splice point. A segment will end at the frame before the splice point, and the next segment will start with that IDR frame. Usually this will lead to either shorter or longer segments than the normal cadence around the splice points.
  • The packager will also need to ensure that the HLS and DASH manifests are written in such a way that those segments are clearly identifiable.

All media tracks that make up a stream (video, audio) need to be spliced in the same way, so that media segments line up across them all.

Similarly, subtitles and thumbnail tracks will need to be segmented and spliced to correspond to media segments.


In the examples below, normal segments are 4 seconds, but a splice point was inserted at 14.5 seconds, causing 2 smaller segments around that time point.


# comment: this is the splice point


<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" type="static" profiles="urn:mpeg:dash:profile:isoff-live:2011">
    <Period id="1" duration="PT1M30S">
        <AdaptationSet contentType="video" segmentAlignment="true" mimeType="video/mp4">
            <SegmentTemplate timescale="600" initialization="init.mp4" media="segment-$Time$.dash">
                    <S t="0" d="2400" r="2" />
                    <S d="1500" />
                    <!-- comment: this is the splice point -->
                    <S d="900" />
                    <S d="2400" r="20" />
            <Representation bandwidth="501000" width="426" height="240" codecs="avc1.42C01E" scanType="progressive">

Conditioning for AVOD

It remains very common in VOD streaming solutions to deliver subtitle and thumbnail tracks through sidecar files (ie. files that contain the full track for the whole duration of the VOD asset), even when the audio and video tracks are segmented. Just as common is for URIs to those files to be issued to the players outside of the manifest.

Unfortunately, in the majority of cases, and for the majority of players, this type of delivery is incompatible with DAI scenarios. Indeed, the insertion of ads in the middle of the content alters the logical timeline of the timeline, whereas the subtitle timeline will remain unchanged. This will then cause for subtitles to go out of sync with the other media tracks after the first ad has played.

Instead of sidecar files, you should therefore use the following to ensure that those tracks are taken into consideration in the manipulation by

  • Use a segmented output
  • Ensure that information about those tracks is provided in the manifest

The format to use (among those supported by will largely depend on the capabilities of the target players and clients.

Conditioning for Live Ad Replacement

In a Live Ad Replacement scenario, the live stream contains the original ads, as well as SCTE-35 markers providing the time when those ads start (the "cue-out") and stop (the "cue-in", when when the main content resumes).

In this case it is important that the encoder and packager have spliced the content with the same timing as in the SCTE-35 markers.


In this HLS example, media segments size are 4 seconds, but with the presence of an SCTE-35 marker, the splicing is performed so the media segment boundary falls into the marker's START-DATE attribute value.