Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!boulder!newsfeed.berkeley.edu!howland.erols.net!nntp.abs.net!news.chicago.winstar.net!wasp.rahul.net!samba.rahul.net!rahul.net!a2i!jfm.a2i!jfm
From:
[email protected]
Newsgroups: rec.video.desktop,comp.graphics.animation,comp.multimedia,comp.os.ms-windows.video,news.answers,rec.answers,comp.answers
Subject: AVI Graphics Format Overview
Followup-To: rec.video.desktop,comp.graphics.animation,comp.multimedia,comp.os.ms-windows.video
Date: 3 Aug 1999 21:04:10 GMT
Organization: a2i network
Lines: 11014
Approved:
[email protected]
Message-ID: <
[email protected]>
NNTP-Posting-Host: foxtrot.rahul.net
NNTP-Posting-User: jfm
Summary: Answers to many commonly asked questions about AVI files, Video For Windows, and DirectShow (formerly ActiveMovie). This includes how to convert to and from other video formats, playing, editing, and authoring AVI files as well as information on programming.
Xref: senator-bedfellow.mit.edu rec.video.desktop:132629 comp.graphics.animation:67035 comp.multimedia:88846 comp.os.ms-windows.video:21513 news.answers:164215 rec.answers:51576 comp.answers:37262
Archive-name: graphics/avi-faq
Posting-Frequency: monthly
Last-modified: 1999/8/1
Version: 1.276
URL:
http://www.rahul.net/jfm/avi.html
Copyright: (c) 1996-1999 John F. McGowan, Ph.D.
Maintainer: John F. McGowan, Ph.D. <
[email protected]>
<HTML>
<META NAME="Description" CONTENT="John McGowan's AVI Overview">
<META NAME="Keywords" CONTENT="AVI, Video, Video for Windows, ActiveMovie, DirectShow, NetShow">
<META HTTP-EQUIV="keywords" CONTENT="AVI, Video, Video for Windows, ActiveMovie, DirectShow, NetShow">
<HEAD><TITLE>John McGowan's AVI Overview</TITLE></HEAD>
<BODY BGCOLOR="#FFFFFF">
<PRE>
AVI Overview
by John F. McGowan, Ph.D.
(c) 1996-1999, John F. McGowan
http://www.rahul.net/jfm/
----------------------------------------------------------------------------
</PRE>
<!-- This document is best viewed using an HTML browser. However, -->
<!-- it has been composed with limited use of HTML so that it can -->
<!-- be used as a plain text file as well. -->
<!-- Release $Id: avi.html,v 1.276 1999/07/22 17:16:26 jfm Exp $ -->
<!-- Permission to copy and distribute this document is granted -->
<!-- so long as the title, author's name, and URL are retained -->
<!-- Any additions or modifications made to the original should be -->
<!-- clearly marked as such. -->
<!-- The author welcomes and encourages suggested changes and -->
<!-- additions to the overview. Contributors will be credited. -->
<PRE>
<A NAME="Top">
What is in this Overview?
</A>
- Overview of Video for Windows, DirectShow (ActiveMovie), and AVI
<A HREF="#Definition">What is AVI?</A>
<A HREF="#Disclaimer">Disclaimer</A>
<A HREF="#Get">How to Get the AVI Overview</A>
<A HREF="#New">WHAT'S NEW</A>
Brief Table of Contents
<A HREF="#Common">Most Common AVI Question: What does "could not find vids:xxxx ..." error mean?</A>
<A HREF="#Util">UTILITIES, SYSTEM ADMINISTRATION, AUTHORING, ETC.</A>
<A HREF="#Install">Installation, Configuration, and Other Issues</A>
<A HREF="#AVIAndWEB">AVI and the WORLDWIDE WEB</A>
<A HREF="#Codecs">Audio and Video Codecs</A>
<A HREF="#Business">Business and Economics of AVI</A>
PROGRAMMING/TECHNICAL TOPICS
<A HREF="#WinProg">Microsoft Windows Video Programming</A>
<A HREF="#MMProg">Multimedia Technical Information</A>
<A HREF="#WinDriver">Microsoft Windows Device Drivers</A>
<A HREF="#Glossary">Glossary</A>
<A HREF="#Chronology">Chronology</A>
Detailed Table of Contents
<A HREF="#Common">Most Common AVI Question: What does "could not find vids:xxxx ..." error mean?</A>
<A NAME="Util">UTILITIES, SYSTEM ADMINISTRATION, AUTHORING, ETC.</A>
- <A HREF="#Play">How to play an AVI file?</A>
- DOS
- Windows
- Macintosh
- Unix
- VAX/VMS
- Amiga
- OS/2
- How to convert AVI to various audio/video formats.
- <A HREF="#ToMPEG">MPEG (.MPG Files)</A>
- <A HREF="#ToMOV">QuickTime (.MOV or .MooV)</A>
- <A HREF="#ToGIF89a">Animated GIFs (GIF89a)</A>
- <A HREF="#ToASF">Microsoft ASF (Active Streaming Format)</A>
- <A HREF="#ToSequence">Sequence of Still Images in Separate Files</A>
- <A HREF="#ToSmacker">Smacker (.SMK Files)</A>
- <A HREF="#ToRM">Progressive Networks RealMedia Streaming Format (.RM Files)</A>
- How to convert other audio/video formats to AVI
- <A HREF="#FromMOV">QuickTime (.MOV or .MooV)</A>
- <A HREF="#FromSequence">Sequence of Still Images in Separate Files</A>
- <A HREF="#FromAutodesk">Autodesk Animation (FLI or FLC)</A>
- <A HREF="#FromMPEG">MPEG (.MPG)</A>
- <A HREF="#FromGIF89a">Animated GIF (.GIF)</A>
- <A HREF="#Bitmaps">How to Convert a Sequence of Still Images in One Format to a Sequence in Another Format</A>
- Authoring AVI Files
CREATING AVI FILES
- <A HREF="#ScreenCapture">How to capture screen to AVI files</A>
- <A HREF="#AuthorAVI">Multimedia Authoring Tools to Create AVI Files</A>
- <A HREF="#LW">How to import AVI files into Lightwave</A>
- <A HREF="#Capture">How to create AVI files from analog video (Video Capture Cards)</A>
- From VHS tapes and video cameras
- From Hi8 tapes and video cameras
- <A HREF="#VfWCapture">Video Capture under Video for Windows</A>
- <A HREF="#CaptureCards">Video Capture Cards</A>
- <A HREF="#ParCapture">Video Capture through PC Parallel Port
- <A HREF="#TBC">What to do about horizontal tearing in the video?</A>
- <A HREF="#HDCapture">Hard Drive Video Capture Issues</A>
- <A HREF="#NTCapture">Video Capture Cards with Windows NT Drivers</A>
- <A HREF="#TV">How to create AVI files from Television</A>
- <A HREF="#Morph">How to Create Morph Effects for AVI Files</A>
- <A HREF="#Audio">How to compress the audio sound track in AVI files</A>
MODIFYING AND EDITING AVI FILES
- <A HREF="#FPS">How to change frame rate of AVI files</A>
- <A HREF="#Crop">How to crop an AVI file</A>
- <A HREF="#Edit">How to edit AVI files</A>
- VidEdit
- Personal AVI Editor
- MGI VideoWave
- Corel Lumiere Suite for 32-bit Windows
- Ulead Media Studio Pro
- Adobe Premiere
- in:sync SpeedRazor
- Asymetrix Digital Video Producer (DVP)
- Fast Movie Processor
- Peck's Power Join
- <A HREF="#Audio">How to compress the audio sound track in AVI files</A>
- <A HREF="#ToNTSC">How to create NTSC (or PAL) Safe AVI</A>
- <A HREF="#Phantom">The Phantom Final Frame when Viewing an AVI</A>
- <A HREF="#BinEd">Binary File Editors for Viewing and Editing AVI</A>
- <A HREF="#VidTrace">RIFF and AVI Parser/Viewers</A>
- John McGowan's VidTrace
- Microsoft RIFFWALK
- Bill Luken's RIFFSCAN
- <A HREF="#Wave">Editing and converting WAV files</A>
- <A HREF="#MacSound">Editing and converting Sound Files on Macintosh</A>
MISCELLANEOUS AUTHORING QUESTIONS
- <A HREF="#Output">How to output AVI files to videotape</A>
- <A HREF="#Size">Size limits on AVI files</A>
- <A HREF="#Corel">How to Fix Problem with AVI files from CorelMove 4.0</A>
<A NAME="Install">Installation, Configuration, and Other Issues</A>
INSTALLATION AND CONFIGURATION
- <A HREF="#VfW16">Where to get the 16-bit Video for Windows for Windows 3.x</A>
- <A HREF="#Win95">Reinstalling Microsoft's Video-for-Windows in Windows 95</A>
- <A HREF="#GETAM">How to get ActiveMovie 1.0</A>
- <A HREF="#NT40">Installing and configuring AVI Codecs in Windows NT 4.0</A>
- <A HREF="#Extension">How to give AVI files a different extension in Windows 3.1</A>
- <A HREF="#AVI95">How AVI Files are Handled in Windows 95</A>
INFORMATION SOURCES
- <A HREF="#Biblio">Bibliography of sources of information on Video for Windows and AVI</A>
- <A HREF="#News">Internet Newsgroups with Information on AVI and Video </A>
- <A HREF="#VideoStandards">Where to find information
on digital audio and video standards other than AVI.
MISCELLANEOUS QUESTIONS
- <A HREF="#VideoChips">PC Video Card and Video Chips</A>
- <A HREF="#NTVideo">Video Cards with Windows NT Drivers</A>
- <A HREF="#Word">How to embed an AVI file in a Microsoft Word Document</A>
- <A HREF="#Names">Microsoft's Changing Names</A>
- <A HREF="#Misc">Answers to miscellaneous other frequently asked questions about AVI </A>
- <A HREF="#Health"> AVI and Your Health (Eye Strain)</A>
<A NAME="AVIAndWEB">AVI and the WORLDWIDE WEB</A>
- <A HREF="#Style">Effective use of video on a Web page</A>
- <A HREF="#Web">How to embed an AVI file in a Web page</A>
- <A HREF="#NS">Configuring Netscape Navigator 3.0x to Display AVI Files</A>
- <A HREF="#Plug-ins">Netscape Navigator Plug-ins to play AVI</A>
- <A HREF="#IE">Configuring Internet Explorer 3.0x to Display AVI Files</A>
- <A HREF="#Mail">Sending AVI by E-Mail or Network News Postings</A>
- <A HREF="#Crypto">How to encrypt AVI Files</A>
- <A HREF="#MIME">MIME types of AVI</A>
- <A HREF="#HTTPD">Configuring Web Servers to Handle AVI Files</A>
- Apache
- CERN (or W3C)
- NCSA HTTPd
- Microsoft Internet Information Server 3.0
- Netscape Enterprise Server 3.0
- <A HREF="#Java">AVI and Java</A>
- <A HREF="#VRML">AVI and VRML</A>
- <A HREF="#NetShow">AVI and NetShow</A>
- <A HREF="#ToASF">Converting AVI to Microsoft Active Streaming Format (ASF) Files</A>
- <A HREF="#Content">Sources of AVI Video Clips on the Web</A>
- <A HREF="#AVILBR">Low Bit Rate AVI for the Web</A>
REAL-TIME OR STREAMING VIDEO OVER IP NETWORKS
- <A HREF="#Limits">Limitations of AVI and Video for Windows over Networks</A>
- <A HREF="#NetShow">NetShow</A>
- <A HREF="#Names">Microsoft's Changing Names</A>
- <A HREF="#RFC">Internet Video Standards and Pseudo-Standards</A>
<A NAME="Codecs">AUDIO and VIDEO CODECS</A>
- <A HREF="#Codec">Video for Windows compressors and decompressors</A>
WHAT THEY ARE, WHERE TO GET THEM, WHICH WORK BEST!
- The Old Guard
- <A HREF="#DIB">Full Frames (Uncompressed)</A>
- <A HREF="#ColorFormats">Color Formats</A>
- <A HREF="#RT21">Intel Real Time Video 2.1 (Indeo 2.1?) (RT21)</A>
- <A HREF="#IV32">Indeo 3.2/3.1</A>
- <A HREF="#MRLE">Microsoft Run Length Encoding</A>
- <A HREF="#MSVC">Microsoft Video 1</A>
- <A HREF="#CVID">Cinepak</A>
- <A HREF="#MJPG">Motion JPEG</A>
- <A HREF="#XMPG">Editable MPEG</A>
- The New Wave
- <A HREF="#VDOW">VDOWave (VDOLive)</A>
- <A HREF="#IV41">Indeo Video Interactive (Indeo 4.1)</A>
- <A HREF="#IV50">Indeo Video Interactive (Indeo 5.x)</A>
- <A HREF="#UCOD">ClearVideo (aka RealVideo)</A>
- <A HREF="#SFMC">SFM (Surface Fitting Method)</A>
- <A HREF="#QPEG">QPEG</A>
- <A HREF="#H261">H.261</A>
- <A HREF="#H263">H.263</A>
- Microsoft H.263
- Vivo Software H.263
- Intel I263 H.263
- Shannon Communication Systems (SCS) H.263+
- Telenor R&D H.263
- <A HREF="#MPG4">MPEG-4</A>
- <A HREF="#LS">Lightning Strike (Infinop)</A>
- <A HREF="#VxTreme">VxTreme</A>
Video Codecs NOT Available for AVI
- <A HREF="#Sorenson">Sorenson Video</A>
- <A HREF="#BestCodec">Which AVI video codec is best?</A>
- <A HREF="#CodecPerformance">Performance of the AVI Codecs</A>
A table with typical compression ratios of
Video for Windows codecs.
- <A HREF="#QTCodec">Which Video for Windows codecs are
supported by QuickTime on the Apple Macintosh?</A>
- <A HREF="#VfWInstalled">How to determine which codecs are installed</A>
- <A HREF="#WhichAVICodec">How to determine which codec was used to compress an AVI file</A>
- <A HREF="#FourCC">Microsoft Four Character Codes (FOURCC)</A>
- <A HREF="#FOURCCGUID">Microsoft GUIDs for Video for Windows Codecs</A>
- <A HREF="#ColorFormats">Color Formats</A>
- <A HREF="#ALGO">Video Compression Technologies</A>
<A HREF="#RLE">Run Length Encoding</A>
<A HREF="#VQ">Vector Quantization</A>
<A HREF="#DCT">Discrete Cosine Transform</A>
<A HREF="#DWT">Discrete Wavelet Transform</A>
<A HREF="#Contour">Contour-Based Image Coding</A>
<A HREF="#FD">Frame Differencing</A>
<A HREF="#Motion">Motion Compensation</A>
- <A HREF="#ACM">Audio Codecs</A>
- <A HREF="#ACMInstalled">How to determine which Audio Codecs are Installed</A>
<A HREF="#Glossary">GLOSSARY</A>
<A HREF="#Chronology">CHRONOLOGY</A>
PROGRAMMING/TECHNICAL TOPICS
<A NAME="WinProg">MICROSOFT WINDOWS PROGRAMMING </A>
THE OLD REGIME
- <A HREF="#WINMM">Windows Multimedia System</A>
- <A HREF="#VFW">Video for Windows</A>
- <A HREF="#WAVE">Wave (Waveform Audio)</A>
- <A HREF="#Format">AVI file format</A>
- RIFF Files
- Original AVI File Format
- <A HREF="#OpenDML">OpenDML AVI File Format Extensions</A>
- <A HREF="#AVISpec">Where to get the exact specification of AVI?</A>
- <A HREF="#AVIDIB">AVI and Windows Bitmaps (DDB, DIB, ...)</A>
THE NEW WAVE
- <A HREF="#ActiveMovie">ActiveMovie</A>
- <A HREF="#GUID">GUID's and AVI</A>
- <A HREF="#DirectShow">DirectShow (ActiveMovie 2.0)</A>
- <A HREF="#DirectDraw">DirectDraw</A>
- <A HREF="#MMX">MMX</A>
- <A HREF="#ActiveX">ActiveX</A>
- <A HREF="#Names">Microsoft's Changing Names</A>
HOW TO PROGRAM IN WINDOWS
- <A HREF="#AviPlay">Playing an AVI file within a Windows Application</A>
- <A HREF="#AviWrite">Reading and Writing an AVI file within a Windows Application</A>
<A NAME="MMProg">USEFUL INFORMATION FOR AVI AND VIDEO PROGRAMMING (NOT WINDOWS)</A>
SOURCE CODE
- <A HREF="#AVISRC">Where to get C source code for an AVI Player Including Many Codecs</A>
- <A HREF="#JPEGSRC">Where to get C source code for a JPEG Encoder or Decoder</A>
- <A HREF="#H263SRC">Where to get C source code for an H.263 Video Encoder or Decoder</A>
- <A HREF="#MPEGSRC">Where to get C source code for an MPEG Video Encoder or Decoder</A>
- <A HREF="#WAVELETSRC">Where to get C/C++ Source Code for Wavelet Image Compression</A>
TECHNICAL INFORMATION
- <A HREF="#COLORFAQ">Where to get an explanation of Color, Color Spaces, Gamma and All That</A>
- <A HREF="#FileFormats">Where to get Detailed Information on Graphics File Formats?</A>
- <A HREF="#AudioFmts">Where to get Detailed Information on Audio File Formats?</A>
USEFUL INFORMATION FOR NETWORKED VIDEO PROGRAMMING
- <A HREF="#RFC">Internet Video Standards</A>
- MIME
- RTP (Real Time Protocol)
- RSVP (Resource Reservation Protocol)
- IP Multicast
- UDP (User Datagram Protocol)
<A NAME="WinDriver">WINDOWS DEVICE DRIVERS AND VIDEO</A>
- <A HREF="#Driver">What is a driver?</A>
- <A HREF="#GDI">GDI Device Drivers</A>
- <A HREF="#DHAL">DirectDraw Hardware Abstraction Layer</A>
- <A HREF="#VXD">Virtual Device Drivers</A>
- <A HREF="#NTDM">Windows NT Driver Model</A>
- <A HREF="#WDM">Win32 Driver Model (WDM)</A>
- <A HREF="#INF">Setup Information Files</A>
<A HREF="#Awards">Awards</A>
<A HREF="#Credits">Credits</A>
ABOUT THE AUTHOR
- John McGowan is a software engineer with experience
in digital audio and video on PC/Windows, Unix/X Windows, and
PowerMacintosh platforms. He has developed commercial MPEG-1 and
MPEG-2 player software. His experience includes development,
optimization, and implementation of audio, video, and still image
compression and decompression algorithms in C/C++ on Intel, MIPS,
SPARC, and PowerPC based platforms. He has also developed Microsoft
Windows user interface software. He has a Ph.D. in physics from the
University of Illinois at Urbana-Champaign and a B.S. in physics from
the California Institute of Technology.
- <A HREF="
http://www.rahul.net/jfm/index.html">John McGowan's Home Page</A>
<A NAME="Disclaimer">
<H2>Disclaimer</H2>
In no event shall John McGowan or other contributors be liable for
direct, indirect, special, incidental or consequential damages
arising out of the use or inability to use information, softwares,
bitstreams and other data found on or referenced by the AVI Graphics
Overview.
Permission to copy and distribute this document is granted so long as
the title, author's name, URL, and this disclaimer are retained. Any
additions or modifications made to the original should be clearly
marked as such. The author welcomes and encourages suggested changes
and additions to the overview. Contributors will be credited.
<A HREF="#Top">Return to Top</A>
<A NAME="Get">
<H2>How to Get the AVI Overview</H2>
</A>
The AVI Overview is available at:
<A HREF="
http://www.rahul.net/jfm/avi.html">
http://www.rahul.net/jfm/avi.html</A>
If you are in a Web Browser such as Internet Explorer or Netscape
Navigator, you can save a copy of the page you are viewing to your
local hard disk as an HTML File.
In Internet Explorer, Select File | Save As...
In Netscape Navigator, Select File | Save As...
Using FTP (File Transfer Protocol):
<A HREF="
ftp://ftp.rahul.net/pub/jfm/avi/avi.html">
ftp://ftp.rahul.net/pub/jfm/avi/avi.html</A>
<A HREF="#Top">Return to Top</A>
<A NAME="New">
<H2>What's New</H2>
</A>
(May 11, 1999)
Aachen, Germany
MainConcept releases Linux verson of Main Actor Video Editor.
Linux is a free imlementation of the Unix operating system
for the IBM PC-compatible and other platforms.
(April 15, 1999)
Aachen, Germany
MainConcept releases Main Actove Video Editor 3.0 for
Microsoft Windows 95/98 and Microsoft Windows NT 4.0.
(April 13, 1999)
RealNetworks acquires Xing Technology Corporation.
(March 21, 1999)
XAnim 2.80.1 released on March 21, 1999. XAnim is a video and animation
player for the X Windows System and Unix. It includes support for
AVI files.
XAnim 2.80.1 contains some minor changes to XAnim 2.80.0 which was
released on March 14, 1999.
(March 14, 1999)
XAnim 2.80.0 released on March 14, 1999. XAnim is a video and animation
player for the X Windows System and Unix. It includes support for
AVI files.
According to the XAnim Web site:
XAnim 2.80.0 is now ready for consumption. In addition to several new
video codecs, the new revision also supports dynamically loadable
video decompression libraries. This means you no longer need to
recompile xanim each time a new video codec is released or
upgraded. There are currently dll's for: Creative CYUV, Radius
Cinepak, Intel Indeo 3.2, Intel Indeo 4.1, Intel Indeo 5.0, CCITT
H.261 and CCITT H.263.
(Feb. 2, 1999)
Intel Indeo 5.10 video codec released, supersedes 5.06.
(November 5, 1998) Guillaume de Bailliencourt writes:
John,
I'm an anonymous reader of your FAQ for years.
First, congratulation for your work in this FAQ !
I've just released a software MJPEG codec and main features are :
Decompress hardware M-JPEG AVI files (Rainbow Runner, DC30, ...) without the
capture hardware.
Win9x, WinNT, Video for Windows, ActiveMovie & DirectShow compliant.
MMX and 3DNow! optimized.
DirectDraw YUV accelarated output supported (YUY2 & UYVY).
Most of the M-JPEG formats supported (4:2:2, 4:1:1, mjpg, dmb1, jpeg).
You can download it at www.morgan-multimedia.com
For me it is better than the Paradigm Matrix codec. I'll send you benchmark
..
For the moment it has been tested on :
Play back (decompression) :
Paradigm Matrix software codec compressed AVI Files in every resolutions &
compressions.
Matrox Rainbow Runner AVI files in every resolutions & compressions.
Miro/Pinnacle DC30 AVI files in 384 x 288 & 720x540
Fast Screen Machine II + MJPEG card AVI file in 368 x 276
AVI files created with MainActor "Software & Harware MJPEG output"
AVI file converted from a 'jpeg' QuickTime file with SmartVid
Compressed with my codec & played back with :
Paradigm Matrix software codec
Matrox Rainbow Runner
Best regards,
Guillaume de Bailliencourt
(January 6, 1999)
Radius Incorporated, the Cinepak company, renames itself
Digital Origin Incorporated.
(August 24, 1998)
[email protected] writes:
Hi there,
We have an H263+ avi codec and analysis tool at www.shansys.com
(August 3, 1998) David Gartner of Equilibrium writes:
Equilibrium adds AVI with sound support to DeBabelizer Pro 4.5
..
**AVI Video with Sound**
DeBabelizer Pro 4.5's new full AVI support enables users to batch process
legacy Video for Windows files for use on most any Macintosh and Windows
systems for the Web, CD-ROM or kiosk. Video for Windows (AVI) was built
into Windows 95 and NT and runs only on Windows machines. Now, with a few
keystrokes, DeBabelizer Pro 4.5 users can automatically optimize, convert,
and compress tens, hundreds or thousands of videos to QuickTime 3.0,
animated GIFs or a variety of other cross-platform video and animation
formats.
..
(June 6, 1998) Wolfgang Hesseler writes:
Hello,
Hello, I just wanted to let you know that I've released the new
QuickView 2.30. It now supports a bunch of new video codecs like
Motion JPEG, several audio codecs and QuickTime video codecs.
Please update your FAQ. Thanks.
(May 4, 1998) Microsoft plans to release first public test version
of NetShow 3.0
(March 23, 1998) The AVI Overview selected as an
'Outstanding Page' by the PC Webopaedia
(November 12, 1997) Microsoft has a new NetShow distribution NetShow
2.1 NetShow 2.1 adds support for RealNetworks (formerly Progressive
Networks) RealVideo and RealAudio, NetShow clients for Windows 3.1,
MacOS, and the Linux, Solaris, SunOS, and HP-UX versions of Unix, and
TheaterServer for streaming broadcast-quality video over highbandwidth
networks such as ATM and fast Ethernet. Microsoft has invested in
RealNetworks within the last few months.
(October 30, 1997) Wolfgang Hesseler writes about his AVI viewer
for DOS:
Hello, I just wanted to let you know that I've released the new
QuickView 2.20. Besides supporting more hardware and MOV files it
supports the QPEG codec. Please update your FAQ. Thanks.
(September 8, 1997) Microsoft distributes Advanced Streaming
Format (ASF) Specification for a "Public Design Review".
<A HREF="
http://www.microsoft.com/asf/">Microsoft ASF Page</A>
(September, 1997) Lernout and Hauspie Speech Products forms
a strategic partnership with and receives investment capital
from Microsoft. $45 million in some reports. BT Alex Brown
acted as financial advisor. Lernout and Hauspie audio codecs are
used in Microsoft's NetShow product.
(September, 1997) avi2mpg1 released. A Windows 95/NT console
application to convert AVI to MPEG-1.
(August 5, 1997) Microsoft acquires VxTreme (wavelet based
streaming video) for its NetShow product line.
<A HREF="
http://www.vxtreme.com/">VxTreme Inc.</A>
Intel's Indeo Video Interactive 5.0 software is
now available on Intel Web site.
Wolfgang Hesseler announces version 2.13 of QuickView, an AVI
player for DOS (July 28, 1997)
MainConcept announces version 1.1 of MainActor shareware.
MainActor can convert between AVI and many video, animation,
and image formats. (July 9, 1997)
Marcus Moenig of MainConcept writes:
John,
well here comes the press release of v1.1 We now support full MPEG-I and
MPEG-II without audio. So you can now convert MPEG into AVI and vice versa.
I dont want to get on your nerves on what MainActor can and cannot do but
we also support full Motion JPEG for AVIs. Even interlaced JPEG from Miro
and FAST hardware can now be read and written by MainActor.
---End---
RAD Game Tools announces a new version of their Smacker
utilities, including the ability to read and write AVI
files with optimized 8 bit color palettes. (June 27, 1997)
<A HREF="#Top">Return to Top</A>
<A NAME="Definition">
<H2>What is AVI?</H2>
</A>
AVI stands for Audio Video Interleave. It is a special case
of the RIFF (Resource Interchange File Format). AVI is defined by
Microsoft. AVI is the most common format for audio/video data on the
PC. AVI is an example of a de facto (by fact) standard.
<A HREF="#Top">Return to Top</A>
<A NAME="WINMM">
<H2>Windows Multimedia System</H2>
</A>
In Win16 and Win32, Microsoft created a partially unified system
for handling multimedia. This system consists of the high level
Media Control Interface or MCI Application Programming Interface (API)
and associated MCI drivers. Playback of AVI files can be controlled
through the high level MCI API and the MCIAVI.DRV MCI driver.
The Windows Multimedia System also provides a number of low level
API's such as the WAVE API for waveform audio and associated
device drivers such as the WAVE device drivers for sound cards.
Under Windows NT 4.0, the MCI and low level API's are stored in
the file WINMM.DLL
The API's are:
MCI (high level API - useful for AVI playback)
joy (joystick devices)
midi (MIDI devices)
mixer (MIXER devices)
wave (waveform audio input and output devices)
mmio (low level functions to parse RIFF files)
time (timers etc.)
aux (auxiliary sound device)
When a program loads the MCIAVI driver, the Multimedia
System has the intelligence to locate and invoke the
appropriate MCI driver (MCIAVI.DRV in 16-bit
Windows or MCIAVI32.DLL in 32 bit windows) and pass the MCI
commands such as MCI_PLAY to the MCI driver.
The MCIAVI driver then calls Video for Windows to decompress
the video, GDI (or another graphics API) to display the
decoded frames, and WAVE to output the decoded audio samples.
A dump of the functions exported by winmm.dll under NT 4.0
generated with the Microsoft DUMPBIN.EXE utility follows:
Microsoft (R) COFF Binary File Dumper Version 5.00.7022
Copyright (C) Microsoft Corp 1992-1997. All rights reserved.
Dump of file winmm.dll
File Type: DLL
Section contains the following Exports for WINMM.dll
0 characteristics
31EC70B4 time date stamp Tue Jul 16 21:48:52 1996
0.00 version
2 ordinal base
197 number of functions
197 number of names
ordinal hint name
3 0 CloseDriver (000026CE)
4 1 DefDriverProc (00005AF4)
5 2 DriverCallback (0000254E)
6 3 DrvGetModuleHandle (00001D37)
7 4 GetDriverModuleHandle (00001D37)
8 5 MigrateAllDrivers (00013E79)
9 6 MigrateMidiUser (00013E60)
10 7 MigrateSoundEvents (00011A3C)
11 8 NotifyCallbackData (0000B2C2)
12 9 OpenDriver (00002036)
13 A PlaySound (00008ACB)
2 B PlaySoundA (00008ACB)
14 C PlaySoundW (00009AE1)
15 D SendDriverMessage (00001000)
16 E WOW32DriverCallback (0000C448)
17 F WOW32ResolveMultiMediaHandle (0000CC3C)
18 10 WOWAppExit (00009D3F)
19 11 aux32Message (0000C507)
20 12 auxGetDevCapsA (0000A3FD)
21 13 auxGetDevCapsW (00008C77)
22 14 auxGetNumDevs (00006AE4)
23 15 auxGetVolume (0000A4A1)
24 16 auxOutMessage (00008BFF)
25 17 auxSetVolume (0000A4C9)
26 18 joy32Message (0000C768)
27 19 joyConfigChanged (0000AE40)
28 1A joyGetDevCapsA (0000A99A)
29 1B joyGetDevCapsW (0000AB40)
30 1C joyGetNumDevs (0000AB96)
31 1D joyGetPos (0000ABAA)
32 1E joyGetPosEx (0000ABFD)
33 1F joyGetThreshold (0000AC5C)
34 20 joyReleaseCapture (0000ACA8)
35 21 joySetCapture (0000ACFC)
36 22 joySetThreshold (0000AE06)
37 23 mci32Message (00007566)
38 24 mciDriverNotify (00007006)
39 25 mciDriverYield (00008727)
40 26 mciExecute (0000D92C)
41 27 mciFreeCommandResource (000035CE)
42 28 mciGetCreatorTask (0000DCD5)
43 29 mciGetDeviceIDA (0000DCA3)
44 2A mciGetDeviceIDFromElementIDA (0000DBC6)
45 2B mciGetDeviceIDFromElementIDW (0000DBF5)
46 2C mciGetDeviceIDW (00005372)
47 2D mciGetDriverData (0000158B)
48 2E mciGetErrorStringA (0000DA46)
49 2F mciGetErrorStringW (0000352F)
50 30 mciGetYieldProc (0000E1F3)
51 31 mciLoadCommandResource (00002A75)
52 32 mciSendCommandA (000015D4)
53 33 mciSendCommandW (000014A1)
54 34 mciSendStringA (00004927)
55 35 mciSendStringW (00004A24)
56 36 mciSetDriverData (000058BD)
57 37 mciSetYieldProc (000034C9)
58 38 mid32Message (0000BDFD)
59 39 midiConnect (0001019E)
60 3A midiDisconnect (0001018C)
61 3B midiInAddBuffer (0001004A)
62 3C midiInClose (0000FF42)
63 3D midiInGetDevCapsA (0000FCCC)
64 3E midiInGetDevCapsW (0000FC71)
65 3F midiInGetErrorTextA (0000FDEB)
66 40 midiInGetErrorTextW (0000FDB2)
SetInfo (0000EBF4)
140 8A mmioStringToFOURCCA (0000ED9A)
(00008BC5)
Under Windows 3.x and Windows 95, the DLL MMSYSTEM.DLL (short for
MultiMedia System) contains the multimedia API's.
<A HREF="#Top">Return to Top</A>
<A NAME="VFW">
<H2>Video for Windows</H2>
</A>
Video for Windows is an entire system for handling video
in Microsoft Windows. It was part of MS Windows 3.1 The
original Video for Windows is a collection of 16 bit
windows utilities, dynamic link libraries, and other
components.
The AVI file and file format is a central part of Video
for Windows.
Microsoft Visual C++ 5.0 has a Video for Windows
include file Vfw.h which contains the various API's that
make up Video for Windows:
* COMPMAN - Installable Compression Manager.
* DRAWDIB - Routines for drawing to the display.
* VIDEO - Video Capture Driver Interface
*
* AVIFMT - AVI File Format structure definitions.
* MMREG - FOURCC and other things
*
* AVIFile - Interface for reading AVI Files and AVI Streams
* MCIWND - MCI/AVI window class
* AVICAP - AVI Capture Window class
*
* MSACM - Audio compression manager.
Microsoft released a Video for Windows 1.0 for
Windows 3.1 in November 1992, followed by Video for Windows 1.1. There
have been several versions of Video for Windows 1.1
identified by a trailing alphabetical character such as
1.1e The last and most recent version of Video for
Windows 1.1 for Windows 3.x is Video for Windows 1.1e
This is available by ftp from Microsoft.
Microsoft has provided a 32-bit version of Video for Windows
for Windows 95, while threatening to replace Video for Windows with
ActiveMovie. This version has 32 bit versions of the Video
for Windows codecs such as Cinepak. Other DLL's in the
Video for Windows 95 are also 32-bit How much of the Video for
Windows in Windows 95 is 32 bit code is not clear; many of the
codecs are clearly 32 bit codecs. Nor is it clear how much has been
changed or modified besides the convesion to 32-bit code.
Windows NT 3.5, 3.51 and Windows NT 4.0 include a Video for Windows for
NT. Presumably this is strictly 32-bit. It is not clear how
much code is shared between the NT Video for Windows and the
Windows 95 Video for Windows. Note that hardware device
drivers are different between Windows 95 and NT 3.5/3.51/4.0.
ActiveMovie 1.0 and DirectShow (formerly ActiveMovie 2.0) are
32-bit successors to Video for Windows for both Windows 95
and Windows NT. These support AVI files. ActiveMovie started
out life under the code name Quartz; early Beta releases of
ActiveMovie were known as Quartz.
ActiveMovie 1.0 is bundled with Windows 95b (OEM Service Release 2.x)
and Internet Explorer 3.x/4.x for Windows 95. It can also be downloaded
and installed in Windows 95 separately. Note that ActiveMovie 1.0
does NOT completely replace Video for Windows. For example, ActiveMovie
1.0 does not provide a video capture mechanism. Video capture still
uses Video for Windows capture drivers.
ActiveMovie 1.0 is a 32 bit software component that can run in NT's
user mode. It runs under Windows NT 4.0 as well as Windows 95.
DirectShow (ActiveMovie 2.0) will supposedly add a number of new
features including video capture support, kernel mode streaming, and
miscellaneous other features.
VIDEO FOR WINDOWS under WINDOWS NT 4.0
Under NT 4.0, Video for Windows is implemented as a collection of
32-bit DLL's in the Microsoft 32-bit Common Object File Format or COFF
format. These are usually located in the \WINNT\SYSTEM32 directory
where Windows NT stores most of the system DLL's, drivers, and so
forth.
MSVFW32.DLL ( Microsoft Video for Windows DLL - NT 4.0 )
AVIFIL32.DLL ( AVIFILE API for Reading and Writing AVI Files and Streams )
AVICAP32.DLL ( AVI Capture Window Class )
MCIAVI32.DLL ( Video for Windows MCI Driver )
MSACM32.DRV ( Microsoft Audio Compression Manager )
MSACM32.DLL ( more Microsoft Audio Compression Manager )
MSRLE32.DLL ( Microsoft RLE Video Codec )
IR32_32.DLL ( Intel Indeo 3.2 Video Codec )
MSVIDC32.DLL ( Microsoft Video 1 Codec )
ICCVID.DLL ( Cinepak for Windows 32 - Radius )
What is in MSVFW32.DLL?
MSVFW32.DLL includes the DRAWDIB, Installable Compression
Manager or ICM, and MCI Windows components of Video for
Windows. Other components are stored in other DLL's.
This is a dump of the functions exported by MSVFW32.DLL
Version 4.00
Microsoft (R) COFF Binary File Dumper Version 5.00.7022
Copyright (C) Microsoft Corp 1992-1997. All rights reserved.
Dump of file msvfw32.dll
File Type: DLL
Section contains the following Exports for MSVFW32.dll
0 characteristics
31EC70E9 time date stamp Tue Jul 16 21:49:45 1996
0.00 version
2 ordinal base
47 number of functions
47 number of names
ordinal hint name
3 0 DrawDibBegin (00001E14)
4 1 DrawDibChangePalette (00008C30)
5 2 DrawDibClose (0000888A)
6 3 DrawDibDraw (000010A6)
7 4 DrawDibEnd (00008BEC)
8 5 DrawDibGetBuffer (00008EFC)
9 6 DrawDibGetPalette (00002F97)
10 7 DrawDibOpen (00003E0A)
11 8 DrawDibProfileDisplay (00003EBA)
12 9 DrawDibRealize (00001D49)
13 A DrawDibSetPalette (00001C0D)
14 B DrawDibStart (00002EEB)
15 C DrawDibStop (00002F42)
16 D DrawDibTime (00008C2B)
17 E GetOpenFileNamePreview (0000C7DC)
18 F GetOpenFileNamePreviewA (0000C7DC)
19 10 GetOpenFileNamePreviewW (0000C6A5)
20 11 GetSaveFileNamePreviewA (0000C7EC)
21 12 GetSaveFileNamePreviewW (0000C7CC)
22 13 ICClose (000035E0)
23 14 ICCompress (00004CE5)
24 15 ICCompressorChoose (00005F61)
25 16 ICCompressorFree (00005615)
26 17 ICDecompress (00004D4B)
27 18 ICDraw (0000106A)
28 19 ICDrawBegin (00001B95)
29 1A ICGetDisplayFormat (00004D8E)
30 1B ICGetInfo (00004C60)
31 1C ICImageCompress (00005A96)
32 1D ICImageDecompress (00005D2A)
33 1E ICInfo (00002FEB)
34 1F ICInstall (00004574)
35 20 ICLocate (0000372E)
36 21 ICMThunk32 (0000841C)
37 22 ICOpen (0000337C)
38 23 ICOpenFunction (00003B53)
39 24 ICRemove (0000488B)
40 25 ICSendMessage (00001000)
41 26 ICSeqCompressFrame (000059A7)
42 27 ICSeqCompressFrameEnd (00005907)
43 28 ICSeqCompressFrameStart (000056E4)
44 29 MCIWndCreate (0000C988)
45 2A MCIWndCreateA (0000C988)
46 2B MCIWndCreateW (0000C8CC)
47 2C MCIWndRegisterClass (0000C83F)
48 2D StretchDIB (00009D13)
2 2E VideoForWindowsVersion (000041D1)
Summary
8000 .data
3000 .rdata
2000 .reloc
3000 .rsrc
11000 .text
VIDEO FOR WINDOWS FOR WINDOWS 95
Microsoft distributed a new Video for Windows for Windows 95
while emphasizing Quartz/ActiveMovie/DirectShow in its
marketing.
Video for Windows 95 Files
MSRLE32.DLL (32-bit Microsoft RLE Video Codec)
IR32_32.DLL (32-bit Indeo 3.2 Video Codec)
ICCVID.DLL (32-bit Radius Cinepak Video Codec)
MSVIDC32.DLL (32-bit Microsoft Video 1 Video Codec )
MSVIDEO.DLL (16-bit Video for Windows DLL)
MCIAVI.DRV (16-bit AVI Video MCI Driver)
AVIFILE.DLL (16-bit AVIFILE)
AVICAP.DLL (16-bit AVICAP)
AVICAP32.DLL (32-bit AVICAP)
MSVFW32.DLL (32-bit Video for Windows DLL - with VfW API)
AVIFIL32.DLL (32-bit AVIFILE)
SUMMARY
Video for Windows 1.0 (Windows 3.x)
Video for Windows 1.1 (a-e) (Windows 3.x)
Video for Windows (Windows 95 - has 32-bit codecs and other 32-bit DLL's)
Video for Windows (Windows NT 3.5, 3.51, and 4.0)
Quartz (Betas of ActiveMovie) (Windows 95)
ActiveMovie 1.0 (Windows 95 and NT 4.0)
ActiveMovie 2.0 (DirectShow) (probably Windows 97/98/Memphis and NT 5.0)
<A HREF="#Top">Return to Top</A>
<A NAME="WAVE">
<H2>WAVE</H2>
</A>
The Microsoft Windows audio (sound) input/output system, commonly
referred to as Wave or WAVE, predates Video for Windows, which is
wrapped around WAVE in various ways. The audio tracks in AVI files
are simply waveform audio (or WAV) data used by the wave system.
Video for Windows parses the AVI files, extracts the WAV data, and
pipes the WAV data to the WAVE system. Video for Windows handles the
video track if present.
Traditionally, audio input and output devices such as Sound Blaster
Cards have a WAVE audio input/output driver to play WAV (waveform
audio) files.
The simplest waveform audio files consists of a header followed by
Pulse Coded Modulation (PCM) sound data, usually uncompressed 8 or 16
bit sound samples. WAVE also provides a mechanism for audio codecs.
See elsewhere in the AVI Overview for further information on audio
codecs and audio compression.
WAVE is present in Windows 3.1 and Windows 95. A different WAVE
system is present in Windows NT 3.5, 3.51, and 4.0 At least the
hardware device drivers for sound cards must be different in NT.
ActiveMovie appears to be replacing WAVE.
<A HREF="#Top">Return to Top</A>
<A NAME="Format">
<H2>What is the AVI File Format?</H2>
</A>
AVI Files are a special case of RIFF files. RIFF is
the Resource Interchange File Format. This is a general
purpose format for exchanging multimedia data types
that was defined by Microsoft and IBM during their
long forgotten alliance.
Kevin McKinnon writes:
In fact, RIFF is a clone of the IFF format invented by Electronic Arts in
1984. They invented the format for Deluxe Paint on the Amiga, and IFF
quickly became the standard for interchange on that platform,
maintained eventually by Commodore right up 'til it's demise. EA also
ported Deluxe Paint to the PC platform and brought IFF with it.
IFF even used the 4-character headers (FourCC), though at the time it was
simply called a LONGWORD that some clever people decided to pair into
four charcter because they looked good in #define's. ;)
RIFF is so close to IFF that the good IFF parser routines will (mostly)
correctly parse RIFF files.
----End of Kevin----
Further information on the IFF format is available at:
<A HREF="
http://www.ipahome.com/gff/textonly/summary/iff.htm">
http://www.ipahome.com/gff/textonly/summary/iff.htm</A>
<H3>RIFF Files</H3>
RIFF files are built from
(1) RIFF Form Header
'RIFF' (4 byte file size) 'xxxx' (data)
where 'xxxx' identifies the specialization (or form)
of RIFF. 'AVI ' for AVI files.
where the data is the rest of the file. The
data is comprised of chunks and lists. Chunks
and lists are defined immediately below.
(2) A Chunk
(4 byte identifier) (4 byte chunk size) (data)
The 4 byte identifier is a human readable sequence
of four characters such as 'JUNK' or 'idx1'
(3) A List
'LIST' (4 byte list size) (4 byte list identifier) (data)
where the 4 byte identifier is a human readable
sequence of four characters such as 'rec ' or
'movi'
where the data is comprised of LISTS or CHUNKS.
<H3>AVI File Format</H3>
AVI is a specialization or "form" of RIFF, described below:
'RIFF' (4 byte file length) 'AVI ' // file header (a RIFF form)
'LIST' (4 byte list length) 'hdrl' // list of headers for AVI file
The 'hdrl' list contains:
'avih' (4 byte chunk size) (data) // the AVI header (a chunk)
'strl' lists of stream headers for each stream (audio, video, etc.) in
the AVI file. An AVI file can contain zero or one video stream and
zero, one, or many audio streams. For an AVI file with one video and
one audio stream:
'LIST' (4 byte list length) 'strl' // video stream list (a list)
The video 'strl' list contains:
'strh' (4 byte chunk size) (data) // video stream header (a chunk)
'strf' (4 byte chunk size) (data) // video stream format (a chunk)
'LIST' (4 byte list length) 'strl' // audio stream list (a list)
The audio 'strl' list contains:
'strh' (4 byte chunk size) (data) // audio stream header (a chunk)
'strf' (4 byte chunk size) (data) // audio stream format (a chunk)
'JUNK' (4 byte chunk size) (data - usually all zeros) // an OPTIONAL junk chunk to align on 2K byte boundary
'LIST' (4 byte list length) 'movi' // list of movie data (a list)
The 'movi' list contains the actual audio and video data.
This 'movi' list contains one or more ...
'LIST' (4 byte list length) 'rec ' // list of movie records (a list)
'##wb' (4 byte chunk size) (data) // sound data (a chunk)
'##dc' (4 byte chunk size) (data) // video data (a chunk)
'##db' (4 byte chunk size) (data) // video data (a chunk)
A 'rec ' list (a record) contains the audio and video data for a single frame.
'##wb' (4 byte chunk size) (data) // sound data (a chunk)
'##dc' (4 byte chunk size) (data) // video data (a chunk)
'##db' (4 byte chunk size) (data) // video data (a chunk)
The 'rec ' list may not be used for AVI files with only audio or only
video data. I have seen video only uncompressed AVI files that did
not use the 'rec ' list, only '00db' chunks. The 'rec ' list is used
for AVI files with interleaved audio and video streams. The 'rec '
list may be used for AVI file with only video.
## in '##dc' refers to the stream number. For example, video data chunks
belonging to stream 0 would use the identifier '00dc'. A chunk of
video data contains a single video frame.
Alexander Grigoriev writes ...
John,
##dc chunk was intended to keep compressed data, whereas ##db chunk
nad(sic) to be used for uncompressed DIBs (device independent bitmap),
but actually they both can contain compressed data. For example,
Microsoft VidCap (more precisely, video capture window class) writes
MJPEG compressed data in ##db chunks, whereas Adobe Premiere writes
frames compressed with the same MJPEG codec as ##dc chunks.
----End of Alexander
The ##wb chunks contain the audio data.
The audio and video chunks in an AVI file do not contain
time stamps or frame counts. The data is ordered in time sequentially as
it appears in the AVI file. A player application should display the
video frames at the frame rate indicated in the headers. The
application should play the audio at the audio sample rate indicated
in the headers. Usually, the streams are all assumed to start at
time zero since there are no explicit time stamps in the AVI file.
The lack of time stamps is a weakness of the original AVI file
format. The OpenDML AVI Extensions add new chunks with time
stamps. Microsoft's ASF (Advanced or Active Streaming Format), which
Microsoft claims will replace AVI, has time stamp "objects".
In principle, a video chunk contains a single frame of video. By
design, the video chunk should be interleaved with an audio chunk
containing the audio associated with that video frame. The data
consists of pairs of video and audio chunks. These pairs may be
encapsulated in a 'REC ' list. Not all AVI files obey this simple
scheme. There are even AVI files with all the video followed by all
of the audio; this is not the way an AVI file should be made.
The 'movi' list may be followed by:
'idx1' (4 byte chunk size) (index data) // an optional index into movie (a chunk)
The optional index contains a table of memory offsets to each
chunk within the 'movi' list. The 'idx1' index supports rapid
seeking to frames within the video file.
The 'avih' (AVI Header) chunk contains the following information:
Total Frames (for example, 1500 frames in an AVI)
Streams (for example, 2 for audio and video together)
InitialFrames
MaxBytes
BufferSize
Microseconds Per Frame
Frames Per Second (for example, 15 fps)
Size (for example 320x240 pixels)
Flags
The 'strh' (Stream Header) chunk contains the following information:
Stream Type (for example, 'vids' for video 'auds' for audio)
Stream Handler (for example, 'cvid' for Cinepak)
Samples Per Second (for example 15 frames per second for video)
Priority
InitialFrames
Start
Length (for example, 1500 frames for video)
Length (sec) (for example 100 seconds for video)
Flags
BufferSize
Quality
SampleSize
For video, the 'strf' (Stream Format) chunk contains the following
information:
Size (for example 320x240 pixels)
Bit Depth (for example 24 bit color)
Colors Used (for example 236 for palettized color)
Compression (for example 'cvid' for Cinepak)
For audio, the 'strf' (Stream Format) chunk contains the following
information:
wFormatTag (for example, WAVE_FORMAT_PCM)
Number of Channels (for example 2 for stereo sound)
Samples Per Second (for example 11025)
Average Bytes Per Second (for example 11025 for 8 bit sound)
nBlockAlign
Bits Per Sample (for example 8 or 16 bits)
Each 'rec ' list contains the sound data and video data for a single
frame in the sound data chunk and the video data chunk.
Other chunks are allowed within the AVI file. For example, I have
seen info lists such as
'LIST' (4 byte list size) 'INFO' (chunks with information on video)
These chunks that are not part of the AVI standard are simply
ignored by the AVI parser. AVI can be and has been extended by adding
lists and chunks not in the standard. The 'INFO' list is a registered
global form type (across all RIFF files) to store information that
helps identify the contents of a chunk.
The sound data is typically 8 or 16 bit PCM, stereo or mono,
sampled at 11, 22, or 44.1 KHz. Traditionally, the sound has
typically been uncompressed Windows PCM. With the advent of
the WorldWide Web and the severe bandwidth limitations of the
Internet, there has been increasing use of audio codecs. The
wFormatTag field in the audio 'strf' (Stream Format) chunk
identifies the audio format and codec.
<A NAME="OpenDML">
<H3>OpenDML AVI File Format Extensions</H3>
</A>
The Open Digital Media (OpenDML) Consortium has defined an OpenDML
AVI File Format Extensions which extend AVI to support a variety of
features required for professional video production. These include
support for fields (not just frames), file sizes larger than 1 GB,
timecodes, and many other features. Microsoft has reportedly
incorporated OpenDML AVI support in DirectShow 5.1 (ActiveMovie 5.1).
It is also used by various professional video applications for the PC,
in particular Matrox's DigiSuite software.
The Open Digital Media Consortium AVI File Format Extensions
add new lists and chunks to the AVI file which contain extra
data such as timecodes not incorporated in the original AVI
standard.
OpenDML appears to have been spearheaded by Matrox to improve AVI
for professional video authoring and editing. Matrox makes a variety
of PC video products such as DigiSuite for professional and broadcast
video authoring and editing. The OpenDML AVI File Format Extensions
are primarily for the Motion JPEG AVI files used for professional
video authoring and editing. The OpenDML effort seems to have been
pushed to one side with the advent of ActiveMovie, NetShow, Advanced
(formerly Active) Streaming Format (ASF) Files, and other Microsoft
initiatives.
On Oct. 2, 1997, the OpenDML AVI File Format Extensions Version 1.02
specification document (dated February 28, 1996) was available at
the Matrox Electronic Systems, Ltd. Web site at:
<A HREF="
http://www.matrox.com/videoweb/odmlff2.htm">
http://www.matrox.com/videoweb/odmlff2.htm</A>
The specification is in Adobe Portable Document Format (PDF). Since
Matrox seems to rearrange their site from time to time and one can't
always find the specification, I've included a link to a copy of the PDF
version of the specification on my Web site.
<A HREF="
http://www.rahul.net/jfm/odmlff2.pdf">PDF OpenDML AVI File Format Extensions Specification Document</A>
<A HREF="
http://www.adobe.com/prodindex/acrobat/readstep.html">Get Adobe Acrobat Reader (PDF Viewer)</A>
<A NAME="AVISpec">
<H3>Where to get the exact AVI specification?</H3>
</A>
Microsoft Visual C++ 5.0 has a Video for Windows include
file Vfw.h which gives the exact AVI data structures such as
the various headers used in AVI files. The file also has
comments explaining the structure of the AVI file.
Video for Windows refers to the AVI Format by the mnemonic
AVIFMT. At one time, the format information was apparently
stored in an AVIFMT.H header file. The format
information now appears consolidated in Vfw.h
In addition to the Video for Windows header files, Chapter Four of the
Video for Windows Programmer's Guide, "AVI Files", gives a detailed
specification of the AVI file format.
<A HREF="#Top">Return to Top</A>
<A NAME="AVIDIB">
<H2>AVI and Windows Bitmaps (DDB, DIB, ...)</H2>
</A>
Microsoft Windows represents bitmapped images internally and in files
as Device Dependent Bitmaps (DDB), Device Independent Bitmaps (DIB), and
DIB Sections. Uncompressed 'DIB ' AVI files represent video frames as
DIB's. Various multimedia API's that work with AVI use Windows
bitmapped images.
Prior to Windows 3.0, Windows relied on Device Dependent Bitmaps for
bitmapped images. A DDB is stored in a format understood by the
device driver for a particular video card. As the name suggests, DDB's
are not generally portable.
The structure of a DDB is:
typedef struct tagBITMAP { // bm
LONG bmType; /* always zero */
LONG bmWidth; /* width in pixels */
LONG bmHeight; /* height in pixels */
LONG bmWidthBytes; /* bytes per line of data */
WORD bmPlanes; /* number of color planes */
WORD bmBitsPixel; /* bits per pixel */
LPVOID bmBits; /* pointer to the bitmap pixel data */
} BITMAP;
Usually the pixel data immediately follows the BITMAP header.
(BITMAP header)(Pixel Data)
The HBITMAP handles used by GDI are handles to Device Dependent Bitmaps.
The GDI function BitBlt and StretchBlt are actually using Device
Dependent Bitmaps.
With Windows 3.0, Microsoft introduced the Device Independent Bitmap or
DIB, the reigning workhorse of bitmapped images under Windows. The DIB
provided a device independent way to represent bitmapped images, both
monochrome and color.
Windows retains DDB's despite the introduction of the DIB. For
example, to use a DIB, you might call:
hBitmap = CreateDIBitmap(...)
CreateDIBitmap creats a DDB from a DIB, returning the GDI HBITMAP
handle of the DDB for further GDI calls. At a low level, Windows
and GDI are still using DDB's.
The DIB files have a standard header that identifies the format, size,
color palette (if applicable) of the bitmapped image. The header
is a BITMAPINFO structure.
typedef struct tagBITMAPINFO {
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO;
The BITMAPINFOHEADER is a structure of the form:
typedef struct tagBITMAPINFOHEADER{ // bmih
DWORD biSize;
LONG biWidth;
LONG biHeight;
WORD biPlanes;
WORD biBitCount
DWORD biCompression; /* a DIB can be compressed using run length encoding */
DWORD biSizeImage;
LONG biXPelsPerMeter;
LONG biYPelsPerMeter;
DWORD biClrUsed;
DWORD biClrImportant;
} BITMAPINFOHEADER;
bmiColors[1] is the first entry in an optional color palette or color
table of RGBQUAD data structures. True color (24 bit RGB) images
do not need a color table. 4 and 8 bit color images use a color table.
typedef struct tagRGBQUAD { // rgbq
BYTE rgbBlue;
BYTE rgbGreen;
BYTE rgbRed;
BYTE rgbReserved; /* always zero */
} RGBQUAD;
A DIB consists of
(BITMAPINFOHEADER)(optional color table of RGBQUAD's)(data for the
bitmapped image)
A Windows .BMP file is a DIB stored in a disk file. .BMP files prepend
a BITMAPFILEHEADER to the DIB data structure.
typedef struct tagBITMAPFILEHEADER { // bmfh
WORD bfType; /* always 'BM' */
DWORD bfSize; /* size of bitmap file in bytes */
WORD bfReserved1; /* always 0 */
WORD bfReserved2; /* always 0 */
DWORD bfOffBits; /* offset to data for bitmap */
} BITMAPFILEHEADER;
Structure of Data in a .BMP File
(BITMAPFILEHEADER)(BITMAPINFOHEADER)(RGBQUAD color table)(Pixel Data)
The Win32 API documentation from Microsoft provides extensive
information on the data structures in a DIB.
In Windows 95 and Windows NT, Microsoft added the DIBSection to
provide a more efficient way to use DIB's in programs. The DIBSection
was originally introduced in Windows NT to reduce the number of
memory copies during blitting (display) of a DIB.
<A HREF="#Top">Return to Top</A>
<A NAME="Codec">
<H2>Meet the Codecs</H2>
</A>
The video data in an AVI file can be formatted and compressed in
a variety of ways. Video for Windows 1.1e comes with several
compressors:
Intel Indeo (version 3.2)
Microsoft Video 1
Microsoft RLE (Run Length Encoding)
Cinepak
AVI is not restricted to these compressors. They
are the compressors provided with Video for Windows.
These compressors are the Old Guard, the video codecs
from the early days of Video for Windows and QuickTime (Cinepak originated
with the Macintosh and QuickTime). During this period the
focus of video was playback from hard drives and CD-ROM's.
The advent of the WorldWide Web and Internet Mania
has created a New Wave of audio and video codecs, trying to
apply "advanced" technologies such as sophisticated motion
estimation and compensation, wavelets, fractals, and other
techniques to achieve extremely low bitrates (such as
28.8 Kbits/second for phone lines) for the Internet.
<H3>The Old Guard</H3>
<A NAME="DIB">
<H4>Full Frames (Uncompressed)</H4>
</A>
Users can store AVI files with uncompressed frames. No codec is
required for this.
The Four Character Code (FOURCC) for this is 'DIB ', DIB for
the Microsoft Device Independent Bitmap.
NOTE: Unfortunately, at least three other Four Character Codes are
somtimes used for uncompressed AVI videos:
'RGB '
'RAW '
0x00000000 ( a FOURCC whose hexadecimal value is 0 )
<A NAME="ColorFormats">
<H5>Color Formats</H5>
</A>
Not all uncompressed bitmap images and AVI frames are the same!
A variety of color formats for image pixels exist. Some of these
color formats are essentially standard and supported on all systems.
Some color formats (such as 8 bit grayscaleY8) require special
drivers to display or capture.
Color Formats are also known as IMAGE FORMATS or PIXEL FORMATS. Some
components of Microsoft Windows identify Color Formats with a Four
Character Code (FOURCC) such as 'RGB8' or 'YUY2'. Some components such
as the Windows Device Independent Bitmap or DIB do not use Four Character
Codes for color formats.
24 BIT RGB (DE FACTO STANDARD)
24-bit RGB is the most well-known color format. All common graphics
programs support 24-bit RGB. In 24 bit
RGB a pixel is represented as three bytes, one byte for the red
component, one byte for the green component, and one byte for
the blue component.
255 0 0 (a bright red pixel in 24 bit RGB)
0 255 0 (a bright green pixel in 24 bit RGB)
0 0 255 (a bright blue pixel in 24 bit RGB)
0 0 0 (a black pixel in 24 bit RGB)
255 255 255 (a white pixel in 24 bit RGB)
128 128 128 (a gray pixel in 24 bit RGB)
.. and so forth.
Other color formats include:
8 bit grayscale Y8
9 bit YUV9
12 bit BTYUV 4:1:1
12 bit YUY12
16 bit YUY2 4:2:2
8 bit RGB (uses a color palette)
15 bit RGB (16 bits with most significant bit zero, 5 bits for red, 5 bits for green, and 5 bits for blue)
16 bit RGB (16 bits with 5 bits for red, 6 bits for green, and 5 bits for blue)
(24 bit RGB - described above)
32 bit RGB (most significant byte is zero, 8 bits for red, 8 bits for green, and 8 bits for blue)
<H6>Original DIB Color Formats</H6>
In the Windows 3.1 Software Development Kit, DIB's were defined to
allow values of 1,4,8, and 24 bits per pixel in the biBitCount
field of the BITMAPINFOHEADER. The biCompression field was allowed
the values BI_RGB, BI_RLE4 (for run length encoding of 4 bit per pixel
images), and BI_RLE8 (for run length encoding of 8 bit per pixel
images). That was it. This original specification of the DIB
provided the 8 bit RGB and 24 bit RGB color formats described
above.
The original DIB specification had no support for 16 bit per pixel
formats, 32 bit per pixel formats, or special encodings like YUV.
Not surprisingly, the original formats and specification of the DIB
are the most widely supported in software.
<H6>New DIB Color Formats and More Complexity</H6>
Microsoft added support for 16 bit per pixel and 32 bit per pixel
images to the DIB specification. These formats are identified by
setting the biBitCount field in the BITMAPINFOHEADER of the DIB to 16 or
32. An uncompressed AVI file that stores images using the RGB 15, RGB
16, or RGB 32 color formats stores the video frames as DIB's using
these "new" color formats.
By default, the Microsoft 16 bit per pixel format is actually RGB 15
where one bit is unused, 5 bits for red, 5 bits for green, and 5 bits
for blue. This was done because the 15-bit RGB or 5-5-5 format was
used in 16-bit-per-pixel color video cards. Hardware designers found
it easier to build chips using a 5-5-5 pixel format with one bit
unused than the slightly higher resolution 5-6-5 color format.
The Microsoft 32 bit per pixel format has the most significant byte of
the pixel set to zero. Then 8 bits for red, 8 bits for green, and 8
bits for blue. This is RGB 32 bit. Why do this? The 32-bit pixels
in this format are DWORD aligned on 32 bit boundaries in this format
which is more efficient for operations and memory transfers under a 32
bit processor architecture than the unaligned 24-bit RGB format.
Microsoft also added a new value for the biCompression field of the
BITMAPINFOHEADER called BI_BITFIELDS. If biCompression is set
to BI_BITFIELDS, then the color table is three DWORD (32 bit)
masks giving the bits used for the red, green, and blue components
of a pixel. In this way, a "custom" format such as RGB 5-6-5 (5 bits
for Red, 6 bits for Green, and 5 bits for Blue) can be defined. This
is 16-bit RGB.
Although Four Character Codes (FOURCC's) such as 'RGB8' are used to
identify different Color Formats in some parts of Microsoft Windows,
the DIB data structures don't use FOURCC's, rather they use combinations
of biBitCount and biCompression.
In addition to 16-bit and 32-bit formats, Microsoft also defined a
mechanism for custom encodings such as YUV, YUY2, and so forth.
Manufacturers can register the new format with Microsoft. Microsoft
also concocted a JPEG-DIB specification for wrapping the JPEG still
image compression standard in a DIB. The JPEG DIB specification is
virtually unused, having lost to the JFIF JPEG file format. JPEG
images in general use, such as on Web pages, are JFIF's not JPEG
DIB's.
<H6>YUV Color Space and Color Formats</H6>
YUV is the color space used in the European PAL broadcast television
standard. PAL was originally introduced in Britain and Germany in
1967. PAL is used by most European nations and many nations around
the world. The United States and Japan use the NTSC standard. France
and a few other nations use the SECAM standard.
Y refers to the luminance, a weighted sum of the red, green, and blue
components. The human visual system is most sensitive to the luminance
component of an image. Analog video systems such as NTSC, PAL, and
SECAM transmit color video signals as a luminance (Y) signal and
two color difference or chrominance signals (the U and V above).
If, R, G, and B are the red, green, and blue values, then:
Y = 0.299 R + 0.587 G + 0.114 B
U = 0.493 (B - Y)
V = 0.877 (R - Y)
U is very similar to the difference between the blue and yellow
components of a color image. V is very similar to the difference
between the red and green components of a color image. There is
evidence that the human visual system processes color information
into something like a luminance channel, a blue - yellow
channel, and a red - green channel. For example, while we
perceive blue-green hues, we never percieve a hue that is
simultaneously blue and yellow. This may be why the YUV
color space of PAL is so useful.
To exploit this, digital color formats such as YUV9 or YUY2
exist that represent pixels as levels of Y, U, and V.
<H6>Summary</H6>
Both BMP still image files and AVI files may be saved in many
different color formats. While 24 bit RGB is almost universally
supported, there is no guarantee that your graphics software or
AVI playback drivers will support some of the less well-known
color formats. You may need to get special software, drivers, or
even hardware to use some of these formats.
For example, all of the color formats listed above are supported by
hardware and software drivers with the miro miroMEDIA PCTV TV Tuner
and Video Capture card. Some of these, like 8 bit grayscale Y8, are
not widely known or supported.
In general, video capture drivers allow selection of the color
format used when the video (AVI file) is captured. These color
format options should be accessible through the video capture software
application.
In Microsoft's VidCap and VidCap32 video capture applications, the
user may select the color format through Options | Video Format...
<A HREF="#Top">Return to Top</A>
<A NAME="MRLE">
<H4>Microsoft Run Length Encoding</H4>
</A>
Microsoft Run Length Encoding uses the Four Character Code MRLE
[drivers32]
VIDC.MRLE=MSRLE32.DLL
in Windows 95
[drivers]
VIDC.MRLE=MSRLE.DLL
in Windows 3.x
Microsoft Run Length Encoding usually appears with the
name "Microsoft RLE" in lists of Video Compression options.
8 BIT ONLY
Microsoft RLE only supports 8 bit color, a maximum of 256 colors using
a color lookup table. It does NOT support 16 bit color, also known as
"High Color" or "Thousands of Colors", or 24 bit color, also known as
"True Color" or "Millions of Colors".
WHERE TO GET Microsoft RLE
Historically, Microsoft RLE has been one of the standard Video for
Windows codecs from Microsoft.
The 16 bit Microsoft Video for Windows 1.1e installs a 16 bit
version of the Microsoft RLE video codec.
The Microsoft Windows 95 CD-ROM installs a 32 bit Microsoft
RLE video codec.
The Microsoft Windows NT Workstation Operating System Version 4.0
installs a 32 bit Microsoft RLE video codec.
<A HREF="#Top">Return to Top</A>
<A NAME="MSVC">
<H4>Microsoft Video 1</H4>
</A>
Microsoft Video uses the Four Character Code MSVC
[drivers32]
VIDC.MSVC=MSVIDC32.DLL
in Windows 95
[drivers]
VIDC.MSVC=MSVIDC.DLL
in Windows 3.x
<B>NOTE:</B> The Four Character Code CRAM is also used for Microsoft Video 1.
8 OR 16 BIT ONLY
Microsoft Video 1 supports only 8 bit or 16 bit color. 16 bit color is
also known as "High Color" or "Thousands of Colors". Microsoft Video 1
does not support 24 bit color, also known as "True Color" or "Millions of
Colors".
WHERE TO GET Microsoft Video 1
Historically, Microsoft Video 1 has been one of the standard
Video for Windows codecs from Microsoft.
The 16 bit Microsoft Video for Windows 1.1e installs a 16 bit
version of the Microsoft Video 1 video codec.
The Microsoft Windows 95 CD-ROM installs a 32 bit version of the
Microsoft Video 1 video codec.
The Microsoft Windows NT Workstation Operating System 4.0 CD-ROM
installs a 32 bit version of the Microsoft Video 1 video codec.
<A HREF="#Top">Return to Top</A>
<A NAME="RT21">
<H4>Intel Real Time Video 2.1 (Indeo 2.1?) (RT21) </H4>
</A>
RT21 is the Microsoft Four Character Code for the Intel Real Time
Video 2.1 (Indeo 2.1?) video compressor-decompressor (codec).
Microsoft's 16-bit Video for Windows Version 1.1e for Windows 3.x
includes a 16-bit Video for Windows codec for RT21.
WHERE TO GET Intel Real Time Video 2.1
Microsoft appears to have discontinued support for RT21 in the
32-bit Video for Windows in Windows 95 and Windows NT 4.0. There are
32-bit Video for Windows codecs for Indeo 3.1/3.2, Indeo 4.x, and
Indeo 5.0.
For example, the Microsoft Windows NT Workstation Operating System
Version 4.0 CD-ROM DOES NOT install RT21. It DOES install Intel
Indeo R 3.2
See the <A HREF="#VfW16">Where to get the 16-bit Video for Windows</A>
section.
<A HREF="#Top">Return to Top</A>
<A NAME="IV32">
<H4>Intel Indeo 3.1/3.2</H4>
</A>
Indeo uses the Microsoft Four Character Codes IV31 and IV32,
originally for Indeo 3.1 and Indeo 3.2, but these are usually now
mapped to Indeo 3.2
[drivers32]
VIDC.IV31=IR32_32.DLL
VIDC.IV32=IR32_32.DLL
in Windows 95
[drivers]
VIDC.IV31=IR32.DLL
VIDC.IV32=IR32.DLL
in Windows 3.x
Indeo 3.x uses Vector Quantization based image compression.
WHERE TO GET INDEO 3.x
Indeo R3.2 is one of the default standard Video for Windows codecs.
Microsoft's 16 bit Video for Windows 1.1e includes a 16 bit Indeo 3.2
codec (the IR32.DLL above).
The Microsoft Windows 95 CD-ROM installs a 32 bit Indeo 3.2 video codec.
Microsoft Windows NT Workstation Operating System 4.0 CD-ROM installs
a 32 bit Indeo R3.2 video codec.
<A HREF="#Top">Return to Top</A>
<A NAME="CVID">
<H4>Cinepak</H4>
</A>
Cinepak is the most widely used Video for Windows codec. Cinepak
reportedly provides the fastest playback of video. While Indeo 3.2
provides similar or slightly superior image quality for same
compression, Indeo decompression is much more CPU intensive than
Cinepak. Cinepak was originally developed for the Mac and licensed to
Apple by SuperMac Technology Inc. It is now free with Video for
Windows. It is also free with Apple's QuickTime.
There are at least three Cinepak Video for Windows codecs in existence:
Cinepak by SuperMac (the original, 16 bit)
Cinepak by Radius (newer, better?, 16 bit)
Cinepak by Radius[32] (32 bit version of Radius Cinepak, shipped with Windows 95)
Peter Plantec's Caligari TrueSpace2 Bible strongly recommends using
the Radius codec for superior results when generating AVI files from
TrueSpace.
Cinepak uses the Microsoft Four Character Code CVID
[drivers32]
VIDC.CVID=ICCVID.DLL
in Windows 95
[drivers]
VIDC.CVID=ICCVID.DRV
in Windows 3.x
Apple QuickTime supports Cinepak.
Mark Podlipec's XAnim Unix X11 video player supports Cinepak.
Cinepak uses Vector Quantization based image compression and frame
differencing.
Historical Note: On or about Jan. 6, 1999, Radius Inc. renamed itself
Digital Origin Inc.. SuperMac Technology, the original owner
of Cinepak, was a predecessor to Radius.
WHERE TO GET CINEPAK
Cinepak is one of the default standard video codecs in Video for Windows.
A 16-bit version of Cinepak is included in the 16 bit Video for Windows 1.1e
The Microsoft Windows 95 CD-ROM installs a 32 bit Cinepak by Radius video
codec.
Microsoft Windows NT Workstation Operating System Version 4.0 CD-ROM installs
a 32 bit Cinepak by Radius video codec.
CINEPAK PRO
Compression Technologies Inc.
Oakland, CA
<A HREF="mailto:
[email protected]">E-Mail:
[email protected]</A>
<A HREF="
http://www.Cinepak.com/">
http://www.Cinepak.com/</A>
sells an improved Cinepak compressor that purportedly can fix
some common problems with video encoded with Cinepak. They
say their product, CinepakPro, generates 100 percent completely
Cinepak compatible movies. These movies will play back using all
existing native Cinepak decompressors.
CinepakPro and Cinepak Toolkit (Macintosh OS 7.5 or newer and
QuickTime 2.5 or 3)
CinepakPro QTX (Windows 95/98/NT and QuickTime 3.0 for Windows)
CinepakPro AVI (Windows 95/98/NT and Video for Windows/
ActiveMovie)
An updated AVI Cinepak codec for Windows is available at the
Compression Technologies Inc. Web site (May 12, 1999).
<A HREF="#VQ">Vector Quantization</A>
<A HREF="#Top">Return to Top</A>
<A NAME="MJPG">
<H4>Motion JPEG</H4>
</A>
Most PC video capture and editing systems capture video to AVI
files using Motion JPEG video compression. In Motion JPEG, each
video frame is compressed separately using the JPEG still image
compression standard. No frame differencing or motion estimation
is used to compress the images. This makes frame accurate
editing without any loss of image quality during the editing
possible.
The standards situation for Motion JPEG is complicated since
at one time there was no industry standard for Motion JPEG.
Microsoft has a Microsoft Motion JPEG Codec and a JPEG DIB
Format. The OpenDML Avi File Format Extensions (another
standard for extending AVI to support professional video
features) includes Motion JPEG support. See the Paradigm
Matrix site below for more information on these standards.
Motion JPEG codecs usually use the Four Character Code 'MJPG'.
Motion JPEG is used for editing and authoring, but rarely for
distribution. Usually, once the video has been edited, it is
compressed further using Cinepak or another codec for distribution.
Because Motion JPEG does not use frame differencing or motion
estimation, better compression is possible with other codecs.
A software Motion JPEG codec for Windows NT and Windows 95 is available
from Paradigm Matrix at:
<A HREF="
http://www.pmatrix.com/Goodies.htm">
http://www.pmatrix.com/Goodies.htm</A>
The Paradigm Matrix Motion JPEG codec uses the Four Character Code
MJPG.
A 32-bit software Motion JPEG codec for Windows NT and Windows 95 is
available from Morgan Multimedia at:
<A HREF="
http://www.morgan-multimedia.com/">
http://www.morgan-multimedia.com/</A>
A 32-bit software Motion JPEG codec for processors with MMX instructions
(MainConcept Motion JPEG Codec) is available from MainConcept at:
<A HREF="
http://www.mainconcept.com/">
http://www.mainconcept.com/</A>
Motion JPEG uses the Block Discrete Cosine Transform (DCT) for
image compression.
<A HREF="#Top">Return to Top</A>
<A NAME="XMPG">
<H4>Editable MPEG</H4>
</A>
At least two companies defined schmes to wrap editable MPEG (I frames
only MPEG) in AVI files. Xing Technology's editable MPEG AVI uses
the Four Character Code XMPG. Sigma Designs defined an AVI format
using the Four Character Code MPGI.
Editable MPEG consists of only MPEG I frames. This omits the MPEG
motion estimation. It is very similar to Motion JPEG. By wrapping
I frames only MPEG in AVI, editable MPEG works with standard Video for
Windows editing and authoring applications such as Adobe Premiere.
Xing Technologies
<A HREF="
http://www.xingtech.com/">
http://www.xingtech.com/</A>
Sigma Designs
<A HREF="
http://www.graphcomp.com/info/specs/ms/editmpeg.htm">MPEG Extensions to AVI File
Format (Draft 1.1 by Sigma Designs)</A>
<A HREF="#Top">Return to Top</A>
<H3>The New Wave</H3>
Recently (5/18/97), there has been a proliferation of new Video for
Windows codecs. A few like H.261 have been around for a while, but
most represent implementations of new or improved technologies such as
wavelets. Many are targetted toward low bitrate video over the
Internet. For lack of better terminology, I refer to these as the New
Wave to differentiate them from the older codecs like Cinepak included
with Video for Windows 1.1e (the last release prior to Windows 95).
Microsoft appears to be developing or licensing some of these codecs as part of
NetShow, NetMeeting, and other Microsoft initiatives.
<A NAME="VDOW">
<H4>VDOWave or VDOLive from VDONet</H4>
</A>
VDONet <A HREF="
http://www.vdo.net/">
http://www.vdo.net/</A>
4009 Miranda Ave., Suite 250
Palo Alto, CA 94304
Voice: (415) 846-7730
FAX: (415) 846-7900
markets a wavelet based video codec which includes a Video for Windows
(32 bit) implementation. Microsoft has licensed VDOWave as
part of the NetShow product. There are two versions of the
VDOWave codec. VDOWave 2.0 is a fixed rate video codec which uses
the Microsoft Four Character Code VDOM. This codec adds the line
[drivers32]
vidc.vdom=vdowave.drv
to the SYSTEM.INI file in Windows 95.
VDOWave 3.0 is a "scalable" video codec. This codec uses the Microsoft Four
Character Code (FOURCC) VDOW and adds the line
[drivers32]
VIDC.VDOW=vdowave.drv
to the Windows 95 SYSTEM.INI files.
In NetShow 2.0, the standalone Client Setup installs a VDOWave
decode-only codec. The NetShow 2.0 Tools Setup installs a
VDONet VDOWave encoder.
In some of my tests, VDOWave appears significantly superior to MPEG-1
and the other block Discrete Cosine Transform based codecs at low
bitrates.
VDONet uses the trademark VDOWave for its wavelet based video codec.
VDONet uses the trademark VDOLive for its VDOLive On-Demand Product
Line. This includes the VDOLive On-Demand Server, the VDOLive tools
including VDOCapture and VDOClip, and the VDOLive Player. Sometimes
VDOLive and VDOWave are used interchangably by users and in some company
literature.
VDONet also has a VDOPhone product for real-time videoconferencing.
Based on the company documentation, published reports, and viewing the
technology, VDOWave appears to be a combination of wavelet based image
compression and motion compensation or frame differencing.
<A HREF="#Top">Return to Top</A>
<A NAME="IV41">
<H4>What is Indeo Video Interactive?</H4>
</A>
Indeo Video Interactive, Indeo 4.1, is a new version of Indeo from
Intel based on a "hybrid wavelet algorithm" according to Intel. This
is a different compression algorithm than Indeo 3.2 which is included
with Video for Windows. Indeo 3.2 uses Vector Quantization.
Indeo Video Interactive supports a number of features in addition
to the new compression algorithm such as transparency.
Indeo Video Interactive can be installed as a Video for Windows
codec or in the new ActiveMovie environment from Microsoft.
For further information on Indeo Video Interactive
<A HREF="
http://www.intel.com/pc-supp/multimed/indeo/index.htm">
http://www.intel.com/pc-supp/multimed/indeo/index.htm</A>
How to program "sprites" in Indeo Video Interactive?
Some of Intel's marketing material touts the ability to
add sprites to applications using Indeo Video Interactive.
Most of Intel's technical documentation on Indeo and the API's
for using Indeo Video Interactive neglects to explain what Intel
means by "sprite". There is a brief mention in the Overview
document for Indeo Video Interactive.
Indeo "sprite" means TRANSPARENCY.
Indeo Video Interactive supports TRANSPARENCY. Indeo has
transparent pixels to create transparent backgrounds to
implement effects such as chroma-keying. The well-known
example of chroma-keying is the television weather forecaster
standing in front of a satellite weather picture. The forecaster
stands in front of a blue screen (sometimes a green screen) and
video gadgetry replaced the blue color with another video signal.
Anything that is not the blue "key" color is left unchanged.
In Indeo jargon a "video sprite" is a foreground object such as the
mythical weather-caster on a transparent background. Your application
can then provide a bitmap image or even another video as a background
in the transparent areas of the image. This provides a crude
mechanism for the video to change depending on interactions with
a user.
Look up the API's in Intel's documentation for TRANSPARENCY
to implement video sprites.
How to identify an AVI file that uses Indeo Video Interactive
for the video compressor?
Video for Windows identifies different video compressors through
four character codes. For example, 'cvid' is the four character
code for the widely used Cinepak compressor. The four character
code is found in the video stream header 'strh' in the AVI file.
Indeo Video Interactive (Indeo 4.1) uses the four character code
'iv41'
If Indeo Video Interactive is not installed installed Video for
Windows will report an error, indicating that it cannot find the
compressor for 'iv41'. The specific message appears to be:
"Video not available, cannot find 'vids:iv41' decompressor."
<B>NOTE:</B> Indeo 4.1 claims to implement a hybrid wavelet transform.
Some of the behavior of the codec at low bitrates differs from other
wavelet based image and video compressors such as VDOWave, Infinop's
Lightning Strike, and some public domain wavelet compression software.
In particular, at low bitrates, I have seen the characteristic checkerboard
pattern of 8x8 pixel blocks seen in block based transform coding methods such
as MPEG-1. I'm not sure what Intel means by hybrid wavelet transform.
In general, at low bitrates, image and video compression schemes using the
Discrete Wavelet Transform (DWT) exhibit a blurring at the edges of objects and
also "ringing" artifacts near edges. They do not exhibit the blocking
artifacts, checkerboard pattern in extreme cases, seen in block Discrete Cosine
Transform based image and video compression.
<A HREF="#Top">Return to Top</A>
<A NAME="IV50">
<H4>Indeo Video Interactive 5.0</H4>
Intel is now (2/22/99) distributing an Indeo Video Interactive 5.10
software on their Web site. Indeo 5.0 claims to use a new better
wavelet compression algorithm for improved video quality. Indeo 5
includes features such as progressive download for the Internet,
transparency, sprites, etc.
As of 2/22/99, the latest version of Intel Indeo Video appears to
be Indeo Video 5.10. The latest version of Intel Indeo Audio appears
to be Indeo Audio 2.5
The previous version of Intel Indeo Video was Indeo 5.06
<A HREF="
http://developer.intel.com/ial/indeo/video/">
http://developer.intel.com/ial/indeo/video/</A>
Known releases of Indeo Video 5.x
Intel Indeo 5.10 (02-Feb-1999)
Intel Indeo 5.06 (1998)
Intel Indeo 5.0 (1997????)
NOTE: All releases of Indeo 5.x appear to use the Four Character
Code IV50
Apple QuickTime 4 includes support for Indeo 5, allowing playback on
Apple Macintosh platforms.
<A HREF="
http://www.apple.com/quicktime/technologies/indeo/">
http://www.apple.com/quicktime/technologies/indeo/</A>
<A HREF="#Top">Return to Top</A>
<A NAME="UCOD">
<H4>ClearVideo (aka RealVideo)</H4>
</A>
ClearVideo is a video codec from Iterated Systems(<A
HREF="
http://www.iterated.com/">
http://www.iterated.com</A>
Iterated has also licensed the ClearVideo technology to Progressive
Networks, makers of RealAudio, under the name RealVideo.
You can (or could at one time) download a Video for Windows demo of
ClearVideo from the Iterated Web site. This includes a demo Video for
Windows codec that allows both encoding and decoding The video can
only be played on the same machine with the demo encoder. This codec
works with Video for Windows applications such as Media Player and
VidEdit.
Fractal video encoding appears to be very slow (computationally
intensive). The video is similar or somewhat
superior to MPEG-1 in quality.
ClearVideo uses Fractal Image Compression. Iterated is the main
(only?) producer of commercial fractal image and video compression
technology.
The Video for Windows evaluation version of ClearVideo
installs
[drivers32]
VIDC.UCOD=CLRVIDCD.DLL
in SYSTEM.INI in Windows 95.
<A HREF="#Top">Return to Top</A>
<A NAME="SFMC">
<H4>SFM (Surface Fitting Method)</H4>
</A>
Crystal Net Corporation (<A
HREF="
http://www.crystalnet.com/">
http://www.crystalnet.com/</A> seeks
to license a technology called SFM or Surface Fitting Method. This is
supposed to be a low bitrate video technology for ISDN and POTS (Plain
Old Telephone Service) bitrates. They have a Video for Windows demo
to download from their Web site.
SFM used the Microsoft Four Character Code (FOURCC) SFMC.
The demo installs (actually the instructions tell you to manually install):
[drivers32]
VIDC.SFMC=SFMdemo.dll
in Windows 95.
The demo does not include an encoder which presents problems in evaluating
the technology. However, SFM appears to be some sort of edge detection based
encoding technology.
White Pine's Enhanced CU-See Me desktop videoconferencing product uses
Crystal Net's SFM under the name White Pine Color Software Codec.
NEC has reportedly licensed SFM for its Network Video Audio Tool (NVAT).
(February, 1998)
Crystal Net also reportedly has relationships with Shepherd Surveillance
and Winnov. (February, 1998)
<A HREF="#Top">Return to Top</A>
<A NAME="QPEG">
<H4>QPEG</H4>
</A>
Q-Team Dr. Knabe produces a Video for Windows codec known as QPEG.
Currently (6/27/97), QPEG supports 8 bit color. Q-Team plans 16 and
24 bit color, MMX support, and other additional features in the
future.
Sample AVI/QPEG files and Video for Windows QPEG codecs for
Windows 3.x and Windows 95/NT are available at the Q-Team
Web site.
<A HREF="
http://www.q-team.de/">
http://www.q-team.de/</A>
Q-Team is also working on MPEG-4 for the PC.
<A HREF="#Top">Return to Top</A>
<A NAME="H261">
<H4>H.261</H4>
</A>
H.261 is an international standard, widely used for video conferencing
in the 128 Kbits/second to 384 Kbits/second range. This is a block
Discrete Cosine Transform method. Actually, H.261 was the first
international standard developed using the block Discrete Cosine
Transform and motion compensation. MPEG-1, which is probably better
known, followed the H.261 effort.
Intel's ProShare videoconferencing product installs a Video for Windows H.261 codec.
<B>NOTE:</B> I've never generated an AVI files with Intel's H.261, so it
may only be used for Intel ProShare videoconferencing and not with AVI.
Microsoft has a Microsoft H.261 Video for Windows 32 bit codec.
[drivers32]
VIDC.M261=MSH261.DRV
in Windows 95.
<A HREF="#Top">Return to Top</A>
<A NAME="H263">
<H4>H.263</H4>
</A>
H.263 is another international standard, based on the Block Discrete
Cosine Transform (DCT) and motion compensation. H.263 has a number of
improvements, mostly in the area of motion compensation, over the
earlier H.261 standard. It is targeted toward very low bitrate video
compression.
Microsoft's NetShow 2.0 installs a Microsoft H.263 video codec.
Microsoft H.263 uses the Four Character Code M263.
[drivers32]
VIDC.M263=msh263.drv
in Windows 95.
The Microsoft H.263 video codec is one of several "keyed" codecs
installed by NetShow. Others are Vivo H.263 and Duck's TrueMotion 2.0.
These codecs will not encode video as AVI files, although they
apparently will create Microsoft ASF files or provide compression for
streaming video products such as Microsoft's NetMeeting
videoconferencing. See the section on NetShow for more information
on the NetShow video codecs.
Some versions of the msh263.drv driver will crash when trying to
encode an AVI file from VidEdit or similar applications. Other
versions of msh263.drv don't crash but give an "Unable to begin
compression" message box.
Vivo Software Inc. markets streaming H.263 and G.723 audio for the
Web under the brand name VivoActive. Vivo has its own file format
called .VIV which can be embedded in Web pages. Vivo provides
a player called VivoActive player and an authoring tool for
creating .VIV files called VivoActive Producer.
Microsoft NetShow installs a "keyed" codec that identifies itself as
Vivo H.263 Video Codec[32] which installs
[drivers32]
VIDC.VIVO=IVVIDEO.DLL
in Windows 95. Vivo H.263 uses the four character code 'VIVO'.
The Vivo Software Web Site:
<A HREF="
http://www.vivo.com/">
http://www.vivo.com/</A>
Intel distributes an Intel "I263" H.263 video codec at there Web
site as part of the NetCard product. This installs
[drivers32]
VIDC.I263=C:\WINDOWS\I263_32.DLL
VIDC.I420=C:\WINDOWS\I263_32.DLL
in Windows 95.
As of June 18, 1998, this codec could be found at:
<A HREF="
http://support.intel.com/support/createshare/camerapack/CODINSTL.HTM">
http://support.intel.com/support/createshare/camerapack/CODINSTL.HTM</A>
Note that Intel, like Microsoft, seems to rearrange their Web site
constantly.
Unlike the keyed Microsoft H.263 video codecs, this codec can be
used to encode AVI's through Microsoft VidEdit 1.1 (and presumably
other video editing products). How well this codec in fact
implements the H.263 standard is not clear.
Shannon Communication Systems (SCS) has an H.263+ AVI codec and
analysis tool available for download at their Web site:
<A HREF="
http://www.shansys.com/">
http://www.shansys.com/</A>
Telenor R&D of Norway distributes the source code for an H.263 encoder
and decoder that will reportedly compile and run under Windows. This
is a standalone application not a Video for Windows codec or
ActiveMovie filter. See elsewhere in the AVI Overview for a link to
the Telenor Web site.
<A HREF="#Top">Return to Top</A>
<A NAME="MPG4">
<H4>MPEG-4</H4>
</A>
Microsoft's NetShow 2.0 installs a Video for Windows codec for
MPEG-4. MPEG-4 is a new international standard that has not
been officially released as yet. Microsoft is deeply involved in
the MPEG-4 standardization effort. Microsoft has been using
its MPEG-4 for the Microsoft NBC Business Video broadcasting over
the Internet.
The Microsoft MPEG-4 Video for Windows Codec identifies itself
as "MPEG-4 High Speed Compressor" in Control Panel | Multimedia |
Devices | Video Compression Codecs.
Adding to confusion in true Microsoft fashion, there are two
versions of each NetShow video codec. The NetShow 2.0 Player (Client)
installation program installs codecs that provide only decoding
functionality. The MPEG-4 video codec installed by NetShow 2.0 Player
can only play back an MPEG-4 AVI. The MPEG-4 video codec installed by
the NetShow 2.0 Tools can encode AVI files with MPEG-4 video
compression.
If you want to author MPEG-4 compressed AVI, make sure to get and
install the NetShow 2.0 Tools, not just the NetShow 2.0 Player.
The Microsoft MPEG-4 High Speed Compressor Video Codec installed
by the NetShow 2.0 Tools will not encode arbitrary dimension
video. The video must have the dimensions 176 x 144 (QCIF).
In Microsoft VidEdit, "MPEG-4 High Speed Compressor" becomes
visible in the list of compression options only if the video
is sized to 176x144.
MPEG-4 uses the Microsoft Four Character Code (FOURCC) MPG4.
[drivers32]
VIDC.MPG4=msscrc32.dll
<A HREF="#Top">Return to Top</A>
<A NAME="LS">
<H4>Lightning Strike (Infinop)</H4>
</A>
Infinop markets a wavelet based video codec called Lightning Strike
Streaming Video. A Lightning Strike video decoder compatible with
Microsoft NetShow can be dowloaded from the Infinop Web site. There
are several sample Lightning Strike Video files at the Infinop
site. The Lightning Strike encoder does not seem to be generally
available.
<A HREF="
http://www.infinop.com/">
http://www.infinop.com/</A>
<A HREF="#Top">Return to Top</A>
<A NAME="VxTreme">
<H4>VxTreme</H4>
</A>
VxTreme was acquired by Microsoft in September of 1997.
Microsoft has invested in numerous low bitrate audio and video
companies during the second half of 1997, include VxTreme, VDONet,
Progressive Networks/RealNetworks, and Lernout and Hauspie Speech
Products.
Although I have not seen a Video for Windows implementation of VxTreme
(12/20/97), I thought that I should include this codec. Undoubtedly,
it will be ported to Video for Windows and/or ActiveMovie if this has
not already happened.
VxTreme(<A HREF="
http://www.vxtreme.com/">
http://www.vxtreme.com/</A>
markets a video codec that is usually identified as a wavelet based
codec.
A VxTreme player, a Plug-In for Internet Explorer and Netscape, is
available at the VxTreme Web site.
VxTreme has some very impressive demos of QCIF (160x120) talking heads
material on their Web sites. The subjective image quality during
scenes with small changes is quite good, much superior to the block
Discrete Cosine Transform based codecs and probably VDONet's VDOWave.
Text such as movie titles and credits appears to encode very well.
Preserving the sharp edges of text is a major problem in block
Discrete Cosine Transform based encoders such as the JPEG still image
compression standard and the MPEG video compression standard. In
general, wavelet image compression encounters problems with sharp
edges as well.
VxTreme clearly uses some sort of motion compensation or frame
differencing. Image quality drops dramatically during periods with
rapid changes. I viewed a number of movie trailers encoded with
VxTreme for 28.8 Kbits/second such as the trailer for "Goldeneye".
These trailers contain many scene changes and motion. Video quality
is poor, hardly superior to competitors such as H.261 or Microsoft's
MPEG-4. The talking heads material at 28.8 looks almost natural.
VxTreme may be a combination of the Discrete Wavelet Transform (??)
and motion compensation. The preservation of sharp edges suggests
something beyond the vanilla Discrete Wavelet Transforms described in
the technical literature on wavelet based image and video compression.
<A HREF="#Top">Return to Top</A>
<A NAME="#Sorenson">
<H2>Sorenson Video</H2>
</A>
Sorenson Video, from Sorenson Vision, is a low bitrate video codec
that appears to be available only for Apple QuickTime as of May 10,
1999. Sorenson Video was used to compress the Star Wars trailer for
"Star Wars Episode I: The Phantom Menace" distributed over the Internet
as a QuickTime file in the spring of 1999. No Video for Windows
Sorenson Video codec appears to be available.
Sorenson Video is reportedly based on some kind of vector
quantization technology that can achieve very high compression.
According to Mark Podlipec's XAnim site (May 10, 1999), he contacted
Sorenson Vision to find out if he could license Sorenson Video for
incorporation in the XAnim Unix X11 animation, audio, and video
player. According to his Web site, Sorenson replied that Apple will
not allow Sorenson to license Sorenson Video to others.
<A HREF="
http://www.s-vision.com/">
http://www.s-vision.com/</A>
<A HREF="#Top">Return to Top</A>
<A NAME="VfWInstalled">
<H2>How to determine which Video for Windows decompressors are installed on a PC?</H2>
</A>
In the SYSTEM.INI file, there is a section [drivers] which will contain
some lines as follows:
[drivers]
VIDC.MSVC=msvidc.drv
VIDC.YVU9=isvy.drv
VIDC.IV31=indeor3.drv
VIDC.RT21=indeo.drv
VIDC.CVID=iccvid.drv
VIDC.MRLE=msrle.drv
AVI files contain a four character code (such as 'IV31' or 'CVID')
in the stream header for the video stream. This four character
code identifies the video compressor used for the video stream.
For example, 'CVID' is the identifier for Cinepak (formerly Compact
Video) compression.
Video for Windows prefixes the four character code with VIDC. and
uses it to look up the video decompressor driver in SYSTEM.INI
iccvid.drv is the driver for Cinepak in the example above.
Note: These are 16-bit drivers. Windows 95 adds a section [drivers32]
for 32 bit drivers. There are 32 bit versions of the Video for
Windows drivers. See below (and notice that the 32 bit drivers have
different names from the 16 bit drivers).
[drivers32]
vidc.cvid=iccvid.dll ; Cinepak for Windows 32
vidc.iv31=ir32_32.dll
vidc.iv32=ir32_32.dll
vidc.msvc=msvidc32.dll
vidc.mrle=msrle32.dll
WINDOWS 95
In Windows 95:
(1) Open the Control Panel
(2) Double click on the Multimedia Icon (applet)
(3) Select the "Advanced" Tab
(4) Under the Multimedia Drivers icon, double click on the Video Compression Codecs icon
to open it. This gives a list of installed video codecs.
<A HREF="#Top">Return to Top</A>
<A NAME="WhichAVICodec">
<H3>How to determine which codec was used to compress an AVI file</H3>
</A>
LOW LEVEL WAY THAT WORKS ON ANY OPERATING SYSTEM WITH A FILE EDITOR!
A low level way to find out is to view the avi file with an editor,
for example the standard EDIT command in DOS will work. Search for
the four character code vids (usually lower case). vids indicates a
VIDeo Stream. vids is immediately followed by the four character code
for the compressor used for the AVI file. For example, a full frames
(uncompressed) AVI will contain the string:
vidsDIB
An AVI compressed using Microsoft Video 1 will contain the string:
vidsmsvc
And so forth.
See elswhere in this overview for information on the Microsoft Four Character Codes.
WINDOWS 95
In Windows 95 (or Windows NT 4.0):
Right click on the avi file's icon.
This brings up a menu of items.
Select Properties.
Click on the Details tab in the Properties sheet.
Look under Video Format in the Details. This will list the
compression used. The compression is identified using an explanatory
human-readable string such as "16 x 16, 24 Bits, 8 Frames, 60.150
Frames/Sec, 76 KB/Sec, Uncompressed". The Microsoft Four Character
Code is not used.
NOTE: Windows 95 needs to have the Video for Windows codec installed
to correctly identify the codec used in the AVI file. If you have
an AVI that Windows 95 can't play because the codec is not
installed, you will have to use another method.
<A HREF="#Top">Return to Top</A>
<A NAME="BestCodec">
<H3>Which AVI video compressor is best?</H3>
</A>
"Best" depends on what the user is trying to do. Selection of a
video codec depends on several variables: time to encode the video,
how widely known and available the video codec is, compression ratios
that can be achieved for a target subjective quality level. The
<A HREF="#CodecPerformance">Performance of AVI Codecs</A> section gives
detailed information on the performance, compression ratios, video
quality, etc. of AVI video codecs.
Cinepak is the most widely used AVI video codec. Cinepak reportedly
provides the fastest playback of video. While Indeo 3.2 provides
similar or slightly superior image quality for same compression, Indeo
decompression is much more CPU intensive than Cinepak. Cinepak was
originally developed for the Mac and licensed to Apple by SuperMac.
It is now free with Video for Windows. It is also free with Apple's
QuickTime.
There are at least three Cinepak codecs in existence:
Cinepak by SuperMac (the original, 16 bit)
Cinepak by Radius (newer, better?, 16 bit)
Cinepak by Radius[32] (32 bit version of Radius Cinepak, shipped with Windows 95)
Peter Plantec's Caligari TrueSpace2 Bible strongly recommends using the Radius
codec for superior results when generating AVI files from TrueSpace.
Cinepak is the best codec to use to insure ease of playback. Few people
will have problems or need to install special codecs or software
to play an AVI compressed with Cinepak.
Cinepak is based on Vector Quantization and Frame Differencing to
achieve video compression. Other technologies such as the
Block Discrete Cosine Transform and Motion Compensation can achieve
superior compression (smaller files for the same subjective visual
quality).
Codecs that beat Cinepak
H.263 (probably H.261) Block DCT/Motion Compensation
MPEG-4 Video Verification Model Block DCT/Motion Compensation
Indeo Video Interactive (Indeo 4.x) "hybrid wavelet"
VDONet's VDOWave Discrete Wavelet Transform/Motion Compensation
Iterated System's RealVideo or ClearVideo Fractal Compression
Although not integrated into AVI, the MPEG-1 digital video standard
with IPB frames outperforms Cinepak.
The block DCT/Motion Compensation based codecs seem to perform 1.5 - 2.0
times better than Cinepak. VDOWave, a wavelet based codec, seems somewhat
better than this.
<A HREF="#Top">Return to Top</A>
<A NAME="CodecPerformance">
<H2>Performance of AVI Codecs</H2>
</A>
<H3>How do the Video Codecs Perform on Typical Video?</H3>
To test the performance of the many Video for Windows codecs, I
created a ten second video of myself using the U.S. Robotics
Bigpicture video capture card which is based on the Brooktree Bt848
chip. I am talking, picking up a microphone, and waving my hands
against an essentially static background. This sequence was captured
at: 30 frames per second, 320 by 240 pixels, with 24 bit RGB color, no
frames dropped during video capture. Note that 320 by 240, RGB 24, at
30 frames per second is similar to a single field of NTSC television
video and to the spatial resolution and frame rates of the successful
VideoCD products based on MPEG-1 digital video compression.
I then compressed the video using different codecs and Microsoft's VidEdit
1.1 video editor.
A table of results follows. Except where noted, the video
resembles the original uncompressed video closely. In case where the
video was significantly degraded, this is noted IN CAPS.
The encoders for most codecs have an adjustable quality factor,
frequently displayed as a value between 0 and 100. A higher quality
factor means the compressed video looks better but has a higher
bitrate, a lower compression ratio. There is a trade-off between
quality and bitrate. In the technical literature on image and video
compression this is known as the rate-distortion function R(D). In
the table below, the entries give the quality factor used to encode
the test video where appropriate.
Results:
Codec File Size Compression Ratio/
Bitrate
--------------------------------------------------------------------------
Raw 24 bit RGB (Full Frames Uncompressed) 66.187 MB 1:1 / 53 Mbps
30 frames per second
320 by 240 pixels
a talking head with some hand waving
Radius Cinepak (32-bit) 6.92 MB 9.6:1 / 5.5 Mbps
Quality Factor 100, keyframe every 15 frames
Compression Technology: Vector Quantization
Intel Indeo 5.1 (32-bit) 4.41 MB 15.0:1 / 3.5 Mbps
Quality Factor 85, keyframe every 15 frames
Compression Technology: Wavelet
Intel Indeo 5.1 (32-bit) 0.98 MB 67.8:1 / 784 Kbps
Quality Factor 50, keyframe every 15 frames
Compression Technology: Wavelet
Intel Indeo 5.1 (32-bit) 0.81 MB 81.7:1 / 648 Kbps
Quality Factor 25, keyframe every 30 frames
Compression Technology: Wavelet
Intel Indeo 4.3 (32-bit) 2.46 MB 26.9:1 / 2 Mbps
Quality Factor 85
Compression Technology: "Hybrid Wavelet"
Intel Indeo R3.2 (32-bit) 3.93 MB 16.8:1 / 3.1 Mbps
Quality Factor 65, keyframe every 4 frames
Version 3.24.15.03
Compression Technology: Vector Quantization
Microsoft Video 1 (32-bit) 3.16 MB 20.7:1 / 2.5 Mbps
Microsoft Video 1 Compressor Version 1.0
LOW QUALITY - NOTICABLY GRAINY
Microsoft MPEG-4 (32-bit) 0.625 MB 105.9:1 / 500 Kbps
MPEG-4 Video High Speed Compressor
keyframe every 3600 frames
Compression Control 0
Data Rate 128 Kilobits/second
LOW QUALITY - BLOCKING ARTIFACTS
Compression Technology: Block Discrete Cosine Transform
Motion Compensation
Intel Indeo Raw R1.2 (32-bit) 24.6 MB 2.7:1 / 19.7 Mbps
Version 1.20.15.01
Intel I.263 H.263 (32-bit) 0.764 MB 86.6:1 / 612 Kbps
keyframe every 15 frames
Quality Factor 50 %
LOW QUALITY - BLOCKING ARTIFACTS
Compression Technology: Block Discrete Cosine Transform
Motion Compensation
Intel I.263 H.263 (32-bit) 1.99 MB 33.2:1 / 1.6 Mbps
keyframe every 15 frames
Quality Factor 100 %
Compression Technology: Block Discrete Cosine Transform
Motion Compensation
Brooktree YUV 411 Raw 32.6 MB 2.0:1 / 26 Mbps
BtV MediaStream
Verson: 2.01
With a 160 by 120, 30 frames per second, 239 frame (7.996) second, 24
bit color video "Space Shuttle" sequence, from a NASA promotional
video showing the launch of the space shuttle and some crowds watching
and cheering. Here I was able to calculate and report the <A
HREF="#PSNR">Peak Signal to Noise Ratio</A> (in dB, decibels), an
objective measure of image and video quality.
Codec File Size Compression Ratio/
Bitrate/<A HREF="#PSNR">PSNR</A>
-----------------------------------------------------------------------
Full Frames (Uncompressed) 13.1 MB 1 : 1 / 13.5 Mbps
24 bits, 160 x 120, 30 fps / INFINITE
Cinepak Codec by Radius [32] 0.85 MB 15.9 : 1 / 848 Kbps
Version 1.8.0.12 / 31.08 dB
Key frame every 15 frames
Quality Factor 100
(DEFAULT ENCODING PARAMETERS)
Cinepak Codec by Radius[32] 0.78 MB 17.7 : 1 / 780 Kbps
Version 1.8.0.12 / 30.49 dB
Key Frame every 15 frames
Target Data Rate 100 KBytes/sec
Quality Factor 100
Cinepak Codec by Radius[32] 0.42 MB 32.93 : 1 / 420 Kbps
Version 1.8.0.12 / 27.087 dB
Key Frame every 15 frames
Target Data Rate 50 KBytes/sec
Quality Factor 100
VISIBLE BLOCKING
Cinepak Codec by Radius[32] 0.23 MB 59.6 : 1 / 231 Kbps
Version 1.8.0.12 / 23.762 dB
Key Frame every 15 frames
Target Data Rate 25 KBytes/sec
Quality Factor 100
HEAVY BLOCKING/UNACCEPTABLE VIDEO
Intel Indeo 5.10 0.993 MB 13.2 : 1 / 992 Kbps
Key frame every 15 frames / 32.43 dB
Quality Factor 85
(DEFAULT ENCODING PARAMETERS)
Intel Indeo 5.10 0.22 MB 59.6 : 1 / 216 Kbps
Key frame every 15 frames / 29.6 dB
Quality Factor 50
RINGING ARTIFACTS JUST VISIBLE
Intel Indeo 5.10 0.177 MB 74 : 1 / 176 Kbps
Key frame every 15 frames / 28.4 dB
Quality Factor 25
RINGING ARTIFACTS
Intel Indeo 5.10 0.158 MB 82.9 : 1 / 152 Kbps
Key frame every 15 frames / 27.82 dB
Quality Factor 10
RINGING ARTIFACTS
Intel Indeo Video Interactive[32] 1.564 MB 8.8 : 1 / 1.564 Mbps
Indeo V 4.11.15.62 / 28.686 dB
Key frame every 15 frames
Target Data Rate 1687 KBytes/sec
Quality Factor 85
(DEFAULT)
Intel Indeo Video Interactive[32] 0.732 MB 18.9 : 1 / 732 Kbps
Indeo V 4.11.15.62 / 28.284 dB
Key frame every 15 frames
Target Data Rate DISABLED
Quality Factor 85
Intel Indeo Video Interactive[32] 0.297 MB 46.5 : 1 / 297 Kbps
Indeo V 4.11.15.62 / 26.622 dB
Key frame every 15 frames
Target Data Rate DISABLED
Quality Factor 50
BLOCKING AND RINGING ARTIFACTS
Intel Indeo Video Interactive[32] 0.256 MB 53.9 : 1 / 256 Kbps
Indeo V 4.11.15.62 / 25.389 dB
Key frame every 15 frames
Target Data Rate DISABLED
Quality Factor 25
HEAVY BLOCKING AND RINGING ARTIFACTS/UNACCEPTABLE
Microsoft Video 1 5.198 MB 2.7 : 1 / 5.198 Mbps
Key frame every 15 frames / 32.209 dB
Quality Factor 100
SLIGHTLY GRAINY
Microsoft Video 1 0.79 MB 17.5 : 1 / 790 Kbps
Key frame every 15 frames / 30.286 dB
Quality Factor 75 (DEFAULT)
BANDING AND BLOCKING
Microsoft Video 1 0.17 MB 82.8 : 1 / 166 Kbps
Key frame every 15 frames / 23.915 dB
Quality Factor 50
VERY BLOCKY/UNACCEPTABLE
Microsoft Video 1 0.08 MB 163.7 : 1 / 84 Kbps
Key frame every 15 frames / 18. 524 dB
Quality Factor 25
VERY VERY BLOCKY/UNACCEPTABLE
With a 160 by 120 pixel, 15 frames per second, 24 bit color version of
the "Talking Head" 10 second video sequence, created by downsampling in
space and time with Microsoft VidEdit 1.1
Codec File Size Compression Ratio/
Bitrate
------------------------------------------------------------------------
Full Frames Uncompressed 8.53 MB 1:1 / 6.816 Mbps
24 bit RGB
160 x 120 pixels
15 frames per second
Cinepak from Radius (32 bit) 1.20 MB 7.1:1 / 960 Kbps
Version 1.10.0.6
Quality Factor 100
Keyframe every 15 frames
Intel Indeo 4.5 (32 bit) 0.677 MB 12.6:1 / 541 Kbps
Quality Factor 85
Keyframe every 15 frames
Intel Indeo R3.2 (32 bit) 0.98 MB 8.7:1 / 784 Kbps
Version 3.24.15.03
Quality Factor 65
Keyframe every 4 frames
Microsoft Video 1 0.947 MB 9:1 / 758 Kbps
Quality Factor 75
SOME BLOCKING ARTIFACTS
Intel Indeo Raw 1.2 3.25 MB 2.6:1 / 2.6 Mbps
Intel I.263 H.263 0.67 MB 12.8:1 / 535 Kbps
Intel Indeo 5.10 (32 bit) 0.973 MB 8.8:1 / 778 Kbps
Quality Factor 85
Keyframe every 15 frames
Intel Indeo 5.10 (32 bit) 0.367 MB 23.2:1 / 294 Kbps
Quality Factor 50
Keyframe every 15 frames
Intel Indeo 5.10 (32 bit) 0.339 MB 25.2:1 / 271 Kbps
Quality Factor 25
No fixed keyframes
LOW QUALITY - BLURRY
Brooktree YUV 411 Raw 4.42 MB 1.9:1 / 3.536 Mbps
<A HREF="#Top">Return to Top</A>
<A NAME="QTCodec">
<H2>Which Video for Windows Codecs are Available for QuickTime on Apple Macintosh?</H2>
</A>
As of June 24, 1998, this is an incomplete list:
Cinepak (formerly Apple Compact Video)
The Video for Windows 1.1 Apple Macintosh utilities include
QuickTime system extensions for:
Microsoft Video 1
Microsoft Full Frames (uncompressed)
Microsoft RLE (Run Length Encoding)
Intel provided a QuickTime system extension for Intel Indeo
3.2. As of June 24, 1998 this was available at:
<A HREF="
http://developer.intel.com/ial/indeo/video/driver.htm">
http://developer.intel.com/ial/indeo/video/driver.htm</A>
As of June 24, 1998 Intel did NOT provide QuickTime versions
of Indeo 4.x or 5.x for the Apple Macintosh. Intel did provide
a version of Indeo 4.4 for QuickTime for Windows.
<A HREF="#Top">Return to Top</A>
<A NAME="FourCC">
<H2>Microsoft Four Character Codes (FOURCC)</H2>
</A>
A Four Character Code or FOURCC is a four byte code defined by
Microsoft as part of Video for Windows to identify various types
of video data.
Microsoft defined FOURCC's to uniquely identify pixel layouts and
video compressor types in Video for Windows. For example, the FOURCC
'CVID' identifies the Cinepak (formerly Compact Video) video
compressor. AVI files contain the FOURCC for the video compressor in the
video stream header.
In addition to codecs, Four Character Codes identify the pixel layouts
used in uncompressed images and video. For example, codes such as
'YUY2' identify layouts of pixels in YUV space (as opposed to RGB).
These codes are used in interfacing with graphics cards. For example,
the S3 ViRGE/VX chip supports the YUY2 pixel layout. YUY2 is popular
because it refers to the 4:2:2 format used in <A
HREF="#CCIR601">CCIR-601</A> (D1) digital video.
Video for Windows, Display Control Interface (DCI), and Direct Draw
all use FOURCC's.
Incomplete List of Four Character Codes for Video for Windows Codecs
(This is followed by a list of Codes Registered with Microsoft)
DIB Full Frames (Uncompressed)
RGB Full Frames (Uncompressed)
RAW Full Frames (Uncompressed)
0x00000000 Full Frames (Uncompressed)
0x00000000 indicates the hexadecimal value of the Four Character
Code is zero. A Four Character Code 'AAAA' has hexadecimal value
0x41414141 where 0x41 is the ASCII code for 'A'.
Some video capture and editing products will use the non-standard
FOURCC 0x00000000 for uncompressed AVI video instead of the easier to
understand 'DIB ' or 'RGB ' or 'RAW '.
MSVC or CRAM or WHAM Microsoft Video 1
MRLE Microsoft Run Length Encoding
IV31 Indeo 3.1/3.2
IV32 Indeo 3.1/3.2
CVID Cinepak (Radius)
ULTI Ultimotion (IBM)
MJPG Motion JPEG (Microsoft, Paradigm Matrix, video capture companies)
IJPG Intergraph JPEG
CYUV Creative YUV
YVU9 Intel Indeo Raw YUV9
XMPG Editable (I frames only) MPEG (Xing)
MPGI Editable MPEG (Sigma Designs)
VIXL miro Video XL
MVI1 Motion Pixels
SPIG Radius Spigot
PGVV Radius Video Vision
TMOT Duck TrueMotion S
DMB1 Custom Format Used by Matrox Rainbow Runner. This
appears to be a type of Motion JPEG
IV41 Indeo Interactive (Indeo 4.1 from Intel)
IV50 Indeo 5.x, including 5.0, 5.06, and 5.10
UCOD ClearVideo (Iterated Systems)
VDOW VDOWave (VDONet)
SFMC Surface Fitting Method (CrystalNet)
QPEG Q-Team Dr.Knabe 's QPEG video compressor
H261 H.261
M261 Microsoft H.261
VIVO Vivo H.263
M263 Microsoft H.263
I263 Intel "I.263" H.263
MPG4 Microsoft MPEG-4
LIST OF CODES REGISTERED WITH MICROSOFT ( July 19, 1999 )
Compressor Code Description
ANIM Intel - RDX
AUR2 AuraVision - Aura 2 Codec - YUV 422
AURA AuraVision - Aura 1 Codec - YUV 411
BT20 Brooktree - MediaStream codec
BTCV Brooktree - Composite Video codec
CC12 Intel - YUV12 codec
CDVC Canopus - DV codec
CHAM Winnov, Inc. - MM_WINNOV_CAVIARA_CHAMPAGNE
CPLA Weitek - 4:2:0 YUV Planar
CVID Supermac - Cinepak
CWLT reserved
DUCK Duck Corp. - TrueMotion 1.0
DVE2 InSoft - DVE-2 Videoconferencing codec
DXT1 reserved
DXT2 reserved
DXT3 reserved
DXT4 reserved
DXT5 reserved
DXTC DirectX Texture Compression
FLJP D-Vision - Field Encoded Motion JPEG With LSI Bitstream Format
GWLT reserved
H260 Intel - Conferencing codec
H261 Intel - Conferencing codec
H262 Intel - Conferencing codec
H263 Intel - Conferencing codec
H264 Intel - Conferencing codec
H265 Intel - Conferencing codec
H266 Intel - Conferencing codec
H267 Intel - Conferencing codec
H268 Intel - Conferencing codec
H269 Intel - Conferencing codec
I263 Intel - I263
I420 Intel - Indeo 4 codec
IAN Intel - RDX
ICLB InSoft - CellB Videoconferencing codec
ILVC Intel - Layered Video
ILVR ITU-T - H.263+ compression standard
IRAW Intel - YUV uncompressed
IV30 Intel - Indeo Video 3 codec
IV31 Intel - Indeo Video 3.1 codec
IV32 Intel - Indeo Video 3 codec
IV33 Intel - Indeo Video 3 codec
IV34 Intel - Indeo Video 3 codec
IV35 Intel - Indeo Video 3 codec
IV36 Intel - Indeo Video 3 codec
IV37 Intel - Indeo Video 3 codec
IV38 Intel - Indeo Video 3 codec
IV39 Intel - Indeo Video 3 codec
IV40 Intel - Indeo Video 4 codec
IV41 Intel - Indeo Video 4 codec
IV42 Intel - Indeo Video 4 codec
IV43 Intel - Indeo Video 4 codec
IV44 Intel - Indeo Video 4 codec
IV45 Intel - Indeo Video 4 codec
IV46 Intel - Indeo Video 4 codec
IV47 Intel - Indeo Video 4 codec
IV48 Intel - Indeo Video 4 codec
IV49 Intel - Indeo Video 4 codec
IV50 Intel - Indeo 5.0
MP42 Microsoft - MPEG-4 Video Codec V2
MPEG Chromatic - MPEG 1 Video I Frame
MRCA FAST Multimedia - Mrcodec
MRLE Microsoft - Run Length Encoding
MSVC Microsoft - Video 1
NTN1 Nogatech - Video Compression 1
qpeq Q-Team - QPEG 1.1 Format video codec
RGBT Computer Concepts - 32 bit support
RT21 Intel - Indeo 2.1 codec
RVX Intel - RDX
SDCC Sun Communications - Digital Camera Codec
SFMC Crystal Net - SFM Codec
SMSC Radius - proprietary
SMSD Radius - proprietary
SPLC Splash Studios - ACM audio codec
SQZ2 Microsoft - VXtreme Video Codec V2
SV10 Sorenson - Video R1
TLMS TeraLogic - Motion Intraframe Codec
TLST TeraLogic - Motion Intraframe Codec
TM20 Duck Corp. - TrueMotion 2.0
TMIC TeraLogic - Motion Intraframe Codec
TMOT Horizons Technology - TrueMotion Video Compression Algorithm
TR20 Duck Corp. - TrueMotion RT 2.0
V422 Vitec Multimedia - 24 bit YUV 4:2:2 format (CCIR 601).
For this format, 2 consecutive pixels are represented by a 32 bit (4 byte) Y1UY2V color value.
V655 Vitec Multimedia - 16 bit YUV 4:2:2 format.
VCR1 ATI - VCR 1.0
VIVO Vivo - H.263 Video Codec
VIXL Miro Computer Products AG - for use with the Miro line of capture cards.
VLV1 Videologic - VLCAP.DRV
WBVC Winbond Electronics - W9960
XLV0 NetXL, Inc. - XL Video Decoder
YC12 Intel - YUV12 codec
YUV8 Winnov, Inc. - MM_WINNOV_CAVIAR_YUV8
YUV9 Intel - YUV9
YUYV Canopus - YUYV compressor
ZPEG Metheus - Video Zipper
The following list shows the FOURCC values for DIB compression.
Compressor Code Description
CYUV Creative Labs, Inc - Creative Labs YUV
FVF1 Iterated Systems, Inc. - Fractal Video Frame
IF09 Intel - Intel Intermediate YUV9
JPEG Microsoft - Still Image JPEG DIB
MJPG Microsoft - Motion JPEG DIB Format
PHMO IBM - Photomotion
ULTI IBM - Ultimotion
VDCT Vitec Multimedia - Video Maker Pro DIB
VIDS Vitec Multimedia - YUV 4:2:2 CCIR 601 for V422
YU92 Intel - YUV
Extensive information on Microsoft's Four Character Codes (FOURCC) may be
found at
<A HREF="
http://www.webartz.com/fourcc">Dave Wilson's The Almost Definitive FOURCC
Definition List</A>
Microsoft maintains a web page of FOURCC's for video, both pixel
layouts and compression, with the FOURCC's registered with Microsoft.
<A HREF="
http://www.microsoft.com/hwdev/devdes/fourcc.htm">
http://www.microsoft.com/hwdev/devdes/fourcc.htm</A>
Microsoft has defined 128-bit (16 byte) identifiers known as
Globally Unique Identifiers (GUIDs) to identify virtually everything
in the Microsoft Universe. Microsoft has defined mappings from
the Four Character Codes used for video and audio codecs to GUID's.
<A HREF="#FOURCCGUID">GUIDs for Video for Windows Codecs</A>
<A HREF="#Top">Return to Top</A>
<A NAME="ALGO">
<H2>Video Compression Technologies</H2>
</A>
There are several underlying technologies used by different Video for
Windows Codecs. For example, Indeo 3.2 and Cinepak both use Vector
Quantization. The international standards MPEG-1, MPEG-2, MPEG-4,
H.261, and H.263 all use a combination of the block Discrete Cosine
Transform (DCT) and motion estimation/compensation. Several of the
New Wave codecs use wavelet transform based image compression (the
Discrete Wavelet Transform or DWT). Other technologies include
Fractal Image Compression, represented by Iterated Systems.
Some general comments on image and video compression:
(1) Image compression may be lossless where no information is
lost during the compression process. The image produced by
the decompression (also known as decoding) process is identical
bit by bit with the original image. The widely used GIF format
is a lossless image and video (GIF89a or animated GIF) compression
format.
LOSSLESS COMPRESSION
(2) Image compression may be lossy where information is lost during
the compression process. These schemes exploit limitations of the
human visual system. Some errors are undetectable by the human
eye. Even though two images are different at the bit by bit level, the
human viewer cannot distinguish them. Some errors are detectable by
the human eye but acceptable. Some errors are detectable and very
annoying. The widely used JPEG image compression standard is a lossy
compression scheme.
LOSSY COMPRESSION
(3) Within lossy image and video compression, a compression scheme may
be perceptually lossless, in which case the human viewer cannot
distinguish between the original image or video and the decompressed
compressed image or video which has errors introduced by the lossy
compression. Most lossy image and video compression have some sort of
quality factor or factors. If the quality is good enough, then the image will
be perceptually lossless.
PERCEPTUALLY LOSSLESS COMPRESSION
(3) JPEG's and MPEG's and other lossy compression of images and video
are often compressed beyond the point of perceptual losslessness, but
the compressed images and video are still acceptable to the human viewer.
If the compression and decompression degrades the image in a way that
is very similar or identical to the natural degradation of images that might
occur in the world then the human visual system will not object greatly.
Loss of fine detail in an image is often acceptable because humans
perceive objects in the natural world with widely varying levels of
detail depending on how close the human viewer is to the object and
whether the human viewer is looking directly at the object or not.
The human viewer sees less detail if an object is further away.
When a human viewer looks directly at an object, the viewer uses a
small very high resolution part of the retina. If an object is to one
side of the direction of view, the viewer is using lower resolution
parts of the retina. Human beings are also used to certain natural
forms of degradation such as rain, snow, and fog.
Note that in all these natural viewing situation, the human viewer
will still perceive sharp edges and lines in an image regardless of
the level of detail. The human viewer will usually perceive the
objects as the same object despite the variations in level of detail.
A horse is a horse is a horse.
NATURALLY LOSSY COMPRESSION
(4) Sufficiently low quality lossy compression will introduce visual
artifacts that are highly annoying to the human viewer. An example
is the blocking artifacts visible in highly compressed MPEG video and
other block Discrete Cosine Transform based image compression codecs.
At some point the lossy compression will introduce artifacts that are
very unnatural and are perceived as new objects in the scene or spurious
lines within the image.
The human visual system is very sensitive to lines or edges. One of its
main functions appears to be to detect and characterize physical
objects such as other people, potential threats such as predators, food
plants, and other things. Objects in the visual system are dilineated by
edges. Anything such as a codec algorithm that destroys or creates an
edge in an image is noticed, particularly if the edge is interpreted as
the border of an object by the human visual and cognitive system.
UNNATURAL LOSSY COMPRESSION
All of the widely used video codecs are lossy compression algorithms.
At sufficiently high compression most of them will have problems with the
edges in the image. Vector quantization, block Discrete Cosine Transform,
and wavelet based image and video compression inherently do not
mathematically represent the intuitive notion of an edge or line.
THE POINT: In using and selecting video codecs, the author of an
AVI file (or a compressed digital video in general) needs to achieve
NATURALLY LOSSY COMPRESSION or better. Once the compression
introduces noticable AND unnatural artifacts, the video is of very
limited use even in cases where some features and objects are
recognizable.
A basic description of video compression technologies follows. I have
tried to avoid the dense mathematics found in most of the technical
video and image compression literature.
<A HREF="#Top">Return to Top</A>
<A NAME="RLE">
<H3>Run Length Encoding</H3>
</A>
VIDEO CODECS THAT USE RUN LENGTH ENCODING
Microsoft RLE (MRLE)
Run length encoding is also used to encode the DCT coefficients in
the block Discrete Cosine Transform (DCT) based international
standards MPEG, H.261, H.263, and JPEG.
STRENGTHS AND WEAKNESSES
1. Works for bilevel or 8 bit graphic images such as cel animation.
2. Not good for high resolution natural images.
OVERVIEW
Run length encoding encodes a sequence or run of consecutive
pixels of the same color (such as black or white) as a single
codeword.
For example, the sequence of pixels
77 77 77 77 77 77 77
could be coded as
7 77 (for seven 77's)
Run length encoding can work well for bi-level images (e.g.
black and white text or graphics) and for 8 bit images, particularly
images such as cel animations which contain many runs of the same
color.
Run length encoding does not work well for 24 bit natural images
in general. Runs of the same color are not that common.
<A HREF="#Top">Return to Top</A>
<A NAME="VQ">
<H3>Vector Quantization</H3>
</A>
VIDEO CODECS THAT USE VECTOR QUANTIZATION
Indeo 3.2
Cinepak
Indeo 3.2 and Cinepak both use vector quantization. As with most
digital video, Indeo and Cinepak work in the YUV color space (not RGB
for example).
STRENGTHS AND WEAKNESSES
1. The encoding process is computationally intensive. Still cannot
be done in real time without dedicated hardware.
2. The decoding process is very fast.
3. Blocking artifacts at high compression.
4. Generally, block Discrete Cosine Transform and Discrete Wavelet
Transform based image compression methods can achieve higher compression.
OVERVIEW
The basic idea of Vector Quantization based image compression is to
divide the image up into blocks (4x4 pixels in YUV space for Indeo and
Cinepak). Typically, some blocks (hopefully many) are similar to
other blocks although usually not identical. The encoder identifies a
class of similar blocks and replaced these with a "generic" block
representative of the class of similar blocks. The encoder encodes a
lookup table that maps short binary codes to the "generic" blocks.
Typically, the shortest binary codes represent the most common classes
of blocks in the image.
The Vector Quantization (VQ) decoder uses the lookup table to assemble
an approximate image comprised of the "generic" blocks in the lookup
table.
Note that this is inherently a lossy compression process because the
actual blocks are replaced with a generic block that is a "good
enough" approximation to the original block.
The encoding process is slow and computationally intensive because the
encoder must accumulate statistics on the frequency of blocks and
calculate the similarity of blocks in order to build the lookup table.
The decoding process is very quick because it is lookup table based.
In Vector Quantization, the lookup table may be called a codebook.
The binary codes that index into the table may be called codewords.
Higher compression is achieved by making the lookup table smaller,
fewer classes of similar blocks in the image. The quality of the
reproduced approximate image degrades as the lookup table becomes
smaller.
Vector Quantization is prone to blocking artifacts as compression is increased.
Vector Quantization is an entire sub-field in signal and image
processing. It goes well beyond the brief description above and
is applied to other uses than video compression.
The standard reference book on Vector Quantization is:
Vector Quantization and Signal Compression
A. Gersho and R. Gray
Boston, MA : Kluwer, 1992
Like many image and signal processing books, this is heavy on
abstract math.
<H4>A Simple Example</H4>
Consider the following 4 by 4 blocks of pixels. Each pixel has a value
in the range of 0-255. This is a grayscale image for simplicity.
(Block 1)
128 128 128 128
128 128 128 128
128 128 128 128
128 128 128 128
(Block 2)
128 127 128 128
128 128 128 128
128 128 127 128
128 128 128 128
(Block 3)
128 127 126 128
128 128 128 128
127 128 128 128
128 128 128 128
In practice, the blocks will look the same to a human viewer.
The second and third blocks could be safely replaced by the first
block. By itself, this does not compress the image. However, the
replacement block (Block 1) could be represented by a short index into
a lookup table of 4x4 blocks. For example, the index, in this case,
could be 1.
Lookup Table[1] = 128 128 128 128
128 128 128 128
128 128 128 128
128 128 128 128
The original image could be converted into a lookup table and a series
of indexes into the lookup table, achieving substantial compression.
In video, the same lookup table could be used for many frames, not just
a single frame.
<A HREF="#Top">Return to Top</A>
<A NAME="DCT">
<H3>Discrete Cosine Transform</H3>
</A>
VIDEO CODECS THAT USE DCT
Motion JPEG
Editable MPEG
MPEG-1
MPEG-2
MPEG-4
H.261
H.263
H.263+
STRENGTHS AND WEAKNESSES
1. Blocking artifacts at high compression.
2. Ringing at sharp edges. Occasional blurring at sharp edges.
3. Computationally intensive. Only recently has it been possible
to implement in real time on general purpose CPU's as opposed to
specialized chips.
OVERVIEW
The Discrete Cosine Transform (DCT) is a widely used transform in
image compression. The JPEG still image compression standard, the
H.261 (p*64) video-conferencing standard, the H.263 video-conferencing
standard and the MPEG (MPEG-1, MPEG-2, and MPEG-4) digital video
standards use the DCT. In these standards, a two-dimensional (2D) DCT
is applied to 8 by 8 blocks of pixels in the image that is compressed.
The 64 (8x8 = 64) coefficients produced by the DCT are then quantized
to provide the actual compression. In typical images, most DCT
coefficients from a DCT on an 8 by 8 block of pixels are small and
become zero after quantization. This property of the DCT on real world
images is critical to the compression schemes.
In addition, the human eyes are less sensitive to the high frequency
components of the image represented by the higher DCT coefficients.
A large quantization factor can and usually is applied to these
higher frequency components. The de facto standard quantization
matrix (a matrix of 64 quantization factors, one for each of the 64
DCT coefficients) in the JPEG standard has higher quantization factors
for higher frequency DCT coefficients.
The quantized DCT coefficients are then run-length encoded as variable
length codes that indicate some number of zero coefficients followed by
a non-zero coefficient. For example, a run-length code might indicate
4 zero coefficients followed by a non-zero coefficient of level 2. Short
variable length codes (e.g. 0110) are used for common combinations of
runs of zero and levels of the non-zero coefficient. Longer variable
length codes (e.g. 0000001101) are used for less common combinations of
runs of zero and levels of the non-zero coefficient. In this way,
substantial compression of the image is possible.
The Discrete Cosine Transform itself is best explained as a 1 dimensional
(1D) DCT first. The 2D DCT is equivalent to performing a 1D DCT on each
row of a block of pixels followed by a 1D DCT on each column of the block of
pixels produced by the 1D DCT's on the rows.
The one dimensional Discrete Cosine Transform is applied to a block of
N samples (pixels in an image or sound pressure samples in an audio file).
The Discrete Cosine Transform is an NxN matrix whose rows are sampled
cosine functions:
DCT(m,n) = sqrt( (1 - delta(m,1) ) / N )
* cos( (pi/N) * (n - 1/2) * (m-1) )
where
DCT(m,n) is the 1D DCT Matrix
m,n = 1,...,N
pi = 3.14159267...
N = number of samples in block
delta(m,1) = 1 if m is 1
0 otherwise
cos(x) = cosine of x (radians)
* = multiply
Naively, performing a DCT on a block of N samples would require
N*N multiplies and adds. However, the DCT matrix has
a recursive structure that allows implementation with order
N log(N) multiplies and adds (many fewer). This makes the DCT practical
for implementation on current CPUs and DSPs.
<A HREF="#Top">Return to Top</A>
<A NAME="DWT">
<H3>Discrete Wavelet Transform (DWT)</H3>
</A>
Video Codecs Using the Discrete Wavelet Transform:
VDONet's VDOWave
VxTreme
Intel Indeo 5.x
(possibly) Intel Indeo 4.x
STRENGTHS AND WEAKNESSES
1. Most video and image compression implemented using the
Discrete Wavelet Transform does not exhibit the blocking, also known
as tiling, artifacts seen with the block Discrete Cosine Transform.
2. DWT based video and image compression often outperforms block DCT
compression if evaluated using the Peak Signal to Noise Ratio (PSNR)
or Mean Squared Error (MSE) metric (these are mathematically
equivalent).
3. The subjective quality of video or images compressed with
DWT can appear better than block DCT methods for the same
compression ratio.
4. Wavelet compression exhibits blurring and ringing artifacts
at sharp edges as compression increases. This is an inherent
weakness of the method, shared with the Discrete Cosine Transform
which also exhibits the same type of artifacts. The ringing artifacts
are also known as contouring, mosquito noise, and the Gibbs effect.
OVERVIEW
The Discrete Wavelet Transform essentially consists of passing a
signal, such as an image, through a pair of filters, a low pass filter
and a high pass filter. The low pass filter yields a coarse or low
resolution version of the signal. The high pass filter yields an
added detail or difference signal. The outputs of the two filters are
then downsampled by two. Thus, at this point the downsampled outputs
have the same number of bits as the input signal. The parameters of
the two filters are selected so that when the upsampled output of the
low pass filter is added to the upsampled output of the high pass
filter, the original signal is reproduced.
The output of the high pass filter, the added detail signal, may then be fed
into another pair of filters and the process repeated.
A simple example of the discrete wavelet transform is the Haar
wavelet transform.
The input signal is x[n], a series of samples indexed by n.
Haar Low Pass Filter (the average of two successive samples)
g[n] = 1/2 * ( x[n] + x[n+1] )
Haar High Pass Filter (the difference of two successive samples)
h[n] = 1/2 * ( x[n+1] - x[n] )
Note that:
x[n] = g[n] - h[n]
x[n+1] = g[n] + h[n]
The output sequences g[n] and h[n] contain redundant information.
One can safely downsample by two, that is omit the even or odd
samples, and still reproduce the original input signal x[n].
Usually the odd samples are omitted.
The complete input signal x[n] can be reproduced from
g[0], g[2], g[4], ....
h[0], h[2], h[4], ....
alone.
x[0] = g[0] - h[0]
x[1] = g[0] + h[0]
x[2] = g[2] - h[2]
x[3] = g[2] + h[2]
and so on.
The output of the low pass filter is a coarse approximation of the
original input signal. When the input signal is an image, this
means a - sometimes blurry - low resolution version of the original image.
The output of the high pass filter is an added detail or difference
signal. When combined with the coarse approximation, as described, the
original input signal is exactly reproduced.
The coarse approximation is sometimes called a base layer and the
added detail is sometimes called an enhancement layer.
The output of the high pass filter, h[n], can be fed into another
pair of filters, repeating the process. The Discrete Wavelet Transforms
used in wavelet image and video compression iterate the process many
times.
So far, no compression has occured. The transform produces the same
number of bits as the input signal.
The output values are called transform coefficients or wavelet transform
coefficients.
The Haar wavelet is used primarily for illustrative purposes. More
complex filters are used for most real wavelet implementations.
Compression is typically achieved by applying some form of quantization,
scalar quantization in simple implementations and vector quantization
in more complex implementations, to the added detail signals. Some type of
<A HREF="#Entropy">entropy coding</A> may be applied to the quantized
transform coefficients.
For example, in the crude Haar wavelet example, the input signal x[n]
might be a series of 8 bit samples such as a raster scan of a gray
scale image. One could continue to use 8 bits for the output of the
low pass filter, g[n]. However, use only 6 bits (divide by 4) for the
output of the high pass filter, h[n] - the detail signal. This is
scalar quantization. Further, the output of the high pass filter in
the Haar case will tend to be peaked toward zero, that is small
coefficients will be more common than large coefficients. Thus, some
form of <A HREF="#Entropy">entropy coding</A> of the detail signal
h[n] is possible.
In fact, for most natural images, the low pass signal g[n]
will be correlated with
previous samples g[n-1] except at edges. g[n] will tend to be close
to or the same as g[n-1]. Objects tend to
have a continuous surface material, such as skin, with a constant
surface reflectance, a texture. The texture can be a simple
constant color level or a complex periodic or quasi-periodic
pattern, for example wallpaper or marble. Thus the low
pass signal g[n] could be encoded as a difference from previous
samples g[n-1] to achieve further compression.
For image coding, the notion is that the human visual system is less
sensitive to fine details in the image than to the gross features.
Thus quantization can be applied to the detail signals more strongly.
<A HREF="#Top">Return to Top</A>
<A NAME="Contour">
<H3>Contour-Based Image Coding</H3>
</A>
Video Codecs Using Contour-Based Image Coding:
Crystal Net's Surface Fitting Method (SFM) may be an example of
Contour-Based Image Coding.
The emerging ISO standard MPEG-4 incorporates some ideas associated
with Contour-Based Image Coding.
OVERVIEW
A contour is a line representing the outline of a figure, body, or
mass. A texture is a representation of the structure of a surface.
Contour-based image coding represents images as contours bounding
textured regions. Since contours frequently correspond to the boundaries
of objects in a scene there is a close relationship between
contour-based image coding and object-based image coding. Object-based
image coding represents an image as a collection of objects.
For example, once contours and textures have been extracted from an
image, the contours can be encoded as the control points of a spline -
a polynomial function used to represent curves - fitted to the
contour. The textures can be encoded as the transform coefficients
from a spatial frequency transform such as the Discrete Cosine
Transform or the many variants of the Discrete Wavelet Transform.
Compression can be achieved through <A HREF="#Entropy">entropy
coding</A> of scalar or vector quantization of the control parameters
of the spline and the transform coeffecients used for the texture.
Contour-Based Image Coding is a very leading edge image coding
technology (May, 1999). Extracting contours, also known as
edge or line detection, remains an unsolved problem in computer
science. Whereas human viewers have an easy sense of what is and
is not a contour or line in a scene, computer algorithms - so far -
miss some contours, as defined by humans, and also find spurious
contours, as defined by humans. Extracting contours is one of a number
of pattern recognition and reasoning tasks that seem almost effortless
in humans but have proven difficult - impossible so far - to emulate
with computers.
A number of edge detection and image segmentation algorithms exist that
could be applied to the contour extraction in contour-based image
coding.
In principle, contour based image coding could circumvent the
problems that transform coding methods such as the Discrete Cosine
Transform (JPEG, MPEG, H.261, H.263, DV, etc.) and the Discrete
Wavelet Transform (Intel Indeo, VDONet VDOWave, etc.) encounter at
sharp edges, achieving higher compression.
<A NAME="FD">
<H3>Frame Differencing</H3>
</A>
VIDEO CODECS THAT USE FRAME DIFFERENCING
Cinepak
STRENGTHS AND WEAKNESSES
1. Generally can achieve better compression than independent encoding
of individual frames.
2. Errors accumulate in successsive frames after a key frame, eventually
requiring another key frame. (see below)
OVERVIEW
Frame Differencing exploits the fact that little changes from frame to
frame in many video or animation sequences. For example, a video
might show a ball flying through the air in front of a static
background. Most of the image, the background, does not change from
frame to subsequent frame in the scene.
In frame differencing, the still image compression method such as
vector quantization is applied to the difference between the frame and
the decoded previous frame. Often, most of the difference is zero or
small values which can be heavily compressed.
Most often, frame differencing uses "key frames" which are frames
compressed without reference to a previous frame. This limits accumulated
errors and enables seeking within the video stream.
In the widely used Cinepak, a key frame is often set every 15 frames.
Cinepak movies usually use frame differencing combined with vector
quantization.
If the compression scheme is lossy (vector quantization is lossy), errors
will accumulate from frame to frame. Eventually these errors will become
visible. This necessitates key frames!
<A HREF="#Top">Return to Top</A>
<A NAME="Motion">
<H3>Motion Compensation</H3>
</A>
VIDEO CODECS THAT USE MOTION COMPENSATION
ClearVideo (RealVideo) Fractal Video Codec from Iterated Systems
VDOWave from VDONet
VxTreme
MPEG-1,2, and 4
H.261
H.263
H.263+
STRENGTHS AND WEAKNESSES
1. Motion compensation achieves high video compression in video
generally superior to frame differencing.
2. The encoding phase of motion compensation (known as motion
estimation) is computationally intensive. MPEG-1 with IP and
B frames cannot be encoded in real time without dedicated
hardware, a silicon implementation of motion estimation.
3. The motion compensation scheme used in the international
standards MPEG, H.261, and H.263 works best for scenes with
limited motion such as talking heads. In general, video with
heavy motion such as sports video is hard to compress with
motion compensation.
OVERVIEW
Motion Compensation codes for motion within a scene such as a ball
moving across a background. The block Discrete Cosine Transform (DCT)
based international video standards MPEG-1, MPEG-2, MPEG-4, H.261, and
H.263 use motion compensation. Iterated Systems ClearVideo (Real
Video) fractal Video Codec, VDOWave from VDONet, and VxTreme's video
codec use forms of motion compensation.
Motion Compensation refers to a number of ideas and algorithms. The
motion compensation method used in MPEG and related international
standards (H.261 and H.263) is described below. This motion
compensation works for translational motion only. This is suited for
objects moving across a background or panning of the camera. It does
not work well for spinning objects, resizing objects, or camera zooms.
Alternative forms of motion compensation exist which handle
rotational, scaling, skewing, and other kinds of motion in a scene.
Recognizing objects such as a flying ball in a scene is an unsolved
problem in image processing and understanding. A way to exploit
the motion of the ball to achieve image compression is to partition the
image into blocks (16x16 pixels in MPEG-1). Code a "motion vector" for
each block which points to the 16x16 pixel block in a previous (or
future) frame that most closely approximates the block being coded.
In many cases this reference block will be the same block (no motion).
In some cases this reference block will be a different block (motion).
The encoder need not recognize the presence of a ball or other object, only
compare blocks of pixels in the decoded and reference frames.
COMPRESSION IS ACHIEVED BY SENDING OR STORING ONLY THE MOTION VECTOR
(AND A POSSIBLE SMALL ERROR) INSTEAD OF THE PIXEL VALUES FOR THE
ENTIRE BLOCK.
Note that the reference block can be anywhere in the image. The coded
or "predicted" blocks must form a partition or tiling of the image (frame)
being decoded. A reference block can be any 16x16 pixel block in
the reference frame (image) that most closely approximates the coded
or "predicted" block. The reference frame must be decoded prior to
the current frame being decoded.
However, the reference frame need not be PRESENTED before the current
frame being decoded. In fact, the reference frame could be a future
frame!! MPEG allows for this through so-called B (bi-directionally
predicted) frames.
In the example of the ball in front of a static background, no motion
occurs in most of the blocks. For these cases, the motion vectors
are zero. Motion compensation for these blocks is then equivalent
to frame differencing, where the difference between the block and the
same block in a previous (or future) frame is coded.
For the block or blocks containing the moving ball, the motion vectors
will be non-zero, pointing to a block in a previous (or future) frame
that contains the ball. The displaced block is subtracted from
the current block. In general, there will be some left over non-zero
values, which are then coded using the still image compression scheme
such as Vector Quantization, the Block Discrete Cosine Transform, or
the Discrete Wavelet Transform.
MPEG style motion compensation does not require recognition of the
ball. An encoder simply compares the block being coded with displaced
blocks in the reference frame (a previous or future frame). The
comparison can use mean squared error or some other metric of
differences between images. The encoder selects the displaced block
with the smallest mean squared error difference! At no point has the
encoder recognized an object in the image.
In MPEG, the motion vectors are encoded as variable length codes for
greater compression.
The encoding process is called Motion Estimation. This finds the
motion vector (or vectors) for each block.
The decoding process is called Motion Compensation.
Motion Compensation achieves greater compression than simple Frame
Differencing.
ILLUSTRATIVE EXAMPLE:
Predicted Region Reference Region
(the current frame being decoded) (a previously decoded frame)
_________ _________
| | | | | |
| * | | | | * | (moving ball)
| 4 | -4 | | | |
| | | | | |
_________ _________
| | | | | |
| 0 | 0 | | | | (no change)
| | | | | |
| | | | | |
_________ _________
The asterisk (*) represents a ball flying across the scene from right
to left.
Four blocks with associated motion vectors (4, -4, 0, and 0). The
upper left block looks like the upper right block in the reference
region (where the ball was). The upper right block looks like the
upper left region in the reference region. The lower left and
lower right blocks were unchanged. In this simple example, the
vertical displacement is zero and is ignored.
In this simple example, the region can be decoded using the
motion vectors alone. In more general cases, there is an
error between the frame predicted using motion vectors alone
and the actual frame. This error is coded using a still image
compression scheme such as the block Discrete Cosine Transform (DCT).
In this simple example, the previously decoded frame is also
previous in the presentation order. The previously decoded or
reference frame precedes the current frame in time. In general, keep
in mind the distinction between decode order and presentation order.
The reference frame could be a future frame.
<A HREF="#Top">Return to Top</A>
<A NAME="ACM">
<H3>Audio Codecs</H3>
</A>
The sound tracks in an AVI file are Microsoft Waveform Audio (WAV) files.
The Waveform Audio files can be uncompressed PCM (Pulse Code Modulated) audio
or compressed with many different audio codecs (compressor/decompressors).
The windows multimedia system uses the terms WAVE or waveform audio to
refer to audio that consists of digitally sampled sound. In contrast
to notes of music as in MIDI, another type of audio incorporated in
the Windows multimedia system.
The WAV, WAVE, or waveform audio system (different authors use all
three to refer to the Microsoft Windows audio system) predates
Video for Windows. Video for Windows was wrapped around
WAVE. Various compromises were made to insure backward compatibility
with existing WAVE applications, drivers, and files. Keep this
in mind as there are various differences between the audio
system and the video system described in other sections of this
overview.
To play a waveform audio (WAV) file or the sound track of an AVI
compressed with a codec, the codec must be installed in the Audio
Compression Manager under Windows. Windows PCM files (uncompressed)
are always supported. The Audio Compression Manager (ACM) is the
software system in Windows that manages waveform audio codecs and
filters.
Different audio codecs are identified with different waveform audio
tags, 32 bit numbers. Wave audio format tags are registered with
Microsoft.
The following list of registered wave audio formats if from the
mmreg.h file in the Win32 SDK. mmreg.h is the Registered Multimedia
Information Public Header File
/* WAVE form wFormatTag IDs */
#define WAVE_FORMAT_UNKNOWN 0x0000 /* Microsoft Corporation */
#define WAVE_FORMAT_ADPCM 0x0002 /* Microsoft Corporation */
#define WAVE_FORMAT_IBM_CVSD 0x0005 /* IBM Corporation */
#define WAVE_FORMAT_ALAW 0x0006 /* Microsoft Corporation */
#define WAVE_FORMAT_MULAW 0x0007 /* Microsoft Corporation */
#define WAVE_FORMAT_OKI_ADPCM 0x0010 /* OKI */
#define WAVE_FORMAT_DVI_ADPCM 0x0011 /* Intel Corporation */
#define WAVE_FORMAT_IMA_ADPCM (WAVE_FORMAT_DVI_ADPCM) /* Intel Corporation */
#define WAVE_FORMAT_MEDIASPACE_ADPCM 0x0012 /* Videologic */
#define WAVE_FORMAT_SIERRA_ADPCM 0x0013 /* Sierra Semiconductor Corp */
#define WAVE_FORMAT_G723_ADPCM 0x0014 /* Antex Electronics Corporation */
#define WAVE_FORMAT_DIGISTD 0x0015 /* DSP Solutions, Inc. */
#define WAVE_FORMAT_DIGIFIX 0x0016 /* DSP Solutions, Inc. */
#define WAVE_FORMAT_DIALOGIC_OKI_ADPCM 0x0017 /* Dialogic Corporation */
#define WAVE_FORMAT_YAMAHA_ADPCM 0x0020 /* Yamaha Corporation of America */
#define WAVE_FORMAT_SONARC 0x0021 /* Speech Compression */
#define WAVE_FORMAT_DSPGROUP_TRUESPEECH 0x0022 /* DSP Group, Inc */
#define WAVE_FORMAT_ECHOSC1 0x0023 /* Echo Speech Corporation */
#define WAVE_FORMAT_AUDIOFILE_AF36 0x0024 /* */
#define WAVE_FORMAT_APTX 0x0025 /* Audio Processing Technology */
#define WAVE_FORMAT_AUDIOFILE_AF10 0x0026 /* */
#define WAVE_FORMAT_DOLBY_AC2 0x0030 /* Dolby Laboratories */
#define WAVE_FORMAT_GSM610 0x0031 /* Microsoft Corporation */
#define WAVE_FORMAT_ANTEX_ADPCME 0x0033 /* Antex Electronics Corporation */
#define WAVE_FORMAT_CONTROL_RES_VQLPC 0x0034 /* Control Resources Limited */
#define WAVE_FORMAT_DIGIREAL 0x0035 /* DSP Solutions, Inc. */
#define WAVE_FORMAT_DIGIADPCM 0x0036 /* DSP Solutions, Inc. */
#define WAVE_FORMAT_CONTROL_RES_CR10 0x0037 /* Control Resources Limited */
#define WAVE_FORMAT_NMS_VBXADPCM 0x0038 /* Natural MicroSystems */
#define WAVE_FORMAT_CS_IMAADPCM 0x0039 /* Crystal Semiconductor IMA ADPCM */
#define WAVE_FORMAT_G721_ADPCM 0x0040 /* Antex Electronics Corporation */
#define WAVE_FORMAT_MPEG 0x0050 /* Microsoft Corporation */
#define WAVE_FORMAT_CREATIVE_ADPCM 0x0200 /* Creative Labs, Inc */
#define WAVE_FORMAT_CREATIVE_FASTSPEECH8 0x0202 /* Creative Labs, Inc */
#define WAVE_FORMAT_CREATIVE_FASTSPEECH10 0x0203 /* Creative Labs, Inc */
#define WAVE_FORMAT_FM_TOWNS_SND 0x0300 /* Fujitsu Corp. */
#define WAVE_FORMAT_OLIGSM 0x1000 /* Ing C. Olivetti & C., S.p.A. */
#define WAVE_FORMAT_OLIADPCM 0x1001 /* Ing C. Olivetti & C., S.p.A. */
#define WAVE_FORMAT_OLICELP 0x1002 /* Ing C. Olivetti & C., S.p.A. */
#define WAVE_FORMAT_OLISBC 0x1003 /* Ing C. Olivetti & C., S.p.A. */
#define WAVE_FORMAT_OLIOPR 0x1004 /* Ing C. Olivetti & C., S.p.A. */
//
// the WAVE_FORMAT_DEVELOPMENT format tag can be used during the
// development phase of a new wave format. Before shipping, you MUST
// acquire an official format tag from Microsoft.
//
#define WAVE_FORMAT_DEVELOPMENT (0xFFFF)
<A HREF="#Top">Return to Top</A>
<A NAME="ACMInstalled">
<H3>How to determine which Audio Codecs are Installed</H3>
</A>
View the SYSTEM.INI file. In Windows 95, 32 bit Audio Codecs are
listed in the [drivers32] section.
[drivers32]
msacm.lhacm=lhacm.acm
msacm.l3codec=l3codecb.acm
msacm.msg723=msg723.acm
msacm.msnaudio=msnaudio.acm
The string msacm stands for Microsoft Audio Compression Manager (ACM).
This is the system software component that manages audio codecs (and
other audio components) in 16 bit Windows and in Win32. The different
codecs are identified by a string of arbitrary length such as msnaudio
for Microsoft Network Audio. Note that this differs from Video for
Windows where everything is a Four Character Code.
The audio codecs in the example above were installed in Windows 95 by
Microsoft's NetShow streaming audio/video product.
The Audio Compression Manager (ACM) is in the process of being displaced by
ActiveMovie.
In Windows 95,
(1) Open the Control Panel
(2) Double Click on the Multimedia Icon
(3) Select the Advanced tab
(4) Under the Multimedia Drivers icon, double click on the Audio
Compression Codecs icon to see a list of installed audio codecs. This
information is the same as the information stored in the SYSTEM.INI
file (see above).
<A HREF="#Top">Return to Top</A>
<A NAME="ActiveMovie">
<H2>ActiveMovie</H2>
</A>
Active Movie is a new multimedia architecture for Windows 95 and
Windows NT (4.0 and after). ActiveMovie includes support for
playing AVI, QuickTime (.MOV), and MPEG files. ActiveMovie is
apparently intended to supersede Video for Windows.
ActiveMovie 1.0 ships with the OEM Service Release 2 (OSR2) of Windows 95.
It did not ship with prior releases of Windows 95 but was available
separately through the ActiveMovie SDK.
ActiveMovie 1.0 is also bundled with Microsoft's Internet Explorer for
Windows 95 and NT 4.0 Internet Explorer can be downloaded from the
Microsoft Web site at:
<A HREF="
http://www.microsoft.com/ie/">
http://www.microsoft.com/ie/</A>
Active Movie 1.0 can be downloaded by itself from the Microsoft Internet
Explorer site. (6/6/97)
ActiveMovie 1.0 appears to be a 32 bit software component that runs
under both Windows 95 and Windows NT 4.0 user mode.
ActiveMovie provides at least three different programming
interfaces:
- The ActiveMovie ActiveX Control
- ActiveMovie Component Object Model (COM) interfaces
- The OM-1 MPEG MCI (Media Control Interface) command set
Amongst other things, the ActiveMovie ActiveX Control can be embedded
in HTML Web pages and programmed via VBScript or JavaScript. It can also
be programmed using Visual C++ or Visual Basic as part of applications.
The ActiveMovie COM interaces can be accessed through Visual C++ or
Visual Basic.
ActiveMovie supports a subset of the Media Control Interface (MCI)
commands familiar to Video for Windows programmers. These commands can
be accessed through the mciSendCommand(...) and mciSendString(...)
functions in C/C++.
ActiveMovie 1.0 does NOT provide video capture. Windows 95 with
ActiveMovie 1.0 continues to use the Video for Windows video capture system
and drivers.
ActiveMovie 2.0 (renamed DirectShow in 1997) will provide a new,
alternative mechanism for video capture. According to information
distributed by Microsoft at the WDM Device Driver Conference in April,
1997, ActiveMovie 2.0 will use the WDM Stream Class under Memphis
(formerly Windows 97) and Windows NT 5.0 to implement video capture.
This is subject to possible change since neither Memphis nor NT 5.0
has been released (7/27/97).
Extensive information on ActiveX and ActiveMovie is available at the
Microsoft Web site.
ActiveMovie 1.0 SDK Documentation at (6/28/97):
<A
HREF="
http://www.microsoft.com/devonly/tech/amov1doc/">
http://www.microsoft.com/devonly/tech/amo
v1doc/</A>
<A HREF="#Top">Return to Top</A>
<A NAME="GUID">
<H2>GUID's and AVI</H2>
</A>
GUID stands for Global Unique IDentifier.
In Microsoft's Component Object Model (COM) morass, an object oriented
programming model that incorporates MFC (Microsoft Foundation
Classes), OLE (Object Linking Embedding), ActiveX, ActiveMovie and
everything else Microsoft is hawking lately, a GUID is a 16 byte or
128 bit number used to uniquely identify objects, data formats,
everything.
Within ActiveMovie, there are GUID's for video formats, corresponding
to the FOURCC's or Four Character Codes used in Video for Windows.
These are specified in the file uuids.h in the Active Movie Software
Developer Kit (SDK). ActiveMovie needs to pass around GUID's that
correspond to the FOURCC for the video in an AVI file.
With proper programming, this should be hidden from end users but
ActiveMovie programmers need to know about GUID's.
<A HREF="#Top">Return to Top</A>
<A NAME="FOURCCGUID">
<H2>What are the GUIDs for the Video for Windows Codecs?</H2>
</A>
Video for Windows codecs are identified by a thirty-two bit Four
Character Code (FOURCC). A Four Character Code is a thirty-two bit
value formed from the ASCII codes for four characters. Typically, the
four characters are a memnonic for the item identified. For example,
the popular Cinepak video codec is 'CVID'.
Microsoft has introduced 128-bit (16 byte) Globally Unique
Identifiers (GUIDs) for identifying everything in the
Microsoft Universe. Microsoft has established a mapping
procedure from the human readable Four Character Codes to
GUIDs for video codecs. Replace the "x"'s in the GUID
below with the 32-bit value built from the Four Character
Code. The Four Character Code is in 7-bit ASCII.
xxxxxxxx-0000-0010-8000-00AA00389B71
For example, the GUID for Radius Cinepak is:
44495643-0000-0010-8000-00AA00389B71
44 is the hexadecimal (base 16) ASCII code for 'D'
49 is the hexadecimal (base 16) ASCII code for 'I'
56 is the hexadecimal (base 16) ASCII code for 'V'
43 is the hexadecimal (base 16) ASCII code for 'C'
Note that the order of the characters is reversed from
naive expectation.
<A HREF="#Top">Return to Top</A>
<A NAME="DirectShow">
<H2>DirectShow</H2>
</A>
DirectShow is Microsoft's new name for ActiveMovie 2.0 Microsoft
has shifted to marketing ActiveMovie as an integral part of
DirectX.
Apparently DirectShow (ActiveMovie 2.0) will be released to the general
public as part of Direct X 5.0.
<A HREF="#Top">Return to Top</A>
<A NAME="DirectDraw">
<H2>DirectDraw</H2>
</A>
DirectDraw, one of the components of DirectX, is a new Applications
Programming Interface (API) that is part of Windows 95 and Windows NT
4.0. DirectDraw allows programs to directly access video memory and
other hardware features in video display cards. Direct Draw also
defines new device drivers for graphic/video display adapters to
supersede the GDI display drivers. DirectDraw needs the new
device drivers.
DirectDraw consists of a new API and new hardware drivers known as
the Direct Draw Hardware Abstraction Layer (HAL).
In the Windows 3.1 Graphic Device Interface (GDI), an application
program never writes directly to the memory in a display card. It
writes to a buffer in main memory within Windows. GDI invokes a GDI
video device driver and copies the image from main memory to the video
memory of the video card. This multiple copying of the image
inevitably slows down the display.
The DirectDraw API provides a mechanism allowing appliation programs
to write directly into the video card's memory. It also provides a
mechanism to access various special features in video cards such as
color space conversion, hardware scaling, z-buffering, alpha blending,
and so forth.
Video card manufacturers must provide a DirectDraw driver for DirectDraw
to work with their card.
Microsoft's ActiveMovie uses DirectDraw to achieve faster playback of
AVI, QuickTime, and MPEG files.
There is extensive information on the DirectDraw and DirectX API's at the
Microsoft Web site.
The DirectX 3 SDK can be downloaded from the Microsoft Developer
Online Web site (6/28/97):
<A HREF="
http://www.microsoft.com/msdn/">
http://www.microsoft.com/msdn/</A>
Select Microsoft SDKs from the Technical Information section, or point
your browser at:
<A HREF="
http://www.microsoft.com/msdn/sdk/">
http://www.microsoft.com/msdn/sdk/</A>
Versions of DirectX
- DirectX 1
- DirectX 2
- DirectX 3
- DirectX 3A (latest as of 2/18/97)
- DirectX 5.0 (in development?)
- DirectX 6.0 (mentioned occasionally by Microsoft)
<A HREF="#Top">Return to Top</A>
<A NAME="Driver">
<H2>What is a driver?</H2>
</A>
Most often, driver refers to a software component that handles
control and communication with hardware in a computer. Most
but not all hardware device drivers run in a privileged mode
such as the Ring Zero mode of the Intel 80x86 processors.
Microsoft Windows uses the term driver to refer to several different
software components.
- Hardware Device Drivers
- Windows 3.x or 95 Virtual Device Drivers (VxD's)
- not all VxD's access hardware
- Microsoft Windows Installable Drivers
such as
- Media Control Interface or MCI Drivers
- Video for Windows Codecs (Compressor/Decompressors)
- Audio Codecs (Compressor/Decompressors)
Hardware Device Drivers include MS-DOS device drivers, DOS Terminate
and Stay Resident Programs that access hardware, Windows 3.x and 95
VxD's (Virtual Device Drivers) that access hardware, Windows DLL's
that access hardware but do not run in Ring Zero, Windows NT
kernel-mode device drivers, and the new Win32 Driver Model (WDM) drivers
for Memphis/Windows 98 and NT 5.0
Microsoft Windows Installable Drivers are Ring Three (Windows 95) or
user-mode (Windows NT) Dynamic Link Libraries (DLL's) with a single
entry point DriverProc(). MCI drivers, Video for Windows Codecs,
Microsoft Audio Compression Manager Codecs, and a variety of other
software components are Installable Drivers. Some installable
drivers are hardware drivers.
<A HREF="#Top">Return to Top</A>
<A NAME="GDI">
<H2>GDI Device Drivers</H2>
</A>
In Windows 3.1, and to a lesser extent Windows 95, the Graphic Device
Interface or GDI is the system that handles graphic display, including
putting bitmaps on the display monitor. Amongst other things, GDI
defines a set of GDI functions that application programs call such as
BitBlt(...) to display graphics on the screen. GDI also controls
printers and other graphic output devices.
Windows NT also provides a GDI system, but the underlying hardware
device drivers are different. Windows 3.1 GDI drivers won't work
under NT. Application programs written using the GDI API will usually
work under NT.
GDI is device independent. To achieve this, GDI uses GDI device
drivers loaded dynamically as needed.
The most commonly used GDI device driver is the DISPLAY device (for
display monitors). In Windows 3.1, this is specified by lines such
as:
display.drv=SUPERVGA.DRV
in the SYSTEM.INI file. SUPERVGA.DRV is a generic super vga graphic
display adapter driver shipped with Windows 3.1 SUPERVGA.DRV is a GDI
Device Driver
The printer driver is another common GDI device driver.
In Windows 3.1 or Windows 95 without DirectDraw, GDI handles display
of video frames on the display monitor.
GDI defines a set of standard functions exported by GDI Device
Drivers. A GDI Device Driver can also report that it does not support
a particular function.
Standard Functions for GDI Device Driver
Entry Name Description
01 BitBlt Transfer bits from src (source) to dest (destination) rect (rectangle)
02 ColorInfo Converts between logical and physical colors.