- Smaller file sizes
- Better quality
- I think the ipod video uses it
- Quicktime uses it
Tools
I use Debian Sarge stable (this is my main server ... and I'm happy to keep Debian on it) so chances are I have to compile everything from scratch:
- MPlayer 1.0pre8
- faac 1.25
- MP4Box (part of gpac 0.4.2)
- x264 0.53.583 (I think is was the latest daily snapshot)
FAAC was a prick to compile as all the source files seem to have CR.LF at the end of every line (obviously from a Windows box). Make sure you read the Wiki for FAAC and there is a tiny bit of shell code (see here) to simplify stripping out the carriage returns. I ran the configure step for faac as :
'./configure --with-mp4v2'.
I can't remember the specifics of compiling x264 or gpac. You might have to do a bit of trawling to find the answers to get them compiled.
The technique from that Linux Journal article for converting video was the following:
- Dump the audio out as raw PCM audio
- Convert the PCM audio to AAC audio using FAAC B
- Use mencoder to dump out the source video in YUV 4:2:0 and using a fifo pipe that into x264 to convert the video part of the file to h.264
- Use MP4Box to combine the h.264 video and AAC audio into an MP4 container
A simple example is:
# dump pcm audio
mencoder -ao pcm -vc null -vo null source.mpg
# convert to AAC
faac --mpeg-version 4 audiodump.wav
# Convert the video to h264
mkfifo /tmp/myfifo
mencoder -vf format=i420 -nosound -ovc raw -of rawvideo \
-ofps 25 -o /tmp/myfifo source.mpg >/dev/null 2>&1 &
x264 -o source.h264 --fps 25 --crf 26 --progress \
/tmp/myfifo 720x576
rm /tmp/myfifo
# Encapsulate in MP4 container
MP4Box -add source.h264 -add audiodump.aac -ftps 25 source.mp4
In my case the source.mpg was a 720x576 PAL MPEG2 capture from a Hauppage PVR card. You'd obviously need to change the resolution settings and fps settings depending on your source material.
That crf setting appears to be very clever. It stands for Constant Rate Factor and it seems good at maintaining a set level of quality in a video regardless of whether there are low motion or high motion scenes throughout. There is a bitrate setting that you can use with x264, but I find that this is ignored when you start using the crf setting. The Linux Journal article suggests using a crf setting of between 18 and 26. With lower values producing better quality but with consequently larged file sizes.
With any kind of video encoding, there are often a bunch of bizarrely named parameters that you can tweak on your way to finding a good compromise between file size and quality. x264 has heaps of them. The example above used a single pass encode, but you can also do two pass or three pass (and possibly more passes) encodes to improve quality. In writing this article, I have only played around with a few of these settings. Some of my observations so far:
- I've somehow been able to convert a 23 minute show (typical half hour show minus commercials) into an 80MB mp4 file. While the quality wasn't marvellous it was a lot better than a divx of that file size.
- You need a lot of CPU power for h264 encoding. I've recently upgraded to a Core 2 Duo CPU ... and you need it ... unless you like waiting. x264 has a '--threads' option which is great for taking advantage of multi-core CPUs. Even with this CPU grunt, x264 tends to plod along at between 17fps and 32fps depending on my settings and the type of source material.
- Quicktime seems to very fickle about what kinds of h264 encoded stuff you throw at it. I rarely use Windows these days, and mplayer and XBMC seem to be happy playing back what I've encoded. One thing I did notice early on was that a 720x576 H264 video won't play back very well on XBMC, giving me the impression that the XBox does not have enough CPU power. Since then I have downscaled the video while I convert it and XBMC is fine (so long as there isn't lots of onscreen motion)
There seems to be a reasonably active community of windows users using these tools... albeit with some GUI frontend to disguise it. Check out the Doom 9 forums for MPEG-4 Encoder GUIs especially the ones that relate to MeGUI. MeGUI seems to have a bunch of 'profiles' which are just XML files that describe the settings to use with x264. I haven't quite worked out how to convert one of these XML profiles to x264 parameters ... as the parameter naming is slightly different but they are definately a good start once you get the hang of it.
Encoding examples
NB: I have no idea how well these will fair if you try to play them back in Windows.... especially using Quicktime. You could try using the Windows ports of mplayer or even VLC as a better option than quicktime.
I also soon discovered that the MP4Box command appends to an mp4 file if it already exists, so make sure the mp4 file is deleted before MP4Box runs. Another useful hint is to t use the '-tmp' option with MP4Box as the default behaviiour of MP4Box is to create rather large temp files un /tmp. If you're running short on disk space, MP4Box will screw up ... and most likely not tell you.
In all the examples, I have previously encoded the sound like:
# You always get a 'your computer is too slow' message. Don'tResize 720x576 source to 720x304. Convert in one pass
# worry about it.
mplayer -ao pcm:fast:file=source.wav -vc null -vo null source.mpg
# Create source.aac
faac --mpeg-vers 4 source.wav
FIFO=/tmp/myfifoThe sar/par params above require some explanation. The video source is 720x576 PAL in 4:3 aspect ratio. Since the video is downscaled to 720x304 some aspect ratio info needs to go back into the video so that the player knows how to play it back (I think Quicktime ignores these settings). The calculation looks like:
mkfifo $FIFO
mencoder -vf scale=720:304,format=i420 -nosound -ovc raw -of rawvideo -ofps 25 -o $FIFO source.mpg 2>&1 >/dev/null &
## Found a note somehwere that file output here needs to end in .264 or .h264
x264 -o source.h264 --crf 23 --sar 281:500 --threads 2 --fps 25 \
--bframes 1 --b-rdo --bime --weightb --direct auto --subme 6 \
--trellis 1 --analyse p8x8,b8x8,i4x4,p4x4 --me umh --progress \
--no-psnr $FIFO 720x304
MP4Box -tmp /somedir -add source.h264:par=281:500 -add source.aac -fps 25 source.mp4
DAR (Display aspect ratio) = 4/3 = 1.3333
the SAR and PAR = 1.3333 * 304 / 720 = 0.562 = 562/1000 = 281 / 500
So the SAR/PAR = 281:500
ie. Lowest fraction
For a 16x9 source video you could try a SAR/PAR of 3:4
Resize 720x576 source to 512:416. Convert in one pass . Less quality
FIFO=/tmp/myfifo
mkfifo $FIFO
mencoder -vf scale=512:416,format=i420 -nosound -ovc raw -of rawvideo -ofps 25 -o $FIFO source.mpg 2>&1 >/dev/null &
## Found a note somehwere that file output here needs to end in .264 or .h264
x264 -o source.h264 --crf 26 --sar 541:500 --threads 2 --fps 25 \
--bframes 1 --b-rdo --bime --weightb --direct auto --subme 6 \
--trellis 1 --analyse p8x8,b8x8,i4x4,p4x4 --me umh \
--progress --no-psnr $FIFO 512x416
MP4Box -tmp /somedir -add source.h264:par=541:500 -add source.aac -fps 25 source.mp4
NB: The scaling factor here 512x416 was specifically chosen as a size that approximates 4:3 aspect ratio so that it would hopefully work with Quicktime ... but I don't think it plays smoothly on quicktime this way.
Resize 720x576 source to 512:416. Convert in two passes. Low quaility, but small file
FIFO=/tmp/myfifo
mkfifo $FIFO
mencoder -vf scale=512:416,format=i420 -nosound -ovc raw \
-of rawvideo -ofps 25 -o $FIFO source.mpg 2>&1 >/dev/null &
## Found a note somehwere that file output here needs to end in .264 or .h264
x264 --bitrate 300 --crf 26 -o /dev/null --pass 1 --stats test.stats \
--sar 541:500 --threads 2 --fps 25 --bframes 2 --subme 6 --progress \
--analyse p8x8,b8x8,i4x4,p4x4 --no-psnr $FIFO 512x416
# 2nd pass
mencoder -vf scale=512:416,format=i420 -nosound -ovc raw \
-of rawvideo -ofps 25 -o $FIFO source.mpg 2>&1 >/dev/null &
## Found a note somehwere that file output here needs to end in .264 or .h264
x264 --bitrate 300 -o source.h264 --pass 2 --stats test.stats \
--sar 541:500 --threads 2 --fps 25 --bframes 2 --subme 6 \
--progress --analyse p8x8,b8x8,i4x4,p4x4 --no-psnr $FIFO 512x416
MP4Box -add source.h264:par=541:500 -add source.aac.aac \
-fps 25 source.mp4