Eu estou usando o FFmpeg para mux video (de stills .png) e audio (de .aiff). Isso está acontecendo com sucesso e minhas configurações são as seguintes:
ffmpeg -r 24 -i /input/%04d.png -i /input/audio.aiff -c:v libx264 -preset:v medium -crf:v 15 -pix_fmt yuv420p -ac 2 -c:a libfaac -b:a 500k -cutoff 20000 /output.mp4
O problema ocorre no QuickTime e, basicamente, em todos os outros jogadores quando tento pular / scrub através do vídeo para diferentes partes. O áudio continua em perfeita sincronia depois que eu clico ou deslizo no tempo com o controle deslizante, mas o vídeo congela em qualquer quadro por alguns segundos, antes de 'recuperar o atraso' e retomar a reprodução normal com o vídeo e o áudio em perfeita sincronia novamente. p>
Por favor, note que o mux em si é bem-sucedido: é apenas o uso de players de vídeo que parece ser o problema, com o atraso de vídeo ao pular para partes diferentes.
Também vale a pena notar que tentei codificar o vídeo sozinho usando o FFmpeg com libx264, e o mesmo problema ocorre: por isso, certamente deve ser um problema com a libx264. Meu palpite é que há uma opção para o codificador libx264 que resolverá isso.
Last login: Mon Sep 16 18:50:17 on ttys000
MGTFs-Acousmatibook:~ MGTF$ ffmpeg -r 24
-i/Volumes/MGTF_HDD_2/Full_Quality_Stills/%04d.png -i
/Volumes/MGTF_HDD_2/Demo_Audio/MIX_2_48KHz_24-Bit.aif -c:v libx264 -preset:v medium -
crf:v 15 -pix_fmt yuv420p -ac 2 -c:a libfaac -b:a 500k -cutoff 20000
/Volumes/MGTF_HDD_2/test_243234.mp4
ffmpeg version 1.2.1 Copyright (c) 2000-2013 the FFmpeg developers
built on Sep 15 2013 11:52:52 with gcc 4.2.1 (GCC) (Apple Inc. build 5666) (dot 3)
configuration: --prefix=/usr/local/Cellar/ffmpeg/1.2.1 --enable-shared --enable-pthreads
--enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-
avresample --enable-vda --cc=cc --host-cflags= --host-ldflags= --enable-libx264 --
enable-libfaac --enable-libmp3lame --enable-libxvid --enable-libvorbis --enable-libvpx -
-enable-libfdk-aac
libavutil 52. 18.100 / 52. 18.100
libavcodec 54. 92.100 / 54. 92.100
libavformat 54. 63.104 / 54. 63.104
libavdevice 54. 3.103 / 54. 3.103
libavfilter 3. 42.103 / 3. 42.103
libswscale 2. 2.100 / 2. 2.100
libswresample 0. 17.102 / 0. 17.102
libpostproc 52. 2.100 / 52. 2.100
[image2 @ 0x10180ec00] Stream #0: not enough frames to estimate rate; consider
increasing probesize
Input #0, image2, from '/Volumes/MGTF_HDD_2/Full_Quality_Stills/%04d.png':
Duration: 00:04:06.00, start: 0.000000, bitrate: N/A
Stream #0:0: Video: png, rgb48be, 1920x1080 [SAR 2835:2835 DAR 16:9], 25 tbr, 25 tbn, 25
tbc
[aiff @ 0x101809800] max_analyze_duration 5000000 reached at 5001333 microseconds
Guessed Channel Layout for Input Stream #1.0 : stereo
Input #1, aiff, from '/Volumes/MGTF_HDD_2/Demo_Audio/MIX_2_48KHz_24-Bit.aif':
Duration: 00:04:16.25, start: 0.000000, bitrate: 2304 kb/s
Stream #1:0: Audio: pcm_s24be, 48000 Hz, stereo, s32, 2304 kb/s
[libx264 @ 0x101809e00] using SAR=1/1
[libx264 @ 0x101809e00] using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1
Cache64
[libx264 @ 0x101809e00] profile High, level 4.0
[libx264 @ 0x101809e00] 264 - core 125 - H.264/MPEG-4 AVC codec - Copyleft 2003-2012 -
http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0
analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16
chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2
threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0
bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1
weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0
rc_lookahead=40 rc=crf mbtree=1 crf=15.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/Volumes/MGTF_HDD_2/test_243234.mp4':
Metadata:
encoder : Lavf54.63.104
Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 1920x1080 [SAR 1:1 DAR
16:9], q=-1--1, 12288 tbn, 24 tbc
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, s16, 500 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (png -> libx264)
Stream #1:0 -> #0:1 (pcm_s24be -> libfaac)
Press [q] to stop, [?] for help
frame= 6150 fps=3.1 q=-1.0 Lsize= 92964kB time=00:04:16.25 bitrate=2971.9kbits/s
video:88174kB audio:4619kB subtitle:0 global headers:0kB muxing overhead 0.184099%
[libx264 @ 0x101809e00] frame I:63 Avg QP: 6.33 size: 96239
[libx264 @ 0x101809e00] frame P:2238 Avg QP:13.66 size: 22767
[libx264 @ 0x101809e00] frame B:3849 Avg QP:16.70 size: 8645
[libx264 @ 0x101809e00] consecutive B-frames: 14.2% 6.0% 3.3% 76.6%
[libx264 @ 0x101809e00] mb I I16..4: 73.6% 18.0% 8.5%
[libx264 @ 0x101809e00] mb P I16..4: 2.8% 4.2% 1.4% P16..4: 10.1% 1.9% 1.6% 0.0%
0.0% skip:78.0%
[libx264 @ 0x101809e00] mb B I16..4: 0.2% 0.5% 0.2% B16..8: 4.8% 0.5% 0.2%
direct: 1.2% skip:92.4% L0:45.4% L1:45.9% BI: 8.7%
[libx264 @ 0x101809e00] 8x8 transform intra:42.8% inter:56.9%
[libx264 @ 0x101809e00] coded y,uvDC,uvAC intra: 42.9% 25.3% 23.0% inter: 3.3% 2.6% 0.8%
[libx264 @ 0x101809e00] i16 v,h,dc,p: 93% 3% 4% 1%
[libx264 @ 0x101809e00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 32% 11% 48% 1% 2% 1% 2% 1%
1%
[libx264 @ 0x101809e00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 27% 20% 5% 6% 5% 7% 4%
7%
[libx264 @ 0x101809e00] i8c dc,h,v,p: 83% 7% 6% 4%
[libx264 @ 0x101809e00] Weighted P-Frames: Y:10.7% UV:1.7%
[libx264 @ 0x101809e00] ref P L0: 39.8% 3.9% 34.5% 19.7% 2.3%
[libx264 @ 0x101809e00] ref B L0: 67.6% 24.9% 7.5%
[libx264 @ 0x101809e00] ref B L1: 88.2% 11.8%
[libx264 @ 0x101809e00] kb/s:2818.79
MGTFs-Acousmatibook:~ MGTF$