我做了一些关于减少视频延迟的研究。
My 以下答案表明相关的 FFmpeg 标志是-probesize 32
and -flags low_delay
.
上述标志与视频相关decoder侧(接收侧)。
视频编码参数“发送器/编码器侧”对于确定端到端延迟更为重要。
添加参数-tune zerolatency
将编码器延迟降至最低,但所需的带宽要高得多(并且可能与通过互联网进行流式传输无关)。
我将限制我对解码延迟的回答,因为它似乎与您的问题主题更相关。
关于“知道其他人如何以低延迟获取视频帧”的主题是一个单独问题的主题(我不知道答案)。
为了比较 FFplay 和 FFmpeg(解码器)之间的延迟差异,我创建了一个“自包含”测试示例。
主要“原则”:
- 执行 FFmpeg 子进程以并行传输两个 RTSP 输出流。
流式视频是合成模式,帧计数器作为视频上的文本。
两个输出流应用相同的编码参数(仅端口不同)。
RTSP IP 地址是127.0.0.1
(本地主机)。
(注:我们可能会使用三通复用器而不是编码两次,但我从未尝试过)。
- 执行FFplay子进程来解码并显示一个视频流。
- 执行 FFmpeg 子进程来解码其他视频流。
OpenCV imshow
用于显示视频。
- 计数器较大的显示视频是延迟较低的视频。
代码示例(已更新):
import cv2
import numpy as np
import subprocess as sp
import shlex
rtsp_stream0 = 'rtsp://127.0.0.1:21415/live.stream' # Use localhost for testing
rtsp_stream1 = 'rtsp://127.0.0.1:31415/live.stream'
width = 256 # Use low resolution (for testing).
height = 144
fps = 30
# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')
# FFplay command before updating the code (latency is still too high):
# ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')
# Updated FFplay command - adding "-vf setpts=0" (fixing the latency issue):
# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -vf setpts=0 -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')
# Execute FFplay to used as reference
ffplay_process = sp.Popen(ffplay_cmd)
# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL
# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
'-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
'split[v0][v1] ' # Split the input into [v0] and [v1]
'-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
f'-map "[v0]" -an {rtsp_stream0} '
'-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
f'-map "[v1]" -an {rtsp_stream1}'))
while True:
raw_frame = process.stdout.read(width*height*3)
if len(raw_frame) != (width*height*3):
print('Error reading frame!!!') # Break the loop in case of an error (too few bytes were read).
break
# Transform the byte read into a numpy array, and reshape it to video frame dimensions
frame = np.frombuffer(raw_frame, np.uint8)
frame = frame.reshape((height, width, 3))
# Show frame for testing
cv2.imshow('frame', frame)
key = cv2.waitKey(1)
if key == 27:
break
process.stdout.close()
process.wait()
ffplay_process.kill()
rtsp_streaming_process.kill()
cv2.destroyAllWindows()
添加之前的示例输出-vf setpts=0
:
Sample output (left side is OpenCV and right side is FFplay):
看起来 FFmpeg-OpenCV 延迟降低了6 frames添加之前-vf setpts=0
到 FFplay 命令。
注意:我花了一些时间才找到解决方案,我决定保留原始帖子的结果,以显示添加setpts
filter.
Update:
Adding -vf setpts=0
解决了延迟问题。
最新答案来自以下帖子建议添加setpts
将所有视频时间戳重置为零的视频过滤器。
对于音频流的存在来说,这可能不是一个好主意,但是当需要最低的视频延迟时,这是我能找到的最佳解决方案。
添加后-vf setpts=0
FFplay 和 OpenCV 的延迟大致相同:
重复测试mpv媒体播放器:
(注:在我找到 FFplay 解决方案之前,它似乎更相关)。
当应用所有 mpv“延迟黑客”时这一页,mpv和OpenCV的延迟大致相同:
FFplay 肯定有解决方案,但我找不到它......
代码示例(使用 mpv 而不是 FFplay):
import cv2
import numpy as np
import subprocess as sp
import shlex
rtsp_stream0 = 'rtsp://127.0.0.1:21415/live.stream' # Use localhost for testing
rtsp_stream1 = 'rtsp://127.0.0.1:31415/live.stream'
width = 256 # Use low resolution (for testing).
height = 144
fps = 30
# https://stackoverflow.com/questions/60462840/ffmpeg-delay-in-decoding-h264
ffmpeg_cmd = shlex.split(f'ffmpeg -nostdin -probesize 32 -flags low_delay -fflags nobuffer -rtsp_flags listen -rtsp_transport tcp -stimeout 1000000 -an -i {rtsp_stream0} -pix_fmt bgr24 -an -vcodec rawvideo -f rawvideo pipe:')
# https://stackoverflow.com/questions/16658873/how-to-minimize-the-delay-in-a-live-streaming-with-ffmpeg
#ffplay_cmd = shlex.split(f'ffplay -probesize 32 -analyzeduration 0 -sync ext -fflags nobuffer -flags low_delay -avioflags direct -rtsp_flags listen -strict experimental -framedrop -rtsp_transport tcp -listen_timeout 1000000 {rtsp_stream1}')
# https://github.com/mpv-player/mpv/issues/4213
mpv_cmd = shlex.split(f'mpv --demuxer-lavf-o=rtsp_flags=listen --rtsp-transport=tcp --profile=low-latency --no-cache --untimed --no-demuxer-thread --vd-lavc-threads=1 {rtsp_stream1}')
# Execute FFplay to used as reference
#ffplay_process = sp.Popen(ffplay_cmd)
# Execute mpv media player (as reference)
mpv_process = sp.Popen(mpv_cmd)
# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE) #,stderr=sp.DEVNULL
# The following FFmpeg sub-process stream RTSP video.
# The video is synthetic video with frame counter (that counts every frame) at 30fps.
# The arguments of the encoder are almost default arguments - not tuned for low latency.
# drawtext filter with the n or frame_num function https://stackoverflow.com/questions/15364861/frame-number-overlay-with-ffmpeg
rtsp_streaming_process = sp.Popen(shlex.split(f'ffmpeg -re -f lavfi -i testsrc=size={width}x{height}:rate={fps} '
'-filter_complex "drawtext=fontfile=Arial.ttf: text=''%{frame_num}'': start_number=1: x=(w-tw)/2: y=h-(2*lh): fontcolor=black: fontsize=72: box=1: boxcolor=white: boxborderw=5",'
'split[v0][v1] ' # Split the input into [v0] and [v1]
'-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
f'-map "[v0]" -an {rtsp_stream0} '
'-vcodec libx264 -pix_fmt yuv420p -g 30 -rtsp_transport tcp -f rtsp -muxdelay 0.1 -bsf:v dump_extra '
f'-map "[v1]" -an {rtsp_stream1}'))
while True:
raw_frame = process.stdout.read(width*height*3)
if len(raw_frame) != (width*height*3):
print('Error reading frame!!!') # Break the loop in case of an error (too few bytes were read).
break
# Transform the byte read into a numpy array, and reshape it to video frame dimensions
frame = np.frombuffer(raw_frame, np.uint8)
frame = frame.reshape((height, width, 3))
# Show frame for testing
cv2.imshow('frame', frame)
key = cv2.waitKey(1)
if key == 27:
break
process.stdout.close()
process.wait()
#ffplay_process.kill()
mpv_process.kill()
rtsp_streaming_process.kill()
cv2.destroyAllWindows()