解析Android的动态照片

Andorid动态照片的处理与生成

Posted by wszqkzqk on August 1, 2024
本文字数:28528

前言

Android的动态照片是一种逐渐普及的媒体文件格式,它可以将包含音频的视频与静态图片结合在一起,形成一个动态的照片。这种照片已经在多种机型上得到了支持,例如Google的Pixel系列、三星的Galaxy系列,以及小米等厂商的大部分机型。本文将介绍Android动态照片的格式、处理与生成方法。

动态照片的格式

Android动态照片本质上是在静态图片的末尾直接附加了一个视频文件,这个视频文件包含了音频与视频流。其中,视频文件的位置使用XMP元数据进行标记,这样在解析时可以快速找到视频文件的位置。这种格式的好处是可以在不改变原有图片的情况下,为图片添加动态效果。在不受支持的图片查看器上,这种图片会被当作静态图片显示,而在支持的图片查看器上,可以显示动态效果。

结构与XMP元数据

动态照片嵌入有以下相关的XMP元数据:

  • Xmp.GCamera.MicroVideoVersion:视频版本号,例如1
  • Xmp.GCamera.MicroVideo:是否为动态照片,例如1
  • Xmp.GCamera.MicroVideoOffset:视频文件的偏移量
    • 从文件末尾算起
    • 十进制编码的字符串,例如8532144
    • 事实上即为视频文件的大小

文件从开始到$FILE_SIZE - Xmp.GCamera.MicroVideoOffset的部分为一个完整的静态图片,而从$FILE_SIZE - Xmp.GCamera.MicroVideoOffset到文件末尾的部分为一个完整的视频文件,可以进行拆分。但是在拆分后还需要注意从得到的静态图片中删除XMP元数据,因为此时的图片已经不再是动态照片了。

不依赖XMP元数据的解析

由于目前手机拍摄的动态照片中嵌入的视频均为MP4格式,因此可以通过解析MP4文件头来找到嵌入的视频文件的位置。MP4文件头的结构如下:

  • 前4个字节为一个32位整数,表示文件头的大小
  • 之后的4个字节为一个字符串,表示文件类型,为ftyp

因此,只需要在文件中查找到ftyp字节序列的位置,减去4,即可找到视频文件的位置。这种方法不依赖XMP元数据,但是存在一定的风险,因为理论上ftyp字节序列可能会出现在文件的其他位置。

简单处理工具

使用GNU Coreutils可以很方便地对动态照片进行处理。例如,可以用tail命令从动态照片中提取出视频文件:

FILE=/path/to/your/img tail -c +$(math $(grep -F --byte-offset --only-matching --text ftyp $FILE | grep -o ^[0-9]\*) - 3) $FILE > /path/to/your/video
  • FILE:动态照片的路径
  • math:一个简单的计算器(可以使用awk代替)
    • 这里减去3而不是4是因为tail命令的+选项是从1开始计数的
  • grep -F --byte-offset --only-matching --text ftyp $FILE:查找ftyp字节序列的位置
  • grep -o ^[0-9]\*:提取出文件头的大小
  • tail -c +$OFFSET $FILE:从文件的第OFFSET个字节开始输出
  • > /path/to/your/video:将输出重定向到视频文件

也可以用dd命令实现相同的功能:

FILE=/path/to/your/img dd bs=$(math $(grep -F --byte-offset --only-matching --text ftypmp4 $FILE | grep -o ^[0-9]\*) - 4) skip=1 if=$FILE of=/path/to/your/video

dd的字节计数是从0开始的,因此这里减去4而不是3。

提取静态图片的方式也类似:

FILE=/path/to/your/img head -c $(math $(grep -F --byte-offset --only-matching --text ftyp $FILE | grep -o ^[0-9]\*) - 4) $FILE > /path/to/your/img.jpg

提取、编辑、合成工具

笔者使用Vala语言,基于GExiv2与FFmpeg库,开发了一个简单的动态照片处理工具——Live Photo Converter,可以实现动态照片的提取、编辑、合成等功能。该工具支持Linux与Windows平台,功能更为强大、方便,更多信息请参见项目主页。

目前的这一工具的动态照片逐帧提取功能是用FFmpeg实现的,程序简单创建了FFmpeg进程来实现逐帧提取,未来可能会增加使用GStreamer实现的功能以便支持图片的并行编码等更加高效的处理方式。目前笔者已经使用GStreamer实现了逐帧提取功能,也可以选择使用FFmpeg。

架构:用于提取、解包的LivePhoto

目前的工具使用LivePhoto类来表示动态照片,该类没有公开字段,仅包含以下私有字段:

public class LivePhotoConv.LivePhoto {
    string basename;
    string basename_no_ext;
    string extension_name;
    string filename;
    GExiv2.Metadata metadata;
    string dest_dir;
    int64 video_offset;
    bool make_backup;
    bool export_original_metadata;
    FileCreateFlags file_create_flags;

    ......
}

该类的创建需要传入动态照片的路径,还可以传入可选的输出目录、是否导出原始XMP元数据、文件创建标志与是否备份:

public LivePhoto (string filename, string? dest_dir = null, bool export_metadata = true,
                    FileCreateFlags file_create_flags = FileCreateFlags.REPLACE_DESTINATION, bool make_backup = false) throws Error {
    this.metadata = new GExiv2.Metadata ();
    this.metadata.open_path (filename);
    this.make_backup = make_backup;
    this.file_create_flags = file_create_flags;

    this.filename = filename;
    this.basename = Path.get_basename (filename);
    var last_dot = this.basename.last_index_of_char ('.');
    if (last_dot == -1) {
        this.basename_no_ext = this.basename;
        this.extension_name = "jpg"; // Default extension name
    } else {
        this.basename_no_ext = this.basename[:last_dot];
        if (last_dot + 1 < this.basename.length) {
            this.extension_name = this.basename[last_dot + 1:];
        } else {
            this.extension_name = "jpg"; // Default extension name
        }
    }
    if (dest_dir != null) {
        this.dest_dir = dest_dir;
    } else {
        this.dest_dir = Path.get_dirname (filename);
    }

    this.video_offset = this.get_video_offset ();
    if (this.video_offset < 0) {
        throw new NotLivePhotosError.OFFSET_NOT_FOUND_ERROR ("The offset of the video data in the live photo is not found.");
    }
    // Remove the XMP metadata of the main image since it is not a live photo anymore
    // MUST after `get_video_offset` because `get_video_offset` may use the XMP metadata
    this.metadata.clear_xmp ();
    this.export_original_metadata = export_metadata;
}

在创建LivePhoto类的实例时,程序会调用get_video_offset方法,这一方法的实现如下:

inline int64 get_video_offset () throws Error {
    try {
        // Get the offset of the video data from the XMP metadata
        var tag_value = this.metadata.try_get_tag_string ("Xmp.GCamera.MicroVideoOffset");
        if (tag_value != null) {
            int64 reverse_offset = int64.parse (tag_value);
            if (reverse_offset > 0) {
                var file_size = File.new_for_commandline_arg  (this.filename)
                    .query_info ("standard::size", FileQueryInfoFlags.NONE)
                    .get_size ();
                return file_size - reverse_offset;
            }
        }
    } catch {
        // If the XMP metadata does not contain the video offset, search for the video tag in the live photo
        Reporter.warning ("XMPOffsetNotFoundWarning",
            "The XMP metadata does not contain the video offset. Searching for the video tag in the live photo.");
    }

    const uint8[] VIDEO_TAG = {'f', 't', 'y', 'p'}; // The tag `....ftyp` of MP4 header.
    const int TAG_LENGTH = VIDEO_TAG.length; // The length of the tag.
    int64 offset = -1; // The offset of the video data in the live photo.

    var file = File.new_for_commandline_arg  (this.filename);
    var input_stream = file.read ();

    ssize_t bytes_read; // The number of bytes read from the input stream.
    int64 position = 0; // The current position in the input stream.
    uint8[] buffer = new uint8[Utils.BUFFER_SIZE];
    uint8[] prev_buffer_tail = new uint8[TAG_LENGTH - 1]; // The tail of the previous buffer to avoid boundary crossing.
    uint8[] search_buffer = new uint8[Utils.BUFFER_SIZE + TAG_LENGTH - 1];

    while ((bytes_read = input_stream.read (buffer)) > 0) {
        // Copy the tail of the previous buffer to check for boundary crossing
        Memory.copy (search_buffer, prev_buffer_tail, TAG_LENGTH - 1);
        // Copy the current buffer to the search buffer
        Memory.copy ((void*) ((int64) search_buffer + TAG_LENGTH - 1), buffer, bytes_read); // Vala does not support pointer arithmetic, so we have to cast the pointer to int64 first.

        ssize_t buffer_offset = 0;
        for (uint i = 0; i < bytes_read; i += 1) {
            if (buffer[i] == VIDEO_TAG[buffer_offset]) {
                buffer_offset += 1;
                if (buffer_offset == TAG_LENGTH) {
                    offset = position - TAG_LENGTH + 1;
                    break;
                }
            } else {
                buffer_offset = 0;
            }
            position += 1;
        }

        if (offset != -1) {
            break;
        }
        // Store the tail of the current buffer
        Memory.copy (prev_buffer_tail, (void*) ((int64) buffer + bytes_read - TAG_LENGTH - 1), TAG_LENGTH - 1);
    }

    return offset - 4; // The feature of MP4: there is 4 bytes of size before the tag.
}
  • 首先尝试从XMP元数据中获取视频偏移量
  • 如果失败则搜索ftyp标签
    • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
    • 每次读取时,将上一次读取的内容的末尾与当前读取的内容的开头拼接在一起,以避免边界问题
    • 在拼接后的内容中搜索ftyp标签
    • 找到后返回offset - 4,因为ftyp标签前有4个字节的大小
    • 如果没有找到则返回-1

在创建LivePhoto类的实例后,可以调用extract_main_image方法提取静态图片,该方法接受一个可选的目标路径参数:

public string export_main_image (string? dest = null) throws Error {
    // Export the bytes before `video_offset`
    var file = File.new_for_commandline_arg  (this.filename);
    var input_stream = file.read ();
    string main_image_filename;
    if (dest != null) {
        main_image_filename = dest;
    } else {
        if (this.basename.has_prefix ("MVIMG")) {
            // The main image of a live photo is named as `IMG_YYYYMMDD_HHMMSS.xxx`
            main_image_filename = Path.build_filename (this.dest_dir, "IMG" + this.basename[5:]);
        } else {
            // If the original image is xxx.yyy, the main image is xxx_0.yyy
            main_image_filename = Path.build_filename (this.dest_dir, this.basename_no_ext + "_0." + this.extension_name);
        }
    }

    var output_stream = File.new_for_commandline_arg  (main_image_filename).replace (null, make_backup, file_create_flags);
    // Write the bytes before `video_offset` to the main image file
    Utils.write_stream_before (input_stream, output_stream, this.video_offset);

    if (export_original_metadata) {
        // Copy the metadata from the live photo to the main image
        metadata.save_file (main_image_filename);
    }
    return (owned) main_image_filename;
}
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
  • 将读取的内容写入到目标文件中,直到video_offset
  • 如果需要导出原始XMP元数据,则将元数据保存到目标文件中
  • 返回目标文件的路径
    • 如果没有传入目标文件路径,则根据文件名生成一个默认的文件名
    • 如果原始文件名以MVIMG开头,则生成一个以IMG开头的文件名
    • 否则在原始文件名后加上_0
    • 文件扩展名保持不变

还可以调用extract_video方法提取视频文件,该方法接受一个可选的目标路径参数:

public string export_video (string? dest = null) throws Error {
    /* Export the video of the live photo. */
    // Export the bytes after `video_offset`
    var file = File.new_for_commandline_arg  (this.filename);
    var input_stream = file.read ();
    string video_filename;
    if (dest != null) {
        video_filename = dest;
    } else {
        if (this.basename.has_prefix ("MVIMG")) {
            // The video of a live photo is named as `VID_YYYYMMDD_HHMMSS.mp4`
            video_filename = Path.build_filename (this.dest_dir, "VID" + this.basename_no_ext[5:] + ".mp4");
        } else if (this.basename.has_prefix ("IMG")) {
            // If the original image is IMG_YYYYMMDD_HHMMSS.xxx, the video is VID_YYYYMMDD_HHMMSS.mp4
            video_filename = Path.build_filename (this.dest_dir, "VID" + this.basename_no_ext[3:] + ".mp4");
        } else {
            video_filename = Path.build_filename (this.dest_dir, "VID_" + this.basename_no_ext + ".mp4");
        }
    }

    var output_stream = File.new_for_commandline_arg  (video_filename).replace (null, make_backup, file_create_flags);
    // Write the bytes after `video_offset` to the video file
    input_stream.seek (this.video_offset, SeekType.SET);
    Utils.write_stream (input_stream, output_stream);

    return (owned) video_filename;
}
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
    • 跳过video_offset个字节
  • 将读取的内容写入到目标文件中
  • 返回目标文件的路径
    • 如果没有传入目标文件路径,则根据文件名生成一个默认的文件名
    • 如果原始文件名以MVIMG开头,则生成一个以VID开头的文件名
    • 如果原始文件名以IMG开头,则生成一个以VID开头的文件名
    • 否则在原始文件名前加上VID_
    • 文件扩展名为.mp4

最后,可以调用splites_images_from_video_ffmpeg方法将视频文件拆分为图片,该方法接受两个可选的参数,分别为输出格式与目标目录。值得注意的是,该方法使用了FFmpeg库,因此需要安装FFmpeg;此外,该方法不需要预先导出视频文件,因为本方法是直接在本程序中从video_offset开始读取视频数据,并通过管道传递给FFmpeg,因此可以直接从动态照片中提取,而不产生额外的中间文件:

public void splites_images_from_video_ffmpeg (string? output_format = null, string? dest_dir = null) throws Error {
    /* Export the video of the live photo and split the video into images. */
    string name_to_printf;
    string dest;

    var format = (output_format != null) ? output_format : this.extension_name;

    if (this.basename.has_prefix ("MVIMG")) {
        name_to_printf = "IMG" + this.basename_no_ext[5:] + "_%d." + format;
    } else {
        name_to_printf = this.basename_no_ext + "_%d." + format;
    }

    if (dest_dir != null) {
        dest = Path.build_filename (dest_dir, name_to_printf);
    } else {
        dest = Path.build_filename (this.dest_dir, name_to_printf);
    }

    string[] commands;
    if (format.ascii_down () == "webp") {
        // Spcify the `libwebp` encoder to avoid the `libwebp_anim` encoder in `ffmpeg`
        commands = {"ffmpeg", "-progress", "-", // Split progress to stdout
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-c:v", "libwebp",
                    "-y", dest};
    } else {
        commands = {"ffmpeg", "-progress", "-",
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-y", dest};
    }

    var subprcs = new Subprocess.newv (commands,
        SubprocessFlags.STDOUT_PIPE |
        SubprocessFlags.STDERR_PIPE |
        SubprocessFlags.STDIN_PIPE);

    var pipe_stdin = subprcs.get_stdin_pipe ();
    var pipe_stdout = subprcs.get_stdout_pipe ();
    var pipe_stderr = subprcs.get_stderr_pipe ();

    var file = File.new_for_commandline_arg  (this.filename);
    var input_stream = file.read ();
    input_stream.seek (this.video_offset, SeekType.SET);
    Utils.write_stream (input_stream, pipe_stdin);

    pipe_stdin.close (); // Close the pipe to signal the end of the input stream, MUST before `wait`
    subprcs.wait ();

    var exit_code = subprcs.get_exit_status ();

    if (exit_code != 0) {
        var subprcs_error = Utils.get_string_from_file_input_stream (pipe_stderr);
        throw new ConvertError.FFMPEG_EXIED_WITH_ERROR (
            "Command `%s' failed with %d - `%s'",
            string.joinv (" ", commands),
            exit_code,
            subprcs_error);
    }

    if (export_original_metadata) {
        MatchInfo match_info;

        var subprcs_output = Utils.get_string_from_file_input_stream (pipe_stdout);
        var re_frame = /.*frame=\s*(\d+)/s;

        re_frame.match (subprcs_output, 0, out match_info);
        if (match_info.matches ()) {
            // Set the metadata of the images
            var num_frames = int64.parse (match_info.fetch (1));
            for (int i = 1; i < num_frames + 1; i += 1) {
                var image_filename = Path.build_filename (this.dest_dir, name_to_printf.printf (i));
                metadata.save_file (image_filename);
            }
        } else {
            Reporter.warning ("FFmpegOutputParseWarning", "Failed to parse the output of FFmpeg.");
        }
    }
}
  • 根据输出格式生成输出文件名
  • 如果传入了目标目录,则使用传入的目标目录,否则使用原始文件所在目录
  • 根据输出格式生成FFmpeg命令
  • 创建FFmpeg进程
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
    • 跳过video_offset个字节
  • 将读取的内容通过管道传递给FFmpeg
  • 等待FFmpeg进程结束
    • 如果FFmpeg进程返回值不为0,则抛出异常
  • 如果需要导出原始XMP元数据,则将元数据保存到输出文件中
    • 如果FFmpeg输出中包含帧数信息,则将元数据保存到所有输出文件中
    • 如果没有帧数信息,则发出警告

在以后,可能会增加使用GStreamer实现的功能以便支持图片的并行编码等更加高效的处理方式,目前的FFmpeg实现在图片编码上并没有并行,因此可能速度并不是很快。

架构:用于合成的LiveMaker

LiveMaker类用于合成动态照片,该类包含以下私有字段:

public class LivePhotoConv.LiveMaker {

    string main_image_path;
    string video_path;
    string dest;
    GExiv2.Metadata metadata;
    bool make_backup;
    FileCreateFlags file_create_flags;
    
    ......
}

该类的创建需要传入主图片路径与视频路径,还可以传入可选的输出路径、是否导出原始XMP元数据、文件创建标志与是否备份:

public LiveMaker (string main_image_path, string video_path,
                    string? dest = null, bool export_original_metadata = true,
                    FileCreateFlags file_create_flags = FileCreateFlags.REPLACE_DESTINATION,
                    bool make_backup = false) throws Error {
    this.main_image_path = main_image_path;
    this.video_path = video_path;
    this.make_backup = make_backup;
    this.file_create_flags = file_create_flags;

    if (dest != null) {
        this.dest = dest;
    } else {
        var main_basename = Path.get_basename (main_image_path);
        if (main_basename.has_prefix ("IMG")) {
            main_basename = "MVIMG" + main_basename[3:];
            this.dest = Path.build_filename (Path.get_dirname (main_image_path), main_basename);
        } else {
            var video_basename = Path.get_basename (video_path);
            if (video_basename.has_prefix ("VID")) {
                video_basename = "MVIMG" + video_basename[3:];
                this.dest = Path.build_filename (Path.get_dirname (main_image_path), video_basename);
            } else {
                this.dest = Path.build_filename (Path.get_dirname (main_image_path), "MVIMG_" + main_basename);
            }
        }
    }

    this.metadata = new GExiv2.Metadata ();
    if (export_original_metadata) {
        // Copy the metadata from the main image to the live photo
        this.metadata.open_path (main_image_path);
    }
}

在创建LiveMaker类的实例时,程序会根据传入的参数生成输出路径,如果需要导出原始XMP元数据,则会将主图片的元数据复制到输出文件中。

创建实例后,可以调用export方法将主图片与视频合成为动态照片,该方法接受一个可选的目标路径参数:

public void export (string? dest = null) throws Error {
    var live_file = File.new_for_commandline_arg  ((dest == null) ? this.dest : dest);
    var output_stream = live_file.replace (null, false, FileCreateFlags.NONE);

    var main_file = File.new_for_commandline_arg  (this.main_image_path);
    var main_input_stream = main_file.read ();

    var video_file = File.new_for_commandline_arg  (this.video_path);
    var video_input_stream = video_file.read ();
    var video_size = video_file.query_info ("standard::size", FileQueryInfoFlags.NONE).get_size ();

    // Copy the main image to the live photo
    Utils.write_stream (main_input_stream, output_stream);
    // Copy the video to the live photo
    Utils.write_stream (video_input_stream, output_stream);

    output_stream.close ();

    // Copy the metadata from the main image to the live photo
    // Set the XMP tag `LivePhoto` to `True`
    this.metadata.try_set_tag_string ("Xmp.GCamera.MicroVideoVersion", "1");
    this.metadata.try_set_tag_string ("Xmp.GCamera.MicroVideo", "1");
    this.metadata.try_set_tag_string ("Xmp.GCamera.MicroVideoOffset", video_size.to_string ());
    this.metadata.save_file (this.dest);
}
  • 创建输出文件
  • 读取主图片与视频文件的内容
  • 将主图片与视频文件的内容写入到输出文件中
  • 如果需要导出原始XMP元数据,则将元数据保存到输出文件中
    • 设置XMP元数据中的LivePhoto相关标签

架构:用于处理的Utils命名空间

Utils命名空间包含了一些工具函数,例如用于读取文件的函数、用于写入文件的函数。这些函数在上述两个类中都有使用,因此将其提取出来,方便复用。

目前Utils命名空间中的常数内容仅有一个:

const int BUFFER_SIZE = 8192;

这个常数用于指定读取文件时的缓冲区大小。

此外,还有若干函数,例如用于从文件输入流中读取字符串的函数:

public string get_string_from_file_input_stream (InputStream input_stream) throws IOError {
    StringBuilder? builder = null;
    uint8[] buffer = new uint8[BUFFER_SIZE + 1]; // allocate one more byte for the null terminator
    buffer.length = BUFFER_SIZE; // Set the length of the buffer to BUFFER_SIZE
    ssize_t bytes_read;

    while ((bytes_read = input_stream.read (buffer)) > 0) {
        buffer[bytes_read] = '\0'; // Add a null terminator to the end of the string
        if (builder == null) {
            builder = new StringBuilder.from_buffer ((char[]) buffer);
        } else {
            (!) builder.append ((string) buffer);
        }
    }

    return (builder != null) ? (!) builder.free_and_steal () : "";
}
  • 创建一个StringBuilder对象
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
  • 将读取的内容转换为字符串
  • 将字符串追加到StringBuilder对象中
  • 返回StringBuilder对象中的字符串
    • 如果没有读取到内容,则返回空字符串
  • 可用于读取子进程的输出

还有用于将文件的内容写入到输出流中的函数:

public void write_stream (InputStream input_stream, OutputStream output_stream) throws IOError {
    var buffer = new uint8[BUFFER_SIZE];
    ssize_t bytes_read;
    while ((bytes_read = input_stream.read (buffer)) > 0) {
        buffer.length = (int) bytes_read;
        output_stream.write (buffer);
        buffer.length = BUFFER_SIZE;
    }
}
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
  • 将读取的内容写入到输出流中
  • 直到读取到文件末尾

还有用于将文件的内容写入到输出流中,但是只写入指定的字节数的函数:

public void write_stream_before (InputStream input_stream, OutputStream output_stream, int64 end) throws IOError {
    var bytes_to_write = end;
    var buffer = new uint8[BUFFER_SIZE];
    ssize_t bytes_read;
    while ((bytes_read = input_stream.read (buffer)) > 0 && bytes_to_write > 0) {
        if (bytes_read > bytes_to_write) {
            buffer.length = (int) bytes_to_write;
        } else {
            buffer.length = (int) bytes_read;
        }
        output_stream.write (buffer);
        buffer.length = BUFFER_SIZE;
        bytes_to_write -= bytes_read;
    }
}
  • 读取文件的内容,每次读取BUFFER_SIZE大小的内容
  • 将读取的内容写入到输出流中
  • 直到读取到文件末尾或者写入的字节数达到指定的字节数
  • 用于提取静态图片
    • 读取到video_offset个字节时停止

架构改进:LivePhoto实现

目前笔者已经将LivePhoto类的实现改进。LivePhoto类本身是一个抽象类,其实现类包括LivePhotoFFmpegLivePhotoGst,分别使用FFmpegGStreamer实现了逐帧提取功能。这样可以根据需要选择不同的实现。

LivePhoto类的其他实现不变,但splites_images_from_video变为了抽象方法,需要在实现类中实现:

public abstract void splites_images_from_video (string? output_format = null, string? dest_dir = null) throws Error;

LivePhotoFFmpeg的实现如下:

public override void splites_images_from_video (string? output_format = null, string? dest_dir = null, int threads = 1) throws Error {
    /* Export the video of the live photo and split the video into images. */
    string name_to_printf;
    string dest;

    var format = (output_format != null) ? output_format : this.extension_name;

    if (threads != 1) {
        Reporter.warning ("NotImplementedWarning", "The `threads` parameter of FFmpeg mode is not implemented.");
    }

    if (this.basename.has_prefix ("MVIMG")) {
        name_to_printf = "IMG" + this.basename_no_ext[5:] + "_%d." + format;
    } else {
        name_to_printf = this.basename_no_ext + "_%d." + format;
    }

    if (dest_dir != null) {
        dest = Path.build_filename (dest_dir, name_to_printf);
    } else {
        dest = Path.build_filename (this.dest_dir, name_to_printf);
    }

    string[] commands;
    if (format.ascii_down () == "webp") {
        // Spcify the `libwebp` encoder to avoid the `libwebp_anim` encoder in `ffmpeg`
        commands = {"ffmpeg", "-progress", "-", // Split progress to stdout
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-c:v", "libwebp",
                    "-y", dest};
    } else {
        commands = {"ffmpeg", "-progress", "-",
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-y", dest};
    }

    var subprcs = new Subprocess.newv (commands,
        SubprocessFlags.STDOUT_PIPE |
        SubprocessFlags.STDERR_PIPE |
        SubprocessFlags.STDIN_PIPE);

    var pipe_stdin = subprcs.get_stdin_pipe ();
    var pipe_stdout = subprcs.get_stdout_pipe ();
    var pipe_stderr = subprcs.get_stderr_pipe ();

    var file = File.new_for_commandline_arg (this.filename);
    var input_stream = file.read ();
    input_stream.seek (this.video_offset, SeekType.SET);
    Utils.write_stream (input_stream, pipe_stdin);

    pipe_stdin.close (); // Close the pipe to signal the end of the input stream, MUST before `wait`
    subprcs.wait ();

    var exit_code = subprcs.get_exit_status ();

    if (exit_code != 0) {
        var subprcs_error = Utils.get_string_from_file_input_stream (pipe_stderr);
        throw new ExportError.FFMPEG_EXIED_WITH_ERROR (
            "Command `%s' failed with %d - `%s'",
            string.joinv (" ", commands),
            exit_code,
            subprcs_error);
    }

    if (export_original_metadata) {
        MatchInfo match_info;

        var subprcs_output = Utils.get_string_from_file_input_stream (pipe_stdout);
        var re_frame = /.*frame=\s*(\d+)/s;

        re_frame.match (subprcs_output, 0, out match_info);
        if (match_info.matches ()) {
            // Set the metadata of the images
            var num_frames = int64.parse (match_info.fetch (1));
            for (int i = 1; i < num_frames + 1; i += 1) {
                var image_filename = Path.build_filename (this.dest_dir, name_to_printf.printf (i));
                try {
                    metadata.save_file (image_filename);
                } catch (Error e) {
                    throw new ExportError.MATEDATA_EXPORT_ERROR ("Cannot save metadata to `%s': %s", image_filename, e.message);
                }
            }
        } else {
            Reporter.warning ("FFmpegOutputParseWarning", "Failed to parse the output of FFmpeg.");
        }
    }
}

这与原来的实现基本相同,只是将splites_images_from_video_ffmpeg方法的实现移到了LivePhotoFFmpeg类中。

LivePhotoGst的实现如下:

public override void splites_images_from_video (string? output_format = null, string? dest_dir = null, int threads = 0) throws Error {
    // Enpty args to Gst
    unowned string[] args = null;
    Gst.init (ref args);

    // Create a pipeline
    var pipeline = Gst.parse_launch ("appsrc name=src ! decodebin ! videoflip method=automatic ! queue ! videoconvert ! video/x-raw,format=RGB,depth=8 ! appsink name=sink") as Gst.Bin;
    var appsrc = pipeline.get_by_name ("src") as Gst.App.Src;
    var appsink = pipeline.get_by_name ("sink") as Gst.App.Sink;

    // Create a new thread to push data
    Thread<void> push_thread = new Thread<void> ("file_pusher", () => {
        try {
            // Set the video source
            var file = File.new_for_commandline_arg (this.filename);
            var input_stream = file.read ();
            input_stream.seek (this.video_offset, SeekType.SET);
            uint8[] buffer = new uint8[Utils.BUFFER_SIZE];
            ssize_t size;
            while ((size = input_stream.read (buffer)) > 0) {
                buffer.length = (int) size;
                var gst_buffer = new Gst.Buffer.wrapped (buffer);
                var flow_ret = appsrc.push_buffer (gst_buffer);
                if (flow_ret != Gst.FlowReturn.OK) {
                    warning ("Error pushing buffer to appsrc: %s", flow_ret.to_string ());
                    break;
                }
                buffer.length = Utils.BUFFER_SIZE;
            }
        } catch (Error e) {
            Reporter.error ("IOError", e.message);
        }
        appsrc.end_of_stream ();
    });
    pipeline.set_state (Gst.State.PLAYING);

    // Create a threadpool to process the images
    if (threads == 0) {
        threads = (int) get_num_processors ();
    }
    var pool = new ThreadPool<Sample2Img>.with_owned_data ((item) => {
        try {
            if (export_original_metadata) {
                item.export (this.metadata);
            } else {
                item.export ();
            }
        } catch (Error e) {
            Reporter.error ("Error", e.message);
        }
    }, threads, false);

    Gst.Sample sample;
    uint index = 1;
    string filename_no_index_ext = Path.build_filename (
        dest_dir,
        ((this.basename.has_prefix ("MVIMG")) ?
            "IMG" + this.basename_no_ext[5:] :
            this.basename_no_ext)
    );
    unowned var extension = (output_format != null) ? output_format : this.extension_name;
    // for jpg, pixbuf requires the format to be "jpeg"
    unowned var format = (extension == "jpg") ? "jpeg" : extension;
    while ((sample = appsink.pull_sample ()) != null) {
        string filename = filename_no_index_ext + "_%u.".printf (index) + extension;
        var item = new Sample2Img (sample, filename, format);
        pool.add ((owned) item);
        index += 1;
    }

    push_thread.join ();
    ThreadPool.free ((owned) pool, false, true);
    pipeline.set_state (Gst.State.NULL);
}

在GStreamer实现中,笔者实现了对各帧内容编码的并行处理,可以通过threads参数指定线程数,如果不指定则使用CPU核心数。笔者使用! videoflip method=automatic将视频按照正确的方向显示,然后将视频转换为RGB格式,最后将RGB格式的视频帧转换为图片。这样可以避免使用FFmpeg时的一些问题,例如方向不正确等。

为了方便多线程编码导出的实现,笔者实现了一个Sample2Img类,用于将Gst.Sample转换为图片文件:

[Compact (opaque = true)]
public class LivePhotoConv.Sample2Img {
    Gst.Sample sample;
    string filename;
    string output_format;

    /**
     * Constructor for the Sample2Img class.
     *
     * @param sample The Gst.Sample object to be processed.
     * @param filename The name of the output file.
     * @param output_format The format of the output file.
     */
    public Sample2Img (owned Gst.Sample sample, string filename, string output_format) {
        this.sample = sample;
        this.filename = filename;
        this.output_format = output_format;
    }

    /**
     * Export the sample as an image.
     *
     * @param metadata The metadata to be saved along with the image. (optional)
     * @throws Error if an error occurs during the export process.
     */
    public void export (GExiv2.Metadata? metadata = null) throws Error {
        unowned var buffer = this.sample.get_buffer ();
        unowned var caps = this.sample.get_caps ();
        unowned var info = caps.get_structure (0);
        int width, height;
        info.get_int ("width", out width);
        info.get_int ("height", out height);
        
        Gst.MapInfo map;
        buffer.map (out map, Gst.MapFlags.READ);
        Gdk.Pixbuf pixbuf = new Gdk.Pixbuf.from_data (
            map.data,
            Gdk.Colorspace.RGB,
            false,
            8,
            width,
            height,
            width * 3
        );

        pixbuf.save (filename, output_format);
        Reporter.info ("Exported image", filename);

        if (metadata != null) {
            try {
                metadata.save_file (filename);
            } catch (Error e) {
                throw new ExportError.MATEDATA_EXPORT_ERROR ("Cannot save metadata to `%s': %s", filename, e.message);
            }
        }
    }
}

Sample2Img类目前的实现比较简单,主要是将Gst.Sample转换为Gdk.Pixbuf,然后保存为图片文件。由于这样的转化需要先将视频解码流转化为RGB,再转化为图片,因此这一过程其实效率还有待优化,笔者可能会在以后的版本中改进。

架构改进(2024.08.26):LivePhotoFFmpeg

笔者对LivePhotoFFmpeg类的实现进行了改进,主要是将splites_images_from_video方法中的代码进行了重构,将文件写入的部分与元数据保存的部分分开,这样可以更好地处理异常情况。

public override void splites_images_from_video (string? output_format = null, string? dest_dir = null, int threads = 1) throws Error {
    /* Export the video of the live photo and split the video into images. */
    string name_to_printf;
    string dest;

    var format = (output_format != null) ? output_format : this.extension_name;

    if (threads != 0 && threads != 1) {
        Reporter.warning ("NotImplementedWarning", "The `threads` parameter of FFmpeg mode is not implemented.");
    }

    if (this.basename.has_prefix ("MVIMG")) {
        name_to_printf = "IMG" + this.basename_no_ext[5:] + "_%d." + format;
    } else {
        name_to_printf = this.basename_no_ext + "_%d." + format;
    }

    if (dest_dir != null) {
        dest = Path.build_filename (dest_dir, name_to_printf);
    } else {
        dest = Path.build_filename (this.dest_dir, name_to_printf);
    }

    string[] commands;
    if (format.ascii_down () == "webp") {
        // Spcify the `libwebp` encoder to avoid the `libwebp_anim` encoder in `ffmpeg`
        commands = {"ffmpeg", "-progress", "-", // Split progress to stdout
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-c:v", "libwebp",
                    "-y", dest};
    } else {
        commands = {"ffmpeg", "-progress", "-",
                    "-loglevel", "fatal",
                    "-hwaccel", "auto",
                    "-i", "pipe:0",
                    "-f", "image2",
                    "-y", dest};
    }

    var subprcs = new Subprocess.newv (commands,
        SubprocessFlags.STDOUT_PIPE |
        SubprocessFlags.STDERR_PIPE |
        SubprocessFlags.STDIN_PIPE);

    Thread<ExportError?> push_thread = new Thread<ExportError?> ("file_pusher", () => {
        try {
            // Set the video source
            var pipe_stdin = subprcs.get_stdin_pipe ();
            var file = File.new_for_commandline_arg (this.filename);
            var input_stream = file.read ();
            input_stream.seek (this.video_offset, SeekType.SET);
            Utils.write_stream (input_stream, pipe_stdin);

            // `subprcs.get_stdin_pipe ()`'s return value is **unowned**,
            // so we need to close it **manually**.
            // Close the pipe to signal the end of the input stream,
            // otherwise the process will be **blocked**.
            pipe_stdin.close ();
            return null;
        } catch (Error e) {
            return new ExportError.FILE_PUSH_ERROR ("Pushing to FFmpeg failed: %s", e.message);
        }
    });

    var pipe_stdout = subprcs.get_stdout_pipe ();
    var pipe_stderr = subprcs.get_stderr_pipe ();

    var pipe_stdout_dis = new DataInputStream (pipe_stdout);
    var re_frame = /^frame=\s*(\d+)/;
    MatchInfo match_info;

    string line;
    int64 frame_processed = 0;
    while ((line = pipe_stdout_dis.read_line ()) != null) {
        if (re_frame.match (line, 0, out match_info)) {
            var frame = int64.parse (match_info.fetch (1));
            for (; frame_processed < frame; frame_processed += 1) {
                var image_filename = Path.build_filename (
                    (dest_dir == null) ? this.dest_dir : dest_dir,
                    name_to_printf.printf (frame_processed + 1)
                );
                Reporter.info ("Exported image", image_filename);

                if (export_original_metadata) {
                    try {
                        metadata.save_file (image_filename);
                    } catch (Error e) {
                        // DO NOT throw the error, just report it
                        // because the image exporting is not affected
                        Reporter.error ("Error", e.message);
                    }
                }
            }
        }
    }

    var push_file_error = push_thread.join ();
    // Report the error of data pushing,
    // report here instead of throwing it to avoid zombie subprocess
    if (push_file_error != null) {
        Reporter.error ("FilePushError", push_file_error.message);
    }
    subprcs.wait ();

    var exit_code = subprcs.get_exit_status ();

    if (exit_code != 0) {
        var subprcs_error = Utils.get_string_from_file_input_stream (pipe_stderr);
        throw new ExportError.FFMPEG_EXIED_WITH_ERROR (
            "Command `%s' failed with %d - `%s'",
            string.joinv (" ", commands),
            exit_code,
            subprcs_error);
    }
}

此外,以前的实现是在FFmpeg进程结束后才保存元数据,现在改为在每一次FFmpeg输出中包含帧数信息时就保存元数据,这样可以方便用户在运行过程中查看进度。

架构改进(2024.08.26):LivePhotoGst

笔者对LivePhotoGst类的实现进行了改进,更改了pipeline的内容,实现了根据元数据自动旋转视频的功能,这样可以避免在导出图片时出现方向不正确的问题。

public override void splites_images_from_video (string? output_format = null, string? dest_dir = null, int threads = 0) throws Error {
    // Enpty args to Gst
    unowned string[] args = null;
    Gst.init (ref args);

    // Create a pipeline
    var pipeline = Gst.parse_launch ("appsrc name=src ! decodebin ! videoflip method=automatic ! queue ! videoconvert ! video/x-raw,format=RGB,depth=8 ! appsink name=sink") as Gst.Bin;
    var appsrc = pipeline.get_by_name ("src") as Gst.App.Src;
    var appsink = pipeline.get_by_name ("sink") as Gst.App.Sink;

    // NOTE: `giostreamsrc` does not support `seek` and will read from the beginning of the file,
    // so use `appsrc` instead.
    // Create a new thread to push data
    Thread<Error?> push_thread = new Thread<Error?> ("file_pusher", () => {
        try {
            // Set the video source
            var file = File.new_for_commandline_arg (this.filename);
            var input_stream = file.read ();
            input_stream.seek (this.video_offset, SeekType.SET);
            uint8[] buffer = new uint8[Utils.BUFFER_SIZE];
            ssize_t size;
            while ((size = input_stream.read (buffer)) > 0) {
                buffer.length = (int) size;
                var gst_buffer = new Gst.Buffer.wrapped (buffer);
                var flow_ret = appsrc.push_buffer (gst_buffer);
                if (flow_ret != Gst.FlowReturn.OK) {
                    appsrc.end_of_stream ();
                    return new ExportError.FILE_PUSH_ERROR ("Pushing to appsrc failed, flow returned %s", flow_ret.to_string ());
                }
                buffer.length = Utils.BUFFER_SIZE;
            }
        } catch (Error e) {
            appsrc.end_of_stream ();
            return new ExportError.FILE_PUSH_ERROR ("Pushing to appsrc failed: %s", e.message);
        }
        appsrc.end_of_stream ();
        return null;
    });
    pipeline.set_state (Gst.State.PLAYING);

    // Create a threadpool to process the images
    if (threads == 0) {
        threads = (int) get_num_processors ();
    }
    var pool = new ThreadPool<Sample2Img>.with_owned_data ((item) => {
        try {
            if (export_original_metadata) {
                item.export (this.metadata);
            } else {
                item.export ();
            }
        } catch (Error e) {
            Reporter.error ("Error", e.message);
        }
    }, threads, false);

    Gst.Sample sample;
    uint index = 1;
    string filename_no_index_ext = Path.build_filename (
        dest_dir,
        ((this.basename.has_prefix ("MVIMG")) ?
            "IMG" + this.basename_no_ext[5:] :
            this.basename_no_ext)
    );
    unowned var extension = (output_format != null) ? output_format : this.extension_name;
    // for jpg, pixbuf requires the format to be "jpeg"
    unowned var format = (extension == "jpg") ? "jpeg" : extension;
    while ((sample = appsink.pull_sample ()) != null) {
        string filename = filename_no_index_ext + "_%u.".printf (index) + extension;
        var item = new Sample2Img (sample, filename, format);
        pool.add ((owned) item);
        index += 1;
    }

    var push_file_error = push_thread.join ();
    if (push_file_error != null) {
        throw push_file_error;
    }
    ThreadPool.free ((owned) pool, false, true);
    pipeline.set_state (Gst.State.NULL);
}

Meson构建系统选项(2024.08.26)

笔者在Meson构建系统中添加了是否启用GStreamer的选项,可以通过-Dgst=[enabled|auto|disabled]来指定是否启用GStreamer,默认为auto,即根据系统环境自动选择是否启用GStreamer。笔者在meson_options.txt中添加了如下选项:

option('gst', type: 'feature', value: 'auto',
    description: 'GStreamer support')

然后在src/meson.build中根据选项来选择是否构建GStreamer相关的源文件:

# Find GStreamer dependencies
require_gst = get_option('gst')
gst = dependency('gstreamer-1.0', required: require_gst)
gst_app = dependency('gstreamer-app-1.0', required: require_gst)
gdk_pixbuf = dependency('gdk-pixbuf-2.0', required: require_gst)

# Check if all GStreamer dependencies are found
if gst.found() and gst_app.found() and gdk_pixbuf.found()
  add_project_arguments('-D', 'ENABLE_GST', language: 'vala')

  basic_deps += [
    gst,
    gst_app,
    gdk_pixbuf
  ]

  executable_sources += [
    'livephotogst.vala',
    'sample2img.vala'
  ]
endif

这样可以根据选项来选择是否构建GStreamer相关的源文件。

目前根据笔者的经验,从嵌入视频导出webpjxl时,使用GStreamer更快,而导出jpgpngavif时使用FFmpeg更快,且导出avif时GStreamer实现无法导出元数据,但FFmpeg正常。因此使用时可以根据需要选择不同的实现。