YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SDMatte_plus-fp16-and-bf16

This repository provides optimized, inference-only versions of the original SDMatte model by LongfeiHuang.

The models here have been specifically processed to be lightweight and efficient for deployment and use in applications like ComfyUI, without compromising the quality of the matting results.

What is this?

This repository contains inference-only weights for the SDMatte model. The original checkpoint file (.pth) was a full training checkpoint, which included not only the model weights but also ~6.5 GB of trainer states (like optimizer states). These trainer states are crucial for resuming training but are unnecessary for performing inference (i.e., actually using the model for matting).

Optimizations Performed

  1. Removal of Trainer States: The largest optimization was stripping the trainer key from the original checkpoint. This removes all unnecessary data related to the training process, significantly reducing the file size without affecting the model's output.

  2. 16-bit Precision Quantization: The model weights have been converted from their original 32-bit floating-point precision (FP32) to 16-bit precision. We provide two popular formats:

    • FP16 (half-precision): Offers a great balance of speed, reduced memory usage, and high quality. It is supported by most modern NVIDIA GPUs (10-series and newer).
    • BF16 (bfloat16): Offers a dynamic range identical to FP32, making it more resilient to overflow/underflow issues. It provides the best performance on newer NVIDIA GPUs (RTX 30-series and newer).

DIY Quantization with convert_precision.py

This repository also includes the Python script, convert_precision.py, which was used to create these fp16 and bf16 models. You can use this script to convert the original FP32 checkpoint yourself.

  1. Place the original FP32 SDMatte_plus.pth file in the same folder as the script.
  2. Open the convert_precision.py file with a text editor.
  3. Modify the TARGET_PRECISION variable at the top to either 'fp16' or 'bf16'.
  4. Run the script from your terminal: python convert_precision.py.

Acknowledgements

Huge thanks to LongfeiHuang for creating and open-sourcing the original SDMatte model. This repository is merely an optimized packaging of their incredible work. Please visit the original repository for more details on the model's architecture and training.


中文版本

这是什么?

本仓库提供原始 SDMatte 模型(作者 LongfeiHuang) 的优化版,仅用于推理。

这里的模型都经过了专门处理,旨在使其轻量化且高效,以方便在 ComfyUI 等应用中部署和使用,同时不影响抠图效果的质量。

执行的优化

  1. 移除 Trainer 状态:最大的优化是剥离了原始检查点中的 trainer 键。这移除了所有与训练过程相关的非必要数据(如优化器状态),在完全不影响模型输出质量的前提下,极大地减小了文件体积。原始文件中约 6.5 GB 的数据都属于此类。

  2. 16 位精度量化:模型权重已从原始的 32 位浮点精度 (FP32) 转换为 16 位精度。我们提供了两种主流格式:

    • **FP16 (半精度)**:在速度、显存占用和高质量之间取得了绝佳的平衡。它被大多数现代 NVIDIA GPU(10 系及更新)所支持。
    • **BF16 (bfloat16)**:拥有与 FP32 相同的动态范围,使其在处理数据溢出/下溢问题时更具弹性。它在较新的 NVIDIA GPU(RTX 30 系及更新)上能提供最佳性能。

使用 convert_precision.py 自行量化

本仓库同样包含了用于创建这些 fp16 和 bf16 模型的 Python 脚本 convert_precision.py。您可以使用此脚本,自行将原始的 FP32 检查点进行转换。

  1. 将原始的 FP32 SDMatte_plus.pth 文件放置于脚本所在的同一个文件夹内。
  2. 使用文本编辑器打开 convert_precision.py 文件。
  3. 修改文件顶部的 TARGET_PRECISION 变量,将其设置为 'fp16''bf16'
  4. 在您的终端中运行脚本:python convert_precision.py

致谢

非常感谢 LongfeiHuang 创建并开源了卓越的 SDMatte 模型。本仓库的工作仅仅是对其出色成果的优化和打包。有关模型架构和训练的更多详细信息,请访问原始仓库。

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support