Instructions to use qizekun/ShapeLLM_13B_general_v1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use qizekun/ShapeLLM_13B_general_v1.0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="qizekun/ShapeLLM_13B_general_v1.0")# Load model directly from transformers import AutoProcessor, AutoModelForCausalLM processor = AutoProcessor.from_pretrained("qizekun/ShapeLLM_13B_general_v1.0") model = AutoModelForCausalLM.from_pretrained("qizekun/ShapeLLM_13B_general_v1.0") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use qizekun/ShapeLLM_13B_general_v1.0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "qizekun/ShapeLLM_13B_general_v1.0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "qizekun/ShapeLLM_13B_general_v1.0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/qizekun/ShapeLLM_13B_general_v1.0
- SGLang
How to use qizekun/ShapeLLM_13B_general_v1.0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "qizekun/ShapeLLM_13B_general_v1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "qizekun/ShapeLLM_13B_general_v1.0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "qizekun/ShapeLLM_13B_general_v1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "qizekun/ShapeLLM_13B_general_v1.0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use qizekun/ShapeLLM_13B_general_v1.0 with Docker Model Runner:
docker model run hf.co/qizekun/ShapeLLM_13B_general_v1.0
ShapeLLM model
This repository contains the ShapeLLM-13B model presented in ShapeLLM: Universal 3D Object Understanding for Embodied Interaction.
Install
- Clone this repository and navigate to ShapeLLM folder
git clone https://github.com/qizekun/ShapeLLM.git
cd ShapeLLM
- Install Package
conda create -n shapellm python=3.10 -y
conda activate shapellm
pip install --upgrade pip # enable PEP 660 support
pip install -e .
- Install additional packages for training cases
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
- Install PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
ShapeLLM
model weights
Please check out our Model Zoo for all public ShapeLLM checkpoints.
Demo
CLI Inference
Chat about point clouds using CLI interface. It also supports multiple GPUs, 4-bit and 8-bit quantized inference.
python -m llava.serve.cli \
--model-path qizekun/ShapeLLM_13B_general_v1.0 \
--pts-file assets/instrument.npy
Training
Consistent with LLaVA, we adopt a two-stage training approach. In the first stage, we solely fine-tune the projector for semantic alignment. In the second stage, we conduct full fine-tuning using Instruction Following data.
Download data following DATA, organize the data as follows in ./playground/data/shapellm/,
βplayground/data/shapellm/
βββ cap3d_objaverse_785k.json
βββ cap3d_objaverse_sft_45k.json
βββ gapartnet_sft_27k_openai.json
βββ gapartnet_pcs
β βββ Box_100129_0_0.npy
β βββ ...
βββ cap3d_pcs
βββ 00000054c36d44a2a483bdbff31d8edf.pt
βββ ...
Furthermore, ShapeLLM utilizes the Large version of ReCon++ as the point encoder.
You need to download the ReCon++ weight and save it to ./checkpoints/recon/large.pth.
βcheckpoints/recon/
βββ large.pth
1. Feature Alignment Stage
sh scripts/pretrain.sh
2. Visual Instruction Tuning Stage
sh scripts/finetune.sh
The training takes around 14 hours for ShapeLLM-13B on 8x A100 (80G). It takes around 7 hours for ShapeLLM-7B.
Zero-shot Understanding on 3D MM-Vet
Evaluate 3D MLLMs for integrated capabilities and embodied interaction capabilities, run the script:
sh scripts/eval/mmvet.sh
Using GPT4 to calulate the 3D MM-Vet score:
sh scripts/eval/eval_mmvet.sh
Visual Grounding on GApartNet
Evaluate the performance of ShapeLLM on the GApartNet dataset, run the script:
sh scripts/eval/gapartnet_ref.sh
Calucate the generative 3D visual grounding accuracy:
sh scripts/eval/eval_gapartnet.sh
- Downloads last month
- 17