kunishou/databricks-dolly-15k-ja
Viewer • Updated • 15k • 1.15k • 89
How to use Jumtra/rinna-3.6b-tune-ep5 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Jumtra/rinna-3.6b-tune-ep5") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Jumtra/rinna-3.6b-tune-ep5")
model = AutoModelForCausalLM.from_pretrained("Jumtra/rinna-3.6b-tune-ep5")How to use Jumtra/rinna-3.6b-tune-ep5 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Jumtra/rinna-3.6b-tune-ep5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Jumtra/rinna-3.6b-tune-ep5",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5
How to use Jumtra/rinna-3.6b-tune-ep5 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Jumtra/rinna-3.6b-tune-ep5" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Jumtra/rinna-3.6b-tune-ep5",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Jumtra/rinna-3.6b-tune-ep5" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Jumtra/rinna-3.6b-tune-ep5",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use Jumtra/rinna-3.6b-tune-ep5 with Docker Model Runner:
docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5
このモデルは、MosaicMLのllm-foundryリポジトリを使用してrinna/japanese-gpt-neox-3.6bをファインチューニングしたモデルです。
June 28, 2023
MIT
Jumtra/test_data_100QAを用いてモデルの正答率を評価した また、学習時のvalidateデータに対してのPerplexityを記載した。
| model name | 正答率 | Perplexity |
|---|---|---|
| Jumtra/rinna-3.6b-tune-ep5 | 40/100 | 8.105 |
| Jumtra/rinna-v1-tune-ep1 | 42/100 | 7.458 |
| Jumtra/rinna-v1-tune-ep3 | 41/100 | 7.034 |
| Jumtra/calm-7b-tune-ep4 | 40/100 | 9.766 |
| Jumtra/calm-v3-ep1 | 35/100 | 9.305 |
| Jumtra/calm-v3-ep3 | 37/100 | 13.276 |
以下のプロンプトを用いた
INSTRUCTION_KEY = "### 入力:"
RESPONSE_KEY = "### 回答:"
INTRO_BLURB = "以下はタスクを説明する指示と文脈のある文章が含まれた入力です。要求を適切に満たす回答を生成しなさい。"
JP_PROMPT_FOR_GENERATION_FORMAT = """{intro}
{instruction_key}
{instruction}
{response_key}
""".format(
intro=INTRO_BLURB,
instruction_key=INSTRUCTION_KEY,
instruction="{instruction}",
response_key=RESPONSE_KEY,
)