Instructions to use acul3/bahasa-4b-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use acul3/bahasa-4b-v3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="acul3/bahasa-4b-v3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("acul3/bahasa-4b-v3")
model = AutoModelForCausalLM.from_pretrained("acul3/bahasa-4b-v3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use acul3/bahasa-4b-v3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "acul3/bahasa-4b-v3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "acul3/bahasa-4b-v3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/acul3/bahasa-4b-v3

SGLang

How to use acul3/bahasa-4b-v3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "acul3/bahasa-4b-v3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "acul3/bahasa-4b-v3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "acul3/bahasa-4b-v3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "acul3/bahasa-4b-v3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use acul3/bahasa-4b-v3 with Docker Model Runner:
```
docker model run hf.co/acul3/bahasa-4b-v3
```

Bahasa-4b Model Report

Model Name

Bahasa-4b

Model Detail

Bahasa-4b is continued training from qwen-4b using 10 billion high quality text of Indonesian. The model outperforms some 4b, and even 7b models for Indonesian tasks.

Model Developers

Bahasa AI

Intended Use

This model is intended for various NLP tasks that require understanding and generating Indonesian language. It is suitable for applications such as question answering, sentiment analysis, document summarization, and more.

Training Data

Bahasa-4b was trained on a 10 billion subset data of Indonesian dataset from a collected pool of 100 billion.

Benchmarks

The following table shows the performance of Bahasa-4b compared to the models Sailor_4b and Mistral-7B-v0.1 across several benchmarks:

Dataset	Version	Metric	Mode	Sailor_4b	Bahasa-4b-hf	Mistral-7B-v0.1
tydiqa-id	0e9309	EM	gen	53.98	55.04	63.54
tydiqa-id	0e9309	F1	gen	73.48	75.39	78.73
xcopa-id	36c11c	EM	ppl	69.2	73.2	62.40
xcopa-id	36c11c	F1	ppl	69.2	73.2	-
m3exam-id-ppl	ede415	EM	ppl	31.27	44.47	26.68
belebele-id-ppl	7fe030	EM	ppl	41.33	42.33	41.33

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "Bahasalab/Bahasa-4b-chat-v2",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Bahasalab/Bahasa-4b-chat")

messages = [
    {"role": "system", "content": "Kamu adalah asisten yang membantu"},
    {"role": "user", "content": "kamu siapa"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    input_ids=model_inputs.input_ids,
    attention_mask=model_inputs.attention_mask,
    max_new_tokens=512,
    eos_token_id=tokenizer.eos_token_id

)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

This data demonstrates that Bahasa-4b consistently outperforms the Sailor_4b model in various Indonesian language tasks, showing improvements in both EM (Exact Match) and F1 scores across different datasets, and is competitive with the Mistral-7B-v0.1 model.

Downloads last month: 4

Safetensors

Model size

4B params

Tensor type

BF16