Spaces:

LAMDA-NeSy
/

ChinaTravel

Running

App Files Files Community

ChinaTravel / README.md

Cbphcr

Update README.md

df7b57a verified 7 months ago

preview code

raw

history blame contribute delete

3.38 kB

	---
	title: ChinaTravel
	emoji: 🐢
	colorFrom: blue
	colorTo: gray
	sdk: gradio
	sdk_version: 5.34.0
	app_file: app.py
	pinned: false
	license: cc-by-nc-sa-4.0
	---

	<!-- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference -->

	<center>
	<h1>ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning
	</h1>
	</center>

	The offical codebase for our NeurIPS'25 (Datasets and Benchmarks Track) submission "ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning".

	<!--
	\| [Webpage](https://www.lamda.nju.edu.cn/shaojj/chinatravel/) \| [Paper](https://arxiv.org/abs/2412.13682) \| [Dataset(Huggingface)](https://huggingface.co/datasets/LAMDA-NeSy/ChinaTravel), [Dataset(ModelScope)](https://www.modelscope.cn/datasets/Cbphcr/ChinaTravel) \|
	-->
	[Dataset (Huggingface)](https://huggingface.co/datasets/LAMDA-NeSy/chinatravel_neurips25submission)

	<!--
	![Overview](images/overview.png) -->

	## Quick Start

	### Setup

	1. Create a conda environment and install dependencies:

	```bash
	conda create -n chinatravel python=3.9
	conda activate chinatravel
	pip install -r requirements.txt
	```

	2. Download the database and unzip it to the `chinatravel/environment/` directory (Download Links: [Google Drive](https://drive.google.com/drive/folders/1bJ7jA5cfExO_NKxKfi9qgcxEbkYeSdAU?usp=drive_link), [NJU Drive](https://box.nju.edu.cn/d/dd83e5a4a9e242ed8eb4/)).

	3. Download necessary models or tokenizers (e.g. [deepseek tokenizer](https://cdn.deepseek.com/api-docs/deepseek_v3_tokenizer.zip)) to `./chinatravel/local_llm` (You need to create the folder first)



	### Running

	We support the deepseek (offical API from deepseek), gpt-4o (chatgpt-4o-latest), glm4-plus, and local inferences with Qwen, Mistral, Llama.

	```bash
	export OPENAI_API_KEY=""

	# Act ReAct0 ReAct agent
	python run_exp.py --splits easy --agent Act --llm gpt-4o # Replace "Act" with "ReAct0" or "ReAct" for other pure neural agents

	# LLM-modulo agent with 10 refine_steps
	python run_exp.py --splits medium --agent LLM-modulo --llm gpt-4o --refine_steps 10

	# LLMNesy agent with oracle translation
	python run_exp.py --splits human --agent LLMNeSy --llm deepseek --oracle_translation

	# LLMNesy agent
	python run_exp.py --splits human1000 --agent LLMNeSy --llm deepseek
	```

	Note:
	1. Please download the weights of the open-source model to `./chinatravel/open_source_llm` and modify the corresponding model path in `./chinatravel/agent/llms.py` (This step is only necessary when using a locally deployed open-source model.).
	2. We implemented the following agents:
	1. `Act`: zero-shot Act agent
	2. `ReAct0`: zero-shot ReAct agent
	3. `ReAct`: one-shot ReAct agent
	4. `LLM-modulo`: LLM-modulo agent
	5. `LLMNesy`: Neuro-Symbolic agent
	3. We retain the DSL annotations of "Human1000" as private information to prevent performance fraud or unfair comparisons. Researchers are encouraged to submit their results to us for evaluation on Human-1000.
	4. If you want to skip the completed queries, please add the parameter `--skip 1`
	### Evaluation

	```bash
	python eval_exp.py --splits human --method LLMNeSy_deepseek_oracletranslation
	python eval_exp.py --splits human --method LLMNeSy_deepseek
	```

	## Docs

	[Environment](chinatravel/environment/readme.md)
	[Constraints](chinatravel/symbol_verification/readme.md)