--- library_name: transformers base_model: medgemma27B tags: - medical - medical-coding - icd10 - cpt - hcpcs - healthcare - clinical - fine-tuned - peft - lora license: apache-2.0 language: - en pipeline_tag: text-generation --- # medgemma-27b-medical-coding ## Model Description This is a **LoRA adapter** fine-tuned on **medgemma27B** for medical coding tasks. The model is specifically designed to: - Extract diseases and medical conditions from discharge summaries - Identify medical procedures and interventions - Assign appropriate medical codes (ICD-10, CPT, HCPCS) - Process clinical documentation with high accuracy **Base Model:** `medgemma27B` **Fine-tuning Method:** LoRA (Low-Rank Adaptation) ## Training Details ### LoRA Configuration - **Rank (r):** 8 - **Alpha:** 16 - **Dropout:** 0.1 - **Target Modules:** q_proj, up_proj, v_proj, gate_proj ## Usage ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("TachyHealthResearch/medgemma-27b-medical-coding") # Load base model base_model = AutoModelForCausalLM.from_pretrained( "medgemma27B", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "TachyHealthResearch/medgemma-27b-medical-coding") ``` ### Example Usage ```python # Define the system prompt for medical coding system_prompt = """You are an expert medical coding specialist. Analyze the discharge summary to extract diseases, procedures, and assign appropriate medical codes. Return the response in JSON format with this structure: {"diseases": ["disease1", "disease2"], "icd10_codes": ["code1", "code2"], "procedures": ["procedure1", "procedure2"], "cpt_codes": ["code1", "code2"], "hcpcs_codes": ["code1", "code2"]}""" # Example discharge summary discharge_summary = """ Patient admitted with chest pain and shortness of breath. Diagnosed with acute myocardial infarction and congestive heart failure. Underwent percutaneous coronary intervention with stent placement. """ # Prepare the conversation messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": f"Please analyze this discharge summary:\n\n{discharge_summary}"} ] # Apply chat template and generate text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=512, temperature=0.1, do_sample=True, pad_token_id=tokenizer.eos_token_id, ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) generated_response = response[len(text):].strip() print("Generated Medical Codes:") print(generated_response) ``` ## Model Performance This model has been specifically fine-tuned for medical coding tasks and demonstrates strong performance in: - Disease extraction from clinical text - Medical procedure identification - Medical code assignment (ICD-10, CPT, HCPCS) - Structured JSON response generation ## Intended Use ### Primary Use Cases - Medical coding automation - Clinical documentation analysis - Healthcare data processing ### Limitations - Always verify generated codes with qualified medical coding professionals - Performance may vary on clinical documents significantly different from training data - Intended for use in appropriate healthcare environments only ## License This model is released under the Apache 2.0 License. ## Citation If you use this model in your research or applications, please cite: ```bibtex @misc{TachyHealthResearch_medgemma_27b_medical_coding_2024}, title = {medgemma-27b-medical-coding: Medical Coding Model}, author = {TachyHealthResearch}, year = {2024}, publisher = {Hugging Face}, url = {https://huggingface.co/TachyHealthResearch/medgemma-27b-medical-coding} } ``` --- **Important**: This model is intended for research and healthcare applications. Always ensure proper validation and human oversight when using AI models in medical contexts.