{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RAG - Chatbot\n",
    "\n",
    "## Background\n",
    "\n",
    "In this notebook I will demonstrate how to build the backend of a RAG-Chatbot, that will allow users to interact with uploaded pdf documents and provided URL's. \n",
    "\n",
    "The critical advantage of a RAG-chatbot, in comparison to a standard chatbot is the retrieval of best matching chunks of information provided by the user, and amendment of these information as context to the original question of the user, as illustrated here:\n",
    "![RAG_Chatbot_Flowchart](RAG_Chatbot_transpBG.drawio.png)\n",
    "\n",
    "This technique therefore provides a feasible alternative to the expensive process of  fine-tuning an LLM with information that were not available during the pre-training phase.\n",
    "\n",
    "In this implementation, we will use:\n",
    "- LLM: [zephyr-7b-alpha](https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha) through [Hugging Face serverless Inference API](https://huggingface.co/docs/api-inference/en/index)\n",
    "- Embeddings: [HF Sentence Transformers all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)\n",
    "- Vectorstore: [FAISS](https://faiss.ai/index.html)\n",
    "- For glueing them all: [LangChain (v0.3)](https://python.langchain.com/docs/versions/v0_3/)\n",
    "\n",
    "Note: the code snippets below have been copied and simplified from my original code [here](https://github.com/OnurKerimoglu/chat_with_docs/blob/main/src/rag.py), which is in turn deployed to HuggingFace space [here](https://huggingface.co/spaces/OnurKerimoglu/rag_chat), which may well be sleeping due to inactivity (don't hesitate to wake it up!)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building the Bot\n",
    "Let's first import all the packages that will be needed. Here, we will use:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "BtdxWdDWl6-2"
   },
   "outputs": [],
   "source": [
    "import dotenv\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "from langchain_community.document_loaders import UnstructuredURLLoader, PyPDFLoader\n",
    "from langchain_community.vectorstores import FAISS\n",
    "from langchain_huggingface import HuggingFaceEndpoint, HuggingFaceEmbeddings\n",
    "from langchain.chains import RetrievalQA\n",
    "from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define some helper functions to load data from given url and pdf sources and convert to text chunks that will be later vectorized:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def load_data(urls, pdfs):\n",
    "    documents = []\n",
    "    if urls:\n",
    "        url_loader = UnstructuredURLLoader(urls=urls)\n",
    "        documents.extend(url_loader.load())\n",
    "    for pdf in pdfs:\n",
    "        pdf_loader = PyPDFLoader(pdf)\n",
    "        documents.extend(pdf_loader.load())\n",
    "    return documents\n",
    "\n",
    "def sources_to_texts(documents):\n",
    "\n",
    "    # Retrieval system\n",
    "    chunk_size = 1000\n",
    "    chunk_overlap = 200\n",
    "\n",
    "    text_splitter = RecursiveCharacterTextSplitter(\n",
    "        chunk_size=chunk_size,\n",
    "        chunk_overlap=chunk_overlap)\n",
    "    texts = text_splitter.split_documents(documents)\n",
    "    return texts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Function to create a retriever using the helper functions above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "id": "WHvjp6I1WMAp"
   },
   "outputs": [],
   "source": [
    "def create_retriever(documents, k):\n",
    "    texts = sources_to_texts(documents)\n",
    "    # Create embeddings\n",
    "    embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
    "    vectorstore = FAISS.from_documents(texts, embeddings)\n",
    "    retriever = vectorstore.as_retriever(search_kwargs={\"k\": k})\n",
    "    return retriever"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's define a helper function that creates a LangChain RetrievalQA bot based on a given llm and retriever:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_QAbot(retriever, llm):\n",
    "    # System prompt and prompt template\n",
    "    system_template = \"\"\"\n",
    "    You are an AI assistant that answers questions based on the given context.\n",
    "    Your responses should be informative and relevant to the question asked.\n",
    "    If you don't know the answer or if the information is not present in the context, just say so.\n",
    "    After answering a user question, stop, and do not make up any follow up questions\"\"\"\n",
    "    human_template = \"\"\"Context: {context}\n",
    "\n",
    "    Question: {question}\n",
    "\n",
    "    Answer: \"\"\"\n",
    "    # Create the prompt\n",
    "    system_message_prompt = SystemMessagePromptTemplate.from_template(system_template)\n",
    "    human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)\n",
    "    prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])\n",
    "    QAbot = RetrievalQA.from_chain_type(\n",
    "        llm=llm,\n",
    "        chain_type=\"stuff\",\n",
    "        retriever=retriever,\n",
    "        return_source_documents=True,\n",
    "        chain_type_kwargs={\"prompt\": prompt}\n",
    "    )\n",
    "    return QAbot"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now put everything together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def setup_rag_bot(\n",
    "        urls,\n",
    "        pdfs,\n",
    "        k=3  # i.e., retrieve 3 best matching vectors\n",
    "        ):\n",
    "    # Initial data\n",
    "    documents = load_data(\n",
    "        urls,\n",
    "        pdfs\n",
    "        )\n",
    "    # Create the retriever\n",
    "    retriever = create_retriever(\n",
    "        documents,\n",
    "        k=k\n",
    "        )\n",
    "    # Create the llm\n",
    "    llm = HuggingFaceEndpoint(\n",
    "        repo_id=f\"huggingfaceh4/zephyr-7b-alpha\",\n",
    "        temperature=0.01,  # choose a small temperature to reduce hallucination potential, and to increase the chances to follow instructions\n",
    "        max_new_tokens=512\n",
    "        )\n",
    "    # Create a QA bot\n",
    "    RAGbot = create_QAbot(\n",
    "        retriever,\n",
    "        llm\n",
    "    )\n",
    "    return RAGbot"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, let's define a function that will act as the interface between the rag-chatbot and the user:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ask_ragbot(RAGbot, question):\n",
    "    result = RAGbot.invoke({\"query\": question})\n",
    "    sources = [doc.metadata.get('source', 'Unknown source') for doc in result[\"source_documents\"]]\n",
    "    response = {\n",
    "        \"question\": question,\n",
    "        \"answer\": result[\"result\"],\n",
    "        \"sources\": sources\n",
    "    }\n",
    "    print(f\"Question: {response['question']}\")\n",
    "    print(f\"Answer: {response['answer']}\")\n",
    "    print(\"Sources:\")\n",
    "    for source in response['sources']:\n",
    "        print(f\"- {source}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example Usage\n",
    "\n",
    "Let's now create a rag bot based on some URL's and local pdf's"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.\n"
     ]
    }
   ],
   "source": [
    "ragbot = setup_rag_bot(\n",
    "        urls = [\n",
    "            \"https://en.wikipedia.org/wiki/Artificial_intelligence\",\n",
    "            \"https://en.wikipedia.org/wiki/Machine_learning\"\n",
    "        ],\n",
    "        pdfs = [\"/home/onur/WORK/DS/repos/chat_with_docs/docs/the-big-book-of-mlops-v10-072023 - Databricks.pdf\"]\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Time to have a chat!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: What is Machine Learning?\n",
      "Answer: \n",
      "\n",
      "    Machine learning is the study of programs that can improve their performance on a given task automatically. It has been a part of AI from the beginning and is a field that started to flourish in the 1990s. There are several kinds of machine learning, including unsupervised learning, supervised learning (classification and regression), and probably approximately correct (PAC) learning. The term machine learning was coined in 1959 by Arthur Samuel, an IBM employee and pioneer in the field of computer gaming and artificial intelligence.\n",
      "Sources:\n",
      "- https://en.wikipedia.org/wiki/Artificial_intelligence\n",
      "- https://en.wikipedia.org/wiki/Machine_learning\n",
      "- https://en.wikipedia.org/wiki/Machine_learning\n"
     ]
    }
   ],
   "source": [
    "ask_ragbot(ragbot, \"What is Machine Learning?\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: How does Databricks help with model deployment?\n",
      "Answer: \n",
      "    Databricks provides a comprehensive platform for data science and machine learning, which includes tools and features for model deployment. Databricks released Delta Lake to the open source community in 2019, which provides all the data lifecycle management functions that are needed to make cloud-based object stores reliable and performant. This design allows clients to update multiple objects at once and to replace a subset of data in place, which is essential for model deployment. Additionally, Databricks provides MLflow, a popular open-source platform for managing the end-to-end machine learning lifecycle, which includes model training, evaluation, and deployment. MLflow provides a Model Registry, which allows for managing model artifacts directly via UI and APIs, and provides flexibility to update production models without code changes. Overall, Databricks provides a robust and flexible platform for model deployment, which can help organizations operationalize their machine learning models more efficiently and effectively.\n",
      "Sources:\n",
      "- /home/onur/WORK/DS/repos/chat_with_docs/docs/the-big-book-of-mlops-v10-072023 - Databricks.pdf\n",
      "- /home/onur/WORK/DS/repos/chat_with_docs/docs/the-big-book-of-mlops-v10-072023 - Databricks.pdf\n",
      "- /home/onur/WORK/DS/repos/chat_with_docs/docs/the-big-book-of-mlops-v10-072023 - Databricks.pdf\n"
     ]
    }
   ],
   "source": [
    "ask_ragbot(ragbot, \"How does Databricks help with model deployment?\")"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [
    "IlK08De8l6-2"
   ],
   "provenance": []
  },
  "kernelspec": {
   "display_name": "langchain_311",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}