🧠 Project Overview

This project showcases a Retrieval-Augmented Generation (RAG) chatbot capable of answering Apple iPhone support questions by referencing real documentation scraped from Apple’s official site.

The chatbot runs locally or with OpenAI models and supports streaming responses using Gradio. It demonstrates modern AI implementation practices including local inference with Ollama and hybrid deployment with GPT-4 for enterprise-grade reliability.


πŸ”§ Technologies Used

  • LangChain – RAG pipeline orchestration with ConversationalRetrievalChain
  • FAISS – Efficient vector similarity search with saved local index
  • OpenAI Embeddings – For cloud-based text representation and semantic search
  • OpenAI GPT4 – For LLM-based text generation
  • Ollama – For local LLM inference using LLaMA 3 and nomic-embed-text
  • Gradio – Interactive frontend for chatbot demo with model switching
  • Playwright (Python) – For automated document scraping of Apple Support

πŸ” Features

  • πŸ”„ Dynamic model selection between llama3.2 (local but not shown in the current demo) and gpt-4 (cloud)
  • 🧩 Document chunking and embedding with semantic retrieval
  • πŸ—ƒοΈ Vector search via FAISS to enhance question answering
  • πŸ’¬ Conversation memory with context persistence
  • 🌐 Self-hosted or deployable via Docker

🎯 Why It Matters

This project is ideal for:

  • Companies exploring AI-based customer support automation
  • Teams evaluating RAG pipelines and vector databases
  • Clients requiring private LLM deployment or hybrid cloud-local solutions

πŸš€ Try It Yourself

You can take a look at the Demo HERE.

Interface for RAG Chatbot.

The entire project is containerized with Docker Compose, allowing easy deployment in local or cloud environments.

Want a similar AI assistant trained on your own knowledge base or product documentation?

Contact me here or see the GitHub repository.