ImageSense AI - Image Search Quality Optimization

An AI-powered image search engine that enhances search capabilities by integrating Large Language Models (LLMs) to better understand user intent and deliver more relevant search results.

---
Source Code available in the repository
Github handle: saileshdwivedy30
Project: ImageSense AI - Image Search Quality Optimization
---

Source Code: Click here
Demo Video: Watch Demo
Dataset Link: View Dataset

Overview

An AI-powered image search engine utilizing Large Language Models (LLMs) and OpenAI’s CLIP model to enhance search accuracy by better understanding user queries and intent.
Key functionalities include query expansion, text-to-image search, and efficient similarity search using FAISS and CLIP embeddings.

The system ensures high-quality image search results by leveraging LLaMA-based query expansion and FAISS-based image retrieval, all within a user-friendly Streamlit interface.


Tech Stack

  • Python – Core language for development.
  • OpenAI CLIP Model – Extracts image and text embeddings.
  • FAISS (Facebook AI Similarity Search) – Enables fast similarity search among images.
  • LLaMA 3.2 3B Model – Hosted via Ollama API for query expansion.
  • Streamlit – Builds the interactive web interface.
  • PIL (Pillow) – Handles image processing.

Dataset Details

  • Dataset: Flickr8k
  • Dataset Source: Kaggle

Image Processing & Search Flow

1. Image Embedding Extraction

  • Extracts embeddings from all images using OpenAI CLIP model.
  • Stores embeddings in a predefined folder for efficient retrieval.

2. Query Expansion (LLaMA-based)

  • Enhances user queries using LLaMA 3.2 3B Model hosted via Ollama API.

3. Image Search with FAISS

  • Converts the expanded text query into embeddings using CLIP.
  • Uses FAISS for fast retrieval of the most similar images.

4. Visual Interface (Streamlit)

The user-friendly web interface allows:

  • Search queries input and automatic query expansion.
  • Matching images display based on FAISS retrieval.

Major Takeaways:

  • LLM-enhanced image search significantly improves relevance by understanding user intent.
  • Query expansion via LLaMA helps refine ambiguous search terms.
  • CLIP & FAISS enable fast and accurate image retrieval.
  • User-friendly interface makes AI-powered image search easily accessible.