Skip to content

Latest commit

 

History

History
65 lines (44 loc) · 3.33 KB

gpt4v.md

File metadata and controls

65 lines (44 loc) · 3.33 KB

Using GPT-4 Turbo with Vision

This repository now includes an example of integrating GPT-4 Turbo with Vision with Azure AI Search. This feature enables indexing and searching images and graphs, such as financial documents, in addition to text-based content.

Feature Overview

  • Document Handling: Source documents are split into pages and saved as PNG files in blob storage. Each file's name and page number are embedded for reference.
  • Data Extraction: Text data is extracted using OCR.
  • Data Indexing: Text and image embeddings, generated using Azure AI Vision (Azure AI Vision Embeddings), are indexed in Azure AI Search along with the raw text.
  • Search and Response: Searches can be conducted using vectors or hybrid methods. Responses are generated by GPT-4 Turbo with Vision based on the retrieved content.

Getting Started

Prerequisites

Setup and Usage

  1. Update repository: Pull the latest changes.

  2. Enable GPT-4 Turbo with Vision:

    First, make sure you do not have integrated vectorization enabled, since that is currently incompatible:

    azd env set USE_FEATURE_INT_VECTORIZATION false

    Then set the environment variable for enabling vision support:

    azd env set USE_GPT4V true

    When set, that flag will provision a Computer Vision resource and GPT-4-vision model, upload image versions of PDFs to Blob storage, upload embeddings of images in a new imageEmbedding field, and enable the vision approach in the UI.

  3. Clean old deployments (optional): Run azd down --purge for a fresh setup.

  4. Start the application: Execute azd up to build, provision, deploy, and initiate document preparation.

  5. Web Application Usage: GPT4V configuration screenshot

    • Access the developer options in the web app and select "Use GPT-4 Turbo with Vision".
    • Sample questions will be updated for testing.
    • Interact with the questions to view responses.
    • The 'Thought Process' tab shows the retrieved data and its processing by GPT-4 Turbo with Vision.

Feel free to explore and contribute to enhancing this feature. For questions or feedback, use the repository's issue tracker.