AI - Azure AI services Blog

6 MIN READ

Ways to simplify your data ingestion pipeline with Azure AI Search

gia_mondragon

Microsoft

May 28, 2025

Updates to indexers and AI enrichment skills for multimodal data ingestion in Azure AI Search. Introducing a no-code Logic Apps ingestion wizard to accelerate AI agent grounding data and RAG data prep time.

We are introducing multiple features in Azure AI Search that make AI agent grounding and RAG data preparation easier. Here is what’s new:

The GenAI prompt skill in public preview accesses Azure AI Foundry chat-completion models to enrich content when it’s being indexed
Logic app integration with AI Search in Azure portal for simple data ingestion

Introduction

Azure AI Search is introducing new features and integrations designed to simplify and accelerate the creation of RAG-ready indexes. The GenAI Prompt Skill leverages generative AI during the indexing process, enabling advanced context expansion, image verbalization, and other transformations to enhance multimodal search relevance. The GenAI Prompt Skill enables sophisticated data transformations during the indexing process and facilitates use cases like content summarization, image captioning, and sentiment classification. This skill is available in the 2025-05-01-preview REST API and most recent beta versions of Python, .NET, and Java, Azure AI Search SDKs.

Additionally, we are releasing a new Azure Logic Apps integration within the Azure AI Search portal. This integration offers a no-code ingestion wizard for creating text and vector-based RAG-ready indexes, supporting multiple connectors for data ingestion. This simplifies data ingestion, enabling developers to focus on building innovative applications instead of plumbing integrations.

Together, these features reduce data preparation time and help developers to spin up RAG-ready indexes more quickly. With faster data preparation and deeper enrichment, customers can expect RAG-powered apps to deliver more relevant search results, improved document understanding, and an overall enhanced user experience.

GenAI Prompt Skill

What it is

The GenAI Prompt Skill is a new type of skill in Azure AI Search that uses chat-completion models during indexing. This skill enables developers to call any customer-owned model deployed in Azure AI Foundry  or Azure OpenAI to enrich data during ingestion. By defining a custom prompt, developers can transform and enhance their data in ways that were previously cumbersome.

This skill is versatile, supporting virtually any text output transformation the underlying model can perform. Examples include:

Image Verbalization: Generate alt-text or scene descriptions from image inputs for improved accessibility and search relevance.
Content Summarization: Use extractive or abstractive summarization to distill large documents into concise, searchable summaries.
Classification: Assign taxonomy labels or sentiment scores to content for better organization and filtering.

Example Use Cases

Image to Text: Convert images into descriptive text for indexing, enabling multimodal search capabilities.
Sentiment Analysis: Automatically label customer feedback documents with positive, neutral, or negative sentiment.
Summarization: Extract key information from long-form documents, making them easier to search and understand.

Get Started with GenAI Prompt Skill

The GenAI Prompt Skill integrates into the Azure AI Search indexing pipeline. Here’s an end-to-end flow of how this skill works within a RAG workflow:

Data Source: Documents, images, or multimodal content are ingested from various sources using Azure AI Search built-in indexers.

Crack and Parse Document: Text and image content are extracted from source documents and passed through the indexing pipeline.

GenAI Prompt Skill: Extracted content is processed by a skill specified in a skillset, which uses a chat-completion model deployed in Azure OpenAI or Azure AI Foundry. The model applies transformations such as summarization, classification, or image verbalization.

Enriched Data Output: Transformed data is output as enriched text or structured metadata, which is then stored in the Azure AI Search index for RAG-ready applications as showcase in the diagram below.

This architecture not only simplifies the enrichment process but also allows for powerful customizations by modifying the system and user prompts to meet specific business needs.

High-Level Architecture Diagram

You can quickly have images verbalized now with the GenAI prompt skill by using the “Import and vectorize data” wizard new multimodal RAG functionality. Read about the multimodal search full set of capabilities in our blog post “From diagrams to dialogue: Introducing new multimodal functionality in Azure AI Search”.

Responsible AI Practices for the GenAI Prompt Skill

When leveraging the GenAI Prompt Skill, it’s crucial to adhere to Microsoft’s Responsible AI guidelines and Gen AI Prompt skill best practices to ensure ethical, transparent, and fair use of AI capabilities. These guidelines help maintain user trust, uphold data privacy, and mitigate potential risks associated with AI-driven data transformations. Refer also to the Azure AI Search transparency note.

Portal Experience: Logic Apps Integration

The Logic Apps integration in the Azure AI Search portal offers a no-code experience for building ingestion pipelines tailored for RAG workflows. This integration introduces new connectors and a simplified wizard to minimize the time and complexity of preparing data for indexing.

Key Features of Logic Apps Integration

1. New Connectors Support from the Azure portal in AI Search and Text Transformation Pipeline:

Choose from a wide range of connectors, including SharePoint Online, OneDrive, and Amazon S3, to ingest data directly into Azure AI Search.
The wizard chains together key steps such as cracking and parsing documents, text chunking, Azure OpenAI embedding creation and sending both text and embeddings to an AI Search Index, automating the ingestion and vectorization process for a RAG-ready index.

2. Incremental Refresh:

Update indexes by polling connectors for delta changes.

High-Level Differences: Azure AI Search Built-in Indexers vs. Logic Apps Integration

Azure AI Search Built-in Indexers

Data Sources: Only supported data sources listed “by Azure AI Search”, such as Azure Storage Blob, ADLS Gen2, Azure SQL, etc.
Execution: Runs directly within the Azure AI Search service.
Transformation: Built-in pipelines extract text/images and enrich content using AI skillsets (OCR, Image Analysis, etc.).
Monitoring: Manage runs via Azure portal or REST API.
Cost: Ingestion cost included in Search Scale Units; AI enrichment billed separately, per skill used.
Value: Zero-code setup, full AI enrichment stack.

Logic Apps Integration via the Azure AI Search portal wizard

Data Sources: Flexible connectors for multiple sources (SharePoint, OneDrive, Amazon S3, etc.). Besides the offered few-clicks workflows in the Azure AI Search portal wizard, you can create your own pipelines for all existing Logic Apps connectors and actions according to your scenario.
Execution: Workflows run independently of the AI Search service.
Transformation: Customizable workflows; extend templates for advanced pre-processing or image extraction.
Monitoring: Track runs via Azure Logic Apps portal (step-level insights) and optional Azure Monitor logs.
Cost: Pay-as-you-go pricing based on consumption workflows connectors and actions.
Value: No-code setup, fully editable workflows for custom scenarios via the Azure Logic Apps designer.

Cost and Running Environment Note

The Logic Apps consumption workflows follow a pay-per-execution model, making it ideal for proof-of-concept projects. For production workloads, customers can replicate the same pipeline using the Logic Apps standard workflows for greater scalability and cost optimization. This is an early integration, so you’re welcome to provide any feedback on this blog post for further requirements and functionality.

SharePoint Online Built-in Indexer (Preview) vs. Pushing Data to Azure AI Search via the Logic Apps Connector

The SharePoint Online built-in indexer in Azure AI Search (currently in preview) is not planned for General Availability at this time and is recommended only for non-production environments and testing. Microsoft Copilot Studio is the recommended solution for working with SharePoint data in copilot scenarios (RAG applications), as it provides advanced enterprise capabilities out-of-the-box.

For scenarios where data from SharePoint needs to be indexed in Azure AI Search going forward, despite of the Microsoft Copilot Studio recommendation, customers can instead build custom indexing pipelines using Azure Logic Apps. Logic Apps offers extensive functionality to create pipelines with required transformations. Each connector and action within Logic Apps has its own product lifecycle status (preview or general availability). Currently Azure Logic Apps AI Search connector and related functionality are in public preview.

Getting Started with Azure Logic Apps integration with Azure AI Search portal wizard

Here’s a tutorial on how to use the new integration Logic App connectors and functionality via the Azure portal in the AI Search wizard. Also, you may find more information on this functionality in this Logic App blog post.

What’s Next?

What’s new in Azure AI Search?

Feedback: For feedback regarding the Logic App connectors, there is a feedback option in the Azure portal directly, so you can share specific improvement requests or suggestions.

Updated May 28, 2025

Version 1.0

azure ai search

azure openai service

gia_mondragon

Microsoft

Joined January 24, 2019

View Profile

AI - Azure AI services Blog

Follow this blog board to get notified when there's new activity