Skip to main content

Embeddings

Stream a JSONL file and generate a vector embedding for the configured text attribute on each record. The vector is written back onto the record as an array of floats. Records that do not have the source attribute (or whose value is not a string) are passed through unchanged.

Provider-agnostic — works with any installed AI application that exposes an embeddings endpoint (OpenAI, Mistral, Anthropic Voyage, Gemini, Cohere, Scaleway AI). For Mistral-specific embeddings see AI::MistralEmbeddings; for OpenAI-specific embeddings see AI::OpenAIEmbeddings.

Pre-requisite: Install an AI provider application from Profile > {Organization} > Applications.

Parameters

ProviderREQUIRED
Configured AI application.
Model

Embedding model identifier from the selected provider (e.g. text-embedding-3-small, mistral-embed, voyage-3-large). Defaults to the provider's recommended embedding model when left empty.

Source AttributeREQUIRED

Field on each incoming record whose string value will be vectorized.

Output AttributeREQUIRED

Field where the resulting embedding vector (array of floats) will be stored.

Input

FileREQUIRED
JSONL file with one record per line.

Output

File

JSONL file where each record has the embedding vector written under the configured output attribute. Records without the source attribute are passed through unchanged.