Version: 2.31-unstable

OracleDocumentStore


API reference	Oracle
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/oracle

OracleDocumentStore is a Document Store backed by Oracle AI Vector Search, available in Oracle Database 23ai and later. It stores documents alongside dense vector embeddings in a native VECTOR column, and supports both vector similarity search and keyword search via an automatically managed DBMS_SEARCH index.

Installation

shell

pip install oracle-haystack

Connection

OracleDocumentStore connects to Oracle using the OracleConnectionConfig dataclass, which supports two connection modes:

Thin mode (default): connects directly over TCP. No Oracle Instant Client required.
Thick mode: activated automatically when wallet_location is provided. Used for Oracle Autonomous Database (ADB-S) connections.

Set the connection parameters as environment variables:

shell

export ORACLE_USER="haystack"
export ORACLE_PASSWORD="secret"
export ORACLE_DSN="localhost:1521/freepdb1"

Initialization

python

from haystack.utils import Secret
from haystack_integrations.document_stores.oracle import (
    OracleDocumentStore,
    OracleConnectionConfig,
)

document_store = OracleDocumentStore(
    connection_config=OracleConnectionConfig(
        user=Secret.from_env_var("ORACLE_USER"),
        password=Secret.from_env_var("ORACLE_PASSWORD"),
        dsn=Secret.from_env_var("ORACLE_DSN"),
    ),
    embedding_dim=768,
)

To learn more about the initialization parameters, see the API docs.

Connecting to Oracle Autonomous Database

For Oracle Autonomous Database (ADB-S), provide a wallet for authentication. The store automatically activates thick mode when wallet_location is set:

python

document_store = OracleDocumentStore(
    connection_config=OracleConnectionConfig(
        user=Secret.from_env_var("ORACLE_USER"),
        password=Secret.from_env_var("ORACLE_PASSWORD"),
        dsn=Secret.from_env_var("ORACLE_DSN"),
        wallet_location="/path/to/wallet",
        wallet_password=Secret.from_env_var("WALLET_PASSWORD"),
    ),
    embedding_dim=1536,
)

HNSW Vector Index

By default, the store performs exact vector search. To enable approximate nearest-neighbor search (faster on large datasets), create an HNSW index:

python

document_store = OracleDocumentStore(
    connection_config=OracleConnectionConfig(
        user=Secret.from_env_var("ORACLE_USER"),
        password=Secret.from_env_var("ORACLE_PASSWORD"),
        dsn=Secret.from_env_var("ORACLE_DSN"),
    ),
    embedding_dim=768,
    distance_metric="COSINE",
    create_index=True,  # creates the HNSW index on startup
    hnsw_neighbors=32,
    hnsw_ef_construction=200,
    hnsw_accuracy=95,
)

Supported Retrievers

OracleEmbeddingRetriever: Retrieves documents from OracleDocumentStore based on vector similarity to a query embedding.
OracleKeywordRetriever: Retrieves documents matching a keyword query using Oracle's DBMS_SEARCH full-text index.

Example: RAG pipeline

python

from haystack import Document, Pipeline
from haystack.document_stores.types import DuplicatePolicy
from haystack.components.embedders import (
    SentenceTransformersDocumentEmbedder,
    SentenceTransformersTextEmbedder,
)
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

from haystack_integrations.document_stores.oracle import (
    OracleDocumentStore,
    OracleConnectionConfig,
)
from haystack_integrations.components.retrievers.oracle import OracleEmbeddingRetriever

document_store = OracleDocumentStore(
    connection_config=OracleConnectionConfig(
        user=Secret.from_env_var("ORACLE_USER"),
        password=Secret.from_env_var("ORACLE_PASSWORD"),
        dsn=Secret.from_env_var("ORACLE_DSN"),
    ),
    embedding_dim=768,
)

# Index documents
documents = [
    Document(content="There are over 7,000 languages spoken around the world today."),
    Document(
        content="Elephants have been observed to behave in a way that indicates a high level of self-awareness.",
    ),
    Document(
        content="In certain places, you can witness the phenomenon of bioluminescent waves.",
    ),
]

doc_embedder = SentenceTransformersDocumentEmbedder(
    model="sentence-transformers/all-MiniLM-L6-v2",
)
doc_embedder.warm_up()
embedded_docs = doc_embedder.run(documents)["documents"]
document_store.write_documents(embedded_docs, policy=DuplicatePolicy.OVERWRITE)

# Build a RAG pipeline
template = [
    ChatMessage.from_user(
        """
        Given the following context, answer the question.
        Context: {% for doc in documents %}{{ doc.content }}{% endfor %}
        Question: {{ query }}
        """,
    ),
]

pipeline = Pipeline()
pipeline.add_component(
    "embedder",
    SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
)
pipeline.add_component(
    "retriever",
    OracleEmbeddingRetriever(document_store=document_store, top_k=3),
)
pipeline.add_component("prompt_builder", ChatPromptBuilder(template=template))
pipeline.add_component(
    "llm",
    OpenAIChatGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")),
)

pipeline.connect("embedder.embedding", "retriever.query_embedding")
pipeline.connect("retriever.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "llm.messages")

result = pipeline.run(
    {
        "embedder": {"text": "How many languages are there?"},
        "prompt_builder": {"query": "How many languages are there?"},
    },
)

print(result["llm"]["replies"][0].text)

Installation​

Connection​

Initialization​

Connecting to Oracle Autonomous Database​

HNSW Vector Index​

Supported Retrievers​

Example: RAG pipeline​

Installation

Connection

Initialization

Connecting to Oracle Autonomous Database

HNSW Vector Index

Supported Retrievers

Example: RAG pipeline