Docling on IBM Power (ppc64le & ppc64 AIX) — Document Extraction & RAG Preparation | LibrePower
{ } [ ]
from docling.document_converter import DocumentConverter converter = DocumentConverter() result = converter.convert(pdf) md = result.document .export_to_markdown()
# ppc64le & ppc64 native $ pip install docling $ docling my-report.pdf --to markdown --to json # Output: structured data ✓
Live ppc64le & ppc64 (AIX)
🦆
Docling

Docling
on Power

Document extraction and RAG preparation for IBM Power infrastructure.
PDF, DOCX, PPTX → structured chunks. Native on ppc64le & ppc64.

PDF parsing Table extraction RAG-ready chunks Markdown & JSON LLM search via API

IBM's document engine, now running on Power.

Extract and chunk any document for your RAG pipelines — on-premises extraction, LLM-powered search via API.

How it works

Documents in. Structured chunks out.

Docling parses page layouts, extracts tables, detects formulas, and outputs clean Markdown, JSON, or HTML — ready for your RAG pipeline. Retrieval and search are powered by LLMs via API.

Docling document processing pipeline — from PDF input through layout analysis, table structure recognition, and OCR to structured output in Markdown and JSON

Docling processing pipeline — from raw documents to structured, RAG-ready chunks.

Input & Output

Every format. One pipeline.

PDF
DOCX
PPTX
XLSX
HTML
PNG / JPEG
🦆 Docling
JSON
Markdown
HTML
Plain text
Scroll to explore the pipeline
Feed Docling any document format and get structured, machine-readable output.
Parse complex PDF layouts — tables, headers, reading order, formulas, and more.
Convert Office files — Word documents and PowerPoint presentations.
Process spreadsheets and web pages into structured data.
Handle images with OCR for text extraction from scanned documents.
Export to JSON, Markdown, or HTML — ready for RAG chunking.
Use plain text for simple extraction and downstream processing.
Structured chunks
On Power
01 — Document extraction

Advanced document extraction

Layout analysis, reading order detection, table structure recognition, and formula extraction — all running natively on Power without GPU requirements.

Layout analysis Reading order Table structure Formulas → LaTeX
02 — RAG preparation

RAG-ready chunking

Docling partitions documents into optimized chunks with preserved reading order and semantic boundaries — ready to feed into your vector databases and retrieval pipelines.

LangChain LlamaIndex CrewAI Haystack
03 — LLM-powered search

Search via LLM APIs

Document extraction and chunking run locally on Power. For retrieval and search, connect to any LLM provider via API — OpenAI, watsonx, or your preferred service. Hybrid architecture: local processing, cloud intelligence.

watsonx OpenAI API Any LLM Hybrid setup
04 — Integrations

AI framework integrations

Plug-and-play connectors for LangChain, LlamaIndex, CrewAI, and Haystack. Use the Python SDK or CLI to integrate document extraction into your existing workflows.

Docling Serve Python SDK CLI REST API
Platform support

Native on ppc64le & ppc64 (AIX)

LibrePower's port of Docling brings document extraction and RAG preparation to IBM Power Systems — Linux (ppc64le) and AIX (ppc64). No x86 emulation. No GPU required.

Linux ppc64le
AIX ppc64
quickstart.py
# Install on ppc64le or ppc64 (AIX)
# pip install docling

from docling.document_converter import DocumentConverter

source = "quarterly-report.pdf"
converter = DocumentConverter()
result = converter.convert(source)

# Export to Markdown for RAG
md = result.document.export_to_markdown()

# Or export to JSON for processing
json_data = result.document.export_to_dict()
Get started

Bring document extraction to your Power infrastructure

Deploy Docling on IBM Power today. Extract and chunk your documents for RAG — on-premises processing, LLM-powered search via API.