Docling on IBM Power (ppc64le & ppc64 AIX) — Document Extraction & RAG Preparation | LibrePower

from docling.document_converter import DocumentConverter converter = DocumentConverter() result = converter.convert(pdf) md = result.document .export_to_markdown()

# ppc64le & ppc64 native $ pip install docling $ docling my-report.pdf --to markdown --to json # Output: structured data ✓

Live — ppc64le & ppc64 (AIX)

🦆

Docling

Docling
on Power

Name: Docling on Power
Author: LibrePower

Document extraction and RAG preparation for IBM Power infrastructure.
PDF, DOCX, PPTX → structured chunks. Native on ppc64le & ppc64.

PDF parsing Table extraction RAG-ready chunks Markdown & JSON LLM search via API

How it works

Documents in. Structured chunks out.

Docling parses page layouts, extracts tables, detects formulas, and outputs clean Markdown, JSON, or HTML — ready for your RAG pipeline. Retrieval and search are powered by LLMs via API.

Docling document processing pipeline — from PDF input through layout analysis, table structure recognition, and OCR to structured output in Markdown and JSON

Docling processing pipeline — from raw documents to structured, RAG-ready chunks.

Input & Output

Every format. One pipeline.

PDF

DOCX

PPTX

XLSX

HTML

PNG / JPEG

🦆 Docling

JSON

Markdown

HTML

Plain text

Scroll to explore the pipeline

Feed Docling any document format and get structured, machine-readable output.

Parse complex PDF layouts — tables, headers, reading order, formulas, and more.

Convert Office files — Word documents and PowerPoint presentations.

Process spreadsheets and web pages into structured data.

Handle images with OCR for text extraction from scanned documents.

Export to JSON, Markdown, or HTML — ready for RAG chunking.

Use plain text for simple extraction and downstream processing.

Structured chunks

On Power

01 — Document extraction

Advanced document extraction

Layout analysis, reading order detection, table structure recognition, and formula extraction — all running natively on Power without GPU requirements.

Layout analysis Reading order Table structure Formulas → LaTeX

02 — RAG preparation

RAG-ready chunking

Docling partitions documents into optimized chunks with preserved reading order and semantic boundaries — ready to feed into your vector databases and retrieval pipelines.

LangChain LlamaIndex CrewAI Haystack

03 — LLM-powered search

Search via LLM APIs

Document extraction and chunking run locally on Power. For retrieval and search, connect to any LLM provider via API — OpenAI, watsonx, or your preferred service. Hybrid architecture: local processing, cloud intelligence.

watsonx OpenAI API Any LLM Hybrid setup

04 — Integrations

AI framework integrations

Plug-and-play connectors for LangChain, LlamaIndex, CrewAI, and Haystack. Use the Python SDK or CLI to integrate document extraction into your existing workflows.

Docling Serve Python SDK CLI REST API

Platform support

Native on ppc64le & ppc64 (AIX)

LibrePower's port of Docling brings document extraction and RAG preparation to IBM Power Systems — Linux (ppc64le) and AIX (ppc64). No x86 emulation. No GPU required.

Linux ppc64le

AIX ppc64

quickstart.py

# Install on ppc64le or ppc64 (AIX)
# pip install docling

from docling.document_converter import DocumentConverter

source = "quarterly-report.pdf"
converter = DocumentConverter()
result = converter.convert(source)

# Export to Markdown for RAG
md = result.document.export_to_markdown()

# Or export to JSON for processing
json_data = result.document.export_to_dict()

Get started

Bring document extraction to your Power infrastructure

Deploy Docling on IBM Power today. Extract and chunk your documents for RAG — on-premises processing, LLM-powered search via API.

Request access View source on GitLab