PdfContentChunker - Documentation (ODC)

Stable version 0.1.0 (Compatible with ODC)

Uploaded on 03 October 2025 by OutSystems Labs

Documentation

0.1.0

Actions

1. ExtractTextFromPDF

Parameters: pdfBinary, normalizeWhitespace, collapseRepeatedNewlines, includePageNumberPrefix, maxTotalChars, collectLogs, attachFilesLogs.\

Output: Text + logsZipFile.

2. ChunkPlainText

Parameters: text, chunkSizeChars, overlapSizeChars, normalizeWhitespace, estimateTokens, maxTotalChars, collectLogs, attachFilesLogs.\

Output: Chunk[] + stats + logsZipFile.

3. ExtractAndChunkPDF

One-step pipeline (extraction + chunking). Returns Chunk[] + stats + logsZipFile.

Add the ExtractAndChunkPDF Action in a client action or a workflow, then use the Chunk Output in a loop to create embeddings and store in your VectorDB

PdfContentChunker (ODC)

PdfContentChunker (ODC)