Login to follow
PdfContentChunker

PdfContentChunker (ODC)

Stable version 0.1.0 (Compatible with ODC)
Uploaded on 03 October 2025 by OutSystems Labs
PdfContentChunker

PdfContentChunker (ODC)

Documentation
0.1.0

Actions


1. ExtractTextFromPDF

Parameters: pdfBinary, normalizeWhitespace, collapseRepeatedNewlines, includePageNumberPrefix, maxTotalChars, collectLogs, attachFilesLogs.\

Output: Text + logsZipFile.


2. ChunkPlainText

Parameters: text, chunkSizeChars, overlapSizeChars, normalizeWhitespace, estimateTokens, maxTotalChars, collectLogs, attachFilesLogs.\

Output: Chunk[] + stats + logsZipFile.


3. ExtractAndChunkPDF

One-step pipeline (extraction + chunking). Returns Chunk[] + stats + logsZipFile.



Add the ExtractAndChunkPDF Action in a client action or a workflow, then use the Chunk Output in a loop to create embeddings and store in your VectorDB