Architecture
This component ships as two Forge assets, and that split is structural, not optional. ODC Libraries cannot access AI Gateway. A Library has no path to an LLM call. So the stateless computation lives in Agentic Chunking Library, and the two model calls live here in Agentic Chunking ODC.
The entry point is a single Service Action: ChunkTextAgentic. Five steps run in sequence inside it.
ChunkTextAgentic
PreChunkForExtraction: batches input chunks into prompt-safe sizes within your configured token budget.
PreChunkForExtraction
CallExtractionModel: sends each batch to Claude 3.7 Sonnet via AI Gateway at Temperature=0. Returns a flat JSON array of atomic, self-contained propositions.
CallExtractionModel
ParsePropositions: strips markdown fences, normalises the raw model response, and accumulates propositions across batches.
ParsePropositions
CallGroupingModel: sends the full proposition list to a second model call at Temperature=0. Groups propositions by subject domain regardless of original position.
CallGroupingModel
NormaliseAgenticOutput: maps the grouping response into typed AgenticChunk structs.
NormaliseAgenticOutput
AgenticChunk
Installation
Step 1: Install Agentic Chunking ODC.Install this app from the Forge. Agentic Chunking Library installs automatically as a dependency. It provides the three stateless actions the pipeline calls internally, plus the AgenticChunk and AgenticResponse output types your consuming app will work with.
AgenticResponse
Step 2: Configure your AI Gateway connection.The pipeline calls Claude 3.7 Sonnet via ODC AI Gateway. In your ODC Portal, ensure an AI Gateway connection to Claude 3.7 Sonnet is active for the stage you're deploying to. The app references this connection through the TrialClaude3_7Sonnet AI model provider. No API keys are handled directly in the app.
TrialClaude3_7Sonnet
Step 3: Verify site properties.Open the app's site properties and confirm the six configuration values match your environment. The defaults are functional, but review Temperature. The pipeline is designed for Temperature=0 and the default should be set to 0 for deterministic runs.
Step 4: Run the test runner.Navigate to the Chunk Processing Test Runner screen. Select any test case and hit Run Scenario. If the pipeline completes and returns chunks, your AI Gateway connection is working correctly.
Using ChunkTextAgentic
Reference ChunkTextAgentic as a Service Action from your consuming app.
Inputs
InputChunks (AgenticChunkInput List): The chunks to process. Each carries a ChunkId and Text.
InputChunks
DocumentId (Text): Identifier stamped onto every output chunk. Use your document's natural key.
DocumentId
MaxTokensPerBatch (Integer): Token target for batching input before extraction. 2000 is a safe starting point.
MaxTokensPerBatch
Outputs
Response (AgenticResponse): The full normalised output including all chunks, totals, and success state.
Response
Result (Result): Execution status. Check IsSuccess before consuming Response.
Result
Configuration
Six site properties control pipeline behaviour. All have working defaults.
ExtractionSystemPrompt (default: full prompt): System prompt for the extraction call. Encodes seven rules: one fact per proposition, no pronouns, self-contained, no invention, drop structure noise, deduplicate, JSON output only.
ExtractionSystemPrompt
ExtractionUserMessageTemplate (default: "Extract propositions from the following text:"): Prefix prepended to each batch before the extraction call.
ExtractionUserMessageTemplate
GroupingSystemPrompt (default: full prompt): System prompt for the grouping call. Encodes seven rules: theme not position, category label, no duplication, no invention, minimum size, ambiguous assignment, JSON output only.
GroupingSystemPrompt
GroupingUserMessageTemplate (default: "Group the following propositions by theme:"): Prefix prepended to the proposition array before the grouping call.
GroupingUserMessageTemplate
MaxTokens (default: 4000): Maximum token budget for both model calls.
MaxTokens
Temperature (default: 1): Sampling temperature. Set to 0 for deterministic runs.
Temperature
The prompts are the most important configuration surface. They encode the substance rules that held across all 32 test cases. If you modify them, re-run the test suite before deploying.
The Test Runner
The app includes a built-in Chunk Processing Test Runner screen with 32 bundled test cases organised into four categories.
Extraction Phase: cases that test proposition extraction behaviour: compound sentence splitting, pronoun resolution, deduplication, and output format compliance.
Grouping Phase: cases that test thematic reassembly: cross-chunk consolidation, domain separation, and sub-theme granularity.
Edge Cases: boundary conditions: empty input, single sentences, single-domain documents, very short chunks.
Real Pipeline Integration: TC-031 and TC-032, the two end-to-end cases.
To use the test runner:
Results are cached per test case. Switching to a different case and back reloads the saved result without re-running. Hit Reset Demo in the top bar to clear all cached results and start fresh.
TC-031 and TC-032 are the headline cases. TC-031 was hand-authored to test cross-chunk consolidation and mixed-domain splitting. TC-032 is built from a real Level 4 run on a six-domain document: nine input chunks with domain shifts mid-chunk and duplicated boundary sentences from overlap. Run both before drawing conclusions about behaviour on your own content.
When to Use It
Reach for Level 5 after Level 4, specifically when retrieval quality is failing because chunks still contain mixed facts, duplicated overlap, unresolved pronouns, or scattered related content that belongs together.
If Level 4 is giving you clean boundaries already, stay there. The cost difference is real and the judgment layer only earns its keep when there's actual judgment to do.