The library focuses on flexible header detection and robust data extraction, making it suitable for Excel files with:
Variable header positions
Unknown header row numbers
Multiple possible header naming conventions
Semi-structured or user-generated Excel templates
It supports:
Automatic header detection
Header detection using explicit cell location
Header detection using expected column names
Advanced matching using a minimum number of required header matches
The output includes:
JSON representation of Excel data
Concatenated row strings for free-text processing
Header row metadata
Matched header information for traceability
BSD-3 license (https://opensource.org/licenses/BSD-3-Clause)