Amazon Textract is a machine learning service that extracts text and handwriting from scanned documents. It can also analyze and extract data from forms and tables.
This connector allows you to use the following Textract capabilities:
Detect and extract handwriting and printed text from scanned documents
Identify, analyze, and extract data structured in forms and tables
Extract financial data from printed documents
This component is based on the AWS SDK for .NET v3.
To use this connector you must have:
An AWS account, so you can use Amazon Textract
An AWS access key (access key ID and secret access key)
Some connector actions also require the input file to be stored in an S3 bucket. To use these capabilities, the AWS account must have access to get objects from the specified bucket.
See the AWS documentation for detailed information on getting started, developing, and working with Amazon Textract.
To configure your connector to access Amazon Textract, you need the following AWS authentication information:
The access key ID of your AWS access key
The secret access key of your AWS access key
The AWS Region of the service endpoint you want to connect. To reduce latency, you should choose a region close to your application server. See the API documentation for the list of region names.
Use the above information to fill in the following site properties available in the AmazonTextractConnector_IS module:
Access Key
Secret Access Key
Region (default value is "USEast1")
The connector uses the values in these site properties as the default AWS credentials to authenticate in Amazon Textract.
Alternatively, you can specify different credentials when invoking an action from the connector in your logic by using the AWSCredentials input parameter.
Each time you run an action from the connector, the authentication in Amazon Textract is performed as follows (see this logic in GetAmazonCredentials action):
If the AWSCredentials input parameter is passed (the parameter isn’t mandatory), these values are used to authenticate.
Otherwise, the authentication uses the values in the Access Key, Secret Access Key, and Region site properties.
If the values in the site properties are not set, an exception is raised: “Invalid configuration. Please validate the values."
With this design, you have the flexibility to define a set of AWS credentials for specific actions, while using the ones configured in the site properties as default credentials for the remaining actions.