Script / Code Free

Extract Paragraphs - Sample Script

Extract Paragraphs - Sample Script Overview

Out-of-the-box Transformation and TotalAgility don’t have a locator that can extract a paragraph from a document, and with 2-column documents this can be a challenging task. This document shows how to create a script locator that allows you to extract paragraphs that contains a specific keyword, or keywords in the first line. The example used in this document is to find and extract the paragraph that describes the definitions of the “effective date” and “term” of an agreement.

Features

The sample script code included allows to extract paragraphs, in single-column and two-column documents. It draws a rectangle between two anchors and the width of the text column, and returns the text in that rectangle. The keyword that identifies the paragraph must be found in the first line of that paragraph.  The first line of the paragraph may be indented. The paragraph may be numbered (as “X.” or “X.Y.”), or instead it has a header that a locator can find.

Limitations: The script does not capture the second part of a paragraph, if the paragraph continues on the next column, or next page. Also, make sure that the document isn’t skewed (if necessary, use VRS to correct for skewed text), as skewed text leads to words being extracted in the wrong order.

Benefits

Be able to catalog contacts and other document types where paragraphs provide a definition.

Technical Details

Inputs

Documents

Outputs

Extracted paragraphs delivered to fields

Geographic Availability
Global

Additional Information

PLEASE NOTE: Tungsten Labs is independent of Tungsten Automation and this listing is not officially supported or maintained.

Required Software / Applications

None

Created By

Products

TotalAgility
Transformation
RPA

Industry

Other

compatibility

TotalAgility 7.6+, Transformation, RPA

Business Process

Legal

Last Updated

March 26, 2024

Consulting Required

No

Support Available

No

Pricing

Free