Robot Free

Web Crawling and Content Extraction with Tungsten RPA

Web Crawling and Content Extraction with Tungsten RPA Overview

The Web Crawling and Content Samples for Tungsten RPA provide a range of sample robots to illustrate the approaches that can be used to gather/harvest data from private or public web sites and uncover the benefits of using a scalable low-code approach to data sourcing with minimal effort.

Features

The purpose of this asset is to provide a configuration baseline to reduce the effort involved in deploying web crawling and content extraction capabilities into product demos and robot deployments. This also serves as an enablement example to illustrate some of the following key capabilities within Tungsten RPA.

  • Use of the inbuilt ‘headless’ RPA web browser – No desktops required!
  • Low code database interaction (creation & updates, searching/retrieving content etc.)
  • Downloading binary content and converting into usable formats
  • Use of looping to simplify process design
  • Calling and handling web service responses within a robot to support modular design
  • Using snippets within a robotic workflow to drive re-usability
  • Decisions to support process logic
  • Date conversion and manipulation of text based content
  • And much more…

Benefits

Global businesses are in a continual race to evolve, this is especially true for companies that are in the business of information. Whether you gather and analyse information to advise or sell to others, use information in-house to maintain a competitive edge, or leverage information in an entirely new and disruptive business model.

Yet many companies rely on manual research, incomplete information gathered from web scraping tools, third-party data and home-grown coding solutions to power this critical piece of their business. This is not only time-consuming and costly but it also hinders scalability, provides reduced data accuracy, standardisation and ultimately results in a diluted customer experience.

There are a wide range of real world applications for automating web data extraction, below is just a snapshot of use cases where Kofax customers are today applying this technology to drive incredible business value.

  • Market Intelligence
  • Investor Relations and Corporate Intelligence
  • Equity Research
  • Web Content Migration
  • Gathering Content Samples for Machine Learning

Technical Details

Inputs

Websites

Outputs

Data

Geographic Availability
Global

Additional Information

This asset is intended to illustrate the techniques that can be used through the Tungsten RPA platform to crawl and extract web based content from private or public web resources.  However, Tungsten Automation does not endorse the use of this for mass data gathering exercises where there could be a detrimental impact on the service provider. Careful consideration should be taken into the resource demands that could be placed on target sites when using these assets.

 

 

PLEASE NOTE: Tungsten Labs is independent of Tungsten Automation and this listing is not officially supported or maintained.

Required Software / Applications

None

Additional Solutions From Tungsten Labs

Created By

Products

RPA

Industry

Other

compatibility

RPA 11.0 and above

Business Process

General / Other

Last Updated

March 26, 2024

Consulting Required

No

Support Available

No

Pricing

Free