Data Analysis Scanning and Extraction Language
The Abacus Data Analysis Scanning and Extraction Language (DASEL) is a powerful scripting language that provides an analyst with the capability of creating text scanners that can extract useful data from text files such as documents, downloaded HTML, etc. Tools are provided for the following features:
DASEL Example Application: Internet Scanner Development
The DASEL tool can be used to develop text scanners for any target text file. As an example, the procedure for developing rule-based text scanners for web pages is summarized in the figure above. The first step is to identify the exact set of Internet Universal Resource Locators (URLs) to be analyzed. This is usually accomplished by visiting the sites with a standard browser and copying the URLs from the address window. Sample HTML is then retrieved from each site and saved. Then, to create text extraction scanners, rules are written in the DASEL language. The DASEL Interpreter tests the scanners by running them against the saved HTML. Using the results output reports including the Error Log Report, the Audit Trail Report, and the Parse Tree Report, changes are made to the rules until they correctly scan the target text file.
DASEL Example Scanner
A sample DASEL scanner specification that extracts the title, text, and URL from a web page is shown below.
User interaction sequence for the DASEL demonstration
DASEL User Interaction Sequence
Home | Corporate Profile | Abacus Corporate Presentation | Abacus AI Projects Presentation | Software Development | Systems Engineering & Analysis | Artificial Intelligence | Avionics Systems | Ground Systems | Computer Systems | Business Systems | Proprietary Products | Customer Support Services | New Activities | Key Management | Clients | Employment Opportunities | Site Map | Contact Us | About Us
2008, Abacus Programming Corporation