x
x
I developed the analysis agent shell and three specializations of it for deviation detection, association mining, and neural-net-based classification learning. My colleague Amy Unruh developed the Control Agent component. Together, we developed a number of application prototypes.
INFOSLEUTH:
Agent-Based System
for Data Gathering & Analysis
MY POSTMORTEM ON THE INFOSLEUTH PROJECT
I joined the InfoSleuth project hoping it would eliminate a  fundamental obstacle to successful data mining--the problem of  acquiring sufficient  amounts of  data to analyze; since the InfoSleuth data gathering software was designed for rapidly creating virtual data warehouses. However, as I worked on developing prototype systems with real databases,  I discovered  that InfoSleuth faces the same poor data quality problems plaguing efforts to  create  real  data warehouses. As databases age and database managers change, the semantics of tables and fields are often  lost or inadvertently altered. For example, sometimes users contribute their own informal semantics using inadequately documented coding schemes for data values.  In addition, many databases contain a high percentage of missing and erroneous data. Perhaps my perspective is biased because I am a psychologist, but in hindsight, I wonder why we and our sponsors eagerly set out to develop a new internet and database technology without first attacking the underlying human-computer interaction problem related to  insuring sufficient data quality.
SOME PAPERS ABOUT INFOSLEUTH
Bayardo, Bohrer, Brice, Cichocki, Fowler, Helal, Kashyap, Ksiezyk, Martin, Nodine, Rashid, Rusinkiewicz, Shea, Unnikrishnan, Unruh & Woelk (1997) InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environments. SIGMOD 97 PDF
Martin, G., Unruh, A.., & Urban, S.(1999) An Agent Infrastructure for Knowledge Discovery and Event Detection.MCC-INSL-399-99.  PDF
Unruh, A., Martin, G., & Perry, B. (1998) Getting only what you want: data mining and event detection using InfoSleuth Agents. MCC Technical Report. MCC-INSL-113-98. PDF
Martin, G. (1997) Healthcare knowledge mining in a decentralized web-based environment. MCC Technical Report. INSL-059-97. PDF
Ksiezyk, T., Martin, G. & Jia, Q. (2001)  InfoSleuth: Agent-based system for data integration and analysis. 25th Annual International Computer Software and Applications Conference (COMPSAC '01)  p. 474
InfoSleuth is an agent-based system designed to automate the gathering and analysis of dynamically changing data located in heterogeneous databases accessible from the Internet. It was developed at the MCC research consortium in Austin during 1995-2000, and underwent further development, testing and hardening at Motorola during 2000-2002. Prototype systems have been developed using databases containing  manufacturing, healthcare, environmental and defense data.
x
x
                                               AUTOMATED DATA GATHERING
To gather data, an InfoSleuth user specifies an SQL query referencing elements from a  domain ontology created for a given application area. Software agents  then:
LOCATE resources (databases) that have advertised having data relevant to these elements,   
TRANSLATE the ontology-based query into queries referencing elements in the different  schemas of the identified local databases,
SUBMIT the queries to the resources,
INTEGRATE the results returned from these multiple resources, expressing them in terms of elements in the ontology used in the original SQL query, and
RETURN the results to the user.

InfoSleuth also supports subscription queries, making it possible for users to receive updated query results as the contents of the underlying databases change
    
                                      AUTOMATED DATA ANALYSIS / MINING             
InfoSleuth users can also configure a data analysis or  mining task to run continuously or in batch mode. InfoSleuth's automated data gathering capabilities retrieve the to-be-analyzed data from distributed  datasources, and feed it continuously or in batch-mode  to one of three types of Analysis Agents. Analysis results  are placed in a database to which interested users can  subscribe to be notified of results. The Control Agent  coordinates execution of the analysis task. The three  types of analyses are deviation detection in an ongoing  data stream, association mining, and backpropagation  neural net-based classification & classification learning.
   
HOME
Human Computer
Interaction
Neural Network
Optical Character Recognition
Visual Encoding Learning
and Fluent Reading
Visual Coding
MY RESEARCH
InfoSleuth Agent System for
Data Gathering & Analysis
SAIC  AgentMiner
Telcordia
Examples of Tech Transfer With Industrial Sponsors