| x |
| x |
| I developed the analysis agent shell and three specializations of it for deviation detection, association mining, and neural-net-based classification learning. My colleague Amy Unruh developed the Control Agent component. Together, we developed a number of application prototypes. |
| INFOSLEUTH: Agent-Based System for Data Gathering & Analysis |
| MY POSTMORTEM ON THE INFOSLEUTH PROJECT |
| I joined the InfoSleuth project hoping it would eliminate a fundamental obstacle to successful data mining--the problem of acquiring sufficient amounts of data to analyze; since the InfoSleuth data gathering software was designed for rapidly creating virtual data warehouses. However, as I worked on developing prototype systems with real databases, I discovered that InfoSleuth faces the same poor data quality problems plaguing efforts to create real data warehouses. As databases age and database managers change, the semantics of tables and fields are often lost or inadvertently altered. For example, sometimes users contribute their own informal semantics using inadequately documented coding schemes for data values. In addition, many databases contain a high percentage of missing and erroneous data. Perhaps my perspective is biased because I am a psychologist, but in hindsight, I wonder why we and our sponsors eagerly set out to develop a new internet and database technology without first attacking the underlying human-computer interaction problem related to insuring sufficient data quality. |
| SOME PAPERS ABOUT INFOSLEUTH |
| Bayardo, Bohrer, Brice, Cichocki, Fowler, Helal, Kashyap, Ksiezyk, Martin, Nodine, Rashid, Rusinkiewicz, Shea, Unnikrishnan, Unruh & Woelk (1997) InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environments. SIGMOD 97 PDF |
| Martin, G., Unruh, A.., & Urban, S.(1999) An Agent Infrastructure for Knowledge Discovery and Event Detection.MCC-INSL-399-99. PDF |
| Unruh, A., Martin, G., & Perry, B. (1998) Getting only what you want: data mining and event detection using InfoSleuth Agents. MCC Technical Report. MCC-INSL-113-98. PDF |
| Martin, G. (1997) Healthcare knowledge mining in a decentralized web-based environment. MCC Technical Report. INSL-059-97. PDF |
| Ksiezyk, T., Martin, G. & Jia, Q. (2001) InfoSleuth: Agent-based system for data integration and analysis. 25th Annual International Computer Software and Applications Conference (COMPSAC '01) p. 474 |
![]() |
| InfoSleuth is an agent-based system designed to automate the gathering and analysis of dynamically changing data located in heterogeneous databases accessible from the Internet. It was developed at the MCC research consortium in Austin during 1995-2000, and underwent further development, testing and hardening at Motorola during 2000-2002. Prototype systems have been developed using databases containing manufacturing, healthcare, environmental and defense data. |
| x |
| x |
| AUTOMATED DATA GATHERING To gather data, an InfoSleuth user specifies an SQL query referencing elements from a domain ontology created for a given application area. Software agents then: LOCATE resources (databases) that have advertised having data relevant to these elements, TRANSLATE the ontology-based query into queries referencing elements in the different schemas of the identified local databases, SUBMIT the queries to the resources, INTEGRATE the results returned from these multiple resources, expressing them in terms of elements in the ontology used in the original SQL query, and RETURN the results to the user. InfoSleuth also supports subscription queries, making it possible for users to receive updated query results as the contents of the underlying databases change |
| AUTOMATED DATA ANALYSIS / MINING InfoSleuth users can also configure a data analysis or mining task to run continuously or in batch mode. InfoSleuth's automated data gathering capabilities retrieve the to-be-analyzed data from distributed datasources, and feed it continuously or in batch-mode to one of three types of Analysis Agents. Analysis results are placed in a database to which interested users can subscribe to be notified of results. The Control Agent coordinates execution of the analysis task. The three types of analyses are deviation detection in an ongoing data stream, association mining, and backpropagation neural net-based classification & classification learning. |
| MY RESEARCH |
| InfoSleuth Agent System for Data Gathering & Analysis |
| Examples of Tech Transfer With Industrial Sponsors |