Machine Learning
Visit the Yahoo! Group for this interest group
Send mail to the local members
Subscribe to the Yahoo! Group for this interest group


About the Machine Learning Group (Last updated 13 Jul 2001)

The Laboratory for Knowledge Discovery in Databases (KDD) is a research group in the Computing and Information Sciences (CIS) Department at Kansas State University. Its research emphasis is in the areas of applied artificial intelligence (AI) and knowledge-based software engineering (KBSE) for decision support systems.

More specifically, we are interested in machine learning, data mining and knowledge discovery from large spatial and temporal databases, human-computer intelligent interaction (HCII), and high-performance computation in learning and optimization. In our research, we look for ways to systematically decompose analytical learning problems based upon information theoretic and probabilistic criteria, so that the most appropriate machine learning methods may be applied to the resulting transformed problems.

One of the major challenges in this area is the design of unsupervised learning and bias (or hyperparameter) optimization methods to produce an effective decomposition of learning tasks. An interesting opportunity presented by this problem is that, by addressing the high-level control of inductive learning in a statistically sound fashion, we can improve our techniques for both model selection and model integration (as practiced in multimodal sensor fusion). We have developed and applied such approaches to multistrategy learning, which are potentially computation-intensive, to interesting analytical problems in the areas of decision support (uncertain reasoning) and control automation.

The goal of our work is to gain insight into the interaction between artifacts that adapt or learn - whether by Bayesian, neural, or genetic computation - and their users. Important examples of this interaction include data visualization in intelligent displays, software agents for distributed high-performance computation and information retrieval, and virtual environments for simulation and computer-assisted instruction.

Currently our projects are primarily focusing on the reimplementation of a subset of MLC++ into MLJ and the implementation of wrappers for performance enhancements in KDD. In doing these projects, it is our intent to better understand the workings of different induction alogrithms, and to build upon them for furture research.

Resources Online (Last updated 13 Jul 2001)
Projects (Last updated 29 Jan 2002)

Machine Learning in Java (MLJ) - Download page
Presentations (Last updated 13 Jul 2001)
Publications (Last updated 11 Apr 2001)

Journals

[HWRC02] W. H. Hsu, M. Welge, T. Redman, and D. Clutter. Constructive Induction Wrappers in High-Performance Commercial Data Mining and Decision Support Systems. Knowledge Discovery and Data Mining. Kluwer Academic Publishers, to appear.  (PostScript .ps.gz)

[HRW00] W. H. Hsu, S. R. Ray, and D. C. Wilkins. A Multistrategy Approach to Classifier Learning from Time SeriesMachine Learning, 38(1-2):213-236. Kluwer Academic Publishers, 2000. (PostScript .ps.gz)

[RH98] S. R. Ray and W. H. Hsu.  Self-Organized-Expert Modular Network for Classification of Spatiotemporal Sequences. Intelligent Data Analysis, 2(4). IOS Press, October, 1998. (PostScript .ps.gz)

[HZ95] W. H. Hsu and A. E. Zwarico.  Automatic Synthesis of Compression Techniques for Heterogeneous FilesSoftware: Practice and Experience, 25(10):1097-1116. Wiley, 1995. (PostScript .ps.gz)

Book Chapters

[Hs02] W. H. Hsu.  Control of Inductive Bias in Supervised Learning using Evolutionary Computation: A Wrapper-Based Approach.  In J. Wang, editor, Data Mining: Opportunities and Challenges. IDEA Group Publishing, to appear. (PostScript .ps.gz)

Conferences

[Gu02] H. Guo. A Bayesian Metareasoner for Algorithm Selection for Real-time Bayesian Network Inference Problems. AAAI02 Doctoral Consortium Abstract, to appear.

[GPSH02] H. Guo, B. B. Perry, J. A. Stilson, W. H. Hsu. A Genetic Algorithm for Tuning Variable Orderings in Bayesian Network Structure Learning. AAAI02 Student Abstract, to appear.

[DGVH02] S. Das, S. Gosavi, S. Vaze, and W. H. Hsu. An Ant Colony Approach for the Steiner Tree Problem (poster abstract). In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002), New York, NY, 2002, to appear. (PostScript .ps.gz)

[HG02] W. H. Hsu and S. M. Gustafson. Genetic Programming and Multi-Agent Layered Learning by Reinforcements. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002), New York, NY, 2002, to appear. (PostScript .ps.gz)

[HGPS02] W. H. Hsu, H. Guo, B. B. Perry, and J. A. Stilson. A Permutation Genetic Algorithm for Variable Ordering in Learning Bayesian Networks from Data. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002), New York, NY, 2002, to appear. (PostScript .ps.gz)

[HSL02] W. H. Hsu, C. P. Schmidt, and J. A. Louis. Genetic Algorithm Wrappers for Feature Subset Selection in Supervised Inductive Learning (poster abstract). In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2002), New York, NY, 2002, to appear. (PostScript .ps.gz)

[GH01] S. M. Gustafson and W. H. Hsu. Layered Learning in Genetic Programming for a Cooperative Robot Soccer Problem. In Proceedings of the 4th European Conference on Genetic Programming (EuroGP-2001), Lake Como (Milan), Italy, April, 2001. Springer-Verlag, 2001. (PostScript .ps.gz)

[HWRC00] W. H. Hsu, M. Welge, T. Redman, and D. Clutter. Genetic Wrappers for Constructive Induction in High-Performance Data Mining (poster abstract). In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000), Las Vegas, NV, July, 2000. Morgan Kaufmann Publishers, San Mateo, CA, 2000. (PostScript .ps.gz)

[HCGG00] W. H. Hsu, Y. Cheng, H. Guo, and S. Gustafson. Genetic Algorithms for Reformulation of Large-Scale KDD Problems with Many Irrelevant Attributes (poster abstract). In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000), Las Vegas, NV, July, 2000. Morgan Kaufmann Publishers, San Mateo, CA, 2000. (PostScript .ps.gz)

[GH00] S. M. Gustafson and W. H. Hsu. Genetic Programming for Strategy Learning in Soccer-Playing Agents: A KDD-Based Architecture. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000) Workshop Program, Las Vegas, NV, July, 2000. (PostScript .ps.gz)

[HAR+99] W. H. Hsu, L. S. Auvil, T. Redman, D. Tcheng, and M. Welge. High-Performance Knowledge Discovery and Data Mining Systems Using Workstation Clusters (poster abstract). Presented at National Conference on High Performance Networking and Computing (SC99), Portland, OR, November, 1999. (PostScript .ps.gz)

[HAP+99] W. H. Hsu, L. S. Auvil, W. M. Pottenger, D. Tcheng, and M. Welge. Self-Organizing Systems for Knowledge Discovery in Databases.  In Proceedings of the International Joint Conference on Neural Networks (IJCNN-99), Washington, DC, July, 1999. (PostScript .ps.gz)

[HR99] W. H. Hsu and S. R. Ray.  Construction of Recurrent Mixture Models for Time Series Classification.  In Proceedings of the International Joint Conference on Neural Networks (IJCNN-99), Washington, DC, July, 1999. (PostScript .ps.gz)

[HWWY99a] W. H. Hsu, M. Welge, J. Wu, and T. Yang.  Genetic Algorithms for Selection and Partitioning of Attributes in Large-Scale Data Mining Problems.  In Proceedings of the Joint AAAI-GECCO Workshop on Data Mining with Evolutionary Algorithms, Orlando, FL, July, 1999. (PostScript .ps.gz)

[HWWY99b] W. H. Hsu, M. Welge, J. Wu, and T. Yang. Genetic Algorithms for Synthesis of Attributes in Large-Scale Data Mining (poster abstract).  In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-99), Orlando, FL, July, 1999. (PostScript .ps.gz)

[GHVW98] E. Grois, W. H. Hsu, M. Voloshin, and D. C. Wilkins. Bayesian Network Models for Automatic Generation of Crisis Management Training Scenarios.  In Proceedings of the Tenth Innovative Applications of Artificial Intelligence Conference (IAAI-98), pp. 1113-1120.  Madison, WI, July, 1998. (PostScript .ps.gz)

[HGL+98a] W. H. Hsu, N. D. Gettings, V. E. Lease, Y. Pan, and D. C. Wilkins.  Crisis Monitoring: Methods for Heterogeneous Time Series Learning.  In Proceedings of the International Workshop on Multistrategy Learning (MSL-98). Milan, Italy, June, 1998. (PostScript .ps.gz)

[HGL+98b] W. H. Hsu, N. D. Gettings, V. E. Lease, Y. Pan, and D. C. Wilkins. Heterogeneous Time Series Learning for Crisis Monitoring.  In A. Danyluk, T. Fawcett, and F. Provost, editors, Proceedings of the Joint AAAI-ICML Workshop on AI Approaches to Time Series Problems, pp. 34-41.  Madison, WI, July, 1998. (PostScript .ps.gz)

[HR98a] W. H. Hsu and S. R. Ray.  A New Mixture Model for Concept Learning From Time Series (Extended Abstract). In A. Danyluk, T. Fawcett, and F. Provost, editors, Proceedings of the Joint AAAI-ICML Workshop on AI Approaches to Time Series Problems, pp. 42-43.  Madison, WI, July, 1998. (PostScript .ps.gz)

[HR98b] W. H. Hsu and S. R. Ray.  Quantitative Model Selection for Heterogeneous Time Series. In R. Engels, F. Verdenius, and D. Aha, editors, Proceedings of the Joint AAAI-ICML Workshop on the Methodology of Applying Machine Learning, pp. 8-12.  Madison, WI, July, 1998. (PostScript .ps.gz)

[Hs97a] W. H. Hsu. A Position Paper on Statistical Inference Techniques Which Integrate Bayesian and Stochastic Neural Network Models.  In Proceedings of the International Conference on Neural Networks (ICNN-97), pp. 1972-1977.  Houston, TX, June, 1997. (PostScript, no figures .ps.gz, no figures)

[Hs97b] W. H. Hsu. Probabilistic Learning in Bayesian and Stochastic Neural Networks (Doctoral Consortium Abstract). In Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), p. 810. Providence, RI, July, 1997. (PostScript .ps.gz)

[DKGH93a] A. Delcher, S. Kasif, H. Goldberg, W. Hsu. Probabilistic Prediction of Protein Secondary Structure Using Causal Networks. In Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93), pp. 316-321.  Washington, DC, August, 1993. 

[DKGH93b] A. Delcher, S. Kasif, H. Goldberg, W. Hsu. Prediction of Protein Secondary Fold Using Probabilistic Networks. In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology (ISMB-93).  Bethesda, MD, July, 1993. 

Theses and Technical Reports

[HWRC00] W. H. Hsu, M. Welge, T. Redman, and D. Clutter. High-Performance Commercial Data Mining: A Multistrategy Machine Learning Application. National Center for Supercomputing Applications Technical Report NCSA-ALG-2000-01. Automated Learning Group (ALG), National Center for Supercomputing Applications (NCSA), UIUC, 2000.

[Hs98] W. H. Hsu.  Time Series Learning With Probabilistic Network Composites.  Ph.D. thesis, University of Illinois at Urbana-Champaign (Technical Report UIUC-DCS-R2063).  August, 1998. (PDF PostScript .ps.gz)

[WFH+96] D. C. Wilkins, C. Fagerlin, W. H. Hsu, E. T. Lin, and D. Kruse. Design of a Damage Control Simulator. Knowledge Based Systems Laboratory Technical Report UIUC-BI-KBS-96005. Beckman Institute, UIUC, 1996.

Work in Progress (Last updated 18 April 2002)
Group Members and Affiliates (Last updated 22 Jan 2002)

Faculty and Affiliates Graduate Students Undergraduate Students Alumni

 


Back to the KDD Lab main page

[ Divider ]

Group founded: 01 Oct 1999
Page created: 13 Jul 2001
Last updated: 11 Apr 2002
William H. Hsu