IASTED - Tutorial Session | Artificial Intelligence and Applications | February 15 – 17, 2010

~AIA 2013~

~AIA 2010~

~AIA 2009~

~AIA 2008~

~AIA 2007~

~AIA 2006~

~AIA 2005~

~AIA 2004~

~AIA 2003~

~AIA 2002~

The Tenth IASTED International Conference on
Artificial Intelligence and Applications
AIA 2010

February 15 – 17, 2010
Innsbruck, Austria

TUTORIAL SESSION

The Development of Intelligent Computational Tools in Bioinformatics

Dr. Hesham Ali
University of Nebraska at Omaha, USA
hesham@unomaha.edu

Duration

3 hours

Abstract

The last few years have witnessed significant developments in the area of Bioinformatics. The explosion of biological data requires an associated increase in the scale and sophistication of the automated systems and intelligent tools to enable the researchers to take full advantage of the available databases. This ranges from the effective storage of data and their associated data models, to the design of efficient algorithms to automate the data mining procedures as well as to uncover non-obvious aspects of these datasets, and also to the development of advanced software systems to support database curation, and mining. With more researchers taking on Bioinformatics projects that integrate theoretical and applied concepts from both Bioscience as well as Computational Sciences, the discipline of Bioinformatics is quickly emerging as an exciting independent field of research. As a result, new educational and research programs have been developed, new scientific journals have been launched and various professional conferences have been dedicating sessions or mini-tracks to disseminate current Bioinformatics research. In this tutorial, we present how Bioinformatics has emerged as the result of integrating IT with Biological Sciences. We provide an overview of this emerging discipline with the basic background from both the computational and the biological standpoints. We also present examples of recently developed intelligent tools and expert systems that produced exciting results that could not have been obtained without such innovative integration.

Objectives

The field of Bioinformatics has been attracting a lot of attention in recent years. The massive size of the current available biological and medical databases and its high rate of growth have a great influence on the types of research currently conducted and researchers are focusing more than ever to maximize the use of these databases. Hence, it would be of great advantage for researchers to utilize the information stored in the available databases to extract new information as well as to understand various biological and medical phenomena.
In addition, from the IT point-of-view, the problem of efficiently collecting, sharing, mining and analyzing the wealth of information available in a growing set of the biological and clinical data has common roots in many IT applications. This is particularly critical in managing biological and clinical data since relevant data is available in different shapes and forms, and hence, employing all available data to extract meaningful properties is an enormous task. Heterogeneous data, obtained from microarrays, mass spectrometry experiments and clinical records, can all be used to find potential correlations between genes/proteins and the susceptibility to have a particular disease. The proposed tutorial will address these issues with a particular focus on the following objectives:

Provide a balanced perspective of the exciting discipline of Bioinformatics as an emerging interdisciplinary field of study with equally important roots in Biosciences and Computational Sciences.
Introduce the main traditional computational problems in Bioinformatics and survey the fundamental algorithmic tools, with a focus on AI concept and expert systems.
Introduce the audience to recently developed intelligent approaches that take advantage of both the biological knowledge and the computational methods to address critical Bioinformatics problems.

Timeline

The tutorial is designed for three hours and is divided into two parts, each scheduled for 80 minutes with a 20 minutes break. The first part covers the introduction, the background and an overview of key problems and AI algorithms in the area of Bioinformatics. The first part is covered in points 1-5 below. The second part focuses on introducing the audience to new research projects with a focus on how AI concepts and algorithmic methods address essential problems in Bioinformatics. Several case studies are introduced to illustrate the key message of the tutorial which is the need to develop expert systems with roots in Biosciences and Computational Sciences to address the key problems in Bioinformatics. This part is covered in points 6-10 below.

Introduction to Bioinformatics
Background – The Bioscience aspect and the computational perspective
Bioinformatics now – current state of the emerging discipline
Overview of key problems, data structures and algorithms in Bioinformatics
Overview of new expert systems and intelligent tools to address Bioinformatics problems
Case Study 1: Recognition and classification of Biological Sequences using Intelligent sequence comparison tools
Case Study 2: Advanced clustering techniques and phylogenetic analysis
Case Study 3: Prediction of novel regions of chromosomal alterations using gene expression profiling
Case Study 4: Intelligent Integrated Medical Data System (I2MeDS)
Translational Research and the next steps in Bioinformatics research

Background Knowledge Expected of the Participants

The tutorial is intended for bio-scientists and computational scientists who are interested in Bioinformatics and how to develop or use AI tools to solve Bioinformatics related problems. Although some basic background in molecular biology would be helpful, it is not necessary since the tutorial will provide a basic background of the needed concepts. Similarly, some basic background in algorithms and simple AI concepts would be useful but it is not necessary.

Qualifications of the Instructor(s)

Hesham Ali is a Professor of Computer Science and Dean of the College of Information Science and Technology at UNO. He is the deputy director of the Nebraska Informatics Center for Life Sciences and serves as the director of the Bioinformatics Core Facility. He has published numerous articles, book chapters and two books in various areas including scheduling, wireless networks, and Bioinformatics. He is currently leading several projects funded by NSF and NIH in wireless networks and Bioinformatics.

References

[1]	fiogf49gjkf0d R. Sengupta, D. Bastola and H. Ali, "Characteristic Restriction Endonuclease Cut Order for Classification and Identification of Fungal Sequences," Proceedings of the 2009 IEEE Computer Society Bioinformatics Conference (CSB 2009), Stanford University, August 10-12, 2009.
[2]	P. Ciborowski and H. Ali, "Bioinformatics," a book chapter in, "Proteomics for Undergraduates," A. Kraj and J. Silberring (eds.), Wiley Inc., 2008.
[3]	X. Deng , H. Geng and H. Ali, "A Hidden Markov Model Approach to Predicting Yeast Gene Function from Sequential Gene Expression Data," The International Journal of Bioinformatics Research and Applications, 2008:4(3):263-273.
[4]	D. Quest, K. Dempsey, M. Shafiullah, D. Bastola, and H. Ali. MTAP: A Motif Tool Assessment Pipeline for Automated Assessment of De Novo Regulatory Motif Discovery Tool. BMC Bioinformatics, August 2008.
[5]	D. Quest, K. Dempsey, M. Shafiullah, D. Bastola, and H. Ali. A Parallel Architecture for Regulatory Motif Algorithm Assessment. HiCOMB 2008: Seventh IEEE International Workshop on High Performance Computational Biology, April 14th 2008.
[6]	X. Deng , H. Geng and H. Ali, "Cross-platform Analysis of Cancer Biomarkers: A Bayesian Network Approach to Incorporating Mass Spectrometry and Microarray Data," Journal of Cancer Informatics, 2007.
[7]	A. Sadanandam, M. Varney, L. Kinarsky, H. Ali, R. Lee Mosley, R. Singh, "Identification of Functional Cell Adhesion Molecules with a Potential Role in Metastasis by a Combination of in vivo Phage Display and in silico Analysis," OMICS: A Journal of Integrative Biology, Vol. 11, No. 1: 41-57, March 2007.
[8]	X. Huang and H. Ali, "High Sensitivity RNA Pseudoknot Prediction," Nucleic Acid Research, 2007.
[9]	N. Sharma, J. Youn, N. Shrestha and H. Ali, "Direction Finding Signage System using RFID for Healthcare Applications," Proceedings of The International Conference on BioMedical Engineering and Informatics (BMEI 2008), Sanya, Hainan, China, May 27-30, 2008.
[10]	J. Uher, D. Sadofsky, J. Youn, H. Ali, H. Sharif, J. Deogun, and S. Hinrichs, "I2MeDS: Intelligent Integrated Medical Data System," Proceedings of The International Conference on BioMedical Engineering and Informatics (BMEI 2008), Sanya, Hainan, China, May 27-30, 2008.
[11]	H. Geng, H. Ali and J. Chan, "A Hidden Markov Model Approach for Prediction of Genomic Alterations from Gene Expression Profiling," Proceedings of the fourth International Symposium on Bioinformatics Research and Applications (ISBRA), Atlanta, Georgia, May 6-9, 2008.
[12]	D Quest, K. Dempsey, D. Bastola, and H. Ali. An Automated Pipeline for Regulatory Motif Tool Assessment. Computational Systems Bioinformatics (CSB), August 2006.
[13]	H. Geng, X. Deng and H. Ali, "MPC: a Knowledge-based Framework for Clustering under Biological Constraints," Int. J. Data Mining and Bioinformatics, Volume 2, Number 2, 2007.
[14]	X. Deng, H. Geng, D. Bastola and H. Ali, "Link Test — A Statistical Method for Finding Prostate Cancer Biomarkers," Journal of Computational Biology and Chemistry, 2006.
[15]	A. Churbanov, I. Rogozine, J. Deogun, and H. Ali, "Method of Predicting Splice Sites Based on Signal Interactions," Biology Direct, 2006.
[16]	X. Deng, H. Geng, and H. Ali, "Joint Learning of Gene Functions--A Bayesian Network Model Approach". Journal of Bioinformatics and Comp. Biology, Vol. 4, No. 2, pp. 217-239, 2006.
[17]	X. Deng and H. Ali, EXAMINE, "A Computational Approach to Reconstructing Gene Regulatory Networks," Journal of BioSystems, 81:125-136, 2005.
[18]	A. Churbanov, M. Pauley, D. Quest and H. Ali, "A method of precise mRNA/DNA homology-based gene structure prediction," BMC Bioinformatics, 6:261, 2005.
[19]	A. Mohamed, D. Kuyper, P. Iwen, H. Ali, D. Bastola and S. Hinrichs, "Computational approach for the identification of Mycobacterium species using the internal transcribed spacer-1 region," Journal of Clinical Microbiology, Vol. 43, No. 8: 3811-3817, 2005.
[20]	A. Churbanov, I. Rogozin, V. Babenko, H. Ali and E. Koonin, Evolutionary conservation suggests a regulatory function of AUG triplets in 5'UTRs of eukaryotic genes, Nucleic Acid Research, 33(17), pp. 5512-20, Sep 2005.
[21]	H. Geng, X. Deng and H. Ali, "A New Clustering Algorithm Using Message Passing and its Applications in Analyzing Microarray Data," The Fourth International Conference on Machine Learning and Applications (ICMLA'05), pp. 145-150, 2005

The Tenth IASTED International Conference onArtificial Intelligence and ApplicationsAIA 2010