The BiTmaP curriculum consists of three on-line courses taken over three semesters as well as university and industry seminars. During the first semester all BiTmaP students are required to take BioE480: Introduction to Bioinformatics. Upon successful completion of this course (B or better), the student may select one of the elective BiTmaP courses: BioE439: Biostatistics BioE494, Molecular Modeling in bioinformatics BioE594, or Design and Computations Methods in Bioinformatics. During the third Semester based upon maintaining a B average for the Program, students may select a second elective BiTmaP course.
Selection of two elective courses during the same semester.
With approval from both the Academic Director and BiTmaP Management, students who meet the following criteria and receive approval may choose two elective courses during the same semester.
- Student must have received an “A” in BioE480: Introduction to Bioinformatics.
- Dual course selection must occur during Spring or Fall academic semesters and excluding Summer academic semester.
- Student must commit to 30 hours per week for BiTmaP coursework (15 hours per week per course).
- Academic Director and BiTmaP Management must provide approval after review of any other outstanding circumstances.
Training Cycle
Semester 1 |
Semester 2 |
Semester 3 |
Semester 4 |
Introduction to Bioinformatics |
Biostatistics in informatics
Structure modeling in bioinformatics
Genome analysis and microarray |
Biostatistics in informatics
Structure modeling in bioinformatics
Genome analysis and microarray |
|
Tutor/Mentor |
Tutor/Mentor |
Internship |
Internship/Job |
Students who successfully complete the BiTmaP Program will receive a Certification in Bioinformatics from BiTmaP and UIC as well as 12 college credits from the University of Illinois at Chicago, Department of BioEngineering.
Each BiTmaP student is required to begin the academic program with the core introduction to bioinformatics course:
BioE480: Introduction to Bioinformatics
This is the general introductory course in bioinformatics. The main techniques covered in this course are related to sequence analysis and include gene identification, genome sequencing, sequence comparison, database searching, and phylogenetic tree analysis.
Molecular biology basics are also introduced including the central dogma of molecular biology and DNA and protein sequence composition and structure. Students will be introduced to all of the biology necessary to understand the applications of bioinformatics algorithms and software taught in this course.
The algorithms and software that will be taught in this course include:
- Sequence comparison algorithms and the software program FASTA as well as the related programs.
- Sequence database searching with BLAST, PSI-BLAST, and HMMER
- Functional database searches with GO and PFAM for gene identification and functional assignment.
- Biology database design using SQL/mySQL
- Programming to solve text processing and other bioinformatics tasks with Perl, and learning how to use the bioperl database to search for the available programs.
- Phylogenetic analysis with the program PHYLIP.
All of these programs are widely used in industry and academia.
Required Background:
The student should know basic uses of probabilities. This course requires some (Perl) programming skills, C++ or Java.
Recommended Pre-reading:
- Section 1.3 in the book “Biological Sequence Analysis” by R. Durbin et al. ISBN: 0-521-62971-3.
- “Beginning Perl for Bioinformatics” by O’Reilly. ISBN: 0-596-00080-4
Course details:
This course consists of 21 lectures, 2 of them worth 1.5 lectures of materials. We will normally post 2 lectures per week (on Mondays). On holiday weeks, only 1 lecture may be posted.
Homework worth 100 points will be assigned each week. The homework will be posted on Wednesday and will be due the following Wednesday. For any week, we may post homework early. This will not affect the due date of the homework.
Late homework will be accepted until the first Friday following the due date with a penalty of 20 points per day late. Homework will not be accepted after Friday.
In addition to homework, there will be a comprehensive project which will be due on the same day as the final exam.
Syllabus
Lecture 1: Intro to course with review of relevant molecular biology
Lecture 2: Review of relevant molecular biology.
Lecture 3: Pairwise sequence alignment.
Lecture 4: Scoring Matrices.
Lecture 5: Multiple Sequence Alignment.
Lecture 6: Statistical Methods in Sequence Analysis.
Lecture 7: Introduction to Hidden Markov Models.
Lecture 8: Using Hidden Markov Models for Sequence Analysis.
Lecture 9: Statistical Methods in Bioinformatics.
Lecture 10: Database Search. (1.5 lectures)
Lecture 11: Microarray Data Analysis. (1.5 lectures)
Lecture 12: Introduction to Phylogenetic Analysis.
Lecture 13: Phylogenetic Analysis.
Lecture 14: Gene Finding.
Lecture 15: Whole Genome Analysis.
Lecture 16: Genomic Circuits.
Lecture 17: Protein Structure and Prediction.
Lecture 18: Protein Secondary Structure Prediction & Homology Modeling.
Lecture 19: Protein Folding.
Lecture 20: Drug Design, Discovery, and Docking.
Lecture 21: RNA Secondary Structure Prediction.
BiTmaP students who have successfully completed the Introduction to Bioinformatics course with a grade of “B” or better may select one elective course from the following three courses offered for their second and third semester.
BioE439: Biostatistics
This course teaches the student biostatistics. The industry standard in statistics software is R and S++, and this will be the focus of this course. Programming knowledge of Java (preferred), Perl or C++ is required.
The application of basic algorithms and the theory behind the statistical analysis will be covered. Extensive examples and small projects will be used in order to learn how to use R and Java to accomplish bioinformatics tasks. Topics covered also include sample analysis, interval-censored survival data analysis, longitudinal data analysis, multivariate analysis, theory of distributions in statistics, and experiment and design.
Course background: Topics include: descriptive statistics, hypothesis testing, estimation, confidence intervals, t-tests, chi-squared tests, analysis of variance, linear regression and correlation.
Required Background: Math 210 (Calculus III) or equivalent is required for this course.
Grading: homework, exams
Homework: TBA
Syllabus:
Lecture 1: Introduction to Biostatistics
Lecture 2: Descriptive Statistics
Lecture 3: Basic Probability Concepts -1
Lecture 4: Basic Probability Concepts -2
Lecture 5: Probability distribution -1
Lecture 6: Probability distribution -2
Lecture 7: Some important sampling distributions -1
Lecture 8: Some important sampling distributions -2
Lecture 9: Estimation -1
Lecture 10: Estimation -2
Lecture 11: Hypothesis testing -1
Lecture 12: Hypothesis tests -2
Lecture 13: Hypothesis tests -3
Lecture 14: Analysis of variance -1
Lecture 15: Analysis of variance -2
Lecture 16: Analysis of variance -3
Lecture 17: Simple linear regression and correlation-1
Lecture 18: Simple linear regression and correlation -2
Lecture 19: Multiple regression and correlation -1
Lecture 20: Multiple linear regression and correlation -2
Lecture 21: Chi-square distribution and the analysis of frequencies -1
Lecture 22: Chi-square distribution and the analysis of frequencies -2
Lecture 23: Chi-square distribution and the analysis of frequencies -3
BioE494: Molecular Modeling in bioinformatics
This course teaches the students how to elucidate the structure of a biopolymer using related modeling tools and algorithms in bioinformatics.
The targeted areas are in protein structure modeling, structure based drug design, drug screening, cheminformatics, and binding prediction. Students will learn the principles and applications of each of the algorithms and programs used in structure modeling.
Popular software programs used in industry will also be covered including:
- Protein-ligand (drug) binding: AUTODOCK, DOCK
- General packages aimed at structure modeling: SYBYL, QUANTA, INSIGHT II
- Molecular dynamics simulations: CHARMm, AMBER
Required background:
Calculus: Basic - Example: Basic knowledge of integration-differential calculus, functional analysis, the meaning of first derivative, the meaning of second derivative, basic knowledge of differential equations….
Linear Algebra: Basic - Example: Basic knowledge of matrices, what do they represent and basic operation, i.e. matrix addition, multiplication, transposition, inversion… The knowledge of vector operation is required.
Programming: Any language, basic level - Either Matlab, C/C++, Perl, is ok. Visual Basic and Delphi are acceptable…. Students should be able to open and read text files, extract the data from those files, transfer the data from text format to numeric format and manipulate then using mathematical functions, and finally save the results as either text files, or binary files.
Physics: what level, - Example High school physics or physics 101 are required level of knowledge. Students have to know the difference between force and energy, the dependence of force on acceleration, equation of motion. Coordinate system.
Chemistry and Biology - For students without chemistry background, bio480 or similar course should be prerequisite. Students should know basic properties of nucleic acids (DNA and RNA), and properties of amino acids (20 natural amino acids which form proteins)
Syllabus:
Lecture 1: Introduction
Lecture 2: Molecular Mechanical Force Field (1)
Lecture 3: Molecular Mechanical Force Field (2)
Lecture 4: Statistical Potentials (1)
Lecture 5: Statistical Potentials (2)
Lecture 6: Conformational Analysis (1)
Lecture 7: Conformational Analysis (2)
Lecture 8: Minimization
Lecture 9: Computer simulation
Lecture 10: Monte Carlo simulation
Lecture 11: Molecular Dynamics Simulation (1)
Lecture 12: Molecular Dynamics Simulation (2)
Lecture 13: Structure Modeling (1)
Lecture 14: Structure Modeling (2)
Lecture 15: Structure Modeling (3)
Lecture 16: Structure Modeling (4)
Lecture 17: Structure Modeling (5)
Lecture 18: Structure Modeling (6)
Lecture 19: Protein interactions
Lecture 20: Free energy calculation
Lecture 21: Drug design
Lecture 22: Special topics
BioE594: Design and Computations Methods in Bioinformatics
This course focuses on the study and implementation of methods of data mining and machine learning that are useful in the analysis of gene expression data from genome comparison, microarray experiments, and protein function prediction.
Students will gain practical skills, especially in the area of microarray data analysis, in addition to theoretical knowledge behind these methods.
The topics covered in this course include microarray technology, microarray experiments, preprocessing of microarray data, statistical methods (hypothesis testing, resampling, bootstrap, multiple testing), distances and expression measures, feature selection, cluster and courseification analysis for microarray data, and inferring genetic networks.
We will use the R and Bioconductor packages for the microarray data analysis and MATLAB for other implementation tasks. All three tools are widely used in industry for gene expression analysis.
Required background: Calculus: Basic, Linear Algebra: Matrix operations, Statistics Hypothesis testing like F-test, T-test.
Syllabus:
Lecture 1: Introduction of the course
Lecture 2: Introduction to Microarrays
Lecture 3: Introduction to R
Lecture 4: Review of statistics ILecture
5: Review of statistics IILecture
6: Review of statistics IIILecture
7: Preprocessing DNA Microarray DataLecture
8: Preprocessing Affymetrix DataLecture
9: Introduction of BioconductorLecture
10: Identification of differentially expressed genesLecture
11: Multiple testing for the identification of differentially expressed genesLecture
12: Significance Analysis of Microarrays (SAM)Lecture
13: AnnotationLecture
14: Genomic data-mining method 1 - overviewLecture
15: Genomic data-mining method 2 - clusteringLecture
16: Genomic data-mining method 3 - dimension reduction in unsupervised learning Lecture
17: Genomic data-mining method 4 - classificationLecture
18: Genomic data-mining method 5 – feature selectionLecture
19: Application for bayesian classifierLecture
20: Genomic data-mining method 6 - Support vector machinesLecture
21: Identification of Transcription Binding SiteLecture
22: Using Bayesian Networks to Analyze expression data
BiTmaP Course TextbooksIntroduction to Bioinformatics
“Bioinformatics: Sequence and Genome Analysis (Genome Analysis)” by David W. Mount, Cold Spring Harbor Laboratory Press, 2005
Biostatistics
“Biostatistics: A Foundation For Analysis In The Health Sciences (8th Edition)” by Wayne W. Daniel, Wiley, 2005
Computational Functional Genomics and Microarray
“Microarray Bioinformatics”, by Dov Stekel, Cambridge University Press, 2003. Materials from recent scientific literature will also be used.
Molecular Modelling
“Molecular Modelling: Principles and Applications,” by Andrew Leach, Prentice Hall, 2001
|