Course list: (in part and in CWRU only):

 

l       Data Analysis I: Basic exploratory data analysis for uni-variate response with single or multiple covariates. Graphical methods and data summarization model fitting using S-plus computing language. Linear and multiple regressions. Emphasis on model selection criteria, on diagnostics to assess goodness of fit and interpretation. Techniques include transformation, smoothing, median polish, and robust/resistant methods. Case studies, and analysis of individual data sets.

l        Data Analysis II: Extensions of exploratory data analysis and modeling to multivariate response observations and to non-Gaussian data. Singular value decomposition and projection, principal components, factor analysis and latent structure analysis, discriminant analysis and clustering techniques, cross-validation, E-M algorithm, and CART. Introduction to generalized linear model.  Case studies of complex data sets with multiple objectives for analysis. 

l        Bayesian Data Analysis: Principles of Bayesian theory, methodology and applications. Methods for forming prior distributions using conjugate families, reference priors and empirically based priors. Derivation of posterior and Predictive distributions and their moments. Properties when common distributions such as binomial, normal or other exponential family distributions are used. Hierarchical models. Computational techniques including Markov chain Monte Carlo and importance sampling.

l       Statistical Computing: Basic topics in statistical computing: Floating point arithmetic; Semi-numerical computation including generation and tests of random numbers, Monte Carlo methods, variance reduction methods, stochastic models and simulation studies; Numerical computation including numerical linear algebra, optimization and root-finding, numerical integration; Statistical computing, e. g. re-sampling methods, EM algorithms, Gibbs sampling and projection pursuit

l       Stochastic Modeling: Introduction to stochastic modeling of data with emphasis on models and statistical analysis of data with a significant temporal and/or spatial structure. Markovian    and semi-Markovian models, point processes, point cluster models, queuing models, risk model, likelihood methods, estimating equations.

l        Theoretical Statistics: Point estimation: maximum likelihood, moment estimators. Methods of evaluating estimators including mean squared error, consistency, "best" unbiased and sufficiency. Hypothesis testing. likelihood ratio and union-intersection tests. Properties of tests including power function, bias. Interval estimation. 

l        Linear Models: Theory of least squares estimation, interval estimation and tests for models with normally distributed errors. Regression on dummy variables. ANOVA,VACOV. Variance component models. Model diagnostics. Robust regression. Analysis of longitudinal data.

l        Advanced Techniques in Data Analysis: Topics drawn from re-sampling methods (including bootstrapping), MCMC (Gibbs sampling), nonparametric curve and surface fitting, kernel density estimation, projection pursuit, time series (time permitting), approaches to model uncertainty, models for repeated measures and structural-functional models, statistical inference for non-statistical mathematical models of large systems.   

l        Theory and Methods of Experimental Design Experimental design for polynomial regression models and for multi-factor models. Theory for construction of increased efficiency designs including fractional factorials, Latin squares. Designs for response surfaces. GOSSETT-generated optimal designs for nonstandard problems.     

l        Survival data analysis: Basic concepts of survival analysis including hazard function, survival function, types of censoring, Kaplan-Meier estimates, log-rank tests, and the generalized Wilcoxon tests. Parametric inference will include exponential and Weibull distributions with and without censoring. The proportional hazard.

l       Statistical consulting: statistical consulting under the guidance of the instructor.

l        STAT METHOD/ANALYSIS OF DNA: Background on low level processing and generation of high throughput genomic data. Detect differentially expressing genes via FDR theory, empirical Bayes, resampling based approaches, ANOVA methods including non-Bayesian and fully Bayesian approaches. Optimality and theoretical measures of performance. Empirical comparisons and case studies.

l        PRINCIPLES OF GENETIC EPID:  A survey of the basic principles, concepts and methods of the discipline of genetic epidemiology, which focuses on the role of genetic factors in human disease and their interaction with environmental and cultural factors. Many important human disorders appear to exhibit a genetic component; hence the integrated approaches of genetic epidemiology bring together epidemiological and human genetic perspectives in order to answer critical questions about human disease. Methods of inference based upon data from individuals, pairs of relatives, and pedigrees will be considered.

l        Real Analysis: Real and complex measure theory, integral theorems.  Banach space.  Riesz representation theorem, functional analysis, closed graph theorem, open mapping. Weak topology.  Hilbert space. Fourier series. etc.

l        Abstract Algebra: Basic properties of groups, rings, modules and fields. Finitely generated modules over principal ideal domains, canonical forms for matrices; categories and functors; tensor product of modules, bilinear and quadratic forms; field extensions; fundamental theorem of Galois theory, solving equations by radicals.

l        Set Topology:  Metric spaces, topological spaces, and continuous functions. Compactness. Connectedness. Path connectedness. Topological manifolds. Topological groups. Polyhedral. Simplical complexes. Fundamental groups.

l        Algebraic topology: The fundamental group and covering spaces; van Kampen's theorem. Higher homotopy groups. Long-exact sequence of a pair. Homology theory; chain complexes; short and long exact sequences; Mayer-Vietoris sequence. Homology of surfaces and complexes; applications

l        Topological dynamic systems: research topic with Professor Wu.

l       Graph Theory: Building Blocks, Trees, Connectedness, Matching, transversability.  NP-complete. Major COP problems and algorithms.

l        Combinatory: Permutations, combinations and variations. Principle of inclusion and exclusion. Generating functions. Difference equations. Partitions. Stirling numbers. Eulerian numbers. Ballot problems. Ramsey's theorem. Finite groups. Polya's theorem. Debruijn's theorem. Graphs. Trees. Finite fields. Finite geometries. Orthogonal Latin squares. Hadamard matrices. Block designs. Coding theory.

l        Algorithm Analysis: Sorting, searching, set manipulation, graph algorithms, matrix operations, polynomial manipulation, and fast Fourier transforms. Through specific examples and general techniques, the course covers the design of efficient algorithms as well as the analysis of the efficiency of particular algorithms. Certain important problems for which no efficient algorithms are known  (NP-complete problems) are discussed in order to illustrate the intrinsic difficulty, which can sometimes preclude efficient algorithmic solutions.

l       Database Systems: Basic issues in file processing and database management systems. Physical data organization. Relational databases. Database design.  Relational query Languages, SQL. Query languages. Query optimization. Database integrity and security.   Object-oriented databases. Object-oriented Query Languages, OQL, XML

l        Operating systems: CPU scheduling, memory management, concurrent processes, semaphores, monitors, deadlocks, secondary storage management, file systems, protection, UNIX operating system, fork, exec, wait, UNIX System VIPCs, sockets, remote procedure calls, threads. Must be proficient in "C" programming language.