Oral health is essential for overall well-being, but did you know that oral diseases affect more than half of the people in the world? Despite the high prevalence of oral diseases and the many statistical challenges that arise from analyzing complex oral health data, methodological research motivated by these challenges are limited. I develop statistical methods and tools to answer oral health questions arising from complex multilevel data.

To study the relationship between a disease and its risk factors, we collect information from a sample of individuals. This sample, ideally, should represent the population that we want to study. But when time and money are limited, the sample size must be small. To make meaningful inference about every person in the target study population, we need to purposefully oversample people from under-represented groups and correct for the sampling bias using proper statistical methods. I study the implications of using biased sampling designs under special circumstances.

Multiple imputation (MI) is a common approach to handle missing data in observational and experimental studies. MI is straightforward to implement with cross-sectional data with no derived variables, such as interaction terms. But if I want to include an interaction effect in the analysis, and if one or both variables that make up the interaction are missing, how do I impute them? Do I impute the original variables and compute the interaction variable or do we impute the interaction directly? What if I have clustered or longitudinal data with missing outcomes? How do I incorporate the correlation between observations in the imputation model? These are the questions I try to answer.

Radiologists read mammograms to screen for breast cancer but how
consistent are the results between two radiologists, or three?
Traditional agreement statistics – like Cohen’s Kappa – lack the ability
to measure agreement between many radiologists each reading many
mammograms. A generalized linear mixed effects models (GLMM) can be used
to compute a comprehensive summary measure of agreement (and association
if the screening result is an ordinal score) between many radiologists’
mammogram readings. I also authored an R package `modelkappa`

to calculate GLMM-based agreement and association.

*A complete list of my published works is available from my Google
scholar page.*

**Mitani AA**, Feng X, and Kaye EK. Modeling time-varying risk factors of tooth loss: re- sults from joint model compared to extended Cox regression model. Accepted to Journal of Clinical Periodontology.Dong M and

**Mitani A**. Multiple imputation methods for missing multilevel ordinal outcomes. BMC Medical Research Methodology 23(112) pp. 938–949, 2023. doi: 10.1186/s12874-023-01909-5.**Mitani AA**, Kaye EK, Nelson KP. Accounting for drop-out using inverse probability censoring weights in longitudinal clustered data with informative cluster size. Annals of Applied Statistics. 16(1), pp. 596-611, 2022. doi:10.1214/21-AOAS1518**Mitani AA**, Mercaldo ND, Haneuse S, Schildcrout JS. Survey design and analysis considerations when utilizing misclassified sampling strata. BMC Medical Research Methodology. 21(145), 2021. doi:10.1186/s12874-021-01332-8**Mitani AA**, Kaye EK, Nelson KP. Marginal analysis of multiple outcomes with informative cluster size. To appear in Biometrics. 77, pp. 271–282, 2021. doi:10.1111/biom.13241**Mitani AA**, Haneuse S. Small data challenges of studying rare diseases. Invited Commentary. JAMA Network Open. 3(3), 2020. doi:10.1001/jamanetworkopen.2020.1965**Mitani AA**, Kaye EK, Nelson KP. Marginal analysis of ordinal clustered longitudinal data with informative cluster size. Biometrics. 73(3), pp. 938– 949, 2019. doi:10.1111/biom.13050Nelson KP,

**Mitani AA**, Edwards D. Evaluating the effects of rater and subject factors on measures of association Biomedical Journal. 60, pp. 639–656, 2018. doi:10.1002/bimj.201700078**Mitani AA**, Nelson KP. Modeling Agreement between Binary Classifications of Multiple Raters in R and SAS. Journal of Modern Applied Statistical Methods. 15, 2017. doi:10.22237/jmasm/1509495300**Mitani AA**, Freer PE, Nelson KP. Summary measures of agreement and association between many raters’ ordinal classifications. Annals of Epidemiology. 27(10), 2017. doi:10.1016/j.annepidem.2017.09.001Nelson KP,

**Mitani AA**, Edwards D. Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings. Statistics in Medicine. 36(20), pp. 3181–3199, 2017. doi:10.1002/sim.7323Desai M,

**Mitani AA**, Bryson S, Robinson T. Multiple imputation when rate of change is the outcome of interest. Journal of Modern Applied Statistical Methods. 15(1), Article 10, 2016. doi:10.22237/jmasm/1462075740**Mitani AA**, Kurian AW, Das AK, Desai M. Navigating choices when applying multiple imputation in the presence of multi-level categorical interaction effects. Statistical Methodology. 27, 2015. doi:10.1016/j.stamet.2015.06.001

- modelkappa: Calculates model-based kappa of agreement and association and their standard errors for multiple raters each assessing multiple cases
- CWGEE: Cluster-weighted generalized estimating equations for clustered longitudinal data with informative cluster size
- ipccwGEE: Solves the cluster-weighted generalized estimating equations with inverse probability censoring weights for correlated binary responses in clustered longitudinal data with informative cluster size and informative drop-out.