Data Analyst - Research Scientist 3 - UW Genetic Analysis Center
The Genetic Analysis Center (GAC, www.biostat.washington.edu/research/centers/gac) in the Department of BIOSTATISTICS develops and applies statistical methods to genetic and health-related data, to discover how genetic variation contributes to human well-being and disease. The genetic data include variations assayed by whole-genome sequencing and micro-arrays, and the health data cover a broad range of traits, including complex diseases such as diabetes, asthma, atherosclerosis, and cancer; as well as responses to drug treatments. The projects are funded by various parts of NIH, but also non-profits and industry. Current major projects are the Trans-Omics for Precision Medicine (TOPMed) Data Coordinating and Analysis Centers (www.nhlbiwgs.org) and statistical/analytic support for the National Institutes of Health – Center for Inherited Disease Research (NIH-CIDR) program (http://www.cidr.jhmi.edu/services/Services.html).
Provide data analysis, programming, and reporting of results for statistical genetics/genomics projects, both independently and in collaboration with other staff of the GAC. Two current project examples: (1) In the NIH-CIDR program, we collaborate with investigators at CIDR and with clinical study investigators to perform quality control and analysis of genotype data obtained from next generation sequencing and/or micro-array assays performed at CIDR. This involves data management, quality control procedures, estimating genotyping error rates, imputing genotypes for unmeasured variants, and performing genotype-phenotype association analyses. Presentations are given regularly to our collaborators, documenting progress, interpreting results, and discussing issues that arise. A written report is provided at the end of the project. (2) In the TOPMed program, we perform quality control of genotype data, estimate relatedness and population structure in genotypes from whole-genome sequencing, harmonize phenotypes across multiple studies and perform pooled genotype-phenotype analyses. We also use genome annotations to select rare variants for aggregate testing. Results of these analyses are QC’ed and presented to TOPMed investigators, with interpretation of results and discussion of relevant issues. In both projects, we participate in preparing manuscripts for publication, writing and editing methods descriptions, designing and creating figures and tables, and undertaking revision after peer review.
These responsibilities involve the following activities:
- Work with clinical collaborators and GAC staff to develop goals for data analysis.
- Create, manipulate, and merge large and complex data sets.
- Program statistical data analysis of large data sets efficiently using R and parallel computing.
- Write reusable functions in R, document and package code and follow standards for reproducible research.
- Summarize and communicate results of analyses to GAC staff and clinical collaborators.
- Assist in preparation of manuscripts for publication.
Training for these activities will be provided through working closely with experienced team members. The GAC provides a highly collaborative and collegial environment in which team members across the group share expertise and mentor each other. Continuing education opportunities include attending workshops, short courses and conferences.