Data Gaps

Together, we can help interpret cold-case forensic samples using more advanced statistical analysis

When you grip a mug, open a door or grasp a knife, you leave behind skin cells containing your DNA. 

Imagine that the same knife you used to slice zucchini for your dinner was later used in a crime. In addition to DNA from the victim and the perpetrator, your own genetic profile could very well be detected in samples obtained from the knife.

Investigators commonly find mixtures of DNA in a single forensic sample that contain various unequal contributions, like the DNA samples that could come from multiple people using one knife. Now, the development of technologies can reveal and identify smaller, previously undetectable amounts of DNA samples within the mixture, which enables more specific DNA identification.

But how do investigators determine whether DNA profiles in the evidence match a suspect’s profile? That takes math — and sophisticated software to generate key statistics presented in court. The process gets even harder with cold cases, where older DNA profiles often didn’t have refined levels of detail in the mixture.

That’s where statistical geneticists like Bruce Weir and his colleagues at the University of Washington School of Public Health come in.

DNA Illustration
DNA Illustration

Weir, a professor of biostatistics and the director of the School’s Institute for Public Health Genetics, developed the groundbreaking statistical methods used around the world to evaluate DNA evidence. Today, Weir and his colleagues are developing and refining statistical methods that will aid in solving cold cases, take advantage of new genetic sequencing technology, and improve how forensic DNA profiles are interpreted in cases with sparse genetic data. These efforts are helping forensic scientists, law enforcement and the courts use and accurately interpret increasingly sophisticated DNA profiling techniques.

“One of our missions in the School is to do things for general societal benefit,” says Weir, who also directs the Summer Institute in Statistical Genetics, which he founded 27 years ago to train forensic scientists and researchers. “I think it is to everyone’s benefit to have forensic science conducted and presented correctly.”

Cracking cold cases

Unsolved crimes, sometimes decades old, present extra challenges when it comes to comparing older DNA evidence with suspects’ profiles. That’s particularly true when evidence includes mixtures of DNA from more than one person, a common occurrence.

With such mixtures, one person usually contributes more DNA than others, Weir explains. For example, in sexual assault cases, most of the DNA comes from the victim and a smaller amount from the perpetrator. “That minor contributor’s DNA might not be complete,” Weir says. When portions of the DNA profile are not detected, that’s called “dropout,” which was harder to analyze in cold cases using previous methodologies.

Bruce Weir photo
Bruce Weir, professor in the School of Public Health’s Department of Biostatistics.

Now Weir and colleagues at the FBI and the University of North Texas have developed software that allows those old, incomplete profiles to be used. With the new software, analysts can say, “Well, there is still a strong association between the suspect and the evidence, even when we take into account the dropout,” Weir says. That can help the defense as well as the prosecution, he notes.

Weir first developed statistics to interpret DNA mixtures back when genetic evidence took off in the early 1990s and calculations were done by hand. In fact, he testified in the first major case where such mixture analysis was used: the murder trial of O.J. Simpson. In recognition of his contributions to forensic science and population and quantitative genetics, Weir was elected as a Fellow of the Royal Society of London in 2021.

A paper on the new mixture analysis software has been published in a peer-reviewed journal, and forensic scientists in the U.S. and abroad are using the software.

“The software could be very useful to us,” says Sean Carhart, DNA technical leader for the Washington State Patrol’s crime laboratory. “It may allow interpretation of older data, particularly more complex DNA mixtures, that is not otherwise feasible.”

Tapping advances in genetic sequencing

Labs today can measure genetic variation in large portions of the human genome quickly and inexpensively in a process called “genetic genealogy,” enabling consumers to locate distant relatives and learn about their health risks from sites such as 23andMe and Forensic science has just begun to employ these large, public data sets, most famously in the use of such consumer sites to find suspects via their relatives.

Instead of obtaining exact DNA sequences, forensic analysts currently examine just 20 places, or loci, across the genome, counting the number of times short DNA sequences are repeated at each site. The number of repeats at each of 40 total sites (one set inherited from each parent) varies from person to person, allowing forensic scientists to assess matches based on the length of those repeats.

"I think it is to everyone's benefit to have forensic science conducted and presented correctly."

The next frontier is sequencing the genome at those 20 sites. That finer-grained data could aid with interpreting mixtures, and even distinguish between twins. It could also reveal that a suspect’s DNA does or does not match the evidence, Weir says. “So it can protect the innocent and make things look stronger against the guilty,” he says.

But first, for that wealth of sequencing data to be useful, methods for evaluating potential matches have to be updated. That’s the focus of Sanne Aalbers, a Ph.D. candidate working with Weir. Aalbers, who also has a background in forensic science, is working under a National Institute of Justice grant she was awarded to develop population estimates for expanded genetic profiles. The estimates are required to calculate DNA matches. She is also considering the practical implications of her work, interviewing forensic scientists to assess how they understand and use the models she’s developing.

“I’m excited about the opportunity to combine both theoretical/quantitative research with qualitative research,” Aalbers says. “This is entirely due to the interdisciplinary nature of my Ph.D. program in Public Health Genetics, which allows me to sort of bridge that gap between theory and practice.”