Tag Archives: dna database

Is DNA the next frontier in privacy?-CRG in the News

In late January, President Barack Obama announced what some have called a moonshot.

The $215 million Precision Medicine Initiative seeks to transform the health care system to target therapies to patients according to their unique genetics and environment.

The most ambitious part of the initiative is a proposal to enroll 1 million people in essentially a superstudy. Their genomes will be sequenced, their medical experiences will be chronicled through their electronic health records, and sensors worn by them will track their activities, behaviors and environmental exposures.

Participation in the program would be voluntary, but many are already concerned about the protection of the participants’ privacy. That includes the president.

“We’re going to make sure that protecting patient privacy is built into our efforts from Day One,” Obama said at the program’s announcement. “It’s not going to be an afterthought.”

Privacy is a reasonable concern for an undertaking that’s likely to generate exabytes of data (an exabyte is 1 quintillion bytes) on people’s bodies, behaviors and cellular makeup. And the issues range from protecting people’s identities and information to making sure participants have some level of control or knowledge of where the data generated is going and what it’s being used to study.

The effort complements other emerging for-profit ventures that are using large amounts of genetic data to try to help humans live healthier and likely longer lives.

Google, through its Google X life sciences division, intends to sequence thousands of people’s genomes and collect other physiological information to draw the clearest picture of human health yet and identify biological indicators of future disease. And Human Longevity, co-founded by maverick biologist Craig Venter, built the largest sequencing center in the world with the hope of processing 1 million genomes by 2020. It plans to combine those with data on people’s gut bacteria, health records and more to aid in the development of new therapies to prevent and treat disease.

“This is a bold, audacious kind of idea, but the time has come,” National Institutes of Health Director Francis Collins told “Fault Lines” in March about the government’s effort. “We have the chance to really find out what are the factors that keep people healthy or cause illness to happen and, when it happens, how to manage it.”

Target for cybercrime?

At a hearing about the Precision Medicine Initiative in front of the Senate Health, Education, Labor and Pensions Committee on May 5, Democratic Sen. Patty Murray of Washington state warned of the risks to privacy.

“In the last few months we’ve seen serious security breaches impacting families’ personal health information, and that’s unacceptable,” she said. “We need to be aware that data is being created that cybercriminals will want to exploit, and that means we will need to develop a strategy to protect privacy that meets today’s challenges.”

Collins responded that the White House, the NIH and the Office of the National Coordinator for Health Information Technology were all “deeply serious” about protecting the data of its volunteers.

He told “Fault Lines” that as a first step, he would de-identify the data, or remove any obvious identifiers — like names, addresses, Social Security numbers and dates of birth — from DNA samples and other health information.

But the lack of clear plan beyond that is troubling to some privacy advocates.

“I’ve heard almost nothing from any of the proponents of the Precision Medicine Initiative as to what structures they’re going to construct and put in place to protect this information,” said Jeremy Gruber, the president and executive director of the Council for Responsible Genetics. “I’m sure they’ll make all kinds of promises that they are appreciative of the issue and will address it.”

Details on security for the new effort may be scant, but medical centers all over the country are protecting sensitive information every day, largely successfully, noted Atul Butte. He heads the California Initiative to Advance Precision Medicine, which is starting two pilot research studies as a proof of principle for this approach to medicine.

“You could legitimately argue about how secure you can keep things on the Internet,” he said. “But would cybercriminals go after [genetic data] if there are banks with money storing their data online?”

Collins told “Fault Lines” that the Initiative aims to do the best it can to avoid obvious slip-ups that would leave volunteers’ data exposed.

“You can’t guarantee people that there’s no way that somebody — either accidentally or for nefarious purposes — might figure out who they are,” he said. “There’s no perfect way to deal with privacy in 2015.”

Concerns about data sharing

In the effort to build its cohort of 1 million, the Precision Medicine Initiative is likely to look for collaborators with information to marry into its database. Possible partners include the U.S. Department of Veterans Affairs, which is attempting to partially sequence 1 million veterans for its own research, and the extensive patient files at Kaiser Permanente of Northern California.

“We’re starting to see, very quickly in the last couple of years, particularly with these biobanks, an incredible effort to try to network now,” said Gruber. “So we’ve had the type of privacy protections that we’ve had not because any type of framework was in place to ensure them but simply because of the fact that these databases were separated.”

Sharing data is a key to the efficacy of precision medicine: the more data, the better when trying to find patterns and signals to develop possible treatments. But offering access to a data set can bring up issues of consent, with study participants not knowing what their information is being used for.

Take the case of 23andMe, a company in which Google is a major investor. Founded in 2006, 23andMe offered customers an analysis of their DNA that included assessments of their risk of developing various illnesses. In 2013 the U.S. Food and Drug Administration shut down that service.

In the past year, the company announced deals to share its database of 950,000 partial genome sequences with Pfizer and Genentech, among others, to study diseases. The moves prompted many critics, including Gruber, to accuse the company of misleading its customers by selling their data.

“More than 80 percent of 23andMe customers consent to research,” said a statement that the company provided to “Fault Lines.” “Ultimately, 23andMe customers own their data. Customers can decide to stop participating in research at any time and remove all of their information from our database.”

Hank Greely, a Stanford law professor who studies biomedical ethics, said that 23andMe customers might not have understood what they were agreeing to when they read the company’s consent form.

“It says, ‘I’m willing to let my data be shared for research into disease,’” he said. “It doesn’t say, ‘I’m willing to let you sell my data to Pfizer for $200 million.’”

When asked how he planned to avoid the sort of issues 23andMe had run into, Collins told “Fault Lines” that participants in the Precision Medicine Initiative would be treated as partners in the research and that they would have a say in deciding how much exposure they want their data to have.

He said it’s imperative that the program be flexible enough to accommodate people who will share anything if it will help others, as well as those who might later decide that they would like to limit the exposure their personal data gets.

“It’s going to be up to them to decide what’s an acceptable level of privacy and where are they willing to take risks,” said Collins. “And each individual, ideally, ought to be able to set that bar for themselves. Because we all differ.”

Nikhil Swaminathan, Al Jazeera

False positive highlights limitations of familial DNA searching

Michael Usry, a young New Orleans filmmaker, has a flair for the macabre. His award-winning output includes titles like “Murderabilia,” a dark production that spotlights the trade in collectibles related to real-life killings and other violent crimes.

Usry’s work can come off as gratuitously violent, as in a clip from the short film “Winding Down” in which his protagonist disembowels a corpse in a basement.

But for the Idaho Falls Police Department, his filmography seemed outright suspicious.

The authorities in that Idaho city targeted Usry last year as a suspect in the 1996 murder of Angie Dodge, a case that’s drawn national media attention amid claims that the wrong man was convicted.

The fatal stabbing remains steeped in mystery, as DNA from the crime scene failed to match any of the millions of genetic profiles on file in the national criminal database. Police still believe Dodge had multiple attackers, but traditional forensic testing has failed to identify the man whose semen was found on the 18-year-old’s body.

Investigators last year turned to a controversial technique known as familial searching, which seeks to identify the last name of potential suspects through a DNA analysis focusing on the Y chromosome. A promising “partial match” emerged between the semen sample and the genetic profile of Usry’s father, Michael Usry Sr. — a finding that excluded the father but strongly suggested one of his relatives had a hand in the young woman’s murder.

The results instantly breathed new life into a high-profile investigation in which Idaho Falls authorities have weathered intense criticism. But the story of how the police came to suspect the younger Usry and then eventually clear him of murder raises troubling questions about civil liberties amid the explosive — and increasingly commercial — growth of DNA testing.

The elder Usry, who lives outside Jackson, Mississippi, said his DNA entered the equation through a project, sponsored years ago by the Mormon church, in which members gave DNA samples to the Sorenson Molecular Genealogy Foundation, a nonprofit whose forensic assets have been acquired by Ancestry.com, the world’s largest for-profit genealogy company.

Ancestry.com received a court order last summer requiring it to reveal Usry’s name to the police, although it is listed as “protected” in the Sorenson Y-chromosome database, according to court records obtained by The New Orleans Advocate. Following this new lead, the police mapped out five generations of Usry’s family, narrowing their focus to three men.

Only one, the New Orleans filmmaker, fit the mold of a plausible suspect, according to an application for a search warrant. Usry, 36, had ties to Idaho, including two sisters who attended a private university about 25 miles from the crime scene.

In addition, some of Usry’s Facebook friends lived in the state. And in their search warrant application, police also mentioned Usry’s short films, which they said generally have “dealt with some sort of homicide or killings.”

“All of the circumstantial evidence was right,” said Sgt. James Hoffman, of the Idaho Falls Police Department. “He seemed like a really good candidate. But we’ve had that happen before.”

Days of suspense

Detectives traveled to New Orleans in December and persuaded a magistrate judge to sign a search warrant ordering Usry to provide his DNA for comparison. For about a month, Usry lived in a state of suspense, fearing he’d be taken into custody regardless of the test results.

“I had lots of days sitting at the house with the dog,” he recalled in an interview, “wondering if these guys were going to use a battering ram to bust open the door and shoot my dog after he started barking at them.”

On. Jan. 13, Usry received the email he’d been awaiting. His DNA, Hoffman wrote, did not match the semen from the scene of Dodge’s murder.

“It turned out to be nothing,” Hoffman said in a telephone interview. “I wish it wasn’t a dead end, but it was.”

Several experts agreed the Idaho Falls police had probable cause to take a swab from Usry. But his experience touches on the delicate balance between the right to privacy and public safety.

“It’s not very common to see this sort of thing, and I frankly hope it doesn’t become very common because an awful lot of people won’t bother testing” their DNA, said Judy G. Russell, a genealogist and attorney who writes The Legal Genealogist blog.

Ancestry.com did not respond to questions about how frequently it receives court orders in criminal investigations and whether it ever resists law enforcement requests to learn the identity of otherwise protected genetic information.

Erin Murphy, a professor at the New York University School of Law who has written about familial searching, said Usry’s case is the first she’s seen in which law enforcement used a publicly accessible database like Sorenson, as opposed to a private law enforcement database, to obtain an investigative lead.

“I think what we’re looking at is a series of totally reasonable steps by law enforcement,” Murphy said. “But it has this really Orwellian state feeling to it, and it is a huge indictment of private genetic testing companies and the degree to which people seamlessly share that information online.”

A controversial tool

Familial searching differs from traditional DNA testing, a mainstream tool used to identify criminals. In familial searching, the number of partial matches — in which genetic profiles share several common “alleles,” or variant forms of genes — can be overwhelming.

In Usry’s case, Hoffman’s search of the Sorenson database yielded 41 partial matches. Though the name was initially shielded, the strongest match was Usry’s father, with 34 of 35 alleles, which the Sorenson Forensics Lab in Utah deemed “exceptionally good” and indicative of a close family relation to the suspect.

Familial searching in law enforcement dates back over a decade to Great Britain, which employed the technique in the 2003 case of Craig Harman, who threw a brick through the windshield of a passing vehicle, causing the driver to suffer a fatal heart attack. British authorities could find no direct match between the DNA found on the brick — Harman’s blood — and a national database of criminal defendants.

But they found two dozen similar genetic profiles belonging to people from the surrounding region, including Harman’s brother, who represented the closest partial match to the blood sample. Further probing led investigators to Harman, who reportedly gave his DNA and confessed to the crime in the face of a match.

In the United States, the most notable use of familial searching was the case of the notorious Grim Sleeper. An alleged serial killer, Lonnie Franklin, was indicted in 2011 on 10 counts of murder in Los Angeles after authorities found similarities between crime scene evidence and the DNA of Franklin’s son, who recently had been jailed on a weapons charge.

The public debate largely has centered on the use of law enforcement databases like the FBI’s Combined DNA Index System.

Proponents argue familial searching is a harmless way for police to crack otherwise unsolvable cases. The closest partial matches can steer investigators toward a criminal’s family members, whose DNA profiles closely resemble those of a convicted or incarcerated relative.

Skeptics like Murphy, the NYU law professor, warn that the technique drastically expands DNA testing beyond the function envisioned by states that compel criminal defendants to submit DNA samples upon arrest. Many states lack formal legal rules governing the use of familial searching by law enforcement, while Maryland has explicitly outlawed the practice.

In Louisiana, familial searching by law enforcement appears to be exceedingly rare. The Louisiana State Police Crime Lab does not conduct that method of testing, said Capt. Doug Cain, a State Police spokesman.

“It’s not been a bone of contention here as it has in other states,” said Kevin Ardoin, director of the Acadiana Crime Lab. “I don’t believe we have any restrictions on it, though, probably because it’s not been pursued or challenged.”

Usry’s experience, Murphy noted, differs from a familial search involving a law enforcement database, in that Usry’s father voluntarily submitted his DNA as part of a genealogical project. But, she added, “that doesn’t mean that many people wouldn’t find it surprising — and troubling — to know that a DNA swab they sent in for recreational purposes” could years later land their child in an interrogation room.

“I think this case puts into stark relief some of the critical privacy concerns surrounding consumer genetic testing companies, public repositories of genetic information, DNA familial search practices and the general degree to which law enforcement should be able to access genetic information,” Murphy said.

A cloud of suspicion

The call came in early December, as Usry was visiting his parents in Mississippi. The caller identified himself as a law enforcement official in New Orleans and said he was investigating a hit-and-run in City Park.

“I didn’t take it as a big deal,” Usry said. “He said he wanted to get together, ask a couple of questions and take a couple pictures of my car.”

Knowing he had nothing to hide, Usry agreed to meet with the officer. After he returned to New Orleans, three lawmen arrived at his door, including two detectives from Idaho Falls. The other, Usry said, was a federal agent.

The officers led Usry to the 13th floor “of some building by the Superdome, an interrogation room with a one-way mirror.”

“They told me that it was not, in fact, about a hit-and-run,” Usry said. “They wanted to know about my travels to Idaho.”

Usry piqued the officers’ interest, he said, when he recounted a ski trip he’d taken when he was about 19 with three friends. While Usry couldn’t recall ever setting foot in Idaho Falls, his interrogators determined he must have passed through the city when driving to Rexburg, Idaho, to pick up a friend.

At the end of the interview, Usry said, the federal agent entered the small room with “gloves on, ready to go.” It finally occurred to him that he might want to consult a lawyer when the lawmen hovered over him and extended a long Q-tip into his mouth.

“It really took me until almost the end of the entire experience to realize that they were saying, ‘We think that you, Michael Usry, are a suspect in this murder,’ ” he said.

When the officers dropped Usry off at home, he said, “I just kind of stood on my sidewalk dazed for a few minutes.”

He called a friend, a co-producer of the short film “Murderabilia.”

“You’re not going to believe what just happened to me,” Usry told him.

It took his friend about 30 seconds to search “Idaho Falls cold case murder” online and determine the interrogation was related to the Dodge killing, he said.

Coming to terms

In the weeks since the questioning, Usry said, he’s wondered whether someone else in his family — perhaps a cousin or distant relative he hasn’t even met — had something to do with the killing.

Colleen M. Fitzpatrick, a well-known forensic genealogist, said a partial match of 34 of 35 alleles is “very close to a 100 percent” indication that the donor of the semen is an Usry. She suggested the authorities take a closer look at the family lineage.

“There’s still a small percent chance that it’s not an Usry due to an adoption or illegitimate” child in the family lineage, she noted.

Greg Hampikian, a professor at Boise State University who has worked on the Dodge case, also seemed impressed by the partial match. According to the search warrant application, he told police that Usry’s father “would probably be within three or four generations” of the unidentified suspect.

Asked whether police thought the semen donor might be someone else in Usry’s family, Hoffman, the Idaho Falls sergeant, said, “Not that I can see so far. We’re still investigating everything we can.”

Usry said he regarded his experience as an invasion of privacy, made even worse because his father’s DNA had been submitted during a church-sponsored event. But it also has emboldened Usry to try to find out the truth about the Dodge case — in the best way he knows how.

He is working on a documentary and researching the well-publicized case of Christopher Tapp, whose conviction in Dodge’s killing has drawn widespread criticism.

“I feel a connection to this case because I’ve been forced to be involved in it,” Usry said. “The guy that really killed Angie Dodge deserves to be put in jail.”

Jim Mustian, New Orleans Advocate

Seizing DNA In Vermont

Say this for the Vermont Legislature: Sometimes it just won’t take “no” for an answer, even when it should know better.

In 2009, the Legislature passed a law extending the requirement to provide DNA samples from only those convicted of felonies to those merely charged with them. Last summer, the Vermont Supreme Court overturned that provision, holding it to be a violation of the Vermont Constitution’s protections against unreasonable searches and seizures. Thankfully, the presumption of innocence still counts for something in Vermont.

Now Sen. Richard Sears, D-Bennington, is back at it with new legislation that would extend the DNA sampling requirement to people convicted of misdemeanors that carry prison time as a potential punishment.

“The more people you have in the database, the easier it is to eliminate suspects and to zero in on the right suspect,” says Sears. That’s true, and under that theory, the state should require all newborn Vermonters to submit DNA samples in the event that they grow up to be serial killers.

But this bill is flawed in so many ways that perhaps it won’t come to that. First, it casts an exceptionally wide net; an estimated 4,600 people would be added to the DNA database in the first year alone. Thus thousands of Vermonters who have committed only minor offenses would be required to provide to the state what the Vermont Supreme Court has called “a massive amount of unique, private information” that goes beyond mere identification. That’s a disproportionate imposition on the privacy of someone who may have done nothing more sinister than wander onto posted land. And if you think that this is not a matter of concern to you, consider that by some estimates, 70 percent of adults Americans have, often unintentionally, committed a crime for which they could be imprisoned.

Second, entrusting this vast new amount of personal information to the management of the state of Vermont is folly. The folks who brought us the ace Vermont Health Connect website have demonstrated that beyond a reasonable doubt. If the bill passes, maybe Anthem will be hired as a consultant to provide advice on how to keep sensitive personal information secure from misuse.

Third, according to Vermont Public Radio, the annual costs of complying with the legislation would exceed $142,000 for laboratory tests alone, at a time when the Department of Public Safety is looking to cut more than a million dollars from next year’s budget because of reductions in revenue estimates that are forcing retrenchment throughout state government. If it’s crime this bill seeks to combat, a better use of that money would be to shore up the prison program under which inmates earn high school degrees and get job training — something the Shumlin administration is proposing to shrink.

This legislation is yet another example of technological capability outrunning thoughtful consideration of privacy costs and a failure to anticipate further advances that might pose even thornier problems. Suppose scientists somehow discerned a genetic pattern that in some cases prefigures criminal activity of a certain kind. With all these DNA samples in hand resulting from minor offenses, would the state be tempted to subject those who fit the profile to pre-emptive monitoring in the effort to prevent future crimes? Sounds far-fetched, but so did a lot of other scary invasions of privacy that have become reality in recent years.

Valley News editorial

CRG in the News-Millions of DNA samples stored in warehouse worry privacy advocates

Privacy advocates are calling for more safeguards related to a state collection of DNA samples from 16 million Californians in a nondescript government warehouse in the Bay Area.

The biobank holds blood taken with the prick of a heel from almost every baby born in California for the last three decades. It is used to screen for 80 health disorders, such as cystic fibrosis and sickle cell anemia.

Unlike most states, California keeps the frozen samples indefinitely and shares them with genetic researchers, for a fee.

State officials say the samples are secure and are used to save lives. But the privacy advocates and an influential state lawmaker, concerned about the potential misuse of DNA information, say parents and donors should have a clear choice about whether the state can keep theirs.

“Throughout the process, from the point of screening to the point of storage to the point of third-party use, public understanding, knowledge and consent is almost completely” absent, said Jeremy Gruber, president of the nonprofit Council for Responsible Genetics.

The blood samples are stored on special paper cards without names — just numbers that can be used to find identifying information stored separately, according to officials. Names are not provided to researchers.

But Assemblyman Mike Gatto (D-Glendale), chairman of a new committee created to address privacy issues in a world where technology is seen as outpacing the law, says that “whenever data is stored, data can fall into the wrong hands.”

“Imagine the discrimination a person might face if their HIV status or genetic predisposition to a mental disorder were revealed to the public,” he said.

The lawmaker wants the state to get written consent from parents before storing children’s blood samples indefinitely and allowing their use in research after the initial screening. Parents can already opt out if they do so in writing, but Gatto said many don’t realize it.

He wants to require that the state get them to opt in, to make sure parents consider the issue. Gatto said he plans to amend such a proposal into a bill he has introduced, AB 170, that would allow the blood donors, when they turn 18, to have their samples destroyed.

He said he may also include stiff financial penalties against researchers if DNA information in their possession is breached or leaked.

One researcher who has used the blood samples to make a breakthrough in public health said requiring parental consent would be “very damaging” to work like hers.

“The parents who don’t suspect anything is wrong with their kids are going to say no, because they are not going to understand” how research might help their children or others later, said Jennifer M. Puck, a professor of immunology in the Department of Pediatrics at UC San Francisco.

Puck developed a test for severe combined immunodeficiency, or “bubble boy” disease, using blood samples from 20 newborns who later developed the malady. That test has been used since 2010 to identify dozens of infants who could ultimately lose their immune systems without treatment, Puck said.

California’s biobank “has been exceedingly useful in my clinical work and my research.”

All 50 states collect and test blood samples from newborns. California is one of 20 that keep blood samples after screening. It has samples dating to 1982 and adds to the collection with every birth.

California is one of seven states that release samples for research without parental consent and one of four that charge a fee for their release, according to a survey published recently in the medical journal Pediatrics.

The state Department of Public Health collected $71,000 in processing fees and approved six research requests in the current fiscal year, according to Scott Sandow, a spokesman for the agency.

A 16-page brochure given to all expectant parents when they arrive at the hospital says they may decline to have their child’s samples stored after the initial screening by sending a written request to the state, Sandow said.

Gatto, who with his wife has been through the childbirth experience twice in the last four years, says he does not remember anything said about blood samples kept indefinitely and used in research.

“Obviously, when that blood is taken, it’s a very emotional time, and it’s a very fast-moving time,” Gatto said. At the births of both of his daughters, he and his wife were absorbed in “arguing about what we were going to name them.”

Gatto said he has spoken to hundreds of parents who were not aware that their children’s DNA was kept by the state.

“You assume they will test your child for the disease and then burn the [sample] as medical waste,” Gatto said.

Gatto envisions a notice that parents would receive well before entering the hospital to give birth, with a box they could check to allow storage of and research with their child’s blood after screening.

Robert Nussbaum, chief of the Division of Medical Genetics at UCSF, said that there is no such thing as a guarantee of absolute security but that new restrictions are unnecessary. The state program is doing a good job of keeping the information stored in the samples confidential, he said.

The program’s benefits to healthcare, he said, “outweigh the risks” of any data compromise.

How drug companies will mine your genes

Imagine a world where genetic sequencing is free, like Gmail. That’s where we’re headed. Genetic data is going the way the rest of our data has gone on the web. Companies will mine it, repackage it, and find a way to make money off it.

For eight years, personal genomics company 23andMe has been giving consumers access to their genes. It started off costing $999. Today it’s $99. In 2012, the company allowed consumers to share their data with third-party apps, the way you might link your Facebook profile to your Hulu account.

And this week, what could be seen as another step toward the full webification of our DNA went down, thanks to a new deal between 23andMe and Genentech, one of its early investors.

The Deal

To start, Genentech will pay to get access to the genetic and health information 23andMe has amassed from more than 10,000 of its customers with Parkinson’s disease, a disorder that affects movement. Genentech scientists will be able to see what medications they’re on, what other conditions they have, the symptoms they experience, along with whether they have tweaks in their DNA that put them at risk for developing the disease. (All these customers previously consented to be part of research by 23andMe and its partners.)

But the deal goes further. 23andMe will help Genentech identify approximately 3,000 customers who want to participate in more nuanced research.

23andMe only analyzes snippets of your DNA. Genentech needs the whole genome. So, it’s going to pay to sequence them, if they agree. Whether or not the patients will get that information back is still being worked out, 23andMe’s Emily Drabant Conley, who brokered the Genentech partnership, told me.

“We’re now starting to do [studies] at this really unprecedented scale,” she added. “Our database has hit a critical mass. These people can be recontacted. They’re engaged. The other thing that’s happening is pharma is seeing the value of bringing in genetics into the R&D pipeline earlier.”

You can see how this might evolve.

“With 23andMe,” wrote Lisa Miller in New York magazine last April. “[CEO Anne Wojcicki] wants to do with DNA what Google did for data—because, after all, DNA is data.” And what Google — and other web giants have done — is to create powerful platforms through which other companies can sell us stuff.

This vision of Googlized genetics thrives only in a world with minimal hurdles to getting data. It might take a decade or more, but eventually the cost will get low enough to allow companies to offer full genome sequencing for free, at massive scales.

“I would love that. It would be great…We want to have a product that’s accessible,” said 23andMe’s Drabant Conley. “I don’t foresee that happening in the near future. There are real costs involved.”

But when cost does hit that critical minimum, the true democratization of genetics will unfurl. It will just be something very different from what we were sold on.

The Promise

Eight years ago, when 23andMe first launched, it wooed technophiles with the idea that genetic information shouldn’t be locked away in a lab. It belonged to you. We had the right to spit in a tube and find out what diseases we might be at risk for, bureaucracy be damned.

Geneticist Misha Angrist, a senior fellow in Science & Policy at Duke University’s Social Science Research Institute, was one of the people who bought into that vision. He sent his DNA off to be analyzed by 23andMe and other companies that offered direct-to-consumer (DTC) genetic tests. This was the future.

Until it wasn’t.

Reports started coming out that genetic tests offered by different companies served up different interpretations of people’s genes. Sometimes, they were misleading. Genome wide association studies — the science upon which DTC genetics companies were built — started coming under fire. How much could you really tell from looking at individual mutations without analyzing the whole genome? Regulators started to notice. Then last year, the U.S. government slapped 23andMe with a cease-and-desist letter ordering it to stop marketing its spit-box genetic test as a health report. That reduced 23andMe, if only temporarily, to a glorified ancestry service since. The over-the-counter genetic testing industry, as we imagined it, seemed in jeopardy.

And then, this Genentech deal.

“I kind of think of this as a sad day. I think it’s sort of the final confirmation that direct-to-consumer genetics is not a viable business,” said Angrist. “I think this day was probably inevitable. It’s hard for me to feel betrayed. And you now if my genotype contributes to a blockbuster drug…and it improves people’s lives, I can’t be an asshole and say, ‘Boy that’s terrible!’, but given the rhetoric and the marketing of DTC genetics, which again, I admittedly bought into and wanted to buy into — I wanted it to be true — it’s hard for me to stand on the sidelines and say, ‘Yay! Rah-rah! Go Genentech.’”

That rhetoric harkened back a bit to the days of the early internet, when we learned to equate the sound of beeping modems to a superhighway of information that would empower us to be better. The information was seemingly accessible to all, for a fraction of the price it might have cost to access in the real world. That democratization morphed into something beyond our expectations, too.

“Just as it became clear that people didn’t want to pay money to AOL or Prodigy for email, Google found a way to give email away and derive ad revenue from somewhere else,” said Angrist. “It’s all the same business model,” a model in which you are the product. This internet-age adage is true, especially for DNA.

The new Genentech deal is only the beginning. Reset Therapeutics put out a press release saying it was going to use 23andMe’s massive database to study diseases related to circadian rhythms, the internal clocks that dictate when we sleep and eat. (Whether Reset will also be sponsoring its own genetic sequencing wasn’t clear.) And there are eight other deals coming, the details of which will trickle out in coming weeks. All the company would say for now is that they span a wide range of diseases and conditions.

As disappointing as the news may be to some, the strategy isn’t all bad. It could really open up new areas of research, in ways that current methods simply can’t.

“This is very exciting.  An integrated, enthusiastic cohort of 800,000 is far beyond academic capacity,” said George Church, one of the leaders of the Human Genome Project, in an email. (He’s also an advisor to 23andMe and was on the advisory board of a company recently acquired by Roche, Genentech’s parent company.) “I expect that the 23andMe cohort will continue to grow and be increasingly targeted to diseases of interest to pharma.”

Say Genentech wants to screen compounds for a new heart-disease drug they suspect might only work on a subset of patients with a certain genetic makeup. They can query 23andMe’s database, come up with potential candidates, and recontact them to sequence them. What’s more using new technologies like organoid chips, they can take skin cells from these patients, reprogram them, and build up tiny tissue-specific systems on which they can test their compounds. The results could give them a good indication of which ones might be toxic or work in individual patients. They could run multiple experiments at once, potentially speeding R&D up significantly.

For pharma companies, it’s a no brainer, especially with 23andMe customers at the ready, willing to supply more data in the hopes of finding a cure for the condition that ails them or their relatives.

As 23andMe co-founder Linda Avey put it on Twitter:

That economic value, though, will accrue to 23andMe.

The company’s terms of service clearly state that “by providing any sample…you acquire no rights in any research or commercial products that may be developed by 23and­Me or its collaborating partners.”

Daniella Hernandez, FUSION

Geneticists Begin Tests of an Internet for DNA

A coalition of geneticists and computer programmers calling itself the Global Alliance for Genomics and Health is developing protocols for exchanging DNA information across the Internet. The researchers hope their work could be as important to medical science as HTTP, the protocol created by Tim Berners-Lee in 1989, was to the Web.

One of the group’s first demonstration projects is a simple search engine that combs through the DNA letters of thousands of human genomes stored at nine locations, including Google’s server farms and the University of Leicester, in the U.K. According to the group, which includes key players in the Human Genome Project, the search engine is the start of a kind of Internet of DNA that may eventually link millions of genomes together.

The technologies being developed are application program interfaces, or APIs, that let different gene databases communicate. Pooling information could speed discoveries about what genes do and help doctors diagnose rare birth defects by matching children with suspected gene mutations to others who are known to have them.

The alliance was conceived two years ago at a meeting in New York of 50 scientists who were concerned that genome data was trapped in private databases, tied down by legal consent agreements with patients, limited by privacy rules, or jealously controlled by scientists to further their own scientific work. It styles itself after the World Wide Web Consortium, or W3C, a body that oversees standards for the Web.

“It’s creating the Internet language to exchange genetic information,” says David Haussler, scientific director of the genome institute at the University of California, Santa Cruz, who is one of the group’s leaders.

The group began releasing software this year. Its hope—as yet largely unrealized—is that any scientist will be able to ask questions about genome data possessed by other laboratories, without running afoul of technical barriers or privacy rules.

The researchers felt they had to act because the falling cost of decoding a genome—then about $10,000, and now already closer to $2,000—was producing a flood of data they were not prepared for. They feared ending up like U.S. hospitals, with electronic systems that are mostly balkanized and unable to communicate.

The way genomic data is siloed is becoming a problem because geneticists need access to ever larger populations. They use DNA information from as many as 100,000 volunteers to search for genes related to schizophrenia, diabetes, and other common disease. Yet even these quantities of data are no longer seen as large enough to drive discovery. “You are going to need millions of genomes,” says David Altshuler, deputy director of the Broad Institute in Cambridge and chairman of the new organization. And no single database is that big.

The Global Alliance thinks the answer is a network that would open the various databases to limited digital searches by other scientists. Using that concept, says Heidi Rehm, a Harvard Medical School geneticist, the alliance is already working on linking together some of the world’s largest databases of information about the breast cancer genes BRCA 1 and BRCA 2, as well as nine currently isolated databases containing data about genes that cause rare childhood diseases.

In March, the group launched a test of whether scientific organizations would be willing to share data. A product called Beacon lets the owner of a database open it up for strictly limited searches.

“We are not trying to invent a technical feat; it’s breaking down this problem of people not sharing data,” says Marc Fiume, a computer science graduate student at the University of Toronto who built part of the interface. “This lets you probe, but without identifying anyone or violating patient privacy.”

So far, 15 databases are compatible with Beacon, which Fiume rates a reasonable success. Three are stores of public genomes that Google maintains a copy of, and one is at a software company called Curoverse in Boston.

Haussler says a future protocol would offer access to progressively more data, but in a controlled way. Scientists would have to register, or even sign legal agreements. “If it’s ‘Give me the whole genome,’ you’d enter a contract for that,” he says.

One change the alliance is pushing is a new type of master consent form, the document that lays out volunteers’ rights when they hand over their genomes. The new consent is broader than most, giving permission for “controlled access” by “researchers around the world.” It promises that no researcher will identify a participant, although since DNA is a unique, like a fingerprint, there would be no guarantees.

Like the W3C, the Global Alliance has “host institutions” that pay its bills. So far, they are the Broad Institute, the Wellcome Trust Sanger Institute in the U.K., and the Ontario Institute for Cancer Research, according to Altshuler, who declined to say how much money each had contributed.

John Wilbanks, founder of the nonprofit Sage Bionetworks, is working with the alliance and is also a former member of the W3C. He says the alliance has a harder task than the W3C did. “The Web existed long before the Web Consortium did. That is the big difference,” he says. “The Web got traction, and the consortium was created to manage it. They didn’t have to create the Web.”

Antonio Regalado, MIT Technology Review

Google Wants Your DNA: Are You Willing to be a Project in the Cloud?

For the past 18 months, Google has quietly been approaching hospitals and universities to acquire genome data in an effort to roll out a cloud computing service for DNA, according to Technology Review.

Google Genomics is the search giant’s first product for the DNA age, providing an API to store, process, explore and share DNA sequence reads, reference-based alignments, and variant calls, using Google’s cloud infrastructure.

For $25 a year, Google will host a copy of genome sequences in the cloud.

While genetic databases already exist online, Google Genomics is the latest and most ambitious iteration. Genealogy databases for finding ancestors and public genetic databases run by national research centers, while impressive and useful, have nothing on the DNA storage service.

Connecting and comparing genomes by the thousands, and soon by the millions, will propel medical discoveries for the next decade. Between Google, IBM, Microsoft and Amazon – the question of who will store the data is already a point of growing competition.

“We saw biologists moving from studying one genome at a time to studying millions,” David Glazer, the software engineer who led the effort, told Technology Review. “The opportunity is how to apply breakthroughs in data technology to help with this transition.”

Why Google Genomics is Important

The collection of data is vastly increasing in labs all over the world as faster equipment for decoding DNA is becoming more accessible. The Broad Institute in Cambridge, Massachusetts, reported that during the month of October it decoded the equivalent of one human genome every 32 minutes – roughly 200 terabytes of raw data.

This flow of data exceeds what biologists have previously handled (to put this in perspective, in over two months, Broad Institute will produce the equivalent of the amount of material that gets uploaded to YouTube in one day) prompting the effort to store and access data at a central point.

The National Cancer Institute said in October that it would pay $19 million to move copies of the 2.6 petabyte Cancer Genome Atlas into the cloud. Copies of the data will reside at both Google Genomics and in Amazon’s data centers.

The Future of Medical Discoveries

Without the comparison of genome sequences, it is tough for researchers to determine what a mutation is and what is not within DNA. With a database that houses thousands of genomes, the chances of pinpointing inconsistencies become much higher.

A database such as Google Genomics can serve as a search catalogue for doctors to determine the best treatment options for a patient.

“Our bird’s eye view is that if I were to get lung cancer in the future, doctors are going to sequence my genome and my tumor’s genome, and then query them against a database of 50 million other genomes,” said Deniz Kural, CEO of Seven Bridges, which stores genome data on behalf of 1,600 researchers in Amazon’s cloud. “The result will be ‘Hey, here’s the drug that will work best for you.’”

Solving the Privacy Issues

With big data comes big privacy issues. Genome databases have to carefully calibrate how much information they provide alongside DNA sequences. While more information such as age, sex, location, diet habits, etc. are more useful to researchers, the easier it is to identify who the genome belongs to.

A study in Science last year was able to identify several men from the publicly available 1000 Genomes Project based on their Y chromosomes and age, location and family tree data. While Google Genomics’ data is geared towards researchers rather than the general public, the wide accessibility of this information leaves the privacy matter open.

Additionally, what if researchers who are studying a patient’s genomes for cancer come across information that reveals a newly discovered rare disease or that said patient has an unknown sibling. Do they tell the patient?

While these privacy worries aren’t unique to Google Genomics, the sheer magnitude of the project magnifies the potential problems. According to Gizmodo, researchers have advocated for central genomic data centers to standardize privacy policies. Once these privacy concerns are reckoned with, Google Genomics has the capability to succeed where others haven’t.

According to Technology Review, at least 3,500 genomes from public projects are already stored on Google’s servers.

Stehphanie Ocano, Healthcare Global

Taking your Genome to the Bank

What’s more valuable than your money, equally vulnerable, and unique to you? Answer: Your genome. And just like your money, your genome should be stored securely as possible and those institutions that store your genome should be regulated on how they store it, use it, and potentially share it.

As medical science advances, it’s going to be increasingly important for people to be able to control and manage access to their personal genomes. To make this possible, we need to establish a formal, well-regulated system of genome banking. Just as the government regulates the banks that hold our money, we must also have it or an equivalent group/system to govern how institutions manage our genomic data. Because it is only by guaranteeing the security and use of that information that we will be able to exploit the full potential of the growing pool of genomic data for the betterment of the individual and for mankind.

Everyone’s genomic data, after all, is potentially life saving and life changing. We’ve long known that each of us has a unique string of three billion or so tiny molecules linked together in our own genetic code. That code governs much of our health and well being. It dictates the color of our eyes, how tall we can grow, our relative risk of developing cancer, and much more. Your genome also has big implications for your children and other members of your family. If a family member develops an inheritable disease such as breast cancer or Huntington’s, the information in your genome could be crucial to determine if other family members are also at risk. Think of this as the estate you pass on to your heirs.

Craig Venter was recently quoted in a Businessweek article saying ? “It’s going to be important to know what the variant is you got from your mother and from your father, and whether that correlates with 30 other variants across the genome that are associated with susceptibility for a certain type of cancer, for example.”

Genomic data is also becoming increasingly important in medical research. Studies of Alzheimer’s suggest that one reason so many clinical trials have failed so far is because the studies need to be done in people with earlier stage disease, which is currently very challenging to detect. In the future, it’s likely that people whose relatives have the disease will be able to get a simple blood test that will help estimate their own risk. Additional studies will then be used to select optimal candidates for early stages trial of new Alzheimer’s drugs.

But there’s also a dark side to genome sequencing. Hackers can sneak into databases and determine the owner of an “anonymous” genome using DNA identifiers. They can also uncover previously unrecognized sensitive information about someone’s genome, or unmask areas that researchers have attempted to keep from public view.

In probably the most famous example of this, James Watson (the co-discoverer of the DNA double helix) made almost his entire genome public in 2007. One gene—APOE, which helps predict risk of Alzheimer’s disease—had been masked before Watson’s genome was published. However, several geneticists said they could tell whether or not he carried the gene, based on other mutations that are commonly inherited with APOE. No one publicized his APOE status, but it became clear that publishing your genome could entail risk to your privacy.

Then in January 2013, a researcher at Massachusetts’ world-renowned Whitehead Institute tracked down five people who he selected at random from a DNA database. Using just their DNA, ages, and the states where they lived, in a matter of hours he identified the five as well as some of their relatives.

Such stories are unnerving to many, who worry that insurance companies or employers will use genetic information to discriminate against them. Even though it’s technically against the law, this is a widespread fear.

Some experts claim it’s time to simply accept that we are in a genomic era, and that will entail some risk to privacy. They are encouraging people to donate their DNA to large public databases so that we can more rapidly advance genomic testing and diagnosis. But for those who are not comfortable sharing their data (whether public or private unregulated entities) there should be an alternative that doesn’t involve simply trusting that our data will be safe. We should be able to have a guarantee that our genomes will be safe and managed according to standards. As we have done in the past—you can store you money in your mattress or you can choose to store it in a bank that is regulated and must abide by guidelines.

The system I am suggesting is one that would set up firm rules for how genomic data is stored, used, and transferred/loaned between institutions, just as we have rules for transferring money. Once those system and rules are in place and have been widely communicated, more people will trust the system. With greater confidence in the system, more people will start to participate and we can finally enter a truly genomic era that can profoundly affect the future of humanity. My challenge to lawmakers and policy makers is to not look at this area as something that is managed by researchers but something of significant value to society that affects every human being on earth. And if that is believed to be true—then this valuable asset should be rightfully standardized, regulated and managed to create a system that benefits all individuals both here and abroad.

Harry Glorikian , GEN

Mandatory DNA collection during arrest is unconstitutional, court says

A state appeals court decided unanimously Wednesday that California’s practice of taking DNA from people arrested for felonies — though not necessarily convicted or even charged — violates the state constitution.

The decision, handed down by an appeals panel here, is likely to be appealed to the California Supreme Court.

A three-judge panel of the First District Court of Appeal struck down a portion of a 2004 law passed by voters permitting the state to take and store DNA profiles from people arrested for felonies.

The U.S. Supreme Court has upheld a more limited Maryland law under the federal Constitution.

But Wednesday’s decision was based on the California Constitution, which specifically gives residents privacy rights.

“The California DNA Act intrudes too quickly and too deeply into the privacy interests of arrestees,” Presiding Justice J. Anthony Kline wrote for the panel.

“The fact that DNA is collected and analyzed immediately after arrest means that some of the arrestees subjected to collection will never be charged, much less convicted, of any crime,” Kline wrote.

In 2012, 62% of people arrested on suspicion of felonies in California were ultimately not convicted, and almost 20% were never even charged, the court said.

Unlike Maryland, California does not require DNA profiles to be automatically expunged if a person is not convicted.

California stores the DNA profiles in a database. People who are not convicted or charged must apply to have their genetic profiles removed, and “the expungement process is neither quick nor guaranteed,” Kline said.

Arrestees must submit a request to the trial court and the county prosecutor as well as to the state Department of Justice’s DNA laboratory to have their DNA profiles removed from the database, and courts may deny the requests.

“California places the burden on the arrestee to pursue an onerous judicial process which seemingly vests the prosecutor with power to prevent expungement merely by objecting to the request,” Kline wrote.

A similar challenge to California’s law being heard in federal court was put on hold Wednesday as a result of the state court ruling.

A spokesman for state Atty. Gen. Kamala D. Harris, whose office has defended the DNA law, said lawyers were reviewing the ruling.

Harumi Mass, a senior staff attorney at the ACLU of Northern California, which filed a friend of the court brief in the state case, praised the court’s decision for recognizing that “DNA is fundamentally differently from a fingerprint.”

“People who have never been charged with a crime should not have their DNA put in a government database,” Mass said.

Maura Dolan, LA Times

In Search for Killer, DNA Sweep Exposes Intimate Family Secrets in Italy

The body of Yara Gambirasio, a 13-year-old student, was found on a chilly day in a quiet northern city in Lombardy in February 2011, three months after she was reported missing following a gymnastics practice. The autopsy showed she had been beaten but not sexually abused, stabbed several times and left to die in a field less than 10 miles from her home.

The autopsy also revealed stains on the young girl’s frayed clothing: a liquid that contained the DNA of a person whom investigators called “Unknown Male No. 1.”

In the absence of other evidence, investigators embarked on the country’s largest DNA dragnet, taking genetic samples from nearly 22,000 people.

Finally, last month, they said they had found their suspect, Massimo Giuseppe Bossetti, a 43-year-old father of three from Lombardy.

But what some praised as a triumph of modern science and 21st century sleuthing, has also set off a debate about the risks of privacy violations and the darkened corners of the past they can expose after the DNA testing also revealed something unknown even to the suspect’s family: that he was the illegitimate son of a man who had died in 1999.

Geneticists arrived at this conclusion after narrowing the DNA search to one local family and a long-dead bus driver whose known sons did not match the genetic sample found on the slain girl’s body. The dead man’s DNA was first lifted from a stamp he had licked. Later, he was exhumed to carry out further tests.

When a match was made last month, and Mr. Bossetti was arrested, the ensuing media frenzy was set off in part by the details of what one newspaper described as a “genetic soap opera.”

Recent confidences on the part of the residents of small towns in Lombardy about a long-ago — and until now secret — affair eventually led investigators to the mother of “Unknown Male No. 1,” who presumably cuckolded her husband with a local acquaintance in the late 1960s.

Italy’s privacy watchdog eventually berated the media for leaking news of the alleged affair and for pursuing comment from its unsuspecting protagonists, notably the cuckolded husband and the wife, and the suspect’s known and newly discovered siblings.

A sharply worded statement by Italy’s privacy authority called on media outlets to use “maximum respect” in reporting personal details incidental to the case of the murdered girl. “Not even the public interest legitimizes the media frenzy over intimate personal details such as to create irreparable damage to family life and personal relations,” Antonello Soro, the chief of Italy’s Data Protection agency, said in a statement.

Investigators had initially concentrated their efforts on tracking down people whose mobile phones had been used near where the girl was abducted and killed. The families of classmates were tested, as were regulars at a nearby discothèque and the members of the sports center where Ms. Gambirasio had taken gym classes.

Investigators said they could have eventually tested all 120,000 phone owners intercepted in the area that night. “But I don’t know if we would have gotten them all,” said Letizia Ruggeri, the lead prosecutor in the case. “Luckily we didn’t have to.”

Despite the sheer number of people tracked down for DNA testing, the investigation raised few eyebrows in Italy. A court order can be issued in case someone refuses to be tested, “but I never needed one because everyone submitted to the test voluntarily,” Ms. Ruggeri said. She acknowledged the extraordinary scale of the case, both in costs and in manpower, noting that it had been a one-of-a-kind situation. “But there was no legal obstacle to our doing it,” Ms. Ruggeri said.

The previously unknown familial ties and their public exposure are the unfortunate “collateral effects” of a complicated case, said Carlo Previderé, a forensic geneticist at the University of Pavia, where hundreds of DNA samples — including swabs from some 500 women identified as potential birth mothers of the suspect — were tested. The women included former classmates, colleagues, shopkeepers, neighbors and clients who would have known the deceased bus driver in the late 1960s. The testing eventually identified the birth mother.

Professor Previderé’s lab positively identified Mr. Bossetti after testing DNA material taken during a breath analyzer test staged by investigators, who pretended to stop him by chance.

“The important fact is that this man’s DNA is the same as the DNA found on Yara,” he said, noting that the odds of anyone else sharing the same genetic profile was “one person out of two billion of billions of billions” or practically nonexistent.

For investigators and the national media, the positive DNA match immediately emerged as the smoking gun in a case that had floundered for months and had resulted in one wrongful arrest.

By most media accounts, the case is effectively closed. Mr. Bossetti’s casual Facebook musings have been deconstructed to glean diabolical intent; personal habits — including a very Italian propensity to frequent tanning salons — have been reread by commentators in a dark, narcissistic key. Even the suspect’s reputation as a family man who attended church has been recast as a modern-day version of the classic wolf in sheep’s clothing.

“There’s the desire to punish, to dole out an exemplary punishment — the trial is as good as done because of the media effect,” said Roberta Barbieri, a lawyer who defended a Moroccan man arrested — and released — a few days after Yara’s disappearance.

Valerio Spigarelli, president of the national criminal lawyer’s body, is among those who have challenged the public dissemination of genetic information gathered through such vast DNA screening.

“When something leads to very personal rights, like paternity or maternity, perhaps the results could have been guarded a little more jealously,” he said.

Prosecutors say that while the DNA match is “very revealing,” other evidence consolidates their case, including the suspect’s presence in the area on the night of the murder. Furthermore, the suspect’s genetic material was found on the young girl’s underwear and leggings, “a sensitive spot, next to a cut” and its presence in that spot would be difficult to explain otherwise, said Ms. Ruggeri the prosecutor.

“DNA doesn’t fly,” said Luciano Garofano, the retired commander of the military police’s leading forensic department, based in Parma.

Mr. Bossetti insists he does not know how his DNA got onto Yara’s body. He spoke to prosecutors this month, “answering all the questions put to him, and explaining some elements,” one of his lawyers, Claudio Salvagni, said.

Yet even geneticists certain of the DNA match balk when it comes to definitively pinning the murder on the “Unknown Male No. 1.”

There is no doubt, said Giuseppe Novelli, an acclaimed geneticist and chancellor of the University of Rome Tor Vergata, whose department worked on the case, that the DNA traces found on the young girl belongs to the match. There is no doubt, either, “that he is the illegitimate son of a dead man,” he said.

“But whether he is the murderer, I don’t know,” Professor Novelli said. “Now we have to decide how it got on Yara’s body.”

Elisabetta Povoledo, NY Times