Monthly Archives: August 2014

Bump Tracker: Nine Months of Big Data

Anne Morriss put down the phone and rushed to where her baby lay sleeping. She watched her newborn breathe for a minute before hurriedly picking up the phone, “He’s still alive,” she said shakily. Anne and her partner had returned home from the hospital with their newborn son when they received the most unnerving phone call of their lives. A physician they didn’t know, from the Massachusetts department of health, called to find out if their child was still alive. When Anne answered yes, the doctor asked her to go check again. Their next step was a mad rush to bring their baby to the hospital.

When baby Alec was just one day old a nurse pricked his heel with a lancet and collected his blood in a slender tube before blotting it on filter paper. That small sample of blood was broken apart into charged molecules then deflected by a magnet, so that the amino acids in the baby’s blood could be measured. Newborn genetic screening is mandatory in the United States, but if you ask an obstetrician what this test reveals they likely won’t be able to tell you. It simply does too much. More than 60 diseases can be identified from the assay, most of which few physicians have ever encountered in actual patients. Alec’s newborn screening test showed that he lacked the enzymes needed to break down fat and convert it into energy for the body. The disease is called MCADD, medium chain acyl-CoA dehydrogenase deficiency, and it affects one in 15,000 babies born in the United States. Before genetic screening, babies with the disorder usually died, often misdiagnosed as Reye’s syndrome or SIDS. Newborn genetic screening likely saved Alec’s life.

A simple blood draw could unveil a treasure trove of genetic data, including all the possible diseases we carry.

Morriss found herself pondering the genetic nature of the disorder. If only she and her wife had known the risk she carried within her DNA. This was magnified by the fact that the couple used a sperm donor to conceive their child, a donor who, like Morriss, also carried risk of the disease. If genetic testing could diagnose the disease why couldn’t it prevent it altogether? Building on this concept, Morriss co-founded GenePeeks in 2010. The company analyzes disease risk in potential children, predicting the likelihood of more than 500 severe disorders. Today it partners with sperm banks to offer parents a personal genetics analysis with their potential donors. One day the technology might change the way we conceive children, letting us pry into our potential. A simple blood draw could unveil a treasure trove of genetic data, including all the possible diseases we carry.

The genetics of conceiving a baby is only the first analysis in the network of data that forms during pregnancy. Like a spider web, the complexity builds, offering both a biologist’s banquet and a marketer’s dream. The fetal DNA running through a pregnant woman’s veins has never been easier to capture nor as capable of revealing the secrets of an unborn child. At the same time data brokers have uncovered the sequence of events that herald a pregnancy, able to precisely target expectant parents. Made possible by breakthrough technology, both approaches are having a radical influence on pregnancy today. Nine months of big data that hold infinite promise and worrisome risk.

* * * 

For me, like most potential parents, the first test I took was not genetic. Instead it was a simple pregnancy test. Filled with anticipation, I bought 25 pregnancy tests from amazon in January. That’s when things started to change. Not necessarily with my body, but with my data. Immediately my Amazon search results began to skew. This was hardly surprising. We all expect the “recommendations for you” will shift based on past purchases. The big change came in late March. It seemed the Internet knew I was pregnant before I could be sure myself. In my excited frenzy over seeing a faint, barely visible red line, I googled “pregnancy test accuracy.” While I agonized over the results of the early pregnancy test, the Internet morphed around me, forming itself into a pregnancy paradise, offering pregnancy sites and baby products that followed me from news sites to blogs to online stores. Using tracking software, I found that with each site I visited more than a dozen data brokers followed me. Their alarm bells were ringing in response to my pregnancy related behavior. From one of these targeted ads I registered on a popular pregnancy site, providing my due date and downloading an app, so that I could track the baby’s growth in a weekly fruit comparison: from a blueberry to a papaya to a watermelon. While I fawned over the adorable update that my embryo was now the size of an orange seed, the company was selling my information to as many retailers as possible. It was the beginning of my pregnancy but also the start of something else.

An expectant mother’s data is worth fifteen times that of the average person’s. Merchants know that a new baby means that serious purchases are about to be made and that brand loyalty, often acquired before the baby arrives, can yield years of dependable purchasing. To take advantage of this, data brokers are inventing new computational techniques to zero in on this lucrative group. This big data approach means that background information, purchases made both online and in stores, social media use and internet activity are combined to accurately categorize consumers into some 70 categories with names like “Urban Scramble,” “Savvy Single,” “Rural Everlasting,” “Skyboxes and Suburbans,” “Married Sophisticates,” and of course the most lucrative of all, “Expectant Parent.” All together data brokers hold billions of data points. One data broker alone holds 3000 bits of data for nearly every consumer in the United States.

Still, it seems we should have some say in who knows we’re pregnant.

My seemingly innocent acts: ordering pregnancy tests online, an Internet search for pregnancy test accuracy, purchasing prenatal vitamins at Target, and signing up for a cute pregnancy app, were akin to opening Pandora’s box. Sandwiched between my friend’s updates on Facebook were now oddly poignant Pampers ads, seemingly scripted to prey upon the hormones of pregnant women. Enfamil began to send formula samples to my door complete with glossy ads of doe-eyed babes cuddled in their mother’s arms while Target cleverly slipped in coupons for nursery furniture among ones for toilet paper and soda. The advertising eerily mirrors the progress of my pregnancy with ads for morning sickness remedies coming in during my first trimester and maternity clothing popping up along with my growing belly in the second.

Sometimes this slick new marketing world backfires. In her article, “The Pregnancy is Gone but the Promotions Keep Coming,” April Salazar describes what happened when she received a package of formula from Enfamil. The samples came as she approached what would have been her due date with the box reading jauntily, “You’re almost there!” But instead, of welcoming a baby, April and her husband had made the difficult choice to terminate her pregnancy months earlier after learning the baby had a lethal disorder. She had never signed up for free formula samples. What she had done was sign up at a popular pregnancy site, likely the same one I did, to track the growth of her baby. She’s not alone. Many women receive these welcoming packages from formula companies, ostensibly celebrating the due date of a baby, but instead offering only a heartbreaking reminder of what could have been. With some 10-20 percent of pregnancies ending in miscarriage (and this number is likely higher), this is an anguish shared by many—most of whom never signed up for formula in the first place but whose due dates were sold by data brokers, pregnancy websites, maternity stores and even obstetrician offices.

​Yet there is also the risk of knowing too much. 

In a now famous case reported by the New York Times, a man learned of his teenage daughter’s pregnancy only after she received coupons for maternity clothing and nursery furniture from Target. The dad angrily complained to managers that the company was promoting teen pregnancy before sheepishly apologizing that, “It turns out there’s been some activities in my house I haven’t been completely aware of.” This invasion of privacy was made possible by newly engineered computer systems. Five years ago, big data explorations were fracturing under software running on generic hardware. A new strategy, pioneered by Oracle, takes an architectural approach to combining hardware and software. These advances are making data mining even more comprehensive, able to combine a pregnant teenager’s on and offline behavior for better targeting.

Massive data brokers such as the firm Epsilon, whose high profile customers include Wells Fargo, Toyota, Macy’s, as well as Target, are benefitting from these engineered systems, such as Oracle’s Exadata. The technology might be more sophisticated but concerns linger over the security of these massive data banks. Epsilon had what was once called “the biggest if not the most expensive, security breach of all time” when part of the company’s 40-billion-member email lists were hacked, resulting in widespread phishing scams and a loss of $4 billion. It raises the question, if these companies are collecting data both on and offline, what other information could be breached? Will pregnancy data be stolen as well as sold?

Data mining provides some benefits to consumers. It may be creepy that Target knows to send me diaper coupons so soon after I learn I’m pregnant, but I have a better chance of benefitting from them than I do, say, coupons for denture cream. Still, it seems we should have some say in who knows we’re pregnant. This, however, is not what government regulators have decided.

* * *

The sale of consumer information is regulated by the Government Accountability Office (GAO), which in 2013 found that there is no comprehensive privacy law governing the collection and sale of consumer data and that, under current law, consumers have no right to know what information has been gathered about them or control how personal information is collected, even sensitive health information such as pregnancy.

While data miners pry into our private lives, the biological data that makes up a pregnancy is rapidly changing. Unheard of only a few years ago, today prenatal genetic testing is becoming routine in the United States. Blood is drawn from a pregnant woman early in pregnancy and fragments of the fetal genes, which make up roughly 10 percent of the DNA floating in her blood at the end of the first trimester, are teased out. The fetal DNA is then sequenced, the chromosomes counted. The test acts as a screen for genetic diseases caused by extra or missing chromosomes such as Down’s syndrome, Edward’s syndrome and Turner’s syndrome, many of which are severe and can be lethal. Revealing this genetic information to parents gives a world of choice when deciding how to cope with a medical disorder. These genetic tests are slowly replacing the need for risky medical interventions. Before the tests were widely available, the only alternative was amniocentesis and chronic villus sampling (CVS). These tests use a needle inserted through the abdomen and into the womb to either sample the amniotic fluid or the placenta. Both procedures carry a small risk of miscarriage and are undoubtedly more dangerous than a simple blood draw.

Prenatal genetic tests are not only safer but also better than invasive procedures at detecting chromosomal abnormalities. The false positive rate, the number of parents who are told their child is positive for Down’s syndrome when they aren’t, is very low, 0.3 percent compared to 3.6 percent with an amniocentesis. Genetic testing can also be performed earlier, giving parents the option to end pregnancies afflicted by medical disorders. Prenatal genetic screening is widely available from multiple companies, including Harmony, Verifi, MaternitiT21 and Panorama, although insurers may only cover the cost if a woman falls into a high-risk category because of her age (over 35) or medical history. In the two years noninvasive genetic testing has been available, it’s benefitted hundreds of thousands of women. Many believe the screening will soon become just another tube of blood of the many taken from pregnant women at their doctor’s office.

The boundaries lie only in the ever-widening limits of our scientific knowledge.

But while the blood draw is simple, the results are not always clear for parents. The test is not meant to diagnose but instead acts as a screen. Because of this, test results are presented in terms of risk, say, one in 10, or one in 10,000. This can be a little confusing for parents unaccustomed to looking at probabilities. While a doctor performs an amniocentesis or CVS, genetic test results are handed down from a private company. This means there’s no standard method for delivering the medical data—or for discussing the decisions that may come next. Some parents receive the news that their child is likely to have Down’s syndrome in a two-minute phone call. This is the moment when genetic counseling is most needed but may not always be offered. Roughly two-thirds of parents whose fetus has been diagnosed with Down’s syndrome will choose abortion. The percentage is much higher for other trisomies, which often cause a short, painful life.

For me, like most parents, the simple black and white test results box gave reassuring news and even something extra. Amidst it’s chromosome counting is the identification of sex chromosomes. It reveals XY for a boy and XX for a girl. Trembling with excitement, my husband and I learned we’d be welcoming a little girl. When we told our 4-year-old daughter she delighted in the news, made possible by a technology not available when she was born. (Not that the results surprised her; she’d been expecting a sister all along.) For the first time in history, sex is revealed early in pregnancy, often only a few weeks after seeing those double lines on a test. While it brings expectant parents joy, it’s the control over our reproduction that some ethicists worry over, citing among other concerns like the woes of gender imbalance in countries such as China and India. It’s not just gender that has people worried. We have the potential to screen for much more. In 2012 the first noninvasive test of whole genome fetus screening was published, opening the door to a whole new level of genetic understanding. In addition to the risk of childhood diseases, parents can peek into their child’s medical future, uncovering their risk of breast and ovarian cancer, diabetes, even physical traits such as hereditary baldness and eye color. The boundaries lie only in the ever-widening limits of our scientific knowledge. Genetic tests could lead to fantastic medical care—greater screenings for cancer; early treatment for diabetes—but they could also potentially sully the unbounded potential a newborn has in her parents’ eyes. For better or worse, these tests are a part of both our present and future. In the United States newborn genetic screening is mandatory, although the tests performed vary from state to state. How long until prenatal genetic screening becomes just as essential?

A human being is never more hidden from the world than inside the womb. Perhaps this is why people are drawn to the swollen bellies of pregnancy, rubbing them gently and asking endless questions. We want to know what’s concealed. A noninvasive prenatal test was key to the first peek inside: the ultrasound. The images were grainy black and white, hardly resembling a baby at all but parents didn’t care. They were elated to sneak a glimpse inside the womb. The technology diagnosed abnormalities, but for most parents the screening simply brought them closer to the life growing inside. What for all of human history was hidden now revealed. Our new era of pregnancy technology has the same ability to save lives while bonding us closer to our babies.

Yet there is also the risk of knowing too much. In the 40 years since ultrasounds became standard practice in the US, they too have transformed. The once grainy images have been replaced with 3D/4D ultrasounds that construct the features of the face. The detail is incredible. I gasped as I watched the fetus—I now knew was the size of a large banana—inside me break into a smile. But the image was odd; the baby looked something like a melting Skeletor. It was a reminder that the data will keep getting bigger, the images will gain clarity, and yet none of it can possibly compare to holding the real thing in your arms.

Nathalia Holt, The Atlantic

New Genetic Privacy Network Report: Genetic Privacy and Non-Forensic Biobanks

The current structure of biobanks in the United States is missing substantial federal and state policies to address and resolve people’s privacy concerns as well as sufficient education for people about the laws that are there to protect their genetic privacy. Medical, research, and commercial DNA databases all generate privacy concerns largely because of the misunderstanding between what people anticipate from giving their genetic information to these biobanks and what actually occurs. Because of the various risks that are associated with non-forensic biobanks, it is necessary to have stringent regulations and federal oversight so that scientists may be able to use donated genetic data to research diseases without compromising the genetic privacy of donors as well as their close relatives. 

The new report:  Do You Know Where Your DNA Is? Genetic Privacy and Non-Forensic Biobanks explores the various forms of biobanks in the US, their privacy limitations, the current state of regulation and the need for reform.

Protecting privacy while gathering health data

As health care goes digital, data is an increasingly valuable commodity.

Correctly crunched, the massive amount of patient health information now online could lead to big improvements in medicine. But few agree on a way to acquire, see, share and use the data that satisfies everyone who wants a piece.

That debate took a turn this week when rival insurers Blue Shield of California and Anthem Blue Cross said they would team up to create an health information sharing network with their combined 9 million patients.

Because companies traditionally guard such data, wanting to hold on to every competitive edge possible, the insurers tout the digital network as a major step in sharing information that manages to respect patients’ privacy. But privacy advocates question whether the final product will truly take into account patients’ interests.

As health care goes digital, data is an increasingly valuable commodity.

Correctly crunched, the massive amount of patient health information now online could lead to big improvements in medicine. But few agree on a way to acquire, see, share and use the data that satisfies everyone who wants a piece.

That debate took a turn this week when rival insurers Blue Shield of California and Anthem Blue Cross said they would team up to create an health information sharing network with their combined 9 million patients.

Because companies traditionally guard such data, wanting to hold on to every competitive edge possible, the insurers tout the digital network as a major step in sharing information that manages to respect patients’ privacy. But privacy advocates question whether the final product will truly take into account patients’ interests.


“People have been saying, ‘Wow, health information exchanges have not gotten off the ground, this is a jump start,’ ” said Lee Tien, a senior staff attorney at the Electronic Frontier Foundation in San Francisco. “But is it a real jump start? A public relations jump start? I’m not sure what it benefits.”

The California Integrated Data Exchange, or Cal Index, plans to pool electronic medical records from doctors and hospitals by the end of the year. Blue Shield and Anthem Blue Cross, two of the state’s largest insurers, are inviting other providers and insurers to join the network and add their patients’ information. All records will be integrated with claims data, and physicians will be able to share information through a secure system, the insurers say.

“We believe that providers and plans must collaborate to ensure that Californians receive quality health care at a sustainably affordable price – and a fundamental component is sharing comprehensive patient information broadly and efficiently,” Mark Morgan, Anthem Blue Cross’ president, said in a statement.

Network goals

The goal is to use the data to cut unnecessary expenses like duplicated tests, reduce mistakes and enable patients to receive care faster, while obeying state and federal privacy laws, the insurers say. California’s size will probably make the network the largest in the country, topping similar networks in Kansas, Nebraska and Massachusetts.

Like all electronic health services that strive to share data, Cal Index must comply with the federal law that defines protections for patient data.

The law “wants to find a framework where on the one hand, society is not deprived of the research benefits associated with health information, and neither are patients deprived of their privacy,” said Riyad Omar, general counsel at Practice Fusion in San Francisco. The startup’s cloud electronic health record system hosts 80 million patient records from 112,000 medical providers.

When and how patients agree to share data are critical matters in health information technology. In a recent survey by the state Office of Health Information Integrity, nearly 90 percent of 800 Californians said they prefer in theory to be asked if they want to share. Most said they do not want their records shared without their explicit consent. In practice, too, people are generally willing to share their information when offered the choice, the state agency’s March report found.

Still, some companies worry that giving patients the choice to opt in could lower participation rates. If too few people agree to share data, the intended benefits of pooling information could be lost.

As Tien put it, “The industry has never liked opt-in. Privacy advocates believe in opt-in.”

By default, the 9 million patients in Blue Shield and Anthem Blue Cross will become part of the network. Before the system goes live, members will be told of a website where they can choose to not have their information shared, said Dr. Ken Park, vice president of payer and provider solutions at Anthem Blue Cross. Those patients will still receive coverage and treatment.

Cal Index chose this approach to try to balance privacy protection and participation rates, given that rates can sometimes be lower when people are asked to opt in, Park said.

In addition, Cal Index is ironing out the details of who can see the data, and to what extent.

Doctors may have abbreviated versions of patient records so they aren’t overwhelmed. Patients who want to see all their data will eventually have their own website, but that site won’t be available when the system starts, Park said.

Looking for trends

Researchers, too, may want access so they can look for trends in genetic risks, lifestyle habits and other factors in people’s health. Academics won’t be allowed into the entire database, but they can see data tailored to their line of inquiry that strip any identifying information about patients..

No data will be sold, but Cal Index won’t necessarily shut out researchers sponsored by, say, the pharmaceutical industry. Tien wants to see the system’s research policy more spelled out.

“That tension between people’s willingness to contribute to the greater good, but at same time not wanting to be the basis on which someone else makes a lot of money – that tension needs to be resolved,” he said.

Stephanie M. Lee , San Francisco Chronicle