The Dark Side of DNA Data

by Aniika Gjesvold Cantero

Consumer DNA testing has continued to garner increasing attention in the last decade, and with it has come a stream of promises for research, medicine, and services to consumers.

Personalized medicine, cold cases, early disease detection, and family heritage are the main selling points.  However, there is an untold side to what is happening with our DNA data as access and ownership shuffle across borders behind the scenes.

At-home DNA tests are relatively straightforward.

As an example, Ancestry.com generates the consumer's results once the saliva sample is processed and run through its proprietary software.  According to their privacy policy, they only retain the data if the customer agrees to let their DNA be used for "informed consent research."

Additionally stating: "Neither your saliva nor the extracted DNA (together referred to as 'Biological Samples') are Personal Information under this Privacy Statement...  Future testing may be done if you agree to our Informed Consent for Research or if you consent to other tests of your Biological Samples.  If you do not consent to the storage of your Biological Sample, we will destroy your sample."

The saliva sample will be destroyed, but it appears the information extracted from the sample is not.  In the case of Ancestry.com, their DNA "network" contains the DNA information of 22 million people.

Advertising, as having the "World's largest consumer DNA network," (Ancestry.com) further supports the notion that genomic information is being stored longterm and the data is not destroyed unless otherwise requested.

According to a recent study, "Vanderbilt University researchers found that 71 percent of companies used consumer information internally for purposes other than providing the results to consumers."  (Roberts 2020)

So what's the big deal?

DNA data is not like a Social Security number or other information associated with a person.  It is biometric, and an individual is identifiable by this information.  "DNA presents privacy issues different from those involved in other biometrics collection ... [since] it can contain information about a person's entire genetic make-up, including gender, familial relationships, ... race, health, disease history and predisposition to disease."  (Lynch 2012)

Combining personal information with genomic data produces a complete picture of an individual.  John Demers, when he was head of the DOJ's National Security Division, put it clearly when discussing the national security risks of genetic information when he said this data "can be used from a counterintelligence perspective to either coerce you or convince you to help the Chinese," further adding, "the worst case would be the development of some kind of biological weapon ... if you had all of the data of a population, you might be able to see what the population is most vulnerable to" - in addition to the types of exploitation that follow when profiteers gain access to troves of personal information.

Combining genomic information with a complete background check can also identify an individual's closest living relatives and family circle.  Once an individual's DNA data is collected, it is not difficult to use today's technologies to identify their closest relatives and family.

The National Society of Collegiate Scholars (NSCS) tries to express just how valuable DNA information is, going on to state, "Your DNA is the most valuable thing you own ... It is your unique genetic code and can enable tailored healthcare delivery to you.  Losing your DNA is not like losing a credit card ... you cannot replace your DNA.  The loss of your DNA not only affects you, but your relatives and, potentially, generations to come."

Direct-to-Consumer (DTC) genetic testing data has limited regulations to help protect consumers: "While many companies have robust privacy and informed consent policies, no federal laws prohibit companies from providing individuals' genetic information to third-parties." - National Human Genome Research Institute (NHGRI).

The Federal Trade Commission can provide some level of protection to consumers by enforcing action if a company makes false claims or misleading statements regarding privacy and security or fails to protect an individual's information.  But in the case of business as usual, DNA data falls through the cracks.

"Ancestry.com is not a covered entity under the Health Insurance Portability and Accountability Act (HIPAA), and as a result, no data provided by you is subject to or protected by HIPAA."  (Ancestry.com)

Not to say there are no regulations around this type of information at all.  Currently, there is a well-defined set of standards issued by the FBI for handling and storing DNA information for inclusion in the Combined DNA Index System (CODIS); the program defines a standard for support of criminal justice DNA databases and extends to cover the software used to run them.  (CODIS and NDIS 2022)

However, this is specific to law enforcement and does not cover consumer DNA information generated, stored, and maintained by private companies.  (NIST)

Not all privacy policies at these consumer DNA testing companies are the same.

For example, 23andMe requires customers to opt-in and provide consent before sharing the customer's data.  However, this relationship can change if the customer downloads their DNA information and then uploads it to another website.

An example of this, provided by Segert, is GEDmatch.  GEDmatch's privacy policy is much looser, displays users' real names, and is publicly searchable.  The site received infamy when police used it to solve the Golden State Killer case.  (Segert)

There are other aspects to this gap, such as when the Genetic Information Nondiscrimination ACT (GINA) was adopted to prevent employers from discriminating against employees based on genetic information.  GINA does not, however, apply to third-party direct-to-consumer testing like Ancestry and 23andMe or the handling of the information after it is collected.  (Roberts 2020)

De-identifying DNA data, meaning stripping the dataset of personal identifiers, has received skepticism around the accuracy of the claims about the ability to do this successfully.

Deidentification as a solution to growing privacy concerns is not currently a viable option - "It is not clear if this is entirely effective because genetic data is intrinsically identifying.  This is because each person's genome is unique and may be traced back to them similar to a thumbprint."  (Segert)

In a recent case study, researchers could infer participants' last names using a small portion of their genetic data along with census information such as date of birth and their home state.  (Segert)

This confirms that it is possible to re-identify an individual after the information has been de-identified.

Additionally, there seem to be lax regulations around a company's ability to sell their customer's genetic information; as Segert explained, direct-to-consumer companies are able to offer their services at an affordable price point because "they can sell their customer's genetic data to pharmaceutical companies for a profit.  23andMe, for example, has a contract to license customer data to the biotech giant Genentech for their research efforts into Parkinson's disease."

Relating to 23andMe, it was announced in February 2021 by the Virgin Acquisition Group that the company was being acquired by the firm.  (Paul)

As the saying commonly goes, follow the money.  And in this case, you have to ask yourself what value a DNA testing company geared to learning about your ancestors has to investors like Richard Branson, who are willing to spend 3.5 billion U.S. dollars to acquire it.  The answer lies in the asset that a consumer DNA database is and the gap in regulations preventing companies from using and profiting from it.

There also seems to be a deficiency in regulating the limitations of access to U.S. genetic information from foreign entities from a legal perspective.  There is currently nothing preventing a foreign company from purchasing a U.S. company that holds DNA data as a primary asset.

This has already occurred in at least two documented instances; on December 4th, 2020, it was announced that Blackstone acquired Ancestry for 4.7 billion dollars.  Blackstone is a private equity fund with a stake in pharmaceuticals and healthcare-related businesses.

Even though Blackstone is an American investment management company, the nature of its partnership structure and the companies they have acquired in the past make them a global entity.  (Karr)

According to Keith Bradsher of The New York Times, the Chinese government holds a three billion dollar nonvoting stake in the Blackstone Group, muddying the clarity of information transfer and ownership around the genetic data we discussed previously.

These are important factors to note when addressing regulatory concerns and determining how the data should be treated.  Currently, since this information does not fall under HIPAA, it is covered by regulations that apply to general personal information.

More specifically, it is federally regulated based on three criteria: analytical validity, clinical validity, and clinical utility by the Food and Drug Administration (FDA), the Centers for Medicare & Medicaid Services (CMS), and the Federal Trade Commission (FTC) as stated by the National Human Genome Research Institute.  These regulations, however, do not regulate or dictate privacy and data handling measures, as well as access and ownership from foreign entities.

There is not a lack of transparency around genomic data, but rather a lack of identifying the data as biometric and handling it with appropriate security and privacy standards, regulations, and procedures.  Current measures have failed in their effectiveness in securing DNA testing databases and minimizing the exploitation of the information.

Key takeaways from this effort are understanding the relationship genomic data has with businesses and individuals, and then further understanding what inherent risks emerge.  As we have identified, genomic data is biometric, and there are uses for this information that present threats to U.S. national security and citizens; it is not currently covered under HIPAA, and it is not regulated or prevented from crossing borders or from its purchase by foreign entities.

Countries like China are able to legally purchase genomic data on U.S. citizens with the purchase of companies that possess DNA databases as an asset.  Since we have confirmed that the method of de-identification has yet to prove successful, we must also conclude that de-identified DNA data is still sensitive and presents the same risks to national security as the transfer and acquisition of identified DNA data.

It is worth discussing that changes need to be made regarding how DNA data is recognized and handled.  It should be first and foremost treated as biometric information that is not strippable of personally identifiable information.

Monitoring and restriction should be implemented to prevent the legal and illegal acquisition of U.S. DNA data by China and other adversarial nations that have made their intentions clear that they are not in line with the U.S.'s best interests.

As the value of DNA data grows and more companies place a vested interest, it will get harder to implement regulations and safeguards later.  Implementing and enforcing new regulations and frameworks around this information will be challenging, as companies worldwide have already been making multibillion dollar investments where the access and usage of large DNA databases are the primary assets.

Genomic information has been pitched as providing the necessary data to unlock medical breakthroughs that would nonetheless change the future of medicine.

Though this is great from a medical research perspective, prioritizing privacy will help ensure that this privacy exists for future generations.

Works Cited

Return to $2600 Index