Friday, January 18, 2013

Who Am I?

In a recent article in Science the investigators demonstrated an uncanny ability to identify family trees by using Y chromosome STR, short tandem repeats, and integrating that with open source data. The use such sites as Y Search which focuses on agglomerating such Y chromosome data with family trees.

As the authors state:

Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.

So the concern about genetic privacy is not a concern, there is none. The interesting observation is how quickly these things are created and become public and free. This is indeed worth following.