Friday, April 23, 2010

A Bayesian Approach to Cancer

In a recent Science article they discuss the great deal of knowledge available about genes and cancer yet they ability to align them in some sensible fabric which allows both assessment and prediction is still wanting.

The problem discussed in the article is that after thousands of sample and genes being analyzed the information regarding cause and effect is still lacking. Pathways are understood but what causes what is not. The approaches are to this point ones which are basically inferential and correlative. Bert Vogelstein is quoted as:

The skeptic is Bert Vogelstein, who spoke at a Monday plenary session on cancer genomes at the annual meeting of the American Association for Cancer Research in Washington, D.C. Vogelstein looked across all the studies published since 2007 that have sequenced the 21,000 or so protein-coding genes involved in cancer, known as the cancer "exome." The analysis covered 78 tumor samples and eight cancer types (the majority of the studies were done by Vogelstein's group). Vogelstein also threw in data for 22 medulloblastomas (a type of brain tumor) that his team has not yet published.

The article goes on to state:

Even at this early stage of the planned survey--with data on just 100 tumor samples—"we can already answer many of the fundamental questions about the cancer genome," Vogelstein said. For example, tumors typically have from 30 to 80 single-base mutations, except for types that take less time to develop such as leukemia (about 10 mutations). Melanoma and lung cancer round out the high end (100 to 200 mutations) because they are caused by environmental carcinogens that cause lots of mutations. (Deletions and amplifications add a few more genetic glitches.) To Vogelstein this looked like the outline of a basic pattern that won't change much. But he gathered some more details.

Vogelstein searched databases for all mutations in genes found in solid cancers in the past 2 decades; for 353 cancer subtypes, he came up with 130,072 mutations in 3142 genes. But not all contribute to cancer. The challenge is to figure out which mutations are "drivers" and which are "passengers." To pick out the drivers, Vogelstein assumed that mutations in suppressor genes had to truncate the gene's protein; for oncogenes he included only mutations seen in at least two tumors. That distilled the gene count to just 319 potential driver genes, 286 of them tumor suppressors and 33 oncogenes.

Nearly all these genes fall into 12 "core" signaling pathways, Vogelstein said. And that picture—about 320 genes in 12 pathways--is unlikely to change much even when thousands more tumor samples are sequenced, he argued. So far, the cancer exome projects have found only two new driver genes ( IDH1/2 in glioma and FOXL1 in granulosa tumors). Vogelstein predicts that most new driver mutations will be rare; and nearly all will be part of same 12 pathways.

The problem is that there is no clearly underlying model for the temporal behavior of cancer. In addition there are many genes and many small segments yielding micro RNA as well, almost 1000 micro RNA elements generated by gene segments of about 20-40 base pairs. The question is how does one integrate this into a model.

Vogelstein then says:

Vogelstein summed up by saying that cancer has gone from "a complete black box" to something that "we really kind of understand." The "sobering" part, he said, is that he doesn't expect there will be many new genes or genetic breakthroughs. He has pinned his own hopes for preventing cancer deaths on using genetics to diagnose cancers early, when they're more treatable.

But the problem is that most cancer researchers are discovering facts and not models. We know that genes yield RNA which yields proteins. Proteins are facilitators of various pathways either blocking or accelerating them as on or off switches or speeding them up or slowing them down in a catalytic manner. We may not know the specifics but with the data we can generate dynamic system models and using Bayesian approaches applied to system identification we can then determine the details of the models.

Any good systems analyst knows then that we can ascertain if the systems are observable and/or controllable. That is we can ascertain if we can observe the states, namely the genes and proteins, and if we can then via controllability we can readily drive the system, namely the cell to a desired state, namely non-malignant.

It will likely take a new generation of cancer researchers to get from the determination of genes and their effects to being able to model the "system". It is akin to the world of electronics and control going from handbook designs to fully computerized optimal designs. This is a cultural phenomenon and it just requires time.