Research Projects

Active Projects

Roberta Devito
Roberta Devito, PhD

Recovering Reproducible and Local Signal in Genomic Data

One of the most important challenge in biological science today is to elucidate the extent to which complex experiments, which measure hundreds of thousands of variables, can be analyzed to generate consistent and global signal when repeated, to identify local signal related to tissues, cancer types or population structure. Importantly, we must include the intrinsic diversity of variation across different studies and control for technical confounders as part of this task. Learn More…

Ian Wong, PhD

Profiling Gene Expression and Mechanophenotype in Circulating Tumor Cells Ex Vivo

Primary tumors shed circulating tumor cells (CTCs) into the bloodstream that metastasize preferentially to distant organs, resulting in 90% of cancer related fatalities. For example, estrogen receptor positive (ER+) breast cancers exhibit high rates of metastasis to bone, with decreased rates to liver and lung. CTCs exhibit heterogeneous gene expression programs and functional phenotypes, which are selected by soluble and mechanical interactions within each metastatic “niche.”

A critical challenge is to predict how patient-specific CTCs disseminate throughout the body and respond to therapeutic treatments. An exciting strategy is to culture CTCs ex vivo for drug screening informed by genomic and transcriptional profiling. We seek to elucidate how CTCs respond to different features of the metastatic niche by engineering controlled interactions with tissue specific extracellular matrix (ECM) and with human primary stromal cells, which may recapitulate disease progression and therapeutic resistance in these microenvironmental contexts. Learn More…

Jay Hou, PhD

Computational Prediction of Tumor Progression in Brachytherapy

Permanent brachytherapy brain implants offer a method to deliver radiation therapy intraoperatively immediately following glioblastoma (GBM) tumor resection. GammaTile with four 131Cs radiation seeds is an FDA-cleared device for brachytherapy, offering modular, localized radiation with biocompatible material. Studies have shown that GammaTile therapy improves overall survival for patients with recurrent GBM. The location and dosage of GammaTiles are designed and evaluated by neurosurgeons, radiation oncologists, and medical physicists. However, the relationships between radiation dosage, tumor recurrence, and infiltration probabilities in GBM tumors remain largely unknown. Currently, no computational tools exist to predict how GBM cells might be affected by the radiation dose and migrate to distant sites.

This proposal aims to simulate tumor progression in patients undergoing brachytherapy. In vitro 2D and 3D cell proliferation and migration (both random and directional) with localized radiation will be developed and studied. Previously developed biophysical models, such as the Cell Migration Simulator and Brownian Dynamics Model, will be expanded to computationally simulate and analyze the physical and molecular mechanisms of tumor growth and infiltration under localized radiation. Additionally, previously developed machine learning algorithms will be enhanced to extract single-cell features from simulation data and patient MRI images, and to predict tumor progression in GBM patients under brachytherapy. The computational tools developed will enable surgeons and oncologists to optimize patient-specific strategies, thereby improving patient outcomes in brachytherapy. Learn More…

Completed Projects

Lalit Beura
Lalit Beura, PhD

Characterize the transcriptional and epigenetic networks that control FRT TRM identity. Our phenotypic characterization has revealed a significant heterogeneity among the TRM populations. We hypothesize that FRT TRM are transcriptionally diverse and their transcriptional heterogeneity is driven by differential chromatin accessibility. This hypothesis will be tested by performing computational integration of single cell RNA-seq (scRNA-seq) and assay for transposase accessible chromatin sequencing (ATAC-seq) to identify key molecular regulators that control the differentiation trajectories of these distinct populations. This will be further correlated with the functional potential of TRM in the event of pathogenic infection. We aim to generate a comprehensive transcriptional and epigenetic map of anti-pathogenic mucosal TRM.

Learn More…

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of coronavirus disease 19 (COVID-19). Spike protein is the primary antigenic target for COVID vaccines and interfering with the interface between RBD (Receptor Binding Domain) of spike and ACE2 is the mechanism of action for the majority of existing therapeutic antibodies, indicating the importance of RBD and its binding to the cellular receptor for controlling SARS-CoV-2. It is unclear whether there are any intrinsic cellular proteins that inhibit viral entry of SARS-CoV-2.

We will leverage the screening platform to identify cellular receptors for secreted virulence proteins of SARS-CoV-2, Orf3a, Orf7a, and Orf8. Importantly, the same screening platform is validated to be a efficient platform for secreted virulent proteins. In a separate screening for norovirus secreted virulence protein (NS1), the surfaceome screening successfully identified Syndecan-4 as a putative cellular receptor for norovirus NS1 (Li et al., In preparation). In the Aim 2, we will perform a surfaceome screening for Orf3a, Orf7a, and Orf8 and will discover putative cellular receptors for the secreted virulence proteins.

Learn More…


Gene regulatory mechanisms are critical for proper cellular and protein function, and modern molecular biology has linked numerous pathologies to dysregulation of these processes. Although modification of the genome to correct pathogenic mutations is a promising therapeutic approach, these efforts cannot be successful without knowledge of the underlying biochemistry of protein machinery such as CRISPR-Cas9 (Cas9). Cas9 can be a customizable tool to edit and correct disease-linked (genomic) mutations, however, to fully realize these applications, novel strategies to overcome its off-target effects and poor temporal control must be investigated. Cas9 utilizes a guide RNA molecule to recruit, stabilize, and facilitate cleavage of double-stranded DNA after recognition of a well-known protospacer adjacent motif (PAM) sequence. Prior X-ray crystal structures indicate that conformational changes within the Cas9 nucleases, HNH and RuvC, are required for effective catalytic function. However, these structures offer little mechanistic information, as the target DNA and catalytic nucleases are never observed in an activated state. The conformational shift of HNH, in particular, is correlated to motions of neighboring subdomains, all of which are activated from >20 Å away by the PAM-binding domain, suggesting an allosteric mechanism. Understanding this allosteric coupling would have exciting potential for precision medicine by establishing novel paradigms to control and enhance the spatial and temporal function of Cas9. We recently identified a pathway of millisecond timescale motions spanning the HNH nuclease and reaching multiple Cas9 domains that computational results suggest is a portion of a larger allosteric network that controls Cas9 function. To investigate the reach of this allosteric network and the role of molecular motions in its mechanism, my laboratory will undertake a synergistic solution NMR and computational study to map the longrange allosteric pathway of Cas9. We will (1) characterize the molecular determinants of protein motions in the HNH nuclease, (2) establish the biophysical roles of the neighboring REC2 and REC3 domains in Cas9 signal transduction and (3) characterize the interaction of the PAM sequence with its binding domain to evaluate its role as an allosteric activator. Specifically, this multidisciplinary approach of NMR spin relaxation experiments and molecular dynamics, network theory, and Eigenvector Centrality simulations will probe differential protein motions in Cas9, revealing specific amino acids responsible for transmitting structural or dynamic information to affect biological response. These studies will use both full-length Cas9 and novel engineered constructs to interrogate specific domains within the 160 kDa enzyme. The structural and dynamic findings of this work will be correlated to function with biochemical and cellular assays to provide a detailed understanding of the Cas9 allosteric mechanism.

CRISPR-Cas9 has potential to modify disease-causing genes, but is prone to off-target alterations due to poor temporal control its expression. It is therefore desirable to develop an allosterically-controlled Cas9 that elicits no function unless activated, circumventing this limitation. Cas9 is reliant on conformational dynamics for allosteric function, but typical solution methods for characterizing motional ensembles, namely NMR, are insufficient as standalone techniques to fully characterize an enzyme of this size. Pairing experiments with in silico methods can elevate the level of insight from NMR alone by more precisely modeling NMR data and generating dynamic structural networks that illuminate regions of allosteric crosstalk that may become functional handles for enhanced control over genome editing. Learn More…


Gene regulatory mechanisms are critical for proper cellular and protein function, and modern molecular biology has linked numerous pathologies to dysregulation of these processes. Although modification of the genome to correct pathogenic mutations is a promising therapeutic approach, these efforts cannot be successful without knowledge of the underlying biochemistry of protein machinery such as CRISPR-Cas9 (Cas9). Cas9 can be a customizable tool to edit and correct disease-linked (genomic) mutations, however, to fully realize these applications, novel strategies to overcome its off-target effects and poor temporal control must be investigated. Cas9 utilizes a guide RNA molecule to recruit, stabilize, and facilitate cleavage of double-stranded DNA after recognition of a well-known protospacer adjacent motif (PAM) sequence. Prior X-ray crystal structures indicate that conformational changes within the Cas9 nucleases, HNH and RuvC, are required for effective catalytic function. However, these structures offer little mechanistic information, as the target DNA and catalytic nucleases are never observed in an activated state. The conformational shift of HNH, in particular, is correlated to motions of neighboring subdomains, all of which are activated from >20 Å away by the PAM-binding domain, suggesting an allosteric mechanism. Understanding this allosteric coupling would have exciting potential for precision medicine by establishing novel paradigms to control and enhance the spatial and temporal function of Cas9. We recently identified a pathway of millisecond timescale motions spanning the HNH nuclease and reaching multiple Cas9 domains that computational results suggest is a portion of a larger allosteric network that controls Cas9 function. To investigate the reach of this allosteric network and the role of molecular motions in its mechanism, my laboratory will undertake a synergistic solution NMR and computational study to map the longrange allosteric pathway of Cas9. We will (1) characterize the molecular determinants of protein motions in the HNH nuclease, (2) establish the biophysical roles of the neighboring REC2 and REC3 domains in Cas9 signal transduction and (3) characterize the interaction of the PAM sequence with its binding domain to evaluate its role as an allosteric activator. Specifically, this multidisciplinary approach of NMR spin relaxation experiments and molecular dynamics, network theory, and Eigenvector Centrality simulations will probe differential protein motions in Cas9, revealing specific amino acids responsible for transmitting structural or dynamic information to affect biological response. These studies will use both full-length Cas9 and novel engineered constructs to interrogate specific domains within the 160 kDa enzyme. The structural and dynamic findings of this work will be correlated to function with biochemical and cellular assays to provide a detailed understanding of the Cas9 allosteric mechanism.

CRISPR-Cas9 has potential to modify disease-causing genes, but is prone to off-target alterations due to poor temporal control its expression. It is therefore desirable to develop an allosterically-controlled Cas9 that elicits no function unless activated, circumventing this limitation. Cas9 is reliant on conformational dynamics for allosteric function, but typical solution methods for characterizing motional ensembles, namely NMR, are insufficient as standalone techniques to fully characterize an enzyme of this size. Pairing experiments with in silico methods can elevate the level of insight from NMR alone by more precisely modeling NMR data and generating dynamic structural networks that illuminate regions of allosteric crosstalk that may become functional handles for enhanced control over genome editing. Learn More…

One Dr. Sohini Ramachandran will develop new computational and analytical methodologies to identify risk alleles for leukemia that differ in incidence across ethnic groups and genders, and apply these methods to genome wide association studies. Analyses of X-linked factors offer new insights into human genomic variation.

Learn More…

The most prominent diseases of modern times–including inflammatory bowel diseases (IBD) such as Crohn’s, ulcerative colitis, and colon cancer as well as metabolic diseases such as diabetes and obesity–are caused as a result of failure to maintain homeostatic interactions with commensal bacteria. However, at the moment, we do not fully understand the mechanisms that regulate host-microbe interactions. Moreover, attempts to identifiy common microbiome associated patterns linked with these diseases have either failed or are inconsistent at best. It is likely that the intestinal flora is spatially stratified just like any other ecosystem; and so far, large-scale sequence analysis of intestinal microbiota that uses fecal biota or entire intestinal luminal content as a surrogate for looking at gut microflora has failed to pick up transverse stratification of intestinal microbiome. We propose a novel approach for determining the structural and functional stratification of intestinal microbiome. We will assess the role of host immunity in organizing bacterial communities by altering their functional gene content within the gut lumen. Finally, we will delineate how clinically relevant antibiotic regimens alter spatial community structure of human mcirobiome by employing humanized gnotobiotic mouse models. We will apply these approaches to understand how common genetic factors that are associated with IBD influence the function and structure of microbiota to cause dysbiosis and chronic inflammation.

Learn More…

While infection biology has largely focused on studying the immune response to a single infection, it is becoming increasingly clear that many infections involve more than one pathogen. Therefore, studying the effect of one pathogen on the response to another is of utmost clinical importance. Infection with the seasonal influenza virus leads to an estimated 500,000 deaths annually and during global pandemics, these numbers are even higher. Bacterial pneumonia is a common complication following infection with influenza virus, which leads to increased morbidity and mortality (1). We propose that the ability to survive an infection is determined by two main factors, resistance (the ability to respond to and clear the pathogen) and tolerance (the ability to tolerate the effects of a given pathogen burden) (2). Myself and others have shown that infection with influenza virus compromise a variety of resistance mechanisms to many different bacterial pathogens. However, in a recent publication, I have shown that during influenza virus/bacterial coinfection tolerance is also compromised (3). In a mouse model of influenza virus/Legionella pneumophila coinfection the pathogen load remained unchanged allowing us to focus on tolerance mechanisms. We found that by decreasing the inflammatory immune response and increasing the tissue repair response we were able to increase tolerance to coinfection. As these complex infections are very difficult to treat effectively, this finding opens up a new avenue of research and potential treatments for human infectious disease.

In this current study, we will use a bioinformatics approach to explore the transcriptional profiles of coinfected lungs and lung epithelial cells by RNA-Seq (Aim 1). We will use these transcriptional profiles to find and screen small molecule drugs (perturbagens) that increase tolerance to coinfection in an in vitro system (Aim 2). We will then apply these findings to increasing tolerance in our in vivo model (Aim 3). This study will allow us to discover novel mechanisms of tolerance and treatments for viral/bacterial coinfections of the lung. This project has direct applications to human diseases. With the increase in organisms that are resistant to common antimicrobials, new treatment regimens are necessary to combat infectious diseases. In addition, even with effective antimicrobial treatments, damage can be caused that decreases tolerance, and we must focus on both treating the host and targeting the microbial pathogens. This is particularly true in the context of complex polymicrobial coinfections. Ultimately, these findings can be applied to tolerance mechanisms of other lung diseases.

Learn More…

Aging is the single most important risk factor for a wide range of chronic illnesses, including diabetes, heart disease, cancer and neurodegenerative diseases. Hence, interventions that can slow aging have the potential to prevent or at least retard the onset of these debilitating diseases. It was recently discovered that senescent cell clearance in mouse aging models improves healthspan and extends lifespan. Cellular senescence is a genetic program characterized by the irreversible arrest of proliferation and is integral to multiple in vivo processes including tumor suppression, embryonic development, wound healing and tissue repair. Senescent cells secrete inflammatory cytokines and accumulate with age due to a decline in immune system function. Thus, there is much interest in targeting senescent cells for clearance via pharmacological interventions, referred to as senolytic drugs, to alleviate pathologies of aging and improve healthspan. In this project we will characterize the heterogeneity of different forms of cellular senescence by single cell transcriptomics analysis and will study the regulatory networks specific to the different sub-classes of senescent cells. We will then use their transcriptional signatures to identify novel putative senolytic drugs by querying available drug databases such as the Connectivity Map (CMAP.

Learn More…

In the post genome era, biological research and genomic medicine have been transformed by high-throughput technologies. New techniques have enabled researchers to investigate biological systems in great detail. Nonetheless, the extraordinary amount of information in the large number of emerging high-dimension datasets has not been fully exploited. Increasingly, pathway analysis and other a priori biological knowledge based approaches have improved success in extraction of valuable information from high-throughput experiments and genome-wide association studies. Preeclampsia is a complex disease and one of the most common causes of fetal and maternal morbidity and mortality worldwide. It is one of the great but enigmatic health problems. Despite many studies, there has been little fundamental improvement in our understanding in decades. It is a multi-system hypertensive disorder of pregnancy, characterized by variable degrees of maternal symptoms including elevated blood pressure, proteinuria and fetal growth retardation that affect 2-8 % of deliveries in the US. Many clinicians believe there is a difference between preeclampsia and severe or early and late preeclampsia. However, to date there is little direct evidence that they represent different genetic etiologies. We hypothesize that preeclampsia is a complex, polygenic disorder that entails activation of a network of genes. We will perform a case/control study using whole exome sequencing. We will restrict our enrollment to patients with early, severe preeclampsia. The working hypothesis is that this will provide better power, lower heterogeneity, and higher genetic effect for this complex phenotype. We will develop new bioinformatic approaches to identify the gene networks and causal variants that contribute to severe preeclampsia. This will be coupled with high-throughput technologies applied to this carefully chosen cohort of patients..

Learn More…

Funding Opportunities

View more research from the Center for Computational Biology of Human Disease in our archives.