I’m a research assistant, working towards Doctor of Philosophy (Ph.D.), focused in Computer Science from Virginia Commonwealth University (VCU). I work on massive data & complex biological problems to understand how gene activity regulates disease progression and disease heterogeneity. My expertise includes Data Science, Machine Learning, Algorithm, Mathematical Modeling, and Optimization. I am proficient in C/C++, R, Matlab, and Python. I enjoy generating new ideas and devising feasible solutions to broadly relevant problems.
PhD in Computer Science, 2020
Virginia Commonwealth University
MS in Nanotechnology, 2014
Jadavpur University
BS in Electronics & Communication, 2012
West Bengal University of Technology
80%
80%
50%
Responsibilities include:
As manipulating the self-assembly of supramolecular and nanoscale constructs at the single-molecule level increasingly becomes the norm, new theoretical scaffolds must be erected to replace the thermodynamic and kinetics based models used to describe traditional bulk phase active syntheses.
Correspondence between experimental results and EKS models on switching of pathway
Mutual information of multistep signaling cascades
Alzheimer’s disease (AD) and Parkinson’s disease (PD) are the most common neurodegenerative disorders related to aging. Though several risk factors are shared between these two diseases, the exact relationship between them is still unknown. In this paper, we analyzed how these two diseases relate to each other from the genomic, epigenomic, and transcriptomic viewpoints. Using an extensive literature mining, we first accumulated the list of genes from major genome-wide association (GWAS) studies. Based on these GWAS studies, we observed that only one gene (HLA-DRB5) was shared between AD and PD. A subsequent literature search identified a few other genes involved in these two diseases, among which SIRT1 seemed to be the most prominent one. While we listed all the miRNAs that have been previously reported for AD and PD separately, we found only 15 different miRNAs that were reported in both diseases. In order to get better insights, we predicted the gene co-expression network for both AD and PD using network analysis algorithms applied to two GEO datasets. The network analysis revealed six clusters of genes related to AD and four clusters of genes related to PD; however, there was very low functional similarity between these clusters, pointing to insignificant similarity between AD and PD even at the level of affected biological processes. Finally, we postulated the putative epigenetic regulator modules that are common to AD and PD.
Research that meaningfully integrates constraint-based modeling with machine learning is at its infancy but holds much promise. Here, we consider where machine learning has been implemented within the constraint-based modeling reconstruction framework and highlight the need to develop approaches that can identify meaningful features from large-scale data and connect them to biological mechanisms to establish causality to connect genotype to phenotype. We motivate the construction of iterative integrative schemes where machine learning can fine-tune the input constraints in a constraint-based model or contrarily, constraint-based model simulation results are analyzed by machine learning and reconciled with experimental data. This can iteratively refine a constraint-based model until there is consistency between experimental data, machine learning results, and constraint-based model simulations.
Synthetic biologists endeavor to predict how the increasing complexity of multi-step signaling cascades impacts the fidelity of molecular signaling, whereby information about the cellular state is often transmitted with proteins that diffuse by a pseudo-one-dimensional stochastic process. This begs the question of how the cell leverages passive transport mechanisms to distinguish informative signals from the intrinsic noise of diffusion. We address this problem by using a one-dimensional drift-diffusion model to derive an approximate lower bound on the degree of facilitation needed to achieve single-bit informational efficiency in signaling cascades as a function of their length. Within the assumptions of our model, we find that a universal curve of the Shannon-Hartley form describes the information transmitted by a signaling chain of arbitrary length and depends upon only a small number of physically measur-able parameters. This enables our model to be used in conjunction with experimental measurements to aid in the selective design of biomolecular systems that can over-come noise to function reliably, even at the single-cell level.