Cancer Informatics Core pilot (CIC-p)

The Cancer Informatics Core pilot project (CIC-p) provides collaborative informatics support for cancer research, through guidance for software use and by acting as liaison for various NGS data services.

CIC-p Community and Leadership


Raja Mazumder, PhD

Associate Professor of Biochemistry and Molecular Medicine

Office Phone: 202-994-5004


Mazumder's Bio

Dr. Mazumder is an Associate Professor of Biochemistry and Molecular Medicine and Co-Director of The McCormick Genomic Proteomic Center at The George Washington University (GW). While working at National Center for Biotechnology Information (NCBI) at NIH, UniProt and Protein Information Resource (PIR), Dr. Mazumder has worked closely with colleagues developing international molecular biology resources and using these resources to identify therapeutics, diagnostics and vaccines targets. Through current NIH, FDA and industry grants and contracts, he is involved in genomic and bioinformatics research associated with cancer biology, glycobiology, metagenomics and standards development. His expertise is in big data analytics platform and knowledgebase development for NGS and proteomic data analysis.


Anelia Horvath

Anelia Horvath, Ph.D.

Associate Research Professor of Pharmacology and Physiology

Office Phone: 202-994-2114 


Horvath's Bio

Dr. Horvath is an Associate Research Professor of Pharmacology and Physiology. She is also a Co-Director of The McCormick Genomic Proteomic Center at GW. In this role she is working closely with colleagues from GW to support omics- level projects and provide assistance with project design and data analysis and interpretation. Dr. Horvath’s team has designed and implemented Next Generation Sequencing (NGS) analytical pipelines that efficiently integrate the standard alignment, assembly, abundance, and variants analysis, with a wide variety of custom modules to answer the specific project questions. Dr. Horvath’s research has two main aspects. The first is identifying biologically significant trends from Next Generation Sequencing (NGS) datasets. Her team designs novel strategies for mining and integration of different levels of –omics data, and implements them in user-friendly software packages. The second aspect focuses on the applications of the above methodologies on different disease conditions, including cancer. A special focus is Breast Cancer, where her team works on cancer-protective genomic signatures identified in high-risk mutation carriers free of cancer.


Keith Crandall

Keith A. Crandall, Ph.D.

Professor of Computational Biology

Office Phone: 571-553-0107


Crandall's Bio

Dr. Crandall is the founding Director of the Computational Biology Institute at GW. Professor Crandall studies the computational biology, population genetics, and bioinformatics, developing and testing of Big Data methods DNA sequence analysis.  He applies such methods to the study of the evolution of both infectious diseases (especially HIV) and crustaceans (especially crayfish). Professor Crandall has published over 250 peer-reviewed publications, as well as three books.  He has been a Fulbright Visiting Scholar to Oxford University and an Allen Wilson Centre Sabbatical Fellow at the University of Auckland.  Professor Crandall has received a number of awards for research and teaching, including the American Naturalist Society Young Investigator Award, an NSF CAREER Award, a PhRMA Foundation Faculty Development Award in Bioinformatics, Honors Professor of the Year award at Brigham Young University, ISI Highly Cited Researcher, and the Edward O. Wilson Naturalist Award. He was also recently elected a Fellow in the American Association for the Advancement of Science (AAAS).  Professor Crandall earned his BA degree from Kalamazoo College in Biology and Mathematics, an MA degree from Washington University in Statistics, and a PhD from Washington University School of Medicine in Biology and Biomedical Sciences.  He also served as a Peace Corps Volunteer in Puyo, Ecuador.


Hiroki Morizono, Ph.D.

Associate Research Professor of Integrative Systems Biology

Office Phone: 202-476-4862


Morizono's Bio

Dr. Morizono is an Associate Research Professor of Integrative Systems Biology and Pediatrics, at the GW School of Medicine and Health Sciences. He is also a Principal Investigator at the Center for Genetic Medicine Research, at Children's National Health System. Dr. Morizono is one of the leads for the Informatics component of the GW-Children's Clinical and Translational Science Institute where he is developing systems that will enable researchers to interrogate electronic health record data, and gain insights on factors that affect pediatric health.  In this role, he has worked closely with colleagues at GW to bridge gaps in expertise, encourage the formation of new collaborations and improve the mutual sharing of research resources.

His research has largely focused on inborn errors of metabolism, specifically urea cycle disorders using informatics, biochemical, biophysical and structural biology approaches. His group has shown that disruption of this pathway appears to increase levels of fat in the liver, a factor known to increase the risk for fibrosis and hepatocellular carcinomas, and that this can be reduced through gene delivery using AAV-based vectors.

George Washington illustration for a placeholder image

Qing Zeng, Ph.D.

Professor of Biomedical Informatics 

Office Phone: 202-994-5615 


Zeng's Bio

Dr. Zeng is a tenured professor and the Director of the Biomedical Informatics Center at GW. She is also the Associate Director of the Center for Health and Aging in Washington DC VA Medical Center. She is an elected Fellow of the American College of Medical Informatics. With 20 years of experience in informatics, her research focuses on data mining, consumer health informatics and semantic data integration. She has published over 90 peer-reviewed articles and has served as the PI on more than a dozen NIH, VA, DOD and corporate-sponsored research projects, while also collaborated widely with colleagues in clinical domains ranging from cardiology to oncology. Her manuscripts were repeatedly selected by the International Medical Informatics Association’s Yearbooks. Other awards include the Harriet H. Werley Award in the 2007 AMIA Annual Symposium, the Distinguished Paper Award in the 2011 American Medical Informatics Association’s Joint Summit, and the Distinguished Paper Award in the 2016 American Medical Informatics Association’s Annual Symposium. In 2014, her “HeartsLikeMine” project won first-place in the “Ideas That Work” contest in the iHealth conference sponsored by the American Medical Informatics Association and Academy Health. During her time in Harvard, she developed a natural language processing (NLP) tool (HITEx) for two large consortium projects (i2b2 and SPIN). HITEx was the first open-source, comprehensive clinical NLP system in the nation. At VA, she developed NLP tools for two other large consortium projects (VINCI and CHIR) and has organized a VA-wide collaboration to develop interoperability standards and collaborative NLP development environments for the broader clinical NLP community.

Commercial software available through MGPC

  • Geneious R10 

    Geneious is a genome browser, reference mapping and sequence assembly tool used for NGS analysis.
  • HGMD

    HGMD is the gold standard resource for comprehensive data on published human inherited disease mutations.
  • Ingenuity Pathway Analysis (IPA)

    IPA is a pathway and network analysis of complex omics data.
  • MetaCore

    MetaCore is a high quality biological systems content in context, ideal for systems biology research.
  • MOE

    Drug design and homology modeling software.
  • Oncomine

    Compute gene expression signatures, clusters, and gene-set modules, for extracting biological insights from the data.
  • OriginLab

    OriginLab is a data graphics software for technical charts for scientists and engineers displaying 2D and 3D plotting, statistics, curve fitting, and peak fitting.

    TRANSFAC provides data on eukaryotic transcription factors, their binding sites, consensus binding sequences and regulated genes.


RNA Sequencing

How much does RNA sequencing cost?

RNA sequencing, which is also called whole transcriptome shotgun sequencing, is a method to detect the transcriptome from blood, cell, or tissue. RNA-seq allows users to explore the transcript isoforms, gene fusions, single nucleotide variants and specific gene expressions. Prices depend on the quantity of reads (e.g. 15 million reads for human sample costs approximately $500/sample). The user needs to provide the RNA library, the cell sample, or the tissue sample. Alternatively, the user may choose to send raw material (tissue, cells, etc.) to the sequencing company and they can isolate RNA for $50/sample, including small RNA.

Exome Sequencing

How much does Exome Sequencing cost?

Exome sequencing, which is also called whole exome sequencing (WES), is a technique to detect the protein coding regions in a genome. Exome-seq allows users to uncover single nucleotide variants and chromosomal aberrations. Prices depend on the quantity of reads and the average of coverage (e.g. 18 million reads and 35X average coverage for an approximate of $1100/sample). The user needs to provide the total genomic DNA. Alternatively, the user may choose to send raw material (tissue, cells, etc.) to the sequencing company and they can isolate DNA for $50/sample.

Metagenomics Sequencing

How much does Metagenomics Sequencing cost?

Metagenomics sequencing is a method to study the genetic material from environment including environmental genomics, ecogenomics, or community genomics. Metagenomic sequencing allows researchers to identify organisms present in a sample, like evaluating bacterial diversity in environments. The services include DNA isolation, amplification, library preparation and sequencing on MiSeq for Human samples. Prices vary for different analyses (e.g. 45 million reads and 2X average coverage for an approximate of $300/sample).