COAD Project

An educational research website for exploring COAD PanCanAtlas data, tumor/normal modeling, protein mutation sites, and ChEMBL compound context.

For education and research only.
Not for medical diagnosis.

COAD Workflow: From Data to Biological Insight

1

Data Selection
(PanCanAtlas)

COAD Samples

TumorNormal

512 samples

  • COAD colon adenocarcinoma RNA-seq (TPM)
  • Sample filtering & QC
2

Feature Matrix
(Gene Expression)

Top variable genes

Genes x Samples

  • Normalize & log-transform
  • Select informative genes
  • Build feature matrix
3

Model Evaluation
(Tumor vs Normal)

Confusion Matrix

Predicted

Accuracy 0.89   |   F1 0.88   |   AUC 0.93

  • XGBoost classification
  • Stratified CV evaluation
  • Robust performance
4

Biology Interpretation
(Proteins & Pathways)

Top Differential Protein Signals

ANXA1
CEACAM6
REG1A
MUC2
CA9

log2 Fold Change (Tumor / Normal)

WNT SignalingCell AdhesionEGFR SignalingPI3K-AKT
  • Protein & pathway analysis
  • Biological context
  • Hypothesis generation

COAD PanCanAtlas Data

COAD (Colon Adenocarcinoma) data from PanCanAtlas is the foundation of this project. Explore the sample composition, data types, and key characteristics used in the analysis.

Learn more about COAD data

Project Links

This is a personal learning project by Nancy. The code and research notes are available on GitHub, and questions can be sent by email.