1.8 KiB
1.8 KiB
RNA-seq DESeq2 Pipeline
A sequential Kubernetes Job pipeline for differential expression analysis on yeast RNA-seq data. Each stage runs as a one-shot Job against a shared PVC, in order.
Dataset
- Source: Gierliński et al., ENA accession PRJEB5348
- Reads: 50bp single-end
- Conditions: wild-type (WT: ERR458493–495) vs. snf2 deletion mutant (snf2: ERR458500–502)
Pipeline stages
| Order | File | Stage |
|---|---|---|
| 1 | 01-pvc.yaml |
Shared PersistentVolumeClaim for pipeline data and intermediate files |
| 2 | 02-job-sra-download.yaml |
Downloads raw FASTQ reads from SRA/ENA |
| 2b | 02b-job-sra-download-extra.yaml |
Downloads the remaining replicate samples |
| 3 | 03-job-fastqc.yaml |
FastQC read quality control |
| 4 | 04-job-star.yaml |
STAR alignment to the reference genome |
| 4b | 04b-job-star-extra.yaml |
STAR alignment for the remaining replicate samples |
| 5 | 05-job-featurecounts.yaml |
Gene-level count matrix from aligned reads |
| 6 | 06-job-deseq2.yaml |
DESeq2 differential expression analysis (WT vs. snf2) |
Results
- STAR alignment: ~85–90% mapping rate across samples
- DESeq2 output visualized (volcano plot, etc.) in a Jupyter R notebook
Running
Namespace: rnaseq. Jobs are sequential — each depends on the previous stage's output landing on the shared PVC, so apply and wait for completion before moving to the next:
kubectl apply -f 01-pvc.yaml
kubectl apply -f 02-job-sra-download.yaml
kubectl get jobs -n rnaseq -w # wait for Completed before continuing
kubectl apply -f 02b-job-sra-download-extra.yaml
kubectl apply -f 03-job-fastqc.yaml
kubectl apply -f 04-job-star.yaml
kubectl apply -f 04b-job-star-extra.yaml
kubectl apply -f 05-job-featurecounts.yaml
kubectl apply -f 06-job-deseq2.yaml