SeqMinerCmd Tutorial
A command line tool for SeqMiner
(Updated October 1, 2013)
SeqMinerCmd is a convenient data extraction tools for VCF and BCF files. The software, SeqMinerCmd
, can be downloaded in this link and its source code can be obtained from github.
This tutorial demonstrates by-region and by-gene based methods, and the extraction results can be stored as a R matrix or a list.
Extract gentoype matrix from VCF file
- extract by region
./seqMiner -r "1:196621007-196634467" vcf/all.anno.filtered.extract.bcf.gz
## 1 region to be extracted.
## ----- 1:196621007-196634467 -----
## Position NA12286 NA12341 NA12342
## 1:196623337 1 2 0
## 1:196632129 0 2 0
## 1:196632470 1 2 0
## 1:196633606 2 2 2
- extract by gene
./seqMiner --geneFile vcf/refFlat_hg19_6col.txt.gz -n CFH vcf/all.anno.filtered.extract.vcf.gz
## 1 region to be extracted.
## ----- CFH -----
## Position NA12286 NA12341 NA12342
## 1:196623337 1 2 0
## 1:196632129 0 2 0
## 1:196632470 1 2 0
## 1:196633606 2 2 2
## 1:196623337 1 2 0
## 1:196632129 0 2 0
## 1:196632470 1 2 0
## 1:196633606 2 2 2
Extract arbitrary fields from VCF file
- extract by region
./seqMiner -r "1:196621007-196634467" -e CHROM,POS:DP:GT,GD vcf/all.anno.filtered.extract.vcf.gz
## CHROM POS DP NA12286:GD NA12341:GD NA12342:GD NA12286:GT NA12341:GT NA12342:GT
## 1 196623337 23 0 2 21 0/1 1/1 0/0
## 1 196632129 25 1 7 17 0/0 1/1 0/0
## 1 196632470 28 3 7 18 0/1 1/1 0/0
## 1 196633606 26 2 5 19 1/1 1/1 1/1
- extract by gene
./seqMiner --geneFile vcf/refFlat_hg19_6col.txt.gz -n CFH -e CHROM,POS:DP vcf/all.anno.filtered.extract.vcf.gz
## CHROM POS DP
## 1 196623337 23
## 1 196632129 25
## 1 196632470 28
## 1 196633606 26
## 1 196623337 23
## 1 196632129 25
## 1 196632470 28
## 1 196633606 26
Contact
SeqMiner is developed by Xiaowei Zhan and Dajiang Liu. We welcome your questions and feedbacks.