Questions tagged [vcftools]

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

This toolset can be used to perform the following operations on VCF files:

Filter out specific variants
Compare files
Summarize variants
Convert to different file types
Validate and merge files
Create intersections and subsets of variants

Links:

42 questions

vote

1 answer

Combine multiple VCF files into one large VCF file

I have a list of VCF files from specific ethnicity such as American Indian, Chinese, European, etc Under each ethnicity, I have around 100+ files. Currently, I computed the VARIANT QC metrics such as call_rate, n_het etc for one file as shown…

asked Sep 08 '20 at 13:53

The Great

7,215
7
40
128

vote

0 answers

How to filter VCF file with a list CHR or contig IDs?

I need to subset/filter a SNP vcf file by a long list of non-sequential contig IDs, which appear in the CHR column. My VCF file contains 13,971 contigs currently, and I want to retain a specific set of 7,748 contigs and everything associated with…

vcf-variant-call-format vcftools

asked Jun 14 '20 at 01:00

acoles

vote

2 answers

vcftools - installing on MAC

I'm trying to install vcftools on mac. Looking at previous posts on this issue, I made sure I've got Mac OS X developer tools (http://www.cnet.com/how-to/install-command-line-developer-tools-in-os-x/). I followed the procedure recommended in the…

bioinformatics vcftools

asked Mar 29 '18 at 19:23

FcmC

vote

2 answers

vcf to ped format: redefine non-dbSNPs

When I am converting a vcf file to ped format (with vcftools or with vcf to ped converter of 1000G), I run into the problem that the IDs of the variants that don't have a dbSNP ID get the base pair position of that variant as an ID. Example of…

bioinformatics vcftools vcf-variant-call-format

asked Jan 28 '14 at 08:31

marie de boer

vote

1 answer

Preparing a Perl file to run with Ubuntu and tabix

I don't know about Ubunto or Perl but still need to install and run a program on it. This is what I am looking at: http://vcftools.sourceforge.net/docs.html On the installation section it says this: To build the vcftools executable, type "make" in…

perl ubuntu bioinformatics vcf-variant-call-format vcftools

asked Jul 31 '12 at 17:13

Bohn

26,091
61
167
254

votes

0 answers

Merging two plink files

I have two plink binary files - one containing only polimorphic sites (400k snps), the other one - plink file with reference data containing more sites (500K). How to merge them, so that those extra 100K snps will not be assigned to missing in a…

variant genetics vcftools

asked May 29 '23 at 22:56

Anna

votes

0 answers

vcf2maf - generate one maf file for two vcf files

I have 38 samples in vcf format and need to generate maf files for each to visualise them using MesKit in R. Some of the samples are matched tumour and normal and I was wondering if there is a way to generate one single maf file for the two vcf…

r vcftools

asked Jan 30 '23 at 09:29

CH1374

votes

1 answer

Is It Possible to Calculate Allele Frequency in a VCF File with Python?

I have a VCF file with 200 samples (mitochondrial genome of Plasmodium falciparum). I managed to transform the raw data into Pandas dataframe. Here is a pic to take a look at: And a few relevant lines from the actual…

python vcf-vcard vcftools

asked Jan 28 '23 at 12:29

eh329

votes

0 answers

python error: Traceback (most recent call last), IndexError: list index out of range

I'm trying to run the below python script (vcf2treemix.py) with the command <./vcf2treemix.py -vcf allsamples14_filtered_1_autosomes38_bisnps.vcf.gz -pop allsamples14.clust.pop> I got this error with both python 2 and 3 ######### error…

python vcftools

asked Nov 19 '22 at 08:42

Ali Basuony

votes

1 answer

Missing data per site

I want to calculate statistics of missing data per each site in my vcf file. Using vcftools --missing-site gives wrong stats for several sites. Is there is any other way to calculate it? Thank you!

missing-data vcftools

asked May 01 '22 at 21:20

Anna

votes

1 answer

Extract variant positions from VCF dependent on contents of other columns

I have a vcf file, I am trying to extract the information from these columns: #CHROM POS REF ALT However I would like to extract these only if the SAMPLE-1 column contains the string DeNovo (Not DeNovoSV) and that SAMPLE-1, SAMPLE-2, and…

awk sed samtools vcftools bcftools

asked Feb 13 '22 at 21:30

hdjc90

votes

1 answer

How to run ensembl-vep in conda

I’ve installed like so: conda install ensembl-vep=105.0-0 And then installed the human cache like this: vep_install -a cf -s homo_sapiens -y GRCh38 -c /mnt/gpfs/live/rd01__/ritd-ag-project-rd018o-mdflo13/refs/vep —CONVERT But I can’t get it to run…

variant vcf-variant-call-format vcftools gatk

asked Jan 21 '22 at 12:11

Mike

votes

1 answer

VCF file is missing mandatory header line ("#CHROM...")

I am getting an error when I am going to read a VCF file using scikit-allel library inside a docker image and os ubuntu 18.04. It shows that raise RuntimeError('VCF file is missing mandatory header line ("#CHROM...")') RuntimeError: VCF file is…

pandas numpy scikit-learn python-3.6 vcftools

asked Dec 17 '21 at 17:14

Shahedul Islam

votes

1 answer

creating a per sample table from a vcf using bcftools

I have a multi-sample vcf file and I want to get a table of IDs on the left column with the variants in which they have an alternate allele in. It should look like this: ID1 chr2:87432:A:T_0/1 chr10:43234:C:G_1/1 ID2 chr2:87432_A:T_1/1 ID3…

vcftools bcftools

asked Nov 16 '21 at 14:10

tacrolimus

votes

2 answers

Merge three columns in one (linux, python, or perl)

I have one file (.tsv) that contain variants calling for all the samples. I would like to merge the first three columns into one column: Example: Original: file name= variants.tsv > the first three columns that I want to merge are: lane sampleID …

linux csv merge columnsorting vcftools

asked Oct 17 '21 at 07:29

Alhu.A

Prev 1

3 Next