Variant Call Format

The Variant Call Format (VCF) is a standard text file format used in bioinformatics for storing gene sequence variations. The format was developed in 2010 for the 1000 Genomes Project and has since been used by other large-scale genotyping and DNA sequencing projects. VCF is a common output format for variant calling programs due to its relative simplicity and scalability. Many tools have been developed for editing and manipulating VCF files, including VCFtools, which was released in conjunction with the VCF format in 2011, and BCFtools, which was included as part of SAMtools until being split into an independent package in 2014.

Variant Call Format
Filename extension
.vcf
Developed by1000 Genomes Project
Latest release
4.3
January 13, 2021 (2021-01-13)
Type of formatBioinformatics
Extended fromTab-separated values
Extended togVCF
Open format?Yes
Websitesamtools.github.io/hts-specs/VCFv4.3.pdf

The standard is currently in version 4.3, although the 1000 Genomes Project has developed its own specification for structural variations such as duplications, which are not easily accommodated into the existing schema.

Additional file formats have been developed based on VCF, including genomic VCF (gVCF). gVCF is an extended format which includes additional information about "blocks" that match the reference and their qualities.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.