Questions tagged [bed]

BED (Browser Extensible Data) format in bioinformatics provides a flexible way to define the data lines of genomic features that are displayed in an genome annotation track. BED lines have 3 required fields (chromosome, start position, end position) and 9 additional optional fields. For other bioinformatics formats, such as FASTA, FASTQ, VCF, GFF, BAM/SAM, etc, use their own separate tags.

BED (file format) - Wikipedia

BED format - Genome Browser FAQ
BED (Browser Extensible Data) format provides a flexible way to define the data lines that are displayed in an annotation track. BED lines have three required fields and nine additional optional fields. The number of fields per line must be consistent throughout any single set of data in an annotation track. The order of the optional fields is binding: lower-numbered fields must always be populated if higher-numbered fields are used.

BED information should not be mixed as explained above (BED3 should not be mixed with BED4), rather additional column information must be filled for consistency, for example with a "." in some circumstances, if the field content is to be empty. BED fields in custom tracks can be whitespace-delimited or tab-delimited. Only some variations of BED types, such as bedDetail, require a tab character delimitation for the detail columns.

The first three required BED fields are:

chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or scaffold (e.g. scaffold10671).
chromStart - The starting position of the feature in the chromosome or scaffold. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or scaffold. The chromEnd base is not included in the display of the feature, however, the number in position format will be represented. For example, the first 100 bases of chromosome 1 are defined as chrom=1, chromStart=0, chromEnd=100, and span the bases numbered 0-99 in our software (not 0-100), but will represent the position notation chr1:1-100.

26 questions
0
votes
0 answers

Obtain fraction of gene covered from bam file using bed file

I am trying to calculate the gene coverage of specific genes in a BAM file. I have a list of genes with thier start and end positions in a BED file. I would essentially like to know the number of overlaps for each gene and how well it is covered. My…
0
votes
2 answers

Convert a data.frame in R to .bed format file

I have a data.frame that looks like this. bed <- data.frame(chrom=c(rep("Chr1",5)), chromStart=c(18915152,24199229,73730,81430,89350), chromEnd=c(18915034,24199347,74684,81550,89768), …
LDT
  • 2,856
  • 2
  • 15
  • 32
0
votes
2 answers

how to format a large txt file to bed

I am trying to format CpG methylation calls from R package "methyKit" to simple bed format. Since it is a large file, i can not do it in Excel. I also tried Seqmonk, but it does not allow me to export the data in the format I want. Linux Awk/sed…
0
votes
1 answer

Pandas convert a dataframe to a bed file?

I used pandas.to_csv() to convert a pandas dataframe to a BED file by doing this: pd.to_csv('xxx.bed', index=False, sep='\t', header=None) I want to know if this can successfully convert a dataframe to a bed file, or I am just exporting the…
James
  • 1
0
votes
2 answers

Data frame to bed file conversion

I have pretty large data frames in R, which I need to convert to bed files. I use the code below for df->bed conversion, but it is extremely slow. I was wondering how to convert df to bed quicker and in a smarter way, again in R or bash. Here are…
gokberk
  • 47
  • 7
0
votes
0 answers

Chromosomal position to gene position conversion in a sample PLINK tped file using reference GTF file

At first, this thread might look related to genetics, but the problem is actually shell scripting and programming based. I am new to coding, so I was suggested to find a help in SO. I try to intersect NCBI GTF files with PLINK tped files with…
0
votes
1 answer

How to save in two columns of the same file from different output in bash

I am working on a project that require me to take some .bed in input, extract one column from each file, take only certain parameters and count how many of them there are for each file. I am extremely inexperienced with bash so I don't know most of…
0
votes
3 answers

Python: Extract DNA sequence from FASTA file using Bed file

May I know how can I extract dna sequence from fasta file? I tried bedtools and samtools. Bedtools getfasta did well but for some of my file return "warning: chromosome was not found in fasta file" but the fact is the chromosome name in bed file and…
Allyson
  • 115
  • 1
  • 3
  • 12
0
votes
2 answers

Create VCF from .bim, .bed and .fam files

I have a .fam, .bed and .bim file with markers for few individuals. I would need to convert it into a VCF file. Could someone help to create a VCF file. Are there any opensource tools which can do this?
chas
  • 1,565
  • 5
  • 26
  • 54
-2
votes
1 answer

AWK to handle bed files

I would like to grep and separate fields from bed files to generate a new bed file with these new arranged data. I would go from here: 1 15903 rs557514207 G G,A…
DaN
  • 3
  • 5
-2
votes
1 answer

awk to separate rows from bed files depending on character

I want to separate rows by comma delimiter in one filed and keep the other information of the row. I have tab delimited files with 4 columns and a lot of rows... Frome here: 1 13445 rs558318514 C G,T 1_13445 1 13453 rs568927457 T C 1_13455 1…
DaN
  • 3
  • 5
1
2