0

I am using a software needs reference geneome in .gbk format (which is obseleted by genebank and is replaced by gbff). I searched to find a file convertor however I failed. I supposed gb and gbk are the same, so I renamed gb to gbk, however didnt help. I appriciate any help.

I am going to share the command I used to convert the gbff to gbk:

from Bio import SeqIO

# Specify input and output filenames
input_file = "GCF_000013425.1_ASM1342v1_genomic.gbff"
output_file = "GCF_000013425.1_ASM1342v1_genomic.gbk"

# Read the GenBank Flat File
records = SeqIO.parse(input_file, "genbank")

# Write records in GenBank format
SeqIO.write(records, output_file, "genbank")
Nar_sys
  • 9
  • 4
  • Hi Narges. I am using Mauve and have the same issue. Did you manage to find a solution? Best, Rikki – Rikki Franklin Frederiksen Jun 29 '23 at 18:37
  • Dear Rikki, Thank you for your comment. I shared the command I used for converting the gbff to gbk format. Tormes needs .gbk/.fasta as it uses Mauve. However, It failed with both .gbk and .fasta. It was like the software skip the reference for any reason and do the assembly with no reference. I cross my finger for you with Mauve. Please let me know if you had any questions. – Nar_sys Jun 30 '23 at 09:28

1 Answers1

0

The FASTA to GenBank converter requires Python3 and the “Biopython” package (see https://biopython.org/).

Install this as python3 -m pip install biopython.

The Python code (assumes a DNA sequence):

    From Bio import SeqIO

    with open("change_this_name_1.fna") as input_handle, open 
      ("change_this_name_1.gbk", "w") as output _handle:
        sequences = SeqIO.parse(input_handle, "fasta")
        records = list(sequences)
        for i in range(len(records));
            records[i].annotations["molecule_type"] = "DNA"
        count = SeqIO.write(records, output_handle, "genbank")

    print("Converted %i records" % count)
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83