2

I would like to make a circos-like plot to visualize SNPs only (with multiple tracks for SNPs attributes). It could be done either with python, R or I am happy to consider other languages.

So far, I have taken a look at the circlize R package. However, I get the error "Range of the sector ('C') cannot be 0" when initializing the circos plot. I believe that this error arises from the fact that I have discrete data (SNPs) instead of having data for all positions. Or maybe this is because I have some data points that are repeated.

I have simplified my data below and show the code that I have tried so far:

Sample  Gene    Pos read_depth  Freq
1   A   20394   43  99
1   B   56902   24  99
2   A   20394   50  99
2   B   56902   73  99
3   A   20394   67  50
3   B   56902   20  99
3   C   2100394 21  50
install.packages("circlize")
library(circlize)
data <- read.table("test_circos.txt", sep='\t', header=TRUE)
circos.par("track.height" = 0.1)
circos.initialize(factors = data$Gene, x = data$Pos)

I would like to know whether it is possible to get a circos-like plot where each of my data points (7 in my example) is plotted as an individual data point without any other points being plotted, in the way of a discrete axis.

Agathe
  • 303
  • 2
  • 15
  • There seems to be a whole book about it https://jokergoo.github.io/circlize_book/ – JBGruber Sep 17 '19 at 11:21
  • Yes, I have seen it but I do not see anything about the error I am mentioning and I do not see how my data is different from the test data used in the book (A quick glance) so I do not understand why I am stuck at the initializing step. Again, I believe more and more that this is because some data points are repeated (e.g. pos 20394 in Gene A), hence the question of whether it is possible to do a circos-like plot with such a data structure, or how can I change my data structure to enable a circos plot. – Agathe Sep 17 '19 at 12:07

1 Answers1

1

If it is of interest to anyone, I decided to do as follows:

  1. Number datapoints per category (='Gene'); new column 'Number':
Sample  Gene  Pos     depth  Freq   Number
1       A     20394   43     99     1      
1       B     56902   24     99     1
2       A     20394   50     99     2
2       B     56902   73     99     2
3       A     20394   67     50     3
3       B     56902   20     99     3
3       C     2100394 21     50     1
  1. Design circos config file as follows (header not included in real config file):
chr - ID  LABEL START END COLOUR
chr - A   A     0     3   chr1
chr - B   B     0     3   chr2
chr - C   C     0     1   chr3

This means that my genes will have length equal to the number of SNPs identified in said genes and that each bp of the genes will represent one line (=SNP) in my SNP file.

I can then use circos as normal.

In the end, I chose circos because it seemed best documented, therefore easier to learn with the addition of appearing more flexible.

Agathe
  • 303
  • 2
  • 15