-1

i have following input:

import pandas as pd
df = pd.DataFrame(np.array([[1,  "A"],[2, "A"],[3, "B"],[4, "C"],[5, "D" ],[6, "A" ],[7, "B" ],[8, "A"], 
                           [9, "C" ],[10, "D" ],[11,"A" ],
                           [12,  "A"],[13, "B"],[14, "B"],[15, "D" ],[16, "A" ],[17, "B" ],[18, "A" ], 
                           [19, "C" ],[20, "D" ],[21,"A" ],
                           [22,  "A"],[23, "A"],[24, "C"],[25, "D" ],[26, "A" ],[27, "C" ],[28, "A" ], 
                           [29, "C" ],[30, "D" ],[31,"A" ]]),
                            columns=['No.',  'Value'])

This is the output:

    No. Value
0   1   A
1   2   A
2   3   B
3   4   C
4   5   D
5   6   A
6   7   B
7   8   A
8   9   C
9   10  D
10  11  A
11  12  A
12  13  B
13  14  B
14  15  D
15  16  A
16  17  B
17  18  A
18  19  C
19  20  D
20  21  A
21  22  A
22  23  A
23  24  C
24  25  D
25  26  A
26  27  C
27  28  A
28  29  C
29  30  D
30  31  A

Now i want to visualize all sequences that are in the data.

The first sequence should start with the first value in the data frame and ends with upcoming value of "D". So for example the first sequence is from No. 1 to No.5 (including).

The second sequence is from No.6 till the next Value of "D", No.10. And so on.

The Dataframe has six sequences in it.

How to visualize the sequences?

ML-ME
  • 33
  • 8
  • `Now i want to visualize all sequences that are in the data.` Can you explain more? – jezrael Jan 14 '20 at 09:44
  • A possible Chart may have six sequences on the x-axis and fixed value like 1 on the y-axis. In each sequence e.g. sequence 1 the following values A-A-B-C-D could have an own colour for each bar. – ML-ME Jan 14 '20 at 09:48

2 Answers2

1

I think you need:

g = df['Value'].eq('D').shift().cumsum().bfill().astype(int)
df1 = df.groupby(g)['Value'].value_counts().unstack(fill_value=0)
print (df1)
Value  A  B  C  D
Value            
0      2  1  1  1
1      2  1  1  1
2      2  2  0  1
3      2  1  1  1
4      3  0  1  1
5      2  0  2  1
6      1  0  0  0

df1.plot.bar()

Or:

g = df['Value'].eq('D').shift().cumsum().bfill().astype(int)
idx = df.groupby(g)['Value'].agg(''.join)
df1 = df.groupby(g)['Value'].value_counts().unstack(fill_value=0).set_index(idx)
print (df1)
Value  A  B  C  D
Value            
AABCD  2  1  1  1
ABACD  2  1  1  1
AABBD  2  2  0  1
ABACD  2  1  1  1
AAACD  3  0  1  1
ACACD  2  0  2  1
A      1  0  0  0

df1.plot.bar()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • This looks nice, but i want something that can visualizie the sequence. It my could look like "Serial Programming" https://forums.fast.ai/uploads/default/original/2X/4/4ebf88f00742be9b31e450097131c6bbb40fb7d1.png Instead of seconds on the x-axis there should be "Sequence". The ticks should be be 0, then 1 for the first tick and so on. And instead of the y-axis tasks there should be Value like A for the first tick, B for the second tick and so on. – ML-ME Jan 14 '20 at 09:58
  • @New16122019 - hmmm, not understand. And instead of tasks there should be Value - what is value here? Counts? or `1,2,3,4,5,6,7` ? – jezrael Jan 14 '20 at 10:05
  • The values should be "A", "B", "C", and "D". – ML-ME Jan 14 '20 at 10:07
  • @ML-ME - Unfortunately not understand. – jezrael Jan 14 '20 at 10:09
  • @ML-ME - Is possible create graph in excel from data in question? – jezrael Jan 14 '20 at 10:13
  • I want to see if there is a pattern of the appearing sequences. Therefore i need a suitable visualization. – ML-ME Jan 14 '20 at 10:14
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/205923/discussion-between-jezrael-and-ml-me). – jezrael Jan 14 '20 at 10:15
1

Visualization of the sequence can be thought of as (Number of sequences existing and their interval). If the above holds true.

You can try as :

replace D values as 'nan' in new coloumn named seq

df.loc[df['Value'] != 'D', 'seq'] = 1

Then plot the df to visualize the sequences as:

import matplotlib.pyplot as plt
plt.plot('seq','ro',data=df)

The o/p will be as:

enter image description here

if D also needed to show. We can try below code:

df.loc[df['Value'] != 'D', 'seq'] = 1
df.loc[df['Value'] == 'D', 'seq'] = 2

Then plot the df to visualize the sequences as:

import matplotlib.pyplot as plt
plt.plot('seq','rd',data=df,linestyle='dashdot')
plt.plot('seq','gd',data=df,linestyle='dashed')

enter image description here

Anil Kumar
  • 385
  • 2
  • 17