0

I have an Excel sheet like this:

A    B    C    D
3    1    2    8
4    2    2    8
5    3    2    9
          2    9
6    4    2    7

Now I am trying to plot 'B' over 'C' and label the data points with the entrys of 'A'. It should show me the points 1/2, 2/2, 3/2 and 4/2 with the corresponding labels.

import matplotlib.pyplot as plt
import pandas as pd
import os

df = pd.read_excel(os.path.join(os.path.dirname(__file__), "./Datenbank/Test.xlsx"))

fig, ax = plt.subplots()
df.plot('B', 'C', kind='scatter', ax=ax)
df[['B','C','A']].apply(lambda x: ax.text(*x),axis=1);

plt.show()

Unfortunately I am getting this:

enter image description here

with the Error:

ValueError: posx and posy should be finite values

As you can see it did not label the last data point. I know it is because of the empty cells in the sheet but i cannot avoid them. There is just no measurement data at this positions. I already searched for a solution here: Annotate data points while plotting from Pandas DataFrame but it did not solve my problem.

So, is there a way to still label the last data point?

P.S.: The excel sheet is just an example. So keep in mind in reality there are many empty cells at different positions.

Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93
Jack.O.
  • 171
  • 1
  • 14

1 Answers1

0

You can simply trash the invalid data rows from df before plotting them

df = df[df['B'].notnull()]
matlantis
  • 160
  • 5