-1

I'm new to python and need help. I'm trying to produce a binary matrix (presence/absence) with the following data in a CSV format. I have tried several codes found in chats, but non of them solved the problem.

site_Name tool1 tool2 tool3
site1 0 1 0
site2 1 0 0
site3 0 0 1
site4 0 1 1

I have tried to convert the dataset into numpy array, transposed the dataset, dropping columns, etc

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
khady
  • 1
  • 1

1 Answers1

0

You could try a seaborn heatmap:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

N, M = 10, 20
df = pd.DataFrame((np.random.rand(N, M) * 2).astype(int),
                  columns=[f'tool{i + 1:02d}' for i in range(M)],
                  index=[f'site{i + 1:02d}' for i in range(N)])
sns.heatmap(df, annot=True, lw=1, cbar=False)
plt.tight_layout()
plt.show()

sns.heatmap of binary data

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • Hi JohanC Tanks for your answer. Actually, the data i try to plot is not random and every site and tool has a name (a bit dufficult). It is why i have used site and tools. – khady Feb 20 '23 at 22:29
  • Well, you store your information in a dataframe (possibly reading in a dataframe). And then use `sns.heatmap(...)` using that dataframe. If you'd redo the edit of @SaaruLindestøkke you could directly read your dataframe from the website. Please note that data as image is not allowed at StackOverflow. – JohanC Feb 20 '23 at 22:37
  • I have tried your suggestion, but the column "site name" is the problem. When I drop it, I'm able to plot the data. Here is the error "could not convert string to float: 'AnyamabaLayer _'' – khady Feb 20 '23 at 23:06
  • You need to set that column as index of your dataframe e.g. `df = df.set_index("site name")`. – JohanC Feb 21 '23 at 06:17
  • THANK YOU SO MUCH Johan !!!!! it worked. Setting the index was the solution. – khady Feb 21 '23 at 09:33