1

As a beginner this is my first time using larger datasets for clustering. Can someone please help me with this problem.
Here is my code:

df1=BAR[['symbol','average_price']]  
plt.figure(figsize=(40,30))  
plt.scatter(df1['symbol'],df1['average_price']) 
plt.xlabel('SYMBOL',fontsize=15) 
plt.ylabel('AVERAGE_PRICE',fontsize=15)  
plt.xticks(rotation=90,fontsize=12)    
plt.show()

Here is my scatterplot:

enter image description here

scleronomic
  • 4,392
  • 1
  • 13
  • 43

1 Answers1

0

One easy option is to change the figure size and make it really big (as you did). Or you could also change the ratio and make the figure only really broad. But for me I only see the effect if I do not use an interactive backend but for example use Agg. If you than save the figure and look at the png file you can see the extra space. Another thing I tend to do is to arrange the labels in a smarter way to use the space a little better.

For example you could move every second label down and draw a longer tick, in my opinion this looks less cramped.

import matplotlib as mpl
mpl.use('Agg')
import matplotlib.pyplot as plt

def change_tick_length(ax, i, size):
    h = ax.xaxis.get_majorticklines()
    i = np.ravel_multi_index((i, 0), (len(h)//2, 2))
    h[i].set_markersize(size)


n = 200
new_tick_length = 70
n_whitespace = int(new_tick_length / 3.3)  # found this ratio works for me

labels = [f"Dummy{i}" for i in range(n)]

fig = plt.figure(figsize=(40, 15))
ax = fig.subplots()
fig.subplots_adjust(bottom=0.3)
ax.plot(range(n))
ax.set_xticks(range(n))

for i, l in enumerate(labels):
    if i % 2 == 0:
        labels[i] = f"{l}{' ' * n_whitespace}"
        change_tick_length(ax=ax, i=i, size=new_tick_length)

ax.set_xticklabels(labels, rotation=90)
plt.savefig('dummy')

a lot of ticks

scleronomic
  • 4,392
  • 1
  • 13
  • 43