0

I am producing a plot based on the width of a text column to determine the "knee point" based on the curve that is generated. I want more granular numbers so that I can be more specific in the size of the text.

I have tried the following:

library(textshaping)
library(dplyr)

df%>% 
  mutate(width_txt = text_width(title)) %>% 
  pull(width_txt) %>% 
  sort(decreasing = FALSE) %>% 
  plot()

Which gives me:

enter image description here

But not sure if this is correct. More specifically I want to have a logarithmic curve with numbers on the x-axis and the topic names on the y-axis of the plot.

> dput(head(df$title))
c("ACID?", "Week 2", "Week 2_1", "Are we talking ACID?", "Where has the reading material gone?", 
"I like, I wish...")
Ranji Raj
  • 778
  • 4
  • 18
  • Use `dput()` on `df_PR` not on `df_PR$title` which is not used in your code at all. What do you mean by the "knee point"? You seem to have a smooth curve where the slope is continuously increasing. – dcarlson Jun 18 '22 at 16:39
  • Just a thought on 'more granular numbers', where some_txt is your `c('ACID?'...`, `rep.int(nchar(some_txt)[1:length(nchar(some_txt))], times=nchar(some_txt)[1:length(nchar(some_txt))] )`. – Chris Jun 18 '22 at 17:19
  • @Chris, Based on the above plot I estimate a number 18217 for the text width to be used as a filter. So I would reduce my search space according to that. – Ranji Raj Jun 18 '22 at 17:54
  • 1
    I found the estimate number hard to pin down when considered as a for loop. The nice thing about the approach is it doesn't matter how long `some_txt` is. It is all vectorized so runs quick. Aligning it back to topics on the y axis is when you will have taken acid. – Chris Jun 18 '22 at 20:57

0 Answers0