0

I'm running into issues and not fully understanding how to create a proper line chart with my streamlit app.

main.py

st.title("Project1")
st.header("Part A - The Top Stories API")
st.markdown("This app uses the Top Stories API to display the most common words used in the top current \
        articles based on a specified topic selected by the user. The data is displayed as line chart \
        and as a wordcloud image.")

st.subheader("I - Topic Selection")
name = st.text_input("Please enter your name")
topic = st.selectbox(
    "Select a topic of your interest",
    ["arts", "automobiles", "books", "business", "fashion", "food", "health", "home",
     "insider", "magazine", "movies", "nyregion", "obituaries", "opinion", "politics",
     "realestate", "science", "sports", "sundayreview", "technology", "theater", "t-magazine",
     "travel", "upshot", "us", "world"]
)

if 'automobiles' in topic:
    st.write("Hi " + name + ", you selected the " + topic + " topic.")

    api_calls.top_stories_api('automobiles')

    st.subheader("II - Frequency Distribution")
    if st.checkbox("Click here to generate frequency distribution"):
        chart_data = pd.DataFrame(
            api_calls.line_graph(),
            columns=['Words', 'Count']
        )
        chart_data = chart_data.set_index('Words')
        st.line_chart(chart_data)

    st.subheader("III - WordCloud")
    if st.checkbox("Click here to generate wordcloud"):
        api_calls.wordcloud()
        st.markdown("<p style='text-align: center;'>Wordcloud generated for " + topic + " topic.",
                    unsafe_allow_html=True)

api_calls.py

api_key_dict = main_functions.read_from_file("JSON_Files/api_keys.json")
api_key = api_key_dict["my_key"]

my_articles = main_functions.read_from_file("JSON_Files/response.json")

str1 = ""

for i in my_articles["results"]:
    str1 = str1 + i["abstract"]


def top_stories_api(topic):
    url = "https://api.nytimes.com/svc/topstories/v2/" + topic + ".json?api-key=" + api_key
    response = requests.get(url).json()

    main_functions.save_to_file(response, "JSON_Files/response.json")


def most_popular_api(social, period):
    url = "https://api.nytimes.com/svc/mostpopular/v2/" + social + "/" + period + ".json?api-key=" + api_key
    response = requests.get(url).json()

    main_functions.save_to_file(response, "JSON_Files/response.json")


def line_graph():
    from nltk.corpus import stopwords

    words = word_tokenize(str1)

    words_no_punc = []

    for w in words:
        if w.isalpha():
            words_no_punc.append(w.lower())

    stopwords = stopwords.words("english")

    clean_words = []
    for w in words_no_punc:
        if w not in stopwords:
            clean_words.append(w)

    fdist3 = FreqDist(clean_words)
    fdist3.most_common(10)

    return fdist3


def wordcloud():
    wc = WordCloud().generate(str1)
    plt.figure(figsize=(12, 12))
    plt.imshow(wc)
    plt.axis("off")

    return st.pyplot()

so as you can see, I'm creating a FreqDist to give me 10 most common words and I'd like to use those words as column names, and the frequency as the other axis. I'm not getting any errors however, the line graph is not being displayed.

Another issue I'm running into is I'd like to loop through the items in the selectbox, so whatever the user clicks, is what will show up in the line chart. The data should automatically be pulled from response.json. Trying to do this using DRY methods.

1 Answers1

1

So the line chart in Streamlit is called like this: st.line_chart(data) where data is the data frame you want to display. You would need to change the last line of code here to:

st.line_chart(chart_data)

Please note that this is just a wrapper and it won't allow you to do any custom charting. It is designed for simple data structures only. If you want to customize your chart you should use one of the other plotting methods, such as st.pyplot and pass in a matplotlib figure or use an altair chart.

Check out the docs here: https://docs.streamlit.io/en/stable/api.html?highlight=line_chart#streamlit.line_chart

As for looping over the items in the selectbox to filter the line chart isn't ideal. It would be very slow and it can be done much easier with built in Pandas functionality with filter. You can select specific columns by name by using items and passing a list, in your case the topic list from your multiselect:

chart_data.filter(items=topic)

Check out the pandas docs on this here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.filter.html

Marisa
  • 70
  • 5
  • When you say "generate the rest of the data" what do you mean? In your original question, it sounds like you are trying to filter your Pandas data frame by using its feature names (i.e. the value from the select box) but now you're saying that you have to generate your data from somewhere? Does that data change based on what the user selects from the box? my method assumes that `topic` is the name of these features. – Marisa Jun 28 '21 at 20:51
  • Hi @marisa, I've deleted my other comments and updated my original post as I've refactored my code. Are you able to look at what is going on now? I'm not longer getting errors, however the line graph is still not displaying. – anonymousmonkey339 Jun 28 '21 at 23:24