6

I am attempting to create a small dataset by pulling messages/responses from a slack channel I am a part of. I would like to use python to pull the data from the channel however I am having trouble figuring out my api key. I have created an app on slack but I am not sure how to find my api key. I see my client secret, signing secret, and verification token but can't find my api key

Here is a basic example of what I believe I am trying to accomplish:

import slack
sc = slack.SlackClient("api key")
sc.api_call(
  "channels.history",
  channel="C0XXXXXX"
)

I am willing to just download the data manually if that is possible as well. Any help is greatly appreciated.

PDPDPDPD
  • 445
  • 5
  • 16
  • 1
    You need to install your app to get the token. Which you can do from the app management screen where you also see the client secret etc. – Erik Kalkoken Jun 24 '19 at 23:03
  • 1
    Also you want to look into `conversations.history` instead. Its the newer version which works better for pagination and also can retrieve all types of channels. – Erik Kalkoken Jun 24 '19 at 23:12
  • What permissions did you give your Slack app to access messages from a channel? – mdrishan Jan 03 '22 at 15:40

2 Answers2

12

messages

See below for is an example code on how to pull messages from a channel in Python.

  • It uses the official Python Slack library and calls conversations_history with paging. It will therefore work with any type of channel and can fetch large amounts of messages if needed.
  • The result will be written to a file as JSON array.
  • You can specify channel and max message to be retrieved

threads

Note that the conversations.history endpoint will not return thread messages. Those have to be retrieved additionaly with one call to conversations.replies for every thread you want to retrieve messages for.

Threads can be identified in the messages for each channel by checking for the threads_ts property in the message. If it exists there is a thread attached to it. See this page for more details on how threads work.

IDs

This script will not replace IDs with names though. If you need that here are some pointers how to implement it:

  • You need to replace IDs for users, channels, bots, usergroups (if on a paid plan)
  • You can fetch the lists for users, channels and usergroups from the API with users_list, conversations_list and usergroups_list respectively, bots need to be fetched one by one with bots_info (if needed)
  • IDs occur in many places in messages:
    • user top level property
    • bot_id top level property
    • as link in any property that allows text, e.g. <@U12345678> for users or <#C1234567> for channels. Those can occur in the top level text property, but also in attachments and blocks.

Example code

import os
import slack
import json
from time import sleep

CHANNEL = "C12345678"
MESSAGES_PER_PAGE = 200
MAX_MESSAGES = 1000

# init web client
client = slack.WebClient(token=os.environ['SLACK_TOKEN'])

# get first page
page = 1
print("Retrieving page {}".format(page))
response = client.conversations_history(
    channel=CHANNEL,
    limit=MESSAGES_PER_PAGE,
)
assert response["ok"]
messages_all = response['messages']

# get additional pages if below max message and if they are any
while len(messages_all) + MESSAGES_PER_PAGE <= MAX_MESSAGES and response['has_more']:
    page += 1
    print("Retrieving page {}".format(page))
    sleep(1)   # need to wait 1 sec before next call due to rate limits
    response = client.conversations_history(
        channel=CHANNEL,
        limit=MESSAGES_PER_PAGE,
        cursor=response['response_metadata']['next_cursor']
    )
    assert response["ok"]
    messages = response['messages']
    messages_all = messages_all + messages

print(
    "Fetched a total of {} messages from channel {}".format(
        len(messages_all),
        CHANNEL
))

# write the result to a file
with open('messages.json', 'w', encoding='utf-8') as f:
  json.dump(
      messages_all, 
      f, 
      sort_keys=True, 
      indent=4, 
      ensure_ascii=False
    )
Erik Kalkoken
  • 30,467
  • 8
  • 79
  • 114
  • Hi Erik thank you for answering the question, it was very helpful. This seems to only receive the messages in the channel and not the response message threads to the initial messages. Is the best way to retrieve those to go through the meta data? – PDPDPDPD Jul 08 '19 at 16:37
  • yes, let me update the answer to add how it works for threads. – Erik Kalkoken Jul 08 '19 at 17:51
  • Thanks very much I was struggling to figure out a solution that did not have issues with api limits or taking a very long time – PDPDPDPD Jul 08 '19 at 18:35
  • Happy to help. As you can see in the code my function is waiting 1 sec after each API call to ensure the rate limit it not violated. – Erik Kalkoken Jul 08 '19 at 18:38
  • Yeah I have that in mine as well, I was just pulling every response individually before with rather than using the replies api so my code was very slow. Do you know if I can make the wait less in the code? – PDPDPDPD Jul 08 '19 at 18:46
  • not much. max is 50 calls / minute. see here: https://api.slack.com/docs/rate-limits#tier_t3 – Erik Kalkoken Jul 08 '19 at 19:12
  • but you can increase the page size to max 1.000 to speed up the process, which means each API call will return but to 1.000 messages. Slack does not recommend more than 200 tough for stability – Erik Kalkoken Jul 08 '19 at 19:15
  • It does not seem like the conversations.replies api returns much metadata. For example ideally I would like to know the reactions to each of the replies in the thread. Is the only way to retrieve this information to recursively use the conversations_history method? – PDPDPDPD Jul 08 '19 at 22:35
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/196162/discussion-between-erik-kalkoken-and-pdpdpdpd). – Erik Kalkoken Jul 08 '19 at 22:36
4

This is using the slack webapi. You would need to install requests package. This should grab all the messages in channel. You need a token which can be grabbed from apps management page. And you can use the getChannels() function. Once you grab all the messages you will need to see who wrote what message you need to do id matching(map ids to usernames) you can use getUsers() functions. Follow this https://api.slack.com/custom-integrations/legacy-tokens to generate a legacy-token if you do not want to use a token from your app.

def getMessages(token, channelId):
    print("Getting Messages")
    # this function get all the messages from the slack team-search channel
    # it will only get all the messages from the team-search channel
    slack_url = "https://slack.com/api/conversations.history?token=" + token + "&channel=" + channelId
    messages = requests.get(slack_url).json()
    return messages


def getChannels(token):
    ''' 
    function returns an object containing a object containing all the
    channels in a given workspace
    ''' 
    channelsURL = "https://slack.com/api/conversations.list?token=%s" % token
    channelList = requests.get(channelsURL).json()["channels"] # an array of channels
    channels = {}
    # putting the channels and their ids into a dictonary
    for channel in channelList:
        channels[channel["name"]] = channel["id"]
    return {"channels": channels}

def getUsers(token):
    # this function get a list of users in workplace including bots 
    users = []
    channelsURL = "https://slack.com/api/users.list?token=%s&pretty=1" % token
    members = requests.get(channelsURL).json()["members"]
    return members
MFK34
  • 129
  • 2
  • 11
  • 1
    These are some useful advise for OP. I would however suggest to use the official Python Slack library instead of coding all the API calls yourself. Much easier. Here is the link: https://github.com/slackapi/python-slackclient. Install with `pip3 install slackclient==2.0.0` – Erik Kalkoken Jun 25 '19 at 11:16
  • Also: You would need to add pagination logic when retrieving messages and users, or you will only get a portion (e.g. the first 200). – Erik Kalkoken Jun 25 '19 at 13:00