0

I am using Python, Selenium, Tweepy, and openSSL to look at links gathered from tweet data. Essentially what I have my code do is check if the tweet has a link, check if it is http/https, and if it is https, it will check if the certificate has expired. Here is that chunk o fcode:

if rsecure.search(s1) != None:

    #driver.get(s1)
    cert=ssl.get_server_certificate((s1, 443))
    x509 = OpenSSL.crypto.load_certificate(OpenSSL.crypto.FILETYPE_PEM, cert, ssl_version=ssl.PROTOCOL_SSLv23)
    if x509.has_expired():
        print("Expired Cert")
    else:
        print( "Good Link")
    print(driver.current_url)

Everything is working, including checking twitter for the phrases I put in, printing the bad http links, etc., except for this portion of code. When it reaches this, it doesn't print the link, it prints: [Errno -2] Name or service not known. I've looked around and there isn't much that has helped me resolve this error. I figured it has something to do with the openSSL portion, which I don't know much about.

Any ideas? EDIT: it would also occasionally print this error: encoding with 'idna' codec failed (UnicodeError: label too long)

EDIT: other code above provided portion for more context

import tweepy
import re
from selenium import webdriver
from pyvirtualdisplay import Display
import time

from OpenSSL import SSL
import OpenSSL
import ssl, socket
PYOPENSSL = True
#from selenium.webdriver.firefox.firefox_binary import FirefoxBinary

#binary = FirefoxBinary('/ex50/bin/geckodriver.exe')
display = Display(visible=0, size=(800, 800))  
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--incognito")


driver = webdriver.Chrome('/usr/local/rvm/gems/ruby-2.4.0/bin/chromedriver', chrome_options=chrome_options)


# context = ssl.create_default_context()
# conn = context.wrap_socket(
#                   socket.socket(socket.AF_INET),
#                    server_hostname= hostname)

# ssl_info = conn.getpeercert()
# print(ssl_info)

securer = r'https:\S*'
badr = r'http:\S*'
rsecure = re.compile(securer)
rbad = re.compile(badr)

class MyStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        try:
            if 'http' in status.text:
                if rsecure.search(status.text) != None:
                    driver.get(rsecure.search(status.text).group())
                    s1 = driver.current_url
                elif rbad.search(status.text) != None:

                    driver.get(rbad.search(status.text).group())
                    s1 = driver.current_url
jww
  • 97,681
  • 90
  • 411
  • 885
  • Which line generates the error? and what's `s1` value? – CristiFati Oct 10 '17 at 19:04
  • The S1 value is the driver.currentURL value. I am using cloud 9 so I am not entirely sure which line it is, but I am assuming it is the lines where I generate the certificate -- I have commented out it and it works but obviously not checking for the expired certificate. – Bernardo Silva Oct 12 '17 at 16:19
  • UPDATE: I found out it is this line giving the error: cert=ssl.get_server_certificate((s1, 443)) and further inspection yielded this error as well: socket.gaierror – Bernardo Silva Oct 12 '17 at 17:14

0 Answers0