I'm writing a scraper that uses selenium to navigate & login to a certain website; search for the newest data and then store it into a database. I'm using selenium-webdriver to navigate the website, and now I'm trying to write tests for the most important edge cases.
I downloaded the HTML and built a fake Sinatra website, that mimics the behavior of the original site so that I can test my code. However, I have to run the puma server separately in an environment independent of my code.
I need to be able to mock everything in the same environment so that I can have better control of how the application behaves. I think I can take the same approach as the guys from Capybara do but I don't know how to start.
I created a short mocking class, and it runs but as soon as puma starts RSpec is halted waiting for puma to stop its execution.
What's the best approach that I can take to actually test this scraper correctly, are there any technologies that already exist and that I can use?
My scraper works the same as explained in this tutorial:
https://dev.to/mknycha/serverless-web-scraper-in-ruby-tutorial-50hg
I tried to make it work by starting the mocked website when starting rspec like this:
require 'webmock'
require 'puma'
require 'puma/events'
require 'spec/support/fake_website'
require 'rack/handler/puma'
include WebMock::API
WebMock.reset!
def enable_external_connections!
WebMock.allow_net_connect!
end
def disable_external_connections!
WebMock.disable_net_connect!(allow_localhost: true, allow: ['app.local'])
end
def stub_net_connections!(options = {})
registry = {
"fake_website.com" => { /fake_website.com/ => proc { FakeWebsite } }
}
if !options[:only].to_a.empty?
[options[:only]].flatten.each do |key|
WebMock::API.stub_request(:any, registry[key].keys.first).to_rack(registry[key].values.first.call)
end
enable_external_connections!
elsif !options[:except].to_a.empty?
(registry.keys - [options[:except]].flatten).flatten.each do |key|
WebMock::API.stub_request(:any, registry[key].keys.first).to_rack(registry[key].values.first.call)
end
enable_external_connections!
else
registry.keys.flatten.each do |key|
WebMock::API.stub_request(:any, registry[key].keys.first).to_rack(registry[key].values.first.call)
end
disable_external_connections!
end
end
def run_puma(example)
options = { Host: '127.0.0.1', Port: '80', Threads: '0:4', workers: 1, daemon: true, Verbose: true }
conf = Rack::Handler::Puma.config(FakeWebsite, options)
events = conf.options[:Silent] ? ::Puma::Events.strings : ::Puma::Events.stdio
puma_ver = Gem::Version.new(Puma::Const::PUMA_VERSION)
events.log 'App starting Puma...'
events.log "* Version #{Puma::Const::PUMA_VERSION} , codename: #{Puma::Const::CODE_NAME}"
events.log "* Min threads: #{conf.options[:min_threads]}, max threads: #{conf.options[:max_threads]}"
Puma::Server.new(FakeWebsite, ::Puma::Events.stdio, conf.options).tap do |s|
s.binder.parse conf.options[:binds], s.events
s.min_threads, s.max_threads = conf.options[:min_threads], conf.options[:max_threads]
end.run.join
end
# Disable all external requests by default.
disable_external_connections!
RSpec.configure do |config|
# Disable external connections and stub all external services
#
config.before(:each) do |example|
stub_net_connections!
if example.metadata[:external_connections] == true
enable_external_connections!
elsif example.metadata[:external_connections] == false
run_puma(example)
disable_external_connections!
end
end
config.after(:each) do |example|
end
end
However, as soon as this function runs the tests are stopped because the server is started:
def run_puma(example)
options = { Host: '127.0.0.1', Port: '80', Threads: '0:4', workers: 1, daemon: true, Verbose: true }
conf = Rack::Handler::Puma.config(FakeWebsite, options)
events = conf.options[:Silent] ? ::Puma::Events.strings : ::Puma::Events.stdio
puma_ver = Gem::Version.new(Puma::Const::PUMA_VERSION)
events.log 'Chimera starting Puma...'
events.log "* Version #{Puma::Const::PUMA_VERSION} , codename: #{Puma::Const::CODE_NAME}"
events.log "* Min threads: #{conf.options[:min_threads]}, max threads: #{conf.options[:max_threads]}"
Puma::Server.new(FakeWebsite, ::Puma::Events.stdio, conf.options).tap do |s|
s.binder.parse conf.options[:binds], s.events
s.min_threads, s.max_threads = conf.options[:min_threads], conf.options[:max_threads]
end.run.join
end
These lines are the ones that make the tests stop:
Puma::Server.new(FakeWebsite, ::Puma::Events.stdio, conf.options).tap do |s|
s.binder.parse conf.options[:binds], s.events
s.min_threads, s.max_threads = conf.options[:min_threads], conf.options[:max_threads]
end.run.join
Is there another way to achieve this? Are there any tools to test this type of application out there?