Skip to content

'Imports' feature and better spider support

Compare
Choose a tag to compare
@ChrisBAshton ChrisBAshton released this 25 Nov 15:06
· 65 commits to master since this release

Introduces (via #487):

  • 'Imports' functionality (#486)
  • wraith spider [config_name] command (#488)
  • wraith info command (#484)

Spidering and Imports

We have a lot of open 'Spider mode' issues. Many of these arise from the fact that the Wraith spidering logic is quite complex and difficult to test. e.g. Wraith has to maintain a lot of internal state, triggering spidering automatically when certain config properties are missing, but only triggering the spidering if it hasn't been done in the last x-config-option days, etc.

We can make Wraith much simpler under the hood by giving spidering its own dedicated spider command, which the user must choose to run manually, as regularly as they choose. A new 'Imports' feature allows users to import configs into one another.

Paths determined through the spider command are stored to a file as YAML, instead of in a .txt file, which reduces complexity further by storing paths in the same way as non-spider use of Wraith. It also removes the Nokogiri dependency, which will speed up Wraith setup times slightly.

The 'Imports' feature has lots of potential uses beyond just supporting spidering. For example, you may have a common Wraith config defining browser engine, screen sizes to capture, colour of diff image, etc - and then you can have multiple different Wraith configs for each of your sites, all of which import the base config and stop you from having to duplicate all of that information.

Proposed spider workflow

Say we have a simplified config:

imports: "spider_configs.yml"

domains:
  test: http://www.test.example.com
  live: http://www.test.example.com
  1. wraith spider test.yml => spiders the sites, and saves the paths to spider_configs.yml.
  2. wraith capture test.yml => the spider_configs.yml paths are automatically imported into the test.yml, and Wraith continues as if the paths were specified manually.