Scan Discovery

Also known as Scan Spider

The navigation of your scanned web application is entirely configurable by HawkScan. To find meaningful vulnerabilities, HawkScan will try to discover parts of your site, intercepting the request & response HTTP payloads as it navigates your web application.

This process is called Scan Discovery and is configured under the hawk.spider section of the stackhawk.yml file.

stackhawk.yml

hawk:
  spider:
    maxDurationMinutes: 2 # maximum allowed time in minutes for any enabled spiders to crawl your web application.
    seedPaths: [] # list of paths to directly add to the site tree.
    base: true # basic spider utility that looks at html source files and follows urls it finds. Enabled by default.
    ajax: false # more complex spider operation that follows dynamic links and buttons on application.
    custom: {} # bring your own developer tools and use generated web traffic to discover your application.

These mechanisms are best suited for discovering running web applications that serve Content-Type: text/html;, including server-side rendered and MVC-shaped web applications. While HawkScan will try to deterministically and consistently scan a running website, the results of the Scan Discovery phase can be more variable for larger web applications with more links and changing content.

For more consistent and protocol constrained REST API scanning, you should specify a configuration such as OpenAPI specification instead of relying on Scan Discovery mechanisms. HawkScan also supports scanning GraphQL and SOAP APIs.

maxDurationMinutes

Multiple spiders can be enabled for a scan; however, the full navigation of your web application may take a long time if the app is sufficiently large. This setting limits the amount of time all configured spiders may take when operating. By default this is 2 minutes. Larger web applications may need more time to scan in pre-production, whereas a shorter feedback time is better when scanning in development.

seedPaths

Explicitly adds routes to the site tree. HawkScan visits the host URL and any routes added here directly during the scan. These paths will be used as additional starting points for crawling your application. This parameter is useful for defining routes that are not readily crawlable from the root of your application host. For example, a hidden page like /admin.

NOTE: This configuration is NOT a replacement for an API definition and provides no benefit to pure REST API’s.

base Spider

The base spider is the basic web crawler for discovering your application’s routes. This spider is appropriate for most traditional web applications. This spider will reach new pages in the web application by finding URLs in the Content-Type: text/html; responses and breadth-first-searching those paths until it has reached all feasible pages.

Toggle it’s operation with true or false.

NOTE: This feature is enabled by default.

ajax Spider

The ajax spider is a more complex web crawler that is designed to discover and find new pages in more dynamic websites or Single Page Applications. This spider leverages Selenium to follow an unscripted process for clicking any buttons and links it encounters.

Toggle its operation with true or false. You can additionally configure which browser to use with spider.ajaxBrowser setting. Options include:

  • FIREFOX_HEADLESS (default)
  • FIREFOX
  • CHROME_HEADLESS
  • CHROME

NOTE: To use the spider.ajax option with the CLI you must have Firefox or Chrome installed and set spider.ajaxBrowser appropriately. This spider is not available in the arm64 HawkScan Docker image.

custom Scan Discovery

Software Developers that are skillful and successful with HawkScan tend to use other great application testing tools. These tools may generate web traffic and support proxying that traffic into other software. These capabilities can be reused with HawkScan to check the tested application web traffic for software vulnerabilities.

Toggle its operation by specifying a custom.command to be run.

See the Custom Scan Discovery page for more details.