Other Tips for Running an IdP

Robots.txt

By default, robots or spiders, like Googlebot, will try to crawl sites that are protected by an SP, this will result in the robot trying to crawl through the steps of the IdP. Modern spidering algorithms used by most robots will result in long delays between each url that is fetched. This results in issues like the LoginContext expiring from the StorageService long before the robot returns for the next step in the process to crawl the steps of the IdP authentication process. This wasn't as much of an issue pre 2.2.0, but now with more redirects instead of internal forwards it generates many errors over the period of a day. One of the steps to reduce robots generating huge amounts of errors is to add a robots.txt to the root of the site.

/robots.txt

User-agent: *
Disallow: /idp/

Shibboleth 2

IdPTips

Other Tips for Running an IdP

Robots.txt