version=pmwiki-2.2.0-beta56 ordered=1 urlencoded=1 agent=Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1 author=simon charset=ISO-8859-1 csum=glean some robot variables ctime=1134691559 host=202.37.32.2 name=PmWiki.Robots rev=5 targets=PmWiki.AvailableActions,PmWiki.LayoutVariables,PmWiki.Robots,Category.Robots text=(:Summary: Setting available to control robots:)%0a%25rfloat%25''This page is under construction.'' %0aFor PmWiki v2.1 (or later) it is supposed to describe "robots" in general, and the settings available to control them.%0a%0a%0a!!Nofollow links%0aThe default PmWiki skin adds "rel=nofollow" to all the [[available actions]] (edit, diff, upload, and print action links at the top and bottom of the page). %0aThis is a convention introduced by Google and it tells the robot crawler not to give any weight to the content accessed by "nofollow" links. '^[[http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html|#]]^'%0aNote that many robots, including Yahoo! Slurp and msnbot, completely ignore the nofollow attributes on links.%0a%0aThe anti-spam convention that first introduced "nofollow" doesn't say anything about robots not following the links,only that links with rel="nofollow" shouldn't given any weight in search results.%0a%0ayou can also add rel=nofollow in a [[wiki style(s), viz [=%25rel=nofollow%25=], to specify %0a%0a%0a!! The file @@robots.txt@@%0a%0aHere's one example of the contents of a ''robots.txt'', that primarily would tell robots to ignore links containing '@@action@@'.:%0a%0a-> [@%0aUser-agent: *%0aDisallow: /pmwiki.php/Main/AllRecentChanges%0aDisallow: /pmwiki.php/%0aDisallow: */search%0aDisallow: SearchWiki%0aDisallow: *RecentChanges%0aDisallow: RecentChanges%0aDisallow: *action=%0aDisallow: action=%0aUser-Agent: W3C-checklink%0aDisallow:%0a@]%0a%0a!! Robot variables%0aWe can now provide better control of robot (webcrawler) interactions with a site to reduce server load and bandwidth. %0a:$RobotPattern: used to detect robots based on the user-agent string%0a:$RobotActions: any actions not listed in this array will return a 403 Forbidden response to robots%0a:$EnableRobotCloakActions: setting this flag will eliminate any forbidden ?action= values from page links returned to robots, which will reduce bandwidth loads from robots even further [-(PITS:00563)-].%0a:$MetaRobots: see [[layout variables]]%0a%0a!! See also%0a* [[Cookbook:Controlling WebRobots]] - {Cookbook/ControllingWebRobots$:Summary}%0a* %25newwin rel=nofollow%25[[http://robotstxt.org/]]%0a* %25newwin rel=nofollow%25[[http://sitemaps.org/]]%0a* %25newwin rel=nofollow%25Wikipedia:Robots.txt Robots Exclusion Standard%0a* %25newwin rel=nofollow%25http://microformats.org/wiki/rel-nofollow, http://www.nonofollow.net/%0a* Category: [[!Robots]] time=1181789147