Transparency: The President’s New “Robots.txt” Shoes
From Kottke, still relevant after all these years.
Here’s a small and nerdy measure of the huge change in the executive branch of the US government today. Here’s the robots.txt file from whitehouse.gov yesterday:
User-agent: *
Disallow: /cgi-bin
Disallow: /search
Disallow: /query.html
Disallow: /omb/search
Disallow: /omb/query.html
Disallow: /expectmore/search
Disallow: /expectmore/query.html
Disallow: /results/search
Disallow: /results/query.html
Disallow: /earmarks/search
Disallow: /earmarks/query.html
Disallow: /help
Disallow: /360pics/text
Disallow: /911/911day/text
Disallow: /911/heroes/textPlus 2400 lines.
The new robots.txt? 2 lines:
User-agent: *
Disallow: /includes/
The robots.txt blocks certain directories from being indexed by search engines like Google — which means it also blocks archival for things like legal proceedings.