DEVELOPMENT ENVIRONMENT

~liljamo/robots.txt

ref: c8d6ca21955c2194fa482dc509765ad88ec7ff7f robots.txt/README.md -rw-r--r-- 753 bytes
c8d6ca21Jonni Liljamo feat: generate-nginx.sh 27 days ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# robots.txt

This repository contains my very opinionated (and possibly overengineered)
robots.txt generation.

`lists/` contains lists of user agents to disallow at root:
 * .txt files, line by line listings of user agents
 * empty lines are ignored
 * lines starting with # are ignored

`bases/` contains site specific base robots.txt files:
 * in the robots.txt format
 * names represent the domains they're served at

`generate.sh` is a bash script for generating output robots.txt files:
 * arg $1 is a required path to the out file
 * arg $2 is an optional path to a base file

`generate-nginx.sh` is a bash script for generating an nginx if block to block
user agents at that level:
 * prints out an if block with every user agent in the lists