Reverse Proxy to ENFORCE the robots.txt against malicious crawlers that don't respect it
Find a file
2024-01-15 21:20:39 -06:00
.gitignore refactor to build inside docker container. 2024-01-15 17:23:16 -06:00
bible.txt add unblock instructions to readme 2024-01-15 21:12:07 -06:00
config.json add HardBlockBasedOnRobotsTxt 2024-01-15 21:20:39 -06:00
Dockerfile asd 2024-01-15 17:53:10 -06:00
go.mod move to forest git server real quick 2024-01-15 17:34:33 -06:00
go.sum move to forest git server real quick 2024-01-15 17:34:33 -06:00
main.go add HardBlockBasedOnRobotsTxt 2024-01-15 21:20:39 -06:00
ReadMe.md add HardBlockBasedOnRobotsTxt 2024-01-15 21:20:39 -06:00

forgejo-crawler-blocker

if anyone needs to clear the data to unblock someone, these are the commands to run on paimon:

sudo -i

docker stop gitea_forgejo-crawler-blocker_1
rm /etc/docker-compose/gitea/forgejo-crawler-blocker/traffic.db
docker start gitea_forgejo-crawler-blocker_1

persistent data storage

/forgejo-crawler-blocker/data inside the docker container.

forests manaul build process

Run on server: (paimon)

cd /home/forest/forgejo-crawler-blocker && git pull sequentialread main  && cd /etc/docker-compose/gitea && docker stop gitea_forgejo-crawler-blocker_1 || true && docker rm gitea_forgejo-crawler-blocker_1 || true && docker image rm gitea_forgejo-crawler-blocker || true && rm -f forgejo-crawler-blocker/traffic.db && docker-compose up -d && sleep 1 && docker logs -n 1000 -f gitea_forgejo-crawler-blocker_1