Sitemap and Robots.txt Generation
Sitemap and Robots.txt Generation
This repository includes an automated GitHub Actions workflow that generates sitemap.xml and robots.txt files whenever changes are pushed to the main branch.
Workflow: .github/workflows/generate-sitemap.yml
Trigger
- Automatically runs on every push to the
mainbranch
What it does
- Generates sitemap.xml with all pages from the Jekyll blog:
- Homepage with highest priority (1.0)
- All blog posts from
_posts/directory (priority 0.8) - Other Jekyll pages like
index.md(priority 0.6)
- Generates robots.txt that:
- Allows all search engines to crawl the website
- References the sitemap.xml location
- URL Structure
- Uses GitHub Pages format:
https://lxndrj.github.io/blog.pandango.de/ - Blog posts follow Jekyll permalink structure:
/YYYY/MM/DD/post-slug.html
- Uses GitHub Pages format:
Files Generated
sitemap.xml: XML sitemap following sitemaps.org protocolrobots.txt: Robots exclusion protocol file
Automatic Updates
Both files are automatically updated and committed back to the repository whenever:
- New blog posts are added
- Existing posts are modified
- Other Jekyll pages are added or changed
Manual Regeneration
If needed, the sitemap can be manually regenerated by pushing any change to the main branch, which will trigger the workflow.