
bear.nolt.io/113
Preview meta tags from the bear.nolt.io website.
Linked Hostnames
4- 3 links tobear.nolt.io
- 1 link todevelopers.cloudflare.com
- 1 link toneil-clarke.com
- 1 link tooutdatedbrowser.com
Thumbnail

Search Engine Appearance
Block LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
I'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
Bing
Block LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
I'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
DuckDuckGo

Block LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
I'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
General Meta Tags
8- titleBlock LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
- cache-controlprivate, no-cache, no-store, must-revalidate
- expires0
- pragmano-cache
- charsetutf-8
Open Graph Meta Tags
5- og:descriptionI'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
- og:imagehttps://nolt.io/static/dist/images/[email protected]
- og:titleBlock LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
- og:typewebsite
- og:urlhttps://bear.nolt.io/113
Twitter Meta Tags
5- twitter:cardsummary
- twitter:titleBlock LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
- twitter:descriptionI'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
- twitter:imagehttps://nolt.io/static/dist/images/[email protected]
- twitter:site@TryNolt
Item Prop Meta Tags
3- nameBlock LLMs from scraping by allowing customization of robots.txt · ʕ•ᴥ•ʔ Bear Feedback
- descriptionI'm more than happy to have by site indexed by search engines such as Google, Bing, Kagi, etc. But I'd love to have a way to block LLMs such as ChatGPT, from training using data from my site. From what I'm aware, the main viable option for this would be robots.txt. page level no-follow tags would run away search engines since there's no way to specify user-agents. Additionally, DNS level scraping products from companies such as Cloudflare only prevent malicious scrapers (e.g. ones looking for emails/phone numbers).
- imagehttps://nolt.io/static/dist/images/[email protected]
Link Tags
2- canonicalhttps://bear.nolt.io/113
- shortcut iconhttps://nolt.io/static/dist/images/logo.1034f87571.png
Links
6- http://outdatedbrowser.com
- https://bear.nolt.io/1
- https://bear.nolt.io/2
- https://bear.nolt.io/new-post
- https://developers.cloudflare.com/waf/tools/user-agent-blocking/#cloudflare-user-agent-blocking