💻 README Bullshit Detector Script
Instantly identify AI-generated fluff in documentation to save development time.
import re
def detect_bullshit_score(readme_text):
"""
Analyzes README text and returns a bullshit percentage.
Higher score = more AI-generated marketing speak.
"""
# Common AI-generated buzzwords and patterns
buzzwords = [
'ecosystem', 'seamless', 'synergistic', 'paradigm',
'revolutionary', 'disrupt', 'leverage', 'transformative',
'enterprise-grade', 'cutting-edge', 'next-generation',
'robust', 'scalable', 'holistic', 'frictionless'
]
# Empty claims without actual functionality
empty_patterns = [
r'empower.*developer',
r'redefine.*industry',
r'unlock.*potential',
r'streamline.*workflow'
]
text_lower = readme_text.lower()
words = text_lower.split()
if len(words) == 0:
return 0
# Count buzzword occurrences
buzzword_count = 0
for buzzword in buzzwords:
buzzword_count += text_lower.count(buzzword)
# Count empty pattern matches
pattern_count = 0
for pattern in empty_patterns:
pattern_count += len(re.findall(pattern, text_lower))
# Calculate bullshit percentage
total_indicators = buzzword_count + pattern_count
bullshit_score = min(100, (total_indicators / len(words)) * 1000)
return round(bullshit_score, 2)
# Example usage:
# readme = "This revolutionary tool leverages synergistic paradigms to transform your developer ecosystem..."
# score = detect_bullshit_score(readme)
# print(f"Bullshit Score: {score}%")
The Problem: When Your Documentation Sounds Smarter Than Your Code
Remember when README files used to tell you what software did? Those were simpler times. You'd get a quick description, installation instructions, maybe an example or two. Today, thanks to the miracle of AI writing assistants, we've entered the era of documentation that sounds like it was written by a marketing executive who just discovered the thesaurus.
The problem isn't just aesthetic—it's a genuine productivity killer. Developers now spend more time parsing phrases like "leveraging synergistic paradigms" than actually understanding code. I recently spent 15 minutes reading about a "revolutionary new approach to data transformation" only to discover it was a JSON prettifier. Another gem promised to "redefine the developer ecosystem" while being a 20-line script that renamed files.
The worst offenders follow a predictable pattern: they start with grandiose claims about "disrupting" something, drop at least three industry buzzwords per sentence, and somehow manage to say absolutely nothing concrete. By the time you reach the installation section, you're not even sure if you're looking at a database ORM or a new meditation app for developers.
The Solution: Cutting Through the Buzzword Fog
After one too many encounters with documentation that sounded like it was written by a corporate AI trying to sound human (and failing spectacularly), I decided enough was enough. I built README GPT Bullshit Detector to solve this exact problem.
The tool works on a simple but effective principle: AI-generated content tends to use certain patterns and vocabulary that actual technical documentation doesn't. While humans might occasionally slip in a "seamless" or "robust," AI documentation sounds like it's trying to convince you to invest in a startup that hasn't built anything yet.
Despite the humorous premise, this is actually a useful tool. It helps you quickly assess whether a project's documentation contains actual information or just marketing fluff. It's particularly helpful when evaluating dependencies—if a library's README scores 95% on the bullshit meter, maybe think twice before adding it to your production codebase.
How to Use It: Your Quick Start Guide
Installation is refreshingly simple—no "revolutionary deployment paradigms" here. Just clone the repository and run the script:
git clone https://github.com/BoopyCode/readme-gpt-bullshit-detector
cd readme-gpt-bullshit-detector
python detector.py /path/to/your/README.mdThe core logic is beautifully straightforward. Here's a snippet from the main detection function that shows how it identifies problematic patterns:
def calculate_bullshit_score(text):
"""Returns a percentage score of how bullshitty the README is"""
buzzwords = [
'revolutionary', 'leverage', 'ecosystem', 'seamless',
'synergy', 'paradigm', 'robust', 'cutting-edge',
'next-generation', 'innovative', 'transformative'
]
words = text.lower().split()
buzzword_count = sum(1 for word in words if word in buzzwords)
# The more buzzwords per 100 words, the higher the score
return min(100, (buzzword_count / len(words)) * 1000)Check out the full source code on GitHub to see all the features, including the logic for extracting actual technical descriptions and comparing README length to source code size.
Key Features That Actually Do Something
- Buzzword Density Analysis: Scans for telltale AI vocabulary like "revolutionary," "leverage," "ecosystem," and "seamless"—the four horsemen of the documentation apocalypse.
- Bullshit Score (0-100%): Generates a quantitative measure of how much marketing fluff you're dealing with. Scores above 70% suggest you might want to look elsewhere.
- Technical Description Extraction: Attempts to find and display what the project actually does in three sentences or less, cutting through the nonsense.
- README-to-Code Ratio Check: Flags projects where the documentation is longer than the actual source code—a classic sign of overpromising and underdelivering.
Conclusion: Reclaiming Your Time from AI-Generated Nonsense
In a world where AI can generate convincing-sounding nonsense at scale, tools like this help maintain some semblance of sanity. The README GPT Bullshit Detector won't write better documentation for you, but it will help you identify when you're reading documentation that was clearly written by something that doesn't understand what it's describing.
Try it out on your own projects—you might be surprised at what you find. And if your README scores particularly high, maybe take it as a sign to write something that actually explains what your code does instead of what it "revolutionizes."
Try it out: https://github.com/BoopyCode/readme-gpt-bullshit-detector
Remember: just because AI can generate endless paragraphs about "synergistic ecosystems" doesn't mean it should. Your time is too valuable to spend deciphering what should have been a simple explanation.
Quick Summary
- What: A tool that scans README files for AI-generated marketing fluff and extracts the actual technical description.
💬 Discussion
Add a Comment