36

Scribe

AI-powered cold emails for research positions, with real citations and anti-hallucination validation

Scribe Demo

Three years ago, when the GPT api released, I wrote a scrappy Python script to help myself cold email professors. It worked well enough that friends started asking for it. One of them, Gurnoor, used it to land spots in labs at Harvard and Stanford. That script stayed the same until I nuked the codebase and rebuilt the entire thing from scratch.

Scribe is now a full platform. Upload your resume, create a template with customizable placeholders, and generate personalized emails backed by real citations.

The Backend (The Fun Part)

I'm genuinely proud of how the architecture came together. The system runs a 4-step AI pipeline that transforms a simple template into a personalized email. The key step is the placeholders in the email template (like professor_most_recent_paper) that guides the system on where to webscrape and type of information to extract. These are the four steps in the pipeline.

  1. Template Parser: Claude Haiku analyzes your template and extracts search terms
  2. Web Scraper: Playwright scrapes Google results, then a two-tier summarization system condenses everything without losing important details
  3. ArXiv Enricher: Pulls the professor's actual papers and scores them for relevance
  4. Email Composer: Claude Sonnet writes the final email, validated up to 3 times to catch hallucinations

The whole thing is stateless. All pipeline data lives in memory and only hits the database once at the very end. All workers scale horizontally without fighting over database locks, and Logfire captures the entire execution trace for debugging.

FastAPI handles requests, which are entered into a Redis queue and passed to Celery workers running the pipeline asynchronously. This architecture prevents HTTP timeouts since email generation can take up to 60 seconds, and allows the system to scale horizontally across multiple workers.

The anti-hallucination system was the trickiest part. When scraping multiple pages, facts get verified across sources. Single-source claims get marked as uncertain. The email composer runs validation loops that check if the generated email actually mentions the professor's real work before saving.

Scribe is open source. If you're curious about how the AI pipeline works or want to contribute, check out the GitHub repository.

Try It

If you're a student looking to break into research, or know someone who is: scribe.manitmishra.com

Thanks to credit grants, Scribe is free. No paywalls. Built for students, by a student.