X Bookmarks Architect
Automating a Knowledge Ingestion Engine for a PM's Second Brain
Deliverables / Skills Utilized
Playwright
Python
Notion API
MCP
Warp Terminal
π The Problem
PMs in Fintech and AI face a growing "black hole" of bookmarks on Xβthousands of saved posts that remain unsorted, disconnected from workflows, and effectively useless for knowledge retrieval.
I had 2,000+ bookmarks accumulated over years. Finding anything meant endless scrolling. The knowledge was there, but completely inaccessible.
π‘ The Solution
Built an end-to-end data pipeline to extract, process, and ingest all bookmarks into a structured Notion "Second Brain" with zero manual effort.
π οΈ Implementation Steps
π Authentication & Harvesting (Playwright)
- Automated browser session with cookie-based authentication
- Handled X's anti-bot measures with realistic interaction patterns
- Session persistence for reliable multi-hour extractions
π Infinite Scroll Engineering
- Custom scroll algorithm to handle X's lazy-loading
- Implemented backoff strategies to avoid rate limits
- Checkpoint system to resume interrupted sessions
β‘ Zero-Token Sync Engine
- Direct Notion API integration via MCP (Model Context Protocol)
- Structured data mapping: URL, text, author, timestamp, categories
- 95% reduction in API costs vs standard AI processing approaches
π― Algorithmic Noise Filtering
- Rule-based filtering to exclude sports, memes, and off-topic content
- Category auto-assignment based on content patterns
- Priority scoring for fintech/AI/product content
π Technical Highlights
| Metric | Result |
|---|---|
| π¦ Items Migrated | 2,000+ |
| π° API Cost Reduction | 95% vs AI-based approaches |
| π€ Manual Effort | Zero post-setup |
| β Success Rate | 99.2% |
π― Why This Matters for PMs
This project demonstrates:
- π» Technical Literacy: Writing production code, not just specs
- π Scalable Workflows: Building systems that compound over time
- π Tool Integration: Connecting disparate tools into cohesive pipelines
- π§© Problem-Solution Fit: Identifying personal pain points and engineering solutions
π§ Stack
- Playwright (Python): Browser automation and scraping
- Warp Terminal: Modern terminal for development
- Notion API + MCP: Structured knowledge base integration
- Python 3.x: Core scripting and data processing