Founding Software Engineer
The web follows a power law: 0.01% of changes drive outcomes, but you can't predict which 0.01% without watching everything. We're building the platform that solves this — agents that patrol broadly, filter with judgment, and deliver the signals that matter.
About Zipf.AI
Zipf's Law says a tiny fraction of events drive every outcome. The engineering challenge is finding that fraction across billions of web pages that change daily. We build persistent web monitoring for AI agents and teams — agents that watch sources on a schedule, extract structured data, detect meaningful change, and deliver signals to Slack, CRM, or webhooks.
Our founding team built search and LLM systems at Microsoft Bing, Snowflake, Neeva, Amazon, Walmart, Qualtrics, and Mendel.AI. We've published dozens of papers with thousands of citations, hold dozens of patents, and shipped systems handling billions of queries daily.
We're making it trivial to run complex agentic workflows on web-scale data. Describe what to watch in plain language, and our platform figures out the rest — what to crawl, how to extract, when something meaningful changed, and where to deliver the signal.
We are not building yet another hype-chasing company that works its employees to the bone. We are a deeply curious and genuine group of people. While most of us could talk shop for all hours of the day we heavily encourage everyone to get healthy doses of reality.
The Role
We are building persistent monitoring infrastructure for AI agents — search, crawl, workflow execution, and signal delivery. There is simply no better learning opportunity.
We have paying customers and real revenue. We need software engineers who are scrappy and curious — people who can own a problem from database schema to Kubernetes deployment to customer call.
We are obsessed with metrics of true quality. We are thrilled to be building completely outside the gravitational pull of ad-dollars.
Choose your own job title. Grow into the Head of Product, Head of Platform Engineering, or Head of Forward Deployed Engineering.
What You'll Build
Reality: you'll work on whatever needs doing. Here's what's actively being built and shipped:
Workflow Execution Engine
- • Multi-step DAG execution with dependency resolution and cascade conditions
- • AI-planned workflows — users describe intent, Claude generates the patrol pipeline
- • Baseline adaptation — analyze first execution, auto-tune steps
- • Change detection, information gain scoring, and signal delivery
Crawl & Search Infrastructure
- • Intent-driven crawling with OPIC/PageRank URL prioritization
- • Structured extraction with LLM-powered classification
- • Query decomposition — break complex queries into parallel sub-queries
- • Session-based research with URL deduplication and context accumulation
Scale & Reliability
- • Python/FastAPI services on Kubernetes with queue-driven autoscaling
- • Thousands of concurrent workflow patrols with sub-minute step latency
- • Infrastructure-as-code across multiple environments
- • Graceful shutdown, message redrive, and self-healing workflows
Observability & Quality
- • Metrics, tracing, and structured logging across distributed services
- • LLM trace logging for debugging multi-step reasoning chains
- • Feedback loops — agents learn from signals and auto-tune over time
- • Quality scoring and notification intelligence to avoid noise
What We're Looking For
The Intersection
You sit somewhere at the intersection of:
- Infrastructure builders who design systems that scale and don't fall over.
Experience with: Kubernetes, SQS/job queues, PostgreSQL, AWS, Terraform, API design, monitoring & observability
- Modelers who understand ML/AI and can improve retrieval quality
Experience with: LLM integration, prompt engineering at scale, embedding models, reranking, inference optimization
- Forward-deployed engineers who talk to customers and ship what they need
Experience with: Customer-facing technical work, debugging production issues, rapid prototyping, turning feedback into features
Customer Obsession
- • Genuinely care about solving customer problems, not just elegant code
- • Willing to do whatever customers need - even if it's "not your job"
- • Take ownership of outcomes, not just outputs
Technical Fundamentals
We care more about how you think than your specific tech stack. That said, you should have:
- • Strong fundamentals in distributed systems and backend infrastructure
- • Experience with Kubernetes, job queues (SQS/BullMQ/Temporal), or high-scale async systems
- • Comfort with Python and TypeScript (we use both daily)
- • Familiarity with LLM integration, web crawling, search systems, or ML infrastructure
- • Willingness to write code that may not exist in a few months — iteration over perfection
Early-Stage Mindset
- • We're pre-PMF. You'll build things that get thrown away and may crack as we scale. You're excited by that.
- • Default to action and iteration over lengthy planning
- • Comfortable with ambiguity and changing priorities
- • Want ownership and impact more than a defined role
Our 6-Month Goals (You'll Help Us Get There)
- 1.Platform: Workflow execution, synthesis, and signal delivery running on Kubernetes with production reliability and full observability.
- 2.Scale: Thousands of concurrent workflow patrols with sub-minute step latency. Autoscaling that responds to real demand, not fixed limits.
- 3.Intelligence: Close the quality loop — AI-planned workflows that self-tune via baseline analysis, adapt based on feedback signals, and deliver genuinely useful monitoring to paying customers.
The Reality Check
What Makes This Hard
- • Making distributed systems feel simple to users while handling real complexity underneath
- • Context switching between infrastructure, LLM prompts, and customer calls
- • Concurrency, queues, and state management across async workflows
- • Fast pace — things break, priorities shift
What Makes This Rewarding
- • Own real infrastructure end-to-end — from config to production traffic
- • Huge impact as one of the first engineers at a company with paying customers
- • Learn from a team that's built billion-scale systems at Bing, Snowflake, and Amazon
- • Early-stage equity (at very modest hype-free valuation) that can easily 10x in the next two years
Tech Stack Deep Dive
This is our current stack, but we're pragmatic — if something better fits the problem, we'll use it.
Backend & Data
- • Python 3.12+ - Core services (FastAPI)
- • PostgreSQL - Primary datastore
- • AWS SQS - Async job processing
- • SQLAlchemy 2.0 - Async database access
AI & LLM
- • Claude (Anthropic) - Core LLM for planning, extraction, evaluation
- • Multi-provider adapters - Anthropic, OpenAI-compat (Cerebras, Fireworks)
- • Web search & crawl - Custom pipelines for retrieval and extraction
- • Semantic reranking - Result scoring and relevance tuning
Infrastructure
- • Kubernetes (EKS) - Container orchestration
- • Queue-driven autoscaling - Workers scale to demand
- • Terraform - Infrastructure as code
- • GitHub Actions - CI/CD
Frontend
- • Next.js 15 - React framework
- • TypeScript - Dashboard and public site
- • Tailwind CSS - Styling
- • Vercel - Frontend hosting
- • Playwright - E2E testing
Observability
- • OpenTelemetry - Metrics and tracing
- • Prometheus + Grafana - Dashboards and alerting
- • Sentry - Error tracking
- • Fluent Bit - Structured log shipping to CloudWatch
- • PostHog - Product analytics
Architecture Patterns
- • Layered services - Clean separation of routes, services, and data access
- • Link graph analysis - PageRank and HITS for crawl prioritization
- • DAG execution - Multi-step workflows with dependency resolution
- • LLM orchestration - Multi-provider with fallback and retry
Compensation
We believe in transparent, fair compensation. Salary and equity scale with experience and impact potential.
Early Career (0-2 years)
EntryStrong fundamentals, eager to learn, excited about early-stage chaos
Mid-Level (3-7 years)
CoreProven ability to ship, comfortable with ambiguity, can own entire features
Senior+ (8+ years)
LeadershipDeep expertise in search/ML/infra, can architect systems, comfortable leading and mentoring
Note: Equity is at a very modest, hype-free valuation. We raised at reasonable terms and aren't playing the Silicon Valley valuation inflation game. Your equity has real potential to 10x+ in the next 2-3 years as we hit product-market fit and scale.
Details
Location
Flexible, with strong preference for being in NYC a few days per month. Remote-first culture, but we believe some in-person time builds better teams. Quarterly Offsites to focus and recharge.
Benefits
We take care of our team: unlimited vacation (honor system), company-wide holiday closures (Christmas week + Fourth of July week), flexible hybrid schedule, comprehensive medical/dental, 401(k) match, generous parental leave (6 months birthing / 3 months non-birthing), conference travel support, and team offsites twice a year.
View full benefits packageReady to Apply?
Send us your resume, a note about why you're interested, and something you've built that you're proud of.
We care about what you can do and how you think, not credentials.
Apply NowQuestions?
"Is this more ML or infra?"
Yes. Both. And product. And ops. That's the point.
"What's the tech stack?"
Python (FastAPI) and TypeScript (Next.js). PostgreSQL, SQS for job processing, Kubernetes for orchestration, Terraform for infrastructure. We build our own search, crawl, and extraction pipelines with LLMs in the loop. We care more about picking the right tool than religious adherence to specific tech.
"Do I need a PhD?"
No. Some of our employees have advanced degrees, but we care about ability to ship and solve problems.
"What if I haven't done search before?"
That's fine. If you've built scalable systems or worked on ML infrastructure, you'll figure it out.
Zipf is an equal opportunity employer. We value diversity and are committed to creating an inclusive environment for all employees.