Overview
Much of the technical work at a VC fund is building glue between tools. Connecting meeting notes to your CRM, data providers to your warehouse, portfolio data to dashboards. Some integrations exist out of the box, but many require custom code. This chapter covers common integration patterns, validation strategies, webhook handling, and rate limits.Common Integration Patterns
The integrations you build fall into a few common patterns. Tool-to-tool glue: Connecting SaaS tools your team uses. Examples:- Meeting transcription tool (Granola, Otter) → CRM (Attio) to automatically log conversations
- CRM → data warehouse for analysis
- Email → CRM to track outreach
- Calendar → CRM to log meetings
Validating API Responses
The biggest source of problems in integrations is trusting external APIs to return what you expect. API schemas change. Vendors return errors in unexpected formats. Required fields are sometimes null. Data types don’t match documentation. Never trust external APIs. Validate everything. Use validation libraries For TypeScript: Zod. For Python: Pydantic. These libraries let you define schemas for your data and automatically validate objects against them.Webhook Handling
Many vendors (especially CRM systems) provide webhooks: they call your HTTP endpoint when events happen (company updated, deal stage changed, meeting logged). This is more efficient than polling their API constantly. Setting up webhooks You need:- An HTTPS endpoint the vendor can reach (use webhook.site for local development with dummy data)
- To register your endpoint with the vendor (usually through their dashboard)
- To handle webhook verification (vendors send a signature to prove the request came from them)
Rate Limits and API Costs
External APIs have rate limits. PitchBook might allow 100 requests per minute. People Data Labs might allow 500 requests per day. Exceed these and you get 429 errors or get blocked. API costs Beyond rate limits, many vendors charge per request or per entity returned. This changes how you think about API usage:- Only request data you actually need
- Cache responses so you don’t request the same data repeatedly
- Batch requests when possible
- Validate input before making API calls (don’t waste money on requests that will fail)
- First retry: wait 1 second
- Second retry: wait 2 seconds
- Third retry: wait 4 seconds
- Fourth retry: wait 8 seconds
- Give up after 5 attempts
Error Handling
Not all errors should be retried. Some are permanent, others are transient. Categorize errors- Retriable: 429 (rate limit), 503 (service unavailable), 504 (timeout), network errors. Retry with backoff.
- Non-retriable: 400 (bad request), 401 (unauthorized), 403 (forbidden), 404 (not found). Fix the problem or skip the request.
- Context-dependent: 500 (server error) might be temporary or might indicate a vendor bug. Retry a few times, but not indefinitely.
Working with LLM Providers
If you’re building features that use LLMs: Use an AI gateway for failover. LLM providers have frequent outages and rate limits. Services like Vercel’s AI SDK or LiteLLM provide automatic failover between providers. Handle streaming responses. When building UIs, you’ll want LLM APIs to stream tokens incrementally. Set timeouts. LLM requests can take 30+ seconds. Set 60-90 second timeouts so slow requests don’t block indefinitely.Authentication Patterns
API tokens Most vendors provide REST APIs authenticated with bearer tokens. Store tokens securely:- Use environment variables, not hardcoded in code
- Use a secrets manager for production
- Never commit tokens to git
- Rotate periodically
- Dagster: Data orchestration, can schedule imports and transformations
- Airflow: Workflow orchestration, similar to Dagster