Competitive analysis, technology due diligence, market landscape mapping. The output you actually want — a structured report with citations, resolved contradictions, and a clear picture of what's out there — requires dozens of searches, not one. One agent can't hold all of it. A coordinated team can.
A single agent with web access will search, read, and summarize — but it will miss things, lose track of earlier sources, and hit context limits long before it covers the topic. The real problem is parallelism and synthesis: how do you run 50 searches at once, extract consistently from each, and then combine the results without dropping half of them?
The workflow decomposes a research brief into parallel search agents, normalizes their outputs into a shared workspace, and hands everything to a synthesis agent that produces the final deliverable. The dag topology handles the fan-out; map_reduce handles aggregation at scale.
An agent reads the research brief — say, "map the vector database landscape: key vendors, pricing models, differentiation, recent funding, customer segments" — and decomposes it into a set of specific search queries grouped by topic. This decomposition is itself an LLM task: it requires understanding the domain and knowing which questions are worth asking.
AgentEach query spawns a dedicated search agent. For a competitive analysis, you might have 8–12 agents running simultaneously: one per competitor, plus agents covering pricing, open-source alternatives, recent press coverage, and analyst takes. Each agent runs independently, writes its raw findings to the shared workspace, and terminates.
Agent — parallel fan-outEach search agent fetches pages, extracts relevant content, and normalizes it into a consistent schema: source URL, publication date, key claims, named entities, confidence level. The schema is enforced by the step contract — downstream steps expect structured data, not raw HTML.
Agent — per sourceA dedicated reconciliation agent reads all normalized outputs from the shared workspace and identifies contradictions: Pinecone's pricing page says one thing, a TechCrunch article from three months ago says another. The agent flags discrepancies, assigns confidence weights based on source recency and authority, and produces a reconciled fact set.
AgentThe synthesis agent reads the reconciled fact set and writes the final report: executive summary, vendor-by-vendor breakdown, comparison tables, key themes, gaps, and recommendations. Every claim links back to a source in the workspace. The output is a structured document, not a chat response.
Agent + QA loopRunning 12 searches in parallel isn't a loop with asyncio — it's a coordination problem. Which agents are done? Which failed? What do downstream steps get to see? Epsilon's dag topology handles all of that. You write the search agent once; Epsilon fans it out and collects the results.
Every agent writes its findings to a shared workspace directory. The reconciliation and synthesis agents read from that same directory. There is no context window to manage, no prompt engineering to pass findings between steps. The filesystem is the message bus.
For large research projects — due diligence across 200 companies, patent landscape across 500 filings — switch from dag to map_reduce. Epsilon distributes the work across worker pools, aggregates at each level of the tree, and produces the same structured output regardless of input size.
The research brief becomes the task. Epsilon handles the decomposition, fan-out, and synthesis. You get a structured document with citations.
# competitive analysis: vector database landscape $ epsilon runs create --topology dag \ --task "Competitive analysis: vector database market. Cover Pinecone, Weaviate, Qdrant, Milvus, pgvector. Pricing, differentiation, funding, customer segments." \ --implementation python:research_workflow.py:run run_id: r-9e2f7d topology: dag status: running step 1/5 decompose complete (12 search queries) step 2/5 search_agents running (12 parallel) # check progress mid-run $ epsilon runs get r-9e2f7d step 2/5 search_agents complete (11/12 succeeded, 1 retried) step 3/5 extract_normalize complete step 4/5 reconcile running step 5/5 synthesize queued # large-scale research: switch to map_reduce $ epsilon runs create --topology map_reduce \ --task "Due diligence: 200 climate-tech startups from Crunchbase export" \ --implementation python:research_workflow.py:run run_id: r-1a9b3c topology: map_reduce status: running workers: 20 items: 200 aggregation: 3 levels