Housecarl
Operational Open the console →
Why Arbiter

When your sources disagree

You have twelve tabs open. The analyst note says one thing, the filing implies another, the insider thread on Reddit is sure of a third — and the most confident voice in the room is the one with the least to lose if it's wrong. You've read everything twice. You know more than when you started, and you're less sure.

This is the moment most research dies: not from too little information, but from conflicting information and no honest way to settle it.

What people do instead — and why it fails

Faced with sources that disagree, almost everyone falls back on one of three moves: trust the most recent thing they read, trust the most confident voice, or average everything into a mushy "the truth is somewhere in the middle." All three feel reasonable. None of them is — recency isn't reliability, confidence isn't evidence, and the middle of a fact and a falsehood is just a smaller falsehood.

The failure has a shape: nobody is keeping score of who said each thing, how reliable they've been, when they said it, and whether the agreements are real or just the same rumor wearing five hats.

What Arbiter does about it

Housecarl Arbiter keeps that score, formally, on every investigation:

  • Who's talking matters. A regulator on the record outweighs an anonymous tip. A company defending itself gets discounted for the conflict of interest. You set the trust levels; Arbiter applies them without fatigue or favoritism.
  • When they said it matters. Stale claims quietly lose their grip; fresh, corroborated ones hold theirs. Last quarter's rumor doesn't get an equal vote on this quarter's question.
  • Repetition isn't corroboration. Five outlets citing the same wire story count as one source. Two genuinely independent confirmations count for far more — and Arbiter knows the difference.
  • Disagreement is shown, not smoothed. When the sources truly contradict, you get each version of events that holds together, who backs it, and how strong it is — so you can see exactly why the leading reading leads, and what would have to be true for the other side to win.

At the end you have a verdict with a confidence number, every claim ranked from strong to refuted, and a pointer to the one piece of missing evidence that would most change the answer. That's your next phone call, chosen instead of guessed.

See the whole thing run on a contested fraud case: the worked example.

Common questions

Isn't this what an AI summary does?

No — a summary compresses the disagreement; it doesn't settle it. Ask an AI to summarize five conflicting sources and you get fluent prose that quietly picks sides without telling you. Arbiter does the opposite: it makes the disagreement explicit, weighs it in the open, and hands you the reasoning along with the verdict. (If your deeper problem is the AI itself sounding sure when it shouldn't, that's this page.)

Do I have to score the sources myself?

You assign each source a type and a rough trust level — expert, journalist, insider, the subject of the story — and Arbiter does everything downstream. If you're working through Claude or ChatGPT, the assistant drafts those for you and you adjust.

What if the evidence genuinely doesn't settle it?

Then Arbiter says so. Uncertainty is reported as a number, not papered over — and the result tells you which missing piece of evidence would do the most to resolve it. "Not yet knowable, and here's what to check" is a real answer, and sometimes it's the one that saves you.

What does it cost?

It starts free. Paid plans run from about thirty to a few hundred dollars a month depending on volume — see the pricing page. Weigh that against what's riding on the question you're staring at right now.


Settle it with the evidence weighed. Run your first investigation from Claude or open the console — free, no card required.