The Question That Has Started Showing Up

A specific question is showing up more often in pharma-leadership conversations: should we build AI in-house, or partner with a vendor?

It is a useful procurement question. It is not the best architecture question.

The framing arrives with assumptions. Internal builds offer control but move slowly. Vendor partnerships offer speed but create dependency. Pilots do not scale; platforms do. Compliance is hard to embed. Integration creates complexity. Performance is costly to sustain.

Each of those statements is sometimes true. None is always true.

The problem is that the build-vs-partner frame makes leadership think it is choosing an operating model, when the deeper question is what kind of AI system is being put in front of regulated writers. For regulatory writing, the better question is this:

What does the platform do on behalf of the writer, and what does the writer still have to reconstruct after the fact?

By purpose-built, we do not mean "vendor software with pharma language on the website." We mean an architecture designed around regulated-writing constraints from the start: bounded source sets, visible inference, writer-controlled claims, sentence-level provenance, audit logs, and review workflows that preserve human accountability. A purpose-built platform may be bought, built, or co-developed. The distinction is whether the system was designed around the regulated-writing task — not who delivered it.

The Architecture Behind the Procurement Question

If a writing team uses a generic LLM wrapper — internal or external — to generate regulatory text without source constraints, claim-level provenance, or visible inference boundaries, the system pushes the verification burden downstream. The draft sounds finished. The writer then has to reconstruct what was sourced, what was inferred, and what requires judgment.

That is not a writing acceleration problem. It is a review-liability problem.

By generic LLM wrapper, we mean a chat or document-generation interface placed on top of a frontier model without deep regulatory-writing controls: no task-specific source binding, no claim-level provenance, no statement-type distinction, and no regulatory review workflow built into the generation process. That definition applies whether the wrapper was built internally by a pharma platform team or delivered by an external vendor.

If a platform binds each task to writer-approved sources, surfaces the difference between evidence description and regulatory conclusion, and produces a provenance trail by default, the architecture changes. The platform does the constraint work. The writer does the judgment work.

This is adjacent to the concern behind FDA's January 2025 AI draft guidance: the agency wants sponsors to define the AI model's context of use, understand the model's role in the regulatory question, and establish credibility proportionate to risk. In regulated writing, that same logic points to a practical question: where does inference enter the document, and can the writer see and control it?

Build vs partner does not tell you which architecture you have.

A strong internal build can be purpose-built. A weak vendor platform can be generic. A strong vendor platform can be purpose-built. The distinction is not who built it. The distinction is whether the system was designed around the regulated-writing task.

Why the Binary Framing Persists

The build-vs-partner framing persists because it maps neatly to existing procurement and operating models.

Vendors naturally emphasise speed, specialisation, and lower internal lift. Internal teams naturally emphasise control, governance, and enterprise integration. Both frames are useful. Neither is sufficient.

The question that matters for regulated writing cuts across both:

Does the architecture constrain inference at generation time, or does it rely on human review to catch unsupported claims after the fact?

A well-architected internal build can constrain inference. A well-architected vendor platform can too. A poorly architected version of either passes the verification burden back to the regulatory writer, who is then signing off on output engineered to sound complete.

The choice that matters is architectural, not merely procurement.

The Three Diagnostic Questions

Whether the answer is build, partner, or hybrid, three questions determine whether the platform is fit for regulated writing.

1. Can the writer bind the task to an approved source set? The system should not rely on the model "probably" staying within scope. The writer should be able to define which protocol, SAP, TFLs, CSR, label, guidance, or prior submission the task is allowed to use. Unsupported claims should be blocked or flagged before they enter the draft. The output is bounded by what the writer chose — not by the model's training data, prior submissions you didn't authorise, or the open internet.

But source grounding is necessary, not sufficient. A sentence can be perfectly sourced and still make the wrong regulatory argument. That is why the platform should constrain evidence use while leaving judgment with the writer.

2. Does the system distinguish evidence description from regulatory conclusion? "The study showed X" is not the same as "X supports a favorable benefit-risk conclusion." The first is description. The second is judgment. A purpose-built system should make that boundary visible, especially when the model is moving from reported data to interpretation. The platform should make it difficult for conclusions to appear in the register of neutral description without writer confirmation.

A platform that allows conclusions to be written in the register of plain reporting puts the writer in the position of having to find the judgment after it has been disguised as derivation.

3. Does every claim have provenance and an audit trail? A regulatory writer should be able to show where the sentence came from: source, page, table, figure, prompt, retrieved passage, and generation step. The audit record is not only a compliance artifact. It changes review behaviour. A reviewer can challenge the source, the interpretation, or the conclusion directly instead of trying to reconstruct where the sentence came from.

These three questions are the architecture diagnostic. They apply equally to a vendor SaaS, an internal build, or a hybrid system. Two platforms can both call themselves "AI for regulatory writing" and answer these three questions completely differently.

What the Third Option Looks Like

The third option — purpose-built — is not a procurement category. It is an architectural posture.

A purpose-built regulatory-writing platform can be bought, built, or co-developed. What makes it purpose-built is that its defaults are designed for regulated writing:

  • source sets are bounded by the writer;
  • unsupported claims are blocked or flagged;
  • evidence description and regulatory conclusion are treated differently;
  • provenance is generated by default;
  • the audit ledger is a default, not an enterprise add-on;
  • review happens against sources, not against fluent text alone.

A generic LLM wrapper has the opposite default. It begins with a permissive model and asks the writer, through prompts, SOPs, and review, to add the constraints later. When the writer forgets a constraint, the model produces the consumer-grade default: fluent, plausible, and potentially unsupported.

The architectural difference determines whether the platform reduces the writer's burden or merely moves it downstream.

The Hybrid Model Is Often the Real Shape

In practice, the winning model may be hybrid. A pharma company may keep identity, access control, data governance, model-risk management, and enterprise integration internal — while using a purpose-built regulatory-writing platform for the application layer.

That is still not a build-vs-partner decision. It is an architecture decision. The internal stack handles the parts where the pharma company has the strongest right to own the layer (data residency, identity, governance). The purpose-built layer handles the parts where being designed for the regulated-writing task matters (source binding, inference surfacing, provenance, audit).

A leader asking "build or partner?" frames the choice as either-or. A leader asking "where does each layer need to be purpose-built?" gets to a more realistic answer.

The Question Pharma Leadership Should Be Asking

If the AI conversation in your organisation is framed only as build-or-partner, the framing is already too narrow.

The better question is whether the platform — whoever owns it — constrains source use, surfaces inference, preserves writer judgment, and produces provenance by default. If the answer is yes, the system is working with your regulatory writers. If the answer is no, the system is shifting the verification burden back to them. In that case, the organisation is paying for speed at generation and repaying the debt during review.

Our interpretation of FDA's AI direction is this: the issue is not whether AI is used. It is whether the sponsor can define the AI's role, control where inference enters, and take responsibility for the output. Ingrid Witherell, Asthra's Regulatory Writing Partner, wrote a position paper on this last week: where inference enters the record, and who owns it, is the question regulators care about.

Build vs partner doesn't answer that question. Architecture does.

What the Architecture Diagnostic Actually Asks

For pharma leaders making AI decisions in 2026, the better starting point is not the procurement form.

It is the architecture diagnostic:

  • Can writers bind the task to approved sources?
  • Can the system distinguish description from conclusion?
  • Can every claim be traced back to source and generation history?

A platform that answers yes can be internal, vendor-led, or hybrid.

A platform that answers no will still produce fluent drafts. But the writer will be left to reconstruct the judgment after the fact.

That is the difference between AI that accelerates regulated writing and AI that merely accelerates review debt.