The Inference Gap: Why the 'AI Bank' Collides with CJEU C-184/20

Aggregated bank data lets AI infer Article 9 categories (CJEU C-184/20). Why sensitive inferences belong before model training, not in a downstream audit.

The Inference Gap: Why the 'AI Bank' Collides with CJEU C-184/20

Late last week, on a public LinkedIn thread about the “AI bank”, a board member of a German direct bank used a phrase that stuck with me: a new species of bank is forming, one that fuses online-banking prices with branch-grade advice through AI. The thread was lively, the optimism genuine. So I asked a narrow question. Where in this architecture do you classify data from which special categories under Article 9 GDPR can be inferred, in the sense of the Court of Justice ruling C-184/20? The bank answered honestly, and the answer is the reason for this article: that point had “not been addressed explicitly at the legal-operational level.”

That sentence is not an admission of wrongdoing. It is a precise description of where most AI-banking roadmaps have a blind spot. The gap between the new “AI bank” category and a six-year-old court ruling is not a survey gap and not a consent gap. It is an architecture decision that almost everyone postpones to the audit, where it is already too late.

The ruling almost nobody has operationalised

On 1 August 2022, the Court of Justice of the European Union (CJEU, the EU’s highest court for interpreting Union law) decided case C-184/20, ECLI:EU:C:2022:601. The case was about a Lithuanian anti-corruption declaration, not about banking. The reasoning, however, reaches every aggregated dataset in Europe.

The court ruled that data from which special categories within the meaning of Article 9 GDPR can be derived by an intellectual operation, by combination or deduction, fall under the protection of Article 9 themselves. The source data does not have to be health data, religious data, or trade-union data. The mere possibility of inferring such a category is enough to trigger the regime: processing prohibited by default, no comfortable reliance on legitimate interest, in many cases explicit consent under Article 9(2) required, and a data protection impact assessment for large-scale processing.

Aggregated bank data meets that threshold almost by design. Three examples make the mechanism concrete, and these are concrete cases, not a rhetorical flourish. Recurring pharmacy charges at the start of every week, paired with a quarterly payment to a named medical practice, signal a chronic condition, which is health data. A standing donation to a religious organisation or a union due read out directly as religious affiliation or trade-union membership. A location profile assembled from card usage at a clinic, a counselling centre, or a particular neighbourhood reveals living circumstances and, in some constellations, data on sex life. None of those payments is itself a special category. The inferred profile is.

Why personalised AI advice deepens the exposure

The standard reassurance from the AI-banking camp is that better models mean better service. On the substance, that reassurance misses the point.

Highly personalised advice is, by construction, an aggregation engine. The value proposition is precisely that the system reasons across the whole picture: salary, fixed costs, savings goals, spending patterns, the mortgage, the planned car. The denser that reasoning, the higher the inference depth, and the more reliably the model crosses the C-184/20 threshold. The feature and the legal exposure grow on the same axis. You cannot turn up the personalisation while turning down the inference, because they are the same dial.

This is the same territory I covered when OpenAI shipped its personal-finance feature in mid-May. There, too, aggregated account data meets the Article 9 threshold the moment a model reasons across it. A supervised European bank is in fact the stronger fiduciary anchor here, with a banking licence and a regulator, where a cloud AI provider has neither. But that fiduciary anchor raises the bar: supervised institutions are precisely the ones a data protection authority examines first, so the inference doctrine binds the bank with full force.

Why this lands now

Two clocks are running at once, and they are not aligned with the audit calendar.

The first is competitive. The board member’s “new species” framing is a response to deposit pressure from Trade Republic, BBVA, and J.P. Morgan’s Chase entering the German market. The bank that fuses a low-cost model with credible, AI-assisted advice first wins a structural argument. That race rewards speed, and speed is exactly what pushes the inference question to the end of the build.

The second clock is regulatory, and it points the other way. Regulation (EU) 2024/1689, the EU AI Act, adopted on 13 June 2024, becomes fully applicable on 2 August 2026. Article 16 sets out provider obligations and Article 17 mandates a quality-management system, with Article 17(4) addressing financial institutions explicitly. Both require documented procedures before a system is placed on the market, not retrofitted after an incident. And the obligation that is already live: Article 4 on AI literacy has applied since 2 February 2025. Understanding where your pipeline infers sensitive categories is no longer optional knowledge. It is a legal duty that took effect more than a year ago. What the EU AI Act means for the Mittelstand in 2026 sets out the broader timeline, but the inference point sits underneath all of it.

The escalation tempo is on record, too. The US class action filed against OpenAI in the Southern District of California on 13 May 2026 turned aggregated conversation and financial data into a litigation subject within a single week. That is how fast this territory moves from product launch to courtroom.

Architecture as the answer

The gap does not close with a sharper survey question. It closes with a build decision, and the decision is sequential.

The mechanism is classify-before-training. Before any feature reaches the model, transaction signals are tagged at the source: “pharmacy”, “NGO donation”, “location cluster home/clinic”. Those tags then enter a feature-selection gate as block tags. A feature carrying a block tag does not pass into the training set without an explicit, documented decision. Sensitive inferences are made visible and steerable at the entrance, before the model learns them, rather than discovered in a downstream audit when they are already baked in and hard to reverse.

This is also the only way the words “transparency” and “control” become technically real. The direct bank used both terms in its survey communication, and the customer expectation behind them is genuine. But a consent banner is a promise, not a mechanism. Block tags in the feature-selection gate are the mechanism. They are what lets you tell a customer, and a regulator, exactly which inferable categories were excluded and why. Control that you can point to on an architecture diagram is worth more than control asserted in a press release.

The Mittelstand bridge

None of this is a banking-only problem, and that is the part worth sitting with. The CJEU threshold is identical for a major bank and a fifty-person company. Any Mittelstand firm that runs AI over aggregated customer or employee data sits on the same doctrine.

A workforce analysis that infers health status from absences and canteen purchases is the pharmacy-charge problem in a different uniform. A churn model that reads political or religious orientation out of customer behaviour is C-184/20 territory whether or not anyone intended it. The reflex answer, “but we have consent”, does not hold, because a blanket consent cannot cover a sensitive inference that nobody classified before training. You cannot consent precisely to a category you never knew the model would derive. This is the same class-blending risk that appears when employees feed aggregated company data into their own AI tools, only now the aggregation layer carries inferable special categories.

What both architectures don’t solve

A fair objection deserves a fair hearing, and the direct bank made the strongest version of it. Its position is that the legal-operational level and the customer-expectation level are two different things. The YouGov survey it commissioned measured the second: customers are open to more AI in advice when transparency, control, and human responsibility are in place. The survey was never meant to adjudicate Article 9 at the operational level. That distinction is correct, and it is not a straw man.

Here is where the distinction holds only halfway. Without an explicit measurement requirement, call it “inference robustness”, the legal-operational level never becomes a measurable product property. What is not measured does not enter the backlog, and what does not enter the backlog does not get built. The separation between expectation and operation is clean on paper. In practice it holds only if someone builds the operational bridge, and that bridge is classification before training.

The harder limit cuts the other way, against my own thesis. Block tags do not make a model safe on their own. Inference is probabilistic, and a sufficiently capable model can reconstruct a sensitive category from features that no one thought to tag. A pure cloud architecture and a pure local one share this exposure. Classification before training narrows the surface and creates an audit trail, without abolishing the risk. That is why the honest claim stays narrow: this is the first control that makes Article 9 governable, where the outcome is otherwise accidental. The bank that builds it first still carries the inference problem, yet gains a sales argument that US providers, operating without a European supervisor and without this doctrine in their bones, structurally lack.

Frequently asked questions on the inference gap

What does the “inference gap” actually mean for my data?

On 1 August 2022, the CJEU ruled in C-184/20 that data from which sensitive information can be inferred is treated, in law, like sensitive data itself. From innocuous-looking transactions, a pharmacy charge, a donation, a location, a model can derive health status, conviction, or living circumstances. The moment an AI system can do that, Article 9 GDPR applies, even if you never deliberately collected those categories.

A blanket consent does not protect you when the sensitive inference first arises during training and no one has classified it. Article 9(2) requires explicit consent for exactly those categories. Without classification before training, you do not know which inference you would even need consent for. Consent to a category you never identified is not informed consent.

Why before training and not in the audit?

In an audit you examine a finished model. What it infers from aggregated data is already trained in by then, and hard to extract. Classify sensitive inferences upfront as block tags in the feature-selection gate, and you decide deliberately what enters the model before it learns. That is also the logic behind Articles 16 and 17 of the EU AI Act (2024/1689): documented procedures before a system goes to market, not corrective measures afterwards.

Does this affect only banks, or my Mittelstand company too?

It affects anyone running AI over aggregated customer or employee data. A workforce analysis that infers health status from absences and canteen purchases is the same problem as pharmacy charges at a bank. The CJEU threshold is identical for a major bank and a fifty-person firm. For mid-sized companies the practical entry point is usually AI architecture and automation done with judgement, paired with the compliance and mandatory topics around AI.

What is a “block tag” in technical terms?

A block tag is a classification label assigned to a data signal before feature selection, such as “pharmacy”, “NGO donation”, or “location cluster home/clinic”. It prevents that signal, or features derived from it, from entering training unchecked. It turns a sensitive inference into something visible and governable, rather than a by-product buried inside the model where only an audit can find it, and only after the fact.


Next step

Where in your AI pipeline are sensitive inferences classified, and does that happen before or after training?

If you cannot point to the gate on a whiteboard in ten minutes, a sparring session is worth the hour. I work solo, with no tool-reseller agenda, and the conversation is free of charge.

Book a no-cost intro call

→ Or read more first: AI and automation · Compliance and mandatory topics

Sources and links: CJEU C-184/20 (curia.europa.eu) · Article 9 GDPR · Regulation (EU) 2024/1689, EU AI Act · EU AI Act Article 4 (AI literacy) · ING press release on AI in banking and customer expectations

Read further on pfisterer.xyz: ChatGPT Finances: the architecture question before the trust question · EU AI Act 2026 and the Mittelstand · Shadow AI: when employees build their own AI agents

About the Author René Pfisterer

10+ years in ERP integration, data migration, and process automation for mid-sized companies. Specialized in DATEV, SAP, and AI implementation.

Full profile →
← Previous article SAP's 9 June API Cut-Off: One Pipeline Dies, Not All of Them Next article → LinkedIn AI Sludge in 2026: Bubble or Your Best Moat?

Interested?

Let's discuss how I can help in a short conversation.