How I help financial institutions identify fraud patterns

Across finance, healthcare, e-commerce, and social platforms, the same core algorithmic principle applies: efficiently discovering meaningful co-occurrence patterns at scale. PCY enables this by reducing memory, computation, and noise—turning raw data into actionable insight.

AI/ML

1/26/20263 min read

When financial institutions come to me about fraud detection, they rarely start with algorithms.

They usually ask something like:

“We see suspicious transactions, but it’s hard to tell what’s truly abnormal versus normal customer behavior.”

That distinction matters.

Fraud detection isn’t about labeling everything unusual as fraud — it’s about first understanding what normal behavior looks like at scale, and then flagging what deviates from it.

Banks process millions of transactions per day. Without a principled way to define “normal,” fraud systems either:

Miss real threats, or
Overwhelm analysts with false positives

How I model the problem

Before I touch any algorithm, I decide how banking activity should be represented.

This modeling step is critical, because fraud detection depends on context, not isolated events.

Here’s how I structure it:

Transaction → User session
One banking session or customer activity window (e.g., login → actions → logout)
Items → Actions
Actions such as ATM withdrawal, online transfer, password change, foreign login
Itemsets → Behavioral patterns
Actions that frequently occur together define normal behavior
Support threshold → Normality filter
A pattern must occur often enough to be considered expected behavior

Once framed this way, fraud becomes a deviation problem:

Anything outside frequent patterns is potentially suspicious.

Why PCY is needed in banking systems

After modeling the problem, a hard constraint appears immediately.

Banks deal with:

Millions of daily sessions
Dozens of possible actions per session
Explosive combinations of action sequences

Brute-force counting of all action pairs doesn’t scale.

That’s why I use PCY.

PCY allows me to:

Identify frequent, normal behavior patterns efficiently
Filter rare or anomalous combinations early
Scale fraud analysis without blowing up compute or storage

What banks get from this approach

By combining careful modeling with PCY, I produce:

Frequent action pairs → baseline customer behavior
Missing or rare pairs → potential fraud signals
A clear definition of normality grounded in data

💡 Business value: reduced fraud losses, faster detection, and scalable analysis without overwhelming fraud teams.

What banks are really asking

Underneath all fraud tooling, the core question is simple:

“What transaction behaviors normally occur together?”

Once that’s answered, fraud detection becomes much easier:

Transfers without login?
Unusual action combinations?
Sudden spikes in rare behaviors?

Those are no longer vague concerns — they become measurable deviations.

Step 1: Transaction sessions as transactions

I start by modeling each banking session as a transaction.

I then define:

Minimum support = 4

This means an action pattern must appear in at least 4 sessions to be considered normal behavior.

Step 2: Pass 1 – Single action support (what I eliminate first)

Before looking at combinations, I determine which actions are common enough to matter.

Only actions meeting the support threshold survive.

Frequent actions: {Login, Transfer}

Numeric view:

Formally: L1 = {1,2}

ChangePassword is dropped at this stage — not because it’s unimportant, but because it doesn’t occur often enough to define normal session behavior in this dataset.

This step dramatically reduces noise and future computation.

Step 3: PCY hashing (how I define normality efficiently)

Now I apply PCY’s first pass.

Instead of explicitly counting all action pairs, I hash them into buckets and count bucket frequency.

Hashed pair:

Pair Bucket

(1,2) 3

Bucket support:

(Login, Transfer) appears in 4 sessions

Since the bucket meets the minimum support threshold, it is marked frequent.

This step ensures:

Normal behavior survives
Rare combinations die early
Computation stays bounded

Step 4: Pass 2 – Candidate action pairs

Next, I construct candidate pairs (C₂).

Rules:

Both actions must be frequent
Their hash bucket must be frequent

Candidate pairs: C2 = {(1,2)}

There is only one pair worth examining further.

Step 5: Final counting (what I trust)

I now perform exact counting — but only for candidate pairs.

Pair Support

(1,2) 4

Final frequent pair:

L2 = {(1,2)}

Translated back to actions:

This pair defines normal customer behavior.

Fraud insight (where detection happens)

Because I’ve clearly defined normality, fraud signals become obvious.

When the bank sees:

Transfer without Login
ChangePassword + Transfer spikes
Action combinations not in L₂
🚨 Alerts are triggered.

These alerts aren’t heuristic guesses — they’re grounded in statistically frequent behavior patterns.

Business impact

By modeling the problem correctly and using PCY:

Banks define normal behavior empirically
Fraud detection becomes deviation-based, not rule-heavy
Systems scale across millions of sessions per day

PCY doesn’t detect fraud directly.

It defines normality — and that’s what makes fraud visible.