Screening long names – is your filter up to the task?

The screening of long names can be an area of confusion when looking at the performance of your sanctions filter. With lack of official guidance, it can be a confusing area to manage. Simply verifying that your filter can hit the truncated version of the name is naive and potentially exposes an organisation to substantial risk.

The challenge of screening of long names comes not from the length per se, but from the fact that common target fields may contain length restrictions that effectively truncate the name. For example, take a common target field such as the MT 103 50K. This provides for four lines of 35 characters for the name and address – 140 characters in total (the first line is reserved for account identification). That’s enough to fit the majority of SDN entity and all individual names. Since common practice is to reserve two lines for address information, this is a very optimistic figure. Even so, there are currently nine names on the SDN list that exceed even this limit.

There are roughly the same number of individual and entity names (prime names and strong aliases)

The median length of individual names is 20 and length of entity names is 26

The longest individual name is 66 characters whereas there are 455 entity names longer than this

There are nine entity names that exceed the 140 character limit of an MT 103 50K field

For the handful of entity names that exceed the 140 character limit does it matter? How should they be screened? Clearly the name cannot be presented fully so what should the ideal filter screen for?

At the time of writing the longest name on the SDN list is a strong alias of OFAC 35342 (VSEROSSISKIY INSTITUT AVIATSIONNYKH MATERIALOV) which has 196 characters:

FEDERAL STATE UNITARY ENTERPRISE ALL-RUSSIAN SCIENTIFIC RESEARCH INSTITUTE OF AVIATION MATERIALS OF THE NATIONAL RESEARCH CENTER KURCHATOV INSTITUTE STATE RESEARCH CENTER OF THE RUSSIAN FEDERATION

So if we were to test this in a 50K field assuming words are not broken over lines it may look something like this:

Line 2: FEDERAL STATE UNITARY ENTERPRISE
Line 3: ALL-RUSSIAN SCIENTIFIC RESEARCH
Line 4: INSTITUTE OF AVIATION MATERIALS
Line 5: OF THE NATIONAL RESEARCH CENTER

This truncated version is missing the last eight words – KURCHATOV INSTITUTE STATE RESEARCH CENTER OF THE RUSSIA – which account for 28% of the name. Should this hit? Many filters will hit this by essentially recognising that the name is truncated, but is this the best approach? If we have to sacrifice some words, which are the ones we would want to keep to help us the most in our screening endeavours?

Well, our task is to identify the most likely match on the SDN list so let’s look at the number of times each of the component words appears on the list and assign a relative importance based on this:

Each word within the name is assigned a commonality factor which represents how common this word is within the names on the SDN list relative to the other words

The most common word OF can be found in 764 SDN name entries whereas the word KURCHATOV appears only twice, both within the same profile

A commonality can be assigned based on relative frequency of occurrence of the words. SCIENTIFIC and RESEARCH occur 80 and 159 times respectively. This is reflected in the ratio of their commonalities (3.5% : 7.0%)

It’s no surprise that ‘the’ and ‘of’ dominate as the most common words, which is why these are typically ignored for screening purposes. The least frequent word, KURCHATOV, appears twice and it turns out that they are both for the same profile entry, so effectively this narrows the match immediately. Note that this is one of the words that doesn’t “make the cut” when truncation occurs.

Let’s look at another example of some reasonable combinations of name components:

KURCHATOV ALL-RUSSIAN FEDERATION OF AVIATION MATERIALS

In this example, any two of the components ALL-RUSSIAN, FEDERATION, AVIATION and MATERIALS will narrow it down to the correct profile with the exception of the combination of FEDERATION and AVIATION which would also hit another profile. This indicates that it is extremely easy to pinpoint names based on selecting a few of the less commonly occurring words.

The widespread adoption of the ISO 20022 standard will certainly help in this area. However, with many institutions adopting a translation approach at least in the interim, the problem does not go away.

Ultimately, the question of which variations of the name components should hit or miss is a risk-appetite decision. Fundamentally, not anticipating realistic scenarios where shorthand may be adopted could introduce significant risk. Relying on filter behaviour that has been introduced simply to pass poorly designed tests is no substitute for defining your own strategy of how to deal with these scenarios.

How does your filter handle this?

Deep Lake specialises in advanced analytical techniques and expert business knowledge to provide deeper insight into screening environments. Contact us to find out more about our products and services.