In part one of this series, we looked at the procedures undertaken by OFAC to classify weak aliases. We examined the entries on the SDN list to determine the consistency of the application of these rules for individuals (see ofac-weak-aliases-part-1-individuals). By and large these rules are applied fairly consistently for individuals, although there are a couple of observations that could improve them.
How are these rules applied when it comes to entity profiles? To recap, these are the procedures OFAC follows to classify weak aliases:
1. Character length (shorter strings were assumed to be less effective in a screening environment than longer strings);
2. The presence of numbers in an alias (digits 0-9);
3. The presence of common words that are generally considered to constitute a nickname (example: Ahmed the Tall);
4. References in the alias to geographic locations (example: Ahmed the Sudanese);
5. The presence of very common prefixes in a name where the prefix was one of only two strings in a name (example: Mr. Smith).”
There’s no explicit note that these apply only to individuals, but clearly one can see from the examples provided the bias towards individual profiles rather than entity profiles.
OK, so once again let’s start with rule #1 above and look at the SDN entities with only one word under ten characters:
When we looked at individuals we found that all single word names of six characters or less were classified as weak aliases. It’s clear from the above table that this is not the case for entities: over a third of the names that meet this criterion are not classified as weak, 28 of them being prime entries:
In addition to the three letter prime entry ‘M23’, there are 26 Strong aliases of 3 characters:
There are 141 weak aliases consisting of three letters, which would indicate that the norm is to treat these as weak. But then what is the criterion used for the three letter strong aliases?
So it looks like rule #1 does not really apply to entity names. Should these be screened? It’s certainly common practice to not screen against the 3 letter entity aliases. Any further optimisation needs careful consideration and should be documented as an explicit risk appetite decision. Perhaps one could be justified in excluding the screening of entries like ‘OPERA’; but ‘HAMAS’, ‘HASM’, ‘SMT-K’, ‘MS-13’, etc. are clearly important entries that should be screened and handled with false positive reduction strategies.
Rule #2, covering the presence of numerical digits, is again seemingly not applicable for entities. There are 134 entity names containing digits, spread fairly evenly across prime, strong and weak aliases. Here is a sample of these:
Although by inspection it does appear that the quality (number of words, length) of these is reduced moving from left to right, it’s difficult to spot any pattern to this. For example, compare the strong alias ‘AO 10 SRZ’ to the weak alias ‘AO 356 ARZ’.
It’s difficult to see how to apply rules #3-#5 above to entities, so once again it appears that there are no clear criteria for determining which entity names should be classified as weak.
Can we find any systematic patterns in the data that would help us understand the difference between a weak and strong alias classification? Well for a start, there are 8 two letter names, so we are probably on safe ground in assuming all future two letter entries will be classified as weak.
What about differences in content? We saw above when looking at the digit rule (#2) that there is no obvious stand-out difference in the names. Let’s look at a few complete profiles and compare the different qualities of entries.
Firstly, entity 25647:
The notable difference in the weak aliases is the absence of any geographical qualifier. Without this information we are left with fairly neutral words like ‘ADVANCED’ and ‘COMPANY’.
However, entry 11913 does not seem to support this conclusion:
It’s not easy to see a clear rationale for distinguishing between the weak and strong entries here. This is where the purpose of some of these weak entries is questionable, given that one would likely expect to hit them anyway based on a reasonable fuzzy match capability.
Let’s look at one more, entry 33983:
The differences seem even less clear here. For example, there’s a lot of overlap between the prime entry name ‘REVOLUTIONARY ARMED FORCES OF COLOMBIA – PEOPLE’S ARMY’ with the weak alias ‘REVOLUTIONARY ARMED FORCES OF COLOMBIA DISSIDENTS FARC-EP’. Also, all of the weak aliases contain the entire text from the strong alias ‘FARC-EP’, so if tested they would all create alerts.
So what are the takeaways here? Well firstly, it appears that the rules stated by OFAC cannot be applied to entity entries.
- There are 26 three letter strong aliases and one three letter prime entry
- The presence of numbers does not affect the quality classification
- Entity names containing geographic locations are, if anything, more likely to be classified as strong aliases rather than weak
In addition, comparisons of content within fail to reveal usable rationales to help justify the quality classification.
- No consistent clear patterns emerge from looking at detailed profile entries
- Many profiles contain weak aliases that are close enough to the prime or strong aliases entries to create a fuzzy match
In part 3 of this series we will be exploring some cross-list issues with weak aliases.
Deep Lake specialises in advanced analytical techniques and expert business knowledge to provide deeper insight into screening environments. Contact us to find out more about our products and services.