Wednesday, 4 January 2017

FIU issues: OSINT and Amber Flags

      The banking industry relies heavily on third-party systems to alert them to “red flags” that highlight specific risks associated to a client or counterparty. Typically these alerts capture actions or risks that have already occurred: a bankruptcy, an arrest, or a default. This means that client risk management is invariably on the back-foot, reacting to events by looking in the rear-view mirror.

The explosion of information that can be openly captured on the web, coupled with advances in the systems to capture relevant content, means that banks are now able to create forward-looking systems that are able to sniff out potential risks before they develop into red flags.

These signals of potential risks are 
amber flags; they can act as an early-warning alarm for risk managers and investigators. This approach works on both positive and negative signals, but is particularly relevant in the areas of financial crime, AML detection and KYC remediation and on-boarding.

     To build an early-warning system:

     There are three principal ingredients to building an effective early warning system that will deliver actionable intelligence.
  • first of all, one needs the broadest possible capture of news content
  • secondly, the ability to create and adapt filters on that news-flow to ensure that only relevant news flows from the process;
  • finally, you need a team of investigators and analysts that have been trained in the process, methodologies and techniques of what is called Open Source Intelligence - OSINT for short.
      In putting together such a system one is invariably touching upon issues more commonly tackled in big-data projects: multiple content types (text, image, voice, numbers etc.), multiple file types (html, gif, jpeg, wav etc.), multiple languages and heavy content flows. In today’s post I will deal with the first ingredient:

     The capture of news

     By definition the banking industry is used to dealing with large flows of information, but the tendency is to focus on breaking news and to rely upon well-known third-party vendors such as Bloomberg and Reuters for alerts and data management. These vendors have their own journalists (roughly 2,000 each) and import news from external agencies with which they have agreements (the news-wires and specialists). Yet as the diagram below shows, the actual breadth of their news capture is very narrow and focuses on ‘breaking news’. This means that a vast amount of potentially relevant information is discarded and does not make it onto their systems.
      The most effective news-capture system that I have found to date is that of Moreover Technologies (now owned by Lexis Nexis). Whereas Bloomberg and Reuters bring in external feeds from a couple of thousand sources plus their own journalists, Moreover currently captures content from 87,000 news-sources and 2.7 million social media feeds globally. This makes it the broadest commercially available news-capture system on the planet. Just as important as its breadth of capture (in some eighty languages), is Moreover’s user interface, which enables both broad and ultra-defined filtering of its average daily flow of 2.5 million articles. The portal offers a large number of pre-set filters as well as a very ample field in which to build your own search algorithms. In contrast to Google, where the field size of the search request in limited to 256 characters, I have built highly effective algorithms of over 5,000 distinct terms in Moreover, thus ensuring a very high relevance in the resulting flow of information. This ability to actively filter massive news flows is critical to capturing actionable intelligence.


      As this diagram shows, the distribution of news generation follows a typical Gaussian curve. The peak of that curve, i.e. the news that is most highly distributed, is what we term “breaking news”: the terrorist attacks, plane crashes, bankruptcies and the such like. Yet very little news erupts as  “breaking news”. Typically stories will develop over time and start their life way down in the tails of the curve on the periphery. As a story develops it will travel up the curve, gaining more and more distribution, being reported by more and more news-outlets and in a growing number of languages. The initial event will grow into being a ‘developing story’; from there into a ‘mainstream news’ event, and possibly into ‘breaking news’. Therefore if one can monitor the periphery for relevant news one will gain a significant information edge on the competition, meaning for the bank that it may be able to offset its risk before a red flag is posted.  This requires considerable preparation, but in a world where black swan events are an urban myth, in a world where we live within a scale of grey, he who can prepare and monitor effectively – will win. The good news is that the systems discussed in this paper and inexpensive, and the training required is fast and immediately effective. 

No comments:

Post a Comment