FIU ISSUES

FIU ISSUE #1: OSINT & Amber Flags



     The banking industry relies heavily on third-party systems to alert them to “red flags” that highlight specific risks associated to a client or counterparty. Typically these alerts capture actions or risks that have already occurred: a bankruptcy, an arrest, or a default. This means that client risk management is invariably on the back-foot, reacting to events by looking in the rear-view mirror.

The explosion of information that can be openly captured on the web, coupled with advances in the systems to capture relevant content, means that banks are now able to create forward-looking systems that are able to sniff out potential risks before they develop into red flags.

These signals of potential risks are 
amber flags; they can act as an early-warning alarm for risk managers and investigators. This approach works on both positive and negative signals, but is particularly relevant in the areas of financial crime, AML detection and KYC remediation and on-boarding.

     To build an early-warning system:

     There are three principal ingredients to building an effective early warning system that will deliver actionable intelligence.
  • first of all, one needs the broadest possible capture of news content
  • secondly, the ability to create and adapt filters on that news-flow to ensure that only relevant news flows from the process;
  • finally, you need a team of investigators and analysts that have been trained in the process, methodologies and techniques of what is called Open Source Intelligence - OSINT for short.
      In putting together such a system one is invariably touching upon issues more commonly tackled in big-data projects: multiple content types (text, image, voice, numbers etc.), multiple file types (html, gif, jpeg, wav etc.), multiple languages and heavy content flows. In today’s post I will deal with the first ingredient:

     The capture of news

     By definition the banking industry is used to dealing with large flows of information, but the tendency is to focus on breaking news and to rely upon well-known third-party vendors such as Bloomberg and Reuters for alerts and data management. These vendors have their own journalists (roughly 2,000 each) and import news from external agencies with which they have agreements (the news-wires and specialists). Yet as the diagram below shows, the actual breadth of their news capture is very narrow and focuses on ‘breaking news’. This means that a vast amount of potentially relevant information is discarded and does not make it onto their systems.
      The most effective news-capture system that I have found to date is that of Moreover Technologies (now owned by Lexis Nexis). Whereas Bloomberg and Reuters bring in external feeds from a couple of thousand sources plus their own journalists, Moreover currently captures content from 87,000 news-sources and 2.7 million social media feeds globally. This makes it the broadest commercially available news-capture system on the planet. Just as important as its breadth of capture (in some eighty languages), is Moreover’s user interface, which enables both broad and ultra-defined filtering of its average daily flow of 2.5 million articles. The portal offers a large number of pre-set filters as well as a very ample field in which to build your own search algorithms. In contrast to Google, where the field size of the search request in limited to 256 characters, I have built highly effective algorithms of over 5,000 distinct terms in Moreover, thus ensuring a very high relevance in the resulting flow of information. This ability to actively filter massive news flows is critical to capturing actionable intelligence.


      As this diagram shows, the distribution of news generation follows a typical Gaussian curve. The peak of that curve, i.e. the news that is most highly distributed, is what we term “breaking news”: the terrorist attacks, plane crashes, bankruptcies and the such like. Yet very little news erupts as  “breaking news”. Typically stories will develop over time and start their life way down in the tails of the curve – on the periphery. As a story develops it will travel up the curve, gaining more and more distribution, being reported by more and more news-outlets and in a growing number of languages. The initial event will grow into being a ‘developing story’; from there into a ‘mainstream news’ event, and possibly into ‘breaking news’. Therefore if one can monitor the periphery for relevant news one will gain a significant information edge on the competition, meaning for the bank that it may be able to offset its risk before a red flag is posted.  This requires considerable preparation, but in a world where black swan events are an urban myth, in a world where we live within a scale of grey, he who can prepare and monitor effectively – will win. The good news is that the systems discussed in this paper and inexpensive, and the training required is fast and immediately effective. 

....end/
   

    FIU ISSUES #2: filtering for Amber Flags

  • In last week’s post I discussed how massive scale, multi-lingual content capture can be achieved, enabling investigators to use OSINT techniques and methodologies to discover relevant content from the periphery. 
  • This week I look at how to create effective search algo’s and filters that will cut out the white noise and deliver a refined flow of actionable intelligence.  
  • Financial intelligence is not just about transactional intel and by definition cannot be delivered by backward-looking systems (i.e. red flags)
  • What follows in an example of how to set up an effective forward-looking monitoring system for amber flags and the benefits that such systems can deliver. 

In the world of finance, events tend to repeat themselves time and time again: bombs, earthquakes, oil leaks, fires, court cases, frauds etc. Everything repeats at some time or other (hence my assertion that black swan events are an urban myth). The fact that events repeat means that one can build search algorithms that will effectively monitor for a pre-defined future event with relative ease.
Take the example of BP’s tragedy in the Gulf of Mexico, with the Deepwater Horizon.  Such an event is an ongoing operational risk for any company exploring for hydrocarbons. By building a data-set with the names of all the offshore rigs in the world (available by subscription from RigData.com) and by adding to that the list the names of companies involved in producing or exploring for oil and gas, you can create the base for a highly effective alert system.  Then you build an ontology (list of keywords) around the possible types of accidents that can occur on offshore rigs, in various languages and put the two lists together using some basic Boolean logic.  Plugging the resulting search algorithm into Moreover’s Newsdesk should thereby ensure that you will be one of the first people to know of such an event occurring; furthermore the system’s alerting function will ensure that you are alerted as soon as the first piece of relevant content triggers a capture.
This is exactly what happened on Sunday 25st March 2012. At about 5:30pm the alarms on Total’s Elgin gas platform in the North Sea were triggered on the back of the detection of a gas leak. The platform immediately “went dark”, meaning that all power was shut down to reduce the risk of any sparks igniting the escaping gas, and an orderly evacuation of the platform began. The first alert of this potentially catastrophic situation was captured by the search algorithm on Moreover within the hour, as local Scottish press reported the arrival to Aberdeen of helicopters from the platform evacuating the workers. At 6:21 pm Sunday 25th March BBC Radio Shetland carried a report of a major evacuation being carried out from the Elgin platform, citing a gas leak. The Shetland Isles might be about as peripheral as one can find; but as a source, the BBC is a global leader. The radio report was transposed from voice to text by Moreover and hence captured on its systems. (Note the technological feat of translating an Aberdeen accent into printed English!).
Just three words had triggered the news alert: the word Total (which has by itself numerous meanings); that word was tied to the term “Elgin Platform” and to the keyword “leak”. Separately each one of these would generate a massive amount of noise, but brought together in a structured format within a dedicated search algorithm, meant that an alert was immediately triggered when the three words appeared together in a single news report. Searching on Google with the word “total” would generate 3.5 billion instances; the word “Elgin” another 57 million; and “leak” about 147 million. However by searching for instances of those three words locked together (Total AND Elgin Platform” AND leakdelivers just 383 pieces of content: a volume that can be easily filtered further. 
The point is this: a structured search of global content, using Boolean logic to create the search strings and filters, will deliver relevant news even from the periphery to the end-user real-time.
A gas leak is far more dangerous that an oil leak: one spark and the whole platform is at risk. Consequently it was likely that as soon as the news went mainstream the share price of Total would react negatively, especially given that the BP disaster will still then fresh in peoples’ minds. Yet whilst the Moreover system captured over 120 instances of the news the following day (Monday) there was no reaction in Total’s share-price; in fact the shares went up. One of those reports came via BBC Radio Shetland again, quoting a local union representative as saying that workers coming off the rig talked of a major subsea leak that was visible from the support vessels present, mentioning that “that the sea was seen to be boiling gas below the rig”. That is not a minor event by any means. Why was there no reaction in the share-price? Because as far as traders were concerned, there was no such news: as it wasn’t carried on either Bloomberg or Reuters, it “wasn’t news”.

This changed the following day and minutes after Total started an emergency executive meeting (some 42 hours after the first public reports emerged), the share price of Total fell by €7bn. The French press started talking about a major evacuation on one of the company’s North Sea platforms, and both Reuters and Bloomberg finally picked up the story, which was then elevated to being “breaking news”. Finally, when the stock market closed (at 5:30pm Paris time), Total announced that the platform had been evacuated and whilst the situation was ongoing, that there was no risk to human life.  A few days later the leak was plugged and the story over. Three years later Total was fined a record £1.125mn for the shortcomings that lead to that leak.
In this example, there were multiple examples of amber flags over a 60-hour period that were not picked up by the market, despite them being easy to capture for anyone with the foresight to put an early warning system in place. On Monday 26th March Moreover had processed over 2 million articles; just 120 of those were relevant to the event (0.006% of the day’s throughput); yet effective filtering ensured that they were all picked up and that not one mention “fell through the floorboards”.
Now take that example and relate it across to money-laundering, financial crime or to KYC and it quickly becomes apparent that:
  1. Relying on third-party vendors selling red-flag data is hopelessly out-of-date
  1. That creating relevant search algorithms to act as early warning signals for potential risks is increasingly straight-forward


End…/

No comments:

Post a Comment