From Authorization to Insight

Data Supply Chain is the life cycle of data entering an organization, undergoing transformation and standardization, and coming out as information ready to be consumed by users within the organization.

Big-data empowered organizations, such as banks, financial institutions, insurance and health organizations, hold vast amounts of data. They obtain a huge amount of data from their own internal activity as well as external sources.

Data is not very usable while kept in its raw format, hence data-driven organizations must digest and analyze the data to gain useful insights and make smarter business decisions. Data scientists, engineers, analysts, and developers use tools such as Looker, Tableau, SAS, TensorFlow, Jupyter and even Python to process the data into a human readable report.

But, a question arises… Who can actually access the data?

Data Usage Restrictions

Data driven organizations in the financial industry face a lot of regulations while using their data. The most famous regulation is the GDPR, but in fact there are thousands of regulatory rules that a bank must comply with while using its data. In addition to regulations, there are restrictions issued by standardization organizations, on top of the obligations and guidance generated by the bank itself.

Regulatory compliance can become a mess, especially for multinational banking institutions, requiring them to follow multiple regulatory alerts from multiple sources, including international and federal.

The United Kingdom has 6 financial regulatory authorities, the European Union has 5 regulatory authorities, and the USA has 11 federal regulatory authorities. Each state in the USA has its own banking authority, and almost any country in the world has one (Wikipedia).

Regulation rules from multiple banking authorities sometimes overlap or collide.

Whether an investment analyst managing a nostro account analyzing a strategic model, or a marketing analyst wishing to start a mortgage campaign — are they allowed to access all the data to gain their insights? As you might realize, the answer is No, and here is where the regulatory rules come into effect.

Since the 2008 financial crisis, strict regulatory enforcement has brought cumulative financial penalties of roughly $321 billion (through the end of 2016). While US regulators have assessed most of the fines, their counterparts in Europe and Asia will likely step up the pace. (Boston Consulting Group).

Data Access Authorization

It may take many months for an access request by a bank analyst from issuing until it is fulfilled. Majority of the time spent is due to compliance teams’ tedious and risky process of verifying that both the request as well as the requester are compliant with the regulations, obligations, standards, and organizational guidelines. Many of the requests for data are denied because of cost considerations — the implications of privacy breaches and data usage issues may cause significant financial losses in terms of regulation investigations and fines.

Denying a request for information is bad for business and fulfilling the request months after it has been raised makes the report no longer relevant.

The data journey from raw data till the final report has three key elements that needs to be considered:

  1. Does the requester have permissions to access the dataset?
    Coarse-grained authorization (RBAC and even ABAC) is not enough. Within the banking industry, a requester must have a specific role to be able to access a dataset, but other restrictions may apply. For example, he also must physically reside in the same region as the data in order to use it and the request must be issued in normal business hours.
  2. Does the data can be used for such a report?
    Fine-grained authorization is a must, and it’s best if it would extend up to a cell-level within the dataset. Authorizing a requester to a dataset while limiting his view to certain columns or cells of a data table has major benefits. Sometimes, limited access is allowed, and some of the data needs to be anonymized or de-identified to prevent privacy breach. Each user should “see” only the relevant parts of the data.
  3. Could the analyzed report violate compliance?
    Even if a report is produced from data that has been cleaned to prevent privacy or other regulatory issues, the final report may contain information that causes a compliance breach.

Data Journey and Beyond

When generally referred to, Data Journey is the process of generating data, storing, transforming, aggregating, modeling, and in the end, visualizing the data in the form of a table, a graph or whatever form is suitable.

But in the highly-regulated industry, regulation implies a major requirement when dealing with data — the data usage history.

Which data was used when making a business decision?

Business decisions are based on insights from reports. Reports are based on raw data. Especially when dealing with large amounts of data through AI and machine learning, it is crucial to be able to investigate the history of a report, the raw data it was derived from, as well as where and when a specific report was used.

Wouldn’t it be great if there was such a system that could automatically create organizational policies based on regulation and guidance, attach the specific authorization to both users and data, and deliver the final report, after taking all the restrictions into account?