Business Problem

  • Data exists in multiple databases spanning multiple proprietary formats in a reporting server that then provides data for Reporting Services Portal, Cubes, Power BI and Actuarial Server.
  • These databases are individual Silos – PMFIC does not have a data warehouse.
  • Ad-hoc Queries are written against Reporting Server data for additional data requests submitted to IT..
  • Ad-hoc queries are written against Actuarial Server data for additional data requests submitted to Actuarial.
  • Data Governance standards and best practices are not yet implemented.
  • Data Cubes have had limited success in user acceptance.
  • Reporting team does not have an Enterprise Data Warehouse, there is little to no reusable logic to build reports.

Solution

  • Pilot Business Use cases were provided new capabilities/insights rather than replicating existing data/reporting.
  • With aging systems/technology prevalent in client landscape, the EDW/Data Lake and its tools allowed for a rich pool of available talent to help the client move forward. This contrasts with the aging COBOL processing system and the mis-matched technology of the current data infrastructure.
  • The Ecosystem design provided a proven, balanced, repository of data allowing for faster development of operational reports. All users of reporting data were entitled to the same views which allowed for one single source of truth.

Tech Stack:

  • Cloudera CDP
  • Spark on Kubernetes
  • Spark
  • Scala / Pyspark