EDW Regulations and Implications

Some of the regulations that needs to be considered while building an Enterprise Data Warehouse in North America are SOX and GLB of the US, Bill 198 and PIPEDA of Canada and PCI. A brief explanation and its implications are given below.

Regulatory Compliance

Sarbanes-Oxley (SOX) and Bill 198
: According to SOX, the CEO and CFO need to certify that the financial statements and disclosures of an institution are fairly presented, in all material respects, the operations and financial condition of the issuer. The management of the institution is responsible for establishing and maintaining an adequate internal control structure. An assessment of the effectiveness of the internal control structure and procedures need to be done at the end of the issuer’s fiscal year.

Bill 198 creates civil liability for the first time for continuous disclosure in the secondary market in Ontario, creating personal liability for directors and experts for misrepresentations and failure to make timely disclosure

Gramm-Leach-Bailey (GLB) and Personal Information Protection and Electronic Documents Act (PIPEDA)
: These regulations insure the security and confidentiality of customer records and information. They mandate institutions to protect against any anticipated threats or hazards to the security or integrity of such records. Access controls needs to be in place to protect against unauthorized access to or use of such records or information

Payment Card Industry (PCI)
: This regulation has everything to do with credit cards. An institution needs to be concerned of this regulation if their business uses credit cards and related systems. Some of the highlights of the regulation are –

  • Install and maintain a firewall configuration to protect data
  • Protect stored data
  • Develop and maintain secure systems and applications
  • Restrict access to data by business need-to-know
  • Assign a unique ID to each person with computer acess
  • Track and monitor all access to network resources and cardholder data
  • Regularly test security systems  and processes

Regulatory Compliance – Implications

The above regulations have a lot of implications that needs to be considered while designing an EDW.

  • Evaluate data usage requirements for all users. Create a data usage control policy that defines what, where, when, and how each type of data may be used by each user.
  • Record database activity and report on deviations from the data usage control policy.
  • Alert (and where appropriate block) user activity when a deviation from usage control policy represents a threat to data privacy or integrity.


  • Identify and prevent storage of sensitive data to demonstrate compliance.


  • Application controls to ensure the completeness of financial transactions
  • Provide comprehensive data security for databases used to generate financial reports.


  • Assure secure, stable and reliable performance of the data warehouse environment.
  • Ability to test security systems and processes.


  • Ensure that complete detailed information about user activity is gathered and maintained.
  • Ability to verify that only users with legitimate need have to data.

Access Control

  • Ensure that only appropriate individuals can alter or access critical information
  • Access controls within networks or applications to ensure that that only appropriate individuals can access or alter financial data.
  • Controls should be in place to examine live database traffic and create profiles of all legitimate activity for each user or application that accesses the database.
  • Enforcement of need-to-know access policy based on business activities.

Incident Management & Disaster Recovery

  • Incident management to address responses and initiates continuous improvement.
  • The system should have the ability to implement controls by notifying administrators about suspicious activity in real time, and even preventing known malicious activity as it occurs in the IT environment.
  • Network Controls
  • Network firewall and IPS should be in place to protect data warehouse.
  • Web Application firewall that provides a layer of defense to protect applications.


  • Permit ongoing real-time monitoring and enforcement of corporate and IT controls.
  • Logging permits ongoing real-time monitoring of systems, evidence of security vulnerabilities, evidence for forensic investigation (audit trails), and incident investigation.
  • Ability to audit access of sensitive data and alert administrators of suspicious activities.
  • Monitor and report shared user accounts and other potential user account violations.
  • All notifications of security incidents to Information Security should be documented and contain explicit, clear and concise information.

EDW Security Considerations

An Enterprise Data Warehouse (EDW), according to California State University, is a collection of data that can be defined and shared across the whole enterprise along the lines of common dimensions to be used for analysis. While you are in the designing phase of an EDW, there are certain security and related functional requirements that needs to be considered.

Security Services Requirements

Authentication: The primary purpose of an EDW is business intelligence. The information contained in it could be confidential to restrictive in nature and will be used for formulating future market of the company.  The EDW needs to have an authentication services that assures the identity of the user or system seeking access to the EDW. Authentication services can be implemented using passwords, tokens, biometrics (e.g., fingerprint readers), and encryption.

Access Control: Access control services needs to be in place assuring that people, computer systems, and processes can use only those resources (e.g., files, directories, computers, networks) that they are authorized to use and only for the purposes for which they are authorized. Access control mechanisms can be identity based (e.g., UNIX protection bits, access control lists), label based (also known as mandatory access controls), or role-based (implemented as a combination of the above, plus system privileges). Access control plays an important role in protecting against illegitimate use and in providing confidentiality and integrity protection.

Confidentiality: A typical EDW would contain Personal Identifiable Information (PII) and other sensitive information related Payment Card Industry (PCI). An EDW needs to implement services that protect sensitive and private information from unauthorized disclosure. Confidentiality services are generally implemented using encryption.

Integrity: Since EDW are used for long time purpose specifically for business intelligence, it needs to implement services that assure that data, computer programs, and system resources are as they are expected to be and that they cannot be modified by unauthorized people, software, or computer equipment. Mechanisms for implementing data integrity include cyclic redundancy checks and checksums, and encryption. The type of controls assuring system integrity includes physical protection, virus-protection software, secure initialization mechanisms, and configuration control.

Attribution: “You should be able to identify the person who shot you and not just the gun”. An EDW needs to implement services that assure actions performed on a system are attributable to the entities performing them, and that neither individuals nor systems are able to repudiate their actions. This is possible through audit trails, encryption, and digital signatures.

Availability: An EDW is usually an offline repository which is used whenever there is a need for it. It need not be part of a live or online system providing real team information. The availability is not that critical as a database supporting a live system. However, today some EDW are designed to support real time.  Services need to be in place in the EDW assuring systems, applications, and data are available when they are needed. Considerable efforts may be required to safeguard data and critical system services, ensuring that correct and complete information is available. Architects needs to ensure that IT services, utilising EDW, needs to be available for authorized individuals to deliver and process reliably. An important requirement of any privacy protection schema is to ensure that critical data and services are available at all times. The mechanisms providing availability include fault-resilient computers, virus protection software, and RAID (Redundant Array of Inexpensive Disks) storage.

Security Functional Requirements

The security functional requirements that are mentioned below depends on the business requirement and the type of business the EDW is going to support.

Anonymity: While designing a system, sometime there could be business requirements that mandate to protect identity of a user using a system while maintaining audit trails of the means of accessing the system. Here a separate system used by the user could be the subject accessing the system. In case of such requirements for anonymity, an EDW needs to provide protection of the user identity. Anonymity is not intended to protect the subject identity.

: There are use cases where a user or a system seeking access does not want to disclose their identity. This is a rare requirement and depends on the business that is supported. Pseudonymity ensures that a user may use a resource or service without disclosing its user identity, but can still be accountable for that use.

Unlinkability: A subject may various resources for completely different purposes. Unlinkability ensures that a user may make multiple uses of resources or services without others being able to link these uses together. This is a good case for an EDW having wealth information that could be used for different purposes.

Unobservability: In an enterprise where there could be multiple departments, one department may not want the other to know that it is accessing certain resource. Unobservability ensures that a user may use a resource or service without others, especially third parties, being able to observe that the resource or service is being used. This scenario is a good case where the business is to provide EDW services for multiple competing departments or even companies.

Production Data as Test Data

In order to maintain high quality of code, a company needs to use production quality source data for development, unit test and QA functional test purposes. There could be situations when the company uses unscrambled production source data, which potentially exposes customer sensitive data. Customer sensitive data must be protected. Given that there is a correlation between the quality of test data and the quality of code delivered to production, all efforts should be made to minimize the disruption/distortion of test data, while satisfying the privacy concerns.

Try to desensitize data that is brought down to Development from Production while maintaining its quality such as referential integrity between files/tables/entities that needs to be maintained. Some projects, such as fraud detection, need to maintain meaningful data in fields such as ‘name/address/postal code’ so that patterns and groupings can be detected. All fields are within some field specific domain. The field domain or context cannot be specified ahead of time and may vary with projects. Some projects may need to maintain certain relationships of the field.

Security Guidelines

  • Data that would or deemed to have very serious or significant impact, if exposed, on confidentiality of a customer or an entity should be decoupled. The decoupling of data should be accomplished in such a way, so that after treatment it will not be possible to trace back the sensitive customer data to their real owners.
  • Data that would or deemed to have very serious or significant impact after decoupling process, if exposed, on confidentiality of a customer or an entity should be masked.

Separation of Duties

  • A developer shall determine and request the data that needs to be downloaded to development, however the request has to be reviewed and approved by a person responsible for the data.
  • The personnel executing the extraction and transformation of the data should not be the requester (or developer) and should have the approval for the execution from a person responsible for the data.
  • The above mentioned approvals will be obtained on a per project basis and not per request.

Maintaining and Protecting Referential Integrity

  • Referential Integrity shall be maintained to the records downloaded to development; however the key that maintains referential integrity may help in identifying a customer in production and should be protected. This option is possible only if the source system start protecting it.
  • Developers who have access to downloaded files should not have access to production, since the key that maintain the referential integrity may help identify a customer and associated details in production. In some cases, this option is not practically possible since developers may need to have read access to production for triage purpose.

Audit trail

  • Appropriate mechanism should be in place to properly demonstrate trail of activities (including approval) that led to the execution of a particular extraction.