PeopleSoft/SQR technical development PeopleSoft/SQR technical development PeopleSoft/SQR technical development
• Mail/phone
  Search
  Keyword:
  Tools
SQR Debugger
 
 
 
TECH TALK
Lightly technical observations on PeopleSoft and related topics
 

October 11, 2004

Scrambling Data

Do you ever have a need to hide personal information from users of your PeopleSoft database? This would be the case, for example, if certain developers should not have access to salary or other personal information, or if you want to clone a database for training purposes. As another example, I once wrote a program to "scramble" the data in a database for PeopleSoft's internal HR department, so that it could be used for a large-scale user test. How is this done?

Many organizations today recognize the risk of giving too much access to personal information. However, this must be balanced with the need to customize, support and test the system. The overall direction you take may be dictated by an internal audit department or by the IT, Legal, or HR department's policy. Work closely with them to define the scope of the project. If you decide to "scramble" sensitive data, you must carefully determine which types of data to change. Scramble too little, and you have accomplished nothing. Scramble too much, and the system could be nearly unusable for any support and testing.

Also, an HRMS/HCM database is more appropriate for this than, say, a financials database.

First, determine the needs of your developers and users. A developer who supports payroll processing, for example, needs full access to real salary and pay period data. At the other end of the scale, a temporary consultant working on a training module project may not need any access to salary at all. Can this requirement be met through the normal security mechanisms?

If you decide to continue, the next step is to determine the fields to be scrambled. Take the social security number (national ID), for example. There is rarely a need to know an actual SSN during development and initial testing. We could simply update the PERS_NID table, setting all SSNs to the same thing. However, this can interfere with testing of interfaces that use SSN as a "key". It may be better to set the SSN to a unique value that includes the EMPLID.

Salary-related fields are difficult to scramble, especially if you are running payroll or Payroll Interface. You could simply set each employee's annual salary to some arbitrary number, such as $50,000. However, the system also stores monthly and hourly pay; pay components; earnings, deduction and tax balances; and many other amouonts that should combine in a reasonable way. I've come to the conclusion that it's just about impossible to scramble salary figures meaningfully, so I generally just set everything to $1. It is then obvious to any user that these are scrambled numbers and the user should not expect them to "tie" to each other.

Note that there is nothing to prevent a user from "hiring" a new employee and giving that employee a reasonable salary for more realistic testing purposes.

When scrambling salary, don't forget about related fields such as salary grade. Knowing someone's grade is often about the same as knowing the salary itself.

An alternative method is to hide the identity of an employee rather than other personal information about the employee. To do this, you could leave the salary information alone and instead change the EMPLID, name, address, job title, and other identifying information. This is what I did for PeopleSoft.

Of course, if you decide to use this method, you must change every instance of EMPLID throughout the system. Otherwise, tables won't join accurately. Remember also that an employee ID field is not always called "EMPLID". You could find employee IDs in other fields--SUPERVISOR_ID, for example. How to find all instances of a particular field is another topic that I'll deal with here soon.

Once you have determined your approach and the fields to be changed, write a combination of programs and/or SQL scripts to accomplish the scrambling. Keep these in a safe place, since you will need to rerun them after every database refresh.

I believe that we should do everything we can to protect our data and to minimize unnecessary access, but scrambling data effectively is a significant effort, so plan carefully.

Until next time...







 

  HOME  |  ABOUT US  |  PRODUCTS  |  SERVICES  |  TECH TALK  |  LINKS  |  SQR  |  CONTACT
© 2003-2010 SparkPath Technologies, Inc. & its licensors. All rights reserved. Trademarks used are property of their respective owners. | Terms of Use