Results 1 to 5 of 5

Thread: Database Random Name Generator

  1. #1
    Senior Member
    Join Date
    Feb 2004
    Posts
    1,891
    Thanks
    218
    Thanked
    61 times in 53 posts
    • jonathan_phang's system
      • Motherboard:
      • Asus Rampage III Extreme
      • CPU:
      • i7 930 @ 4.2 ghz (200x21)
      • Memory:
      • 12GB Corsair XMS3 1600
      • Storage:
      • Crucial M4 128GB SSD + Misc Data Drive
      • Graphics card(s):
      • EVGA GTX 1080 FTW
      • PSU:
      • Corsair HX850 Modular
      • Case:
      • Antec 300
      • Operating System:
      • Windows 7 x64
      • Monitor(s):
      • Asus PB278Q (27" 2560x1440)
      • Internet:
      • Virgin Media 100mb

    Database Random Name Generator

    Hello,

    quick one here - for those of you more versed in data anonymisation, just wondering if you know of a place where I can generate a large list of random names. I need to rename 8046 people, which are split by gender, so would like to specifically generate a 2 csv lists of x length.

    I can then just write a quick C# program to parse each pair and update the names in the DB.

    However, most places I find let me do about 30 at a time. Any ideas or better ways to achieve this?

    Thanks in advance

    JP

  2. #2
    Senior Member
    Join Date
    May 2007
    Location
    Cheshire
    Posts
    329
    Thanks
    16
    Thanked
    41 times in 25 posts
    • chadders's system
      • CPU:
      • Sony Vaio VGN-AW11Z/B
      • Operating System:
      • Windows 7 x64
      • Monitor(s):
      • Samsung 226BW + Belinea 17" lcd
      • Internet:
      • Be Pro - 14000/1200 (down/up kps)

    Re: Database Random Name Generator

    Probably missing the point but could you not take the 30 @ time names (first & last) whack them in two tables and cross-join (ie 30 first names x 30 last names = 900 first & last name combinations)?

    edit: you'd only need 90 first and last names to get 8,100 different names - does it matter if you get every combination of "Smith"... John Smith, Mary Smith, etc...?

  3. #3
    Not a good person scaryjim's Avatar
    Join Date
    Jan 2009
    Location
    Gateshead
    Posts
    15,196
    Thanks
    1,231
    Thanked
    2,291 times in 1,874 posts
    • scaryjim's system
      • Motherboard:
      • Dell Inspiron
      • CPU:
      • Core i5 8250U
      • Memory:
      • 2x 4GB DDR4 2666
      • Storage:
      • 128GB M.2 SSD + 1TB HDD
      • Graphics card(s):
      • Radeon R5 230
      • PSU:
      • Battery/Dell brick
      • Case:
      • Dell Inspiron 5570
      • Operating System:
      • Windows 10
      • Monitor(s):
      • 15" 1080p laptop panel

    Re: Database Random Name Generator

    Can you not use C# to access the database and SHA1 hash a concatenation of the first and last names? That'll give you psuedonymised data (as long as every person the the database has a different name)! Or do you need something that looks like an actual name?

    If you're not sure about generating a SHA1 hash from a String in C# I can post some sample code I've written for work (my work receives ~ 1.2million personal sensitive data records a month, so pseudonimsation of data is something we are quite familiar with ).

  4. #4
    Senior Member
    Join Date
    Feb 2004
    Posts
    1,891
    Thanks
    218
    Thanked
    61 times in 53 posts
    • jonathan_phang's system
      • Motherboard:
      • Asus Rampage III Extreme
      • CPU:
      • i7 930 @ 4.2 ghz (200x21)
      • Memory:
      • 12GB Corsair XMS3 1600
      • Storage:
      • Crucial M4 128GB SSD + Misc Data Drive
      • Graphics card(s):
      • EVGA GTX 1080 FTW
      • PSU:
      • Corsair HX850 Modular
      • Case:
      • Antec 300
      • Operating System:
      • Windows 7 x64
      • Monitor(s):
      • Asus PB278Q (27" 2560x1440)
      • Internet:
      • Virgin Media 100mb

    Re: Database Random Name Generator

    @chadders - yeah, ended up doing that temporarily and it worked fine... Its only a temp fix for now, but better than the character scrambled script that we had before. Might look into a better way for larger sets.

    @Jim, unfortunately, the whole point I even got asked about this is because what we had meant that the names were nonsense. Mind you though, if you dont mind, I would be interested in taking a look at the hash method you use, if only to see how others do things!

    Thanks for the input chaps

  5. #5
    Not a good person scaryjim's Avatar
    Join Date
    Jan 2009
    Location
    Gateshead
    Posts
    15,196
    Thanks
    1,231
    Thanked
    2,291 times in 1,874 posts
    • scaryjim's system
      • Motherboard:
      • Dell Inspiron
      • CPU:
      • Core i5 8250U
      • Memory:
      • 2x 4GB DDR4 2666
      • Storage:
      • 128GB M.2 SSD + 1TB HDD
      • Graphics card(s):
      • Radeon R5 230
      • PSU:
      • Battery/Dell brick
      • Case:
      • Dell Inspiron 5570
      • Operating System:
      • Windows 10
      • Monitor(s):
      • 15" 1080p laptop panel

    Re: Database Random Name Generator

    No worries. Generating a SHA1 is easy in .NET as the libraries *should* be included with the default SDK.

    Use System.Text.Encoding.UTF8.GetBytes(String) to return a Byte array based on the name - you can just concatenate the first and last names, but we use an identifier comprising first and last initial, DOB and sex (M or F).

    Use System.Security.Crypotgraphy.SHA1.Create().ComputeHash(Byte[]) - this creates a default SHA1 cipher then uses it to create the hash based on the Byte array you generated in the first step.

    Then, step through the array building a String (or using a StringBuilder ) by using the .ToString("X2") method of the Byte struct, which converts the byte into a 2 character Hex string.

    Because the hash of the message will always be the same, you can then use the hash as an identifier in the database without storing personally identifiable data, and match it to other datasets which can generate a hash of the same identifier. It allows data matching across sensitive data sources (e.g. matching police data to health data) without disclosing any personally identifiable information, which is important for data protection (the hash cannot be algorithmically reversed - although with sufficient computation power and knowledge of the structure of the data that was hashed you *could* determine the original message by regenerating the hash). By the sound of it it's not relevant to the work you're currently doing, but if you ever get deeper into data anonymisation you may find it useful...

  6. Received thanks from:

    jonathan_phang (08-06-2009)

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. multiple answers in one database field
    By dmcHamleys in forum Software
    Replies: 5
    Last Post: 08-03-2009, 03:01 PM
  2. V3 Cardbox database app promises power and accessibility
    By Bob Crabtree in forum HEXUS News
    Replies: 3
    Last Post: 05-12-2005, 07:10 PM
  3. Replies: 3
    Last Post: 17-10-2005, 02:31 PM
  4. abnormal access database traffic?
    By Stoo in forum Networking and Broadband
    Replies: 8
    Last Post: 06-10-2004, 04:43 PM
  5. hexus database
    By ingouk in forum HEXUS Suggestions
    Replies: 4
    Last Post: 08-09-2003, 09:46 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •