Incident(s) of the Week: Double Feature

Incident 1: UNC Data Breach Exposes Information On Over 100,000 Women Listed In Mammogram Registry

The University of North Carolina at Chapel Hill recently disclosed a data breach that exposed information on 160,000 women, including the Social Security Numbers of 114,000.  Original reports estimated that more than 200,000 women were affected.  The source of the breach was a computer intrusion into a server housing the Carolina Mammography Registry, which is "a 14-year-old project that compiles and analyzes mammography data submitted by radiologists across North Carolina."

Evidently, the breach was discovered in July, but it may have occurred over two years ago.  According to Matt Mauro, chairman of the UNC Department of Radiology, traces of computer viruses were found on a UNC School of computer server dating back to 2007 were found on the server.  The school delayed in notifying those affected while it conducted a forensic investigation to determine exactly who was affected.  To this point, however, the school still does not know who committed the breach or where the attack originated from, how the server (which had all required security measures) was breached, or whether any data was actually downloaded.

Links:

Incident 2: Massachusetts Inmate Pleads Guilty to Charges that He Hacked Prison Computer While Incarcerated, Accessed Personal Information On 1,100 Correctional Officers

On September 14, 2009, Francis G. Janosko pled guilty to charges that he hacked a legal research computer provided to inmates in the Plymouth County Correctional Facility.  A highly restricted computer terminal was provided to inmates for the sole purpose of allowing them access to legal research resources.  Janosko apparently circumvented security measures restricting the computer to legal research tools and obtained accessed the administrator's username and password, the prison's internal network, and a report listing the names, birthdays, Social Security Numbers and contact information for 1,100 current and former prison personnel.  He also used the computer to send email and download publicly-available photographs and videos.

A grand jury in Boston indicted Janosko for these activities about a year ago in a sealed indictment (.pdf).  In the plea agreement (.pdf) recently reached with the U.S. Attorney's Office in Boston, federal prosecutors have agreed to dismiss the original charge of aggravated identity theft in exchange for Janosko's guilty plea to charges under the Computer Fraud and Abuse Act.  Janosko has agreed to accept an additional incarceration of 18 months for the hack.  Sentencing in the case is scheduled for December 15th.

Social Security Numbers (SSNs) Can Be Predicted Using Basic, Widely-Available Public Data. Social Security Administration Not Surprised, and Continues to Offer Detailed SSN Information to the Public

As has been recently reported, researchers from Carnegie Mellon University have announced that they have uncovered a method to accurately predict the Social Security Numbers (SSNs) of individuals by simply knowing two of the most basic and widely-available facts about people today: their dates of birth, and their States of birth. In their paper titled “Predicting Social Security Numbers from Public Data” (.pdf), researchers Alessandro Acquisti and Ralph Gross warn that they have uncovered a distinct and identifiable statistical pattern across SSNs of deceased persons – that, ironically, are made publicly available by the Social Security Administration (SSA or Agency) itself – and have used that pattern to accurately predict the SSNs of live Americans by simply knowing their birthdays and in which States they were born. In other words: “[A]ny third party with internet access and some statistical knowledge . . . [can deduce the pattern of SSN assignment] by analyzing publicly available records in the [Social Security Administration] Death Master File [and] interpolating an alive person’s state and date of birth with the patterns detected across deceased individuals.” 

What has received considerably less media attention, however, is the SSA's muted response to this fiasco, and, quite the opposite, the alarmingly broad set of explanatory guides and almost-complete SSNs that the Agency makes available to the public on their website.

 While SSNs are the most ubiquitous personal identifiers today, and while their disclosure is a gateway to identity theft and other potentially disastrous mischief, the SSA's response to the report has been astoundingly nonchalant. Rather than provide any assurances over the integrity of their SSN assignment system, the SSA instead appears amused that the researchers are taking credit for “crack[ing] a code” that, in the SSAs words, has been “a matter of public record for years.” 

The SSA is right: the Agency's website contains user-friendly guides that explain, in sometimes surprising detail, the SSN assignment system. The excerpt below, for example, is from the section helpfully titled “Structure of the Social Security Number (SSN)”:

The SSN consists of nine digits separated into three parts by hyphens (i.e., 000-00-0000) representing the area, group, and serial numbers.

1. Area Number The first three digits of the SSN are the area number. The area number reflects the State as derived from the ZIP Code in the mailing address the number holder provided on his/her application for an original SSN card.

 2. Group Number The middle two digits of the SSN are the group number. The group number ranges from 01 to 99, but group numbers are not released for SSN assignment in consecutive order. Instead, for administrative reasons, group numbers are released in the following sequence:

Odd numbers from 01 through 09; then even numbers from 10 through 98; then even numbers from 02 through 08; and finally, odd numbers 11 through 99.

3. Serial Number The last four digits of the SSN are the serial number. The serial number represents a straight numerical series of numbers from 0001-9999 within each group.

The SSA website also provides a chart that enables any lay person to make a very reasonable guess at the first 3 digits of any person’s SSN if a person’s State of birth is known -- and with extreme accuracy if that individual is born in smaller-sized States, such as Hawaii or Rhode Island.  Other resources include:  

  • a list, updated monthly, of the Area Number and Group Number (or the first 5 digits of the SSN) that have been assigned each month beginning December, 2003;
  • instructions on how to access the Death Master File, which researchers used to deduce the statistical patterns by which SSNs are assigned, and, presumably, any statistically savvy third party can do the same; and
  • FAQs, directed to employers that, among other things, lets us know which SSNs are invalid: “no SSNs with an area number in the 800 or 900 series, or ‘000’ area number, have been assigned. No SSNs with an area number above 772 have been assigned in the 700 series.”

To its credit, the SSA at least acknowledges the obvious reality that “the use of the SSN as a general identifier has grown to the point where it is the most commonly used and convenient identifier for all types of record-keeping systems and data exchanges in the U.S.,” and that identity theft associated with SSNs is a pressing concern. While the SSA has for decades refused to adopt the most obvious safeguard of completely randomizing SSN assignment, the Agency has finally announced that it is currently developing a system to randomize all SSNs beginning next year.  That system, however, would only apply to the assignment of new SSNs – and would in no way help the hundreds of millions of Americans alive today whose SSNs remain vulnerable. 

Links