LocalBlox, a US-based data technology company that “crawls, discovers, extracts, indexes, maps and augments data in a variety of formats from the web and from exchange networks” and ties it all together to create profiles on individuals that contain personal, business and consumer data for marketing purposes, has been found leaking information on tens of millions of individuals.
The discovery was made by UpGuard researcher Chris Vickery, who stumbled upon the unsecured Amazon Web Services S3 bucket holding the data, bundled in a single, compressed file. When decompressed, it revealed 48 million records in a format that’s easy for anyone to peruse.
“The sheer breadth of the exposed data includes such information as individuals’ names, physical addresses, dates of birth, scraped LinkedIn job histories, public Facebook data, and individuals’ Twitter handles. In addition, it appears the prominent real estate site Zillow is used in the process as well, with information being somehow blended from the service’s listings into the larger data pool,” UpGuard shared in a blog post.
Other type of information that may be included in the profiles includes things like whether the person is married or single, owns a car, uses credit cards, and so on.
“The database appears to work by tracking an IP address, matching collected data to that IP address when able, and thus providing a clearer image of the behavior and background of the user at that IP address,” UpGuard noted.
“Also of interest are exposed source fields, providing some indication of where the scraps of data were collected from. Some are fairly unambiguous, pointing to aggregated content, purchased marketing databases, or even information caches sold by payday loan operators to businesses seeking marketing data. Other fields are more mysterious, such as a source field labeled ‘ex.217;”
Who’s to blame?
It was easy for Vickery to pinpoint whom the exposed bucket belongs to, as the metadata in the header file pointed to LocalBlox.
He contacted Ashfaq Rahman, co-founder and CTO of LocalBlox, and he confirmed it, then secured the bucket a few hours later. But he later claimed to ZDNet that the bucket was secured and that Vickery hacked into it.
He did not say why he restricted the bucket’s permissions after he was contacted by Vickery, and says that most of the found data was fake and used for internal tests. He also claims that no one besides Vickery has accessed the bucket and file in question.
Technically, all of this data was already public somewhere online, so it’s not like it was secret. Facebook, LinkedIn, Twitter and the rest of the online services from which the data was scraped prohibit the practice but their efforts to prevent it often fall short.
It is also become patently obvious that the prohibition means little to those who seek to monetize this data in question – as evidenced by the recent Cambridge Analytica revelations.