Breadth of Data: Why it Matters

The last decade has seen tremendous growth in the availability of identity theft protection and identity monitoring services. Breadth of data has been driven by the even faster growing online presence of the world’s population. While the internet has been around for decades, how we use it changes every day. Fifteen years ago, most people did most of their shopping in-store, whereas today, e-commerce platforms such as Amazon dominate how we shop. Ten years ago, you might have found yourself in the back of a dimly lit bar hoping to meet your significant other; but today, we can stay at home on our couch, swipe right and left on our online dating platform of choice, drinking a beer we ordered online and find our soulmate. In the last couple years alone, we’ve seen tremendous transformations such as traditionally face-to-face business being conducted via online teleconferencing, we’ve taken classroom education to the digital realm, we consult our doctors online, and many of us make money on digital entertainment platforms doing jobs that didn’t even exist a few years ago.

Okay, so we’re doing more things over the internet. So what?

The more we do online, the more of our personal data leaves our control. This isn’t a bad thing, as the potential downsides are easily outweighed by the convenience and technological advancements afforded by our digitized world. We simply need to be mindful of what’s out there. If you haven’t already, take a moment to read our blog about your digital exhaust, where we discuss how you leave behind digital breadcrumbs, and what to do about it in part 2 of the publication. In short, the bits of your personal data you may not realize are floating in the digital ether are highly valuable to threat actors and if exploited, can cost you money and your privacy. Thankfully, there are remedies for this problem, the most prominent of which is to enroll in an identity theft protection or identity monitoring service.

But not all identity theft protection services are created equal. You can read about what makes a great identity monitoring service here, which includes key factors such as data quality, volume, accuracy and global coverage, but today, we’ll discuss the importance of breadth of data.

What exactly is “breadth of data”?

Let’s begin by looking at a more traditional metric, “depth” or “volume” of data—this refers to “coverage”, or “does this data set include the data exposures that matter to me?” This is typically the first thing an identity data provider will implement—which is good, but unfortunately is no longer “good enough” on its own. Many identity exposure data providers are still focused on what mattered most at the inception of ID protection—email addresses and passwords. Yes, these are important attributes to track, and they’re among the most prevalent, but it doesn’t paint a complete picture. A focus on just emails and passwords leads down a narrow avenue—what we need today is something broader, and so we arrive at breadth of coverage.

We’ve established how much of our data gets exposed, but is your identity theft protection service capturing it all? Breadth of coverage refers to not only identifying that your email address and password appear in a data breach, but looking further and understanding that your telephone number, name, address, etc have also been exposed and they’re linked to your email. While this seems trivial on the surface, there are a lot of complexities and intricacies that must be considered when capturing a broad swath of data. Constella has spent the last decade mastering this art.

Adequately Capturing Broad Data

Think back to the last several times you’ve signed up for a new website or filled out a digital form. You almost certainly provided an email address and made up a password, you probably gave your name and mobile phone number, and you likely even revealed additional data about something more obscure, like your license plate number, and make, model and year of your car for a parking service, for instance. When a website captures such obscure data, they need to designate a spot in their database for it—think of it as adding another column to a spreadsheet. In turn, when an identity monitoring data provider captures breached data, they need to recognize these obscure data types and identify them. Without this identification, this additional data would be useless—consider a 9-digit number captured form a breach without additional context; is it an international phone number, social security number, a passport number, or something completely different? To generate value through breadth of data, we need to not only capture the broader data points, but also identify the type of data we’re dealing with.

Constella recognizes over 250 data types and is constantly growing that list. We dedicate part of our data analysis and quality checks towards field identification, and so, we’re able to identify less common identity attributes such as gender, weight, height, sexual orientation, hair color, eye color and more; we can classify attributes such as license plate numbers and vehicle identification numbers; we recognize the name of the company you work for, job titles, the university you attended, your graduation year and salary information. As we grow the data types we can support, we’ve recently added coverage for a variety of online profile IDs from over a dozen different platforms, coverage for gaming IDs (a “gamertag”) and even business-centric attributes like VAT number and company registration numbers. Most prominently, Constella has worked hard to maintain international coverage by recognizing the tax ID and national ID number formats of over 50 countries.

How does broad data help us?

In short, capturing data exposures at full breadth means we can see the complete picture. The totality of an individual or business’s exposures gives in-depth perspective of the risk profile that person or business carries. Understanding the extent of your digital footprint allows you to anticipate where you may be vulnerable. It is these details that cyber criminals are certainly capturing and using to their advantage—for instance, a data breach that exposes users’ automobile information allows a malicious actor to carry out targeted phishing attacks in bulk. Consider a generic phishing email that reads, “Dear sir or madam, this is your auto insurance company, your policy is about to be cancelled for non-payment; click here to fix this problem.” It doesn’t sound very credible and will be dismissed as a phishing email by many. On the other hand, imagine the same email but with personal details, “Dear Joe, this is your auto insurance company, Geico, letting you know that your policy ending in 1234 for your 2019 Toyota Camry is about to expire for non-payment…”, now that is bound to trick quite a few people into giving up payment info to a fraudster. Having a complete picture can help you remain cognizant of the attack vectors a malicious actor may use against you.

Get Breadth, Volume and Quality Data from Constella

Constella not only captures and curates the full breadth of data exposed in a breach, but we meticulously verify and validate the data we publish, ensuring you’ll receive high-quality, high-confidence alerts that you can count on. Contact us today to see how Constella’s industry-leading data can power your solution.

Keon Ramezani

Sr. Sales Engineer