Unravelling the Mystery of Bad NAP Data
Incorrect business information online is much more frequently encountered than correct business information online, and one of the main reasons for this is the way the business data flows through the ecosystem.
Problems with the Business Data Exchange System
Business data flows unidirectionally – from a small oligopoly of business data providers to an enormous number of online platforms ranging from GPS navigation systems, through Internet yellow pages (IYPs), to mobile applications of any sort. Unfortunately, the technological and data-exchange-related relationships between the business data providers and the business data receivers are not always as straightforward as everyone interested in finding correct business information online would have liked them to be. There are two main issues:
1) Business data is not transferred in real time.
Normally, the data receivers are “fed” the data or they “pull” the data at certain time intervals. The latter are most frequently between 1 and 6 months long. This means that if certain business data has been updated in a business data provider’s database it might make its way to a data receiver (for instance, a GPS navigation system) no sooner than 1 month after it has been originally updated in the source.
2) Business data updated in the source is not always “mapped” to the old business data in the public platform.
This could most easily be explained with an example:
LocalEze is one of the most important business data providers in the US. They provide business information to a number of online platforms, one of which is MerchantCircle.com.
We do not know with 100% accuracy how frequently MerchantCircle receives/pulls data from Localeze’s database but for the purposes of this example we will set the cycle at 45 days. Here is the example itself:
On January 1, new business information from an official government source is added to LocalEze’s database:
This information is provided for usage to the “data receivers” and MerchantCircle.com gets it on February 15. However, in the meantime, on January 25, the business owner (Bob) claims the listing on LocalEze and updates the information to feature his correct phone number. The LocalEze listing is updated and features the following new phone number:
Unfortunately, MerchantCircle.com don’t understand (due either to the way the new data is provided by LocalEze, or due to imperfections in their data clustering system) that this new business phone, together with all the other information that is exactly the same as the rest of the business information for the business and MerchantCircle.com already knows about, is the exact same business and it is just the phone number that has changed. Thus, on March 10, a new listing appears on MerchantCircle.com. Now there are two listings for Bob’s business on their site:
Here’s a summary of the order of events:
- January 1 – The original listing is added to LocalEze
- January 25 – The phone number on the original listing on LocalEze is edited
- February 15 – The original listing finally lands on MerchantCircle (it’s a 45 day cycle)
- March 10 – The edited information lands on MerchantCircle. However, because MerchantCircle suck at matching and de-duping info, they create a separate listing (duplicate), instead of editing the original listing from February 15
Now Bob has a problem he doesn’t even suspect he has.
The situation described above has one more negative side – because the record on LocalEze has been updated, it might be very difficult to track down the original source of the issue. There is no publicly available information about “Old Phone Numbers” related to the business record of Bob’s Painting on LocalEze. Therefore, a specialist, who Bob might decide to hire in future to help with his local SEO, might have a difficult, if not impossible, time trying to determine where all these incorrect listings on MerchantCircle.com and other sites originated from.
I came to truly realize the complexity of this issue a few days ago when Andrew Shotland, one of the most reputable local SEO specialists and a professional whom I have great respect for, posted an article about some research he had done for one of his clients’ online citation profiles. In it, Andrew concludes that a listing on Factual is the source of the bad data that’s seeding the listing at Google. Here is the relevant information from Andrew’s article:
A person named Alexander Jubb “considered joining” a dental practice located at “804 Carlsbad Village Dr, Carlsbad, CA”, which used the phone number “(760) 739-8500”. According to Andrew’s article, Alexander “never did” join the practice. However, Andrew discovered a listing on Google Maps, and then (using his NAP Hunter) a few citations on the web, in which Alexander Jubb’s name was mentioned together with the practice’s business details. He wondered about the source of this inaccurate information and focused his research around the following listings:
The problem: none of these four sites is a direct data provider for Google.
How to figure out the actual source?
I knew the following before I started digging:
- There are two official, recognized business data sources for the US for Google Maps – Acxiom, and Infogroup (ExpressUpdate) (see here for reference).
- The only place where Citysearch gets their business data from is Infogroup (see their FAQ for reference).
I had a match! However, I knew that Infogroup do not supply data to Factual, and probably they do not supply data to ucomparehealthcare.com, either. I went hunting for a few more citations. I was lucky enough to find the following two listings:
These two listings are essentially part of the same network of business directories, so the data on the two of them is identical.
If you click through to see the listings you will notice that the NAP there is different from what we are looking for. They feature the current business information for Dr. Jubb:
At first glance, this was discouraging. However, if you take a closer look at the URLs you will notice that they feature the business name, address, and phone number as they were originally seen on the listings:
Additionally, the number in the end of the URL is the NPI number of the practitioner. According to the network’s “Data Lists” page, they receive all their data from the NPI registry, so the source of this information should be NPI record number 1326282690:
This page currently features the same business information for the business as the two listings on dental-yellow.com and dentists-directory.info. The NPI registry is one of the most trustworthy data sources in the medical industry, so the chances that this was the initial source of the bad data are very high. What is of particular interest is the fact that the NPI record was last updated on December 15, 2011, more than two and a half years ago, and there are still listings online that feature that outdated information.
It’s easy to imagine a scenario where Infogroup might have picked up that information, added it to their database, the information spread across their network of “data receivers”, and then after the data had already gone out to their subscribers, Infogroup updated the record (they make 100,000 data verification phone calls per day), but the old information stuck on some of the receivers’ sites (as in my LocalEze – MerchantCircle example above).
The last question left was – does Factual source data for practitioners from the NPI registry? Fortunately, Factual share statistics about their database of healthcare providers directly on their site. According to the stats there are only two “attributes” that are available in every single (100%) healthcare provider record – category, and National Provider Identifier (NPI). Boom!
To recap, here is how I came to the conclusion that NPI is the original source of the bad data that made its way to Google Maps:
Andrew found listings on UCompareHealthcare, Angieslist, Citysearch, and Factual. However:
- None of these feeds data directly to Google
- Citysearch receives data only from Infogroup (ExpressUpdate)
But I was left with the remaining questions:
- Where did Infogroup get the data from?
- Where did Factual get the data from?
I did some additional research, and found two listings, one on dentalyellow.com, and one on dentists-directory.info. Their URLs gave me new clues:
- The outdated information for the dentist came on those sites originally from the NPI registry, because these sites receive business data only from there
- The information was updated
This lead me to believe that the information originally came from the NPI registry, but was updated not later than 15 December 2011 (as per the NPI registry log).
I was left with a new question:
- Do Infogroup and Factual source data from the NPI registry?
I did some additional research on the sources where Infogroup and Factual obtain business data from:
- No certainty about if Infogroup obtain data from NPI, but it is highly possible, having in mind the prominence and trustworthiness of the NPI registry in the medical niche
- Factual seem to source all their data in the healthcare niche from the NPI registry
CONCLUSION: The data came to Factual (and probably InfoGroup) from the NPI registry! Phew!
This just goes to show how bad NAP data can make its way into the ecosystem and almost completely cover its tracks as it gets distributed. It’s a crazy world out there, but maybe not as crazy as the local search ecosystem. 😉