The Trojan Horse of Online Tracking — Amazon Inc.
Amazon, the company that once laid claim to (the now quaint and almost humble) bragging rights as the “world’s biggest bookstore” has branched out from this single-minded beginning into a queasily dominant roleplayer in many plots. Amazon’s proven exceptionally disruptive in the many arenas including the obvious online retail, groceries, and data logistics. Today, Amazon’s control runs the gamut from Whole Foods Market to web hosting and content delivery. It’s in this last role wherein Amazon’s tendrils are most complexly intertwined with our data, and where privacy-invading practices are either happening or likely to.
It’s a complicated matter involving a content distribution network, known by its acronym CDN. The beta launch of Amazon’s CloudFront CDN occurred in November 2008, meaning the network has been working with websites large and small for over a decade. The aggregate collection of internet monitoring and request services is known as Amazon Web Services (AWS), and it includes CloudFront.
The purpose of a CDN is simple: send requested website data to the consumer as quickly as possible. Geographically, this means accessing the data from a server that is the closest to where it was requested, so that it loads much faster. This means that the more servers a company has, the more requests it can handle across the globe. Amazon enjoys a healthy 40% CDN market share worldwide, making it a predominant player, and all the more likely to disrupt the data protection space.
Due to the sheer level of market share that Amazon has, it’s dealing with a lot of data going to and from the servers it controls. Just last month, nearly 100K websites started using CloudFront’s services, which is about 40,000 more than Amazon’s next biggest competitor, CloudFlare. The period between August 2012 and February 2019 saw worldwide usage jump from just 9000 websites to 2.2 million. Nearly every type of company is represented by AWS. Some popular websites that use CloudFront’s services include:
- Hulu (television content)
- Spotify (music content)
- Canon (enterprise camera company)
- Nextdoor (local social networking for homeowners and renters)
- Bandai Namco (gaming content)
It’s easy to see just how many different kinds of companies Amazon works with and can potentially service and collect data from. Because most of their web services deal with third parties, Amazon can distribute this information between them in order to further exploit the personal data of users across the internet, regardless of whether or not they use explicitly Amazon products.
Yet Amazon already collects a ton of personal data from the users of its services and hardware, and there’s no easier way to track a consumer’s habits than by attaching the websites they visit to certain ads or recommendations. Amazon grabs your device ID and matches it to your Amazon account, sending that info to a CloudFront hosted website when you visit it. Everything from your email, phone number, and past shopping purchases are fair game. You already agree to share this information with the company when you choose to use their services, so why wouldn’t they do it?
The potential for consequence is palpable, as Amazon is becoming a bigger threat to consumer data than even Google or Facebook. You can shop on Amazon, host a website through them, and get a Kindle. Your grocery list, website information, and favorite books are all known to them. Once again, a large company is getting away with collecting the personal information of millions. Just one breach could spell disaster for consumers worldwide, sending their most personal data into the wild west of the internet; fair (free!) game for marketers and identity thieves. It won’t be surprising if we end up seeing the company on front page news sooner than later.