Friday, August 14, 2009

Tr.im's Brief Demise and the Privacy Implications of the Bit.ly Monopoly


In case you missed it, last Sunday the URL shortener Tr.im announced that it was going to close. Then, on Wednesday, they announced that they were going to keep the service open. I spent yesterday morning listening to the TechZing interview with Tr.im and Nambu founder Eric Woodward, which was interesting in a lot of ways. His perspective on having worked with a Chinese development team was quite interesting, and though not relevant for this post, it explains why development on Nambu (an OS X and iPhone twitter client) has been slow in coming- the Chinese developers left to get rich making iPhone apps. What I was more interested in was the discussion of business models for URL shorteners in specific and the Twitter ecosystem in general. It was the lack of any plausible business model for Tr.im that led to the decision to close the service. According to Woodward, there are only three plausible business models for a legitimate URL shortener:
  1. you can charge users
  2. you can sell advertising
  3. you can sell data that you generate
and given that Bit.ly has an inside track with Twitter and offers everything to users for free, all of the three business models were just not going to work for Tr.im.

Part of the problem that Tr.im has experienced is that the cost of running a URL shortener scales with the amount of usage. (In a good web business, you have a high fixed cost and very sublinear scaling of cost with usage) Woodward says that he has to spend an hour a day just dealing with spam, and this problem is getting worse. Why is spam a problem for URL shortening? It's because spammers (and presumably phishers, too) will use URL shorteners to hide links to porn, scams, malicious content, etc. Afflicted users will report the links to the URL shortener's ISP host (Tr.im uses Rackspace) and the URL shortener will be shut down unless the spamming links are turned off. Another problem that afflicts popular URL shorteners is the problem of popular twitterers. When Ashton Kutcher or Shaquille O'Neill tweets a link to their millions of followers, the URL shortener can suddenly be hit with 10,000 hits per minute. A read-only website can handle this traffic easily by spreading service over multiple servers, but a URL shortener such as Tr.im, architected to generate dynamic usage data by writing to a MySQL database, needs beefy hardware to avoid getting overloaded.

In addition to the problems highlighted by Woodward, I see three core difficulties for URL shortener businesses:
  1. It's really easy to build a small-scale URL shortener. A good web developer could probably build one in a day. Because of this low barrier to entry, casual users are forever going to be able to find free URL shorteners.
  2. There are plenty of illegitimate (or at least annoying) business models for URL shorteners. These involve stealing traffic, stealing "Google juice", putting interstitial advertising in links, framing links, etc. This attracts entrants who make it harder for someone wanting to run a non-annoying business to attract paying users.
  3. Links need to be reliable, because if your shortener fails, the user doesn't get the content they are trying to access In a lot of applications, links are meant to keep working forever. Reliability and persistence are expensive.
So basically, URL shorteners are a high cost, tiny revenue business.

Nonetheless, URL shorteners are very useful in today's 140 character world. Woodward had concluded, and I agree, that there is no room for more than one URL shortener business in the Twittersphere, and that Bit.ly has won. Bit.ly thus finds itself with an odd sort of natural monopoly. Of the three plausible business models for Bit.ly, it seems to me that generating and selling data is the only one that would maintain the monopoly, and thus the business. What kind of data might Bit.ly sell? I'll place my bet on "psychographic data". With its URL shortening monopoly, Bit.ly has access to a huge number of clicks. Bit.ly knows who I am, because I signed up for an account. Whenever I click a bit.ly link, my browser sends a cookie to Bit.ly which it could be using to track my interests, what I read. Aggregated over all the people who click Bit.ly links, the dataset of who clicked what could be very interesting to advertisers. Just as Google has the ability to tailor advertising to me based on my search history, Bit.ly could use a psychographic profile of me to help advertiser do targetting. Bit.ly has an interesting advantage over Google, however. Because so many of its properties rely on the user's perception of Google as a company that can be trusted with sensitive information, Google is quite limited in how far it can go in tracking users. In contrast, Bit.ly is not inhibited in this way. In fact, Bit.ly's ability to profile users could make it even more attractive to people putting links into tweets. Bit.ly could even provide this sort of profiling without violating its privacy policy which promises that it
... discloses potentially personally-identifying and personally-identifying information only to those of its employees, contractors and affiliated organizations that (i) need to know that information in order to process it on Bitly, Inc.‘ behalf or to provide services available at Bitly, Inc. websites, and (ii) that have agreed not to disclose it to others. Some of those employees, contractors and affiliated organizations may be located outside of your home country; by using Bitly, Inc. websites, you consent to the transfer of such information to them. Bitly, Inc. will not rent or sell potentially personally-identifying and personally-identifying information to anyone. Other than to its employees, contractors and affiliated organizations, as described above, Bitly, Inc. discloses potentially personally-identifying and personally-identifying information only when required to do so by law, or when Bitly, Inc. believes in good faith that disclosure is reasonably necessary to protect the property or rights of Bitly, Inc., third parties or the public at large.
Ironically, URL shorteners could also be used in ways that enhance user privacy from a different direction. As I discussed in my post on the semantics of redirectors, most URL shorteners use HTTP redirects. Although it's browser-dependent, either 301 or 302 redirects will result in the originating page being sent in the referrer header. "META refresh" redirects, on the other hand, can be used to wipe (or replace) the value of the referrer header. (Unfortunately, these can be also used annoyingly to cause "referrer spam".)

Redirectors deployed for purposes other than URL shortening also have market-share related privacy implications. For example, the dx.doi.org redirector which handles most DOI traffic could be a very useful vantage point for industrial or technology espionage. Because this redirector serves so many scientific article links, a spy agency might be able to monitor everyone in the world doing research on nuclear fission or anthrax weaponization, to give two examples.

In preparing my post on privacy mechanisms for Google Book Search, I was struck at the many directions that someone intent on privacy intrusion could take to collect potentially sensitive information. Part of the skepticism I expressed about being able to "sell" privacy comes from a feeling that privacy as traditionally thought of is pretty much a lost cause on the internet, no matter what Google, Bit.ly, or anyone does. Somehow, traditional concepts of privacy need to be recast into something that people still value as they use the internet.
Reblog this post [with Zemanta]

0 comments:

Contribute a Comment

Note: Only a member of this blog may post a comment.