Thursday, October 04, 2007

To scrape or not to scrape?

Recently Zap2It, the company which had been providing XML based TV listings for free to open source software such as MythTV, decided to shutdown their free XMLTV service. The "official" replacement for the service is a non-profit startup, SchedulesDirect. They now provide XML TV listings for a very reasonable price of $20/year.

I always have a hard time forking over money for something that I can get for free elsewhere, and in the wake of the labs.zap2it.com closure several enterprising individuals have developed programs which scrape TV listings from commercial websites. One of these is a PERL program called Zap2XML. I can report that the program works great, and there are even a couple of setup guides floating around around that target MythTV users.

However, the use of the program in my opinion constitutes a bit of a moral and ethical dilemma. You see there is the little problem of the fact that when I signed up for a free Zap2it.com account, I agreed to their Terms of Service which say "You may not scrape or otherwise copy our Content without permission" which technically prohibits me from scraping their website. However, their TOS also says, "you may download or print a single copy of any portion of the Content solely for your personal, non-commercial use". I believe that even though they prohibit me from scraping their website for listings with a program like Zap2XML, that as long as I am using it only for "personal, non-commercial use" that I am probably ok.

I believe that their reason for prohibiting scraping is two-fold. One, it prevents other websites from scraping Zap2It's TV listings (which they paid TMS a lot of money for) and then rebranding the content and displaying elsewhere. I am clearly not doing this, but the second reason is an even bigger moral dilemma. You see Zap2It (as well as 90% of the rest of the internet) depends upon advertising revenues to cover at least part of the cost of their doing business, and when I use a program like Zap2XML to scrape listings from their website I am obtaining content from them without viewing their advertising. Some people have argued that this is stealing, and this leads into the even bigger question of the morality behind online advertising and ad blockers such as Adblock Plus which is a whole other ball of wax that I don't really want to get into right now, but suffice it to say that I believe that there is some validity to arguments on both sides of the issue

I believe that this issue is the second reason why Zap2It specifically prohibits scrapers. They are providing content that costs them money to produce. Additionally there are costs involved in bandwidth and servers, and they believe that they have a right to recover part of those costs through advertising revenues. Now, I am not opposed to their collecting ad revenues from my viewing of their website, but I also feel like I have the right to select which content I view and how I view it, and so for now I am using Zap2XML.

Anyways, I just thought that I would throw some of this issue out there and see how my readers feel about this issue. Please feel free to post any and all comments or opinions that you might have on the morality and ethics behind this issue.

4 comments:

Anonymous said...

You are a pirate.

Engineman said...

Ha ha, very funny N. Is that the worst that you can do?

Anonymous said...

LoL. I'm all about being honest, but I think I'd lean toward scraping. If that bears on your conscience then you could contact zap2it and cancel your contract with them or send them 20 smackers and enjoy the service. I guess you could also fly under the cover of "I don't know that zap2xml uses any info from zap2it therefore I am not breaking contract by using it"

Anonymous said...

I'm using zap2xml for my MythTV and it works well, except:

- The program IDs have changed format (perhaps proper format?) so repeat show detection isn't working.

- Some programs don't have the full details they did before. Some, like Simpsons have no real detail. The data is there on zap2it site but zap2xml doesn't pick it up.


Schedules Direct is offering a valuable service; kudos to them. I can afford $20 per month but choose not to deal with SD since I can get good enough listings for me with zap2xml.

Yeah, I violate copyrights on movies, shows, books and music too. So what; I have no sympathy for big business given that they have no sympathy for me. As for the "little people" that might get hurt a bit, well I'm a "little people" too and I get hurt all the time and "the system" doesn't care.

 

Blogger Template by Blogcrowds