Can we make an Open Source IMDb replacement?

greenspun.com : LUSENET : Robot Wisdom : One Thread

Jorn asked:
Before the Internet Movie Database (IMDb) was bought by Amazon, it was the showpiece of Net.cooperation. Does anyone know how its volunteers reacted to the purchase? It seems a pale shadow of its old energy, these days.

I know how I reacted to it: anger and outrage. The purchase and subsequent redesign and quiet removal of features (such as the huge tally of film subgenres) was completely clueless and irresponsible.

When they first made the switch, I timed some casual lookups as well as detailed searches from both the new site and a mirror site that hadn't yes switched over. The new site took more than 2x as long for me to get the same info (because of all the images, the weird layout, and the extra clicks to get to the real data). The new design was not even available as an alternate, and the IMDb "community" had no say in the matter (yet alt.cult-movies and other film groups were full of criticism on the new design).

The new design is more like an advertisement than a collaborative group creation.

But I should have known better; corporations can only be responsible by chance and not choice. I don't think that Net.cooperation is possible under the auspices of Amazon, eBay, epinions or any of these other legal structures. We need an Open Source, net.community (non-corporate) replacement for IMDb -- the Free Film Database? (ffdb.org is open, as is freebay.org and freepinions.org.)

I was able to download and archive a copy of the entire db, but I think a clean-room reimplementation may be required because of copyright restrictions. (You were free to download a copy for your own use, at least before the buyout, but you weren't free to share your results with others.)

-- Michael Stutz (stutz@dsl.org), November 01, 1999

Answers

I am not intimately familiar with original copyright restriction clause but it is most likely that the user ratings and comments could not be reproduced in any way at another location. As we all know, user input was--and continues to be an integral part of the IMDb. Therein lies the value of the site.
<
Certainly it would be possible to recreate an Open Source equivalent of the site. At least part of the database could be legally used and could provide the backbone for the new site.

With the right combination of MySQL, PHP/Perl, it seems like a very doable project if divided among a few people. I would be interested in hearing more from those who would like to start such a site. I too am thoroughly disappointed with the new bloated and feature poor design.

Jorn: if you can, send me a copy of the db--I would like to see exactly how it was structured. (please send it to me@witold.org)

-- Witold (witold@rnchq.org), November 02, 1999.

The databases themselves are rather large. Just the actors db was 20megs compressed, 49megs uncompressed and this was back in May of this year when I last downloaded a copy. All told, the various dbs compressed made up for nearly 100megs.

The section that is relevant to our discussion aobut the copyright restrictions is as follows:
 3. Specifically the files may NOT be  used  to  construct
    any  kind  of  on-line database (except for individual
    personal use). Clearance for  ALL  such  on-line  data
    resources   must  be  requested  from  Internet  Movie
    Database Ltd


-- Kip DeGraaf (kip@monroe.lib.mi.us), November 02, 1999.

Witold's right, we probably couldn't use the current ratings and recommendations. But I suspect that the plot descriptions and comments could be reused (by individuals who consent to this) -- when you contribute to the IMBDb you don't sign a formal copyright assignment statement.

You can get the plain text data files by ftp:
http://us.imdb.com/interfaces#plain

Does anyone have access to a listserv where we can set up a discussion list for this topic?

-- Michael Stutz (stutz@dsl.org), November 03, 1999.


I would very much be interested in such a product as well, and would appreciate any inquiries from someone interested in doing the maintenance.

I am willing to pay initial costs to serve the site, as well as do the backend programming (have a community framework based on data models from ACS done in php/mySQL that would work perfect) - but have no time or energy to devote to maintenance of the data.

If you have ideas for the features you'd like the community to have, and if you have the resources (basically a group of volunteers) to transcribe and update data - then I'm all for doing it and keeping it commercial free.

Free free to email: mailto:hboutwel@ix.netcom.com

-- Heath Boutwell (hboutwel@ix.netcom.com), November 21, 1999.


Would it be possible to create a system like the old CDDB, but for movies on DVD?

I hope it's not too far-fetched, but surely a copy of DeCSS and an OCR package could be hooked together to parse the credits at the end of the DVD and send them to a central server. Anyone with the software would be helping to build open IMDB themselves.

Once DeCSS has worked it's magic, we would need software that converts the scrolling text into a single (if very tall) image. Running that through OCR shouldn't take too long. My guess is that the entire process from inserting the DVD to sending the data to the server could happen in two minutes or less.

The system could also transmit english subtitles for a dialogue search feature, though that may infringe copyright. Another possibility would be to transmit the credits image, and have more powerful software on the server do the OCR, though again that might infringe copyright.

The only thing I'm not certain could be automated is detection of the credit scroll video on the DVD. Plus there is the issue of non- standard credits (Pixar movies show images while credits roll and older movies have credits at the beginning. Then there's Monty Python's Quest for the Holy Grail...). A user could simply cue the DVD to the beginning of the credits.

What do you think? Doable?

M

-- Mike Shivas (mshivas@hotmail.com), September 10, 2002.



Moderation questions? read the FAQ