Study Shows Up to 69% of Megaupload Files Were Legitimate…

When the United States Department of Justice removed an estimated 250 million files from Megaupload they must have released data proving the site’s main purpose was copyright infringement, right? Wrong. Northeastern University has now carried out the study that should have been done.

The university has released the findings in a study titled “Holiday Pictures or Blockbuster Movies? Insights into Copyright Infringement in User Uploads to One-Click File Hosters“. Their purpose was to find out exactly what types of files are uploaded to Megaupload and similar sites. To do this, metadata from shutdown year 2011 was extracted from six sites: FileFactory, Easy-share, Filesonic, Wupload, Megaupload, Undeadlink.

Exactly what type of files led to the freezing of over $39 million in Megaupload assets?

The study shows that 4.3% or 10.75 million Megaupload files were definitely legitimate, and the legitimacy of a further 65% of files remained unknown.

This means that anywhere between 10.75 million and 173.25 million seized files were non-infringing, all of which are being held.

However, 31% or 77.5 million files were deemed infringing. Does that justify seizing 100% of the files?


Image by Ricard Clupés, licensed under Creative Commons Attribution 2.0 Generic (CC BY 2.0)

43 Responses

  1. MATH?


    Study Shows Up to 96% of Megaupload Files Could Be Infringing…


    “For Megaupload (MU) the researchers found that 31% of all uploads were infringing, while 4.3% of uploads were clearly legitimate. This means that with an estimated 250 million uploads, 10.75 million uploads were non-infringing. For the remaining 65% the copyrighted status was either unknown, or the raters couldn’t reach consensus. ” – torrentfreak

    • NoJoke

      No kidding.

      No offense Nina… but you should probably should be writing for TorrentFreak instead of here.

      …talk about propaganda…

    • Nina Ulloa

      I didn’t know that authorities seizing information that may or may not be illegal was ok…
      “We’re not sure but we assume it’s illegal so let’s seize it all”.

      • Anonymous

        Again Nina:

        You have to understand that removing commercially available songs and movies from well known organized crime sites doesn’t constitute ‘seizing information’. :)

        And you’ve come to the wrong place if you think artists have any interest in suppressing information.

        It’s the other way around:

        People often accuse us of being self-serving because we have been fighting censorship and protecting free speech through out history. And to some extent they may even be right, since free speech is the very heart of what we do. Take it away, and art is dead.

        But I’ll take a self-serving artist who makes millions happy over a self-serving pro-pirate like you any day.

  2. jw

    What I think is telling here is that “For the remaining 65% the copyrighted status was either unknown, or the raters couldn’t reach consensus.” What that suggests to me is that it wasn’t Beyonce mp3s or episodes of Breaking Bad or a copy of the Goonies. It was stuff that was somehow borderline, or it was encrypted & unavailable to the general public. To me, that paints a pretty specific picture.

    However, I think we can all agree that the holiday pictures weren’t using up the bulk of the bandwidth. The heavily trafficked files were all most likely copyright infringing.

  3. Hipolipolopigus

    “Vote”? What, do you think the FBI are part of a democracy or somethi– Oh, right.

  4. Anonymous

    Haha Nina, I knew it was your ‘story’ before I clicked…

    Man, it’s beyond me how people can support organized crime, it hurts sooo many people.

    • Zac Shaw

      The only organized crime going on here is perpetrated by the “Hit Men” who dominate the global market for music. Payola, price fixing, price gouging, extortion… these are the tactics of the RIAA, the Big 3, and multinational corporations worldwide.

      You’re calling fans freely accessing the music they love organized crime? Absurd.

      This free access to music vs. sharing is stealing debate has been going on since Napster, and look where we’re at today. Free access to music is not something to be debated anymore, it’s something to be adapted to. Professional musicians who care about their careers aren’t complaining about fans accessing their music for free. They’re exploring direct fan patronage, they’re revamping their approach to monetizing live shows, they’re using digital tools to effectively market directly to their fans.

      Meanwhile, a bunch of crusty old professional musicians and industry people, who were complicit in the organized crime of a music industry built on exploitation, sit around crying into their patch bays about kids listening to a greater diversity of music, more frequently than ever before. You have to be blinded by your own ignorance not to see the opportunity that provides to musicians. Deal with the challenge instead of siding with the labels, trying to sue your own fans out of existence to line the pockets of the corporations you sold your rights to.

      • Anonymous

        Zac, here’s information for people who don’t understand how and why organized copyright crime hurts us all:

        The cost of piracy:

        10 billion Euros and 185,000 European jobs in 2008.

        58 billion dollars and 373,000 American jobs in 2007.
        Siwek, Stephen E.,The True Cost of Piracy to the U.S. Economy, report for the Institute for Policy Innovation, Oct. 2007.

        Hope this helps.

      • Anonymous

        Honestly guy, you sound like your stuck back in 1997, when people still believed the stuff your spouting. The data is in, and unless you are in the top 1-2% of artists, your income is way down. As a musician your product is the music you make. It’s not the t-shirt, it’s not the tour. That’s why so many of the so called experimenters, Radiohead, Nine-Inch-Nails, etc., have signed back up with record & management labels/groups. Honestly guy, get informed and try to keep up, and don’t get stuck in the past.

    • Nina Ulloa

      For the record, I don’t torrent, never really have, and I don’t support downloading music illegally. I actually buy music…

      However, I don’t support vast overstretching of suppression of information because the government is being pressured by their corporate buddies to take it out.

      • noJoke

        Music, Film, & software isn’t “Information”. sorry.

        The VAST majority of actual information (you know, history, facts, statistics, and knowledge…) are freely and legally available on the web and elsewhere. When you start to define entertainment products as “information”, that’s where you jump the rails, IMO.

        And, yeah… i’m glad that Kim Dotcom is on trial for his blatant criminal enterprise. Could it have gone down differntly? sure. But criminals like him need to face justice. And sorry… if you put your faith into people like that, and store your data with him, that’s the choice you made.

        • Nina Ulloa

          Where are the stats that show that all of the files seized were unauthorized music, film, and software?

          • Vistor

            Nina, perhaps the stats you are seeking will be presented in the trail – you know, like in court – you know, like where criminals are tried.

          • Yves Villeneuve

            I recall at around that time the Government or FBI said 90% was infringing. Some sort of study was likely done before a judge gave approval to raid MU servers. According to stolen documents by Snowden, NSA at the very least is capable of unlocking encryption available to the public thus the unlocking technology could have been used to conduct the study. The study authors did not unzip any files due to privacy concerns but the FBI may have with a warrant if needed to further prove its case in front of the eventual raid-approving judge.

          • GGG

            The US govt, NSA specifically I believe, invented Tor, essentially as a tool random people would take and run with (as they did) so they could keep up to date on how to break it. It would make sense for them to say the vast majority of things are illegal to give themselves free reign to look into everyone’s shit.

      • Anonymous

        Nina, we all have to start somewhere, but you do have to understand that removing stolen movies, songs and other commercially available files has nothing to do with ‘suppresion of information’.

        As far as the rest of your little crusade:

        If you choose to upload legitimate files to or otherwise deal with well known organized crime sites such as MegaUpload, SilkRoad, IsoHunt or Pirate Bay, you shouldn’t be surprised — let alone whine — if you lose said files for one reason or another.

  5. Anonymous

    Honestly what passes for studies and journalism today is shocking!

    “the legitimacy of a further 65% of files remained unknown” = more than half the files were never analyzed
    “4.3%” = legit
    “31%” = infringing

    How from this data the conclusion “Study Shows Up to 69% of Megaupload Files Were Legitimate” is a remarkable piece of bizarre piece of logic.

    And if you want to wail against corporatism try the billions of dollars internet and technology industry, their massive funded lobbying, their influence and the revolving employment door between government and big tech, makes the music industry look totally insignificant.

    • Anonymous


      I’m sure Paul can find better ways to capture our attention than to import writers who support the commercial Piracy Industry.

  6. lolwut

    “The study shows that 4.3% or 10.75 million Megaupload files were definitely legitimate, and the legitimacy of a further 65% of files remained unknown.”

    Okay, I guess I’ll just blithely assume that the 65% of indeterminate files were legitimate and write a completely fabricated, click-baity title because FUCK ACTUAL JOURNALISM.
    -Nina Ulloa

  7. jw

    A lot of people seem to misunderstand the method of the study, mostly because it wasn’t explained well in the post.

    It’s not that 65% weren’t checked… the sampling was only 1,000 files. It’s that 65% of the 1,000 WERE checked, but their legitimacy, for whatever reason, either couldn’t be determined or agreed upon. To me, it’s not clear what “meta data” means, if it’s anything beyond name/file size/upload date/file format. It could mean that the content of some of those files were purposely obscured because it’s infringing content. Other than that, it suggests that the files aren’t, for instance, Beyonce or Breaking Bad, but something that’s not clearly a infringing.

  8. david lowery

    first comment is absolutely correct. bad math. but further.

    If you read any of the indictment. At one point megaupload is freaking out about having to remove 36 files. Why? cause these were there most popular files. It doesn’t matter how many files were on the site. it matters how many of the downloads were infringing.

    Look I could put Wolverine for a free download on my website. slather it with advertising. but then also upload 9,999 non infringing files that nobody cares about. I could then claim that %1 of my website was non-infringing.

    Bad math. Bad logic.

  9. david lowery

    should have read less ” I coudl then claim less than 1% was infringing.

  10. stupid

    okay, so drug dealer has a store. he’s selling stolen guns in the back, but has a groceries up front. he shouldn’t be shut down for illegal activities?

  11. Why Is Nina Allowed to Write This Article?

    No post required but i feel compelled to tell Nina that up to 96% of readers here believe you’re a _______(fill in the blank)…Nina clearly doesnt understand what “information” is…and for the love of _____ “up to 69%”? NINA…you realize that when anyone says “UP TO” their argument, sales pitch and credibility are greatly undermined by the fact that its not a solid fact…right? Now, if you were to say 65-69% using the +/- 4% possibility based on ACTUAL FACTS that cant be disputed or argued then you’d have a case albeit not on the “information” suppression debate. “THE SUPPRESSION OF INFORMATION”….What the hell are you talking about? You’re simply ranting and raging against the machine the RIAA and the major labels. FYI NINA, the government is lobbied and PAID by lobbyist for any and every sort of self serving issue. Your story sounds like nothing more than BULLSHIT….and its incredibly disheartening to know who wrote the damn article before even clicking on the story, based on the title alone. That says quite a bit right there.

    • Paul Resnikoff

      Nina’s definitely allowed to write this article, just as you are allowed to vehemently disagree with it. Those are the laws of Digital Music News.

      Sheriff Paul Resnikoff

      • Anonymous

        You couldn’t be more right.

        But it’s also important to remember that many — most? — of your readers in one way or another are victims of the commercial Piracy Industry that Nina defends.

          • Anonymous

            Not sure what you mean… piracy obviously hurts everybody who loves music.

    • Nina Ulloa

      The point of story is the uncertainty, I think saying it again might be redundant.

      • noJoke

        the point of the story is that since you’ve arrived, the quality of this site has taken a drastic nose-dive.
        I’m all for differing points of view… but i’m not going to visit this once fine site-turned Big Tech propaganda anymore.
        Mike Masnick called, he misses you over at TechDirt.

        Good Bye

  12. Central Scrutinizer

    As a former government employee involved with copyright I can tell you that if the people reviewing the files were similar to the employees I worked with they probably wouldn’t recognize the title of any recorded pop song released within the last 25 years. If the works were pdf copies of printed scores or a recording of a classical work they would be all over it.

  13. DirtSoapMusic

    I think a topic over looked is the money that IS being spent. Billions of dollars are payed out by Adchoice, pay per click etc. If some of that money went into the hands of the owners of the copyrights it would be a much different world then it is today. What if the Megauploads of the world payed out to the artist like a PRO does? (or something like that). I think (in general) we as an industry need to rethink a lot of things. Technology is going to continue to advance no matter what. If you know just a little about the history of performing rights, then you may know that a similar, but dated, debate went on when radio was invented. That might not be the best example, but I think it makes my point. P2P and digital lockers are here to stay. Just like radio and the like they ALL make their money from advertising. Fact is, that is the way of the world . We only have so many fingers to put in this leaky dik we call an industry. Time to change with the times and build a better dam! A new business needs to emerge. Not based on what was, but based on what is. The tidal waters,of the music industry, are raising. Do we continue to build an Ark and save a few or do we look beyond the flood and build the future?

  14. AmAmusedGeek

    Personally, I find the fact that 60+% of the content ‘couldn’t be classified’ worrying. Really needs to be some sort of central registry or database or something…If computers can’t classify stuff with a reasonable degree of accuracy, there will be no chance to keep up with all the content being produced. Human reviewers are just too expensive and too slow…

  15. kf

    Again, DMN’s statistical illiteracy is shocking/funny/depressing.

  16. hippydog

    I want to comment on a few things.. :-)
    1.) no ‘upload’ site guarantees to hold your files forever (unless its a site dedicated to backing up peoples computers in the ‘cloud’).. IE: if anyone lost their files because they were seized, then those people are idiots..

    2.) 65% unknown if they were legitimate or not..
    The original article also mentioned they did not BOTHER including porn (or trying to find out if the porn was legitimate or not).. ERGO that 65% could also be made up of porn! LOL

    3.) Quote “Therefore, using only public data (as done in[1,2]) likely underestimates
    legitimate uploads on OCHs[1]. An exception is the expert report produced by Richard Waterman for the plaintiffs in the Disney v. Hotfile lawsuit. Based on internal data obtained from Hotfile, Waterman estimated that approximately 90.2% of the daily downloads from Hotfile were highly likely infringing copyright.”
    so.. yes, places like these might have legitimate content on the hard drives..
    but if 90% of the bandwidth is being used to break copyright, then obviously something is wrong here..
    and that seems like a pretty good justification to shut the place down..

    3.) As to shutting down some of the pirates making a difference?
    I know enough about “tech” to realize our industry has been changed forever, and the chances of closing the flood gates is almost zero (if your relying on ‘music cops’ to make a big enough difference, your gonna be waiting a long time).. Should it at least be slowed down? yes.. but it aint the answer..

    > We have the current ‘Y’ Generation who ACTUALLY believes stealing media is their right (or at least has ZERO repercussions.) , and DO NOT feel guilty about it (they will support some, who they truly like, but everyone else can go frack themselves)..

    > We have a new generation ‘Z’? Who are being raised where ALL CONTENT IS AT THEIR FINGERTIPS.. If you think its bad now, wait another decade.. its gonna be worse..

  17. Why Is Nina Allowed to Write This Article?

    Paul and Nina, the point is….the report is misleading and your spin is misleading and unfortunately makes you out to atleast appear as a PRO-PIRATE supporter. Either way, I could honestly care less what Nina “writes” but when I come to this site I expect decent & accurate reporting. You want to throw in an opinion great but make it know on which side of the fence you stand. If you’re pro-piracy thriving, make it known. Me making that statement is no different than Nina taking the stance that it’s illogical to shut down a company who make or may not have an insane amount of illegal content….both are extremes..have a great one paul and nina.

  18. Window Scarves

    With havin so much content do you ever run into any problems of plagorism or copyright violation? My website has a lot of completely unique content I’ve either written myself or outsourced but it appears a lot of it is popping it up all over the web without my permission. Do you know any techniques to help prevent content from being stolen? I’d truly appreciate it.

  19. Kong Gloves

    I think other site proprietors should take this site as an model, very clean and excellent user genial style and design, as well as the content. You’re an expert in this topic!