What Copyright Laws? Protected Media Is Actually ‘Freeware’ When Available ‘On the Open Web,’ Microsoft AI Chief Says

microsoft ai
  • Save

microsoft ai
  • Save
Microsoft AI CEO Mustafa Suleyman. Photo Credit: Christopher Wilson

It turns out copyrighted content is actually “freeware” that artificial intelligence models can freely ingest – at least according to Microsoft AI CEO Mustafa Suleyman.

Suleyman made the ill-advised remark, presumably unvetted by Microsoft’s legal and PR teams, during an interview with CNBC’s Andrew Ross Sorkin. That discussion took place at the Aspen Ideas Festival.

And in keeping with the annual event’s name, the conversation rather expectedly touched on generative AI’s well-documented and highly controversial training processes. According to critics, multiple lawsuits, and even artificial intelligence chatbots themselves, those processes include the ingestion of all manner of protected media.

After introducing the DeepMind co-founder Suleyman as “one of the OGs of the AI world,” Sorkin asked about “whether the AI companies have effectively stolen the world’s IP.”

“It appears that a lot of the information that has been trained on over the years has come from the web,” explained Sorkin. “And some of it’s the open web, and some’s not. … Who is supposed to own the IP? Who is supposed to get value from that IP? And whether – to put it in very blunt terms – whether the AI companies have effectively stolen the world’s IP?”

Not hesitating to answer, Suleyman, who joined Microsoft by bringing his Inflection AI company to the tech giant, dove into the “freeware” comment.

“With respect to content that is already on the open web,” relayed Suleyman, “the social contract of that content since the 90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been freeware, if you like. That’s been the understanding.”

While there’s some ambiguity surrounding the definition of “open web,” the longtime AI exec didn’t clarify in the remainder of his answer that he was referring to non-copyrighted production libraries or public domain databases, for instance.

“There’s a separate category where a website or a publisher or a news organization had explicitly said, ‘Do not scrape or crawl me for any other reason than indexing me so that other people can find that content,’” he continued. “That’s a gray area, and I think that’s going to work its way through the courts.”

(As many have pointed out in similar situations, it’s readily apparent which side benefits from this purported “social contract” and which side is getting the decidedly short end of the stick in the form of no credit, compensation, or upside whatsoever.)

Besides its obvious conflict with actual U.S. copyright law and the fact that the open web is itself replete with infringement, the statement, voiced by an individual who’s said to be an artificial intelligence “OG,” seems to underscore the AI sector’s general unwillingness to acknowledge even basic creative rights.

Different execs have echoed the pernicious claim that the unauthorized training of models on protected media is transformative and constitutes fair use. Adjacent to the idea, OpenAI (and Microsoft, a sizable backer) is facing multiple related suits, music publishers remain embroiled in litigation against Amazon-funded Anthropic, and most recently, the major labels sued Suno and Udio for alleged infringement.

Behind the firmly worded legal actions and even the threat of IP devaluation as well as lost revenue are, of course, broader concerns about where exactly the runaway AI train is heading.

Bankrolled in large part by a collection of multi-trillion-dollar companies, the unprecedented technology is seemingly threatening to replace (or at least make things far more financially difficult for) the very creatives and professionals whose works it has allegedly used en masse to build its core products.

During the same interview highlighted above, Suleyman visited the subject when addressing the perceived ability of AI “to make the raw materials necessary to be creative” – presumably meaning a bit of imagination and work – “more available than ever before.”

GPT-3 cost tens of millions of dollars to train and is now available free and open source – you can operate [it] on a single phone, certainly on a laptop,” indicated Suleyman. “GPT-4, the same story. I think that that’s going to make the raw materials necessary to be creative and entrepreneurial cheaper and more available than ever before.”