Thursday, November 16, 2023

What Meta learned from Galactica, the doomed model

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.


One year ago — and two weeks before OpenAI released ChatGPT — Meta released a research demo called Galactica. An open source “large language model for science” that was trained on data including 48 million scientific papers, Meta touted Galactica’s ability to “summarize academic literature, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.”

Galactica survived publicly for only three days. On November 17, 2022, Meta took down the demo after an outcry over what was, back then, a word that had not yet made it into the mainstream: Hallucinations. Many were appalled by Galactica’s sometimes very unscientific output, which, like other LLMs, included information that sounded plausible but was factually wrong and in some cases also highly offensive. 

At the time, Meta chief scientist Yann LeCun stuck up for the model and posted a series of defensive tweets: “It’s no longer possible to have some fun by casually misusing it. Happy?”), but to no avail. Galactica would not be the game-changing model for the generative AI era.

Two weeks later, ChatGPT was released into the wild

That same week, however, tantalizing rumors about the upcoming release of GPT-4 — which some predicted could be in a few months — made the rounds. And just two weeks later, on November 30, as many AI researchers attending NeurIPS in New Orleans whispered hopefully that OpenAI might release GPT-4 at the conference, suddenly there it was — ChatGPT, released into the wild.

VB Event

The AI Impact Tour

Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!

 

Learn More

Of course, it was quickly clear that ChatGPT had its own hallucination problem. Like Galactica and other generative AI models, ChatGPT quickly spit out eloquent, confident responses that often sounded plausible and true even if they were not. OpenAI made this weakness very clear in its blog announcing ChatGPT and explained that fixing it is “challenging.”

Still, that did not slow down ChatGPT’s ride to LLM stardom: Over the past year it has become one of the fastest growing services of all time, with an estimated 100 million monthly users in just two months and, now, 100 million weekly users.

However, Galactica’s legacy endures. “There were a lot of good lessons learned,” Joelle Pineau, VP of AI research at Meta, recently told VentureBeat. “That’s a good model — I still get a lot of requests from people who want the model.”

Pineau emphasized that Galactica was never meant to be a product. “It was absolutely a research project,” she said. “We released with the intent, we did a low-key release, put it on GitHub, the researcher tweeted about it.”

But everyone got so excited by it, she explained. “The gap between the expectation, and where the research was, was too big.” People were surprised by things like hallucinations that would hardly be news a year later, she added — and Galactica’s level of hallucination was actually lower than other models because it was fine-tuned on scientific literature.

“Suddenly people had a product expectation, like you would use it to actually write your papers — no, that’s not the intent,” she said.

Galactica lessons led to decisions about Llama release

Meta pulled down the Galactica demo, Pineau explained, “to make sure that people were not misled into using it,” adding that it was not released with a responsible use guide “which we’ve learned to do.”

Overall, Pineau said, “If I was to do it today, we would just manage the release.” She added that Meta “probably misjudged” the expectations around Galactica, but “the lessons of that have been folded into our next generation of models.”

That next generation of models was Llama, Meta’s large language model that took the AI research world by storm in February 2023 — followed by the commercial Llama 2 in July and Code Llama in August. With Llama, the first major free ‘open source’ LLM (Llama and Llama 2 are not fully open by traditional license definitions), open source AI began to have a moment — and a red-hot debate — that has not ebbed all year long.

When Llama was released on February 24, Meta was careful — Yann LeCun, in sharing the paper, posted that “Meta is committed to open research and releases all the models [to] the research community under a GPL v3 license.”

When asked why researchers had to fill out a form to get access to Llama, LeCun retorted: “Because last time we made an LLM available to everyone (Galactica, designed to help scientists write scientific papers), people threw vitriol at our face and told us this was going to destroy the fabric of society.”

[EDITOR’S NOTE: A week after its release, Llama’s model weights were leaked by someone who posted the download link to 4chan]

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.



from Hacker News https://ift.tt/t3wyvQ2

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.