Skip to main content

GPT-3: Advancing the understanding of cues for coding, writing

OpenAI says it is backlogged with a waitlist of prospective testers seeking to assess if the first private beta of its GPT-3 natural language programming (NLP) tool really can push the boundaries of artificial intelligence (AI).

Since making the GPT-3 beta available in June as an API to those who go through OpenAI’s vetting process, it has generated considerable buzz on social media. GPT-3 is the latest iteration of OpenAI’s neural-network-developed language model. The first to evaluate the beta, according to OpenAI, include AlgoliaQuizlet and Reddit, and researchers at the Middlebury Institute.

Although GPT-3 is based on the same technology as its predecessor GPT-2, released last year, the new version is an exponentially larger data model. With nearly 175B trainable parameters, GPT-3 is 100 times larger than GPT-2. GPT-3 is 10 times larger in parameters than its closest rival, Microsoft’s Turing NLG, which has only 17B. 

RELATED CONTENT: Microsoft announces it will exclusively license OpenAI’s GPT-3 language model

Experts have described GPT-3 as the most capable language model created to date. Among them is David Chalmers, professor of Philosophy and Neural Science at New York University and co-director of NYU’s Center for Mind, Brain, and Consciousness. Chalmerss underscored in a recent post that GPT-3 is trained on key data models such as CommonCrawl, an open repository of searchable internet data, along with a huge library of books and all of Wikipedia. Besides its scale, GPT-3 is raising eyebrows at its ability to automatically generate text rivaling what a human can write. 

“GPT-3 is instantly one of the most interesting and important AI systems ever produced,” Chalmers wrote. “This is not just because of its impressive conversational and writing abilities. It was certainly disconcerting to have GPT-3 produce a plausible-looking interview with me. GPT-3 seems to be closer to passing the Turing test than any other system to date (although “closer” does not mean “close”).” 

Another early tester of GPT-3, Arram Sabeti, was also impressed. Sabeti, an investor who remains chairman of ZeroCater, was among the first to get his hands on the GPT-3 API in July. “I have to say I’m blown away. It’s far more coherent than any AI language system I’ve ever tried,” Sabeti noted  in a post, where he where he shared his findings.

“All you have to do is write a prompt and it’ll add text it thinks would plausibly follow,” he added. “I’ve gotten it to write songs, stories, press releases, guitar tabs, interviews, essays, technical manuals. It’s hilarious and frightening. I feel like I’ve seen the future and that full AGI [artificial general intelligence] might not be too far away.”

It is the “frightening” aspect that OpenAI is not taking lightly, which is why the company is taking a selective stance in vetting who can test the GPT-3 beta. In the wrong hands, GPT-3 could be the recipe for misuse. Among other things, one could use GPT-3 to create and spread propaganda on social media, now commonly called “fake news.” 

OpenAI’s Plan to Commercialize GPT-3
The potential for misuse is why OpenAI chose to release it as an API rather than open sourcing the technology, the company said in a FAQ.  “The API model allows us to more easily respond to misuse of the technology,” the company explained. “Since it is hard to predict the downstream use cases of our models, it feels inherently safer to release them via an API and broaden access over time, rather than release an open source model where access cannot be adjusted if it turns out to have harmful applications.”

OpenAI had other motives for going the API route as well. Notably, because the NLP models are so large, it takes significant expertise to develop and deploy, which makes it expensive to run. Consequently, the company is looking to make the API accessible to smaller organizations as well as larger ones.

Not surprisingly, by commercializing GPT-3, OpenAI can fund ongoing research in AI, as well as continued efforts to ensure it is used safely with resources to lobby for policy efforts as they arise. 

Ultimately, OpenAI will release a commercial version of GPT-3, although the company hasn’t announced when, or how much it will cost. The latter could be significant in determining how accessible it becomes. The company says part of the private beta aims to determine what type of licensing model it will offer. 

OpenAI, started as a non-profit research organization in late 2015 with help from deep-pocketed founders who include Elon Musk, last year emerged into a for-profit business with a $1 billion investment from Microsoft. As part of that investment, OpenAI runs in the Microsoft Azure cloud.

The two companies recently shared the fruits of their partnership one year later. At this year’s Microsoft Build conference, held as a virtual event in May, Microsoft CTO Kevin Scott said the company has created one of the world’s largest supercomputers running in Azure.

OpenAI Seeds Microsoft’s AI Supercomputer in Azure 
Speaking during a keynote session at the Build conference, Scott said Microsoft completed its supercomputer in Azure at the end of last year, taking just six months, according to the company. Scott said the effort will help bring these large models in reach of all software developers.

Scott likened it to the automotive industry, which has used the niche high-end racing use case to develop technologies such as hybrid powertrains, all-wheel drive and antilocking breaks. Some of the benefits of its supercomputing capabilities in Azure and the large ML models hosted there enables is significant to developers, Scott said.

“This new kind of computing power is going to drive amazing benefits for the developer community, empowering previously unbelievable AI software platform that will accelerate your projects large and small,” he said. “Just like the ubiquity of sensors and smartphones, multi-touch location, high-quality cameras, accelerometers enabled an entirely new set of experiences, the output of this work is going to give developers a new platform to build new products and services.”

Scott said OpenAI is conducting the most ambitious work in AI today, indicating work like GPT-3 will give developers access to very large models that were out of their reach until now. Sam Altman, OpenAI’s CEO, joined Scott in his Build keynote to explain some of the implications.

Altman said OpenAI wants to build large-scale systems and see how far the company can push it. “As we do more and more advanced research and scale it up into bigger and bigger systems, we begin to make this whole new wave of tools and systems that can do things that were in the realm of science fiction only a few years ago,” Altman said. 

“People have been thinking for a long time about computers that can understand the world and sort of do something like thinking,” Altman added. “But now that we have those systems beginning to come to fruition, I think what we’re going to see from developers, the new products and services that can be imagined and created are going to be incredible. I think it’s like a fundamental new piece of computing infrastructure.” 

Beyond Natural Language
As the models become a platform, Altman said OpenAI is already looking beyond just natural language. “We’re interested in trying to understand all the data in the world, so language, images, audio, and more,” he said. “The fact that the same technology can solve this very broad array of problems and understand different things in different ways, that’s the promise of these more generalized systems that can do a broad variety of tasks for a long time. And as we work with the supercomputer to scale up these models, we keep finding new tasks that the models are capable of.”

Despite its promise, OpenAI and its vast network of ML models doesn’t close the gap on all that’s missing with AI. 

Boris Paskalev, co-founder and CEO of DeepCode, said GPT-3 provides models that are an order of magnitude larger than GPT-2. But he warned that developers should beware of drawing any conclusions that GPT-3 will help them automate code creation.

“Using NLP to generate software code does not work for the very simple reason that software code is semantically complex,” Paskalev told SD Times. “There is absolutely no actual use for it for code synthesis or for finding issues or fixing issues. Because it’s missing that logical step that is actually embedded, or the art of software development that the engineers use when they create code, like the intent. There’s no way you can do that.”

Moiz Saifee, a principal on the analytics team of Correlation Ventures, posted a similar assessment.  “While GPT-3 delivers great performance on a lot of NLP tasks — word prediction, common sense reasoning– it doesn’t do equally well on everything. For instance, it doesn’t do great on things like text synthesis, some reading comprehension tasks, etc. In addition to this, it also suffers from bias in the data, which may lead the model to generate stereotyped or prejudiced content. So, there is more work to be done.”

 

The post GPT-3: Advancing the understanding of cues for coding, writing appeared first on SD Times.



from SD Times https://ift.tt/2RZyTqc

Comments

Popular posts from this blog

Difference between Web Designer and Web Developer Neeraj Mishra The Crazy Programmer

Have you ever wondered about the distinctions between web developers’ and web designers’ duties and obligations? You’re not alone! Many people have trouble distinguishing between these two. Although they collaborate to publish new websites on the internet, web developers and web designers play very different roles. To put these job possibilities into perspective, consider the construction of a house. To create a vision for the house, including the visual components, the space planning and layout, the materials, and the overall appearance and sense of the space, you need an architect. That said, to translate an idea into a building, you need construction professionals to take those architectural drawings and put them into practice. Image Source In a similar vein, web development and design work together to create websites. Let’s examine the major responsibilities and distinctions between web developers and web designers. Let’s get going, shall we? What Does a Web Designer Do?

A guide to data integration tools

CData Software is a leader in data access and connectivity solutions. It specializes in the development of data drivers and data access technologies for real-time access to online or on-premise applications, databases and web APIs. The company is focused on bringing data connectivity capabilities natively into tools organizations already use. It also features ETL/ELT solutions, enterprise connectors, and data visualization. Matillion ’s data transformation software empowers customers to extract data from a wide number of sources, load it into their chosen cloud data warehouse (CDW) and transform that data from its siloed source state, into analytics-ready insights – prepared for advanced analytics, machine learning, and artificial intelligence use cases. Only Matillion is purpose-built for Snowflake, Amazon Redshift, Google BigQuery, and Microsoft Azure, enabling businesses to achieve new levels of simplicity, speed, scale, and savings. Trusted by companies of all sizes to meet

2022: The year of hybrid work

Remote work was once considered a luxury to many, but in 2020, it became a necessity for a large portion of the workforce, as the scary and unknown COVID-19 virus sickened and even took the lives of so many people around the world.  Some workers were able to thrive in a remote setting, while others felt isolated and struggled to keep up a balance between their work and home lives. Last year saw the availability of life-saving vaccines, so companies were able to start having the conversation about what to do next. Should they keep everyone remote? Should they go back to working in the office full time? Or should they do something in between? Enter hybrid work, which offers a mix of the two. A Fall 2021 study conducted by Google revealed that over 75% of survey respondents expect hybrid work to become a standard practice within their organization within the next three years.  Thus, two years after the world abruptly shifted to widespread adoption of remote work, we are declaring 20