Prompt engineering-the next thing on generative AI platform

5 min read

Reading Time: 5 minutes

Prompt Engineering: The New Era of AI

Trends are changing in the data science domain every day. Many tools, techniques, libraries, and algorithms are developing daily. This constantly changing landscape keeps the data science domain at the bleeding edge. The techniques and methods used to solve different tasks in Machine Learning (ML)/ Deep Learning (DL) and Natural Language Processing (NLP) are also changing.

What is next in Artificial Intelligence (AI)? This is one of the questions every data science aspirant should ask themself. In the initial years of the AI revolution, symbolic AI-related rule-based systems solved the different ML/DL or NLP problems such as text classification, natural language text extraction, passage summarization, data mining, etc. As time passed, statistical models replaced symbolic AI-based ML solutions. In statistical models, we use machine learning or deep learning systems with lots of training data and use model prediction. Statistical models need lots of training data for training. Realistically, obtaining that much labeled, cleaned data to train isn’t easy. This significant limitation to statistical modeling was overcome by a new learning method called prompt engineering.

Prompt-based machine learning, aka prompt engineering, has already opened many possibilities in ML. Lots of research is taking place in this area globally. Let’s dive into prompt engineering and why it is becoming more popular in data science.

What is Prompt Engineering?

Prompt engineering is the process of designing and creating prompts, or input data for AI models to train on as they learn specific tasks.

Prompt engineering is a new concept in artificial intelligence, particularly, in NLP. In prompt engineering, the task description is embedded in the input.. Prompt engineering typically works by converting one or more tasks into a prompt-based dataset and training a language model with what has been called “prompt-based learning” or just “prompt learning.”

This technique became popular in 2020 with GPT- 3, the third generation of Generative Pretrained Transformer. AI solution development has never been simple, but with GPT-3, you only need a meaningful training prompt written in simple English.

The first thing we must realize while working with GPT-3 is how to design required prompts according to our use case. Like In statistical models, the quality of training data boosts prediction quality. Here, the quality of input prompts increases the quality of the task it performs. So, a good prompt can deliver a good task performance in the desired favorable context. Writing the best prompt is often a matter of trial and error.

Prompt-based learning is based on Large Language Models (LLM) that directly model text probability. In contrast, traditional supervised learning trains a model to take an input x and predict an output y as P(y|x). In order to use these models for prediction tasks, the initial input (x) is changed using a template into a textual string prompt with some empty slots (x’). The language model is then used to probabilistically fill the empty information to obtain a final string (x^), from which the final output (y) can be deduced. This architecture is effective and appealing for various reasons. For example, this architecture allows the user to pre-train the language model on enormous volumes of unlabeled text and conduct few-shot, one-shot, or even zero-shot learning by specifying a new prompting function.

Pretrain – Prompt – Predict Paradigm

As you now know, prompt engineering has evolved with the pretrained LLMs. As mentioned above, prompt engineering is based on the LLM. So all the prompts we trigger are hit on a Language Model behind the scene. A powerful, well-trained language model should be selected/Identified before we start the prompt engineering. The first step is selecting the pre-trained model.

Pretrain

While selecting a pre-trained model, considering the pretraining objective is significant. One can pretrain a language model in multiple ways depending on the context.

Pretraining via next token prediction

Next-token prediction is when we simply want to predict the next word after giving all the previous words in context. This is one of the most straightforward and easy-to-understand pretraining strategies. These pretraining objectives are so important for prompting considerations because they will affect the types of prompts we can give the model and how we can incorporate answers into those prompts.

Pretraining via masked token prediction

Instead of predicting the next token by taking all other previous tokens, the masked token prediction method predicts any masked tokens within the input, given all the surrounding contexts. The classic BERT language model has been trained via this strategy.

Training an entailment model

More than a pretraining strategy, the entailment helps in the prompted classification tasks. Entailment means, given two statements, we want to determine if they imply one another, conflict with one another, or are neutral, meaning they have no bearing on one another.

Prompt and Prediction

Once we select the pretrained language model, the next step is designing the appropriate prompt for our desired task. Understanding what the pretrained language model knows about the world and how to get the model to use that knowledge to produce beneficial and contextual results is the key to generating successful prompts.

Here, in the form of a training prompt, we provide the model with just enough information to enable it to recognize the patterns and complete the task at hand. We don’t want to overwhelm the model’s natural intelligence by providing all the information simultaneously.

Given that the prompt describes the task, picking the right prompt significantly impacts both the accuracy and the first task that the model does.

As a general guideline, we should strive to elicit the required response from the model in a zero-shot learning paradigm when developing a helpful prompt. This indicates that the task should be finished without undergoing any sort of fine-tuning. If, after receiving the model’s response, you believe it falls short of your expectations, give the model one specific example along with the suggestion. The One-Shot Learning paradigm is used to describe this. If you still believe that the responses do not satisfy your needs, give the model a couple of additional instances along with the request. The Few-Shot Learning paradigm is what is used in this.

For simplicity, the standard flow for training prompt design should look like this: Zero-Shot → One Shot → Few shot

Let us check how we can generate prompts on the GPT-3 language model to perform our custom tasks.

Prompt with Zero-Shot

Consider the domain identification problem in NLP. In statistical modeling, we are used to solving it by training a machine learning or deep learning model using extensive training data. Modeling and training are not feasible in statistical machine learning without significant time and proper technical knowledge.

We can identify domains of some famous personalities using prompts in GPT-3 with zero-shot learning. Since the setup and background of GPT-3 are beyond this topic, we are directly moving to generally understanding prompts.

Prompt:
The following is a list of celebrities and the domains they fall into:
Leonardo DiCaprio, Elon musk, Ishan Kishan, Kamala Harris, Andrey Kurkov

Response:

Looks amazing right? Without providing prior knowledge, the pretrained model detects each domain based on the prompt.

Prompt with One-Shot

Consider the question-answering problem. Question-answering is one of the fundamental problems in NLP. Others include word sense, disambiguation, named entity identification, anaphora and cataphora resolution. Training a question-answering model from scratch and ensuring its performance is very complex in traditional machine learning.

Question-answering in machine learning is complex because it requires the model to understand natural language and context, reason, and retrieve relevant information from a large amount of data. Furthermore, natural language is complex and ambiguous, with multiple meanings and interpretations for the same words or phrases. This makes it challenging to accurately understand the question’s intent and provide an appropriate answer. The model must handle various forms of language, such as idioms, colloquialisms, and slang.

Context is another important factor in question-answering. The meaning of a word or phrase can vary depending on the context in which it is used. For example, the word “bat” can refer to a flying mammal or a piece of sports equipment, depending on the context. Thus, the model needs to be able to understand the broader context of a question to provide a relevant answer

Let us see how we can resolve this problem with prompt engineering.

An example and input are distinguished with the token

Prompt:
Context: The World Health Organization is a specialized agency of the United Nations responsible for international public health. Headquartered in Geneva, Switzerland, it has six regional offices and 150 field offices worldwide. The WHO was established on 7 April 1948.

Question: When was the WHO established?

Answer: 7 April 1948

Context: Greg Barclay has been unanimously re-elected as chairman of the International Cricket Council (ICC) for a second two-year term. The former New Zealand Cricket chair was unopposed following the withdrawal of Zimabwe’s Tavengwa Mukuhlani from the process, and the ICC Board reaffirmed its full support to Barclay to continue at the helm.

Question: Who is the chairman of ICC?

Answer:
Response:

Prompt with Few-Shot

Consider the text pattern challenge. Let’s see how the prompts with few-shot learning in GPT-3 can solve the problem.

Prompt:
Extract the ISO number from the Context:
Context: ISO 9001 is probably the most well-recognized ISO number in the world
ISO numbers of examples:
ISO 187288|
ISO 972012|
ISO 569295|

ISO number:
Response:

Prompt engineering is rising as a new era in Artificial Intelligence. Lots of research is happening in this domain today. Many prompt-based tools and libraries other than GPT-3 have already been developed.

Some of them include:

OpenAI Playground
GPTTools
LangChain
ThoughtSource
EveryPrompt
DUST
Dyno
Metaprompt
Prompts.ai
Lexica
Scale SpellBook
Interactive Composition Explorer
LearnGPT
AI Test Kitchen
betterprompt
Prompt Engine
PromptSource
sharegpt
DreamStudio
PromptInject

Prompt-based machine learning has already started to help us solve several bottleneck issues. Prompts will work in NLP and image processing domains. All AI aspirants worldwide look forward to this field’s latest research results.

So, why wait? Let’s prompt!

Author

Pradeep T

AI Engineer- Data Science, Fosfor

Pradeep has 3+ years of progressive experience in executing data-driven solutions. He is proficient in machine learning and statistical modeling algorithms/techniques for solving data science problems. His expertise lies in unstructured data processing and Large Language Model (LLM) based applications. Pradeep holds a Master's degree in Computational Linguistics from APJ Abdul Kalam Technological University and a Bachelor's in Computer science from Cochin University of Science and Technology. He actively contributes to open-source projects and authors blog posts on artificial intelligence. He is a passionate data scientist who believes in making incremental progress toward daily growth.

More on the topic

Read more thought leadership from our team of experts

Generative AI - Accelerate ML operations using GPT

As Data Science and Machine Learning practitioners, we often face the challenge of finding solutions to complex problems. One powerful artificial intelligence platform that can help speed up the process is the use of Generative Pretrained Transformer 3 (GPT-3) language model.

Bias in AI: A primer

While Artificial Intelligence (AI) systems can be highly accurate, they are imperfect. As such, they may make incorrect decisions or predictions. Several challenges need to be solved for the development and adoption of technology. One major challenge is the bias in AI systems. Bias in AI refers to the systematic differences between a model's predicted and true output. These deviations can lead to incorrect or unfair outcomes, which can seriously affect critical fields like healthcare, finance, and criminal justice.

Choosing the best AI/ML platform from a multimodel vendor

Artificial intelligence (AI) and machine learning (ML) technologies are expanding rapidly as organizations seek to capitalize on the value of their data. Half of the companies surveyed in a 2020 Mckinsey study have already adopted AI in at least one business function.

Privacy & Cookie policy

Privacy & Cookies policy

Cookie name	Active
sess_map

What is a cookie?

A cookie is a small piece of data that a website asks your browser to store on your computer or mobile device. The cookie allows the website to “remember” your actions or preferences over time. On future visits, this data is then returned to that website to help identify you and your site preferences. Our websites and mobile sites use cookies to give you the best online experience. Most Internet browsers support cookies; however, users can set their browsers to decline certain types of cookies or specific cookies. Further, users can delete cookies at any time.

Why do we use cookies?

We use cookies to learn how you interact with our content and to improve your experience when visiting our website(s). For example, some cookies remember your language or preferences so that you do not have to repeatedly make these choices when you visit one of our websites.

What kind of cookies do we use?

We use the following categories of cookie:

Category 1: Strictly Necessary Cookies

Strictly necessary cookies are those that are essential for our sites to work in the way you have requested. Although many of our sites are open, that is, they do not require registration; we may use strictly necessary cookies to control access to some of our community sites, whitepapers or online events such as webinars; as well as to maintain your session during a single visit. These cookies will need to reset on your browser each time you register or log in to a gated area. If you block these cookies entirely, you may not be able to access gated areas. We may also offer you the choice of a persistent cookie to recognize you as you return to one of our gated sites. If you choose not to use this “remember me” function, you will simply need to log in each time you return.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
__cfduid	Cloudflare	Cookie associated with sites using CloudFlare, used to speed up page load times	1 Year
lidc	linkedin.com	his is a Microsoft MSN 1^st party cookie that ensures the proper functioning of this website.	1 Day
PHPSESSID	ltimindtree.com	Cookies named PHPSESSID only contain a reference to a session stored on the web server	When the browsing session ends
catAccCookies	ltimindtree.com	Cookie set by the UK cookie consent plugin to record that you accept the fact that the site uses cookies.	29 Days
AWSELB		Used to distribute traffic to the website on several servers in order to optimise response times.	2437 Days
JSESSIONID	linkedin.com	Preserves users states across page requests.	334,416 Days
checkForPermission	bidr.io	Determines whether the visitor has accepted the cookie consent box.	1 Day
VISITOR_INFO1_LIVE		Tries to estimate users bandwidth on the pages with integrated YouTube videos.	179 Days

Category 2: Performance Cookies

Performance cookies, often called analytics cookies, collect data from visitors to our sites on a unique, but anonymous basis. The results are reported to us as aggregate numbers and trends. LTI allows third-parties to set performance cookies. We rely on reports to understand our audiences, and improve how our websites work. We use Google Analytics, a web analytics service provided by Google, Inc. (“Google”), which in turn uses performance cookies. Information generated by the cookies about your use of our website will be transmitted to and stored by Google on servers Worldwide. The IP-address, which your browser conveys within the scope of Google Analytics, will not be associated with any other data held by Google. You may refuse the use of cookies by selecting the appropriate settings on your browser. However, you have to note that if you do this, you may not be able to use the full functionality of our website. You can also opt-out from being tracked by Google Analytics from any future instances, by downloading and installing Google Analytics Opt-out Browser Add-on for your current web browser: https://tools.google.com/dlpage/gaoptout & cookiechoices.org and privacy.google.com/businesses

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
_ga	ltimindtree.com	Used to identify unique users. Registers a unique ID that is used to generate statistical data on how the visitor uses the web site.	2 years
_gid	ltimindtree.com	This cookie name is asssociated with Google Universal Analytics. This appears to be a new cookie and as of Spring 2017 no information is available from Google. It appears to store and update a unique value for each page visited.	1 day
_gat	ltimindtree.com	Used by Google Analytics to throttle request rate	1 Day

Category 3: Functionality Cookies

We may use site performance cookies to remember your preferences for operational settings on our websites, so as to save you the trouble to reset the preferences every time you visit. For example, the cookie may recognize optimum video streaming speeds, or volume settings, or the order in which you look at comments to a posting on one of our forums. These cookies do not identify you as an individual and we don’t associate the resulting information with a cookie that does.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
lang	ads.linkedin.com	Set by LinkedIn when a webpage contains an embedded “Follow us” panel. Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.	When the browsing session ends
lang	linkedin.com	In most cases it will likely be used to store language preferences, potentially to serve up content in the stored language.	When the browsing session ends
YSC		Registers a unique ID to keep statistics of what videos from Youtube the user has seen.	2,488,902 Days

Category 4: Social Media Cookies

If you use social media or other third-party credentials to log in to our sites, then that other organization may set a cookie that allows that company to recognize you. The social media organization may use that cookie for its own purposes. The Social Media Organization may also show you ads and content from us when you visit its websites.

Ref links:

LinkedIn – https://www.linkedin.com/legal/privacy-policy Twitter – https://gdpr.twitter.com/en.html & https://twitter.com/en/privacy & https://help.twitter.com/en/rules-and-policies/twitter-cookies Facebook – https://www.facebook.com/business/gdpr Also, if you use a social media-sharing button or widget on one of our sites, the social network that created the button will record your action for its own purposes. Please read through each social media organization’s privacy and data protection policy to understand its use of its cookies and the tracking from our sites, and also how to control such cookies and buttons.

Category 5: Targeting/Advertising Cookies

We use tracking and targeting cookies, or ask other companies to do so on our behalf, to send you emails and show you online advertising, which meet your business and professional interests. If you have registered on our websites, we may send you emails, tailored to reflect the interests you have shown during your visits. We ask third-party advertising platforms and technology companies to show you our ads after you leave our sites (retargeting technology). This technology allows us to make our website services more interesting for you. Retargeting cookies are used to record anonymized movement patterns on a website. These patterns are used to tailor banner advertisements to your interests. The data used for retargeting is completely anonymous, and is only used for statistical analysis. No personal data is stored, and the use of the retargeting technology is subject to the applicable statutory data protection regulations. We also work with companies to reach people who have not visited our sites. These companies do not identify you as an individual, instead rely on a variety of other data to show you advertisements, for example, behavior across websites, information about individual devices, and, in some cases, IP addresses. Please refer below table to understand how these third-party websites collect and use information on our behalf and read more about their opt out options.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
BizoID	ads.linkedin.com	These cookies are used to deliver adverts more relevant to you and your interests	183 days
iuuid	demandbase.com	Used to measure the performance and optimization of Demandbase data and reporting	2 years
IDE	doubleclick.net	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	2,903,481 Days
UserMatchHistory	linkedin.com	This cookie is used to track visitors so that more relevant ads can be presented based on the visitor’s preferences.	60,345 Days
bcookie	linkedin.com	This is a Microsoft MSN 1st party cookie for sharing the content of the website via social media.	2 years
__asc	ltimindtree.com	This cookie is used to collect information on consumer behavior, which is sent to Alexa Analytics.	1 Day
__auc	ltimindtree.com	This cookie is used to collect information on consumer behavior, which is sent to Alexa Analytics.	1 Year
_gcl_au	ltimindtree.com	Used by Google AdSense for experimenting with advertisement efficiency across websites using their services.	3 Months
bscookie	linkedin.com	Used by the social networking service, LinkedIn, for tracking the use of embedded services.	2 years
tempToken	app.mirabelsmarketingmanager.com		When the browsing session ends
ELOQUA	eloqua.com	Registers a unique ID that identifies the user’s device upon return visits. Used for auto -populating forms and to validate if a certain contact is registered to an email group .	2 Years
ELQSTATUS	eloqua.com	Used to auto -populate forms and validate if a given contact has subscribed to an email group. The cookies only set if the user allows tracking .	2 Years
IDE	doubleclick.net	Used by Google Double Click to register and report the website user’s actions after viewing clicking one of the advertiser’s ads with the purpose of measuring the efficiency of an ad and to present targeted ads to the user.	1 Year
NID	google.com	Registers a unique ID that identifies a returning user’s device. The ID is used for targeted ads.	6 Months
PREF	youtube.com	Registers a unique ID that is used by Google to keep statistics of how the visitor uses YouTube videos across different web sites.	8 months
test_cookie	doubleclick.net	This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor’s browser supports cookies.	1,073,201 Days
UserMatchHistory	linkedin.com	Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor’s preferences.	29 days
VISITOR_INFO1_LIVE	youtube.com		179 days

Third party companies	Purpose	Applicable Privacy/Cookie Policy Link
Alexa	Show targeted, relevant advertisements	https://www.oracle.com/legal/privacy/marketing-cloud-data-cloud-privacy-policy.html To opt out: http://www.bluekai.com/consumers.php#optout
Eloqua	Personalized email based interactions	https://www.oracle.com/legal/privacy/marketing-cloud-data-cloud-privacy-policy.html To opt out: https://www.oracle.com/marketingcloud/opt-status.html
CrazyEgg	CrazyEgg provides visualization of visits to website.	https://help.crazyegg.com/article/165-crazy-eggs-gdpr-readiness Opt Out: DAA: https://www.crazyegg.com/opt-out
DemandBase	Show targeted, relevant advertisements	https://www.demandbase.com/privacy-policy/ Opt out: DAA: http://www.aboutads.info/choices/
LinkedIn	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://www.linkedin.com/legal/privacy-policy Opt-out: https://www.linkedin.com/help/linkedin/answer/62931/manage-advertising-preferences
Google	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://policies.google.com/privacy Opt Out: https://adssettings.google.com/ NAI: http://optout.networkadvertising.org/ DAA: http://optout.aboutads.info/
Facebook	Show targeted, relevant advertisements	https://www.facebook.com/privacy/explanation Opt Out: https://www.facebook.com/help/568137493302217
Youtube	Show targeted, relevant advertisements. Show embedded videos on LTI websites	https://policies.google.com/privacy Opt Out: https://adssettings.google.com/ NAI: http://optout.networkadvertising.org/ DAA: http://optout.aboutads.info/
Twitter	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://twitter.com/en/privacy Opt out: https://twitter.com/personalization DAA: http://optout.aboutads.info/

Save settings

Overview

Partners

What’s hot

Industries

Roles

Knowledge hub

About Fosfor

The Fosfor Decision Cloud

What’s hot

An ecosystem geared for value

Industries

Roles

Knowledge hub

About Fosfor