Just like us humans, AIs are not perfect and have their flaws. In this second part of the post, we try to understand how to best use artificial intelligence to minimize the risk of generating hallucinations by generating better prompts.
In this post I summarize a bit of the last year's experience with Large Language Models and tell you how to improve your prompts to try to reduce hallucinations and at the same time get more results from your AI interrogation activity.
Summary of the previous episode.
In the first part dedicated to hallucinations we explored the phenomenon of hallucinations of language models such as Chat GPT, Bard, Claude (Yes, I never mentioned this but I really like it!).
We understand that AIs hallucinate for the following reasons:
AIs learn from huge amounts of data, and we can't control whether this information is entirely accurate. Therefore, we cannot always claim 100% accuracy.
As I have already explained in other posts, AIs do not understand the meaning of words, they only know how to combine them in a statistical way correct. They don't feel emotions, they don't have a biological body, they aren't meteoropathic.
No matter how hard they try, they struggle to capture the intent of the user asking the questions. Especially if badly expressed.
They need a limited context so they don't feel 'free to wander' and forced to answer anything rather than say 'I don't know'.
There are configuration parameters such as Temperature, which define how much leeway an AI can take in generating responses for us. The higher it is, the more imagination increases in the answers.
There can be errors, biases, manipulative intentions, cultures in the training phase, overfitting issues on limited data and other technical factors that contribute to providing AIs with 'Emerging Abilities' that are not sufficiently tested.
The only thing GAIs do is look for the next best word in their context window.
They tend to lie all the time, they do it very well and often without realizing it.
Keep in mind that some of the previous points are also valid for us humans: understanding of the context, basic culture, knowledge acquired over time determine different answers to the same questions.
The importance of providing context
If you think it's effective ask questions like:
Tell me about artificial intelligence
Or
Write a summary of the Divine Comedy in 100 words
You are way off track! As already mentioned, templates need a lot of context, i.e. a lot of text that guides you in the direction you want to go.
Imagine that you have just hired a Harvard-graduate assistant in your company, with a specialization at Stanford and a PHD at the Polytechnic (just so as not to always be foreign-referenced).
He is very intelligent, very cultured, he can do thousands of things.
But he doesn't know you, he doesn't know anything about you, your company, your processes, your suppliers, your way of communicating, the goals you have and the way you think they can be achieved.
Basically you always have to explain everything to him every time you start a new conversation.
How an effective prompt should be done
Each prompt should have the following characteristics:
Start with a detailed explanation of the role you want to give to AIs.If you ask an AI:“I have a headache. Why?”orYou're a stage comedian who makes fun of people's problems with dry jokes. Generate 5 jokes starting from this sentence: “I have a headache. Why?”.orYou are a diagnostic doctor with 10 years of experience. I have a headache. Why?you will get very different outputs (Be aware that the AI is not your doctor!).
Provide initial information, even if you think the AI already has it, perhaps by sending it links (for GPT you will have to use a Plug-in) For example if you want to askDescribe an innovative idea to launch a new product on the market.First you will have to give background information, explain the company, how it was born, what it does, what its objectives are, what customers think of it. Remember that he is a newly hired assistant and knows nothing about you.
Describe on what media and who the content is intended for.Writing for a 6-year-old child or a follower of Umberto Eco, for a generalist social network or a technical blog, will change the shape, structure and syntax of the content.
Provide examples if possible.We could talk about this for hours. The more examples you provide, the more coherent the answers will be. Obviously if you expect long outputs it will be more tiring. But, this way, you will receive surprisingly more valid answers.
Clearly specify the goal you want to achieve and what you expect from it. For example, if you want him to generate data, tell him what format you want them in (Table, CSV, etc.), if you need a bulleted or numbered list, ask him, if you are interested in a semantic analysis of a text, reading the psychology of a character or an analysis of the KPIs of a balance sheet tell them clearly and without going around the bush.
Work in an iterative mode. Basically ask one question at a time and use the AI response as context for the next question. In this way the conversation becomes progressively more coherent and focused. In this way the context window gets longer, structured and the possibility of making mistakes is reduced.
Be written in English.Chat GPT, Claude, Bard have been trained in English. If you master it, ask questions in English and then ask for output in Italian. The difference is tangible.
Moral: the ideal initial prompt, if you want to work seriously and reduce hallucinations, should be made up of around 400-500 tokens.
After that, you'll be ready to ask direct questions like the ones above.
I know it may seem long and boring but if you want to get plausible results it is the most recommended way.
The Master Prompts
If you work with GAIs on recurring topics, I suggest you create Master Prompts: i.e. prompt models that tell the context as described above and that you will use every time you want to query the AI on these arguments.
You can save them in your chat history or, if they contain sensitive information and you work anonymously, in any text or note editor. (Remember that by default the data you leave to GAI will be used in the next training cycle!)
Done the first time you can afford, for the following times, simpler questions.
I created one for each area of interest I work on: proofreading these texts, producing content for the companies I work with (in which I always explain at the beginning, every time, who the company is, what does, how it communicates, who is the target), correction and training of the English language (or the language you prefer), preparation of lessons, writing of contents of any kind.
After the Master Prompt, don't be shy: ask for 10, 100 different answers, modify it, evaluate the results and then choose.
Always consider the following in the Master Prompt:
If he can't give you an answer, he has to tell you: ask him to do it. GAI tend to want to answer everything always and in any case at the cost of telling you some bullshit!
Work on the temperature: you can indicate in the master prompt to keep a HIGH temperature (Guaranteed hallucinations), or LOW (Almost absent but with basically drier responses). GPT Chat Works with a range of 0 to 2. The default is around 1.
Ask him if he's sure, if he has further ideas: if he gives you a different answer, continue until he gives the same answer several times.
If the prompt is long ask him to read everything before starting to provide output.
Check and verify: it is written very clearly on every AI: perfection does not exist.
The best is the enemy of the good: A job well done is enough for you, not a perfect job finished in 99% of situations. You can always give the final touch.
So…
AIs do not learn from experience: they know nothing about you until you tell them. the more specific the information, the more difficult it is to find the right answers. It's ridiculous when famous journalists complain about hallucinating their own biography. Their work probably wasn't simply used to train the AI in question; no need to complain.
We have to know a little, we have to know how to ask. If you are completely ignorant about a subject, ask him first of all to tell you what you should ask him.
Simple questions = Simple answers.
Wrong questions = Probably wrong answers.
The human factor in the use of AI is decisive. They seem very easy to use but to get good results you need to know them; like any other software.
Each LLM model is programmed to always respond by pretending to know. Like the best communicators. We must know this and ask him not to invent anything when needed.
There is no 100% perfect answer, the final check is up to you.
What do you think? What else would you like to hear about in the next posts?
PS. If necessary, you will find a short glossary here.
コメント