推断（Inferring）

本文是《基于 ChatGPT 的 Prompt 工程》系列课程的第三讲，本文展示如何使用 ChatGPT 来推断文本的情感、总结文本的主题等。

示例是一个台灯的评论：

lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

情感判断

下面的 Prompt 主要询问文本的情感是 Positive，还是 Negative。

prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

模型回答：这段文本是 Positive。

如果只想要 Positive 或者 Negative 的回答，我们需要在 Prompt 中加入这一要求

prompt = f"""
What is the sentiment of the following product review, which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

模型回答： Positive。

列举情绪的种类

prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

模型回答：happy, satisified, grateful, impressed, content

从这个例子可以看出，大模型擅长在文本中挖掘内容。

识别消极情绪

公司可能会有需要识别评论者的消极情绪，比如 anger。

prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

模型回答：No。

如果采用传统的机器学习，想要识别一段文本中是否有消极情绪，需要 supervised learning。

提取信息

下面的例子是要模型提取出产品和公司的信息，并且以 JSON 格式展示，key 是 “Item” 和 “Brand”。

prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

模型回答：{“Item”: “lamp”, “Brand”: “Lumina”}

这种方式也可以一次性提取多种信息。

prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)

这里把 Anger value 变成 boolean 值，厉害。

推断主题

story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""

prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

response.split(sep=',')

模型打印：

'goverment surver',
'job satisfaction',
'NASA',
'Social security administration',
'employ concerns'

如果我们想知道这个 story 的主题是否包含在 topic_list 中，可以用如下 Prompt 来推断。

topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]

prompt = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.\

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

模型打印：1 0 0 1 1

在机器学习中，这个例子叫 zero-shot。

下面的代码可以判断出这个故事是不是 NASA 的故事：

1
2
3

topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

模型打印：ALERT: New NASA story!

总结

本文使用电灯和 NASA 故事的例子展示了 ChatGPT 强大的推断能力。传统机器学习领域中训练出的模型，需要精细的调节和带标签的数据集；但是 ChatGPT 不需要就可以做到，真的非常令人震惊。

参考

Inferring