封面来源：ChatGPT-4 DALL·E

摘要：ChatGPT的引入改变了人们处理文本数据的方式。本文为ChatGPT设计了一种提示策略（prompting strategy），用于识别和分析财务沟通中的异常之处，重点关注S&P 500公司的电话财报会议（earnings call）。利用最新的GPT-4-Turbo模型，从25个维度对异常财务沟通进行识别和分类，分为四类：高管的异常沟通、财务分析师的异常沟通、异常内容和技术问题。大部分电话财报会议存在异常财务沟通，这与公司的某些特征相关，并随商业周期波动。股市对异常信息反应消极，交易频繁。ChatGPT等大型语言模型具有金融分析潜力，可为解读复杂文本数据及其对市场影响的经济后果提供新的见解。

引用：Beckmann, L., Beckmeyer, H., Filippou, I., Menze, S., & Zhou, G. (2024). Unusual Financial Communication-Evidence from ChatGPT, Earnings Calls, and the Stock Market. Earnings Calls, and the Stock Market (January 15, 2024).

文献：

Introduction

Data

Firms in the S&P 500 typically enjoy the largest following by financial analysts and reflect novel information sooner than smaller firms ($\color{blue}{Zhang,\ 2006}$).
For smaller firms the data quantitiy and quality are much reduced and the price delay may be too large, obscuring the pricing channel in question ($\color{blue}{Hou\ and\ Moskowitz,\ 2005}$).

Contribution

Engineer a suitable prompting approach for ChatGPT to identify and understand unusualness in earnings calls.
- Apply a three-step prompting strategy to identify unusualness in earnings calls (which is solely based on ChatGPT’s general knowledge and does not require an external definition of unusualness.)
- 向ChatGPT输入S&P 500公司的电话财报会议数据，输出”正常“或”异常“的结果并给出理由；
Identify unusual financial communication in earnings calls and investigate its correlation with firm characteristics, industry affiliation, and macroeconomic indicators across various business cycles.
- Identify 25 dimensions of unusualness in earnings calls, which can be classified into four broader categories: unusual communication by executives, by financial analysts, unusual contents, and unusual technical issues.
- 具有异常沟通的公司往往规模更大，但利润较低，更有可能成为动量的输家；
- Relate the different dimensions of unusual communication to various macroeconomic indicators.
- Document that the degree of unusual communication varies with the business cycle.
As our third contribution, we investigate to which extent market participants react to unusual earnings calls and if so, which dimensions of unusual communication are responsible for this reaction.
- Firms typically earn high returns on earnings announcement dates($\color{blue}{Savor\ and\ Wilson,\ 2016}$).
- Firms with unusual communication are significantly lower, and in fact indistinguishable from zero.
- This result stands both when value-weighting and equally-weighting the respective firms.
- Find large differences in the effects of the 25 dimensions.
  - Technical difficulties are related to a positive return impact (equally-weighted) or no return impact (value-weighted).
  - The largest negative return impact is produced by a lack of critical questioning by analysts (−4.24%), repetitive questions (−2.83%), or the announcement of surprising information (−2.03%).

Discover

Trading activity is significantly elevated for firms with unusual communication along many of the 25 identified dimensions.
- The literature has proposed trading volume as a measure for disagreement across investors.
- $\color{blue}{Hong\ and\ Stein\ (2007)}$argue that disagreement may arise when investors possess different information sets or if new information leads them to update their beliefs.
Find a negative and highly significant announcement return impact of unusual communication.

Advantage

Rely on the same model for the decision of whether a particular earnings call was unusual or not.
The model processes textual information significantly faster than humans, combines much of the aggregate reasoning while humans would process this information from its training corpus ($\color{blue}{de\ Kok,\ 2023}$).
Utilize the most recent iteration of ChatGPT, known as GPT-4-Turbo, which can process texts of up to 128000 tokens in a single prompt.

Methodology

Data Description

Object: Firms in the S&P 500
- Presentation
- Q&A session※
  - .txt files
  - the date and time of when the earnings call took place
  - the reporting quarter
  - the company name and ticker
  - feed ChatGPT with the Q&A session only
Data sources: Refinitiv
Sample period: 2015.01: 2022.12

Accessing ChatGPT: Access GPT-4-Turbo via its application programming interface (API).

an updated version of GPT-4
released on November 6, 2023
token limit has been increased to 128000

Prompt Engineering

three-step prompting approach（了解电话财报会议是否异常以及异常特征）：

第一步，随机向ChatGPT提供2015年至2023年间1000份问答会话记录的样本，并要求该模型判断某个问答会是否异常，如果存在异常则提供文字说明。

Prompt 1: Please read the following transcript of a Question-and-Answer session from the earn-ings conference call of company {firm} ({ticker}) carefully. Determine whether the Question-and-Answer session of this earnings conference call is ‘usual’ or ‘unusual’: If the Question-and-Answer session is classified as ‘usual’, state ‘usual’ without any justifications or further output. If the Question-and-Answer session is classified as ‘unusual’, state ‘unusual’ and provide a justification for this classification. Transcript of the Question-and-Answer Session: ‘{qa}’.

第二步，将异常结果系统化。收集第一步中的异常理由，并提供给新prompt，令ChatGPT从中归纳出高层次类别。

Prompt 2: Please read the provided text file with justifications for unusual Q&A sessions from earnings conference calls carefully. What are high-level categories to identify unusual Q&A ses-sions? Make sure that each statement from the text file can be assigned to one of the categories.

第三步，检查以上维度的所有异常问答。

1
2

Prompt 3: Please read the following transcript of a Question-and-Answer session from the earnings conference call of company {firm} (ticker) carefully. Determine whether the Question-and-Answer session of this earnings conference call is ‘usual’ or ‘unusual’ in the following {len(categories)} categories: {categories}
For each category, state whether the Question-and-Answer session is ‘usual’ or ‘unusual’. If the Question-and-Answer session is classified as ‘usual’ in the respective category, state ‘usual’ without any justifications or further output. If the Question-and-Answer session is classified as ’unusual’ in the respective category, state ‘unusual’, print a ‘/’, and provide a justification for this classification. Transcript of the Question-and-Answer Session: ‘{qa}’

以特斯拉2018年第一季度电话财报会议中的异常沟通为例。

表1概括了异常沟通维度，ChatGPT提供25个high-level categories，且降维至四个维度：
- 管理层沟通异常：指的是公司管理层在回答分析师提问时表现出的异常行为，如回答过于冗长、情绪化，或者含糊其辞、逃避问题，甚至出现自相矛盾的情况，显露出管理团队可能没有充分准备；
- 分析师互动异常：包括分析师提出的问题离题或重复、问题数量异常多、反复关注特定参与者或议题、频繁询问与财务无关的问题，以及缺少对关键问题的探讨；
- 沟通内容异常：涵盖公司变革、领导层变动、战略洞察、法律与监管议题讨论、运营和管理问题、对其市场行为的分析、对特定（财务或非财务）主题的详细讨论、外部事件的影响、宏观经济因素和公告等，以及对公司产品或服务的深入讨论；
- 技术难题：识别在沟通过程中遇到的技术执行或操作问题，可能会影响到信息或数据传输的准确性。

ChatGPT优势

可复现性；
解释文本信息速度更快、结合了更多信息；
优于人工方法或者LM词典；

Identifying Unusual Communication

异常交流频率

表1展示了异常沟通频率。
表2表明在某一维度上存在异常沟通的公司，在同一类的另一维度上也存在异常沟通可能性有多大。
表3说明在某一类别（高管）的至少一个维度上存在异常交流的公司，在另一类别（分析师）上也出现异常交流迹象的频率。
- Financial analysts are important to set the tone of the discussions during the earnings call.
表1至表3可得：异常沟通模式存在较大的异质性。

异常公司

表4为异常沟通与公司特征的关系。
- 表4整体介绍
  - 通过比较在4类别的25个维度，查看存在异常沟通与否，公司在财务比率和公司特征方面是否存在系统性差异，进而分析特定公司是否更容易出现异常沟通；
  - 考虑如下因素：市值（Size）、账面市值比（B2M）、投资行为（Inv）、盈利能力（Prof）、动量（Mom）、21天最大收益率（MAX）、流动性（Illiq）和名义股价（PRC）；
  - Prof（-2%）：即存在异常沟通公司的盈利能力平均低2%；
- 高管异常沟通
  - 存在高管异常沟通的公司盈利能力（Prof）明显较低、平均规模（Size）略大，股票流动性（Illiq）略高；
  - 高管异常沟通的不同维度存较大异质性；
- 分析师异常沟通
  - 公司市值（Size）较大、投资额（Inv）略高但盈利能力（Prof）较低、$\color{red}{股票流动性（Illiq）较强}$、交易价格（PRC）较高有关；
- 技术难题
  - 与公司特征无关；
- 以上说明：ChatGPT可以从电话财报会议记录中识别出异常沟通对应的意义。
- 沟通内容异常
  - 股票平均规模（Size）较大，盈利能力（Prof）较差，动量（Mom）输家、股票流动性（Illiq）更强，交易的名义价格（PRC）更高。
  - 讨论法律问题（Legal）的公司往往是价值股，账面市值比高；
  - 宣布意外消息（Surprise）和公司或管理层变动（Changes）的公司更有可能是动量（Mom）输家，表明其以往股票收益率已低于同行；
  - 存在Legal、Surprise和Changes等维度异常沟通的公司盈利能力（Prof）较差；
表5展示了不同行业的企业是否在异常沟通倾向方面存在系统性差异。
- 步骤：收集样本中各公司的SIC，根据$\color{blue}{Kenneth\ French}$的定义将其归入12个行业之一；
- 异常沟通与公司行业归属基本无关，即异常沟通并非某些行业独有；

异常沟通时期

图1显示了随时间推移，外部事件影响（External Events Impact）、运营和管理问题（Operational and Management Issues）以及意外公告（Surprising Announcements）等三方面异常沟通的相对发生率；
- 外部事件影响（External Events Impact）
  - 2018-2019年约为10%，在2020年Q1达到峰值，超过 25%（即四分之一的企业讨论了外部事件的影响，主要是Covid crisis）；
  - 【Interestingly, however, ChatGPT considers the communication style with this regard of only 1/4 of all firms as unusual.】
  - 2022年初再次达到峰值（俄乌战争）；
- 运营和管理问题（Operational and Management Issues）
  - 平稳
- 意外公告（Surprising Announcements）
  - 2018年、2021年Q1达到高峰；
表6揭示了异常沟通与商业周期的关系（单变量回归）。
- 宏观经济指标
  - 标准普尔 500 指数的市盈率（the price-dividend ratio of the S&P 500）、股票净发行量（net share issuance）、当前国库券利率（the current Treasury bill rate）、期限（term）和违约利差（default spread）（$\color{blue}{Welch\ and\ Goyal,\ 2008}$）；
  - VIX；
  - intermediary capital ratio（ICR）（$\color{blue}{He\ et\ al.,\ 2017}$）；
  - 芝加哥联储全国活动指数（Chicago Fed National Activity Index，$\color{blue}{CFNAI}$）；
- 结果分析

Stock Market Reaction to Unusual Communication

计算步骤
- 采用每日股价，计算事件发生日$t$与前后一天（$t-1$、$t+1$）时间窗口内累计收益率；
- 分别计算4类、25维度异常沟通的等权、加权收益率；
表7表明了异常沟通与公告收益的关系。
- 异常沟通与较低的公告收益率相关，正常沟通收益为正（$\color{blue}{Savor\ and\ Wilson,\ 2016}$），异常沟通收益率无异于零;
- 公司高管异常沟通与显著较小的公告收益有关（等权、加权均适用）；
- 分析师异常互动与负收益显著相关；
- 异常内容产生显著较小的公告收益；
- 更换计算累计异常收益方法：$\color{blue}{Fama\ and\ French\ (1993)}$
表8表明异常沟通导致交易频繁。
- 异常沟通交易量显著增加；
- 分析师对特定主题或参与者的反复关注，以及重复提问和大量提问，均导致交易量显著增加；
- 技术困难不会使交易量增加，即异常沟通交易量增加是由于投资者对信息存在分歧；
表9面板回归。
- 异常沟通导致公告收益下降35个基点；
- 技术困难对收益率存在正向影响；
- 异常沟通与异常内容都和公告收益率大幅下降相关；
- 加入SIC行业固定效应，各类异常沟通系数几乎不变。