文献精读|FRL|Can ChatGPT assist in picking stocks?
封面来源:ChatGPT-4 DALL·E
摘要:本文研究了可接入互联网的ChatGPT-4能否提供有价值的投资建议并及时评估财务信息。通过现场实验,发现 ChatGPT-4 评级与未来盈余公告和股票收益之间正相关。实证表明ChatGPT-4会根据盈利意外和新闻事件信息及时调整评级,“attractiveness ratings”策略可产生正收益。
引用:Pelster, M., & Val, J. (2024). Can ChatGPT assist in picking stocks?. Finance Research Letters, 59, 104786.
Highlights
We conduct a live experiment to evaluate whether ChatGPT can pick stocks.
ChatGPT’s earnings forecast significantly correlate with actual earnings.
ChatGPT’s attractiveness ratings significantly correlate with future stock returns.
ChatGPT updates its ratings to news information in a timely manner.
Keywords
Information processing; Artificial intelligence (AI); ChatGPT;
Introduction
新闻在金融市场中至关重要,其可为投资者判断资产内在价值、正确投资决策提供信息。然而,当今信息爆炸时代(甚至信息冲突)可能会让投资者无法彻底判断。相较于机构投资者,个人投资者尤其如此,其在时间和技术方面面临更多限制。
作为一种先进的人工智能 (AI) 语言模型,ChatGPT已成为潜在有益于投资者的工具。通过利用其自然语言处理 (NLP) 功能,ChatGPT根据过往观察到的模式将大量信息提炼成简洁的摘要,从而帮助投资者有效筛选大量新闻等数据,并提供对特定查询的答复,达成满足投资者需求的互动和定制交流。而且该技术具有普适性,且可能降低信息成本,使金融知识普惠大众。
本研究评估了ChatGPT综合财经新闻的能力,探讨其能否准确提供切实可行的投资见解。在2023年第二季度财报公布期间进行现场实验,以测试ChatGPT能否可靠评估财报意外和其他新闻事件。使用盈利公告作为测试是因为此时安排了重大企业活动,创造了信息丰富的环境,蕴含复杂性和微妙性,对投资者来而言较难判断。
$\color{red}{现场实验可有效避免信息泄露}$,具体而言,不能使用历史数据分析ChatGPT的性能,因为无法保证ChatGPT不使用“未来”数据。
实证结果表明控制共识预测(consensus forecasts)后,ChatGPT盈利预测与实际盈利正相关。 “吸引力评级”(attractiveness ratings)与未来股票收益正相关。此外,ChatGPT能够及时对新闻做出反应。
通过研究盈余公告(earnings announcements)预测和股票吸引力评级,探索两个用例以支持投资者决策过程中使用ChatGPT。公告发布后几个月内,积极(消极)意外会导致积极(消极)的累积异常收益(CAR)。可要求ChatGPT在财报公布之前协助选股、判断股票吸引力评级。由于投资者处理信息存在认知、时间限制,因此考虑预选方案易于考虑全市场。
本文对ChatGPT在金融领域的用例文献做出重要贡献($\color{blue}{Dowling\ and\ Lucey,2023}$;$\color{blue}{Fieberg\ et\ al.,2023}$;$\color{blue}{Niszczota\ and\ Abbas,2023}$),通过提供类似辅导的支持并帮助以对话方式解释数据,ChatGPT和类似技术很可能会影响投资者的决策过程[By providing tutoring-like support and helping to interpret data in a conversational manner, ChatGPT and similar technologies are likely to affect the decision-making processes of investors.]。ChatGPT以及类似技术在各行业的快速融合表明,未来人们将更加依赖人工智能检索、总结和分析的信息。研究结果表明,该技术可能有助于金融机构帮助客户、员工提取和解释信息的实施。
What is ChatGPT?
基于OpenAI的GPT-4架构,ChatGPT是一种最先进的NLP算法,其旨在根据给定的提示生成类似人类的文本。该模型建立在$\color{blue}{Vaswani\ et\ al.(2017)}$引入的Transformer架构之上,并利用具有多层注意力机制的深度学习方法处理、生成语言。由于其庞大的训练数据和先进的架构,ChatGPT擅长多种应用,例如内容生成、翻译,甚至是代码编写等复杂任务。
ChatGPT, based on OpenAI’s GPT-4 architecture, is a state-of-the-art NLP designed to generate human-like text based on given prompts. The model is built upon the transformer architecture introduced by$\color{blue}{Vaswani\ et\ al.(2017)}$ and leverages a deep learning approach with several layers of attention mechanisms to process and generate language. Due to its vast training data and advanced architecture, ChatGPT excels at multiple applications such as content generation, translation, and even complex tasks like code writing.
LLMs广泛应用:在金融领域中,大语言模型 (large language models, LLMs) 可用于自动化客户支持、财务分析、欺诈检测、算法交易、个人财务管理、法规遵从、教育或咨询($\color{blue}{Cucchiara, 2023}$)。鉴于应用范围广泛,摩根士丹利已经在其财富管理业务中采用 ChatGPT 也就不足为奇了。在投资方面,$\color{blue}{Fieberg\ et\ al.(2023)}$认为“公开信息是可预测资产价格变动的关键驱动因素,意味着GPT-4等人工智能应用程序可能特别适合根据新信息调整投资组合配置建议“。OpenAI集成浏览功能可能会显著改变ChatGPT在该领域的能力,因为来自网络的当前信息绕过了ChatGPT的知识截点(2021年9月)。
In finance, large language models (LLMs) can be used in automated customer support, financial analysis, fraud detection, algorithmic trading, personal finance management, regulatory compliance, education, or consulting ($\color{blue}{Cucchiara, 2023}$). Given the wide range of applications, it is not surprising that Morgan Stanley is already employing ChatGPT in its wealth management business. In the context of investments, $\color{blue}{Fieberg\ et\ al.(2023)}$argue that “publicly available information is the key driver of predictable asset price movements, which implies that AI applications such as GPT-4 may be particularly well-equipped to adjust portfolio allocation advice to the arrival of new information.“ OpenAI’s efforts to integrate a browsing function may significantly change ChatGPT’s abilities in this field, as current information from the web bypasses ChatGPT’s knowledge cutoff in September 2021.
考虑到需要评估海量信息和投资者的注意力有限
考虑到海量信息和投资者的注意力有限($\color{blue}{Hirshleifer\ and\ Teoh, 2023}$),利用LLMs协助处理信息可能会改善投资管理。然而,ChatGPT根据在其训练数据中观察到的模式生成响应,如果出现模棱两可或误导性的提示,则很容易产生错误信息($\color{blue}{Lipton\ and\ Steinhardt,\ 2019}$);其并不“理解”人类意义上的信息,而是重复所学到的模式。这有时会导致输出看起来合理但实际上不正确2。此外,虽然该模型可以生成广泛主题的文本,但其在特定、细致入微的主题上的专业知识深度可能会有所不同,这通常反映了其训练数据的分布和质量。总而言之,该模型评估财务信息的潜在优势来自于其观察过去数据(训练数据)模式并将这些模式应用于当前信息的能力。如果相同或非常相似的模式导致不同的结果,那么评估就会导致误导性的建议。
2. Compared to previous versions, GPT-4 reduces the probability of hallucination, i.e., that the language model generates random answers not in line with reality. ↩
Considering the vast array of information that needs to be evaluated and the limited attention of investors ($\color{blue}{Hirshleifer\ and\ Teoh, 2023}$), leveraging LLMs to assist in the information processing could be a game-changer in investment management. However, ChatGPT’s responses are generated based on patterns observed in its training data, making it vulnerable to producing misinformation if presented with ambiguous or misleading prompts ($\color{blue}{Lipton\ and\ Steinhardt,\ 2019}$). It does not “understand” information in the human sense, rather it regurgitates patterns it has learned. This can sometimes lead to outputs that are plausible-sounding but factually incorrect. Additionally, while the model can generate text on a wide range of topics, its depth of expertise on specific, nuanced subjects may vary, often reflecting the distribution and quality of its training data. To summarize, the potential strengths of the model to evaluate financial information comes from its ability to observe patterns in past data (the training data) and to apply these patterns to current information. If the same or very similar patterns lead to different outcomes, then the evaluation leads to misleading recommendations.
Related literature
ChatGPT在金融领域的研究仍处于起步阶段,一些研究调查了ChatGPT是否有可能帮助投资组合构建。 Ko 和 Lee(2023)专注于资产类别,发现ChatGPT的选择更加多元化,并且优于随机投资组合。 Lopez-Lira 和 Tang (2023) 使用 ChatGPT 评估单一股票新闻头条的情绪,发现基于 ChatGPT 情绪评分的投资组合优于基于其他 LLMs(例如 BERT)的投资组合。 Smales (2023) 研究了 ChatGPT 是否能够对澳大利亚储备银行做出的货币政策决策进行分类。
Research on ChatGPT in financial contexts is still in its infancy. A few studies investigate whether ChatGPT can potentially assist portfolio construction. Ko and Lee (2023) focus on asset classes and find that ChatGPT’s selections are better diversified and outperform random portfolios. Lopez-Lira and Tang (2023) use ChatGPT to assess the sentiment of single stock news headlines and find that portfolios based on ChatGPT sentiment scores outperform portfolios based on other LLMs such as BERT. Smales (2023) studies whether ChatGPT is able to classify monetary policy decisions made by the Reserve Bank of Australia.
Niszczota 和 Abbas (2023) 研究了 GPT 是否可以充当金融机器人顾问。作者表明,基于 GPT-4 的 ChatGPT 在金融素养测试中取得了近乎完美的 99% 分数,这比之前的版本有了显着的改进。菲伯格等人。 (2023)设计假设的投资者概况,其风险承受能力、年龄和投资期限各不相同。对于这些投资者,作者随后要求 ChatGPT 推荐投资组合。他们的结论是,GPT 的建议反映了投资者的个人情况,例如风险承受能力、风险能力和可持续性偏好,但对投资期限不敏感。尽管如此,他们得出的结论是,历史风险调整后的表现与专业管理的基准投资组合相当。
Niszczota and Abbas (2023) investigate whether GPT can serve as a financial robo-advisor. The authors show that ChatGPT based on GPT-4 achieves a near-perfect 99% score in a financial literacy test which is a significant improvement over previous versions. Fieberg et al. (2023) design hypothetical investor profiles varying in their risk tolerance, age, and investment horizon. For those investors, the authors then ask ChatGPT to recommend a portfolio composition. They conclude that GPT’s suggestions reflect an investor’s individual circumstances such as their risk tolerance, risk capacity, and sustainability preferences but are insensitive to the investment horizon. Nevertheless, they conclude that the historical risk-adjusted performance is on par with a professionally managed benchmark portfolio.
与现有文献不同,我们进行了一项实时实验来检查 ChatGPT 的一个版本,该版本可以自主浏览互联网并权衡所获得的信息与其响应的相关性。我们探讨了 ChatGPT 在处理时间敏感信息和对股票进行排名方面的表现,这是投资者在众多投资选择中导航的常见做法。
Unlike the existing literature, we run a live experiment to examine a version of ChatGPT that autonomously browses the internet and weighs the relevance of the obtained information for its responses. We explore ChatGPT’s performance in processing time-sensitive information and ranking stocks, which is a common practice of investors to navigate the vast array of investment options.
- $\color{purple}{Panel\ A}$分析普通股全样本,$\color{purple}{Panel\ B}$分析纽约证券交易所小盘股,$\color{purple}{Panel\ C}$分析剩余非小型股;
- $\color{green}{更复杂的模型具有更高的夏普比}$;
参考文献
- Dowling, M., & Lucey, B. (2023). ChatGPT for (finance) research: The Bananarama conjecture. Finance Research Letters, 53, 103662.
- Fieberg, C., Hornuf, L., & Streich, D. (2023). Using GPT-4 for financial advice. Available at SSRN 4488891.已下架。
- Niszczota, P., & Abbas, S. (2023). GPT has become financially literate: Insights from financial literacy tests of GPT and a preliminary test of how people use it as a source of advice. Finance Research Letters, 58, 104333.
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
- Cucchiara, R. (2023). What large language models like GPT can do for finance. https://www.nature.com/articles/d43978-023-00095-8.
- Hirshleifer, D., & Teoh, S. H. (2003). Limited attention, information disclosure, and financial reporting. Journal of Accounting and Economics, 36(1-3), 337-386.
- Lipton, Z. C., & Steinhardt, J. (2019). Troubling Trends in Machine Learning Scholarship: Some ML papers suffer from flaws that could mislead the public and stymie future research. Queue, 17(1), 45-77.