Zhichong Lyu (吕志冲)

Tech | Stanford and UW Researchers Train a Powerful AI Reasoning Model for Under $50

Created2025-02-06|Updated2025-02-06|Tool|Deepseek•S1•LLM

On January 31, 2025, researchers from Stanford University and the University of Washington unveiled s1, a groundbreaking AI reasoning model trained for less than $50. Remarkably, s1 demonstrates reasoning and coding abilities comparable to OpenAI’s o1 and DeepSeek’s R1 models. Even more impressively, the model, along with its training data and code, has been open-sourced on GitHub, providing researchers and developers worldwide with a low-cost alternative for high-performance AI reasoning. This breakthrough challenges the notion that cutting-edge AI requires massive computational resources, signaling a shift toward more accessible and democratized AI development.

Tool | In-Depth Guide: How to Efficiently Use DeepSeek Model APIs

Created2025-02-05|Updated2025-02-05|Tool|Deepseek•LLM•API

This article offers a detailed analysis of how to seamlessly integrate and use DeepSeek model APIs across three major platforms—Baidu, ByteDance, and Alibaba Cloud—helping developers leverage cutting-edge AI capabilities with ease.

Draft | DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Created2025-02-03|Updated2025-02-05|Journal|Deepseek•LLM

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.

Journal | MS | ChatGPT for textual analysis? How to use generative LLMs in accounting research

Created2025-01-18|Updated2025-01-18|Journal|LLMs•MS•Accounting

This paper focuses on exploring the application of GLLMs (Generative Large Language Models) in the field of accounting research and compares them with existing text analysis techniques.

Journal | RSER | Study on how the digital economy affects urban carbon emissions

Created2024-10-01|Updated2025-02-05|Journal|RSER•Digital economy•Carbon emissions•Green innovation

The digital economy is crucial in advancing an economically sustainable and low-carbon future and plays a key role in achieving carbon neutrality and carbon peaking. We measured the digital economy development level of each prefecture-level city in China form 2011to 2019, then investigated the impact of on carbon emissions and its mechanism using the panel fixed effects model and the spatial Durbin model.

Journal | RFS | How to talk when a machine is listening: Corporate disclosure in the age of AI

Created2024-09-18|Updated2025-02-05|Journal|RFS•AI•Information disclosure

What impact will the use of machine readers in the era of artificial intelligence have on corporate information disclosure?

Journal | JBF | How cheap talk in climate disclosures relates to climate initiatives, corporate emissions, and reputation risk

Created2024-05-13|Updated2025-01-18|Journal|LLMs•JBF•Climate•Cheap talk•MSCI

This paper introduces deep learning algorithms to identify climate-related greenwashing in the annual reports of companies within the MSCI World Index. The study finds that only targeted climate engagement reduces greenwashing, while voluntary climate disclosures are associated with more greenwashing. Additionally, greenwashing is linked to an increase in negative news coverage and rising emissions. Therefore, "greenwashing" serves as a tool to assess the effectiveness of climate initiatives and predict reputation and transition risks.

Journal | JFE | A picture is worth a thousand words: Measuring investor sentiment by combining machine learning and photos from news

Created2024-05-06|Updated2025-02-05|Journal|Asset pricing•JFE•Multimodal

A large body of research has documented how investor sentiment helps researchers understand and predict stock market returns over time. However, few studies have utilized machine learning methods to extract investor sentiment from news images for predicting stock market returns.

Journal | JFuM | Anger in predicting the index futures returns

Created2024-04-22|Updated2025-01-18|Journal|JFuM•Emotion•Forecast

This paper aims to study how different emotions impact index futures returns. Using lagged, text-based emotion indices (anger, joy, fear, optimism, and pessimism), the study tests their predictive power for S&P 500 futures returns. The results reveal an asymmetric predictive ability between the pessimism and optimism indices.

Literature | EAPS | Can ChatGPT reduce human financial analysts’ optimistic biases?

Created2024-04-01|Updated2025-01-18|Literature|LLMs•EAPS

This paper explores the potential of the large language model ChatGPT as a financial advisor for forecasting the performance of publicly listed companies. Using the components of the CSI 300 index, ChatGPT's predictions for key financial performance indicators are compared with those of human analysts and actual values. The results suggest that ChatGPT can correct the optimistic biases of human analysts.