Earlier this month, Microsoft said it is spending $80 billion on AI infrastructure in 2025 alone while Meta CEO Mark Zuckerberg last week said the social media company planned to invest between $60 billion and $65 billion in capital expenditures in 2025 as part of its AI strategy.
本月初,微軟稱其於2025年將獨自挹注800億美元於AI之基礎建置;然而,Meta執行長Mark Zuckerberg上星期說,社群媒體公司在2025年將注資600億到650億美元於AI上。
“If model training costs prove to be significantly lower, we would expect a near-term cost benefit for advertising, travel, and other consumer app companies that use cloud AI services, while long-term hyperscaler AI-related revenues and costs would likely be lower,” wrote BofA Securities analyst Justin Post in a note Monday.
BofA 證券分析師Justin Post星期一在札記中寫到:假如模型訓練的成本被證明可大大降低,我們可期待利用AI雲端服務的廣告、旅遊及其他消費者應用程式公司短期內享有成本降低之利;同時,遠程大規模與AI有關的收益及成本亦可能降低。
Nvidia’s comment also reflects a new theme that Nvidia CEO Jensen Huang, OpenAI CEO Sam Altman and Microsoft CEO Satya Nadella have discussed in recent months.
Nvidia的評論亦反映到Nvidia執行長黃仁勳、OpenAI執行長Sam Altman 及微軟執行長 Satya Nadella最近幾個月討論的新方案。
Much of the AI boom and the demand for Nvidia GPUs was driven by the “scaling law,” a concept in AI development proposed by OpenAI researchers in 2020.
許多人工智能熱潮需要Nvidia的GPUs,而GPUs乃由“縮放法則”所驅動;至於“縮放法則”係OpenAI研究人員於2020年提出的一個開發AI的概念
That concept suggested that better AI systems could be developed by greatly expanding the amount of computation and data that went into building a new model, requiring more and more chips.該概念建議較優的AI系統藉由大大擴張資料計算量來從事建置新的模型開發,此時需要更多更多的晶片。
Since November, Huang and Altman have been focusing on a new wrinkle to the scaling law, which Huang calls “test-time scaling.”
自11月來,黃仁勳與 Sam Altman及投注心力於新點子,黃仁勳稱此為「測試時間縮放」。
This concept says that if a fully trained AI model spends more time using extra computer power when making predictions or generating text or images to allow it to “reason,” it will provided better answers than it would have if it ran for less time.
此概念是這樣的,假設一個充分訓練的人工智能模型要作出預測或生成文字或圖像並使其可推論,會耗費更多時間及電腦電力;「測試時間縮放」將提供比之前較好答案且時間較短。
Forms of the test-time scaling law are used in some of OpenAI’s models such as o1 as well as DeepSeek’s breakthrough R1 model.
多種類型的測試時間縮放法則被用在像OpenAI o1模型;突破性的DeepSeek R1模型也是。