NVIDIA Cosmos 是一個世界基礎(chǔ)模型（WFMs, world foundation models）開發(fā)平臺，用于推動物理 AI 的發(fā)展，包含先進(jìn)的視覺標(biāo)記器、護(hù)欄以及加速視頻數(shù)據(jù)處理工具管線。它專為加速智能駕駛汽車和星空機(jī)器人領(lǐng)域的合成數(shù)據(jù)生成、AI 模型訓(xùn)練與評估而設(shè)計。

本篇文章介紹 Cosmos 最新世界基礎(chǔ)模型 Cosmos Reason-1 如何在阿里云星空人工智能平臺PAI上進(jìn)行快速部署使用。

Cosmos Reason-1 模型簡介

Cosmos Reason-1 是一款可完全定制的多模態(tài) AI 推理模型，它專門為理解運(yùn)動、物體交互以及時空關(guān)系而構(gòu)建。基于思維鏈（Chain-of-thought, CoT）推理，Cosmos Reason-1 模型可以解讀視覺輸入、根據(jù)給定的提示詞預(yù)測結(jié)果、并獎勵最佳決策。

該模型基于真實世界的物理規(guī)律實現(xiàn)推理，從而生成清晰且能夠感知上下文環(huán)境的自然語言回復(fù)。Cosmos Reason-1 能夠通過充當(dāng)判別器或?qū)Ａ恳曈X數(shù)據(jù)進(jìn)行標(biāo)注，從而增強(qiáng)合成數(shù)據(jù)管理能力。

Cosmos Reason-1-7B 基于 Qwen2.5-VL 使用物理常識和具身推理數(shù)據(jù)進(jìn)行后訓(xùn)練，并使用了監(jiān)督微調(diào)（SFT）和強(qiáng)化學(xué)習(xí)（RL）技術(shù)。

更多關(guān)于 Cosmos Reason-1 模型的介紹，您可訪問：

● NVIDIA Research 官網(wǎng)：

https://research.nvidia.com/labs/dir/cosmos-reason1/

● NVIDIA Cosmos 官網(wǎng)：

https://www.nvidia.cn/ai/cosmos/

● NVIDIA Cosmos 開發(fā)者官網(wǎng)：

https://developer.nvidia.cn/cosmos

PAI-Model Gallery 簡介

阿里云 PAI-Model Gallery 已同步接入 Cosmos Reason-1 模型，提供企業(yè)級部署方案。

PAI-Model Gallery 是阿里云星空人工智能平臺 PAI 的產(chǎn)品組件，它集成了國內(nèi)外 AI 開源社區(qū)中優(yōu)質(zhì)的預(yù)訓(xùn)練模型，涵蓋了 LLM、AIGC、CV、NLP 等各個領(lǐng)域。通過 PAI 對這些模型的適配，用戶可以以零代碼方式實現(xiàn)從訓(xùn)練到部署再到推理的全過程，簡化了模型的開發(fā)流程，為開發(fā)者和企業(yè)用戶帶來了更快、更高效、更便捷的 AI 開發(fā)和應(yīng)用體驗。

PAI-Model Gallery 訪問地址：

https://pai.console.aliyun.com/#/quick-start/models

? 零代碼一鍵部署

? 自動適配云資源

? 部署后開箱即用API

? 全流程運(yùn)維托管

? 企業(yè)級安全數(shù)據(jù)不出域

PAI 一鍵部署 Cosmos Reason-1

極簡流程立即體驗

1. 在 PAI-Model Gallery 模型廣場找到 Cosmos Reason-1-7B 模型，或通過鏈接直達(dá)該模型：

https://pai.console.aliyun.com/?regionId=cn-beijing#/quick-start/models/Cosmos-Reason1-7B/intro

2. 在模型詳情頁右上角點擊「部署」，在選擇計算資源后，即可一鍵完成模型的云上部署。

3. 部署成功后，在服務(wù)頁面可以點擊“查看調(diào)用信息”獲取調(diào)用的 Endpoint 和 Token，想了解服務(wù)調(diào)用方式可以點擊預(yù)訓(xùn)練模型鏈接，返回模型介紹頁查看調(diào)用方式說明。

4. 使用推理服務(wù)：您可以使用 API 調(diào)用模型服務(wù)，也可以使用 PAI 平臺提供的 WebUI 界面與模型交互。

模型實測

我們使用NVIDIA官網(wǎng)提供的視頻案例來測試模型。視頻中一個人正在往杯子中倒牛奶，我們給模型的問題是“視頻中的人將牛奶倒入杯中，下一步最合理的動作是什么？” 模型的回答展示出模型考慮到了人物下一步動作的各種可能性，甚至考慮到了倒完牛奶放好奶瓶后，用勺子攪拌牛奶和咖啡，最后模型在進(jìn)行了一番深入思考后得出了結(jié)論，下一步是“將奶瓶放回到臺面上”。

從這個例子可以看出，模型準(zhǔn)確地“理解”了視頻中人的動作并給出了正確的預(yù)測，這種對運(yùn)動、物體交互以及時空關(guān)系的理解及預(yù)測能力，當(dāng)前一般的語言模型是難以做到的，而且模型輸出了完整的思維鏈，讓我們能清楚地知道模型的整個思考過程。

下面附上完整的視頻、輸入問題、模型輸出結(jié)果，供大家參考。

● 視頻：

點擊鏈接觀看視頻：https://research.nvidia.com/labs/dir/cosmos-reason1/

視頻來源于NVIDIA

● Question：The overall goal is "The agent pours milk into the cup". The agent in the video is currently performing one subtask out of many to complete this instruction. For the agent in the video, what is the most plausible next immediate subtask?

● 模型輸出：The agent is currently pouring milk into the cup. The next immediate subtask after pouring would logically be to stop pouring once the desired amount is achieved. Then, the agent would need to place the milk bottle back on the countertop. After that, they might close the bottle's cap to prevent spills. Finally, they could stir the mixture with the spoon to mix the milk and coffee thoroughly. However, since the current action is pouring, the very next step after pouring would be to stop pouring and set the bottle down. But the question asks for the most plausible next immediate subtask. Since the pouring is ongoing, the next step after completing the pour would be to place the milk bottle back on the countertop. Alternatively, if the pouring is still in progress, maybe adjusting the angle to ensure all milk is poured. But given the context, the agent is actively pouring, so once that's done, placing the bottle down makes sense. However, the answer might be to stop pouring and set the bottle aside. But the exact next step after pouring would be to place the bottle back. Alternatively, maybe the agent needs to check if the cup is full enough. But the most logical next step after pouring is to stop and place the bottle down. So the answer is likely "Place the milk bottle back on the countertop."

Coming Soon：NVIDIA Cosmos Reason-1 tools 模型后訓(xùn)練最佳實踐

通過上面的例子我們已經(jīng)可以看出 Cosmos Reason-1 模型的強(qiáng)大之處，此外，NVIDIA 本次除了開源 Cosmos Reason-1 模型，也開放了 Cosmos Reason-1 tools，包括模型后訓(xùn)練腳本（SFT + RL）。這就意味著，用戶可以根據(jù)自身數(shù)據(jù)定制自己的 Cosmos Reason-1 模型，相信這會吸引眾多開發(fā)者和企業(yè)定制自己的 physical AI 模型。

我們也第一時間測試了 Cosmos Reason-1 tools 的性能表現(xiàn)。在 Qwen2.5-32B-Instruct 模型 + gsm8k數(shù)據(jù)集（Batch size = 2,048）組合上進(jìn)行后訓(xùn)練測試，相比開源框架verl，Cosmos Reason-1 tools 在小規(guī)模集群上實測有1-2倍的性能優(yōu)勢。

PAI平臺將在近期集成 Cosmos Reason-1 tools 模型后訓(xùn)練能力，歡迎您持續(xù)關(guān)注。

聯(lián)系星空

歡迎各位小伙伴持續(xù)關(guān)注使用 PAI-Model Gallery，平臺會不斷上線 SOTA 模型，如果您有任何模型需求，也可以聯(lián)系星空。您可通過搜索釘釘群號（79680024618），加入PAI-Model Gallery用戶交流群。

繼續(xù)閱讀：

星空人工智能技術(shù)網(wǎng) 倡導(dǎo)尊重與保護(hù)知識產(chǎn)權(quán)。如發(fā)現(xiàn)本站文章存在版權(quán)等問題，煩請30天內(nèi)提供版權(quán)疑問、身份證明、版權(quán)證明、聯(lián)系方式等發(fā)郵件至1851688011@qq.com我們將及時溝通與處理。！：首頁 > 星空人工智能產(chǎn)業(yè) > 智能物聯(lián) » Cosmos on PAI系列一：PAI-Model Gallery云上一鍵部署NVIDIA Cosmos Reason-1

星空人工智能技術(shù)網(wǎng)

Cosmos on PAI系列一：PAI-Model Gallery云上一鍵部署NVIDIA Cosmos Reason-1