🌐 Ngôn ngữ

Liên hệ 📫

Người đóng góp chính của kho lưu trữ này là một nghiên cứu sinh Thạc sĩ sẽ tốt nghiệp năm 2026, hãy liên hệ để hợp tác hoặc trao đổi cơ hội.

本仓库的主要贡献者是一名 2026 届硕士毕业生，欢迎联系合作或交流机会。

Tin tức 📅

[2026/03]: Chúng tôi hiện hỗ trợ CLI và đã phát hành các mô hình tinh chỉnh trên Hugging Face 🤗!
[2026/01]: Đã hỗ trợ xuất PPTX dạng tự do và theo mẫu, chế độ ngoại tuyến đã có! Đã bổ sung quản lý ngữ cảnh để tránh tràn ngữ cảnh.
[2025/12]: 🔥 Ra mắt V2 với những cải tiến lớn - Tích hợp nghiên cứu sâu, thiết kế trực quan tự do, tự động tạo tài sản, chuyển văn bản thành hình ảnh, và môi trường Agent với sandbox & hơn 20 công cụ.
[2025/09]: 🛠️ Đã thêm hỗ trợ máy chủ MCP - xem MCP Server để biết chi tiết cấu hình
[2025/09]: 🚀 Phát hành v2 với nhiều cải tiến lớn - xem ghi chú phát hành để biết thêm chi tiết
[2025/08]: 🎉 Bài báo được nhận đăng ở EMNLP 2025!
[2025/05]: ✨ Phát hành v1 với các chức năng cốt lõi và 🌟 đột phá: đạt 1.000 sao trên GitHub! - xem ghi chú phát hành để biết chi tiết
[2025/01]: 🔓 Mã nguồn được mở, với mã thử nghiệm được lưu tại experiment release

Sử dụng 📖

[!QUAN TRỌNG]

Không hỗ trợ Windows. Nếu bạn dùng Windows, vui lòng sử dụng WSL.

Chúng tôi khuyến nghị bắt đầu với CLI và tác vụ tối thiểu để xác nhận các phụ thuộc và môi trường đã được cấu hình đúng.

Cấu hình

Nếu bạn dùng CLI, pptagent onboard có thể hỗ trợ tạo và cập nhật các cấu hình này một cách tương tác. Nếu bạn sử dụng Docker Compose hoặc xây dựng từ mã nguồn, bạn cần chuẩn bị thủ công:

cp deeppresenter/config.yaml.example deeppresenter/config.yaml
cp deeppresenter/mcp.json.example deeppresenter/mcp.json

#### Các Dịch Vụ Tùy Chọn Nâng Cao Chất Lượng

Các dịch vụ sau đây có thể cải thiện đáng kể chất lượng sinh dữ liệu, đặc biệt về độ sâu nghiên cứu, phân tích PDF và tạo tài sản hình ảnh:

Tavily: nâng cao chất lượng tìm kiếm web. Đăng ký API key tại tavily.com, sau đó thiết lập TAVILY_API_KEY trong deeppresenter/mcp.json.
MinerU: nâng cao chất lượng phân tích PDF. Bạn có thể đăng ký API key tại mineru.net và thiết lập MINERU_API_KEY trong deeppresenter/mcp.json, hoặc triển khai MinerU cục bộ và thiết lập MINERU_API_URL thay thế.
Mô hình chuyển đổi văn bản thành hình ảnh: nâng cao chất lượng tạo hình ảnh. Cấu hình t2i_model trong deeppresenter/config.yaml.

Nếu bạn muốn thiết lập hoàn toàn ngoại tuyến, hãy triển khai MinerU cục bộ và thiết lập offline_mode: true trong deeppresenter/config.yaml để tránh sử dụng các công cụ phụ thuộc mạng như tìm kiếm web.

Nhiều biến cấu hình khác có thể được tìm thấy trong constants.py.

1. Sử Dụng Cá Nhân / Tích Hợp OpenClaw: CLI

[!NOTE]

Trên macOS, CLI có thể tự động cài đặt một số phụ thuộc cục bộ, bao gồm Homebrew, Node.js, Docker, poppler, Playwright, và llama.cpp.

Trên Linux, bạn cần tự chuẩn bị môi trường.

Sử dụng chế độ này nếu bạn muốn thiết lập cục bộ nhanh nhất hoặc muốn kết nối DeepPresenter vào OpenClaw thông qua CLI.

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
First-time interactive setup
uvx pptagent onboard
Generate a presentation
uvx pptagent generate "Single Page with Title: Hello World" -o hello.pptx
Generate with attachments
uvx pptagent generate "Q4 Report" \
  -f data.xlsx \
  -f charts.pdf \
  -p "10-12" \
  -o report.pptx

| Lệnh | Mô tả | | ------------------- | ------------------------------------------------- | | pptagent onboard | Trình hướng dẫn cấu hình tương tác | | pptagent generate | Tạo bài thuyết trình | | pptagent config | Xem cấu hình hiện tại | | pptagent reset | Đặt lại cấu hình | | pptagent serve | Khởi động dịch vụ suy luận cục bộ sử dụng bởi CLI |

2. Thiết lập tối thiểu / Phát triển: Xây dựng từ mã nguồn

Sử dụng chế độ này nếu bạn muốn lớp trừu tượng nhỏ nhất và toàn quyền kiểm soát các phụ thuộc trong quá trình phát triển.

uv pip install -e .
playwright install-deps
playwright install chromium
npm install --prefix deeppresenter/html2pptx
modelscope download forceless/fasttext-language-id
docker pull forceless/deeppresenter-sandbox
docker pull forceless/deeppresenter-host
docker tag forceless/deeppresenter-sandbox deeppresenter-sandbox
or build from dockerfile
docker build -t deeppresenter-sandbox -f deeppresenter/docker/SandBox.Dockerfile .

Khởi động ứng dụng:

python webui.py

3. Triển khai máy chủ: Docker Compose

Sử dụng chế độ này để có môi trường máy chủ ổn định với các phụ thuộc rõ ràng.

# Pull the public images to avoid build from source
docker pull forceless/deeppresenter-sandbox
docker tag forceless/deeppresenter-sandbox deeppresenter-sandbox
Or build from source
docker build -t deeppresenter-sandbox -f deeppresenter/docker/SandBox.Dockerfile .
Start the host service
docker compose up -d

The service exposes the web UI on http://localhost:7861.

Case Study 💡

#### Prompt: Please present the given document to me.

#### Prompt: 请介绍小米 SU7 的外观和价格

#### Prompt: 请制作一份高中课堂展示课件，主题为“解码立法过程：理解其对国际关系的影响”

Những người đóng góp 🌟

_Force1ess	_Puelloc	_hongyan	_Dnoob	_Sadahlu
_{KurisuMakiseSame}	_Angelen	_BrandonHu	_{Eliot White}	_EvolvedGhost
_ISCAS-zwl	_{James Brown}	_JunZhang	_{Open AI Tx}	_{Sense_wang}
_SuYao	_{Zakir Jiwani}	_Zhenyu	_lnennnn

Trích dẫn 🙏

Nếu bạn thấy dự án này hữu ích, vui lòng sử dụng nội dung sau để trích dẫn:

@inproceedings{zheng-etal-2025-pptagent, title = "{PPTA}gent: Generating and Evaluating Presentations Beyond Text-to-Slides", author = "Zheng, Hao and Guan, Xinyan and Kong, Hao and Zhang, Wenkai and Zheng, Jia and Zhou, Weixiang and Lin, Hongyu and Lu, Yaojie and Han, Xianpei and Sun, Le", editor = "Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet", booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2025", address = "Suzhou, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.emnlp-main.728/", doi = "10.18653/v1/2025.emnlp-main.728", pages = "14413--14429", ISBN = "979-8-89176-332-6", abstract = "Automatically generating presentations from documents is a challenging task that requires accommodating content quality, visual appeal, and structural coherence. Existing methods primarily focus on improving and evaluating the content quality in isolation, overlooking visual appeal and structural coherence, which limits their practical applicability. To address these limitations, we propose PPTAgent, which comprehensively improves presentation generation through a two-stage, edit-based approach inspired by human workflows. PPTAgent first analyzes reference presentations to extract slide-level functional types and content schemas, then drafts an outline and iteratively generates editing actions based on selected reference slides to create new slides. To comprehensively evaluate the quality of generated presentations, we further introduce PPTEval, an evaluation framework that assesses presentations across three dimensions: Content, Design, and Coherence. Results demonstrate that PPTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions." }

@misc{zheng2026deeppresenterenvironmentgroundedreflectionagentic, title={DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation}, author={Hao Zheng and Guozhao Mo and Xinru Yan and Qianhao Yuan and Wenkai Zhang and Xuanang Chen and Yaojie Lu and Hongyu Lin and Xianpei Han and Le Sun}, year={2026}, eprint={2602.22839}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2602.22839}, }

--- Tranlated By Open Ai Tx | Last indexed: 2026-04-09 ---