Web Analytics

GOT-OCR-2-GUI

⭐ 180 stars English by XJF2332

GOT-OCR-2-GUI

See English version here

🛑Support discontinued, future updates will be casual

img.png

About this project

Model weights: Mirror site, Original site Original GitHub: GOT-OCR2.0 This project was developed on Windows, I personally have not used Linux and am not familiar with it, so I cannot guarantee it will run properly on Linux. If you want to deploy on Linux, you can refer to this issue Some code comes from: GLM4 , Deepseek

Please give a star

To do

Usage

If you do not have the folders mentioned here, please create one

Choose a branch

#### Alpha

The fastest updating branch, the latest changes will be committed to this branch. Code may sometimes be untested. Very unstable, sometimes even unusable.

#### main

A relatively stable branch, but some new features may be missing.

Dependencies

This environment has been tested to work properly under python 3.11.9

#### torch

Choose the appropriate GPU version of torch installation from the torch official website I previously used Stable 2.4.1 + cu124 Currently using Stable 2.0.1 + cu118, which can resolve 1 Torch is not compiled with Flash Attention, and no other issues have been found so far

#### PyMuPDF

In practice, if installed directly from requirements.txt, it will report ModuleNotFoundError: No module named 'frontend', but installing separately does not cause this issue, the specific reason is unclear Additionally, if it still reports ModuleNotFoundError, uninstall fitz and PyMuPDF first, then reinstalling should solve it; in practice, pip install -U PyMuPDF does not work

pip install fitz
pip install PyMuPDF

#### Installing with pip

pip install -r requirements.txt

Also, someone mentioned encountering conflicts when installing dependencies using requirements.txt, but I did not find any issues here, and pipdeptree did not show any conflicts. The requirements.txt is directly from pip freeze of my own virtual environment, so it should be fine. However, since such issues have indeed occurred, here is a requirements-noversion.txt without version numbers for you to try: For more information, please see this issue #4

pip install -r requirements-noversion.txt

#### Others

, download the compressed package and place it in the edge_driver folder

Everyone should have Edge on their computer, right? Hopefully? This thing comes pre-installed...
The file structure should be:
> GOT-OCR-2-GUI
└─edge_driver
├─msedgedriver.exe
└─...
``

Download Model Files

Only one of the following models is needed to perform OCR, but to enable automatic model loading, you need the Safetensors model Support for GGUF models is not yet complete; you can currently try it separately in the GGUF tab

#### Safetensors

  • Download into the models folder
  • Don’t miss any files
  • If it is a new GOT-OCR-2-HF model (currently unsupported), download into the models-hf folder (but support has not been added yet)
  • The file structure should be:
GOT-OCR-2-GUI └─models ├─config.json ├─generation_config.json ├─got_vision_b.py ├─model.safetensors ├─modeling_GOT.py ├─qwen.tiktoken ├─render_tools.py ├─special_tokens_map.json ├─tokenization_qwen.py └─tokenizer_config.json
` #### GGUF

GGUF models are supported by got.cpp Go to the MosRat/got.cpp repository to download the models, put Encode.onnx into gguf\Encoder.onnx, and place the remaining Decoder GGUF models into gguf\decoders

Getting Started

> GUI users can ignore this, but CLI users remember to put the images you want to OCR into the imgs folder (CLI currently only detects .jpg and .png images)

Localization Support

Notes

> If you accidentally delete it, a backup can be found in the scripts folder; just copy one over

FAQ

---
  • Q: What is an "HTML local file"? Are there HTML files not saved locally?
  • A: Although the model outputs HTML files saved locally, they use external scripts, so even if the file is local, a network connection is needed to open it. Therefore, I downloaded the external scripts locally, as mentioned earlier
mardown-it.js. This is mainly to prevent PDF export failures caused by network issues.
  • Q: Why did my model fail to load?
  • A: Check if you are missing any files. The model files downloaded from Baidu Cloud seem to be incomplete; I recommend downloading them from the previously mentioned Huggingface.
---
  • Q: Do you have any suggestions for deploying this project?
  • A: See this issue #5
---

Star History

Star History Chart --- Tranlated By Open Ai Tx | Last indexed: 2026-01-12 ---