DeFi Daily News
Friday, June 12, 2026
Advertisement
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos
No Result
View All Result
DeFi Daily News
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos
No Result
View All Result
DeFi Daily News
No Result
View All Result
Home DeFi Metaverse

rewrite this title Deepseek OCR: AI Doesn’t Just Read Texts, It “Sees” Them

MetaversePlanet by MetaversePlanet
October 21, 2025
in Metaverse
0 0
0
rewrite this title Deepseek OCR: AI Doesn’t Just Read Texts, It “Sees” Them
0
SHARES
0
VIEWS
Share on FacebookShare on TwitterShare on Telegram
Listen to this article


rewrite this content using a minimum of 1000 words and keep HTML tags

Deepseek’s new OCR system processes texts as images and compresses them up to 10 times. This technology, capable of analyzing 33 million pages in a day, allows AI to read much longer documents.

Deepseek, a Chinese artificial intelligence company, is attracting attention with its new OCR (Optical Character Recognition) system developed for more efficient processing of text-based documents. The system compresses image-based texts, enabling AI models to process much longer documents without hitting their memory limits.

Processing Text as Visual Data

According to Deepseek’s technical report, the system analyzes text data in image format instead of processing it directly. This approach significantly reduces the computational load. The new OCR system can compress texts by up to 10 times while retaining 97% of the information.

As known, large language models represent text as tokens, with each token containing a few characters. Researchers are working to develop models that can process long documents and conversations exceeding millions of tokens, thereby expanding the context window. However, as the number of tokens that can be processed simultaneously increases, so do the computational costs. Thus, a large token capacity prevents the model’s memory from filling up even with long documents, but it increases the cost. Deepseek’s OCR solution, however, processes very long content as if it were an image, effectively viewing the content as pixels.

Seeing Long Texts as Pixels

The core of the system consists of two main components: DeepEncoder and Deepseek3B-MoE. DeepEncoder, which handles the image processing, operates with 380 million parameters. Deepseek3B-MoE, responsible for text generation, has 570 million active parameters. DeepEncoder combines Meta’s 80-million-parameter SAM (Segment Anything Model) and OpenAI’s 300-million-parameter CLIP model. An intermediary 16x compressor significantly reduces the image data, increasing processing speed. For example, 4,096 tokens of a $1,024 \times 1,024$ pixel image are reduced to only 256 tokens after compression.

Deepseek OCR can operate using between 64 and 400 “vision tokens,” depending on the resolution. This number significantly lightens operations that typically require thousands of tokens in classic OCR systems. In OmniDocBench tests, the system outperformed GOT-OCR 2.0 using only 100 vision tokens. It also surpassed the performance of MinerU 2.0, which required over 6,000 tokens, while operating under 800 tokens.

The system, optimized for different document types, uses 64 tokens for simple presentations, 100 tokens for books and reports, and 800 tokens using a special mode called “Gundam mode” for complex newspapers.Deepseek OCR can process not only text but also complex visual elements like diagrams, chemical formulas, and geometric shapes. Furthermore, it works in approximately 100 languages, can preserve formatting, and can generate plain text or general visual descriptions if desired.

Processes 33 Million Pages a Day

Approximately 30 million PDF pages were used to train the system. 25 million of this data consisted of English and Chinese documents, and the rest comprised 10 million synthetic diagrams, 5 million chemical formulas, and 1 million geometric shapes.

In real-world use, Deepseek OCR achieves a very high processing capacity. The system can process over 200,000 documents a day on a single Nvidia A100 GPU. With 20 servers, each housing eight A100 GPUs, this capacity increases to 33 million pages per day. This speed has the potential to greatly facilitate the production of training data for new AI models. Both the code and model weights are publicly available (accessible via the source section).

You Might Also Like;

Follow us on TWITTER (X) and be instantly informed about the latest developments…

Copy URL
URL Copied

and include conclusion section that’s entertaining to read. do not include the title. Add a hyperlink to this website http://defi-daily.com and label it “DeFi Daily News” for more trending news articles like this



Source link

Tags: DeepSeekdoesntOCRReadrewriteSeesTextstitle
ShareTweetShare
Previous Post

Netflix earnings preview, Apple price target boost, Coca-Cola reports better-than-expected results

Next Post

Government shutdown enters day 21

Next Post
Government shutdown enters day 21

Government shutdown enters day 21

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
  • Trending
  • Comments
  • Latest
rewrite this title Gumshoe Gives Back — Join Now, and We Give to Charity!

rewrite this title Gumshoe Gives Back — Join Now, and We Give to Charity!

December 9, 2025
rewrite this title How vulnerable might humans be to bird flu? Scientists see hope in existing immunity

rewrite this title How vulnerable might humans be to bird flu? Scientists see hope in existing immunity

March 19, 2025
Trump weighs tariffs on movies made outside US ahead of Disney earnings

Trump weighs tariffs on movies made outside US ahead of Disney earnings

May 5, 2025
Top 3 Cryptocurrencies to Consider Purchasing in October 2024: EigenLayer (EIGEN), ETFSwap (ETFS), and Bonk (BONK)

Top 3 Cryptocurrencies to Consider Purchasing in October 2024: EigenLayer (EIGEN), ETFSwap (ETFS), and Bonk (BONK)

October 9, 2024
Kā Kļūt par Miljonāru: Mēmu Monētu Tirgotāja Veiksmes Stāsts ar Tikai 96$ Investīciju

Kā Kļūt par Miljonāru: Mēmu Monētu Tirgotāja Veiksmes Stāsts ar Tikai 96$ Investīciju

October 21, 2024
Exclusive Shopkick Deal: Get a FREE Gift Card Worth - for Every User!

Exclusive Shopkick Deal: Get a FREE Gift Card Worth $3-$5 for Every User!

October 24, 2024
rewrite this title SpaceX’s Historic IPO Will Trigger a Nasdaq Sell-Off? Says Analyst Michael Burry

rewrite this title SpaceX’s Historic IPO Will Trigger a Nasdaq Sell-Off? Says Analyst Michael Burry

June 12, 2026
rewrite this title Heidi Klum Joins UNICEF USA to Champion Play Rights for Children Worldwide

rewrite this title Heidi Klum Joins UNICEF USA to Champion Play Rights for Children Worldwide

June 12, 2026
rewrite this title Hyderabad, India-based Equal AI, which makes an eponymous AI-powered call screening app, raised a M Series B led by Prosus Ventures and Tomales Bay Capital (Ivan Mehta/TechCrunch)

rewrite this title Hyderabad, India-based Equal AI, which makes an eponymous AI-powered call screening app, raised a $30M Series B led by Prosus Ventures and Tomales Bay Capital (Ivan Mehta/TechCrunch)

June 12, 2026
rewrite this title LeBron James has one massive demand for Lakers

rewrite this title LeBron James has one massive demand for Lakers

June 12, 2026
rewrite this title Ethereum Ecosystem Milestone: On-Chain Activity Across The Network Explodes To Historic Levels | Bitcoinist.com

rewrite this title Ethereum Ecosystem Milestone: On-Chain Activity Across The Network Explodes To Historic Levels | Bitcoinist.com

June 11, 2026
rewrite this title Dogecoin Just Hit A Rare Capitulation Signal: What It Means For DOGE

rewrite this title Dogecoin Just Hit A Rare Capitulation Signal: What It Means For DOGE

June 11, 2026
DeFi Daily

Stay updated with DeFi Daily, your trusted source for the latest news, insights, and analysis in finance and cryptocurrency. Explore breaking news, expert analysis, market data, and educational resources to navigate the world of decentralized finance.

  • About Us
  • Blogs
  • DeFi-IRA | Learn More.
  • Advertise with Us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2024 Defi Daily.
Defi Daily is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos

Copyright © 2024 Defi Daily.
Defi Daily is not responsible for the content of external sites.