korean trocr model

  • trocr λͺ¨λΈμ€ λ””μ½”λ”μ˜ ν† ν¬λ‚˜μ΄μ €μ— μ—†λŠ” κΈ€μžλŠ” ocr ν•˜μ§€ λͺ»ν•˜κΈ° λ•Œλ¬Έμ—, μ΄ˆμ„±μ„ μ‚¬μš©ν•˜λŠ” ν† ν¬λ‚˜μ΄μ €λ₯Ό μ‚¬μš©ν•˜λŠ” 디코더 λͺ¨λΈμ„ μ‚¬μš©ν•˜μ—¬ μ΄ˆμ„±λ„ UNK둜 λ‚˜μ˜€μ§€ μ•Šκ²Œ λ§Œλ“  trocr λͺ¨λΈμž…λ‹ˆλ‹€.
  • 2023 ꡐ원그룹 AI OCR μ±Œλ¦°μ§€ μ—μ„œ μ–»μ—ˆλ˜ λ…Έν•˜μš°λ₯Ό ν™œμš©ν•˜μ—¬ μ œμž‘ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

train datasets

AI Hub

model structure

how to use

from transformers import TrOCRProcessor, VisionEncoderDecoderModel, AutoTokenizer
import requests 
import unicodedata
from io import BytesIO
from PIL import Image

processor = TrOCRProcessor.from_pretrained("ddobokki/ko-trocr") 
model = VisionEncoderDecoderModel.from_pretrained("ddobokki/ko-trocr")
tokenizer = AutoTokenizer.from_pretrained("ddobokki/ko-trocr")

url = "https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))

pixel_values = processor(img, return_tensors="pt").pixel_values 
generated_ids = model.generate(pixel_values, max_length=64)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
generated_text = unicodedata.normalize("NFC", generated_text)
print(generated_text)
Downloads last month
2,435
Safetensors
Model size
214M params
Tensor type
I64
Β·
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.