jakep-allenai commited on
Commit
0c2fd16
·
verified ·
1 Parent(s): 924afd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -2
README.md CHANGED
@@ -22,12 +22,34 @@ Quick links:
22
 
23
  The best way to use this model is via the [olmOCR toolkit](https://github.com/allenai/olmocr).
24
 
25
- ## Prompting
26
 
27
  This model expects as input a single document image, rendered such that the longest dimension is 1024 pixels.
28
 
29
  The prompt must then contain the additional metadata from the document, and the easiest way to generate this
30
- prompt is via the [olmOCR toolkit](https://github.com/allenai/olmocr).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## License and use
33
 
 
22
 
23
  The best way to use this model is via the [olmOCR toolkit](https://github.com/allenai/olmocr).
24
 
25
+ ## Usage
26
 
27
  This model expects as input a single document image, rendered such that the longest dimension is 1024 pixels.
28
 
29
  The prompt must then contain the additional metadata from the document, and the easiest way to generate this
30
+
31
+
32
+ ## Manual Prompting
33
+
34
+ ```python
35
+ image_base64 = [base64 image of PDF rendered down to 1024 px on longest edge]
36
+
37
+ "messages": [
38
+ {
39
+ "role": "user",
40
+ "content": [
41
+ {"type": "text", "text": "Below is the image of one page of a document, as well as some raw textual content that was previously extracted for it. Just return the plain text representation of this document as if you were reading it naturally.
42
+ Do not hallucinate.
43
+ RAW_TEXT_START
44
+ Page dimensions: 1836.8x2267.2
45
+ [Image 0x0 to 1837x2267]
46
+
47
+ RAW_TEXT_END"},
48
+ {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{image_base64}"}},
49
+ ],
50
+ }
51
+ ],
52
+ ```
53
 
54
  ## License and use
55