PDFToSlides in ComfyUI
5.0
0 reviewsDescription
Purpose of This Workflow:
OCR/Summerize/DrawingPics for a PDF file.
1. Using ComfyUI-Document to Read English PDF Files (Now on manager)
2. Perform OCR and Summarize the Document Contents, Outputting as Text
3. Convert the Text into Image Generation Prompt.
4. Generate Images Using SD3 and Combine with Text, Outputting a 1920x1080 Image (looks like a presention).
The summarization uses Ollama, and the OCR functionality uses Florence2. This workflow can read PDFs from arXiv, summarize them, and generate images to accompany the summaries. However, this workflow is not suitable for scientific academic papers, as the images in such papers (like the SD3 paper in another image) are randomly generated by SD3.
The summary content can be in Chinese. By modifying the prompt to require a Chinese summary and text layout (changing each line to 24 characters, font size to 30), you can achieve this in assets.
If the document is a text-containing PDF instead of a scanned one, ComfyUI-Document can directly read the PDF content without needing OCR.
This workflow is experimental, and the summaries of LLM may contain errors. The main goal is to test the using of ComfyUI-Document in conjunction with OCR.
Discussion
(No comments yet)
Loading...
Resources (1)
Reviews
No reviews yet
Versions (3)
- latest (a year ago)
- v20240711-015200
- v20240711-014219
Node Details
Primitive Nodes (11)
DocumentLoader (1)
DownloadAndLoadFlorence2Model (1)
EmptySD3LatentImage (1)
Fast Groups Bypasser (rgthree) (1)
Florence2Run (1)
ImageConcatMulti (1)
LayerUtility: SD3NegativeConditioning (1)
Note (2)
PDFToImage (1)
TextChunker (1)
Custom Nodes (23)
ComfyUI
- CLIPTextEncode (2)
- CheckpointLoaderSimple (1)
- PreviewImage (2)
- SaveImage (1)
- ImpactSwitch (2)
- LayerUtility: ColorImage V2 (1)
- LayerUtility: SimpleTextImage (1)
- LayerStyle: DropShadow (1)
- OllamaGenerate (2)
- Gemini_API_Zho (1)
- KSampler (Efficient) (1)
- String Literal (4)
- Text List (2)
- Text List to Text (2)
Model Details
Checkpoints (1)
SD3\sd3_medium_incl_clips_t5xxlfp8.safetensors
LoRAs (0)