Skip to content

[Bug]: Performance Issue: Financial PDF ingestion with multiple tables takes more than 5 minutes to process #1144

@sreedevi-m-2026

Description

@sreedevi-m-2026

OpenRAG Version

0.3.0

Deployment Method

uv add (installed in project)

Operating System

macOs 15.6

Python Version

3.13

Affected Area

Ingestion (document processing, upload, Docling)

Bug Description

Financial PDF documents containing multiple tables (FinanceBench dataset) take approximately 5 minutes to complete ingestion in OpenRAG.

Although the ingestion eventually completes successfully, the processing duration is significantly long and the UI does not display progress indicators during this time.

This may affect usability when ingesting complex financial reports.

Steps to Reproduce

  1. Open Knowledge → Add Knowledge
  2. Upload a financial PDF from the FinanceBench dataset
  3. Start ingestion
  4. Observe the ingestion duration

Expected Behavior

Large financial documents should ingest within a reasonable time frame
or display progress indicators for long-running processing tasks.

Actual Behavior

File ingestion completes successfully but takes approximately 5 minutes.

Relevant Logs

Screenshots

Image

financial doc.zip

Additional Context

No response

Checklist

  • I have searched existing issues to ensure this bug hasn't been reported before.
  • I have provided all the requested information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug🔴 Something isn't working.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions