refactor: make office docs optional and reduce core dependency footprint#255
Merged
mlikasam-askui merged 4 commits intomainfrom Apr 3, 2026
Merged
Conversation
- Move MarkItDown to new `office_document` extra and lazy-load in markdown conversion - Remove bson usage; generate time-ordered IDs via `time_ns` + UUID fragment - Promote `pure-python-adb` to default deps; replace `android` extra in `all` - Relax Python constraint to `>=3.10` and align setup/readme docs - Remove obsolete mypy ignore for `bson`
| str: Time-ordered ID string | ||
| """ | ||
|
|
||
| return f"{prefix}_{str(bson.ObjectId())}" |
Contributor
There was a problem hiding this comment.
I do not know what bson was doing and what effects removing it has. Out of curiosity: can you maybe explain?
Contributor
Author
There was a problem hiding this comment.
It was the reason why the SDK was not compatible with Python 3.14 and later, but now the imagehash library is the new issue causing incompatibility.
Collaborator
There was a problem hiding this comment.
FYI: bson.objectid ist used for mongodb
Contributor
Author
There was a problem hiding this comment.
which we don't use anymore
programminx-askui
approved these changes
Apr 2, 2026
| str: Time-ordered ID string | ||
| """ | ||
|
|
||
| return f"{prefix}_{str(bson.ObjectId())}" |
Collaborator
There was a problem hiding this comment.
FYI: bson.objectid ist used for mongodb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR reduces the default installation footprint and clarifies optional dependency usage.
Dependency changes
markitdownfrom core dependencies and introduced optional extra:office_documentallextra to includeoffice-document(and no longer depend on the removedandroidextra)pure-python-adbinto default dependenciesbsonfrom core dependenciesRuntime/code changes
generate_time_ordered_id()no longer usesbson.ObjectId; now builds IDs fromtime.time_ns()+ UUID suffixconvert_to_markdown()now importsmarkitdownlazily and raises a clear install hint:pip install "askui[office-document]"Documentation/config updates
pip install askui) and explains optional extrasdocs/10_extracting_data.mdexplicitly notes Excel/Word (OfficeDocumentSource) requiresoffice-documentdocs/01_setup.mdupdated Python requirement textpyproject.toml/pdm.locksynchronized with new extras + depsmypyignore section forbsonWhy
This keeps the base package lighter, avoids forcing Office-conversion dependencies on all users, and makes Office document support explicit and discoverable.