A high-performance Python service designed to extract and sanitize sensitive metadata from your files. Stop leaking your GPS location, device serial numbers, and personal info before sharing documents or photos.
- Metadata Extraction: View all hidden technical tags before they are wiped.
- Deep Sanitization: Completely strips EXIF, XMP, and other metadata from Images, PDFs, and Word docs.
- Privacy-First: Files are processed in-memory and never stored on the server.
- Modern Stack: Built with FastAPI and powered by the ultra-fast
uvpackage manager.
- Framework: FastAPI (Asynchronous API)
- Package Manager: uv
- Libraries: Pillow (Images), PyMuPDF (PDF), python-docx (Word)
- Containerization: Docker
-
Build the image: docker build -t purefile-app .
-
Run the container: docker run -d -p 8000:8000 --name purefile-container purefile-app
-
Install dependencies: uv pip install -r pyproject.toml
-
Run the application: python run.py
Once the service is running, open your browser at: http://localhost:8000/docs
You will see the interactive Swagger UI where you can upload a file and get a "purified" version back instantly.
- Images: .jpg, .jpeg, .png, .webp
- PDF: .pdf
- Documents: .docx
PureFile acts as a digital filter. It parses the binary structure of your file, identifies metadata segments (like EXIF in photos or Author properties in DOCX), and re-saves the file while intentionally omitting these segments. The result is a visually identical file with a clean "digital history".