Skip to content

universal-development/zero-clone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zero-clone

Portable Bash CLI to automatically sync from remote servers to local directories using rclone, with a simple convention-based layout per project.

Directory convention (per base directory)

  • clone (or the name set by --clone-dir / ZERO_CLONE_DIR): synchronized files go here
  • .zero-clone/rclone.conf: rclone configuration used for that base
  • .zero-clone/list.txt: sources to sync, one per line
  • .zero-clone/env.sh: optional environment overrides (e.g., JOBS, RCLONE_OPTS, CLONE_DIR)
  • .zero-clone/logs/: per-job rclone logs

Project-level (current working directory)

  • init.sh: optional, sourced before CLI argument processing; use it to set ZERO_CLONE_DIR and other global defaults
  • zero-clone.txt: optional list of base directories to process (used when no --from-file is given)

Quick start

  • Install rclone and ensure it’s in PATH.
  • Create a base directory and add the structure above.
  • Put sync sources in .zero-clone/list.txt (format below).
  • Run: bash bin/zero-clone (or add bin/ to PATH and run zero-clone).

CLI usage

  • zero-clone [options] [PATH ...]
  • Options:
    • -y, --yes: skip confirmation prompt
    • -j, --jobs N: default parallel jobs when env.sh doesn’t set JOBS
    • --from-file FILE: file listing base directories to process (falls back to zero-clone.txt if present)
    • --clone-dir NAME: directory name to use as destination root within each base (default: clone, or ZERO_CLONE_DIR if set). For example, --clone-dir data syncs into <base>/data/<dest> instead of <base>/clone/<dest>.
    • --dest DIR: override destination root for all bases (data lake mode); syncs to DIR/<dest> instead of <base>/<clone-dir>/<dest>
    • --dry-run: pass --dry-run to rclone
    • --no-progress: hide rclone progress
    • --version, -h/--help
  • PATH arguments: search roots that are scanned recursively for .zero-clone directories. Defaults to current directory when not using --from-file.

Discovery of bases (.zero-clone)

  • If --from-file provided: read base directories from it (one per line, # comments allowed).
  • Else if zero-clone.txt exists in current working directory: read from it.
  • Else: recursively find .zero-clone directories under provided PATH(s) (or .) and use their parents as bases.

list.txt format

  • One job per non-empty, non-comment line: SRC [DEST]
    • SRC: rclone source (e.g., remote:path/to/data or a URL supported by rclone)
    • DEST (optional): relative path under the clone directory. If omitted, it is derived from basename(SRC path).
  • Examples:
    • myremote:projects/repo repos/repo → syncs to <clone-dir>/repos/repo
    • myremote:datasets/cats → syncs to <clone-dir>/cats

env.sh (optional, per base)

  • Sourced before running jobs for the base; you may export:
    • JOBS: number of parallel rclone sync processes (default 2, or --jobs CLI)
    • RCLONE_OPTS: extra flags passed to rclone (e.g., "--checksum --transfers 8")
    • CLONE_DIR: override destination root for this base only (e.g., /data/shared-lake)

init.sh (optional, project-wide)

  • If a file named init.sh exists in the current working directory it is sourced automatically before CLI argument processing.
  • Use it to set project-wide defaults, most commonly ZERO_CLONE_DIR:
    # init.sh
    export ZERO_CLONE_DIR=data
  • ZERO_CLONE_DIR: sets the default clone directory name for all bases (equivalent to --clone-dir). Can also be set directly in the shell environment.

Logs

  • Per-job logs are written to <base>/.zero-clone/logs/<timestamp>_<dest>_<src>.log.
  • On failures, the script exits non-zero and points to the relevant log files.

Data Lake mode

  • Use --dest /path/to/lake to direct all syncs into a single directory.
  • Or set CLONE_DIR=/path/to/lake in a base's env.sh for per-base override.
  • Priority: --dest flag > CLONE_DIR in env.sh > --clone-dir CLI > ZERO_CLONE_DIR env / init.sh > default clone.
  • Put all your remotes in one rclone.conf and list sources in one list.txt:
    server-a:data/users     users
    server-b:data/products  products
    s3:bucket/analytics     analytics
    
    Then run: zero-clone --dest /data/lake .

Examples

  • examples/sample-project/: minimal layout with placeholder files and an example init.sh.
  • examples/local-to-local/: end-to-end local-to-local sync with a runnable run.sh.
  • examples/data-lake/: data lake pattern with multiple sources syncing into one shared directory.

Testing

  • Run all tests: bash test/run.sh
  • Requires rclone installed from your OS package manager and available in PATH.
  • Tests use local filesystem paths (no network) by copying examples/sample-project to a temp dir, generating local sources, and verifying:
    • discovery, confirmation bypass, logging, and per-base concurrency wiring
    • creation of per-job logs in <base>/.zero-clone/logs/
    • local file sync results under <base>/clone/ (default), a custom dir via --clone-dir, and a data lake dir via --dest
    • ZERO_CLONE_DIR env variable overrides the default clone directory name
    • init.sh in the working directory is sourced and its ZERO_CLONE_DIR export is respected

Notes

  • rclone is executed with --config <base>/.zero-clone/rclone.conf so each base can have isolated configs, remotes, and keys.
  • The script groups jobs per base and applies the per-base JOBS limit concurrently.

License

  • MIT License. See LICENSE file for details.

About

CLI scripts to sync remote directories

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages