MediumRepository SystemsPython 3

Repository Hash Tree

Build a deterministic repository hash tree that updates parent directories after file writes and deletes.

45m2 sample tests4 hidden tests

Implement RepoHashTree, an in-memory hash tree for repository snapshots.

Requirements

  • set_file(path, content) stores or replaces one file.
  • delete_path(path) removes a file or directory and returns whether anything was deleted.
  • root_hash() returns a deterministic hash for the whole tree.
  • File order must not change the root hash.
  • Directory hashes must depend on sorted child names, child types, and child hashes.
  • Reject empty paths, .. traversal, and file/directory conflicts.

Example

python
1left = RepoHashTree() 2left.set_file("src/app.py", b"print('hi')") 3left.set_file("README.md", b"hello") 4 5right = RepoHashTree() 6right.set_file("README.md", b"hello") 7right.set_file("src/app.py", b"print('hi')") 8 9assert left.root_hash() == right.root_hash()

Constraints

  • Keep the tree in memory.
  • Use a deterministic stable hash. Avoid process-randomized hashes.
  • Treat paths as POSIX-style / paths after normalization.

Editor