Backup
birdclaw can write the canonical SQLite store as deterministic JSONL shards that Git can diff and merge. The backup repo is the long-lived, version-controlled record; the SQLite file is just a fast local index built from it.
#Layout
manifest.json
data/accounts.jsonl
data/profiles.jsonl
data/tweets/YYYY.jsonl
data/tweets/unknown.jsonl
data/collections/likes.jsonl
data/collections/bookmarks.jsonl
data/dms/conversations.jsonl
data/dms/YYYY.jsonl
data/moderation/blocks.jsonl
data/moderation/mutes.jsonl
Design rules:
- tweets are sharded by year for human browsing, partial loads, and yearly analysis
- DMs are sharded by year with
conversation_idin each row, so Git stays fast while preserving conversation membership - collection-only tweets with unknown timestamps go to
data/tweets/unknown.jsonlinstead of pretending they belong to 1970 - likes and bookmarks are stored as collection edges and mirrored into the timeline rows so existing queries keep working
- profiles include bio plus follower/following counts so the snapshot is meaningful on its own
- no SQLite WAL/SHM, FTS shadow tables, or transient live cache rows ever land in the backup
The manifest pins per-shard byte counts, row counts, and SHA hashes. Validation walks every shard and verifies they line up.
#backup export
Write text shards to a local directory. Validates the manifest by default.