Archive & Sync

Archive Import

Archive import

birdclaw import archive parses a Twitter/X archive ZIP and writes everything into the canonical SQLite tables: tweets, likes, bookmarks, profiles, DMs, and (when present) blocklists.

It is idempotent. Re-running on the same archive replays the import without producing duplicates, so you can import, then re-import after a fresh archive download to top up.

#Get an archive

Twitter / X publishes account archives at https://x.com/settings/your_archive. Requesting one takes ~24 hours; you receive a download link in email.

Save the ZIP somewhere autodiscovery can find it (~/Downloads is fastest), or pass an explicit path.

#Autodiscovery

On macOS, archives are autodiscovered via Spotlight (mdfind) plus name heuristics borrowed from Sweetistics:

1 archive find 0

This searches ~/Downloads first, then runs an mdfind pass under $HOME for files matching twitter-*.zip, x-*.zip, and *archive*.zip.

The result lists every plausible candidate so you can confirm before importing.

#Import

1 import archive 0
1 import archive ~/Downloads/twitter-archive-2025.zip 0

Flags:

  • --select <kinds> — subset of tweets,likes,bookmarks,profiles,directMessages,blocks
  • --dm-mode metadata|full — default is full; metadata skips message bodies for speed
  • --dry-run — analyze without writing
  • --force — re-import even if a manifest hash matches a previous run

Examples:

1 import archive ~/Downloads/twitter-archive.zip 0 tweets,directMessages
2 import archive ~/Downloads/twitter-archive.zip 0 metadata 1
2 import archive ~/Downloads/twitter-archive.zip 0 1

#Hydrate profiles

The archive ships with stale profile metadata (bios, follower counts, avatars from years ago). Hydrate from live Twitter when you can:

1 import hydrate-profiles 0

This walks the imported profiles table and refreshes each entry through whichever transport is available (xurl first, bird second). Without a live transport, hydration is a no-op and the archive's snapshot stays.

Avatars are written to ~/.birdclaw/media/thumbs/avatars/ so the web UI does not re-fetch them on every render.

#What ends up where

After import, archive data and live data live in the same canonical tables. There is no archive_* shadow universe.

  • Tweetstweets table, indexed by FTS5 — searchable via birdclaw search tweets
  • Likestweets table + a likes collection edge — searchable via --liked
  • Bookmarkstweets table + a bookmarks collection edge — searchable via --bookmarked
  • DMsdm_conversations and dm_events tables, indexed by FTS5 — searchable via birdclaw search dms
  • Profilesprofiles table — drives @mention resolution and DM influence scoring
  • Blocks (when present in the archive export) → blocks table per account

Tweets whose archive timestamps are missing or impossible (1970-01-01 rows) get bucketed into data/tweets/unknown.jsonl on backup export rather than pretending they belong to 1970.

#After import

1 db stats 0
3 search tweets 0 1 5 2
3 search tweets 0 1 20 2

db stats prints row counts per table and the schema version so you can confirm the import landed.

#See also

  • Sync — top up archive data with cached live reads
  • Search — FTS5 over tweets and DMs
  • Backup — round-trip the canonical tables to deterministic JSONL shards