Initial import
This guide walks you through a one-shot backfill of your existing SaaS users (and their workspaces / accounts) into a fresh letter.app project, so you can target them with sequences and broadcasts from day one.
It uses the Node SDK to call the same ingestion API your app will use day-to-day, so anything you import here is indistinguishable from data arriving in real time. Re-running the import is safe; identify and group are idempotent on the external id and traits are deep-merged server-side.
What this guide covers
- Contacts with traits (name, plan, signup date, etc.).
- Accounts with traits (B2B workspaces / orgs).
- The contact, account link so segments and sequences can read
account.*traits.
It also explains how to dry-run, checkpoint, resume, and verify.
What this guide skips, and why
Historical events are intentionally out of scope. Inserted events are
trigger-evaluated synchronously, which means dumping months of historical
Signed Up events into a project with a published sequence would mass-enroll
your entire user base into that sequence. Until letter ships a
backfill-friendly skipTriggers flag, the safe pattern is:
- Import contacts and accounts (this guide).
- Build and publish your sequences.
- From that point on, send live events via the SDK.
If you also want historical events available for segment filters
(did event X in the last N days), wait for the dedicated backfill flag or
get in touch and we can stage one for your project.
Prerequisites
- A letter.app project. An empty one is ideal so you can sanity-check the counts after the run.
- A production API key from Settings, API keys. It is shown once on
creation; copy it into
LETTER_API_KEYin your env. - No published sequences yet. Keep them in draft until the import finishes so an unrelated change doesn’t accidentally enroll anyone.
- A Node 20+ environment with
tsxinstalled in the project you’ll run the script from (pnpm add -D tsx @letterapp/node).
The script
Save this as scripts/import-into-letter.ts in any repo with access to your
production database. Swap the iterateUsers body for your own paginated query
(Prisma, Drizzle, raw SQL, anything that streams rows in a stable order).
import { Letter } from "@letterapp/node";
import { readFile, writeFile } from "node:fs/promises";
const CHECKPOINT = ".letter-import-checkpoint.json";
const PAGE_SIZE = 100;
const DRY_RUN = process.argv.includes("--dry-run");
const letter = new Letter({
apiKey: process.env.LETTER_API_KEY!,
flushAt: PAGE_SIZE, // one batch per page
maxRetries: 5, // ride out transient 5xx / 429
onError: (err) => console.error("[letter] flush failed:", err),
});
type SourceUser = {
id: string;
email: string;
name: string | null;
plan: string;
signupAt: Date;
workspace?: {
id: string;
name: string;
mrr: number;
};
};
// Replace the body with your own paginated query. Yield rows in a stable
// order (id ASC works). The checkpoint stores the last id seen so you can
// resume after a crash without re-sending the whole table.
async function* iterateUsers(after: string | null): AsyncGenerator<SourceUser> {
// Example with raw SQL via your own client:
//
// const rows = await sql<SourceUser[]>`
// SELECT id, email, name, plan, signup_at, workspace_id, workspace_name, workspace_mrr
// FROM users
// WHERE ${after ? sql`id > ${after}` : sql`TRUE`}
// ORDER BY id ASC
// LIMIT ${PAGE_SIZE}
// `;
//
// Return each row, then call iterateUsers again with the last id.
void after;
yield* [];
}
async function loadCheckpoint(): Promise<string | null> {
try {
const raw = await readFile(CHECKPOINT, "utf8");
return (JSON.parse(raw) as { cursor: string | null }).cursor;
} catch {
return null;
}
}
async function saveCheckpoint(cursor: string | null): Promise<void> {
await writeFile(CHECKPOINT, JSON.stringify({ cursor }, null, 2));
}
async function main() {
let cursor: string | null = await loadCheckpoint();
let processed = 0;
let buffer: SourceUser[] = [];
for await (const user of iterateUsers(cursor)) {
buffer.push(user);
if (buffer.length < PAGE_SIZE) continue;
await flushPage(buffer);
cursor = buffer.at(-1)!.id;
await saveCheckpoint(cursor);
processed += buffer.length;
console.log(`imported ${processed}`);
buffer = [];
}
if (buffer.length > 0) {
await flushPage(buffer);
cursor = buffer.at(-1)!.id;
await saveCheckpoint(cursor);
processed += buffer.length;
console.log(`imported ${processed}`);
}
await letter.close();
console.log(`done. ${processed} contacts processed.`);
}
async function flushPage(page: SourceUser[]): Promise<void> {
if (DRY_RUN) {
for (const u of page) console.log("would import", u.id, u.email);
return;
}
for (const u of page) {
letter.identify({
userId: u.id,
email: u.email,
traits: {
name: u.name,
plan: u.plan,
signup_at: u.signupAt.toISOString(),
},
timestamp: u.signupAt,
// Stable messageId per source row keeps re-runs cheap on the wire.
messageId: `import:identify:${u.id}`,
});
if (u.workspace) {
letter.group({
userId: u.id,
accountId: u.workspace.id,
name: u.workspace.name,
traits: { mrr: u.workspace.mrr },
messageId: `import:group:${u.id}:${u.workspace.id}`,
});
}
}
await letter.flush();
}
main().catch(async (err) => {
console.error(err);
await letter.close();
process.exit(1);
});
Two things this template buys you:
- Resumable. The script writes the last processed id to
.letter-import-checkpoint.jsonafter every page. Kill it, restart it, and it picks up exactly where it left off. Delete the file to start over from scratch. - Re-runnable. Identify and group upsert on
(projectId, externalId); re-running with the same script is safe and merges any new traits on top of what’s already there. StablemessageIds mean even the wire calls are deduped server-side once we wire idempotency into upserts.
Step by step
- Drop the script into your repo and adjust
iterateUsersto talk to your own database. SetLETTER_API_KEYin your env. - Dry run first to make sure the rows look right:
You should see onepnpm tsx scripts/import-into-letter.ts --dry-run | head -50would import <id> <email>per row. Nothing hits the server yet. - Run for real:
Watch thepnpm tsx scripts/import-into-letter.tsimported Nlines. If something goes wrong, the checkpoint file means you can fix and re-run without re-sending what already landed. - When it prints
done., open the project’s Contacts page and confirm the total matches what you expect.
What happens on the server
Per call, the API does the following (see Ingestion API for the full contract):
identify: upsert on(projectId, userId), set the email, deep-merge the incoming traits into the existing JSONB blob (existing || incoming).group: upsert the account on(projectId, accountId), deep-merge traits, then setcontacts.accountIdfor the matching contact (so the link only lands if you calledidentifyfirst, which the script does).
Both are idempotent: re-running the script after fixing a trait simply merges the new values in. There is no “wipe and replace” mode by design, so you can’t accidentally erase data with a bad re-run.
Verifying the import
- Counts. The Contacts page header shows the total. Compare against the row count from your source query.
- Spot checks. Click into a few contacts, confirm email, traits, and the linked account.
- Segments. Build a quick segment like
plan equals proand confirm the count matches a manual SQL query against your own DB. This is the fastest way to catch a trait that was imported as the wrong type (string"true"vs. booleantrue, for example).
Fixing data after the fact
- Wrong value? Re-run the script with the corrected source data. The new
value wins because
existing || incomingputs the right-hand side last. - Renamed a trait? Add the new key in the script. The old key sticks around as orphan data until you clear it explicitly.
- Want to clear a trait? Send the key with value
nullfrom anidentify. The server keeps the key with a null value (full pruning is on the roadmap, not yet shipped). Filtering on it viatrait does not existworks as expected once it’s null.
Scale and throttling
- The Node SDK already backs off on
429and retries on5xx, so you do not need any custom rate-limit logic. - Per-key limit is 10 000 requests / minute. At
flushAt: 100that is a ceiling of roughly 1M contacts / minute, far beyond any realistic initial import. - One Node process is enough for tens of millions of rows. Don’t shard unless you’ve actually measured a bottleneck on the source DB side, since the SDK will saturate the API on its own.
After the import
- Publish your sequences.
- Wire
@letterapp/nodeinto your app so new signups, plan changes, and workspace updates flow in live. - Drive broadcasts and segments off the data you just imported.
See Ingestion API and Node SDK for the day-to-day reference.