A public archive
for sequences that
nature never wrote.
Generative models now produce proteins, genes and regulatory elements faster than the scientific community can catalogue. ArtGene Archive is the first dedicated registry for AI-designed biological sequences — providing watermarking, biosafety certification, and an auditable chain of custody from model to bench to publication, while sepearting naturally occuring sequences from the ones created by humans in collaboration with AI. Art(ificial)-gene Archive only store sequences that pass multiple biosecurity screening gates (See demo).
Idealised helix, rendered from
a watermarked 648 bp fragment.
What GenBank was to the sequencing machine,
ArtGene Archive is to the generative model.
In 1982 a small group at the National Institutes of Health recognised that automated sequencers had begun to produce DNA data faster than journals or institutions could track. The response - a public, federated, machine-readable catalogue became the scaffolding of modern biology.
Forty-four years later we stand at a structurally identical moment. A frontier model can now propose a functional enzyme in milliseconds. A graduate student with a laptop and a credit card can order it as DNA before lunch. The community has no shared infrastructure to know which sequences were machine-designed, who designed them, what safety checks were applied, or whether they were ever observed in nature.
ArtGene Archive is that infrastructure. We provide a dedicated deposit path, automated three-gate biosafety screening (more under development), cryptographic watermarking via codon steganography, tamper-evident certification, and formal recognition for contributors. Deposits are free. Records are public. The registry will be governed by an international consortium.
The absence of this infrastructure is a biosecurity risk today. Its presence could accelerate safe, citizen-driven biological innovations at the scale the moment demands.
Four pillars.
Biosafety screening
Three automated gates run on every deposit: structural confidence (ESMFold pLDDT), off-target homology (BLAST + ToxinPred2), and ecological risk (HGT + DriftRadar). Flagged records route to a human panel within 24h.
Codon watermark
Synonymous codon substitution encodes a 128-bit institutional signature into the coding sequence. The watermark survives translation and is recoverable from re-sequenced DNA - provenance that travels with the molecule.
Tamper-evident certificate
Each deposit issues a signed JSON certificate anchored to a Merkle ledger. The AG-ID, the sequence hash, the model, and the biosafety scorecard are all cryptographically bound. Trust but verify.
Contributor recognition
Researchers, labs and institutions accumulate a verifiable public record of depositions. First-deposit priority is time-stamped and permanent. Models themselves are credited as generative provenance.
An invisible
signature, baked
into the molecule.
Because most amino acids are encoded by multiple synonymous codons, a coding sequence can carry arbitrary information without altering the protein it produces. ArtGene exploits this redundancy to embed a 128-bit institutional signature — a fingerprint that persists across plasmid transfer, re-synthesis, and publication.
Re-sequence the molecule in any lab on earth and our verifier will return the AG-ID, the depositing institution, and the original certificate — or fail cleanly and tell you it cannot.
Read the technical specification →From upload to bio-screening to certificate
in one seamless pipeline.
Every record is signed.
Every signature is
public and verifiable.
ArtGene certificates are signed JSON objects anchored to a public Merkle ledger. Any record can be verified offline with our open-source CLI. No vendor lock-in. No gatekeeping. No ambiguity about who deposited what, when, and under what biosafety conditions.
ArtGene is operated as public-interest infrastructure — not a product.
Deposit your first sequence.
The registry is free to use, open by default, and takes less than a minute for a standard deposit. No institutional account is required to read; one API key is required to deposit.