An independent scientific registry·For AI Generated Art(ificial) bio sequences·Operated under the ArtGene Consortium v1.0EN · FR · ES · ZH · JA
Quick start · ≈ 3 minutes

Deposit from the command line.

(Feature under development). The ArtGene CLI handles authentication, FASTA validation, biosafety screening, and certificate retrieval in a single command. Read records without an API key; deposits require a free institutional key.

Install
$ pip install artgene
$ artgene auth login
Deposit a sequence
$ artgene deposit my-sequence.fasta \
    --model "ESM-3 v2.1" \
    --host "E. coli BL21(DE3)" \
    --license CC-BY-4.0

  ↳ validating FASTA         ✓
  ↳ gate α · structural      ✓ 0.91
  ↳ gate β · off-target      ✓ 0.97
  ↳ gate γ · ecological      ✓ 0.88
  ↳ watermark embedded       ✓ 128-bit
  ↳ certificate minted       ✓ AG-2026-018428
  ↳ anchored to ledger       ✓ block 148903
Verify any record
$ artgene verify AG-2026-018427
  ✓ certificate valid
Deposit lifecycle

How a deposit becomes a certificate.

01
Prepare your sequence
ArtGene Archive accepts protein-coding sequences in standard FASTA format. The sequence must be at least 10 characters. Longer sequences (300+ amino acids) provide more synonymous codon positions for fingerprinting.
>MyProtein | Homo sapiens
MKTIIALSYIFCLVFA…
02
Fill in the deposit form
Go to the Register page and provide sequence metadata — name, molecule type, expression host, generating model, and an ethics approval code.
  • Owner ID — your email or researcher username.
  • Ethics Code — IRB / ethics committee approval reference.
  • Host Organism — expression system calibrates gate thresholds.
Open deposit form →
03
Three biosafety gates run automatically
After submission the pipeline runs α → β → γ gates. All must pass for a certificate to be issued. Gate α failure short-circuits the remaining gates.
Read gate details →
04
Receive your certificate and AG-ID
A successful run returns a Certificate with a unique registry ID (e.g. AG-2026-000001). The certificate records the SHA3-512 hash, gate outcomes, and watermark carrier positions for auditing.
Browse registry →
05
Issue per-recipient distribution copies
From the sequence detail page, issue fingerprinted copies for each recipient. Each copy embeds a unique codon pattern — same protein, different synonymous codons. Paste any leaked copy into the Verify Source page to identify the origin.
Biosafety gates

Three sequential checks.

Gates run in order. A hard FAIL at any gate prevents the next from running. Results are stored in the certificate for auditing.

α
Structural confidence
ESMFold pLDDTLinearFold ΔMFE

Uses ESMFold-derived pLDDT scores to assess predicted folding confidence and RNA minimum free energy (ΔMFE). Sequences predicted to fold into dangerous prion-like or amyloid-prone structures are flagged.

Fail → gates β and γ are skipped.

β
Off-target homology
BLASTToxinPred2SecureDNAIBBIS

Amino acid composition analysis: Kyte-Doolittle hydropathy (GRAVY), cationic/amphipathic toxin scoring, allergen probability estimation, and a curated k-mer screen for known antimicrobial peptide scaffolds. Full BLAST screening against pathogen and toxin databases is in development.

Toxin probability > 0.30, allergen > 0.40, or k-mer matches to known toxic scaffolds → FAIL. Allergen > 0.30 → WARN.

γ
Ecological risk
HGT scoringDriftRadar

Horizontal Gene Transfer (HGT) propensity scoring and DriftRadar ecological-spread modelling estimate environmental containment risk.

High HGT score or escape probability → WARN or FAIL.

FAQ

Frequently asked questions.

Can I register RNA or DNA sequences?

ArtGene currently requires protein-coding sequences provided as FASTA. The DNA sequence is synthesised via codon optimisation when a distribution copy is issued.

What is Provenance Tracing?

After registering a sequence, you can issue fingerprinted distribution copies for each recipient from the sequence detail page. Each copy embeds a unique codon pattern (same protein, different synonymous codons). If a copy leaks, paste it on the Verify Source page to identify which recipient it came from.

What does CERTIFIED vs REJECTED mean?

CERTIFIED means the sequence passed all applicable biosafety gates and has been issued a certificate with a registry ID. REJECTED means one or more gates returned a hard FAIL, and the sequence cannot be registered until the safety concern is resolved.

Where can I find my Organisation UUID?

Your organisation UUID is assigned by the ArtGene platform administrator when your institution is onboarded. Contact your system administrator if you do not have it.

How is the certificate hash computed?

The certificate hash is a SHA3-512 digest of the canonical certificate JSON (excluding the hash field itself). It can be used to verify that a certificate has not been tampered with.

Ready to deposit?

Register a sequence now.

Deposits are free for public records. A certificate is issued in under two minutes.

Open deposit form →