Features
Variant Object Types
AnyVar enables the registration and retrieval of a number of different variation object types, beginning with core elements of the GA4GH Variation Representation Specification (VRS):
Variant Translation
AnyVar implements a Translator abstraction that can be used to ingest free-text variation expressions of known nomenclatures. By way of the VRS-Python translator module, the following kinds of expressions are supported:
Expression Type |
Example |
|---|---|
HGVS |
|
SPDI |
|
gnomAD/VCF |
|
VCF Ingestion and Annotation
AnyVar can ingest and register all variants (and reference alleles) contained within a Variant Call Format (VCF) file, and return a file copy with variation IDs included as INFO field properties.
Object Extensions
Registered variation objects can be associated with extensions, which provide a flexible mechanism for attaching additional, lightweight metadata to a variation. An extension consists of: a name (str), describing the type of metadata, and a value, which may be any JSON-serializable object.
Extensions are intended to mirror the concept of extensions in VRS: they capture auxiliary information that is closely associated with the variation itself, but not part of its core, identity-defining representation. Extensions are deliberately unstructured and permissive in order to support a wide range of use cases without requiring schema changes.
Examples of appropriate uses include:
registration metadata (e.g., timestamps, provenance identifiers)
cross-references to external resources
links to associated samples, patients, or datasets
simple flags or attributes relevant to the stored object
Note
Extensions are not intended for storing large, complex, or highly interpretive data. In particular, they should not be used for:
detailed evidence or annotation records
clinical interpretations or assertions
large structured payloads or documents
Such information is better represented in dedicated data models or external systems.
Variant Mappings
Mappings can be used to register specific modes of relationship between variations, such as reference assembly liftover. The VariationMappingType enum provides the supported kinds of relationships:
- class anyvar.core.metadata.VariationMappingType(value)[source]
Supported mapping types between VRS Variations.
Warning
Currently, use of mappings outside of the liftover relation are experimental, and these parameters are subject to change. The following describes current intentions, and may or may not be validated within AnyVar.
LIFTOVER_TO: Genomic-to-genomic coordinate transformation. Use when mapping a variation between two reference sequences of the same molecule type, typically genomic DNA.TRANSCRIBE_TO: Genomic-to-transcript (RNA/cDNA) projection. Use when projecting a genomic DNA object onto a transcript sequence (RNA or cDNA). This accounts for splicing, transcript strand orientation, and transcript-specific exon structure.TRANSLATE_TO: Transcript-to-protein projection. This entails codon interpretation, amino acid substitution/insertion/deletion/extension, and protein coordinate changes.
- LIFTOVER_TO = 'liftover_to'[source]
- TRANSCRIBE_TO = 'transcribe_to'[source]
- TRANSLATE_TO = 'translate_to'[source]
Stateless Mode
Optionally, an AnyVar server can be configured as stateless, to provide its variant translation and bulk file annotation functions without any persistent object registration backend. This utilizes the NoObjectStore class in place of a relational data backend.