anyvar.storage.duckdb
Provide DuckDB-based storage implementation.
Most live, persistent variant registration and search services will want to employ the PostgreSQL storage option for its performance, particularly for concurrent writes and large-scale datasets. However, DuckDB may be better in certain use cases:
The DuckDB file-based option can be used to assemble a cohort or a dataset into a static file, like an index, that can be easily passed along to other uses for later lookup This option can also function as a simple registration service in cases where a separately-provisioned PostgreSQL server is logistically prohibitive, although this is not ideal.
The DuckDB in-memory option can be used for simple testing and demonstration purposes, and also works as a less-performant “stateless” translation service. Note that the database is wiped and recreated every time a FastAPI service restarts.
>>> from anyvar.storage.duckdb import DuckDbObjectStore
>>> file_based = DuckDbObjectStore("duckdb:///path/to/my/variants.duckdb")
>>> in_memory = DuckDbObjectStore("duckdb:///:memory:")
Under the hood, this should behave like a simpler but less-performant equivalent of Postgres. Our implementation is designed to employ common SqlAlchemy resources so there should be minimal specific maintenance required here.
- class anyvar.storage.duckdb.DuckDbObjectStore(db_uri, *args, **kwargs)[source]
DuckDB-backed AnyVar object store.
- __init__(db_uri, *args, **kwargs)[source]
Initialize PostgreSQL storage.
- Parameters:
db_uri (
str) – DuckDB connection URI. See above for options.
- delete_extensions(object_id, name=None, value=None)[source]
Delete extension(s) for an object
Supports gradual specificity – either delete all extensions, or delete all extensions under a given key/name, or delete all extensions with a given name AND value.
If no extension matching given args exists, do nothing.
Note that this gets a little slow in DuckDB, because we have to manually query the # of matching rows first.
- Parameters:
object_id (
str) – The object IDname (
str|None) – Optional extension key/name to deletevalue (
Optional[TypeAliasType]) – Optional extension value to delete. Ignored ifnameis not provided
- Return type:
int- Returns:
Number of deleted rows