Anfisa glossary

Variant

This documentation uses term "variant" in two contexts: DNA variant and transcript variant. Prior meaning in generic context is DNA variant, the meaning transcript variant is some kind of detailization.

DNA Variant

DNA Variant is the base object of the system. It corresponds to a difference (mutation) in genome code between of some fixed person and "standard" genomics sequence.

Dataset

is a collection of variants; the system supports two kinds of datasets: XL-datasets and workspaces, or WS-datasets

Case

is group of persons - samples - with family connections between each other whose genomics information is registered in dataset.

Cohort

is group of persons, possibly large, collected for scientific purposes

Sample

person in case, may be affected or unaffected. Proband sample in medical cases is always affected

Trio

tree persons: child and two parents, child is subject of trio; in practice cases are often trio

Vault

Storage for all datasets represented in the system

XL-dataset

eXtra Large dataset, may contain arbitrary many variants

Workspace
WS-dataset

comparatively small dataset (less than 9000 variants); object of Workspace page

Filtration in this case applies to transcript variants

Tagging and zone selection mechanisms are available for workspaces.

Transcript

gene transcription scenario. Transcripts of the most important type describe protein coding process with known fixed protein result.

Transcript variant

Particular DNA variant can affect some gene coding transcripts, zero, one, or many times. So a single DNA variant can be considered as list of transcript variants. Transcript variants are determined as (actual) logical pairs (DNA variant, transcript) and are used in WS-datasets as base information objects of low level.

Viewing regime

The user can view and study all properties of selected variants in this regime. See Viewing regimes for details.

Filtration

is the main analytic mechanism providing by the system; the user determines rules of selection variants (DNA or transcript ones) satisfying conditions for variety of properties. The subset of variants can be used for detailed study in viewing regime. The user also can create secondary workspace and continue studies of data inside it.

Two filtration mechanisms are supported in the system: using filters or decision tree

Filter

implementation of filtration mechanism where sequence of conditions are applied to variants one by one, in conjunctional way. Filters are solution items

Filtering regime

Regime for work with filters, see Filtering regime

Decision tree

implementation of filtration where conditions are applied to variants in form of decision tree. See Decision Tree Syntax Reference for definitions. Decision trees are solution items

Decision tree code

Internal representation of decision tree is portion of code in Python dialect. See Decision Tree Syntax Reference

Decision tree point

Single instruction of decision tree. Points of main types have state: set of variants that correspond to this point. See Decision Tree Syntax Reference

Decision tree state label

Name of state for a decision tree point. See Decision Tree Syntax Reference. The purpose of labels is setting proper parameters to some of complex functions

Tagging

In workspace context, where number of variants is not so large, the user can tag them manually. Tags are stored on the server side. See details in ws_tags

Zone

In workspace the user can use zone selection as an additional mechanism of filtration.

Primary dataset

Dataset that was loaded in vault directly. Usually it is XL-dataset with wide variety of variants.

Secondary workspace

The user can create workspace datasets as result of filtration process. The typical scenario in the system is as follows. The user starts with primary dataset, selects comparatively small subset of variants and put into secondary workspace, and then this subset is ready for careful detailed manual study. The user can repeat selection procedure more than one time.

Root dataset

For secondary workspace it is original dataset that was loaded in vault directly

Viewing property

Property of variant shown in viewing regime

Conditions

Conditions on various filtering properties, see Condition descriptor.

Decision tree atomic condition
Atom

Atomic condition for filtering property used in decision tree point, see Decision Tree Syntax Reference

Filtering property
Unit

Property of variants used for Filtration procedures purposes.

Numeric property

Filtering property with numeric values

Enumerated property

Filtering property with values from a enumerated list of strings

Status property

Enumerated property with single value

Multiset property

Enumerated property with multiple value

Variety property

Enumerated property with wide spectrum of values (or symbols). Dual to correspondent panel property, see Variety and panel filtering complex.

Panel property

Enumerated property represents presence of the dual variety property symbols in symbol panels, see Variety and panel filtering complex.

Functions
Filtering function

Aggregated information items that can be used in Filtration procedures as well as filtering properties, in case if parameter data is defined. See Filtering functions.

Filtering properties classification

Classification of filtering properties using to UX settings of filtration properties in filtering regime.

Dataset documentation

Collection of documents in various formats attached to dataset or produced by the system on dataset loading or creation. Documentation on secondary workspace includes references to documentation on base one.

Aspect

Representation of part of data on variant in context of full view representation. See Viewing regimes

Solution item

Item representing some application solution useful for the user. Generalization name for filter, decision tree and some others. See the discussions Solution items and Solution items in work.

Rules

Aggregated multiset property that detects what decision trees are positive on the variant. Available only in filtering regime in Workspace page.

Symbol panel

List of symbols prepared for special purposes. Used in Variety and panel filtering complex. Symbol panels are solution items

Gene panel

List of gene symbols. Gene panels are solution items

Active symbols

Symbols that are reported in variety properties in complete form. Implemented as persistent hidden symbol panel, see details here

Export

Operation of creation (external) Excel document for selected variants. Document is stored on server side, see configuration settings.

Delayed request

A request that needs to be complete only if the main request has returned incomplete information. Forms series. See details in Status report mechanism (with delays)

Background task

The system cannot perform immediately some of tasks, so it evaluate them with some delay. Once such a tasks initiates, the client periodically call the server request job_status whether the task is done.

Internal UI

Is a variant of Front-End of the system that is used for deep development process of the system. It is more "primitive" than NextGen UI, however it covers the whole functionality supported by REST UI. Only Chrome and Firefox browsers are supported by Internal UI, and there are more inconveniences in usage of it. However, it is a palliative while NextGen Front-End is being developed to its proper state

Anti-cache mechanism

The internal UI uses some files (with extensions *.js and *.css), and these files are checked out from the repository. So after a push from the repository these files can change. If these files were used by the UI directly, there would be a possibility that the user’s browser will ignore changes in such a file and use some outdated cached copy of its previous version instead of the fresh version of it. The workaround for this problem is to create a mirror directory, copy into it all the necessary files but slightly modify their names in such a way that different versions of the same file will have different names. See mirror-ui configuration setting.

Annotation pipeline

A process of preparation of primary dataset information that should be evaluated before creation if dataset in the system. See Administration file formats reference for details.

Annotated JSON file

Result of annotation pipeline, usually in the following formats: *.json, *.json.gz, *.json.bz2. S See details in Annotated JSON file format overview