Administration file formats reference

Configuration of the service

The main control file of the service is anfisa.json.

Primary dataset information

Any primary dataset is set up based on data prepared in preliminary annotation pipeline process. So annotation pipeline is the necessary part of the whole Anfisa project but it's proper description is not a part of this documentation: Anfisa service uses only results of this pipeline, in form of annotated JSON files, usually in the following formats: *.json, *.json.gz, *.json.bz2 (see details of format here: Annotated JSON file format overview)

  • The most simple and primitive way of dataset creation uses only this file: just run app.storage - Dataset creation and upload with --source option (--config option is required, and --kind option is important in case of XL-dataset)

  • More extended way of dataset creation uses Inventory (*.CFG) file format: run app.storage - Dataset creation and upload with --inv option (--config and --key options play the same role as above)

  • Both ways above operate single dataset, but there is some integration solution. The system administrator can create a file storage.dir file format, collect there all information on primary datasets being maintained, and use path to this file as --dir option.

In this context options --config, --source, --inv, --kind are out of sense because all correspondent peaces of information are being retrieved from storage.dir file format.