Running
In order to run pyIPSA workflow you have to specify input and output folders in the configuration file.
Open file config/config.yaml in pyIPSA directory and specify your desired input and output folders.
Paths must be absolute or relative to pyIPSA directory. Input folder must have at least one alignment file (BAM).
To run pyIPSA use the following command while in root directory:
$ snakemake --cores <number of cores>
To run in cluster environment using Grid Engine:
$ snakemake --cluster qsub --j <number of jobs>
Also you can create your own custom config. Just copy default config to the same folder and change the values you need. To run with custom config:
$ snakemake --configfile config/my_config.yaml
Default config file config/config.yaml must be present along with custom one.
For other running options consult with snakemake docs.
Configuration
config/config.yaml has many other useful options:
pooled- ifTrue, merge junctions from all samples before retrieving sites if Trueprimary- ifTrue, use only primary alignment for multimapped readsunique- ifTrue, do not use multimapped readsthreads- number of threads used to read single alignment filemin_offset- minimal offset when aggregating junctionsmin_intron_length- minimal allowed length of junctionmax_intron_length- maximal allowed length of junctionentropy- minimal value of entropy used for filtering out junctions or sitestotal_count- minimal allowed count while filtering out junctionsgtag- ifTrue, use only junctions with GT/AG splice sitesgenome_filenames- stores full names of genomesgenome_urls- stores URLs to genome filesannotation_urls- stores URLs to annotation files