Software Scripts and console applications SeqFeatTools

Sequence feature tools

Author: Roman Jaksik

This is a set of applications used to gather and analyse various nucleotide sequence features. This package includes the following applications:

sf_aex.py
     This application extracts data from the Gene Expression Atlas using a file exported from the ArrayExpress website.
     The application returns a table with unique genes/transcriots as rows and specific experiment factors as columns.
sf_combine.py
     Combines all *.sf files based on first column of the main_locus_file. Alternatively it combines
     all tables taking only genes having feature values in all columns if the main_locus_file is not provided
sf_cpg.py
     Calculates the number of CpG islands in promoter of each gene based on the gene coordinates file and USCS
     data from cpgIslandExt.txt.gz file automatically downloaded from the UCSC database
sf_dep_test.py
     Tests the dependancy between each pair of columns from the input file starting from column number two.
sf_iso.py
     Analyzes a set of genome sequences in a specified directory using IsoFinder, combines all results and assigns iso
     scores to each gene based on its location
sf_mirna.py
     Counts the occurence of miRNA binding sites for each RefSeq transcript from the specified loci list
sf_mrnaseq.py
     Downloads the refseq mRNA sequence file from NCBI server and cleans the locus out of additional info
sf_promseq.py
     Downloads the promotor region sequence files (1k, 2k and 5k bp long) from the UCSC server and cleans the
     locus out of additional info
sf_uscs_feat.py
     Extracts various RefSeq transcript features from the latest human refGene.txt.gz file downloaded from
     the UCSC server

The example usage of each application can be viewed by running the application without any parameters.

 

Python scripts: