logo-ebgm Equipe de Bioinformatique Génomique et Moléculaire

Ecole Normale Supérieure - INSERM - U436
logo1-ebgm

What is ASSIRC?

ASSIRC ('Accelerated Search for SImilarity Regions in Chromosomes') is a tool for finding regions of similarity in genomic sequences. The method involves three steps: (i) identification of short exact chains of fixed size, called 'seeds', common to both sequences, using hashing functions; (ii) extension of these seeds into putative regions of similarity by a 'random walk' procedure; (iii) final selection of regions of similarity by assessing alignments of the putative sequences. The pairs of regions are recorded into an internal database by: using a Red-Black Tree structure for rapid access, joining adjacent pairs together to take gaps into account and checking for the inclusion of a seed in a previously recorded pair to reduce the computation.

Different strategies for improving this program by distributing the operations and data to multiple processing have been developped in the program D-ASSIRC. It is based on three alternative strategies of task sharing: (1) a distributed search using the splitting of studied sequences into large overlapping subsequence (strategy ASS); (2) two distributed searches for repeated exact motis of fixed size either managed by a central processor (strategy AGD) or locally managed by numerous processors (strategy ALD).

Here are presented slides used for at ALBIO.

Related publications

More about the package

The package is Ada written, and was reported to successfully run on various unix platforms (Linux, Solaris). The package consists of 3 main programs:

Availability

Source programs are freely available at the following address: ftp://ftp.biologie.ens.fr/pub/molbio/assirc.tar.gz

----------------------------

Materials on this site are copyrighted and maintained by P. Vincens <vincens@biologie.ens.fr>.
Last modified: Wed Apr 3 17:33:03 MEST 2002