Equipe de Bioinformatique Génomique et Moléculaire

Ecole Normale Supérieure - INSERM - U436

What is ASSIRC?

ASSIRC ('Accelerated Search for SImilarity Regions in Chromosomes') is a tool for finding regions of similarity in genomic sequences. The method involves three steps: (i) identification of short exact chains of fixed size, called 'seeds', common to both sequences, using hashing functions; (ii) extension of these seeds into putative regions of similarity by a 'random walk' procedure; (iii) final selection of regions of similarity by assessing alignments of the putative sequences. The pairs of regions are recorded into an internal database by: using a Red-Black Tree structure for rapid access, joining adjacent pairs together to take gaps into account and checking for the inclusion of a seed in a previously recorded pair to reduce the computation.

Different strategies for improving this program by distributing the operations and data to multiple processing have been developped in the program D-ASSIRC. It is based on three alternative strategies of task sharing: (1) a distributed search using the splitting of studied sequences into large overlapping subsequence (strategy ASS); (2) two distributed searches for repeated exact motis of fixed size either managed by a central processor (strategy AGD) or locally managed by numerous processors (strategy ALD).

Here are presented slides used for at ALBIO.

Related publications

Vincens P., Buffat L., André C., Chevrolat J.P., Boisvieux J.F. and Hazout S. (1998)
A strategy for finding regions of similarity in complete genome sequences. Bioinformatics 14 (8), 715-725 (Abstract).
Vincens P., Badel-Chagnon A., André C. and Hazout S. (2001)
D-ASSIRC: distributed program for finding sequence similarities in genomes Bioinformatics 18 (3), 246-251 (Abstract).

More about the package

The package is Ada written, and was reported to successfully run on various unix platforms (Linux, Solaris). The package consists of 3 main programs:

assirc : The program for finding pairs of similar regions.
dassirc : A set of programs for finding pairs of similar regions using distributed processing units.
yaap : A program for aligning sequences. Yaap (Yet Another Alignement Program) performs local or global alignement.

Availability

Source programs are freely available at the following address: ftp://ftp.biologie.ens.fr/pub/molbio/assirc.tar.gz

Materials on this site are copyrighted and maintained by P. Vincens <vincens@biologie.ens.fr>.
Last modified: Wed Apr 3 17:33:03 MEST 2002