Repeat Masker
Repeat Masker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked.
Currently over 56% of human genomic sequence is identified and masked by the program.
Usage
Submodules
RepeatMasker uses the following submodules:
- Tandem Repeats Finder version 4.0.9
- RMBlast search engine version 2.6.0
- Repbase Update Database of Repetitive DNA version 20170201 (February 2017)
License
To use RepeatMasker and its submodules you first need to accept its licence with the following license terms:
-
RepeatMasker is licensed under the Open Source License v2.1.
-
Tandem Repeats Finder is licensed under the following terms:
-
RMBlast is licensed under the Open Source License v2.1.
-
Repbase Update is a Database of Repetitive DNA published by Genetic Information Research Institute. You may use the content of the Database free of charge under the following conditions:
You won’t be able to use this program without a license agreement.
Basic usage
Warning
Load SW module repeatmasker, then run RepeatMasker (e.g. with sample input file my_repeatmasker_sample.fasta
):
Interactive job
Interactive job can be run as follows:
You are then redirected to a concrete machine where you can run RepeatMasker with my_repeatmasker_sample.fasta
input file as follows (and then exit from the machine):
Please note that interactive regime does not bring any significant speed-up comparing to running RepeatMasker locally on your machine unless parallelism is used. Interactive regime may be used to test the execution of your job (strongly recommended) and on success you are invited to switch to running it in a batch (see below).
Batch job
This is the preferred way of running jobs. Create the shell script my_repeatmasker_script.sh
with the following content:
Submit this shell script by something like:
Last updated on