Repeat Explorer
Galaxy is a web-based platform designed for running computational and statistical analyses with focus on openness and usage of FAIR data. It originally started in biomedical science but nowadays spans numerous scientific domains including ecology, natural language processing, chemistry, climate science, and social sciences.
There is worldwide network of Galaxy servers providing open access to virtually all academic users consisting of “copies” (instances) of the service. Some major ones are hosted in the United States, EU, and Australia. Besides, numerous specialized services exist.
Many quickstart and advanced tutorials are available on Galaxy Training Network.
Metacentrum currently maintains 3 independent Galaxy servers: usegalaxy.cz, RepeatExplorer, and UMSA.
RepeatExplorer
RepeatExplorer is a domain specific Galaxy instance which includes utilities for graph-based clustering and characterization of repetitive sequences in next-generation sequencing data and tools for the detection of transposable element protein coding domains.
We maintain this Galaxy for our partners at Institute of Plant Molecular Biology.
RepeatExplorer Galaxy environment is available at https://repeatexplorer-elixir.cerit-sc.cz/galaxy/.
Registration
- Visit registration url
- Select account that you will use for registration
- If you have access to eduIDcz identity (Czech academia) prefer that.
- Otherwise use Elixir/LS Login or other identity provider from the list
- Log in to your selected account and agree when asked to share information with Perun
- (optional) if similar user already exists in Perun, you will be asked to prove your identity by logging into this existing account – do this only in case that you want to use this account and it is truly yours (you have to provide correct username and password for this account). If you don’t want to use your already existing account or similar user that Perun found is not you, you can click on the button ”It is not me”.
- Please fill the presented application form for Elixir CZ IT services.
- Next you will see form to choose username and password for RepeatExplorer Galaxy. If you have chosen to use your already existing account during previous steps these fields will be pre-filled.
- Congratulation, you are successfully registered! If you have changed your email address during registration, you will have to verify it. Please check your inbox.
- Please expect up to 30m propagation delay before you’ll be able to log in to Galaxy.
User Quotas
The RepeatExplorer Galaxy server offers 200 GB of free storage quota to any registered user. If your research requires more storage please reach us at regalaxy@rt.cesnet.cz with description of your needs.
There is also a limit on the number of jobs a given user can have running concurrently. The RepeatExplorer instance has this limit set at 5 jobs at the moment. Again, please reach is if this is not sufficient for your needs.
Maximum size of a single dataset is limited at 250 GB.
FTP Access
RepeatExplorer’s FTP server runs at repeatexplorer-elixir.cerit-sc.cz on port 990 and uses the same username and password as RepeatExplorer Galaxy itself.
To learn how to connect to the server and import data to your history please follow the process described in the docs.
Citing RepeatExplorer
Dear users of RepeatExplorer please use the following acknowledgement in your publications using our infrastructure:
Computational resources were provided by the ELIXIR-CZ project (LM2015047), part of the international ELIXIR infrastructure.
Primary Publications
Novak, P., Neumann, P., Macas, J. (2020) – Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nature Protocols 15:3745–3776, https://doi.org/10.1038/s41596-020-0400-y.
Novak, P., Neumann, P., Pech, J., Steinhaisl, J., Macas, J. (2013) - RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29:792-793, https://doi.org/10.1093/bioinformatics/btt054.
Classification of repetitive elements using REXdb:
Neumann, P., Novak, P., Hostakova, N., Macas, J. (2019) – Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mobile DNA 10:1, https://doi.org/10.1186/s13100-018-0144-1.
The principle of repeat identification implemented in the RepeatExplorer:
Novak, P., Neumann, P., Macas, J. (2010) - Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics 11:378, https://doi.org/10.1186/1471-2105-11-378.
Using TAREAN for satellite repeat detection and characterization:
Novak, P., Robledillo, L.A.,Koblizkova, A., Vrbova, I., Neumann, P., Macas, J. (2017) - TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Research 45:e111, https://doi.org/10.1093/nar/gkx257
Novak, P., Hostakova, N., Neumann, P., Macas, J. (2024) – DANTE and DANTE_LTR: computational pipelines implementing lineage-centered annotation of LTR-retrotransposons in plant genomes. NAR Genomics and Bioinformatics, Volume 6:3:lqae113, https://doi.org/10.1093/nargab/lqae113
Last updated on
