Analysis of low-complexity protein sequence families
| ABG-134633 | Sujet de Thèse | |
| 01/12/2025 | Contrat doctoral |
- Biologie
- Biologie
- Informatique
Description du sujet
In this project you will integrate experimental data from the scientific literature with diverse functional and structural annotations found in established biological databases (such as UniProtKB, ELM, DisProt, InterPro, and PhaSepDB, among others), unifying redundant or overlapping annotations and organizing them according to a hierarchical structure based on Gene Ontology and LCRAnnotationsDB categories. This improves the accessibility and interpretability of LCR-related data, ensuring researchers can efficiently search for biologically meaningful patterns across multiple annotation types.
The core aim is to generate new scientific insight by associating LCRs with their functions through a combination of expert curation and modern machine learning approaches, further enhancing the depth of the database to benefit the scientific community. To associate LCRs with specific functions, the project will:
• Combine manual expert curation—reviewing and annotating experimental findings from the literature— with advanced machine learning approaches to extend functional annotation of LCRs. clans”
• Apply clustering algorithms to group LCRs by similar annotation profiles or biological properties, forming “cluster that represent functionally coherent subgroups.
• Integrate both curated knowledge and algorithmic insights into the expanded LCRAnnotationsDB, so that users can retrieve clusters, clan information, and evidence linking LCRs to specific functions.
Project Impact
Notably, aside from well-studied homorepeats or selected cases, much of the functional and structural space of LCRs remains unexplored. This project aims to address this gap by systematically combining dispersed data from various databases and literature, thereby enabling the scientific community to uncover new LCR functions and biological roles.
Contribution to the Community
The improved LCRAnnotationsDB will be made freely available, providing an invaluable resource for protein research, functional genomics, and bioinformatics. By integrating curated annotations with clusters, this project will accelerate discoveries in the field of protein low complexity regions. The improvements will also involve adding dynamic user-friendly visualization features to interactively explore LCR annotations and relationships within proteins as well as development of tools/interfaces allowing regular updates of the underlying databases and annotation sources to reflect the latest releases from UniProtKB, InterPro, DisProt, and others for up-to-date annotation coverage.
Prise de fonction :
Nature du financement
Précisions sur le financement
Présentation établissement et labo d'accueil
The Institute of Biochemistry and Biophysics of the Polish Academy of Sciences is one of the leading research institutes in the field of life sciences in Poland. Our mission is to carry out basic research in various areas of biology, biophysics, biochemistry, and genetics. We also understand the need for the efficient movement of basic science discoveries into practice, and we therefore constantly increase our efforts to transfer the research results to the industry and clinic.
Our scientists perform innovative, ground-breaking research that is published in high impact, world-renowned journals. In our research we employ several model organisms such as bacteria, fungi, plants and animals, and use a wide range of modern biochemical, biophysical, bioinformatics, genetic and molecular methods.
Moreover, our Institute manages the Polish Henryk Arctowski Antarctic Station, located on King George Island, off the coast of Antarctica. The station provides an opportunity to carry out research not only by our employees but also by scientists from all over the world.
The quality and originality of our research is well recognized both in Poland and abroad, as a result our scientists frequently receive prestigious prizes. We have also been successful in acquiring multiple research funds from national and international sources. Finally, we have developed a rich collaboration network that spans the entire world.
Since our vision is to be a leading, innovative research centre that is able to solve fundamental biological problems, we are willing to host talented, open-minded scientists who are eager to push the boundaries of knowledge.
LCR-LAB (lcr-lab.org)
The LCR Lab develops state-of-the-art tools for generation and analysis of protein low complexity regions, which we make freely available to the scientific community. Low complexity regions (LCRs) are characterised by a low level of amino acid diversity.
Our research group is composed of two cooperating teams from The Silesian University of Technology (Gliwice) and The Institute of Biochemistry and Biophysics PAS (Warsaw), Poland.
Intitulé du doctorat
Pays d'obtention du doctorat
Etablissement délivrant le doctorat
Profil du candidat
1. Holds a degree of Master of Science [Magister], Master of Engineering [Magister Inżynier], medical doctor or equivalent in the field of: exact sciences, natural sciences, medical sciences or related disciplines, granted by a Polish or foreign university; a person who does not possess the qualifications described above may take part in the competition, but must obtain the qualifications in question and provide the relevant documents before the start of the programme at the Doctoral School (i.e., 1st March 2026) Education at the Doctoral School begins on 1st March 2026.
2. Familiar with a Linux environment, basic Bash commands, scripting in Python (and/or other).
3. Familiar with protein sequence and function analyses (e.g. manual annotation, database data analysis).
4. Knowledge of biological databases and pipelines.
5. Experience in working with phylogenetic/genomic/transcriptomic/systems biology tools will be a bonus.
6. Be well organized, eager to learn and ready to assimilate literature.
7. Have fluency in spoken and written English.
Vous avez déjà un compte ?
Nouvel utilisateur ?
Vous souhaitez recevoir nos infolettres ?
Découvrez nos adhérents
ADEME
CASDEN
Laboratoire National de Métrologie et d'Essais - LNE
Tecknowmetrix
Généthon
ONERA - The French Aerospace Lab
Nokia Bell Labs France
Aérocentre, Pôle d'excellence régional
TotalEnergies
MabDesign
Ifremer
Groupe AFNOR - Association française de normalisation
MabDesign
PhDOOC
Institut Sup'biotech de Paris
ASNR - Autorité de sûreté nucléaire et de radioprotection - Siège
ANRT
SUEZ
CESI

