Union Ministry of Science and Technology has recently dedicated to the nation India’s first digitised repository for life science data – Indian Biological Data Center (IBDC), at the Regional Centre of Biotechnology (RCB), Faridabad, Haryana.
This will enable researchers to store biological data from publicly funded research, reducing their dependency on American and European data banks.
- The biobank now contains data including 200 human genomes sequenced as part of the ‘1,000 Genome Project,’ an international endeavour to map genetic variations in people.
- In addition, the database comprises the majority of the 2.6 lakh Sars-Cov-2 genomes sequenced by the Indian Sars-Cov-2 Genomic Consortium (INSACOG).
- The database will also house the 25,000 mycobacterium TB sequences, helping in understanding the spread of multi-drug and extremely drug resistant TB in the country and aiding in the search for targets for new medicines and vaccines.
- Currently, the database contains the genomic sequences of crops such as rice, onion, tomatoes, and mustard, among others.
- The presence of genomes from humans, animals and bacteria in the same database would also aid researchers in the study of zoonotic diseases (spread from animals to humans).
- It is probable that the database will later be expanded to store protein sequences (amino acids) as well as imaging data such as ultrasound and MRI images.
About the ‘Indian Biological Data Bank’ –
- According to the Government of India’s BIOTECH-PRIDE guidelines, IBDC is mandated to archive all life science data (in digitised form) generated from publicly funded research in India.
- In India there is no specific guidelines for storage access and sharing of Biological data.
- The Biotech PRIDE Guidelines will facilitate this and enable exchange of information to promote research and innovation in different research groups across the country.
- These guidelines will be implemented through the Indian Biological Data Centre (IBDC) at Regional Centre for Biotechnology supported by the Department of Biotechnology.
- The data centre, which is India’s first national repository for life science data, is supported by the Department of Biotechnology (DBT) in collaboration with the National Informatics Centre (NIC), India.
- The digitised data will be stored on a four-petabyte (1 petabyte = 10,00,000 gigabytes (gb)) supercomputer called ‘Brahm’.
- The biobank also has a backup data ‘Disaster Recovery’ site at National Informatics Centre (NIC)-Bhubaneswar.
- The database offers open access (can be used by other researchers from across the country) and controlled access (data will not be openly shared for a number of years) mechanisms for data submission to researchers.
- Need —
- Currently, most Indian researchers rely on the European Molecular Biology Laboratory (EMBL) and the National Center for Biotechnology Information to store biological data.
- Other smaller datasets are available at some institutes, but they are not available to all.
- Objectives —
- Provide an IT platform for permanent archiving of biological data in India.
- Development of SOPs for storing and sharing the data as per FAIR (findability, accessibility, interoperability and reusability) principles.
- Perform quality control, data backup and management of data life cycle.
- Development of web-based tools, organisation of training programs on ‘Big’ data analysis and benefits of data sharing.
- Significance —
- Because of the heterogeneity of life scientific data, IBDC is being built in a decentralised manner, with various portions dealing with specific types of data.
- It would also provide infrastructure and expertise for biological data analysis, aiding in knowledge discovery of numerous genetic diseases, vaccines and medicines.
- As a result, it will always strive to meet the needs of not just the Indian, but also the global scientific community.