Web Resources

From Bioinformatics Lab
Revision as of 19:20, 21 November 2016 by Il1001 (Talk | contribs)

Jump to: navigation, search

Contents

Knowledgebase, GATEWAY DBs, Genome/Gene Annotations

Genome Sequencing/Re-sequencing Consortium Projects

  • Genome 10K Sequencing >16,000 vetebrates
  • i5K genome initiative Sequencing >5,000 insects and other anthropods
  • Bird 10K project Sequencing >10,000 bird species
  • 1000 Plants Sequencing 1,000 plant species
  • 1000 Genomes Sequencing 1,000 healthy people from various populations
  • UK10K Sequencing 10,000 people (4,000 healthy, 6,000 disease) in England
  • Genomics England Sequencing 100,000 people in England focusing on patients with a rare disease and their families and patients with cancer.

NGS data public depositories

  • SRA Sequence Read Archive by NCBI
  • ENA European Neucleotice Archive by EBI
  • GEO Gene Expression Omnibus (for processed data only)
  • ENCODE Encyclopedia of DNA Elements project (human)

Epigenomics and Regulomics Resources

Genomic Variation DBs

- Human genomics variations

  • ExAC (Exome Aggregation Consortium) Exome variation data from >60k individuals
  • DiscovEHR Collaboration between the Regeneron Genetics Center (WES) and Geisinger Health System (EHR) provides the vcf by 50,000 MyCode participants
  • European Variation Archive Most comprehensive and organized by studies (include Clinical variants)
  • NCBI Variation Variation DBs (dbSNP, dbVar, dbGaP, ClinVar)
  • 1000 Genome Project Catalog of 60 million variant sites (SNV, CNV, SV), 2535 individuals from 26 populations
  • Complete Genomics Very accurate 69 human WGS public data and more
  • Exome Variant Server by NHLBI GO Exome sequencing project (ESP)
  • iJGVD Integrative Japanese Genome Variation Database
  • HGV Database The HGV database is a fully searchable online database of genome variations published in peer-reviewed Data Reports in Human Genome Variation

- Human disease associated genomics variations

  • CGD Clinical Genomic Database
  • HGMD The human gene mutation database (The professional version of DB is commercial. The public version of DB is not downloadable.)
  • COSMIC DB for somatic mutations for cancer (largely by manual curation)
  • TCGA Germline/somatic mutations for cancer are available as Mutation Analaysis file format (MAF).
  • OMIM Germline mutations for genetic diseases
  • Roche Cancer Genome Database (RCGDB) Germline/somatic mutations for cancer collected from diverse resourses (not downloadable)
  • IDbase Human Immunodeficiency-causing mutation database

- Arabidopsis genomics variations

  • AtPolyDB Everything about Arabidopsis natural variants (by Magnus Nordborg, GMI)
  • RegMap panel Reginal Mapping Project for Arabidopsis natural variants (by Joy Bergelson, U on Chicago)
  • 1001 Genome Project Genetic variation] of natural population of Arabidopsis (by Detlef Weigel, MPI)

Genetic variant functional impact scoring Tools

Genotype-to-Phenotype Resources

Genotype-to-Expression (eQTL) Databases

Pathway/Signature gene set DBs

Pathway DBs

  • Pathguide.org A very comprehensive list of pathway and network databases
  • Gene Ontology by Gene Ontology Consortium
  • KEGG pathways and many more
  • Biocyc includes Metacyc, Ecocyc, Humancyc, Aracyc, Yeastcyc
  • Reactome A manually curated and peer-reviewed pathway DB
  • Pathway Interaction Database (PID) Human pathways curated by NCI-Nature/imported from BioCarta/Reactome
  • CORUM Comprehensive Resource of Mammalian Protein Complexes
  • NetPath A database for signaling pathways (cancer/immune signaling pathways)
  • SIGNOR 11000 manually-annotated causal relationships between proteins that participate in signal transduction
  • UniProt-GOA by EBI (support multi-species annotation)
  • UniPathway a fully manually curated resource of metabolic pathways (cross-linked to KEGG, MetaCyc)
  • Mapman Metabolic pathway databases
  • Plantcyc Plant metabolic network databases
  • Gramene A curated DB for grasses
  • agriGO A GO databases for agricultural community
  • AgBase Curated DB for functional analysis of agriculural animals and plants

Signature Gene Set DBs

  • MsigDB License required for redistribution
  • GeneSigDB
  • DSigDB Drug signature database for gene set analysis
  • L1000CDS2 Return 50 signature genes for each LINCS L1000 data set using Characteristic Direction (CD) method
  • CREEDS CRowd Extracted Expression of Differential Signatures: Signature gene sets from GEO selected by crowdsourcing project using CD method

Protein/Gene Interaction DBs

- PPIs by curation

  • iRefWeb a web interface to PPI consolidated from 10 public DB (BIND, BioGRID, CORUM, DIP,IntAct, HPRD, MINT, MPact, MPPI, OPHID(predicted PPIs))
  • IntAct
  • BIND the Biomolecular Interaction Network Database
  • BioGRID
  • HPRD Human Protein Reference Database
  • MINT Molecular Interaction DB
  • DIP Database of Interacting Proteins
  • Mpact Representation of Interaction Data at MIPS
  • MPPI Mammalian PPI DB at MIPS

- Inferred gene interactions

  • STRING Known and predicted PPI
  • FunCoup DB of functional couplings between genes by data integration

TF Regulation DBs

-TFBS motif model DB

-Tools for MOTIF discovery and searching

  • MEME Suite has everything for motif based sequence analysis

-TF-target DB

  • TRED a transcriptional regulatory element database (contains curated TF-target links for 36 TF families)
  • ORegAnno DNA regulatory regions, TFBS, regulatory variants

-TF ChIP DB

  • hmChIP TF-ChIP DB for human and mouse

-Plant TF DB

  • AGRIS Arabidopsis Gene Regulatory Information Server (by OSU)
  • PlnTFDB Plant TF database by University of Potsdam, Germany
  • PlantTFDB Plant TF database by Peking University, China

-Others

  • Gene Regulation Info A very useful site for epigenetics and TF-DNA interaction studies (by Dr. Vladimir Teif)

miRNA DBs and target prediction tools

-microRNA list and expression atlas

  • miRBase miRNA database by Manchester University
  • microRNA.org download miRNA expression atlas for human, mouse, rat

-microRNA-target links (Gold standard)

  • miRWalk2.0 Validated links from 4 databases and text minings, Predicted links from 13 prediction data sets
  • miRTarBase Manually curated microRNA-target links, miRNA-mRNA paired expression profiles, miRNA-disease links
  • miRecords Manually curated microRNA-target links + predicted links (by 11 computational algorithms)
  • miRTex Text mining system for miRNA-target, miRNA-gene/gene-miRNA regulation
  • mirSel microRNA-target links by text mining
  • Comir Combinatorial miRNA target prediction tool

-microRNA-disease

  • Human microRNA Disease Database(HMDD) Manually curated microRNA-disease links
  • miR2Disease Manually curated microRNA-target links and microRNA-disease links
  • PhenomiR A knowledgebase of miRNA expression in disease and biological processes
  • miRGator data for miRNA expression, miRNA-mRNA paired expression profile, miRNA perturbation experiments...

-Target prediction software

  • TargetScan executable PITA executable miRanda excecutable
  • miRmap target prediction by multiple algorithms, excecutable, precalculated, many other related data
  • miRDB Pre-calculated miRNA-target associations (based on SVM), not executable

-CLIP-seq database

-Plant microRNA DB

ncRNA related DBs and Servers

  • NONCODE Integrative annotation of long noncoding RNAs
  • lncRNAdb a reference DB for long noncoding RNAs
  • RAIN RNA–protein Association and Interaction Networks Intro to RAIN
  • NPInter ncRNA interaction database (ncRNA and other molecules)
  • LncRNADisease a DB for lncRNA associated diseases
  • ncFANs a web server for functional annotation of ncRNA

Gene Expression DBs (Microarray/RNA-seq)

Data deposit servers

  • GEO
  • AtGenExpress Arabidopsis gene expression DB by Weigel lab (there are unpublished non-GEO data here)
  • ImmGen Immunological Genome Project Ontogenet TF-module networks based on ImmGen data

Expression Atlas

Mass Spectrometer or Immunohistochemistry Proteomics Resources

- Human Proteome Database

  • Human Proteome Map 85 samples from 17 adult tissues, 6 primary hematopoietic cells and 7 fetal tissues
  • ProteomicsDB >10,000 raw data files from 60 human tissues, 147 cell lines, and 13 body fluids
  • The Human Protein Atlas The tissue-based map of human proteome based on Immunohistochemistry (for 32 different tissues and organs)

- Open MS proteomics data analysis suite

- Open stand alone software for spectra database search (search engines)

  • MSblender A combined search engine
  • MS-GFDB: Its successor MS-GF+ is faster and more sensitive for high resolution MS data.
  • X!TANDEM
  • Comet: the direct descendant of Crux, which is an academic version of the commercial software SEQUEST
  • MyriMatch
  • OMSSA Due to budgetary constraints NCBI has discontinued OMSSA. Historical binaries are available from here.

-Raw spectra databases

Protein localization and Secretome DB

  • Vesiclepedia A DB for all types of Extracellular Vesicles (includes Exocarta)
  • Exocarta A DB for Exosome
  • EVpedia A DB for Extracellular Vesicles with many analysis softwares
  • SUBA SUBcellular location DB for Arabidopsis proteins

Phenotype/Disease Annotation DBs

  • DISEASES gene-disease association from text mining (GHR, Uniprot, textmining)
  • GHR Genetics Home Reference (by NCBI)
  • Disease Ontology Disease ontology files FUNDO DOLite_term-to-genes map
  • Human Phenotype Ontology
  • OMIM Human disease DB (needs License to distribute)
  • OrphaData Open database for rare diseases and orphan drug (by Orphanet)
  • GAD Genetic Associationan Database: archive of human genetic association studies of complex diseases and disorders (includes summary data extracted from published candidate gene and GWAS studies).
  • UMLS Unified Medical Language Systems
  • ICD International Classification of Disease by WHO
  • DGA Disease and Gene Annotation, an integrative set of disease-to-gene, gene-to-gene, disease-to-disease relationships
  • GenomeRNAi v12 contains 168 human RNAi, 181 D. melanogaster RNAi screen data sets
  • OGEE Online GEne Essentiality database
  • Human-Mouse Disease Connection a part of MGI

Drug/Bio-active chemical DBs

  • Drugable.com by National Library of Medicine, ~1 million chemicals, ~7000 structural pockets, ~4 millions of drug-protein interactions by docking model
  • PubChem A DB contains drug structure and function by NCBI
  • ChEMBL A DB contains drug structure and functions by EBI
  • Drugs@FDA A DB for FDA approved drugs
  • DailyMed High quality Information about marketed drugs by NCBI
  • SuperDrug A DB contains 3D-structures of drugs

Drug-Target relationship/ Chemical genomics DBs

  • DGIdb An integrated Drug-Gene Interaction DB (CancerCommons, ChEMBL, CIVIC, Clearity Foundation, DoCM, DrugBank, Guid To Pharmacology MyCancerGenome, PharmGKB, Targeted Agents in Lung Cancer TDG, TEND, TTD); go to help for download batch data file
  • KEGG DRUG contains information about only approved drugs
  • STITCH DB for known and predicted chemical-protein interaction
  • Drugbank A major DB of drug/target
  • Therapeutic Target Database (TTD) A major DB of drug/target
  • MATADOR Manually Annotated Targets and Drugs Online Resource
  • IUPHAR/BPS Guide to Pharmacology A DB of in-depth information of drug targets and ligands
  • PDSP Ki DB data warehouse for published and internally-derived Ki, or affinity of drugs at targets
  • Yeast Fitness DB Chemical genomics test for ~400 chemicals (Science 320-362)

Drug signature, Pharmacogenomics, Toxicogenomics, Clinical Trials

Cancer Genomics DBs

-Curated functional and clinical variants in Cancer

  • CIViC A knowledgebase for expert-crowdsourcing the clinical interpretation of variants in cancer
  • DoCM A database of curated mutations in cancer

-Catalog of Cancer genes and mutations

  • TSGene Literature curated Tumor suppressor genes (~1000 coding, ~200 non-coding); v2 paper also provides ~300 oncogenes in Supple
  • NCG The Network of Cancer Genes; a manually curated repository of cancer genes from the literature (1571 cancer genes by v5)
  • COSMIC Catalog Of Somatic Mutations In Cancer
  • CGC Cancer Gene Census

-Cancer Genomics Data Portals

  • Synapse TCGA-Pancancer The official resource for hosting analysis-ready TCGA Pan Cancer data
  • cBioPortal Data sets from published studies including TCGA
  • CGHub TCGA data portal by UCSC; TCGA The Cancer Genome Atlas project home
  • TumorPortal Pan-cancer data set from many tumor types.
  • ICGC data portal raw data from ICGC and TCGA
  • MethHC A database of DNA Methylation and gene expression in Human Cancer (use Pan-cancer data)

-Data for survival predictions

-Cancer Genomics Data Analysis Web server

  • CRAVAT Cancer-Related Analysis of Variants ToolKit
  • IntOGen Integrative Onco Genomics

-Cancer Pharmacogenomics

-Cancer cell essential genes

  • Achilles Project shRNA-based essential gene profiles for 216 cancer cell lines
  • COLT-cancer database shRNA-based essential gene profiles for 70 breast, pancreatic, ovarian cancer cell lines

Metagenome DBs and tools

-Metagenomic data central DB

-Human microbiome

Genome Engineering Resources

  • Addgene Plasmids for Genome Engineering
  • Zhang Lab Feng Zhang at MIT (CRISPR resource, Optic control)
  • Joung Lab Keith Joung at Harvard (TALEN resource, CRISPR resource)

Data-driven Omics companies

Other Resources

-Machine Learning

- Academic society

  • KSBSB Korean Society of Bioinformatics and Systems Biology
  • KGO Korea Genome Organization
  • KSMCB Korean Society of Molecular and Cellular Biology
  • KSBMB Korean Society of Biochemistry and Molecular Biology

- Other Systems Biology Links

  • DREAM Dialogue for Reverse Engineering Assessments and Methods
  • Sage Bionetworks
  • Assay depot Online marketplace for pharmaceutical research service
  • CAGI Critical Assessment of Genome Interpretation

- Neuroscience

- Others

Cool Web servers

  • REVIGO Visualize GO enrichment summary
  • VENNY Drawing Venn diagram
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox