Difference between revisions of "Web Resources"

From Bioinformatics Lab
Jump to: navigation, search
(Mutation Effect Prediction Tools)
(Variant functional impact scoring Tools)
Line 78: Line 78:
 
*[http://www.1001genomes.org/ 1001 Genome Project] Genetic variation] of natural population of Arabidopsis (by Detlef Weigel, MPI)
 
*[http://www.1001genomes.org/ 1001 Genome Project] Genetic variation] of natural population of Arabidopsis (by Detlef Weigel, MPI)
  
== Variant functional impact scoring Tools ==
+
== Genetic variant functional impact scoring Tools ==
 
*[http://www.columbia.edu/~ii2135/eigen.html Eigen] Assign functional important scores on genetic variants in coding and noncoding regions (human only, unsupervised integration)
 
*[http://www.columbia.edu/~ii2135/eigen.html Eigen] Assign functional important scores on genetic variants in coding and noncoding regions (human only, unsupervised integration)
 
*[http://cadd.gs.washington.edu/home CADD: Combined Annotation Dependent Depletion] a tool for scoring the deleteriousness of SNV and indels (human only, supervised integration)
 
*[http://cadd.gs.washington.edu/home CADD: Combined Annotation Dependent Depletion] a tool for scoring the deleteriousness of SNV and indels (human only, supervised integration)

Revision as of 16:14, 4 April 2016

Contents

Knowledgebase, GATEWAY DBs, Genome/Gene Annotations

Genome Sequencing/Re-sequencing Consortium Projects

  • Genome 10K Sequencing >16,000 vetebrates
  • i5K genome initiative Sequencing >5,000 insects and other anthropods
  • Bird 10K project Sequencing >10,000 bird species
  • 1000 Plants Sequencing 1,000 plant species
  • 1000 Genomes Sequencing 1,000 healthy people from various populations
  • UK10K Sequencing 10,000 people (4,000 healthy, 6,000 disease) in England
  • Genomics England Sequencing 100,000 people in England focusing on patients with a rare disease and their families and patients with cancer.

NGS data public depositories

  • SRA Sequence Read Archive by NCBI
  • ENA European Neucleotice Archive by EBI
  • GEO Gene Expression Omnibus (for processed data only)
  • ENCODE Encyclopedia of DNA Elements project (human)

NGS data analysis tools

  • BEDOPSA suite for common genome analysis tasks with high scalability and flexibility
  • BEDTools A suite for BED (Browser Extensible Data) and GFF (General Feature Format) format.
  • SAMtools A suite for SAM (Sequence Alignment/Map) format
  • Homer A suite of tools for Motif Discovery and NGS (ChIP-Seq, RNA-Seq, DNase-Seq, Hi-C). Excellent documentation!
  • F-Seq A Feature Density Estimator for High-Throughput Sequence Tags
  • IDR Reproducibility and automatic thresholding of ChIP-seq data

The Encyclopedia of DNA Element (ENCODE) Data links

Epigenomics, Cis-regulatory regions Resources

  • GREAT Genomic Regions Enrichment of Annotations Tool; Predict functions for cis-regulatory regions
  • CistromeMap A knowledgebase for ChIP-Seq and DNase-Seq studies in mouse and human
  • Road map Epigenomics NIH Roda map Epigenomics project home
  • BLUEPRINT epigenome Epigenome maps of >100 different blood cell types
  • Epigenie An informative web community for epigenetics-related research
  • EpGenSys European network to bring together epigenetic and systems biology
  • Gene Regulation Info A very useful site for epigenetics and TF-DNA interaction studies (by Dr. Vladimir Teif)

Genomic Variation DBs

- Human genomics variations

- Human disease associated genomics variations

  • CGD Clinical Genomic Database
  • HGMD The human gene mutation database (The professional version of DB is commercial. The public version of DB is not downloadable.)
  • COSMIC DB for somatic mutations for cancer (largely by manual curation)
  • TCGA Germline/somatic mutations for cancer are available as Mutation Analaysis file format (MAF).
  • OMIM Germline mutations for genetic diseases
  • Roche Cancer Genome Database (RCGDB) Germline/somatic mutations for cancer collected from diverse resourses (not downloadable)
  • IDbase Human Immunodeficiency-causing mutation database

- Arabidopsis genomics variations

  • AtPolyDB Everything about Arabidopsis natural variants (by Magnus Nordborg, GMI)
  • RegMap panel Reginal Mapping Project for Arabidopsis natural variants (by Joy Bergelson, U on Chicago)
  • 1001 Genome Project Genetic variation] of natural population of Arabidopsis (by Detlef Weigel, MPI)

Genetic variant functional impact scoring Tools

Genotype-to-Phenotype Resources

  • UK10k Exome Sequencing data for both healthy and disease population
  • UK Biobank Genotype and extensive phenotype data for ~500k UK people
  • GWASdb includes moderate SNPs (p-value < 10^-3) with manual curation from original papers; manually mapped ~1600 GWAS traits to ~500 HPO terms, ~440 DO terms, ~230 DOLite terms
  • European Genome-phenome Archive(EGA) Raw data of GWAS, WGS, Exome-seq. A great resource for meta-analysis
  • dbGaP The database of Genotypes and Phenotypes (GWAS, WGS, Exome-seq...)
  • NCBI ClinVar human variations and their relations to the human health (Not includes unreviewed data from GWAS)
  • GWAS Central contain SNPs for any p-value
  • GWAS catalog now maintained by EBI
  • PheGenI Phenotype-Genotype Integrator
  • Genome-wide Repository of Associations between SNPs and Phenotypes (GRASP) Better than GWAS catalog, including eQTL,QTLs
  • COGS nature resources CollaborativeOncological Gene-environment Study (GOGS): Association study using ~211,000SNPs (iCOGS) for breast, ovarian, prostate cancers.

Genotype-to-Expression (eQTL) Databases

Pathway Annotation DBs

  • Pathguide.org A very comprehensive list of pathway and network databases
  • Gene Ontology by Gene Ontology Consortium
  • KEGG pathways and many more
  • Biocyc includes Metacyc, Ecocyc, Humancyc, Aracyc, Yeastcyc
  • Reactome A manually curated and peer-reviewed pathway DB
  • Pathway Interaction Database (PID) Human pathways curated by NCI-Nature/imported from BioCarta/Reactome
  • CORUM Comprehensive Resource of Mammalian Protein Complexes
  • NetPath A database for signaling pathways (cancer/immune signaling pathways)
  • UniProt-GOA by EBI (support multi-species annotation)
  • UniPathway a fully manually curated resource of metabolic pathways (cross-linked to KEGG, MetaCyc)
  • Mapman Metabolic pathway databases
  • Plantcyc Plant metabolic network databases
  • Gramene A curated DB for grasses
  • agriGO A GO databases for agricultural community
  • AgBase Curated DB for functional analysis of agriculural animals and plants

Protein/Gene Interaction DBs

- PPIs by curation

  • iRefWeb a web interface to PPI consolidated from 10 public DB (BIND, BioGRID, CORUM, DIP,IntAct, HPRD, MINT, MPact, MPPI, OPHID(predicted PPIs))
  • IntAct
  • BIND the Biomolecular Interaction Network Database
  • BioGRID
  • HPRD Human Protein Reference Database
  • MINT Molecular Interaction DB
  • DIP Database of Interacting Proteins
  • Mpact Representation of Interaction Data at MIPS
  • MPPI Mammalian PPI DB at MIPS

- Inferred gene interactions

  • STRING Known and predicted PPI
  • FunCoup DB of functional couplings between genes by data integration

TF Regulation DBs

-TFBS motif model DB

-Tools for MOTIF discovery and searching

  • MEME Suite has everything for motif based sequence analysis

-TF-target DB

  • TRED a transcriptional regulatory element database (contains curated TF-target links for 36 TF families)
  • ORegAnno DNA regulatory regions, TFBS, regulatory variants

-TF ChIP DB

  • hmChIP TF-ChIP DB for human and mouse

-Plant TF DB

  • AGRIS Arabidopsis Gene Regulatory Information Server (by OSU)
  • PlnTFDB Plant TF database by University of Potsdam, Germany
  • PlantTFDB Plant TF database by Peking University, China

-Others

  • Gene Regulation Info A very useful site for epigenetics and TF-DNA interaction studies (by Dr. Vladimir Teif)

miRNA DBs and target prediction tools

-microRNA list and expression atlas

  • miRBase miRNA database by Manchester University
  • microRNA.org download miRNA expression atlas for human, mouse, rat

-microRNA-target links (Gold standard)

  • miRWalk2.0 Validated links from 4 databases and text minings, Predicted links from 13 prediction data sets
  • miRTarBase Manually curated microRNA-target links, miRNA-mRNA paired expression profiles, miRNA-disease links
  • miRecords Manually curated microRNA-target links + predicted links (by 11 computational algorithms)
  • miRTex Text mining system for miRNA-target, miRNA-gene/gene-miRNA regulation
  • mirSel microRNA-target links by text mining
  • Comir Combinatorial miRNA target prediction tool

-microRNA-disease

  • Human microRNA Disease Database(HMDD) Manually curated microRNA-disease links
  • miR2Disease Manually curated microRNA-target links and microRNA-disease links
  • PhenomiR A knowledgebase of miRNA expression in disease and biological processes
  • miRGator data for miRNA expression, miRNA-mRNA paired expression profile, miRNA perturbation experiments...

-Target prediction software

  • TargetScan executable PITA executable miRanda excecutable
  • miRmap target prediction by multiple algorithms, excecutable, precalculated, many other related data
  • miRDB Pre-calculated miRNA-target associations (based on SVM), not executable

-CLIP-seq database

-Plant microRNA DB

ncRNA related DBs and Servers

  • NONCODE Integrative annotation of long noncoding RNAs
  • lncRNAdb a reference DB for long noncoding RNAs
  • NPInter ncRNA interaction database (ncRNA and other molecules)
  • LncRNADisease a DB for lncRNA associated diseases
  • ncFANs a web server for functional annotation of ncRNA

Gene Expression DBs (Microarray/RNA-seq)

Mass Spectrometer or Immunohistochemistry Proteomics Resources

- Human Proteome Database

  • Human Proteome Map 85 samples from 17 adult tissues, 6 primary hematopoietic cells and 7 fetal tissues
  • ProteomicsDB >10,000 raw data files from 60 human tissues, 147 cell lines, and 13 body fluids
  • The Human Protein Atlas The tissue-based map of human proteome based on Immunohistochemistry (for 32 different tissues and organs)

- Open MS proteomics data analysis suite

- Open stand alone software for spectra database search (search engines)

  • MSblender A combined search engine
  • MS-GFDB: Its successor MS-GF+ is faster and more sensitive for high resolution MS data.
  • X!TANDEM
  • Comet: the direct descendant of Crux, which is an academic version of the commercial software SEQUEST
  • MyriMatch
  • OMSSA Due to budgetary constraints NCBI has discontinued OMSSA. Historical binaries are available from here.

-Raw spectra databases

Protein localization and Secretome DB

  • Vesiclepedia A DB for all types of Extracellular Vesicles (includes Exocarta)
  • Exocarta A DB for Exosome
  • EVpedia A DB for Extracellular Vesicles with many analysis softwares
  • SUBA SUBcellular location DB for Arabidopsis proteins

Phenotype/Disease Annotation DBs

  • OMIM Human disease DB
  • DISEASES gene-disease association from text mining
  • Disease Ontology Disease ontology files FUNDO DOLite_term-to-genes map
  • Human Phenotype Ontology
  • OrphaData Open database for rare diseases and orphan drug (by Orphanet)
  • GAD Genetic Associationan Database: archive of human genetic association studies of complex diseases and disorders (includes summary data extracted from published candidate gene and GWAS studies).
  • UMLS Unified Medical Language Systems
  • ICD International Classification of Disease by WHO
  • DGA Disease and Gene Annotation, an integrative set of disease-to-gene, gene-to-gene, disease-to-disease relationships
  • GenomeRNAi v12 contains 168 human RNAi, 181 D. melanogaster RNAi screen data sets
  • OGEE Online GEne Essentiality database
  • Human-Mouse Disease Connection a part of MGI

Drug/Bio-active chemical DBs

  • Drugable.com by National Library of Medicine, ~1 million chemicals, ~7000 structural pockets, ~4 millions of drug-protein interactions by docking model
  • PubChem A DB contains drug structure and function by NCBI
  • ChEMBL A DB contains drug structure and functions by EBI
  • Drugs@FDA A DB for FDA approved drugs
  • DailyMed High quality Information about marketed drugs by NCBI
  • SuperDrug A DB contains 3D-structures of drugs

Drug-Target relationship/ Chemical genomics DBs

  • DGIdb An integrated Drug-Gene Interaction DB (CancerCommons, ChEMBL, CIVIC, Clearity Foundation, DoCM, DrugBank, Guid To Pharmacology MyCancerGenome, PharmGKB, Targeted Agents in Lung Cancer TDG, TEND, TTD); go to help for download batch data file
  • KEGG DRUG contains information about only approved drugs
  • STITCH DB for known and predicted chemical-protein interaction
  • Drugbank A major DB of drug/target
  • Therapeutic Target Database (TTD) A major DB of drug/target
  • MATADOR Manually Annotated Targets and Drugs Online Resource
  • IUPHAR/BPS Guide to Pharmacology A DB of in-depth information of drug targets and ligands
  • PDSP Ki DB data warehouse for published and internally-derived Ki, or affinity of drugs at targets
  • Yeast Fitness DB Chemical genomics test for ~400 chemicals (Science 320-362)

Clinical Trials and Pharmaco/Toxicogenomics DBs

Cancer Genome/Cell Line Biology DBs

-Catalog of Cancer genes and mutations

  • TSGene Literature curated Tumor suppressor genes (~1000 coding, ~200 non-coding); v2 paper also provides ~300 oncogenes in Supple
  • NCG The Network of Cancer Genes; a manually curated repository of cancer genes from the literature (1571 cancer genes by v5)
  • COSMIC Catalog Of Somatic Mutations In Cancer
  • CGC Cancer Gene Census

-Cancer Genomics Data Portals

  • Synapse TCGA-Pancancer The official resource for hosting analysis-ready TCGA Pan Cancer data
  • cBioPortal Data sets from published studies including TCGA
  • CGHub TCGA data portal by UCSC; TCGA The Cancer Genome Atlas project home
  • TumorPortal Pan-cancer data set from many tumor types.
  • ICGC data portal raw data from ICGC and TCGA
  • MethHC A database of DNA Methylation and gene expression in Human Cancer (use Pan-cancer data)

-Data for survival predictions

-Cancer Genomics Data Analysis Web server

  • CRAVAT Cancer-Related Analysis of Variants ToolKit
  • IntOGen Integrative Onco Genomics

-Cancer chemical genomics

-Cancer cell essential genes

  • Achilles Project shRNA-based essential gene profiles for 216 cancer cell lines
  • COLT-cancer database shRNA-based essential gene profiles for 70 breast, pancreatic, ovarian cancer cell lines

Stem Cell Biology DBs

  • SHOGoiN A comprehensive cell database by Kyoto Univ.
  • Stemformatrics Datasets and Bioinformatics tools for Stem Cell Research
  • SCDE The Stem Cell Discovery Engine
  • ESCAPE Embryonic Stem Cell Atlas of Pluripotency Evidence (Many stem cell related networks)

Metagenome DBs and tools

-Metagenomic data central DB

-Human microbiome

Bacterial Antibiotics DBs

Organism-centric DBs: Microbes

Organism-centric DBs: Animals

Organism-centric DBs: Plants

- All Plants

- Arabidopsis

- Rice (Oryza sativa)

  • RGAP Rice Genome Annotation Project by MSU (Go get the part list here!)

- Maize (Zea Mays)

  • MaizeGDB
  • Panzea Maize Genotype/Phenotype raw public data (Great resources for Maize genetics)

-Barley (Hordeum vulgare L.)

-Wheat (Triticum aestivum)

- Tomato (Solanum lycopersicum)

- Soybean (Glycine Max)

Genome Engineering Resources

  • Addgene Plasmids for Genome Engineering
  • Zhang Lab Feng Zhang at MIT (CRISPR resource, Optic control)
  • Joung Lab Keith Joung at Harvard (TALEN resource, CRISPR resource)

Data-driven Omics companies

Other Resources

-Machine Learning

- Academic society

  • KSBSB Korean Society of Bioinformatics and Systems Biology
  • KGO Korea Genome Organization
  • KSMCB Korean Society of Molecular and Cellular Biology
  • KSBMB Korean Society of Biochemistry and Molecular Biology

- Other Systems Biology Links

  • DREAM Dialogue for Reverse Engineering Assessments and Methods
  • Sage Bionetworks
  • Assay depot Online marketplace for pharmaceutical research service
  • CAGI Critical Assessment of Genome Interpretation

- Neuroscience

- Others

Cool Web servers

  • REVIGO Visualize GO enrichment summary
  • VENNY Drawing Venn diagram
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox