Difference between revisions of "Web Resources"

From Bioinformatics Lab
Jump to: navigation, search
(Regulome Resources)
(Other Resources)
(18 intermediate revisions by one user not shown)
Line 99: Line 99:
 
- Signature Gene Set DBs
 
- Signature Gene Set DBs
 
*[http://software.broadinstitute.org/gsea/msigdb MsigDB] License required for redistribution
 
*[http://software.broadinstitute.org/gsea/msigdb MsigDB] License required for redistribution
*[http://www.genesigdb.org/genesigdb/ GeneSigDB]
+
*[http://www.genesigdb.org/genesigdb/ GeneSigDB] Manually curated gene sets from Pubmed literature
 +
*[https://www.immunespace.org/announcements/home/thread.view?rowId=50 ImmnuneSigDB] Compendium of immune signatures (now available from MsigDB)
 +
*[http://biocc.hrbmu.edu.cn/CancerSEA/goDownload CancerSEA] which provides 14 signature profiles for characterization of cancer cells
 
*[http://tanlab.ucdenver.edu/DSigDB/DSigDBv1.0/ DSigDB] Drug signature database for gene set analysis
 
*[http://tanlab.ucdenver.edu/DSigDB/DSigDBv1.0/ DSigDB] Drug signature database for gene set analysis
 
*[http://amp.pharm.mssm.edu/L1000CDS2/help/ L1000CDS2] Return 50 signature genes for each LINCS L1000 data set using Characteristic Direction (CD) method
 
*[http://amp.pharm.mssm.edu/L1000CDS2/help/ L1000CDS2] Return 50 signature genes for each LINCS L1000 data set using Characteristic Direction (CD) method
Line 166: Line 168:
 
*[http://lulab.life.tsinghua.edu.cn/postar/ POSTAR] a DB of RNA binding protein binding sites in human and mouse transcriptome (experimental and computational methods)
 
*[http://lulab.life.tsinghua.edu.cn/postar/ POSTAR] a DB of RNA binding protein binding sites in human and mouse transcriptome (experimental and computational methods)
  
== Single Cell Genomics Resources ==  
+
== Single Cell Analysis Resources ==  
*[https://github.com/seandavi/awesome-single-cell Awesome single cell] List of software packages for single-cell data analysis, including RNA-seq, ATAC-seq, etc (GitHub)
+
- Human Cell Atlas
 +
*[https://www.humancellatlas.org/ HCA]
 +
*[https://chanzuckerberg.com/science/programs-resources/humancellatlas/ Chan Zukerburg Initiative HCA seed networks]
 +
- Flow cytometry data resources
 +
*[http://flowrepository.org/ FlowRepository] data deposition place for experimental findings published in peer-reviewed journals in the flow cytometry field
 +
- scRNA-seq data analysis resources
 +
*[https://satijalab.org/seurat/ Seurat] The package for scRNA-seq data analysis
 +
*[https://www.sanger.ac.uk/science/tools/scrna-seq-analysis-course scRNA-seq analysis course by Sanger]
 +
*[https://www.cellphonedb.org/ CellPhoneDB] A repository of curated receptors, ligands and their interactions.
 +
*[https://github.com/seandavi/awesome-single-cell Awesome single cell]  
 +
*[https://www.scrna-tools.org/ scRNA-tools DB]
 +
- Depositories for scRNA-seq data
 
*[https://portals.broadinstitute.org/single_cell Single Cell Portal] scRNA-seq database by Broad Institute
 
*[https://portals.broadinstitute.org/single_cell Single Cell Portal] scRNA-seq database by Broad Institute
 
*[https://bioinfo.uth.edu/scrnaseqdb/ scRNASeqDB] scRNA-seq database by UTHSC
 
*[https://bioinfo.uth.edu/scrnaseqdb/ scRNASeqDB] scRNA-seq database by UTHSC
Line 271: Line 284:
 
*[http://www.metahit.eu/ MetaHIT] Metagenomics of the human intestinal tract
 
*[http://www.metahit.eu/ MetaHIT] Metagenomics of the human intestinal tract
 
*[http://huttenhower.sph.harvard.edu/ Huttenhower Lab] A great resource for analysis tools
 
*[http://huttenhower.sph.harvard.edu/ Huttenhower Lab] A great resource for analysis tools
 +
- Bacterial Culture Collection
 +
*[https://kctc.kribb.re.kr/kctc.aspx KCTC] Korean Collection for Type Cultures
 +
*[https://www.atcc.org/ ATCC Microbiology collection]
 +
*[https://www.dsmz.de/ DSMZ] German Collection of Microorganisms
  
 
== Proteome Resources ==
 
== Proteome Resources ==
Line 290: Line 307:
  
 
== Other Resources ==
 
== Other Resources ==
- Academic society
+
- Academic society & Research Center
 
*[http://www.ashg.org/ ASHG] American Society of Human Genetics
 
*[http://www.ashg.org/ ASHG] American Society of Human Genetics
 
*[http://www.aacr.org AACR] American Association for Cancer Research
 
*[http://www.aacr.org AACR] American Association for Cancer Research
Line 299: Line 316:
 
*[http://www.ksmcb.or.kr/ KSMCB] Korean Society of Molecular and Cellular Biology
 
*[http://www.ksmcb.or.kr/ KSMCB] Korean Society of Molecular and Cellular Biology
 
*[http://new.ksbmb.or.kr/ KSBMB] Korean Society of Biochemistry and Molecular Biology
 
*[http://new.ksbmb.or.kr/ KSBMB] Korean Society of Biochemistry and Molecular Biology
 +
*[http://mrc-systemsmed.org/ Yonsei Medical Research Center for Systems Medicine]
  
 
- Cool software
 
- Cool software
Line 305: Line 323:
 
*[http://bioinfogp.cnb.csic.es/tools/venny/ VENNY] Drawing Venn diagram
 
*[http://bioinfogp.cnb.csic.es/tools/venny/ VENNY] Drawing Venn diagram
  
- Data-driven Omics companies
+
- Omics-driven Biotech companies
 +
*[https://celsiustx.com/ Celsius Terapeutics] Novel targets and biomarker identified through single-cell RNA sequencing analysis
 
*[https://www.arivale.com/ Arivale] Health coaching for wellness based on multi-omics analysis
 
*[https://www.arivale.com/ Arivale] Health coaching for wellness based on multi-omics analysis
 
*[http://www.humanlongevity.com/ Human Longevity Inc.]
 
*[http://www.humanlongevity.com/ Human Longevity Inc.]
Line 317: Line 336:
  
 
- Machine Learning
 
- Machine Learning
 +
*[https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw StatQuest] Excellent tutorial movies for learning machine learning and more (by Josh Starmer at UNC)
 
*[http://scikit-learn.org/ Scikit learn] Open software for Machine Learning
 
*[http://scikit-learn.org/ Scikit learn] Open software for Machine Learning
 
*[https://www.coursera.org/course/ml?from_restricted_preview=1&course_id=971489&r=https%3A%2F%2Fclass.coursera.org%2Fml-004 Machine Learning by Andrew Ng]
 
*[https://www.coursera.org/course/ml?from_restricted_preview=1&course_id=971489&r=https%3A%2F%2Fclass.coursera.org%2Fml-004 Machine Learning by Andrew Ng]

Revision as of 17:44, 16 September 2019

Contents

Google DataSet Search Engine

Gene/Genome Annotations

  • CCDS The concensus protein coding regions among NCBI, Ensembl, and Sanger (Havana) annotation
  • GENCODE The Encyclopedia of Genes

Variome Resources

- Variation DBs

  • ExAC (Exome Aggregation Consortium) Exome variation data from >60k individuals
  • 1000 Genome Project Catalog of 60 million variant sites (SNV, CNV, SV), 2535 individuals from 26 populations
  • UK10K Sequencing 10,000 people (4,000 healthy, 6,000 disease) in England
  • Genomics England Sequencing 100,000 people in England focusing on patients with a rare disease and their families and patients with cancer.
  • DiscovEHR Collaboration between the Regeneron Genetics Center (WES) and Geisinger Health System (EHR) provides the vcf by 50,000 MyCode participants
  • European Variation Archive Most comprehensive and organized by studies (include Clinical variants)
  • NCBI Variation Variation DBs (dbSNP, dbVar, dbGaP, ClinVar)
  • iJGVD Integrative Japanese Genome Variation Database
  • HGV Database The HGV database is a fully searchable online database of genome variations published in peer-reviewed Data Reports in Human Genome Variation

- Functional significance of variants

Phenome/Diseasome Resources

- Human Disease DBs

  • DisGeNET MetaDB for disease genes and variants (very comprehensive and open license)
  • Open Targets Another very comprehensive DB for disease target (mostly protein-coding genes) and related evidence
  • denovo-db a compendium of human de novo variants
  • DISEASES gene-disease association from text mining (GHR, Uniprot, textmining)
  • GHR Genetics Home Reference (by NCBI)
  • Disease Ontology Disease ontology files FUNDO DOLite_term-to-genes map
  • Human Phenotype Ontology
  • OMIM Human disease DB (needs License to distribute)
  • OrphaData Open database for rare diseases and orphan drug (by Orphanet)
  • GAD Genetic Association Database: archive of human genetic association studies of complex diseases and disorders (includes summary data extracted from published candidate gene and GWAS studies).
  • UMLS Unified Medical Language Systems
  • ICD International Classification of Disease by WHO
  • DGA Disease and Gene Annotation, an integrative set of disease-to-gene, gene-to-gene, disease-to-disease relationships
  • GenomeRNAi v12 contains 168 human RNAi, 181 D. melanogaster RNAi screen datasets
  • OGEE Online GEne Essentiality database
  • Human-Mouse Disease Connection a part of MGI

- QTL depositories

  • GTEx Portal eQTL for ~50 different tissue types in humans

- GWAS resources

  • PheGenI Phenotype-Genotype Integrator: For a query trait, it return GWAS loci collected from all available data resources (very convenient to make a single GWAS data set for each trait)
  • GWAS catalog Disease-associated variants; Now providing GWAS summary stat data
  • LDHUB a centralized database of summary-level GWAS results
  • Genome-wide Repository of Associations between SNPs and Phenotypes (GRASP) Better than GWAS catalog, including eQTL,QTLs
  • GWASdb includes moderate SNPs (p-value < 10^-3) with manual curation from original papers; manually mapped ~1600 GWAS traits to ~500 HPO terms, ~440 DO terms, ~230 DOLite terms
  • DistiLD Diseases and Traits in Linkage Disequilibrium Blocks

- Genotype raw data depositories

- Clinical/Disease variant databases

  • CGD Clinical Genomic Database
  • HGMD The human gene mutation database (The professional version of DB is commercial. The public version of DB is not downloadable.)
  • OMIM Germline mutations for genetic diseases
  • Roche Cancer Genome Database (RCGDB) Germline/somatic mutations for cancer collected from diverse resourses (not downloadable)
  • IDbase Human Immunodeficiency-causing mutation database
  • NCBI ClinVar human variations and their relations to the human health (Not includes unreviewed data from GWAS)

- Others

  • COGS nature resources CollaborativeOncological Gene-environment Study (GOGS): Association study using ~211,000SNPs (iCOGS) for breast, ovarian, prostate cancers.
  • Personal Genome Project
  • DECIPHER Developmental Diseases to Phenotypes database with public patients (very useful for rare disease genetics research)

Interactome, Pathway/Signature Resources

- Interactome DBs

  • iRefWeb a web interface to PPI consolidated from 10 public DB (BIND, BioGRID, CORUM, DIP,IntAct, HPRD, MINT, MPact, MPPI, OPHID(predicted PPIs))
  • STRING Known and predicted PPI
  • Human Reference Interactome Project Y2H-based human protein interactions

- Pathway DBs

  • Pathguide.org A very comprehensive list of pathway and network databases
  • Gene Ontology by Gene Ontology Consortium
  • KEGG pathways and many more
  • Biocyc includes Metacyc, Ecocyc, Humancyc, Aracyc, Yeastcyc
  • Reactome A manually curated and peer-reviewed pathway DB
  • Pathway Interaction Database (PID) Human pathways curated by NCI-Nature/imported from BioCarta/Reactome
  • CORUM Comprehensive Resource of Mammalian Protein Complexes
  • NetPath A database for signaling pathways (cancer/immune signaling pathways)
  • SIGNOR 11000 manually-annotated causal relationships between proteins that participate in signal transduction
  • UniProt-GOA by EBI (support multi-species annotation)
  • UniPathway a fully manually curated resource of metabolic pathways (cross-linked to KEGG, MetaCyc)

- Signature Gene Set DBs

  • MsigDB License required for redistribution
  • GeneSigDB Manually curated gene sets from Pubmed literature
  • ImmnuneSigDB Compendium of immune signatures (now available from MsigDB)
  • CancerSEA which provides 14 signature profiles for characterization of cancer cells
  • DSigDB Drug signature database for gene set analysis
  • L1000CDS2 Return 50 signature genes for each LINCS L1000 data set using Characteristic Direction (CD) method
  • CREEDS CRowd Extracted Expression of Differential Signatures: Signature gene sets from GEO selected by crowdsourcing project using CD method

Regulome Resources

- TF and motif DB

- Epigenomics Consortium projects

- Promoter DB

  • EPD Eukaryotic Promoter Database; Databases of experimentally validated (by either publication or in-house assay) promoters in various organisms

- Enhancer DB

  • Enhancer Atlas Human enhancers based on >=3 independent high-throughput experimental datasets (contains 2,534,123 enhancers for 76 cell lines and 29 tissues)
  • dbSUPER contains 82,234 super-enhancers in 102 human and 25 mouse tissue/cell types
  • HEDD Human Enhancer Disease Database (~2.8M enhancers from ENCODE, FANTOM5, RoadMap and annotations for disease, target, variant, conservation)
  • DiseaseEnhancer manual curation of disease-associated enhancers

- Transcriptional Start Site (TSS) DB

  • DBTTS contains 491 million TSS tag sequences for 20 tissues and 7 cell cultures in human and mouse

- Chip-seq/DNase-seq DB

  • Cistrome DB the most comprehensive DB for Chip-seq and DNase-seq data

- Enhancer-Promoter Interaction DB

  • JEME Computationally inferred EPI networks for 935 human primary cells, tissues, and cell lines

- microRNA list and expression atlas

  • miRBase miRNA database by Manchester University
  • microRNA.org download miRNA expression atlas for human, mouse, rat
  • microRNAome microRNA RNA-seq based atlas for 46 primary cell types and 42 cancer or immortalized cell lines

- microRNA-target links (Gold standard)

  • miRWalk2.0 Validated links from 4 databases and text minings, Predicted links from 13 prediction data sets
  • miRTarBase Experimental-based microRNA-target links (most popular)

- microRNA-disease

  • Human microRNA Disease Database(HMDD) Manually curated microRNA-disease links (most comprehensive)
  • PhenomiR DB for dysregulated miRNA in diseases
  • dbDEMC DB for dysregulated miRNA in Cancer
  • miRGator data for miRNA expression, miRNA-mRNA paired expression profile, miRNA perturbation experiments...

- miRNA Target predictions

  • mirDIP >150M human miRNA-target predictions collected from 30 resources with integrative score

- CLIP-seq database

- lncRNA Resources

  • FANTOM-CAT An atlas of human long non-coding RNAs with accurate 5' ends
  • NONCODE Integrative annotation of long noncoding RNAs
  • lncRNAdb a reference DB for long noncoding RNAs
  • RAIN RNA–protein Association and Interaction Networks Intro to RAIN
  • NPInter ncRNA interaction database (ncRNA and other molecules)
  • RAID RNA-associated interaction DB (very comprehensive)
  • LncRNADisease a DB for lncRNA associated diseases
  • ncFANs a web server for functional annotation of ncRNA
  • LincSNP a DB of disease-associated SNP in human lncRNA and their TFBS
  • POSTAR a DB of RNA binding protein binding sites in human and mouse transcriptome (experimental and computational methods)

Single Cell Analysis Resources

- Human Cell Atlas

- Flow cytometry data resources

  • FlowRepository data deposition place for experimental findings published in peer-reviewed journals in the flow cytometry field

- scRNA-seq data analysis resources

- Depositories for scRNA-seq data

  • Single Cell Portal scRNA-seq database by Broad Institute
  • scRNASeqDB scRNA-seq database by UTHSC
  • conquer A repository of consistently processed, analysis-ready single-cell RNA-seq data sets
  • Jinglebells A repository of standardized single cell RNA-Seq datasets for analysis and visualization at the single cell level
  • SCPortalen human and mouse single-cell centric database
  • 10X Genomics Datasets by 10X Genomics

Chemical Biology and Drug Research Resources

- Drug and Bioactive chemical DBs

  • Drug Repurposing Hub a best-in-class drug screening collection of >3,000 clinical drugs and their annotation (structure, MoA, protein targets)
  • Drugable.com by National Library of Medicine, ~1 million chemicals, ~7000 structural pockets, ~4 millions of drug-protein interactions by docking model
  • PubChem A DB contains drug structure and function by NCBI
  • ChEMBL A DB contains drug structure and functions by EBI
  • Drugs@FDA A DB for FDA approved drugs
  • DailyMed High quality Information about marketed drugs by NCBI
  • SuperDrug A DB contains 3D-structures of drugs

- Clinical Trial Information

- Drug Target DBs

- Drug signature, Pharmacogenomics, Toxicogenomics DBs

- Drug-Gene Interaction DBs

  • MOSAIC Chemical-genetic interactions in Yeast (cover >13000 compounds)

Cancer Biology Resources

- Cancer Somatic Mutations DBs

- Cancer Somatic Mutation Visualization

  • Proteinpaint Exploring genomic alteration in pediatric cancer

- Cancer Gene DBs

- Cancer Genomics Research Gateway

- Cancer Genomics Data Analysis Cloud Platforms

  • ISB-CGC Cancer Genomics Cloud by ISB
  • WebMeV Analysis of large genomic data, particularly for RNASeq and microarray data (TCGA, GEO, or user-uploaded).

- Tumor Microenvironment Analysis tools

  • TIMER Web server for a comprehensive TME analysis
  • xCell Tumor cellular heterogeneity analysis web server; R package is also available from github

- Cancer Pharmacogenomics

- Cancer cell essential genes

  • GenomeCRISPR A database for high-throughput CRISPR/Cas9 screening experiments
  • Achilles Project shRNA-based screen for 216 cancer cell lines (v2.4.3) and CRISPR-based screen for 33 cancer cell lines (v3.3.8)
  • COLT-cancer database shRNA-based essential gene profiles for 70 breast, pancreatic, ovarian cancer cell lines

Metagenome DBs and tools

- Metagenomic data central DB

- Human microbiome

- Bacterial Culture Collection

Proteome Resources

- Human Proteome Database

  • Human Proteome Map 85 samples from 17 adult tissues, 6 primary hematopoietic cells and 7 fetal tissues
  • ProteomicsDB >10,000 raw data files from 60 human tissues, 147 cell lines, and 13 body fluids
  • The Human Protein Atlas The tissue-based map of human proteome based on Immunohistochemistry (for 32 different tissues and organs)

- Open stand-alone software for mass spectra database search (search engines)

  • MSblender A combined search engine
  • MS-GFDB: Its successor MS-GF+ is faster and more sensitive for high-resolution MS data.
  • X!TANDEM
  • Comet: the direct descendant of Crux, which is an academic version of the commercial software SEQUEST
  • MyriMatch
  • OMSSA Due to budgetary constraints NCBI has discontinued OMSSA. Historical binaries are available from here.

- Protein localization and Secretome DB

  • Vesiclepedia A DB for all types of Extracellular Vesicles (includes Exocarta)
  • Exocarta A DB for Exosome
  • EVpedia A DB for Extracellular Vesicles with many analysis software

Other Resources

- Academic society & Research Center

- Cool software

  • REVIGO Visualize GO enrichment summary
  • UpSetR Shiny App Visualizes set intersections in a matrix layout and introduces aggregates based on groupings and queries; R package is also available from github
  • VENNY Drawing Venn diagram

- Omics-driven Biotech companies

- Machine Learning

- Neuroscience

- Others

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox