SSPsyGene Knowledge Base — data export bundle ============================================= This bundle contains the processed data tables used by https://psypheno.gi.ucsc.edu/. Gene identifiers have been resolved to gene symbols (HGNC for human, MGI for mouse) where mappings exist; rows otherwise carry the raw identifier they were loaded with. Layout ------ manifest.tsv Index: one row per data table, with row counts, columns, and pointers to the per-table files. tables/{table}.tsv Full per-table data dump, tab-separated. metadata/{table}.yaml Per-table metadata: description, columns + field labels, source, links, gene mappings, and citation. preprocessing/{table}.yaml Per-table preprocessing provenance: which gene-symbol rescues fired, what was dropped, row counts before/after each step. Only present for tables whose dataset ships a preprocessing.yaml. ensembl_to_symbol.tsv Ensembl ID ↔ gene symbol mapping used by the website. README.txt This file. Loading in R ------------ manifest <- read.delim("manifest.tsv", stringsAsFactors = FALSE) tbl <- read.delim("tables/perturb_fish_astro.tsv", stringsAsFactors = FALSE) head(tbl) Loading in Python (pandas) -------------------------- import pandas as pd manifest = pd.read_csv("manifest.tsv", sep="\t") tbl = pd.read_csv("tables/perturb_fish_astro.tsv", sep="\t") Citing a dataset ---------------- Each metadata/{table}.yaml file includes the underlying publication block (first/last author, year, journal, DOI, PMID, SSPsyGene grants). Please cite the original publication when using a dataset.