003 · gene_ref_human

Bioinformatics

Source

Human gene annotation reference downloaded from Ensembl using evanverse::download_gene_ref("human").

The dataset is arranged at gene resolution: one row per Ensembl gene.

Overview

Field Value
Category bioinformatics
Rows 91,703
Columns 12
Key variable `ensembl_id`
File `toy/003_gene_ref_human.qmd`

Variables

Column Description
ensembl_id Ensembl gene identifier
symbol Gene symbol
entrez_id Entrez Gene identifier
gene_type Ensembl gene biotype
chromosome Chromosome or scaffold name
start Gene start coordinate
end Gene end coordinate
strand Genomic strand
description Ensembl gene description
species Source species
ensembl_version Ensembl release version
download_date Date the reference was downloaded

Preview

Head Rows

ensembl_id symbol entrez_id gene_type chromosome start end strand description species ensembl_version download_date
ENSG00000210049 MT-TF NaN Mt_tRNA MT 577 647 1 mitochondrially encoded tRNA-Phe (UUU/C) [Source:HGNC Symbol;Acc:HGNC:7481] human 115 2026-04-17
ENSG00000211459 MT-RNR1 NaN Mt_rRNA MT 648 1601 1 mitochondrially encoded 12S rRNA [Source:HGNC Symbol;Acc:HGNC:7470] human 115 2026-04-17
ENSG00000210077 MT-TV NaN Mt_tRNA MT 1602 1670 1 mitochondrially encoded tRNA-Val (GUN) [Source:HGNC Symbol;Acc:HGNC:7500] human 115 2026-04-17
ENSG00000210082 MT-RNR2 NaN Mt_rRNA MT 1671 3229 1 mitochondrially encoded 16S rRNA [Source:HGNC Symbol;Acc:HGNC:7471] human 115 2026-04-17
ENSG00000209082 MT-TL1 NaN Mt_tRNA MT 3230 3304 1 mitochondrially encoded tRNA-Leu (UUA/G) 1 [Source:HGNC Symbol;Acc:HGNC:7490] human 115 2026-04-17
ENSG00000198888 MT-ND1 4535.0 protein_coding MT 3307 4262 1 mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 1 [Source:HGNC Symbol;Acc:HGNC:7455] human 115 2026-04-17
ENSG00000210100 MT-TI NaN Mt_tRNA MT 4263 4331 1 mitochondrially encoded tRNA-Ile (AUU/C) [Source:HGNC Symbol;Acc:HGNC:7488] human 115 2026-04-17
ENSG00000210107 MT-TQ NaN Mt_tRNA MT 4329 4400 -1 mitochondrially encoded tRNA-Gln (CAA/G) [Source:HGNC Symbol;Acc:HGNC:7495] human 115 2026-04-17
ENSG00000210112 MT-TM NaN Mt_tRNA MT 4402 4469 1 mitochondrially encoded tRNA-Met (AUA/G) [Source:HGNC Symbol;Acc:HGNC:7492] human 115 2026-04-17
ENSG00000198763 MT-ND2 4536.0 protein_coding MT 4470 5511 1 mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 [Source:HGNC Symbol;Acc:HGNC:7456] human 115 2026-04-17
ENSG00000210117 MT-TW NaN Mt_tRNA MT 5512 5579 1 mitochondrially encoded tRNA-Trp (UGA/G) [Source:HGNC Symbol;Acc:HGNC:7501] human 115 2026-04-17
ENSG00000210127 MT-TA NaN Mt_tRNA MT 5587 5655 -1 mitochondrially encoded tRNA-Ala (GCN) [Source:HGNC Symbol;Acc:HGNC:7475] human 115 2026-04-17

Gene Type Counts

gene_type genes
lncRNA 36749
protein_coding 24080
processed_pseudogene 10234
rRNA 4032
unprocessed_pseudogene 2853
misc_RNA 2418
snRNA 2230
miRNA 1945
transcribed_unprocessed_pseudogene 1885
transcribed_processed_pseudogene 1219
TEC 1084
snoRNA 1042
rRNA_pseudogene 517
IG_V_pseudogene 300
IG_V_gene 229
transcribed_unitary_pseudogene 206
TR_V_gene 160
unitary_pseudogene 97
TR_J_gene 93
IG_D_gene 64

Chromosome Counts

chromosome genes
1 8610
10 3284
11 4180
12 3875
13 2010
14 2810
15 2854
16 3234
17 3894
18 1663
19 3567
2 5720
20 1977
21 1987
22 1754
3 4170
4 3671
5 4017
6 4254
7 3981
8 3328
9 3211
GL000008.2 3
GL000009.2 7
GL000194.1 5
GL000195.1 9
GL000205.2 7
GL000213.1 4
GL000214.1 2
GL000216.2 1

Missing Values

variable missing
symbol 40268
entrez_id 54923
description 854

Plot Notes

Plot structure Use
Ranked bar chart Most frequent gene biotypes
Chromosome bar chart Gene counts by chromosome
Genomic range plot Gene positions along selected chromosomes
Missingness chart Symbol and Entrez ID coverage

Download

gene_ref_human.csv