004 · gene_ref_mouse

Bioinformatics

Source

Mouse gene annotation reference downloaded from Ensembl using evanverse::download_gene_ref("mouse").

The dataset is arranged at gene resolution: one row per Ensembl gene.

Overview

Field Value
Category bioinformatics
Rows 78,873
Columns 12
Key variable `ensembl_id`
File `toy/004_gene_ref_mouse.qmd`

Variables

Column Description
ensembl_id Ensembl gene identifier
symbol Gene symbol
entrez_id Entrez Gene identifier
gene_type Ensembl gene biotype
chromosome Chromosome or scaffold name
start Gene start coordinate
end Gene end coordinate
strand Genomic strand
description Ensembl gene description
species Source species
ensembl_version Ensembl release version
download_date Date the reference was downloaded

Preview

Head Rows

ensembl_id symbol entrez_id gene_type chromosome start end strand description species ensembl_version download_date
ENSMUSG00000064336 mt-Tf NaN Mt_tRNA MT 1 68 1 mitochondrially encoded tRNA phenylalanine [Source:MGI Symbol;Acc:MGI:102487] mouse 115 2026-04-17
ENSMUSG00000064337 mt-Rnr1 NaN Mt_rRNA MT 70 1024 1 mitochondrially encoded 12S rRNA [Source:MGI Symbol;Acc:MGI:102493] mouse 115 2026-04-17
ENSMUSG00000064338 mt-Tv NaN Mt_tRNA MT 1025 1093 1 mitochondrially encoded tRNA valine [Source:MGI Symbol;Acc:MGI:102472] mouse 115 2026-04-17
ENSMUSG00000064339 mt-Rnr2 NaN Mt_rRNA MT 1094 2675 1 mitochondrially encoded 16S rRNA [Source:MGI Symbol;Acc:MGI:102492] mouse 115 2026-04-17
ENSMUSG00000064340 mt-Tl1 NaN Mt_tRNA MT 2676 2750 1 mitochondrially encoded tRNA leucine 1 [Source:MGI Symbol;Acc:MGI:102482] mouse 115 2026-04-17
ENSMUSG00000064341 mt-Nd1 17716.0 protein_coding MT 2751 3707 1 mitochondrially encoded NADH dehydrogenase 1 [Source:MGI Symbol;Acc:MGI:101787] mouse 115 2026-04-17
ENSMUSG00000064342 mt-Ti NaN Mt_tRNA MT 3706 3774 1 mitochondrially encoded tRNA isoleucine [Source:MGI Symbol;Acc:MGI:102484] mouse 115 2026-04-17
ENSMUSG00000064343 mt-Tq NaN Mt_tRNA MT 3772 3842 -1 mitochondrially encoded tRNA glutamine [Source:MGI Symbol;Acc:MGI:102477] mouse 115 2026-04-17
ENSMUSG00000064344 mt-Tm NaN Mt_tRNA MT 3845 3913 1 mitochondrially encoded tRNA methionine [Source:MGI Symbol;Acc:MGI:102480] mouse 115 2026-04-17
ENSMUSG00000064345 mt-Nd2 17717.0 protein_coding MT 3914 4951 1 mitochondrially encoded NADH dehydrogenase 2 [Source:MGI Symbol;Acc:MGI:102500] mouse 115 2026-04-17
ENSMUSG00000064346 mt-Tw NaN Mt_tRNA MT 4950 5016 1 mitochondrially encoded tRNA tryptophan [Source:MGI Symbol;Acc:MGI:102471] mouse 115 2026-04-17
ENSMUSG00000064347 mt-Ta NaN Mt_tRNA MT 5018 5086 -1 mitochondrially encoded tRNA alanine [Source:MGI Symbol;Acc:MGI:102491] mouse 115 2026-04-17

Gene Type Counts

gene_type genes
lncRNA 32936
protein_coding 22067
processed_pseudogene 9315
TEC 3219
miRNA 2206
unprocessed_pseudogene 2168
snRNA 1608
snoRNA 1517
transcribed_unprocessed_pseudogene 983
transcribed_processed_pseudogene 938
misc_RNA 562
rRNA 354
IG_V_gene 218
IG_V_pseudogene 158
TR_V_gene 144
unitary_pseudogene 107
transcribed_unitary_pseudogene 87
TR_J_gene 70
scaRNA 51
TR_V_pseudogene 34

Chromosome Counts

chromosome genes
1 5012
10 3770
11 4529
12 3462
13 3540
14 3497
15 2868
16 2453
17 3346
18 2119
19 2037
2 5503
3 4119
4 4368
5 4752
6 4311
7 5977
8 3695
9 3981
GL456210.1 6
GL456211.1 8
GL456212.1 2
GL456219.1 2
GL456221.1 10
GL456239.1 1
GL456354.1 5
GL456372.1 1
GL456381.1 1
GL456385.1 2
JH584296.1 3

Missing Values

variable missing
symbol 740
entrez_id 50064
description 53

Plot Notes

Plot structure Use
Ranked bar chart Most frequent gene biotypes
Chromosome bar chart Gene counts by chromosome
Genomic range plot Gene positions along selected chromosomes
Missingness chart Symbol and Entrez ID coverage

Download

gene_ref_mouse.csv