KIAA1841

SANBR
Identifiers
Aliases	SANBR, SANT and BTB domain regulator of CSR, KIAA1841
External IDs	MGI: 1918925; HomoloGene: 19038; GeneCards: SANBR; OMA:SANBR - orthologs
Gene location (Human)
Chr.	Chromosome 2 (human)
End	61,138,034 bp
Gene location (Mouse)
Chr.	Chromosome 11 (mouse)
End	23,583,639 bp
RNA expression pattern
	Top expressed in
	endothelial cell; ; bronchial epithelial cell; ; Brodmann area 23; ; primary visual cortex; ; Achilles tendon; ; testicle; ; mucosa of paranasal sinus; ; corpus callosum; ; middle temporal gyrus; ; ganglionic eminence;
	Top expressed in
	trigeminal ganglion; ; spermatocyte; ; spermatid; ; superior cervical ganglion; ; ganglionic eminence; ; ventral tegmental area; ; pontine nuclei; ; otic vesicle; ; dorsal tegmental nucleus; ; medial dorsal nucleus;
	More reference expression data
	n/a
Orthologs
	84542
	71675
	ENSG00000162929
	ENSMUSG00000042208
	Q6NSI8
	Q68FF0
NM_001129993; NM_032506; NM_001330432; NM_001330433; NM_001330434;
	NM_001330435; NM_001330436
	NM_027860; NM_001347242
NP_001123465; NP_001317361; NP_001317362; NP_001317363; NP_001317364;
	NP_001317365; NP_115895
	NP_001334171; NP_082136
	Wikidata
View/Edit Human	View/Edit Mouse

KIAA1841 is a gene in humans that encodes a protein known as KIAA1841 (uncharacterized protein KIAA1841). KIAA1841 is _targeted for the nucleus and it predicted to play a role in regulating transcription.

Gene

Location

Location of KIAA1841 on chromosome 2

KIAA1841 is located on the long arm of chromosome 2 (2q14), starting at 61297486 and ending at 61349294. The KIAA1841 gene spans 52809 base pairs and is orientated on the ++ strand. The coding region is made up of 4292 base pairs and the protein sequence of 718 amino acids.^[5]

Gene neighborhood

Genes PEX13 and C2orf74 neighbor KIAA1841 on chromosome 2.^[6]

Expression

Diagram depicting the expression of KIAA1841 in tissues throughout the body.

KIAA1841 is highly expressed in reproductive structures and nervous tissue. These include the brain, prostate, cervix, ear and nervous tissue. It is intermediately expressed in the lungs and spinal cord.^[7]^[8] KIAA1841 is expressed at low levels in a wide range of tissues throughout the human body.

Transcript variants

In humans, the KIAA1841 gene produces 18 alternatively spliced transcript variants as well as 3 unspliced. From the 18 spliced variants 4 form a protein product. The main transcript in humans is transcript ID ENST00000402291, or OTTHUMT00000325477.^[9]^[10]^[11]

Homology

Paralogs

There are no paralogs of KIAA1841^[12]

Orthologs

Below is a table of a variety of orthologs of the human KIAA1841. The table include closely, intermediately and distantly related orthologs.

Species	NCBI accession #	Sequence length	Protein identity	mRNA identity
Homo sapiens (Human)	NP_001123465.1	718	100%	100%
Odobenus rosmarus divergens (Walrus)	XP_004397774.1	718	94%	97%
Canis lupus familiaris (Grey wolf)	XP_538505	718	92%	96%
Equus caballus (Horse)	XP_001495879.1	765	93%	96%
Mus musculus (Mouse)	NP_082136.2	718	89%	94%
Echinops telfairi (Hedgehog)	XP_004710320.1	718	86%	92%
Pelodiscus sinensis (Soft-shelled turtle)	XP_006122225.1	718	78%	88%
Anas platyrhynchos (Mallard duck)	XP_005016968.1	719	78%	87%
Gallus gallus (Red junglefowl)	NP_001186348.1	718	76%	87%
Xenopus tropicalis (Western clawed frog)	XP_004914757.1	715	71%	83%
Danio rerio (Zebrafish)	XP_001333668.2	735	60%	73%
Drosophila melanogaster (Fruit fly)	NP_648346.1	889	38%	62%
Apis mellifera (Western honeybee)	XP_006559923.1	849	40%	62%
Anopheles gambiae (Mosquito)	XP_558222.3	806	40%	61%

Orthologs of the human protein KIAA1841 are listed above in descending order or date of divergence and then ascending order of percent identity. KIAA1841 is highly conserved throughout all orthologs, this is demonstrated with a 40% identity in the least similar ortholog. KAA1841 has evolved slowly and evenly over time.^[13]^[14]

Homologous domains

The domain of unknown function 3342 is conserved in all orthologs. It is the highest conserved region of the protein. Conservation of this domain was traced all the way back to a fungus called Batrachochytrium dendrobatidis, which diverged 1216 million years ago from humans.^[15]

Protein

General properties

The molecular weight of KIAA1841 is 82 kilodaltons. The isoelectric point is 6.5. The protein sequence is not rich or low in any amino acids. There are two stretches of non-polar regions, which are capable of being transmembrane regions. There is a stretch of 21 0’s from 254-275 and a stretch of 24 0’s from 420–444.1 The DUF3342 domain stretches from 147–449.^[16]

	KIAA1841	DUF3342	N Terminus	C Terminus
Isoelectric point	6.5	6.74	5.5	8.2
Positive charge (%)	13.8	13.2	11.7	15.9+
Negative charge (%)	14.4	13.5	14.2	14.5
Net charge	-0.6	-0.3	-2.5	1.4
Major hydrophobics (%)	24.8	28.7	27.2	22.3

Composition

There is an even distribution of amino acids comprising KIAA1841. The percent composition of each amino acid is fairly consistent throughout the orthologs of the protein. The most distant ortholog displays the most variance in amino acid composition. There is a higher percent composition of alanine, histidine and leucine and a lower composition of lysine.

The protein sequence of KIAA1841 is not rich or low in any amino acids. The same is true in Mus musculus, Danio rerio, Drosophila melanogaster but not true for the most distantly related. Batrachochytrium dendrobatidis is rich is histidine. Humans and closely related orthologs are composed of 2.2% to 3.8% histidine compared to 5% in Batrachochytrium dendrobatidis.

Domains

The DUF3342 domain stretches from 147-449 on KIAA1841 and has a molecular weight of 35.7 kdal. The DUF domain is low in G (2%) and rich is C (6.3%). Both of the non-polar stretches in the protein are located within the DUF domain. One at the beginning and one at the end.

The domain (DUF3342) of unknown function is a part of the pfam11822 family. This family of proteins has yet to be functionally characterized and it is found in bacteria. This domain is usually between 170 amino acids and 303 amino acids in length. The N terminal half of this protein family is a BTB-like domain. BTB domains multifunctional protein-protein interaction motif that is involved in a number of different cellular functions, including roles in regulating transcription, cytoskeleton dynamics, gating and assembly of ion channels and is involved with ubiquitination of proteins. BTB domain structures are highly conversed and are found on proteins that only have one or two other types of domains.

Post-translational modifications

KIAA1841 is highly phosphorylated post modification. There are 37 predicted phosphorylated sites. There is one leucine-rich nuclear export signal toward the end of the protein. There is one sulfated tyrosine, which strengthens protein-protein interactions. Two motifs with high probability of post translational modification sumoylation sites were found. Sumoylation sites are involved in a number of cellular processes, including nuclear-cytosolic transport, transcriptional regulation and protein stability.

Secondary structure

KIAA1841 is primarily composed of alpha helices and beta sheets. Alpha helices comprise the majority of the protein, this is true for the DUF domain and both terminuses. The DUF domain has slightly less beta sheets compared to the protein as a whole and the C terminus has an even smaller amount of beta sheets comprising its secondary structure.^[17]^[18]

	KIAA1841	DUF3342	N Terminus	C Terminus
Alpha helix (%)	68	68	63.9	70.1
Beta sheet (%)	61	49.2	60.8	35.8
Beta turn (%)	14.1	9.9	12.8	15.4

Subcellular localization

Protein KIAA1841 is _targeted to the nucleus.^[19]

Interacting proteins

KIAA1841 was found to interact with SRPK1 (Serine/arginine- rich protein-specific kinase 1)^[20] The interaction was detected via a protein kinase assay. SRPK1 localizes to the nucleus and the cytoplasm. By regulating intracellular localization of splicing factors it is thought to play a role in regulating both constitutive and alternative splicing. KIAA1841 is also found in the nucleus and is thought to play a role in regulating transcription.

Clinical significance

Disease association

Diseases associated with this gene are Crohn’s disease, celiac disease and inflammatory bowel disease.^[21]^[22]

References

^ ^a ^b ^c GRCh38: Ensembl release 89: ENSG00000162929 – Ensembl, May 2017
^ ^a ^b ^c GRCm38: Ensembl release 89: ENSMUSG00000042208 – Ensembl, May 2017
^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "NCBI gene database". NCBI.
^ "NCBI gene database". NCBI.
^ "GEO profiles". NCBI geo profiles.
^ "EST profiles". NCBI EST profiles.
^ "Emsembl". Vega.
^ "Genecards". The Gene Human Database.
^ "Aceview". NCBI.
^ "Genecards". The Gene Human Database.
^ "BLAST". NCBI.
^ Hedges, SB. "TimeTree". Bioinformatics.
^ "Biology Workbench". San Diego Supercomputer Center.^{[permanent dead link‍]}
^ "SAPS". Statistical Analysis of Protein Sequence, Biology Workbench.^{[permanent dead link‍]}
^ "PELE". San Diego Supercomputer Center.
^ "CHOFAS (Predict Secondary Structure of PS". Chou-Fasman. Archived from the original on 2003-08-11. Retrieved 2014-05-10.
^ "PSORT II". Expasy.
^ "2 binary interactions found for search term KIAA1841". IntAct Molecular Interaction Database. EMBL-EBI. Retrieved 2018-08-25.
^ "NCBI gene database". NCBI.
^ "Genecards". The Gene Human Database.

[refGRCh38Ensembl-1] GRCh38: Ensembl release 89: ENSG00000162929 – Ensembl, May 2017

[refGRCm38Ensembl-2] GRCm38: Ensembl release 89: ENSMUSG00000042208 – Ensembl, May 2017

[3] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[4] "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[5] "NCBI gene database". NCBI.

[6] "NCBI gene database". NCBI.

[7] "GEO profiles". NCBI geo profiles.

[8] "EST profiles". NCBI EST profiles.

[9] "Emsembl". Vega.

[10] "Genecards". The Gene Human Database.

[11] "Aceview". NCBI.

[12] "Genecards". The Gene Human Database.

[13] "BLAST". NCBI.

[14] Hedges, SB. "TimeTree". Bioinformatics.

[15] "Biology Workbench". San Diego Supercomputer Center.^{[permanent dead link‍]}

[16] "SAPS". Statistical Analysis of Protein Sequence, Biology Workbench.^{[permanent dead link‍]}

[17] "PELE". San Diego Supercomputer Center.

[18] "CHOFAS (Predict Secondary Structure of PS". Chou-Fasman. Archived from the original on 2003-08-11. Retrieved 2014-05-10.

[19] "PSORT II". Expasy.

[20] "2 binary interactions found for search term KIAA1841". IntAct Molecular Interaction Database. EMBL-EBI. Retrieved 2018-08-25.

[21] "NCBI gene database". NCBI.

[22] "Genecards". The Gene Human Database.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]