Skip to main content
. 2017 May 12;10:25. doi: 10.1186/s13072-017-0133-5

Table 1.

Linear regression models explaining DNA methylation level at CGIs in 60–65 µm oocytes

Variable Simple linear regression Lasso regression
Significance (p value) % variability explained Coefficient value in the most regularised modelb, c
H3K4me3 enrichment, p10 ChIP-seq 2.34 × 10−6 4.8 N/A
H3K4me2 enrichment, p10 ChIP-seq 2.88 × 10−13 11.2 −0.132
H3K36me3 enrichment, p10 ChIP-seq 8.21 × 10−8 6.2 0.050
KDM1A dependence 1.1 × 10−11 9.7 −0.132
KDM1B dependence 1.5 × 10−12 10.5 −0.106
CpG density 0.000767 2.5 N/A
%GC content 0.169199 0.9 N/A
Transcription level (log-transformed) 0.000297 2.9 N/A
Enriched motif occurences (CBCCGCC, CCCMAM, CBCCGGGa) 3.97 × 10−8 6.5 −0.019

Simple linear regressions (variables tested individually) and multiple linear regression (variables tested together) modelling the relationship between explanatory variables and DNA methylation level at CGIs in 60–65 µm oocytes. The outcome of the model is presented as a proportion of the variability in DNA methylation level at CGIs in 60–65 µm oocytes explained by the variables

aSee Fig. 6a for motifs details. These three motifs were selected as they represent binding sites of known proteins

bCoefficients of variables in the model selected after software cross-validation of models as the most regularised model. These coefficients correspond to the values on y axis in Fig. 8. N/A marks variables that are not included in the model

cThe Lasso regression model including the 5 variables indicated in the column accounts for 18.5% of the variation

  NODES
INTERN 1