Entering edit mode

2.5 years ago

star
▴
280

I have some ChIP-seq data from different studies and I like to normalise them based on TMM and Upperquartile methods from edge R packages and then see which method is better for my data.

As you see their normalized data are different in the table for each method but when I got a correlation and draw heatmap plot, all the value is the same.

- I want to know finding correlation is a good way and why all value after
`cor()`

is the same? drawing heatmap on the result of the correlation is correct?

`> dge <- DGEList(counts=data) > data_upperquartile <- calcNormFactors(dge, method="upperquartile") > data_upperquartile<- data.frame(cpm(data_upperquartile,normalized.lib.sizes = TRUE)) > data_upperquartile[c(100:105),c(1:3)] A B C 0.1007585 0.1230328 0.01741683 0.1151526 0.1730148 0.03483366 0.1439407 0.2268417 0.04644487 0.1727289 0.2768238 0.05225048 0.1631328 0.2460656 0.04644487 0.1103546 0.1461014 0.02902805 >data_TMM <- calcNormFactors(dge, method="TMM") >data_TMM<- data.frame(cpm(data_TMM,normalized.lib.sizes = TRUE)) > data_TMM[c(100:105),c(1:3)] A B C 0.09484844 0.1153246 0.01901974 0.10839821 0.1621753 0.03803947 0.13549776 0.2126298 0.05071930 0.16259732 0.2594804 0.05705921 0.15356413 0.2306493 0.05071930 0.10388162 0.1369480 0.03169956 > cor_data_upperquartile <- cor(data_upperquartile) A B C A 1.0000000 0.9878731 0.9383675 B 0.9878731 1.0000000 0.9739410 C 0.9383675 0.9739410 1.0000000 >cor_data_TMM <- cor(data_TMM) A B C A 1.0000000 0.9878731 0.9383675 B 0.9878731 1.0000000 0.9739410 C 0.9383675 0.9739410 1.0000000`

As a remark, that is only true if normalization uses linear factors such as in TMM or the geometric mean approach of DESeq2. If you do something like quantile normalization or loess regression,

`cor`

will change dramatically.Thanks for your reply. So how can I find which method is better?

I recommend reading the

`csaw`

manual on ChIP-seq normalization. It explains the concepts quite nicely and contains code to plot MA plots to visually check the normalization "efficiency".