{"id":35491,"date":"2022-12-27T19:33:04","date_gmt":"2022-12-28T03:33:04","guid":{"rendered":"https:\/\/www.novogene.com\/us-en\/?post_type=resources&#038;p=35491"},"modified":"2025-05-29T03:41:42","modified_gmt":"2025-05-29T10:41:42","slug":"geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing","status":"publish","type":"resources","link":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/","title":{"rendered":"GEO Data Mining (II) &#8211; Differential gene expression analysis and visualization with complete code sharing"},"content":{"rendered":"<p>In the previous two articles, &#8220;<a href=\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/gene-expression-omnibus-data-mining-ia-quick-and-easy-download-of-geo-data\/\" target=\"_blank\" rel=\"noopener noreferrer\">Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data<\/a>&#8221; and &#8220;<a href=\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ib-downloading-sequence-read-archive-raw-data\/\" target=\"_blank\" rel=\"noopener noreferrer\">GEO Data Mining (IB) &#8211; Downloading Sequence Read Archive Raw Data<\/a>,&#8221; we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.<\/p>\n<p>Why is differential expression analysis of genes valuable? During tumor pathogenesis or the progress of other diseases, gene expression can change. Some genes that were originally silent may become highly expressed, while other genes that were originally expressed normally may have increased or decreased expression in experimental samples. Genes that have altered expression levels compared with normal samples may control or affect tumorigenesis. Therefore, to study tumorigenesis mechanisms, investigators need to set up case (experiment or treatment) and control (control) groups for DGE analysis, explore the differentially expressed genes that characterize the treatment group and the control group, and construct gene enrichment pathways to study the pathogenesis of tumors. Then investigators can focus on interpreting these enriched pathways and explore the relationship or internal mechanism between these pathways and the phenotype of the sample.<\/p>\n<p>Through the following four steps, we present a complete set of shared code for differential gene expression analysis and visualization:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/GEO-Data-Mining-II.png\" alt=\"\" \/><\/p>\n<p><strong>Step 1: Extract expression matrix, clinical information, and chip number<\/strong><\/p>\n<p>The original data we downloaded, including metadata, constitutes the \u201ceSet\u201d. We will use the method of subsetting the list to extract the first element in the eSet list: eSet[[1]]. We will also use the \u201cexprs\u201d function to convert it into a matrix, usually using head (100) to determine whether the data needs to be converted to log type.<\/p>\n<p>#(1) Extract expression matrix exp&#8212;-<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">exp <- exprs(eSet[[1]])<br \/>\nexp[1:4,1:4]<br \/>\n#exp = log2(exp+1) #Judge whether log2 is needed<\/p>\n<p>#(2) Extract clinical information &#8212;-<br \/>\nThe clinical information of each sample can be obtained by using the pData() function. Usually, the source column (source) or feature column (characteristics\u201d) of the data frame will describe which samples are control or treatment.<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">pd <- pData(eSet[[1]])<\/p>\n<p>#(3) Extract chip platform number &#8212;-<br \/>\nThe data usually use different chip probes, which are subsequently converted into entrez ID or symbol ID according to the GEO Platform (GPL) number. The NCBI database&#8217;s official name for the gene is represented by the Entrez ID. The HGNC database&#8217;s official name for the gene is represented by the symbol ID.<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">gpl <- eSet[[1]]@annotation<br \/>\np = identical(rownames(pd),colnames(exp))<br \/>\nsave(gse,exp,pd,gpl,file = &#8220;step1_output.Rdata&#8221;)#Save the result<\/p>\n<p><img decoding=\"async\" style=\"display: block;margin: 0 auto;\" src=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/GEO-Data-Mining-II1.png\" alt=\"\" \/><\/p>\n<p><strong>Step 2: Build grouping, chip annotation<\/strong><\/p>\n<p>1. Build group information<br \/>\nThe Rdata from Step 1 are grouped according to the sample source information in the GEO Series (GSE).<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">rm(list = ls())<br \/>\nload(&#8220;step1_output. Rdata&#8221;)<br \/>\nlibrary(stringr)<br \/>\ntable(colnames(exp))<br \/>\ngroup_list = ifelse(str_detect(pd$source_name_ch1,&#8221;tumor&#8221;),&#8221;test&#8221;,&#8221;normal&#8221;)<br \/>\ngroup_list=factor(group_list,<br \/>\n                   levels = c(&#8220;test&#8221;,&#8221;normal&#8221;))<br \/>\ngroup_list<br \/>\ntable(group_list)<\/p>\n<p>2. Chip annotation of GLP <\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">if(T){<br \/>\n   a = getGEO(gpl,destdir = &#8220;.&#8221;)<br \/>\n   b = a@dataTable@table<br \/>\n   colnames(b)<br \/>\n   ids2 = b[,c(&#8220;ID&#8221;,&#8221;GENE_SYMBOL&#8221;)]<br \/>\n   colnames(ids2) = c(&#8220;probe_id&#8221;,&#8221;symbol&#8221;)<br \/>\n   ids2 = ids2[ids2$symbol!=&#8221;&#8221; &#038; !str_detect(ids2$symbol,&#8221;\/\/\/&#8221;),]<br \/>\n}<br \/>\nsave(group_list,ids2,exp,pd,file = &#8220;step2_output.Rdata&#8221;)<\/p>\n<p><strong>Step 3: Data inspection \u2013 Principal Component Analysis<\/strong><\/p>\n<p>Next, perform principal component analysis (PCA) with the FactoMineR, factoextra package to determine whether there is a significant grouping between treatment and control groups. Use the code below to draw a PCA plot.<\/p>\n<p>#Import the Rdata results of the first two steps<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">rm(list = ls())#clear the environment<br \/>\nload(&#8220;step1_output. Rdata&#8221;)<br \/>\nload(&#8220;step2_output. Rdata&#8221;)<br \/>\ndat=as.data.frame(t(exp))<br \/>\nlibrary(FactoMineR)<br \/>\nlibrary(factoextra)<br \/>\n# Unified operation of pca<br \/>\ndat.pca <- PCA(dat, graph = FALSE)<br \/>\npca_plot <- fviz_pca_ind(dat.pca,<br \/>\n                          geom.ind = &#8220;point&#8221;, # show points only (nbut not &#8220;text&#8221;)<br \/>\n                          col.ind = group_list, # color by groups<br \/>\n                          palette = c(&#8220;#00AFBB&#8221;, &#8220;#E7B800&#8221;),<br \/>\n                          addEllipses = TRUE, # Concentration ellipses<br \/>\n                          legend.title = &#8220;Groups&#8221;<br \/>\n)<br \/>\nprint(pca_plot)<br \/>\nggsave(plot = pca_plot, filename = paste0(gse,&#8221;PCA.png&#8221;))<br \/>\nsave(pca_plot, file = &#8220;pca_plot. Rdata&#8221;)<\/p>\n<p><img decoding=\"async\" style=\"display: block;margin: 0 auto;\" src=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/GEO-Data-Mining-II2.png\" alt=\"\" \/><\/p>\n<p style=\"font-size: 12px;text-align: center;color: c0c0c0;\">PCA plots\uff08significant difference between subgroups of data\uff09<\/p>\n<p><strong>Step 4: Differential gene expression analysis and visualization &#8211; volcano map and heat map<\/strong><\/p>\n<p>Next, import the Rdata from Steps 1 and 2, and use the limma package to perform differential gene expression (DGE) analysis on each gene. DGE analysis includes determining log fold change (logFC) of a gene, average expression level, and whether the p value is significant. DGE analysis uses two main input files: (1) the organized expression matrix, in which the row name is the gene name and the column name is the sample name; and (2) grouping information (group list).<\/p>\n<p>DGE analysis with the limma package is divided into three steps: lmFit, eBayes, and topTable. Using the limma package requires three types of data: (1) Expression matrix exp grouping matrix; (2) Design difference comparison matrix; and (3) Contrast.matrix to get the differential analysis matrix (nrDEG). The analysis results focus on logFC and p value.<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">rm(list = ls())<br \/>\nload(&#8220;step1_output. Rdata&#8221;)<br \/>\nload(&#8220;step2_output. Rdata&#8221;)<\/p>\n<p>1. Differential gene expression analysis<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">library(limma)<br \/>\ndesign=model.matrix(~group_list)<br \/>\nfit=lmFit(exp, design)<br \/>\nfit=eBayes(fit)<br \/>\ndeg=topTable(fit, coef=2, number = Inf)<br \/>\nhead(deg)<\/p>\n<p>2. Add a few columns to the deg data frame<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">#1. Add the probe_id column and turn the row name into a column<br \/>\nlibrary(dplyr)<br \/>\ndeg <- mutate(deg,probe_id=rownames(deg))<br \/>\n#tibble::rownames_to_column()<br \/>\nhead(deg)<br \/>\n#merge merges two tables<br \/>\ntable(deg$probe_id %in% ids$probe_id)<br \/>\n#deg \u200b\u200b<- inner_join(deg,ids,by=\"probe_id\")<br \/>\ndeg <- merge(x = deg,y = ids2, by=\"probe_id\")<br \/>\ndeg <- deg[!duplicated(deg$symbol),]<br \/>\ndim(deg)<br \/>\n#2. Add change column: up or down, the volcano map should be used<br \/>\n#logFC_t=mean(deg$logFC)+2*sd(deg$logFC)<br \/>\nlogFC_t=1.5<br \/>\nchange=ifelse(deg$P.Value>0.05,&#8217;stable&#8217;,<br \/>\n              ifelse( deg$logFC >logFC_t,&#8217;up&#8217;,<br \/>\n                      ifelse( deg$logFC < -logFC_t,'down','stable') )<br \/>\n)<br \/>\ndeg <- mutate(deg, change)<br \/>\nhead(deg)<br \/>\ntable(deg$change)<\/p>\n<p>3. Add the ENTREZID column, which will be used later in the enrichment analysis<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">library(ggplot2)<br \/>\nlibrary(clusterProfiler)<br \/>\nlibrary(org.Hs.eg.db)<br \/>\ns2e <- bitr(unique(deg$symbol), fromType = \"SYMBOL\",<br \/>\n            toType = c( &#8220;ENTREZID&#8221;),<br \/>\n            OrgDb = org.Hs.eg.db)<br \/>\nhead(s2e)<br \/>\nhead(deg)<br \/>\ndeg <- inner_join(deg,s2e,by=c(\"symbol\"=\"SYMBOL\"))<br \/>\nhead(deg)<br \/>\nsave(logFC_t,deg,file = &#8220;step4_output.Rdata&#8221;)<\/p>\n<p><strong>Data visualization- volcano map, heat map<\/strong><\/p>\n<p>Several different types of graphs are available to visualize differential gene expression data, including volcano maps and heat maps.<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">rm(list = ls())<br \/>\nload(&#8220;step1_output. Rdata&#8221;)<br \/>\nload(&#8220;step2_output. Rdata&#8221;)<br \/>\nload(&#8220;step4_output. Rdata&#8221;)<br \/>\nlibrary(dplyr)<\/p>\n<p>1.Volcano map<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">dat <- mutate(deg,v=-log10(P.Value))<br \/>\nif(T){<br \/>\n  for_label <- dat%>%<br \/>\n    filter(symbol %in% c(&#8220;RUNX2&#8243;,&#8221;FN1&#8221;))<br \/>\n}<br \/>\nif(F){<br \/>\n  for_label <- dat %>% head(10)<br \/>\n}<br \/>\nif(F) {<br \/>\n  x1 = dat %>%<br \/>\n    filter(change == &#8220;up&#8221;) %>%<br \/>\n    head(3)<br \/>\n  x2 = dat %>%<br \/>\n    filter(change == &#8220;down&#8221;) %>%<br \/>\n    head(3)<br \/>\n  for_label = rbind(x1,x2)<br \/>\n}<br \/>\np <- ggplot(data = dat,<br \/>\n            aes(x = logFC,<br \/>\n                y = v)) +<br \/>\n  geom_point(alpha=0.4, size=3.5,<br \/>\n             aes(color=change)) +<br \/>\n  ylab(&#8220;-log10(Pvalue)&#8221;)+<br \/>\n  scale_color_manual(values=c(&#8220;blue&#8221;, &#8220;grey&#8221;,&#8221;red&#8221;))+<br \/>\n  geom_vline(xintercept=c(-logFC_t,logFC_t),lty=4,col=&#8221;black&#8221;,lwd=0.8) +<br \/>\n  geom_hline(yintercept = -log10(0.05), lty=4, col=&#8221;black&#8221;, lwd=0.8) +<br \/>\n  theme_bw()<br \/>\np<br \/>\nvolcano_plot <- p +<br \/>\n  geom_point(size = 3, shape = 1, data = for_label) +<br \/>\n  ggrepel::geom_label_repel(<br \/>\n    aes(label = symbol),<br \/>\n    data = for_label,<br \/>\n    color=&#8221;black&#8221;<br \/>\n  )<br \/>\nvolcano_plot<br \/>\nggsave(plot = volcano_plot, filename = paste0(gse,&#8221;volcano.png&#8221;))<\/p>\n<p><img decoding=\"async\" style=\"display: block;margin: 0 auto;\" src=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/description4.png\" alt=\"\" \/><\/p>\n<p style=\"font-size: 12px;text-align: center;color: c0c0c0;\">Volcano plot\uff08Upregulation &#038; Downregulation in Gene Expression\uff09<\/p>\n<p>2. Draw a heatmap<\/p>\n<p style=\"background-color: #aaaaaa;padding: 20px;\">cg=names(tail(sort(apply(exp,1,sd)),1000))<br \/>\nn=exp[cg,]<br \/>\nannotation_col=data.frame(group=group_list)<br \/>\nrownames(annotation_col) = colnames(n)<br \/>\nlibrary(pheatmap)<br \/>\nheatmap_plot <- pheatmap(n,<br \/>\n          show_colnames=F,<br \/>\n          show_rownames = F,<br \/>\n          annotation_col = annotation_col,<br \/>\n          scale = &#8220;row&#8221;)<br \/>\n# save the result<br \/>\nlibrary(ggplot2)<br \/>\npng(file = paste0(gse,&#8221;heatmap.png&#8221;))<br \/>\nggsave(plot = heatmap_plot, filename = paste0(gse,&#8221;heatmap.png&#8221;))<br \/>\ndev.off()<\/p>\n<p><img decoding=\"async\" style=\"display: block;margin: 0 auto;\" src=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/description-5.png\" alt=\"\" \/><\/p>\n<p style=\"font-size: 12px;text-align: center;color: c0c0c0;\">Heatmap\uff08Measuring similarity of expression between subgroups\uff09<\/p>\n<p>At present, we have completed the differential gene expression analysis and visualization part of GEO data mining and have obtained significantly up-regulated and down-regulated differentially expressed genes. However, disease occurrence is often affected by more than one gene. For more tools to assist in studying disease and cancer mechanisms, stay tuned for our next tutorial on GEO data mining and identifying gene expression pathways.<\/p>\n","protected":false},"featured_media":0,"parent":0,"template":"","yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v20.8 (Yoast SEO v20.8) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing - Novogene<\/title>\n<meta name=\"description\" content=\"In the previous two articles, &quot;Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data&quot; and &quot;GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,&quot; we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing\" \/>\n<meta property=\"og:description\" content=\"In the previous two articles, &quot;Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data&quot; and &quot;GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,&quot; we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/\" \/>\n<meta property=\"og:site_name\" content=\"Novogene\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/NovogeneAmerica\/\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-29T10:41:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/GEO-Data-Mining-II.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@Novogene_Global\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/\",\"url\":\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/\",\"name\":\"GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing - Novogene\",\"isPartOf\":{\"@id\":\"https:\/\/www.novogene.com\/us-en\/#website\"},\"datePublished\":\"2022-12-28T03:33:04+00:00\",\"dateModified\":\"2025-05-29T10:41:42+00:00\",\"description\":\"In the previous two articles, \\\"Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data\\\" and \\\"GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,\\\" we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.novogene.com\/us-en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Resources\",\"item\":\"https:\/\/www.novogene.com\/us-en\/resources\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"GEO Data Mining (II) &#8211; Differential gene expression analysis and visualization with complete code sharing\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.novogene.com\/us-en\/#website\",\"url\":\"https:\/\/www.novogene.com\/us-en\/\",\"name\":\"Novogene\",\"description\":\"USA Based Lab Guaranteed Data Security\",\"publisher\":{\"@id\":\"https:\/\/www.novogene.com\/us-en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.novogene.com\/us-en\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.novogene.com\/us-en\/#organization\",\"name\":\"Novogene\",\"url\":\"https:\/\/www.novogene.com\/us-en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.novogene.com\/us-en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2020\/05\/20200506113246.png\",\"contentUrl\":\"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2020\/05\/20200506113246.png\",\"width\":941,\"height\":269,\"caption\":\"Novogene\"},\"image\":{\"@id\":\"https:\/\/www.novogene.com\/us-en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/NovogeneAmerica\/\",\"https:\/\/twitter.com\/Novogene_Global\",\"https:\/\/www.linkedin.com\/company\/novogene\/\",\"https:\/\/www.youtube.com\/c\/NovogeneGlobal\"]}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing - Novogene","description":"In the previous two articles, \"Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data\" and \"GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,\" we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/","og_locale":"en_US","og_type":"article","og_title":"GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing","og_description":"In the previous two articles, \"Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data\" and \"GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,\" we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.","og_url":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/","og_site_name":"Novogene","article_publisher":"https:\/\/www.facebook.com\/NovogeneAmerica\/","article_modified_time":"2025-05-29T10:41:42+00:00","og_image":[{"url":"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2022\/12\/GEO-Data-Mining-II.png"}],"twitter_card":"summary_large_image","twitter_site":"@Novogene_Global","twitter_misc":{"Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/","url":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/","name":"GEO Data Mining (II) - Differential gene expression analysis and visualization with complete code sharing - Novogene","isPartOf":{"@id":"https:\/\/www.novogene.com\/us-en\/#website"},"datePublished":"2022-12-28T03:33:04+00:00","dateModified":"2025-05-29T10:41:42+00:00","description":"In the previous two articles, \"Gene Expression Omnibus Data Mining (IA) \u2013 Quick and Easy Download of GEO Data\" and \"GEO Data Mining (IB) - Downloading Sequence Read Archive Raw Data,\" we shared procedures and code for downloading sequence data. Now that we have the data, we can begin mining it. This article covers how to extract the downloaded sequence data, perform differential gene expression (DGE) analyses, find up-regulated and down-regulated genes, and then visualize changes in gene expression.","breadcrumb":{"@id":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.novogene.com\/us-en\/resources\/blog\/geo-data-mining-ii-differential-gene-expression-analysis-and-visualization-with-complete-code-sharing\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.novogene.com\/us-en\/"},{"@type":"ListItem","position":2,"name":"Resources","item":"https:\/\/www.novogene.com\/us-en\/resources\/"},{"@type":"ListItem","position":3,"name":"GEO Data Mining (II) &#8211; Differential gene expression analysis and visualization with complete code sharing"}]},{"@type":"WebSite","@id":"https:\/\/www.novogene.com\/us-en\/#website","url":"https:\/\/www.novogene.com\/us-en\/","name":"Novogene","description":"USA Based Lab Guaranteed Data Security","publisher":{"@id":"https:\/\/www.novogene.com\/us-en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.novogene.com\/us-en\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.novogene.com\/us-en\/#organization","name":"Novogene","url":"https:\/\/www.novogene.com\/us-en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.novogene.com\/us-en\/#\/schema\/logo\/image\/","url":"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2020\/05\/20200506113246.png","contentUrl":"https:\/\/www.novogene.com\/us-en\/wp-content\/uploads\/sites\/4\/2020\/05\/20200506113246.png","width":941,"height":269,"caption":"Novogene"},"image":{"@id":"https:\/\/www.novogene.com\/us-en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/NovogeneAmerica\/","https:\/\/twitter.com\/Novogene_Global","https:\/\/www.linkedin.com\/company\/novogene\/","https:\/\/www.youtube.com\/c\/NovogeneGlobal"]}]}},"acf":[],"_links":{"self":[{"href":"https:\/\/www.novogene.com\/us-en\/wp-json\/wp\/v2\/resources\/35491"}],"collection":[{"href":"https:\/\/www.novogene.com\/us-en\/wp-json\/wp\/v2\/resources"}],"about":[{"href":"https:\/\/www.novogene.com\/us-en\/wp-json\/wp\/v2\/types\/resources"}],"wp:attachment":[{"href":"https:\/\/www.novogene.com\/us-en\/wp-json\/wp\/v2\/media?parent=35491"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}