Column

MicrobialLandScape


MicrobialLandScape.

The MicrobialLandScape application allows user to upload bacterial abundance files (Bacterial 16s rDNA amplicons results) and run prediction of soil physchem parameters and geographical location. Models were trained on AMI soil 16S rDNA datasets and still under development. Metadata for the models used from public domains AMI, NVIS and ALUM8. Models are still under development: the accuracy and the performance has to be yet evaluated. There is inbuild dataset for demonstration purposes.

Information below contains some additional details about the training dataset used in application.

UMAP for Australian Microbiome contextual data

Select samples by using rectangular tool from the widget on top right dash AMI map. Based on your selection over the geographical location those samples will be highlighted on UMAP figure below. Both graphs are interactive and allow you to perform various data filtering and navigation.

UMAP embedding for OTUs counts.

Dimensionality reduction performed on the bacterial profiles which were taken from BPA website: UMAP function was applied to the transformed OTUs count matrix. OTUs counts with matched “unklnown” taxonomies were added. In other words different sequences with unknown taxonomies were treated as identical. Selected metadata columns were collected from the main contextual mastersheet table and merged with UMAP results.

Australian Microbiome contextual data.

Contextual metadata taken from BPA website

Number of soil samples by locality.
locality nn
Christmas Island 16
Jervis Bay Territory 42
Sydney 48
Monaro 92
Australian Capital Territory 116
Antarctica: Windmill Islands 162
Tasmania 237
New South Wales 270
Antarctica 330
Queensland 347
Victoria 375
South Australia 406
Northern Territory 457
Western Australia 592
Major Vegetation Groups NVIS layer

The “Version 6.0 Major Vegetation Groups – Estimated Pre-1750 Vegetation” (NVIS 2023) refers to a classification system used to categorize and describe the major types of native vegetation that were estimated to have existed in Australia before European colonization and land use changes.The Major Vegetation Groups are a simplified representation of Australia’s native vegetation. They abstract away the details of individual plant species and focus on the overall patterns of vegetation distribution. They encompass various combinations of plant species within the canopy, shrub, or ground layers, displaying structural similarities and often being dominated by a single genus. From a mapping perspective, these groups represent the predominant vegetation found within a map unit, even in cases where several vegetation types coexist.

R Library leaflet was used to generate this plot. Select samples

CRC TiME

TiME.

  • CRC – Cooperative Research Centres program
  • TiME – Transformation in Mining Economies

This project is dedicated to establish link between soil microbes community and plant vegetation with possibility to extend analysis to the application for seed provisioning selection. The shiny application allows to do exploratory analysis of microbial dataset where samples profiles can be plotted on scatter plot and highlighted with metadata values. UMAP techniques used for dimensional reduction. The differentiall abundance analysis is performed with DESEQ2 package where only pairwise contrast can be applied for heatmap plots.

To do:

  1. Sample selection for inclusion exclusion for the UMAP embedding.
  2. Better control fo heatmap figure:
    • number of expected clusters (k-means)
    • number of rows to show,
    • number of samples,
    • Test Interactive complexheatmap
  3. Convert nonnumerical columns to factor column for coloring UMAP.

FINDMYSNP

GTCHIPS.

Application findmysnp is designed for quick navigation trough illumina microarray annotations. It was originally though for genotyping beadchip arrays only but later annotation from methylation EPICv2 array added. The idea is to browse and compare arrays for the best gene and snp content. Note that EPICv2 array has huge amount of SNPs in report but those are not directly queering by EPIC methylation assay. Those reported SNPs are merely in close vicinity to the cpg methylation probe/site on the EPIC array. Another moment to mention here, is the gene annotation: it may not 100% match the illumina product files: there is was some extra work performed on annottion files to make them harmonized across chips. This is mostly about matching rs-id to associated genes. Some illumina files may have missed gene symbols associated with particular rs-ids. To make rs-id annotations uniform all missed genes for rs were populated based on entire chip collection: in other words if at least in one chip rs-id has been annotated with gene symbol it been propagated across all chips. Two genome versions were added not the hg38 coordinates were created by liftovers from hg19.

The application is reasonably matured. There probably few extra features would be good to add: 1) Region based search on chromosomes and coordinates input. 2) Input format flexibility: comma separated and end of line and space sparated lists. 3) Interface to include exclude particular chips from analysis. 4) New illumina methylation chip.

Older version code and documentation available on GitHUB

Docker container with shiny application and R libraries. Docker

Note this one only has GTA and EPIC chip included.

misc

ServerStatus.

Some visitor counter app for the shiny logs.

Inspired by https://www.rcharlie.com/blog/shiny-monitor/.




Work in progress . . .

This is an alt text.
https://markdownlivepreview.com/

Column

AMI map

TiME

TiME.

Shiny application for exploring TiME samples of soil bacterial profiles. This is fixed 16S amplicon diversity profiles generated with AGRF DivPro pipeline and merged with available metadata. The first tab [UMAP] dedicated to sample browsing and sample clusters exploration. Sampleset can be navigated via interactive dimensional reduction performed on the fly with UMAP protocol.This scatterpolot can be colored by metadata columns. The Second tab [Differential abundance] provides facilities for testing abundance of bacterial species contrasted by any metadata column.