MoSBi

Welcome

Welcome to the Molecular Signatures with Biclustering (MoSBi) webtool!

We apply biclustering algorithms on omics data in combination with a novel ensemble approach and find to find molecular signatures in the data. This can yield to new insights of the data, which go further than traditional clustering, where only one dimension is considered.

Try it out with your own data or use the example data available on the

1. Upload

page.

A tutorial for the webtool can be found on the

Tutorial

page.

MoSBi is not only available as a public webservice, but can also be used as an R package, or the webserver can be deployed on your own computer using Docker. To download and run MoSBi locally, look at the

About

page.

Before using MoSBi, we recommend reading the publication:

Rose et al. "MoSBi: Automated signature mining for molecular stratification and subtyping"
Proceedings of the National Academy of Sciences , 2022; 119 (16): e2118210119;
doi: https://doi.org/10.1073/pnas.2118210119

Please note that your uploaded data will be processed on our servers and automatically be deleted once your session is closed. Note we have only limited capacities on our server. If page freezes occur or your data exceeds the limit, consider running MoSBi locally. Further information are available on the

About

page.

1. Upload Data

Upload your data in the format as explained on the 'Tutorial' page. You can additionally upload labels and colors for your data. Once data is uploaded, you can proceed to the computation tab.

The data size is limited to 1e+07 rows and 1e+07 columns. Download and run MoSBi on your machine, if your data exceeds this limit. For more information visit the 'About' page.

Example data from:

Ku et al., Molecular Oncology (2020)

Data summary

UMAP visualization

3. Biclustering Results

3.1. Distribution of bicluster sizes

Gives an overview of the sizes of biclusters.

3.2. Biclusters

Heatmaps of all computed biclusters, sorted by algorithm.

3.3. Download biclusters

Biclusters can be downloaded here, or later on page 4.

Download Biclusters (as xlsx table)

Interactive Molecular Signatures

This page contains interactive versions of the visualizations seen on the molecular signatures page. You may expect increased load times when this page is opened for the first time.

Tutorial

This tutorial will walk you through the main functionalities of the webtool. For a full analysis all steps 1.-4. need to be completed.

1. Data upload

To start the analysis, a dataset must be uploaded. An example dataset can be used by clicking on the button

Use example data

or user data can be uploaded. Additionally, labels for the samples in the dataset can be uploaded and custom colors which will be used in the generated plots. However, the latter two are optional and can be added at any time, even after the analysis is done and the plots will be updated automatically with the new colors or conditions.

The data must be uploaded in

.csv

format (Comma separated values). Here are examples how the upladed data should look like:

Dataset (required)

Names	Sample1	Sample2	...
Gene1	0.12	2.35	...
Gene2	5.12	0.98	...
...	...	...	...

Labels for samples (Optional)

id	condition
Sample1	Disease1
Sample2	Disease2
...	...

Colors (Optional)

condition	color
Disease1	green
Disease2	red
...	...

Row and column names are required for the dataset. If no labels are uploaded, they can also be generated using k-means clustering. Other options include the possibility to normalize the data and replace missing values. Once the data is uploaded, the size of the dataset is shown with the number of missing values and a UMAP visualization is presented to give an overview of the data and labels.

The data size is limited on our servers. Therefore the maximum number of rows is limited to 1e+07 and the number of columns is limited to 1e+07. If your data exceeds that limit please run MoSBi locally. Information for that can be found on the About page.

The data is now ready to continue with the analysis and compute biclusters.

2. Compute biclusters

The first step of the analysis is the computation of biclusters, using multiple algorithms. Without prior experience or knowledge of biclustering, is recomended to use the prior selection of algorithms. Of you are only interested in the result of one algorithm, it is also sufficient to select only one algorithm.

The computation can now be started by clicking the

Compute biclusters

button. This will execute all algorithms with default parameters. Parameters of all algorithms can be changed in the blue boxes for each parameter individually. A box in the lower right corner indicates the progress of the computation:

Once the computation has finished, an overview about the executed algorithms is shown, indicating if errors occured in some algorithms with a list of all error messages, which allows the you to adapt algorithm parameters and rerun the algorithms.

In the next step, the resulting biclusters can be investigated.

3. Biclustering Results

This page gives you an overview about the computed biclusters and the option to download them.

The histogram 3.1. plots the size distribution of all biclusters for rows and columns. Additionally, in 3.2. each bicluster can be selected individually and a heatmap with labels for samples can be inspected. Finally, All biclusters can be downloaded as an xlsx table. Here is an example how a resulting page can look like:

4. Molecular Signatures

The final step is the extraction of molecular signatures by creating bicluster networks and community detection.

Additional parameters can be modified, however this is not recommended and should only be done after reading the publication. The networks can then be calculated using the accoding green button. The first two plots then show the estimation of the similarity threshold. For further information, please consult the publication.

The next plots shows the bicluster similarity networks colored by labels for the samples and by biclustering algorithm. Two connected biclusters show a higher than random similarity. On the left, the biclusters are colored by the sample labels that are included in the biclusters. This gives you an overview about the network and the similarities in the results of different algorithms.

Now, communities can be calculated to extract the signatures. A minimum community size can be defined. Afterwards, all communities will be shown in tabs below as networks and tables, where the included samples and features can be investigated. The visualization of the network as a hybrid matrix diagram plots bicusters as heatmatps in the network. This helps to visually inspect the similarity of biclusters.

All results can be downloaded as a zip file afterwards. Additionally, it is possible to extract the content of a community and perform another round of biclustering on this subsetted data. This will temporarily save the date on our server and make it available through the public link that occurs after clicking on the according button. The link will be invalidated after 30 seconds. By clicking again a new link is generated.

The page

Interactive Molecular Signatures

provides the same community network plots, but in an interactive form.

About

Thank you for using MosBi!

Citation

If you use MoSBi, please cite our publication:

Feedback

If you have questions or suggestions how to improve MoSBi, please contact us:

<tim.rose[a.t_)tum.de>

<josch.pauling[a.t_)tum.de>

Code

MoSBi is free software. If you want to run MoSBi locally or have a look at the code here:

MoSBi webapp

MoSBi R package

MoSBi R package on bioconductor

Impressum

Dr. Josch K. Pauling
bidt Junior Research Group LipiTUM
Chair of Experimental Bioinformatics
TUM School of Life Sciences Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising
Germany

Welcome

1. Upload Data

1.1. Upload a dataset

1.2. Labels for samples (Optional)

1.3. Upload colors (Optional)

1.4. Data modification (Optional)

Data summary

UMAP visualization

2. Compute Biclusters

2.1. Algorithms for bicluster computation

2.2. Optional Parameters for the biclustering algorithms

Fabia

CC

Xmotifs

Isa2

QUBIC

Plaid

Bimax

Spectral

Quest

BicARE

Biclustpy

QUBIC2

Bi-Force

UniBic

akmbiclust

2.3. Run Algorithms

3. Biclustering Results

3.1. Distribution of bicluster sizes

3.2. Biclusters

3.3. Download biclusters

4. Molecular Signatures

4.1. Ensemble parameters (Optional)

4.2. Calculate Networks

4.3. Error model statistics

4.4. Bicluster networks

Colored by conditions

Colored by algorithm

4.5. Molecular signatures

4.6. Download Results

(4.7. Recursive Biclustering)

Interactive Molecular Signatures

Tutorial

1. Data upload

Dataset (required)

Labels for samples (Optional)

Colors (Optional)

2. Compute biclusters

3. Biclustering Results

4. Molecular Signatures

About

Citation

Feedback

Code

Impressum