Genetic structure


Step 1 Data input

Users can upload ped and map files for analysis. Geological source or breed name for each individual may be defined in a sample information txt file. Please --download the example datasets-- and check the tutorial for detailed information about the file formats. The PED, MAP, and sample information files should be named with the suffix of “.ped”, “.map”, and “.txt”, respectively. Please note, users can upload a file with a maximum size of 50MB. Datasets larger than that should be uploaded via FTP  or analyzed locally with the  standalone package.


Please upload the .ped file:

or select an uploaded file

or use an example dataset



Please upload the .map file:

or select an uploaded file

or use an example dataset



Please upload the sample_infomation.txt file:

or select an uploaded file

or use an example dataset



Step 2: Set the maximum number of presumed populations

Please set the maximum number (K) of presumed populations. Genetic clustering will be performed under the assumptions of from 1 to K (e.g. 1, 2, 3, ..., K) ancestries. The optimal number of ancestries will have the minimum cross-validation error estimate. K should be an integer and slightly larger than the expected value (10 by default).


Step 3: Submit your job

Be notifled by email (Tick this box if you want to be notified by email when the results are available)




if available,the title will be included in the subject of the notification email and can be used as a way to identify your analysis.

Results

You may bookmark the following web address and view your results later. Please note that the results will be stored for 7 days.

Your job is currently running ... Please be patient. Analysis successful !
The analysis is failed. Please verify the formats of your input files. In case that you are using the example datasets from FigShare (Version 1) , please note that the error is caused by a wrong sample_information file. We have updated this file in the lastest version of the examples (https://figshare.com/articles/dataset/AMBP_case_study/19390652).


Ancestral components for each cluster are displayed in the following table (Ancestry_1: the first largest ancestral population in the cluster; Avg. Q of Ancestry_1: the average fraction of Ancestry_1 in the cluster; Ancestry_2: the second largest ancestral population in the cluster; Avg. Q of Ancestry_2: the average fraction of Ancestry_2 in the cluster).


Sample clustering based on genomic ancestral compositions.

Optimal number of ancestors determined by cross-validation.

Sample clustering based on principal component analysis.

Download all the results