1,851,819
Imputed Genomes


126
Registered Users

19
Running Jobs

Pig Haplotypes Reference Panel

Provides a free genotype imputation service using Minimac4 for pig. In the following releases, our server offers imputation from the reference panel that includes:

    PHARP v4

  • 12,898 haplotypes
  • 56.1 million variants (42.3 million SNPs and 5.8 million Indels) of autosomes
  • 157 pig breeds

  • PHARP v3

  • 9,244 haplotypes
  • 53.8 million variants (42.3 million SNPs and 5.5 million Indels) of autosomes
  • 140 pig breeds

  • PHARP v2
  • 4,096 haplotypes
  • 53 million SNPs of autosomes
  • 122 pig breeds

  • PHARP v1
  • 2,012 haplotypes
  • 34 million SNPs of autosomes
  • 71 pig breeds

Citation

If you use PHARP in your work, please cite our publication: Wang Z., Zhang Z., Chen Z., Sun J., Cao C., Wu F., Xu Z., Zhao W., Sun H., Guo L., Zhang Z., Wang Q. & Pan Y. (2022) PHARP: A pig haplotype reference panel for genotype imputation. Scientific Reports 12,12645

The latest news


  • Onging
    Sequencing more individuals and will update to PHARP V5.
  • 2024-10-12
    PHARP v4 (n=6449) was released
  • 2023-07-31
    PHARP v3 (n=4622) was released
  • 2022-09-30
    PHARP v2 (n=2048) was released
  • 2022-06-30
    PHARP was accepted for publication in Scientific Reports, 2022
  • 2021-06-03
    The preprint was published: bioRxiv
  • 2021-01-01
    The first release of PHARP is public available.
  • 2020-12-29
    The evaluation of imputation accuracy for PHARP is done.
  • 2020-08-29
    The new update WGS (81 Large White) datasets from SRA were collected and done SNP calling..
  • 2020-06-30
    After QC, the first version of pig haplotype reference panel was done, including 1,006 individuals.
  • 2020-06-12
    GATK HaplotypeCaller calls gVCF is done.
  • 2020-04-27
    Remove the duplication and unmapped reads, add RG for each individual are done.
  • 2020-03-30
    Reads QC,alignment, sort, and merge multipe BAMs into one for the same individual are done.
  • 2020-02-27
    Download of approximate 2,000 SRR datasets is done.
  • 2019-10-15
    Manually collected WGS data by reviewing papers related to pig genome study.
  • Overview
  • Sample
  • Imputation performance
  • GWAS
Schematic diagram of the pig haplotype reference panel’s construction, imputation accuracy evaluation, implementation platform and applications.A: Data resources and processing steps used to construct the PHARP. B: Imputation accuracy estimation of PHARP on multiple test datasets. C: Imputation platform development. D: Applications of PHARP in GWASs, GS and other potential studies such as eQTL mapping and TWASs.
Imputation accuracy under different scenarios. A: Mimicing three popular pig commercial chips (50K, 60K, and 80K) using three datasets by masking all variants (only autosomes were used) except those on the chips; the held-out genotypes were considered as ‘real’ to calculate the CR and r2 values. B: Boxplot of imputation accuracy estimated by mimicking the target panel with different densities of SNPs on chromosome 1 using test datasets 1, 2 and 3. C: Boxplot of the imputation accuracy estimated by mimicking 50K chip genotypes from dataset 1 using different sizes of reference panels constructed by randomly extracting samples from 1006 individuals (repeated 5 times). D: Mimicking the 50K chip genotypes from dataset 1 and 2 and using reference panels constructed by extracting samples according to pig breed (LW, Large White, n = 114; DU, Duroc, n = 85). E: The imputation accuracies of the different MAF bins ((0, 0.02], (0.02, 0.05], (0.05, 0.1], (0.1, 0.2], (0.2, 0.3], (0.4 0.5]) estimated by mimicking the 50K chip genotypes using dataset 1. F: The imputation accuracy estimated from dataset 4 using our reference panel and that from Animal-ImputeDB. Dataset 1, Large White pig breed, LW, n = 81; dataset 2, Duroc pig breed, DU, n = 299; dataset 3, Jiaxinghei pig breed, JXH, n = 54; dataset 4, Duroc pig breed, n = 20, pigs were genotyped by both a 50K chip and ELC.
Association signals for growth phenotypes before and after imputation. Association test statistics on the −log10 (P-value) scale (y-axis) are plotted for each SNP position (x-axis) for the trait of backfat thickness at an age of 180 days (A), from Zhang et al., and at 100 kg (B), from Fu et al. To simplify the plot, only the variants with a P-value less than 1.08×10-4 are shown, and they are colored according to the annotated genes. The black-labeled genes are reported in the original paper, and the blue-labeled genes are novel genes detected after imputation. Examples of potential causal variants (marked by blue asterisks) in the SNRPC (C), GRM4 (D) and PACSIN1 (E) genes. Each dot represents a variant, whose LD (r2) with the Chip SNP (marked by blue diamonds) or the one with the lowest P-value (marked by a black circle) is indicated by the colour of the dot. The two horizontal lines divide SNPs with P-values < 2.05×10-6 and <1.08×10-4 (A), and P-values < 6.46×10-7 and <1.86×10-5 (B).