SeqPHASE

Flot (2010) SeqPHASE: a web tool for interconverting PHASE input/output files and FASTA sequence alignments Molecular Ecology Ressources 10 (1): 162-166 link

If you would like to use SeqPHASE offline, please find here an updated command-line version of SeqPHASE Steps 1 and 2 as two perl scripts to run locally (you may have to modify the first line of the scripts depending on the location of perl on your system).

Step 1: generating PHASE input files from FASTA alignments

Alignment of sequences from homozygous individuals and from heterozygotes to be phased (FASTA, one sequence per individual):

New! Alignment of fake haplotype pairs from heterozygotes to be phased (FASTA, two sequences per individual):
In this alignment, sequences for a given individual should have the same name except for the last character (for instance: "sample1a" for the first haplotype of sample 1, "sample1b" for the other).

Fake haplotype pairs are an alternate way of entering genotypes to be phased that is particularly useful when dealing with length-variant heterozygotes (since there is no IUPAC sign for "A or indel", "C or indel", etc.), but I rather recommend solving such cases first using Champuru then inputting the result as known haplotype pairs in the 3rd SeqPHASE input field (below).

Alignment of known haplotype pairs, if any (FASTA, two sequences per individual):
In this alignment, sequences for a given individual should have the same name except for the last character (for instance: "sample1a" for the first haplotype of sample 1, "sample1b" for the other).

Important tips!!
1. Unlike many other programs, PHASE treats very differently indels and missing data. Hence, it is important to code missing data as "N" or "?" in your FASTA alignments and not as "-", which should be reserved for "real" indels.
2. If you input more than one FASTA alignment (e.g. because of have already determined the haplotypes of some length-variant heterozygotes using Champuru), you should align all your sequences together first before splitting them into the different FASTA files. If you align each of them separately and the resulting alignments do not match, PHASE may not infer the haplotypes correctly.

Example dataset: sequences from homozygous individuals and from heterozygotes to be phased and phased haplotypes

Frequently Asked Questions
Email the author

To proceed directly to Step 2, click here.