A Basic Guide to Angelfish Genetics
Te following was intended to be a short and simple guide to genetics, Unfortunately, it's not quite as short, or as simple as I had intended it to be! None the less I hope that you find it useful.
In an effort to keep things simple, there are some areas that I've glossed over, and to improve clarity I've broken things down into separate sections.
Animals are made up of billions of tiny cells, and each cell has a specific function and purpose within the animal. At the centre of each cell is an area called the nucleus, and contained within the nucleus is all the necessary information for the development of the animal. That information is stored on chromosomes, angelfish have 24 pairs, one chromosome from each pair comes from the animal's mother, the other from it's father. The term for having chromosomes in pairs is "diploid".
Each chromosome will have arranged along it's length, many hundreds, or even thousands of genes. Each gene codes for a specific protein and each protein will have a specific effect upon the animal's development.
When the gametes (eggs & sperm) are produced, a special form of cell division called meiosis is employed. This process produces gametes that are said to be "haploid" having only one copy of each chromosome rather than the normal diploid complement.
When a haploid egg is fertilised by a haploid sperm, they fuse to produce one single cell, their nuclei fuse and the new nucleus produced is diploid as it receives one set of chromosomes from the egg & one set of chromosomes from the sperm.
Shortly after formation the cell divides into two. This process is called mitosis. Mitosis differs from meiosis in that copies of each chromosome are produced such that each daughter cell produced contains a nucleus with a full diploid set of chromosomes. As the animal develops, with each cell division a full set of chromosomes is passed on every time, so even though the adult animal contains literally billions of cells, each individual cell contains the complete "blueprint" for it's development.
The really amazing thing about this process is that,
as the animal develops it's cells start to differentiate and specialise, so
some cells become skin cells, others develop to form the heart, the gills, the
eyes, etc. etc. The exact process by which this happens is not fully understood,
but what is known is that the process is regulated by genes, which produce hormones
which cause other genes to be switched on or turned off.
As yet, on-one knows how many genes an angelfish has. Human beings are thought to have between 20,000 to 25,000 pairs of genes, we can reasonably assume that angelfish have a similar number. But at any one time, in any particular cell most of those genes will be turned off. Throughout the life of each individual cell, different genes will be switched on and off, the timing of when they are turned on and off, and the proteins that they code for produce not only the variations that we can see, but many that we can't.
What is a gene?
A gene is a length of DNA which codes for the manufacture
of a specific protein, that protein will have some function within the fish
at some point in it's life. As stated in the previous post these lengths of
DNA are arranged along the length of chromosomes. Each individual gene is located
at a specific point on the chromosome called a locus (plural loci).
DNA consists of long chains of sugar molecules each linked to one of four nucleic acids, these are adenine, guanine, thymine and cytosine often simply referred to as A, G, C & T. Proteins are constructed from building blocks called amino acids, all animal & plant proteins are built up using just twenty different amino acids. The sequence of amino acids in a protein is coded for in the corresponding gene using a triplet code, so for example the sequence adenine - guanine - cytosine (AGC) might code for one amino acid, whilst CCC might code for another. With four possible letters (A, G, C & T.) in three different positions there are 4x4x4 = 64 possible words in the genetic code. Since there are only 20 amino acids some amino acids are coded for by more than one "word", other "words" signal instructions such as "start here", "stop here" and other "words" have no known function.
From time to time, changes can randomly occur in the genetic code. These are called mutations, a mutation can be caused by a chance error, when the chromosomes are copied during cell division, or can result from damage caused by mutegens such as radioactivity or certain chemicals.
Mutations fall into the following 4 categories:-
Insertion, Deletion, Substitution & Inversion.
By way of explanation, think of the phrase "The Cat Sat On_ The Mat" as a short gene.
In the case of insertion, a section of genetic code becomes imbedded in the original gene, this may be as a result of a stutter in the transcription (copying) process.
e.g. "The Cat Sat On_ Sat On_ The Mat" or it
could be a bit of code floating around in the nucleus gets randomly inserted
into the gene e.g. "The Cat Sat On_ The Big Mat"
As the name implies is where a part of the code is lost "The Cat On_ The Mat"
In this case, one or more letters of the gene are copied incorrectly, "The Rat Sat On_ The Mat"
In this case a portion of the gene is spliced in backwards e.g. "The Cat _nO taS The Mat"
In some cases, minor changes to the gene make little or no difference to it's function, the insertion or deletion of a single amino acid may have little or no effect on the way the gene functions. In other cases the insertion or deletion of a single nucleic acid can render the gene useless.
For example just add the letter A and "The Cat Sat On_ The Mat" becomes "The CAa tSa tOn _Th eMa t" or delete one letter to get "ThC atS atO n_T heM at"
Most mutations usually either have little or no effect on the gene function, or seriously disrupt function so that the gene doesn't work, just very occasionally a version of the gene is produced which is both functional & different in it's effect from the original.
As angelfish keepers & breeders we tend to use the word gene to refer to a specific mutation, for example Dark, Marble, Gold Marble and New Gold might be referred to as four different genes. It is however, equally correct, to refer to them as 4 different mutations of the same gene. Because these genes are found at the same locus they are called alleles. There is a fifth allele of Dark, Marble, Gold Marble and New Gold and it is referred to as "Wild Type". In truth there may be multiple subtly different versions of the wild type gene, but since we have no practical way of differentiating them, they can all be lumped together as wild type.
Because angelfish are diploid (Have only two sets of chromosomes) at each loci a fish may have either 2 copies of the same allele (this is called homozygous) or two different alleles (heterozygous). With 5 different Alleles there are 15 different gene combinations that a fish could carry. (If you're thinking that there are 25 - read on!)
The following abbreviations are usually used :-
Dark (D), Marble (M), Gold Marble (Gm), New Gold (g) and Wild Type (+).
The 15 possible combinations can be written as follows
D/D : D/M : D/Gm : D/g : D/+
M/M : M/Gm : M/g : M/+ (note M/D is the same as D/M and was covered on the first line)
Gm/Gm : Gm/g : Gm/+
g/g : +/g (note I will cover why g is lower case & why I've written +/g rather than g/+ in the next post)
& finally +/+
Dominant & Recessive Genes
What happens if you cross a gold angel with a normal
silver (wild type) angel?
Well if you knew nothing about genetics you might expect to get offspring that were a blend of the two, a gold angel with stripes for example. Or you might expect to get to get a mixture of the two types.
In fact if you mate a homozygous silver (+/+) with a
homozygous gold (g/g) what you get is 100% silver angels.
However, if you were to mate one of those silver angels with a gold (g/g) angel you would get 50% gold and 50% silvers.
Whilst if you were to mate them brother to sister then you'd get 25% golds & 75% silvers. Why is this?
In the post entitled "Development" I explained
how a diploid animal produces haploid gametes & when these fuse the resulting
cell is once again diploid, having one set of chromosomes from the mother &
one set from the father. So in our first example (+/+ x g/g)
the first parent (let's say it's the male) has two chromosomes each with a + gene, so all his gametes will have + genes. Whilst the female has two g genes, so all her gametes will carry the g gene. Therefore all the resulting offspring will have one + and one g gene. In this case the wild type (+) gene is said to be dominant over the recessive gold (g) gene. That is to say that one gene overrides the effect of the other, so that the phenotype (the appearance) of the offspring is indistinguishable from one of the parents.
A couple of important points here, firstly the difference between genotype & phenotype...
In the above example there are two phenotypes, silver (wild-type) & gold; but there are three genotypes +/+, +/g & g/g.
When writing the genotype, it is convention that an uppercase letter is used for a dominant gene and a lowercase for a recessive gene. It is also convention that the dominant gene is written first, so +/g rather than g/+ or +/G.
So what happened in the brother to sister mating?
Since both are heterozygous (+/g) then they can produce gametes with either a + gene or a g gene, and because of the mechanism by which the gametes are produced these are produced in equal quantity...
So the female lays eggs 50% of which carry a gold gene
and 50% of which carry a silver gene. When these are fertilised there is a 50%
chance that it will be by a sperm carrying a gold gene and 50% of the time by
a wild type sperm.
This gives us four possible combinations...
1) gold egg plus gold sperm = homozygous gold: g & g = g/g: 25%
2) gold egg plus wild sperm = heterozygous: g & + = +/g: 25%
3) wild egg plus gold sperm = heterozygous + & g = +/g: 25%
4) wild egg plus wild sperm = homozygous wild: + & + = +/+: 25%
note that case 2 & 3 both result in heterozygous (+/g) offspring, giving a 1:2:1 ratio of homozygote : heterozygote : homozygote, and because gold is recessive, it gives an observed ratio of 3:1 silver phenotype to gold phenotype.
Sometimes when two homozygous individuals are mated,
the offspring do not resemble either parent, but instead appear to be a blend
of the two. This is known as incomplete dominance (also referred to as incomplete
or partial penitrance). The way the genes separate, and recombine is identical
to that detailed for dominant/recessive gene combinations, the difference is
that the heterozygous individuals have a different phenotype to either of the
Many angelfish mutations fall into this category, a good example would be the dark gene. A fish that is homozygous (D/D) will be a very black fish, with vertical bands that are only visible under certain lighting conditions, and even then only if you look closely. A homozygous wild type (+/+) is, of course, a typical wild type silver angel with vertical black banding, mate them together and the heterozygous offspring (D/+) will be a dusty grey fish with darker black bands, in other words a black lace.
A really important point to remember, is that the concept
of dominant, incomplete dominant and recessive is not an absolute, but relates
only to how the gene behaves in relation to the other gene or genes you are
comparing it to. It is perfectly possible for a gene to be dominant over one
of it's alleles and recessive in respect of a different allele.
The (new) gold gene is probably the best example of this. Gold is recessive to wild type.
Gold Marble is partially dominant in respect of wild type, that is to say that a heterozygous Gm/+ fish would retain the normal wild type body colour and stripes, but in addition has random black patches whilst a Gm/Gm fish has a gold body colour with a fair degree of black coverage. If however you mate a homozygous gold marble (Gm/Gm) with a homozygous Gold (g/g) the (Gm/g) offspring will have less black patches than their marbled parent. In this case gold and gold marble are demonstrating incomplete dominance.
Punnett squares are a useful tool for working out the
result of potential crosses, they work in a similar way to a "times table".
For example if you wanted to work out the results of a cross between a male GM/g & a female D/M you fill in the details as follows :-
Write each of the genes that the male carries in the
two left hand boxes
Do the same for the female in the two top boxes
now follow the arrows to fill in the results in the four answer boxes.
As there are four answer boxes each answer box represents 100/4 = 25% of the batch.
A Punnett square will give you the genotype, you need to supply the information as to what the phenotype is.
In some cases the same genotype could appear in different answer squares. These should be added together to give you the percentage of different genotypes produced.
More complex punnett squares can be used to determine the results of crosses where more multiple gene loci are involved. I will cover those later.
Known Angelfish genes & loci
Angelfish probably have somewhere in the region of 25,000 pairs of genes spread about on their 24 pairs of chromosomes. We currently known about 17 genes mutations located on just 12 loci, so we clearly have a lot still to unravel!
The first locus discovered was the Dark locus, as already
discussed it is the site of 4 known gene mutations :- Dark(D), Marble(M), (New)
Gold (g) and Gold Marble (Gm).
(Note the correct abbreviation for gold marble is Gm. The convention is that where more than letter is used to code for a non-recessive gene the first letter should be upper case and the rest lower case. Some people use the abbreviation gM, on the basis that the gold part of the gold marble gene acts in a recessive fashion whilst the marble part is dominant. This is logical but goes against convention. Similarly some use the abbreviation B for black in place of D for Dark)
Next is the Stripeless locus. There are two known mutations at this locus, Zebra (Z) which is dominant over the wild type gene and stripeless (S) which exhibits incomplete dominance both with the wild type and the zebra gene. Homozygous stripeless produces the blushing phenotype.
The third locus has only one known mutation, veiltail (V), this too is an incomplete dominant producing longer fins in the heterozygous condition and very long fins in the homozygous.
Like veiltail all the remaining loci have only one known
Fourth is half-black. Half black (h) is a recessive gene which only expresses in homozygous condition, and only then if the fish is grown in optimum conditions.
Fifth is smokey (Sm), heterozygous individuals are called smokey whilst homozygous are called chocolate.
Sixth pearlscale (p), a recessive mutation which affects the shape of the scales.
Seventh streaked (St) an incomplete dominant, which causes white streaks in darker finned individuals such as blacks & marbles.
Eighth albino (a), a recessive mutation which causes the complete loss of melanin pigment, resulting in a pale fish with characteristically red eyes.
Ninth is the recently renamed Philippine Blue (pb). This mutation produces a metallic blue sheen but only in homozygous condition because it is recessive.
The remaining three genes and loci are, Hong Kong Gold, which is believed lost to the hobby. Similarly Naja gold, both of these early gold varieties were located on their own unique loci, and both were recessive. Finally Notched (N) is a partially dominant gene which causes severe deformity in those fish which possess it.
Dealing with genes at more than loci
Although a fish only has two genes at each individual
locus, it can, of course, have various mutations at the different loci. For
example you could have a streaked pearlscale hybrid black veiltail, how do we
represent this in genetic notation?
Firstly the genetic notation follows the hierarchy detailed in the previous post and not the phenotype name, so in the case of the example above we write D/g V/+ p/p St/+ and not St/+ p/p D/g V/+.
D/g = hybrid black, one dark & one gold gene.
V/+ = Veiltail, one veiltail and one wild type gene.
p/p = pearlscale, because pearlscale is recessive, the fish must be homozygous for the gene.
St/+ = streaked, although fish homozygous for streaked have more streaking, streaked is the phenotype name for both the homozygous and the heterozygous condition. In this example I have chosen the fish to be heterozygous.
Two important points to note :-
Firstly in the above example the + symbol appears twice, but it is not the same gene. The + symbol simply indicates the corresponding wild type allele to the particular mutation at that particular locus. When working out potential crosses, take care not to muddle your wild type alleles.
Secondly although the genetics of our example fish look complicated, each individual gene loci will continue to function as we saw in the posts on dominant & recessive genes and punnett squares. Crossing it to a homozygous gold will still produce 50% homozygous golds and 50% hybrid blacks, what happens at one locus is totally independent of any other loci, unless they are "linked" (see next post).
Let's look at an other example...
What do you get if you cross a gold veiltail (g/g V/+)
with a smokey (Sm/+)?
For the purpose of this exercise lets assume that neither fish has any other known mutations.
The first thing to note is that smokey is not an allele of either gold or veiltail so we need to add in the corresponding wild type genes in the two fish. Giving us :-
g/g V/+ +/+ x +/+ +/+ Sm/+
There are three ways of dealing with this cross, firstly you could draw up an appropriate punnett square, if the genes at every loci were heterozygous you'd need something like this....
I'll leave you to fill in the details!
But because each fish has only 1 heterozygous loci, a much simpler punnett square can be drawn up...
The second way is to look at each loci separately.
So g/g x +/+ all the offspring will be +/g, as g is recessive, phenotypically none will be gold.
V/+ x +/+ produces 50% veils & 50% standard fins.
+/+ x Sm/+ similarly 50% will be smokey and 50% won't
So half the offspring will be smokey, and half of those
will be veiltails and all will carry a hidden gold gene.
Of the other half of the batch, all will be silvers, all with a hidden gold gene and half will be veiltails.
The third approach is to plug the necessary details into an appropriate genetic calculator.
Why don't my fish produce what the Punnett square predicts?
One possibility is that your fish aren't what you think
they are, and that there are "hidden genes". Hidden genes as the name
implies are genes which are not expressed in the phenotype (the appearance)
of the fish that carries them. There are three possible reasons why genes may
remain hidden :-
1. Recessive genes in the heterozygous condition, by definition, are not expressed in the phenotype.
2. Some genes only express if the appropriate environmental factors are in place. The best example in the world of angelfish is the half black gene. This is a recessive gene, but even homozygous individuals often don't express if they are raised in cramped or other substandard conditions.
3. Epistasis - Some genes can suppress or mask the expression of other genes at other loci, this is termed epistasis. Gold is epistatic to several genes including smokey.
Another reason that observed numbers differ from those predicted by calculation is that some genes are deleterious. That is to say that the individuals that carry them are weaker, less able to compete, grow slower or are more prone to disease than their siblings that don't carry the gene. This means that at a disproportionate number of them die at a very early age, thus skewing the results when the numbers are counted.
The Final thing to consider is "Gene linkage".
As stated earlier many hundreds of genes are arranged along the length of a
chromosome. You might therefore expect that if two genes were located at different
loci on the same chromosome, they would always be transmitted to the next generation
together. In fact during meiosis, the pairs of chromosomes become tangled, break
apart, and are then joined back together randomly, this process is called "crossing
over". Most chromosomes will cross over several times during meiosis, so
genes that are located at opposite ends of the chromosome will show no signs
of linkage. But genes that are positioned close together, will tend to stay
There is no evidence that any of the known angelfish mutations are linked. But linkage is something to consider if you are trying to understand the inheritance of a particular trait not covered by the existing known genes.
More on Punnett squares
Lets start with a single gene locus. What do we get if we mate a male black lace (D/+) with a female black lace (D/+)? Well both fish can produce gametes (eggs & sperm) with only one of the two genes that they carry, either D or +, and they do so in equal proportions. Half the eggs will have a dark gene, and the sperm that fertilises them will either carry a dark or a wildtype (+) gene. Similarly with the eggs that carry the wildtype gene, they can be fertilised either by a sperm carrying a wild type gene, or by sperm with a dark gene. All of this can be represented by this basic diagram,
which if you draw lines around it becomes a punnett square
A few things to notice
i. I have written the genes that originate from the male in red & the female in blue, this is to aid understanding of which parent donates which gene. In the offspring the parent they originate from makes no difference.
ii. there are four different offspring squares, each therefore represents 1/4 i.e. 25% of the resultant offspring.
iii. The offspring in the top left & bottom right squares are the same - D/+.
iv. The ratio of offspring produced by this particular cross is :-
25% : 50% : 25%
+/+ : D/+ : D/D
Now lets look at a different gene. What do we get if we mate two veiltails (V/+)?
Once again we can draw up a punnett square :-
and once again we end up with the same ratio of offspring :-
25% : 50% : 25%
+/+ : V/+ : V/V
So how do we handle two gene loci at the same time?
What do we get if we mate two black lace veiltail angels?
Well, from the two punnett squares we've already looked at we can see we get a ratio of 25% : 50% : 25% both for the colour and the fins. So to answer the question what percentage of normal finned silvers should I get from this cross? we could simply combine the results of the punnett squares as follows:-
A neater way to do this is to put the second punnett square into the appropriate square of the first punnett square! e.g.
obviously we can extend this to cover all four boxes in the original square
notice how, because of the way I have arranged the parental
genes, in this particular case, those genotypes that appear more than once are
arranged on the diagonals. THIS DOES NOT ALWAYS HAPPEN,
other combinations of heterozygous and homozygous parents will produce different
patterns in the grid.
In this case we can easily see that 4 out of 16, i.e. 1 in 4, i.e. 25% of the offspring will be black lace veiltails like their parents, 2 out of 16 i.e. 12.5% will be silver veiltails and a similar number will be double black veils.
Now to complicate things further!
let's throw the stripeless gene into the mix. If both parents are also ghosts (S/+) what percentage of the offspring will be blushing double dark veiltails?
Once again we could draw up a single punnett square for S/+ x S/+ and as before with dark & veiltail we would find a 1:2:1 ratio of homozygous : heterozygous : homozygous, and again we could incorporate this into the relevant squares of our existing punnett square.
Once again we could extend this process to produce a grid containing 64 cells
I'll leave you to fill in the details if you want.
Clearly by the time we get to a grid for three different gene loci where the parents are heterozygous at all three loci, we have a very complicated grid. However in many cases parent fish are not heterozygous at all loci and this means we can produce a much simplified table.
Lets take the following as an example. Try to draw up a Punnett's square grid to cross a Chocolate Ghost S/+ - Sm/Sm with a Smokey Turquoise D/+ - S/S - Sm/+.
The first thing to remember when drawing up a punnett
square is to get your gene loci in order....
so for the first fish, we need to insert the two wild genes it carries at the dark loci +/+, so we end up with :-
+/+ S/+ Sm/Sm crossed with D/+ S/S Sm/+
In this case, the first fish is homozygous at the first and third loci, so it will always pass on a + gene & a Sm gene to it's offspring, the only question is at the second loci where it could pass on either a stripeless gene or a + gene.
Similarly the second fish is homozygous on the second loci and will always pass on a stripeless gene, but we need to allow for the variations at the other two loci.
Thus we only need a 4 x 2 grid, giving the 8 possible genotypes predicted by the calculator, each genotype will be produced in equal proportion i.e. 12.5%
This process can be extended to cover any number of gene loci, but the grids become larger with each extra loci, in practice it is often easier to find the answer you need by looking at each individual gene loci in turn, either by drawing up seperate punnett squares as I did in diagram 4, or simply from knowing that all single loci crosses conform to one six possible patterns, these are :-
AA x AA = 100% AA
AA x AB = 50% AA 50%AB
AA x BB = 100% AB
AB x AB = 25% AA : 50% AB : 25%BB
AB x AC = 25% AA : 25% AB : 25% AC : 25% BC
AB x CD = 25% AC : 25% AD : 25% BC : 25% BD
Try drawing up punnett squares to represent each of the above.