Tandem Duplications Working Page

Links

Corey's page

Index of BLAST dotplots vs. whole genome

BLAST dotplots for chr 10 only

Other chr 10 info

Get rice DNA sequences


family number General Location Copies (direct / inverted) Gene type Other Chromos Comments Further aligned bp ungapped bp ungapped min/max matches ungapped bootstrap matches synonymous bp synonyous bootstrap matches synonymous min/max matches
1 General (420,000)

410,000 - 465,000 region

6

(4/2)

hsr201 hypersensitivity-related protein none not all full length. 2 in one orientation, 4 in opposite. first copy has no BLAST homology with rest. 1800 450 0/1000 182/437 57 31/1000 42/57
2 General (510,000)

500,000 - 630,000 region

4

(4/0)

O-deacetylbaccatin III-10-O-acetyltransferase weak homology on 4 and 6 widely separated copies, all tandem. 2 further copies at 1,416,000 and 1,429,000. protein-coding sequences align well, but the annotated start sites of 2 of them are downstream of start sites indicated by homology region. 1404 1065 14/1000 794/895 567 869/1000 529/544
3 General (746,000)

740,000 - 830,000 region

6

(5/1)

LeOPT1-oligopeptide transporter weak copies on several chromos; strong copy at 11,085,000 2 separate exons; 5 tandem, last one inverted peptide transporter at 766,000 has a regions similar to family 3 copy just to its left, but this region isn't found in the other copies--another dup? 1863 657 152/1000 282/550 90 345/1000 73/85
4

XXXXX

778,609 4 wall-associated protein kinase chr 7, plus copies from family 9 and additional single copies at 226,000 and 8,489,000 2 separate exons, and only 2 copies here, in reverse orientation  
5

XXXXX

1,123,130 4 O-methyltransferase ZRP4 chr 9 only 2 copies here: junk at 1,139,000 is unrelated 
6 General (1,607,000)

1,600,000 - 1,780,000 region

17

(17/0)

unknown weak on chr 2 all tandem! Another copy at 1,548,000 plus some partial copies; some annotated as part of single large genes 2232 256 0/1000 65/251 6 -- 3/6
7 General (2,106,612)

2,100,000 - 2,240,000 region

9

(8/1)

hypothetical none several sub-groups; and tandem except last one (far away from others) is inverted some duplications here seem recent--more sequence than just genes match with blast 1584 609 535/1000 348/558 156 0/1000 129/152
8 General (2,925,860)

2,880,000 - 3,020,000 region

13

(9/4)

proline-rich protein none first copy is at 2,882,000, last at 3,016,000 3 groups plus some others. 2 groups have inverted duplications. Total of 4 inverted relative to others 1008 441 4/1000 291/436 147 0/1000 128/147
9

XXXXX

3,044,670 4 wall-associated protein kinase see family 4 part of large multiple inversion-duplication, not really a tandem array  
10

XXXXX

9,630,154

9,620,000 - 9,700,000 region

5

(3/2)

integral membrane protein weak copies on several chromos 5 copies between 9,624,000 and 9,698,000 2 of the 5 are inverted. Looks like a tandem dup, an inverted dup, and a very old inverted dup
11 9,960,184

9,954,000 - 9,984,000 region

6

(6/0)

lipid transfer protein chr 3, 4 (all weak). family 23 is related to this family. Very nice closely spaced tandem array. one copy is not well annotated as a gene 429 213 56/1000 69/192 30 3/1000 25/30
12

XXXXX

14,076,773

14,050,000 - 14,130,000 region

5

(3/2)

CCR4-associated factor none first copy at 14,034,000. Part of large multiple inversion/duplication region a tandem-duplicated inversion-duplication. Not too useful.
13
( 13a )
14,403,160

14,330,000 - 14,530,000 region

26

(22/4)

speckle-type protein one or two copies on almost every chromosome first copy at 14,334,000; last at 14,526,000 mostly tandem, with a couple of 1 gene inverted duplications. fam 13a has 2 trucated copies removed. 1770 177 0/1000 45/174 6 -- 4/6
14

XXXXX

15,026,061 6 cytochrome P450 copies on 2, 4, 7 two exons; only 3 copies of the duplication  
15
( 15a )
15,712,713

15,700,000 - 15,760,000 region

9

(9/0)

glycine rich protein chr 6, probably an array 6 main copies, but 3 others weakly related just past them (15,739,000 -15,751,000). Nice tandem array. fam15a_cds just has the 6 main copies. Maybe related to family 16--weak homology at farthest copies 879 408 156/1000 215/352 147 32/1000 110/134
16 15,844,166

15,810,000 - 15,880,000 region

9

(8/1)

unknown? none Good array. One inverted copy, weak homology 768 555 168/1000 361/526 261 148/1000 213/252
17
( 17a )
16,548,753

16,540,000 - 16,600,000 region

5

(4/1)

disease resistance protein another single copy at 19,769,000 2 exons, a bit confusing. Another partial copy just to left. fam 17a does not have the isolated single copy 4263 1551 0/1000 915/1338 606 40/1000 520/572
18 18,227,548

18,220,000 - 18,240,000 region

4

(4/0)

epoxide hydrolase small piece on chr 5 4 copies, nice tandem array some homology in introns too 987 896 999/1000 446/774 375 707/1000 300/350
19 19,036,776

19,030,000 - 19,120,000 region

10

(8/2)

cytochrome P450 chr 2, 6, 7, 8, plus family 14 good array; 2 inverted copies   2031 1128 0/1000 832/1128 459 0/1000 409/459
20
( 20a )
19,720,959

19,710,000 - 19,880,000 region

23

(21/3)

glutathione s-transferase chr 1, 3, 7, 9, 10 (10,930,000: 1 copy) first copy at 19,711,000 is a pseudogene deleted at 3' half: use fam20a_cds, which lacks this sequence 2 adjacent ones (plus one other) are inverted; all the rest in tandem. Also has interspersed TE duplication. 825 642 0/1000 422/609 123 0/1000 110/122
21 20,189,115

20,170,000 - 20,250,000 region

11

(11/0)

nucleoid DNA binding protein chr 12 (good copy) left-hand copy part of larger dup; nice array, interspersed with LTR elements all tandem 1395 990 0/1000 729/983 273 0/1000 252/271
22

XXXXX

20,437,636 4 chitinase chr 3, 5, 6 only 2 copies, 2 exons each  
23 20,887,576

20,870,000 - 20,930,000 region

9

(9/0)

lipid transfer protein chr 2,3,4,6,8,11,12, plus 10 (family 11) first copy at 20,882,000; last copy at 20,924,000 first copy has only weak homology; possible other weak copies in this area (pseudogenes?) 522 345 275/1000 208/335 135 19/1000 118/133
24

XXXXX

21,062,328

21,050,000 - 21,090,000 region

4

(3/1)

beta expansin EXPB6 chr 2, 3, 4, 5, 8, 10 (1 copy at 20,692,000) last copy is reverse orientation  
25 21,220,736

21,210,000 - 21,260,000 region

6

(6/0)

ethylene-forming enzyme chr 11 4 main copies, plus small part of a fifth, with 2 other genes (ethylene-forming enzyme) sharing a bit at 21,242,000 and 21,248,000. I include the 2 other genes even though homology is weak. all tandem 1098 825 729/1000 465/754 288 455/1000 241/278
26 11,093,000

11,050,000 - 11,110,000 region

4

(4/0)

Putative Glucan 1,3-beta-glucosidase precursor chr 5, 8 plus a bit of a 5th copy at very end of the main annotated gene several large introns; copies 2-4 annotated as part of 1 gene 1548 1410 771/1000 1005/1053 804 719/1000 683/700
27 12,390,000

12,380,000 - 12,430,000 region

4

(4/0)

putative lipase none weak homology with significant intron variation all tandem 1293 870 995/1000 597/666 330 686/1000 306/310
28 14,236,000

14,230,000 - 14,280,000 region

8

(5/3)

hypothetical protein chr 4, 5, 6, 8, 9, 11 5 in one orientation, 3 in opposite   1572 927 0/1000 364/872 225 0/1000 110/223
29

XXXXX

14,772,000 8 hypothetical protein chr 2, 10 (around 15,000,000) Part of a large inverted duplication. first copy at 14,718,000; last copy at 14,778,000 big mess; forget it for now
30 20,757,000

20,750,000 - 20,780,000 region

5

(5/0)

hypothetical protein chr 5 5 good copies plus a bit of a 6th; one has very strong homology to chr 5 copy. Good array Most of it part of one annotated protein. All tandem. 852 438 969/1000 115/241 21 -- 15/18
31 21,181,000

21,180,000 - 21,200,000 region

4

(4/0)

putative metalloproteinase chr 2, 4, 6 nice tandem array weak homology, but close together all in tandem 1209 684 1000/1000 429/513 213 461/1000 198/206