CA2206815A1 - Methods and apparatus for dna sequencing and dna identification - Google Patents

Methods and apparatus for dna sequencing and dna identification

Info

Publication number
CA2206815A1
CA2206815A1 CA002206815A CA2206815A CA2206815A1 CA 2206815 A1 CA2206815 A1 CA 2206815A1 CA 002206815 A CA002206815 A CA 002206815A CA 2206815 A CA2206815 A CA 2206815A CA 2206815 A1 CA2206815 A1 CA 2206815A1
Authority
CA
Canada
Prior art keywords
probes
recited
nucleic acid
sequence
tion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002206815A
Other languages
French (fr)
Inventor
Radoje T. Drmanac
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hyseq Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2206815A1 publication Critical patent/CA2206815A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00279Features relating to reactor vessels
    • B01J2219/00306Reactor vessels in a multiple arrangement
    • B01J2219/00313Reactor vessels in a multiple arrangement the reactor vessels being formed by arrays of wells in blocks
    • B01J2219/00315Microtiter plates
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00351Means for dispensing and evacuation of reagents
    • B01J2219/00364Pipettes
    • B01J2219/00367Pipettes capillary
    • B01J2219/00369Pipettes capillary in multiple or parallel arrangements
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00351Means for dispensing and evacuation of reagents
    • B01J2219/00378Piezo-electric or ink jet dispensers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00351Means for dispensing and evacuation of reagents
    • B01J2219/00387Applications using probes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00497Features relating to the solid phase supports
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00497Features relating to the solid phase supports
    • B01J2219/00504Pins
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00497Features relating to the solid phase supports
    • B01J2219/00513Essentially linear supports
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00277Apparatus
    • B01J2219/00497Features relating to the solid phase supports
    • B01J2219/00527Sheets
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00603Making arrays on substantially continuous surfaces
    • B01J2219/00659Two-dimensional arrays
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00718Type of compounds synthesised
    • B01J2219/0072Organic compounds
    • B01J2219/00722Nucleotides
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B60/00Apparatus specially adapted for use in combinatorial chemistry or with libraries
    • C40B60/14Apparatus specially adapted for use in combinatorial chemistry or with libraries for creating libraries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/582Recycling of unreacted starting or intermediate materials

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Cell Biology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Saccharide Compounds (AREA)

Abstract

Sequencing by Hybridization (SBH) methods and apparatus employing subdivided filters for discrete multiple probe analysis of multiple samples may be used for DNA identification and for DNA sequencing. Partitioned filters are prepared. Samples are affixed to sections of partitioned filters and each sector is probed with a single probe or a multiplexed probe for hybridization scoring. Hybridization data is analyzed for probe complementarity, partial sequencing by SBH or complete sequencing by SBH.

Description

METHODS AND APPAR~TUS FOR
DNA SEQUENCING AND DNA IDENI~ICATION

Field of the Invention This invention relates in general to mtqth~d~ and ~ ualuS for 5 nucleic acid analysis, and, in particular to, mPth~i~ and a~ tus for DNA
sequeneing.

Bach~" . uLd The rate of dt;~ell,lining the sequence of the four nucleotides in DNA c~mrles is a major te~hnic~l obst~le for further adv~n~m~nt Of 10 molP~ular biology, m~Aicine, and bioter-hnology. Nucleic acid sequencing methods which involve sep~r~tion of DNA molecules in a gel have been in use since 1978. The only other proven method for sequencing nucleic acids is sequencing by hybri~li7~tion (SBH).
The array-based approach of SBH does not require single base 15 resolution in separation, degradation, synthesis or im~ging of a DNA
molecule. In the most commonly ~ cu~s~A variation of this method, using mi~m~tch discriminative hybridization of short oligonucleotides K bases in length, lists of con~tinlent K-mer oligonucleotides may be determined for target DNA. The sequence may be assembled through uniquely ovell~pi,lg 20 scored oligonucleotides.
In SBH sequence assembly, K -1 oligonucleotides which occur repeatedly in analyzed DNA fragm~nt~ due to chance or biological reasons may be subject to special concidtoration. If there is no additional information,'' relatively small fr~gmPnt~ of DNA may be fully assembled in as much as 25 every base pair (bp) is read several times. In assembly of relatively longer fr~gmtont~, ambiguities may arise due to repeated occurrence of a K -1 Sh~Y ~EL~) W O96/17957 PCTrUS95/1615 nucleotide. This problem does not exist if mlltat~d or similar sequences have to be d~- ...;nPd Knowledge of one sequence may be used as a tPmplatP to C~ lly ~ccemble a similar one.
There are several a~fOaCIleS for sequencing by hybri-li7~tinn S In SBH Format 1, DNA ~mrles are arrayed and labelled probes are hybri-li7P~ with the 1'-- Replica mPmbranPs with the same sets of sample DNAs may be used for parallel scoring of several probes and/or probes may be mllltirleYP~. Arraying and hybndi7~tiQn of DNA c~mplPs on the nylon mPmhr~ntos are well developed. Each array may be reused many 10 times. Format 1 is ecre~i~lly ~ffici~nt for batch ~,-oce~ g large numbers of .c~mrl ~P.S
In SBH Format 2, probes are arrayed and a labelled DNA
sample fragmPnt is hybridi_ed to the arrayed probes. In this case, the co~llplct~ sequence of one fragmPnt may be determined from cimlll~n~o-.c 15 hyhri~ii7~ti~n re~ti(mc with the arrayed probes. For sequencing other DNA
r,~.~...,..~c, the same oligonucleotide array may be reused. The arrays may be produced by spotting or in situ variant of Format 2, DNA anchors are arrayed and ligation is used to determine oligose~luences present synthesis. Specific hybri~li7ati- ~ has been demol~ctrated. In a variant of Format 2, DNA anchors 20 are arrayed and ligation is used to determine oligosequences present at the end of target DNA.
In Format 3, two sets of probes are used. One set may be in the form of arrays and another, labelled set is stored in multiwell plates. In this case, target DNA need not be labelled. Target DNA and one labelled 25 probe are added to the arrayed set of probes. If one att~rhed probe and one lahPll~ probe both hybridize contiguously on the target DNA, they are covalently ligated, producing a sequence twice as long to be scored. The process allows for sequencing long DNA fr~gmPnt~, e.g. a complete bart~-ia genome, without DNA subcloning in smaller pieces.

w~ u~

In the present invention, SBH is applied to the effi~ nt i-l.ontific~tion and sequencing one or more DNA c~mplçc in a short period of time. The procedure has many applir~tinnc in DNA r~ nocfirs, fol~ns~cs, and gene mapping. It also may be used to identify mut~tio.nC ~~.~nsil,le for 5 genetic disordeM and other traits, to assess biodivel .ily and to produce many other t~es of data ~p~n~Pnt on DNAs~uence.

S~ of the I~

AsmPntiol~-P~d above, Format 1 SBH is appro~lia~ for the !cimlllt~nPollc analysis of a large set of ~mplPc Parallel scoring of thousands 10 of ~mrlPc on large arrays may be applied to one or a few ~mrles are in thousands of in~ependçnt hybri~li7~tion ro~rtionc using small pieces of mPmbr~nPs~ The i-lPntifir~tion of DNA may involve 1-20 probes and the i(lPntific~tion of mutations may in some cases involve more than 1000 probes spP~ific~lly seiP~ctP~I or decign~l for each sample. For idçntifi~tion of the 15 nature of the m~lt~t~ DNAsegmpntc7specific probes may be synthPci7P~ or sPl~tPd for each mutation detç~ted in the first round of hybri~ii7~tionc.
According to the present invention, DNA samples may be ple~cd in small arrays which may be separated by a~p,u~liale spacers, and which may be cimlllt~nPously tested with probes sPlect~ from a set of 20 oligonucleotides kept in multiwell plates. Small arrays may consist of one or more ~mplç5 DNA~mplçs in each small array may consist of ~ tC or individual samples of a sequence. Conc~-u~ e small arrays which form larger arrays may re~,~ sent either replic~tion of the same array or 5~mplçs of a different DNA fr~mPnt A universal set of probes concicts of snfficient 25 probes to analyze any DNA fr~gm.ont with ~ e~-;fied precision, e.g. with respect to the rech-n~l~ncy of reading each bp. These sets may include more probes than are ne~ecc~ry for one specific fragment, but fewer than are nr,cec~.y for testing thousands of DNA samples of different sequence.

W O96/179S7 PCTtUS95tl6154 DNA or allele irlPntifir~tion and a r~ nc stic sequencing process may include the steps of:

1) SPl~tinn of a subset of probes from a cleAir~t~d"~l~se.l~Live or universal set to be hybridized with each of a plurality small arrays;
5 2) Adding a first probe to each sul)~uldy on each of the arrays to be analyzed in parallel;
3) Pclro,~ g hybri~i7~tion and scoring of the hybri-li7~tion results;
4) Stripping offpreviously used probes and ,el)e~ling lG~ g probes that are to be scored;
10 5) P~uces~,ing the obtained results to obtain a final analysis or to deterrnine ~ lition~l probes to be hybridized;
6) I'~ro~ il.g ~ lition~l hybrirli7~tio-nc for certain s.lba"~ys; and 7) P~.,ces~ g complete sets of data and colll~uLing obL~il.i.lg a final analysis.

The present invention solves problems in fast i~ntifir~tion and sequencing of a small number of nucleic acid ~mpl~s of one type (e.g. DNA, RNA) and in parallel analysis of many sample types by using a presynth~ci7P~
set of probes of manageable si~ and samples ~tt~rhed to a support in the form of subarrays. Two approaches have been combined to produce an efficient 20 and versatile process for the de~el,l,ination of DNA identity, for DNA
rii~gno~ti~s~ and for i~t~ntifi~tion of mut~tion~ For the i~entific~tion of known sequences a small set of shorter probes may be used in place of a longer unique probe. In this case, there may be more probes to be scored, but a universal set of probes may be synth~i7~d to cover any type of sequence.
For ~Y~mple, a full set of 6-mers or 7-mers are only 4,096 and 16,384 probes, respectively.
Full sequencing of a DNA fragment may involve two levels.
One level is hybridization of a snffiçient set of probes that cover every base t ~ RU~ ~

wo 96/17957 PCT/US95J16154 at least once. For this purpose, a specific set of probes may be synthPci7~d for a standard sample. This hybri~li7~tion data reveals whether and where mnt~icnlc (differences) occur in non-standard c~mrles~ To cl~ ...ine the identity of the ~h~nges, ~ litinn~l specific probes may be hybridized to the sample. In another embo~ t, all probes from a Ulli\'~ al set may be scored.
A universal set of probes allows scoring of a relatively small number of probes per sample in a two step process without lln~rce~t~hl~o PYren-lit--re of time. The hybri-ii7~tion process involves succç~ive probings, in a first step of co.~.puLing an optimal subset of probes to be hybri-li7PIi first and, then, on the basis of the obtained results, a second step of de~ -P
litic~n~l probes to be scored from among those in the existing universal set.
The use of an array of sample arTays avoids concP~utive scoring of many oligonucleotides on a single sample or on a small set of rl-s This ap~ acl~ allows the scoring of more probes in parallel by manipulation of only one physical object. By combining the use of the ;~al~ay formed with the universal set of probes and the four step hybri-li7~tion process, a DNA sample 1000 bp in length may be sequenced in a relatively short pe;iod of time. If the sample is spotted at 50 ~7uballays in an array and the array is reprobed 10 times, 500 probes may be scored. This number of probes is highly s~fflcient. In screening for the occurrence of a mutation, approximately 335 probes may be used to cover each base three times. If a mutation is present, several covering probes will be affected. These negative probes may map the mutation with a two base precision. To solve a single base mutation mapped with this precision, an additional 15 probes may be employed. These probes cover any base combination for the two quection~hle positions (~ccumin~ that deletions and insertions are not involved). These probes may be scored in one cycle on 50 subarrays which contain the given sample. In the implem~nt~tion of a multiple label color scheme (multiplexing), two to six probes labelled with dirrere-~ fluo-~scen~ dyes may W O96/179S7 PCTrUS95/16154 be used as a pool, thereby reducing the number of hyhri~1i7~tion cycles and ~holL.~ing the sequencing process.
In more compli~ted cases, there may be two close m~lt~tir or insertions. They may be h~n~ d with more probes. For eY~mplP, a three 5 base insertion may be solved with 64 probes. The most compli~t~ cases may be approached by several steps of hyhri-li7~tion, and the s~ c*n~ of a new set of probes on the basis of results of previous hybri~i7~tion~
If ~ b~ldy~ consists of tens or hundreds of ~mp1es of one type, then several of them may be found to contain one or more c~l~nge~
10 (mut~tion~ insertions, or del~oti-)nc). For each ~ PI~ where m~lt~tion occurs, a specific set of probes may be scored. The total number of probes to be scored for a type of sample may be several hundreds. The scoring of replica arrays in parallel allow scoring of hundreds of probes in a relatively small number of cycles. In ~ ition~ colllpatible probes may be pooled.
15 Positive hybridi_ations may be ~ign~d to the probes s~ ted to check particular DNA segments because these segmPnt~ usually differ in 75% of their constituent bases.
By using a larger set of longer probes, longer targets may be conveniently analyzed. These targets may l~l~sent pools of shorter 20 f~gmPnt~ such as pools of exon clones.
The multiple step a~luach~ which ~..ini...i,~s the number of ~ ,y probes, may employ a specific hybridization scoring method to define the presence of heterozygotes (sequence variants) in a genomic segment to be sequenced from a diploid chromosomal set. There are two possibilities:
25 i) the sequence from one chromosome lep-esel-ts a basic type and the sequence from the other ruprcse.lts a new variant; or, ii) both chromosomes contain new, but different variants. In the first case, the sc~nning step de~ign~d to map changes gives a maximal signal difference of two-fold at the heterozygotic position. In the second case, there is no m~Cl~ing; only a more W O 96/17957 PCTnUS95~16154 comrlir~tP~ s~Pl~P,ction of the probes for the subsequent rounds of hybri-li7~tinn~
may be l~uiçcd.
Scoring two-fold signal differences l~uil~d in the first case may be achieved PfficiPntly by co,.,p~, ;ng coll- sl,onding signals with controls S co..l~;n;.~g only the basic sequence type and with the signals from other analyzed s~mplPs This approach allows de~l".;n~ n of a relative lc~lucl;nf~
in the hybridi7~tion signal for each particular probe in the given sample. This is ~ip;nific~nt because hyhr~ 7~tit)~ efficiPncy may vary more than two-fold for a particular probe hybridized with different DNA fragmPntc having its full 10 match target. In ~rl iition, helel~ygoLic sites may affect more than one probe dtpPnAing on the number of oligonucleotide probes. Decrease of the signal for two to four con~e~ul;~e probes produces a more ~ignifir~nt inriir~tir,n of h~ u~y~;oLic sites. The leads may be çh~ od by small sets of sel~t~
probes among which one or few probes are supl~ose to g*e full match signal 15 which is on average eight-fold stronger than the signals coming from mi~m~trh-cor~t~ g ci-~rlr~Yr~s.
Partitioned membranes allow a very flexible o~ ;on of experiments to accommodate relatively larger numbers of samples l~rCse~n;-~a given sequence type, or many different types of ~mples rel,lcsel~ted with 20 smaller number of ~ mples A range of 4-256 ~mpl~s can be h~nrileci with particular emci~nry. Subarrays within this range of numbers of dots may be de~ign~d to match the configuration and size of standard multiwell plates used for storing and l~heliing oligonucleotides. The size of the subdlldys may be adjusted for different number of ~mples, or a few standard ~ub~ldy sizes 25 may be used. If all s~mrles of one type do not fit in one subarray, ~ririition~l ~l~bdlldys or me,-lbldiles may be used and pl-Jces~ed with the same probes.
In ~ririition, by adjusting the number of replicas for each ~ bdlldy, the time for completion of idrntifir~tion or sequencing process may be varied.

CA 0220681~ 1997-06-04 W O96/17957 PCTrUS95/1615 Deta~ed Des~ ion F.Y~mple 1 P1G~ ;On of a Universal Set of Probes Two types of universal sets of probes may be pl~a~ecl. The S first is a co rlc e set (or at least a nonco."pl~",.~ , y subset) of relatively short probes. For eY~mrle, all 4096 (or about 2000 non-comp!PmPnt~ry) ~mers, or all 16,384 (or about 8,000 non-comp1PmPnt~ry) 7-mers. Full noncomple",~ ;., y subsets of 8-mers and longer probes are less convenient in as much as they include 32,000 or more probes.
A second type of probe set is s~ole~ted as a small subset of probes still s~lffiriPnt for reading every bp in any sequence with at least withone probe. For eY~mrle, 12 of 16 dimers are sllffi~iPnt A small subset for 7-mers, 8-mer and 9-mers for sequencing double str~n~iP~ DNA may be about 3000, 10,000 and 30,000 probes, l~ s~;li~ely.
Probes may be prepared using standard chPmictry with one to three non-spe~ified (mixed A,T,C and G) or universal (e.g. M base, inosine) bases at the ends. If radiolabelling is used, probes may have an OH group at the 5' end for kin~cing by radiol~helle~ phosphorous groups Alternatively, probes labelled with fluorescent dyes may be employed. Other types of 20 probes like PNA (Protein Nucleic Acids)or probes cont~ining mo-iifiP~ bases which change duplex stability also may be used.
Probes may be stored in barcoded multiwell plates. For small numbers of probes, 96-well plates may be used; for 10,000 or more probes, storage in 384- or 864-well plates is ~,ef~ d. Stacks of S to 50 plates are 25 enough to store all probes. Approximately 5 pg of a probe may be s~lfficient for hybn-li7~tion with one DNA sample. Thus, from a small synthesis of about 50 ~g per probe, ten million samples may be analy_ed. If each probe is used for every third sample, and if each sample is 1000 bp in length, then WO 96117957 P~TIIIS95116154 g over 30 billion bases (10 human genomes) may be sequenced by a set of 5,000 probes.
FY~mple 2 n of DNA S~mples S DNA fragm~nt~ may be p~ ~l as clones in MI3, pl~cmi~l or lambda vectors andlor prepared directly from genomic DNA or cDNA by PCR or other amplifit~tion mpthodc. S~mplçs may be ~lc~ d or tlicp~n~d in multiwell plates. About 100-1000 ng of DNA ,c~mplçs may be ~ ,d in 2-500 ~1 of final volume.

FY~mI le 3 P~ ion of DNA Arrays Arrays may be prepared by spotting DNA ~mpl~s on a support such as a nylon membrane. Spotting may be p~ Çol~l~ed by using arrays of metal pins (the positions of which co.-~s~ond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA sol~ltio~ to a nylon membrane. By offset printing, a density of dots higher than the density of the wells is achieved. One to 25 dots may be accomm~l~tP~ in 1 mm2 de~Pn-ling on the type of label used. By avoiding spotting in some pres~lç~ttod number of rows and coll-mn~, sep~r~t~ subsets (s~ al,~ys) may be 20 formed. Samples in one subarray may be the same genomic segnt of DNA
(or the same gene) from different individuals, or may be different, ov~ ed genomic clones. Each of the subarrays may l~.c;sent replica spotting of the same c~mrlçs. In one example. one gene segment may be amplified from 64 p~tiçntc For each patient, the amplified gene segment may be in one 96-well plate (all 96 wells cont~ining the same sample). A plate for each of the 64 patients is prepared. By using a 96-pin device all samples may be spotted on one 8 x 12 cm membrane. Subarrays may contain 64 ~mples, one from each W O 96117957 PCT~US951161S4 patient. Where the 96 ~ub~dys are identi.~~l, the dot span may be 1 mm2 and there may be a 1 mm space betwcen sllbdlldys.
Another d~roacl~ is to use mPmhr~nes or plates (available from NUNC, Naperville, Illinois) which may be partitioned by physical spacers e.g.
5 a plastic grid molded over the ~ ...h~ e~ the grid being similar to the sort of b~ e applied to the bottom of multiwell plates, or hyd,~hobic strips.
A fixed physical spacer is not Illcrcll-d for im~ging by CAl)OSUl'~, to flat phosrhor-storage screens or x-ray films.

FY~mrle 4 SPlP~ction and T~hPlling of Probes When an array of sub~,dys is produced, the sets of probes to be hybridized in each of the hybn~li7~tion cycles on each of the ~ub~ldy~ is tlPfinP~ For the c~mrl~~ in FY~mple 3, a set of 384 probes may be sPle~t from the universal set, and 96 probings may be pclro~ ed in each of 4 cycles.
Probes stolect~d to be hyhritli7-P~l in one cycle plcf~.dbly have similar G+C
CC~l-t~P.l~t ~ .
SelPctPd probes for each cycle are transferred to a 96-well plate and then are labelled by kin~ing or by other labelling procedures if they are not l~h~ d (e.g. with stable fluorescent dyes) before they are stored.
On the basis of the first round of hybri~i7~tinn~, a new set of probes may be defined for each of the ~ubdlldys for additional cycles. Some of the arrays may not be used in some of the cycles. For eY~mple, if only 8 of 64 patient ~mples exhibit a m~lt~tion and 8 probes are scored first for each mutation, then all 64 probes may be scored in one cycle and 32 subarrays are not used. These subarrays may then be treated with hybridization buffer to prevent drying of the filters.
Probes may be retrieved from the storing plates by any convenient approach, such as a single channel pipetting device or a robotic 5~3t~ ;3 W O 96/17957 PCTnUS9~1615 station such as a Re~m~n Biomek 1000 (Re~m~n Instruments, Fullerton, ~liforni~) or a Mega Two robot (~IPg~m~tion, Lawrenceville, New Jersey).
A robotic station may be ;n~ Alrd with data analysis ~l~Jgl~ullS and probe p~ol3.,...lc Outputs of these pl'~g~ lS may be inputs for one or 5 more robotic ~t~ti()nc Probes may be retrieved one by one and added to suba-l~s covered by hybridization buffer. It is plerGllGd that retrieved probes be placed in a new plate and l~hPllP~ or mixed with hybri~i7~tion buffer. The lJlGr~led method of retrieval is by açcç~in~ stored plates one by one and 10 piI~ettin~ (or tr~n~fPrrin~ by metal pins) a sllffi~i-ont amount of each ~:Pl~:ted probe from each plate to specific wells in an intPrmpAi~ry plate. An array of individually addressable pipettes or pins may be used to speed up the retrieval process.

F.Y~mple S
Hybri-ii7~tion and Scoring Process T ~hçll~d probes may be mixed with hybri~li7~tion buffer and pirettP~ IllGÇ~Glltially by mllltic~h~nnPl pipettes to the subarrays. To preventmixing of the probes between subarrays (if there are no hydrophilic strips or physical barriers imprinted in the membrane), a coll~,sl,onding plastic, metal 20 or ceramic grid may be firmly pressed to the membrane. Also, the volume of the buffer may be reduced to about 1 ~l or less per mm~. The con~ntr~tion of the probes and hybri~i7~tion conditions used may be as ~Psçribe~ previously except that the washing buffer may be quickly poured over the array of sub~l~-ys to allow fast dilution of probes and thus prevent 25 ~ignifit~nt cross-hybri~i7~tion. For the same reason, a minim~l con~pntr~tionof the probes may be used and hybri~li7~tion time Pl~tpn~l-pcl to the m~xim~l pr~rtir~l level. Por DNA detection and sequencing, knowledge of a "normal"
s~u.,.lce allows the use of the continuous st~l king intP~tion phenomenon to W O96/17957 PCTrUS95/1615 increase the signal. In addition to the labelled probe, additional unlabelled probes which hybridize back to back with a l~hellPd one may be added in the hybrirli7~ion reaction. The amount of the hybrid may be increased several times. The probes may be cun~-P~;led by lig~ti~n. This approach may be i~ t for resolving DNA regions forming l'COI~ .SS;OnS".
In the case of r~rlioW~lled probes, images of the filters may be obtained ~lerelelltially by phocl.ho. ~Lul~ge tPrhnol~gy. FluGl~scent labels may be scored by CCD cameras, confocal microsc~y or otherwise. Raw signals are norm~li7Pd based on the amount of target in each dot to prù~lly scale and integrate data from different hybri~i7~ti(m PYI-Pnmp-ntc- Differences in the amount of target DNA per dot may be cull~led for by dividing signals of each probe by an average signal for all probes scored on one dot. Also, the norm~li7Pd signals may be scaled, usually from 1-100, to COIII~ data from different experiments. Also, in each subarray, several control DNAs may be used to determine an average background signal in those ~mpl~5 which do not contain a full match target. Furthermore, for ~mrl~P5 obtained from diploid (polyploid) scores, homozygotic controls may be used to allow recognition of heterozygotes in the s~ml lPs Example 6 Diagnostics - Scoring Known Mutations or Full Gene Resequencing A simple case is to discover whether some known mutations occur in a DNA segment. Less than 12 probes may suffice for this purpose, for example, 5 probes positive for one allele, 5 positive for the other, and 2 negative for both. Re~n~P of the small number of probes to be scored per 25 sample, large numbers of c~mples may be analyzed in parallel. For eY~mple, with 12 probes in 3 hybridization cycles, 96 different genomic loci or gene segmPntc from 64 patient may be analyzed on one 6 x 9 in me.llb.~ule con~ining 12 x 24 subarrays each with 64 dots lc:~le~ g the same DNA

CA 022068l5 l997-06-04 W O 96/17957 PCTnUS9SJ16154 segm~nt from 64 p~ti~nt~. In this eY~mrle, samples may be ~ ed in sixty-four 96-well plates. Each plate may ,c;~ ient one patient, and each well may ~c~lc~e~lt one of the DNA segm~nt~ to be analyzed. The ~mr~ s from 64 plates may be spotted in four replicas as four 4u~LI~ of the same 5 m~mhr~nP.
A set of 12 probes may be s~lP~t~i by single channel ~ e~ g or a single pin transferring device (or by an array of individually controlled pipets or pins) for each of the 96 segmPnt~ and ,~hl~ged in twelve 96-well plates. Probes may be labelled if they are not prelabelled before storing, and 10 then probes from four plates may be mixed with hybri~i7~tinn buffer and added to the subarrays p.er~e.ltially by a 96-channel pil-eting device. After one hybri-li7~tion cycle it is possible to strip off previously used probes by ineub~tin~ the m~mhr~ne at 37~ to 55~C in the pr~fe-~bly un~ )t~d hybri~i7~tic n or washing buffer.
The likelihood that probes positive for one allele are positive and probes positive for the other allele are negative may be used to d~le"-,ine which of the two allels is present. In this re~nn-l~nt scoring scheme, some level (about 10%) of errors in hybridization of each probe may be t~ ~t~
An incomplete set of probes may be used for scoring most of 20 the alleles, espe~i~lly if the smaller rednn~i~ncy is s~ffi~iPnt, e.g. one or two probes which prove the presence or absence in a sample of one of the two alleles. For P~mple, with a set of four thousand 8-mers there is a 91%
chance of finding at least one positive probe for one of the two alleles for a randomly sPl~ted locus. The incomplete set of probes may be opli."ized to 25 reflect G+C content and other biases in the analyzed samples.
For full gene sequencing, genes may be amplified in an app,upliate number of segm~ntc For each segm~nt, a set of probes (about one probe per 2-4 bases) may be cP~ ted and hybridized. These probes may identify whether there is a mutation anywhere in the analyzed segmlontc 30 Segm~nt~ (i.e., subarrays which contain these segments) where one or more mllt~fP~l sites are ~IPtP~t-od may be hybri~1i7p~ with ~f~rlition~l probes to find the exact sequence at the mutated sites. If a DNAsampleis tested by every second 6-mer, and a mut~tit)n is localized at the position that is s~ll,uunded by ~..ilively hybri~li7P~ probes TGCAAA and TATTCC and covered by three S negative probes: CAAAAC,AAACTA and ACTATT, the mut~t~ ml~l~ti~S
must be A and/or C oc~u~ g in the normal sequence at that pos;l;~ 1-. They may be ch~nged by a single base mllt~tinl1, or by a one or two nucl~tide ion and/or insertion between bases AA, AC or CT.
One approach is to select a probe that extends the positively 10 hybri-li7P~ probe TGCAAA for one nucleotide to the right, and which extends the probe TATTCC one nucleotide to the left. With these 8 probes (GCAAAA, GCAAAT, GCAAAC,GCAAAG and ATATTC,TTATTC, CTATTC,GTATTC) two q~lestion~hle nucleotides are dcLc.,l-ined.
The most likely hypothesis about the mut~tion nay be 15 det~ ed. For eY~mrle,Ais found to be mutated to G. There are two sol~lti~-nc ~ti~fi~d by these results. Either repl~rçmPnt of A with G is the only change or there is in addition to that change an insertion of some number of bases between newly determined G and the following C. If the result with bridging probes is negative these options may then be ehP~ Pd first by at least 20 one bridging probe comr7;icing t'ne mllt~tP~ position (AAGCTA) and with an ~ ition~l 8 probes: CAAAGA, CAAAGT, CAAAGC, CAAAGG and ACTATT,TCTATT,CCTATT,GCTATT,I There are many other ways to select mutation-solving probes.
In the case of diploid, particular coll~palisons of scores for the 25 test s~mples and homozygotic control may be ,)elro,llled to identify heterozygotes (see above). A few consecutive probes are e~ eclPd to have roughly twice smaller signals if the segment covered by these probes is mllt~t~ on one of the two chromosomes.

.

FY~mrlP 7 Identifir~ti~ ~ of Genes (Mllt~tion~ esron~ihle for Genetic Disorders and Other Traits The sequencing process rli~rlnsed herein has a very low cost per S bp. Also, using larger UlliV~ l sets of longer probes (8-mers or 9-mers), DNA fr~gmPnt~ as long as 5-20 kb may be sequenced without c..h~ ing Furth~ o- e, the speed of I~s~,le"cing may be about 10 million bp/daylhybritli7~tion in~LlulIlent. This peIro~ re allows for resequeIlcing a large fraction of human genes or the human genome repe~tPAIy from ~ciontifir~lly or mPAir~lly interesting individuals. To resequence 50% of the human genes, about 100 million bp is çh~P~. That may be done in a relatively short period of time at an affordable cost.
This enormous resequencing capability may be used in several ways to identify mllt~tion~ and/or genes that encode for disorders or any other traits. Ra~ic~lly~ mRNAs (which may be converted into cDNAs) frôm particular tissues or genomic DNA of patients with particular disorders may be used as starting m~tPri~l~ From both sources of DNA, Se~ .,.tP. genes or ~enomiC fr~mPnt~ of approp,iate length may be p-~;d either by cloning l)~ocelu,~s or by in vitro ~mplifi~ticm ,~,uce.lu,c;s (for eY~mrl~P by PCR). If cloning is used, the minim~l set of clones to be analyzed may be s~P1P~t~P~ fromthe libraries before sequencing. That may be done efficien~ly by hybridization of a small number of probes, esre~i~lly if a small number of clones longer than 5 kb is to be sorted. Cloning may increase the amount of hybri~i7~tion data about two times, but does not require tens of thousands of PCR primers.
In one variant of the procedure, gene or genomic fr~gmPnt~
may be prepared by restriction cutting with enzymes like Hga I which cuts DNA in following way: GACGC(N5')/CTGCG(N10'). Protruding ends of five bases are different for different fr~mPntc One enzyme produces a~ul~liat~
fr~gmPnt~ for a certain number of genes. By cutting cDNA or genomic DNA

W O96/179S7 PCTrUS9511615 with several enzymes in S~p~F~ratp~ re~ctit)n~, every gene of interest may be excised a~ ~lia~ly. In one approach, the cut DNA is fractionated by size.
DNA fr~gmPnt~ prepared in this way (and optionally treated with FYonllclp~
III which individually removes nucleotides from the 3' end and inel~es S length and ~rerifieity of the ends) may be tii~rPn~P~ in the tubes or in multiwell plates. From a relatively small set of DNA adapters with a a~ll.ll.n portion and a variable protruding end of a~r~iate length, a pair of ~ pt~r~
may be sflP~I~P~ for every gene fr~EmPnt that needs to be ~mplifi~P~ These adapters are ligated and then PCR is pelrc,ll.,ed by univel:~l primers. From 10 1000 adapters, a million pairs may be generated, thus a million dirr.~
fr~EmPnt~ may be ~rerifi~lly amplified in the idPntic~l conditions with a universal pair of primers comrlçmPnt~ry to the common end of the ~ t- .
If a DNA difference is found to be rel)eal~ in several p~tiPnt~
and that sequence change is non~Pnce or can change function of the 15 co"~ onding protein, then the mutated gene may be responsible for the disorder. By analyzing a significant number of individuals with particular traits, functional allelic variations of particular genes could be ~oci~t~d by specific traits.
This approach may be used to Plimin~tP the need for very 20 expensive gene~ic mapping on extensive pedigrees and has special value when there is no such genetic data or m~tPri~l.

FY~mple 8 Scoring Single Nucleotide Polymorphisms in Genetic Mapping Techniques disclosed in this application are ~I"o~iate for an 25 efficiPnt identific~tinn of genomic fr~gmPntc with single nucleotide polymorphisms (SNUPs). In 10 individuals by applying the described ~-sequencing process on a large number of genomic fr~gm~Pnt~ of known sequence that may be amplified by cloning or by in vitro amplification, a wo 96/17957 PCTJUS95116154 s~-fficient llUlnlJt;l of DNA s~-~e ~ with SNUPs may be identifi~d. The polymorphic fr~gmPntc are further used as SNUP ~ h~l ~. These . . ~ are either lll~l l,ed previously (for ~Y~mrle they lc~se.lt m~rpe~ STSs) or they may be mapped through the s.;l~nil-g pr~lur~ desc,ibed below.
SNUPs may be scored in every individual from relevant f~milies or populations by amplifying ill~h~l~ and arraying them in the form of the array of ~uba~ys. Sul~-dy~ contain the same marker ~m~)lifi~l from the analyzed individuals. For each marker, as in the ~ nosti~s of known mutations, a set of 6 or less probes positive for one allele and 6 or less probes positive for the other allele may be sPlP~tP~ and scored. From the ~ nifir~nt ~oci~tion of one or a group of the ~ualkel.. with the disorder, chromosomal "osition of the responsible gene(s) may be determined. Re~ e of the high throughput and low cost, thousands of .nalk~l~, may be scored for ~ousands of individuals.
This amount of data allows 1~1i7~tinn of a gene at a resolution level of less than one million bp as well as 1O~li7~tion of genes involved in polygenic ~iise~ces. ~ li7P~ genes may be identified by sequencing particular regions from relevant normal and ~ffecS~d individuals to score a mutation(s).
PCR is pl~f~lltd for amplification of ll~ from genomic DNA. Each of the .ll~kel~ require a specific pair of primers. The e~icting markers may be convertible or new Illalk~l~ may be defined which may be prepared by cutting genomic DNA by Hga I type restriction enzymes, and by ligation with a pair of adapters as described in Fy~mple 7.
SNUP markers can be amplified or spotted as pools to reduce the number of independent amplification reactions. In this case, more probes are scored per one sample. When 4 ",~,k~,~ are pooled and spotted on 12 replica membranes, then 48 probes (12 per marker) may be scored in 4 cycles.

FY~mple 9 Detection and Verifir~ti( n of Identity of DNA FragmPntc DNA fr~gm~ntc genP~tP~ by restriction cutting, cloning or in vitro ~mI~lifi~tion (e.g. PCR) r,~l~el-lly may be ir1PntifiPd in a C~ t S TdPntifir~tion may be pclrwl--ed by verifying the presence of a DNA band of specific size on gel ele~ ~hol.,~is. ~lt~rn~tively, a specific oligonucleotide may be prepared and used to verify a DNA sample in question by hybri-li7~tion. The procedure developed here allows for more effiri~nt i~Pntifir~tion of a large number of s~mrles without ~ ~ing a specific 10 oligonucleotide for each fr~gmPnt A set of positive and negative probes may be sel~P~tP~ from the universal set for each fragment on the basis of the known sequences. Probes that are SPlP~tPd to be positive usually are able to form one or a few ovella~illg groups and negative probes are spread over the whole insert.
This tP~hnnlogy may be used for idPntific~tion of STSs in the process of their mapping on the YAC clones. Each of the STSs may be tested on about 100 YAC clones or pools of YAC clones. DNAs from these 100 re~rtionC possibly are spotted in one subarray. Different STSs may l~ ;,ellt concP~utive ~ubdlldys. In several hybridization cycles, a cign~t~lre may be 20 ge-~ P~I for each of the DNA samples, which signature proves or disproves eYict~nre of the particular STS in the given YAC clone with nececc~ry confidPnre, To reduce the number of indepPnd~Pnt PCR reactions or the number of independent samples for spotting, several STSs may be amplified 25 simultaneously in a reaction or PCR samples may be mixed, respectively. In this case more probes have to be scored per one dot. The pooling of STSs is -lep~ndPnt of pooling YACs and may be used on single YACs or pools of YACs. This scheme is esper~ y attractive when several probes labelled with dirre~ t colors are hybridized together.

W O 96n79s7 PCTnUsgS~1615 In ~d-lition to confirmation of the eYi~tPnce of a DNA fr~gmPnt in a sample, the amount of DNA may be ej,l;~ ted using intenCities of the hybrilli7~tion of several sep;tldfP probes or one or more pools of probes. By CC~ g obtdined int~nciti~S with intenciti~s for control ~mrles having a S known amount of DNA, the quantity of DNA in all spotted ~mrlPs is d~ ed cim~llt~n~ollcly. Re~llse only a few probes are nP~Iy for i~lPntifit~tion of a DNA fr~gmPnt and there are N possible probes that may be used for DNA N bases long, this applir~tion does not require a large set of probes to be s-lffi~ient for identifir~tion of any DNA segmPnt From one 10 thouc~n~ 8-mers, on average about 30 full ,..~ probes may be SPl~tPd for a 1000 bp fr~mPnt Example 10 -lPntific~tion of Infectious Disease Org~nicmc and Their Variants DNA-based tests for the ~ete~tinn of viral, b~ctPn~l, fungal and 15 other p~r~citi~ org~nicmc in patients are usually more reliable and less expensive than ~ltern~tives. The major advantage of DNA tests is to be able to identify specific strains and mutants, and eventually be able to apply more effective tre~tmPnt Two applications are described below.
The presence of 12 known antibiotic recict~n~e genes in 20 b~ctPri~l infections may be tested by amplifying these genes. The amplified products from 128 patients may be spotted in two subarrays and 24 s~b~ldy~
for 12 genes may then be repeated four times on a 8 x 12 cm membrane. For each gene, 12 probes may be sel~PctP~d for positive and negative scoring.
Hybri-li7~tinns may be performed in 3 cycles. For these tests, as for the tests 25 in Example 9, a much smaller set of probes is most likely to be universal.
- For PY~mple, from a set of one thousand 8-mers, on average 30 probes are positive in 1000 bp fr~gmPntc and lO positive probes are usually sllffi~i~nt for a highly reliable identification. As described in Example 9, several genes W O 96/17957 PCTrUS95/1615 may be amplified and/or spotted together and the amount of the given DNA
may be dc;L~ ed. The amount of ~mplifi~1 gene may be used as an inrlic~tor of the level of infection.
Another eY~mrle involves possible sequencing of one gene or S the whole genome of an HIV virus. Re~ cP of rapid diversifi~tion~ the virus poses many ~lifficulti~s for selection of an optimal therapy. DNA fr~m~nts may be ~mplifiP~ from isolated viruses from up to 64 patients and resequenced by the described procedure. On the basis of the obtained sequence the optimal therapy may be selPct~i If there is a mixture of two virus types of which one 10 has the basic sequence (similar to the case of heterozygotes), the mutant may be id~ntifi~l by qn~ .t;ve co~ ;cons of its hybridi7~ticm scores with scores of other ~mpl~P5~ esrer~ y control ~mpl~Ps CO~ i..g the basic virus type only. Scores twice as small may be obtained for three to four probes that cover the site mut~tPd in one of the two virus types present in the sample (see 15 above).

Example 1 1 Forensic and Parental Identific~tion Appli~tionc Sequence polymorphisms make an individual genomic DNA
unique. This permits analysis of blood or other body fluids or tissues from 20 a crime scene and co.llpalison with c~mples from criminal s~ s. A
sllfficient number of polymorphic sites are scored to produce a unique Cipn~tllre of a sample. SBH may easily score single nucleotide polymorphisms to produce such cign~tllres.
A set of DNA fr~gmentc (10-1000) may be amplified from 25 c~mples and s~lspe~tc DNAs from samples and sllcpectc reprPcP~ one fr~gmPnt are spotted in one or several subarrays and each subarray may be replic~t~d 4 times. In three cycles, 12 probes may determine the presence of allele A or B in each of the s~mrlles, inclutling suspectc, for each DNA locus.

WO 96117957 PCTJUS95Sl6154 M~trhing the patterns of ~mples and sncre~t~ may lead to discovery of the suspect ,~.,~nsible for the crime.
The same plucedurc may be applicable to prove or disprove the identity of parents of a child. DNA may be ylc~cd and polym-)rphic loci S ~mrlifi~ from the child and adults; 1~ of A or B alleles may be determined by hybri-ii7~tion for each. Comr~ricons of the obtained r~ttern~
along with positive and negative controls, aide in the d~ t~- .";~-~t;on of familial rel~tionchirs In this case, only a ci~nifir~nt pûrtion of the alleles need matchwith one parent for i-i~ontifi~ti-)n Large numbers of scored loci allow for the 10 avoidance of st~tict~ errors in the l~r~edu~l or of m~C~ing effects of de novo mut~tionc FY~mple 12 ~cc~ccin~ Genetic Diversity of Populations or Species and Biological Diversity of Ecological Niches Me~curing the frequency of allelic v~ri~ti~nc on a cignific~nt number of loci (for eY~mrle, several genes or entire mitnchonr~ri~l DNA) permits development of ~lirrt;~cnt types of conchl~ion~, such as conrlllcicnc ,~gal.ling the impact of the environment on the genotypes, history and evolution of a population or its susceptibility to llice~ces or extinction, and 20 others. These ~ccc ~ .ltc may be pc1rOI 1"ed by testing specific known alleles or by full resequencing of some loci to be able to define de novo m~lt~tionc which may reveal fine variations or presence of mutagens in the environm~nt Additionally, biodiversity in the microbial world may be surveyed by resequencing evolutionarily conserved DNA sequences, such as 25 the genes for ribosomal RNAs or genes for highly conservative proteins.
- DNA may be plc~J~cd from the environment and particular genes amplified using primers co"cs~nding to conservative sequences. DNA fr~gmentc may be cloned p,efe,e.,li;~lly in a plasmid vector (or diluted to the level of one Ou~

CA 0220681', 1997-06-04 W O96/17957 PCTrUS95116154 mol~P~ule per well in multiwell plates and than amplified in vitro). Clones pl~p~u~d this way may be resequenced as ~lesçribe~ above. Two types of information are obtained. First of all, a catalogue of dirf~l~nt species may be defined as well as the density of the individuals for each species. Another S ~gi~ of inforrn~tion may be used to measure the i~nuenc~ of ecological factors or pollution on the eco~y~ . It may reveal whether some species are ir~tPA or whether the abllnd~nre ratios among species is altered due to the pollution. The method also is applicable for sequencing DNAs from fossils.

FY~mple 13 DNA Sequencing An array of subarrays allows for efficient sequencing of a small set of ~mples arrayed in the form of replic~tP~ subarrays; For eY~mple~ 64 ~mrles may be arrayed on a 8 X 8 mm subarray and 16 X 24 subarrays may be replicated on a 15 X 23 cm membrane with 1 mm wide spacers be~
15 the subarrays. Several replica membranes may be made. ~or eY~mpl~7 probes from a universal set of three thousand seventy-two 7-mers may be divided in thirty-two 96-well plates and labelled by kin~ing. Four membranes may be processed in parallel during one hybridi7~tion cycle. On each membrane, 384 probes may be scored. All probes may be scored in two 20 hybridi7~tion cycles. Hybridi7~tinn inten~itiPs may be scored and the sequence assembled as described below.
If a single sample subarray or subarrays contains several unknowns, espe~ ly when similar samples are used, a smaller number of probes may be s--fficient if they are intelligently SPlP~t~d on the basis of 2~ results of previously scored probes. For example, if probe AAAAAAA iS not positive, there is a small chance that any of 8 overlapping probes are positive.If AAAAAAA iS positive, then two probes are usually positive. The sequencing process in this case consists of first hybridizing a subset of W~ 96/17957 PCTIUS9~11615J.

minim~lly overlapped probes to define positive anchors and then to s~lcce~ively select probes which confirms one of the most likely hypotheses about the order of anchors and size and type of gaps beL~n them. In this = second phase, pools of 2-l0 probes may be used where each probe is sPl~tPd 5 to be positive in only one DNA sample which is dirre~. nL from the samples e-l~;led to be positive with other probes from the pool.
The ~.lb~uldy approach allows effirient impl~~nent~tion of probe co~ ion (ovellapped probes) or probe cooperation (continUous st~rlring of probes) in solving br~nrhing problems. After hybri~li7~tion of a universal ~t 10 of probes the sequence assembly program deLel"lines c~n~ tp sequence subfr~gm~nts (SFs). For the further assembly of SFs, ~ tion~l information has to be provided (from o~ell~p~ed s~u~nces of DNA fr~ment~, similar s~u~,nces, single pass gel sequences, or from other hyb~i-li7~tinn or rest~icti- n mappingdata). ~o,..~ ehybr~ 7~tionand co~.lh~uousst~r~rin~intPr~f~.tinnc 15 have been proposed for SF assembly. These ~ udcl~es are of limited pr~cti~l value for sequencing of large numbers of samples by SBH ~I,c~;n a l~bellPd probe is applied to a sample affixed to an array if a uniform array is used. Fortunately, analysis of small numbers of samples using replica sul ~l~ys allows efficient impl~mPnt~tion of both approaches. On each of the 20 replica subarrays, one br~n~hing point may be tested for one or more DNA
~mples using pools of probes similarly as in solving mut~ted sequences in different samples spotted in the same subarray (see above).
If in each of 64 samples described in this example, there are about l00 br~n~hing points, and if 8 samples are analyzed in parallel in each 25 subarray, then at least 800 subarray probings solve all branches. This means that for the 3072 basic probings an ~ ition~l 800 probings (25%) are employed. More preferably, two probings are used for one br~ching point.
If the s--l,~,~ys are smaller, less additional probings are used. For example, if subarrays consist of 16 samples, 200 additional probings may be scored (6%). By using 7-mer probes (Nl 2B7N, 7) and competitive or collaborative W O96/179S7 PCTrUS95/16154 br~n~hing solving approaches or both, fragmPnt~ of about 1000 bp fragm~ntc may be assembled by about 4000 probings. Furthermore, using 8-mer probes (NB8N) 4 kb or longer fr~gmPnt~ may be ~PmhlPd with 12,000 p~ubings.
Gapped probes, for eY~mple, NB4NB3N or NB4NB4N may be used to reduce S the number of bran~hin~ points.

FY~mrle 14 DNA Analysis by Tran~iPnt ~tt~hmPnt to Subarrays of Probes and Ligation of Labelled Probes.

Oligonucleotide probes having an informative length of four to 10 40 bases are synth~i7~d by standard chPmi~try and stored in tubes or in multiwell plates. Specific sets of probes comrri~ing one to 10,000 probes are arrayed by deposition or in situ synthesis on s~ dt~ :iupl)olls or distinct sections of a larger support. In the last case, sections or ;,..balldys may be s.,lJ~dl~d by physical or hydrophobic barriers. The probe arrays may be 15 prepared by in situ synthesis. A sample DNA of a~r~pliale size is hybri~li7-P~ with one or more specific arrays. Many ~mpl~Ps may be i.,tel,~aled as pools at the same subarrays or int1epen~l~Pntly with different subarrays within one support. Simultaneously with the sample or subsequently, a single labelled probe or a pool of labelled probes is added on each of the 20 subarrays. If ~ttachPd and labelled probes hybridize back to back on the complemPnt~ry target in the sample DNA they are ligated. Oc.;ullence of ligation will be measured by detecting a label from the probe.
This procedure is a variant of the described DNA analysis process in which DNA samples are not perm~nently ~tt~ ed to the support.
25 Transient att~c-hmPnt is provided by probes fixed to the support. In this case there is no need for a target DNA arraying process. In addition, ligation allows detection of longer oligonucleotide sequences by combining short labelled probes with short fixed probes.

The process has several unique features. R~.ci-~lly, the t~nci~nt ~tt~chmPnt of the target allows its reuse. After ligation occur the target may be released and the label will stay covalently ~tt~rhPd to the support. This feature allows cycling the t~rget and production of ~ir~ ble S signal with a small quantity of the target. Under optimal con-liti~ ~s, targets do not need to be ~mrlified, e.g. natural sources of the DNA ~mr'~s may be directly used for tii~nosti~s and sequencing llu~ s. Targets may be released by cycling the k~lll~dlUle b~;lw~-~ effit~iont hybri-li7~tion and efficiPnt melting of dur~1PYPs More preferablly, there is no cycling. The 10 le..~ ..,e and concentrations of co-.lpol~r~-tC may be defined to have an equilibrium between free targets and targets entered in hybrids at about 50:50% level. In this case there is a continuous production of ligated products. For different pu~l~oses different equilibrium ratios are optimal.
An electric field may be used to enh~nre target use. At the 15 bPginning, a hol;zoll~l field pulsing within each subarray may be employed to provide for faster target sorting. In this phase, the equilibrium is moved toward hybrid formation, and unl~helled probes may be used. After a target sorting phase, an ayplolJliate washing (which may be helped by a vertical electric field for restricting movement of the ~mpl~s) may be l)elrol-~ed.
20 Several cycles of ~ rimin~tive hybrid melting, target harvesting by hybridi7~tit n and ligation and removing of unused targets may be introduced to increase spe~ificity. In the next step, labelled probes are added and vertical electrical pulses may be applied. By increasing le,.-pe,dlulc;, an optimal free and hybridized target ratio may be achieved. The vertical electric field 25 prevents diffusion of the sorted targets.
The subarrays of fixed probes and sets of labelled probes (specially design~l or selected from a universal probe set) may be ~rr~n~e~
- in various ways to allow an efficient and flexible sequencing and di~gnostics process. For example, if a short fragment (about 100-500 bp) of a b~t~ri~l 30 genome is to be partially or completely sequenced, small arrays of probes (5-W O96117957 PCTrUS9~1161S~

30 bases in length) dpcignt~d on the bases of known sequence may be used.
If ill~.,ogated with a dirL.c;l-~ pool of 10 labelled probes per s.l)a,ldy, an array of 10 subarrays each having 10 probes, allows ç~P~l~ing of 200 bases, ~ccllmin~ that only two bases conne~ted by ligation are scored. Under the 5 con~iti~n.~ where mi.~ .l.f5 are ~ - i---in~tPd ~llv~houl the hybrid, probes may be rlisrl~red by more than one base to cover the longer target with the same number of probes. By using long probes, the target may be inl~.loga~d directly without ~mplifi~tit~n or i~ol~tion from the rest of DNA in the sample.
Also, several targets may be analyzed (screened for) in one sample 10 ~imlllt~n~usly. If the obtained results in~ te oc~,l"~:nce of a mut~tion (or a pathogen), ~rlitinn~l pools of probes may be used to detect type of the mllt~tion or subtype of pathogen. This is a desirable feature of the process which may be very cost effective in preventive ~ gnosi~ where only a small f~.tinn of patients is ~ c~-~d to have an infection or mnt~tiQn In the p,ocesses d~psçrihed in the ~ .. plcs, various ciel~ ;on m~thoA~ may be used, for example, radiolabels, lluo~escellt labels, enzymes or antibodies (chemihlmin~pscen~e)~ large molecules or particles detect~ble by light scattering or intelrelu-l-etric procedures.

Example 15 Oligonucleotide Probes and Targets Suitable for SBH

In order to obtain ~"p~;l..Pnt~l sequence data defined as a matrix of (number of fragments-clones) x (number of probes), the number of probes may be reduced depending on the number of fr~mPnt~ used and vice versa. The optimal ratio of the two numbers is defined by the 25 technological ,~ui,t;-l-ents of a particular sequencing by hybridization process.
There are two parameters which inflll~nse the choice of probe length. The first is the success in obtaining hybridization results that show the WO 96/17957 . PCTfUS95J~6 l~ui-cd degree of disc.;...;..~l;on. The second is the technological fe~cihilityof synthesis of the l~uilcd number of probes.
The r~ui.G~ t of ob~ini..g sl~ffiri~nt hybri~i7~tiQn l iQn with rr~ti~l and useful amounts of target nucleic acid limits S the probe length. It is ~liffi~ t to obtain a s~lffici~nt amount of hybrid with short probes, and to ~~ic. . i...i..,.l.o end micm~t~hes with long probes.
T~ 1y the use of probes shorter than 11-mers in the l;t~ " iS
limited to very stable probes [Estivill et al., Nucl. Acids Res. 15: 1415 (1987)]
On the other hand, probes longer than 15 bases ~iic. . i ",in~lç end mi ~ 5 with difficulty (Wood et al., Proc. Natl.,Acad. Sci. USA 82: 1585 (1985)].
One solution for the problems of unstable probes and end mi.cm~t~ isc.;...;n~lion iS the use of a group of longer probes lcl~l..5--~t;--ga single shorter probe in an inform~tion~l sense. For eY~mple, groups of sixteen 10-mers may be used instead of single 8-mers. Every ~.~ hel of the 15 group has a common core 8-mer and one of three possible v~ri~tionC on outer positi-nc with two variations at each end. The probe may be l'tlJr~,Senlt;d as S'(A, T, C, G) (A, T, C, G) B8 (A, T, C, G) 3'. With this type of probe one does not need to discriminate the non-inÇoll,lali~e end bases (two on 5' end, and one on 3' end) since only the internal 8-mer is read. This solution 20 employs a higher mass amounts of probes and label in hybridization re~çtic n~These disadvantages are Çlimin~te~ by the use of a few sets of ~lic~. i---;--~li./e hybridization conditions for oligomer probes as short as 6-mers.
The number of hybridization reactions is dependent on the number of discrete labelled probes. Thel~role in the cases of sequencing 25 shorter nucleic acids using a smaller number of fr~gm~nt~-clones than the number of oligonucleotides, it is better to use oligomers as the target and nucleic acid fr~mPnt as probes.
~ Target nucleic acids which have un-lefined sequences may be produced as a ~ ulc of r~lc~c~.t~ti~e libraries in a phage or plasmid vector 30 having inserts of genomic fr~gmPrltc of different sizes or in samples ~.cp~cd W O96/17957 PCTrUS95116154 by PCR. Inevitable gaps and unct;~l~nLies in ~ nm~nt of sequenced ~ mt-nt~ arise from nonrAnrlom or repetitive sequence or~ ;on of complex genomes and difficulties in cloning poisonous sequences in FsehPri- hi~ coli. These problems are inherent in se lue~ g large c~
S m~ ul-oc using any method. Such problems may be minimi7P~I by the choice of libraries and number of subclones used for hybri-li7~tion. ~lt~ tively, such ~liffi~-lltips may be overcome through the use of ~mr~lifiPd target sequences, e.g. by PCR ~mrlifir~tion, ligation re~ction~, ligation-~mrlifi re~-~tion~, etc.
Nucleic acids and mPthocls for j~ol~tin~, cloning and sequencing nucleic acids are well known to those of skill in the art. See e.g., Ausubel et al., Current Protocols in Molecular Biology, Vol. 1-2, John Wiley & Sons (1989); and Sarnbrook et al., Molecular Cloning A Laboratory Manual, 2nd Ed., Vols. 1-3, Cold Springs Harbor Press (1989), both of which are incc,l~.dted by reference herein.
SBH is a well developed t~rhnology that may be pr~cti~ed by a number of mPtho~lc known to those skilled in the art. Specifi~lly, techniques related to sequencing by hybridization of the following docu~ nls is incorporated by reference herein: Drmanac et al., U.S. Patent No.
5,202,231 (hereby inco,~l~ted by reference herein) - lssued April 13, 1993;
Drmanac et al., Genomics, 4, 114-128 (1989); Drmanac et al., Proceedings of the First Int'l. Conf. Electrophoresis Supercomputing Human Genome Cantor, DR & Lim HA eds, World Scientific Pub. Co., Singapore, 47-59 (1991); Drmanac et al., Science, 260, 1649-1652 (1993); ~Phr~h et al., Genome Analysis: Genetic and Physical Mapping, 1, 39-81 (1990), Cold Spring Harbor Laboratory Press; Drmanac et al., Nucl. Acids Res., 4691 (1986); Stevanovic et al., Gene, 79, 139 (1989); Panusku et al., Mol. Biol.
Evol., 1, 607 (1990); Nizetic et al., Nucl. Acids Res., 19, 182 (1991);
Drmanac et al., J. Biomol. Strucl. Dyn., 5, 1085 (1991); Hoheisel et al., Mol.
Gen., 4, 125-132 (1991); Strezoska et al., Proc. Nat'l. Acad. Sci. (USA), 88, WO 96/17957 PCTnUS9~1615 10089 (1991); Drmanac et al., Nud. Acids Res., 19, 5839 (1991); and Drmanac et al., lnt. J. Genome Res., 1, 59-79 (1992).

FY~mrle 16 Del~....;.-;.-~ Se~ e--~ from Hybri-1i7~tinn Data S Sequence assembly may be illlellul)~d where ever a given ov~la~ing (N-l) mer is dllrlir~t~d two or more times. Then either of the two N-mers differing in the last nucleotide may be used in eYtPn-~inE the sequence. This br~n~lling point limits un~m~iguous assembly of sequence.
RP~Pmhling the sequence of known oligonucleotides that 10 hybridize to the target nucleic acid to gen~.AI~. the complete se~lu~.~c~ of the target nucleic acid may not be accomrli~hP~ in some cases. This is be~
some i~fo~ ;on may be lost if the target nucleic acid is not in fr~f~mPnt~ of a~l~,~liate size in relation to the size of olignnllcl~ti~o that is used for hybri~ii7in~ The 4ua~ y of illfo~ lion lost is pn~llional to the length of 15 a target being sequenced. However, if sllffiçipntly short targets are used, their sequence msy be unambiguously determined.
The probable frequency of duplicated sequences that would in~lr~,~ with sequence assembly which is distributed along a certain length of DNA may be calculated. This derivation lt~Uil~,S the introduction of the 20 dçfinition of a parameter having to do with sequence org~ni7~tion: the sequence subfragment (SF). A sequence subfragment results if any part of the sequence of a target nucleic acid starts and ends with an (N-l)mer that is repeated two or more times within the target sequence. Thus, subfr~gm~nt~
are sequences generated between two points of br~n~ ing in the process of 25 assembly of the sequences in the method of the invention. The sum of all subfr~gmçnt~ is longer than the actual target nucleic acid because of ov~l~l,ing short ends. Generally, subfragments may not be assembled in a ~ linear order without additional information since they have shared (N-l)mers W O96/17957 PCTrUS95116154 at their ends and starts. Different numbers of subfragmPntc are obtained for each nucleic acid target ~epPn~ling on the number of its ~~led (N-1) mers.
The number ~epPn~l~ on the value of N-l and the length of the target.
Probability c~lsulationc can e~ .llr. the int~ l;tmchir of the 5 two factors. If the ordering of positive N-mers is acco...rli~hP~ by using ~ ~la~ -g s~u~,nces of length N-1 or at an average rli~t~n~e of Ao~ the N-l of a fra~mPnt Lf bases long is given by equation one:
N,f=l+Ao X~KXP(K,Lf) Where K greater than or = 2, and P (K, L~ ~ ellts the probability of an 10 N-mer occurnng K-times on a fragment L~ base long. Also, a COlllput~.
~JlU~,ldlll that is able to form subfragm!ont~ from the content of N-mers for any given sequence is d~Pscribed below in FY~mrle 18.
The number of subfr~gmPnt~ increases with the increase of lengths of fr~gmP-ntc for a given length of probe. Obtained subfr~mPntc may 15 not be uniquely ordered among themselves. Although not coll,pleLe, this information is very useful for co",?~ti~e sequence analysis and the r~co~;..il;Qn of functional sequence char~tPri~tic~s. This type of information may be called partial sequence. Another way ofûblnining partial sequence is the use of only a subset of oligonucleotide probes of a given length.
There may be relatively good ag~ c.lt b~.. ~n predicte sequence according to theory and a COIIIIJUt~l ~im~ tiQn for a random DNA
sequence. For in~tance, for N-l = 7, lusing an 8-mer or groups of sixteen 10-mers of type 5' (A,T,C,G) B8 (A,T,C,G) 3'] a target nucleic acid of 200 bases will have an average of three subfragmPnSc However, because of the 25 dispersion around the mean, a library of target nucleic acid should have inserts of 500 bp so that less than 1 in 2000 targets have more than three subf~gmentc. Thus, in an ideal case of sequence d~te~ ;on of a long nucleic acid of random sequence, a r~lc~entative library with sufficjPntly short inserts of target nucleic acid may be used. For such inserts, it is poccihle to ,econs~ ct the individual target by the method of the invention.

WO 96/17957 PCTIUS951~615LS

The entire sequence of a large nucleic acid is then obtained by ov~lappi-lg of the defined individual insert sequences.
To reduce the need for very short fr~Emente, e.g. 50 bases for 8-mer probes. The inforrnation cont~inPd in the o~ la~)ed fr~Em~nt~ present S in every random DNA fr~".~...l;"ion process like rl~nin~, or random PCR is used. It is also possible to use pools of short physical nucleic acid r, ~.. , ~;
Using 8-mers or ll-mers like 5' (A, T, C, G) N8 (A, T, C ,G )3' for s~qu~culg 1 mP~h~e~, instead of nP~AinE 20,000 50 bp fr~EmPnt~ only 2,100 e~mplf e are sl-ffiei~nt This number consists of 700 random 7 kb clones 10(basic library), 1250 pools of 20 clones of 500 bp (subfr~EmPnte ol~cl"~g library) and 150 clones from jumping (or similar) library. The developed algo~ ", (see FY~mrlP 18) regenP~tPS sequence using hybri~li7~tion data of th these described ~mrl~s FY~mr.lP 17 Hybridization With Oligonucleoti~les Oligonucleo~ides were either purchased from Genosys Inc., Houston, Texas or made on an Applied Biosystems 381A DNA synth~i7~r.
Most of the probes used were not purified by HPLC or gel electrophoresis.
For exarnple, probes were cleeignP,d to have both a single pelÇe~:~ly 20 comrlPmPnt~ly target in i~lLelrelu~, a Ml3 clone con~ )g a 921 bp Eco RI-Bgl II human Bl - intelr~r(,ll fr~gmPnt (Ohno and Tangiuchi, Proc. Natl.
Acad. Sci. 74: 4370-4374 (1981)], and at least one target with an end base mi.cm~t~ll in M13 vectûr itself.
End 1~hel1ing of oligonucleoti~es was ~ Çol",ed as described 25 tM~ni~tie et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Cold Spring Harbor, New York (1982)] in 10 ~1 c~ g T4-polynucleotide kinase (5 units Amersham), ~y32P-ATP (3.3 pM, W O96/17957 PCTrUS95/1615 10 ~Ci ~ m~r~h~m 3000 Ci/mM) and oligonucleotide (4 pM, 10 ng). Specific activities of the probes were 2.5-5 X 10 9 cpm/nM.
Single str~nAed DNA (2 to 4 ~1 in 0.5 NaOH, 1.5 M NaCl) was spotted on a Gene Screen membrane wetted with the same s~ tion, the S filters were nPutr~li7~1 in 0.05 M Na2HPO4 pH 6.5, baked in an oven at 80~C for 60 min. and UV irr~ t~ for 1 min. Then, the filters were in.~ d m hybritli7~tic~n solution (0.5 M Na2HPO4 pH 7.2, 7% sodium lauroyl ~o~ine for 5 min at room ~e.,-pe,dtule and placed on the surface of a plastic Petri dish. A drop of hybri~ii7~tinn solution (10 Ol, 0.5 M Na2HPO4 pH 7.2, 7% sodium lauroyl sa cosine) with a 32p end labelled oligomer probe at 4 nM concentration was placed over 1-6 dots per filter, overlaid with a square piece of polyethylene (al)p,u,.i~ t~ly 1 X 1 cm.), and ;.~cul)~d in a moist ch5.,.lhe, at the inriir~t~d tcl~lpeldtul.,s for 3 hr. Hybri.li7atin was stopped by placing the filter in 6X SSC washing solntinn for 3 X 5 minute at 0~C to remove unhybrir1i7~d probe. The filter was either dried, or further washed for the inrlir~ted times and te-"pe,~tu,t;s, and antl ra~lingrarh~d For ~i~crimin~tion measurements, the dots were excised from the dried filters after autoradiography ~a phosphoim~ger (~ r Dynamics, Sunnyvale, California) may be used] placed in liquid s~intill~tion cocktail and counted.
The unco"~cLed ratio of cpms for IF and M13 dots is given as D.
The conrlition~ reported herein allow hybri-li7~tion with very short oligomlc~eoti~les but ensure dis,~ tions between m~t~h~d and mi~m~t~h~ oligonucleotides that are complempnt~ry to and therefore bind to a target nucleic acid. Factors which influen~e the efficient detection of 2~ hyb~-1i7ation of specific short sequences based on the degree of r~ rimin~tinns (D) between a perfectly complem~-ntary target and an ~.r~ ly compl~m~ntary target with a single mi~m~trh in the hybrid are defin~d. In ~ ,i...Pnt~l tests, dot blot hybri~ii7~tinn of twenty-eight probes that were 6 to 8 nucleotides in length to two M13 clones or to model WO 96/17957 PCTnUS95116154 oligonucleotides bound to membrane filters was accompli~h~l. The prinripl-s guiding the rY~ im~ont~l pf ~ 1U1'1~S are given below.
Oligonl~rl~tide hybri~li7~tion to filter bound target nucleic acids only a few nuclçotides longer than the probe in con~itirn~ of probe S excess is a pseudo-first order reaction with respect to target conce.
This reaction is defined by:
S,/SO = e~ kh [~P] t Wherein S, and S0 are target s~u~.lce cnnrPnt~t~ at time t and tt"
~t;*Je ;~ ely. (OP) is probe consçnt~tion and t is l~ e. The ste co~ -t for hybrid formation, kb increases only slightly in the 0~C to 300C
range (Porschke and Eigen, J. Mol. Biol. 62: 361 (1971); Craig et al., J.
Mol. Biol. 62: 383 (1971)]. Hybrid melting is a first order reaction with respect to hybrid cQncfntr~tion (here replaced by mass due to filter bound state) as shown in:
H~/Ho = e In this equation, H, and Ho are hybrid conc~ ;on~ at times t and to~
r~ e~ /ely; kn, is a rate constant for hybrid melting which is dependent on t~ f~tnre and salt corcent~ticn [Ikuta et al., Nucl. Acids Res. 15: 797 (1987); Porsclike and Eigen, J. Mol. Biol. 62: 361 (1971); Craig et al., J.
Mol. Biol. 62: 303 (1971)]. During hybri~i7~tion, which is a strand association process, the back, melting, or s¢and ~ oci~tion, reaction takes place as well. Thus, the amount of hybrid formed in time is result of fo- w~d and back re~ctioll~ The equilibrium may be moved towards hybrid formation by increasing probe concent~tion and/or decreasing lell~pel~lu.~. However, during washing cycles in large volumes of buffer, the melting reaction is dominant and the back reaction hybridization is in~ignifir~nt, since the probe is absent. This analysis in~ic~trs workable Short Oligonucleotide - Hybri-ii7~tiQn (SOH) conditions call be varied for probe col-cç.-n,.lion or )f~ e.
D or discrimin~tion is defined in equation four:

W O96/179S7 PCTrUS95/16154 D = Hp(tw)l Hi(tw) Hp (tw ) and H; (t",) are the arnounts hybrids ~ h~ g after a washing time, t." for the identic~l amounts of perfectly and i~,~pG,r~Lly compl~mPnt~ry duplex, l~,sp~;Li~ely. For agiven LG~ )G,aLulG, the ~ lion D çll~n~
5 with the 10 length of washing time and reaches the ~ l value when H
= B which is equation five.
The background, B"~ ~sert~ the lowest hybri-li7~tiQn signal et~t~hle in the system. Since any further decrease of Hj may not be ~Y~mintoA, D increases upon continllp~ washing. Washing past tw just dec,~ases Hp relative to B, and is seen as a decrease in D. The optimal washing time, tw~ for imperfect hybrids, from equation three and equation five is:
tu = -ln (B / Hj (to))/ k~ j Since Hp is being washed for the same tw, combining equations, one obtains the optimal discrimin~tit)n function:
D = e ~ ~/Hi (~) ~-P ~-i X Hp(to) / B
The change of D as a function, of T is illlyO~ t because of the choice of an optimal washing temperature. It is obtained by ~ub~LiLu~ g the Arhenius equation which is:
- E / RT
into the previous equation to form the final equation:
D = Hp((to)/B X (B/Hj (to)) (Ap / Aj~ e (Ea~j - E ~ )/RT
Wherein B is less than Hj (to).
Since the activation energy for perfect hybrids, E~p, and the activation energy for imperfect hybrids, E~ j, can be either equal, or E~ j lessthan E~ p D is ~--.pGldtllre independent, or decreases with increasing t~ G"e.,yecLi~ely. This result implies that the search for stringent le.~ rG con-iitions for good discrimin~tion in SOH is unjustified. By washing at lower Lell,l~G,dtures, one obtains equal or better discrimin~tion~ but the time of washing expontonti~lly increases with the decrease of LG.-.~.dtu.G.

WO 96/17957 PCTlU~,gSJ16154 Di!.~.rimin~tinn more strongly decreases with T, if Hi(to) increases relative to Hp (to)~
D at lower ~ es d~n~ls to a higher degree on the Hp (tC)/B ra~*o than on the Hp (tO) / H; (tO) ra.*o. This result inriic~t~s that it is 5 better to obtain a sl~ffi~iont quantity of Hp in the hyhri~ regardless of the ~ çrimin~tion that can be achieved in this step. Better ~ rimin~*on can then be obtained by washing, since the higher amounts of perfect hybrid allow more .*me for dirr."~,l,Lial meL*ng to show an effect. Simil~rly, using larger amounts of target nucleic acid a nt~ce-~.y ~licç. ;...i.-~t;on can be ol~i"ed even 10 with small dirr~ ces b~Lween Km p and K", j.
Extrapolated to a more complex ~ifll~tion than covered in this simple model, the result is that washing at lower ~ s is even more illl~l~t for ob~i"i"g .~ . ;...hl~l;on in the case of hybri~ii7~*on of a probe having many end-mic...~t~ s within a given nucleic acid target.
Using the dçsrribed theoretical principl~~ as a guide for e~l.&.h~entc, reliable hybri~li7~tion~ have been obtained with probes six to eight nucleotides in length. All ~",~.".;...~nt~ were lle.Ço~ ed with a floatingplastic sheet providing a film of hybridi7~tio~ solution above the filter. This procedure allows maximal reduction in the amount of probe, and thus reduced 20 label costs in dot blot hybricli7~tio~ The high con~ntrdtion of sodium lauroyl sarcosine instead of sodium lauroyl sulfate in the phosph~t~
hybri~1i7~tion buffer allows dropping the reaction from room le~
down to 12~C. Similarly, the 4-6 X SSC, 10% sodium lauroyl ~ ~,,;ne buffer allows hybn~i7~tinn at telll~ dLu-c;s as low as 2~C. The d~lerge"t in 25 these buffers is for obtaining tolerable bac~g-uuild with up to 40 nM
concentrations of labelled probe. Pr~limin~ry ~ r~tlori7~tion of the thermal stability of short oligonucleotide hybrids was dele.",-ned on a ~ur~loly~e - octamer with ~0% G+C content, i.e. probe of sequence TGCTCATG. Thetheoretical e~cpe~t~tion is that this probe is among the less stable oct~m~rs.
30 Its tr~n~ition enthalpy is similar to those of more stable h~pt~mtors or, even to Wo 96/17957 PCTlUS95/161 probes 6 nucleotides in length (Bresslauer et al., Proc. Natl. Acad. Sci.
U.S.A. ~: 3746 (1986)). p~r~m~ter Td, the te~ dlul~ at which 50% of the hybrid is melted in unit time of a minute is 18~C. The result shows that Td is 15~C lower for the 8 bp hybrid than for an 11 bp duplex [Wallace et al., Nucleic Acids Res. 6: 3543 (1979)].
In ~ sion to eA~JC' i..--onts with model oligonucl~Qti-iPs, an M13 vector was chosen as a system for a pr~cti~l de~ ;nn of short oligonucleotide hybri-li7~tiQn The main aim was to show useful end-micm~tch l1i5~rimin~ti~n with a target similar to the ones which will be 10 used in various app~ tio~c of the method of the invention. Oligonucleotide probes for the M13 model were chosen in such a way that the M13 vector itself cont~inc the end micm~tched base. Vector IF, an M13 recombinant c,nli.ining a 921 bp human intclf~.oll gene insert, carries single perfectly ~l target. Thus, IF has either the ident~ l or a higher nu"lbe. of 15 micm~tch~d targets in co...r~. ;cQn to the M13 vector itself.
Using low le..,~e.dture con~litionc and dot blots, sum~ient differences in hybridization signals were obtained belween tie dot CO~ g the perfect and the micm~tt~hed targets and the dot con~;~ining the mism~t~ ed targets only. This was true for the 6-mer oligonucleotides and was also true 20 for the 7 and 8-mer oligonucleotides hybridized to the large IF-Ml3 pair of nucleic acids.
The hybridization signal depen-lc on the amount of target available on the filter for reaction with the probe. A nere~C~ry control is to show that the difference in sign intensity is not a reflection of varying amounts 2~ of nucleic acid in the two dots. Hybridization with a probe that has the samenumber and k~nd of targets in both IF and Ml3 shows that there is an equal amount of DNA in the dots. Since the efficiency of hybrid formation increases with hybrid length, the signal for a duplex having six nucleotides was best det~ted with a high mass of oligonucleotide target bound to the 30 filter. Due to their lower molecular weight, a larger number of WO 96/17957 PCTIUS95~16154 oligonucleotide target mol~oc~ s can be bound to a given surface area when p~d to large mole~ c of nucleic acid that serves as target.
To ~ the sensitivity of detection with unpurified DNA, r various amounts of phage S~ were spotted on the filter and hybri~li7~A with a 32 P-l~hellPd oct~m~r. As little as 50 million ull~u.lrled phage co~ il-g no more than 0.5 ng of DNA gave a dct~ hle signal in~ tin~ that sensitivity of the short olignn~ ooti-l~ hybri~ii7~tion method is suffiri~nt ~P~rtion time is short, adding to the pr~rtir~lity.
As mtontinn~ in the theoretical section above, the equilibrium yield of hybrid ~ItqpPnrlc oil probe c~neP~I~Alinrl and/or t~ , of reartion. For inct~nce~ the signal level for the same amount of target with 4 nM o~ - at 13~C is 3 times lower than with a probe cQncentr~tion of 40 nM, and is decreased 4.5-times by raising the hybri~ii7~tiQn te-..l~ .e to 25~C.
The utility of the low le~ a~lre wash for achieving 1.~ 7xi---Al ~liccrimin~tinn is ~emonctrated. To make the pheno,l,enon visually obvious, 50 times more DNA was put in the M13 dot than in the IF dot using hybri-ii7~tion with a vector specific probe. In this way, the signal after the hybridi_ation step with the actual probe was made slloilgel in the, mi~
20 that in the m~tl~hP~ case. The Hp /Hi ratio was 1:4. Inversion of signal int~ncitiPS after prolonged washing at 7~C was achieved without a massive loss of perfect hybrid, resl-lting in a ratio of 2:1. In contr~ct, it is impossible to achieve any ~lic. . ;...in~lion at 25~C, since the ll~ ed target signal is already brought down to the background level with 2 minute washing; at the same 25 time, the signal from the micm~tçh~d hybrid is still ~iete~t~hle. The loss of~ic~rimin~tion at 13~C collll)alc~d to 7~C is not so great but is clearly visible.
If one considers the 90 minute point at 7~C and the 15 minute point at 13~C
when, the micm~t~hed hybrid signal is near the background level, which ,c;~l~se.lts optimal washing times for the ,t;~e~ e con~iitiQns7 it is obvious 30 that the amount of several times greater at 7~C than at 13~C To illnctr~t~ this W O96/17957 PCTrUS95/1615 further, the time course of the change dis~ ion with washing of the same amount of starting hybrid at the two ~ ~-dt~Gs shows the higher m~Yim~l D at the lower LGIII~G1dlU1G. These results confirm the trend in the change of D with t~ ,.., and the ratio of amounts of the two types of hybrid at the 5 start of the washing step.
In order to show the general utility of the short oligonucleotide hybri~i7~tion con~litinns, we have looked hybritli7~tion of 4 he~ , 10 o.;l~ and an ~rlAition~l 14 probes up to 12 nucleotides in length in our simple M13 system. These include-the nnn~mPr ~ l l lAA and O-;l~llt1 10 GGCAGGCG lG~ .ilh.g the two extremes of GC content Although GC
content and sequence are PYpestP~ to inflllPnre the stability of short hybrids [Bre~ r et al., Proc. Natl. Acad. Sci. U.S.A. ~: 3746 (1986)], the low ~e-..~ .e short oligonuclPoti~p conr~itions were applicable to all tested probes in achieving s~ffiçi~nt ~ linn. Since the best rliC~;",;ilAl;nn value obtained with probes 13 nucleotides in length was 20, a several fold drop due to sequence variation is easily toler~t~d.
The M13 system has the advantage of showing the effects of target DNA c(jlllyk;Aily on the levels of discrimin~ticn. For two oCt~m~r.s having either none or five mi~m~t~h.oA targets and differing in only one GC
pair the observed discrimin~tiQns were 18.3 and 1.7, respectively.
In order to show the utility of this mPthod three probes 8 nucleotides in length were tested on a collection of 51 plasmid DNA dots made from a library in Bluescript vector. One probe was present and specific for Rl~l~sçript vector but was absent in M13, while the other two probes had targets that were inserts of known sequence. This system allowed the use of hybri~ii7~tio n negative or positive control DNAs with each probe. This probe sequence (CTCCCTl-r) also had a complem~nt~ry target in the inlelr~
insert. Since the M13 doe is negative while the interferon insert in either M13 or Rluescrirt was positive, the hybridization is sequence specific. Similarly, probes that detect the target sequence in only one of 51 inserts, or in none of WO 96/17957 PCT/US~5J16154 the PY~minPd inserts along with controls that confirm that hybririi7~til n wouldhave oc~;ul~.,d if the a~ ,iate targets were present in the clones.
Therrnal stability curves for very short oligon-lrlP~tirlP hybrids that are 6-8 ~u~ ps in length are at least 15~C lower than for hybrids 11-12 nur1Poti~es in length tFig. 1 and Wallace et al., Nucleic Jlcids Res. ~:
3543-3557 (1979)]. However, ~rulllling the hybri~i7~tion reaction at a low t~ ; and with a very pr~rtir~l 0.4-40 nM ~ ;nn of oligonucleotide probe allows the detection of comrll~~Pnt~ry sequence in a known or unknown nucleic acid t~rget. To d~t~ ne an unknown nucleic acid sequence completely, an entire set c~ g 65,535 8-mer probes may be used. SllffiriPnt amounts of nucleic acid for this purpose are present in convenient biologir~l ~mplPs such as a few microliters of M13 culture, a rl~mi~ prep from 10 ml of ba~t~n~l culture or a single colony of b~rtPri or less than 1 ~1 of a standard PCR r~ctinn~
Short oligonucleotides 6-10 nucleotides long give eYçPll~nt riicr.i~ ion. The relative decrease in hybrid stability with a single end micm~trh is greater than for longer probes. Results with the ocPmer TGCTCATG support this conrlllcion. In the e~ nl~, the target with a G/T end micm~trh, hybril1i7~tion to the target of this type of micm~trh is the 20 most stable of all other types of oligt)nurlPoti~e. This disc~iin~;on achieved is the same as or greater than an internal G/T mi~m~trh in a 19 base paired duplex greater than an internal GIT micm~tch in a l9 paired duplex [Ikuta et al., Nucl. Acids Res. 15: 797 (1987)]. Exploiting these ~iiccrimin~tion ~l~3~llies using the described hybridization contlitinn.C for short oligonucleotide hybridi7~tion allows a very precise detcl,l,inaLion of oligomlcl~Poti~P targets.
In contr~ct to the ease of detPcting ~ .;...in~ n between perfect and i...p~. r~ hybrids, a problem that may exist with using very short oligonucleotides is the ~ n of s~lfficient amounts of hybrids. In 30 pr~rtir,e, the need to ~ çrimin~tP Hp and Hj is aided by increasing the amount W O96/179S7 PCTrUS95/16154 -40 ~

of DNA in the dot and/or the probe conrP-nt~tinn~ or by d~cr~.ing the hybri~li7~*on '~ c. However, higher probe cm~ l;nnc usually ~Cl ~ bacL~ ,und. Moreover, there are limits to the ~II~UIII~, of target nueleic acid that are rr~rti~l to use. This problems was solved by the higher S cl lu~ l;on of the d~ nt Sarcosyl which gave an crrccli~e bacl~ ,ulld with 4 nM of probe. Further improvc."c.,l.~. may be effect~ eitner in the use of c~,~ for u~ c~-;ric binding of probe to filter, or by el.a~ ;"~ the hybri~ii7~tion support m~tPTi~l. Moreover, for probes having Ea less than 45 Keal/mol (e.g. for many h~ulle~ and a majority of h-~;-..-- " mo~lifipd 10 oligonueleotides give a more stable hybrid [~sctoline7 et al., Proc. Na~'l Acaa'.
Sci. 81:3297 (1984)] than their Immoriifi~ eoun~e,~ . The hybri-ii7~tinn c~n~iitin~c described in this invention for short oligol-u~1eotil1e hybrirli7~tinn using low t~ es give better riiccrimin~ting for all sequences and duplex hybrid inputs. The only price paid in achieving unirul~ y in hybri~1i7~titm 15 cs~n~1iti~n C for dirr~.ci~t sequences is an increase in washing time from ",;""t. s to up to 24 hours ~nriing on the sequence. Moreover, the washing time can be further ~cduced by decreasing the salt conren~tion.
Although there is ~YcPIlPnt ~li.c~rimin~tion of one ~ tcl-fd hybrid over a micm~teh~d hybrids, in short oligonucleotide hybri~ii7~tion~
20 signals from micm~trhed hybrids exist, with the majority of the mism~trh hybrids resulting from end micm~t~h This may limit insert sizes that may be effectively ~qY~min~d by a probe of a certain length.
The inflllence of sequence complexity on discrimin~tion cannot be ignored. However, the complexity effects are more signifir~nt when 25 ~fining sequence information by short oligonu~lP~ti~le hybri-li7~tion for specific, nonrandom sequences, and can be overcome by using an a~,~.p,iate probe to target length ratio. The length ratio is chosen to make unlikely, on st~tictir~l grounds, the occurrence of specific sequences which have a number of end-micm~trhes which would be able to elimin~te or falsely invert r~ ion. Results suggest the use of oligonucleotides 6, 7, and 8 WO 96/I7957 PCTnUS95J16154 nucleotides in length on target nucleic acid inserts shorter than 0.6, 2.5, and 10 kb, r~_~iLi~ely.

F-~mrle 18 Seque.~chlg a Target Using ChJ~ and Non~ll~f~

S In this exarnple, hybrilli7~tinn con~liti~n~ that were used are ~lescribe~ supra in FY~mrle 17. Data resulting from the hybriAi7~tic-n of octarner and non~mPr oligonl~clt~oti~ip~ shows that sequencing by hybri~i7~tir,nprovides an G,.Llc.,.ely high degree of accuracy. In this ~Y~rim~nt a known sequence was used to predict a series of contiguous overlapping co-..~ ent 10 octamer and non~m~r oligonnrl~oti(3ps~
In ~ tion to the perfectly m~trhing oligonucleotides, mi~m~trh oligonucleotides, mi~m~trh oligonucleotides wherein intern~l or end mi~m~trhPs occur in the duplex formed by the nligc nurl~tide and the target were eY~minPcl. In these analyses, the lowest pr~rtir~l tGIII~G1dLU1G was used 15 to Il~ ;i7~ hybritli7~tion formation. Washes were accompli~hPd at the same or lower tc~ JGldLulGs to ensure m~im~ rrimin~tir~n by utili7ing the greater dissociation rate of mi~m~trh versus ,~ rl~d oligonucleotide/target hybri~ii7~ticn. These contiiti~ns are shown to be applicable to all sequences although the absolute hybri~ii7~tion yield is shown to be sequence depPn-lent The least destabilizing mi~m~trh that can be postulated is a simple end mism~trh, so that the test of sequencing by hybridization is the ability to ~ rrimin~tP perfectly m~trhP~l oligonucleotide/target duplexes from end-mi~m~tched oligonucleotide/target duplexes.
The discriminative values for 102 of 105 hybri~li7ing oligonucleotides in a dot blot format were greater than 2 allowing a highly ~ ~rcur~tP generation of the sequence. This system also allowed an analysis of the effect of sequence on hybridization formation and hybridization instability.

~E !P5i~g ,~I~DE 26~

W O96/179S7 PCTrUS9S11615 One hundred base pairs of a known portion of a human ~B-in~c~rtn~n genes p~a~ed by PCR, i.e. a 100 bp target sequence, was gen~AI~d with data resulting from the hybri-li7Atinn of 105 oligonucl~oti~1~s probes of h~own sequence to the target nucleic acid. The olig~ 4~
S probes used inrll~ded 72 octamer and 21 no~A~ r oligc)n-~rl~otides whose s~u~ c~ was ~c rc~ly comp~ /A.y to the target. The set of 93 probes provided cunc~u~ re o~c~ ing frames of the target sc~lu~nce e ~1isrlAr~ by one or two bases.
To evaluate the effect of micmAtrh~s, hybri~li7~tit)n was 10 PYAmined for 12 additional probes that contained at least one end micmAtrh when hybridized to the 100 bp test target sequence. Also tested was the hybri~i7Ation of twelve probes with target end-mi~ .AIch~d to four other control nucleic acid sequences chosen so that the 12 oligonucleotides formed perfectly mAtrh~ duplex hybrids with the four control DNAs. Thus, the 15 hybritli7Ation of internal mi~...Atrl-~d, end-mi~mAt~hed and ~lr~;lly mAtl~hP~
duplex pairs of oligonucleotide and target were evaluated for each oligonucleotide used in the experiment. The effect of absolute DNA target col-cen~At;o,l on the hybridization with the test octamer and nonAm~r oligonucleotides was deLel~.lined by defining target DNA Con5entr~ti()n by 20 ~et~ting hybrifi;,AI;on of a different oligonucleoti~e probe to a single occull~"ce non- target site within the co-amplified plasmid DNA.
The results of this experiment showed that all oligonucleotides contAining perfect mAt~hing compllom~ontAry sequence to the target or control DNA hybridized more strongly than those oligonucleotides having 25 mi~mAt~hlos To come to this conclusion, we eYi min~d Hp and D values for each probe. Hp defines the amount of hybrid duplex formed between a test target and an oligonucleotide probe. By A~Signing values of between 0 and 10 to the hybri-ii7Ation obtained for the 105 probes, it was a~_~n~ that 68.5%
of the 105 probes had an Hp greater than 2.

~nrJlE~g ~

W O 96/179F.7 PCT~US9511615 Dic~ in~tinll (D) values were obtained where D was defined as the ratio of signal int~nciti~s between 1) the dot CO~ in;-~ a perfect .ed duplex formed bc~ test oligonucleotide and target or control nucleic acid and 2) the dot colln~ a micm~trh duplex for ned bc;L~n the 5 same oligonucleotide and a dirr~ ,nl site within the target or control nucleic acid. V~ nc in the value of D result from either 1) ~lull.~Lolls in the hybri-ii7~t r.n effi~ienr~y which allows vi.cu~li7~tion of signal over backgl~.und, or 2) the type of micm~trh found btl-v~n the test oligc.nu~l~ti~l~ and the target. The D values obt~il-ed in this eYperim~nt were be~,.~n 2 and 40 for 102 of the 105 oligonucleotide probes L-~---in~ ~ ul~ti~nc of D for the group of 102 oligonucleotides as a whole showed the average D was 10.6.
There were 20 cases where oligonucleotide/t~rget dupl~Y~s ~Yhihited an end-micm~tl~h~ In five of these, D was greater than 10. The large D value in these cases is most likely due to hy~ri-li7~tion destabili7z-ti-~n lS caused by other than the most stable (G/T and G/A) end mi~m~t~h~s, The other possibility is there was an error in the sequence of either the oligonucleotides or the target.
Error in the target for probes with low Hp was eY~ ~ as a poc~ihility because such an error would have affected the hybri~i7~tion of each 20 of the other eight ovc~lap~ing oligonucleotides. There was no a~l~L
instability due to sequence micm~t~h for the other ove.la~l,ing o1i~-nu~lQ~tides, in~ ting the target sequence was correct. Error in the oligonucleotide s~u~nce was ~xcluded as a possibility after the hybri~1i7~tion of seven newly synth~ci7~i oligonucleotides was re-çY~min~d. Only 1 of the 25 seven oligonucleotides resulted in a better D value. Low hybrid formation values may result from hybrid instability or from an inability to form hybrid duplex. An inability to form hybrid dupl~Y~s would result from either 1) self compl~..lç..lil.ity of the chosen probe or 2) target/target self hybric1i7~tion Oligonucleotide/oligonucleotide duplex formation may be favored over 30 oligonuc leotide/target hybrid duplex formation if the probe was self-cc~ AAt~ry. SimilA-rly, target/target ~ccoAi~tion may be favored if the target was self-complementary or may form internal palindromes. In ev,Al-l,Atin~ these possibilities, it was a~p~re"l from probe analysis that the q~ctinnAhlP probes did not form hybrids with themselves. Moreover, in 5 .,~Il;nillg the conL-ibulion of target/target hybri~li7~tinn, it was de~ ~...;..Fd that one of the qu~Pction~hle oligonucleotide probes hybri-li7Pd inPffiri~ntly with two dirL..,.,t DNAs cc,~ illg the same target. The low probability that two different DNAs have a self-cornrl~mPnt~ry region for the same target sequence leads to the conrll~cinn that target/target hybrifli7A~tic n did not 10 cor.L.ibuL~ to low hybr~ 7~tion form~ti~An Thus, these results inrlir~te thathybrid instability and not the inability to form hybrids was the cause of the low hybrid formation observed for spe. ific oligon-~cl~otides The results also P that low hybrid form~tion is due to the specific sequences of certain oligonucleotides. Moreover, the results in~lir~tP that reliable results may be 15 ob~il-ed to genc,.dte sequences if o~;ld.l.e~ and non~mPr oligonucleotides are used.
These results show that using the mPthods dPscrihe~ long sequences of any specific target nucleic acid may be g~nPr~fpd by ~--a~-i---al and unique overlap of conctitu~pnt oligonucleotides. Such sequencing methods 20 are dep~n-lPnt on the content of the individual co"lpollellt oligomers regardless of their frequency and their position.
The sequence which is geneldted using the algorithm desrribed below is of high fidelity. The algorithm tolerates false positive signals from the hybri~i7~ti~n dots as is in~iir~tPd from the fact the sequence genP,r~tPd 25 from the 105 hybridi_ation values, which inr,l~lded four less reliable values, was correct. This fidelity in sequencing by hybridi_ation is due to the "all or none" kinetics of short oligonucleotide hybridi_ation and the difference in duplex stability that exists between perfectly m,ltrhPd dllrl~Y~s and mi~ t~l,P~ duplexes. The ratio of duplex stability of m~trhP~l and 30 end-mi~ rl-ed dl-rl~Yes increases with decreasing duplex length. Moreover, binding energy decreases with decreasing duplex length reslllting in a lower hybrirli7~ti~n effiriPn~y. However, the results provided show that o~u,.er hybri~1i7~tion allows the b~l~n~in~ of the factors Arr~!;,.g duplex stability and ~li.cf..;",in~linn to produce a highly ~rcur~t~ method of s~l-en~ulg by 5 hybri~ .-. Results ~ c~e.l~d in other PY~mrles show that oli~onl~rl~ oLides that are 6, 7, or 8 nucleotides can be effectively used to ~ene.,.t~ reliable se~u~.lce on targets that are 0.5 kb (for h~ ) 2 kb (for s~u-~el~) and 6kb ~for octamers). The sequence of long fr~gm~nt~ may be o~/e.l-d~ed to ge.l~ dle a complete genome sequence.
An algorithm to deL~l",ine sequence by hybTi-li7~ti-n is described in Example 18.

FY~mple 19 Algorithm This example describes an algorithm for gpner~tion of a long 15 sequence written in a four letter alphabet from co~ u~l~l k-tuple words in a minim~l number of S~d~, randomly defined fra~mPntc of a starting nucleic acid sequence where K is the length of an oligonucleotide probe. The algorithm is primarily intend~A for use in the sequencing by hybritli7~tion (SBH) process. The algorithm is based on subfr~gm~ntc (SF), inro""a~i~re 20 fr~gm~ntc (IF) and the possibility of using pools of physical nucleic sequences for ~Pfinin~ infol",ati~/e fr~gmt~ntc.
As described, subfr~gmtontc may be caused by branch points in the assembly process r~s--lting from the repetition of a K-l oligomer sequence in a target nucleic acid. Subfr~gm~ntc are sequence fragments found between 25 any two repetitive words of the length K- 1 that occur in a sequence. Multiple occu~ ces of K-l words are the cause of inle~ plion of ordering the overlap of K-words in the process of sequence generation. Intel.ul~tion leads to a sequence ~ ing in the form of subfr~m~ntc. Thus, the unambiguous UI~ S~

W O96/17957 PCT~US95/16151 segmPntc between br~n~hing points whose order is not uniquely delelll,ined are called sequence subfr~gmPntc.
InÇol,.,~i~re fr~gmP~nt~ are defined as fr~gmfntc of a sequence that are d~lellllined by the nearest ends of overlapped physical sequ~
5 r.,.g",p,,l~
A certain number of physical r.~,.,f ..t~ may be pooled without losing the poccikility of ~Pfining inr~l"~ati~e fr~gmf~ntc The total length of r~n~omly pooled r.,.~ depen~s on the length of k-tuples that are used in the sequPncing process.
The algorithm consists of two main units. The first part is used for ge-,-~ l;orl of subfr~gmPntc from the set of k-tuples contained in a sequence. Subfr~gmPnsc may be genPr~te~ within the coding region of physical nucleic acid sequence of cert~un sizes, or within the inforlllaLi.~e fr~gmPntc defined within long nucleic acid sequences. Both types of 15 fr~gm~ntc are members of the basic library. This algorithm does not ~es~rihe the delt;,lllination of the content of the k-tuples of the inrolll.a~ e fr~gmPntc of the basic library, i.e. the step of ~ ion of in~l,.,~h~e fr~gmentc to be used in the sequence generation process.
The second part of the algorithm determines the linear order of 20 obtained subfragmPntc with the purpose of regenerating the complete sequence of the nucleic acid f.~g---F .~c of the basic library. For this ~ul~ose a second, ordering library is used, made of randomly pooled fr~gmPntc of the starting S~u~llCe. The algorithm does not include the step of combining sequences of basic fr~gmPntc to regenerate an entire, mPg~h~ce plus sequence. This may 25 be accompli~hPd using the link-up of fr~gmPntc of the basic library which is a prerequisite for informative fragment generation. Alternatively, it may be accomplished after generation of sequences of fragmentc of the basic library by this algorithm, using search for their overlap, based on the presence of common end-sequences.

WO 96/179S7 PCTfUS95~1615 The algorithm ~ui~cs neither knowledge of the number of a~pç~ ~s of a given k-tuple in a nucleic acid sequence of the basic and or~ i,lg libraries, nor does it require the ;--ru....;~I;on of which k-tuple words are present on the ends of a fr~gmpnt The algorithm o~tr s with the mLxed 5 content of k-tuples of various length. The concept of the algo~ , enables oppr~tionc with the k-tuple sets that contain false positive and false negative k- tuples. Only in specific cases does the content of the false k-tuples prim~rily infl~lrn~e the compl~tr-n~ and cû~ css of the gcnl.i.l~
sequence. The algorithm may be used for ~~ n of p~r~mp~t~ors in 10 cim~ tion c~ c, as well as for seyuence genrr~tinn in the actual SBH
nl~ e.g. g~ lion of the genomic DNA sequence. In o~ i()n of p~r~mPtprs~ the choice of the oligonucleotide probes (k-tuples) for pr~rtir~land convenient fr~m~ntc and/or the choice of the optimal lengths and the number of fr~gmPntc for the defined probes are ecrPri~lly i~ t.
This part of the algorithm has a central role in the process of the ge.-~.i.lion of the sequence from the content of k-tuples. It is based on the unique ordering of k-tuples by means of ...~xi...~l overlap. The main obstacles in sequence generation are specific repeated sequences and false positive and/or negative k-tuples. The aim of this part of the algorithm is to obtain the20 minim~l number of the longest possible subfr~gmpntc~ with correct sequence.
This part of the algorithm consists of one basic, and several control steps. A
two-stage process is nec~c~il, y since certain information can be used only after ge~ ion of all primary subfr~m~ntc The main pluble.ll of sequence gent~ti~-n is obt;~ining a 25 repeated sellu~,nce from word co~ c that by definitit)n do not carry ;nr~ ;on on the number of oc~;u,~ ces of the particular k-tuples. The concept of the entire algorithm depends on the basis on which this problem is solved. In principle, there are two opposite approaches: 1) repeated sequences may be obtained at the beginning, in the process of generation of 30 pSFs, or 2) repeated sequences can be obtained later, in the process of the CA 0220681~ 1997-06-04 W O96/179S7 PCT~US95/16154 final ordering of the subfr~gmP-ntc In the first case, pSFs contain an excess of sequences and in the second case, they contain a deficit of sequences. The first approach lG~IUil~S Plimin~tion of the excess sequences ge.l~lalGd, and thesecond l~ui~S ~. .~ ling multiple use of some of the subfr~mPntc in the 5 process of the final assembling of the S~UGnCe,.
The dirr,.~ ce in the two approaches in the degree of st i~tn~cc of the rule of unique overlap of k-tuples. The less severe rule is: k-tuple X
is unambiguously maximally overlapped with k-tuple Y if and only if, the ri~ ...o~l k-l end of k-tuple X is present only on the leftmost end of k-tuple 10 Y. This rule allows the generation of repetitive sequences and the formation of surplus sequences.
A stricter rule which is used in the second approach has an addition caveat: k-tuple X is unambiguously maximally overlapped with k-tuple Y if and only if, the ri~htm- ct K- l end of k-tuple X is present only on 15 the leftmost end of k-tuple Y and if the leftmost K-l end of k- tuple Y is not present on the ri~htmost end of any other k-tuple. The algorithm based on the stricter rule is simpler, and is described herein.
The process of elongation of a given subfr~mPnt is stopped when the right k-l end of the last k-tuple included is not present on the left 20 end of any k-tuple or is present on two or more k-tuples. If it is present ononly one k-tuple the second part of the rule is tested. If in addition there is a k-tuple which differs from the previously included one, the assembly of the given subfragment is termin~ted only on the first leftmost position. If this 1ition~l k-tuple does not exist, the conditions are met for unique k-1 overlap 25 and a given subfragment is Pxtended to the right by one el~mPnt Beside the basic rule, a supplemPnt~ry one is used to allow the usage of k-tuples of different IPngthc The m~im~l overlap is the length of k-l of the shorter k-tuple of the overlapping pair. Generation of the pSFs is yGlrull~led starting from the first k-tuple from the file in which k-tuples are 30 displayed randomly and independently from their order in a nucleic acid sequence. Thus, the first k-tuple in the file is not nP~Pc~rily on the be~;....i.,~
of the sequence, nor on the start of the particular subfr~gmPnt The process of subfr~mPnt genPration is ~Ç~l,l,ed by ordering the k-tuples by means of unique overlap, which is defined by the desc~ ihe~ rule. E;ach used k-tuple is S erased from the file. At the point when there are no further k-tuples bi~uously o~G~ g with the last one in~lu~lP~d, the building of ",~",~ t iS ~- ..--;n~ l and the buildup of another pSF is started. Since g~ ion of a majority of subfrAgmPnt~ does not begin from their actual starts, the formed pSF are added to the k-tuple file and are con~idPred as a 10 longer k-tuple. Another po~ihility is to form ~ùl)f~ t~ going in both directions from the starting k- tuple. The process ends when further overlap, i.e. the eYtPn;~ion of any of the subfra~mPnt~ is not pos~ible.
The pSFs can be divided in three groups: 1) Subfr~gmPnts of the maximal length and correct sequence in cases of exact k-tuple set; 2) 15 short subfrA~mPntc, formed due to the used of the m~l~im~l and unambiguous overlap rule on the incomplete set, and/or the set with some false positive k-tuples; and 3) pSFs of an incorrect sequence. The incompl~ ec~ of the set in 2) is caused by false negative results of a hybri-li7~tion r~l,P. ~ , as well as by using an incorrect set of k-tuples. These are forrned due to the 20 false positive and false negative k-tuples and can be: a) misconnected subfrA~mPnt~; b) subfr~gm~ntc with the wrong end; and c) false positive k-tuples which appears as false minim~l subfrAgmP,ntc.
Considering false positive k-tuples, there is the possibility for the presence of a k- tuple col.t;~;--il-g more than one wrong base or cont~ining25 one wrong base somewhere in the middle, as well as the possibility for a k-tuple with a wrong base on the end. Generation of short, erroneous or misconmP~tPd subfr~gmpntc is caused by the latter k-tuples. The k-tuples of the former two kinds .~ ent wrong pSFs with length equal to k-tuple length.

SUBSIlo~ S!~

W O96/17957 PCT~US95/1615 - ~0 -In the case of one false negative k-tuple, pSFs are ge ~ d because of the impossibility of m~xim~l overlapping. In the case of the presence of one false positive k-tuple with the wrong base on its leftmost or ri~htmost end, pSFs are g~ t~d because of the impos~ihility of S unambiguous ov~lapping. When both false positive and false negalive k-tuples with a common k-l sequence are present in the file, pSFs are g~ ~d, and one of these pSFs contains the wrong k-tuple at the n le~
end.
The process of co~ -g subfr~gmPnts with errors in se lllence 10 and the linking of unambiguously c~ P~IPA pSF is ~rul~led after subfr~gmtont gPn~ratiorl and in the process of subfragment ordering. The first step which consists of cutting the mi~coni~ d pSFs and ob~inil g the final subfr~m~nt~ by unambiguous connection of pSFs is described below.
There are two approaches for the formation of mi~cQIlnPct~
15 subfr~m~nts. In the first a mistake occurs when an elloneolls k-tuple appearson the points of assembly of the repeated sequences of lengths k-1. In the second, the repeated sequences are shorter than k-1. These situations can occur in two variants each. In the first variant, one of the repeated sequences l~r~s~"ts the end of a fr~gm~nt In the second variant, the repeated se4u~nce 20 occurs at any position within the fr~gmpnt For the first possibility, the ~h~n~e of some k-tuples from the file (false negatives) is ~uil~d to generate a misconnection. The second possibility requires the presence of both false negative and false positive k-tuples in the file. Con~idPring the repetitions of k-l sequence, the lack of only one k-tuple is sl~fficient when 25 either end is repeated internally. The lack of two is needed for strictly internal repetition. The reason is that the end of a sequence can be con~i-ieredinform~ti~lly as an endless linear array of false negative k-tuples. From the "smaller than k-l case", only the repeated sequence of the length of k-2, which ,equiles two or three specific erroneous k-tuples. will be considered.

It is very likely that these will be the only cases which will be ~iete~ct~P~ in a real eYrerim~nt, the others being much less frequent.
Recognition of the micconnp~l~d subfr~gmPntc is more strictly defined when a rt~wled sequence does not appear at the end of the r,~p.. , 5 In this ~itn~tion~ one can detect further two snbfr~mPntc, one of which cont~inC on its leftmost, and the other on its n~htmost end k-2 s~u~,l.ces which are also present in the mi~nnn~cted subfra~mPnt When the repeated sequence is on the end of the fra~mPnt, there is only one subfr~m~ont which c~ nt~inC k-2 sequence causing the mistake in subfr~gmPnt formation on its 10 leftmost or rightmost end.
The removal of micc~nnPct~A subframents by their cutting is d according to the common rule: If the leftmost or r~ sl sequence of the length of k-2 of any subfr~gm~ntc is present in any other s~hfr~gm~nt, the subfr~gmPnt is to be cut into two ~ubrl~ lt~, each of them c4~ ;n;l~g k-2 se.luence. This rule does not cover rarer situations of a l~l)ealed end when there are more than one false negative k-tuple on the point of r~eaLed k-l sequence. Misconnected subfr~gmPntc of this kind can be recogni7~1 by using the information from the overlapped fr~gmPntc, or infu~ aLi~e fr~gmPntc of both the basic and ordering libraries. ln ~-lrlition, 20 the misconnected subfragment will remain when two or more false negative k-tuples occur on both positions which contain the id~Pnti~l k-l sequence.
This is a very rare situation since it ~~ui-~s at least 4 specific false k-tuples.
An additional rule can be introduced to cut these subfr~gm~ntc on sequences of length k if the given sequence can be obtained by combination of sequences 25 shorter than k-2 from the end of one subfragment and the start of another.
By strict application of the described rule, some completen~
is lost to ensure the accuracy of the output. Some of the subfragments will - be cut although they are not misconnP~ted since they fit into the pattern of a misconnected subfragment. There are several situations of this kind. For 30 example, a fr~gm~Pnt, beside at least two idPntic~l k-l sequences, contains any Wo 96/179S7 PcT/uss5tl6l5, k-2 sequence from k-1 or a fragment contains k-2 sequence repeated at least twice and at least one false negative k-tuple co~ inillg given k-2 sequence in the middle, etc.
The aim of this part of the algorithm is to reduce the number 5 of pSFs to a minim~l number of longer subfr~mPnt~ with correct s~u~llee.
The genPr~tion of unique longer subfr~gmPnt~ or a complete sequence is possible in two situations. The first situation con~ern~ the specific order of repeated k-l words. There are cases in which some or all m~xim~lly Pyt~n~
pSFs (the first group of pSFs) can be uniquely ordered. For example, in 10 fragment S-Rl-a-R2-b-R1-c-R2-E where S and E are the start and end of a fr~gmPIlt a, b, and c are different sequences specific to respective subfr~rnPnt~ and Rl and R2 are two k-1 sequences that are t~n~Pnnly repP~tP~l, five subfr~gmPnt~ are generated (S-R1, Rl-a-R2, R2-b-Rl, Rl-c-R2, and R-E). They may be ordered in two ways; the original sequence above or 15 S-Rl- c-R-b-R1-a-R-E. In contrast, in a fr~mPnt with the same number and types of repeated sequences but ordered differently, i.e. S-R1-a-Rl-b-R-c-R-E, there is no other sequence which includes all subfr~gmPnt~ FY~mrl~s of this type can be recognized only after the process of generation of pSFs. They ;se,.t the necessity for two steps in the process of pSF generation. The 20 second situation of generation of false short subfr~gm~nt~ on positions of nonrepeated k- 1 sequences when ~he files contain false negative and/or positive k-tuples is more important.
The solution for both pSF groups consists of two parts. First, the false positive k- tuples appearing as the nonexisting minim~l subfragmçntc 25 are elimin~ted. All k-tuple subfragments of length k which do not have an overlap on either end, of the length of longer than k-a on one end and longer than k-b on the other end, are elimin~tP~ to enable formation of the maximal number of connPctions. In our experiments, the values for a and b of 2 and 3"~ ;lively, appeared to be adequate to Plimin~te a sufficient number of 30 false positive k-tuples.

Wo 96/17957 PCT~US9S116154 The merging of s--bf~gmPnts that can be uniquely co.~n~cl~
is accomrlichP~ in the second step. The rule for col-n~c!;on is: two subfr~mPntc may be unambiguously collnP~tP~ if, and only if, the ov~lla~ g sequence at the relevant end or start of tWO subfr~gmPntcis not S present at the st~ and/or end of any other subfr~mPnt The exception is if one .ul)fidg~ t from the conci~lp~ed pair has the idpnt~ be~innillg and end. In that case connp~tion is ~llliL~d, even if there is another cubfr~mPnt with the same end present in the file. The main problem here is the precise dçfinition of ovella~l.ing sequence. The 10 c~nnPcti-n is not pe~ d if the ovella~ing sequence unique for only one pair of subfr~m~nts is shorter than k-2, of it is k-2 or longer but an a~ iticn~1 subfragment exists with the ove,la~ g sequence of any length longer than k-4. Also, both the canonical ends of pSFs and the ends after omitting one (or few) last bases are consi~Pred as the ove,lap~ing sequenc~.,.
After this step some false positive k-tuples (as minim~l subfr~gmPntc) and some subfr~EmPntc with a wrong end may survive. In addition, in very rare occasions where a certain number of some specific false k-tuples are simultaneously present, an erroneous connection may t~ke place.
These cases will be ~let~cteci and solved in the subfragment ordering process, 20 and in the additional control steps along with the h~nrlling of uncut "misconnPc.ted" subfragmpntc The short subfr~gmpntc that are obtained are of two kinds. In the common case, these subfr~gm~nts may be unambiguously connected among themselves because of the distribution of repeated k-l sequences. This 25 may be done after the process of generation of pSFs and is a good eY~mple of the necessity for two steps in the process of pSF generation. In the case of using the file cont~ining false positive and/or false negative k-tuples, short pSFs are obtained on the sites of non.~peated k-l sequences. Con~ Prin~
false positive k-tuples, a k-tuple may contain more than one wrong base (or 30 cont~ining one wrong base somewhere in the middle), as well as k-tuple on 5~Sll~Ult SNEr ~RDIE 26~

W O96/17957 PCTrUS95/161S4 the end. ~ener~tion of short and crlollec,us (or mis~onnP~tPd) subfragmPntc is caused by the latter k-tuples. The k-tuples of the forrner kind ,c~ cnl wrong pSFs with length equal to k- tuple length.
The aim of merging pSF part of the algorithm is the re,~ucti~n 5 of the number of pSFs to the minim~l number of longer subfragmPntc with the correct sequence. All k-tuple subfragmPntc that do not have an overlap on either end, of the length of longer than k-a on one, and longer than k-b on the other end, are eli...in~t~d to enable the m~xim~l number of ~ ;onc. In this way, the majority of false positive k-tuples are discarded. The rule for conn~ctic~n is: two subfr~mPntc can be unambiguously cQnnP~t~ if, and only if the ove.la~ ,ing sequence of the relevant end or start of two subfr~gmPntc is not present on the start and/or end of any other subfr~gmPnt The exception is a subfr~gmPnt with the identic~l beginning and end. In that case comleclion is pel.l.iL~cd, provided that there is another subfr~gmPnt with the same end present in the file. The main problem here is of precise ~Pfinition of overlapping sequence. The presence of at least two specific false negative k- tuples on the points of repetition of k-l or k-2 sequences, as well as combining of the false positive and false negative k-tuples may destroy or "mask" some ovella~ g sequences and can produce an unambiguous, but wrong connection of pSFs. To prevent this, completeness must be sacrificed on account of PY~rtnpcc the connPction is not ~ellllilLed on the end-sequences shorter than k-2, and in the presence of an extra ovella~ing sequence longer than k-4. The ovellappillg sequences are defined from the end of the pSFs, or omiffing one, or few last bases.
In the very rare situations, with the presence of a certain number of some specific false positive and false negative k-tuples, some subfragmPntc with the wrong end can survive, some false positive k-tuples (as minim~l subfr~gmPntc) can remain, or the erroneous connection can take place. These cases are detected and solved in the subfragm~ntc ordering WO 96/17957 PCTnUS9511615 process, and in the ~klitionAl control steps along with the h~nl1iing of uncut, mi~r,o~ r~;~r~ subfr~,~mP-ntc The process of ordering of subfr~mPn~c is sirnilar to the process of their gel- .,.1;. n. If one con~ prs sl-bfr~gmentc as longer k-tuples, o ~I~".g is l~,.rul,--ed by their ~-n"mhiguous c~ ~-n~l;~m via ~v~ g ends.
The inforrn Itinn~,l basis for lln~mhi~uous connP~til-n is the division of subfr5~gm~ntc gen~tt~d in f~m-ontc of the basic library into groups s~nli-lg segmpntc of those fr~mPntc The method is ~n~lk~gous to the bi~hPmic l sc~lution of this problem based on hybri-li7~tion with longer oligonl~r~ pswithrelevantcol~np~ gsequence. Thecol-nP~ -gsequences are g~n~t~d as subfragments using the k-tuple sets of the apl~lo~liale S~ tc of basic library fr,~gmPntc. Relevant segmpntc are defined by the fr~gm~ntc of the ordering library that overlap with the respective fr~gmPntc of the basic library. The shortest segmentc are info~nldlive fr~gmPntc of the ordering library. The longer ones are several neighboring inr~llllali-/e frA~mPnt$ or totdl ov~ld~ing portions of frAgmPntc co-l1s~olldillg of the ordering and basic libraries. In order to decrease the number of ~ e ~mpl~S, fragm~ntc of the ordering library are randomly pooled, and the unique k- tuple content is determined.
By using the large number of fr"~mPnt~ in the ordering library very short seg...~ are generated, thus reducing the chance of the multiple ap~e~ re of the k-l sequences which are the reasons for gtone~tion of the subfr~gm~ntc. Furthermore, longer segmPntc~ consisting of the various regions of the given fr~gmPnt of the basic library, do not contain some of the r~eated k- 1 sequences. In every s~gm(ont a connecting sequence (a connecting subfragment) is generated for a certain pair of the subfr~gmPnt~
from the given fragment. The process of ordering consists of three steps: (1) ~ genPr~tion of the k- tuple contents of each segment; (2) g~n~r~tion of subfr~gm~nt~ in each segmçnt; and (3) connection of the subfr~mPnt~ of the 30 S~Pg~ . Primary segmPnt~ are defined as ~ignific~nt intersections and W O96/17957 PCTrUS95/1615 differences of k-tuple contents of a given fragment of the basic library with the k-tuple contPnt~ of the pools of the ordering library. Second~ry (shorter) seg..~ n~ are defined as inte~ ;on~ and differences of the k-tuple c~n~
of the primary s~...~
There is a problem of ~rr~-mul~ting both false positive and negative k-tuples in both the differences and int~,;,e~l;on~. The false negativek-tuples from starting sequences accl~mnl~tp in the intersection~ (ov~lappil~g parts), as well as false positive k-tuples oc~;ulling randomly in both sequences, but not in the relevant ove.la~ing region. On the other hand, the majority of false positives from either of the starting sequences is not taken up into ,h~ .~l;nns This is an eY~mrl~- of the reduction of ~YpPrimPnt~l errors from individual fr~,gmPnt~ by using information from fr~gmPnt~ o~l~ldpping with them. The false k- tuples ~ccum~ te in the differences for another reason.
The set of false negatives from the original sequences are enlarged for false 15 positives from i~f~.e~ilions and the set of false positives for those k-tuples which are not incl~ ~ in the intersection by error, i.e. are false negative in the intersection. If the starting sequences contain 10% false negative data, theprimary and secondary intersection~ will contain 19% and 28~o false negative k- tuples, re~.l e~ ely. On the other hand, a m~them~tir~l expectation of 77 false positives may be predicted if the basic fragment and the pools have lengths of 500 bp and 10,000 bp, respectively. However, there is a possibility of recovering most of the "lost" k-tuples and of elimin~ting most of the false positive k-tuples.
First, one has to deterllline a basic content of the k-tuples for a given segment as the intcl~e-;lion of a given pair of the k-tuple col1tent~,.
This is followed by in-lu-iing all k- tuples of the starting k-tuple contents inthe intersection, which contain at one end k-l and at the other end k-+
sequences which occur at the ends of two k-tuples of the basic set. This is done before gelleldtion of the differences thus preventing the ~ccum~ tion of false positives in that process. ~ollowing that, the same type of enlargement WO 9C/17957 PCTIUS9',116154 of k-tuple set is applied to differences with the dictinction that the bo~ wing is from the il~te.~e~ nc. All borrowed k-tuples are elimin~t~d from the ill~,~;~ion files as false positives.
The inl~.b_clion, i.e. a set of common k-tuples, is defined for each pair (a basic r.~...~t) X (a pool of ordering library). If the number of k-tuples in the set is significant it is enlarged with the false negatives according to the described rule. The primary difference set is obtained by subtracting from a given basic fr~gmPnt the obtained inlel~eclion set. The false negative k-tuples are appended to the difference set by borrowing from 10 the intersection set according to the desc~bed rule and, at the sarne time, removed from the inlel~c~;Lion set as false positive k-tuples. When the basic fr~gm~nt is longer than the pooled fr~mPntc, this difference can f~ ,sel~t the two s~uale segm~nt~ which somewhat reduces its utility in further steps.
The primary segm~-nt.c are all cencl~ed intersections and differences of pairs 1~ (a basic fr~ment) X (a pool of ordering library) co~ the ci~nific~n~
llulllb~l of k-tuples. K-tuple sets of s~ond~ry s~ are oblahled by CO~ nl ison of k-tuple sets of all possible pairs of primary segmPntc. The two differences are defined from each pair which produces the intersection with the cignifi~nt number of k-tuples. The majority of available information 20 from overlapped fr~gmpntc is recovered in this step so that there is little to be gained from the third round of forming intersections and differences.
(2) Generation of the subfragmlQntc of the segments is ,led identic~lly as described for the fragments of the basic library.
(3) The method of connP~ticn of subfragmPntc concictc of 25 seyue.ltially determining the correctly linked pairs of subfragments among the subfr~gm~ntc from a given basic library fragment which have some overlapped ends. In the case of 4 relevant subfragments, two of which contain the same beginning and two having the same end, there are 4 different pairs of subfr~gmPntc that can be connPctPd. In general 2 are correct and 2 are 30 wrong. To find correct ones, the presence of the connPcting sequences of W O96/17957 PCTrUS95/1615 each pair is ~ested in the subfr~gmpntc generated from all primary and sP~Qnrl~ry segmçnt~ for a given basic fr~gmPnt The length and the position of the eonnP,eting sequence are chosen to avoid i"l~r~rc.-ce with s~u~nces which occur by chance. They are k+2 or longer, and include at least one S clr ~ 2 beside o~ ~.lal,~ing sequence in both subfragmPnt~ of a given pair.
The co--n~;nn is ~ ed only if the two eonnPetin~ s~uences are found and the l~ ;ng two do not exist. The two linked subfragmPnt~ replace former subfragmt~ntc in the file and the proeess is cyclically repe~tP~
Repeated sequences are gPntqratp~ in this step. This means that 10 some subfr~gmçnt~ are in~ clPA in linked subfragmPnt~ more than once. They will be recogni7~1 by finding the relevant connPeting sequence which PnE~gPs one subfr~gmPnt in eQnnPc~tinn with two different subfragmrntc The recognition of miccon,~ d subfragmPntc gene.dL~ in the ~lu._es~es of building pSFs and merging pSFs into longer subfragment~ is 15 based on testing whether the sequences of subfr~gm~ntc from a given basic fragment exist in the sequences of subfragments generated in the seg...~ for the fra~mpnt The sequences from an ineûrrectly eonnected position will not be found in-lir~ting the micconnP~ct~p~ subfr~gmPntc Beside the described three steps in ordering of subfr~gmentc 20 some additional control steps or steps applicable to specific sequences will be nececc~ry for the generation of more complete sequenee without mict~k~s.
The determination of which subfragment belongs to whieh segmPnt is performed b eomparison of contents of k-tuples in segmPntc and SUbfr~mPntC R~P~ CP of the errors in the k-tuple contents (due to the 25 primary error in poûls and stati~tic~l errors due to the frequency of occurrences of k-tuples) the exaet partitioning of subfragmPnt~ is impossible.
Thus, instead of "all or none" partition, the chance of coming from the given segment (P(sf,s)) is determined for each subfragment. This possibility is the function of the lengths of k-tuples, the lengths of subfr~gmentc. the lengths of Wo 96/17957 PCTJUS95116154 fr~gmPnt~ of ordering library, the size of the pool, and of the pe ~;el ~ge of false k-tuples in the file:
P(sf,s) =(Ck-F)/Lsf, where Lsf is the length of subfr~mPnt Ck is the number of common k-tuples S for a given subr~ M~/se~ pair, and F is the ~ ter that inr~
rel~tir7nc between lengths of k-tuples, fr~m~nt~ of basic library, the size of the pool, and tne error ~rcen~ge.
Subfr~mPntc attIibuted to a particular segm~.nt are treated as red--n-l~nt short pSFs and are s~bl,lilled to a process of Im~mhiguous 10 connPctinn The d~-finition of ~m~mhiEuous connPctinn is slightly .lirr~ l in this case, since it is based on a probability that subfr~gm~nt~ with o~lal~pillgend(s) belong to the segmPnt con~i~pred. RecidP~s, the accuracy of un~mhi~uous connP~lion is controlled by following the connP~tion of these subfra~m-ont~ in other segment~ After the conne~tinn in different ~ nt~, 15 all of the obtained subfr~m,ont~ are merged together, shorter subfr~mPntc in~lllded within longer ones are elimin~ted~ and the l~ ones are ~u~lllilLed to the ol~;linary connecting process. If the sequence is not l~e .~ d co".~l~t~ ly, the process of partition and co.n~ on of subfr~gmPnt~ is repeated with the same or less severe criterions of probability 20 of belonging to the particular segment, followed by unambiguous connection.
Using severe criteria for d~Pfining unambiguous overlap, some infor nation is not used. Instead of a complete sequence, several subfr~gmPnt~
that define a number of possibilities for a given fragment are obtained. Using less severe criteria an ~cc~ tP- and complete sequence is generated. In a 2~ certain number of situations, e.g. an erroneous connection, it is possible togenerate a co",plete, but an incorrect sequence, or to ge"e,~lte "monster"
subfPgmPntc with no connection among them. Thus, for each fragment of the basic library one obtains: a) several possible solutions where one is correct and b) the most probable correct solution. Also, in a very small number of 30 cases, due to the mistake in the subfragment generation process or due to the ~S.lmlE SEEr p~l1LE 2C) W O96117957 PCTnUS95/161S~

specific ratio of the prob~hiliti~s of belon~ing~ no un~mbi~uous solutio~ is g~ d or one, the most probable solution. These cases remain as incomplete sequences, or the unambiguous solution is ob~ined by co.
these data with other, o~ella~l)ed fr~gmPnt~ of basic library.
The described algorithm was tested on a randomly gen~....... ~, 50 kb sequence, cont~ -g 40% GC to ~im~ te the GC content of the human gens)rne In the middle part of this sequence were inserted various All, and some other repetitive sequences, of a total length of about 4 kb. To ~im~ tP
an in vitro SBH experiment, the following operations were pelrolll.ed to 10 prepare ~ iate data.
- Positions of sixty S kb overlapping "clones" were randomly dPfinP~, to .~im~ tP pl~ya~dlion of a basic library:
- Positions of one thousand 500 bp "clones" were randomly d~t~-...i.-Pd to simlllate making the ordering library. These fr~gmPntc were 15 eytr~ted from the sequence. Random pools of 20 fr~gmPnt~ were made, and k-tuple sets of pools were determined and stored on the hard disk. These data are used in the subfragment ordering phase: For the same density of clones 4 million clones in basic library and 3 million clones in ordering library are used for the entire human genome. The total number of 7 million clones is 20 several fold smaller than the number of clones a few kb long for random cloning of almost all of genomic DNA and sequencing by a gel-based mpthod From the data on the starts and ends of 5 kb fr~gmPntc, 117 "informative fragments" were determined to be in the sequence. This was followed by determination of sets of overlapping k-tuples of which the single 25 "infor",alive fragment" consist. Only the subset of k-tuples matching a prede~ll,lh~ed list were used. The list contained 65% 8-mers, 30% 9-mers, and 5% 10-12-mers. Processes of generation and the ordering of subfr~gmPnt~ were yt;;lrc lll-ed on these data.
The testing of the algorithm was performed on the simulated 30 data in two experiments. The sequence of 50 informative fragmPtltc was CA 022068l5 l997-06-04 W O 96/I7957 PCTnUS9~16~54 l~r~lf .1lr~ with the 100% correct data set (over 20,000 bp), and 26 in~or"l~ e fr~gmPntc (about 10,000 bp) with 10% false k-tuples (5% positive and 5% negative ones).
In the first r~llGI;I~nt, all subfr~gmPntc were correct and in S only one out of 50 ;.~fo....AI;~re fr~m~nt~ the sequence was not completely .cigf-~ d but ~ ;nf~d in the forrn of 5 ~u,l,r.~g.... ,.~i The analysis of ~OSiliOllS of o~ pped fr~mPntc of ordering library has shown that they lack the information for the unique ordering of the S subfr~Em~ntc. The subfr~mPntc may be cnnn~t~d in two ways based on overlapping ends, 101-2-3-4-5 and 1-4-3-2-5. The only difference is the ~ch~nge of positions of subfra~mPntc 2 and 4. Since subfr~mP-ntc 2, 3, and 4 are relatively short (total of about 100 bp), the relatively greater chance existed, and occurred in this case, that none of the fr~mPntc of ordering library started or ended in thesubfraEmPnt 3 region.
15To cim~ t~ real sequencing, some false ("hyhri-1i7~tion") data was included as input in a number of expenm~rltc In oligomer hybri~i7~tion e,~llel.l,lents, under proposed con~iitionc, the only situation producing unreliable data is the end micm~tsh versus full match hybn-1i7~tion.
Therefore, in ~imul~tion only those k-tuples differing in a single ~lprnent on 20 either end from the real one were considered to be false positives. These "false" sets are made as follows. On the original set of a k-tuples of the informative fr~m~nt a subset of 5 % false positive k-tuples are added. False positive k-tuples are made by randomly picking a k-tuple from the set, copying it and altering a nucleotide on its beginning or end. This is followed 25 by subtraction of a subset of 5% randomly chosen k-tuples. In this way the st~ti~ti~11y eYpe~ted number of the most complicated cases is generated in which the correct k-tuple is replaced with a k-tuple with the wrong base on the end.
Production of k-tuple sets as described leads to up to 10% of 30 false data. This value varies from case to case, due to the randomness of -wo 96/179S7 PCT/US95/1615,1 - 6~ -choice of k-tuples to be copied, altered, and erased. Nevertheless, this ~c~cel,~ge 3-4 times exceeds the amount of unreliable data in real hybridi7~sion expenmpntc The introduced error of 10% leads to the two fold increase in the number of subfragmentc both in fra~mPnt~ of basic library 5 ~basic library inroln,ati~e fr~gmPntc) and in segmentC About 10% of the final subfr~gmPntc have a wrong base at the end as eY~t~d for the k-tuple set which collt~inc false positives (see gçnPration of primary subfr~gmPntc).
Neither the cases of misconnP,ction of subfr~gmPntc nor subf~mPntc with the wrong sequence were observed. In 4 inro~ e fragmPnt~ out of 26 10 PY~mined in the ordering process the complete sequence was not regenP~tP~I
In all 4 cases the sequence was obtained in the form of several longer subfragmentc and several shorter subfragments cnnt~in~d in the same segmP-nt This result shows that the algorithmic principles allow working with a large pelcen~ge of false data.
The success of the generation of the sequence from its k-tuple content may be described in terms of completeness and accuracy. In the process of generation, two particular situations can be defin~i: 1) Some part of the information is missing in the generated sequence, but one knows where the ambiguities are and to which type they belong, and 2) the regçnPrat~d 20 sequence that is obtained does not match the sequence from which the k- tuplecontent is geherated, but the mistake can not be dete~ted. ~cc~lming the algorithm is developed to its theoretical limits, as in the use of the exact k-tuple sets, only the first situation can take place. There the incompletenPsc results in a certain number of subfragments that may not be ordered 25 unambiguously and the problem of determination of the exact length of monotonous sequences, i.e. the number of perfect tandem repeats.
With false k-tuples, incorrect sequence may be generated. The reason for mi5t~kPs does not lie in the sho.Lco",i1~gs of the algorithm, but in the fact that a given content of k-tuples unambiguously ,~l~senls the sequence 30 that differs from the original one. One may define three classes of error, ~mm~O ~

_ _ _ _ _ _ _ _ W O 96/17957 PCT~USg5/16154 ~PpPn~lin~ on the kind of the false k- tuples present in the file. False negative k-tuples (which are not acco..ll ~niPd with the false p<;silives) produce "dPlPtion~". False positive k-tuples are producing "el~ng~*on~ (unequal crossing over)". False positives acco...p~niPd with false negatives are the S reason for gçnP.r~tion of "insertions", alone or combined with "~Pletion.~".
The dPl~tion~ are produced when all of the k-tuples (or their majority) between two possible starts of the subfr~m~nt~ are false n~ es. Since every position in the sequence is defined by k k- tuples, the o~;u~ ce of the dPlPtiln~ in a common case lelùilt;s k cor-s~P-cl~tive false negatives. (With 10% of the false negatives and k=8, this ~itl~tion takes place after every 108 PIPmPnt~). This situation is extremely infrequent even in m~mm~ n genome sequencing using random libraries cont~ining ten genome equivalents.
Flong~tiQn of the end of the sequence caused by false positive k-tuples is the special case of "insertions", since the end of the sequence can lS be con~idPred as the endless linear array of false negative k-tuples. One maycon~itlPr a group of false positive k-tuples producing ~ul)fi;lg~Pnt~ longer than one k-tuple, Situations of this kind may be det~tP~ if subf~gmPnt~ are g~nPr~tpd in overlapped fragments, like random physical fr~gment~ of the ordering library. An insertion, or insertion in place of a deletion, can arise 20 as a result of specific combinations of false positive and false negative k-tuples. In the first case, the number of consecutive false negatives is smaller than k. Both cases require several o~ lap~hlg false positive k-tuples.
The insertions and deletions are mostly theoretical possibilities without sizable practical repercussions since the requirements in the number and specificity 25 of false k-tuples are simply too high.
In every other situation of no meeting the theoretical l~equi~ ent of the minim~l number an the kind of the false positive and/or negatives, mi~t~kPs in the k-tuples content may produce only the lesser compl~tPnP~ of a generated sequence.

Claims (34)

- 64 -
1. A method for analyzing nucleic acids by hybridization, comprising the steps of:
arraying a first plurality of nucleic acid segments on a first sector of a substrate;
disposing a second plurality of nucleic acid segments on a second sector of said substrate;
exposing, under conditions discriminating between full complementarity and a one base mismatch, said first plurality of nucleic acid segments to a first hybridization probe in said first sector, said first hybridization probe being shorter than one from among said first plurality of nucleic acid segments, to said plurality of nucleic acid segments;
incubating under conditions discriminating between full complementarity and a one base mismatch, a second hybridization probe in said second sector, said second hybridization probe being shorter than a segment from among said second plurality of nucleic acid segments and said second hybridization probe being different in sequence from said first hybridization probe;
detecting hybridization of a hybridization probe to a nucleic acid segment; and analyzing the result.
2. The method as recited in claim 1, further comprising, prior to said disposing step, the step of introducing a barrier to movement of a nucleic acid.
3. The method as recited in claim 1 further comprising, after said arraying and said disposing step but before said incubating step, thestep of introducing a barrier to movement of a nucleic acid.
4. The method as recited in claim 3 wherein said introducing step comprises pressing a physical barrier against said substrate.
5. The method as recited in claim 2 wherein said introducing step comprises the step of applying a direction-switching electricalfield perpendicular to said support to prevent the mixing of probes between sectors.
6. The method as recited in claim 3 wherein said introducing step comprises the step of applying a direction-switching electricalfield perpendicular to said support to prevent the mixing of probes between sectors.
7. The method as recited in claim 1 wherein said arraying step comprises the step of spotting nucleic acid samples by means of a pin array.
8. The method as recited in claim 1 wherein said arraying step comprises the step of dispensing nucleic acid samples by an array of tubes.
9. The method as recited in claim 1 wherein said arraying step comprises the step of jet printing nucleic acid samples.
10. The method as recited in claim 1 wherein said exposing step comprises the step of applying a plurality of contiguously hybridizing probes.
11. The method as recited in claim 1 wherein said incubating step comprises the step of applying a plurality of contiguously hybridizing probes.
12. The method as recited in claim 10 further comprising the step of ligating at least two of said plurality of contiguously hybridizing probes.
13. The method as recited in claim 11 further comprising the step of ligating at least two of said plurality of contiguously hybridizing probes.
14. The method as recited in claim 1 wherein said exposing step comprises the step of applying a plurality of competitively hybridizing probes having overlapping nucleic acid sequences.
15. The method as recited in claim 1 wherein said incubating step comprises the step of applying a plurality of competitively hybridizing probes having overlapping nucleic acid sequences.
16. The method as recited in claim 1 wherein a least two of said first plurality of nucleic acid segments are arrayed as a mixture.
17. The method as recited in claim 1 wherein a least two of said second plurality of nucleic acid segments are disposed as a mixture.
18. The method as recited in claim 1 further comprising the steps of preparing samples by digestion with an Hga 1 type restriction enzyme and ligating the resulting restriction fragments with an anchor.
19. The method as recited in claim 1 further comprising the step of selecting probes from a universal set of probes of a given length.
20. The method as recited in claim 1 further comprising the step of selecting probes from an incomplete set of probes of a given length.
21. The method as recited in claim 1 further comprising the step of selecting deoxyribonucleotide probes.
22. The method as recited in claim 1 further comprising the step of selecting ribonucleotide probes.
23. The method as recited in claim 1 further comprising the step of selecting a nucleic acid analog selected from the group consisting of protein nucleic acid probes and probes containing base analogs.
24. The method as recited in claim 1 further comprising the step of multiplex labelling of probes.
25. The method as recited in claim 1 further comprising the step of degrading a label on an unhybridized probe.
26. The method as recited in claim 19 wherein said exposing or said incubating step comprises the step of assembling a set of universal probes 6, 7, 8, 9 or 10 bases in length.
27. The method as recited in claim 19 wherein said exposing or said incubating step comprises the step of assembling a set of universal probes 6, 7, 8, 9 or 10 bases in length.
28. The method as recited in claim 20 wherein said exposing or said incubating step comprises the step of assembling an incomplete set of probes 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 bases in length.
29. Apparatus analyzing nucleic acids by hybridization comprising a substrate having points of attachment for nucleic acid fragments, said substrate being segmented by hydrophobic regions.
30. The method as recited in claim 20 wherein said disposing step comprises the step of assembling an incomplete set of probes 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 bases in length.
31. The method of claim 1 further comprising the step of confirming the relative order of at least two bases in a segment by detecting hybridization of two or more probes having overlapping nucleic acid sequences including said at least two bases.
32. A method for nucleotide sequence analysis comprising the steps of:
introducing a sample to an array of probes;
adjusting the temperature to be one at which a majority of sample molecules are unassociated with ligated probes at any given time;
adding a labelled probe to the mixture;
incubating the mixture with ligase;
removing free probes; and detecting ligation products.
33. The method as recited in claim 1 further comprising the steps of defining additional probes for improving a desired result and repeating said exposing, incubating, detecting and analyzing steps.
34. The method as recited in claim 1 further comprising the step of stripping the substrate of probes for reuse of said pluralities of nucleic acid segments.
CA002206815A 1994-12-09 1995-12-08 Methods and apparatus for dna sequencing and dna identification Abandoned CA2206815A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US353,554 1994-12-09
US08/353,554 US6270961B1 (en) 1987-04-01 1994-12-09 Methods and apparatus for DNA sequencing and DNA identification

Publications (1)

Publication Number Publication Date
CA2206815A1 true CA2206815A1 (en) 1996-06-13

Family

ID=23389627

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002206815A Abandoned CA2206815A1 (en) 1994-12-09 1995-12-08 Methods and apparatus for dna sequencing and dna identification

Country Status (10)

Country Link
US (4) US6270961B1 (en)
EP (1) EP0797683A4 (en)
JP (1) JPH10512745A (en)
KR (1) KR980700433A (en)
CN (1) CN1175283A (en)
AU (1) AU715506B2 (en)
CA (1) CA2206815A1 (en)
FI (1) FI972429A (en)
NO (1) NO972535L (en)
WO (1) WO1996017957A1 (en)

Families Citing this family (227)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6270961B1 (en) * 1987-04-01 2001-08-07 Hyseq, Inc. Methods and apparatus for DNA sequencing and DNA identification
US6416952B1 (en) 1989-06-07 2002-07-09 Affymetrix, Inc. Photolithographic and other means for manufacturing arrays
US6346413B1 (en) 1989-06-07 2002-02-12 Affymetrix, Inc. Polymer arrays
US5547839A (en) * 1989-06-07 1996-08-20 Affymax Technologies N.V. Sequencing of surface immobilized polymers utilizing microflourescence detection
US6040138A (en) 1995-09-15 2000-03-21 Affymetrix, Inc. Expression monitoring by hybridization to high density oligonucleotide arrays
US5800992A (en) * 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US5925525A (en) * 1989-06-07 1999-07-20 Affymetrix, Inc. Method of identifying nucleotide differences
US5143854A (en) * 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
US6919211B1 (en) * 1989-06-07 2005-07-19 Affymetrix, Inc. Polypeptide arrays
US5744101A (en) * 1989-06-07 1998-04-28 Affymax Technologies N.V. Photolabile nucleoside protecting groups
US6551784B2 (en) 1989-06-07 2003-04-22 Affymetrix Inc Method of comparing nucleic acid sequences
US6955915B2 (en) * 1989-06-07 2005-10-18 Affymetrix, Inc. Apparatus comprising polymers
US6506558B1 (en) 1990-03-07 2003-01-14 Affymetrix Inc. Very large scale immobilized polymer synthesis
DE69132843T2 (en) * 1990-12-06 2002-09-12 Affymetrix Inc N D Ges D Staat Identification of nucleic acids in samples
US6468740B1 (en) 1992-11-05 2002-10-22 Affymetrix, Inc. Cyclic and substituted immobilized molecular synthesis
US6401267B1 (en) * 1993-09-27 2002-06-11 Radoje Drmanac Methods and compositions for efficient nucleic acid sequencing
US6287850B1 (en) * 1995-06-07 2001-09-11 Affymetrix, Inc. Bioarray chip reaction apparatus and its manufacture
US5795716A (en) 1994-10-21 1998-08-18 Chee; Mark S. Computer-aided visualization and analysis system for sequence evaluation
US8236493B2 (en) * 1994-10-21 2012-08-07 Affymetrix, Inc. Methods of enzymatic discrimination enhancement and surface-bound double-stranded DNA
US6312894B1 (en) 1995-04-03 2001-11-06 Epoch Pharmaceuticals, Inc. Hybridization and mismatch discrimination using oligonucleotides conjugated to minor groove binders
US5801155A (en) 1995-04-03 1998-09-01 Epoch Pharmaceuticals, Inc. Covalently linked oligonucleotide minor grove binder conjugates
US6720149B1 (en) * 1995-06-07 2004-04-13 Affymetrix, Inc. Methods for concurrently processing multiple biological chip assays
US6660233B1 (en) * 1996-01-16 2003-12-09 Beckman Coulter, Inc. Analytical biochemistry system with robotically carried bioarray
EP0880598A4 (en) 1996-01-23 2005-02-23 Affymetrix Inc Nucleic acid analysis techniques
US6391550B1 (en) 1996-09-19 2002-05-21 Affymetrix, Inc. Identification of molecular sequence signatures and methods involving the same
WO1998012354A1 (en) 1996-09-19 1998-03-26 Affymetrix, Inc. Identification of molecular sequence signatures and methods involving the same
US6297006B1 (en) * 1997-01-16 2001-10-02 Hyseq, Inc. Methods for sequencing repetitive sequences and for determining the order of sequence subfragments
US20020042048A1 (en) * 1997-01-16 2002-04-11 Radoje Drmanac Methods and compositions for detection or quantification of nucleic acid species
EP0972078B1 (en) 1997-03-20 2005-06-01 Affymetrix, Inc. (a California Corporation) Iterative resequencing
US20030036084A1 (en) * 1997-10-09 2003-02-20 Brian Hauser Nucleic acid detection method employing oligonucleotide probes affixed to particles and related compositions
US6322968B1 (en) 1997-11-21 2001-11-27 Orchid Biosciences, Inc. De novo or “universal” sequencing array
US7715989B2 (en) 1998-04-03 2010-05-11 Elitech Holding B.V. Systems and methods for predicting oligonucleotide melting temperature (TmS)
US6683173B2 (en) 1998-04-03 2004-01-27 Epoch Biosciences, Inc. Tm leveling methods
US6127121A (en) 1998-04-03 2000-10-03 Epoch Pharmaceuticals, Inc. Oligonucleotides containing pyrazolo[3,4-D]pyrimidines for hybridization and mismatch discrimination
US6949367B1 (en) 1998-04-03 2005-09-27 Epoch Pharmaceuticals, Inc. Modified oligonucleotides for mismatch discrimination
US7045610B2 (en) 1998-04-03 2006-05-16 Epoch Biosciences, Inc. Modified oligonucleotides for mismatch discrimination
US7875440B2 (en) 1998-05-01 2011-01-25 Arizona Board Of Regents Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US6780591B2 (en) 1998-05-01 2004-08-24 Arizona Board Of Regents Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US6872521B1 (en) 1998-06-16 2005-03-29 Beckman Coulter, Inc. Polymerase signaling assay
US6703228B1 (en) 1998-09-25 2004-03-09 Massachusetts Institute Of Technology Methods and products related to genotyping and DNA analysis
EP1001037A3 (en) * 1998-09-28 2003-10-01 Whitehead Institute For Biomedical Research Pre-selection and isolation of single nucleotide polymorphisms
AU1204600A (en) * 1998-10-13 2000-05-01 Brown University Research Foundation Systems and methods for sequencing by hybridization
US7034143B1 (en) 1998-10-13 2006-04-25 Brown University Research Foundation Systems and methods for sequencing by hybridization
US7071324B2 (en) 1998-10-13 2006-07-04 Brown University Research Foundation Systems and methods for sequencing by hybridization
US6545264B1 (en) 1998-10-30 2003-04-08 Affymetrix, Inc. Systems and methods for high performance scanning
DE60042775D1 (en) * 1999-01-06 2009-10-01 Callida Genomics Inc IMPROVED SEQUENCING BY HYBRIDIZATION THROUGH THE USE OF PROBABLE MIXTURES
DE60031506T2 (en) 1999-01-08 2007-08-23 Applera Corp., Foster City FASERMATRIX FOR MEASURING CHEMICALS, AND METHOD FOR THE PRODUCTION AND USE THEREOF
US7595189B2 (en) 1999-01-08 2009-09-29 Applied Biosystems, Llc Integrated optics fiber array
WO2000056937A2 (en) * 1999-03-25 2000-09-28 Hyseq, Inc. Solution-based methods and materials for sequence analysis by hybridization
US6516276B1 (en) * 1999-06-18 2003-02-04 Eos Biotechnology, Inc. Method and apparatus for analysis of data from biomolecular arrays
US7501245B2 (en) * 1999-06-28 2009-03-10 Helicos Biosciences Corp. Methods and apparatuses for analyzing polynucleotide sequences
US6818395B1 (en) * 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
US6339147B1 (en) 1999-07-29 2002-01-15 Epoch Biosciences, Inc. Attachment of oligonucleotides to solid supports through Schiff base type linkages for capture and detection of nucleic acids
EP1235932A2 (en) 1999-10-08 2002-09-04 Protogene Laboratories, Inc. Method and apparatus for performing large numbers of reactions using array assembly
JP3668075B2 (en) * 1999-10-12 2005-07-06 光夫 板倉 Suspension system for determining genetic material sequence, method for determining genetic material sequence using the suspension system, and SNPs high-speed scoring method using the suspension system
US7332275B2 (en) * 1999-10-13 2008-02-19 Sequenom, Inc. Methods for detecting methylated nucleotides
US6660845B1 (en) 1999-11-23 2003-12-09 Epoch Biosciences, Inc. Non-aggregating, non-quenching oligomers comprising nucleotide analogues; methods of synthesis and use thereof
US20040081959A9 (en) 1999-12-08 2004-04-29 Epoch Biosciences, Inc. Fluorescent quenching detection reagents and methods
US6727356B1 (en) 1999-12-08 2004-04-27 Epoch Pharmaceuticals, Inc. Fluorescent quenching detection reagents and methods
US7205105B2 (en) 1999-12-08 2007-04-17 Epoch Biosciences, Inc. Real-time linear detection probes: sensitive 5′-minor groove binder-containing probes for PCR analysis
US20010039014A1 (en) * 2000-01-11 2001-11-08 Maxygen, Inc. Integrated systems and methods for diversity generation and screening
EP1944310A3 (en) 2000-03-01 2008-08-06 Epoch Biosciences, Inc. Modified oligonucleotides for mismatch discrimination
JP2003525292A (en) 2000-03-01 2003-08-26 エポック・バイオサイエンシーズ・インコーポレイテッド Modified oligonucleotides for mismatch discrimination
JP3502803B2 (en) * 2000-03-06 2004-03-02 日立ソフトウエアエンジニアリング株式会社 Microarray, method for producing microarray, and method for correcting spot amount error between pins in microarray
CA2410950A1 (en) * 2000-05-30 2001-12-06 Hans-Michael Wenz Methods for detecting target nucleic acids using coupled ligation and amplification
KR100865664B1 (en) * 2000-06-14 2008-10-29 비스타겐 인코포레이티드 Toxicity typing using liver stem cells
US7846733B2 (en) 2000-06-26 2010-12-07 Nugen Technologies, Inc. Methods and compositions for transcription-based nucleic acid amplification
US6913879B1 (en) 2000-07-10 2005-07-05 Telechem International Inc. Microarray method of genotyping multiple samples at multiple LOCI
US6984522B2 (en) 2000-08-03 2006-01-10 Regents Of The University Of Michigan Isolation and use of solid tumor stem cells
US6681186B1 (en) 2000-09-08 2004-01-20 Paracel, Inc. System and method for improving the accuracy of DNA sequencing and error probability estimation through application of a mathematical model to the analysis of electropherograms
US6858413B2 (en) 2000-12-13 2005-02-22 Nugen Technologies, Inc. Methods and compositions for generation of multiple copies of nucleic acid sequences and methods of detection thereof
US7030292B2 (en) * 2001-01-02 2006-04-18 Stemron, Inc. Method for producing a population of homozygous stem cells having a pre-selected immunotype and/or genotype, cells suitable for transplant derived therefrom, and materials and methods using same
WO2003012147A1 (en) * 2001-02-20 2003-02-13 Datascope Investment Corp. Method for reusing standard blots and microarrays utilizing dna dendrimer technology
BR0205268A (en) 2001-03-09 2004-11-30 Nugen Technologies Inc Processes and compositions for mRNA sequence mplification
CA2440754A1 (en) * 2001-03-12 2002-09-19 Stephen Quake Methods and apparatus for analyzing polynucleotide sequences by asynchronous base extension
DE10120798B4 (en) * 2001-04-27 2005-12-29 Genovoxx Gmbh Method for determining gene expression
WO2002090599A1 (en) * 2001-05-09 2002-11-14 Genetic Id, Inc. Universal microarray system
US20030036073A1 (en) * 2001-06-07 2003-02-20 Saba James Anthony Matrix Sequencing: a novel method of polynucleotide analysis utilizing probes containing universal nucleotides
US6767731B2 (en) 2001-08-27 2004-07-27 Intel Corporation Electron induced fluorescent method for nucleic acid sequencing
WO2003020898A2 (en) * 2001-08-30 2003-03-13 Spectral Genomics, Inc. Arrays comprising pre-labeled biological molecules and methods for making and using these arrays
JP2005504275A (en) * 2001-09-18 2005-02-10 ユー.エス. ジェノミクス, インコーポレイテッド Differential tagging of polymers for high-resolution linear analysis
US20030170678A1 (en) * 2001-10-25 2003-09-11 Neurogenetics, Inc. Genetic markers for Alzheimer's disease and methods using the same
WO2003054143A2 (en) * 2001-10-25 2003-07-03 Neurogenetics, Inc. Genes and polymorphisms on chromosome 10 associated with alzheimer's disease and other neurodegenerative diseases
US20030224380A1 (en) * 2001-10-25 2003-12-04 The General Hospital Corporation Genes and polymorphisms on chromosome 10 associated with Alzheimer's disease and other neurodegenerative diseases
GB0202462D0 (en) * 2002-02-04 2002-03-20 Tepnel Medical Ltd Nucleic acid analysis
WO2003093296A2 (en) * 2002-05-03 2003-11-13 Sequenom, Inc. Kinase anchor protein muteins, peptides thereof, and related methods
EP1573056A4 (en) * 2002-05-17 2007-11-28 Nugen Technologies Inc Methods for fragmentation, labeling and immobilization of nucleic acids
DE10224824A1 (en) * 2002-06-05 2003-12-24 Eppendorf Ag Analysis of target nucleic acid, useful particularly for detecting polymorphisms, uses at least two hybridization probes, with different labels and different binding strengths
EP2385139A1 (en) 2002-07-31 2011-11-09 University of Southern California Polymorphisms for predicting disease and treatment outcome
US20040235005A1 (en) * 2002-10-23 2004-11-25 Ernest Friedlander Methods and composition for detecting targets
US6641899B1 (en) * 2002-11-05 2003-11-04 International Business Machines Corporation Nonlithographic method to produce masks by selective reaction, articles produced, and composition for same
CN103397082B (en) * 2003-02-26 2017-05-31 考利达基因组股份有限公司 The random array DNA analysis carried out by hybridization
US20070141570A1 (en) * 2003-03-07 2007-06-21 Sequenom, Inc. Association of polymorphic kinase anchor proteins with cardiac phenotypes and related methods
FR2852317B1 (en) 2003-03-13 2006-08-04 PROBE BIOPUCES AND METHODS OF USE
JP2006520199A (en) 2003-03-19 2006-09-07 ザ ユニバーシティ オブ ブリティッシュ コロンビア Plasminogen activator inhibitor-1 (PAI-1) haplotype useful as an indicator of patient outcome
US7393207B2 (en) * 2003-03-26 2008-07-01 Shin-Etsu Handotai Co., Ltd. Wafer support tool for heat treatment and heat treatment apparatus
CA2521084A1 (en) 2003-04-14 2004-10-28 Nugen Technologies, Inc. Global amplification using a randomly primed composite primer
US8652774B2 (en) * 2003-04-16 2014-02-18 Affymetrix, Inc. Automated method of manufacturing polyer arrays
US7425700B2 (en) 2003-05-22 2008-09-16 Stults John T Systems and methods for discovery and analysis of markers
WO2005001129A2 (en) * 2003-06-06 2005-01-06 Applera Corporation Mobility cassettes
US20050170367A1 (en) * 2003-06-10 2005-08-04 Quake Stephen R. Fluorescently labeled nucleoside triphosphates and analogs thereof for sequencing nucleic acids
JP4067463B2 (en) * 2003-07-18 2008-03-26 トヨタ自動車株式会社 Control device for hybrid vehicle
US20050038776A1 (en) * 2003-08-15 2005-02-17 Ramin Cyrus Information system for biological and life sciences research
US7348146B2 (en) 2003-10-02 2008-03-25 Epoch Biosciences, Inc. Single nucleotide polymorphism analysis of highly polymorphic target sequences
CA2542768A1 (en) 2003-10-28 2005-05-12 Epoch Biosciences, Inc. Fluorescent probes for dna detection by hybridization with improved sensitivity and low background
US7169560B2 (en) 2003-11-12 2007-01-30 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
WO2005049849A2 (en) 2003-11-14 2005-06-02 Integrated Dna Technologies, Inc. Fluorescence quenching azo dyes, their methods of preparation and use
US7276338B2 (en) * 2003-11-17 2007-10-02 Jacobson Joseph M Nucleotide sequencing via repetitive single molecule hybridization
WO2005054441A2 (en) * 2003-12-01 2005-06-16 California Institute Of Technology Device for immobilizing chemical and biomedical species and methods of using same
CA2552007A1 (en) 2003-12-29 2005-07-21 Nugen Technologies, Inc. Methods for analysis of nucleic acid methylation status and methods for fragmentation, labeling and immobilization of nucleic acids
US7981604B2 (en) 2004-02-19 2011-07-19 California Institute Of Technology Methods and kits for analyzing polynucleotide sequences
US20060046258A1 (en) * 2004-02-27 2006-03-02 Lapidus Stanley N Applications of single molecule sequencing
WO2005085273A1 (en) 2004-03-04 2005-09-15 The University Of British Columbia Thrombomodulin (thbd) haplotypes predict outcome of patients
GB2413796B (en) * 2004-03-25 2006-03-29 Global Genomics Ab Methods and means for nucleic acid sequencing
US20050239085A1 (en) * 2004-04-23 2005-10-27 Buzby Philip R Methods for nucleic acid sequence determination
US20050260609A1 (en) * 2004-05-24 2005-11-24 Lapidus Stanley N Methods and devices for sequencing nucleic acids
JP2008512084A (en) * 2004-05-25 2008-04-24 ヘリコス バイオサイエンシーズ コーポレイション Methods and devices for nucleic acid sequencing
US20070117104A1 (en) * 2005-11-22 2007-05-24 Buzby Philip R Nucleotide analogs
US20070117103A1 (en) * 2005-11-22 2007-05-24 Buzby Philip R Nucleotide analogs
US7476734B2 (en) * 2005-12-06 2009-01-13 Helicos Biosciences Corporation Nucleotide analogs
US20060024678A1 (en) * 2004-07-28 2006-02-02 Helicos Biosciences Corporation Use of single-stranded nucleic acid binding proteins in sequencing
EP1807146A4 (en) * 2004-09-29 2013-07-03 Tel Hashomer Medical Res Infrastructure & Services Ltd Composition for improving efficiency of drug delivery
US20060118754A1 (en) * 2004-12-08 2006-06-08 Lapen Daniel C Stabilizing a polyelectrolyte multilayer
US20060172328A1 (en) * 2005-01-05 2006-08-03 Buzby Philip R Methods and compositions for correcting misincorporation in a nucleic acid synthesis reaction
US7482120B2 (en) * 2005-01-28 2009-01-27 Helicos Biosciences Corporation Methods and compositions for improving fidelity in a nucleic acid synthesis reaction
AU2006237613A1 (en) * 2005-02-18 2006-10-26 Abraxis Bioscience, Inc. Q3 SPARC deletion mutant and uses thereof
WO2006127507A2 (en) 2005-05-20 2006-11-30 Integrated Dna Technologies, Inc. Compounds and methods for labeling oligonucleotides
US20060263790A1 (en) * 2005-05-20 2006-11-23 Timothy Harris Methods for improving fidelity in a nucleic acid synthesis reaction
JP5331476B2 (en) * 2005-06-15 2013-10-30 カリダ・ジェノミックス・インコーポレイテッド Single molecule array for genetic and chemical analysis
CA2612859A1 (en) * 2005-06-23 2006-12-28 The University Of British Columbia Coagulation factor iii polymorphisms associated with prediction of subject outcome and response to therapy
US7666593B2 (en) 2005-08-26 2010-02-23 Helicos Biosciences Corporation Single molecule sequencing of captured nucleic acids
WO2007030759A2 (en) 2005-09-07 2007-03-15 Nugen Technologies, Inc. Improved nucleic acid amplification procedure
US7960104B2 (en) 2005-10-07 2011-06-14 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
US20070117102A1 (en) * 2005-11-22 2007-05-24 Buzby Philip R Nucleotide analogs
US20070128610A1 (en) * 2005-12-02 2007-06-07 Buzby Philip R Sample preparation method and apparatus for nucleic acid sequencing
WO2007091077A1 (en) 2006-02-08 2007-08-16 Solexa Limited Method for sequencing a polynucleotide template
SG10201405158QA (en) * 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
EP1994180A4 (en) * 2006-02-24 2009-11-25 Callida Genomics Inc High throughput genome sequencing on dna arrays
US20100022403A1 (en) * 2006-06-30 2010-01-28 Nurith Kurn Methods for fragmentation and labeling of nucleic acids
GB0618514D0 (en) * 2006-09-20 2006-11-01 Univ Nottingham Trent Method of detecting interactions on a microarray using nuclear magnetic resonance
EP2084296B1 (en) * 2006-09-29 2015-08-05 Agendia N.V. High-throughput diagnostic testing using arrays
US7910302B2 (en) 2006-10-27 2011-03-22 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
US20090111705A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by hybrid capture
US20080221832A1 (en) * 2006-11-09 2008-09-11 Complete Genomics, Inc. Methods for computing positional base probabilities using experminentals base value distributions
EP2102392A4 (en) 2006-11-15 2010-07-14 Univ British Columbia Polymorphisms predictive of anthracycline-induced cardiotoxicity
AU2008205457A1 (en) * 2007-01-18 2008-07-24 University Of Southern California Gene polymorphisms predictive for dual TKI therapy
WO2008088860A2 (en) 2007-01-18 2008-07-24 University Of Southern California Polymorphisms in the egfr pathway as markers for cancer treatment
US7881933B2 (en) * 2007-03-23 2011-02-01 Verizon Patent And Licensing Inc. Age determination using speech
US7572990B2 (en) * 2007-03-30 2009-08-11 Intermec Ip Corp. Keypad overlay membrane
EP2915564B1 (en) 2007-09-28 2020-11-04 Portola Pharmaceuticals, Inc. Antidotes for factor XA inhibitors and methods of using the same
US8278047B2 (en) 2007-10-01 2012-10-02 Nabsys, Inc. Biopolymer sequencing by hybridization of probes to form ternary complexes and variable range alignment
WO2009052214A2 (en) * 2007-10-15 2009-04-23 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US20090263872A1 (en) * 2008-01-23 2009-10-22 Complete Genomics Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
US8518640B2 (en) * 2007-10-29 2013-08-27 Complete Genomics, Inc. Nucleic acid sequencing and process
US8298768B2 (en) * 2007-11-29 2012-10-30 Complete Genomics, Inc. Efficient shotgun sequencing methods
US8415099B2 (en) 2007-11-05 2013-04-09 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US7897344B2 (en) * 2007-11-06 2011-03-01 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors into library constructs
WO2009061840A1 (en) * 2007-11-05 2009-05-14 Complete Genomics, Inc. Methods and oligonucleotide designs for insertion of multiple adaptors employing selective methylation
EP2212437A4 (en) * 2007-11-07 2011-09-28 Univ British Columbia Microfluidic device and method of using same
CN102016579B (en) 2007-11-30 2015-04-08 健泰科生物技术公司 VEGF polymorphisms and anti-angiogenesis therapy
US8592150B2 (en) 2007-12-05 2013-11-26 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
WO2009097368A2 (en) 2008-01-28 2009-08-06 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US20090203531A1 (en) 2008-02-12 2009-08-13 Nurith Kurn Method for Archiving and Clonal Expansion
WO2009117698A2 (en) 2008-03-21 2009-09-24 Nugen Technologies, Inc. Methods of rna amplification in the presence of dna
WO2009132028A1 (en) * 2008-04-21 2009-10-29 Complete Genomics, Inc. Array structures for nucleic acid detection
JP4667490B2 (en) * 2008-07-09 2011-04-13 三菱電機株式会社 Cooker
US9650668B2 (en) 2008-09-03 2017-05-16 Nabsys 2.0 Llc Use of longitudinally displaced nanoscale electrodes for voltage sensing of biomolecules and other analytes in fluidic channels
US8262879B2 (en) 2008-09-03 2012-09-11 Nabsys, Inc. Devices and methods for determining the length of biopolymers and distances between probes bound thereto
WO2010028140A2 (en) 2008-09-03 2010-03-11 Nabsys, Inc. Use of longitudinally displaced nanoscale electrodes for voltage sensing of biomolecules and other analytes in fluidic channels
JP2012501658A (en) * 2008-09-05 2012-01-26 ライフ テクノロジーズ コーポレーション Methods and systems for nucleic acid sequencing validation, calibration, and standardization
CN102203296B (en) 2008-11-05 2014-12-03 健泰科生物技术公司 Genetic polymorphisms in age-related macular degeneration
US20120185177A1 (en) * 2009-02-20 2012-07-19 Hannon Gregory J Harnessing high throughput sequencing for multiplexed specimen analysis
EP2411505A4 (en) 2009-03-26 2013-01-30 Univ California Mesenchymal stem cells producing inhibitory rna for disease modification
EP3121271B1 (en) 2009-03-30 2019-07-24 Portola Pharmaceuticals, Inc. Antidotes for factor xa inhibitors and methods of using the same
WO2010123625A1 (en) 2009-04-24 2010-10-28 University Of Southern California Cd133 polymorphisms predict clinical outcome in patients with cancer
US9524369B2 (en) 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
US20120283136A1 (en) 2009-06-24 2012-11-08 The University Of Southern California Compositions and methods for the rapid biosynthesis and in vivo screening of biologically relevant peptides
WO2011008885A1 (en) 2009-07-15 2011-01-20 Portola Pharmaceuticals, Inc. Unit dose formulation of antidotes for factor xa inhibitors and methods of using the same
RU2577726C2 (en) 2009-10-21 2016-03-20 Дженентек, Инк. Genetic polymorphisms in age-related macular degeneration
WO2011084757A1 (en) 2009-12-21 2011-07-14 University Of Southern California Germline polymorphisms in the sparc gene associated with clinical outcome in gastric cancer
WO2011085334A1 (en) 2010-01-11 2011-07-14 University Of Southern California Cd44 polymorphisms predict clinical outcome in patients with gastric cancer
US20120100548A1 (en) 2010-10-26 2012-04-26 Verinata Health, Inc. Method for determining copy number variations
US10388403B2 (en) 2010-01-19 2019-08-20 Verinata Health, Inc. Analyzing copy number variation in the detection of cancer
US9260745B2 (en) 2010-01-19 2016-02-16 Verinata Health, Inc. Detecting and classifying copy number variation
US9323888B2 (en) 2010-01-19 2016-04-26 Verinata Health, Inc. Detecting and classifying copy number variation
US8700341B2 (en) 2010-01-19 2014-04-15 Verinata Health, Inc. Partition defined detection methods
EP2366031B1 (en) 2010-01-19 2015-01-21 Verinata Health, Inc Sequencing methods in prenatal diagnoses
WO2011103467A2 (en) * 2010-02-19 2011-08-25 Life Technologies Corporation Methods and systems for nucleic acid sequencing validation, calibration and normalization
US9506057B2 (en) 2010-03-26 2016-11-29 Integrated Dna Technologies, Inc. Modifications for antisense compounds
AU2011230496B2 (en) 2010-03-26 2015-09-17 Integrated Dna Technologies, Inc. Methods for enhancing nucleic acid hybridization
WO2012033848A1 (en) 2010-09-07 2012-03-15 Integrated Dna Technologies, Inc. Modifications for antisense compounds
WO2012037456A1 (en) 2010-09-17 2012-03-22 President And Fellows Of Harvard College Functional genomics assay for characterizing pluripotent stem cell utility and safety
US8715933B2 (en) 2010-09-27 2014-05-06 Nabsys, Inc. Assay methods using nicking endonucleases
US8859201B2 (en) 2010-11-16 2014-10-14 Nabsys, Inc. Methods for sequencing a biomolecule by detecting relative positions of hybridized probes
WO2012068519A2 (en) 2010-11-19 2012-05-24 Sirius Genomics Inc. Markers associated with response to activated protein c administration, and uses thereof
US11274341B2 (en) 2011-02-11 2022-03-15 NABsys, 2.0 LLC Assay methods using DNA binding proteins
US8969003B2 (en) 2011-03-23 2015-03-03 Elitech Holding B.V. Functionalized 3-alkynyl pyrazolopyrimidine analogues as universal bases and methods of use
US9085800B2 (en) 2011-03-23 2015-07-21 Elitech Holding B.V. Functionalized 3-alkynyl pyrazolopyrimidine analogues as universal bases and methods of use
HUE050032T2 (en) 2011-04-12 2020-11-30 Verinata Health Inc Resolving genome fractions using polymorphism counts
US9411937B2 (en) 2011-04-15 2016-08-09 Verinata Health, Inc. Detecting and classifying copy number variation
EP2701722A4 (en) 2011-04-28 2014-12-31 Univ Southern California Human myeloid derived suppressor cell cancer markers
ES2570591T3 (en) 2011-05-24 2016-05-19 Elitechgroup B V Methicillin-resistant Staphylococcus aureus detection
GB2497838A (en) 2011-10-19 2013-06-26 Nugen Technologies Inc Compositions and methods for directional nucleic acid amplification and sequencing
JP6285865B2 (en) 2011-11-14 2018-02-28 アルファシグマ ソシエタ ペル アチオニ Assays and methods for selecting treatment regimens for subjects with depression
CN105861487B (en) 2012-01-26 2020-05-05 纽亘技术公司 Compositions and methods for targeted nucleic acid sequence enrichment and efficient library generation
WO2013119923A1 (en) 2012-02-09 2013-08-15 The Regents Of The University Of Michigan Different states of cancer stem cells
CN104619894B (en) 2012-06-18 2017-06-06 纽亘技术公司 For the composition and method of the Solid phase of unexpected nucleotide sequence
US20150011396A1 (en) 2012-07-09 2015-01-08 Benjamin G. Schroeder Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
GB201220924D0 (en) 2012-11-21 2013-01-02 Cancer Res Inst Royal Materials and methods for determining susceptibility or predisposition to cancer
US9914966B1 (en) 2012-12-20 2018-03-13 Nabsys 2.0 Llc Apparatus and methods for analysis of biomolecules using high frequency alternating current excitation
WO2014113557A1 (en) 2013-01-18 2014-07-24 Nabsys, Inc. Enhanced probe binding
EP2971086A1 (en) 2013-03-14 2016-01-20 ELITechGroup B.V. Functionalized 3-alkynyl pyrazolopyrimidine analogues as universal bases and methods of use
EP2971130A4 (en) 2013-03-15 2016-10-05 Nugen Technologies Inc Sequential sequencing
US20160186263A1 (en) 2013-05-09 2016-06-30 Trustees Of Boston University Using plexin-a4 as a biomarker and therapeutic target for alzheimer's disease
JP6697380B2 (en) 2013-06-10 2020-05-20 プレジデント・アンド・フェロウズ・オブ・ハーバード・カレッジ Early developmental genomic assay to characterize the utility and safety of pluripotent stem cells
US20160230231A1 (en) 2013-10-18 2016-08-11 Institut De Cardiologie De Montreal Genotyping tests and methods for evaluating plasma creatine kinase levels
WO2015073711A1 (en) 2013-11-13 2015-05-21 Nugen Technologies, Inc. Compositions and methods for identification of a duplicate sequencing read
WO2015131107A1 (en) 2014-02-28 2015-09-03 Nugen Technologies, Inc. Reduced representation bisulfite sequencing with diversity adaptors
WO2016016157A1 (en) 2014-07-30 2016-02-04 F. Hoffmann-La Roche Ag Genetic markers for predicting responsiveness to therapy with hdl-raising or hdl mimicking agent
US9789087B2 (en) 2015-08-03 2017-10-17 Thomas Jefferson University PAR4 inhibitor therapy for patients with PAR4 polymorphism
EP3635103A1 (en) 2017-06-05 2020-04-15 Research Institute at Nationwide Children's Hospital Enhanced modified viral capsid proteins
US11655498B2 (en) * 2017-07-07 2023-05-23 Massachusetts Institute Of Technology Systems and methods for genetic identification and analysis
US11099202B2 (en) 2017-10-20 2021-08-24 Tecan Genomics, Inc. Reagent delivery system
CN113348513B (en) * 2019-01-07 2024-04-09 私人基因诊断公司 Inspection of sequencing instruments and reagents for use in molecular diagnostic methods
CN109801679B (en) * 2019-01-15 2021-02-02 广州柿宝生物科技有限公司 Mathematical sequence reconstruction method for long-chain molecules
AU2020326698A1 (en) 2019-08-05 2022-02-24 Seer, Inc. Systems and methods for sample preparation, data generation, and protein corona analysis
WO2023096996A2 (en) 2021-11-24 2023-06-01 Research Institute At Nationwide Children's Hospital Chimeric hsv expressing hil21 to boost anti-tumor immune activity

Family Cites Families (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4302204A (en) * 1979-07-02 1981-11-24 The Board Of Trustees Of Leland Stanford Junior University Transfer and detection of nucleic acids
US4562159A (en) * 1981-03-31 1985-12-31 Albert Einstein College Of Medicine, A Division Of Yeshiva Univ. Diagnostic test for hepatitis B virus
CA1180647A (en) * 1981-07-17 1985-01-08 Cavit Akin Light-emitting polynucleotide hybridization diagnostic method
FI63596C (en) * 1981-10-16 1983-07-11 Orion Yhtymae Oy MICROBIA DIAGNOSIS FOERFARANDE SOM GRUNDAR SIG PAO SKIKTSHYBRIDISERING AV NUCLEINSYROR OCH VID FOERFARANDET ANVAENDA KOMBINATIONER AV REAGENSER
US4591567A (en) 1982-04-21 1986-05-27 California Institute Of Technology Recombinant DNA screening system including fixed array replicator and support
JPS5927900A (en) * 1982-08-09 1984-02-14 Wakunaga Seiyaku Kk Oligonucleotide derivative and its preparation
DE3486467T3 (en) * 1983-01-10 2004-10-14 Gen-Probe Inc., San Diego Methods for the detection, identification and quantification of organisms and viruses
JPS6010174A (en) * 1983-06-29 1985-01-19 Fuji Photo Film Co Ltd Screening method of gene by auto radiography
CA1222680A (en) * 1983-07-05 1987-06-09 Nanibhushan Dattagupta Testing dna samples for particular nucleotide sequences
US4677054A (en) * 1983-08-08 1987-06-30 Sloan-Kettering Institute For Cancer Research Method for simple analysis of relative nucleic acid levels in multiple small samples by cytoplasmic dot hybridization
AU575586B2 (en) * 1983-09-02 1988-08-04 Syngene, Inc. Oligonucleotide synthesis employing primer with oxidzable substituents in system
US4613566A (en) * 1984-01-23 1986-09-23 President And Fellows Of Harvard College Hybridization assay and kit therefor
FI71768C (en) * 1984-02-17 1987-02-09 Orion Yhtymae Oy Enhanced nucleic acid reagents and process for their preparation.
CA1223222A (en) 1984-02-22 1987-06-23 Nanibhushan Dattagupta Immobilized nucleic acid-containing probes
US4766062A (en) * 1984-05-07 1988-08-23 Allied Corporation Displacement polynucleotide assay method and polynucleotide complex reagent therefor
US5242794A (en) 1984-12-13 1993-09-07 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US4883750A (en) 1984-12-13 1989-11-28 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
GB8432118D0 (en) * 1984-12-19 1985-01-30 Malcolm A D B Sandwich hybridisation technique
DE3506703C1 (en) * 1985-02-26 1986-04-30 Sagax Instrument AB, Sundbyberg Process for sequence analysis of nucleic acids, in particular deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), as well as carriers for carrying out the method and process for producing the carrier
GB8509880D0 (en) * 1985-04-17 1985-05-22 Ici Plc Testing device
EP0200113A3 (en) * 1985-04-30 1987-03-18 Pandex Laboratories, Inc. A method of solid phase nucleic acid hybridization assay incorporating a luminescent label
AU558846B2 (en) * 1985-06-21 1987-02-12 Miles Laboratories Inc. Solid-phase hydridization assay using anti-hybrid antibodies
US4794073A (en) * 1985-07-10 1988-12-27 Molecular Diagnostics, Inc. Detection of nucleic acid hybrids by prolonged chemiluminescence
US4775631A (en) * 1985-07-26 1988-10-04 Janssen Pharmaceuitica, N.V. Method of localizing nucleic acids bound to polyamide supports
US4806631A (en) * 1985-09-30 1989-02-21 Miles Inc. Immobilization of nucleic acids on solvolyzed nylon supports
US4806546A (en) * 1985-09-30 1989-02-21 Miles Inc. Immobilization of nucleic acids on derivatized nylon supports
TW203120B (en) * 1985-10-04 1993-04-01 Abbott Lab
US4770992A (en) * 1985-11-27 1988-09-13 Den Engh Gerrit J Van Detection of specific DNA sequences by flow cytometry
US4882269A (en) * 1985-12-13 1989-11-21 Princeton University Amplified hybridization assay
EP0228075B1 (en) * 1986-01-03 1991-04-03 Molecular Diagnostics, Inc. Eucaryotic genomic dna dot-blot hybridization method
EP0231010A3 (en) * 1986-01-27 1990-10-17 INCSTAR Corporation A method of solid phase enzyme immunoassay and nucleic acid hybridization assay and dip-stick design and stabilized chromogenic substrate
NO870613L (en) * 1986-03-05 1987-09-07 Molecular Diagnostics Inc DETECTION OF MICROORGANISMS IN A SAMPLE CONTAINING NUCLEIC ACID.
US5348855A (en) * 1986-03-05 1994-09-20 Miles Inc. Assay for nucleic acid sequences in an unpurified sample
CA1284931C (en) * 1986-03-13 1991-06-18 Henry A. Erlich Process for detecting specific nucleotide variations and genetic polymorphisms present in nucleic acids
US5310893A (en) * 1986-03-31 1994-05-10 Hoffmann-La Roche Inc. Method for HLA DP typing
EP0238332A2 (en) * 1986-03-19 1987-09-23 Cetus Corporation Liquid hybridization method and kit for detecting the presence of nucleic acid sequences in samples
US4981783A (en) * 1986-04-16 1991-01-01 Montefiore Medical Center Method for detecting pathological conditions
EP0245206A1 (en) * 1986-05-05 1987-11-11 IntraCel Corporation Analytical method for detecting and measuring specifically sequenced nucleic acid
JP2641880B2 (en) * 1986-08-11 1997-08-20 シスカ・ダイアグノスティックス・インコーポレーテッド Nucleic acid probe assays and compositions
DE3850273T2 (en) * 1987-03-02 1994-09-29 Gen Probe Inc Polycationic carriers for the purification, separation and hybridization of nucleic acid.
US4885250A (en) * 1987-03-02 1989-12-05 E. I. Du Pont De Nemours And Company Enzyme immobilization and bioaffinity separations with perfluorocarbon polymer-based supports
AU601021B2 (en) * 1987-03-11 1990-08-30 Molecular Diagnostics, Inc. Assay for necleic acid sequences in a sample
IL85551A0 (en) * 1987-04-01 1988-08-31 Miles Inc Rapid hybridization assay and reagent system used therein
US6270961B1 (en) * 1987-04-01 2001-08-07 Hyseq, Inc. Methods and apparatus for DNA sequencing and DNA identification
US5525464A (en) * 1987-04-01 1996-06-11 Hyseq, Inc. Method of sequencing by hybridization of oligonucleotide probes
US5202231A (en) * 1987-04-01 1993-04-13 Drmanac Radoje T Method of sequencing of genomes by hybridization of oligonucleotide probes
US4849334A (en) * 1987-06-09 1989-07-18 Life Technologies, Inc. Human papillomavirus 43 nucleic acid hybridization probes and methods for employing the same
WO1988010313A1 (en) * 1987-06-26 1988-12-29 E.I. Du Pont De Nemours And Company Affinity removal of contaminating sequences from recombinant cloned na using capture beads
US5120643A (en) * 1987-07-13 1992-06-09 Abbott Laboratories Process for immunochromatography with colloidal particles
US4921805A (en) * 1987-07-29 1990-05-01 Life Technologies, Inc. Nucleic acid capture method
US4942124A (en) * 1987-08-11 1990-07-17 President And Fellows Of Harvard College Multiplex sequencing
EP0305145A3 (en) * 1987-08-24 1990-05-02 Ortho Diagnostic Systems Inc. Methods and probes for detecting nucleic acids
DE3888653T2 (en) * 1987-12-21 1994-07-07 Applied Biosystems Method and test kit for the detection of a nucleic acid sequence.
US5354657A (en) * 1988-01-12 1994-10-11 Boehringer Mannheim Gmbh Process for the highly specific detection of nucleic acids in solid
GB8810400D0 (en) * 1988-05-03 1988-06-08 Southern E Analysing polynucleotide sequences
US4988617A (en) * 1988-03-25 1991-01-29 California Institute Of Technology Method of detecting a nucleotide change in nucleic acids
US5002867A (en) * 1988-04-25 1991-03-26 Macevicz Stephen C Nucleic acid sequence determination by multiple mixed oligonucleotide probes
WO1989011548A1 (en) * 1988-05-20 1989-11-30 Cetus Corporation Immobilized sequence-specific probes
US5094939A (en) * 1988-07-19 1992-03-10 Fujirebio, Inc. Chemiluminescence assays using stabilized dioxetane derivatives
GB8822228D0 (en) * 1988-09-21 1988-10-26 Southern E M Support-bound oligonucleotides
JPH02299598A (en) * 1989-04-14 1990-12-11 Ro Inst For Molecular Genetics & Geneteic Res Determination by means of hybridization, together with oligonucleotide probe of all or part of extremely short sequence in sample of nucleic acid connecting with separate particle of microscopic size
US5424186A (en) 1989-06-07 1995-06-13 Affymax Technologies N.V. Very large scale immobilized polymer synthesis
US5143854A (en) 1989-06-07 1992-09-01 Affymax Technologies N.V. Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof
WO1992007093A1 (en) * 1990-10-17 1992-04-30 Jack Love Identification and paternity determination by detecting presence or absence of multiple nucleic acid sequences
DE69132843T2 (en) * 1990-12-06 2002-09-12 Affymetrix Inc N D Ges D Staat Identification of nucleic acids in samples
EP0514927A1 (en) * 1991-05-24 1992-11-25 Walter Gilbert Method and apparatus for rapid nucleic acid sequencing
AU2547592A (en) * 1991-08-23 1993-03-16 Isis Pharmaceuticals, Inc. Synthetic unrandomization of oligomer fragments
US5474796A (en) 1991-09-04 1995-12-12 Protogene Laboratories, Inc. Method and apparatus for conducting an array of chemical reactions on a support surface
US6017696A (en) 1993-11-01 2000-01-25 Nanogen, Inc. Methods for electronic stringency control for molecular biological analysis and diagnostics
US5503980A (en) * 1992-11-06 1996-04-02 Trustees Of Boston University Positional sequencing by hybridization
EP0723598B1 (en) * 1993-09-27 2004-01-14 Arch Development Corporation Methods and compositions for efficient nucleic acid sequencing
EP0754241B1 (en) * 1994-04-04 1998-12-02 Ciba Corning Diagnostics Corp. Hibridization-ligation assays for the detection of specific nucleic acid sequences
GB9507238D0 (en) * 1995-04-07 1995-05-31 Isis Innovation Detecting dna sequence variations
US5545531A (en) * 1995-06-07 1996-08-13 Affymax Technologies N.V. Methods for making a device for concurrently processing multiple biological chip assays
EP0937159A4 (en) * 1996-02-08 2004-10-20 Affymetrix Inc Chip-based speciation and phenotypic characterization of microorganisms

Also Published As

Publication number Publication date
JPH10512745A (en) 1998-12-08
KR980700433A (en) 1998-03-30
WO1996017957A1 (en) 1996-06-13
US6403315B1 (en) 2002-06-11
NO972535D0 (en) 1997-06-04
AU715506B2 (en) 2000-02-03
US6270961B1 (en) 2001-08-07
NO972535L (en) 1997-08-06
FI972429A0 (en) 1997-06-06
FI972429A (en) 1997-08-06
CN1175283A (en) 1998-03-04
EP0797683A4 (en) 1999-03-03
US20020192691A1 (en) 2002-12-19
EP0797683A1 (en) 1997-10-01
US6025136A (en) 2000-02-15
AU4468796A (en) 1996-06-26

Similar Documents

Publication Publication Date Title
CA2206815A1 (en) Methods and apparatus for dna sequencing and dna identification
US6309824B1 (en) Methods for analyzing a target nucleic acid using immobilized heterogeneous mixtures of oligonucleotide probes
US6383742B1 (en) Three dimensional arrays for detection or quantification of nucleic acid species
US20020034737A1 (en) Methods and compositions for detection or quantification of nucleic acid species
Dramanac et al. Sequencing of megabase plus DNA by hybridization: theory of the method
US5763175A (en) Simultaneous sequencing of tagged polynucleotides
US5780231A (en) DNA extension and analysis with rolling primers
EP1967592B1 (en) Method of improving the efficiency of polynucleotide sequencing
EP0793718B1 (en) Molecular tagging system
WO1999009217A1 (en) Methods and compositions for detection or quantification of nucleic acid species
JPH11243999A (en) Polynucleotide array for analysis
ES2306485T3 (en) PARALLEL SCREENING PROCEDURE FOR INSUTING MUTANTS AND A DEVICE FOR CARRYING OUT THIS PROCEDURE.
CN115521977A (en) Spatial sequencing Using MICTAG
US20030036084A1 (en) Nucleic acid detection method employing oligonucleotide probes affixed to particles and related compositions
AU739963B2 (en) Method of mapping restriction sites in polynucleotides
US20040224324A1 (en) Happiar mapping
WO1999036567A2 (en) Enhanced discrimination of perfect matches from mismatches using a modified dna ligase
AU2002302835A1 (en) Genomic mapping method
CZ254699A3 (en) Processes and compositions suitable for detection of quantification of types of nucleic acids

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead