Michael Hunkapiller (PhD '74), Biological Engineer
November 10, 2021
From his time as a postdoctoral scholar in Lee Hood's group at Caltech to his appointment as CEO of Pacific Biosciences, Michael Hunkapiller has been a key participant in the revolution in biotechnology as a result of DNA sequencing.
As is often the case, Hunkapiller's interests began in fundamental research, and as both the technology and analytical tools advanced, he recognized early on that DNA sequencing would play a key role in translational applications in fields as diverse as agriculture and human health.
Interview Transcript
DAVID ZIERLER: This is David Zierler, Director of the Caltech Heritage Project. It is Wednesday, November 10, 2021. It's my great pleasure to be here Dr. Michael W. Hunkapiller. Mike, it's great to see you. Thank you for joining me today.
MICHAEL HUNKAPILLER: I'm glad to do it.
ZIERLER: To start, would you please tell me your current or most recent title and institutional affiliation?
HUNKAPILLER: Well, my current title is retired. [laugh] Prior to that, I was Chief Executive Officer at Pacific Biosciences in California.
ZIERLER: What is Pacific Biosciences?
HUNKAPILLER: They are a scientific instrumentation company focused on DNA sequencing.
ZIERLER: Just for you, if somebody asked you at a dinner party what kind of a scientist you are, what would you say?
HUNKAPILLER: [laugh] It usually is a chemist.
ZIERLER: Yet your field is bioscience.
HUNKAPILLER: Yeah, my PhD at Caltech was in Chemical Biology, whatever that is. [laugh]
ZIERLER: Let's go all the way back to the beginning. Where did you grow up?
HUNKAPILLER: I grew up in Oklahoma, mostly in a small town called Seminole about 40 miles southeast of Oklahoma City.
ZIERLER: And the name Hunkapiller, what's the national origin to that?
HUNKAPILLER: Oh, boy. Well, the national origin was German. It wasn't Hunkapiller, it was Hungerbuhler. And when distant relatives came into the US in the late 1700s, early 1800s, that's how the name got transcribed. [laugh] Which is probably typical for a lot of European names that weren't English.
ZIERLER: Were you always interested in science, even as a young boy?
HUNKAPILLER: Yes. Which might be a surprise, coming from Oklahoma, particularly rural Oklahoma. But when you remember that one of the first seven and one of the second nine astronauts came from the middle of Oklahoma, both within 15 miles of where I grew up, it's maybe not so strange. I was just right in the middle of the frenzy over space exploration and so forth.
ZIERLER: Do you remember Sputnik?
HUNKAPILLER: Oh, I remember Sputnik and all of that, yes.
ZIERLER: Did your high school offer a strong curriculum in math and science?
HUNKAPILLER: No. [laugh] Not at all. My chemistry teacher had never taken a chemistry class in her life. It had a reasonably good math program. But I would say science was pretty pitiful. But that didn't stop me from being interested in it.
ZIERLER: When it was time to think about colleges, what was available to you? Did you want to stay close to home?
HUNKAPILLER: No, not particularly. I actually applied for Caltech undergraduate school with no particular reason as to why I did it that I can remember. I didn't get in. I didn't have the requisite science and even math background at that point in time. And I wound up going to a reasonably small college not too far away from where I grew up, mostly because they gave me a very, very good scholarship.
ZIERLER: Being at Oklahoma Baptist University in the late 1960s, I assume it was a pretty conservative place.
HUNKAPILLER: You got it.
ZIERLER: Was there any anti-war sentiment or anything like that?
HUNKAPILLER: Yeah. It was somewhat limited, obviously, but it was there.
ZIERLER: And did you focus on chemistry from the beginning? Was that always your plan?
HUNKAPILLER: Yes.
ZIERLER: Was it a good program in chemistry?
HUNKAPILLER: Well, it was. It was a very Baptist school. I wasn't a Baptist, but that didn't seem to matter that much. And the reason it had a good chemistry program is because they had a very big pre-nursing program. And they had to have all of the chemistry and some of the biology to be accredited for that. And so, it was anomalously good in science, actually.
ZIERLER: Was there a senior thesis, or were there any professors who were mentors to you?
HUNKAPILLER: There were a couple. I did have a senior thesis. I think it's listed in my CV. But the person who was the sort of senior biology professor, biochemistry professor was sort of, I think, a mentor to most of the sort of serious chemistry, biology students, including me.
Caltech by Way of Oklahoma
ZIERLER: Were you encouraged to apply for graduate school? Did professors think that you had a career maybe even in academic science?
HUNKAPILLER: Well, I think I encouraged myself. I knew that was what I wanted to do. And I guess I'm a little stubborn. I applied for Caltech in graduate school and got in. The only encouragement there in terms of applying to graduate school at Caltech also came from a previous student who was maybe four or five years in front of me who had gone to Caltech as a grad student.
ZIERLER: I imagine most of your cohort in graduate school came from bigger programs, more elite schools than you did. How well-prepared were you?
HUNKAPILLER: Actually, I think that may be a bit of a misperception. There certainly were some who were. But there were a lot from relatively small schools. Caltech, at least at the time, I assume they do now, tends to take a fairly diverse set of students from their backgrounds. It's more sort of how they perceived, based on your recommendations, your scores, your grades, what your interests were. And they had to compete with a lot of big schools for graduate students, which was strange.
ZIERLER: What program did you settle on at Caltech?
HUNKAPILLER: Well, I went into the Chemistry and Engineering Division, which is where I applied for and got in. But I kind of knew I wanted to focus on biochemistry-type programs there. Which is what I did.
ZIERLER: And what was exciting? What were some of the big research areas at the time?
HUNKAPILLER: I went to work with Professor John Richards, who was a biochemist. He was actually a photochemist by his early training, but he went into biochemistry and was involved in looking at the mechanism of actions of enzymes, which I'd had some interest in when I was in undergraduate work at Oklahoma Baptist. A professor there had a small grant, and that's what I'd worked on there. Pretty rudimentary stuff. You might imagine, they didn't have a huge laboratory set up. And that's kind of what got me started in Caltech, and I got through the graduate program fairly quickly, interrupted by a brief stint in the Army. And then, eventually, I moved over into the Division of Biology with Lee Hood.
ZIERLER: Was he a professor at that point? Or he was still in graduate school?
HUNKAPILLER: No, he was a professor.
ZIERLER: What was Lee working on at that point?
HUNKAPILLER: Lee primarily was focused on understanding some of the components of the immune system. A lot of his studies were on looking at how organisms recognize and respond to non-self. And his biggest group, although it got bigger, and bigger, and bigger during the time I was there, was looking at the molecules that are involved in the cell surface reception from t-cells. And then, in the effective part of it, the b-cells. And those molecules were present in tiny, tiny quantities on a fairly small number of cells. And he studied mostly the response in mice. And he needed a way to be able to study the structure of those. And what they had was a fairly archaic way of doing it. They gave you very fragmented information. And what he asked me with kind of a chemistry and instrumentation background was to come up with systems that would allow him to study protein structure with tiny amounts of material to start with.
ZIERLER: Would you say Lee Hood's group was more focused on applications of this research, or it was more a basic science approach?
HUNKAPILLER: It was mostly a basic science approach.
ZIERLER: And where is Bill Dreyer in this?
HUNKAPILLER: Bill was one of Lee's mentors. And I guess Lee had done some tiny amount of protein sequence analysis with Bill, but they were, as far as I could tell, completely separate groups at that point.
ZIERLER: So the DNA sequencing work and the protein sequencing work were separate endeavors?
HUNKAPILLER: Well, the protein sequencing, in terms of developing technology, at the time I was there, mostly came from Lee's efforts, when Lee was able to get funding to do that. He was amazing at convincing even non-governmental agencies to fund him. And together, we were able to secure a fair amount of funding. In the end, I worked on the protein sequencing a little bit with Bill's post-doc, who was also interested in it and didn't have the funds to do it, but we were able to put together a pretty good scientific collaboration.
ZIERLER: Who was Bill's post-doc?
HUNKAPILLER: Rod Hewick. It's on the first paper.
ZIERLER: What ended up being your thesis project?
HUNKAPILLER: Well, there were several. But they were focused on mechanism of action of serine proteases.
ZIERLER: If you wouldn't mind translating that, what does that mean?
HUNKAPILLER: Proteases are enzymes that chew up other proteins. In the stomach, there are two that chew up protein foods. But they're also involved in a whole host of other kinds of reactions. Blood clotting, for example, the processes that go on in that are triggered by lots of events, but one of the first biochemical steps, in fact, two, three, four, or five, are proteases that are making a specific clip in a protein that then goes on to do something else, and something else, and so forth. They're involved in a lot of triggering events of proteins that are present in a sort of inactive form. And when that proteolytic cleavage, which can be very specific, happens, they suddenly turn into an active protein. And it's a very widespread general class of what are called proteases. They tend to be fairly important.
ZIERLER: Was Lee your advisor?
HUNKAPILLER: No, I went to work in chemistry with John Richards.
ZIERLER: Was Lee on your thesis committee? Did you stay in contact with him?
HUNKAPILLER: No, I was a grad student from '70 to '74, and then I stayed on as a post-doc with Jack Richards for a year and a half, two years and started working a little bit with Lee because he had this interest in figuring out how to come up with better ways of doing protein sequencing for tiny amounts of material. And I was ready to actually leave Caltech, and as he put it, he made me an offer I couldn't refuse, which was to stay. It wasn't a difficult decision to make. Obviously, Caltech is a great place. I went to work for Lee officially and moved into biology in '76.
ZIERLER: And what was your research during your post-doc time?
HUNKAPILLER: A lot of the instrumental technology used in Jack Richards's lab was NMR spectroscopy. And so, part of it was learning a little bit more about how to look at and tweak NMR, as opposed to just looking at the biology that we were doing with it. And that got me sort of working more into the instrumentation side of the world from a fundamental perspective. And part of Lee's project in his lab was studying antibody molecules, which you could get in large amounts. And we were interested in looking at how antibodies interact with their antigens, and studying that by NMR spectroscopy. And that's how I got associated with Lee.
ZIERLER: Now, the focus on instrumentation, was the term biomedical engineering in use? Was that a field to go into at this point?
HUNKAPILLER: If it was, I don't remember it.
NMR Spectroscopy and Lee Hood
ZIERLER: Now, being a post-doc, did that put you necessarily on an academic track? What were you thinking in terms of next moves?
HUNKAPILLER: It didn't put me on an academic track at Caltech per se. If you were in the Chemistry Division, they wanted you to go somewhere else. Maybe come back. It was more of an interim step. Moving into Lee's lab was changing fields a fair amount and into a different program. I had no idea how long I was going to stay there at that point. The collaboration with Lee turned out to be pretty fruitful. But to some degree, you're winging it. It's hard to ween yourself away from Caltech. [laugh] Lee had graduate students for 12 years.
ZIERLER: And the group was getting bigger by the year.
HUNKAPILLER: It was getting bigger by the month at times, yes. In the summer, I don't remember what year it was, he had 120 people, which was ridiculous for Caltech.
ZIERLER: Yeah. Did you have a front row seat to some of those administrative conversations about how Lee's group was just getting too big, that's not what Caltech was built to do?
HUNKAPILLER: I don't think I had a front row, but I certainly knew about it.
ZIERLER: Was he already making noise about possibly leaving Caltech at that point?
HUNKAPILLER: Not while I was there. He left several years later.
ZIERLER: And with the group growing so big, in what way was it moving into new directions all the time?
HUNKAPILLER: I think he was primarily in the immunology space. But that's a pretty broad field. Mostly, he just tried to broaden out into that and tried to move into programs that were, themselves, larger scale so that they required a large amount of data as opposed to what one student could generate.
ZIERLER: Was there a sense, just to foreshadow to the 1990s, that this genome revolution was on the horizon? Was anybody thinking along those lines that early?
HUNKAPILLER: The first hints of how to do DNA sequencing in any scale came in the sort of mid to late 70s, which is how we got started in Lee's lab. I think there probably were people who viewed it as something that eventually would happen, but the scale that you could operate at at that point in time was a long way from being able to think of genomes.
ZIERLER: I wonder just intellectually how you might connect DNA sequencing ultimately to the Human Genome Project, for example.
HUNKAPILLER: That's a good question. Obviously, you needed to do DNA sequencing to do the Human Genome Project, that's what it was.
ZIERLER: Just nuts and bolts, why do you need it?
HUNKAPILLER: The Human Genome Project was sequencing the human genome. That's what its end goal was. The purpose of doing that was to be able to take the information that you got from that and understand what information is in a genome, particularly one the size, say, of a mammal, and how alterations in that might impact health or disease from a human perspective. Or in other areas, say, in agriculture, how it might impact not just disease and so forth in animals or plants, but their ability to withstand harsh conditions, their ability to produce whatever you're trying to get out of them. I think it was understood, certainly way before I was around, that genomic structure was going to be important in a whole slew of areas.
ZIERLER: And was the group thinking about drugs and drug delivery also at this point?
HUNKAPILLER: No, Lee's group was focused on pretty early-stage science. And Caltech, particularly at that point, was not a clinically driven institute by any means. They beefed up a little bit of that through collaborations later. But Caltech was a basic science institute.
ZIERLER: Tell me about the opportunity at Applied Biosystems, which finally got you out of Caltech.
HUNKAPILLER: Well, we worked on a series of instruments and technologies within Lee's group, all with the goal of enabling some of the programs that his group, in a broader sense, was working on in basic science. And Lee was on the scientific advisory board for one of the early biotech companies, Amgen. And in the course of that, he got to know some of the venture capitalists who were interested in the whole bio space. And they got to know him and the technology things we were working on. And they had the idea that there might be an opportunity to fund an instrument company that was focused on providing tools for what was a pretty nascent biotech industry. And they sort of interested one scientist and one engineer at Hewlett-Packard at the time in maybe heading up that company, which became Applied Biosystems. And that's sort of how we got connected. We were the initial technology source for that company. And I started off as a consultant. And then, after two years of a lot of plane flights between Southern California and the Bay Area, I got persuaded to move up here permanently.
ZIERLER: What was the overall mission of Applied Biosystems?
HUNKAPILLER: The tagline was to sort of provide the picks and shovels for the biotech revolution.
ZIERLER: Which was already well-underway at that point.
HUNKAPILLER: It was still pretty early. You had Biogen, Amgen, and Genentech, and that was about it.
ZIERLER: And in R&D, your first job there, was that also a basic science kind of endeavor? Or you were already on to applications at that point?
HUNKAPILLER: Well, they were an instrumentation company essentially. It was designing better instruments for doing protein sequence analysis, DNA synthesis, peptide synthesis, but then the big hitter was DNA sequencing.
ZIERLER: Who was the customer base for Applied Biosystems?
HUNKAPILLER: They varied over time, obviously, as we got from a startup to a much bigger company. They were biotech companies, they were government labs like NIH or the CDC, USDA, anyone involved in biological research. And then, they were universities. And their equivalents outside the US as well.
ZIERLER: And in your promotion, what were your responsibilities as scientific director?
HUNKAPILLER: Though I say it was an instrumentation company, it wasn't just an engineering company. Because the instrumentation frequently involved chemistry or biochemistry that was performed in the instruments. The purpose of the instruments, and the chemistry and biochemistry associated with that, kind of fell under the direction of the scientific officer role. And there was a head of engineering as well that I worked with. But the, "What's the instrument going to do, and what is required to get there from a purely chemical perspective?" is more what I focused on. I wasn't an engineer. Despite the fact I'm in the National Academy of Engineering, I was never and have never been an engineer. [laugh]
ZIERLER: Tell me about your membership to the Human Genome Organization, which goes all the way back to 1989?
HUNKAPILLER: Which is when it started. It was pretty clear by the mid-80s, once we had begun to get the core development of the first of the DNA sequencing systems done commercially, which was actually done almost exclusively at Applied Biosystems, that that might begin to enable sequencing large genomes. Might be expensive, but it's possible. And you had people like Wally Gilbert at the time really pushing, "We need to do something like that." And that kind of helped get NIH and even the Department of Energy involved a little bit, being interested in it, so there was a potential source of funding for that. And given that we were sort of leading the effort at Applied Biosystems from the commercial world to develop technology for doing that, it was pretty easy to get involved in that whole effort at an early stage.
ZIERLER: By the time you become executive vice president, are you sufficiently removed from the science at that point? Are you really more on the business side of things?
HUNKAPILLER: It was a scientific company, and my expertise was not sales, per se, it wasn't executive management. I always joked that the only executive management courses I had was when I was in the Army as an officer. [laugh] Somewhat perverted sense of what management is, but it was more true than not. And so, in a company like that that's in a rapidly evolving field, both from what customers are doing, what they want to do, and what they need to do it, the successful managers and executives in those kinds of companies tended to have a technical background. You stay pretty close to that.
ZIERLER: And when did PerkinElmer Corporation buy Applied Biosystems?
HUNKAPILLER: I think in '93, '94, thereabouts.
ZIERLER: Did that change the day-to-day at all?
HUNKAPILLER: A little, but they got into it recognizing that was not their knowledge base. They were a well-established but really the old line analytical chemistry instrumentation company. They started in the mid-30s building infrared spectrometers. And they got pretty big doing that, but they also got very diversified. They had a division that made ceramic coatings for the inside of jet engines. They had a few years where they made small-type computers. Not desktops, but still pretty small by any standard even at the time. They made the nose cones for sidewinder missiles, which is an infrared detector. They made the mirrors for the Hubble Telescope famously. But that was a sideline to what they really made, which was detectors that looked down from space, not up.
They had a lot of very high-security-type operations. And they maintained their scientific instrument stuff. And they kind of realized that that was such a diverse setup. And there were a couple of other things they got involved with. And they spun off all of those, the last one being the group that made the IR detectors for the sidewinder missiles. And the business for them was not a growth business. They hadn't really thought about that, which was kind of weird, other than through acquiring other companies and businesses even outside some of their expertise, but one of the executives there, sort of the number-two guy, realized that they needed to get involved in the biology side. And they came up with a partnership with one of the early biotech companies called Cetus, where the polymerase chain reaction, or PCR, was invented. They had rights to that.
But they didn't really know how to exploit it very well. And that's how they got interested in us at Applied Biosystems at the time. And they knew enough to leave us alone. In fact, they moved the management of the PCR group over to Applied Biosystems. And one other tiny group at the time, which was a mass spectrometry venture they had with a company in Canada, and then kept the rest of it with the goal of eventually–and it wasn't so long after that–spinning out their older business, leaving behind more of a biology-focused business. And so, we were in charge of doing that and at one point became PerkinElmer. They sold the old name off, and we took back the Applied Biosystems name.
ZIERLER: By the mid-1990s, when you were GM and president, what were some of the tools that really brought molecular biology into the modern era?
HUNKAPILLER: There were a whole host of tools, but the ones I said that we focused on the most were ones around automated DNA sequencing and what you needed to go along with that, one of which was PCR. You put those two things together–PCR was the way you got a lot of samples you were going to sequence. And you could do other things with PCR besides that. But those, I think, were the tools. We were the first company that really came out with a functional, semiautomated, at the time, PCR system for doing quantitative PCR, as opposed to the original version, which was sort of a binary type test. And that wound up being a pretty big business. DNA sequencing was always the biggest of those.
But that little mass spec business that I mentioned that moved over when we merged with PerkinElmer also ended up being a half-billion-dollar-a-year sales business, and it was focused on looking at larger molecules than the small stuff that mass spec at the time was mostly focused on. Proteins and looking at a lot of studies in the pharmaceutical world of how drugs get metabolized. Which is a big deal because if you don't know that, you don't know what's happening once you give somebody a drug. And so, it was all focusing on looking at analysis of biological material.
The Origins of DNA Sequencing
ZIERLER: The term automated DNA sequencing, is that to say that back in your Caltech days, DNA sequencing was not automated? What would you call it? Manual? How would the sequencing happen before it was automated?
HUNKAPILLER: It was manual. The original procedure, which we sort of modified in a sense to make it semi-automatable, was developed by Fred Sanger's lab in Cambridge. And the process there was, you start with a piece of DNA, and you use the natural process of copying a piece of DNA, which involves a small starter piece of DNA that binds onto a portion of that target DNA that you're trying to sequence, and then, polymerase, which is an enzyme that basically extends that little piece of DNA one step at a time using the target DNA that you're trying to sequence as a template for that. And so, you're measuring whether or not you're adding an A, G, C, or T in each little step. And the way that was done was a process involving–adding those four bases in a cocktail along with the polymerase, and the enzyme starts its thing.
But you have a tiny amount in a particular reaction of, let's say, one of the bases that is a poison, so if it gets added, its polymerase has to stop. And you do a tiny amount because you don't want it to stop, in a large population of molecules, all of them. You want to stop just a little, and then have the thing still going so next time you come to that nucleotide type needing to be added, it will stop a little bit more. And so, you wind up with kind of a ladder. And if you measure the length of the pieces from the start, you know along that length, every time you had that particular base added. And you do four separate reactions, one for each of the four bases, and you wind up with four ladders.
And you separate those on an electrophoretic system, which separates on the basis of size. And you line up the four reactions and their ladders, and you can kind of go down from smallest, to next smallest or somewhat larger, one base at a time and kind of read the pattern out. And the way that pattern was read out is, those modified bases were labeled with a radioactive tag. And so, you basically spread all these things out on a big electrophoresis plate, a gel material, and stop the separation, slap a piece of x-ray film onto it, and let it develop. And you get an image of the separation pattern. And then, you sit down with a piece of paper and a pencil, and you go through, and you decide, "OK, first one is an A, next one is a T, next one is a G," and so forth.
And you write those down. That was manual, to say the least. The automation was replacing the radioactive tags with fluorescent tags. And in the form that we did, it was four different fluorescent tags, one for each of the four types of bases. Then, you could mix them all together and separate them in one track in an electrophoresis system. And more importantly, you can have a detector near the end of the gel that was looking for the color bases that were coming through at any point in time. And that went directly into a computer, which interpreted whether you had a red, green, blue, whatever, signal coming out. It was automated in the sense that instead of manually looking at the gel with your eyes and asking, "What's happening?" you're having a computer interpret the pattern because it's getting data fed in automatically.
ZIERLER: Besides the obvious benefits to efficiency, what else is valuable in automating DNA sequencing?
HUNKAPILLER: I'll tell you a story, which comes a little bit later, but it gets to the point. The initial system that was commercialized was using what's called a slab gel, a big electrophoresis plate, or two plates with the gel sandwiched in between it, and you put each end of that big plate–and sometimes, they were four or five feet long–into a bath of electrolyte solution, and attach electrodes to each end, and turn the power on, and the DNA migrates through it on the basis of size. And the whole process of making the gels, and reading the gels, and so forth was a complicated process. In Lee's lab, he had a technician who made all the gels for people in the old, manual process who only worked at night.
Which was a lab crew mandate because the failure rate on making those things was pretty high. And when she would get frustrated, she would tend to throw them across the room at the wall, we've got glass going everywhere. And no one would work next to her. But she made a really good plate, so it was tolerable. When we were trying to figure out if we could take this technology into sequencing the human genome, this was after we got commercialized, and it was successful as a technique in the fluorescent, semi-automated process, I went to a group in the UK, who was one of Applied Biosystems's early customers, and asked about their process for scaling up, how they're going to jump up to sequencing the human genome at that point.
And they had a very efficient system that was the group of about eight people who were responsible for making the gels, making the samples of DNA, prepping them for the sequencing reactions, and so forth, carrying that out. Making the gels, loading the gels, running it, making sure the data got transferred into the computer properly, and interpreting the computer data from one sample in a few of the other samples they were running, just how to stitch the stuff together. And that little group of eight people had a certain amount of productivity. And their process for how they were going to scale up was hiring hundreds of those groups of eight people. They calculated that you could do it if you did that.
The problem, of course, is they had no clue where they were going to find all those people. And so, our next task, and it's where we figured out that we could actually do it at the human level, was taking another step in the automation process that eliminated the gel bit completely and allowed you to run multiple samples over the space of a week, in which you load the samples in a tray first, and then the system from then on does everything. It loads the sample, runs it, collects and automates analysis of the data, flushes out–the gel material was not jello at that point, it was a liquid that we used, which is how we were able to do it.
And it moved from what are called slabs to capillaries, which are tiny tubes that you just put your liquid separation material in, eject the sample at the beginning, it runs through, and then you can flush out what was left over from that. And with a group of eight people, you could do 100 times as much work in a period of time. You didn't have to handle the same number of people in order to make it work.
And that's when we approached one of our other customers, which was the group that Craig Venter headed up–and he was sort of halfway at NIH and halfway leaving the NIH and setting up his own institute at the time anyway–on the fact that you could actually do this with a nontrivial number of people, but it wasn't in the tens of thousands. It was when we figured out how to fully automate the process that something on the scale of the human genome was possible. And we worked with the NIH group and the Sanger group who were also involved in that effort to sequence the human genome at that point along the same lines.
ZIERLER: It's well-known, of course, the central role that the NIH played in the Human Genome Project, but what about the Department of Energy and the national labs? Did you interface at all with them?
HUNKAPILLER: Yes. One of the things NIH doesn't like to emphasize is that a lot of the initial impetus for DNA sequencing and something of kin to sequencing the human genome came out of the DOE, not NIH. And their interest was in understanding the impact of radioactive damage on chromosomes. You can imagine where that came from. And so, they were involved in it from the start. They created their own lab here in the Bay Area. But NIH took it over. NIH just had the bureaucracy and the number of labs around the world or the US to participate in it. But you interacted with both agencies.
ZIERLER: When did the company start to get involved in DNA forensic analysis, and at what point did, for example, law enforcement agencies start to take notice of this?
HUNKAPILLER: I don't know whether you know the history of DNA analysis, but it started with Alec Jeffreys in the UK. And there's an interesting story on that, but you can read it somewhere if you want to, as to how he came up with the idea. The problem with it was that it was a little too artsy. It was hard to read the patterns a little bit and figure out the way you could kind of encode it in the database that got beyond one person. But when PCR came along, there was the potential for doing that in a way that you could do that because PCR gives you very defined pieces of DNA that you're able to analyze, and you come up with the right way to do it.
And we realized pretty early on the system that Hoffman La-Roche, who bought the PCR rights from Cetus, was using to do that, mostly for paternity testing at the time, but was also becoming applied to forensic analysis could be analyzed very well with the DNA sequencing system. Because the DNA sequencing system was measuring automatically the links of fragments of DNA. And however you got those fragments, in our case, mostly DNA sequencing, in the case of that was a PCR reaction, was a better way to do that. And we came up with a scheme for how to do that. We worked with the FBI, the Armed Forces Institute of Pathology, which handled it for the military, and the Home Office in the UK to kind of validate that technology.
And we got into it from the perspective that we had this thing from our relationship with the PCR group. Because we owned all the instrument rights, Roche owned all the chemistry rights, but you needed both. And so, they focused with Roche in the clinical diagnostics world. We focused on the non-clinical-diagnostic areas. And we provided each other, in our case, instruments to them, and reagents to us to make the whole system. And it was kind of a natural fit for how to take the technology that we had from an instrument perspective and apply it to that particular application.
DNA and Translational Science
ZIERLER: In the turn of the century, you are very well-recognized with some significant awards. The Edman Award from the Protein Society, SBS Award for Achievement in Biomolecular Screening, the National Biotechnology Award for the National Conference of Biotechnology Ventures. To put it all together, what was making waves exactly? Why were you and your company being recognized for such important work?
HUNKAPILLER: It helps to be first sometimes. And when Applied Biosystems got started, the company that was most known for focusing on biological-type analyses systems was Beckman Instruments. And I'm sure if you've done any history of Caltech, you know Arnold Beckman from Caltech. And we had tried to interest them in some of the early technology stuff with Lee's lab multiple times, and although Arnold was not running the company at that point, he still had some input into them. And he would get very interested, tell his people to come look at what we were doing, and they'd come and say, "Ah, it's not worth anything." Multiple times. It was pretty frustrating. And so, we got started with a startup company because we got turned down by Beckman a lot.
But Beckman Instruments, at that point, was mostly focusing on developments in clinical diagnostics, and none of these tools were looked at as clinical diagnostic tools at that point in time. And that's why they didn't get interested. It kind of left open the field in the sort of basic research side for new instruments to get made. And Applied Biosystems got there first is what it amounted to. And the stuff worked. It kind of revolutionized what was an old system for doing protein sequence analysis. It made the first automated tools for doing DNA synthesis, which became really important once PCR came along because you need to make these little pieces of DNA to start whatever the PCR reactions are. It was being first and being successful. That's kind of what happened.
ZIERLER: And then, in 2003, what was it like when Caltech recognized you with the Distinguished Alumni Award?
HUNKAPILLER: It was nice. [laugh] I've never been that big on personal rewards. I like to tell people I like team sports far more than individual ones. Which you can tell, if you can read the jersey and the patch here. It was more, from my perspective, a recognition of what we'd accomplished at that point at Applied Bio.
ZIERLER: And when did you get involved with Alloy Ventures?
HUNKAPILLER: I was at Applied Bio for 21 years, and we were just about at $2 billion in annual sales. At which point, it becomes much more of an executive management operation. And we had really four quite diverse product lines already and a pretty good team to head each of those groups. One of the very early VC investors in Applied Biosystems, which was a VC-started group, as I mentioned before, I kept in touch with for a long time. He was on the board for a while. We spun out a little company initially to be into the therapeutics world that he was also one of the early investors in and was a chairman there. Had been after me for a while, and I finally caved in and said, "OK. I've got to try that." And it was time to turn over the reins to somebody else at ABI.
ZIERLER: And what was your position at Alloy? What did you do there?
HUNKAPILLER: I was a general partner. You're looking for new investments, helping decide to do investments in particular companies that you're bringing in as well as helping the companies sort of get off the ground, if nothing else. But you wind up being on their board.
ZIERLER: How big of a switch in field was this for you?
HUNKAPILLER: For me, personally, it wasn't. It was all in a scientific instrumentation, biological analyses-type field that I focused on. The firm was much broader than that in the sort of things that the firm as a whole invested in, but that was my focus. And that's why I was brought in to do that.
ZIERLER: When you were named to the National Academy of Engineering, just from a business perspective, it's such an enormous honor, I wonder if that was valuable to you at all in terms of the connections it might've made or doors it might've opened up to you.
HUNKAPILLER: I don't think in any large degree. I was already pretty well-known even in the VC world around here. Although, I wound up doing a few investments outside of the Bay Area, I was more focused on this area so I wasn't traveling all over the place.
ZIERLER: What's the origin story for Pacific Biosciences?
HUNKAPILLER: The technology came out of Cornell. The people who started it, from a technology perspective in Cornell–it was a group headed by a guy named Harold Craighead, and that was mostly a physics-based technology company. Steve Turner came out of his lab. And they were looking at what happens with the sort of microstructures in light and had this concept they called zero-mode waveguides, which could be little, tiny things in which light of a certain wavelength can't go all the way through. And they used that as essentially a way to isolate a single molecule and the fluorescent activity going on around that molecule without it being clouded by neighbors. And they worked with another group, and I don't know who the senior investigator there was, but it's where Jonas Korlach came out of, who was more a biochemist, molecular biologist, and came up with this concept of how to do DNA sequencing that way using a single molecule of DNA, whereas before, you had basically a clonal population of identical molecules that you started with.
If you did a range of molecules, and you just tried to follow in real time, if you could, the base additions with fluorophores without this trick of stopping them, a tiny portion of the population each time, you couldn't make sense out of it. Because a different molecule is not going to get copied at the same rate as its identical clone partner is. It gets jumbled really quickly. But if you did them with a single molecule, you could do that. That's what their goal was. And at Applied Biosystems the last year I was there, we were looking at what was going to happen after the technology that we had because we knew it'd been on the world for a fair amount of time at that point, and technologies don't have an infinite lifetime.
And we'd actually identified the stuff out of Cornell as a potential. You have to get there with a lot of inventions to make it really work, but it was interesting. And we approached them, and by that point, one of the inventors, Steve Turner, had already connected with a venture firm in the Bay Area, Mohr Davidow, to create a company to commercialize that. I think Steve set the company up, and then they came in and decided they were going to fund it and move it out to California. So at AB, we missed out by a month or so.
That was in the spring of 2004, and the fall is when I left and went into the VC world. And one of the venture partners, Brook Byers at Kleiner Perkins, was looking at investment in that company along with Mohr Davidow. He called me up and said, "What do you know about them?" So I went through the story with him. He invested, and that was one of my early investments at Alloy Ventures as well. That's how that connected there.
ZIERLER: What were some of the big strategic initiatives there? What was exciting to develop?
HUNKAPILLER: It was the most complicated instrument that I've ever been associated with from a technology perspective because they were breaking ground in so many disciplines. Not many people did enzymology with single molecules at that point, and they didn't have ways of interrogating not just one individual molecule, but you had to do a lot. Because the equipment required to do one molecule was way too expensive from a cost perspective to generate a DNA sequencing tool. You needed to be looking at lots of pieces of DNA simultaneously. It wound up being about a 2,400-pound instrument.
And it was about eight or nine feet wide by three feet deep. It had four light paths in it because you're detecting light coming off of each one of these little holes in four different colors. And the individual light path coming out of that little, tiny hole, which is about 140 nanometers in diameter, and there were 75,000 of them, you had to break four different laser lights int 75,000 little beams and focus those individually on one of those little holes that had your sequencing reaction going in it. And then, as the light came back fluorescent, you had to take that bunch of lights and focus them on cameras. And the light path from the little hole that hits the cameras was about two meters long.
And all this was wrapped inside this big box. And you were dealing with numbers of photons in the high tens to 100 or 200, so it was pretty high-precision and sensitivity in order to do that. And then, the software for interpreting all that stuff was horrendous as well. And then, there's the chemistry for how you prepare the samples, how you do the chemistry, how you make the right kind of dyes to make all this work. It was pushing the envelope every step of the process. And from a technologist, that was exciting, to say the least. But it also had big potential.
ZIERLER: To what extent was the rapidly expanding world of computational power relevant in this?
HUNKAPILLER: It was, to some degree. The problem that we were trying to solve was there was the next generation beyond the original Sanger stuff that Applied Biosystems managed was what became the technology that was in the end developed commercially Illumina, and it's made them into the biggest company in the space. And the good thing about that was, you could at the time sequence millions of pieces of DNA, now tens of billions of pieces, simultaneously. The problem was that the pieces of DNA that you could look at were really short. 100 base pairs roughly. And the problem is the same one as working a jigsaw puzzle.
You take a jigsaw picture, and you split it into ten million pieces. Putting those together is going to take you a long time and not be very easily done if you have a lot of areas of the picture that are identical. Because you're doing it by sticking the pieces together in the jigsaw puzzle the right way. And it's just hard if they're all nothing but solid green, and you've got 1,000 pieces that are just that. And that's kind of what the little piece problem was of putting it together as a whole genome. And so, for a long time, what's called the short-read or next generation sequencing tended to focus on things where you didn't need information so much that was stitched together, but you could do it from a piecemeal perspective.
But that kind of gave up a complete reading of a whole genome. And the one thing about the biotechnology is that it didn't suffer from that problem. You could think in terms of doing really long stretches of DNA from a single DNA molecule. And big pieces in a jigsaw puzzle or in a sequencing stitching together are much easier to put together than a whole bunch of little pieces. And so, that's what was exciting about the technology, and it opened up an opportunity there that you couldn't easily address with the short-read illumina technology that was in the field at that time.
Leading Pacific Biosciences
ZIERLER: As president and CEO of Pac Bio, to what extent were you focused on diversifying the company into, for example, plant sciences or infectious disease?
HUNKAPILLER: Pacific Biosciences is not an infectious disease company from a therapeutic perspective. And it's not an agricultural company from a perspective of breeding new plants or new animals. It's providing the people doing that with the tool to help them do it. And so, you're interested in what the issues and problems are in those applications so that you're designing the tools that you're providing them to do the appropriate things necessary for those applications. And you learned a lot about them in that process, but you're not a practitioner directly.
ZIERLER: Can you explain the SMRT sequencing project and what made it so versatile with all its applications?
HUNKAPILLER: SMRT is "single molecule, real-time." The single molecule was what you needed to think about getting really long reads as opposed to a collection of molecules, and you're trying to get a single sequence out of a bunch of molecules that are all identical, but the chemistry you're going to apply to them is not uniform in time, so you get out of phase real fast. And so, single molecule got around that problem, and that allows you to get long stretches of information from a single molecule. And then, the real time was, in order to do that and do it efficiently, you needed to be monitoring that process in real time with as close to the normal process for the way DNA is copied as possible. Cornell came up with this idea, it was the way to think about doing long reads again with sequencing.
And initially, we were trying to get up to the same links you could get with the old Sanger method on these big gels, which is up to maybe 1,000 bases of DNA in a single piece. We went way beyond that. But getting up to that point distinguished us from the short read approaches, which were, say, 100 base pairs at the time. And that meant that we could get into looking at problems, say, for genome assembly, where you're trying to, in the end, stitch something together that may be three billion base pairs long, like the human genome, or problems even in individual genes, where you need to be able to distinguish a molecule that maybe is three or four thousand base pairs long, but maybe have a series of very closely related counterparts in the gene as well. And if you can't go from one end to the other on each individual one, you'll wind up with short reads, kind of mixing and matching the pieces from different homologues of these genes.
ZIERLER: Because one of the applications for SMRT is RNA sequencing, which obviously is in the news right now, would you say that Pac Bio, even from a basic science perspective, played a role in the development of the COVID-19 vaccine?
HUNKAPILLER: I don't know the details of what Pfizer, or Moderna, or J&J did in their early research. They were all customers of Pac Bio. The Coronavirus that's responsible for the pandemic is a fairly short strand of RNA, less than 30,000 base pairs long, which, with a little push, you can sequence, even with short-read stuff. As long as you have a fairly defined population. It got harder as all these mutants came out to do it with really short pieces. The labs in China were also our customers. We had a pretty big business there. And we had instruments in Tony Fauci's lab at NIH and at the CDC.
And we expanded that. I think the bigger use of the sequencing technology–because once you had the basic sequence, even from China, the RNA vaccine people knew what sequence to make to put into their vaccine, and they were off and running at that point. The use of the technology at the sequencing level has come more in the analysis of the mutation patterns as the virus evolves over time. Because that's a sequencing problem. And you need to know what happens when those mutations start taking hold. Delta variant wasn't the initial one that was a big deal, but it took over the world really fast and to devastating effect. And so, a sequence that comes in where you're trying to follow that pattern and you get all these variants–and the initial thought was that Coronaviruses don't mutate very fast because they have what's called a proofreading polymerase, or reverse transcriptase, but that turned out not to be true when you have that many people getting infected all over the world.
And they start getting challenged with things that are trying to kill the virus off. You tend to pull up subspecies really fast because you have errors in any rapidly growing system that involves nucleic acid copying. Our system is used there fairly extensively. And long reads help when you start getting a lot of variation because you get a population of one individual now that's got multiple sequences in it.
ZIERLER: Moving right up to the present, when you decided to retire, are you enjoying a true retirement now? Are you still involved with Pac Bio or other boards?
HUNKAPILLER: I'm a consultant. The company just officially reopened last week to everybody going to work since March 15 of 2020. I have done and continue to do a lot of Zoom calls. [laugh] I sit in mostly on a lot of the R&D meetings, which is my first interest anyway. But I'm other than that retired. I've been asked, but I haven't made a decision to go on any other boards yet. I was on a lot of boards, as you saw on my CV, from the time I was at Alloy. And now that it's a little easier to travel around here, as long as that holds in the next couple of months, I may add to that. But who knows?
ZIERLER: Now that we've worked right up to the present, for the last part of our talk, I'd like to ask a few broadly retrospective questions about your career and research, and then we'll end looking to the future. First, on the advances side, between the technology and the intellectual leaps and bounds, just understanding the basic science, what stands out in your memory as some real game changers over the years that really advanced the field exponentially and not incrementally?
HUNKAPILLER: There's a paper that I think I'm the first author, or at least the senior author, on that talked about a microchemical facility for biological analysis or something like that. And I used to give talks early on at Applied Bio when I would do business talks to the outside world. And I distinguished chemistry from biology in this grossly oversimplified way. Chemists got used to using automated analytical instrumentation really early in the process. It thought of itself, in a sense, as an automated tool discipline in a lot of ways. Biologists early on were agar gel plates and fruit fly models. Or back to Mendel, growing plants and writing things down, see the results of it. It was a very manual process.
And it was still that way to a large degree when we were thinking about how to apply modern chemical analytical techniques to at least a lot of biological molecules. And I give Lee a lot of credit for this. It was knowing that the field needed to get infused with a lot more real analytical automated technology in order for it to advance, particularly in the DNA world, where the things you could do, which were pretty remarkable, understanding how DNA worked, were always going to be inhibited until you got detailed structure of the DNA out in a big way. And so, it was this concept that biology needed something like an Applied Bio or whatever. I think it was a pretty good insight into it.
And like I said, I give Lee credit for coming up with that idea. I did a lot of the implementation of it within his lab. But that as a fundamental concept was hard to get NIH to grab onto. Lee tried multiple times to get funding from NIH early on for some of these instrument develops. He couldn't do it. It wasn't their role, I think was the general message from the peer review panels. And so, he went to local sources at some level. He got a fair amount of money for an instrument development group in his lab, or we did, out of what's called the Weingart Foundation, which is some wealthy industrialist who had Alzheimer's, and his family was interested in putting money into that field. And he was able to persuade two or three of the big industrial companies, Monsanto was the biggest one, to put money into it, to fund research in his lab before he could interest somebody like NIH to do it.
It was a hard thing to kind of jump beyond that biomedical research was not an institute that was doing fundamental instrument development very easily at that time. It seems foreign now, but it was really true at the time. At least in the US. I think that was the first big thing. And that's what kind of converted me from thinking of myself as more of an academic research scientist on a particular scientific problem to being into more the development of tools perspective. And the concept there was that you really had to have a lot of disciplines working together. And know there aren't many da Vincis around anymore. You had to have a lot of really sharp people in different disciplines working together in a program. That's kind of what made me switch from more Caltech. As good as Caltech is at that at some degree, each individual professor is a little fiefdom that they run.
They would collaborate with their partners, but the fiefdom is the primary purpose most of the time. JPL is a little different in that sense. And I'm stretching the point a little bit with the individual groups, but it was hard to get them devoting a lot of their resources there because the groups, by and large, are pretty small. They tend to be pretty focused. So that was one. And then, I think the second one was that massive DNA sequencing opens up the ability to study things that I don't even think Fred Sanger realized completely early on. It was an intellectual exercise with Fred. Brilliant guy. But the ramifications of what you can do with DNA sequencing now and do it quickly, I think, would've been hard put for people to realize before the technology got to where it is today.
You could easily have done a PhD thesis on sequencing one small gene's DNA not that long ago. [laugh] you can't do that today, I don't think. Because you can sequence the human genome in a day today. Even from where you were 15 years ago, doing that the first time around. It's just a different world. Heaven forbid what would've happened with COVID if people hadn't been able in China to generate that DNA sequence really early on, and we were struggling with figuring that out now.
The Clinical Value of DNA Sequencing
ZIERLER: If you can step back, from all of the advances in the instrumentation, and of course, the business side of things, all of this funnels to advances in medical science and human health outcomes. Even though, to some extent, you're not involved at the clinical or therapeutic level, what have been some of the advances that give you the most satisfaction that at the end of the day, all of this has led to better medicine, better health outcomes?
HUNKAPILLER: Obviously, I mentioned one of them just now. Pac Bio has moved more into the clinical uses of DNA sequencing than we started off because, again, it has value there. The first of those, in a weird way, don't classify it necessarily as clinical, but DNA forensics. In social meetings, most of the people wouldn't have a clue what you were doing or could understand it in simple terms. The only story I would lead off with was CSI or whatever it was at the time. Everybody knows about DNA forensic testing. That didn't exist really in a big way before DNA sequencing came along.
And it's one of the things we recognized at ABI pretty early on that could be a pretty big deal. And for Thermo Fisher, now, it's a pretty big business. And they still sort of dominate the space there. But you have similar things in the clinical space. The first one we got involved with at Pac Bio was in HLA typing, which is involved in tissue transplant rejections. It's a complicated set of genes, and there's enormous diversity in each one of those from one person to the next. But there's also this closely related family of genes that are different but similar in certain regions. Analysis of them, even with short-read sequencing, is limited. And the tests that have been developed over the years, which are DNA sequencing-based tests, came from the original stuff at Applied Biosystems, are OK, but you can only look at portions of those genes.
And people have found once they used Pac Bio technology and long read, where you can sequence in one go the entire sequence of each of these genes and not worry about getting confused by the other genes that are in the same family and so forth, it means that you can get information you couldn't get. And it turns out that your body can recognize those differences. And so, you get better outcomes if you can sequence the whole genes. It's become more of the standard for how to do things that aren't emergency sequencing needs. You have registries now of people who submit their DNA sequence material, it gets sequenced, it gets put into a database so that if somebody needs a bone marrow transplant, they can find the best possible match because that's kind of an all-or-none thing.
If you don't get it right, the person dies. Because you've obliterated their own immune system, and you're giving them a new set of bone marrow. If it finds the host that you put the sample into, it'll kill them because they recognize it as bad or different. Things like that, even at the gene level, became really important. And there's a whole host of genes that are like that, that you just can't get the right information on by looking at pieces of the genes sometimes, or little bits of them. And where that's essential is the genes that are responsible for metabolizing drugs. It's a similar problem. You've got complicated genes, and they have all kinds of mutations and stuff in them from one individual to another that make them over-metabolize a drug or under-metabolize it, and that determines how much dosage you should be getting, if nothing else, or whether you should be getting that drug at all.
All those things are now coming to fruition, which was the dream of the NIH people for the Human Genome Project. It's taken a while to kind of go from having the information on one sample to how to look at the differences from individuals and their chromosomal makeup, and applying that into medical treatment. And that's just blossoming now in ways that it wasn't before. Even now to the whole genome, where you have these little kids who are born, and they don't thrive. Why? Is there something in their genetic makeup, if they can't identify anything else, that's causing them not to thrive? And more and more of those are being subjected to whole genome DNA sequencing to figure out what those things are. And they're much more complicated than just a single point mutation, one base is changed in the way you have a problem like sickle cell, where there are chunks of the DNA that are missing, or moved into the wrong place, or they're copied too many times, or any number of things.
And that starts to get the attention of large-scale medical practices. And the UK did it first, in a sense, because they have a public health system, and they control the healthcare for everybody. And they started a program several years ago for collecting DNA for everybody. And their goal is to get every newborn kid DNA sequenced over time. And that's been copied all over the place. The Middle East, several countries in Europe. The US, with the All of Us program, to a degree. In China. It's become sort of a key part of medical practice. My personal physician occasionally does these little tests, and you go in, and you get one little thing looked at to see whether you've metabolized statins properly or something like that.
And those are interesting, but when you get to the whole genome level, it opens up a lot of possibilities. You have to be careful about privacy and all that kind of stuff. But it has the potential for helping people understand what they need to be careful about in their health. Whether they pay any attention to it's a different matter. But it can guide medical practice in ways that doctors are starting to understand the importance of.
ZIERLER: Last question. Looking to the future, you're a coauthor on a paper that's in press with Science right now, the complete sequence of a human genome. I wonder in what ways this paper serves as a capstone for a project a long time coming, and in what ways it heralds the next chapter in human genomics?
HUNKAPILLER: The human genome sequence had been sequenced multiple times, although it hadn't been finished. And so, the problem was that, even with the length of sequence that you could read with Sanger, there were regions that you couldn't decipher early on. And it was pretty "holey," in a sense. And the average size of the pieces that could be assembled was about 20,000 base pairs long. When you divide that into three billion, you still have a lot of pieces left after you've sequenced all of these pieces that not only have pieces that cover everywhere, but you don't know where to put them together, you don't always know where to place them.
And so, there were still kind of little chunks here along the chromosome. And over the years, that got refined, but it was still missing a lot of areas. You have areas at the ends of chromosomes that are highly repetitive, and you have similar sequences repeated in tandem over, and over, and over. And understanding that, putting them together with short-reads is impossible. You had much bigger regions roughly in the middle of chromosomes that are involved in how chromosomes are segregated during cell division. Because you've got two copies, one from your mother, and one from your father, in every cell. And when the cell's dividing, you have to pull the 23 pairs of chromosomes apart. If you get both the copies going into one cell, and it's missing in the other, you have a problem.
And there's a mechanism, and that's involved in a lot of the stuff that goes on in this complex region in the middle of the chromosomes called centromeres. And that was completely missing from the original human genome sequence. And it's because there are very complicated, repetitive areas there where you have big chunks that are multiple copies of a tandem. And short reads were hopeless. Only recently have the longer-read technologies gotten to a point where you can do that. And this was the first one in which, from a human perspective, they had done that completely and done from a single–it really wasn't a single individual. It's almost a full human chromosome. It's a cell called a hydatidiform mole, which is kind of a misnomer. It's basically a clonal cell line from a female cell that has essentially fertilized itself. I'm probably doing a disservice to the way the thing was generated.
But both chromosomes are now identical within the cell. It didn't come from one parent and another parent, it's from one parent. And it made it easier to sequence it because you don't have the interference from a slightly different sequence in every position twice in a single individual. But it allowed you to go from one end of each chromosome to the other. It required long reads. And it required highly accurate long reads in order to get there. And so, one of the things that was unique about Pac Bio's long-read technology–there's now another long-read technology that's out there–is, not only can we sequence a long piece of DNA that's 20,000 base pairs long, say, reading it down once, we construct it that it's in basically a circle, so you can go around that circle many times.
So you can interrogate each base position multiple times. And you will make mistakes in any individual read, but it's random, the mistakes that you make. If you just do a simple vote–not quite that simple, but more or less it is–you get a very highly accurate read on even something that's that long. And you need that when you're dealing with regions that are long and repeated and have a bit of difference from one repeat to the next, but not a lot of differences. And so, that kind of enabled that as much as anything else, to be able to do that. It means that you can think, now, about doing that level of work on any genome and not have big stretches that are just invisible to your analytical tool.
ZIERLER: It seems like the developments will only continue from here out.
HUNKAPILLER: They will. [laugh] You asked a while back about computing. You needed a lot of computing, both to collect all the data–because we're now doing it simultaneously, even in this continuous process, where you're collecting data every 10 milliseconds, and you're doing it on ten to hundreds of billions of pieces. That's a lot of data in real time. And so, part of the process is up front just collecting that. You can't store it that fast, unless you've got the resources of the Large Hadron Collider. And the problem is how to come up with algorithms as well as the right sort of fast processing tools that allow you to do that at that rate upfront is where the compute problem comes. For a long time, it was the problem of how you stitch it together. That's become much easier, particularly when you're dealing with large pieces that, by themselves, are highly accurate reads as opposed to noisy reads. But that data processing up front is half the cost of the sequencer. [laugh]
ZIERLER: More computing. That's the story here.
HUNKAPILLER: Yep, more computing. Faster computing.
ZIERLER: Well, Mike, it's been a great pleasure to be able to engage you in your life and your career. I'm so glad we were able to do this, so I'd like to thank you so much.
[End]