Databases, consistency, acid, space, time. When you sit down to talk to Pat Helland, any of those things might come up and actually all of them did today. We started out with a paper that he wrote for the communications of the ACM about a year ago called the Space Time Discontinuum, which is about what happens with distributed systems distance. And honestly, we talked about a lot of interesting things. We love talking to Pat and you get to listen in on today's episode of Streaming Audio, a podcast about Kafka, Confluent, and the cloud.
Hello, and welcome to another episode of Streaming Audio. I am as always your host, Tim Berglund, and I'm joined here today by my friend, Pat Helland. Pat is a principal architect at Salesforce, Pat, welcome to the show.
Well, thank you Tim. And I want to point out principal architect is not the same as an architect with principles. The word is spelled differently.
They are spelled differently and it's difficult to get that right.
It is. Principal with a P-A-L architect. Yes.
P-A-L. That's not principled architect.
Maybe you are.
That's a different discussion.
It is.
We could figure that out.
Yes.
Yeah, we're not saying that you're not, we're just also not asserting that you are.
Correct.
Tell us if there's anybody Pat doesn't know hasn't been exposed to your work before, tell us a little bit about your background.
Well, let's see, I dropped out of college in 1976 because I had an instant family and I was a kid. And then I eventually got a job being paid to program and build databases and transaction systems and distributed systems in 1978. So I've been doing that for, I don't know, 43 years or something. And it's been a lot of fun. I've been writing papers since the eighties, but really a lot since 2000, about how to think about databases, how to think about data, distributed systems and so forth. And fortunate that ACM lets me publish articles about crazy things I think of, including one we're going to talk about today. But one of my joy is they just let me do just random, silly ideas and run with it and tell stories that are impactful in how to think about computing. And so, that along with building complex systems is how I keep off the streets.
I like it. Been doing it for 43 years and it's working out pretty well so far.
Yeah. I don't think I'm going to change careers at this point. It's kind of...
No.
Yeah. I was just telling you a minute ago, I'm going to do this as long as they don't throw me out the door, take me out feet first. Because it's fun. This is an interesting and fun area and I'm super lucky. So.
It really is. And it's clear that you have fun with it. And I like that.
It's good.
You published this paper a little over a year ago with ACM and it was called, Space Time Discontinuum?
Correct. I was thrilled. It was the first paper I was able to cite Monty Python in.
You are going to have to tell us how, don't tell us yet. But if it doesn't come up I'll make sure-
It will come up.
We'll make it come up. Yes. We're in the spirit of Chekhov's gun. You can't just say a thing like that and not have it happen. Anyway, talk to us. What's the-
This paper, it's kind of fun when you publish the papers. You either have to get a 6,000 word paper or a 2000 word shorter story. And this I think was one of the shorter ones. The idea here is, if you're building something in a distributed system and you need data from multiple places and you need to do this with a schedule on time, you can't always count on that stuff coming here, because things that are not here on this computer right now, right here in my hand, there's a probability whether you're going to get them in time and when you will get them.
So if I'm going to do a report based upon the stuff I'm learning from 20 different machines and some of them are here close, and some of them are far, usually I can get them by the deadline I've picked. So I can analyze all 20 of these things to produce a composite result. And that's fine. Then maybe I can give you your daily report or your hourly report if I can get that input on time.
But if you're at a distance, you can't always guarantee the stuff's going to get here on a schedule. So now, if I needed these 20 input things to create my report and I have 19 when it comes to time to create the report, what do I do? You have two broad choices. One choice is to wait and then get the exactly the report you want but it's late. The other choice just to say, what can I tell you out of those 19 that I did get when I didn't get all 20? Can I put together a cogent, reasonable, useful answer based upon a subset of the knowledge?
That took me to the video from Monty Python and the Holy Grail where the black night's trying to keep the hero from crossing the bridge and the hero cuts off his arm and he says, "It's just a scratch." And he cuts off his other arm. "It's just a scratch." And pretty soon he has no legs, and he's, "Come back here, I'll bite you." So the question is, how many of those 20 inputs can you deal with losing? How fervently do you fight forward based upon partial knowledge? And I have an answer, I only have an assertion. Either you wait for everything and it might be delayed, or you figure out a personal answer. What is just a scratch? That's the story.
That's the story. So, there's a... well, it's at least an analogy, and this is just occurring to me, not the black Knight thing. That's definitely an analogy and it's strong. There's one that might be theoretically unjustified, but it sounds like choosing between consistency and availability in the presence of a partition for distributed database.
I have pretty interesting fun opinions about that.
I would love to hear that, let's go there for a second.
So, I actually did a blog post. I have this blog, @pathelland.substack.com, and I did a blog post recently called Don't Get Stuck in the "Con" Game, C-O-N, talking about consistency and me railing about eventual consistency and how I don't understand what it means because everybody interprets it differently. That blog posts is a lot about words.
Okay.
But in the process of doing that, I went and revisited a lot of the distributed systems literature to figure out what the heck it was. And I love this concept of Linearizability, which is just a really, really good paper by Herlihy and Wing. And the paper says, if I have an object and I have a bunch of clients to it on the outside, and each of the clients is going to talk to that object, that I'm just going to do one thing at a time, okay?
And so if this client only knows the timing, in that after you issue the request, you don't know what happened. And when you do get the response, you do know what happened. And you know in between is when you got your turn at that object. And so if there's dozens and dozens of things partying on that object, they're interleaving, but you only know what you know from the outside between what you know when you asked and when you received.
Right.
Many people get confused about whether Linearizability means and reads and writes, because the paper says any operation. I will read lots of stuff in the literature, and then find out that there's an assumption that the operation is reads and writes without stating that it's necessarily reads and writes. Linearizability can apply, For example, to CRDTs and computative operations, that's linearizable.
Okay.
There is an appearance of a single order. The wonderful thing about commutativity is there's an ambiguity about the order, but it works out because as far as you know, from the outside, you got what you wanted within the time window.
Right.
Okay. If I want a king size non-smoking room, and I get there and I get a king size non-smoking room, it might be room 301 by the elevator, it might not, but it's a king size non-smoking room in that window of time is satisfied by that. Now, when you read the cap theorem, the cap conjecture, and then the proof that's the theorem, the conjecture is fuzzy about what consistency means.
Yes.
It's just kind of like, "Yeah, you know." Like those database people mean. Just like what they mean. And you're like, "Huh." And then there's like, "Well, maybe it's the linearizability thing." Okay. And then you read the proof of it, and the proof sets up linearizability as the behavior, except when you prove the theorem as opposed to when the paper asserts what it's going to do, it slips in reads and writes in the proof. And so, they are a hundred percent correct.
If you're doing reads and writes at a distance across a distributed system, you are limited to either availability or that read and write linearizable consistency. So again, if I have single objects and I do a linearizable work in the reads and writes, the cap conjecture ties me up, it does. If on the other hand I'm going to say, "Well, maybe it's a commutative CRDT operation, or maybe it's something else. Then the whole nature of the cap theorem, conjecture theorem is not applicable because the proof of the theorem counts on reads and writes. Go read it. It's really interesting. Or go read my blog post, Don't Get Stuck in the "Con" Game. I have a version one of the blog posts and a version two. Because I did version one, I got on a bunch of back and forth and I went and read more papers. And then I wrote more crap down in version two. Because I got to spend a whole weekend on those two. It was fun.
Yes. Particularly in our line of work, there's going to be input. Yeah.
No, it was awesome. It was great. But again, my response was to spend another 12 hours obsessing over literature and write more crap down because it made me fun. It was fun. I had a big smile on my face at the end of the weekend. It was great.
Good.
So you get into this world of what do you, and don't, you know, with respect to distributed systems consistency, is a whole fun world. But I don't think consistency is a useful word today, unless it is prefixed by the word strict or prefixed by the word causal, or sequential consistency is also meaningful. Because what we use is, eventual consistency means basically nothing because everybody means a different thing by it.
Right.
If you say consistency, it means different things to different people. If you talk eventual consistency, that's even more muddied. And so, not a fan of that nomenclature, even though I've used it in the past, I went and studied and I learned maybe we should not use that word. There's two words I like better, convergent. Convergent means you take a replica and I think our replica, and I do a bunch of stuff and you do a bunch of stuff. If that replica has the property of being convergent, then when your stuff comes back and meets with my stuff, we come up with a unified answer that includes all of it.
Got it.
So convergent is a property of disconnected replicas and what the result is when they reconnect.
Okay.
That is a meaningful and useful phrase.
And can I say something like, they will have the same state again at some point in the future, that's-
When you gather all of the replicas with all of their respective inputs and you merged them, you will get a consistent result, and you will get a consistent result even when you have partial merges, as long as all of it comes together you'll get a consistent... the same answer from all of the replicas. That is a property that is called convergence. And the object is convergent when it does that.
Okay.
Now, in my blog post, I talked about this paper called [inaudible 00:12:28], which was great. It was all about replication and bringing things together. And in the operating systems community, the words consistent is used differently. And so the phrase eventual consistency came from that paper. My friend, Doug Terry, wrote the paper and we talked about it and I pushed on him. He said, "Yeah, it's really convergent. We should've said eventually convergent." So that's in my blog post, it's kind of fun.
And the other funky thing about the word consistency, I'm now on this blog post about-
Yes, we can go anywhere we need to go.
Is that acid transactional consistency. And I did another blog post before the "Con" Game one, which is called, Acid, My Own Personal C Change, because I studied this and I realized when the original paper by my dear friend, Andreas Schrader, was written in, I think 87 or five, I forget exactly. He defined ACID, and consistent in the ACID literally meant that the application gets to pick the end of the transaction.
Okay.
I told him later, like 30 years later, I told him this year.
Yeah.
You should've said complete, not consistent.
Right.
The reason you said consistent is because you can now build things that have multiple changes and make sure all of the multiple changes are in the transaction.
So, I could do referential integrity in the database, and the transaction includes all the referential integrity, multiple changes, and so it's consistent. But it doesn't say what consistent is. Doesn't say there's an application thing on top of the database that does this and this and this and its notion of consistency. So it's absolutely correct that by defining that the app and the upper part of the database can say when enough is enough, and you don't get less than that, that you can build arbitrarily complex ideas of what you want in a consistent world and make sure the transaction doesn't break them. However, all the transaction is giving you is the ability to say, "well, here's 30 changes." And it's all 30 and I'm not going to add 31 and you're not going to make it 29. It's, this is the end. That's all ACID really means.
That sounds like atomicity.
It's sort of, but there's this subtlety of whether the application on top of the transaction system actually gets to define the boundary.
Okay.
Okay. And I don't disagree with what you just said, but the point I want to make is that the word consistent is very poorly defined in database world. And so when you say eventual consistency, you have the database people hearing it this way. And when you have an operating system world where there's different objects and the different objects have to settle out to be the same across all the replicas, which is really convergent, then the operating system people are thinking a different thing.
Yep.
Okay. So consistency is misunderstood by different communities in computer science.
Because they're defining it differently.
Well, they've all lived in these different world. They use the word differently.
Which is fine on its own terms. We do that. I live in this valley, we use this word, and you live in the next valley and that word means a slightly different thing.
Correct. Totally agree.
Yeah.
I would actually like to write a paper enumerating all the different usages of words across all the communities in computer science, I really would like to call it The Joy of Sects, S-E-C-T-S, but I don't think I'm going to get away with that. And I may have to cut this out of this blog post. We'll see. But I just had to share that.
Authors don't always get to choose our titles. Right? That's kind of the editor job, so.
I'm pretty good at getting away with murder. But that's me. Sorry.
But this whole dilemma of misunderstanding is endemic in all humans. In addition, it's endemic in the computer science people as I'll say a subset of the humans.
A subset of humans, that's fair. Who was wanting to retract it as a separate thing than humans.
Right. That not true. Definitely.
[crosstalk 00:16:36] Right. But we all have our quirks.
But so, I-
Did this make sense, what I just said?
Oh, so much. And it's enlightening. Because I don't go into the literature as you do, but I'm definitely aware that the C in ACID is the squirrely one. The other things mean things. Like durability is okay.
Well, I can talk to you for an hour about durability. It's got all sorts of funky aspects, but-
That's true. It's also, one thinks of it as trivial, but it's really not. At any level it's fraught with complexity and atomicity and isolation. Again, isolation, insanely complex topic, means different things to different relational databases. It's not ever the same thing. But at least you can wrap your mind around those. And consistency has always seemed to me, honestly, what I've wondered the back of my mind is, did AID just now sound cool enough like acid sounded cool?
I don't know, but it's not very often you coin a phrase and it lasts that long and sticks that well, so I'm not going to go too negative on it, but-
Exactly. It's, somebody did something right.
And I think according to Jim Gray, some of it was that Andreas' wife didn't like sweet things, and she liked more vinegary things. And so he did that as an honorific to her. So
That is dull. I don't care if it's true.
It is delightful.
Just like to celebrate it.
Yes. It's awesome. However now, we're in this world of how do you do these things at a distance where you don't necessarily know all the pieces are connected together.
Yeah. I want to talk about distance.
There's two major things. One of them I recently learned about, if there are all these decades of fiddling with this stuff. Okay. But let's go with the one we just did a second ago. There's convergence, that has a clear meaning. I have a thing, we're going to call it an object. It's got a bound, it's got an identity and we're going to take replica's at a distance and we're going to do things to it. And when we bring them back together, it will have a predictable result that includes all of the things we did. And so that's convergence.
Okay.
Okay. So another property that I literally just learned about recently and I'm really having a blast with, which is called confluence.
Oh, well.
I know. You worked for confluence and you work for it. It's just, that's coincident. I'm honestly not here doing [crosstalk 00:18:58]. I love your company, you're awesome. But this is separate discussion.
Separate thing. Got it.
Just being clear by the way, it's going to repeat, you guys are awesome. You're great. But I'm not saying that. Confluence is a property of the object , a function, where it has a set of inputs and it has a set of outputs. However, the set of outputs is not dependent upon the ordering of the set of inputs, and new inputs never cause you to retract in existing output. Not everything is confluent.
No.
But if you get a set of inputs and you push it through and make your outputs, and it doesn't matter the order of the inputs, you're going to have the same set of outputs, independent of the order. Okay. And you never take back an output. Now this gets subtle. You can have a second output, which says, just kidding on the first output, but that ends up being the subtlety in some of this.
Okay.
But it turns out there's some really, really interesting properties about this, where if you have a bunch of them and they're each getting a subset of the inputs and they're each producing their own outputs, and now you can merge those and you can compose these confluent data flows that never have to go backwards. And so what that means is, if I take some of the stuff and send it to the East Coast and run it in a computer over there, and some other stuff on the West Coast and run it on a computer over there, as long as I bring them together, I get the outputs and outputs the same, no matter the order, no matter the things, as long as you bring it all together, you get the same answer.
Okay.
That is in many cases what we want out of our replicated systems. You see this kind of behavior in, for example, an email with different offline personal browsers. And you've got your phone, it's disconnected, you've merged, you've got your laptop. There's a predictable behavior of how that comes together and you get a predictable answer independent of origin.
Sure enough.
Okay. But that's because of the nature of how the operations on the mail are done. When you delete an email, you are actually creating a tombstone. The tombstone says that email with that unique ID, it should go away. So when that knowledge gets to another copy of your email collection, things work out. That is a confluence system, has this wonderful behavior.
Okay. That sounded kind of stream processory when you were first describing it, but that's not necessarily the case at all.
Not necessarily, stream processing is of course one option.
Yes. But everything could be synchronous as all get out. And then still...
And at one level that's a convergent thing and at one level it's a confluent thing, but it is in fact, the nature of the confluence that allows the convergence to work well. The nature of the operations-
Intuitively satisfactory.
The operations are independent of order and the outputs don't get retracted. Right. Which gets really interesting when you talk about a mail getting deleted and what that means, because that's like a second output.
Okay.
Which is, "Oh, I had that mail, but now it's gone."
Right.
But the ordering is independent. So all of these things blur together and we're beginning to explore them as an industry. How do we create useful, vibrant offline systems to allow that stuff to happen?
Right. So give me some other examples. Email makes sense. Where else would we see this property in?
Well, let me talk for a second about where you don't get the property.
Oh, gotcha. Okay.
Confluent computation has a very nice time answering a question does exist. If I show you in my output it does exist, it exists. Okay. That output is there. And you get into this funky world with tombstones and emails, but on a whole, the whole idea is, does exist? And much of this is being echoing off of some brilliant work out of Berkeley called the CALM Conjecture, Berkeley and Santa Cruz, where Joe Hellerstein and Peter Alvero did a beautiful paper called Keeping CALM, which is all about how do you make these resilient kinds of systems? And the idea that they've proven is, if you want to have these behaviors of replication and coming together and all that stuff, you need to have knowledge increase over time. And they call that monotonistic behavior, things moving forward.
Okay.
That's the correct theoretical term for it. But most of us think about it just going forward rather than going backward.
Right.
Which is the same thing as saying you're not going to get rid of the output. It's just there. So you can do things like a shopping cart or a mail system in this confluent thing where you're answering questions that are, "Yes, it's there, does exist." What's really hard is something like a garbage collection where you run it across different machines and you're trying to analyze what's missing, something that doesn't have anything. So questions that are, does not exist, can't actually be answered until you shut off the input to that confluence set of systems. And only when you shut it off and let things rattle together, can you then say, "Oh yeah, I didn't get that." Because you don't know what you're going to get from the other sibling confluent operations that are happening at a disconnected point. So the whole inversion of our thinking about what can we learn from new knowledge, and only when we shut off the new knowledge for some subset of what we're analyzing, do we then say, "Oh, well I didn't get this."
Now I know I didn't get it. [crosstalk 00:24:37].
Right. Which is not any different than having a joint checking account. I don't know what... when you're married, and this is no criticism, I love being married. But you don't know what checks are flying back in the day when you had the paper checks and everything.
I remember those days.
But I can say what did or did not occur in the month of May in my checking account based upon the checks that arrived.
Right.
But when I seal off that stuff, then I can answer questions about what didn't happen. "What do you mean you didn't pay the rent?" "Well, I didn't pay the rent because in May I didn't pay the rent." Okay. In this other world, you don't know if the rent's being paid. And so that's this issue of confluence is these independent things that will come together, which can answer questions about, yes I did. Because you can look at a subset of May's knowledge, find the rent check for May and you're good.
You see, oh, the rent got paid but you can't conclude rent didn't get paid.
Correct. Until you cap it with a term called ceiling. So these are the, in my mind, the areas of fund research. How many of the things we're taking apart in our distributed systems can be captured in this way. Where independent work can happen, and then when it all comes together, it's going to get an answer that we can meet our business need.
And this relates, if I could tie this back to space and time, if you're gathering data from places, there's a significant probability distribution of the arrival time of that data, I think is a-
Right. And the supposition is, you wanted to give a report closing out all 20 of those things. That's a probability.
Yes. You want that. And you can make the decision, "I'm not talking until I have it all."
That's a fine decision.
Yes it is. The other one seems like it would rely... Do these tie together, these views of consistency and the space problem?
Sure. [crosstalk 00:26:44]. It's like, "What do I know? What do I know did not happen? What do I don't know?" And at a distance, you never know. You only know when it comes here. And that's the fun of distributed systems.
Yeah. It really is. So, distance, back to the title, the initial title, which is Space, Time Discontinuity.
Discontinuum I said.
Discontinuum. Time is clear enough. Everyone has their own clock. There is no,
The only thing you know is what Lamport said, happens before.
Yes. We know what happened, but we don't know what now is, sequences.
We don't know if your time is moving ahead at the same rate as my time. We don't know any of that.
Don't know any of that stuff, yeah.No. Even, I guess of physical standpoint if one of you is moving very fast, you really don't know. But you at least don't know if your clock oscillators are exactly the same thing. In general, they're not in phase, so.
Correct. And there's a whole bunch of issues there, and you don't know if the other person's dead.
That's the other problem.
So there's another fun blog post I did last September ish, I think, which is called Fail Fast is Failing Fast. Let's position this for a sec. I'll explain this.
Please.
I've been doing full tolerant systems since 82 when I was at tandem and I built the transaction engine for the nonstop system. That was really interesting. And we always had this notion that you could have a primary that's doing the work and there's exactly one or maybe zero for a window of time when it crashes before you make a new one, there's never two. There's a single one. So now I can do a database, I can do things out of the single one without ambiguity. Okay. And I can say, "Oh, if he dies, then we're going to make a new one over here." But I'm carefully making sure the first one's gone. Now, I got a new one. That is a characteristic called fail fast.
Okay.
It works or it doesn't work. It's called fail fast. Most distributed systems have been built on that premise. The problem is how do I know when the other one's gone?
You assume that you have a reliable heartbeat mechanism. You've got someway-
Right. It's called synchronous networking. Absolutely. And there's a famous paper called the FLP for Fischer Lynch Paterson, and it basically says if you're in an asynchronous world where you can't count on the message times, you can't do anything that a group makes, says, everybody's going to agree, because you don't know the difference between the other one going slow and the other one being dead.
Correct.
You don't know. So in an asynchronous world you can't do fail fast, because you can't get everybody to agree that the old one's gone so we're going to make a new one. And so you end up in things like Paxos, where a subset of you agree, and is that good enough? And all these kinds of things. The dilemma we have in this new world is so fascinating. I think cloud computing is awesome because we're sharing, we're using resources more efficiently, it's, and I love to say it this way. I really wish I had a dedicated lane on the freeway for my commute home, then I could predict when I get home.
Yes. It'd be great.
Right.
And inexpensive.
That's would be more cost effective.
Yeah.
So one of the things that's happening as we get these bigger and bigger data centers in these bigger clouds is, it's hard to know that the timeliness of the network is good.
Right.
And when failures happen and they're rejiggering the routing tables and all this stuff, there's even very well-documented cases where different people see different things. A can see B, B can see C, but A can't see C. And that goes on for a long time, sometimes seconds, and lots of confusion. It's very hard to know you have a single primary node. It's very hard to know that when you can't accurately communicate in a very synchronous, timely fashion and synchronous literally is a probability curve.
And you're trying to get into a world where I can get an answer with an excrutiatingly high probability. And as the delays get longer, it's harder to predict. So the reason I say fail fast as failing fastest, imagine you're in a small town and your commute home is a six minute walk down the sidewalk.
Okay.
Okay. You can stand up, you can go home to your family and you can promise that you will walk in the door at 6:23 in the evening every night, no problems.
Okay.
Now you take that same algorithm and your family and your job, and you go to work at the top of a skyscraper in a big downtown, and they're out in the suburbs. Now, it turns out that your ability to get home at the end of the evening is nowhere near as predictable.
Right.
You can actually talk about the average pretty handily, but you can't talk about the jitter or the variation in that timing very well, stuff happens.
Right.
You can't just always get home like that. So if somebody in your family decides you're just dead at 6:24, because you're usually walking at 6:23, that's not going to be a pattern for success. And if you come home at 6:25 and your family has got a new spouse, right? Your wife's pretty married, this is not [inaudible 00:32:18].
I will not standby.
It's not a good approach to life, right? So you need to rethink these algorithms in this world because you just, as things are spreading out and Rav sharing, the distance is effectively farther and the unpredictability is larger. That is, by the way, a different manifestation of the same thing I said about Space Time Discontinuum, I want to do a report, 20 inputs. It's just a finer grain, but the same phenomenon occurs. Distance means uncertainty.
That's what I want to ask about. We're coming up against time, but if you're okay, I'm okay. Because I want to nail this down for just a few minutes. People blame the speed of light. And I have the sense that that's really a red herring, because the travel time from the skyscraper to the suburbs is longer, it's not a six minute walk, but it's not the propagation that's the problem. The propagation time is going to drive the average, and this gets messy. You mentioned cloud infrastructure where there are adaptive systems where a part breaks and the infrastructure [crosstalk 00:33:29].
Which are amazing and wonderful. And I love them.
In their own, right? Yes.
We need to leverage our investment in that and make more investment in that.
Yeah, totally true.
But it introduces problems.
But it strikes me that, we were talking about asynchronous and synchronous a minute ago. Let me throw this out and you tell me what you think. Inside a computer, when it's not a distributed system, everything for some reason is okay there. We don't have these problems and-
We do, the probabilities are really low, yeah.
The probabilities are very low and it's not a computer if it doesn't work out
By the way, we actually have more of those problems in that we have a shared memory and many, many different cores, and we coordinate with latches. And so a lot of this stuff, what you said is true and not true.
There you go. Yeah. For cash access and everything.
This is fractal, right. This whole world is fractal of these phases.
So, my problem is computers are still what I learned in about 1986. There's a register file and there's memory. It was a wonderful world, Pat.
No. Dude, I started in 73. I remember.
Yes. Okay. To some kind of first order approximation, life is synchronous inside that computer. So there is access to memory and there is literally clock edges on which this stuff occurs.
Correct.
Subject to the complexity of some things.
Yeah, of course the multiprocessor all these symmetric multiprocessors, multi-core, all that stuff wonkies out a little bit, but it is way easier inside of a computer to figure out how you're going to do it.
Yeah. Because it's synchronous and the distribution of the arrival time probability is managed for you by that shared clock.
Yeah. Although latching and locking and cues on latches. There's a bunch of stuff-
Still stuff. Okay. There's still some [crosstalk 00:35:17].
There's some bunch of stuff. Go read the old [inaudible 00:35:19] paper from 1976, about what happens when you have a cue on a latch and then one of you gets preempted by the operating systems. This is-
Okay. 76. All right.
Right. And so this is not a hundred percent crystal clear here.
But at least outside the computer, we know it's asynchronous. There isn't a shared clock anymore. So things are going to happen.
It certainly is asynchronous cross computers. I'm only pointing out there's this weirdness inside too.
Yes. Yeah, you're right. Okay. So, it is a... To know approximation, is it synchronous outside the computer.
But even more. Look, our communications paths, I think TCP is an amazing thing that was built in 1974. It was designed in 74 when bandwidth was the problem, not latency, you were literally copying files. That was what it was built for. And if you had an error and you had to go chatter back and forth to clean the error, fine. Now we're running it in data centers and it mostly is fine unless you drop a packet and then there's, and then you clean the mess up and then you keep going. And so, you will not get a consistent low latency. The tail latency of TCP is quite onerous if you're trying to do things in lock step.
Okay.
Okay. So example, computational fluid dynamics. Great. You're trying to figure out air flow or even machine learning training. You have such a big data set, it's spread across a hundred, a thousand computers in the data center. Let's just assume in a single data center, but all of them have to finish this step at this fraction of a second to calculate the pressures and the weather prediction stuff in this little cube. And you got most of your cubes are on the same machine, but then the edge of the machine is next to the other machine. So you've got to tell that edge to the next guy for, he can do the next step. You end up in a world where moving from step to step for hundreds or thousands of computers is going to take a delay based upon the worst slowest single node to single node sharing of that edge condition time.
Okay.
So it's incredibly dependent, the ability to properly utilize these very fast, very big machines is strongly dependent upon the consistency. Not even precisely the delay itself, but the consistency of the ability to get that data across.
The predictability.
Yes. Or I like to talk about jitter.
Yeah. Jitter is...
It just means like an electronics, jitter is just how much did it mean the same time versus variation in that time?
Right.
The jitter in a big complex system becomes really noticeable because your probability than any one of them is jittering is very high. If you're all dependent upon that, then you're just a mess. So there's really some interesting things that are coming out as... these systems don't tend to use TCP, because TCP will sometimes say, "I dropped a packet. Crap, let's go get it." And that's not the thing you want to do because of the head of Elaine blocking, because you've got all this stuff behind it.
Now, Kafka being a queuing system, I'm positive has got similar challenges to work through. And you guys have got a lot of stuff that hope, but these things are endemic in ordered queuing, right? The ordering in itself introduces a compounding of the unpredictable arrival of the things within the queue.
Yeah.
Right.
It makes sense.
It's how the thing rolls. And so TCP being ordered gives you penalties where you might not have them otherwise. So time is really interesting. Distance is really interesting. These are the big things that get in the way of us solving our problems. And so in my mind, the fun challenges are how to be okay if things are delayed. How are you okay if this is out of wonky whack, and that has varying solutions for varying business domains. But most of what we do introduces ordering where we maybe didn't need it, because we're kind of used to ordering. And so people are comfortable working on top of things that provide ordering.
However, that tends to compound the problems. So I'm always fascinated by letting people expressing their business, to ask the questions in ways where we can solve the problem without being quite so dependent upon ordering in as many areas. So we can let them talk naturally to the person writing a business solution. And underneath the covers, we do things that are far more tolerant of these orderings and delays. Did that make sense?
That made sense.
Okay. I'll restate it so it doesn't.
With that, I'm going to say my guest today has been Pat Helland. Thank you so much for being a part of Streaming Audio.
Thank you so much. It's been wonderful to talk and to chat. Thanks for having me.
And there you have it. Hey, you know what you get for listening to the end? Some free Confluent Cloud. Use the promo code 60PDCAST—that's 60PDCAST—to get an additional $60 of free Confluent Cloud usage. Be sure to activate it by December 31st, 2021, and use it within 90 days after activation. Any unused promo value after the expiration date is forfeit and there are a limited number of codes available. So don't miss out. Anyway, as always, I hope this podcast was useful to you. If you want to discuss it or ask a question, you can always reach out to me on Twitter @tlberglund, that's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on a YouTube video or reach out on Community Slack or on the Community Forum. There are sign-up links for those things in the show notes. If you'd like to sign up and while you're at it, please subscribe to our YouTube channel and to this podcast, wherever fine podcasts are sold. And if you subscribe through Apple podcasts, be sure to leave us a review there that helps other people discover it, especially if it's a five-star review. And we think that's a good thing. So thanks for your support, and we'll see you next time.
When compiling database reports using a variety of data from different systems, obtaining the right data when you need it in real time can be difficult. With cloud connectivity and distributed data pipelines, Pat Helland (Principal Architect, Salesforce) explains how to make educated partial answers when you need to use the Apache Kafka® platform. After all, you can’t get guarantees across a distance, making it critical to consider partial results.
Despite best efforts, managing systems from a distance can result in lag time. The secret, according to Helland, is to anticipate these situations and have a plan for when (not if) they happen. Your outputs may be incomplete from time to time, but that doesn’t mean that there isn’t valuable information and data to be shared. Although you cannot guarantee that stream data will be available when you need it, you can gather replicas within a batch to obtain a consistent result, also known as convergence. Distributed systems of all sizes and across large distances rely on reference architecture for database reporting.
Plan and anticipate that there will be incomplete inputs at times. Regardless of the types of data that you’re using within a distributed database, there are many inferences that can be made from repetitive monitoring over time. There would be no reason to throw out data from 19 machines when you’re only waiting on one while approaching a deadline. You can make the sources that you have work by making the most out of what is available in the presence of a partition for the overall distributed database.
Confluent Cloud and convergence capabilities have allowed Salesforce to make decisions very quickly even when only partial data is available using replicated systems across multiple databases. This analytical approach is vital for consistency for large enterprises, especially those that depend on multi-cloud functionality.
If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.
Email Us