Vast Data really does appear to be "vast"

Transcript

curtis: 00:00:00

I'm pretty sure we've said smoking hole more times

curtis: 00:00:02

than we've on this podcast.

curtis: 00:00:05

Just for the record.

curtis: 00:00:06

Just saying

curtis: 00:00:08

It's getting a

curtis: 00:00:08

lot of play today.

curtis: 00:00:29

Hi and welcome to Backup Central's Restore it All podcast.

curtis: 00:00:32

I'm your host, W.

curtis: 00:00:32

Curtis Preston, AKA Mr.

curtis: 00:00:34

Backup and I have with me, my Bhangra dance consultant, Prasanna

curtis: 00:00:39

Malaiyandi, how's it going Prasanna?

Prasanna: 00:00:41

I'm good Curtis, but I have to warn you.

Prasanna: 00:00:43

I have not a dancer at all.

Prasanna: 00:00:44

So probably the wrong person to be seeking advice about dancing from,

curtis: 00:00:50

But you said that you knew about Bhangra dancing and that you

curtis: 00:00:53

could advise me on these things.

Prasanna: 00:00:55

I told you that it's like a Indian dance style, if you will.

Prasanna: 00:01:02

And you had asked a question of, have I seen it because I bet I've

Prasanna: 00:01:05

seen a bunch of Bollywood movies.

curtis: 00:01:07

You expanded my horizon I bought my tickets my wife and I will be going

curtis: 00:01:12

to see the show it's called Bhangin' it!

curtis: 00:01:16

It's bangin' was spelled would be H so it's it's trying to like

curtis: 00:01:23

do an homage to the Bhangra.

curtis: 00:01:26

So it's a, new musical at the LA Jolla Playhouse, which is a very nice

curtis: 00:01:33

Playhouse that I've actually never been.

curtis: 00:01:35

I've lived here 20 something years.

curtis: 00:01:36

I've never watched a show there, but a lot of like big Broadway

curtis: 00:01:39

shows actually start out.

curtis: 00:01:40

I've never started.

curtis: 00:01:42

I've, always watched the Broadway shows

Prasanna: 00:01:44

Prasanna: 00:01:46

gone to Broadway.

curtis: 00:01:47

This is the kind of show that could possibly hit big on Broadway.

curtis: 00:01:50

And so we'll see it and we'll see if it's any good and

curtis: 00:01:55

I'll

Prasanna: 00:01:56

waiting for

curtis: 00:01:56

with my review.

Prasanna: 00:01:58

Yes.

Prasanna: 00:01:59

I think our listeners will be curious.

Prasanna: 00:02:02

And by the way, for those in San Diego, when is it running?

Prasanna: 00:02:05

Do you know how long?

curtis: 00:02:06

It's running.

curtis: 00:02:07

It's running until April.

Prasanna: 00:02:09

Okay.

Prasanna: 00:02:10

So that

Prasanna: 00:02:10

was

Prasanna: 00:02:10

folks in San Diego.

curtis: 00:02:12

Yeah.

curtis: 00:02:14

Yeah, depending on when this goes live, if it goes live less than a month from

curtis: 00:02:20

now, then you have two days left to go see it because it runs until April

curtis: 00:02:27

17th at the LA Jolla Playhouse, by which time all the tickets will be

curtis: 00:02:31

gone and you won't be able to see it.

curtis: 00:02:33

Sorry, I don't know what to tell you, but so we, have a longtime

curtis: 00:02:40

friend on the podcast here today.

curtis: 00:02:42

Prasanna.

curtis: 00:02:43

I'm excited to bring him on I, and not just because he's one of those

curtis: 00:02:47

people that make me feel young.

curtis: 00:02:52

As, been in the it industry for an awfully long time, makes me feel like

curtis: 00:02:57

a young whippersnapper sometimes.

curtis: 00:02:59

He is now the technologist extraordinary and plenipotentiary at Vast Data.

curtis: 00:03:05

Welcome to the podcast, Howard Marks.

Howard: 00:03:09

Thank you.

Howard: 00:03:10

It's very nice to be here.

Howard: 00:03:11

I was always about Fauci guy, so I don't know much about Indian dance.

Prasanna: 00:03:16

Curtis didn't either before he met me.

Prasanna: 00:03:18

So it's fine.

curtis: 00:03:20

Yeah I, my knowledge of Indian dance it basically includes the

curtis: 00:03:31

reference to it in what was that movie?

Prasanna: 00:03:35

Millionaire.

curtis: 00:03:36

bright, the BR the bride and prejudice.

Prasanna: 00:03:39

curtis: 00:03:40

There's a

Prasanna: 00:03:41

yeah.

Prasanna: 00:03:41

I th I think

curtis: 00:03:42

there's a, it's a pride and prejudice,

Prasanna: 00:03:47

yeah.

curtis: 00:03:48

Knock off done with what's her name?

curtis: 00:03:51

. Prasanna: Ashwaryia Rai.

curtis: 00:03:52

I think.

curtis: 00:03:54

She remember she, she, in the movie she, gives two D two dance moves.

curtis: 00:03:58

It was petting the dog and screwing in the light bulb.

curtis: 00:04:01

I don't know if you remember that.

curtis: 00:04:02

She says that.

curtis: 00:04:06

That's literally the extent of my knowledge of Indian dance.

curtis: 00:04:09

That, and the fact that I've watched a bunch of Bollywood movies, but

curtis: 00:04:11

that's all thanks to Prasanna.

Prasanna: 00:04:13

Yeah.

curtis: 00:04:15

So you never know what you're going to get when you're listening to the

curtis: 00:04:18

Backup Central Restore it All podcast.

curtis: 00:04:22

Speaking of which, let me throw out our usual disclaimer, Prasanna

curtis: 00:04:25

and I work for different companies.

curtis: 00:04:26

Persona works for Zoom.

curtis: 00:04:27

I worked for Druva.

curtis: 00:04:28

This is not a podcast of either company and the opinions that you hear are

curtis: 00:04:31

ours, and be sure to rate this podcast ratethispodcast.com/restore, or just

curtis: 00:04:37

go at your on your favorite pod catcher apple podcasts and just scroll down

curtis: 00:04:43

to the bottom and give us some stars.

curtis: 00:04:45

And if you really want to make my day, actually put some words there.

curtis: 00:04:49

Yeah, absolutely.

curtis: 00:04:51

And if you are interested in the things that we're interested in, like

curtis: 00:04:55

backups and storage and resilience and ransomware recovery and cyber

curtis: 00:05:00

warfare and all of these things.

curtis: 00:05:03

Then just send me a note @wcpreston on Twitter, or wcurtispreston@gmail, and

curtis: 00:05:10

I'll be happy to get you on the podcast.

Prasanna: 00:05:14

friendly.

Prasanna: 00:05:14

We ask questions.

curtis: 00:05:17

we even apparently, although the last episode I said, unless your

curtis: 00:05:21

name was Stewart and apparently Stewart has now reached out to you Prasanna

Prasanna: 00:05:25

Yes, he has.

Prasanna: 00:05:26

curtis: 00:05:26

and,

Howard: 00:05:30

So even

curtis: 00:05:31

and

Howard: 00:05:31

name

curtis: 00:05:31

Even Stuart can get on this podcast.

curtis: 00:05:34

So if we're going to let you know a mouse on the podcast, then surely we can let

curtis: 00:05:42

you, his name is Stuart Liddle for those of you that didn't get that reference

curtis: 00:05:44

anyway.

Howard: 00:05:45

to make me feel honored here.

curtis: 00:05:50

We literally let anybody in the door,

curtis: 00:05:55

including guys who always wear Hawaiian shirts.

Howard: 00:06:01

They're comfortable.

Howard: 00:06:01

They come in my size and at this point I'm just known for them.

Howard: 00:06:07

I have been known to tell people I'm going to meet at the

Howard: 00:06:11

Starbucks at some conference.

Howard: 00:06:14

Just look for Santa Clause in an Aloha shirt.

Howard: 00:06:16

That will be me.

curtis: 00:06:21

much.

curtis: 00:06:21

It pretty much

Prasanna: 00:06:22

that's.

Howard: 00:06:24

Yeah.

Howard: 00:06:24

You know how many 350 pound guys with a gray beard are there walking around the

Howard: 00:06:29

average tech show, wearing an Aloha shirt?

Howard: 00:06:32

Two

curtis: 00:06:33

I'm going to, yeah.

curtis: 00:06:35

Two, yeah.

curtis: 00:06:36

At most.

curtis: 00:06:37

Absolutely.

curtis: 00:06:38

And one of them is going to be you.

Howard: 00:06:40

Yeah.

curtis: 00:06:41

so how long have you been at Vast Data?

Howard: 00:06:45

I've been at Vast Data three years and 15 days.

curtis: 00:06:50

Wow.

Prasanna: 00:06:52

And the company is fairly new as well.

Howard: 00:06:55

I joined Vast Data the day before we came out of stealth.

Howard: 00:06:59

My, my first official act at Vast Data was a briefing for Chris Mellor followed

Howard: 00:07:06

the next day by Storage Field Day.

curtis: 00:07:09

Wow.

Howard: 00:07:11

Nothing like starting off running

Howard: 00:07:15

Now, I joined Vast from being an independent analyst.

Howard: 00:07:20

So there were a couple of weeks there where I was getting brought up to speed

Howard: 00:07:25

and such before my official start date.

Howard: 00:07:30

But yeah,

curtis: 00:07:30

And why don't you give a for those that aren't familiar with Vast

curtis: 00:07:36

Data, give us a, know, the elevator

Howard: 00:07:40

sure.

curtis: 00:07:40

and

Howard: 00:07:41

The really short form on Vast Data is that we make very large scale all

Howard: 00:07:47

flash file and object storage systems.

Howard: 00:07:52

And when I say very large scale our average selling price for

Howard: 00:07:58

our cluster is well on the north side of a million dollars.

Howard: 00:08:02

It's multiple petabytes.

Howard: 00:08:05

Today we're just introducing a new storage enclosure that brings

Howard: 00:08:12

our building block down from 675 terabytes per HA enclosure to 338.

Howard: 00:08:22

So we're taking it down by factor of two.

Howard: 00:08:24

We're going from a two U to a one U enclosure.

Howard: 00:08:28

We'll talk about that in a little bit, but the innovative thing

Howard: 00:08:34

about Vast is the architecture.

Howard: 00:08:37

If you talk about a large scale system, like we build traditionally, that's been

Howard: 00:08:43

done with a scale out, shared nothing model where you have a lot of x86 servers.

Howard: 00:08:51

Each of those x86 servers owns some set of media and they communicate

Howard: 00:08:58

on a backend network and software makes it look like one big system.

Howard: 00:09:03

But those systems start to break down at really large scale.

Howard: 00:09:07

And so we've come up with a new model.

Howard: 00:09:09

We call DASE the shared everything architecture instead of having a field of

Howard: 00:09:19

peer nodes, each of which owns some media, we disaggregated the media into these HA

Howard: 00:09:27

enclosures that I was just talking about.

Howard: 00:09:29

So no single point of failure, 400 gig connections to an NVME fabric and

Howard: 00:09:38

that's typically a hundred gig Ethernet.

Howard: 00:09:40

Some of our HPC customers like to run InfiniBand so we

Howard: 00:09:44

can do InfiniBand as well.

Howard: 00:09:47

All those enclosures do is hold data.

Howard: 00:09:51

There's no services there.

Howard: 00:09:54

All of the services, everything that you would think of as the controller function

Howard: 00:10:00

of the system runs in stateless Docker containers in the front end servers.

Howard: 00:10:08

So when a user makes a request to a protocol server to one of

Howard: 00:10:13

those front end servers could be NFS, could be SMB, could be S3.

Howard: 00:10:18

That server looks in the metadata that's stored in storage class memory

Howard: 00:10:25

in the enclosures, finds the data the user's requesting in the data in

Howard: 00:10:32

QLC flash in those same enclosures, retrieves it over the NVME over fabric's

Howard: 00:10:38

fabric and delivers it to the user.

Howard: 00:10:41

So there's none of the traffic from node to node required to reassemble

Howard: 00:10:50

data, everything's north, south across that NVME over fabrics connection.

Howard: 00:10:57

And since the metadata is in storage class memory, it's fast enough to

Howard: 00:11:03

directly access by all of the front end servers that they can just share it.

Howard: 00:11:09

They don't have to cash it.

Howard: 00:11:11

And by not having the cache, we don't have all the complexities

Howard: 00:11:15

of keeping the cache coherent.

Prasanna: 00:11:17

I was just going to ask about that, Howard.

Prasanna: 00:11:19

So it looks like though you're dis-aggregating the actual storage

Prasanna: 00:11:23

and metadata from all the front end processing, which allows,

Prasanna: 00:11:28

would assume the front end to scale independently of the backend.

Howard: 00:11:32

So each of those front end protocol servers, mounts all of the

Howard: 00:11:37

SSDs in the cluster at boot time.

Howard: 00:11:40

And then it looks at all of those SSDs, and at those are the SCM

Howard: 00:11:47

SSDs that hold the metadata and the QLC SSDs that hold the data.

Howard: 00:11:52

So everybody has access to everything.

Howard: 00:11:56

And instead of sending messages back and forth between the front end servers,

Howard: 00:12:01

they simply write a single of truth in the shared metadata, so that the

Howard: 00:12:09

old so that you can place a lock on the metadata or update the metadata.

Howard: 00:12:14

But you never have to tell everybody else you updated it because if they want

Howard: 00:12:18

to know what the state is, they'll go look in the one place where it's true.

Prasanna: 00:12:22

Yeah.

Prasanna: 00:12:22

And because everything is stateless in the front end, you don't have to worry

Prasanna: 00:12:26

about that necessarily to everyone

Howard: 00:12:27

Right,

Prasanna: 00:12:28

that backend

Howard: 00:12:29

right.

curtis: 00:12:30

So the backend has both SSDs and QLC.

Howard: 00:12:35

What has SCM sort of storage class memory SSDs, and that can be

Howard: 00:12:40

Optane or and it has low end QLC SSDs.

Prasanna: 00:12:48

curtis: 00:12:49

And the, the, yeah the, storage class memory is what's

curtis: 00:12:55

holding the metadata and the

curtis: 00:12:57

QLC is, what's holding the data.

Howard: 00:12:59

Primarily.

Howard: 00:13:00

It's also used as a write buffer.

curtis: 00:13:02

Okay.

curtis: 00:13:02

Okay.

Howard: 00:13:03

So writes come into the storage class memory and get mirrored to two

Howard: 00:13:10

different SCM SSDs and then get ACKd.

Howard: 00:13:14

And then the migration from SCM to QLC happens after the act.

Howard: 00:13:19

So we have more time to do things like compress more fully.

curtis: 00:13:22

This is a very different game than.

curtis: 00:13:26

This idea of all of the front end nodes, being able to mount the entire

Howard: 00:13:34

Yes.

curtis: 00:13:35

the background

Howard: 00:13:36

Yeah.

Howard: 00:13:36

We we eliminate the whole concept of ownership and all the

Howard: 00:13:40

complexity that, that creates.

Howard: 00:13:44

And now I'm going to blow your mind because when I say the metadata is in

Howard: 00:13:49

the SCM, I don't mean just the element store metadata, the metadata for our

Howard: 00:13:54

merged file system object store, but also the data reduction metadata.

Howard: 00:14:00

And so when you add another enclosure to the cluster, you add more SCM, which

Howard: 00:14:07

means you add more room for that metadata.

Howard: 00:14:10

So regardless of the size of cluster, the cluster is one data reduction realm

Howard: 00:14:15

across tens or hundreds of petabytes.

Prasanna: 00:14:18

Because everything's looks like one cluster, if you will, or one system.

Howard: 00:14:22

right.

Howard: 00:14:23

And, we don't have to hold the data deduplication hash

Howard: 00:14:27

table in memory any place.

Howard: 00:14:30

It's all in SCM where it's fast enough we don't need that.

Howard: 00:14:34

So we don't have the limitations of how big a deduplication realm can be

Howard: 00:14:38

that most deduplication systems have.

curtis: 00:14:42

right.

curtis: 00:14:43

They typically top out around a a petabyte or so, and then you

curtis: 00:14:47

can't get any bigger than that.

curtis: 00:14:50

I don't know where to start on my questions!

Howard: 00:14:55

so from that, from the backup point of view, we're discovering that

Howard: 00:15:02

the customers are starting to demand higher restore speeds that traditionally

Howard: 00:15:11

all a customer worried about when they were picking the storage for their

Howard: 00:15:15

backups was it fast enough that I can make my backup within the window?

Howard: 00:15:23

And so we got systems like Data Domain and other disk based deduplicating systems,

Howard: 00:15:30

where there was a big write read asymmetry where you could write data faster to

Howard: 00:15:37

them than you could read data from them.

Howard: 00:15:40

Because reading data that caused the system to rehydrate turned

Howard: 00:15:46

sequential IO into random IO.

Howard: 00:15:50

And they had disks on the backend.

Howard: 00:15:53

And as disk drives have gotten bigger, this has gotten worse

Howard: 00:15:58

because a 20 terabyte disk drive today delivers exactly the same

Howard: 00:16:01

number of IOPS that a one terabyte disc drive delivered 10 years ago.

Howard: 00:16:05

So now 20 terabytes of data gets a 20th as many IOPS.

Howard: 00:16:12

And so you discover, yes, it takes me eight hours to back this up.

Howard: 00:16:17

It takes me 82 hours to restore it

Howard: 00:16:21

and

curtis: 00:16:22

Yeah.

curtis: 00:16:23

D D dedupe has never been very friendly for, large restores, especially if

curtis: 00:16:28

you're doing any sort of, if you want to do a live mount, forget it right.

curtis: 00:16:32

From a directly, from a Data Domain.

curtis: 00:16:36

It's possible in the same way, it's possible that...

Howard: 00:16:39

That's, but that's, you can bring up the Oracle or the SQL server VM.

Howard: 00:16:46

So that the it guys can access the passwords database, so that everybody

Howard: 00:16:52

can start at running ERP on it again.

Prasanna: 00:16:55

Yeah.

Prasanna: 00:16:55

Don't use it as production.

Prasanna: 00:16:56

That's a bad thing.

Howard: 00:16:58

Right.

curtis: 00:16:58

right.

Howard: 00:17:00

And we're discovering that people's requirements are getting tighter.

Howard: 00:17:08

You start thinking about software as a service providers where, you know, if you

Howard: 00:17:14

run some account, some industry specific accounting as a service for a thousand

Howard: 00:17:20

customers, that's a thousand databases.

Howard: 00:17:23

And when something goes wrong, you want to restore those databases

Howard: 00:17:27

as fast as you can, because your customers are going to be standing

Howard: 00:17:31

over your shoulder, yelling at you.

Howard: 00:17:34

And the last thing that's kicked, a couple of our potential customers over

Howard: 00:17:39

the edge is the ransomware threat.

Howard: 00:17:43

Because the size of the restore grows so much with ransomware.

Howard: 00:17:48

You start off with, they need to protect my data against ransomware

Howard: 00:17:52

and use various methods to do that.

Howard: 00:17:54

And so we have indestructable snapshots.

Howard: 00:17:57

So you can say snapshot this folder at 6:00 AM when the backup window

Howard: 00:18:05

closes and retain it for 30 days.

Howard: 00:18:08

And even if the administrator wants to delete it he can't.

Prasanna: 00:18:11

So I

Howard: 00:18:12

but

Prasanna: 00:18:12

about that.

Prasanna: 00:18:14

So I did read a little small blurb about that.

Prasanna: 00:18:18

Prasanna: 00:18:20

What prevents, is that locked down forever?

Prasanna: 00:18:23

Like an admin can't delete it no matter what, or is it just, there

Prasanna: 00:18:27

are additional safeguards in place to make sure that someone doesn't

Prasanna: 00:18:30

compromise the admin password,

Howard: 00:18:32

Anyone who ever talked to any customer of EMC Centera knows that if you

Howard: 00:18:41

build a system where you literally can't delete data someone will get themselves in

Howard: 00:18:48

trouble and fill it a hundred percent up with junk, and it will be a bad situation.

Howard: 00:18:57

So you have to provide some mechanism for overriding this because customers

Howard: 00:19:03

will paint themselves in corners.

Howard: 00:19:07

As I said, our average selling price is well over a million dollars.

Howard: 00:19:11

We don't have small customers who we only know third hand through VARs.

Howard: 00:19:18

We are in relatively intimate contact with every one of our customers.

Howard: 00:19:23

And so we don't have a fixed policy that says, if you jump through these

Howard: 00:19:28

hoops, then we will let you delete the undeletable snapshots we, and the

Howard: 00:19:34

customer agree what the hoops are.

Howard: 00:19:36

Yeah, multifactor authentication must be three of the five people on this list.

Howard: 00:19:42

They have to know the passphrase and the proper response to the passphrase.

Howard: 00:19:48

And if they respond with this other response to the passphrase, then for

Howard: 00:19:52

the next 24 hours, do not give anybody the secret as complicated as you want.

Howard: 00:19:58

We'll as long as we can write it down, those are the rules.

Howard: 00:20:02

And then once you've jumped through the hoops, we give you a time limited

Howard: 00:20:08

token that allows you to delete snapshots for a short period of time.

Howard: 00:20:18

And that token is a one-time pad.

Howard: 00:20:23

So that you can't re it's not good for

Prasanna: 00:20:26

Yeah.

Howard: 00:20:26

an hour whenever you use it.

Howard: 00:20:29

It is good for the time when we issue it for some limited period of time.

Howard: 00:20:34

And then you have to know the next one.

Howard: 00:20:38

And it's just, it was the best solution we could come up with.

Prasanna: 00:20:43

And this is probably helps in cases where someone

Prasanna: 00:20:46

attacks a company, they get access to the, to a storage system.

Prasanna: 00:20:50

They start deleting back-ups or what have you, it gives you

Prasanna: 00:20:54

that extra layer of protection.

Howard: 00:20:58

I've seen ransomware , you know, we think of ransomware as being on the

Howard: 00:21:04

order of the viruses we've dealt with.

Howard: 00:21:07

And the ransomware reports I see are much more frequently and this ransomware

Howard: 00:21:13

opened a door and then someone physically hacked for a long period of time.

Howard: 00:21:20

And they took over some workstation, eventually that some

Howard: 00:21:24

administrator logged into and they have an administrator password.

Howard: 00:21:29

And if we're just worried about, if we're just worried about the

Howard: 00:21:35

script kiddies in a, I can protect against the script kiddies in

Howard: 00:21:40

building my backup infrastructure and architecture and those permissions.

Howard: 00:21:47

But we're talking about more sophisticated attacks than that.

Howard: 00:21:51

And frankly we talk about it as ransomware, but it's also

Howard: 00:21:55

rogue administrator protection.

Howard: 00:21:58

Then it's also just the guy who is disgruntled and decides his

Howard: 00:22:03

way out the door, he's going to make life for his employer.

Howard: 00:22:06

You're protected against that too.

curtis: 00:22:09

Yeah.

curtis: 00:22:09

Yeah.

curtis: 00:22:10

And, sometimes rogue administrator is a true rogue administrator, meaning

curtis: 00:22:13

it's a, it's someone masquerading as an administrator as well.

curtis: 00:22:18

That hacker that you talked about.

curtis: 00:22:20

So let me let, me ask call it a difficult question, call it

curtis: 00:22:28

whatever you want to call it.

curtis: 00:22:29

But when I hear about boxes that where you're not supposed to be able to

curtis: 00:22:38

delete data, but then there is this other way where you can delete data.

curtis: 00:22:42

I immediately ask I, I have to ask the question doesn't that suggest

curtis: 00:22:49

that there is a this is, I'm assuming this is a, Unix-based OS and that

curtis: 00:22:56

there's that there is a root account,

Howard: 00:22:58

It we, run in containers under linux

curtis: 00:23:01

So there is an account, there is a a root account and that

curtis: 00:23:06

if someone did some sort of just the right attack against that box.

curtis: 00:23:11

And again you've already mentioned that there is that

curtis: 00:23:15

these are sophisticated attacks.

curtis: 00:23:18

If someone Did a privilege escalation attack against

curtis: 00:23:23

the CoreOS, and now they've gained access to a privileged Couldn't want

Howard: 00:23:31

if someone

curtis: 00:23:32

want.

Howard: 00:23:34

administrative access to the management network, because the

Howard: 00:23:43

ports that face users as storage

Howard: 00:23:45

ports, can't be logged into

curtis: 00:23:51

Okay.

curtis: 00:23:52

they're

curtis: 00:23:52

cause they're back.

curtis: 00:23:52

Cause they're backend,

Howard: 00:23:56

so if you're wondering, if you want to log into

Howard: 00:23:58

Linux as root on one of our appliances, then you need,

Howard: 00:24:03

then the management network has to be set, has to be compromised.

Howard: 00:24:08

And we start saying, are you looking for protection against destruction?

Howard: 00:24:17

Because if your data center is compromised, everything can be destroyed,

Howard: 00:24:27

but that's not really the level of attack that we're, concerned about.

Howard: 00:24:35

We're not talking about and someone walked into the data center because we

Howard: 00:24:40

hadn't disabled their key card and left 20 pounds of thermite in the middle of

Howard: 00:24:44

the floor, who would do such a thing.

Howard: 00:24:49

I've done that on video I was being paid.

Howard: 00:24:54

So you know, I, it is a vulnerability, but it's the

Howard: 00:25:04

generalest of the vulnerabilities.

Howard: 00:25:06

You're pointing out that if I have sufficient

Howard: 00:25:09

access, I can destroy anything.

curtis: 00:25:14

The but it sounds like you have protected from the rogue

curtis: 00:25:21

administrator, the stupid administrator.

curtis: 00:25:25

And and someone gaining access to those.

curtis: 00:25:30

But let me just you to clarify something from your previous answer, when you said

curtis: 00:25:34

that means the management network has been compromised, what do you mean by that?

Howard: 00:25:40

So you manage the system through different ethernet ports,

Howard: 00:25:45

then you access the system.

Howard: 00:25:48

And so too, you're if there's a vulnerability where a user could log

Howard: 00:25:54

into the appliance as the Linux root user that Linux root user can only

Howard: 00:26:02

Howard: 00:26:06

on the gigabit NVMe over fabric port.

curtis: 00:26:13

Gotcha.

curtis: 00:26:13

Okay.

Howard: 00:26:14

so network security should keep that from being an internet

Howard: 00:26:19

connected network and to attack.

curtis: 00:26:23

Gotcha.

curtis: 00:26:23

Gotcha.

curtis: 00:26:24

sense.

curtis: 00:26:24

Okay.

Prasanna: 00:26:27

I had a

Prasanna: 00:26:27

question.

Prasanna: 00:26:29

So Howard, before we dive more into the data protection side, one thing that

Prasanna: 00:26:33

was curious to me was you mentioned that vast supports file and object.

Prasanna: 00:26:38

Could you talk about some of the use cases that you see

Prasanna: 00:26:41

your customers using Vast Data?

Prasanna: 00:26:43

And then I think maybe some of the protection stuff will

Prasanna: 00:26:45

probably come alongside that.

Howard: 00:26:47

Sure.

Howard: 00:26:49

We have the majority of our customers use us for primary storage.

Howard: 00:26:54

And that includes one of the biggest travel sites who uses us for their

Howard: 00:26:59

big data analytics and are using the S3 Presto connectors to store

Howard: 00:27:06

all of their analytic data on us.

Howard: 00:27:11

So that we're much faster than a disk based object store, obviously.

Howard: 00:27:15

And they can do that processing faster.

Howard: 00:27:19

We have a lot of hedge funds who do time series analysis of trade

Howard: 00:27:24

data against large databases to try and predict the market.

Howard: 00:27:29

We have a lot of life sciences customers who are doing things like.

Howard: 00:27:34

Molecular modeling and cryo electron microscopy where one microscope generates

Howard: 00:27:42

many terabytes of data a day because we have very high resolution images.

Howard: 00:27:48

And we have a major motion picture studio who makes movies.

Prasanna: 00:27:56

And so it looks like they are using both sort of the file and the object

Prasanna: 00:28:00

interfaces for a lot of these use cases.

Prasanna: 00:28:03

So specifically around data protection and backup.

Prasanna: 00:28:08

A lot of times you hear The vendor's customers say, object

Prasanna: 00:28:12

store doesn't need to be backed up.

Howard: 00:28:19

This is a subject that personally I find myself on the fence about part

Howard: 00:28:29

of me goes I've built a huge amount of resiliency into this single system.

Howard: 00:28:37

And for durability, if for, availability, I may need to have it in another

Howard: 00:28:44

location, but for durability, assuming that the whole data center doesn't end

Howard: 00:28:50

up being a smoking hole in the ground I could get away without backing this up.

Howard: 00:28:57

I am N I remain firmly on the fence there.

Howard: 00:29:03

But

curtis: 00:29:05

assuming you have the second copy somewhere, you're going to

curtis: 00:29:11

write.

Howard: 00:29:12

may decide that it's, it is data that If, the whole data

Howard: 00:29:16

center goes away, I don't need.

curtis: 00:29:17

Okay.

curtis: 00:29:18

Yeah.

curtis: 00:29:18

Agreed.

curtis: 00:29:19

If, yeah, if we have That, data I would argue why did we make

curtis: 00:29:23

it in the first place, but,

Howard: 00:29:24

That the risk of that is the risk of that is small enough that I'm

Howard: 00:29:29

going to go once every thousand years this is going to cost me a million

Howard: 00:29:33

dollars, but it's going to cost me a million dollars a year to protect.

Howard: 00:29:36

So I'm going to take that risk.

curtis: 00:29:38

okay.

curtis: 00:29:41

So such I will agree to such data classes exist.

curtis: 00:29:45

I don't run into them much, but I will agree

Howard: 00:29:47

yeah.

Howard: 00:29:47

And and then we get to the okay, so this is the object store that does a

Howard: 00:29:53

deep dispersal coding, and they have three locations and I can lose one.

Howard: 00:29:59

So do I need to back that up?

Howard: 00:30:02

That starts getting really close to now I need to back it up because there could be

Howard: 00:30:07

a bug in the software that loses my data.

Howard: 00:30:11

'cause, that's the only thing that could cause that it's like

Howard: 00:30:14

unprotected against one of my three data centers being a smoking hole.

Howard: 00:30:18

what again, it's I could see you going, I want to be safe and I can

Howard: 00:30:26

see you going, it's not worth it.

curtis: 00:30:28

And.

Howard: 00:30:29

Now for us, most of our users use us for primary storage.

Howard: 00:30:34

And for someone like that, big data analytics data, they may not back it

Howard: 00:30:40

up because it's regenerate Hubble, and it's not actually in the form

Howard: 00:30:45

it's in on the object store, but it's extracts from other things and they

Howard: 00:30:50

can run the ETL again and it would be really annoying, but it is replaceable.

Howard: 00:30:57

And then we and then for other use cases this is primary data.

Howard: 00:31:01

I gotta protect it.

Howard: 00:31:03

And so we can do snapshots to an S3 compatible object store

Howard: 00:31:08

and back ourselves up that way.

Howard: 00:31:11

Or you can back us up the usual ways.

curtis: 00:31:16

And could you use one of the, like ones that are like

curtis: 00:31:24

glacier deep archive where I hope I don't ever have to use this.

curtis: 00:31:27

I know it's going to cost me a crap ton of money, but it'll save me a lot of money.

curtis: 00:31:30

In the meantime, can you use that kind of storage?

Howard: 00:31:36

The risk reading data out of that kind of storage

Howard: 00:31:41

requires a few manual steps.

Howard: 00:31:44

If you just use S3 standard then data in those snapshots is available

Howard: 00:31:52

in a .Remote folder, like the .Snapshots folder in the file system.

Howard: 00:31:57

So users can do self-service restore, but that required, but

Howard: 00:32:02

this, that feature means the object has to be immediately readable.

Howard: 00:32:09

And so if you, if it went to

Howard: 00:32:11

Glacier, then.

Howard: 00:32:15

And it would be like your net backup

Prasanna: 00:32:19

Okay.

Howard: 00:32:20

this backup isn't in the catalog anymore.

Howard: 00:32:22

So I got to put those files someplace where I can catalog it and then I got

Howard: 00:32:26

a catalog and then I can restore it.

Howard: 00:32:30

So if you

curtis: 00:32:31

so it's possible.

curtis: 00:32:32

doesn't sound like it's very it's the smoking hole copy, right?

Howard: 00:32:39

It is annoying.

Howard: 00:32:41

But if it's just, but if you're protecting against the smoking hole,

Howard: 00:32:44

then you know, you may be willing to put up with the annoyance.

curtis: 00:32:48

I'm pretty sure we've said smoking hole more times

curtis: 00:32:51

than we've on this podcast.

curtis: 00:32:53

Just for the record.

curtis: 00:32:55

Just saying

curtis: 00:32:57

It's getting a lot of play today.

Howard: 00:33:00

I spent way too long as a disaster recovery planner.

curtis: 00:33:03

Yeah.

curtis: 00:33:04

Yeah.

curtis: 00:33:08

So the majority of your customers use you for primary storage, but clearly

curtis: 00:33:13

you're trying to expand your TAM,

Howard: 00:33:15

Well, w we, we deliver all flash at a substantially lower

Howard: 00:33:21

price than anybody else does.

Howard: 00:33:24

We start with using the cheapest QLC flash.

Howard: 00:33:28

We have a file system designed to treat that flash properly.

Howard: 00:33:34

So we never do small writes that would consume a lot of write amplification.

Howard: 00:33:41

We do very wide erasure code stripes.

Howard: 00:33:45

So we've got under 3% overhead, and then we do guaranteed better data reduction

Howard: 00:33:52

than anybody else in the business.

Howard: 00:33:56

And so that combination means that on an effective byte basis, from whatever backup

Howard: 00:34:06

data mover you're planning on using, we're going to be cheaper than a Data Domain.

Howard: 00:34:11

When you start saying that it's, you have more than a petabyte of data

Howard: 00:34:16

and you need multiple Data Domains.

Howard: 00:34:19

And each one of those is going to be a separate deduplication realm.

Howard: 00:34:23

Then the gap starts to grow substantially.

Howard: 00:34:26

So if so for these very large customers who have five or 10 or 20

Howard: 00:34:30

petabytes data across a bunch of Data Domains, simply the fact that we're

Howard: 00:34:36

one reduction realm makes that makes us much more efficient that can be.

Howard: 00:34:44

it's one system to manage.

Howard: 00:34:45

It's one namespace it's one 20 petabytes or 50 petabytes system.

curtis: 00:34:53

So you're saying, so let me just make sure I understood

curtis: 00:34:56

what you said there correctly.

curtis: 00:34:59

saying on a, regardless of the size of the system, you should

curtis: 00:35:03

be priced competitive with a Data Domain, but then the bigger you get,

curtis: 00:35:07

better you look.

Howard: 00:35:08

under about 500, any pricing experiments under about 500 terabytes,

curtis: 00:35:14

Okay.

curtis: 00:35:14

Okay.

Howard: 00:35:15

in the large end of the business, but yes.

curtis: 00:35:16

Right, That is interesting though, that sort of.

curtis: 00:35:22

into that end of the business.

curtis: 00:35:25

And you had another there was another large, all flash competitor that's

curtis: 00:35:32

doing very well, but they have a very different architecture, they're referring

curtis: 00:35:37

of course, to the orange company.

curtis: 00:35:40

And

Howard: 00:35:41

Yeah, but there,

curtis: 00:35:42

than you.

Howard: 00:35:45

If you're talking about Flash Blade, that's really a shared nothing

Howard: 00:35:50

architecture it's of being pizza box servers, they're blade servers, and each

Howard: 00:35:57

blade has flash modules built in And they they don't scale nearly as large.

curtis: 00:36:09

So it sounds like you, you just took, you've built an

curtis: 00:36:13

architecture based on several new pieces of technology that simply

curtis: 00:36:17

weren't available, say, five years ago,

Howard: 00:36:21

Yeah.

Howard: 00:36:23

We, are the storage system designed from a clean slate around the 2016 toolbox.

Howard: 00:36:33

So QLC, flash,

Howard: 00:36:36

SCM, NVMe over fabrics and other people shoe horn one or two of those technologies

Howard: 00:36:46

into an existing architecture, but we built the whole architecture

Howard: 00:36:51

around having those technologies.

Howard: 00:36:56

Yeah, putting all of the metadata in SCM with no cache meant it had to be in SCM.

Howard: 00:37:02

And it meant the connection between the compute server and that SCM had to be

Howard: 00:37:07

fast enough that we weren't going if we cached this, it would be a lot faster.

Howard: 00:37:13

So that meant it had to be NVMe over fabrics.

Howard: 00:37:17

And then the QLC flash gives us the cost.

Howard: 00:37:20

But it, really is if you look at any storage system, it's by definition built

Howard: 00:37:28

with the parts that the industry is making when they sat down to design it.

curtis: 00:37:35

Yeah.

Howard: 00:37:37

And that when x86 processor when Mahalum came along and the

Howard: 00:37:46

memory bandwidth and the number of PCI e-lanes on processors got big enough.

Howard: 00:37:53

All of a sudden we stopped seeing FPGAs and ASICs in storage systems, we started

Howard: 00:37:57

seeing software defined storage, cause what was available for the designers

Howard: 00:38:02

changed and the NVMe over fabrics has been used by most of the storage

Howard: 00:38:10

vendors for that last mile connection going well, it's going to be fast and

Howard: 00:38:14

then fiber channel or iSCSI for the user machine to access the storage.

Howard: 00:38:20

But it hasn't been as effectively used for the server that is the logical

Howard: 00:38:26

controller to access the media on the back end and the way we use it, we broke the

Howard: 00:38:36

traditional limitation that a drive had to be owned by one or two controllers.

Howard: 00:38:42

Cause I drive a SAS drive where an NVMe drive has one or two ports.

Prasanna: 00:38:47

Yea.

Howard: 00:38:49

We connect that NVMe SSD to what we call a fabric module, which

Howard: 00:38:56

is an NVMe over fabrics router.

Howard: 00:38:59

And in fact, in the new box, it's going to be a pair of Nvidia Bluefield cards

Howard: 00:39:07

and the Bluefield card routes, NVMe over fabrics requests from the ethernet network

Howard: 00:39:13

to the SSDs and routes the responses back.

Howard: 00:39:16

But that's all it does.

Howard: 00:39:18

We don't need x86 servers in the enclosure.

Howard: 00:39:22

We can do it on the ARMs and the offloads and the Bluefields.

Prasanna: 00:39:25

and these are the DPUs, correct?

Howard: 00:39:27

Yes.

Howard: 00:39:28

Yeah.

Howard: 00:39:28

The Bluefield is, the DPU it's the Nvidia Mellanox version of that.

Howard: 00:39:36

And so it has an ARM some ARM cores and NVMe over fabrics and RDMA and

Howard: 00:39:42

other built-in offloads in the chip.

Howard: 00:39:45

And so we leverage that to do the routing of requests from the front

Howard: 00:39:49

end servers, everything is, all the work gets done the SSDs and get that

Howard: 00:39:58

clean fast, more cost-effective channel

curtis: 00:40:03

Let me go back in time when you did that first presentation that

curtis: 00:40:09

you did to the Storage Field Day folks,

Howard: 00:40:12

Yep.

curtis: 00:40:13

how did that go over with, with those folks?

Howard: 00:40:16

It went over pretty well.

Howard: 00:40:18

There was a little being from Missouri and,

Howard: 00:40:23

you,

Howard: 00:40:23

know, we should show you,

curtis: 00:40:24

Cause you weren't because you were brand new.

curtis: 00:40:25

at that point.

Howard: 00:40:26

We We were brand new.

Howard: 00:40:29

And now we're going, okay, look, we've sold a couple of exabytes of storage.

Howard: 00:40:36

Now at this we, our go to market model's a little different, we sell software.

Howard: 00:40:42

We arrange for customers to buy the pre-approved hardware at cost.

Howard: 00:40:50

And the

Howard: 00:40:51

software licenses are,

curtis: 00:40:53

a little interesting.

Howard: 00:40:54

and the software licenses are transferable.

Howard: 00:40:58

So you license a petabyte of software.

Howard: 00:41:03

And you upgrade the hardware when you feel like you're want to upgrade the hardware.

Howard: 00:41:06

Cause you want the denser faster one that is always coming, but we'll write

Howard: 00:41:11

the support contract for 10 years for any appliance from install date.

Howard: 00:41:17

Prasanna: 00:41:18

That's very different

Howard: 00:41:21

well, a typical

Howard: 00:41:22

vendor, you would buy an appliance, it would come with an oEM software license.

Howard: 00:41:27

They would write five years of support.

Howard: 00:41:30

And in year six they would encourage you very strongly to rebuy.

Prasanna: 00:41:35

yep.

Howard: 00:41:37

And then when you rebuy, you have to buy another appliance the

Howard: 00:41:41

software license isn't transferable.

Howard: 00:41:43

So you have to buy another software license.

Howard: 00:41:46

So with us, you gotta have your VAR go to a VAR, a hundred

Howard: 00:41:53

percent channel you go to a VAR.

Howard: 00:41:55

your VAR, goes to Avnet, says, I want this hardware for Vast.

Howard: 00:42:02

Now $1.2 million average selling price.

Howard: 00:42:07

One of our sales guys is involved.

Howard: 00:42:09

We're writing the high touch sale.

Howard: 00:42:12

It's not somebody went on a website someplace.

Howard: 00:42:17

Um, but essentially the VAR, writes two POs: one to Avnet for the hardware and one

Howard: 00:42:27

to us for the actually he writes one PO to Avnet, Avnet cuts us a PO for the software

Howard: 00:42:37

and, that's a capacity subscription.

Howard: 00:42:43

So if you bought a 675 terabyte, enclosure and an appliance, that's got

Howard: 00:42:51

four servers that provide the front end, which is our usual entry point.

Howard: 00:42:56

You could license a hundred terabytes for a year.

Howard: 00:43:00

Multiples of a hundred terabytes for multiples a year.

curtis: 00:43:06

And so that, I think that addresses the question that I had.

curtis: 00:43:10

Cause I listened to the Chris Evans podcasts that you guys did.

Howard: 00:43:16

Yeah.

curtis: 00:43:17

and there was this talk of the 10 year And, again I'm gonna, I'm gonna just

Howard: 00:43:25

Perfect.

curtis: 00:43:26

acknowledge that I live in a SaaS world where we preach against

curtis: 00:43:31

large capacity licensing and capital purchases and all of that stuff.

curtis: 00:43:37

So when I heard 10 year purchase.

curtis: 00:43:41

I was like, what?

curtis: 00:43:41

I gotta, I got to decide now how much I need for 10 years, but that doesn't

curtis: 00:43:46

sound like what you're talking about.

Howard: 00:43:47

No, No, no.

Howard: 00:43:48

no.

Howard: 00:43:49

You th you buy the hardware.

curtis: 00:43:52

Right.

Howard: 00:43:52

We will write a support contract and software license.

Howard: 00:43:58

One agreement.

Howard: 00:43:59

For that hardware for up to 10 years from install date at the same rate.

Howard: 00:44:08

So if you want to keep it for 10 years, you keep it for 10 years

Howard: 00:44:14

Bought

curtis: 00:44:15

I could buy a smaller one and then add capacity.

Howard: 00:44:19

Oh yeah.

Howard: 00:44:22

Our NRR is three.

Howard: 00:44:27

Lots of people buy small and add capacity.

Howard: 00:44:35

We had a 300% NRR.

Prasanna: 00:44:37

I think you meant NRR,

Prasanna: 00:44:38

right?

curtis: 00:44:39

Thanks for explaining.

curtis: 00:44:42

Yeah.

curtis: 00:44:42

NRR,

curtis: 00:44:43

you said ARR.

curtis: 00:44:45

That's why you

curtis: 00:44:45

have me confused there for a minute.

Howard: 00:44:47

Yeah

curtis: 00:44:48

I was like an annual recurring revenue of three, three.

curtis: 00:44:52

Met meant net retention rate, you're saying?

curtis: 00:44:55

yeah.

curtis: 00:44:56

So you're saying 300% your customers start out at X and they end up

curtis: 00:45:01

with three X very regularly.

curtis: 00:45:05

Okay.

Howard: 00:45:08

You know, and you can do that.

curtis: 00:45:09

it just grows as they need it to grow.

Howard: 00:45:12

Yeah.

Howard: 00:45:12

And you can do it in the hardware, so if you want to start really small, then

Howard: 00:45:17

you can buy hardware and license it

Prasanna: 00:45:21

oh, interesting.

Howard: 00:45:22

So You can buy, a 600 terabyte box and a hundred terabytes software

Howard: 00:45:28

license, and the 600 terabyte box you bought at what would be our cost.

Howard: 00:45:34

If we were still selling hardware, we negotiate the cost with the intel

Howard: 00:45:38

and key Aksia and those vendors.

Prasanna: 00:45:42

so you used to sell hardware and then you

Prasanna: 00:45:43

of,

Howard: 00:45:44

started off in an appliance model.

curtis: 00:45:49

Why would I do that?

curtis: 00:45:51

Is that just like ease of large capital purchase thing?

Howard: 00:45:55

Yeah.

curtis: 00:45:56

why

curtis: 00:45:56

would I buy a bigger box

Howard: 00:45:57

university, we had a university had this much money in this year's budget.

curtis: 00:46:04

Oh, okay.

Howard: 00:46:05

We won't put more than a hundred terabytes on it before the next budget

Howard: 00:46:09

comes around when we renew, we'll renew it as a 400 terabyte license.

Prasanna: 00:46:15

and I think this is where at the beginning, you said Howard, that you're

Prasanna: 00:46:18

looking at releasing a smaller unit.

Howard: 00:46:22

Yeah.

Howard: 00:46:22

So the new box is one.

Howard: 00:46:24

You,

Howard: 00:46:25

it uses the ESS F one L the ruler form factor as, DS.

Howard: 00:46:31

So we can, we have 2215 terabyte SSDs for 3 38 raw bat, 300 usable.

Howard: 00:46:40

And that's half the physical size, half the capacity, because what we

Howard: 00:46:47

have now, it holds 56 SSDs and two U

Prasanna: 00:46:51

Gotcha.

Howard: 00:46:54

Yeah, the new one is, from the fabric module is those NVMe routers today.

Howard: 00:47:00

Each one has to be a dual Xeon.

Howard: 00:47:02

So we have enough PCIE

Howard: 00:47:04

lanes and the processors don't do hardly anything.

Howard: 00:47:09

So there's just there's costs there.

Howard: 00:47:11

We don't need, if the Bluefield

Howard: 00:47:13

thing

Prasanna: 00:47:15

That's exciting.

curtis: 00:47:15

right.

curtis: 00:47:18

So let's, focus for a little bit on.

curtis: 00:47:24

The only reason I have historically been when, I historically heard the

curtis: 00:47:29

idea of using flash for backup, I'm like, that sounds ridiculous because

curtis: 00:47:35

for the same for cost reasons, too expensive I'm hearing you that so I

curtis: 00:47:44

would put it this way that, in, in this upcoming world, in this current world

curtis: 00:47:51

in a world where we have large nation states invading other nation states

curtis: 00:47:57

and then large ransomware organizations in those countries, we had this, was

curtis: 00:48:04

our last th they're talking about.

curtis: 00:48:08

So we're, talking about being retaliated against because of this other country.

curtis: 00:48:13

It's crazy.

curtis: 00:48:15

So you have this this, need more than ever before for large recoveries.

curtis: 00:48:25

And I, do believe strongly that there's really only one of two

curtis: 00:48:28

ways to be really successful in any sort of ransomware situation.

curtis: 00:48:35

And, it's basically about fighting the laws of physics .Either you

curtis: 00:48:38

have to have already restored it.

curtis: 00:48:40

So you already have a hot standby ready to go to switch over to or you're

curtis: 00:48:48

doing live mount directly from your backup and live mount directly from

curtis: 00:48:54

your backup is only going to happen if you either aren't, deduplicating

curtis: 00:49:02

like, the way Data Domain does, or

Howard: 00:49:04

Right.

curtis: 00:49:04

have flash as far

curtis: 00:49:06

Tell.

Howard: 00:49:06

if you're not, even if you're not, deduplicating when you start talking

Howard: 00:49:11

about big, hard drives the IO density just

Howard: 00:49:15

isn't there it's better

curtis: 00:49:19

Some somewhere between you and Data Domain, I would put Exagrid,

curtis: 00:49:23

because exa grid has that front end.

curtis: 00:49:25

It's not de duplicated now they're there.

curtis: 00:49:28

They're nowhere near the size of you.

Howard: 00:49:30

right, no.

Howard: 00:49:30

And they have some, and they, have, some flash cache.

Howard: 00:49:35

And if you look at guys who do integrated appliances where the

Howard: 00:49:39

software and the target are one thing, those are typically hybrids.

Howard: 00:49:44

And, so they'll do an instant recover for one or two VMs pretty well.

Howard: 00:49:48

Cause there's enough flash for that.

Howard: 00:49:51

But when you start going, I need the database server behind my ERP, instant

Howard: 00:49:57

recovered, or I need all 50 of these VMs, instant recovered, then it's then you

Howard: 00:50:05

just, don't have enough flash and you're going to get hard drive performance,

curtis: 00:50:09

And so

curtis: 00:50:10

what it sounds like you've replaced the hard drives with QLC

Howard: 00:50:14

right,

curtis: 00:50:15

Help me because I don't live in this world QLC from

curtis: 00:50:18

a cost perspective regular.

Howard: 00:50:22

it's, not just QLC.

Howard: 00:50:24

So QLC means quad level cell holds four bits per cell.

curtis: 00:50:30

okay?

Howard: 00:50:31

The more, bits you hold, the closer, the voltage levels

Howard: 00:50:35

that represent the differences are, and the more sensitive the cells

Howard: 00:50:40

become to a few electrons escaping.

Howard: 00:50:45

If you have SLC, it's like a light switch it's on or off,

Howard: 00:50:51

It doesn't matter if a few electrons escape, you can still

Howard: 00:50:53

tell whether it's on or off.

Howard: 00:50:55

QLC.

Howard: 00:50:56

You got 16 values.

Howard: 00:50:59

The difference between value 13 and value 14 might only be a handful of electrons.

Howard: 00:51:06

So QLC has less endurance.

Howard: 00:51:09

Cause every time you erase it, the insulating layers wear down

Howard: 00:51:13

a little and a few more electrons have opportunities to escape.

Howard: 00:51:18

And it's slower to write because you have to adjust the voltage level just right

Howard: 00:51:23

to be one of those 16 voltage levels.

Howard: 00:51:25

And that takes a little bit longer.

Howard: 00:51:28

Now the slower to write, we don't really care about because

Howard: 00:51:31

we acknowledge the writes while it's still in the SCM.

Howard: 00:51:35

So as long as we are flushing that data out of the SCM, in bandwidth terms

Howard: 00:51:41

fast enough, Latency is unimportant.

Howard: 00:51:46

and the endurance we specifically do a lot of things in our

Howard: 00:51:50

software to manage endurance.

Howard: 00:51:53

So we write very large writes so that the SSD doesn't have to garbage collect

Howard: 00:52:01

internally to accommodate small writes.

Howard: 00:52:05

We erase very large erases so that we delete all of the data in an erase block

Howard: 00:52:12

in the flash so that the SSD doesn't have to garbage collect internally.

Howard: 00:52:17

And that means not only can we use QLC, but we can use dirt cheap QLC

Howard: 00:52:23

SSDs that don't have a DRAM buffer in them to protect the QLC from wear.

Howard: 00:52:33

If you have a DRAM buffer, then you can aggregate multiple small

Howard: 00:52:37

writes, but yet, but now if power fails, it's DRAM, you lose the data.

Howard: 00:52:42

So you need a power fail protection circuit, and you need big capacitors

Howard: 00:52:47

to power, the power fail protection

Howard: 00:52:49

circuit so that you can that you can dump the DRAM into flash and

Howard: 00:52:54

right, and it all starts to add up.

Howard: 00:52:56

So the SSDs we buy, the other customers are hyperscalers.

Howard: 00:53:02

They put them in servers.

Howard: 00:53:04

They only need one port they're writing long tail data.

Howard: 00:53:08

It's not like they're overriding this stuff all the time.

Howard: 00:53:10

It's just too many people are looking at that drunken fat frat

Howard: 00:53:14

boy picture on Facebook it to be on disk so it's on flash.

curtis: 00:53:20

Howard: 00:53:23

We're leveraging all of that to keep so that we can literally

Howard: 00:53:29

use that lowest cost flash.

Howard: 00:53:32

And do the 10 year support because the 10 year support includes if the

Howard: 00:53:37

SSD wears out, we'll replace it.

Prasanna: 00:53:41

cause normally QLC isn't rated for that long.

Prasanna: 00:53:45

I believe.

Prasanna: 00:53:45

Right.

Prasanna: 00:53:45

SLC is years

Howard: 00:53:48

S SLC SLC is the very high endurance flesh, but the typical

Howard: 00:53:53

flash that you see for volume use today is TLC triple level cell.

Howard: 00:53:58

So it's three bits instead of four bits.

Howard: 00:54:00

So QLC is 30% cheaper to make because it holds more bits per cell.

Howard: 00:54:07

And QLC has substantially less endurance.

Howard: 00:54:13

So when you start looking at enterprise SSDs on newegg.

Howard: 00:54:19

The 0.1 drive write per day, SSD is slightly better than the ones we use.

Howard: 00:54:27

And the three drive write per day, SSD, you notice has less capacity because

Howard: 00:54:33

it's got the same amount of flash.

Howard: 00:54:35

It's just more over-provisioned so they can wear level across more of it.

Howard: 00:54:40

And the three drive rate per day, SSD probably has a DRAM cache

Howard: 00:54:44

and all this stuff to protect it.

Prasanna: 00:54:46

Yeah

Howard: 00:54:47

And that's what most enterprise storage systems need because how

Howard: 00:54:53

they put the data in the drive dates back to when it was a disk drive.

Howard: 00:54:59

And you were trying to keep data logically adjacent, not try and manage

Howard: 00:55:05

the write pool inside the drive.

Prasanna: 00:55:07

yeah,

Howard: 00:55:07

The requirements were different.

curtis: 00:55:09

Yeah.

curtis: 00:55:10

Interesting.

curtis: 00:55:10

Yeah.

curtis: 00:55:10

So again, going back to.

curtis: 00:55:11

the fact that you built this from the scratch with that toolbox

curtis: 00:55:17

from 2016, and you were like we need to, manage write leveling,

Howard: 00:55:21

And look, our founder Renen Hallak was the chief engineer at Extreme IO.

Howard: 00:55:28

And when he got tired of working for Michael Dell, he got to talk to Extreme IO

Howard: 00:55:33

customers and find out what they wanted.

Howard: 00:55:35

And nobody said we want faster, Extreme IO was already all flash.

Howard: 00:55:40

They were still adjusting to all flash.

Howard: 00:55:43

And it was plenty fast, but everybody wanted to be able to use

Howard: 00:55:47

that all flash for more things.

Howard: 00:55:50

And so our whole system is designed to provide very high, random read

Howard: 00:55:57

performance, across large amounts of flash at an affordable price.

curtis: 00:56:04

Got it.

Howard: 00:56:06

And so our our performance asymmetry is exactly

Howard: 00:56:09

the opposite of data domains.

curtis: 00:56:14

wait, explain what you just said.

Howard: 00:56:17

Our performance asymmetry is exactly the opposite of data domains.

Howard: 00:56:23

They don't publish restore speeds anymore.

Howard: 00:56:26

Haven't for years we publish, read speeds and writes speeds and reads

Howard: 00:56:32

are at eight times faster than rights.

Prasanna: 00:56:34

That doesn't mean your rights are slow either.

Prasanna: 00:56:36

Just for

Howard: 00:56:38

No Our, smallest system does five gigabytes per second of rights.

Howard: 00:56:46

Yeah.

Howard: 00:56:47

Or your story system probably doesn't keep up with that, but that's the SLOs.

Howard: 00:56:54

But what that means is if you scale a system the traditional way, and

Howard: 00:56:59

you say, I need to move this many terabytes over this many hours, so you

Howard: 00:57:03

have to scale it by right performance.

Howard: 00:57:07

Your backups are going to be much faster than your restores.

Howard: 00:57:12

Excuse me.

Howard: 00:57:12

your restores are much

Howard: 00:57:13

faster than your

Howard: 00:57:14

backups.

Prasanna: 00:57:14

Yeah,

Howard: 00:57:15

Yeah

Howard: 00:57:16

we read much faster than we write.

Howard: 00:57:18

And so if you size for backups speed, you're a store.

Howard: 00:57:21

Speed's going to be

curtis: 00:57:22

yeah.

Howard: 00:57:23

nice.

curtis: 00:57:27

All right.

curtis: 00:57:29

Consider me impressed, Howard.

curtis: 00:57:31

you know, I,

Prasanna: 00:57:32

do by the way

curtis: 00:57:33

Howard: 00:57:34

I I've

curtis: 00:57:34

I, I,

Howard: 00:57:35

time.

Howard: 00:57:36

I've impressed him once.

Howard: 00:57:38

this is makes twice.

Howard: 00:57:39

I'm really, I'm happy with that,

curtis: 00:57:42

yeah it sounds like you're, clearly you've been

curtis: 00:57:47

in the business a long time.

curtis: 00:57:48

You've seen those companies that have really interesting technology

curtis: 00:57:51

and nobody's buying anything.

curtis: 00:57:53

You're not that you,

Howard: 00:57:55

but

curtis: 00:57:55

the really interesting technology, but you're also actually selling it,

curtis: 00:57:59

right?

Howard: 00:58:00

I decided it was time to get a job.

Howard: 00:58:03

And I talked to the folks at Vast, who were still in stealth.

Howard: 00:58:07

And I said to myself, look, Howard, you're a storyteller.

Howard: 00:58:10

And this is a really good story.

Howard: 00:58:14

And it doesn't matter whether it succeeds or not.

Howard: 00:58:17

You're going to have a good story to tell.

Howard: 00:58:21

and low and behold, it's one of those cases where it was a good

Howard: 00:58:25

story and the market requirement fit.

Howard: 00:58:30

And

curtis: 00:58:31

don't have to create the need.

Howard: 00:58:34

we are selling we have, for the past couple of years

Howard: 00:58:41

done comparisons, all the storage companies have gone public you.

Howard: 00:58:44

Yeah.

Howard: 00:58:44

We're growing faster than all of them put together

curtis: 00:58:48

all right Howard thanks a lot for coming on.

curtis: 00:58:52

We might have to have you back.

curtis: 00:58:53

Cause I, I know that I know we've, just begun to scratch the surface and but

curtis: 00:59:00

sounds like you got a good gig over there.

curtis: 00:59:02

I'm glad.

curtis: 00:59:03

Both of us could be

curtis: 00:59:04

employed.

Howard: 00:59:07

curtis: 00:59:07

well.

Howard: 00:59:08

for the people have known us a long time.

Howard: 00:59:10

It really must be shocking to you and I both the same job multiple years, but

Howard: 00:59:19

I'm still having fun at Vast.

Howard: 00:59:21

And there's lots of interesting stuff still to come.

Howard: 00:59:27

Having taken a fresh eye to the market.

Howard: 00:59:31

We got all sorts of good stuff coming.

curtis: 00:59:35

Cool.

curtis: 00:59:36

All right.

curtis: 00:59:37

I wish you the best.

curtis: 00:59:38

And thanks Prasanna.

curtis: 00:59:40

This is one of those cases where your background was very helpful.

curtis: 00:59:43

I think,

Prasanna: 00:59:45

Oh, I try.

Prasanna: 00:59:47

I try,

Prasanna: 00:59:49

Yeah Yeah.

Prasanna: 00:59:56

Having spent a bunch of time building storage arrays.

Prasanna: 00:59:59

It helps, but

Prasanna: 01:00:03

no, it's still interesting problems though, and, yeah.

Prasanna: 01:00:07

Thank you, Howard, for sharing some of the details and indulging in my questions.

Prasanna: 01:00:10

So.

Vast Data really does appear to be "vast"

Listen On

Recent Episodes

security Episodes

saas Episodes

Browse episodes by category