Hacker News new | past | comments | ask | show | jobs | submit login
Moving from relational data to events (event-driven.io)
240 points by alexzeitler 5 months ago | hide | past | favorite | 134 comments



2c: if you need PostgreSQL elsewhere in your app anyway, then store your event data in PostgreSQL + FOSS reporting tools (apache superset, metabase, etc) until you hit ~2TB. After that, decide if you need 2TB online or just need daily/hourly summaries - if so, stick with PostgreSQL forever[1]. I have one client with 10TB+ and 1500 events per sec @ 600 bytes/rec (80GB/day before indexing), 2 days of detail online and the rest summarized and details moved to S3 where they can still query via Athena SQL[2]. They're paying <$2K for everything, including a reporting portal for their clients. AWS RDS multi-AZ with auto-failover (db.m7g.2xlarge) serving both inserts and reporting queries at <2% load. One engineer spends <5 hours per MONTH maintaining everything, in part because the business team builds their own charts/graphs.

Sure, with proprietary tools you get a dozen charts "out of the box" but with pgsql, your data is one place, there's one system to learn, one system to keep online/replicate/backup/restore, one system to secure, one system to scale, one vendor (vendor-equivalent) to manage and millions of engineers who know the system. Building a dozen charts takes an hour in systems like preset or metabase, and non-technical people can do it.

Note: I'm biased, but over 2 decades I've seen databases and reporting systems come & go, and good ol' PostgreSQL just gets better every year.

https://instances.vantage.sh/aws/rds/db.m7g.2xlarge?region=u...

[1] if you really need, there's PostgreSQL-compatible systems for additional scaling: Aurora for another 3-5x scaling, TimescaleDB for 10x, CitusDB for 10x+. With each, there's tradeoffs for being slightly-non-standard and thus I don't recommend using them until you really need.

[2] customer reporting dashboards require sub-second response, which is provided by PostgreSQL queries to indexed summary tables; Athena delivers in 1-2 sec via parallel scans.


Along these lines, if you need the ability to "time travel" and "recover overwritten state" and "reinterpret the events of the past" sometimes all you need is audit logs that maintain snapshots of pre-save data, and a script that identifies and collects instances of a specific sequence of events, which a human can review and bulk-apply as necessary to backfill the effects of new logic.

https://django-simple-history.readthedocs.io/en/latest/ and similar tools are a semi-reliable and simple solution for building audit tables, or you can add Postgres triggers if you need to audit direct database access.

I love event sourcing in theory. In practice, there's so much boilerplate necessary to add a new CRUD workflow, or to quickly and reliably roll out the types of interventions and hotfixes that early-to-mid-stage startups need to do all the time for unforeseen circumstances. Unless you're doing something like implementing payment processing rails, event sourcing may not be the right choice.

https://news.ycombinator.com/item?id=17817375 (2018) has some good conversations on the downsides to event sourcing as well.


> I love event sourcing in theory. In practice, there's so much boilerplate necessary to add a new CRUD workflow, or to quickly and reliably roll out the types of interventions and hotfixes that early-to-mid-stage startups need to do all the time for unforeseen circumstances.

The biggest pain point I had working on an event sourced system at an early stage startup was around schema changes, where changes (especially quick or unplanned) introduced something pathological in event data for some period of time that we didn't realize was an issue until it conflicted with some change much later on.

Often the discovery of these issues would come at a bad time and block something important, e.g. because of small differences in dev environment seed data and real production data, leading to temptation and/or pressure to "just do a quick tiny mutation" to old events to address the problematic events from the past mistake, which realistically often just caused a different issue, because reasoning about all of the potential impacts of the change was hard.

These days in early-stage land I am only interested in event sourcing for contained areas of a product where it can provide real value, rather than as a cornerstone of an entire application.


> reasoning about all of the potential impacts of the change was hard

It's far easier for us as humans to think "this change will affect new data going forward" rather than "when I make a change in the present, it will transform all data from the past as if it had always been the case." Perhaps comic book writers would make good event sourcing engineers - but I digress.

One of the saving graces of event sourcing is that you can spin up a parallel world where you make those changes, rerun your entire event history, and do a diff between your production derived data and your proposed production derived data as of a point in time, identifying the actual situations where pathological cases would occur. But this takes a lot of discipline, and it's not perfect, because events don't exist in a vacuum - they are snapshots of a human being taking an action relative to the data they saw at the time, and if you're retconning the data they saw, it might not match the reasons they took the actions they did.

A common pathological case I find is: "Someone overrode value X to Y, now I have the ability to automate that fix - but did they mean to freeze Y for a different business reason outside the one I did? If at some time in the future it changes upstream from X to Z, should I keep the user's intent to freeze Y or change it to Z because the freeze to Y was just because we didn't have the fix?"

Relational approaches don't solve this, to be sure, and in fact there's worse tooling to have good answers to this question. But they do incentivize the discipline of talking to stakeholders before making a giant wide-scoped UPDATE on data at rest. Event sourcing's power to retcon makes it far too easy to say "let's just reinterpret the history, what's the worst that can happen?"


From your experience, where does it bring value?


I liked it on an application with a complex conversion flow that could be initiated from a few different starting points with a few required steps and a mix of optional steps that might be executed in different orders. Storing the user's profile data from each step as an event helped us model the whole process as a state machine, which made it easier to reason about this big complex thing, as well as to introspect the previous stages of a user's journey right in the application.

I've also thought it made sense on an application that regularly synced down data from 3rd party systems, where it wasn't uncommon to have some kind of weird data introduced and struggle to reconcile when/where it came from against changes made by application users on our end. Having everything as a series of events can be really awesome in that case.


Im not the OP but I’d chime in: first and foremost it’s on the resume.

Of course im not bashing event sourcing altogether but right now resume boosting seems to be the driving force to apply it everywhere.

And this comes from my experience at my current workplace. Event sourcing has brought so much unnecesary complexity to a an older system and it’s not even implemented properly. The engineers who took the initiative padded their resumes then went on to greener pastures while leaving a giant mess behind.


The problem with audit logs for deriving state is that migrations are a forgetful operation. They’re great for telling you who changed what and when… but if the table lives long enough the audit logs will hold references to columns that no longer exist or data that is gone with no way to trust that it can be recovered properly.

The author has another post on that site on when to avoid event sourcing.


Even in theory, the article says: "The end is near: The end of flattened data and losing business information."

Honestly, I can't see how events are not more flattened data than a properly designed relational database.

In fact, IMHO the only reason to use events exclusively, is to sell one of these "fancy" NoSQL databases.


> I love event sourcing in theory.

I've never heard this term before. From a quick scan it seems like a design tool for reasoning with unfamiliar schemas but that doesn't match the conversation. Would you mind explaining the context around its use or suggesting a link?



When a comment on HN has more merit than the article.

The only problem with postgres is that inserting has some interesting scaling problems. Putting a queue between the event sources and the db is usually recommended.


"normalize until it hurts, denormalize until it works" is evergreen advice for scaling both reads and writes. Synchronously enforcing referential integrity and other forms of normalized constraints is what gets expensive.

Pat Helland has some really good writing on this stuff, e.g. https://pathelland.substack.com/p/i-am-so-glad-im-uncoordina...


And in between, at the coma, revise seriously your indexing policy and don't hesitate to remove unused and underused index (even on foreign key if you don't need them that much). People too often underestimate the impact of rebuilding and index on large inserts.


Absolutely, there's no beating the RUM Conjecture.


> Putting a queue between the event sources and the db is usually recommended.

That depends on the nature of the events and whether you can live with the database being out-of-date while the events are still in the queue.


> Putting a queue between the event sources and the db is usually recommended.

Emphasis on _usually_. If your db is at 2% CPU utilization with/without queues ... you probably don't need a queue.


see my reply above - one classic case is db system maintenance.


thanks! forgot to mention queuing (e.g. SQS), which is SUPER valuable, for example when you want to do large scale maintenance on the database (major version upgrade where the on-disk format can change)


SQS has eventual consistency, you can get the same message twice on 2 different intances for example (which was often the case for my projects). I would rather suggest Amazon MQ.


This is often possible to work around by adding a UUID to the message, on the sender side. Then either handle dupes at the DB level by ignoring duplicate inserts, or using something like redis. In practice, even with other queue systems, you can wind up with dupes due to bugs, timeouts, retries, like duplicate sends when the first one actually went through but wasn't acknowledged. I worked on systems sending thousands of messages/second through rabbitmq, originating from various third parties, and dupes would happen often enough that we needed to work around it on the receiving "worker" side.


SQS is weird in that you can get the same message in Japan and Germany at the exact same time and "race" to process the message. It's annoying af. I do not recommend SQS unless you are at some kind of scale (in company/team size, not revenue) where you can deal with it properly.


I used extensively at a previous company for longer running background tasks. It was simpler to use SQS than dealing with standing up our own RabbitMQ cluster. Their Amazon MQ service did not exist at the time. Our system was built to tolerate duplicates and it worked well enough. For something higher volume I'd definitely use RabbitMQ though.


A cluster! Yeah, if you have enough messages warranting a cluster, SQS might be simpler (but probably far more expensive). It's always tradeoffs I guess.


Is the general idea to have a table with definition {id:uuid,created_at:timestamptz,data:jsonb}?

It’s difficult to get index functionality in JSONB, especially against diverse event structures and event definitions that evolve.

I guess I should become more familiar with: https://www.postgresql.org/docs/current/datatype-json.html#J...


yes re table definition.

To answer your question, for indexing there's a couple of options: - if you know which fields you want to index, use an expression index with a regular B-tree. - if you don't know which fields you want to index, use a GIN index on the whole field.

Blogpost which covers both cases: https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-e...

That said, I generally don't bother indexing the "raw" table, but instead create a parsed+summary table that breaks out the fields I want to index into 'real' sql columns and summarizes the data to reduce the number of records. In particular, I use a partitioned table for the raw data which keeps the data-per-table tiny enough to fit in RAM, and "real time" queries just hit the latest raw table(s). For big, real-time data, I sometimes use TABLESAMPLE queries to create estimates, or triggers to do exact counting (which is what NoSQL databases do, just without the nice structure of triggers).


In event sourcing, as described in DDD, the aggregate is more important. So there (also) has to be an aggregate Id, that's non-unique (one aggregate has many events).

Then, there's the optimistic locking system that you'll need if there's a lot of concurrency. For that, typically, a 'sequence' number is added. Simple, effective.

Last, you're quite often only interested in one, or a few, types when doing reporting. A stringly typed 'type' column is often Good Enough.

Created at is handy, but not crucial.

So, you'll want something like

{ Id: uuid, aggregate_id: ?, sequence: Int, type: String, data: jsonb }

Querying then fetches "everything for aggregate x, ordered by sequence". Possibly limited to type X (e.g. when rebuilding a query index). But the first is what happens every time an Aggregate is hydrated (i.e. ShoppingCart::load(”acme/1337”) ) so it has to be indexed for that, primarily.


Let’s say I want to setup such a system, any idea where I could find a detailed write up?


This is an extremely well documented postgresql event sourcing reference implementation: https://github.com/eugene-khyst/postgresql-event-sourcing


Seconded as an excellent resource.

My team used it as a reference point for creating an event store for the write model and separate projection based read model in Golang. If you happen to be a Java shop you could obviously use the implementation too.


I was a team once that strongly considered event sourcing. To me, it seemed like a solution looking for a problem. It could have worked for us, but we ended up passing on it as the benefits were not immediately clear and the risk of doing something new and the lessons learned that would come with it just didn’t seem in the best interest of the project/company. Maybe that makes us tools for passing up a learning opportunity, but I don’t regret getting into that rabbit hole without a fox chasing us down it.


A boring, conventional system that works is a threat to a bloated engineering team who don't have any work to do & polish their resumes with and might feel at threat of redundancy. That is the "problem" this solution solves.


Temporal databases make a lot of sense for financial data, for example.

But in most cases you can just have a normal database and store the historic changes in auxiliary tables. So the main database is kind of a materialized view.


The traditional way to handle this is to use a GAAP style transaction/journal and a roll-up summary table. The current state can be reconstructed from the read only transaction table and you don't need any complex event processing system.


> The current state can be reconstructed from the read only transaction table ..

Isn't that called event-sourcing?


Almost every piece of data we store in SQL would be better on a document database, but since nobody is familiar with those we keep on trucking. I don’t mind too much, I don’t even think we made the wrong choice, but it does cause us some issues with how we have to handle data model changes.

I think most data storage didn’t really keep pace with how a lot of software is being build now though, and things like events and queues are what we build on top of what we have because we need it. For the most part a lot of the data relations happen outside of our databases today through various services, because that’s just how the modern IT landscape looks in many organisations. You’ll have internal master data that supports different teams in the business and interacts with 300+ different IT systems and applications in order to streamline things. With micro services it’s easy to keep the business logic and data models clean, but then you need to manage events, queues and data states as well as reliant storage. Which is just so complicated right now.

I do like SQL but these days, the systems we’re building could frankly be put in a SQLite and be perfectly fine, well almost.


Something that might be missing from these discussions is when event driven architecture is even appropriate. The short answer is if your customer did something and expects a response it's not even driven, that's just request/response.

Event driven is when something happens out of band. E.g. you push your code to GH, which triggered a build. In this example, you reloading the page to see that your updated code is request/response, however that CI build that was enqueued is event driven.

Hope that helps.


It's not that Simple. Request-response isn't a factor to choose ES or ED architectures on.

You can have request-response, inline, blocking, cycles with ES or ED. And you can have async without ES or ED just fine too (e.g. workers, queues, actors, multithreaded etc)


> It's not that Simple.

It can be as complicated as you want to make it. My original point was to supply the missing context that explains 95% of use cases.

> Request-response isn't a factor to choose ES or ED architectures on.

One's business use case and how it translates into an appropriate messaging pattern is arguably the most important factor to base an architecture on.

> You can have request-response, inline, blocking, cycles with ES or ED [...]

Yes, of course. And it's fair to say, at any point in your life, that one can do anything that one wants. Like, one could have rollerblades for hands, or one could implement request/response functionality with an event driven system.

But once we've moved past that, probably the next order of business is to justify the extremely good reason for overriding all prevailing common sense.


No need to be condescending.

ES and ED architectures have good use-cases. And CRUD has good use-cases, and RDMSes and time series and document databases. Non will ever fit perfectly and non are a silver bullet.

Choosing the right architecture is an an art. Experience helps. And being able to look into the future even more.

Point being: a silly parameter as "wether you need the response to a request to have the data" is certainly not the only metric to base architectural choices on. It's not even a significant metric.

It's at most a technical requirements. And one that can be fulfilled with many architectures. Like event sourced architectures even.


Modelling domain events is useful for describing the problem your trying to solve with the domain experts, and it should probably be left in the documentation when planning a solution.

For actually implementing a system that provides an audit trail of long-lived state machines, you're probably better off using something like Temporal.io/durable functions which uses event sourcing internally for their persistence, and has a programming model which forces you to think about deduplication/idempotency by adding different constraints for the code that orchestrates the functionality (workflows), vs the code that actually interacts with the real world (activities)


Durable functions suffer from lack of Observability tho.

I’d love to hear suggestions on overcoming this issue.


temporal.io just released .NET SDK. The observability and scalability of the platform is really good.

Disclaimer: I'm one of the founders of the project.


Nice to see you dropping in Maxim!

For the GP poster - I agree with Maxim here. We've been evaluating workflow orchestration and durable function systems for a while and finally whittled down to where we think we're going to pull the trigger on either Azure Durable Functions or Temporal. Temporal is really nice - the fact that you are "just writing code" is such a huge bonus over some other stuff like AWS Step Functions, Cadence, and Conductor.

As an aside, the engineering/sales engineering team over there seems top notch.


What I meant specifically is that the current state of a workflow is stored in a format that’s opaque to any component other than the workflow itself.

E.g. if I have a “shopping cart checkout” workflow and the user is not making progress, how can I can I tell which step of the workflow the user is stuck at?


Every step of the workflow is durably recorded. So you have the full information about the exact state of each workflow. To troubleshoot, you can even download the event history and replay workflow in a debugger as many times as needed.

The ease of troubleshooting is one of the frequently cited benefits of the approach.

Check the UI screenshot at https://www.temporal.io/how-it-works.


The function's event data and current state is all stored in table storage, so you could query that - I'd expect you'd need to query an event-store-based solution in a similar way?


The concept sounds interesting, but the article doesn't do a great job of explaining how it works. How do I efficiently reconstruct the current state from the event stream? How would the event stream be modelled in the database?



Two ways to do it.

1. Use a database designed for this stuff. Google big query, Amazon redshift, clickhouse..etc. all current data is essentially a type of aggregation. Or in other words it's equivalent to a group-by query on an event database.

It makes sense right? With events I can technically rebuild the current state or the past state of the data through some aggregation query.

2. Rename your relational storage and call it a caching layer that lives next to the event system. It's functionally the same thing but won't trigger any red flags in people who are obsessed with making everything event driven.

The architecture he describes exists. It's just massively complicated so services that utilize it usually do very targeted things. Think Google analytics, data dog, splunk... etc. Etc.


> How do I efficiently reconstruct the current state from the event stream?

There isn't one 'current state'. That thinking comes from centralising everything in one DB.

You create different states in different systems according to different requirements. If you're building a shopping system, with Purchases and Customers, one service could read events and produce a relational table for finance purposes. Another service could read events and produce a key-value store of customer data. A third service could power an OpenSearch service for searching over products.

> How would the event stream be modelled in the database?

It's a list. If you're using something fit-for-purpose like Kafka, then it's multiple lists (topics, partitions, etc.).


It would make more sense to use this for certain streams that change a lot and the data is interesting enough to see what happened along the line. But that could be solved within the relational model..


It's top-down vs bottom-up, or custom vs generic.

Top-down vs bottom-up:

Top-down: starting from the business domain, and then mapping an implementation onto available technologies, tools, and vendors.

Bottom-up: starting from the available technologies, tools, and vendors, and thinking how to bolt up a working solution out of them.

Custom vs generic:

Custom: DDD, CQRS/ES, Sagas, TBUI (Task-based/driven UI), GraphQL, Algebraic Data Types, etc.

Generic: RDBMS, CRUD, REST, ACID transactions, CDC, generic admin UIs, nocode/lowcode, limited/generic types, etc.


Yeah, ehh I'm just going to stick to good old fashioned relational data.


Good, do it until you can’t. Don’t use a hammer on a screw.


I'm on board with event-based architectures, but this article struggles to get its point across.

I would focus on the difference between data relations and business behaviors. Once you start thinking in terms of behaviors and business activities, the move away from operational relational data stores becomes much more obvious.


On an abstract level, events can be modelled as relations.


Event sourcing has a lot of nice properties, so I’m intrigued. But don’t you still need relations? And then how do you implement those?

If the answer is “they’re all implicit in the application layer code” then that’s not really acceptable. I still need some way to query for relations, or keep relation views up to date, or something like that.

I don’t mind if relations are not core to your persistence model, but they have to be implemented _somewhere_ in your data layer, and I’m not seeing any mention of that here.

I have the same issue with Firestore, everyone does relations _somehow_ but it’s all just spaghetti application code which isn’t scalable.


In event sourced systems, you project the event stream into read models, of which there can be many (relational, time series, etc.) If you're familiar with functional programming, it is essentially a fold operation over the stream of events into a single state.

Having worked with event sourced systems in the past, there are benefits in having a persisted explicit event history, but there is much added complexity (how do those read models actually get generated? how do you version the model? do you have snapshots of your read models?). In my experience, the additional complexity was not worth it for most contexts in which the pattern was applied...


No, what you need is a command queue, command event is not domain event.


I was not aware of event driven design until recently, but coincidentally concluded something like it, after considering the optimal data structure in an AI powered world.

While it's clear how event driven design might have been worth the trade off (assuming you were able manage the complexities and actually made use of the data) being able to query an AI with knowledge of every event that happened to your business, with will make it ubiquitous over the coming years.


I made a php demo for a idea: a event based observer based modelling system. I.e. for game of life: https://github.com/ulrischa/OCell


all the comments are negative, but the post at this time 64 upvotes - why? i've seen this so often on HN, but i really don't understand it.


Certain topics get a lot of interest (positive and negative) based on the title alone.

Event-driven isn't quite peak hype any more, but it still gets a lot of instinctive love from a certain group of people, and a lot of instinctive hate from another. So you get a whole bunch of upvotes (but they don't have anything substantial to say about it), then a whole bunch of negative reactions in the comments based on the title alone. And then in this case, you get a whole bunch of negative reactions from people who tried to read the article and couldn't get past the weird tone.


I don't think it's just the tone. At the start of the article, it implies relation and CRUD approaches should and will be replaced by Event Sourcing approach, but he doesn't support that, or even provide a good sense of pros and cons. In one of his comments (comment section) he mentions that the article was supposed to be a "how" not "why", and links to a "why" article, but that "why" article also doesn't do a good job of why with pros and cons.

In summary: not a good presentation of the pros and cons which allows a person to clearly identify under what conditions this approach might be a reasonable tool.

Additional note: A typical competing model, IME, is a relational model that writes business events at the same time state updates occur.


It's definitely a bad article. I was more commenting on why there isn't much substantial conversation here, even though there are lots of upvotes. Very few people even bothered to read it because the tone was so bizarre—I couldn't get past it, but it's not surprising to me that the content is low quality too.


Hate the article enough to leave a negative comment, want to see the fall out of it and have it not drop off the front page, so stick an upvote on it as well would be my guess. I have commented on this but haven’t upvoted as it clearly a bonkers article.


Because you can't downvote submissions, you can only go in and write a negative comment. Makes complete sense to me to see the effects of that with this article.


yes, i'm not denying the comments, it's the upvotes that i don't understand.


There are probably some people who find the article interesting and well written or who upvote based solely on the headline without reading it.


Sounds like a junior work


IT is doomed


This article is not good. Event Sourcing and the Relational Model are orthogonal.

SQL:2011 added a lot of temporal features.

Datomic is based on Datalog which even though is not relational it's kind of the same, and has temporal support. [2]

[1] https://en.wikipedia.org/wiki/SQL:2011

[2] https://vvvvalvalval.github.io/posts/2018-11-12-datomic-even...

(BTW 24 points at the top of HN and no comments? Hmm)


Datalog is definitely relational. More so than SQL.

In terms of temporal data handling & relational & Datalog, it's worth looking at Differential Datalog: https://github.com/vmware/differential-datalog


Datalog is relational in the original sense of relational algebra.


Sure. But the point of the article is for Document or Object-Oriented. And the specific points would apply to Datalog, too.

My point was you can do the equivalent of Event Sourcing with SQL and even better with Datomic.

> In Event Sourcing Church, we’re not doing that anymore. We’re not losing business data; we keep them. We keep them as events.


> Datomic [...] has temporal support

Note it only supports "transaction time" (or "system time" per SQL:2011) but not "valid time" (~"event time") which is needed for a bitemporal data model. [1]

[1] https://vvvvalvalval.github.io/posts/2017-07-08-Datomic-this...


You don't need specific temporal support for an event-oriented DB, nor would I want it. I usually design a vanilla relational schema around events regardless of which DBMS I'm using. E.g. instead of an "order" table with multiple states updated in-place, I'd have "order_placed" and "order_filled" where each row is an event, insert-only.


What on earth is this article trying to accomplish? The tone is bizarre and the underlying concept sounds horrendous to work with if you truly want to replace your static data store with it. By all means add a formal event layer on top of your existing data store but to replace it sounds madness.

If that’s not what the article is proposing then for once I’m going to say it’s not a failure of my intelligence, it’s the articles fault here.


> What on earth is this article trying to accomplish?

Most articles explain building event-driven systems from a greenfield point of view. This article is for when you want to build an event-driven system but you already have brownfield relational data.


Four years ago I heard "Kafka IS your database".

I thought maybe these insane people (probably parroting some tech company enterprise penetration propaganda e.g. Confluent) would have a better story, but... no.

Anyway, yeah, sure, keep logs. But a lot of that article about commands and events is something that only exists if you had one ubiquitous language, system, and OS. You know, the almost literal "seamless" where there aren't any seams.

Sure that will probably plug into some enterprise bus and enterprise integration and enterprise ... anyway.

Competent developers will understand what events to preserve and log and possibly allow retry/repeats.

Anyone who has looked at a Splunk bill will realize that just storing all the logs everywhere is very expensive, which is another way of saying "wasteful". But any generic enterprisey event system will basically start and end with splunk-level log aggregation and kinda-analysis.


Dealt with something like this when I joined a previous job years ago, Kafka was the main data store, microservices had their state in memory they would build at startup by processing their whole history of events (that was ridiculously slow for some of them). Then GDPR came in place and the whole “keep everything in Kafka forever” had to face the reality of “not allowed to keep PII for longer than 30 days” :)

(Before someone suggests it, no messages weren’t encrypted, just throwing the key away wasn’t an option)


> ... they would build at startup by processing their whole history of events (that was ridiculously slow for some of them)

There is a basic technique to solve this, you snapshot every "nth" event


Would be expensive but you can do a copy and replace to keep the data you need, on a new topic.

If you don't need the data, then you don't need it.


> What on earth is this article trying to accomplish

Most of these articles are for the author to promote themselves


Eh. The model he describes is actually standard for analytics.

And because it's standard there are literally databases designed and optimized to do what he says. It's not madness when it already exists and is really common.

Think, redshift, snowflake, biq query, clickhouse..

Additionally their already exists user interfaces and web services that already do what he says.

Datadog, splunk, Google analytics... Anything related to logs, analytics and aggregation of those analytics. What he proposes actually already exists.

That being said I don't agree with the articles point to replace everything with this model. Usually these types of services target very specific use cases.

I think your reaction is a bit extreme here. I don't agree with his proposed model but I see where he's coming from and it's not that the model won't work... It's been proven to work from all the examples I gave above.

The problem with it is that it's just slower and much more complicated. But his proposal does increase the capabilities of your data.

You can increase speed by having a pre-caching layer for your aggregations. Basically what was originally your static store is now a caching layer where the developer or user pre specifies an aggregation that the system should count live as the events come in as well as throwing the events into the event db. If when querying for that aggregation you get a "cache miss" then it hits the event layer and has to do the aggregation job live.

So essentially if you build it like this you have all the capabilities and speed of your original static data store but now you have the ability to re aggregate events differently so you have MORE ways to deal with your data. It can work and it will have more features its just really really really complicated to make an entire system centered around events. Additionally theres also a boatload of extra data to deal with which is another engineering problem.

That's why when people do build these systems it's usually centered around some business requirement that absolutely needs this ability to dynamically query and aggregate events. Logs and analytics being the two big ones. Or some service to data scientists as well.

The theory behind it is attractive. All static data can be represented as a series of events. In fact static data is simply the result of a certain of aggregation query on an event database. It's attractive to use smaller primitives in programming and build higher level abstractions through composition so this style of event driven services seems more fundamental and proper. But of course like I said there's practical issues with it when you look past the theory such that this model is usually only applied to the specific use cases I mentioned above.

So there is a failure here. Not of your intelligence. Failure of your experience.

And as I side note I agree with you on the tone of the article. He's trying to be witty but he's trying too hard.


Relational and event-driven aren't exclusive concepts, that's the problem with the article. Also, it'd help to have a real example of the solution it proposes, since we all know the "old" way it describes is in Postgres/MySQL/whatever.


Where's the hitchhikers guide to moving back to relational data after our whiz bang dev made us event driven and left for a new opportunity?


Next to the 97 part video tutorial on how to get back from your semi-hydrated micro-frontend to something sane for your 12 person company ;)


THe problem is everyone's interpretation and implementation of micro-frontend is different.


So long, and thanks for all the eventual consistency issues


Why not get rid of git too while you're at it?

Just store the current state of the code base. You can't have merge conflicts without conflicting commits.


Actually event driven stuff is highly consistent. Event insertion is fast as hell and all of it is timestamped and tagged with a unique Id so there's no consistency issues.

The problem is availability. You have to aggregate those events to get anything meaningful about it and aggregation can be really really slow.


Exactly. Because the aggregation is so slow, you cache the aggregates, but then the caching layer becomes slow to invalidate, and you get eventual consistency problems.


I wouldn't call that a consistency issue. That's just lag. The aggregation is valid for a specific time point. Caches aren't the source of truth. The source of truth remains consistent here.


The event store and the cache are both part of the same system, and it's this whole system that is "eventually consistent".

Say I create widgets on screen 1 and they are persisted with event sourcing into Postgres. I see the list of created widgets on screen 2, loaded from a materialised view in Postgres (or Elasticsearch). The "lag" between it being created on screen 1 and appearing on screen 2 is the "eventual consistency" issue I'm referring to here, whereas I think you're referring to the consistency only of the persistence on screen 1.

I'm sure we both agree there's no getting away from CAP theorem. Event sourcing accepts less consistency, and every part of the system needs to deal with that.


>The event store and the cache are both part of the same system, and it's this whole system that is "eventually consistent".

Then every single system of the face of the earth at a high enough level has a consistency issue. Just go to the level of the full client and backend system using web browsers. You don't refresh the browser you of course will eventually have a "consistency" issue.

Usually when they refer to this stuff it's referring to the source of truth: The database. When you shard the database into two synchronizing copies, you increase the availability with a second copy but that leads to the potential for both copies to be inconsistent.


It's hardly a cache if it's the only way to query the data, it's a materialized view


Then it's an outdated materialized view. If you don't refresh your view from a browser is it a consistency issue? No. Not in in the way the term is usually used.


Your using terminology in non standard ways.

Read the first paragraph on eventual consistency on wikipedia, it describes the agreed upon definition.

"Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value"


False. And wrong.

Read: https://en.m.wikipedia.org/wiki/Consistency_(database_system...

We are talking about the word consistency. Which is most often used in the context of databases. Eventual consistency is here:

https://en.m.wikipedia.org/wiki/Eventual_consistency

It refers to the same concept and technically and colloquially its referring to databases or things similar to databases. If it didn't then every system on the face of the earth is eventually consistent because of browsers and caches and timing. Every browser eventually presents an outdated view if it's not refreshed. If that's the case what is the point of the word? The word is obviously used for categorization.

Thus The scope is usually and colloquially a database systems or some combination of services that represent a source of truth.

If you read further into the article you cited you will encounter this:

"In order to ensure replica convergence, a system must reconcile differences between multiple copies of distributed data. This consists of two part..."

Literally the article assumes we are talking about distributed systems where replicas can exist.

Is a cache a replica? Is a browser a replica? No. The scope is obviously the source of truth at the resolution where you can have replicas aka: two or more sources of truth.


Exactly. These kind of devs (like OP) cannot be quiet and need to constantly introduce something “cool” so that they can get a good salary raise. The moment they cannot introduce more BS, they move on to another company. The poor other “average” devs need to maintain all the crap.


Do it like an accountant would:

Throw away all invoices and receipts, and just represent each customer's balance as a single number. When it changes, get your crayon, cross it out, and write the new number.


He tried at the beginning to say only when the DB gets too big should you even think about this but there are things you can do there as well.


There is no “too big” in databases and in particular size of data is not the criterion for deciding on using something like event sourcing. It’s a really niche paradigm that is only ever going to be useful in quite unusual circumstances. Most of the time people don’t need it, and most of the time when people do it, they find their immutable event source is very inconvenient for lots of the normal things you want to do with your data, so they end up doing something like CQRS[1] (ie having a database as well). This is one of those Martin Fowler[2] type things that looks good on a whiteboard but most people would be better off avoiding most of the time.

[1] https://en.wikipedia.org/wiki/Command_Query_Responsibility_S...

[2] This Martin Fowler, https://en.wikipedia.org/wiki/Martin_Fowler_(software_engine... not this Martin Fowler https://metro.co.uk/2023/11/06/eastenders-star-reveals-why-m...


The way I see it, either your business domain requires querying over a large amount of data, or it doesn't.

If an application allows someone to be able to enter let's say an order number from anywhere in the world from the last 10 years and be able to find the order, there is no magic - some server out there is going to have to scan a huge amount of data to find a match.

Tricks such as indexes, partitioned tables, etc can be employed, but those tricks have nothing to do with event-sourcing and are independent of it.


> Tricks such as indexes, partitioned tables, etc can be employed, but those tricks have nothing to do with event-sourcing and are independent of it.

You might want to use different tricks in different situations. Different situations means different services, and different tricks means different storage/query technologies.

So how do you get your data into three systems - and more crucially - keep them in sync? Webhooks? Triggers? Some bidirectional sync magic app that claims to beat CAP?

Just use event-sourcing (append-only, disallow modification) and the multiple systems will stay in sync as long as they know how to process one more message.


Agreed with your points but this article seems to present event-sourcing as a replacement for your database(s) and even makes claims about saving storage space, thus at least hinting at not using databases anymore.


It's a replacement for your source-of-truth, not your database(s). Although you're right about the article not explicitly mentioning slurping the events back into a DB. I suspect the reason is that there are plenty of articles which explain how to event-source from greenfield, but this is the first one I've seen which focuses on existing brownfield relational data - see the title.

> makes claims about saving storage space

I don't think that was the right reading about saving storage space:

> We’ve been trying to optimise the storage size; we’ve made some sins of overriding and losing our precious business data

I think it's his strawman RDBMS developer who optimised for saving storage space, and lost business data as a result. The suggested approach is:

> We can optimise for information quality instead of its size.


This reminds me of https://youtu.be/b2F-DItXtZs


I was looking forward to reading this based solely on the title, but I find the writing style and tone to be quite unbearable. The forced attempt at being relatable and light-hearted comes across as patronizing and distracts from the intended message or points being conveyed.


I don't take offense to the tone, it's just too much text and too little substance.


Didn't read the article but I'm in the process of eradicating Event Sourcing from a codebase and returning to the classical ACID database model. The boneheaded decisions made by our predecessors is staggering and choosing to use Event Sourcing for everything is the dumbest of them all.


I worked on a project where we decided that event sourcing was the way to go for a variety of legitimate business needs. We then implemented it with ACID transactions—an application-level framework writes the event to the Postgres database and in the same transaction updates all the computed views. At the scale we were working, this was totally fine performance-wise.

Most people who have had bad experiences with event sourcing were actually having bad experiences with eventual consistency. All that event sourcing means is that you treat the events as the source of truth and everything else as computed from those events (and you could theoretically recompute it all at any time). Eventual consistency is an implementation detail and not a necessary one: you can implement event sourcing in a single Excel file if need be.


Ok? Not sure what you want to talk about here, you probably need to give us a bit more context, especially if you even acknowledge you haven't even opened up the article to talk about the submission itself...

Sometimes, the situation when something gets created and designed, looks very different from the current situation you're in N years later. So what might have looked like a boneheaded decision, could have been the best decision at that point.

But us engineers like to lament our predecessors' code, I'm guilty of this sometimes too. But I try to remember that I don't have the full context of how things were when the code was initially written.


> But I try to remember that I don't have the full context of how things were when the code was initially written.

Ironically, that's what event-sourcing is for.

If the facts were kept, then a better decision can be made today.


Ironically, You missed the point of my comment :) I'm talking about things outside of code and certainly not about storing data about domain-specific things.


What are the major deficiencies of ES in that particular codebase? If you could drop some specifics in bullets that would be really helpful for me. I promise not to ambush you with apologetics.


This system was developed from 2008-2013, a very different time when hardware was "cheap" and the Cloud was not not a thing.

Event Sourcing dictates that Events are never deleted which means that the data volume keeps growing and growing. There is - in this system - no way to delete old events. When I brought this up, the response was "Just add another hard drive". In the modern Cloud era, adding a hard drive is extremely expensive.

The system uses CQRS and all events generate reports that are stored in Elasticsearch. Data is never deleted, only an extra event gets added /saying/ it's deleted. The data is still cached in Elasticsearch. All of it, all the data back to 2013. Added an extra 32GB of RAM just to keep up with it is ludicrously expensive.

We're in Europe. Guess what a system like this does to GDPR. Can you tell me which events I need to delete when somebody says they want to be forgotten? Yeah.

I can't delete old data. It's impossible without collapsing the entire system like a house of cards.

Finally, and this is the piece de resistance, the developers decided to develop a relational database structure ON TOP OF EVENT SOURCING. We're talking primary keys, foreign keys, cascading and non-cascading deletes. Importing an Excel sheet of 10000 rows takes TWO WEEKS because it generates hundreds of events per Excel cell that is being read. We brought it down to 10 minutes and there is plenty of room for improvement it's just that we have other priorities right now. Currently, simple flat "tables" with no foreign keys take several seconds to import (just like in a regular RDMS).

Oh yeah note that this is a system that is used by 4-5 users at a time, not hundreds or thousands of users.


Just beware, if you're using Postgres or MySQL, it's not fully ACID (specifically "I") unless you run xacts in serializable mode.


The architecture from hell. Hard to debug. Why didn't my service pick up that event? Error handling. What happens with the state if a service throws an exception? Resource hog. How do I map/reduce all these events into a state?

I do like events. They go into my elk stack where I can look at pretty graphs which gives me a story of how my system behaves over time.


> Why didn't my service pick up that event?

What event? If you're not event-sourcing, you can't even ask that question. Instead, some user interacted with the system and got a 500. Maybe you got a stack trace in the logs, but the user's data is gone (or maybe half of it was stored).

Persistent events mean you get to fix the problem and try again - no data lost.

> What happens with the state if a service throws an exception?

Whatever you programmed it to do. Same as any other service.

> How do I map/reduce all these events into a state?

There's no one way to store them like there is in an RDBMS. UserService listens to User events. TransactionService listens to Transaction events. SearchService listens to both. All three have different schemas. Those schemas can be thrown away and redesigned at any time without data loss.

Do a bad job of map/reducing them into a state today. Do it better tomorrow.


> Once you distinguish all events you’re fine with and want to migrate your relational data, don’t try to cheat; don’t put your events as small and granular. Relational data is flattened; if you try to retrofit what happened from the final state, you will likely fail or not be precise at best.

.. what?


I think I follow.

The article is aimed at people who already have relational data, and want to build an event-driven system (whose events will eventually end up as relational data again downstream.)

Your system A might look like:

  |    Name | Balance |
  | Michael |   $3.03 |
You might design your system B to have the events AmountCredited{name, amount} and AmountDebited{name, amount}.

You don't know how system A ended up at its current state. That's what's meant by "flattened". When you want to "migrate the relational data", i.e. convert system A's relations into events, it's tempting to use the obvious AmountCredited{"Michael", $3.03} because you know it will result in Michael having the correct balance in the final system.

But it's not good to reuse AmountCredited, _because no such event actually happened_, which is why it could be called "cheating". If future-you looks at historical data, $3.03 won't correspond to any real transaction that happened. Instead you should instead make a special event like AmountImported{name, amount}.


For this example the convention in accounting is to use 'balance brought forward'.

It's a real transaction that happened, and everyone knows what it means i.e. the previous ledger with this account was closed off and the new one has the balance that was there when th book was closed.

Using 'imported' describes what you did, but not what the intention was.


I’ve never before seen a post hit #1 here after 30 minutes with zero comments. Is that normal?


To give the benefit of the doubt, maybe it was an intriguing title?

I didn't personally think that the style of the writing or the content itself was particularly good, but event sourcing has an almost narcotic appeal.

ES (nosql did too) has this wild ability to slip past scepticism. It feels right, I think because it milks some dopamine by feeling both simpler in a rewarding abstract sense and sufficiently (for Devs) complex in a day to day operations sense.


And at 1 hour it's still at the top in spite of all the negative comments. I'm assuming it got flagged by enough of us and yet it prevails. Odd.


Someone got around the upvote ring detection? I know there are lots of folks that claim to know how to get around it, but agreed it seems sketchy


I've got nothing smart to say about databases on a Saturday


Funny, I had the same thought


Lot of negative comments so thought I should counter as I quite liked it. The article style isn’t great but I get the impression it was only ever meant to be a toe dip into one aspect of thinking about events.

I’ve worked at a couple of places that have had a lot of success with EDA, DDD and microservices. I’m sure these patterns are considered by many as buzzwords but they’ve been around for a long, long time now and they can be very effective. There’s no such thing as a silver bullet but the problems these ideas look to solve I’ve experienced very often, so I don’t buy into it being a solution in search of a problem.


Question has been asked every time when someone wants to use event sourcing -- why do you want to redo the stuff that databases already did? "Ensure your design works if scale changes by 10X or 20X, not 100x".


because the database is inconsistent and incorrect is usually the answer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: