Good point. However, approaching this problem from “YAGNI” point of view is a bit misleading, I think. If you are not going to need the timestamp, you shouldn’t add it to your code base.
In my opinion, hastiness is the culprit. When a property appears to be a binary one, we jump to the conclusion to use a boolean way too quickly. We should instead stop and ask ourselves if we are really dealing with a situation that can be reduced to a single bit. The point raised by the article is a good example: you may want to record the state change as timestamp. Moreover, in a lot of the cases, the answer is not even binary. The values for
is_published
may be, “Yes”, “No” or “I don’t know” (and then we will be too quick to assignnull
to “I don’t know”). Underlying problem is that we don’t spend enough time when modeling our problems. And this is a sure way of accumulating technical debt.I think this timestamp-as-a-boolean is a good idea if the field is always going to be interpreted as either True or False and nothing more. If the field in question allows for a 3rd (uncertain) value, then using a timestamp would be extremely confusing.
And it all depends on the problem at hand. Any of those solutions can be acceptable as long as you have a well thought out model.
deleted by creator
Ehhh, I don’t quite agree with this. I’ve done the same thing where I used a timestamp field to replace a boolean. However, they are technically not the same thing. In databases, boolean fields can be nullable so you actually have 3-valued boolean logic:
true
,false
, andnull
. You can technically only replace a non-nullable field to a timestamp column because you are treatingnull
in timestamp asfalse
.Two examples:
-
A table of generated documents for employees to sign. There’s a field where they need to agree to something, but it’s optional. You want to differentiate between employees who agreed, employees who disagreed, and employees who have yet to agree. You can’t change the column from
is_agreed
toagreed_at
. -
Adding a boolean column to an existing table. These columns need to either default to an value (which is fair) or be nullable.
Using a nullable Boolean to represent 3 distinct states just adds confusion and complexity to your system. In most cases I would prefer to use an enum with 3 fields which is non nullable.
Completely agree, I cram a timestamp column in every table, but booleans have their purpose too.
Yeah, this feels like “premature optimization”. When you design your applications and databases, it should reflect your understanding of the problem and how you solved it as best as possible. Using
DATETIMEOFFSET NULL
when you actually meanBIT NOT NULL
isn’t saying what you mean. If you already understand that you have a boolean option and you think you might need a timestamp to track it, use 2 columns. Or an audit table. So sayeth the holy SRP.
-
A CRDT Boolean is also pretty easy to write. Essentially it is just a last-write-wins element set with a single possible value and instead of representing if it is added or removed it represents true or false. To get the current state you take the latest timestamp. To merge two values you update the true and false timestamps to the latest of each.
How do you update it to unoublish? Add another timestamp column and who’s the latest win or just set published_at to null?
You wouldn’t store this information on the same table (unless you’re using a wide row db like dynamo/Cassandra). In a SQL world, you’d store version information in a separate table - one table for the HEAD state and another for history.
So, the history table have every column, but the user table has only user id and version, right?
user_history table user table