Lying abstractions
tl;dr
- Functions are abstraction boundaries. Like modules and interfaces, functions create boundaries that hide implementation details.
- When a function does more (or something else) than the name suggests, it is lying to you and your peers.
- Being decieved by the function name, you will write code that causes bugs.
A programmer's main job
Every successful system inevitably grows in size and features. Our job is to release new features, while keeping the system stable and bug-free.
This is a dichotomy - the more state we need to "keep in our head" - the worse we will perform. We are in a constant battle to keep the system in such a shape that we are able to update and extend it. And so:
A programmer's main job is to fight complexity
Functions are abstractions
The common way to see functions: "they are a way to re-use code", completely misses their most important aspect!
Functions are abstractions, in the same way as modules or interfaces. They create a boundary that hides implementation details. This keeps the complexity down, since you can reason about code at a higher level - without needing to care about the details.
// If we don't care about the details
// This is easier to reason about...
WaterFlower("flower2", "10 ml");
// ... than this
var waterCan = WaterCanPool.GetFirstAvailableWaterCan();
if (waterCan is null) {
throw new NoAvailableWaterCanException();
}
waterCan.FillUp();
boolean hasWatered = false;
for(var flower of flowers) {
if (flower.name.Trim() == "flower2") {
flower.ApplyWater("10 ml");
break;
}
}
if (!hasWatered) {
throw new FlowerNotFoundException("flower2");
}
When you call a function, you should always be able to understand what it does without looking at its implementation. But this only works if the abstraction is honest.
Lying abstractions
Consider this example:
function getUserMarketingCampaign(userid) {
// implementation
}
What does this function do? Clearly it gets something - it retrieves the user's marketing campaign. Simple, right?
But look at the actual implementation:
function getUserMarketingCampaign(userid) {
const campaign = database.query(sql`SELECT * FROM campaigns WHERE user = ${userid}`);
if (!campaign) {
// The lie: it also creates!
return database.insert(sql`INSERT INTO campaigns(user_id) VALUES(${userid})`);
}
return campaign;
}
The function doesn't just get the requested campaign. If the campaign doesn't exist, a new campaign is created. The abstraction is lying to you!
Picture the following scenario:
- A developer is implementing a feature which will only work for users having a marketing campaign. They use the already ubiquitous function
getUserMarketingCampaign() - You are doing code review, looking at a diff - and make the reasonable assumption that
const campaign = getUserMarketingCampaign(123);will do what it says and try to fetch a campaign. - The code is deployed, and now ineglible users are having marketing campagins created at an alarming rate.
- The business gets sued by EU GDPR lawyers and goes bankrupt.
All this because of a function name!
Good naming
We need to use good and precise names for our functions. In this case:
function getOrCreateUserMarketingCampaign(userid) {
// ...
}
Now the name accurately describes what the function does. Yes, it's a bit wordier. I don't care.
Someone might say: "That's too long! Function names should be concise!" But this misses the point entirely. The abstraction boundary has to be clear.
Functions as a contract
Based on a function's in-parameters, we expect it to return specific values or perform specific tasks. This can be seen as a technical "contract", and is often validated in unit tests. But:
The function's name is a contract between you and future readers of the code (human, AI, you).
Therefore the name must describe clearly what the function does, especially including unexpected conditionals or side effects.
Make your abstractions honest. Don't hide surprises behind innocent-sounding names. If a function gets or creates, say so. If it validates and throws, say so. Heck, even if it gets and caches, you can say so.
Yes, this might lead to longer function names. That's okay. Clarity beats brevity every time.
Examples
Some quite harmless examples where I would consider different options for the naming:
// Lying: sound like it will always happen
function killUnit(u) {
if (!u.isImmortal()) {
u.isDead = true;
}
}
// Honest
function tryKillUnit(u) { /* ... */ }
// Lying: sounds like it just checks
function validateUser(user) {
if (!user.email) {
throw new Error("Email required"); // Surprise: it throws!
}
}
// Better: "assert" indicates throwing
function assertUserValid(user) { /* ... */ }
// Lying: sounds like it fetches a value every time
public DataSet fetchPayments(String id) {
var cachedPayment = _paymentCache.get(id);
if (cachedPayment != null)
return cachedPayment;
return _paymentRepository.get(id);
}
// Better
public DataSet getCachedOrFetchPayments { /* ... */ }
If the type system clearly defines a function as being able to fail, I would not worry too much about the naming. For instance:
// It is clear from the return type
// that it can fail
fn calculate_tax() -> Result<u64, Box<dyn Error>> { /* ... */ }
Likewise, it can be clear from the type-system that a function can modify a value:
fn main() {
let values = vec![4,3,2,1];
// "read_first" probably removes the first element
// since the function requires a mutable vector
let first = read_first(&mut values);
println!("{}", first);
println!("{:?}", values);
}
// Though arguably "take_first" would be a more appropriate name in any case :)