A Data Scientist Walks into a Bar

Published

February 20, 2021

A dispirited data scientist walks into a moderately populated bar, makes his way to the counter, and plops himself onto a vacant seat. It’s Wednesday afternoon. The bartender knows what’s coming and has already prepared a double shot of vodka for her new weekly patron.

“Another tough day in the tower huh”, the bartender says. “Is this gonna be a recurring pattern? I mean, not just you and the double shots.” She slides the drink across to him.

“See, here’s the thing,” Henri begins. ”Instead of solving proofs and developing their theorems, those damn mathematicians are on my ass again with that same stupid question! They keep asking me in their pretentious tone ‘so Hen, what do you do really do as a data scientist’, as if I were a waste of taxpayer dollars. It is exponentially aggravating.”

The bartender smirks. “Exponentially huh? That’s a new one.”

“Yes! That’s precisely it feels.” Henri slams the double shot back. “Another one please. And don’t even get me started about the computer scientists and statisticians this week.”

“Kid, you really gotta watch yourself with the double shots. Your remaining brain cells will thank you later. Besides, what’s the big deal? Aren’t they all just arbitrary titles?”

“Not in my world.”

The bartender rolls her eyes. “You’ll have to elaborate then. If my philosophy degree has taught me anything, it’s that none of this matters.”

“Oh quit being so cynical Mal.”

“Well we wouldn’t be having this discussion if you didn’t at least partly believe me. It’s all just smoke and mirrors, means of further division veiled behind the oh-so-noble guise of academic progress, when in reality things are more connected than the specs would like us to believe. Hell, doesn’t it all just boil down to math? Science I mean: generating supposedly novel hypotheses, testing said hypotheses, and publishing results in ivory-encased journals – if you’re lucky – all of which must be reduced to fancy models and statistics in order to garner any form of western validity, aka mathematics.”

“Oh come on, you sound just like them now.” Henri begins to trace the rim of his empty drinking glass, then looks up. “Okay think about neuroscience. You’ve got computational neuroscience, developmental neuroscience, cognitive neuroscience, and now the up-and-coming hotshots of network neuroscience. Sure, there’s some overlap, and sure they all involve mathematics, statistics, and computer science. But the focus of each is on different problems, different open questions within the umbrella of neuroscience. The discipline guides the questions.”

Mal meets Henri’s pensive gaze. “Tell me then: what’s the focus of data science? Which reminds me, I read an article in the Harvard Business Review the other day that said being a data scientist is the sexiest job of the 21st century. You should do a study on that! ‘Cultural perception of scientists in the 21st century’. I bet that’ll get published.”

“Yeah, Obama had a chief data scientist during his presidency too. But that’s beside the point. The emergence of data science is partly a response to the deluge of data we’re now faced with thanks to the advent of personal computers and the internet. God, I used to be worried about running out of storage on my 5 gigabyte iPod – which was so much memory not too long ago! – and now we’re drowning in yotta- and zettabytes.”

“Imagine that. We’re gonna need more letters in the alphabet.” Mal grins. ”Now that’d be an innovation.”

“Precisely. And I don’t think it’s slowing down anytime soon. The data comes in so many forms, from tweets and cute cat pictures to puppy emojis and YouTube Videos. It wasn’t feasible to analyze any of this prior to our current computing capabilities. The scale is jarring frankly, and it additionally requires one to think about the cleaning, management, and storage of all this data. This specific combination of problems isn’t focused on by the mathematics, statistics, and computer science holy trifecta.”

“So let me get this straight. Data science is about leveraging modern computers to manage, process, and analyze this growing tsunami of data, all of which comes in forms not previously considered prior to the genesis of the information society?”

“Yeah, I’d say that’s a good summary.”

Mal laughs. “Sounds like applied statistics and computer science, and math.”

“Okay okay, but that’s a bit decontextualized. You have to frame science in terms of the problems the field is concerned with. And data science is specific to, well, data, virtually always of the digital kind. There’s data visualization too, which has always been important in communicating science, but has taken more of a center-stage position alongside all the other aspects of data science thanks to the increasing number of tools available and more computational power.”

“What about the ethics of all this?”

“That’s another important dimension of data science, although I can’t say we’ve been terribly good at putting it to practice. Just isn’t in the job description or the culture.”

“Great, just what humanity ordered.”

“Hey that’s where your philosophy background would come in handy.”

“Well, when institutions and companies stop suggesting and start backing their words with jobs and paychecks I’ll consider it. Until then I’m left to commiserate with data scientists and their identity crises in the bright limelight of Shakeys.”

Henri perks up. “There are others?”

“Yep.”

“I’m not alone!”

Mal pours Henri another drink. “Computational statistician.”

“What?”

“You’re a computational statistician, full stop.”

Henri considers it. “Data scientist sounds better.”