Skip to main content
Baldur Bjarnason

My Mother's Advice on How to be a Great YouTuber

Baldur Bjarnason

You can also watch it on YouTube.

Below is the essay version of this video. It is not a transcript as it differs in the parts where a raw transcript would just be outright nonsense or clumsy as hell.

Getting off on a weird start #

How would you even start a video?

How do you open the first real video in a series?

This particular video took too takes before I arrived at something, so I just went with this one.

It all made sense when I was planning this.

Where people get their information #

It’s the norm today for people to rely on video as their primary source of information.

It doesn’t matter whether it’s TikTok, YouTube, or some other form of video. This is the medium of our age.

Even people who do read for pleasure or for information, they quite often discover which books they want to read through video.

So, if I want to promote my work-- and I have things to promote – it makes sense to be doing video in some form.

I’ve written books. I’d like you to know about those books in due course.

I do consulting, I’d like you to know about the consulting in due course.

I have a newsletter, I’d love for you to subscribe.

But if I’m going to reach people on video, I need to actually do video, and not procrastinate on it forever.

That’s where the tricky part came up. This is not something I’m used to doing.

So I did what I always do. I did research. I figured out how to do the typical talking-head well-lit YouTube video at a minimal expense.

I used one of the better microphones I own – you know, the one that sounds like radio. I had the entire script written out in detail in advance. I had an interesting backdrop, and I was posed standing, because standing is best for your voice.

And it was just awful. It did not work at all. It just was awkward and stilted, and I froze several times.

I seemed to find it difficult to figure out what to say, even when the text was right there in front of me.

I did what any person would do under these circumstances.

The logical next path.

I called my mother and asked her for her device.

My mother’s advice #

Now, this is not quite as unusual as you’d expect, although mothers often do have good advice. But my mother in particular used to work in TV and radio before she retired, and she worked specifically as a TV reporter for the Icelandic National Broadcasting Service. Her experience is quite extensive.

And she also has, like most of us in my family, the personality quirk, where we do our info-dumps and thoughtlessly offer unfiltered and unvarnished criticism of whatever work is being done, which is fine.

She did that when I was growing up and read my school essays, which is awesome.

Because actionable advice is always fantastic.

She came up with a list of solid pointers.

What does the audience actually want? #

I need to think about what the audience wants, not just what I want.

I had the desire to look like a professional, to look like somebody who was a pro YouTuber who had this polished look to everything he did, and that was not based on anything that the audience is likely to want.

The only reason to listen to a video by, you know, me, is for my expertise – for me to convey what’s in my head over to you so you can build on it or benefit from it.

The audience is only going to be here for my expertise, for the things that I know how to do well, that I know how to explain well.

Her first advice is that I need to drop all of the professional aesthetics and just focus on a video production setup where I could be myself.

Turns out to be a bit trickier than expected, because there’s this minor issue of crippling anxiety about being on camera and being recorded.

But it’s just something I need to cope with, hence the fact that this is something like the fifth take.

The first step is to figure out ways that let me be be more relaxed in front of the camera and the first way to be more relaxed in front of the camera, which I’m very bad at following, is to speak more slowly.

I am extremely bad at this because I automatically start to ramp up the speed of my speech as soon as I get onto something that I find interesting, because that’s the fun part!

But that is less relaxed and harder for the audience to keep up with.

More importantly than speaking slowly is to figure out ways for me to be natural.

As a reporter, my mother’s beat was science and healthcare, which required a lot of interviews of people with expertise, and finding ways of getting people with expertise to convey that knowledge in a natural manner.

That can be tricky, because academics tend to lock up and stiffen up as soon as you take them out of their environment where they’re most natural, like being in the lecture hall or their offices, or an archaeologist at the dig.

You shoot the interviewwhere they feel natural, where they’re most used to thinking about their work and their expertise. In my case, that would be my home office where I’m surrounded by work-related books, my typewriters and pens, and the photos I use to liven the place up. And, most importantly, a small statue of Gaston Lagaffe, the patron saint of absent-minded klutzes at work.

This is where I feel the most comfortable. It feels so much more normal to spout my nonsense in this environment than where I’d set up before.

Another issue she pointed out is that as soon as you start to stress about pauses and errors in your speech, things can go haywire very quickly, you basically go off the rails.

She has a theory about that.

Errors and mistakes in speech are natural #

One of the consistent results in research on retention and that is memory and comprehension is that audio books, on average, tend to score lower than printed books.

Ebooks have historically been all over the place, it depends on the device, it depends on the toolkit, the machinery that surrounds the ebook, and the sample sizes in these studies tend to be so small that it’s next to impossible to get reliable results.

But in terms of comparing printed books with audio books, audio books tend to score lower, but there’s also some variability in both.

My mother’s favourite theory on the topic is that people tend to have an easier time comprehending and retaining audio information that feels like a natural dialogue, that feels like it’s something – it might not be a back and forth between the listener and the speaker, but it feels like that. It taps into the skill sets and the experience we have at conversation.

That would mean that some voice actors, some narrators, can do this just purely through skill, through the way they perform. Even when they’re just reading the text off the page, they make it feel like they’re talking to you, like this is a personal conversation between you and them.

That, in this theory, could improve retention. But from the perspective of a video like the one I’m doing, the pauses, the hesitation, the markers of natural language and natural conversation might make this thing more accessible than the paced, stilted reading off the screen.

I mean, it’s a theory, it doesn’t feel that implausible. I guess we’ll find out.

The shallow aesthetics of professionalism #

There’s also an aesthetic to professionalism. Not just in terms of the visuals, but also in terms of the visuals, acoustics, and voice style.

There’s a pro YouTube look. There’s a pro podcast sound. None of these represent actual professionalism. They aren’t what makes a podcast professional. They aren’t what makes a video professional. They are the trappings of professionalism – the acoustics and aesthetics of professionalism.

They’re a look that can be replicated without an understanding of the underlying professionalism that drove the decisions in the first place. Professionalism is practice, and the specific look that comes with a professional video is a consequence of an expert applying their practice to deliver that video.

If you replicate that result without the expertise, without the practice, you just end up with the aesthetics, even when it’s inappropriate, even when the look of it actually makes the end result less credible.

You can see this happening in, for example, documentaries, where everything is shot like it’s a corporate training video, and it immediately has a kind of of artificiality – an implied falsehood that you don’t get when you cover the same topic using a more run-and-gun dynamic and, let’s face it, uglier aesthetic.

But that’s the professional product. The professional understands the context, and understands the role that the end result is supposed to have – the job it’s doing for the audience. The same applies here.

I don’t think a more professional aesthetic would make this video more credible. It wouldn’t make my video more credible if it was lit – pro Youtube style – with a softbox and shot with a 6K open gate camera and colour-graded to match the YouTube default. It might even make it feel a little bit fake to a lot of people, understandably, because what does a photographer know about strategy or management or machine learning or software development or world development?

The video might benefit from better lighting or a better camera, but they would need to be applied in such as way as to not compromise the credibility or authenticity of the work.

The aesthetics of the video can betray a different expertise than what you’d expect and that makes it less credible.

The “AI” bugbear #

Authenticity and credibility matter as they connect with one of the biggest issues of any media production today: AI.

Anything that is surface-level replicable, like a specific aesthetic, a specific style, a specific look, or a specific sound, can be replicated using machine learning techniques.

They are using fairly detailed statistical modelling of collections of similar material to replicate a new material that matches a common style or form. That means that when something is more broadly represented in the data set that they’re modelling, like video, that aesthetic becomes easier to replicate.

So the actual YouTube look, the standard softbox fill light, backlight for the hair, or for the bald head, the common colour-grading,that look with the standard sound and the podcast voice, all of that can be automatically replicated, at least in short segments, especially if they’re intercut – video essay-style – with clips and images from other sources.

If you don’t need to convey expertise, if you’re just conveying the clichés and the stereotypes of the day, and want to do so in a way that’s represented using the clichés and stereotypes of video of the day, then you’re competing directly with AI. That’s what it does extremely well.

The simplistic look in the video is, turns out, genuinely harder for AI to replicate. I move my head a little bit too much, it’s a problem when I use proper microphones and with those it makes my voice go in and out. That’s why I’ve got the lavalier microphone set up.

I use my hands too much. AI has problems with fingers for a reason, because they don’t actually model the anatomy of their subjects. They model the 2D plane that’s represented on the screen. So they don’t recognise the physical reality of what they’re replicating.

Sometimes the lo-fi effort – the imperfect and personal – is actually stronger, more convincing and harder to automatically replicate in the era of AI, is the first step to doing a project like this.

If you’re going to do video, don’t do whatever it is that AI can do easily.

It’s the same thing with writing, if you’re going to be writing today, avoid the twee LinkedIn-style professional and shallow writing style that large language models are so great at.

Avoid the mundane, the typical, the traditional structure. Be a little bit chaotic, but be yourself.

The humanity of it all #

People will forgive chaos and a little bit of disorder if there’s a person behind it.

No matter what you think about TikTok, new social media, the generation today, or whatever age-related bugbear you might have – whether it’s against older people or younger people – you need to bear in mind that everybody’s doing this to connect with other people.

That’s literally what TikTok is for. That’s literally what social media is for, even YouTube. People are doing social media because they want a human connection. Even when they follow celebrities, that’s because they want to connect with them personally.

Nobody is doing any of this to follow a robot.

It doesn’t matter how polished the robot is, it doesn’t matter how polished the videos are that the robot makes, it’ll never be a person, and that’s what the audience wants. They want people.

This is my plan: I’m going do the whole personal thing and be a little bit… Error-prone. I’m going be trying a few different approaches.

The next video is not gonna be as meta as this one. I might touch up on the process, because it’s interesting when you’re starting but.

But this is the plan.

A note on my accent #

If my accent is a bit of a mess, and that’s because none of it is natural. I was born in Iceland, partially raised in Lancashire, in England, and my original English accent as a child was very Lancaster, and quite indecipherable.

When I moved back home to Iceland as a slightly older kid, I tried to get rid of the accent. So, I figured out how to speak English with an Icelandic accent when I want to, because that can be a useful talent. (There was a period I didn’t mention in the video where I spoke with an American accent, but that didn’t last long.)

But then when I moved to Bristol, 25 years ago, the Icelandic accent came to be a little embarrassing, because one of my friends said it sounded like I was Dutch.

“No, can’t have that.”

I began to consciously (and unconsciously) pick up on the accents of the people around me. The problem there is that this was at a university with a mixed bag of people from all over the UK, so my accent is all over the place.

You might have the occasional word that’s weirdly North English, and then you’ve got Bristolian, you’ve got South English, and an older mixture of things, and then the occasional Icelandic pronunciation slip up.

If you find the chaotic accent off-putting, then I’m sorry, there’s nothing I can do about it.

It’s just the result of literally more than 45 years of messy language.