In this episode, John, Martin, and Gary discuss the future of SEO. They talk about the changes they’ve seen in the past decade and anticipate what’s next for SEO. Find out if HTML knowledge will be necessary in the future, if SEO will still be relevant, and more!
Episode transcript → https://goo.gle/sotr026-transcript
Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.
Search Off the Record – 26th episode
[00:00:01] ♪ [music] ♪
[00:00:10] John Mueller: [00:00:10] Welcome, everyone, to the next episode of the <i>Search Off the Record</i> podcast. Our plan is to talk a bit about what’s happening at Google Search, how things work behind the scenes, and who knows, maybe have some fun along the way.
[00:00:24] My name is John Mueller. I’m a Search Advocate on the Search Relations Team here at Google in Switzerland. I’m joined today by Martin and Gary, both also on the Search Relations Team, and today, we’ll be talking about the future of SEO.
[00:00:41] ♪ [music] ♪
[00:00:46] Gary Illyes: [00:00:46] Talking about the future, what could possibly go wrong, John?
[00:00:49] John Mueller: [00:00:49] The future! I don’t know. I mean, looking back and looking forward a little bit. One of the big changes I’ve been seeing over, I don’t know, maybe the last ten or so years is how more and more sites are moving to hosted platforms where, basically, you don’t have to run your own server anymore or where it’s almost better that you don’t run your own server anymore because you don’t have to deal with all of the technical infrastructure that’s associated with everything around… like a server.
[00:01:22] And to me that seems like, I don’t know, it seems like a reasonable move because you’re kind of off-loading a lot of the technical details, and it gives you more room to focus on what actually matters, like the content that you’re creating. So I guess, in a sense, if we think about SEOs, it means SEOs won’t need to learn HTML anymore, right?
[00:01:45] Martin Splitt: [00:01:45] Oh no, no, no. Abort mission, abandon ship. No, John, no.
[00:01:51] John Mueller: [00:01:51] Well, I mean it’s like if you just have a a rich editor and you just type things in and then you format your text properly and you add some links. What do you need to do with HTML?
[00:02:03] Gary Illyes: [00:02:03] I mean, SEO is more than just doing the text part, right? It’s not just about writing the content.
[00:02:10] Basically, if you hire a copywriter, that could do the content for you. But SEO is also about link tags and meta tags and title elements and all those weird things in the head section of the HTML that you can put there.
[00:02:25] So you kind of want to know about them to control how your snippets look like or how your titles show up in search results and the rel canonical tag to control what will be the– or what should be the canonical version of a URL. You kind of want to know that.
[00:02:44] John Mueller: [00:02:44] But couldn’t you just have that in your CMS? It’s like you just have like a big field for text and then you have some extra fields for metadata.
[00:02:54] Gary Illyes: [00:02:54] But if you just started a website, then do you want to learn, for example, what is rel canonical? Could you explain to me, as an absolute noob, what is a canonical URL and what is its relation to rel conical?
[00:03:12] It’s very hard to simplify it. Or href link, for example. Try to explain href link to someone who just came to the Internet and they’ve never even thought about localization before.
[00:03:23] Martin Splitt: [00:03:23] I mean, explaining href link to someone who has been on the Internet for a decade is hard too.
[00:03:30] Gary Illyes: [00:03:30] I mean, there are people who understand it and can do a good job.
[00:03:33] Martin Splitt: [00:03:33] That’s true. Also, having worked with CMSs in the past, I’ve seen that the ones that are pretty much the more successful ones allow you to always insert custom HTML, because an editor that needs to be as simple as possible for everyone to use it and to create content should also give
you the opportunity to say: “And now, I want to jump out of this simple mode and actually go and do something more advanced.”
[00:03:58] And that usually contains HTML, or means that you need to write HTML. So if you don’t know any HTML, you’ll be very, very quickly out of your depth there. So I think that’s a risky direction to take.
[00:04:11] Gary Illyes: [00:04:11] Also, when we are launching new meta tags for example, or directives, robots directives, for example, then we kind of rely on SEOs to make use of them instead of site owners use them, and we kind of tailor the documentation towards SEOs versus site owners.
[00:04:31] Like, for site owners. If you look on <i>.dev</i> site, in our search documentation, then there, the site-owner tailored content is very simplistic and for good reason. They need to get the basic information that can get their site do OK in search, but then if you want to take the next step, then you either learn more about search and how you can do stuff on your website that makes your site do better in search or you hire an SEO.
[00:05:03] And then we have the SEO documentation which is much more in-depth, I guess, and explains things differently with the assumption that the person who’s reading it already knows stuff about search and how things work and how things connect to it. You probably don’t want to throw the site owner into that water, just when they started with their site.
[00:05:27] John Mueller: [00:05:27] OK, so I guess taking a step back, you’re also saying that SEOs should know HTML from the beginning, even now.
[00:05:35] Martin Splitt: [00:05:35] Yeah, I would say so. I think it’s one of the fundamental technologies that make the web what it is. And if you want to work with it, you should know at least a little bit.
[00:06:01] John Mueller: [00:06:01] So HTML is not going to go away?
[00:06:36] John Mueller: [00:06:36] OK.
[00:06:37] Martin Splitt: [00:06:37] So John, do you think SEOs don’t need to know HTML?
[00:06:41] John Mueller: [00:06:41] Oh, I don’t know. I mean, it’s a question that comes up every now and then, and it’s… I think a part of the reason behind that question is also that there’s just so many different things that SEOs do, and some of them focus purely on content or focus purely on building relationships with other websites. And for some of those activities, you don’t need to know HTML, but there are also lots of technical on-page things that you do have to use.
[00:07:14] I mean it’s… I don’t know how that will evolve, if those directions will kind of merge more or separate out more. But yeah, I mean it’s always interesting to see HTML is not going to go away, so you might as well get used to it. And if, as an SEO, you don’t know anything about HTML, then maybe it’s time to actually try things out.
[00:07:41] ♪ [music] ♪
[00:07:44] John Mueller: [00:07:44] Another thing that I think is is happening more and more and that we’ll see more of in the future is kind of the migration from native apps more and more into web apps which we’ve seen with some of the PWA things that are happening, but also with more and more…
[00:08:05] I think the user is kind of expecting to be able to use any app that they have in any platform, any device that they use. And it feels like that kind of work is going to continue as well. And probably, that means
[00:08:29] But it probably also means that a lot of these apps suddenly have to think about SEO in general. Like what do they actually want to have findable on the web, because in the past, they were just apps.
[00:08:42] Gary Illyes: [00:08:42] Yeah, that’s actually a big topic and a very interesting one, and one that I think is… has been in the past underrepresented or under investigated or under researched.
[00:08:57] In the future, I think a lot of our applications will just happen to run in the browser and you can already see that like you have so many APIs, and opportunities. You can have a video chat in the browser. No one has to necessarily install a client to– or like a desktop app or mobile app to do a video chat. Video chats have been quite popular in the last couple of years and I think that has shown that applications can shift to the web even if they are a little more intricate. But then the question becomes, how do you represent that to someone who searches the web? And what kind of content do you want to highlight?
[00:09:34] And we had this challenge at a company that I worked for in the past where they would create interactive 3D models of real estate spaces like apartments or offices and you could furnish them and walk around them. You could basically do like a virtual viewing in the browser of a different space. You could even do that in AR and try out different furniture in your own home and you could– and all of that in the browser.
[00:10:01] And then the question became: “OK, so we have this amazing application that happens to run in the browser. But if a search engine looks at it, and because it is all visual, it is a black box for a bot for a computer. As far as Googlebot or any other computer looking at the website would go, they would see a title, a meta description and then
a canvas, which is effectively a big rectangle of pixels that they have no idea what they represent or what they mean.” And then the question became: “How do we… how do we get that into a search engine?”
[00:10:34] So for instance we made a virtual model of Don Draper’s apartment in Mad Men. We also made the Simpsons family home, a 3D model that you could visit in the browser, which is really, really cool. But if you search for the Simpsons house or Don Draper’s apartment, you wouldn’t really find our 3D model, because as far as as Google Search and other search engines are concerned, our page is really, really low on content and not very relevant to the query that you enter.
[00:11:02] So how do you do that? And then there’s obviously a bunch of strategies, but I think that will be a topic in the future where SEOs need to identify together with the people making the app and using the app on what content do we want to expose? How do we expose it so that it is useful and understandable to search engines? And that will also be something that search engines will work on and we can already see that 3D models for products or like white sharks or tiger or whatever. There are 3D models for this so that you can get a spatial feeling for how these things look and interact with their environment.
[00:12:10] John Mueller: [00:12:10] So it sounds a lot like you have to combine the SEO strategies that you have for existing sites, suddenly, with completely different site models where–
[00:12:22] Gary Illyes: [00:12:22] Yeah.
[00:12:23] John Mueller: [00:12:23] maybe their developers have purely focused on the technical aspects, like: “How does this 3D model actually work?” And at some point, you combine it with the marketing side of SEO and kind of like: “How do I package information in here so that text-based search engines can actually make use of that?”
[00:12:42] Gary Illyes: [00:12:42] Yeah. Yeah. And also how can users navigate this application? Because if you give me, let’s say, an empty paint application and I don’t necessarily know if that is the application I need to do what I need to do. So then, how do you package this content and how do you interlink functionality with content? And yeah, that’s going to be interesting to see
[00:13:03] John Mueller: [00:13:03] Cool. OK. So that’s like one area where I guess SEOs will be required,
almost. That sounds pretty good. What kind of things do you think will stay the same? You mentioned HTML, both of you, the future will be based on HTML on the web.
[00:13:24] Gary Illyes: [00:13:24] I think so.
[00:14:15] Gary Illyes: [00:14:15] Fortunately, URLs cannot go away. [00:14:17] John Mueller: [00:14:17] What do you mean?
[00:14:18] Gary Illyes: [00:14:18] At least not in the foreseeable future, because the URLs they are the standard way to communicate addresses on the Internet. And without that the Internet is just not the Internet. The same way domain names cannot go away because of how the Internet is built or IP addresses cannot go away because of how the Internet is built. The same way URLs cannot go away.
[00:14:42] John Mueller: [00:14:42] OK.
[00:14:43] Gary Illyes: [00:14:43] If you think about it, how hard it was to introduce IPV6 to the Internet? It took many, many years to introduce it. Not to replace IPV4, but to introduce a new format for IP addressing. Changing URLs, that would be even crazier than changing IPs or IP formats.
[00:15:04] John Mueller: [00:15:04] OK.
[00:15:05] Martin Splitt: [00:15:05] I mean, we did change IP formats when we changed from IP version 4 to IP version 6, but IPs will stay around. And so, I think, well, URLs, they might look different, but they’ll stay around, I think.
[00:15:16] John Mueller: [00:15:16] So by look different, do you mean, instead of path and filenames, everything will be parameters and it’ll be like a machine learning hash instead of words?
[00:15:30] Martin Splitt: [00:15:30] Maybe, or maybe we will decentralize the web somehow and everything will be identified by the hash of the content. And then you might get it from different sources, who knows. But I think URL’s, in terms of addressing contents on the network, will stick around.
[00:15:50] John Mueller: [00:15:50] OK, so I guess like the the URL-based mechanisms will also stick around. Like at least, looking forward, maybe five or ten years, it feels like a long time for the web, but at the same time, it’s like looking back ten years, like what has changed on the web? Not much. It’s like different ads, more cat videos, which is kind of sad, but the other URL-based mechanisms, I guess would also be similar.
[00:16:18] You mentioned the rel canonical. That seems like something that will stick around or do you think something will come along and be able to replace the rel canonical?
[00:16:30] Gary Illyes: [00:16:30] I mean, there’s no need to replace it. Usually these changes are prompted by a need, and unless there is a need for changing rel canonical, something that’s extremely widely used, why would we want to try to change it. When we know that something is broken and we need to come up with something else, then we might change it or try to find new solutions for the same problem. But if it’s not broken, we generally don’t want to touch it.
[00:16:59] John Mueller: [00:16:59] But if we can… I mean, the rel canonical is kind of a mechanism to let search engines know that two pieces of content are the same and you should pick this one. It feels like at some point in the future, we will be able to look at pages and say things like: “Oh, it’s like pretty much the
same. We’ll just pick one of these.”
[00:17:19] Gary Illyes: [00:17:19] I mean, technically, we could do that already. We appreciate the help that we get from rel canonical, and we use it quite aggressively, in fact, in canonical selection but it would work without it. If we remove that criterion from a rel canonical selection, it would work. But then people would have less control over their desired canonical URL, and that’s not something that we want.
[00:17:46] We do want to give people control over what we show in search results, what the canonical version of the URLs are and rel canonical is just a good match for that and that’s why we also standardized it. There’s an internet draft for that, or I think it’s an internet draft.
[00:18:02] John Mueller: [00:18:02] OK, so I guess I think that’s also a really interesting aspect, because on the one hand there’s all of the machine learning work that’s being done to kind of automatically understand things better. But the control aspect is something that machine learning can’t really replace there because that’s like… that’s my personal preference kind of thing and less understanding of a piece of content.
[00:18:31] Gary Illyes: [00:18:31] Yeah, what about meta tags in general? I mean, we have talked about href link and we have talked about canonicals, but do you think we will need more meta tags in the future, or will the need for meta tags go away?
[00:18:45] Martin Splitt: [00:18:45] I hope that we are not introducing more meta tags. And usually, when you see internal threads about, like, this search team wants to introduce a new meta tag. Then usually both John and I jump on that thread and we are pushing back quite aggressively because there’s very rarely a good reason to introduce a new meta tag. Usually, there is already something that might be used for that, like for example, someone wants to let people control whether the content can be translated. And then they want to introduce a new meta tag.
[00:19:18] It’s like, well, is your translate service a robot? Well, technically yes. Then just use the robots meta tag and then just introduce a new directive there. Don’t introduce a completely new meta tag, because then people just have to pile or to learn actually a new meta tag, which is not necessarily a good thing. I think we don’t want more meta tags, and I hope that we are not going more meta tags. But then teams have weird ideas and unfortunately, John and I are not always there to fight back.
[00:19:50] John Mueller: [00:19:50] I mean, it’s also a matter of control that you mentioned. Where sometimes, site owners have very strong preferences one way or the other and it’s useful to listen to them because we kind of want to work together. But it’s it’s always a bit tricky. So I guess <i>robots.txt</i> falls into the similar category of on the one hand, it’s a URL, so it’ll probably
be the same. And the other hand it’s about site owners’ preferences and controls. So probably, that will remain as well.
[00:20:22] Gary Illyes: [00:20:22] One correction there. It’s for URIs. [00:20:25] John Mueller: [00:20:25] URIs. Oh, what’s the difference? [00:20:28] Gary Illyes: [00:20:28] I mean, URL is a form of URI. [00:20:31] John Mueller: [00:20:31] OK.
[00:20:31] Martin Splitt: [00:20:31] Could you give us an example of a URI versus a URL? [00:20:36] Gary Illyes: [00:20:36] App indexing, deep links, for example, that’s a URI and not a… [00:20:43] Martin Splitt: [00:20:43] not a URL.
[00:20:44] Gary Illyes: [00:20:44] … URL.
[00:20:45] Martin Splitt: [00:20:45] Yeah, OK.
[00:20:46] John Mueller: [00:20:46] And you do that in robots.txt too?
[00:20:48] Gary Illyes: [00:20:48] So one of the things that we tried to do with <i>robots.txt</i> when we started the process of standardizing it was to expand the language to accept URIs versus URLs because the original de facto standard that was describing the protocol to be used with URLs, and we had to change
some wording, I don’t recall exactly how, but we had to change some wording, plus the ABNF language to make it work on URIs because If we already have a protocol that can do a very good job controlling crawling on the Internet, then why would we want to introduce yet another one in case a new form of URI shows up on the Internet and invent a new control mechanism for crawling those. And that’s why we expanded the language to accept URIs to be used with <i>robots.txt</i> versus just URLs.
[00:21:43] John Mueller: [00:21:43] Oh, cool. OK. I totally didn’t know about that. That’s that’s pretty cool. So basically, if you, as an SEO, work to understand the foundation of <i>robots.txt</i> and how the the matching goes there, then if some new form of URI pops up and becomes popular in the future, then they can keep building on that. OK, that’s cool. OK. What about things like structured data? You mark up a product, you put a price…
[00:22:16] Martin Splitt: [00:22:16] Oh-h-h!
[00:22:17] John Mueller: [00:22:17] Do search engines really need to have structured data? Can’t you just look at a product page and recognize it’s a product page? Come on!
[00:22:27] Gary Illyes: [00:22:27] I just want to say that I have very strong opinions about structured data.
[00:22:30] Martin Splitt: [00:22:30] I expect structured data as in terms of the data that you present is if you do things right, it’s superfluous, but I think of structured data as a way to opt in to certain features of Search and other products that we offer so that you basically say I add structured data specifically so that I don’t accidentally end up in certain products or services, but I specifically say this is a website that contains product information. So, you know, if there is someone out there who specifically looks for that, they might pick it up and then use it in some sort of user experience or some sort of app or service or whatever. And I like it for that. I like it as a kind of like implicit agreement to provide this information in a more structured form. But I don’t think many people think about it like that.
[00:23:30] John Mueller: [00:23:30] OK, so it’s almost like a control mechanism, [00:23:34] Martin Splitt: [00:23:34] Kind of, yeah.
[00:23:35] John Mueller: [00:23:35] where you kind of say: “Well, I’m OK with Google understanding this is a product page.”
[00:23:39] Martin Splitt: [00:23:39] Yeah.
[00:23:40] John Mueller: [00:23:40] Google probably understands what a product page is anyway. At least looking into the future where machine learning is everywhere.
[00:23:49] Martin Splitt: [00:23:49] Yeah. I’m pretty sure we can understand: “Oh, this is a product, and the product’s name is this and the product’s price is that and this is a product image.” But it is kind of nice to have this explicit machine-readable information where you can say: “Oh, so they specifically want us to think of it as a product.” It’s basically a glorified meta tag that says is product page and then the value of that meta tag would probably be true or something like that.
[00:24:19] John Mueller: [00:24:19] OK, but that almost sounds like the rest of structured data is is going to be optional at some point in the future. Not like next week in the future, but in 10 years or so. Who knows what machine learning progress will have happened and we should be able to look at a page and say: “Oh, these are 12 attributes of this page, and if someone searches for an attribute we should be able to match that.”
[00:24:51] Martin Splitt: [00:24:51] Generally, yes, but there’s many ways of doing things on the web, and even with machine learning, there might be creative ways where a machine does not necessarily pick up the information correctly. So having it spelled out quite literally is probably helpful nonetheless, even in the near future.
[00:25:07] John Mueller: [00:25:07] Yeah.
[00:25:08] Martin Splitt: [00:25:08] I don’t know, maybe not.
[00:25:09] John Mueller: [00:25:09] I don’t know. Gary, what are your really strong opinions or…
[00:25:13] Gary Illyes: [00:25:13] These are more like internally strong opinions because we have a very strong team and leadership that focuses a lot on structured data, and there’s massive use of structured data in indexing and also in understanding entities. And my opinion is that yes, structured data is amazing for these kind of things and to power features, but we certainly can get to a point where we don’t need it anymore and we do have the granular controls that enable people to opt out of different presentations.
[00:25:49] For example, you could add a span tag, well, span element and markit with datanode snippet the part of the text that you don’t want to see in a rich result like for example, if you don’t want your price to show up in a rich result, then you could just opt that out, and you don’t even have to wait for the price to end up in a rich result. You could do that proactively as well as many people do it already with, for example, their news articles where they actually opt out complete paragraphs or even complete news articles from the page or from search results.
[00:26:25] So I don’t see a reason why people couldn’t do that with rich snippets, especially because we are getting better understanding these, for example, product pages. We are getting there where we are really good at figuring out that this is a product page and this is the image of the product and this is the price. This is the stock, whether they have inventory of that particular item. I think it’s a matter of time we start using that. It’s not an if anymore, it’s more like a when. And if we jump ahead 10 years or 15 years where our computers are actually way better than they are now, then that will just enable us to do more of this and also better.
[00:27:13] And when I say we, I mean search engines in general, not just Google because we definitely see other search engines do amazing things with language understanding and machine learning in general. So it is coming. The question is when it’s going to land.
[00:27:29] John Mueller: [00:27:29] OK.
[00:27:31] Martin Splitt: [00:27:31] And I guess for, at least for the foreseeable future, it’ll still be there as something where if you have strong opinions about what your page should be and what the attributes should be, then you can specify it there. And at some point, it’ll be almost like, I don’t know, like a rel canonical where it’s like if you don’t care, then you don’t have to do it, and we’ll try to figure it out. But if you do care, then you can tell us what your real product name is and we won’t have to try to identify it on a page.
[00:28:03] Gary Illyes: [00:28:03] I mean, it could also work as an override. I imagine that there would be a transition period where we move from structured data machine learning data, and then in the transition period, it could be also used as an override. Like for example, you provide something on the page, we misunderstood it, you notice that we misunderstood it, then you could provide structured data to correct what we show in search results if you want that piece of data to be shown.
[00:28:28] John Mueller: [00:28:28] OK.
[00:28:29] Gary Illyes: [00:28:29] Maybe that would work.
[00:28:31] John Mueller: [00:28:31] Cool. Yeah. So I guess SEOs still have to think about structured data, and at the very least they’ll have to think about what they actually put on their pages and to be clear with regards to the actual page’s content. So it’s almost like good content will continue to be important for SEO!
[00:28:51] Gary Illyes: [00:28:51] Well, that’s a shocker. [00:28:52] John Mueller: [00:28:52] No shocker. OK. [00:28:54] ♪ [music] ♪
[00:28:57] John Mueller: [00:28:57] Getting things into search engines, do you think that will change. Like the crawling part of the web, that feels so antiquated. It’s like: “Oh, you find one URL and then you look to see if there are links to other pages and then you request those pages. Couldn’t we, I don’t know, just get a dump of all HTML pages on a website and we’ll just process that at once.
[00:29:23] Martin Splitt: [00:29:23] I am so excited to see how that’s going because currently, there is this push towards a more push based approach. So right now, it’s a bit of a pull thing where search engines kind of make a decision which URL to crawl when and how often. So they go and pull information from people’s websites. I know that other search engines, namely Bing, is experimenting with a push approach where I, as the website owner, proactively tell the search engine: “Hey, there is information on this URL. Please come
[00:30:00] We are experimenting with it as well for certain use cases, I think, like livestreams and something, or life blogs or something like that and job adverts. I think we are using an index push-based approach. But looking in the future, now this is kind of nice and useful because there’s only very few people using it and very few URLs being pushed, if you compare to the size of the Internet as a whole or of the web as a whole.
[00:30:29] And and thus, obviously, we can give those priority that are pushing us proactively or pinging us proactively. But if everyone does it in the future, let’s paint a picture, three years our indexing API opens for all websites and all pages, and Bing does it as well and everyone wants to be in Bing. I think Bing is now open for everyone already, but I’m not sure how many people are actually using it, I don’t think they publish this information anywhere. But let’s assume like everyone pushes all their pages all the time.
[00:30:59] First things first, there is additional work on the side of the website creators, of the website owners, because they have to feed this information to the API somehow. So there will be specific tooling that does this for you or you have like a script that runs every hour and pushes any potentially updated pages to this API. So it’s actually more effort on your side.
[00:31:22] And then also, if everyone does it all the time, I don’t see how anyone, Google, Bing, whoever else would be able to process this with the same high priority as they do it today, and then we would be back to square one, I feel, where they have to schedule things and then just pushing it to the API doesn’t mean that it gets indexed right away or indexed at all because there will be delays, there will be scheduling, there will be dismissal of spammy or bad URLs and I wonder how that’s going to look like in a couple of years and if there are some certain solutions to this problem that I’m not seeing, but I can’t really imagine one. But I would love to be wrong on this.
[00:32:07] Gary Illyes: [00:32:07] I think one more problem with push is the amount of spam that you will ingest, and we’ve seen this with the the submit URL feature that we had on google.com where, I don’t remember the exact number, but the vast majority of the submissions was spam and not low quality content or something. It was spam. It was very obvious that it was Spam.
[00:32:38] And then I’m just very skeptical about exposing more push interfaces because of that reason, because of spam, because it just opens yet another door into search engines for spammers to push spam. And do we really want that or we want to just find our way to good content?
[00:33:05] Martin Splitt: [00:33:05] But is that a filtering problem or an inherent problem that can’t be solved?
[00:33:10] Gary Illyes: [00:33:10] That’s a good question. It can be tailored or filtered to some extent, but then, you end up with false positives where you filter stuff that you shouldn’t have, and then people get grumpy about it. People who who are trying to spam also get grumpy about it and they create lots of noise, externally, and then it’s your and John’s and my job to keep them at bay. So, is that nice?
[00:33:37] Martin Splitt: [00:33:37] No.
[00:33:38] Gary Illyes: [00:33:38] No! What I would actually really want to see, and we are kind of working on some solutions is to be more intelligent about crawling. Because if we are more intelligent about crawling and we are not hitting sites repeatedly for the same URL, for example, or we are more intelligent about discovery, then we are not wasting resources. We are either on the site’s side or on our side and we are just doing better job at getting content into the index. And I think that’s much nicer, and that also leaves the thing that SEOs or technical SEOs do nowadays there for them to work on.
[00:34:20] John Mueller: [00:34:20] So you’re saying links will not go away? [00:34:23] Gary Illyes: [00:34:23] Why would you bring up links?
[00:34:25] John Mueller: [00:34:25] Well, it’s like crawling. [00:34:26] Gary Illyes: [00:34:26] Why?
[00:34:26] John Mueller: [00:34:26] It’s like crawling through a website. You need links. [00:34:29] Gary Illyes: [00:34:29] But why, why, why?
[00:34:32] John Mueller: [00:34:32] Oh, so you’re saying links are going away? [00:34:35] Martin Splitt: [00:34:35] I’m saying out of this.
[00:34:36] Gary Illyes: [00:34:36] No, why are you twisting my words? Why are you bringing this up, even? Links shouldn’t go away.
[00:34:45] John Mueller: [00:34:45] OK.
[00:34:45] Gary Illyes: [00:34:45] I think we shou– Well, we got better at using them, and perhaps, we don’t need as many links as people believe to do ranking well. But I don’t think they are going away. They are the same as HTML, they are basic building blocks… Well, because they are HTML, and they just cannot go away.
[00:35:14] John Mueller: [00:35:14] OK. OK. Cool. More things not going away. Sounds like SEOs will have a future at work anyway. What about things like keyword research?
[00:35:25] Gary Illyes: [00:35:25] What’s that?
[00:35:26] John Mueller: [00:35:26] It’s like when you research specific topics that people are interested in and then SEOs and try to encourage writers to write about these topics because they would drive attention and search.
[00:35:38] Gary Illyes: [00:35:38] I guess it will stay.
[00:35:39] John Mueller: [00:35:39] OK. OK. What about content in general? It’s like with all of these text generation algorithms, basically, you just tell the machine what the topic should be and it’ll create a full page for you, right? So writing will go away?
[00:35:58] Gary Illyes: [00:35:58] I think that could be a topic on its own for a future podcast episode because we can see the pros and the cons of machine-generated content, and we are quite strict about what we allow in our index. But on the flip side, you can also see very good and smart machine-generated– I don’t know if smart is a good word, but very intelligent machine-generated content.
[00:36:26] I recently saw a short article about yeast, for example, and it was generated by GPT-3, search for it on your favorite search engine if you don’t know what it is, and it was very well written. I couldn’t tell that it was written by a machine. And then there’s the thing that if you can’t tell that it was written by a machine, then does it matter if it’s in search or not?
[00:36:54] John Mueller: [00:36:54] OK, yeah. So it’s a good content. It’s OK. But what if the machine makes stuff up. It’s a topic about yeast and it tells you, you put gasoline into bread and then it generates yeast and it’s well written English. But anyone who knows the topic is like: “This is wrong.”
[00:37:14] Gary Illyes: [00:37:14] Well, but it also depends how the the content was generated or what were the sources for it, right? Like how it was taught. Yeah, I think this deserves its own podcast episode. We could debate about this a lot. Right now, our stance on machine-generated content is that if it’s without human supervision, then we don’t want it in search. If someone reviews it before putting it up for the public then it’s fine.
[00:37:45] John Mueller: [00:37:45] Cool, it sounds like one of those areas where SEOs could evolve and try to learn more about fancy machine learning technologies and kind of build out a niche for themselves. What about images, video, audio? That that seems like another one of those areas where machine learning could pick up and say: “Oh, this video is about cats. We will just rank it about for cats.”
[00:38:11] Gary Illyes: [00:38:11] Wait, audio. Why would you ever produce audio? [00:38:15] John Mueller: [00:38:15] Audio?
[00:38:15] Martin Splitt: [00:38:15] Yeah, podcasts are overrated. [00:38:17] John Mueller: [00:38:17] It’s, it’s…
[00:38:18] Gary Illyes: [00:38:18] Oh podca– Well, oops. [all laugh] [00:38:24] John Mueller: [00:38:24] I mean…
[00:38:25] Martin Splitt: [00:38:25] Or ASMR.
[00:38:25] John Mueller: [00:38:25] I mean like like with images, that’s one of those things where the machine learning teams, the research teams, always kind of try to show off how well they recognize the objects on an image and what they’re doing. Do you see that kind of going into SEO where suddenly, people won’t have to do alt attributes for images anymore and images will just rank perfectly?
[00:38:52] Gary Illyes: [00:38:52] I gave a presentation a couple of years ago at a conference called [Sipic], and we were showing entities that we could detect from simple images, simple images being a picture of an apple on a white background. I think the Eiffel Tower and then a person in a picture, again, white background and we could tell the general topic of the image.
[00:39:18] So we could say that a red apple or Eiffel Tower or person, but for example, for the person, we couldn’t even tell the gender. And from the picture, if you were looking at the picture as a human, it was very obvious, the perceived gender. But the machine just couldn’t actually say anything about the perceived gender of the person in the picture, and I think that’s still true.
[00:39:43] In general we can tell the basic topics or topic of the picture or what’s in the picture, but we can get quite confused. If you put up a grapefruit and an orange and the perspective is off or confusing, then we might not be able to tell the two apart. So yeah, I think for now, we are going to rely on odd attributes and the surrounding text, quite a bit. I can see that eventually we get there where we can detect more concepts and more accurately in images and then we can use that for ranking purposes. But I don’t think that we are there yet. But I definitely think that we are going to get there eventually.
[00:40:26] John Mueller: [00:40:26] Yeah, I think the the aspect that I always think about is, well, is that usually, when it comes to images, it’s not that people
are looking for images, but they’re looking for kind of like what’s represented by the image. Where if you’re looking for luggage, then you might use image search to kind of try to find the luggage, but it’s not that you’re looking for a photo of a nice suitcase you want, actually, to buy a physical suitcase and you just want to see what it looks like.
[00:41:00] Gary Illyes: [00:41:00] Yeah, I think we called that visual exploration. It was actually the basis of that presentation that I was referencing, and that’s the vast majority of our users are actually doing that, visually exploring the web versus going and finding an image for a meme or whatever.
[00:41:21] John Mueller: [00:41:21] OK. So working with images and videos will also continue to be something for SEOs. What about voice search? Will SEOs have to optimize for voice search?
[00:41:32] Martin Splitt: [00:41:32] Oh God, the future that never will be. I think no, because if we learn anything– I remember a bunch of years ago, people were like: “Oh, we’ll stop using keyboards and just do voice.” And I think that has been a recurring theme from the 90s. But I think in the future, it won’t change and will naturally or magically become the number one thing that we need to worry about, simply because it changes the input modality, and it changes probably how queries are phrased, but it doesn’t change the fundamental use of natural language to retrieve information from the Internet.
[00:42:13] So I think you don’t have to worry too much about it, to be honest, but that’s maybe just me. Maybe the future will be completely different and we’ll… I don’t know. I don’t think so.
[00:42:25] Gary Illyes: [00:42:25] I think we are going to experiment with just projecting our thoughts into search engines and then that’s how we are going to find things.
[00:42:32] Martin Splitt: [00:42:32] But I wonder if I our… I don’t know. So apparently there’s two different kinds of people. There’s one kind which has an inner monologue, the other one doesn’t. I’m of the kind of inner monologue, so my thoughts are fully formed sentences, so I would still use normal natural language in my thoughts. But maybe the other kind doesn’t, I don’t know.
[00:42:54] John Mueller: [00:42:54] Then you have three voices in your head, two of your own, and then one Google. And then our, you have Bing and the other search engines too. It’s like you have– you wake up in the morning and you’re like: “What should I have for breakfast?” It’s like: “This, this, this.” [Martin laughs]
[00:43:10] Martin Splitt: [00:43:10] And maybe it’s like this video calls or general like conference call situation like: “Can you hear me now. Loud enough?” [all laugh]
[00:43:18] Gary Illyes: [00:43:18] Martin, you are muted. [Martin laughs]
[00:44:06] Martin Splitt: [00:44:06] Dead.
[00:44:07] John Mueller: [00:44:07] Understanding more about web apps. [00:44:09] Martin Splitt: [00:44:09] Dead.
[00:44:10] John Mueller: [00:44:10] No, no. Web apps is like that new opportunity that sounds good. Structured data seems like, well… There’s like, if you plan to retire in the next 10-20 years, probably you’ll continue to do it. But maybe at some point less.
[00:44:30] Crawling and all of that probably stays the same or similar, and machine-generated content seems like one of those research opportunities for people to kind of plan ahead on what might happen, I don’t know, maybe 5-10 years in the future. Is that about right?
[00:44:49] Gary Illyes: [00:44:49] Sounds about right to me, yeah, but maybe we are all wrong. Who knows? That’s the beauty about the future. We can’t really predict things. But yeah, I think that encapsulates what we think will happen. Let’s see how right or wrong we are in the future.
[00:45:05] John Mueller: [00:45:05] OK. Well, it sounds like we’ll continue to have search engines because people will continue to ask is SEO dead, and for that, you need a search engine. So at least for that topic, we’ll continue to need SEOs.
[00:45:21] Gary Illyes: [00:45:21] True. Very meta.
[00:45:24] John Mueller: [00:45:24] Cool. Alright. And with that, I think we’ve kind of made it to the end of our episode, which is pretty cool. Thank you two for for joining in here. Thank you, all of the listeners who are watching or hearing, I guess. Thanks for joining us here. We’ve been having fun with these podcast episodes and I hope you all find them insightful and interesting too. And regardless, let us know if there’s anything that you think we should be talking about more in the future.
[00:45:57] Feel free to drop me a note on Twitter or chat with us at any of the virtual events that we sometimes go to, which is not a lot at the moment. And of course, don’t forget to like and subscribe and update all of your links to point to this podcast episodes because Gary says links will not be going away. So thank you and goodbye.
[00:46:20] Martin Splitt: [00:46:20] [speaks in a foreign language] [00:46:22] Gary Illyes: [00:46:22] Goodbye.
[00:46:24] ♪ [music] ♪