Welcome to Press This, the WordPress community podcast from WMR. Each episode features guests from around the community and discussions of the largest issues facing WordPress developers. The following is a transcription of the original recording.
Powered by RedCircle
Doc Pop: You’re listening to Press This, a WordPress community podcast on WMR. Each week we spotlight members of the WordPress community.
I’m your host, Doc Pop. I support the WordPress community through my role at WP Engine and my contributions on TorqueMag.io. You can subscribe to Press This on RedCircle, iTunes, Spotify, or your favorite podcasting app. You can also download the episodes directly from WMR.fm.
Now, I have to tell you, I am extremely excited about today’s topic: Data portability. I’ve been blogging since about 2004. Back then, if you wanted to migrate your content from MSN Spaces to Blogger, you pretty much had to manually copy/paste every post, every title, and every accompanying image and just go through and do that.
Now, luckily, I only had like a dozen posts at the time, so it didn’t take that long. But in 2006, when I migrated my site from Blogger to WordPress, I was blown away by how easy things had gotten. Blogger had this export function, and LaughingSquid, the WordPress host I was using at the time, had an Import From Blogger function.
It was super easy and seamless to move, and I honestly took it for granted. I just thought, “This is how it’s going to be now. I’m just going to be able to move things from one place to another.” And, obviously, the things didn’t work out that way.
As the years progressed, it started getting harder and harder to export your content and move it elsewhere. For example, you wouldn’t even dream of migrating your posts from Facebook to Elon Musk’s X and vice versa. But you also might have some trouble going from one CMS to another these days. Enter the Data Liberation project, announced by Matt Mullenweg at the State of the Word in 2023.
This open-source initiative aims to break down the barriers of content migration, making it easier and much more seamless to move your precious content from one platform to another. So joining us today to shed light on the challenges and aspirations of the Data Liberation project is Jordan Gillman, a Happiness Engineer at Automattic and the shepherd of this groundbreaking endeavor.
Jordan, how are you doing today?
Jordan Gillman: I’m doing well. Doc, how are you doing?
Doc Pop: I’m doing so well. Right before the show, I asked you this, and I just want to kind of get this—I want to brag about it: This is your first interview about the Data Liberation project, right?
Jordan Gillman: It is. I’ve done one written interview, but this is definitely my first podcast, hopefully of many, about the Data Liberation project, which I’m very excited about.
Doc Pop: And I’m excited too. This is one of those things that there’s a lot of fascinating technology out there right now. And it’s kind of funny, at the State of the Word 2023, this is the thing that got me most excited. I was the most excited about it. So why don’t you tell us about the Data Liberation project, what your goals are, and the history of the project?
Jordan Gillman: Yeah, sure thing. Well, you’re right. Matt did announce it at the State of the Word late last year. And at its core, the Data Liberation project is a community project by the community, for the community, and for an open web. Imagining a time where you can move your content with a single clip, bring it into WordPress and take it out of WordPress.
Ultimately, our mission is to democratize publishing, and what that really means in terms of content is having the power to move your content freely to WordPress from wherever. The freedom to move your WordPress site to another host with a minimum of fuss, but also the freedom to export your content out of WordPress in a format that’s usable however you want to use it.
So I’m excited about the potential of that in freeing people’s content from walled gardens.
Doc Pop: As you alluded to there, this isn’t just about importing from one platform to WordPress. This isn’t about a Squarespace to WordPress importer. This is also trying to unlock even migrating from WordPress. We’re not trying to lock people into anything, right?
Jordan Gillman: I think that’s exactly it. I mean, as much as we, we want everyone to come to WordPress. That’s obviously, you know, a huge goal of the project, but we don’t want to be doing that in horrible ways. We don’t want to lock people in. We don’t want to free them from somewhere else only to feel locked into us.
So I think it’s an important part of this conversation, an important part of this project, to be talking about what freedom of content coming out of WordPress looks like.
Doc Pop: And if we were to go to, wordpress.org, wordpress.org/data-liberation, we would find a lot of the guides.
And that kind of feels like what the current state of the Data Liberation project is right now. Is saying—I might be wrong—but it looks like instead of saying, “Here’s the tools, it’s a little bit more about like, “Hey, here’s a guideline. If you want to go from RSS to WordPress or Wix to WordPress or Drupal to WordPress or WordPress to WordPress, here’s some guides, well-written.” Is that kind of the current state of the project right now?
Jordan Gillman: That is the current state of, I guess, what’s user facing, for the project so far. And a lot of that happened around the timing of the State of the Word and a lot of that was “let’s take a look at what existing resources we have,” so a lot of those guides from the support documents from wordpress.org, a lot of the tools that are kind of in another section there existing tools that people have linked to. And so at that stage it was really a case of, well, let’s pull together a resource of information we have to use as a starting point.
I’m hopeful that those guides will continue to be really useful, even as we see more tooling introduced, because the nature of things at the moment in terms of migrating to WordPress is: Even when we have importers and we’ve got a range of core importers that are available. Many of those still require a little bit of user work.
You know, so we’ve got the guide to how to export your content from Squarespace, which gives you a WordPress import file. Or we’ve got the guide to use RSS. And so some sites will give you an RSS record that you can download and then import that with an RSS importer.
So at the moment, there are some solutions, which are part manual work and then part a tool to finish the job, and that’s really what those guides are for. So we’re hoping to build those out further, but ultimately, by the end of the project, I’d love to see those guides be integrated in a way where they’re part of the tools and if there’s any manual steps required by users that we’re guiding them through those kind of in real time as they’re needed.
Doc Pop: And that’s what I’d like to know about. Like looking long term, is the goal for the Data Liberation project to be to create a tool or to create some sort of standardized data structure that everyone adheres to or is it.
Or is it…I don’t know, like, currently right now, it’s just like, everything’s so different, you’re sharing the tools and the resources, and it’s just a nice hub for learning how to do that.
What are the long term goals? So for the project.
Jordan Gillman: That’s a really good question, and it’s something that ultimately I’m hoping that the community as a whole will help drive exactly what that’s going to end up looking like.
There are a few ideas that have been floated. So we’ve got a GitHub repository where kind of the work and discussion is happening at the moment.
One proposal in that discussion is centered around the idea of maybe a plugin, like a generic import plugin that you might install on your site. And you would give it the source URL, and it would detect the kind of site, the platform of your existing site, and then it would walk you through the steps that might be needed.
So it might show you the guide, and then it would direct you to the plugin that you’d need to install to then use whatever you’ve been able to export. So that’s, I suppose, an idea that fits kind of within the existing paradigm of WordPress plugins and importing. We’re just kind of putting a neat front on it to tidy it up.
There’s another proposal which kind of goes a step further and imagines almost a hosted service on WordPress.org itself where that would happen seamlessly behind the scenes and so you would provide the URL of your existing site. It would detect what platform you’re using and then it would get the content however it needs to, and it would roll up a new playground site for you, you know, within a couple of minutes, so that ideally with one click you have a playground site of your content in a WordPress install. And then once it’s in playground, we have options of how you might want to export or migrate that in a WordPress format to kind of use as you wish.
So they’re both really interesting proposals. It may be that they are different phases of the same kind of idea. But I suppose to answer the second part of your question as to what the end goal is: That’s really something that I don’t see it as my job to decide. Which is why I consider myself a shepherd of the project.
The current phase we’re in is really just about facilitating discussion amongst the community. We want to see brainstorming, we want to see ideas, and then we want to see people, you know, challenge those ideas and together come up with what the actual work might look like and what our actual end goal might look like.
Doc Pop: And I imagine part of this is also not just about creating resources and creating tools. I imagine there’s some sort of political element here where we’re trying to call out services that aren’t allowing exporting. I feel like Squarespace does have an export option. So if you’re in Squarespace, you’re not locked in as a consumer, you can export to WordPress.
Wix, on the other hand, doesn’t, and I feel like part of maybe what’s going on here is sort of trying to get people on board. And as you get those numbers, then you have an easier time saying, “Wix, everybody else does this. Why aren’t you allowing your consumers to migrate off of your platform?”
Jordan Gillman: Yeah, I mean, it’s not a current goal to make this a political or social statement, I suppose. It really is about empowering WordPress users as a first step. However, I won’t deny that I can imagine that a successful project of data liberation within WordPress certainly does start to ask those questions.
And yeah, if our, again, if our mission is to democratize the web, then perhaps those are good conversations to be having. But I should say that it’s, you know, it’s not a goal of the project to be starting those bigger, broader conversations and lobbying and pressuring, I suppose, other organizations to have to be on board with that.
] Doc Pop: It is fun though, when you look through seeing how RSS comes to save the day with so many of these projects. If they don’t currently support any sort of migration, data portability, there’s always that RSS feed, which is this open-source thing that everyone still uses, thank God.
And so, worst case—it seems like with some of these—the worst case scenario is, “Hey, at least you still have your RSS and you probably still have to build a front end, I suppose, but at least you’ll have all that metadata and blog posts and titles and images and alt text that’s in the images.”
All that stuff should hopefully get sucked up in the RSS and be very easy—that’s a very easy format to move around from one place to another, right?
Jordan Gillman: Yeah, that’s exactly right. And I think in my head I’m actually building out as part of my role, building out a list of basically the current state of migratability of a whole bunch of platforms and what we’re going to see with the potential of roping it into WordPress is there are going to be some platforms which we have, you know, we’ve got API access. So with the right development, we can actually probably make API calls and pull the content in a very similar format.
Or we might have services like Squarespace does at the moment, which allow you an export of content. And then so we import that XML file, and you’ll get the content, but you’ll lose some fidelity of the experience.
And then we’ve got RSS, which of course you’re still going to be able to get your content, but you’re not going to be pulling in a layout. You might have troubles with some of the media might not come across. So I think there’s going to be like a differing level of fidelity or parity of display of how we might be able to migrate things.
And for me, that’s one of the reasons that I’ve been very much thinking about the Data Liberation project at this point about content portability. So getting a full migration of my site looks like this on Squarespace. To my site looks like this on WordPress is an admirable goal, but it’s a big shot.
And I think what we definitely can do, and this is, you know, this is the open web kind of side of it is that we can definitely say, “well, you created this content, you should own it and take it where you want, and it may not look exactly the same, but they’re your words, they’re your images. It’s your video, your audio, and you can take it with you.”
Doc Pop: We’re going to take a short break and when we come back, we’ll pick up our conversation about data portability with Jordan Gillman. So stay tuned for more after the short break.
Doc Pop: Welcome back to Press This. Today, we’re talking data portability with Jordan Gillman, a Happiness Engineer at Automattic who’s also a shepherd of the Data Liberation project.
Jordan, before the break, you were talking about the different places that we can migrate from and how We currently have this resource center for learning how to migrate from one place to another.
And part of that, and part of the data liberation, is not just for making it easier to import to WordPress, but even making WordPress easier to migrate to other platforms, including a thing that’s kind of surprising: WordPress to WordPress. There’s challenges that some people have migrating from WordPress—one WordPress host to another.
Can you talk a little bit about that? Maybe what y’all, what thoughts y’all have heard so far about that process?
Jordan Gillman: Yeah, absolutely, I’d love to. It’s actually very fresh in my mind. I was lucky enough to attend WordCamp Asia a week before last, and I spent a lot of time in the sponsors area talking to hosts there about the challenges that they have with migrations, because for many web hosts these days, you know, they offer free migrations for sites, and it’s a big part of their kind of onboarding for users.
So, one of the bigger things that came up is just, in many ways, the shortcomings of the WXR support format that WordPress uses natively. It has served us really well. It is, you know, it’s done a fantastic job. But there’s no denying it has shortcomings when it comes to a full-site migration.
There are challenges with filtering what content you want to export. It doesn’t natively bring the images with the export. Kind of the source site still needs to be live to fetch those images. So, in speaking to hosts, very rarely did they use the native WordPress features for migrations. They were most often using third-party plugins and tools to kind of do a full migration of basically the WP content folder and then bringing the database over.
And so, there was very little work using native import-export tools. And speaking to them, they had a lot of troubles when it came, usually, to just dealing with access to the source site.
So trouble with credentials for logging in or two-factor authentication being active. Clients who wanted to migrate and had already pointed the DNS, so the domain was pointing to the new site instead of the old one, so they couldn’t access it. And issues with existing hosting, like timeouts and memory issues.
So a lot of the time, the biggest successes they had when doing migrations were using these third-party tools, which are really great. They do a good job, but I don’t think that means we shouldn’t try and bring some of that stability and ability to migrate site to site into WordPress itself.
Cause I think part of part of democratizing publishing is that if you, for some reason, want to move hosts, you shouldn’t be locked into a WordPress host.
You should be able to very freely shift to whoever suits you best.
Doc Pop: I haven’t really thought about WXR in a while, that’s the—I just had to look that up. The WordPress extended RSS, that’s the export/import file that’s usually used for WordPress. Is that correct?
Jordan Gillman: Yeah, that’s correct. So if you are doing an export from your WordPress site just using the native tools, it will download an XML file in that specific WordPress export format.
It contains all of your content, it contains references to your media, but it is a little bit limited in that it’s just a single file, so it doesn’t bundle media. And of course, it doesn’t include your theme or your plugins or any of those things that kind of make your site visually and functionally your site. It’s just a content export.
Doc Pop: So what does the Data Liberation project mean for WordPress migration plugins?
Jordan Gillman: You know, that’s a great question. And the answer is at the moment, we don’t know.
I think there’s—it’s a highly competitive space in the kind of WordPress migration plugin space at the moment. And I think there’s plenty that core WordPress can learn from approaches there. But I don’t I don’t think we’ll be aiming to, you know, overtake any of those.
But I think it’s fair that native WordPress allows a bit more flexibility of migration than it does currently.
Doc Pop: I understand why maybe there’d be a plugin to help you import to a certain tool. I guess, is there any reason that plugins or—I don’t know how to phrase this.
But I’m thinking about how, when we’re exporting and importing, we’re oftentimes given options of both sides. Like, where are you importing from? Where are you exporting from?
Why is that? Why can’t we just have one type of file that we export as, and then maybe have that interpreted to whatever platform it’s going to in whatever way it needs to.
Jordan Gillman: You know what, I think that’s a really great question. The short answer is I don’t know. I think that the, I mean, it sounds like the question you’re asking is perhaps a little bit around like a standardization of format for migrational content. Which I think is another thing that’s—it’s not off the table, but it’s a very big conversation.
So I do know that, kind of, over the last year, there was a working group amongst learning management systems, kind of the LMS plugins in WordPress, to standardize their format so that there was greater interoperability migrating between, you know, Sensei and LearnPress, and those kinds of learning plugins.
And that was really successful, but it was also in a very specific small niche. So I think there is some precedent there, but aiming to standardize that is not, at the moment, part of the goals of the project. Certainly, you know, like I said we’re in the stage of discussion, and so if that’s a discussion that grows and interest is fierce then we’ll need to consider that.
But I think that approaching it from that point of view runs the risk of getting bogged down in conversations about what the standard is before we actually provide anything useful to users. So I’m very strongly in favor of let’s give users some tools. Let’s make life easier for them. Let’s make it easier for them to get onto WordPress.
Perhaps at the same time that does raise questions about a standard, but it’s certainly not where we were looking to start.
Doc Pop: It’s early days for the project and I keep asking you about, like, the end goal, and I apologize for that. You’re still figuring this stuff out as well.
One thing I’d like to hear though is maybe just your personal opinion so far about some of these companies might not ever want to quote unquote opt into allowing users to export their data. Maybe they have some sort of a user agreement saying “hey, once you’re with us, you’re not allowed to use some sort of content migration tool.”
And I’m just wondering if you have any thoughts about, like, do platforms need to opt into this or is it okay for us to work around when platforms are being selfish with our content?
Jordan Gillman: Obviously, it would be great for people to opt in. That’s always going to be the preference is if we can be encouraging a broader environment in the web where this idea of content portability is the norm, but we’re not always going to get everyone on board with that. Our first approach will be definitely to try and be using, I suppose, open methods of doing that.
And so we’ll be looking at places that do already offer exports. For those that do not offer exports yet, it may be that we start to be in a position to, you know, have that conversation with them about opening it up, but I do fully expect that there will be many of platforms for which we’ll have to figure out, you know, workarounds to try and, you know, liberate the content for people.
And I say we, in the very broad sense of the word we, because I’m not from a development background, I’m from a kind of design and front end background. So the work of doing that kind of thing is exactly why I’m talking to the community and looking for folks to get involved who have much more experience in those kinds of areas than I do.
Doc Pop: We’re going to take one more quick break. And when we come back, we’re going to wrap up our conversation with Jordan Gillman about the Data Liberation project. So stay tuned,
Doc Pop: Welcome back to Press This. We’re wrapping up our conversation about the Data Liberation project with Jordan Gillman.
And 2023, Matt Mullenweg said the following at his State of the Word. He said, “Imagine a more open web where people can switch between any platform of their choosing. A web where being locked into a system is a thing of the past. This is the web I’ve always wanted to see.”
When he announced this project, I was extremely excited about it. And I know it’s still the early days for the project, but I just wanted to hear how is the project doing amongst other key projects in the WordPress space, such as marketing and site editing and feature polish? Like how do you think this Data Liberation project is kind of currently doing compared to the other projects within WordPress?
Jordan Gillman: Yeah, that’s a great question. I suppose in terms of the goal for the project it is one of the key projects that has been earmarked by Matt and Josepha, the Executive Director of WordPress for 2024.
So in terms of the plans for the year, it’s a really major part of that. One question I’ve had from a few folks in the community has been, why are we focusing on this instead of X?
You know, why are we starting this new project rather than fixing the editor more? Or why are we doing this rather than working on user management more or any of those things? And I think the important thing about the Data Liberation project is it’s not in place to try and take resources away from any of the other work that’s being done on the project.
We are a huge community of people. There are a huge amount of people working on the project, but there’s always room for more. So part of what we’re seeking to do is really hopefully activate people for whom this is exciting, people who are passionate about content freedom, people who have skill sets in those areas.
And so rather than taking resources from other important work, like working on the editor, like, you know, the existing teams within Make WordPress, whether that be the marketing team, or the meta team, or any of them, we’re seeking to really engage with and activate potential new contributors back into the project and really kind of resource things that way.
So in terms of the overall goal it’s a big goal for the year, but it’s something that we’re starting at a really grassroots level, I suppose.
Doc Pop: We’re running low on time. I have two questions for you. Maybe we can wrap them together.
I’d like to know if any organizations outside of the WordPress space have volunteered to contribute to this project. And I’d also like to know what organizations have currently been the biggest contributors to the Data Liberation project.
Jordan Gillman: Sure. So the short answer for the first one is that we’ve not got any resources coming from, kind of, outside the WordPress space at the moment. Again, as when, as we’re not at this stage, kind of seeking to make it a more political, broader statement. That’s not too surprising to me, but it’s definitely something that I’m open to.
And in terms of the second one, the shortest answer is that there has been a little bit of discussion within—in terms of within the WordPress community—within the Data Liberation Slack channel, there’s been some good discussion which has come from, you know, a variety of organizations, obviously, in the Make Slack instance. There’s also been a little bit of contribution by members of Automattic who are also full-time contributors towards WordPress particularly from the meta team.
Some of the proposals that have been raised for discussion have come from those sources. But what we’re really seeking to do is get more eyes and a more diverse set of opinions into that conversation at the moment, which is part of, a major part of what I’m working on at the moment.
Doc Pop: Jordan. I really appreciate your time today. Where can people learn more about what you’re working on?
Jordan Gillman: Awesome. That is the best question you could have asked. The best place to head right now is to go to WordPress.org/data-liberation. That will get you to the existing tools, but there’s also a couple of notes at the top about where you can get involved and join the discussion and join the planning.
A secondary place that I would recommend is if you are in the Make WordPress Slack community—which you absolutely should be, it’s a great place to be—the Data Liberation channel within Slack is where most of the brainstorming and thinking and on-the-fly stuff is going to happen. And I’ll be increasing, kind of, the activity in there over the coming weeks.
Doc Pop: That’s awesome. Thank you so much for your time today. And thanks to everyone who’s listened so far.
Press this as a WordPress community podcast on WMR. You can visit TorqueMag.io to read transcribed versions of these podcasts, plus more WordPress news and tutorials, that’s TorqueMag.io. You can subscribe on RedCircle, iTunes, Spotify, or download directly from WMR.fm.
I’m your host, Dr. Popular. I support the WordPress community through my role at WP Engine, and I love spotlighting members of that community each and every week on Press This.
No Comments