Look at it this way. If it works, it works.
If it doesn’t, it’s all over anyway.
Having a copy of the source data would also be useful to build a static archive, without having to race against the October deadline.
No, I don’t think so, of course for the publicly accessible parts of the forum. I am aware there are parts of it that are no longer so, though I can’t say if just hidden or deleted.
AFAIK there are tools that are Discourse-specific, there are ways that Discourse itself can help serve the content, and there are general purpose scraping tools that can probably be used, too.
It’s something that I have been looking into a few weeks ago, because apparently I have some sixth sense when it comes to shit like this (nevertheless my head was buzzing after finding that post in the queue), but then didn’t get to the bottom of, because of other commitments and this nasty habit of days only having 24 hours and me needing sleep, too.
I’m not sure what challenges will emerge if anybody tried to archive, say, WAYP or Screenshot, starting from scratch, with no experience, let alone the whole thing.
This forum is quite large, FWIW I think for some time they had been raising some limit, because it wouldn’t index the whole message base otherwise.
And would RPS be OK with (possibly multiple) people scraping their forum? or would their tech people try and block that.
Discourse doesn’t have the lowest requirements out there is my understanding, but not having to serve a huge number of users might actually be a blessing in disguise, placing the task largely in the domain of the affordable.
I have no idea how much space RPSF takes, though, that might require an upgrade.
Even then, it would probably only take a few users agreeing to contribute a little money every now and then. That would be my hope.
Now, for some reason I have this creeping feeling that while we’re here trying to look at the bright side and generally make the best of it, they’ll just stride casually along and drop another bomb on poor, unsuspecting us in a short while, though. I wonder why…