Eat This Blog

@literary That is great. I'm glad they were big enough to do that.

2016-08-02T20:31:25Z

@bazbt3 For what it is worth, I think you're still wrong.

2016-08-02T16:59:55Z

@matigo I just looked again, and I am a fool. All I could see was monthly subscription pricing, to begin with. Even one month would be too much. But there's also a batch price that should do what I need for a much more reasonable amount.

// @japchap

2016-08-02T10:55:16Z

@matigo No, I don't really need the content, as I have already scraped that.

I need metadata, if you will, about each page.

I found a thing called Scrapy [doc.scrapy.org] that seems like it ought to be able to do what I want, but it is way beyond my abilities to implement, and not really worth it for this one job. I doubt that there will be other jobs like it.

Maybe I should just pony up and pay the online service.

// @japchap

2016-08-02T10:33:39Z

@japchap Thanks. SiteSucker seems to do a similar job to httrack. So that end is covered. What I really want is something that will crawl the mirror and create its structure.

I'll keep looking

2016-08-02T10:14:12Z

@matigo Yup. Online, there's one I tried called Content Insight [content-insight.com] that provides URL, Type, Size, Level, Title, Word Count and Links In and Out.

I'm not even sure what I should be searching for, search terms, I mean.

2016-08-02T10:11:19Z

I'm working on migrating some of the content of a website into a new incarnation of the website. I'd like, if possible, to construct a spreadsheet that mirrors the structure and contents of the old website. There are online tools to do this, but they are expensive. At the moment I am downloading a mirror of the entire site using httrack. When that's done, can anyone recommend an app that will step through it and extract the information I need?

2016-08-02T09:58:28Z

@matigo What's my number?

2016-08-02T09:16:12Z

In case anyone is interested, I finally wrote up the recipe for my carrot cake [fornacalia.com].

2016-08-02T05:41:12Z

@kdfrawg Oh yeah. Fully confirmed now as I snuffle myself to sleep, pausing only to down a hot Lemsip.

2016-08-01T20:36:06Z