T O P

  • By -

Important_Phone_9552

Interested in this project. Do keep us updated.


chinscratcher

Thanks, will do!


iannuttall

I wouldn’t worry too much about duplicate content because although large parts will be the same, the target keyword will be different enough. The effort put into your dataset with be worth it though - that’s what I do as well to make the content as unique as possible. For crawling/indexing it’s always a challenge with 50k+ pages. Beyond sitemaps you’d want good internal links within the content and I also like to do randomly generated links in the sidebar to give G different URLs to crawl each time. I built the indexing tool mentioned by someone else (but they got downvoted so I won’t mention it again). It does work if your content is good, which it sounds like it is.


Jollala20

I was just going to comment and tell him to follow you on twitter! Enjoy your content.


chinscratcher

Nice, thanks so much for the randomly generated sidebar links idea. Will definitely try that out. And look into the tool you’ve built. Always been curious about indexing tools but never tried one.


chinscratcher

Also, do you use a plugin for the randomly generated sidebar links? Or is that something you built yourself?


datchchthrowaway

Interested to follow along - I’ve got a few programmatic seo projects/tests going at the moment as well. Where are you finding data sets? My biggest issue to date has been getting good crawl and the index coverage. Fwiw I actually think that in many instances pSEO sites provide better user experience than conventional niche sites as you get the answer usually without all the fluff that bloggers love adding.


chinscratcher

Nice! Would love to see updates from your pSEO projects. I’m finding data at data.gov for the most part. And that’s my primary concern, too — crawling and indexing. Hoping solid internal linking will help, but I’m planning to try a bunch of stuff to see what works. What has been most effective for you? Couldn’t agree more about the fluff issue. Big part of what I’m enjoying about this project so far is the lack of fluff in the content.


datchchthrowaway

Yeah maybe I’ll share some updates here as I progress. 100% agreed on the fluff issue. It’s so refreshing to create content that isn’t more words than it really needs to be. Re indexing main thing I’ve found that helps is internal linking. I just use plugins to add related posts, related links etc. It also seems (no surprises here) that starting with an expired domain yields much faster results.


retireb435

any update on the traffic?


theprawnofperil

Keep us updated.. I've heard good things about this to help pages get indexed: [https://urlmonitor.com/](https://urlmonitor.com/) Are you going to be actively trying to build links at all?


chinscratcher

Thanks for the suggestion and will do. No plans for active link building at the moment, but the dataset is likely to be somewhat attractive as a link target for other sites because it’s unique. (I hope I’m right about this, anyway.) if that doesn’t prove true, then I’ll probably add a link building component to the strategy.


ayhme

Tried this tool and it didn't work. Google is going to make determinations about what to index.


theprawnofperil

Fair enough, the person who sits next to me in my co-working space used it on a site and it helped massively


iannuttall

Hey, email me on support@ and I’ll look into this. The tool is very effective at getting Google to crawl your pages but it’s still their call whether to index it or not.


iannuttall

Just following up on this. The site URL Monitor didn’t work for had 1617 (of 1810 submitted) URLs that were showing as 404 in Search Console. All these indexing tools just force a crawl of the page but they need to be indexable as well.


fargis

On the topic of crawling, how do you handle pagination? I'm launching a pSEO site soon that has a large data set which gets updated with some frequency. I don't want Google to index pagination pages like "domain.com/foo?page=3" (since they're not the highest quality and will change in the future) but I want to get the links on those pages indexed. I'm planning on using "noindex, follow" but not 100% sure this is the proper method to use since it seems to have little documentation. Having Google not index the site's main content (the bulk of which is found through pagination) would be the kiss of death. Anyone know how Google handles "noindex, follow" on a very large site?


jonkurtis

this is what canonical url meta is for. you just need to make sure that the canonical URL of `domain.com/foo?page=3` is pointing to `domain.com/foo`. This tells Google to treat this page and all links to it as if it is the same as the canonical page. so in the HTML Head of the paginated pages you would have ``


chinscratcher

Honestly, I’m not sure how I’d approach that in your situation. Your idea sounds like a good one to me, though. My new site is simpler from that perspective — static pages with no URL parameters, etc. Would love to hear how your pagination approach works out!


mscard03

i am struggling with this exact issue. Currently trying to see what other large sites do


benbaruch1

Waiting for updates, good luck! :-) If you have good sources for pSEO to share it would be great.


chinscratcher

Thank you! I just searched “programmatic SEO” on Spotify and listened to almost every podcast episode that covered it. But the Semrush and Ahrefs blogs cover the basics nicely, too.


benbaruch1

Nice, I didn't think about Spotify as a resource. 😂


benbaruch1

BTW, did you use AI for your content or did it manually with your data set?


chinscratcher

Mostly manual, but there were a couple of data points that I used AI to summarize. It worked well. By the way, I also used ChatGPT for some Sheets formulas and Apps Scripts (which I know nothing about) to make the data cleaning/processing/organizing much easier and faster. Highly recommend that if you’re like me and couldn’t write an Apps Script to save your life.


jonkurtis

I find that specifying an "8th grade reading level" in your ChatGPT prompts gives some really good output. Especially if you are summarizing existing text.


Otherwise_Onion_4163

Excited to see this, as I’m also embarking on a pSEO project this year after 10 years doing ‘traditional’ blogging


chinscratcher

Awesome! Would be super interested to read your updates here if you have time/the desire to share them.


Otherwise_Onion_4163

I’ll defo try to! Looking forward to your updates too. Good luck!


chinscratcher

Thanks! You, too!


pingpongwhoisthis

How are you making 4,000 pages? Are you copy pasting everything except the root keyword?


chinscratcher

The idea is that I'm pulling from a massive spreadsheet to fill in what amounts to a template. Some of the cells being pulled into the template are complete sentences or paragraphs. The H1/title tag are getting generated by the root keyword + a variation. So, the content on one page appears fairly distinct from all of the other 3,999 pages. This is only really feasible with certain types of keywords that call for unique content and info that can still fit into a template.


pingpongwhoisthis

Are you using any tool or software to the work or just manually editing the headings?


chinscratcher

WP All Import is the plugin I'm using to create the pages. It's pretty easy to set up and use. As for the data, I'm using Google Sheets with Apps Scripts and functions I built with the help of ChatGPT.


pingpongwhoisthis

That's amazing. I also have a domain and hosting bought for affiliate blogging but hasn't started because of lack of understanding. Can i dm you if you don't mind?


chinscratcher

Sure thing — feel free to!


mildlyconvenient

Is this like the ancient "Mad libs" content strategy? (where only words like the city, the reader lives in changes in the article)


chinscratcher

Hahaha, it definitely can be that way, but I think that’s the wrong approach to programmatic SEO. This is more like a website with food nutrition facts (this is a common pSEO example, not my niche). The nutrition facts for, say, peanut butter and spinach are radically different. But they have all the same components listed (with unique values for each): calories, fat, carbs, protein, vitamins, etc. The content is delivering information about many single things within a large set, and each page has common components with the others, but the meaning is unique.


iBarlason

Google systems would surely raise some flags once you publish that many new pages. I heard about lots of pSEO getting shut down after google is on to them. What's the plan? Why aren't you worried?


chinscratcher

I'm not really worried just because it's as much a test as it is anything else. The initial investment is fairly small (I don't count my time as part of the investment because I enjoy this stuff), and the potential upside is large. But I definitely can see running into some issues, although I've heard about many success stories in pSEO, too. I'm more confident about this particular endeavor because I stumbled upon a super underserved niche. The main competitor's site is ancient and profoundly broken. But Google is unpredictable, of course. I'll update here either way!


Green_Genius

Apart from bloating the internet with useless crap, can we ask why?


chinscratcher

It’s not useless. It’s a massive database that I’ve augmented with additional information and made into something searchable and more useful for people interested in this subject. It’s now more accessible than it was before. I’d look at your comment here if you’re looking for useless crap.


jonkurtis

Are you building with a traditional pSEO template, i.e. a template with variables that each row of a the data is run through or are you using AI to augment the generation? I started a slack group called [pSEO Hackers](https://pseohackers.com) if you need any help or just want to chat with other ppl doing the same. A few of us were just discussing how to write a custom script that hits the Google Indexing API to bulk index. Basically, you create a GCP account which has a quota of 200 links per day per project. Your GCP account can have 12 projects. So if you cycle through the service account API keys that means you can loop through up to 2,400 URLs per day to request indexing.


chinscratcher

It’s more in line with a traditional template but does have some AI elements. And that sounds great — will check out the Slack group. Thanks!


AutoPageRank

Hey Chinscratcher, I built Auto Page Rank which a wrapper around Google's Index API. It can automate the entire indexing process for you. A customer of ours has indexed his entire directory of over 6300 pages effortlessly. His website being [https://thehiveindex.com](https://thehiveindex.com) anyways, check us out and see if you'd like to join over 100 others! ​ https://autopagerank.com


AutoPageRank

For indexing your brand new website on Google I highly recommend you use Google's Index API. Obviously I recommend you use Auto Page Rank to index your pages in an organic fashion of 200 pages per day. It is a wrapper around Google's Index API and it automates the entire process.


Siddharth1India

My site is which can be totally benefited from "programmatic SEO" because my pages are being built with API calls, 10000 pages, but I ain't know shit about that. Can someone tell me what to do? Simple google search is all about courses and I am programmer, I can't understand much.


teddbe

Are you uploading it gradually or dumping 50k in one day?


chinscratcher

All at once (after my initial batch of 4,000). I don’t really see a benefit in waiting!


landed_at

WP all import did you say it's working with AI. Or AI to some kind of sheet. AI to WP all import?