How to scrub URLs from one list from another?



So I have 2 files of URLs. One of them has more than the other. What I want to do is remove all the URLs that are in the first list, that are in the other list. If that makes sense?

I'll say it again just in case.

So I have 2 files of URLs right.

One has about 350+ URLs in it.

And the other list has about 100 URLs in it.

I want to remove all the URLs that are in the list with about 100 URLs, from the list of 350 URLs.

So I want to scrub the list of 350 URLs against the list of 100 URLs.

And remove any URLs in the 350 list that are in the 100 list.

Does anyone know of any free online duplicate list scrubber tool or otherwise for that?

Or perhaps some list comparison tool online or equivalent?

I used to use for this but they went down ages ago which is a shame because they had a full suite of text tools, list comparison tools etc

Anyone know anything like that?


If I'm understanding it right you want to compare two files with urls, and remove duplicates from them right? Personally I use mostly notepad++ as it does the job perfect , but I think there are a lot of duplicate remover tools online too .

I've never heard about that website before , but I thought to check it first with wayback machine, and yeah it was archive by them, and what's interesting is that you still can use it from there like you did from their domain.

However, after that I did some researches on Google I found a copy of it, actually I think it's the same website but with a different domain name. Here it is : Hope it's that what you wanted.

Cheers pro! Respect brother. Yeah compare and scrub or compare and remove duplicates from the small one that are in the big one. How do you scrub duplicate URLs/lines using Notepad++? Is there some tool or option for that? Never even knew it could do that!

Yeah it was a good site for doing stuff like this, and yeah it figures that it would still be in Waybackmachine. Cool how it still works even though it's only a cached version of it! That site you found seems to be very similar cheers for that. I also done a bit more digging and found a site that is ideal and seemingly dedicated for this kind of thing. It's

Any lines or code or URLs that are not in the first field, are highlighted in green for you when you compare them. It's not the most ideal way around it as it puts them all together and you have to select and copy those lines/urls. But it's a work around none the less.

You're welcome mate! Yeah you need a plugin called "TextFx" to do that, it can be installed very easily from notepad++. Now to install it, open notepad++ and go to plugins, then go to plugins manager then to show plugins manager, now you'll see all available plugins that you can install, search for TextFx select it and install it, and that's it. It's a huge plugin, with very useful features, simply it's great.

Basically, if you want to remove duplicate lines of the same text, you can do so in cPanel alone. They utilize a javascript code for that.

How to remove duplicate text:

  1. Open your cPanel
  2. Create a new document (.html, php doesn't matter)
  3. Make sure you're in the code editor (click Code Editor)
  4. Use CTRL + F
  5. A window should popup
  6. Input the text
  7. Input the text replace text
  8. Click on Replace

This is how I do stuff like this. It's quick, and easy. Now to actually compare a list and remove duplicate lines from one list that the other list has is quite a tricky process, this can't be done with any cPanel tools so you'll probably have to use another tool to do it, or even code a script to do it for you if you know how to code.

Oh yeah that's a good point actually. WordPress does a similar thing for revisions too. So you can see what is actually different/new compared to the old version. I guess I could create a post with one list, save it as a draft so a revision is saved, then edit it again and paste in the other urls and then save again so another revision is made and then compare them to see what's different as it highlights what's new in green for you. But yeah like you say, could be a bit tricky removing dupes but least would be a work around anyway. I might have to try this trick with Cpanel see it works sort of the same in practice.

I knew that you guys would have a solution for me. Surprising what you can get done when you put your mind (or other peoples minds) to it eh lol

This may sound stupid but what if you paste everything into an Excel and just use the remove duplicate tool. It will remove every duplicate line and just keep one.
At least that's the way I'm doing it, both for URLs and anything else for that matter, especially when I do a very extensive keyword research, I will get a lot of duplicate keywords, they need to be removed and organized into groups so I always us MS Excel.

For some reason, I haven't quite adapted to Google sheets, even though the are a lot faster and you can basically do the same things you are able to do in Excel.

