Checking your website automatically with Linkchecker

Ok so in order to install linkchecker:

sudo apt install linkchecker

It works on Python.

Then run it like this:

linkchecker https://your-domain.com

If your website is too big and you are bored of seeing it running just press ctrl + c to stop it.

As you will see there is a lot of vocabulary, so better going through it now:

  • thread
  • links queued
  • URL checked
  • runtime
  • URL
  • Name
  • Parent URL
  • Real URL
  • Check time
  • Result

What is a thread?

A thread is nothing more than a task, an execution. For example here, the more threads you execute the faster it will be. Note that it is not without consequences, it would be like many requests to your server. So of course if your server cannot receive all those requests they won't be treated properly and you may also "break" the server.

Links queued

It corresponds to the number of links which need to be checked. Of course the more the linkchecker crawler is progressing the more it is going to find some so according to your settings it may run a lot of time.

URL checked

It corresponds to the number of URLs which have been parsed by the crawler. When it is crawled it is crawled.

Runtime

The time it tooks to execute this series of actions.

URL

URL given by the server.

Name

This is the title tag of the page.

Parent URL

On which page the analyzed linked is on + you have indication about the location of the link on the page.

Real URL

The URL rewritten by the server.

Check time

The time it took linkchecker to analyze it.

Result

The status of the page.


Linkchecker options

Linkchecker gets all its values when you know how to use its options.

linkchecker https://your-domain.com --check-extern

This command will look at your external links.

linkchecker https://your-domain.com --ignore-url=/xmlrpc.php$

This command will ignore the url you are indicating him, as a result it won't be blocked on this one.

linkchecker https://your-domain.com -r1

It means that it will just go one level down.

linkchecker -V

To get the version of linkchecker.

linkchecker https://your-domain.com --verbose

In order to get the complete list of what linkchecker is currently analyzing.

linkchecker https://your-domain.com -r1 --verbose --output=csv >> your-file.csv

In order to analyze a file:

linkchecker url-file-list.html --verbose --check-extern

Last modified: Sunday, 29 September 2019, 7:26 PM