Checking your website automatically with Linkchecker
Ok so in order to install linkchecker:
sudo apt install linkchecker
It works on Python.
Then run it like this:
linkchecker https://your-domain.com
If your website is too big and you are bored of seeing it running just press ctrl + c to stop it.
As you will see there is a lot of vocabulary, so better going through it now:
- thread
- links queued
- URL checked
- runtime
- URL
- Name
- Parent URL
- Real URL
- Check time
- Result
What is a thread?
A thread is nothing more than a task, an execution. For example here, the more threads you execute the faster it will be. Note that it is not without consequences, it would be like many requests to your server. So of course if your server cannot receive all those requests they won't be treated properly and you may also "break" the server.
Links queued
It corresponds to the number of links which need to be checked. Of course the more the linkchecker crawler is progressing the more it is going to find some so according to your settings it may run a lot of time.
URL checked
It corresponds to the number of URLs which have been parsed by the crawler. When it is crawled it is crawled.
Runtime
The time it tooks to execute this series of actions.
URL
URL given by the server.
Name
This is the title tag of the page.
Parent URL
On which page the analyzed linked is on + you have indication about the location of the link on the page.
Real URL
The URL rewritten by the server.
Check time
The time it took linkchecker to analyze it.
Result
The status of the page.
Linkchecker options
Linkchecker gets all its values when you know how to use its options.
linkchecker https://your-domain.com --check-extern
This command will look at your external links.
linkchecker https://your-domain.com --ignore-url=/xmlrpc.php$
This command will ignore the url you are indicating him, as a result it won't be blocked on this one.
linkchecker https://your-domain.com -r1
It means that it will just go one level down.
linkchecker -V
To get the version of linkchecker.
linkchecker https://your-domain.com --verbose
In order to get the complete list of what linkchecker is currently analyzing.
linkchecker https://your-domain.com -r1 --verbose --output=csv >> your-file.csv
In order to analyze a file:
linkchecker url-file-list.html --verbose --check-extern