This is a text-only version of the following page on https://raymii.org:
---
Title       :   Site update, self-hosted search via pagefind
Author      :   Remy van Elst
Date        :   01-07-2023 21:32
URL         :   https://raymii.org/s/blog/Site_update_self_hosted_search_via_pagefind.html
Format      :   Markdown/HTML
---



This is a static site, meaning that no server-side processing occurs. All HTML is generated out of a few folders full of markdown source and then uploaded to the cluster. Searching on this site was always provided by a text-box form that sent you to google with 'site:raymii.org' appended to it. Works fine, but it sends all data to Google. With my recent removal of all Google Ads on this site, as well as tracking via Google Analytics, sending searches via Google seems wrong. <br/><br/>I recently found the `pagefind` program which I now use on here, it is a self hosted static site search engine of sorts.

<p class="ad"> <b>Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below:</b><br><br> <a href="https://leafnode.nl">I'm developing an open source monitoring app called  Leaf Node Monitoring, for windows, linux & android. Go check it out!</a><br><br> <a href="https://github.com/sponsors/RaymiiOrg/">Consider sponsoring me on Github. It means the world to me if you show your appreciation and you'll help pay the server costs.</a><br><br> <a href="https://www.digitalocean.com/?refcode=7435ae6b8212">You can also sponsor me by getting a Digital Ocean VPS. With this referral link you'll get $100 credit for 60 days. </a><br><br> </p>


You can read [all site-updated articles here][2].

Pagefind is [written][8] [in][9] [Rust][10] and runs after my static site generator binary.
`pagefind` indexes all generated HTML and provides an API to query that,
including a search UI which you can see at the bottom of every page here.
Perfect for my static site setup and it aims to not use much storage or
bandwidth.

The search box used to look like this:

![old search img][3]

When you entered a term and pressed `ENTER` you were sent to Google:

![old search via Google][4]



Now the search box looks like this:

![new search box][5]

I know, it's such a major change! Searching is instant and shows the results
right on the page:

![new search results][6]


**Notable changes include thumbnails and publication dates**. I have not done
any configuration whatsoever for the thumbnails, it just figured that out
by itself. Cool!


### What is pagefind?

Quoting the [pagefind website][1]:


> Pagefind is a fully static search library that aims to perform well on large
 sites, while using as little of your users' bandwidth as possible, and
 without hosting any infrastructure.

> Pagefind runs after Hugo, Eleventy, Jekyll, Next, Astro, SvelteKit, or any
 other SSG. The installation process is always the same: Pagefind only
 requires a folder containing the built static files of your website, so in
 most cases no configuration is needed to get started.

> After indexing, Pagefind adds a static search bundle to your built files,
 which exposes a JavaScript search API that can be used anywhere on your site.
 Pagefind also provides a prebuilt UI that can be used with no configuration
 (you can see the prebuilt UI at the top of this page).

> The goal of Pagefind is that websites with tens of thousands of pages should
 be searchable by someone in their browser, while consuming a reasonable
 amount of bandwidth. Pagefind's search index is split into chunks, so that
 searching in the browser only ever needs to load a small subset of the search
 index. Pagefind can run a full-text search on a 10,000 page site with a total
 network payload under 300KB, including the Pagefind library itself. For most
 sites, this will be closer to 100KB.



In my case this site has 489 articles as of the time this page was written. The
search index is around 5MB in size (files on the filesystem, this includes a
webassembly runtime).


Using the firefox devtools network tab performance analyzer I can see that
searching for the term `QObject` uses around 250kB, excluding the images:

![search usage][7]


This matches the statement above regarding network payload. The search term
`QObject` returns 8 results currently.


[1]: https://web.archive.org/web/20230526035657/https://pagefind.app/
[2]: https://github.com/cloudcannon/pagefind
[3]: /s/inc/img/search1.png
[4]: /s/inc/img/search2.png
[5]: /s/inc/img/search3.png
[6]: /s/inc/img/search4.png
[7]: /s/inc/img/search5.png
[8]: https://web.archive.org/web/20230701173548/https://enet4.github.io/rust-tropes/rust-evangelism-strike-force/
[9]: https://web.archive.org/web/20220815110335/https://kasma1990.gitlab.io/2017/06/25/nobody-expects-the-rust-evangelism-strike-force/
[10]: https://web.archive.org/web/20230701173833/http://n-gate.com/hackernews/2017/02/21/0/

---

License:
All the text on this website is free as in freedom unless stated otherwise.
This means you can use it in any way you want, you can copy it, change it
the way you like and republish it, as long as you release the (modified)
content under the same license to give others the same freedoms you've got
and place my name and a link to this site with the article as source.

This site uses Google Analytics for statistics and Google Adwords for
advertisements. You are tracked and Google knows everything about you.
Use an adblocker like ublock-origin if you don't want it.

All the code on this website is licensed under the GNU GPL v3 license
unless already licensed under a license which does not allows this form
of licensing or if another license is stated on that page / in that software:

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.

Just to be clear, the information on this website is for meant for educational
purposes and you use it at your own risk. I do not take responsibility if you
screw something up. Use common sense, do not 'rm -rf /' as root for example.
If you have any questions then do not hesitate to contact me.

See https://raymii.org/s/static/About.html for details.