Continued; filtering Drupal comment spam using an LLM
A continuation of my previous post on Filtering Drupal comments for spam
Overview#
- Old Drupal 5 blog with my old posts stuck in the Drupal
node
table - ~19k comments on my pages before I fixed spam filtering
- Keeping it local with Ollama
- Google Gemma3 was a good speed/quality for detecting spam
- Deepseek R1 takes way too long because it reasons for every comment
TLDR:#
- I excluded a lot of spam comments with Gemma3 and boiled them down to a more manageable set
- Automating this process means I could easily miss comments due to false-positives
- Need to log the Drupal node ID so that I can link filtered comments back to my markdown blog posts
Read other posts