Hi everyone, I’ve been building my own log search server because I wasn’t satisfied with any of the alternatives out there and wanted a project to learn rust with. It still needs a ton of work but wanted to share what I’ve built so far.

The repo is up here: https://codeberg.org/Kryesh/crystalline

and i’ve started putting together some documentation here: https://kryesh.codeberg.page/crystalline/

There’s a lot of features I plan to add to it but I’m curious to hear what people think and if there’s anything you’d like to see out of a project like this.

Some examples from my lab environment:

events view searching for SSH logins from systemd journals and syslog events:

counting raw event size for all indices:

performance is looking pretty decent so far, and it can be configured to not be too much of a resource hog depending on use case, some numbers from my test install:

  • raw events ingested: ~52 million
  • raw event size: ~40GB
  • on disk size: ~5.8GB

Ram usage:

  • not running searches ingesting 600MB-1GB per day it uses about 500MB of ram
  • running the ssh search examples above brings it to about 600MB of ram while the search is running
  • running last example search getting the size of all events (requires decompressing the entire event store) peaked at about 3.5GB of ram usage
  • warmaster@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    20 days ago

    That looks great, congrats!

    If you’re targeting us, homelabbers, I’ll tell you what I would want from a log server:

    • Stupid easy installation (Docker / Proxmox)
    • Integrations with:
    • Proxmox
    • Docker
    • Home Assistant
    • Frigate
    • Scrypted
    • Jellyfin
    • Immich
    • Unifi
    • Kryesh@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      20 days ago

      Thanks! definitely aiming for a stupid easy installation/management for the app itself; but in my experience getting a wide range of supported log sources is no small feat. I’ve been using fluentbit to handle collection from different sources and using the following has been working well for me:

      • docker ‘journald’ log driver
      • fluentbit ‘systemd’ input
      • fluentbit ‘http’ output like the one in the readme

      with that setup you can search for container logs by name which works great with compose:

      or process logs from an nginx container like this to see traffic from external hosts:

      I’ll add a more complete example to the docs, but if you look in the repo there’s a complete example for receiving and ingesting syslog that you can run with just “docker compose up”

      • PlusMinus@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        19 days ago

        Maybe you should add OTLP support? I don’t know how you are ingesting from Fluentbit at the moment, but I think with OTLP basically any log source can be integrated either through the fluentbit OTLP plugin or an OTEL collector.

        • Kryesh@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          19 days ago

          I’m currently using the fluentbit http output plugin, fluentbit can act as an otel collector with an input plugin which could then be routed to the http output plugin. Long term I’ll probably look at adding it but there’s other features that take priority in the app itself such as scheduled searching and notifications/alerting

  • LiPoly@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    20 days ago

    Kudos on the project! I often thought about building something similar myself, because I wasn’t happy with what’s out there. Everything is so complicated to set up and way too oversized for a simple log collection service, or the UI is just bad and super unintuitive for no reason. Glad you’re brining some new wind into the space.

  • pineapple@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    20 days ago

    I don’t really understand the point of this. What kind of logs are you storing and why would you want to?

        • Kryesh@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          19 days ago

          Applications like metrics because they’re good for doing statistics so you can figure out things like “is this endpoint slow” or “how much traffic is there”

          Security teams like logs because they answer questions like “who logged in to this host between these times?” Or “when did we receive a weird looking http request”, basically any time you want to find specific details about a single event logs are typically better; and threat hunting does a lot of analysis on specific one time events.

          Logs are also helpful when troubleshooting, metrics can tell you there’s a problem but in my experience you’ll often need logs to actually find out what the problem is so you can fix it.

        • Possibly linux@lemmy.zip
          link
          fedilink
          English
          arrow-up
          1
          ·
          19 days ago

          When you start seeing a lot of failed login attempts or other suspicious activity you know you are in trouble

  • ancoraunamoka@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    20 days ago

    This is very cool.

    I an slowly building my own syslog server with visualization, but it’s cool to see new stuff on the block.

    I have always been wary of big commercial services like kibana, grafana, etc…