7 comments

  • swaminarayan 1 hour ago
    How well does the Postgres-only approach hold up as data grows — did you benchmark it against Elasticsearch or a dedicated vector DB?
    • prvnsmpth 1 hour ago
      I've done small scale experiments with up to 100-500k rows, and did not notice any significant degradation in search query latency - p95 still well under 1s.

      I haven't directly compared against Elasticsearch yet, but I plan to do that next and publish some numbers. There's a benchmark harness setup already: https://github.com/getomnico/omni/tree/master/benchmarks, but there's a couple issues with it right now that I need to address first before I do a large scale run (the ParadeDB index settings need some tuning).

  • keyle 58 minutes ago
    I've done some RAG using postgres and the vector db extension, look into it if you're doing that type of search; it's certainly simpler than bolting another solution for it.
    • prvnsmpth 51 minutes ago
      Yeah, Omni uses Postgres and pgvector for search. ParadeDB is essentially just Postgres with the pgsearch extension that brings in Tantivy, a full-text search engine (like Apache Lucene).
  • Doublon 1 hour ago
    Interesting!

    I also started to build something similar for us, as an PoC/alternative to Glean. I'm curious how you handle data isolation, where each user has access to just the messages in their own Slack channels, or Jira tickets from only workspaces they have access to? Managing user mapping was also super painful in AWS Q for Business.

    • prvnsmpth 1 hour ago
      Thank you!

      Currently permissions are handled in the app layer - it's simply a WHERE clause filter that restricts access to only those records that the user has read permissions for in the source. But I plan to upgrade this to use RLS in Postgres eventually.

      For Slack specifically, right now the connector only indexes public channels. For private channels, I'm still working on full permission inheritance - capturing all channel members, and giving them read permissions to messages indexed from that channel. It's a bit challenging because channel members can change over time, and you'll have to keep permissions updated in real-time.

  • vladdoster 1 hour ago
    Multiple pages link to a `API Reference` that returns a 404
    • prvnsmpth 59 minutes ago
      Oops, sorry! That page is still a WIP, haven't pushed it yet. The plan was to expose the main search and chat APIs so that users can build integrations with third-party messaging apps (e.g. Slack), but haven't gotten around to properly documenting all the APIs yet.
  • octoclaw 1 hour ago
    [dead]
  • shablulman 1 hour ago
    [dead]