
At the recent CiviCon in San Francisco, there were a few discussions about how we might increase the use of AI in CiviCRM. San Francisco seemed like the right place to be talking about this: on my first day there, I came across a disturbing and provocative billboard that reads "Stop Hiring Humans"! but AI seemed like it was becoming part of the city with Waymo driverless taxis cruising around and robots making your coffee. For the record: Waymo was actually a really capable driver but the robot was nowhere near as good as a human barista on just so many levels. And AI should not be replacing humans but making us more productive.
So what might AI look like in CiviCRM? And is it even likely to happen any time soon? Well we already have DocBot helping you find out how to do things based on the documentation. At the Sprint we started improving the documentation with the use of the popular ChatGPT which is really good at summarising docs, ordering things nicely and giving writing a consistent tone of voice that matches your desired style (we did produce the style guide for it). As the documentation improves, this should make DocBot even better.
Insights and agents
The next steps then seem to be in two main areas: 1) using some kind of AI to produce insights from the data within the system and 2) giving DocBot (or similar) some level of agency so it doesn’t just tell you how to build a SearchKit but can actually build it for you (referred to as agentic AI).
In this article I’m going to focus on the insights we can obtain from the data held as this is something we're working on at Circle and it feels like we can make some pretty quick progress. We recently did a project that involved more traditional Data Visualisation and a key concern here is similar to letting an AI near your data: do we really trust that other system?
B(I) before A(I)
In the case of the BI tools used for building the visualisation dashboards, we had to go through a process known as ETL (Extract, Transform and Load) and in the process we removed names, phone numbers, addresses and any obvious PII. We think we need to be sure that something similar is happening before any AI is allowed to access the data. This has the advantage of producing a simpler data structure which will also make it easier for the AI to extract information.
Alain Benbassat from Business and Code has done some really interesting work on an ETL process based on using CiviCRM as a potential visualisation platform. Taking advantage of entities and SearchKit, his code creates a star schema in CiviCRM with the advantage that the interface is familiar and roles can be easily transferred. Having CiviCRM cid references means it’s also easy to construct links back to actual contacts in order to view further details or mark them in some way based on the patterns from the anonymised database.
We also think that different people will probably want to use different AI’s. There are a load of options available and they are only likely to increase so we think the best option here is to have a connector to a safe set of partially anonymised data that Claude, GPT et al can query following some kind of authentication. In this case the end user has to produce the prompt: something like “hey GPT, can you tell me what my membership renewal looks like next year if we increase our fees by X% and give me a list of members most at risk of not renewing?”
I have used terms like safe and anonymised but if we pass through contact IDs, those can be taken back into CiviCRM to reveal the actual people. Actions could be defined such as put this set of contacts into a group for further processing or comms.
Keeping it in-house
Another approach that we’re exploring involves not letting external AIs anywhere near the data and getting any input from a self-hosted LLM that we’ve trained on CiviCRM’s data structures. At Circle we already operate something like this which helps us with support tickets. It reads the incoming ticket and suggests similar tickets and documentation that may help our team deal with the issue more efficiently. It works pretty well and we can see that it is capable of much more.
So we are looking at how we connect this tool to a CiviCRM database and get it to run in the background with some pre-determined prompts so that it produces useful insights. In this case the prompts might be along the lines of “highlight unusual trends in membership, donations and event registrations and summarise in no more than 3 sentences for each notable pattern”. The prompts would evolve over time based on user feedback but would be more curated than the first example where we imagine giving the user full access to ask whatever questions they come up with.
In this case, the AI is operated by us and as we are already processing that data, no additional agreements should be needed. The trust issue should dissipate as we are already the trusted guardians of the source. Also because we don’t need to train our model on the entirety of human knowledge, it doesn’t need so much processing power and our current ticketing model runs on a modest virtual machine.
Next steps
We hope to have some progress on both these approaches for the October CiviCamp in the Netherlands. We’re also hoping that the documentation push will continue there.
If you have any questions or thoughts on any of this or if you’d like to be involved in any way, we’d love to discuss it. There is no such thing as a stupid question and it’s mainly through discussions around new ideas that we make breakthroughs. As we make any concrete progress we’ll make further announcements and we very much hope to show something in October.