5 Data Mining Techniques for Nonprofit Organizations: Extracting Insights from Quantitative Sources
Introduction
While data mining is often associated with the corporate world, the benefits of this tool extend into the nonprofit realm.[1] Data mining is “a process of analyzing information that you’re collecting, including information like demographics, likenesses of site visitors or any other helpful information.”[2] According to some data science organizations, between 80-90% of nonprofits collect data but roughly 50-60% of the organizations that collect do not take full advantage of the collected data.[3] [4] Given that data mining has become more affordable and accessible than it was in the past, we anticipate that nonprofit organizations will more broadly embrace artificial intelligence; currently, education nonprofits utilize it the most.[5]
Nonprofit organizations must approach data mining intentionally and with a specific objective(s) in mind, in order to be maximally effective.[6] Data mining can, for example, provide invaluable insights into the characteristics of current donors and volunteers which can enable organizations to capitalize on this information to increase their number of donations and volunteers (as well as reducing attrition rates of nonprofit donors and volunteers). Indeed, nonprofits have a lot to gain from adopting the practice of data mining and artificial intelligence in general.
One basic, straightforward approach a nonprofit might take to mine its own data is to request that volunteers and donors’ complete surveys that ask about demographics and other information that may assist with future recruitment.[7] For example, how did volunteers hear about the organization and what motivated them to take advantage of opportunities to volunteer? Once a nonprofit has collected sufficient data, analysts can then graph the results and inform future recruitment efforts.
5 Data Mining Techniques
Of course, there are multiple data mining methods suitable for nonprofits, many of which are forms of machine learning or “the underlying process that drives artificial intelligence…[which]…analyzes sets of data to find complex patterns in order to ‘learn’ how that data will likely perform in the future”.[8] Read on for five additional techniques useful to nonprofit organizations.
- Predictive Analytics, also known as predictive modeling, “is the use of algorithms and machine learning to analyze data and make predictions based on that information.”[9] Giving Behavior Models are used to predict outcomes such as donors’ preferred means of giving, how likely they are to renew their giving commitment, and how much they are likely to give next time.[10] These models, each a form of propensity model[11] can also provide valuable insights into how nonprofits may acquire new donors, recurring donor gift rates, and what campaign priorities are in an organization’s best interest.[12]
- Descriptive and Profiling Score Models assist nonprofit organizations to generate fundraising predictive analytics.[13] Furthermore, descriptive analytics assess historical data in order to identify patterns[14] and allow nonprofit organizations to learn about engagement scoring, RFM value, and estimated giving capacity.[15] RFM refers to recency, frequency, and monetary value of donations to a nonprofit. A third form of predictive modeling, Giving Program Success Models, can be used to provide visualizations of the potential successes of giving programs such as principal giving, planned giving, annual giving, major giving, mid-level giving, and unit-specific giving.[16]
- Natural Language Processing (NLP) technology is used to “analyze text data, such as donor feedback or social media posts, to gain insights into donor sentiment and preferences.”[17] Perhaps the most appealing aspect of NLP is its ability to quickly process and make sense of large amounts of text, which allows nonprofit employees to focus on work that requires a human brain to complete.[18] In regard to its applicability to nonprofit organizations, NLP can be used in a variety of ways: term frequencies (e.g., term frequencies in a word cloud), part-of-speech tagging (e.g., identifying nouns), named entity recognition (classifying entities), topic modeling (text pattern discovery), and automatic text summarization (“TL;DR”).[19] One notable example is the use of NLP to analyze a dataset of 28,000 U.S. bills that lawmakers codified.[20] Given its marked efficiency, NLP enables organizations to make data-driven decisions much more quickly than they otherwise could have.
- Large Language Models are a subset of NLPs; they are related but not the same.[21] One contemporary example of an LLM is ChatGPT (GPT stands for Generative Pre-trained Transformer). In preparation for its current role, ChatGPT was trained using massive amounts of text to help it pick up on the patterns that would eventually enable it to communicate coherently, contextually, and convincingly. Another way to think about the difference between NLPs and LLMs is that LLMs process language and then produce new, complex language in response to that analysis. While NLPs are also capable of generating responses, they are limited to simpler models such as chatbots, virtual assistants (e.g., Siri and Alexa), and automated customer service.[22] These capabilities are undoubtedly invaluable to nonprofit website interfaces. Furthermore, while NLPs can be used on much simpler models like decision trees, LLMs require deep learning models (hence its name of large language model).[23]
- DataDives, is a data mining method created by DataKind[24] in which data scientists volunteer their time and skills to use high-powered technology to help nonprofit organizations solve problems and obtain the necessary insights to make informed decisions.[25] [26] During these DataDives, which are typically 48-hours in length and approached marathon-style, participants “collaborate and together analyze and visualize data sets to provide partner organizations initial insights or prototypes to inform their work and create real world change.”[27] These events may also follow a theme, such as web scraping, ChatGPT, data storytelling, and data governance.[28] In recent years, at least one DataDive focused on mining data relevant to COVID-19 topics such as vaccine hesitancy.[29]
- Smart Automation refers to “the use of AI to automate routine and repetitive tasks.”[30] This data mining method is useful to nonprofit organizations in its ability to complete tasks monotonous to humans so that employees and volunteers can use their time on tasks that AI cannot effectively perform. Two forms of smart automation are Robotic Process Automation (RPA) and Intelligent Process Automatic (IPA): RPA refers to the usage of software robots to complete tasks that are automated and manual in nature whereas IPA is capable of automating more complex tasks.[31] One example of smart automation is Microsoft Excel’s VBA macro, which uses RPA to automate simple repetitive tasks according to rules that data scientists predefine. Smart automation is indeed useful and relevant to nonprofits given that these organizations undoubtedly use spreadsheets to organize important information pertinent to their work and mission.
How to Perform Standard Data Mining Procedures
The Fu Foundation School of Engineering and Applied Science at Columbia offers a six step standard procedure for data mining: Business (objective) understanding, data understanding, data preparation, modeling, evaluation, and deployment.[32]
- Business (Objective) Understanding
Organizations must begin by identifying and understanding their objective(s): what kind of data are they mining? To what end? What technique will be used? Who will do the mining?[33] This first step may seem trivial or obvious but organizations must have a solid understanding of what their goal is prior to spending the time, energy, and resources to mine data. Furthermore, any data mining goals should be informed by the nonprofit’s mission and objectives.
- Data Understanding
Another name for this step is “defining the data.”[34] To define and understand the data, analysts will first gather the data and then describe it in terms of how much data there is, how the data is coded, whether it contains numbers and/or strings, etc. Data scientists will then assess its quality by asking questions such as: What data gaps are there, if any? What errors are in the data, if any? Does the data contain any duplicates?
- Data Preparation
Considered by some to be the most involved step in the data mining process, data preparation can take up to 50-70% of the effort and time that a project requires.[35] According to IBM, data preparation generally involves tasks such as: merging dataset/record, choosing a sample subset of the data, aggregating records, deriving new attributes, sorting the data prior to modeling, replacing or removing missing data, and splitting the data into unique sets for training and testing. In other words, data must be “cleaned” before it can be effectively used for mining, or other purposes.[36]
- Modeling
Data modeling refers to data mining techniques implemented to identify patterns and/or outliers in the data.[37] As previously mentioned, the predictive modeling data mining method enables data analysts to test for predictive patterns within a set of data. Data scientists may also use other modeling techniques, such as association rule mining to identify unexpected patterns that lie within the data.
- Evaluation
The evaluation step in the data mining process is when data analysts determine if the results of the data mining fulfill the objective(s) identified before the data mining began.[38] Should data scientists decide that the any objectives were not met, or discover that new questions have emerged, they will then return to the modeling phase of the process; the data may also need to be changed before analysts undertake any additional modeling.
- Deployment
Once data analysts are satisfied with the results of steps 1-5, the final step is to apply (“deploy”) the results.[39] During the deployment stage of the process, data mining results are both reported and recommended for implementation in the real world. For example, insights obtained from an analysis of consumer psychology may eventually inform grocery store layouts.
Conclusion
According to philanthropic consulting firm BWF, “AI technology is changing the way organizations fundraise.”[40] AI technology, such as data mining, indeed offers a powerful tool for nonprofit organizations, allowing them to unlock valuable insights from the data they collect. By leveraging techniques such as predictive analytics, natural language processing, large language models, DataDives, and smart automation, nonprofits can enhance their understanding of donor behavior, improve volunteer recruitment, and streamline operations. Following standard procedures, including business understanding, data preparation, modeling, and evaluation, ensures that nonprofits can effectively harness the potential of data mining. As this technology becomes more accessible and affordable, nonprofits that embrace data mining and artificial intelligence will be better positioned to fulfill their missions and create meaningful change in their communities.
Take Away
Nonprofits can greatly benefit from data mining and AI technologies by enhancing donor engagement, improving volunteer recruitment, and streamlining operations. Key techniques such as predictive analytics and natural language processing provide valuable insights, enabling data-driven decisions that align with their missions. As the article notes, “AI technology is changing the way organizations fundraise,” making these tools indispensable for modern nonprofit strategies.
[1] Nonprofit Hub. Data Mining for a Nonprofit Gold Rush. https://nonprofithub.org/data-mining-for-a-nonprofit-gold-rush/
[2] Nonprofit Hub. Data Mining for a Nonprofit Gold Rush. https://nonprofithub.org/data-mining-for-a-nonprofit-gold-rush/
[3] Discover Data Science. Data Science for Nonprofits. https://www.discoverdatascience.org/social-good/nonprofits/
[4] BWF. How to Unlock the Power of AI Fundraising: A Complete Guide. https://www.bwf.com/ai-fundraising/
[5] Di Troia, S. Parli, V., Pava, J., Zaman, Badi Uz Zaman, H. & Fitzsimmons, K. Inspiring Action: Identifying the Social Sector AI Opportunity Gap. Stanford University Human-Centered Artificial Intelligence. https://hai.stanford.edu/inspiring-action-identifying-social-sector-ai-opportunity-gap
[6] Nonprofit Hub. Data Mining for a Nonprofit Gold Rush. https://nonprofithub.org/data-mining-for-a-nonprofit-gold-rush/
[7] Nonprofit Hub. Data Mining for a Nonprofit Gold Rush. https://nonprofithub.org/data-mining-for-a-nonprofit-gold-rush/
[8] Paver, C. Artificial Intelligence for Non-Profits: Complete Explainer. Dataro. https://dataro.io/2024/02/16/artificial-intelligence-for-nonprofits/
[9] NPOInfo. 4 Data-Driven Fundraising Trends: Staying Ahead in the Field. https://npoinfo.com/data-driven-fundraising-trends/
[10] BWF. Fundraising Predictive Analytics: Expert Tips for Nonprofits. https://www.bwf.com/fundraising-predictive-analytics/
[11]Paver, C. Artificial Intelligence for Non-Profits: Complete Explainer. Dataro. https://dataro.io/2024/02/16/artificial-intelligence-for-nonprofits/
[12] BWF. Fundraising Predictive Analytics: Expert Tips for Nonprofits. https://www.bwf.com/fundraising-predictive-analytics/
[13] NPOInfo. 4 Data-Driven Fundraising Trends: Staying Ahead in the Field. https://npoinfo.com/data-driven-fundraising-trends/
[14] BWF. AI for Nonprofits: How to Leverage Machine Learning for Good. https://www.bwf.com/ai-for-nonprofits/
[15] Di Troia, S. Parli, V., Pava, J., Zaman, Badi Uz Zaman, H. & Fitzsimmons, K. Inspiring Action: Identifying the Social Sector AI Opportunity Gap. Stanford University Human-Centered Artificial Intelligence. https://hai.stanford.edu/inspiring-action-identifying-social-sector-ai-opportunity-gap
[16] NPOInfo. 4 Data-Driven Fundraising Trends: Staying Ahead in the Field. https://npoinfo.com/data-driven-fundraising-trends/
[17] Humanata. AI For Social Good. https://www.humanata.ca/bigdata
[18] Lee, A. & Kinsella, B. How the Social Sector Can Use Natural Language Processing. Stanford Social Innovation Review. https://ssir.org/articles/entry/how_the_social_sector_can_use_natural_language_processing
[19] Lee, A. & Kinsella, B. How the Social Sector Can Use Natural Language Processing. Stanford Social Innovation Review. https://ssir.org/articles/entry/how_the_social_sector_can_use_natural_language_processing
[20] Lee, A. & Kinsella, B. How the Social Sector Can Use Natural Language Processing. Stanford Social Innovation Review. https://ssir.org/articles/entry/how_the_social_sector_can_use_natural_language_processing
[21] Geeks for Geeks. NLP vs LLM: Understanding Key Differences. https://www.geeksforgeeks.org/nlp-vs-llm/#nlp-vs-llm
[22] Geeks for Geeks. NLP vs LLM: Understanding Key Differences. https://www.geeksforgeeks.org/nlp-vs-llm/#nlp-vs-llm
[23] Geeks for Geeks. NLP vs LLM: Understanding Key Differences. https://www.geeksforgeeks.org/nlp-vs-llm/#nlp-vs-llm
[24]Porway, J. DataKind expanding to help more social good organizations use data to advance their work. Knight Foundation. https://knightfoundation.org/articles/datakind-expanding-help-more-social-good-organizations-use-data-advance-their-work/
[25] Discover Data Science. Data Science for Nonprofits. https://www.discoverdatascience.org/social-good/nonprofits/
[26] Elsevier Foundation. DataKind. https://elsevierfoundation.org/partnerships/inclusive research/datakind/#:~:text=DataDives%20are%20high%20energy%2C%20marathon,serve%20and%20advance%20their%20missions.
[27] Elsevier Foundation. DataKind. https://elsevierfoundation.org/partnerships/inclusive research/datakind/#:~:text=DataDives%20are%20high%20energy%2C%20marathon,serve%20and%20advance%20their%20missions.
[28] National Center for State Courts. Data Dives. https://www.ncsc.org/consulting-and-research/areas-of-expertise/data/data-dives
[29] DataKind. Powering Public Data for Communities. DataKind Hosts Virtual DataDive Event. https://www.datakind.org/2021/02/25/powering-public-data-for-communities-datakind-hosts-virtual-datadiver-event/
[30] Song, Z. & Chun Ping, N. From Data to Impact: How Artificial Intelligent is Driving Non-Profit Success. https://books.google.com/books?hl=en&lr=&id=vOi1EAAAQBAJ&oi=fnd&pg=PT3&dq=nonprofits+AND+data+mining&ots=zjIE2IFR8M&sig=hPSfIm2iDGjQg990tp-PyAI2aG8#v=onepage&q=nonprofits%20AND%20data%20mining&f=false
[31] Song, Z. & Chun Ping, N. From Data to Impact: How Artificial Intelligent is Driving Non-Profit Success. https://books.google.com/books?hl=en&lr=&id=vOi1EAAAQBAJ&oi=fnd&pg=PT3&dq=nonprofits+AND+data+mining&ots=zjIE2IFR8M&sig=hPSfIm2iDGjQg990tp-PyAI2aG8#v=onepage&q=nonprofits%20AND%20data%20mining&f=false
[32] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[33] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[34] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[35] IBM. Data Preparation Overview. https://www.ibm.com/docs/en/spss-modeler/saas?topic=preparation-data-overview
[36] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[37] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[38] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[39] Columbia Engineering Boot Camps. Data Mining: The Complete Guide. https://bootcamp.cvn.columbia.edu/blog/data-mining-guide/
[40] BWF. How to Unlock the Power of AI Fundraising: A Complete Guide. https://www.bwf.com/ai-fundraising/
Articles and White Papers About Data Sources
The Problem with Relying Solely on Dashboards
Articles and White Papers About Monitoring & Evaluation Case Study: Apprenticeship Program Evaluation Conducting a statistically representative comprehensive program evaluation which includes conducting a comprehensive evaluation for two workforce development programs and work with each of the vendors and the County to use interim findings to improve program design and...
Read MoreWhat Types of Data Should You Track?
Articles and White Papers About Data Governance Planning What Types of Data Should You Track? Read More 5 Strategies for Ensuring Ethical Data Handling in Nonprofit Quantitative Research Introduction With the rapid advancement of technology, ethically engaging with data is more imperative than ever, particularly in the realm of quantitative...
Read More5 Data Mining Techniques for Nonprofit Organizations: Extracting Insights from Quantitative Sources
Introduction While data mining is often associated with the corporate world, the benefits of this tool extend into the nonprofit realm.[1] Data mining is “a process of analyzing information that you’re collecting, including information like demographics, likenesses of site visitors or any other helpful information.”[2] According to some data science...
Read MoreEthical Considerations in Utilizing Quantitative Design Data Sources in Research
Introduction From 1932 to 1972 the U.S. Public Health Service conducted a now infamous study called the Tuskegee experiment.[1] This experiment was designed to observe how untreated syphilis progressed in Black men.[2] During the recruitment process researchers did not receive informed consent from the participants; recruiters capitalized on local jargon,...
Read More