Contributor

Using PandasAI as a Method for Database Analysis

PandasAI is a python library that navigates through databases to answer questions using natural language.

Preparation involves using a dataset containing instruction/solution prompts to train an adapter of a Large Language Model (LLM) and creating a python script that will allow PandasAI to access a provided data table. The project has PandasAI answer the given text prompt by generating python code from the adapter to analyze the data table. Answers are displayed as either a text prompt or a graph of data.

Current limits of using PandasAI:
– Datasets need to be cleaned of inaccurate or null information to obtain accurate results. Rows containing blank entries or average values of other rows are deleted.
– The sophistication of the PandasAI response is set in narrow margins. This margin is determined by the Low-Rank Adaptation (LoRA), which fine-tunes the LLM to optimize the parameters needed to follow the instructions of the text prompt. as the LoRA rank increases, PandasAI can answer more complicated questions. However, results of simpler questions will provide more information than needed.

Project development is ongoing.