Center for Applied Artificial Intelligence

Where cutting edge technology meets 
translational science.

The Center for Applied Artificial Intelligence functions within the University of Kentucky Institute for Biomedical Informatics to explore new technologies, foster innovation, and support the application of artificial intelligence in translational science. 

Recent Projects

Contributors

A Specialized, Customizable, Secure Self-Service Tool Large Language Models (LLMs) are a powerful resource, but developing a custom model can be costly, technically complex, and may raise security and privacy issues. Furthermore, running a model is resource-intensive and may require more effort than individuals or small groups can spare. LLM Factory addresses these challenges by providing a secure environment for users to develop their own tailored models and centralized inference for easy application distribution. LLM Factory is a self-service tool developed at the Center for Applied AI that offers users access to cutting edge LLMs while prioritizing control, flexibility, accessibility, and security. Use LLM Factory to fine-tune your own LLM. Train a model with your own data in a secure environment controlled by the University of Kentucky. Configure a model to meet your needs and produce more accurate and effective AI applications. Securely query cutting-edge models, like Llama 3 and Whisper, or interface with them via OpenAI compatible APIs. Citation A paper detailing the development and usage of this tool can be found here: arXiv:2402.00913 More Information Additional information, including a technical overview and user guide documentation can be found here: https://hub.ai.uky.edu/llm-factory/ What supports LLM Factory? The Center for Applied AI functions within the Institute for Biomedical Informatics. Efforts are supported by the University of Kentucky College of Medicine. Why use LLM Factory? Customization and Flexibility LLM Factory allows users to fine-tune their own LLM. Develop specialized models tailored to a project’s specific needs. By leveraging cutting-edge AI advancements and the […]

Contributor

Say you have a dataset that you want a large language model (LLM) to have access to so that it can answer questions relating to that dataset. One option would be to fine-tune the model itself by retraining on the new dataset, but there are a lot of problems with this approach. Retraining models is expensive and new parameter tuning can overwrite old information the model was initially trained on. Additionally, LLMs can hallucinate and make up information about the dataset. Finally, if the dataset contains sensitive information, then you likely won’t want to use it for retraining, as the LLM could give away that information to users without proper access. A better way to approach this problem is to use Retrieval-Augmented Generation (RAG) to connect the LLM to the dataset. This does not require any retraining of the model, and access to the data can be controlled at a user level. With this approach, the LLM can access the stored data and use it to answer questions. This post will walk through an in-depth example of utilizing an LLM with a graph database. For this example, we will use a synthetic Fast Healthcare Interoperability Resources (FHIR) connected graph. FHIR is a system used to store healthcare information, formatted as an interconnected graph of resources, such as patients, treatments, visits, providers, and more. Thus, it can easily be represented as a graph database, through a platform such as Neo4j. FHIR is a complicated schema, with over 25 different node types […]

Contributors

The All of Us Research Project is a National Institute of Health (NIH) initiative with the intent to build a robust dataset of at least one million participants with a focus on social, economic, racial, and age diversity. The mission of All of Us is to ensure that researchers can access data on groups which have been harmed or underrepresented by medical research. The program collects a variety of data on consenting participants including: – Genome Sequences – FitBit/Wearable Fitness Tracker Data – Social Determinant Surveys (surveys containing questions relating to socioeconomic status, rural/urban living, access to healthcare etc.) – COVID-19 Vaccination status – COVID-19 Pandemic Experience Surveys (how were you personally affected by the pandemic) – Electronic Health Data All data collected is done on a consent basis for each collection period. Participants are not required to turn over any data which they are not comfortable having collected. If you would like to sign up to be a participant you can do so here. The program makes this data available to researchers through a state-of-the-art cloud analysis platform which supports analysis through SAS, Python, and R Studio. To access the researcher tools you must be associated with a organization that has a Data Usage Agreement (University of Kentucky has a DURA which allows you to access the data given appropriate training). Additionally, if you would like to explore a database showing specific categories of data collected, you can do so using the All of Us Data Browser. University of […]

Contributors

The Lower Extremity Amputation and Deficiencies Registry (LEADeR) site is a website we created for Shriners Children’s Lexington for reporting and analyzing congenital and acquired limb differences in pediatric patients. Through this web interface, Shriners researchers can add new patients, view data for existing patients, and form complex queries to extract exactly the information they need for studies and analysis. Below is a brief walkthrough of the primary pages of this web interface. New Entry On this page, researchers can input data for new patients, such as demographic information and medical data, like causes of injuries or deformities. Additionally, there are dropdowns for different classification and surgery types. There are additional modals to make data entry easier; for example, this image of a foot can make the process more visual, allowing the user to fill in the form with the checkboxes on the image. View All Users can view data for all patients on this page. Through the ‘column visibility’ button, users can change which columns are displayed to view the specific data they need. On the right side, there are buttons to edit or delete each entry. When choosing the edit button, the user will be sent back to the “New Entry” page, but all data will be filled out for that patient, giving a comprehensive overview. Users can edit the data on that page as needed and resubmit. Search The LEADeR site also contains a comprehensive search page, shown below. This page begins with a single dropdown, containing […]

Contributors

The MELT-Mixtral-8x7B-Instruct-v0.1 Large Language Model (LLM) is a generative text model pre-trained and fine-tuned using publically available medical data. As of now, our model is 6% more accurate than Google’s 540 billion parameters Med-Palm, which is 10X larger. MELT is intended for research purposes only. MELT models are best suited for prompts using a QA or chat format. The Medical Education Language Transformer (MELT) models have been trained on a wide range of text, chat, Q/A, and instruction data in the medical domain. While the model was evaluated using publically available USMLE, Indian AIIMS, and NEET example questions, its use is intended to be more broadly applicable. MELT was trained using publicly available collections, which likely contain biased and inaccurate information. The training and evaluation datasets have not been inspected for content or accuracy. Dataset MELT-Mixtral-8x7B-Instruct-v0.1 is 68.2% accurate across 3 USMLE, Indian AIIMS, and NEET medical examination benchmarks, surpassing the pass mark (>60%) in the U.S. Medical Licensing Examination (USMLE) style questions. Model Description Developed by: Center for Applied AI Funded by: Institute of Biomedical Informatics Model type: LLM Language(s) (NLP): English License: Apache 2.0 Finetuned from the model: Mixtral-8x7B-Instruct-v0.1 https://huggingface.co/IBI-CAAI/MELT-Mixtral-8x7B-Instruct-v0.1

Contributors

In late 2023, we purchased a robot called the temi 3 (available here). This robot features a large touch screen, LIDAR for obstacle avoidance, and text-to-speech/natural language processing/automatic speech recognition included without the need for setting up or training. The tablet attached to the top of the robot runs on Android and controls all functions of the robot (through an open-source SDK). Temi demos and videos are available on the Robot Temi YouTube channel. After learning about all the functionality that is baked into this robot and the vast documentation of its SDK, we were confident that we could use this to develop applications in our environment. There are two projects that we are interested in working on. Both projects involve the smell inspector device described in this post to classify smells. Analyzing Smells in Hospital Rooms First, we want to detect if urine or stool was present in a patient’s hospital room. This is in early development and testing of this is taking place within our own office. Temi comes with “patrol” mode, which visits all predefined waypoints on a floor map. To allow temi to roam freely, the robot needed to be led around our floor to get a map of the area and to add waypoints (e.g. Cody’s office, Snack Room, Conference Room). Once complete, we could programmatically start a patrol using the SDK. After mapping and patrolling were set up, we needed a way to mount the smell inspector sensor to the robot. Dmitry Strakovsky (UK […]

Contributor

In 2023, Kentucky had the fifth-largest drug overdose fatality rate in the United States, and 79% of those deaths involved opioids. The purpose of this project is to provide research support to combat the opioid epidemic through machine learning and forecasting. The goal is to provide accurate forecasts based on different geographical levels to identify which areas of the state are likely to be the most “high risk” in future weeks or months. With this information, adequate support could be prepared and provided to those areas with the hope to treat victims in time and reduce the number of deaths associated with opioid-related incidents. The first step was to analyze what geographical level would be most appropriate for building and training a forecasting model. We had EMS data containing counts of opioid-related incidents based on six different geographical levels: state, county, zip code, tract, blockgroup, and block. Through experimentation, it seemed that the county level was likely the most appropriate scale. State level is too broad for useful results, while any level smaller than zip code proved to be too sparse. Smaller geographical levels contain too few positive examples of incidents for any model to successfully learn the trends of each area. However, data sparsity remains a problem even at the county level in less-populated areas, so we have also worked with Area Development Districts (ADDs), which are larger groupings of counties. Additionally, the temporal level was chosen to be at the monthly scale, rather than yearly or weekly, due […]

Contributors

CLASSify is a web-based tool developed at the Center for Applied AI to make machine learning easier and more accessible. It provides a platform to train and evaluate Machine Learning (ML) classification models on any tabular data without requiring any programming background. Users can simply upload their dataset to the site, choose the training parameters, and the job will be sent off to train all chosen models and provide results in the form of tables and visualizations. CLASSify also provides options for synthetic data generation to bolster imbalanced class labels or create entirely new datasets, as well as explainability scores that provide insight into which features of the data are most important to the models’ predictions. CLASSify is an all-in-one machine learning tool that allows researchers to easily and quickly compare models, gather results, and download any generated artifacts for later use. Citation A paper detailing the development and usage of this tool was submitted and accepted to the American Medical Informatics Association (AMIA) in 2023. This paper can be found here: https://arxiv.org/abs/2310.03618 More Information Additional information, including a technical overview, instructional video, and user guide documentation can be found here: https://hub.ai.uky.edu/classify/ Accessing CLASSify CLASSify is available on an individual basis on CAAI’s self-service tool website. Before you can get started, you must be granted the necessary permissions from a CAAI Administrator. Please contact us for access or submit our collaboration intake form here.   Introduction Clinicians often produce large amounts of data, from patient metrics to drug component analysis. Classical […]

Contributor

  Segment Anything is a segmentation algorithm created by Meta Research. In order to try and make segmentation of medical images available to UK Hospital staff, a web interface which allows for the layperson to interact with segmentation should be utilized. Meta Research provided a sample web interface which precompiled segmentations automatically, but did not feature their correction or manual segmentation features. From there, however, the open source community began to tinker and we now have Segment-Anything-WebUI which features a more robust toolset for the segmentation of images in the browser without needed to precompile any of the segmentations for view. Additionally, it allows you to upload local files to be segmented, then save the segmentations as JSON objects. This repository was the basis of the version we have developed at the Institute for Biomedical Informatics. Accessing the Application The web application is available in two forms. The first form is through the hub site, which is hosted on University of Kentucky systems and is intended to assist in the annotation of medical images as well as the training of more useful and impressive model checkpoints for Segment Anything which will improve annotation with the goal of automatic or single-click annotation. The second form is downloading and building the repository on your own local machine. Instructions are available in the repository readme for building and running the site. How It Works Upload A File: opens a file browser and allows you to upload an image to segment. The image must […]